mirror_ubuntu-kernels

mirror of https://git.proxmox.com/git/mirror_ubuntu-kernels.git synced 2026-01-17 17:58:29 +00:00

Author	SHA1	Message	Date
Steven Rostedt	7e49fcce1b	trace, lockdep: manual preempt count adding for local_bh_disable Impact: fix to preempt trace triggering lockdep check_flag failure In local_bh_disable, the use of add_preempt_count causes the preempt tracer to start recording the time preemption is off. But because it already modified the preempt_count to show softirqs disabled, and before it called the lockdep code to handle this, it causes a state that lockdep can not handle. The preempt tracer will reset the ring buffer on start of a trace, and the ring buffer reset code does a spin_lock_irqsave. This calls into lockdep and lockdep will fail when it detects the invalid state of having softirqs disabled but the internal current->softirqs_enabled is still set. The fix is to manually add the SOFTIRQ_OFFSET to preempt count and call the preempt tracer code outside the lockdep critical area. Thanks to Peter Zijlstra for suggesting this solution. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-23 11:10:57 +01:00
Matthew Ranostay	fd8757aed1	Add PCI DFI vendor ID Add a define for DFI PCI vendor id. Signed-off-by: Matthew Ranostay <mranostay@embeddedalley.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>	2009-01-23 08:08:27 +01:00
Benjamin Thery	4feb88e5c6	netns: ipmr: enable namespace support in ipv4 multicast routing code This last patch makes the appropriate changes to use and propagate the network namespace where needed in IPv4 multicast routing code. This consists mainly in replacing all the remaining init_net occurences with current netns pointer retrieved from sockets, net devices or mfc_caches depending on the routines' contexts. Some routines receive a new 'struct net' parameter to propagate the current netns: * vif_add/vif_delete * ipmr_new_tunnel * mroute_clean_tables * ipmr_cache_find * ipmr_cache_report * ipmr_cache_unresolved * ipmr_mfc_add/ipmr_mfc_delete * ipmr_get_route * rt_fill_info (in route.c) Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-22 13:57:41 -08:00
Benjamin Thery	5c0a66f5f3	netns: ipmr: store netns in struct mfc_cache This patch stores into struct mfc_cache the network namespace each mfc_cache belongs to. The new member is mfc_net. mfc_net is assigned at cache allocation and doesn't change during the rest of the cache entry life. A new net parameter is added to ipmr_cache_alloc/ipmr_cache_alloc_unres. This will help to retrieve the current netns around the IPv4 multicast routing code. At the moment, all mfc_cache are allocated in init_net. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-22 13:57:36 -08:00
Yinghai Lu	d52a61c04c	irq: clean up irq stat methods David Miller suggested, related to a kstat_irqs related build breakage: > Either linux/kernel_stat.h provides the kstat_incr_irqs_this_cpu > interface or linux/irq.h does, not both. So move them to kernel_stat.h. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-22 10:18:58 +01:00
Thomas Gleixner	6552ebae25	Merge branch 'core/debugobjects' into core/urgent	2009-01-22 10:03:02 +01:00
Thomas Gleixner	336f6c322d	debugobjects: add and use INIT_WORK_ON_STACK Impact: Fix debugobjects warning debugobject enabled kernels spit out a warning in hpet code due to a workqueue which is initialized on stack. Add INIT_WORK_ON_STACK() which calls init_timer_on_stack() and use it in hpet. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-01-22 10:02:07 +01:00
Cyrill Gorcunov	273ec51dd7	net: ppp_generic - introduce net-namespace functionality v2 - Each namespace contains ppp channels and units separately with appropriate locks Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 15:55:35 -08:00
Mark McLoughlin	9f4d26d0f3	virtio_net: add link status handling Allow the host to inform us that the link is down by adding a VIRTIO_NET_F_STATUS which indicates that device status is available in virtio_net config. This is currently useful for simulating link down conditions (e.g. using proposed qemu 'set_link' monitor command) but would also be needed if we were to support device assignment via virtio. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (added future masking) Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 14:34:53 -08:00
Gerrit Renker	883ca833e5	dccp: Initialisation and type-checking of feature sysctls This patch takes care of initialising and type-checking sysctls related to feature negotiation. Type checking is important since some of the sysctls now directly impact the feature-negotiation process. The sysctls are initialised with the known default values for each feature. For the type-checking the value constraints from RFC 4340 are used: * Sequence Window uses the specified Wmin=32, the maximum is ulong (4 bytes), tested and confirmed that it works up to 4294967295 - for Gbps speed; * Ack Ratio is between 0 .. 0xffff (2-byte unsigned integer); * CCIDs are between 0 .. 255; * request_retries, retries1, retries2 also between 0..255 for good measure; * tx_qlen is checked to be non-negative; * sync_ratelimit remains as before. Notes: ------ 1. Die s@sysctl_dccp_feat@sysctl_dccp@g since the sysctls are now in feat.c. 2. As pointed out by Arnaldo, the pattern of type-checking repeats itself in other places, sometimes with exactly the same kind of definitions (e.g. "static int zero;"). It may be a good idea (kernel janitors?) to consolidate type checking. For the sake of keeping the changeset small and in order not to affect other subsystems, I have not strived to generalise here. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 14:34:05 -08:00
Gerrit Renker	792b48780e	dccp: Implement both feature-local and feature-remote Sequence Window feature This adds full support for local/remote Sequence Window feature, from which the * sequence-number-validity (W) and * acknowledgment-number-validity (W') windows derive as specified in RFC 4340, 7.5.3. Specifically, the following is contained in this patch: * integrated new socket fields into dccp_sk; * updated the update_gsr/gss routines with regard to these fields; * updated handler code: the Sequence Window feature is located at the TX side, so the local feature is meant if the handler-rx flag is false; * the initialisation of `rcv_wnd' in reqsk is removed, since - rcv_wnd is not used by the code anywhere; - sequence number checks are not done in the LISTEN state (cf. 7.5.3); - dccp_check_req checks the Ack number validity more rigorously; * the `struct dccp_minisock' became empty and is now removed. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 14:34:04 -08:00
Gerrit Renker	f90f92eed7	dccp: Initialisation framework for feature negotiation This initialises feature negotiation from two tables, which are in turn are initialised from sysctls. As a novel feature, specifics of the implementation (e.g. that short seqnos and ECN are not yet available) are advertised for robustness. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 14:34:04 -08:00
Ben Hutchings	288379f050	net: Remove redundant NAPI functions Following the removal of the unused struct net_device * parameter from the NAPI functions named netif_rx_ in commit `908a7a1`, they are exactly equivalent to the corresponding napi_ functions and are therefore redundant. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 14:33:50 -08:00
Cesar Eduardo Barros	5ec99fdf8e	sc92031: use device id directly instead of made-up name Instead of making up a name for the device ids, put them directly in the device id table. Also move the vendor id to pci_ids.h. Signed-off-by: Cesar Eduardo Barros <cesarb@cesarb.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 14:03:41 -08:00
Krzysztof Hałasa	991990a12d	WAN: Convert generic HDLC drivers to netdev_ops. Also remove unneeded last_rx update from Synclink drivers. Synclink part mostly by Stephen Hemminger. Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 14:03:37 -08:00
Krzysztof Hałasa	7cdc15f5f9	WAN: Generic HDLC now uses IFF_WAN_HDLC private flag. Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 14:03:35 -08:00
Stephen Hemminger	5a7616af60	hdlcdrv: convert to internal net_device_stats Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Thomas Sailer <t.sailer@alumni.ethz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 14:02:36 -08:00
Stephen Hemminger	9fd3238e95	ibmtr: convert to internal network_device_stats Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 14:02:26 -08:00
Stephen Hemminger	a1799af4d7	com20020: convert to net_devic_ops Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 14:02:20 -08:00
Stephen Hemminger	bca5b8939f	arcnet: convert to net_device_ops Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 14:02:19 -08:00
Stephen Hemminger	5803c5122a	arcnet: convert to internal stats Use pre-existing network_device_stats inside network_device rather than own private structure. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 14:02:19 -08:00
Steve Glendinning	0b491eee46	usbnet: allow type check of devdbg arguments in non-debug build Improve usbnet's devdbg to always type-check diagnostic arguments, like dev_dbg (device.h). This makes no change to the resulting size of usbnet modules. This patch also removes an #ifdef DEBUG directive from rndis_wlan so it's devdbg statements are always type-checked at compile time. Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 12:35:43 -08:00
Ingo Molnar	198030782c	Merge branch 'x86/mm' into core/percpu Conflicts: arch/x86/mm/fault.c	2009-01-21 10:39:51 +01:00
Jesper Nilsson	c0e69a5bbc	klist.c: bit 0 in pointer can't be used as flag The commit `a1ed5b0cff` (klist: don't iterate over deleted entries) introduces use of the low bit in a pointer to indicate if the knode is dead or not, assuming that this bit is always free. This is not true for all architectures, CRIS for example may align data on byte borders. The result is a bunch of warnings on bootup, devices not being added correctly etc, reported by Hinko Kocevar <hinko.kocevar@cetrtapot.si>: ------------[ cut here ]------------ WARNING: at lib/klist.c:62 () Modules linked in: Stack from c1fe1cf0: c01cc7f4 c1fe1d11 c000eb4e c000e4de 00000000 00000000 c1f4f78f c1f50c2d c01d008c c1fdd1a0 c1fdd1a0 c1fe1d38 c0192954 c1fe0000 00000000 c1fe1dc0 00000002 7fffffff c1fe1da8 c0192d50 c1fe1dc0 00000002 7fffffff c1ff9fcc Call Trace: [<c000eb4e>] [<c000e4de>] [<c0192954>] [<c0192d50>] [<c001d49e>] [<c000b688>] [<c0192a3c>] [<c000b63e>] [<c000b63e>] [<c001a542>] [<c00b55b0>] [<c00411c0>] [<c00b559c>] [<c01918e6>] [<c0191988>] [<c01919d0>] [<c00cd9c8>] [<c00cdd6a>] [<c0034178>] [<c000409a>] [<c0015576>] [<c0029130>] [<c0029078>] [<c0029170>] [<c0012336>] [<c00b4076>] [<c00b4770>] [<c006d6e4>] [<c006d974>] [<c006dca0>] [<c0028d6c>] [<c0028e12>] [<c0006424>] <4>---[ end trace 4eaa2a86a8e2da22 ]--- ------------[ cut here ]------------ Repeat ad nauseam. Wed, Jan 14, 2009 at 12:11:32AM +0100, Bastien ROUCARIES wrote: > Perhaps using a pointerhackalign trick on this structure where > #define pointerhackalign(x) __attribute__ ((aligned (x))) > and declare > struct klist_node { > ... > } pointerhackalign(2); > > Because __attribute__ ((aligned (x))) could only increase alignment > it will safe to do that and serve as documentation purpose :) That works, but we need to do it not for the struct klist_node, but for the struct we insert into the void * in klist_node, which is struct klist. Reported-by: Hinko Kocevar <hinko.kocevar@cetrtapot.si Cc: Bastien ROUCARIES <roucaries.bastien@gmail.com> Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-20 20:52:10 -08:00
Inaky Perez-Gonzalez	8adb711f36	debugfs: introduce stub for debugfs_create_size_t() when DEBUG_FS=n Toralf Förster <toralf.foerster@gmx.de> reported a build failure in the WiMAX stack when CONFIG_DEBUG_FS=n http://linuxwimax.org/pipermail/wimax/2009-January/000449.html This is due to debugfs_create_size_t() missing an stub that returns -ENODEV when the DEBUGFS subsystem is not configured in (like the rest of the debugfs API). This patch adds said stub. Reported-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-20 20:52:09 -08:00
Markus Metzger	b1818748b0	x86, ftrace, hw-branch-tracer: dump trace on oops Dump the branch trace on an oops (based on ftrace_dump_on_oops). Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-20 13:03:48 +01:00
Brian Gerst	0bd74fa8e2	percpu: refactor percpu.h Impact: cleanup Refactor the DEFINE_PER_CPU_* macros and add .data.percpu.first section. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2009-01-20 12:29:19 +09:00
Guennadi Liakhovetski	ef560682a9	dmaengine: add async_tx_clear_ack() macro To complete the DMA_CTRL_ACK handling API add a async_tx_clear_ack() macro. Signed-off-by: Guennadi Liakhovetski <lg@denx.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-19 15:36:21 -07:00
Dan Williams	c50331e8be	dmaengine: dma_issue_pending_all == nop when CONFIG_DMA_ENGINE=n The device list will always be empty in this configuration, so no need to walk the list. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-19 15:35:54 -07:00
Ingo Molnar	4092762aeb	Merge branch 'tracing/ftrace'; commit 'v2.6.29-rc2' into tracing/core	2009-01-18 20:15:05 +01:00
Ingo Molnar	b2b062b816	Merge branch 'core/percpu' into stackprotector Conflicts: arch/x86/include/asm/pda.h arch/x86/include/asm/system.h Also, moved include/asm-x86/stackprotector.h to arch/x86/include/asm. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-18 18:37:14 +01:00
Rafael J. Wysocki	aa8c6c9374	PCI PM: Restore standard config registers of all devices early There is a problem in our handling of suspend-resume of PCI devices that many of them have their standard config registers restored with interrupts enabled and they are put into the full power state with interrupts enabled as well. This may lead to the following scenario: * an interrupt vector is shared between two or more devices * one device is resumed earlier and generates an interrupt * the interrupt handler of another device tries to handle it and attempts to access the device the config space of which hasn't been restored yet and/or which still is in a low power state * the system crashes as a result To prevent this from happening we should restore the standard configuration registers of all devices with interrupts disabled and we should put them into the D0 power state right after that. Unfortunately, this cannot be done using the existing pci_set_power_state(), because it can sleep. Also, to do it we have to make sure that the config spaces of all devices were actually saved during suspend. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-16 12:57:58 -08:00
Jan Kara	cc33412fb1	quota: Improve locking We implement dqget() and dqput() that need neither dqonoff_mutex nor dqptr_sem. Then move dqget() and dqput() calls so that they are not called from under dqptr_sem. This is important because filesystem callbacks aren't called from under dqptr_sem which used to cause lots of problems with lock ranking (and with OCFS2 they became close to unsolvable). The patch also removes two functions which were introduced solely because OCFS2 needed them to cope with the old locking scheme. As time showed, they were not enough for OCFS2 anyway and it would be unnecessary work to adapt them to the new locking scheme in which they aren't needed. As a result OCFS2 needs the following patch to compile properly with quotas. Sorry to any bisecters which hit this in advance. Signed-off-by: Jan Kara <jack@suse.cz>	2009-01-16 18:02:10 +01:00
Theodore Ts'o	08ec8c3878	jbd2: On a __journal_expect() assertion failure printk "JBD2", not "EXT3-fs" Otherwise it can be very confusing to find a "EXT3-fs: " failure in the middle of EXT4-fs failures, and it makes it harder to track the source of the failure. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-01-16 11:57:00 -05:00
Ingo Molnar	74296a8ed6	irq: provide debug_poll_all_shared_irqs() method under CONFIG_DEBUG_SHIRQ Provide a shared interrupt debug facility under CONFIG_DEBUG_SHIRQ: it uses the existing irqpoll facilities to iterate through all registered interrupt handlers and call those which can handle shared IRQ lines. This can be handy for suspend/resume debugging: if we call this function early during resume we can trigger crashes in those drivers which have incorrect assumptions about when exactly their ISRs will be called during suspend/resume. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-16 17:46:49 +01:00
Linus Torvalds	e58d4fd89a	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: sata_fsl: Return non-zero on error in probe() drivers/ata/pata_ali.c: s/isa_bridge/ali_isa_bridge/ to fix alpha build libata: New driver for OCTEON SOC Compact Flash interface (v7). libata: Add another column to the ata_timing table. sata_via: Add VT8261 support pata_atiixp: update port enabledness test handling [libata] get-identity ioctl: Fix use of invalid memory pointer	2009-01-16 08:40:57 -08:00
David Daney	3ada9c1264	libata: Add another column to the ata_timing table. The forthcoming OCTEON SOC Compact Flash driver needs an additional timing value that was not available in the ata_timing table. I add a new column for dmack_hold time. The values were obtained from the Compact Flash specification Rev 4.1. Signed-off-by: David Daney <ddaney@caviumnetworks.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-01-16 10:23:37 -05:00
Jeff Garzik	94be9a58d7	[libata] get-identity ioctl: Fix use of invalid memory pointer for SAS drivers. Caught by Ke Wei (and team?) at Marvell. Also, move the ata_scsi_ioctl export to libata-scsi.c, as that seems to be the general trend. Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-01-16 10:17:09 -05:00
Mandeep Singh Baines	e162b39a36	softlockup: decouple hung tasks check from softlockup detection Decoupling allows: * hung tasks check to happen at very low priority * hung tasks check and softlockup to be enabled/disabled independently at compile and/or run-time * individual panic settings to be enabled disabled independently at compile and/or run-time * softlockup threshold to be reduced without increasing hung tasks poll frequency (hung task check is expensive relative to softlock watchdog) * hung task check to be zero over-head when disabled at run-time Signed-off-by: Mandeep Singh Baines <msb@google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-16 14:06:04 +01:00
Ingo Molnar	34cb61359b	sched: fix !CONFIG_SCHEDSTATS build failure Stephen Rothwell reported this linux-next build failure with !CONFIG_SCHEDSTATS: \| In file included from kernel/sched.c:1703: \| kernel/sched_fair.c: In function 'adaptive_gran': \| kernel/sched_fair.c:1324: error: 'struct sched_entity' has no member named 'avg_wakeup' The start_runtime and avg_wakeup metrics are now not just for statistics, but also for scheduling - so they always need to be available. (Also move out the nr_migrations fields - for future perfcounters usage.) Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-16 13:37:25 +01:00
Ingo Molnar	af2519fb22	Merge branch 'linus' into core/iommu Conflicts: arch/ia64/include/asm/dma-mapping.h arch/ia64/include/asm/machvec.h arch/ia64/include/asm/machvec_sn2.h	2009-01-16 10:09:10 +01:00
Linus Torvalds	3feeba1e53	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (95 commits) b44: GFP_DMA skb should not escape from driver korina: do not use IRQF_SHARED with IRQF_DISABLED korina: do not stop queue here korina: fix handling tx_chain_tail korina: do tx at the right position korina: do schedule napi after testing for it korina: rework korina_rx() for use with napi korina: disable napi on close and restart korina: reset resource buffer size to 1536 korina: fix usage of driver_data bnx2x: First slow path interrupt race bnx2x: MTU Filter bnx2x: Indirection table initialization index bnx2x: Missing brackets bnx2x: Fixing the doorbell size bnx2x: Endianness issues bnx2x: VLAN tagged packets without VLAN offload bnx2x: Protecting the link change indication bnx2x: Flow control updated before reporting the link bnx2x: Missing mask when calculating flow control ...	2009-01-15 16:53:15 -08:00
Jaswinder Singh Rajput	00bfddaf7f	include of <linux/types.h> is preferred over <asm/types.h> Impact: fix 15 make headers_check warnings: include of <linux/types.h> is preferred over <asm/types.h> Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-15 16:39:41 -08:00
Qinghuang Feng	1bcbf31337	btrfs & squashfs: Move btrfs and squashfsto's magic number to <linux/magic.h> Use the standard magic.h for btrfs and squashfs. Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com> Cc: Phillip Lougher <phillip@lougher.demon.co.uk> Cc: Chris Mason <chris.mason@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-15 16:39:38 -08:00
Randy Dunlap	6ae301e85c	resources: fix parameter name and kernel-doc Fix __request_region() parameter kernel-doc notation and parameter name: Warning(linux-2.6.28-git10//kernel/resource.c:627): No description found for parameter 'flags' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-15 16:39:38 -08:00
Randy Dunlap	3eabdb76a0	jbd: fix missing kernel-doc Fix jbd header file kernel-doc notation: Warning(linux-2.6.28-git13//include/linux/jbd.h:823): No description found for parameter 'j_average_commit_time' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-15 16:39:37 -08:00
Li Zefan	45ce80fb6b	cgroups: consolidate cgroup documents Move Documentation/cpusets.txt and Documentation/controllers/* to Documentation/cgroups/ Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-15 16:39:37 -08:00
Takashi Iwai	c0106d72b8	Merge branch 'topic/asoc' into next/asoc	2009-01-15 18:27:20 +01:00
Ingo Molnar	7f268f4352	Merge branches 'cpus4096', 'x86/cleanups' and 'x86/urgent' into x86/percpu	2009-01-15 13:18:57 +01:00
Peter Zijlstra	831451ac4e	sched: introduce avg_wakeup Introduce a new avg_wakeup statistic. avg_wakeup is a measure of how frequently a task wakes up other tasks, it represents the average time between wakeups, with a limit of avg_runtime for when it doesn't wake up anybody. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-15 12:00:08 +01:00
Benjamin Herrenschmidt	937f1ba56b	net: Add init_dummy_netdev() and fix EMAC driver using it This adds an init_dummy_netdev() function that gets a network device structure (allocation and lifetime entirely under caller's control) and initialize the minimum amount of fields so it can be used to schedule NAPI polls without registering a full blown interface. This is to be used by drivers that need to tie several hardware interfaces to a single NAPI poll scheduler due to HW limitations. It also updates the ibm_newemac driver to use that, this fixing the oops on 2.6.29 due to passing NULL as "dev" to netif_napi_add() Symbol is exported GPL only a I don't think we want binary drivers doing that sort of acrobatics (if we want them at all). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-14 21:05:05 -08:00
Linus Torvalds	5393f78027	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (29 commits) powerpc/83xx: Move mcu_mpc8349emitx driver out of drivers/i2c/chips/ powerpc/83xx: Make serial ports work on MPC8315E-RDB w/ FSL U-Boots powerpc/e500mc: Doorbells need to be taken w/exceptions disabled powerpc: Enable PS3 options and QPACE in ppc64_defconfig powerpc/powermac: Fix occasional SMP boot failure powerpc/cacheinfo: Rename cache_dir per-cpu variable hvc_console: Use kzalloc() instead of kmalloc() + memset() hvc_console: Do not set low_latency when using interrupts hvc_console: Call free_irq() only if request_irq() was successful hvc_console: Change an mb() to smp_mb() and add some comments powerpc: Cleanup from l64 to ll64 change: drivers/net powerpc: Cleanup from l64 to ll64 change: drivers/char powerpc: Cleanup from l64 to ll64 change: arch code powerpc: Change u64/s64 to a long long integer type powerpc/kexec: Check crash_base for relocatable kernel powerpc: Make dummy section a valid note header Xilinx: SPI: updated driver for device tree drivers/of: Add the of_find_i2c_device_by_node function. powerpc/xsysace: add compatible string for non-ipcore instance powerpc/mpc52xx: remove dead code from GPIO driver ...	2009-01-14 20:00:28 -08:00
Linus Torvalds	bca268565f	Merge branch 'syscalls' of git://git390.osdl.marist.edu/pub/scm/linux-2.6 * 'syscalls' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (44 commits) [CVE-2009-0029] s390 specific system call wrappers [CVE-2009-0029] System call wrappers part 33 [CVE-2009-0029] System call wrappers part 32 [CVE-2009-0029] System call wrappers part 31 [CVE-2009-0029] System call wrappers part 30 [CVE-2009-0029] System call wrappers part 29 [CVE-2009-0029] System call wrappers part 28 [CVE-2009-0029] System call wrappers part 27 [CVE-2009-0029] System call wrappers part 26 [CVE-2009-0029] System call wrappers part 25 [CVE-2009-0029] System call wrappers part 24 [CVE-2009-0029] System call wrappers part 23 [CVE-2009-0029] System call wrappers part 22 [CVE-2009-0029] System call wrappers part 21 [CVE-2009-0029] System call wrappers part 20 [CVE-2009-0029] System call wrappers part 19 [CVE-2009-0029] System call wrappers part 18 [CVE-2009-0029] System call wrappers part 17 [CVE-2009-0029] System call wrappers part 16 [CVE-2009-0029] System call wrappers part 15 ...	2009-01-14 19:58:40 -08:00
Harvey Harrison	74d96f0186	byteorder: make swab.h include asm/swab.h like a regular header Add swab.h to kbuild.asm and remove the individual entries from each arch, mark as unifdef as some arches have some kernel-only bits inside. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-14 19:56:50 -08:00
Linus Torvalds	c2919f2ab9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: IDE: fix sparse signed-ness errors with host->host_busy ide: fix suspend regression tx4938ide: Fix build error due to read_sff_dma_status moving ide: remove unused CONFIG_BLK_DEV_IDE_AU1XXX_SEQTS_PER_RQ sl82c105: remove dead code via82cxxx: fix cable warning message ide: can't use SSD/non-rotational queue flag for all CFA devices it821x.c: use dev->revision instead of pci_read_config_byte it821x: Add ultra_mask quirk for Vortex86SX ide: fix accidental LOCKDEP breakage caused by local_irq_set() removal	2009-01-14 19:51:26 -08:00
Ben Dooks	e720b9e498	IDE: fix sparse signed-ness errors with host->host_busy The host_busy field in struct ide_host defaults to a signed-long, where most arch's test_and_set_bit_* macros use an unsigned long. Change to using an unsigned long, which on ARM removes the following sparse errors: drivers/ide/ide-io.c:681:8: warning: incorrect type in argument 2 (different signedness) drivers/ide/ide-io.c:681:8: expected unsigned long volatile p drivers/ide/ide-io.c:681:8: got long volatile <noident> drivers/ide/ide-io.c:681:8: warning: incorrect type in argument 2 (different signedness) drivers/ide/ide-io.c:681:8: expected unsigned long volatile p drivers/ide/ide-io.c:681:8: got long volatile <noident> drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness) drivers/ide/ide-io.c:695:3: expected unsigned long volatile p drivers/ide/ide-io.c:695:3: got long volatile <noident> drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness) drivers/ide/ide-io.c:695:3: expected unsigned long volatile p drivers/ide/ide-io.c:695:3: got long volatile <noident> drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness) drivers/ide/ide-io.c:695:3: expected unsigned long volatile p drivers/ide/ide-io.c:695:3: got long volatile <noident> drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness) drivers/ide/ide-io.c:695:3: expected unsigned long volatile p drivers/ide/ide-io.c:695:3: got long volatile <noident> drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness) drivers/ide/ide-io.c:695:3: expected unsigned long volatile p drivers/ide/ide-io.c:695:3: got long volatile <noident> drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness) drivers/ide/ide-io.c:695:3: expected unsigned long volatile p drivers/ide/ide-io.c:695:3: got long volatile <noident> drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness) drivers/ide/ide-io.c:695:3: expected unsigned long volatile p drivers/ide/ide-io.c:695:3: got long volatile <noident> drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness) drivers/ide/ide-io.c:695:3: expected unsigned long volatile p drivers/ide/ide-io.c:695:3: got long volatile <noident> Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-14 19:19:04 +01:00
Brandon Philips	b94b898f31	it821x: Add ultra_mask quirk for Vortex86SX On Vortex86SX with IDE controller revision 0x11 ultra DMA must be disabled. This patch was tested by DMP and seems to work. It is a cleaned up version of their older Kernel patch: http://www.dmp.com.tw/tech/vortex86sx/patch-2.6.24-DMP.gz Tested-by: Shawn Lin <shawn@dmp.com.tw> Signed-off-by: Brandon Philips <bphilips@suse.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-14 19:19:02 +01:00
Peter Zijlstra	0d66bf6d35	mutex: implement adaptive spinning Change mutex contention behaviour such that it will sometimes busy wait on acquisition - moving its behaviour closer to that of spinlocks. This concept got ported to mainline from the -rt tree, where it was originally implemented for rtmutexes by Steven Rostedt, based on work by Gregory Haskins. Testing with Ingo's test-mutex application (http://lkml.org/lkml/2006/1/8/50) gave a 345% boost for VFS scalability on my testbox: # ./test-mutex-shm V 16 10 \| grep "^avg ops" avg ops/sec: 296604 # ./test-mutex-shm V 16 10 \| grep "^avg ops" avg ops/sec: 85870 The key criteria for the busy wait is that the lock owner has to be running on a (different) cpu. The idea is that as long as the owner is running, there is a fair chance it'll release the lock soon, and thus we'll be better off spinning instead of blocking/scheduling. Since regular mutexes (as opposed to rtmutexes) do not atomically track the owner, we add the owner in a non-atomic fashion and deal with the races in the slowpath. Furthermore, to ease the testing of the performance impact of this new code, there is means to disable this behaviour runtime (without having to reboot the system), when scheduler debugging is enabled (CONFIG_SCHED_DEBUG=y), by issuing the following command: # echo NO_OWNER_SPIN > /debug/sched_features This command re-enables spinning again (this is also the default): # echo OWNER_SPIN > /debug/sched_features Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-14 18:09:02 +01:00
Peter Zijlstra	41719b0309	mutex: preemption fixes The problem is that dropping the spinlock right before schedule is a voluntary preemption point and can cause a schedule, right after which we schedule again. Fix this inefficiency by keeping preemption disabled until we schedule, do this by explicity disabling preemption and providing a schedule() variant that assumes preemption is already disabled. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-14 18:09:00 +01:00
Nick Piggin	18e6959c38	mm: fix assertion This assertion is incorrect for lockless pagecache. By definition if we have an unpinned page that we are trying to take a speculative reference to, it may become the tail of a compound page at any time (if it is freed, then reallocated as a compound page). It was still a valid assertion for the vmscan.c LRU isolation case, but it doesn't seem incredibly helpful... if somebody wants it, they can put it back directly where it applies in the vmscan code. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-14 07:32:44 -08:00
Heiko Carstens	2b66421995	[CVE-2009-0029] System call wrappers part 33 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:32 +01:00
Heiko Carstens	d4e82042c4	[CVE-2009-0029] System call wrappers part 32 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:31 +01:00
Benjamin Herrenschmidt	ee6a093222	[CVE-2009-0029] powerpc: Enable syscall wrappers for 64-bit This enables the use of syscall wrappers to do proper sign extension for 64-bit programs. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:17 +01:00
Heiko Carstens	1a94bc3476	[CVE-2009-0029] System call wrapper infrastructure From: Martin Schwidefsky <schwidefsky@de.ibm.com> By selecting HAVE_SYSCALL_WRAPPERS architectures can activate system call wrappers in order to sign extend system call arguments. All architectures where the ABI defines that the caller of a function has to perform sign extension probably need this. Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:16 +01:00
Heiko Carstens	e55380edf6	[CVE-2009-0029] Rename old_readdir to sys_old_readdir This way it matches the generic system call name convention. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:15 +01:00
Heiko Carstens	2ed7c03ec1	[CVE-2009-0029] Convert all system calls to return a long Convert all system calls to return a long. This should be a NOP since all converted types should have the same size anyway. With the exception of sys_exit_group which returned void. But that doesn't matter since the system call doesn't return. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:14 +01:00
Heiko Carstens	4c696ba798	[CVE-2009-0029] Move compat system call declarations to compat header file Move declarations to correct header file. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:14 +01:00
Shaohua Li	f00012074b	ftrace, ia64: Add macro for ftrace_caller Define FTRACE_ADDR. In IA64, a function pointer isn't a 'unsigned long' but a 'struct {unsigned long ip, unsigned long gp}'. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-14 12:11:18 +01:00
Mandeep Singh Baines	baf48f6577	softlock: fix false panic which can occur if softlockup_thresh is reduced At run-time, if softlockup_thresh is changed to a much lower value, touch_timestamp is likely to be much older than the new softlock_thresh. This will cause a false softlockup to be detected. If softlockup_panic is enabled, the system will panic. The fix is to touch all watchdogs before changing softlockup_thresh. Signed-off-by: Mandeep Singh Baines <msb@google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-14 11:48:07 +01:00
Richard Kennedy	daaf83d2b9	netfilter 09/09: remove padding from struct xt_match on 64bit builds reorder struct xt_match to remove 8 bytes of padding and make its size 128 bytes. This saves a small amount of data space in each of the xt netfilter modules and fits xt_match in one 128 byte cache line. Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-12 21:18:37 -08:00
Krzysztof Hałasa	985ebdb5ed	net: Fix a comment in include/linux/netdevice.h. Fix a comment in include/linux/netdevice.h. Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-12 21:18:32 -08:00
Benjamin Herrenschmidt	bd1f7936ab	Merge commit 'gcl/gcl-next' into next	2009-01-13 13:59:11 +11:00
Yinghai Lu	4a046d1754	x86: arch_probe_nr_irqs Impact: save RAM with large NR_CPUS, get smaller nr_irqs Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Mike Travis <travis@sgi.com>	2009-01-12 17:39:24 -08:00
Linus Torvalds	1181a24499	Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sparc64: Fix cpumask related build failure smp_call_function_single(): be slightly less stupid, fix smp_call_function_single(): be slightly less stupid rcu: fix bug in rcutorture system-shutdown code	2009-01-12 16:28:26 -08:00
Linus Torvalds	b743791639	Merge branch 'for-next' of git://git.o-hand.com/linux-mfd * 'for-next' of git://git.o-hand.com/linux-mfd: mfd: Fix twl4030-core build mfd: Ensure sm501 GPIO pin mode is GPIO when configured mfd: dm355 evm MMC/SD card detection regulator: PCF50633 pmic driver input: PCF50633 input driver power_supply: PCF50633 battery charger driver rtc: PCF50633 rtc driver mfd: PCF50633 gpio support mfd: PCF50633 adc driver mfd: PCF50633 core driver	2009-01-12 16:27:24 -08:00
Linus Torvalds	23ead72912	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (37 commits) ucc_geth: use correct UCCE macros net_dma: acquire/release dma channels on ifup/ifdown cxgb3: Keep LRO off if disabled when interface is down sfc: SFT9001: Fix condition for LNPGA power-off dccp ccid-3: Fix RFC reference smsc911x: register irq with device name, not driver name smsc911x: fix smsc911x_reg_read compiler warning forcedeth: napi schedule lock fix net: fix section mismatch warnings in dccp/ccids/lib/tfrc.c forcedeth: remove mgmt unit for mcp79 chipset qlge: Remove dynamic alloc of rx ring control blocks. qlge: Fix schedule while atomic issue. qlge: Remove support for device ID 8000. qlge: Get rid of split addresses in hardware control blocks. qlge: Get rid of volatile usage for shadow register. forcedeth: version bump and copyright forcedeth: xmit lock fix netdev: missing validate_address hooks netdev: add missing set_mac_address hook netdev: gianfar: add MII ioctl handler ...	2009-01-12 16:22:31 -08:00
Linus Torvalds	ddb4a9dd6a	Merge branch 'for_2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/kkeil/ISDN-2.6 * 'for_2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/kkeil/ISDN-2.6: Fix small typo misdn: indentation and braces disagree - add braces misdn: one handmade ARRAY_SIZE converted drivers/isdn/hardware/mISDN: move a dereference below a NULL test indentation & braces disagree - add braces Make parameter debug writable BUGFIX: used NULL pointer at ioctl(sk,IMGETDEVINFO,&devinfo) when devinfo.id not registered	2009-01-12 15:57:34 -08:00
Geert Uytterhoeven	2e4c77bea3	m68k: dio - Kill warn_unused_result warnings warning: ignoring return value of 'device_register', declared with attribute warn_unused_result warning: ignoring return value of 'device_create_file', declared with attribute warn_unused_result Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>	2009-01-12 20:56:41 +01:00
Peter Zijlstra	6d612b0f94	locking, hpet: annotate false positive warning Alexander Beregalov reported that this warning is caused by the HPET code: > hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0 > hpet0: 3 comparators, 64-bit 14.318180 MHz counter > ODEBUG: object is on stack, but not annotated > ------------[ cut here ]------------ > WARNING: at lib/debugobjects.c:251 __debug_object_init+0x2a4/0x352() > Bisected down to `26afe5f2fb` > (x86: HPET_MSI Initialise per-cpu HPET timers) The commit is fine - but the on-stack workqueue entry needs annotation. Reported-and-bisected-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-12 13:33:20 +01:00
Mike Travis	92296c6d6e	cpumask, irq: non-x86 build failures Ingo Molnar wrote: > All non-x86 architectures fail to build: > > In file included from /home/mingo/tip/include/linux/random.h:11, > from /home/mingo/tip/include/linux/stackprotector.h:6, > from /home/mingo/tip/init/main.c:17: > /home/mingo/tip/include/linux/irqnr.h:26:63: error: asm/irq_vectors.h: No such file or directory Do not include asm/irq_vectors.h in generic code - it's not available on all architectures. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-11 19:13:45 +01:00
Mike Travis	9332fccded	irq: initialize nr_irqs based on nr_cpu_ids Impact: Reduce memory usage. This is the second half of the changes to make the irq_desc_ptrs be variable sized based on nr_cpu_ids. This is done by adding a new "max_nr_irqs" macro to irq_vectors.h (and a dummy in irqnr.h) to return a max NR_IRQS value based on NR_CPUS or nr_cpu_ids. This necessitated moving the define of MAX_IO_APICS to a separate file (asm/apicnum.h) so it could be included without the baggage of the other asm/apicdef.h declarations. Signed-off-by: Mike Travis <travis@sgi.com>	2009-01-11 19:13:38 +01:00
Mike Travis	802bf931f2	cpumask: fix bug in use cpumask_var_t in irq_desc Impact: fix bug where new irq_desc uses old cpumask pointers which are freed. As Yinghai pointed out, init_copy_one_irq_desc() copies the old desc to the new desc overwriting the cpumask pointers. Since the old_desc and the cpumask pointers are freed, then memory corruption will occur if these old pointers are used. Move the allocation of these pointers to after the copy. Signed-off-by: Mike Travis <travis@sgi.com> Cc: Yinghai Lu <yinghai@kernel.org>	2009-01-11 19:13:02 +01:00
Rusty Russell	fbd59a8d1f	cpumask: Use topology_core_cpumask()/topology_thread_cpumask() Impact: reduce stack usage, use new cpumask API. This actually uses topology_core_cpumask() and topology_thread_cpumask(), removing the only users of topology_core_siblings() and topology_thread_siblings() Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Cc: linux-net-drivers@solarflare.com	2009-01-11 19:12:49 +01:00
Mike Travis	7f7ace0cda	cpumask: update irq_desc to use cpumask_var_t Impact: reduce memory usage, use new cpumask API. Replace the affinity and pending_masks with cpumask_var_t's. This adds to the significant size reduction done with the SPARSE_IRQS changes. The added functions (init_alloc_desc_masks & init_copy_desc_masks) are in the include file so they can be inlined (and optimized out for the !CONFIG_CPUMASKS_OFFSTACK case.) [Naming chosen to be consistent with the other init*irq functions, as well as the backwards arg declaration of "from, to" instead of the more common "to, from" standard.] Includes a slight change to the declaration of struct irq_desc to embed the pending_mask within ifdef(CONFIG_SMP) to be consistent with other references, and some small changes to Xen. Tested: sparse/non-sparse/cpumask_offstack/non-cpumask_offstack/nonuma/nosmp on x86_64 Signed-off-by: Mike Travis <travis@sgi.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: virtualization@lists.osdl.org Cc: xen-devel@lists.xensource.com Cc: Yinghai Lu <yhlu.kernel@gmail.com>	2009-01-11 19:12:46 +01:00
Martin Bachem	57de16e612	BUGFIX: used NULL pointer at ioctl(sk,IMGETDEVINFO,&devinfo) when devinfo.id not registered daxtar example # modprobe hfcsusb daxtar example # modprobe mISDN_l1loop daxtar example # ./misdnportinfo Found 3 devices id: 0 Dprotocols: 00000006 Bprotocols: 0000000e protocol: 0 nrbchan: 2 name: HFC-S_USB.1 id: 1 Dprotocols: 00000006 Bprotocols: 0000000e protocol: 0 nrbchan: 2 name: mISDN_l1loop.1 id: 2 Dprotocols: 00000006 Bprotocols: 0000000e protocol: 0 nrbchan: 2 name: mISDN_l1loop.2 daxtar example # rmmod hfcsusb daxtar example # ./misdnportinfo Found 2 devices Segmentation fault dmesg: [ 9914.939718] BUG: unable to handle kernel NULL pointer dereference at 000000d4 [ 9914.939721] IP: [<f8f9f2dd>] :mISDN_core:get_mdevice+0x19/0x22 [ 9914.939729] *pde = 00000000 [ 9914.939732] Oops: 0000 [#14] PREEMPT SMP [ 9914.939734] Modules linked in: mISDN_l1loop mISDN_core vmnet vmblock vmci vmmon coretemp w83627ehf hwmon_vid rfcomm l2cap blue tooth usbhid snd_usb_audio snd_usb_lib snd_rawmidi snd_hwdep fuse nvidia(P) uhci_hcd i2c_i801 ehci_hcd snd_hda_intel atl1 usbcore i2c_core parport_seria l [last unloaded: hfcsusb] [ 9914.939751] Pid: 29618, comm: misdnportinfo Tainted: P D (2.6.27.3 #5) [ 9914.939753] EIP: 0060:[<f8f9f2dd>] EFLAGS: 00210246 CPU: 0 [ 9914.939758] EIP is at get_mdevice+0x19/0x22 [mISDN_core] [ 9914.939760] EAX: 00000000 EBX: f8fa791c ECX: f6afaa58 EDX: f7960cf4 [ 9914.939762] ESI: 80044944 EDI: bfc2e62c EBP: bfc2e62c ESP: f5adbef4 [ 9914.939763] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 9914.939765] Process misdnportinfo (pid: 29618, ti=f5ada000 task=f6bec430 task.ti=f5ada000) [ 9914.939767] Stack: f8f9f4e0 00000000 f8f9f867 bfc2e62c 0000000a c02461e8 00200246 c042dde8 [ 9914.939771] 00000003 c042dde4 00000000 00000001 00200082 c0114775 00000000 00000000 [ 9914.939775] 00000003 f7088010 00200282 f8fa791c 80044944 bfc2e62c bfc2e62c c02f6615 [ 9914.939780] Call Trace: [ 9914.939782] [<f8f9f4e0>] _get_mdevice+0x0/0x18 [mISDN_core] [ 9914.939789] [<f8f9f867>] base_sock_ioctl+0x7a/0x129 [mISDN_core] [ 9914.939789] [<c02461e8>] opost+0x171/0x182 [ 9914.939789] [<c0114775>] __wake_up+0x29/0x39 [ 9914.939789] [<c02f6615>] sock_ioctl+0x1b5/0x1d9 [ 9914.939789] [<c02f6460>] sock_ioctl+0x0/0x1d9 [ 9914.939789] [<c016794c>] vfs_ioctl+0x1c/0x5d [ 9914.939789] [<c0167bcb>] do_vfs_ioctl+0x23e/0x24e [ 9914.939789] [<c0167c07>] sys_ioctl+0x2c/0x45 [ 9914.939789] [<c0102cbd>] sysenter_do_call+0x12/0x21 [ 9914.939789] [<c0350000>] pci_fixup_i450gx+0x4e/0x56 [ 9914.939789] ======================= [ 9914.939789] Code: 00 68 02 f0 f9 f8 e8 ae b4 2c c7 8b 44 24 04 5a 59 c3 83 ec 04 31 d2 89 04 24 89 e1 b8 ac df fa f8 68 e0 f4 f9 f8 e8 4a b5 2c c7 <8b> 80 d4 00 00 00 5a 59 c3 53 89 cb 8d 90 9c 00 00 00 89 c8 e8 [ 9914.939789] EIP: [<f8f9f2dd>] get_mdevice+0x19/0x22 [mISDN_core] SS:ESP 0068:f5adbef4 [ 9914.939858] ---[ end trace 50e18a715b019424 ]--- Signed-off-by: Martin Bachem <m.bachem@gmx.de> Signed-off-by: Karsten Keil <kkeil@suse.de>	2009-01-11 17:55:16 +01:00
Ingo Molnar	d19b85db9d	Merge commit 'v2.6.29-rc1' into timers/urgent	2009-01-11 15:34:05 +01:00
Dan Williams	649274d993	net_dma: acquire/release dma channels on ifup/ifdown The recent dmaengine rework removed the capability to remove dma device driver modules while net_dma is active. Rather than notify dmaengine-clients that channels are trying to be removed, we now rely on clients to notify dmaengine when they no longer have a need for channels. Teach net_dma to release channels by taking dmaengine references at netdevice open and dropping references at netdevice close. Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-11 00:20:39 -08:00
Ingo Molnar	0a6d4e1dc9	Merge branch 'sched/latest' of git://git.kernel.org/pub/scm/linux/kernel/git/ghaskins/linux-2.6-hacks into sched/rt	2009-01-11 04:58:49 +01:00
Ian Campbell	0b8698ab58	swiotlb: range_needs_mapping should take a physical address. The swiotlb_arch_range_needs_mapping() hook should take a physical address rather than a virtual address in order to support highmem pages. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-11 04:54:34 +01:00
Yinghai Lu	d7e51e6689	sparseirq: make some func to be used with genirq Impact: clean up sparseirq fallout on random.c Ingo suggested to change some ifdef from SPARSE_IRQ to GENERIC_HARDIRQS so we could some #ifdef later if all arch support genirq Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-11 04:46:26 +01:00
Ingo Molnar	e8b722f487	Merge commit 'v2.6.29-rc1' into irq/urgent	2009-01-11 04:45:50 +01:00
Ingo Molnar	99cd707489	Merge commit 'v2.6.29-rc1' into tracing/urgent	2009-01-11 03:43:52 +01:00
Andrew Morton	53ce3d9564	smp_call_function_single(): be slightly less stupid If you do smp_call_function_single(expression-with-side-effects, ...) then expression-with-side-effects never gets evaluated on UP builds. As always, implementing it in C is the correct thing to do. While we're there, uninline it for size and possible header dependency reasons. And create a new kernel/up.c, as a place in which to put uniprocessor-specific code and storage. It should mirror kernel/smp.c. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-11 03:41:58 +01:00
Balaji Rao	5ec271e745	regulator: PCF50633 pmic driver Changes from V1: - Removed support for suspend_enable & suspend_disable functions. Signed-off-by: Balaji Rao <balajirrao@openmoko.org> Cc: Andy Green <andy@openmoko.com> Cc: Liam Girdwood <lrg@slimlogic.co.uk> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-11 01:34:25 +01:00
Balaji Rao	f5714dc97d	power_supply: PCF50633 battery charger driver Signed-off-by: Balaji Rao <balajirrao@openmoko.org> Cc: Andy Green <andy@openmoko.com> Cc: David Woodhouse <dwmw2@infradead.org> Acked-by: Anton Vorontsov <cbouatmailru@gmail.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-11 01:34:24 +01:00
Balaji Rao	6a3d119b4c	mfd: PCF50633 gpio support What the PCF05633 calls as a 'GPIO' is much more than the GPIO in the linux sense and there are only 4 of them - which means, the gpiolib is not used here. Signed-off-by: Balaji Rao <balajirrao@openmoko.org> Cc: Andy Green <andy@openmoko.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-11 01:34:24 +01:00
Balaji Rao	08c3e06a5e	mfd: PCF50633 adc driver This patch adds basic support for the PCF50633 ADC. The subtractive mode is not supported yet. Since we don't have adc subsystem, it currently lives in drivers/mfd. Signed-off-by: Balaji Rao <balajirrao@openmoko.org> Cc: Andy Green <andy@openmoko.com> Acked-by: Jonathan Cameron <jonathan.cameron@gmail.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-11 01:34:23 +01:00
Balaji Rao	f52046b14b	mfd: PCF50633 core driver This patch implements the core of the PCF50633 driver. This core driver has generic register read/write functions and does interrupt management for its sub devices. Signed-off-by: Balaji Rao <balajirrao@openmoko.org> Cc: Andy Green <andy@openmoko.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-11 01:34:23 +01:00
Ingo Molnar	0811a433c6	Merge branch 'linus' into core/iommu	2009-01-11 00:51:06 +01:00
Arjan van de Ven	886ad09fc8	libata: Add a per-host flag to opt-in into parallel port probes This patch adds a per host flag that allows drivers to opt in into having its busses scanned in parallel. Drivers that do not set this flag get their ports scanned in the "original" sequence. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-10 15:06:52 -08:00
Linus Torvalds	4e9b1c184c	Merge branch 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: [IA64] fix typo in cpumask_of_pcibus() x86: fix x86_32 builds for summit and es7000 arch's cpumask: use work_on_cpu in acpi-cpufreq.c for read_measured_perf_ctrs cpumask: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write cpumask: use cpumask_var_t in acpi-cpufreq.c cpumask: use work_on_cpu in acpi/cstate.c cpumask: convert struct cpufreq_policy to cpumask_var_t cpumask: replace CPUMASK_ALLOC etc with cpumask_var_t x86: cleanup remaining cpumask_t ops in smpboot code cpumask: update pci_bus_show_cpuaffinity to use new cpumask API cpumask: update local_cpus_show to use new cpumask API ia64: cpumask fix for is_affinity_mask_valid()	2009-01-10 06:12:18 -08:00
Artem Bityutskiy	f4b477c473	rbtree: add const qualifier to some functions The 'rb_first()', 'rb_last()', 'rb_next()' and 'rb_prev()' calls take a pointer to an RB node or RB root. They do not change the pointed objects, so add a 'const' qualifier in order to make life of the users of these functions easier. Indeed, if I have my own constant pointer &const struct my_type *p, and I call 'rb_next(&p->rb)', I get a GCC warning: warning: passing argument 1 of ‘rb_next’ discards qualifiers from pointer target type Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-10 06:04:33 -08:00
Ingo Molnar	b17304245f	Merge branch 'linus' into x86/setup-lzma Conflicts: init/do_mounts_rd.c	2009-01-10 12:04:41 +01:00
Takashi Sato	fcccf50254	filesystem freeze: implement generic freeze feature The ioctls for the generic freeze feature are below. o Freeze the filesystem int ioctl(int fd, int FIFREEZE, arg) fd: The file descriptor of the mountpoint FIFREEZE: request code for the freeze arg: Ignored Return value: 0 if the operation succeeds. Otherwise, -1 o Unfreeze the filesystem int ioctl(int fd, int FITHAW, arg) fd: The file descriptor of the mountpoint FITHAW: request code for unfreeze arg: Ignored Return value: 0 if the operation succeeds. Otherwise, -1 Error number: If the filesystem has already been unfrozen, errno is set to EINVAL. [akpm@linux-foundation.org: fix CONFIG_BLOCK=n] Signed-off-by: Takashi Sato <t-sato@yk.jp.nec.com> Signed-off-by: Masayuki Hamaguchi <m-hamaguchi@ys.jp.nec.com> Cc: <xfs-masters@oss.sgi.com> Cc: <linux-ext4@vger.kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Alasdair G Kergon <agk@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-09 16:54:42 -08:00
Takashi Sato	c4be0c1dc4	filesystem freeze: add error handling of write_super_lockfs/unlockfs Currently, ext3 in mainline Linux doesn't have the freeze feature which suspends write requests. So, we cannot take a backup which keeps the filesystem's consistency with the storage device's features (snapshot and replication) while it is mounted. In many case, a commercial filesystem (e.g. VxFS) has the freeze feature and it would be used to get the consistent backup. If Linux's standard filesystem ext3 has the freeze feature, we can do it without a commercial filesystem. So I have implemented the ioctls of the freeze feature. I think we can take the consistent backup with the following steps. 1. Freeze the filesystem with the freeze ioctl. 2. Separate the replication volume or create the snapshot with the storage device's feature. 3. Unfreeze the filesystem with the unfreeze ioctl. 4. Take the backup from the separated replication volume or the snapshot. This patch: VFS: Changed the type of write_super_lockfs and unlockfs from "void" to "int" so that they can return an error. Rename write_super_lockfs and unlockfs of the super block operation freeze_fs and unfreeze_fs to avoid a confusion. ext3, ext4, xfs, gfs2, jfs: Changed the type of write_super_lockfs and unlockfs from "void" to "int" so that write_super_lockfs returns an error if needed, and unlockfs always returns 0. reiserfs: Changed the type of write_super_lockfs and unlockfs from "void" to "int" so that they always return 0 (success) to keep a current behavior. Signed-off-by: Takashi Sato <t-sato@yk.jp.nec.com> Signed-off-by: Masayuki Hamaguchi <m-hamaguchi@ys.jp.nec.com> Cc: <xfs-masters@oss.sgi.com> Cc: <linux-ext4@vger.kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Alasdair G Kergon <agk@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-09 16:54:42 -08:00
Harvey Harrison	69347a236b	memstick: annotate endianness of attribute structs The code was shifting the endianness appropriately everywhere, annotate the structs to avoid the sparse warnings when assigning the endian types to the struct members, or passing them to be[16\|32]_to_cpu: drivers/memstick/core/mspro_block.c:331:4: warning: cast to restricted __be16 drivers/memstick/core/mspro_block.c:333:4: warning: cast to restricted __be16 drivers/memstick/core/mspro_block.c:335:4: warning: cast to restricted __be16 drivers/memstick/core/mspro_block.c:337:4: warning: cast to restricted __be16 drivers/memstick/core/mspro_block.c:341:4: warning: cast to restricted __be16 drivers/memstick/core/mspro_block.c:347:4: warning: cast to restricted __be32 drivers/memstick/core/mspro_block.c:356:4: warning: cast to restricted __be16 drivers/memstick/core/mspro_block.c:358:4: warning: cast to restricted __be16 drivers/memstick/core/mspro_block.c:364:4: warning: cast to restricted __be16 drivers/memstick/core/mspro_block.c:367:4: warning: cast to restricted __be16 drivers/memstick/core/mspro_block.c:369:4: warning: cast to restricted __be16 drivers/memstick/core/mspro_block.c:371:4: warning: cast to restricted __be16 drivers/memstick/core/mspro_block.c:377:4: warning: cast to restricted __be16 drivers/memstick/core/mspro_block.c:478:4: warning: cast to restricted __be16 drivers/memstick/core/mspro_block.c:480:4: warning: cast to restricted __be16 drivers/memstick/core/mspro_block.c:482:4: warning: cast to restricted __be16 drivers/memstick/core/mspro_block.c:484:4: warning: cast to restricted __be16 drivers/memstick/core/mspro_block.c:486:4: warning: cast to restricted __be16 drivers/memstick/core/mspro_block.c:689:22: expected unsigned int [unsigned] [assigned] data_address drivers/memstick/core/mspro_block.c:689:22: got restricted __be32 [usertype] <noident> drivers/memstick/core/mspro_block.c:697:3: warning: cast to restricted __be32 drivers/memstick/core/mspro_block.c:960:17: warning: incorrect type in initializer (different base types) drivers/memstick/core/mspro_block.c:960:17: expected unsigned short [unsigned] data_count drivers/memstick/core/mspro_block.c:960:17: got restricted __be16 [usertype] <noident> drivers/memstick/core/mspro_block.c:993:6: warning: cast to restricted __be16 drivers/memstick/core/mspro_block.c:995:28: warning: cast to restricted __be16 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Alex Dubov <oakad@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-09 16:54:41 -08:00
Andi Kleen	85c210edc4	compiler-gcc.h: add more comments to RELOC_HIDE Requested by C. Lameter Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Mike Travis <travis@sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-09 16:54:41 -08:00
Linus Torvalds	0d34052dfe	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: Revert "driver core: create a private portion of struct device" Revert "driver core: move klist_children into private structure" Revert "driver core: move knode_driver into private structure" Revert "driver core: move knode_bus into private structure"	2009-01-09 15:31:07 -08:00
Linus Torvalds	2fb585a10e	Merge branch 'for_2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/kkeil/ISDN-2.6 * 'for_2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/kkeil/ISDN-2.6: (28 commits) mISDN: Add HFC USB driver mISDN: Add layer1 prim MPH_INFORMATION_REQ mISDN: Fix kernel crash when doing hardware conference with more than two members mISDN: Added missing create_l1() call mISDN: Add MODULE_DEVICE_TABLE() to hfcpci mISDN: Minor cleanups mISDN: Create /sys/class/mISDN mISDN: Add missing release functions mISDN: Add different different timer settings for hfc-pci mISDN: Minor fixes mISDN: Correct busy device detection mISDN: Fix deactivation, if peer IP is removed from l1oip instance. mISDN: Add ISDN_P_TE_UP0 / ISDN_P_NT_UP0 mISDN: Fix irq detection mISDN: Add ISDN sample clock API to mISDN core mISDN: Return error on E-channel access mISDN: Add E-Channel logging features mISDN: Use protocol to detect D-channel mISDN: Fixed more indexing bugs mISDN: Make debug output a little bit more verbose ...	2009-01-09 15:27:39 -08:00
Greg Kroah-Hartman	926beadb3d	Revert "driver core: create a private portion of struct device" This reverts commit `2831fe6f9c`. Turns out that device_initialize shouldn't fail silently. This series needs to be reworked in order to get into proper shape. Reported-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-09 15:06:12 -08:00
Greg Kroah-Hartman	e2d4077678	Revert "driver core: move klist_children into private structure" This reverts commit `11c3b5c3e0`. Turns out that device_initialize shouldn't fail silently. This series needs to be reworked in order to get into proper shape. Reported-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-09 14:55:37 -08:00
Jon Smirl	2526c151c3	drivers/of: Add the of_find_i2c_device_by_node function. The of_find_i2c_device_by_node function allows you to follow a reference in the device tree to an i2c device node and then locate the linux device instantiated by the device tree. Example use: an I2S bus driver finding the i2c_device instance for a codec described by a device tree node. This was waiting for Anton's i2c patches that were just added. Signed-off-by: Jon Smirl <jonsmirl@gmail.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>	2009-01-09 15:49:06 -07:00
Greg Kroah-Hartman	cda5e83fde	Revert "driver core: move knode_driver into private structure" This reverts commit `93e746db18`. Turns out that device_initialize shouldn't fail silently. This series needs to be reworked in order to get into proper shape. Reported-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-09 14:44:18 -08:00
Greg Kroah-Hartman	4db8e282f2	Revert "driver core: move knode_bus into private structure" This reverts commit `b9daa99ee5`. Turns out that device_initialize shouldn't fail silently. This series needs to be reworked in order to get into proper shape. Reported-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-09 14:32:46 -08:00
Linus Torvalds	c40f6f8bbc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-nommu * git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-nommu: NOMMU: Support XIP on initramfs NOMMU: Teach kobjsize() about VMA regions. FLAT: Don't attempt to expand the userspace stack to fill the space allocated FDPIC: Don't attempt to expand the userspace stack to fill the space allocated NOMMU: Improve procfs output using per-MM VMAs NOMMU: Make mmap allocation page trimming behaviour configurable. NOMMU: Make VMAs per MM as for MMU-mode linux NOMMU: Delete askedalloc and realalloc variables NOMMU: Rename ARM's struct vm_region NOMMU: Fix cleanup handling in ramfs_nommu_get_umapped_area()	2009-01-09 14:00:58 -08:00
Linus Torvalds	d7d717fa88	Merge branch 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds * 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds: leds: ledtrig-timer - on deactivation hardware blinking should be disabled leds: Add suspend/resume to the core class leds: Add WM8350 LED driver leds: leds-pcs9532 - Move i2c work to a workqueque leds: leds-pca9532 - fix memory leak and properly handle errors leds: Fix wrong loop direction on removal in leds-ams-delta leds: fix Cobalt Raq LED dependency leds: Fix sparse warning in leds-ams-delta leds: Fixup kdoc comment to match parameter names leds: Make header variable naming consistent leds: eds-pca9532: mark pca9532_event() static leds: ALIX.2 LEDs driver	2009-01-09 13:55:37 -08:00
Linus Torvalds	b64dc5a484	Merge branch 'for-linus' of git://git.o-hand.com/linux-rpurdie-backlight * 'for-linus' of git://git.o-hand.com/linux-rpurdie-backlight: backlight: Rename the corgi backlight driver to generic backlight: add support for Toppoly TDO35S series to tdo24m lcd driver backlight: Add suspend/resume support to the backlight core bd->props.brightness doesn't reflect the actual backlight level. backlight: Support VGA/QVGA mode switching in tosa_lcd backlight: Catch invalid input in sysfs attributes backlight: Value of ILI9320_RGB_IF2 register should not be hardcoded backlight: crbllcd_bl - Use platform_device_register_simple() backlight: progear_bl - Use platform_device_register_simple() backlight: hp680_bl - Use platform_device_register_simple()	2009-01-09 13:55:13 -08:00
Martin Bachem	3f75e84a6a	mISDN: Add layer1 prim MPH_INFORMATION_REQ MPH_INFORMATION provides full D- and B-Channel status overview - new layer1 primitive: MPF_INFORMATON_REQ - layer1 replies with MPH_INFORMATION_IND containing - dch->[state,Flags,nrbchan] - bch[]->[protocol,Flags] - hardware driver should send MPH_INFORMATION_IND on all ph state changes and BChannel state changes to MISDN_ID_ANY Signed-off-by: Martin Bachem <m.bachem@gmx.de> Signed-off-by: Karsten Keil <kkeil@suse.de>	2009-01-09 22:44:29 +01:00
Matthias Urlichs	b36b654a7e	mISDN: Create /sys/class/mISDN Create /sys/class/mISDN and implement functions to handle device renames. Signed-Off-By: Matthias Urlichs <matthias@urlichs.de> Signed-off-by: Karsten Keil <kkeil@suse.de>	2009-01-09 22:44:28 +01:00
Andreas Eversberg	1b4d33121f	mISDN: Fix deactivation, if peer IP is removed from l1oip instance. Added GETPEER operation. Socket now checks if device is already busy at a differen mode. Signed-off-by: Andreas Eversberg <andreas@eversberg.eu> Signed-off-by: Karsten Keil <kkeil@suse.de>	2009-01-09 22:44:27 +01:00
Martin Bachem	02282eee56	mISDN: Add ISDN_P_TE_UP0 / ISDN_P_NT_UP0 - new layer1 protocols for UP0 bus - helper #defines to test for TE/NT/S0/E1/UP0 Signed-off-by: Martin Bachem <m.bachem@gmx.de> Signed-off-by: Karsten Keil <kkeil@suse.de>	2009-01-09 22:44:27 +01:00
Andreas Eversberg	3bd69ad197	mISDN: Add ISDN sample clock API to mISDN core Add ISDN sample clock API to mISDN core (new file clock.c) hfcmulti and mISDNdsp use clock API. Signed-off-by: Andreas Eversberg <andreas@eversberg.eu> Signed-off-by: Karsten Keil <kkeil@suse.de>	2009-01-09 22:44:27 +01:00
Martin Bachem	1f28fa19d3	mISDN: Add E-Channel logging features New prim PH_DATA_E_IND. - all E-ch frames are indicated by recv_Echannel(), which pushes E-Channel frames into dch's rqueue - if dchannel is opened with channel nr 0, no E-Channel logging is requested - if dchannel is opened with channel nr 1, E-Channel logging is requested. if layer1 does not support that, -EINVAL in return is appropriate Signed-off-by: Martin Bachem <m.bachem@gmx.de> Signed-off-by: Karsten Keil <kkeil@suse.de>	2009-01-09 22:44:25 +01:00
Matthias Urlichs	837468d135	mISDN: Use struct device name field struct device already has a 'name' member, use it. Signed-off-by: Matthias Urlichs <matthias@urlichs.de> Signed-off-by: Karsten Keil <kkeil@suse.de>	2009-01-09 22:44:24 +01:00
Matthias Urlichs	8b6015f736	mISDN: Added an ioctl to change the device name To get persistent device names with hotplug we need to rename devices sometime. Signed-off-by: Matthias Urlichs <matthias@urlichs.de> Signed-off-by: Karsten Keil <kkeil@suse.de>	2009-01-09 22:44:23 +01:00
Andreas Eversberg	8dd2f36f31	mISDN: Add feature via MISDN_CTRL_FILL_EMPTY to fill fifo if empty This prevents underrun of fifo when filled and in case of an underrun it prevents subsequent underruns due to jitter. Improve dsp, so buffers are kept filled with a certain delay, so moderate jitter will not cause underrun all the time -> the audio quality is highly improved. tones are not interrupted by gaps anymore, except when CPU is stalling or in high load. Signed-off-by: Andreas Eversberg <andreas@eversberg.eu> Signed-off-by: Karsten Keil <kkeil@suse.de>	2009-01-09 22:44:22 +01:00
Linus Torvalds	4ce5f24193	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile: (31 commits) powerpc/oprofile: fix whitespaces in op_model_cell.c powerpc/oprofile: IBM CELL: add SPU event profiling support powerpc/oprofile: fix cell/pr_util.h powerpc/oprofile: IBM CELL: cleanup and restructuring oprofile: make new cpu buffer functions part of the api oprofile: remove #ifdef CONFIG_OPROFILE_IBS in non-ibs code ring_buffer: fix ring_buffer_event_length() oprofile: use new data sample format for ibs oprofile: add op_cpu_buffer_get_data() oprofile: add op_cpu_buffer_add_data() oprofile: rework implementation of cpu buffer events oprofile: modify op_cpu_buffer_read_entry() oprofile: add op_cpu_buffer_write_reserve() oprofile: rename variables in add_ibs_begin() oprofile: rename add_sample() in cpu_buffer.c oprofile: rename variable ibs_allowed to has_ibs in op_model_amd.c oprofile: making add_sample_entry() inline oprofile: remove backtrace code for ibs oprofile: remove unused ibs macro oprofile: remove unused components in struct oprofile_cpu_buffer ...	2009-01-09 12:43:06 -08:00
Linus Torvalds	7c51d57e9d	Merge git://git.infradead.org/mtd-2.6 * git://git.infradead.org/mtd-2.6: (67 commits) [MTD] [MAPS] Fix printk format warning in nettel.c [MTD] [NAND] add cmdline parsing (mtdparts=) support to cafe_nand [MTD] CFI: remove major/minor version check for command set 0x0002 [MTD] [NAND] ndfc driver [MTD] [TESTS] Fix some size_t printk format warnings [MTD] LPDDR Makefile and KConfig [MTD] LPDDR extended physmap driver to support LPDDR flash [MTD] LPDDR added new pfow_base parameter [MTD] LPDDR Command set driver [MTD] LPDDR PFOW definition [MTD] LPDDR QINFO records definitions [MTD] LPDDR qinfo probing. [MTD] [NAND] pxa3xx: convert from ns to clock ticks more accurately [MTD] [NAND] pxa3xx: fix non-page-aligned reads [MTD] [NAND] fix nandsim sched.h references [MTD] [NAND] alauda: use USB API functions rather than constants [MTD] struct device - replace bus_id with dev_name(), dev_set_name() [MTD] fix m25p80 64-bit divisions [MTD] fix dataflash 64-bit divisions [MTD] [NAND] Set the fsl elbc ECCM according the settings in bootloader. ... Fixed up trivial debug conflicts in drivers/mtd/devices/{m25p80.c,mtd_dataflash.c}	2009-01-09 12:37:15 -08:00
Linus Torvalds	a3a798c88a	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (94 commits) ACPICA: hide private headers ACPICA: create acpica/ directory ACPI: fix build warning ACPI : Use RSDT instead of XSDT by adding boot option of "acpi=rsdt" ACPI: Avoid array address overflow when _CST MWAIT hint bits are set fujitsu-laptop: Simplify SBLL/SBL2 backlight handling fujitsu-laptop: Add BL power, LED control and radio state information ACPICA: delete utcache.c ACPICA: delete acdisasm.h ACPICA: Update version to 20081204. ACPICA: FADT: Update error msgs for consistency ACPICA: FADT: set acpi_gbl_use_default_register_widths to TRUE by default ACPICA: FADT parsing changes and fixes ACPICA: Add ACPI_MUTEX_TYPE configuration option ACPICA: Fixes for various ACPI data tables ACPICA: Restructure includes into public/private ACPI: remove private acpica headers from driver files ACPI: reboot.c: use new acpi_reset interface ACPICA: New: acpi_reset interface - write to reset register ACPICA: Move all public H/W interfaces to new hwxface ...	2009-01-09 11:55:14 -08:00
Linus Torvalds	d9e8a3a5b8	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (22 commits) ioat: fix self test for multi-channel case dmaengine: bump initcall level to arch_initcall dmaengine: advertise all channels on a device to dma_filter_fn dmaengine: use idr for registering dma device numbers dmaengine: add a release for dma class devices and dependent infrastructure ioat: do not perform removal actions at shutdown iop-adma: enable module removal iop-adma: kill debug BUG_ON iop-adma: let devm do its job, don't duplicate free dmaengine: kill enum dma_state_client dmaengine: remove 'bigref' infrastructure dmaengine: kill struct dma_client and supporting infrastructure dmaengine: replace dma_async_client_register with dmaengine_get atmel-mci: convert to dma_request_channel and down-level dma_slave dmatest: convert to dma_request_channel dmaengine: introduce dma_request_channel and private channels net_dma: convert to dma_find_channel dmaengine: provide a common 'issue_pending_all' implementation dmaengine: centralize channel allocation, introduce dma_find_channel dmaengine: up-level reference counting to the module level ...	2009-01-09 11:52:14 -08:00
Wolfgang Grandegger	fefae48bf8	[MTD] CFI: remove major/minor version check for command set 0x0002 The NOR Flash memory K8P2815UQB from Samsung uses the major version number '0'. Add a quirk to cope with it. Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2009-01-09 12:16:28 +00:00
Mark Brown	a6ba2b2dab	ASoC: Implement WM8350 headphone jack detection Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2009-01-09 10:31:32 +00:00
Len Brown	ec9f168fcc	Merge branch 'simplify_PRT' into release Conflicts: drivers/acpi/pci_irq.c Note that this merge disables `e1d3a90846` pci, acpi: reroute PCI interrupt to legacy boot interrupt equivalent Signed-off-by: Len Brown <len.brown@intel.com>	2009-01-09 03:41:08 -05:00
Len Brown	b2576e1d44	Merge branch 'linus' into release	2009-01-09 03:39:43 -05:00
Len Brown	3cc8a5f4ba	Merge branch 'suspend' into release	2009-01-09 03:38:15 -05:00
Linus Torvalds	2150edc6c5	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (57 commits) jbd2: Fix oops in jbd2_journal_init_inode() on corrupted fs ext4: Remove "extents" mount option block: Add Kconfig help which notes that ext4 needs CONFIG_LBD ext4: Make printk's consistently prefixed with "EXT4-fs: " ext4: Add sanity checks for the superblock before mounting the filesystem ext4: Add mount option to set kjournald's I/O priority jbd2: Submit writes to the journal using WRITE_SYNC jbd2: Add pid and journal device name to the "kjournald2 starting" message ext4: Add markers for better debuggability ext4: Remove code to create the journal inode ext4: provide function to release metadata pages under memory pressure ext3: provide function to release metadata pages under memory pressure add releasepage hooks to block devices which can be used by file systems ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc ext4: Init the complete page while building buddy cache ext4: Don't allow new groups to be added during block allocation ext4: mark the blocks/inode bitmap beyond end of group as used ext4: Use new buffer_head flag to check uninit group bitmaps initialization ext4: Fix the race between read_inode_bitmap() and ext4_new_inode() ext4: code cleanup ...	2009-01-08 17:14:59 -08:00
Linus Torvalds	cd764695b6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (45 commits) [SCSI] qla2xxx: Update version number to 8.03.00-k1. [SCSI] qla2xxx: Add ISP81XX support. [SCSI] qla2xxx: Use proper request/response queues with MQ instantiations. [SCSI] qla2xxx: Correct MQ-chain information retrieval during a firmware dump. [SCSI] qla2xxx: Collapse EFT/FCE copy procedures during a firmware dump. [SCSI] qla2xxx: Don't pollute kernel logs with ZIO/RIO status messages. [SCSI] qla2xxx: Don't fallback to interrupt-polling during re-initialization with MSI-X enabled. [SCSI] qla2xxx: Remove support for reading/writing HW-event-log. [SCSI] cxgb3i: add missing include [SCSI] scsi_lib: fix DID_RESET status problems [SCSI] fc transport: restore missing dev_loss_tmo callback to LLDD [SCSI] aha152x_cs: Fix regression that keeps driver from using shared interrupts [SCSI] sd: Correctly handle 6-byte commands with DIX [SCSI] sd: DIF: Fix tagging on platforms with signed char [SCSI] sd: DIF: Show app tag on error [SCSI] Fix error handling for DIF/DIX [SCSI] scsi_lib: don't decrement busy counters when inserting commands [SCSI] libsas: fix test for negative unsigned and typos [SCSI] a2091, gvp11: kill warn_unused_result warnings [SCSI] fusion: Move a dereference below a NULL test ... Fixed up trivial conflict due to moving the async part of sd_probe around in the async probes vs using dev_set_name() in naming.	2009-01-08 16:27:31 -08:00
H. Peter Anvin	889c92d21d	bzip2/lzma: centralize format detection Centralize the compression format detection to a common routine in the lib directory, and use it for both initramfs and initrd. Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-01-08 15:14:17 -08:00
Linus Torvalds	022992ee59	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6: regulator: fix kernel-doc warnings regulator: catch some registration errors regulator: Add basic DocBook manual regulator: Fix some kerneldoc rendering issues regulator: Add missing kerneldoc regulator: Clean up kerneldoc warnings regulator: Remove extraneous kerneldoc annotations regulator: init/link earlier regulator: move set_machine_constraints after regulator device initialization regulator: da903x: make da903x_is_enabled return 0 or 1 regulator: da903x: add '\n' to error messages regulator: sysfs attribute reduction (v2) regulator: code shrink (v2) regulator: improved mode error checks regulator: enable/disable refcounting regulator: struct device - replace bus_id with dev_name(), dev_set_name()	2009-01-08 14:51:11 -08:00
Linus Torvalds	5fbbf5f648	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (84 commits) wimax: fix kernel-doc for debufs_dentry member of struct wimax_dev net: convert pegasus driver to net_device_ops bnx2x: Prevent eeprom set when driver is down net: switch kaweth driver to netdevops pcnet32: round off carrier watch timer i2400m/usb: wrap USB power saving in #ifdef CONFIG_PM wimax: testing for rfkill support should also test for CONFIG_RFKILL_MODULE wimax: fix kconfig interactions with rfkill and input layers wimax: fix '#ifndef CONFIG_BUG' layout to avoid warning r6040: bump release number to 0.20 r6040: warn about MAC address being unset r6040: check PHY status when bringing interface up r6040: make printks consistent with DRV_NAME gianfar: Fixup use of BUS_ID_SIZE mlx4_en: Returning real Max in get_ringparam mlx4_en: Consider inline packets on completion netdev: bfin_mac: enable bfin_mac net dev driver for BF51x qeth: convert to net_device_ops vlan: add neigh_setup dm9601: warn on invalid mac address ...	2009-01-08 14:25:41 -08:00
Linus Torvalds	894bcdfb1a	Merge branch 'for-linus' of git://neil.brown.name/md * 'for-linus' of git://neil.brown.name/md: md: don't retry recovery of raid1 that fails due to error on source drive. md: Allow md devices to be created by name. md: make devices disappear when they are no longer needed. md: centralise all freeing of an 'mddev' in 'md_free' md: move allocation of ->queue from mddev_find to md_probe md: need another print_sb for mdp_superblock_1 md: use list_for_each_entry macro directly md: raid0: make hash_spacing and preshift sector-based. md: raid0: Represent the size of strip zones in sectors. md: raid0 create_strip_zones(): Add KERN_INFO/KERN_ERR to printk's. md: raid0 create_strip_zones(): Make two local variables sector-based. md: raid0: Represent zone->zone_offset in sectors. md: raid0: Represent device offset in sectors. md: raid0_make_request(): Replace local variable block by sector. md: raid0_make_request(): Remove local variable chunk_size. md: raid0_make_request(): Replace chunksize_bits by chunksect_bits. md: use sysfs_notify_dirent to notify changes to md/sync_action. md: fix bitmap-on-external-file bug.	2009-01-08 14:03:34 -08:00
Alan Cox	871af1210f	libata: Add 32bit PIO support This matters for some controllers and in one or two cases almost doubles PIO performance. Add a bmdma32 operations set we can inherit and activate it for some controllers Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-01-08 16:34:27 -05:00
NeilBrown	4044ba58dd	md: don't retry recovery of raid1 that fails due to error on source drive. If a raid1 has only one working drive and it has a sector which gives an error on read, then an attempt to recover onto a spare will fail, but as the single remaining drive is not removed from the array, the recovery will be immediately re-attempted, resulting in an infinite recovery loop. So detect this situation and don't retry recovery once an error on the lone remaining drive is detected. Allow recovery to be retried once every time a spare is added in case the problem wasn't actually a media error. Signed-off-by: NeilBrown <neilb@suse.de>	2009-01-09 08:31:11 +11:00
NeilBrown	efeb53c0e5	md: Allow md devices to be created by name. Using sequential numbers to identify md devices is somewhat artificial. Using names can be a lot more user-friendly. Also, creating md devices by opening the device special file is a bit awkward. So this patch provides a new option for creating and naming devices. Writing a name such as "md_home" to /sys/modules/md_mod/parameters/new_array will cause an array with that name to be created. It will appear in /sys/block/ /proc/partitions and /proc/mdstat as 'md_home'. It will have an arbitrary minor number allocated. md devices that a created by an open are destroyed on the last close when the device is inactive. For named md devices, they will not be destroyed until the array is explicitly stopped, either with the STOP_ARRAY ioctl or by writing 'clear' to /sys/block/md_XXXX/md/array_state. The name of the array must start 'md_' to avoid conflict with other devices. Signed-off-by: NeilBrown <neilb@suse.de>	2009-01-09 08:31:10 +11:00
NeilBrown	d3374825ce	md: make devices disappear when they are no longer needed. Currently md devices, once created, never disappear until the module is unloaded. This is essentially because the gendisk holds a reference to the mddev, and the mddev holds a reference to the gendisk, this a circular reference. If we drop the reference from mddev to gendisk, then we need to ensure that the mddev is destroyed when the gendisk is destroyed. However it is not possible to hook into the gendisk destruction process to enable this. So we drop the reference from the gendisk to the mddev and destroy the gendisk when the mddev gets destroyed. However this has a complication. Between the call __blkdev_get->get_gendisk->kobj_lookup->md_probe and the call __blkdev_get->md_open there is no obvious way to hold a reference on the mddev any more, so unless something is done, it will disappear and gendisk will be destroyed prematurely. Also, once we decide to destroy the mddev, there will be an unlockable moment before the gendisk is unlinked (blk_unregister_region) during which a new reference to the gendisk can be created. We need to ensure that this reference can not be used. i.e. the ->open must fail. So: 1/ in md_probe we set a flag in the mddev (hold_active) which indicates that the array should be treated as active, even though there are no references, and no appearance of activity. This is cleared by md_release when the device is closed if it is no longer needed. This ensures that the gendisk will survive between md_probe and md_open. 2/ In md_open we check if the mddev we expect to open matches the gendisk that we did open. If there is a mismatch we return -ERESTARTSYS and modify __blkdev_get to retry from the top in that case. In the -ERESTARTSYS sys case we make sure to wait until the old gendisk (that we succeeded in opening) is really gone so we loop at most once. Some udev configurations will always open an md device when it first appears. If we allow an md device that was just created by an open to disappear on an immediate close, then this can race with such udev configurations and result in an infinite loop the device being opened and closed, then re-open due to the 'ADD' even from the first open, and then close and so on. So we make sure an md device, once created by an open, remains active at least until some md 'ioctl' has been made on it. This means that all normal usage of md devices will allow them to disappear promptly when not needed, but the worst that an incorrect usage will do it cause an inactive md device to be left in existence (it can easily be removed). As an array can be stopped by writing to a sysfs attribute echo clear > /sys/block/mdXXX/md/array_state we need to use scheduled work for deleting the gendisk and other kobjects. This allows us to wait for any pending gendisk deletion to complete by simply calling flush_scheduled_work(). Signed-off-by: NeilBrown <neilb@suse.de>	2009-01-09 08:31:10 +11:00
Cheng Renquan	cd2ac9321c	md: need another print_sb for mdp_superblock_1 md_print_devices is called in two code path: MD_BUG(...), and md_ioctl with PRINT_RAID_DEBUG. it will dump out all in use md devices information; However, it wrongly processed two types of superblock in one: The header file <linux/raid/md_p.h> has defined two types of superblock, struct mdp_superblock_s (typedefed with mdp_super_t) according to md with metadata 0.90, and struct mdp_superblock_1 according to md with metadata 1.0 and later, These two types of superblock are very different, The md_print_devices code processed them both in mdp_super_t, that would lead to wrong informaton dump like: [ 6742.345877] [ 6742.345887] md: ********************************** [ 6742.345890] md: * <COMPLETE RAID STATE PRINTOUT> * [ 6742.345892] md: ******************************** [ 6742.345896] md1: <ram7><ram6><ram5><ram4> [ 6742.345907] md: rdev ram7, SZ:00065472 F:0 S:1 DN:3 [ 6742.345909] md: rdev superblock: [ 6742.345914] md: SB: (V:0.90.0) ID:<42ef13c7.598c059a.5f9f1645.801e9ee6> CT:4919856d [ 6742.345918] md: L5 S00065472 ND:4 RD:4 md1 LO:2 CS:65536 [ 6742.345922] md: UT:4919856d ST:1 AD:4 WD:4 FD:0 SD:0 CSUM:b7992907 E:00000001 [ 6742.345924] D 0: DISK<N:0,(1,8),R:0,S:6> [ 6742.345930] D 1: DISK<N:1,(1,10),R:1,S:6> [ 6742.345933] D 2: DISK<N:2,(1,12),R:2,S:6> [ 6742.345937] D 3: DISK<N:3,(1,14),R:3,S:6> [ 6742.345942] md: THIS: DISK<N:3,(1,14),R:3,S:6> ... [ 6742.346058] md0: <ram3><ram2><ram1><ram0> [ 6742.346067] md: rdev ram3, SZ:00065472 F:0 S:1 DN:3 [ 6742.346070] md: rdev superblock: [ 6742.346073] md: SB: (V:1.0.0) ID:<369aad81.00000000.00000000.00000000> CT:9a322a9c [ 6742.346077] md: L-1507699579 S976570180 ND:48 RD:0 md0 LO:65536 CS:196610 [ 6742.346081] md: UT:00000018 ST:0 AD:131048 WD:0 FD:8 SD:0 CSUM:00000000 E:00000000 [ 6742.346084] D 0: DISK<N:-1,(-1,-1),R:-1,S:-1> [ 6742.346089] D 1: DISK<N:-1,(-1,-1),R:-1,S:-1> [ 6742.346092] D 2: DISK<N:-1,(-1,-1),R:-1,S:-1> [ 6742.346096] D 3: DISK<N:-1,(-1,-1),R:-1,S:-1> [ 6742.346102] md: THIS: DISK<N:0,(0,0),R:0,S:0> ... [ 6742.346219] md: ****************************** [ 6742.346221] Here md1 is metadata 0.90.0, and md0 is metadata 1.2 After some more code to distinguish these two types of superblock, in this patch, it will generate dump information like: [ 7906.755790] [ 7906.755799] md: ******************************** [ 7906.755802] md: * <COMPLETE RAID STATE PRINTOUT> * [ 7906.755804] md: ******************************** [ 7906.755808] md1: <ram7><ram6><ram5><ram4> [ 7906.755819] md: rdev ram7, SZ:00065472 F:0 S:1 DN:3 [ 7906.755821] md: rdev superblock (MJ:0): [ 7906.755826] md: SB: (V:0.90.0) ID:<3fca7a0d.a612bfed.5f9f1645.801e9ee6> CT:491989f3 [ 7906.755830] md: L5 S00065472 ND:4 RD:4 md1 LO:2 CS:65536 [ 7906.755834] md: UT:491989f3 ST:1 AD:4 WD:4 FD:0 SD:0 CSUM:00fb52ad E:00000001 [ 7906.755836] D 0: DISK<N:0,(1,8),R:0,S:6> [ 7906.755842] D 1: DISK<N:1,(1,10),R:1,S:6> [ 7906.755845] D 2: DISK<N:2,(1,12),R:2,S:6> [ 7906.755849] D 3: DISK<N:3,(1,14),R:3,S:6> [ 7906.755855] md: THIS: DISK<N:3,(1,14),R:3,S:6> ... [ 7906.755972] md0: <ram3><ram2><ram1><ram0> [ 7906.755981] md: rdev ram3, SZ:00065472 F:0 S:1 DN:3 [ 7906.755984] md: rdev superblock (MJ:1): [ 7906.755989] md: SB: (V:1) (F:0) Array-ID:<5fbcf158:55aa:5fbe:9a79:1e939880dcbd> [ 7906.755990] md: Name: "DG5:0" CT:1226410480 [ 7906.755998] md: L5 SZ130944 RD:4 LO:2 CS:128 DO:24 DS:131048 SO:8 RO:0 [ 7906.755999] md: Dev:00000003 UUID: 9194d744:87f7:a448:85f2:7497b84ce30a [ 7906.756001] md: (F:0) UT:1226410480 Events:0 ResyncOffset:-1 CSUM:0dbcd829 [ 7906.756003] md: (MaxDev:384) ... [ 7906.756113] md: ******************************** [ 7906.756116] this md0 (metadata 1.2) information dumping is exactly according to struct mdp_superblock_1. Signed-off-by: Cheng Renquan <crquan@gmail.com> Cc: Neil Brown <neilb@suse.de> Cc: Dan Williams <dan.j.williams@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NeilBrown <neilb@suse.de>	2009-01-09 08:31:08 +11:00
Cheng Renquan	159ec1fc06	md: use list_for_each_entry macro directly The rdev_for_each macro defined in <linux/raid/md_k.h> is identical to list_for_each_entry_safe, from <linux/list.h>, it should be defined to use list_for_each_entry_safe, instead of reinventing the wheel. But some calls to each_entry_safe don't really need a safe version, just a direct list_for_each_entry is enough, this could save a temp variable (tmp) in every function that used rdev_for_each. In this patch, most rdev_for_each loops are replaced by list_for_each_entry, totally save many tmp vars; and only in the other situations that will call list_del to delete an entry, the safe version is used. Signed-off-by: Cheng Renquan <crquan@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2009-01-09 08:31:08 +11:00
Andre Noll	ccacc7d2cf	md: raid0: make hash_spacing and preshift sector-based. This patch renames the hash_spacing and preshift members of struct raid0_private_data to spacing and sector_shift respectively and changes the semantics as follows: We always have spacing = 2 * hash_spacing. In case sizeof(sector_t) > sizeof(u32) we also have sector_shift = preshift + 1 while sector_shift = preshift = 0 otherwise. Note that the values of nb_zone and zone are unaffected by these changes because in the sector_div() preceeding the assignement of these two variables both arguments double. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>	2009-01-09 08:31:08 +11:00
Andre Noll	83838ed878	md: raid0: Represent the size of strip zones in sectors. This completes the block -> sector conversion of struct strip_zone. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>	2009-01-09 08:31:07 +11:00
Andre Noll	6199d3db0f	md: raid0: Represent zone->zone_offset in sectors. For the same reason as in the previous patch, rename it from zone_offset to zone_start. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>	2009-01-09 08:31:07 +11:00
Andre Noll	019c4e2f3e	md: raid0: Represent device offset in sectors. Rename zone->dev_offset to zone->dev_start to make sure all users have been converted. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>	2009-01-09 08:31:06 +11:00
NeilBrown	0c3573f19d	md: use sysfs_notify_dirent to notify changes to md/sync_action. There is no compelling need for this, but sysfs_notify_dirent is a nicer interface and the change is good for consistency. Signed-off-by: NeilBrown <neilb@suse.de>	2009-01-09 08:31:05 +11:00
Mike Rapoport	f4f6bda00f	backlight: add support for Toppoly TDO35S series to tdo24m lcd driver Signed-off-by: Mike Rapoport <mike@compulab.co.il> Acked-by: Eric Miao <eric.miao@marvell.com> Signed-off-by: Richard Purdie <rpurdie@linux.intel.com>	2009-01-08 20:11:07 +00:00
Randy Dunlap	0ba4887c63	regulator: fix kernel-doc warnings Fix kernel-doc warnings in regulator/driver.h: Warning(linux-next-20090108//include/linux/regulator/driver.h:95): Excess struct/union/enum/typedef member 'set_current' description in 'regulator_ops' Warning(linux-next-20090108//include/linux/regulator/driver.h:95): Excess struct/union/enum/typedef member 'get_current' description in 'regulator_ops' Warning(linux-next-20090108//include/linux/regulator/driver.h:124): No description found for parameter 'irq' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> cc: Liam Girdwood <lrg@slimlogic.co.uk> cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>	2009-01-08 20:10:38 +00:00
Mark Brown	c8e7e4640f	regulator: Add missing kerneldoc This is only the documentation that the kerneldoc system warns about. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>	2009-01-08 20:10:33 +00:00
Mark Brown	69279fb9a9	regulator: Clean up kerneldoc warnings Remove kerneldoc warnings that don't relate to missing documentation, mostly by renaming parameters in the documentation to match their actual names. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>	2009-01-08 20:10:33 +00:00
David S. Miller	7f46b1343f	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6	2009-01-08 11:05:59 -08:00
Richard Purdie	859cb7f2a4	leds: Add suspend/resume to the core class Add suspend/resume to the core class and remove all the now unneeded code from various drivers. Originally the class code couldn't support suspend/resume but since class_device can there is no reason for each driver doing its own suspend/resume anymore.	2009-01-08 17:55:03 +00:00
Linus Torvalds	85da1fb545	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (53 commits) serial: Add driver for the Cell Network Processor serial port NWP device powerpc: enable dynamic ftrace powerpc/cell: Fix the prototype of create_vma_map() powerpc/mm: Make clear_fixmap() actually work powerpc/kdump: Use ppc_save_regs() in crash_setup_regs() powerpc: Export cacheable_memzero as its now used in a driver powerpc: Fix missing semicolons in mmu_decl.h powerpc/pasemi: local_irq_save uses an unsigned long powerpc/cell: Fix some u64 vs. long types powerpc/cell: Use correct types in beat files powerpc: Use correct type in prom_init.c powerpc: Remove unnecessary casts mtd/ps3vram: Use _PAGE_NO_CACHE in memory ioremap mtd/ps3vram: Use msleep in waits mtd/ps3vram: Use proper kernel types mtd/ps3vram: Cleanup ps3vram driver messages mtd/ps3vram: Remove ps3vram debug routines mtd/ps3vram: Add modalias support to the ps3vram driver mtd/ps3vram: Add ps3vram driver for accessing video RAM as MTD powerpc: Fix iseries drivers build failure without CONFIG_VIOPATH ...	2009-01-08 09:10:16 -08:00
Wu Fengguang	91f68b7359	generic swap(): introduce global macro swap(a, b) There have been some local definitions of swap(), it's time to replace them all with a uniform one. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:15 -08:00
Kees Cook	f06295b44c	ELF: implement AT_RANDOM for glibc PRNG seeding While discussing[1] the need for glibc to have access to random bytes during program load, it seems that an earlier attempt to implement AT_RANDOM got stalled. This implements a random 16 byte string, available to every ELF program via a new auxv AT_RANDOM vector. [1] http://sourceware.org/ml/libc-alpha/2008-10/msg00006.html Ulrich said: glibc needs right after startup a bit of random data for internal protections (stack canary etc). What is now in upstream glibc is that we always unconditionally open /dev/urandom, read some data, and use it. For every process startup. That's slow. ... The solution is to provide a limited amount of random data to the starting process in the aux vector. I suggested 16 bytes and this is what the patch implements. If we need only 16 bytes or less we use the data directly. If we need more we'll use the 16 bytes to see a PRNG. This avoids the costly /dev/urandom use and it allows the kernel to use the most adequate source of random data for this purpose. It might not be the same pool as that for /dev/urandom. Concerns were expressed about the depletion of the randomness pool. But this patch doesn't make the situation worse, it doesn't deplete entropy more than happens now. Signed-off-by: Kees Cook <kees.cook@canonical.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:12 -08:00
Eric W. Biederman	61bce0f137	pid: generalize task_active_pid_ns Currently task_active_pid_ns is not safe to call after a task becomes a zombie and exit_task_namespaces is called, as nsproxy becomes NULL. By reading the pid namespace from the pid of the task we can trivially solve this problem at the cost of one extra memory read in what should be the same cacheline as we read the namespace from. When moving things around I have made task_active_pid_ns out of line because keeping it in pid_namespace.h would require adding includes of pid.h and sched.h that I don't think we want. This change does make task_active_pid_ns unsafe to call during copy_process until we attach a pid on the task_struct which seems to be a reasonable trade off. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Roland McGrath <roland@redhat.com> Cc: Bastian Blank <bastian@waldi.eu.org> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Nadia Derbey <Nadia.Derbey@bull.net> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:12 -08:00
Eric W. Biederman	f9fb860f67	pid: implement ns_of_pid A current problem with the pid namespace is that it is easy to do pid related work after exit_task_namespaces which drops the nsproxy pointer. However if we are doing pid namespace related work we are always operating on some struct pid which retains the pid_namespace pointer of the pid namespace it was allocated in. So provide ns_of_pid which allows us to find the pid namespace a pid was allocated in. Using this we have the needed infrastructure to do pid namespace related work at anytime we have a struct pid, removing the chance of accidentally having a NULL pointer dereference when accessing current->nsproxy. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Roland McGrath <roland@redhat.com> Cc: Bastian Blank <bastian@waldi.eu.org> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Nadia Derbey <Nadia.Derbey@bull.net> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:12 -08:00
Li Zefan	6af866af34	cpuset: remove remaining pointers to cpumask_t Impact: cleanups, use new cpumask API Final trivial cleanups: mainly s/cpumask_t/struct cpumask Note there is a FIXME in generate_sched_domains(). A future patch will change struct cpumask doms to struct cpumask doms[]. (I suppose Rusty will do this.) Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Mike Travis <travis@sgi.com> Cc: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:11 -08:00
Paul Menage	e7c5ec9193	cgroups: add css_tryget() Add css_tryget(), that obtains a counted reference on a CSS. It is used in situations where the caller has a "weak" reference to the CSS, i.e. one that does not protect the cgroup from removal via a reference count, but would instead be cleaned up by a destroy() callback. css_tryget() will return true on success, or false if the cgroup is being removed. This is similar to Kamezawa Hiroyuki's patch from a week or two ago, but with the difference that in the event of css_tryget() racing with a cgroup_rmdir(), css_tryget() will only return false if the cgroup really does get removed. This implementation is done by biasing css->refcnt, so that a refcnt of 1 means "releasable" and 0 means "released or releasing". In the event of a race, css_tryget() distinguishes between "released" and "releasing" by checking for the CSS_REMOVED flag in css->flags. Signed-off-by: Paul Menage <menage@google.com> Tested-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:10 -08:00
Paul Menage	999cd8a450	cgroups: add a per-subsystem hierarchy_mutex These patches introduce new locking/refcount support for cgroups to reduce the need for subsystems to call cgroup_lock(). This will ultimately allow the atomicity of cgroup_rmdir() (which was removed recently) to be restored. These three patches give: 1/3 - introduce a per-subsystem hierarchy_mutex which a subsystem can use to prevent changes to its own cgroup tree 2/3 - use hierarchy_mutex in place of calling cgroup_lock() in the memory controller 3/3 - introduce a css_tryget() function similar to the one recently proposed by Kamezawa, but avoiding spurious refcount failures in the event of a race between a css_tryget() and an unsuccessful cgroup_rmdir() Future patches will likely involve: - using hierarchy mutex in place of cgroup_lock() in more subsystems where appropriate - restoring the atomicity of cgroup_rmdir() with respect to cgroup_create() This patch: Add a hierarchy_mutex to the cgroup_subsys object that protects changes to the hierarchy observed by that subsystem. It is taken by the cgroup subsystem (in addition to cgroup_mutex) for the following operations: - linking a cgroup into that subsystem's cgroup tree - unlinking a cgroup from that subsystem's cgroup tree - moving the subsystem to/from a hierarchy (including across the bind() callback) Thus if the subsystem holds its own hierarchy_mutex, it can safely traverse its own hierarchy. Signed-off-by: Paul Menage <menage@google.com> Tested-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:10 -08:00
KAMEZAWA Hiroyuki	b5a84319a4	memcg: fix shmem's swap accounting Now, you can see following even when swap accounting is enabled. 1. Create Group 01, and 02. 2. allocate a "file" on tmpfs by a task under 01. 3. swap out the "file" (by memory pressure) 4. Read "file" from a task in group 02. 5. the charge of "file" is moved to group 02. This is not ideal behavior. This is because SwapCache which was loaded by read-ahead is not taken into account.. This is a patch to fix shmem's swapcache behavior. - remove mem_cgroup_cache_charge_swapin(). - Add SwapCache handler routine to mem_cgroup_cache_charge(). By this, shmem's file cache is charged at add_to_page_cache() with GFP_NOWAIT. - pass the page of swapcache to shrink_mem_cgroup. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Paul Menage <menage@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:10 -08:00
Daisuke Nishimura	a5e924f5f8	memcg: remove mem_cgroup_try_charge After previous patch, mem_cgroup_try_charge is not used by anyone, so we can remove it. Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:09 -08:00
KOSAKI Motohiro	c772be939e	memcg: fix calculation of active_ratio Currently, inactive_ratio of memcg is calculated at setting limit. because page_alloc.c does so and current implementation is straightforward porting. However, memcg introduced hierarchy feature recently. In hierarchy restriction, memory limit is not only decided memory.limit_in_bytes of current cgroup, but also parent limit and sibling memory usage. Then, The optimal inactive_ratio is changed frequently. So, everytime calculation is better. Tested-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:09 -08:00
KOSAKI Motohiro	a7885eb8ad	memcg: swappiness Currently, /proc/sys/vm/swappiness can change swappiness ratio for global reclaim. However, memcg reclaim doesn't have tuning parameter for itself. In general, the optimal swappiness depend on workload. (e.g. hpc workload need to low swappiness than the others.) Then, per cgroup swappiness improve administrator tunability. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:08 -08:00
KOSAKI Motohiro	9439c1c95b	memcg: remove mem_cgroup_cal_reclaim() Now, get_scan_ratio() return correct value although memcg reclaim. Then, mem_cgroup_calc_reclaim() can be removed. So, memcg reclaim get the same capability of anon/file reclaim balancing as global reclaim now. Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:08 -08:00
KOSAKI Motohiro	3e2f41f1f6	memcg: add zone_reclaim_stat Introduce mem_cgroup_per_zone::reclaim_stat member and its statics collecting function. Now, get_scan_ratio() can calculate correct value on memcg reclaim. [hugh@veritas.com: avoid reclaim_stat oops when disabled] Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:08 -08:00
KOSAKI Motohiro	a3d8e0549d	memcg: add mem_cgroup_zone_nr_pages() Introduce mem_cgroup_zone_nr_pages(). It is called by zone_nr_pages() helper function. This patch doesn't have any behavior change. Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Balbir Singh <balbir@in.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:08 -08:00
KOSAKI Motohiro	14797e2363	memcg: add inactive_anon_is_low() The inactive_anon_is_low() is key component of active/inactive anon balancing on reclaim. However current inactive_anon_is_low() function only consider global reclaim. Therefore, we need following ugly scan_global_lru() condition. if (lru == LRU_ACTIVE_ANON && (!scan_global_lru(sc) \|\| inactive_anon_is_low(zone))) { shrink_active_list(nr_to_scan, zone, sc, priority, file); return 0; it cause that memcg reclaim always deactivate pages when shrink_list() is called. To make mem_cgroup_inactive_anon_is_low() improve active/inactive anon balancing of memcgroup. Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: "Pekka Enberg" <penberg@cs.helsinki.fi> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:08 -08:00
KOSAKI Motohiro	6e9015716a	mm: introduce zone_reclaim struct Add zone_reclam_stat struct for later enhancement. A later patch uses this. This patch doesn't any behavior change (yet). Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:07 -08:00
KOSAKI Motohiro	f89eb90e33	inactive_anon_is_low: move to vmscan The inactive_anon_is_low() is called only vmscan. Then it can move to vmscan.c This patch doesn't have any functional change. Reviewd-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:07 -08:00
KAMEZAWA Hiroyuki	2c26fdd70c	memcg: revert gfp mask fix My patch, memcg-fix-gfp_mask-of-callers-of-charge.patch changed gfp_mask of callers of charge to be GFP_HIGHUSER_MOVABLE for showing what will happen at memory reclaim. But in recent discussion, it's NACKed because it sounds ugly. This patch is for reverting it and add some clean up to gfp_mask of callers of charge. No behavior change but need review before generating HUNK in deep queue. This patch also adds explanation to meaning of gfp_mask passed to charge functions in memcontrol.h. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:06 -08:00
KAMEZAWA Hiroyuki	a636b327f7	memcg: avoid unnecessary system-wide-oom-killer Current mmtom has new oom function as pagefault_out_of_memory(). It's added for select bad process rathar than killing current. When memcg hit limit and calls OOM at page_fault, this handler called and system-wide-oom handling happens. (means kernel panics if panic_on_oom is true....) To avoid overkill, check memcg's recent behavior before starting system-wide-oom. And this patch also fixes to guarantee "don't accnout against process with TIF_MEMDIE". This is necessary for smooth OOM. [akpm@linux-foundation.org: build fix] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: Jan Blunck <jblunck@suse.de> Cc: Hirokazu Takahashi <taka@valinux.co.jp> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:06 -08:00
Lai Jiangshan	2e4d40915f	memcontrol: rcu_read_lock() to protect mm_match_cgroup() mm_match_cgroup() calls cgroup_subsys_state(). We must use rcu_read_lock() to protect cgroup_subsys_state(). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:06 -08:00
Balbir Singh	28dbc4b6a0	memcg: memory cgroup resource counters for hierarchy Add support for building hierarchies in resource counters. Cgroups allows us to build a deep hierarchy, but we currently don't link the resource counters belonging to the memory controller control groups, in the same fashion as the corresponding cgroup entries in the cgroup hierarchy. This patch provides the infrastructure for resource counters that have the same hiearchy as their cgroup counter parts. These set of patches are based on the resource counter hiearchy patches posted by Pavel Emelianov. NOTE: Building hiearchies is expensive, deeper hierarchies imply charging the all the way up to the root. It is known that hiearchies are expensive, so the user needs to be careful and aware of the trade-offs before creating very deep ones. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Cc: Paul Menage <menage@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Pavel Emelianov <xemul@openvz.org> Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:05 -08:00
Hirokazu Takahashi	f8d6654226	memcg: add mem_cgroup_disabled() We check mem_cgroup is disabled or not by checking mem_cgroup_subsys.disabled. I think it has more references than expected, now. replacing if (mem_cgroup_subsys.disabled) with if (mem_cgroup_disabled()) give us good look, I think. [kamezawa.hiroyu@jp.fujitsu.com: fix typo] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:05 -08:00
KAMEZAWA Hiroyuki	08e552c69c	memcg: synchronized LRU A big patch for changing memcg's LRU semantics. Now, - page_cgroup is linked to mem_cgroup's its own LRU (per zone). - LRU of page_cgroup is not synchronous with global LRU. - page and page_cgroup is one-to-one and statically allocated. - To find page_cgroup is on what LRU, you have to check pc->mem_cgroup as - lru = page_cgroup_zoneinfo(pc, nid_of_pc, zid_of_pc); - SwapCache is handled. And, when we handle LRU list of page_cgroup, we do following. pc = lookup_page_cgroup(page); lock_page_cgroup(pc); .....................(1) mz = page_cgroup_zoneinfo(pc); spin_lock(&mz->lru_lock); .....add to LRU spin_unlock(&mz->lru_lock); unlock_page_cgroup(pc); But (1) is spin_lock and we have to be afraid of dead-lock with zone->lru_lock. So, trylock() is used at (1), now. Without (1), we can't trust "mz" is correct. This is a trial to remove this dirty nesting of locks. This patch changes mz->lru_lock to be zone->lru_lock. Then, above sequence will be written as spin_lock(&zone->lru_lock); # in vmscan.c or swap.c via global LRU mem_cgroup_add/remove/etc_lru() { pc = lookup_page_cgroup(page); mz = page_cgroup_zoneinfo(pc); if (PageCgroupUsed(pc)) { ....add to LRU } spin_lock(&zone->lru_lock); # in vmscan.c or swap.c via global LRU This is much simpler. (*) We're safe even if we don't take lock_page_cgroup(pc). Because.. 1. When pc->mem_cgroup can be modified. - at charge. - at account_move(). 2. at charge the PCG_USED bit is not set before pc->mem_cgroup is fixed. 3. at account_move() the page is isolated and not on LRU. Pros. - easy for maintenance. - memcg can make use of laziness of pagevec. - we don't have to duplicated LRU/Active/Unevictable bit in page_cgroup. - LRU status of memcg will be synchronized with global LRU's one. - # of locks are reduced. - account_move() is simplified very much. Cons. - may increase cost of LRU rotation. (no impact if memcg is not configured.) Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:05 -08:00
KAMEZAWA Hiroyuki	8c7c6e34a1	memcg: mem+swap controller core This patch implements per cgroup limit for usage of memory+swap. However there are SwapCache, double counting of swap-cache and swap-entry is avoided. Mem+Swap controller works as following. - memory usage is limited by memory.limit_in_bytes. - memory + swap usage is limited by memory.memsw_limit_in_bytes. This has following benefits. - A user can limit total resource usage of mem+swap. Without this, because memory resource controller doesn't take care of usage of swap, a process can exhaust all the swap (by memory leak.) We can avoid this case. And Swap is shared resource but it cannot be reclaimed (goes back to memory) until it's used. This characteristic can be trouble when the memory is divided into some parts by cpuset or memcg. Assume group A and group B. After some application executes, the system can be.. Group A -- very large free memory space but occupy 99% of swap. Group B -- under memory shortage but cannot use swap...it's nearly full. Ability to set appropriate swap limit for each group is required. Maybe someone wonder "why not swap but mem+swap ?" - The global LRU(kswapd) can swap out arbitrary pages. Swap-out means to move account from memory to swap...there is no change in usage of mem+swap. In other words, when we want to limit the usage of swap without affecting global LRU, mem+swap limit is better than just limiting swap. Accounting target information is stored in swap_cgroup which is per swap entry record. Charge is done as following. map - charge page and memsw. unmap - uncharge page/memsw if not SwapCache. swap-out (__delete_from_swap_cache) - uncharge page - record mem_cgroup information to swap_cgroup. swap-in (do_swap_page) - charged as page and memsw. record in swap_cgroup is cleared. memsw accounting is decremented. swap-free (swap_free()) - if swap entry is freed, memsw is uncharged by PAGE_SIZE. There are people work under never-swap environments and consider swap as something bad. For such people, this mem+swap controller extension is just an overhead. This overhead is avoided by config or boot option. (see Kconfig. detail is not in this patch.) TODO: - maybe more optimization can be don in swap-in path. (but not very safe.) But we just do simple accounting at this stage. [nishimura@mxp.nes.nec.co.jp: make resize limit hold mutex] [hugh@veritas.com: memswap controller core swapcache fixes] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:05 -08:00
KAMEZAWA Hiroyuki	27a7faa077	memcg: swap cgroup for remembering usage For accounting swap, we need a record per swap entry, at least. This patch adds following function. - swap_cgroup_swapon() .... called from swapon - swap_cgroup_swapoff() ... called at the end of swapoff. - swap_cgroup_record() .... record information of swap entry. - swap_cgroup_lookup() .... lookup information of swap entry. This patch just implements "how to record information". No actual method for limit the usage of swap. These routine uses flat table to record and lookup. "wise" lookup system like radix-tree requires requires memory allocation at new records but swap-out is usually called under memory shortage (or memcg hits limit.) So, I used static allocation. (maybe dynamic allocation is not very hard but it adds additional memory allocation in memory shortage path.) Note1: In this, we use pointer to record information and this means 8bytes per swap entry. I think we can reduce this when we create "id of cgroup" in the range of 0-65535 or 0-255. Reported-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Tested-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Reported-by: Hugh Dickins <hugh@veritas.com> Reported-by: Balbir Singh <balbir@linux.vnet.ibm.com> Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pavel Emelianov <xemul@openvz.org> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:05 -08:00
KAMEZAWA Hiroyuki	c077719be8	memcg: mem+swap controller Kconfig Config and control variable for mem+swap controller. This patch adds CONFIG_CGROUP_MEM_RES_CTLR_SWAP (memory resource controller swap extension.) For accounting swap, it's obvious that we have to use additional memory to remember "who uses swap". This adds more overhead. So, it's better to offer "choice" to users. This patch adds 2 choices. This patch adds 2 parameters to enable swap extension or not. - CONFIG - boot option Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:05 -08:00
KAMEZAWA Hiroyuki	d13d144309	memcg: handle swap caches SwapCache support for memory resource controller (memcg) Before mem+swap controller, memcg itself should handle SwapCache in proper way. This is cut-out from it. In current memcg, SwapCache is just leaked and the user can create tons of SwapCache. This is a leak of account and should be handled. SwapCache accounting is done as following. charge (anon) - charged when it's mapped. (because of readahead, charge at add_to_swap_cache() is not sane) uncharge (anon) - uncharged when it's dropped from swapcache and fully unmapped. means it's not uncharged at unmap. Note: delete from swap cache at swap-in is done after rmap information is established. charge (shmem) - charged at swap-in. this prevents charge at add_to_page_cache(). uncharge (shmem) - uncharged when it's dropped from swapcache and not on shmem's radix-tree. at migration, check against 'old page' is modified to handle shmem. Comparing to the old version discussed (and caused troubles), we have advantages of - PCG_USED bit. - simple migrating handling. So, situation is much easier than several months ago, maybe. [hugh@veritas.com: memcg: handle swap caches build fix] Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Tested-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:05 -08:00
KAMEZAWA Hiroyuki	01b1ae63c2	memcg: simple migration handling Now, management of "charge" under page migration is done under following manner. (Assume migrate page contents from oldpage to newpage) before - "newpage" is charged before migration. at success. - "oldpage" is uncharged at somewhere(unmap, radix-tree-replace) at failure - "newpage" is uncharged. - "oldpage" is charged if necessary (1) But (1) is not reliable....because of GFP_ATOMIC. This patch tries to change behavior as following by charge/commit/cancel ops. before - charge PAGE_SIZE (no target page) success - commit charge against "newpage". failure - commit charge against "oldpage". (PCG_USED bit works effectively to avoid double-counting) - if "oldpage" is obsolete, cancel charge of PAGE_SIZE. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:04 -08:00
KAMEZAWA Hiroyuki	7a81b88cb5	memcg: introduce charge-commit-cancel style of functions There is a small race in do_swap_page(). When the page swapped-in is charged, the mapcount can be greater than 0. But, at the same time some process (shares it ) call unmap and make mapcount 1->0 and the page is uncharged. CPUA CPUB mapcount == 1. (1) charge if mapcount==0 zap_pte_range() (2) mapcount 1 => 0. (3) uncharge(). (success) (4) set page's rmap() mapcount 0=>1 Then, this swap page's account is leaked. For fixing this, I added a new interface. - charge account to res_counter by PAGE_SIZE and try to free pages if necessary. - commit register page_cgroup and add to LRU if necessary. - cancel uncharge PAGE_SIZE because of do_swap_page failure. CPUA (1) charge (always) (2) set page's rmap (mapcount > 0) (3) commit charge was necessary or not after set_pte(). This protocol uses PCG_USED bit on page_cgroup for avoiding over accounting. Usual mem_cgroup_charge_common() does charge -> commit at a time. And this patch also adds following function to clarify all charges. - mem_cgroup_newpage_charge() ....replacement for mem_cgroup_charge() called against newly allocated anon pages. - mem_cgroup_charge_migrate_fixup() called only from remove_migration_ptes(). we'll have to rewrite this later.(this patch just keeps old behavior) This function will be removed by additional patch to make migration clearer. Good for clarifying "what we do" Then, we have 4 following charge points. - newpage - swap-in - add-to-cache. - migration. [akpm@linux-foundation.org: add missing inline directives to stubs] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:04 -08:00
Paul Menage	a47295e6bc	cgroups: make cgroup_path() RCU-safe Fix races between /proc/sched_debug by freeing cgroup objects via an RCU callback. Thus any cgroup reference obtained from an RCU-safe source will remain valid during the RCU section. Since dentries are also RCU-safe, this allows us to traverse up the tree safely. Additionally, make cgroup_path() check for a NULL cgrp->dentry to avoid trying to report a path for a partially-created cgroup. [lizf@cn.fujitsu.com: call deactive_super() in cgroup_diput()] Signed-off-by: Paul Menage <menage@google.com> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Tested-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:03 -08:00
Lai Jiangshan	b2aa30f7bb	cgroups: don't put struct cgroupfs_root protected by RCU We don't access struct cgroupfs_root in fast path, so we should not put struct cgroupfs_root protected by RCU But the comment in struct cgroup_subsys.root confuse us. struct cgroup_subsys.root is used in these places: 1 find_css_set(): if (ss->root->subsys_list.next == &ss->sibling) 2 rebind_subsystems(): if (ss->root != &rootnode) rcu_assign_pointer(ss->root, root); rcu_assign_pointer(subsys[i]->root, &rootnode); 3 cgroup_has_css_refs(): if (ss->root != cgrp->root) 4 cgroup_init_subsys(): ss->root = &rootnode; 5 proc_cgroupstats_show(): ss->name, ss->root->subsys_bits, ss->root->number_of_cgroups, !ss->disabled); 6 cgroup_clone(): root = subsys->root; if ((root != subsys->root) \|\| All these place we have held cgroup_lock() or we don't dereference to struct cgroupfs_root. It's means wo don't need RCU when use struct cgroup_subsys.root, and we should not put struct cgroupfs_root protected by RCU. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Reviewed-by: Paul Menage <menage@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:02 -08:00
Duane Griffin	04143e2fb9	ext3: tighten restrictions on inode flags At the moment there are few restrictions on which flags may be set on which inodes. Specifically DIRSYNC may only be set on directories and IMMUTABLE and APPEND may not be set on links. Tighten that to disallow TOPDIR being set on non-directories and only NODUMP and NOATIME to be set on non-regular file, non-directories. Introduces a flags masking function which masks flags based on mode and use it during inode creation and when flags are set via the ioctl to facilitate future consistency. Signed-off-by: Duane Griffin <duaneg@dghda.com> Acked-by: Andreas Dilger <adilger@sun.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:01 -08:00
Duane Griffin	2e8671cb56	ext3: don't inherit inappropriate inode flags from parent At present INDEX is the only flag that new ext3 inodes do NOT inherit from their parent. In addition prevent the flags DIRTY, ECOMPR, IMAGIC and TOPDIR from being inherited. List inheritable flags explicitly to prevent future flags from accidentally being inherited. This fixes the TOPDIR flag inheritance bug reported at http://bugzilla.kernel.org/show_bug.cgi?id=9866. Signed-off-by: Duane Griffin <duaneg@dghda.com> Acked-by: Andreas Dilger <adilger@sun.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:01 -08:00
Pekka Enberg	5df096d67e	ext3: allocate ->s_blockgroup_lock separately As spotted by kmemtrace, struct ext3_sb_info is 17152 bytes on 64-bit which makes it a very bad fit for SLAB allocators. The culprit of the wasted memory is ->s_blockgroup_lock which can be as big as 16 KB when NR_CPUS >= 32. To fix that, allocate ->s_blockgroup_lock, which fits nicely in a order 2 page in the worst case, separately. This shinks down struct ext3_sb_info enough to fit a 1 KB slab cache so now we allocate 16 KB + 1 KB instead of 32 KB saving 15 KB of memory. Acked-by: Andreas Dilger <adilger@sun.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:00 -08:00
Josef Bacik	f420d4dc42	jbd: improve fsync batching There is a flaw with the way jbd handles fsync batching. If we fsync() a file and we were not the last person to run fsync() on this fs then we automatically sleep for 1 jiffie in order to wait for new writers to join into the transaction before forcing the commit. The problem with this is that with really fast storage (ie a Clariion) the time it takes to commit a transaction to disk is way faster than 1 jiffie in most cases, so sleeping means waiting longer with nothing to do than if we just committed the transaction and kept going. Ric Wheeler noticed this when using fs_mark with more than 1 thread, the throughput would plummet as he added more threads. This patch attempts to fix this problem by recording the average time in nanoseconds that it takes to commit a transaction to disk, and what time we started the transaction. If we run an fsync() and we have been running for less time than it takes to commit the transaction to disk, we sleep for the delta amount of time and then commit to disk. We acheive sub-jiffie sleeping using schedule_hrtimeout. This means that the wait time is auto-tuned to the speed of the underlying disk, instead of having this static timeout. I weighted the average according to somebody's comments (Andreas Dilger I think) in order to help normalize random outliers where we take way longer or way less time to commit than the average. I also have a min() check in there to make sure we don't sleep longer than a jiffie in case our storage is super slow, this was requested by Andrew. I unfortunately do not have access to a Clariion, so I had to use a ramdisk to represent a super fast array. I tested with a SATA drive with barrier=1 to make sure there was no regression with local disks, I tested with a 4 way multipathed Apple Xserve RAID array and of course the ramdisk. I ran the following command fs_mark -d /mnt/ext3-test -s 4096 -n 2000 -D 64 -t $i where $i was 2, 4, 8, 16 and 32. I mkfs'ed the fs each time. Here are my results type threads with patch without patch sata 2 24.6 26.3 sata 4 49.2 48.1 sata 8 70.1 67.0 sata 16 104.0 94.1 sata 32 153.6 142.7 xserve 2 246.4 222.0 xserve 4 480.0 440.8 xserve 8 829.5 730.8 xserve 16 1172.7 1026.9 xserve 32 1816.3 1650.5 ramdisk 2 2538.3 1745.6 ramdisk 4 2942.3 661.9 ramdisk 8 2882.5 999.8 ramdisk 16 2738.7 1801.9 ramdisk 32 2541.9 2394.0 Signed-off-by: Josef Bacik <jbacik@redhat.com> Cc: Andreas Dilger <adilger@sun.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Ric Wheeler <rwheeler@redhat.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:00 -08:00
Duane Griffin	ef8b646183	ext2: tighten restrictions on inode flags At the moment there are few restrictions on which flags may be set on which inodes. Specifically DIRSYNC may only be set on directories and IMMUTABLE and APPEND may not be set on links. Tighten that to disallow TOPDIR being set on non-directories and only NODUMP and NOATIME to be set on non-regular file, non-directories. Introduces a flags masking function which masks flags based on mode and use it during inode creation and when flags are set via the ioctl to facilitate future consistency. Signed-off-by: Duane Griffin <duaneg@dghda.com> Acked-by: Andreas Dilger <adilger@sun.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:00 -08:00
Duane Griffin	0e090f1e05	ext2: don't inherit inappropriate inode flags from parent At present BTREE/INDEX is the only flag that new ext2 inodes do NOT inherit from their parent. In addition prevent the flags DIRTY, ECOMPR, INDEX, IMAGIC and TOPDIR from being inherited. List inheritable flags explicitly to prevent future flags from accidentally being inherited. This fixes the TOPDIR flag inheritance bug reported at http://bugzilla.kernel.org/show_bug.cgi?id=9866. Signed-off-by: Duane Griffin <duaneg@dghda.com> Acked-by: Andreas Dilger <adilger@sun.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:00 -08:00
Pekka J Enberg	18a82eb9f9	ext2: allocate ->s_blockgroup_lock separately As spotted by kmemtrace, struct ext2_sb_info is 17024 bytes on 64-bit which makes it a very bad fit for SLAB allocators. The culprit of the wasted memory is ->s_blockgroup_lock which can be as big as 16 KB when NR_CPUS >= 32. To fix that, allocate ->s_blockgroup_lock, which fits nicely in a order 2 page in the worst case, separately. This shinks down struct ext2_sb_info enough to fit a 1 KB slab cache so now we allocate 16 KB + 1 KB instead of 32 KB saving 15 KB of memory. Acked-by: Andreas Dilger <adilger@sun.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:31:00 -08:00
Alex Zeffertt	1107ba885e	xen: add xenfs to allow usermode <-> Xen interaction The xenfs filesystem exports various interfaces to usermode. Initially this exports a file to allow usermode to interact with xenbus/xenstore. Traditionally this appeared in /proc/xen. Rather than extending procfs, this patch adds a backward-compat mountpoint on /proc/xen, and provides a xenfs filesystem which can be mounted there. Signed-off-by: Alex Zeffertt <alex.zeffertt@eu.citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:30:59 -08:00
Richard Purdie	c835ee7f41	backlight: Add suspend/resume support to the backlight core Add suspend/resume support to the backlight core and enable use of it by appropriate drivers. Signed-off-by: Richard Purdie <rpurdie@linux.intel.com>	2009-01-08 15:37:43 +00:00
Robert Richter	d2852b932f	Merge branch 'oprofile/ring_buffer' into oprofile/oprofile-for-tip	2009-01-08 14:27:34 +01:00
Mark Brown	0081e8020e	leds: Add WM8350 LED driver The voltage and current regulators on the WM8350 AudioPlus PMIC can be used in concert to provide a power efficient LED driver. This driver implements support for this within the standard LED class. Platform initialisation code should configure the LED hardware in the init callback provided by the WM8350 core driver. The callback should use wm8350_isink_set_flash(), wm8350_dcdc25_set_mode() and wm8350_dcdc_set_slot() to configure the operating parameters of the regulators for their hardware and then then use wm8350_register_led() to instantiate the LED driver. This driver was originally written by Liam Girdwood, though it has been extensively modified since then. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Richard Purdie <rpurdie@linux.intel.com>	2009-01-08 12:38:58 +00:00
Riku Voipio	934cd3f979	leds: leds-pcs9532 - Move i2c work to a workqueque Apparently these might be called under atomic context, and i2c operations may sleep. BUG found by Ross Burton <ross@burtonini.com> Signed-off-by: Riku Voipio <riku.voipio@iki.fi> Signed-off-by: Richard Purdie <rpurdie@linux.intel.com>	2009-01-08 12:38:58 +00:00
Wolfram Sang	e2387d6c20	leds: Make header variable naming consistent There is one place where the struct led_classdev as the function argument is named differently. Fix it. Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: Richard Purdie <rpurdie@linux.intel.com>	2009-01-08 12:38:57 +00:00
Paul Mundt	dd8632a12e	NOMMU: Make mmap allocation page trimming behaviour configurable. NOMMU mmap allocates a piece of memory for an mmap that's rounded up in size to the nearest power-of-2 number of pages. Currently it then discards the excess pages back to the page allocator, making that memory available for use by other things. This can, however, cause greater amount of fragmentation. To counter this, a sysctl is added in order to fine-tune the trimming behaviour. The default behaviour remains to trim pages aggressively, while this can either be disabled completely or set to a higher page-granular watermark in order to have finer-grained control. vm region vm_top bits taken from an earlier patch by David Howells. Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Mike Frysinger <vapier.adi@gmail.com>	2009-01-08 12:04:47 +00:00
David Howells	8feae13110	NOMMU: Make VMAs per MM as for MMU-mode linux Make VMAs per mm_struct as for MMU-mode linux. This solves two problems: (1) In SYSV SHM where nattch for a segment does not reflect the number of shmat's (and forks) done. (2) In mmap() where the VMA's vm_mm is set to point to the parent mm by an exec'ing process when VM_EXECUTABLE is specified, regardless of the fact that a VMA might be shared and already have its vm_mm assigned to another process or a dead process. A new struct (vm_region) is introduced to track a mapped region and to remember the circumstances under which it may be shared and the vm_list_struct structure is discarded as it's no longer required. This patch makes the following additional changes: (1) Regions are now allocated with alloc_pages() rather than kmalloc() and with no recourse to __GFP_COMP, so the pages are not composite. Instead, each page has a reference on it held by the region. Anything else that is interested in such a page will have to get a reference on it to retain it. When the pages are released due to unmapping, each page is passed to put_page() and will be freed when the page usage count reaches zero. (2) Excess pages are trimmed after an allocation as the allocation must be made as a power-of-2 quantity of pages. (3) VMAs are added to the parent MM's R/B tree and mmap lists. As an MM may end up with overlapping VMAs within the tree, the VMA struct address is appended to the sort key. (4) Non-anonymous VMAs are now added to the backing inode's prio list. (5) Holes may be punched in anonymous VMAs with munmap(), releasing parts of the backing region. The VMA and region structs will be split if necessary. (6) sys_shmdt() only releases one attachment to a SYSV IPC shared memory segment instead of all the attachments at that addresss. Multiple shmat()'s return the same address under NOMMU-mode instead of different virtual addresses as under MMU-mode. (7) Core dumping for ELF-FDPIC requires fewer exceptions for NOMMU-mode. (8) /proc/maps is now the global list of mapped regions, and may list bits that aren't actually mapped anywhere. (9) /proc/meminfo gains a line (tagged "MmapCopy") that indicates the amount of RAM currently allocated by mmap to hold mappable regions that can't be mapped directly. These are copies of the backing device or file if not anonymous. These changes make NOMMU mode more similar to MMU mode. The downside is that NOMMU mode requires some extra memory to track things over NOMMU without this patch (VMAs are no longer shared, and there are now region structs). Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Mike Frysinger <vapier.adi@gmail.com> Acked-by: Paul Mundt <lethal@linux-sh.org>	2009-01-08 12:04:47 +00:00
Benjamin Krill	5886188dc7	serial: Add driver for the Cell Network Processor serial port NWP device Add support for the nwp serial device which is connected to a DCR bus. It uses the of_serial device driver to determine necessary properties from the device tree. The supported device is added as serial port number 85. NWP stands for network processor and it is part of the QPACE - Quantum Chromodynamics Parallel Computing on the Cell Broadband Engine project. The implementation is a lightweight uart implementation with the focus to consume as little resources as possible and it is connected to a DCR bus. Signed-off-by: Benjamin Krill <ben@codiert.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-01-08 16:25:18 +11:00
Linus Torvalds	713404d608	Merge branch 'for-2.6.29' of git://linux-nfs.org/~bfields/linux * 'for-2.6.29' of git://linux-nfs.org/~bfields/linux: (67 commits) nfsd: get rid of NFSD_VERSION nfsd: last_byte_offset nfsd: delete wrong file comment from nfsd/nfs4xdr.c nfsd: git rid of nfs4_cb_null_ops declaration nfsd: dprint each op status in nfsd4_proc_compound nfsd: add etoosmall to nfserrno NFSD: FIDs need to take precedence over UUIDs SUNRPC: The sunrpc server code should not be used by out-of-tree modules svc: Clean up deferred requests on transport destruction nfsd: fix double-locks of directory mutex svc: Move kfree of deferral record to common code CRED: Fix NFSD regression NLM: Clean up flow of control in make_socks() function NLM: Refactor make_socks() function nfsd: Ensure nfsv4 calls the underlying filesystem on LOCKT SUNRPC: Ensure the server closes sockets in a timely fashion NFSD: Add documenting comments for nfsctl interface NFSD: Replace open-coded integer with macro NFSD: Fix a handful of coding style issues in write_filehandle() NFSD: clean up failover sysctl function naming ...	2009-01-07 17:21:24 -08:00
Linus Torvalds	b424e8d3b4	Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (98 commits) PCI PM: Put PM callbacks in the order of execution PCI PM: Run default PM callbacks for all devices using new framework PCI PM: Register power state of devices during initialization PCI PM: Call pci_fixup_device from legacy routines PCI PM: Rearrange code in pci-driver.c PCI PM: Avoid touching devices behind bridges in unknown state PCI PM: Move pci_has_legacy_pm_support PCI PM: Power-manage devices without drivers during suspend-resume PCI PM: Add suspend counterpart of pci_reenable_device PCI PM: Fix poweroff and restore callbacks PCI: Use msleep instead of cpu_relax during ASPM link retraining PCI: PCIe portdrv: Add kerneldoc comments to remining core funtions PCI: PCIe portdrv: Rearrange code so that related things are together PCI: PCIe portdrv: Fix suspend and resume of PCI Express port services PCI: PCIe portdrv: Add kerneldoc comments to some core functions x86/PCI: Do not use interrupt links for devices using MSI-X net: sfc: Use pci_clear_master() to disable bus mastering PCI: Add pci_clear_master() as opposite of pci_set_master() PCI hotplug: remove redundant test in cpq hotplug PCI: pciehp: cleanup register and field definitions ...	2009-01-07 15:41:01 -08:00
Linus Torvalds	7c7758f99d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (123 commits) wimax/i2400m: add CREDITS and MAINTAINERS entries wimax: export linux/wimax.h and linux/wimax/i2400m.h with headers_install i2400m: Makefile and Kconfig i2400m/SDIO: TX and RX path backends i2400m/SDIO: firmware upload backend i2400m/SDIO: probe/disconnect, dev init/shutdown and reset backends i2400m/SDIO: header for the SDIO subdriver i2400m/USB: TX and RX path backends i2400m/USB: firmware upload backend i2400m/USB: probe/disconnect, dev init/shutdown and reset backends i2400m/USB: header for the USB bus driver i2400m: debugfs controls i2400m: various functions for device management i2400m: RX and TX data/control paths i2400m: firmware loading and bootrom initialization i2400m: linkage to the networking stack i2400m: Generic probe/disconnect, reset and message passing i2400m: host/device procotol and core driver definitions i2400m: documentation and instructions for usage wimax: Makefile, Kconfig and docbook linkage for the stack ...	2009-01-07 15:37:24 -08:00
Linus Torvalds	67acd8b4b7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/arjan/linux-2.6-async * git://git.kernel.org/pub/scm/linux/kernel/git/arjan/linux-2.6-async: async: don't do the initcall stuff post boot bootchart: improve output based on Dave Jones' feedback async: make the final inode deletion an asynchronous event fastboot: Make libata initialization even more async fastboot: make the libata port scan asynchronous fastboot: make scsi probes asynchronous async: Asynchronous function calls to speed up kernel boot	2009-01-07 15:35:47 -08:00
Benny Halevy	db43910cb4	nfsd: get rid of NFSD_VERSION Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-07 17:38:32 -05:00
Benny Halevy	87df4de807	nfsd: last_byte_offset refactor the nfs4 server lock code to use last_byte_offset to compute the last byte covered by the lock. Check for overflow so that the last byte is set to NFS4_MAX_UINT64 if offset + len wraps around. Also, use NFS4_MAX_UINT64 for ~(u64)0 where appropriate. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-07 17:38:31 -05:00
KOSAKI Motohiro	01d07820a0	sparseirq: make for_each_irq_desc() more robust Raja reported for_each_irq_desc() has possibility unsafeness: if anyone write folliwing code, for_each_irq_desc() doesn't work as intended: (right now this code does not exist at all) if (safe) for_each_irq_desc(irq, desc) { ... } else panic(); Reported-by: Raja R Harinath <harinath@hurrynot.org> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-07 23:18:08 +01:00
Robert Richter	14f0ca8eae	oprofile: make new cpu buffer functions part of the api This patch creates the new functions oprofile_write_reserve() oprofile_add_data() oprofile_write_commit() and makes them part of the oprofile api. Signed-off-by: Robert Richter <robert.richter@amd.com>	2009-01-07 22:48:15 +01:00
Linus Torvalds	c6906a2cb7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes * git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes: kbuild: fix typos (s/bin_shipped/bin.o_shipped/) in Documentation kbuild: add a symlink to the source for separate objdirs kconfig: add script to manipulate .config files on the command line kbuild: reintroduce ALLSOURCE_ARCHS support for tags/cscope bootchart: improve output based on Dave Jones' feedback fix modules_install via NFS qnx: include <linux/types.h> for definitions of __[us]{8,16,32,64} types	2009-01-07 13:11:28 -08:00
Anders Larsen	8d1a0a13ed	qnx: include <linux/types.h> for definitions of __[us]{8,16,32,64} types On 2008-12-30 11:32:33, Sam Ravnborg wrote: > We have added a few additional validation checks of the userspace headers: ... > 3) We should include <linux/types.h> and not <asm/types.h> > 4) If we use a __[us]{8,16,32,64} type then we must include <linux/types.h> Satisfy these requirements for the linux/qnx*.h headers. Signed-off-by: Anders Larsen <al@alarsen.net> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>	2009-01-07 21:44:20 +01:00
Linus Torvalds	fa7b906e7f	Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6 * 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6: i2c: Use snprintf to set adapter names Input: apanel - convert to new i2c binding i2c: Drop I2C_CLASS_CAM_DIGITAL i2c: Drop I2C_CLASS_CAM_ANALOG and I2C_CLASS_SOUND i2c: Drop I2C_CLASS_ALL i2c: Get rid of remaining bus_id access i2c: Replace bus_id with dev_name(), dev_set_name()	2009-01-07 11:59:27 -08:00
Linus Torvalds	08249903ea	Merge git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6: avr32: Move syscalls.h under arch/avr32/include/asm/ avr32: Define DIE_OOPS avr32: Remove DMATEST from defconfigs arch/avr32: Eliminate NULL test and memset after alloc_bootmem avr32: data param to at32_add_device_mci() must be non-NULL atmel-mci: move atmel-mci.h file to include/linux avr32: Hammerhead board support avr32: Allow reserving multiple pins at once favr-32: Remove deprecated call MIMC200: Remove deprecated call avr: struct device - replace bus_id with dev_name(), dev_set_name() avr32: Introducing asm/syscalls.h	2009-01-07 11:58:30 -08:00
Linus Torvalds	57c44c5f6f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (24 commits) trivial: chack -> check typo fix in main Makefile trivial: Add a space (and a comma) to a printk in 8250 driver trivial: Fix misspelling of "firmware" in docs for ncr53c8xx/sym53c8xx trivial: Fix misspelling of "firmware" in powerpc Makefile trivial: Fix misspelling of "firmware" in usb.c trivial: Fix misspelling of "firmware" in qla1280.c trivial: Fix misspelling of "firmware" in a100u2w.c trivial: Fix misspelling of "firmware" in megaraid.c trivial: Fix misspelling of "firmware" in ql4_mbx.c trivial: Fix misspelling of "firmware" in acpi_memhotplug.c trivial: Fix misspelling of "firmware" in ipw2100.c trivial: Fix misspelling of "firmware" in atmel.c trivial: Fix misspelled firmware in Kconfig trivial: fix an -> a typos in documentation and comments trivial: fix then -> than typos in comments and documentation trivial: update Jesper Juhl CREDITS entry with new email trivial: fix singal -> signal typo trivial: Fix incorrect use of "loose" in event.c trivial: printk: fix indentation of new_text_line declaration trivial: rtc-stk17ta8: fix sparse warning ...	2009-01-07 11:31:52 -08:00
Detlef Riekenberg	940fbf411e	linux/types.h: Don't depend on __GNUC__ for __le64/__be64 The typedefs for __u64 and __s64 where fixed to be available for other compiler on May 2 2008 by H. Peter Anvin (in commit `edfa5cfa3d`) Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Detlef Riekenberg <wine.dev@web.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-07 11:27:12 -08:00
Rafael J. Wysocki	16cf0ebc35	x86/PCI: Do not use interrupt links for devices using MSI-X pcibios_enable_device() and pcibios_disable_device() don't handle IRQs for devices that have MSI enabled and it should treat the devices with MSI-X enabled in the same way. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:13:25 -08:00
Ben Hutchings	6a479079c0	PCI: Add pci_clear_master() as opposite of pci_set_master() During an online device reset it may be useful to disable bus-mastering. pci_disable_device() does that, and far more besides, so is not suitable for an online reset. Add pci_clear_master() which does just this. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:13:23 -08:00
Kenji Kaneshige	322162a71b	PCI: pciehp: cleanup register and field definitions Clean up register definitions related to PCI Express Hot plug. - Add register definitions into include/linux/pci_regs.h, and use them instead of pciehp's locally definied register definitions. - Remove pciehp's locally defined register definitions - Remove unused register definitions in pciehp. - Some minor cleanups. Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:13:22 -08:00
Stephen Hemminger	db5679437a	PCI: add interface to set visible size of VPD The VPD on all devices may not be 32K. Unfortunately, there is no generic way to find the size, so this adds a simple API hook to reset it. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:13:18 -08:00
Stephen Hemminger	287d19ce2e	PCI: revise VPD access interface Change PCI VPD API which was only used by sysfs to something usable in drivers. * move iteration over multiple words to the low level * use conventional types for arguments * add exportable wrapper Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:13:17 -08:00
Bjorn Helgaas	68feac87de	PCI: add pci_common_swizzle() for INTx swizzling This patch adds pci_common_swizzle(), which swizzles INTx values all the way up to a root bridge. This common implementation can replace several architecture-specific ones. This should someday be combined with pci_get_interrupt_pin(), but I left it separate for now to make reviewing easier. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:13:12 -08:00
Kenji Kaneshige	e8c331e963	PCI hotplug: introduce functions for ACPI slot detection Some ACPI related PCI hotplug code can be shared among PCI hotplug drivers. This patch introduces the following functions in drivers/pci/hotplug/acpi_pcihp.c to share the code, and changes acpiphp and pciehp to use them. - int acpi_pci_detect_ejectable(struct pci_bus pbus) This checks if the specified PCI bus has ejectable slots. - int acpi_pci_check_ejectable(struct pci_bus pbus, acpi_handle handle) This checks if the specified handle is ejectable ACPI PCI slot. The 'pbus' parameter is needed to check if 'handle' is PCI related ACPI object. This patch also introduces the following inline function in include/linux/pci-acpi.h, which is useful to get ACPI handle of the PCI bridge from struct pci_bus of the bridge's secondary bus. - static inline acpi_handle acpi_pci_get_bridge_handle(struct pci_bus *pbus) This returns ACPI handle of the PCI bridge which generates PCI bus specified by 'pbus'. Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:13:11 -08:00
Yu Zhao	fde09c6d8f	PCI: define PCI resource names in an 'enum' This patch moves all definitions of the PCI resource names to an 'enum', and also replaces some hard-coded resource variables with symbol names. This change eases introduction of device specific resources. Reviewed-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Yu Zhao <yu.zhao@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:13:01 -08:00
Yu Zhao	14add80b51	PCI: remove unnecessary arg of pci_update_resource() This cleanup removes unnecessary argument 'struct resource *res' in pci_update_resource(), so it takes same arguments as other companion functions (pci_assign_resource(), etc.). Signed-off-by: Yu Zhao <yu.zhao@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:13:00 -08:00
Andrew Morton	1684f5ddd4	PCI: uninline pci_ioremap_bar() It's too large to be inlined. Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:57 -08:00
Bjorn Helgaas	57c2cf71c1	PCI: add pci_swizzle_interrupt_pin() This patch adds pci_swizzle_interrupt_pin(), which implements the INTx swizzling algorithm specified in Table 9-1 of the "PCI-to-PCI Bridge Architecture Specification," revision 1.2. There are many architecture-specific implementations of this swizzle that can be replaced by this common one. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:50 -08:00
Arjan van de Ven	e8de1481fd	resource: allow MMIO exclusivity for device drivers Device drivers that use pci_request_regions() (and similar APIs) have a reasonable expectation that they are the only ones accessing their device. As part of the e1000e hunt, we were afraid that some userland (X or some bootsplash stuff) was mapping the MMIO region that the driver thought it had exclusively via /dev/mem or via various sysfs resource mappings. This patch adds the option for device drivers to cause their reserved regions to the "banned from /dev/mem use" list, so now both kernel memory and device-exclusive MMIO regions are banned. NOTE: This is only active when CONFIG_STRICT_DEVMEM is set. In addition to the config option, a kernel parameter iomem=relaxed is provided for the cases where developers want to diagnose, in the field, drivers issues from userspace. Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:32 -08:00
Andrew Patterson	2361694191	ACPI/PCI: remove obsolete _OSC capability support functions The acpi_query_osc, __pci_osc_support_set, pci_osc_support_set, and pcie_osc_support_set functions have been obsoleted in favor of setting these capabilities during root bridge discovery with pci_acpi_osc_support. There are no longer any callers of these functions, so remove them. Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:32 -08:00
Andrew Patterson	07ae95f988	ACPI/PCI: PCI MSI _OSC support capabilities called when root bridge added The _OSC capability OSC_MSI_SUPPORT is set when the root bridge is added with pci_acpi_osc_support(), so we no longer need to do it in the PCI MSI driver. Also adds the function pci_msi_enabled, which returns true if pci=nomsi is not on the kernel command-line. Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:31 -08:00
Andrew Patterson	3e1b16002a	ACPI/PCI: PCIe ASPM _OSC support capabilities called when root bridge added The _OSC capabilities OSC_ACTIVE_STATE_PWR_SUPPORT and OSC_CLOCK_PWR_CAPABILITY_SUPPORT are set when the root bridge is added with pci_acpi_osc_support(), so we no longer need to do it in the ASPM driver. Also add the function pcie_aspm_enabled, which returns true if pcie_aspm=off is not on the kernel command-line. Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:29 -08:00
Andrew Patterson	0ef5f8f615	ACPI/PCI: PCI extended config _OSC support called when root bridge added The _OSC capability OSC_EXT_PCI_CONFIG_SUPPORT is set when the root bridge is added with pci_acpi_osc_support() if we can access PCI extended config space. This adds the function pci_ext_cfg_avail which returns true if we can access PCI extended config space (offset greater than 0xff). It currently only returns false if arch=x86 and raw_pci_ext_ops is not set (which might happen if pci=nommcfg is set on the kernel command-line). Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:28 -08:00
Andrew Patterson	990a7ac564	ACPI/PCI: call _OSC support during root bridge discovery Add pci_acpi_osc_support() and call it when a PCI bridge is added. This allows us to avoid having every individual PCI root bridge driver call _OSC support for every root bridge in their probe functions, a significant savings in boot time. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:27 -08:00
Andrew Patterson	8b62091e20	ACPI/PCI: include missing acpi.h file in pci-acpi.h. The pci-acpi.h file will not compile without including linux/acpi.h. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:26 -08:00
Sheng Yang	f7b7baae6b	PCI: add PCI Advanced Feature Capability defines PCI Advanced Features Capability is introduced by "Conventional PCI Advanced Caps ECN" (can be downloaded in pcisig.com). Add defines for the various AF capabilities, including function level reset (FLR). Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:24 -08:00
Inaky Perez-Gonzalez	e306987434	wimax: export linux/wimax.h and linux/wimax/i2400m.h with headers_install These two files are what user space can use to establish communication with the WiMAX kernel API and to speak the Intel 2400m Wireless WiMAX connection's control protocol. Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:22 -08:00
Inaky Perez-Gonzalez	ea24652d25	i2400m: host/device procotol and core driver definitions The wimax/i2400m.h defines the structures and constants for the host-device protocols: - boot / firmware upload protocol - general data transport protocol - control protocol It is done in such a way that can also be used verbatim by user space. drivers/net/wimax/i2400m.h defines all the APIs used by the core, bus-generic driver (i2400m) and the bus specific drivers (i2400m-BUSNAME). It also gives a roadmap to the driver implementation. debug-levels.h adds the core driver's debug settings. Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:18 -08:00
Inaky Perez-Gonzalez	ea912f4e7f	wimax: debug macros and debug settings for the WiMAX stack This file contains a simple debug framework that is used in the stack; it allows the debug level to be controlled at compile-time (so the debug code is optimized out) and at run-time (for what wasn't compiled out). This is eventually going to be moved to use dynamic_printk(). Just need to find time to do it. Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:17 -08:00
Inaky Perez-Gonzalez	ace22f0881	wimax: headers for kernel API and user space interaction Definitions for the user/kernel API protocol through generic netlink. User space can copy it verbatim and use it. Kernel API definition declares the main data types and calls for the drivers to integrate into the WiMAX stack. Provides usage documentation. Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:16 -08:00
Inaky Perez-Gonzalez	5e07878787	debugfs: add helpers for exporting a size_t simple value In the same spirit as debugfs_create_*(), introduce helpers for exporting size_t values over debugfs. The only trick done is that the format verifier is kept at %llu instead of %zu; otherwise type warnings would pop up: format ‘%zu’ expects type ‘size_t’, but argument 2 has type ‘long long unsigned int’ There is no real way to fix this one--however, we can consider %llu and %zu to be compatible if we consider that we are using the same for validating in debugfs_create_{x,u}{8,16,32}(). Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:16 -08:00
Greg Kroah-Hartman	34c65d82e0	USB: remove info() macro from usb.h USB should not be having it's own printk macros, so remove info() and use the system-wide standard of dev_info() wherever possible. No one in the tree is using the macro, so it can now be removed. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:14 -08:00
Greg Kroah-Hartman	338b67b0c1	USB: remove warn() macro from usb.h USB should not be having it's own printk macros, so remove warn() and use the system-wide standard of dev_warn() wherever possible. In the few places that will not work out, use a basic printk(). Now that all in-tree users are gone, remove the macro. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:14 -08:00
Alan Stern	25ff1c316f	USB: storage: add last-sector hacks This patch (as1189b) adds some hacks to usb-storage for dealing with the growing problems involving bad capacity values and last-sector accesses: A new flag, US_FL_CAPACITY_OK, is created to indicate that the device is known to report its capacity correctly. An unusual_devs entry for Linux's own File-backed Storage Gadget is added with this flag set, since g_file_storage always reports the correct capacity and since the capacity need not be even (it is determined by the size of the backing file). An entry in unusual_devs.h which has only the CAPACITY_OK flag set shouldn't prejudice libusual, since the device will work perfectly well with either usb-storage or ub. So a new macro, COMPLIANT_DEV, is added to let libusual know about these entries. When a last-sector access succeeds and the total number of sectors is odd (the unexpected case, in which guessing that the number is even might cause trouble), a WARN is triggered. The kerneloops.org project will collect these warnings, allowing us to add CAPACITY_OK flags for the devices in question before implementing the default-to-even heuristic. If users want to prevent the stack dump produced by the WARN, they can disable the hack by adding an unusual_devs entry for their device with the CAPACITY_OK flag. When a last-sector access fails three times in a row and neither the FIX_CAPACITY nor the CAPACITY_OK flag is set, we assume the last-sector bug is present. We replace the existing status and sense data with values that will cause the SCSI core to fail the access immediately rather than retry indefinitely. This should fix the difficulties people have been having with Nokia phones. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:11 -08:00
Oliver Neukum	856395d6e1	USB: extension of anchor API to unpoison an anchor This extension allows unpoisoning an anchor allowing drivers that resubmit URBs to reuse an anchor for methods like resume() Signed-off-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:11 -08:00
Ming Lei	49367d8f1d	USB: mark "reject" field of struct urb as atomic_t It is enough to protect accesses to reject field of urb by marking it as atomic_t,also it is the only reason of existence of usb_reject_lock,so remove the lock to make code more clean. Signed-off-by: Ming Lei <tom.leiming@gmail.com> Acked-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:08 -08:00
Alan Stern	3b23dd6f8a	USB: utilize the bus notifiers This patch (as1185) makes usbcore take advantage of the bus notifications sent out by the driver core. Now we can create all our device and interface attribute files before the device or interface uevent is broadcast. A side effect is that we no longer create the endpoint "pseudo" devices at the same time as a device or interface is registered -- it seems like a bad idea to try registering an endpoint before the registration of its parent is complete. So the routines for creating and removing endpoint devices have been split out and renamed, and they are called explicitly when needed. A new bitflag is used for keeping track of whether or not the interface's endpoint devices have been created, since (just as with the interface attributes) they vary with the altsetting and hence can be changed at random times. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:08 -08:00
Bryan Wu	2ffcdb3bda	USB: musb: use new platform data interface of musb to replace old one Signed-off-by: Bryan Wu <cooloney@kernel.org> Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:06 -08:00
Alan Stern	65bfd2967c	USB: Enhance usage of pm_message_t This patch (as1177) modifies the USB core suspend and resume routines. The resume functions now will take a pm_message_t argument, so they will know what sort of resume is occurring. The new argument is also passed to the port suspend/resume and bus suspend/resume routines (although they don't use it for anything but debugging). In addition, special pm_message_t values are used for user-initiated, device-initiated (i.e., remote wakeup), and automatic suspend/resume. By testing these values, drivers can tell whether or not a particular suspend was an autosuspend. Unfortunately, they can't do the same for resumes -- not until the pm_message_t argument is also passed to the drivers' resume methods. That will require a bigger change. IMO, the whole Power Management framework should have been set up this way in the first place. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:03 -08:00
Philipp Zabel	68144e0cc9	USB: otg: add otg_put_transceiver() As Russell King points out, calling put_device(otg_transceiver->dev) directly in driver cleanup paths makes assumptions about otg_transceiver internals. Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:02 -08:00
Philipp Zabel	6084f1bf0c	USB: otg: gpio_vbus transceiver stub gpio_vbus provides simple GPIO VBUS sensing for peripheral controllers with an internal transceiver. Optionally, a second GPIO can be used to control D+ pullup. It also interfaces with the regulator framework to limit charging currents when powered via USB. gpio_vbus requests the regulator supplying "vbus_draw" and can enable/disable it or limit its current depending on USB state. [dbrownell@users.sourceforge.net: use drivers/otg, cleanups ] Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:02 -08:00
Ben Efros	1537e0ad94	USB: storage devices and SAT Add the SANE SENSE flag to indicate that a device is capable of handling more than 18-bytes of sense data. This functionality is required for USB-ATA bridges implementing SAT. A future patch will actually enable this function for several devices. The logic behind this is that we can detect support for SANE_SENSE in a few ways: 1) ATA PASS THROUGH (12) or (16) execute successfully 2) SPC-3 or higher is in use 3) A previous CHECK CONDITION occurred with sense format 70-73 and had a length greater than 18-bytes total Signed-off-by: Ben Efros <ben@pc-doctor.com> Signed-off-by: Matthew Dharm <mdharm-usb@one-eyed-alien.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 09:59:55 -08:00
Pete Zaitcev	f150fa1afb	USB: Allow usbmon as a module even if usbcore is builtin usbmon can only be built as a module if usbcore is a module too. Trivial changes to the relevant Kconfig and Makefile (and a few trivial changes elsewhere) allow usbmon to be built as a module even if usbcore is builtin. This is verified to work in all 9 permutations (3 correctly prohibited by Kconfig, 6 build a suitable result). Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Pete Zaitcev <zaitcev@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 09:59:54 -08:00
Inaky Perez-Gonzalez	dc023dceec	USB: Introduce usb_queue_reset() to do resets from atomic contexts This patch introduces a new call to be able to do a USB reset from an atomic contect. This is quite helpful in USB callbacks to handle errors (when the only thing that can be done is to do a device reset). It is done queuing a work struct that will do the actual reset. The struct is "attached" to an interface so pending requests from an interface are removed when said interface is unbound from the driver. The call flow then becomes: usb_queue_reset_device() __usb_queue_reset_device() [workqueue] usb_reset_device() usb_probe_interface() usb_cancel_queue_reset() [error path] usb_unbind_interface() usb_cancel_queue_reset() usb_driver_release_interface() usb_cancel_queue_reset() Note usb_cancel_queue_reset() needs smarts to try not to unqueue when it is actually being executed. This happens when we run the reset from the workqueue: usb_reset_device() is called and on interface unbind time, usb_cancel_queue_reset() would be called. That would deadlock on cancel_work_sync(). To avoid that, we set (before running usb_reset_device()) usb_intf->reset_running and clear it inmediately after returning. Patch is against 2.6.28-rc2 and depends on http://marc.info/?l=linux-usb&m=122581634925308&w=2 (as submitted by Alan Stern). Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Cc: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 09:59:53 -08:00
Alan Stern	9ac39f28b5	USB: add asynchronous autosuspend/autoresume support This patch (as1160b) adds support routines for asynchronous autosuspend and autoresume, with accompanying documentation updates. There already are several potential users of this interface, and others are likely to arise as autosuspend support becomes more widespread. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 09:59:53 -08:00
Harvey Harrison	d767d88875	USB: wusb: annotate association types withe proper endianness Also a trivial annotation in rh.c for: drivers/usb/wusbcore/rh.c:366:9: warning: incorrect type in assignment (different base types) drivers/usb/wusbcore/rh.c:366:9: expected unsigned short [unsigned] [short] [usertype] <noident> drivers/usb/wusbcore/rh.c:366:9: got restricted __le16 [usertype] <noident> drivers/usb/wusbcore/rh.c:367:9: warning: incorrect type in assignment (different base types) drivers/usb/wusbcore/rh.c:367:9: expected unsigned short [unsigned] [short] [usertype] <noident> drivers/usb/wusbcore/rh.c:367:9: got restricted __le16 [usertype] <noident> Association types annotation fixes piles of warnings similar to: drivers/usb/wusbcore/cbaf.c:238:30: warning: incorrect type in initializer (different base types) drivers/usb/wusbcore/cbaf.c:238:30: expected restricted __le16 [usertype] id drivers/usb/wusbcore/cbaf.c:238:30: got int drivers/usb/wusbcore/cbaf.c:238:30: warning: incorrect type in initializer (different base types) drivers/usb/wusbcore/cbaf.c:238:30: expected restricted __le16 [usertype] len drivers/usb/wusbcore/cbaf.c:238:30: got int Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: David Vrabel <david.vrabel@csr.com> Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 09:59:51 -08:00
Rodolfo Giometti	b92a78e582	usb host: Oxford OXU210HP HCD driver. This driver implements the support for Oxford OXU210HP USB high-speed host, no peripheral nor OTG. Signed-off-by: Rodolfo Giometti <giometti@linux.it> Cc: Kan Liu <kan.k.liu@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 09:59:50 -08:00
Peter Zijlstra	490dea45d0	itimers: remove the per-cpu-ish-ness Either we bounce once cacheline per cpu per tick, yielding n^2 bounces or we just bounce a single.. Also, using per-cpu allocations for the thread-groups complicates the per-cpu allocator in that its currently aimed to be a fixed sized allocator and the only possible extention to that would be vmap based, which is seriously constrained on 32 bit archs. So making the per-cpu memory requirement depend on the number of processes is an issue. Lastly, it didn't deal with cpu-hotplug, although admittedly that might be fixable. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-07 18:52:44 +01:00
Arjan van de Ven	efaee19206	async: make the final inode deletion an asynchronous event this makes "rm -rf" on a (names cached) kernel tree go from 11.6 to 8.6 seconds on an ext3 filesystem Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>	2009-01-07 08:47:24 -08:00
Arjan van de Ven	22a9d64567	async: Asynchronous function calls to speed up kernel boot Right now, most of the kernel boot is strictly synchronous, such that various hardware delays are done sequentially. In order to make the kernel boot faster, this patch introduces infrastructure to allow doing some of the initialization steps asynchronously, which will hide significant portions of the hardware delays in practice. In order to not change device order and other similar observables, this patch does NOT do full parallel initialization. Rather, it operates more in the way an out of order CPU does; the work may be done out of order and asynchronous, but the observable effects (instruction retiring for the CPU) are still done in the original sequence. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>	2009-01-07 08:45:46 -08:00
Jean Delvare	b305271861	i2c: Drop I2C_CLASS_CAM_DIGITAL There are a number of drivers which set their i2c bus class to I2C_CLASS_CAM_DIGITAL, however no chip driver actually checks for this flag, so we might as well drop it now. Signed-off-by: Jean Delvare <khali@linux-fr.org>	2009-01-07 14:29:17 +01:00
Jean Delvare	994a075f0f	i2c: Drop I2C_CLASS_CAM_ANALOG and I2C_CLASS_SOUND There are no users left of these two i2c probe class flags so we can drop the now. Signed-off-by: Jean Delvare <khali@linux-fr.org>	2009-01-07 14:29:17 +01:00
Jean Delvare	e1995f65be	i2c: Drop I2C_CLASS_ALL I2C_CLASS_ALL is almost never what bus driver authors really want. These i2c classes are really only about which devices must be probed, not what devices can be present. As device drivers get converted to the new i2c device driver model, only a few device types will keep relying on probing. Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Sonic Zhang <sonic.zhang@analog.com>	2009-01-07 14:29:16 +01:00
Jani Nikula	d506fc322e	ALSA: Add support for video out to the jack reporting API Add support for reporting new jack types SND_JACK_VIDEOOUT and SND_JACK_AVOUT (a combination of LINEOUT and VIDEOOUT) to the jack reporting API. Also add the corresponding SW_VIDEOOUT_INSERT switch to the input system header. Signed-off-by: Jani Nikula <ext-jani.1.nikula@nokia.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2009-01-07 10:55:18 +00:00
Haavard Skinnemoen	52435bfc66	Merge branches 'fixes', 'cleanups' and 'boards'	2009-01-07 11:05:42 +01:00
Linus Torvalds	ede6f5aea0	Fix up 64-bit byte swaps for most 32-bit architectures The __SWAB_64_THRU_32__ case of a 64-bit byte swap was depending on the no-longer-existant ___swab32() method (three underscores). We got rid of some of the worst indirection and complexity, and now it should just use the 32-bit swab function that was defined right above it. Reported-and-tested-by: Nicolas Pitre <nico@cam.org> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 21:17:57 -08:00
Harvey Harrison	637b180c23	byteorder: remove the now unused byteorder.h This implementation caused problems in userspace which can, and does define _both_ __LITTLE_ENDIAN and __BIG_ENDIAN. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 19:45:13 -08:00
Harvey Harrison	991c0e6d1a	byteorder: only use linux/swab.h The first step to make swab.h a regular header that will include an asm/swab.h with arch overrides. Avoid the gratuitous differences introduced in the new linux/swab.h by naming the ___constant_swabXX bits and __fswabXX bits exactly as found in the old implementation in byteorder/swab[b].h Use this new swab.h in byteorder/[big\|little]_endian.h and remove the two old swab headers. Although the inclusion of asm/byteorder.h looks strange in linux/swab.h, this will allow each arch to move the actual arch overrides for the swab bits in an asm file and then the includes can be cleaned up without requiring a flag day for all arches at once. Keep providing __fswabXX in case some userspace was using them directly, but the revised __swabXX should be used instead in any new code and will always do constant folding not dependent on the optimization level, which means the __constant versions can be phased out in-kernel. Arches that use the old-style arch macros will lose their optimized versions until they move to the new style, but at least they will still compile. Many arches have already moved and the patches to move the remaining arches are trivial. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 18:10:26 -08:00
Linus Torvalds	db30c70575	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (29 commits) Input: i8042 - add Dell Vostro 1510 to nomux list Input: gtco - use USB endpoint API Input: add support for Maple controller as a joystick Input: atkbd - broaden the Dell DMI signatures Input: HIL drivers - add MODULE_ALIAS() Input: map_to_7segment.h - convert to __inline__ for userspace Input: add support for enhanced rotary controller on pxa930 and pxa935 Input: add support for trackball on pxa930 and pxa935 Input: add da9034 touchscreen support Input: ads7846 - strict_strtoul takes unsigned long Input: make some variables and functions static Input: add tsc2007 based touchscreen driver Input: psmouse - add module parameters to control OLPC touchpad delays Input: i8042 - add Gigabyte M912 netbook to noloop exception table Input: atkbd - Samsung NC10 key repeat fix Input: atkbd - add keyboard quirk for HP Pavilion ZV6100 laptop Input: libps2 - handle 0xfc responses from devices Input: add support for Wacom W8001 penabled serial touchscreen Input: synaptics - report multi-taps only if supported by the device Input: add joystick driver for Walkera WK-0701 RC transmitter ...	2009-01-06 17:14:01 -08:00
Linus Torvalds	c861ea2cb2	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #3] Revert "CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #2]" SELinux: shrink sizeof av_inhert selinux_class_perm and context CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #2] keys: fix sparse warning by adding __user annotation to cast smack: Add support for unlabeled network hosts and networks selinux: Deprecate and schedule the removal of the the compat_net functionality netlabel: Update kernel configuration API	2009-01-06 17:11:39 -08:00
Linus Torvalds	3610639d1f	Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: hrtimer: splitout peek ahead functionality, fix hrtimer: fixup comments hrtimer: fix recursion deadlock by re-introducing the softirq hrtimer: simplify hotplug migration hrtimer: fix HOTPLUG_CPU=n compile warning hrtimer: splitout peek ahead functionality	2009-01-06 17:10:53 -08:00
Linus Torvalds	cfa97f993c	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: fix section mismatch sched: fix double kfree in failure path sched: clean up arch_reinit_sched_domains() sched: mark sched_create_sysfs_power_savings_entries() as __init getrusage: RUSAGE_THREAD should return ru_utime and ru_stime sched: fix sched_slice() sched_clock: prevent scd->clock from moving backwards, take #2 sched: sched.c declare variables before they get used	2009-01-06 17:10:33 -08:00
Linus Torvalds	7238eb4ca3	Merge branch 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: genirq: provide irq_to_desc() to non-genirq architectures too	2009-01-06 17:10:19 -08:00
Linus Torvalds	f94181da71	Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcu: fix rcutorture bug rcu: eliminate synchronize_rcu_xxx macro rcu: make treercu safe for suspend and resume rcu: fix rcutree grace-period-latency bug on small systems futex: catch certain assymetric (get\|put)_futex_key calls futex: make futex_(get\|put)_key() calls symmetric locking, percpu counters: introduce separate lock classes swiotlb: clean up EXPORT_SYMBOL usage swiotlb: remove unnecessary declaration swiotlb: replace architecture-specific swiotlb.h with linux/swiotlb.h swiotlb: add support for systems with highmem swiotlb: store phys address in io_tlb_orig_addr array swiotlb: add hwdev to swiotlb_phys_to_bus() / swiotlb_sg_to_bus()	2009-01-06 17:10:04 -08:00
Linus Torvalds	40d7ee5d16	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (60 commits) uio: make uio_info's name and version const UIO: Documentation for UIO ioport info handling UIO: Pass information about ioports to userspace (V2) UIO: uio_pdrv_genirq: allow custom irq_flags UIO: use pci_ioremap_bar() in drivers/uio arm: struct device - replace bus_id with dev_name(), dev_set_name() libata: struct device - replace bus_id with dev_name(), dev_set_name() avr: struct device - replace bus_id with dev_name(), dev_set_name() block: struct device - replace bus_id with dev_name(), dev_set_name() chris: struct device - replace bus_id with dev_name(), dev_set_name() dmi: struct device - replace bus_id with dev_name(), dev_set_name() gadget: struct device - replace bus_id with dev_name(), dev_set_name() gpio: struct device - replace bus_id with dev_name(), dev_set_name() gpu: struct device - replace bus_id with dev_name(), dev_set_name() hwmon: struct device - replace bus_id with dev_name(), dev_set_name() i2o: struct device - replace bus_id with dev_name(), dev_set_name() IA64: struct device - replace bus_id with dev_name(), dev_set_name() i7300_idle: struct device - replace bus_id with dev_name(), dev_set_name() infiniband: struct device - replace bus_id with dev_name(), dev_set_name() ISDN: struct device - replace bus_id with dev_name(), dev_set_name() ...	2009-01-06 17:02:07 -08:00
Linus Torvalds	5fec8bdbf9	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: clean up annotations of fc->lock fuse: fix sparse warning in ioctl fuse: update interface version fuse: add fuse_conn->release() fuse: separate out fuse_conn_init() from new_conn() fuse: add fuse_ prefix to several functions fuse: implement poll support fuse: implement unsolicited notification fuse: add file kernel handle fuse: implement ioctl support fuse: don't let fuse_req->end() put the base reference fuse: move FUSE_MINOR to miscdevice.h fuse: style fixes	2009-01-06 17:01:20 -08:00
Linus Torvalds	59e3af21e9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (41 commits) scc_pata: make use of scc_dma_sff_read_status() ide-dma-sff: factor out ide_dma_sff_write_status() ide: move read_sff_dma_status() method to 'struct ide_dma_ops' ide: don't set hwif->dma_ops in init_dma() method Resurrect IT8172 IDE controller driver piix: sync ich_laptop[] with ata_piix.c ide: update warm-plug HOWTO ide: fix ide_port_scan() to do ACPI setup after initializing request queues ide: remove now redundant ->cur_dev checks ide: remove unused ide_hwif_t.sg_mapped field ide: struct ide_atapi_pc - remove unused fields and update documentation ide: remove superfluous hwif variable assignment from ide_timer_expiry() ide: use ide_pci_is_in_compatibility_mode() helper in setup-pci.c ide: make "paranoia" ->handler check in ide_intr() more strict ide-cd: convert to ide-atapi facilities ide-cd: start DMA before sending the actual packet command ide-cd: wait for DRQ to get set per default ide: Fix drive's DWORD-IO handling ide: add port and host iterators ide: dynamic allocation of device structures ...	2009-01-06 17:00:50 -08:00
WANG Cong	8cd3ac3aca	fs/exec.c: make do_coredump() void No one cares do_coredump()'s return value, and also it seems that it is also not necessary. So make it void. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: WANG Cong <wangcong@zeuux.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:29 -08:00
Randy Dunlap	d3635abfee	rapidio: remove excess kernel-doc notation Remove excess kernel-doc notation from rio header and driver: Warning(include/linux/rio_drv.h:399): Excess function parameter or struct member 'buffer' description in 'rio_get_inb_message' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Matt Porter <mporter@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:28 -08:00
David Brownell	cabb3fc4bd	twl4030-gpio: cleanup debounce Provide a static debounce configuration mechanism for twl4030 GPIOs, replacing the previous dynamic one. The single user of that mechanism was for MMC card detect debouncing. Boards can provide a bitmask saying which GPIOs to debounce (30 msec). It's always enabled for pins with the MMC card-detect/VMMCx link active, so most boards won't need to set the debounce mask. This is a net code shrink, including runtime footprint. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:25 -08:00
Ian Kent	a92daf6ba1	autofs4: make autofs type usage explicit - the type assigned at mount when no type is given is changed from 0 to AUTOFS_TYPE_INDIRECT. This was done because 0 and AUTOFS_TYPE_INDIRECT were being treated implicitly as the same type. - previously, an offset mount had it's type set to AUTOFS_TYPE_DIRECT\|AUTOFS_TYPE_OFFSET but the mount control re-implementation needs to be able distinguish all three types. So this was changed to make the type setting explicit. - a type AUTOFS_TYPE_ANY was added for use by the re-implementation when checking if a given path is a mountpoint. It's not really a type as we use this to ask if a given path is a mountpoint in the autofs_dev_ioctl_ismountpoint() function. - functions to set and test the autofs mount types have been added to improve readability and make the type usage explicit. - the mount type is used from user space for the mount control re-implementtion so, for consistency, all the definitions have been moved to the user space include file include/linux/auto_fs4.h. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:23 -08:00
Ian Kent	730c9eeca9	autofs4: improve parameter usage The parameter usage in the device node ioctl code uses arg1 and arg2 as parameter names. This patch redefines the parameter names to reflect what they actually are in an effort to make the code more readable. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:23 -08:00
Masami Hiramatsu	e8386a0cb2	kprobes: support probing module __exit function Allows kprobes to probe __exit routine. This adds flags member to struct kprobe. When module is freed(kprobes hooks module_notifier to get this event), kprobes which probe the functions in that module are set to "Gone" flag to the flags member. These "Gone" probes are never be enabled. Users can check the GONE flag through debugfs. This also removes mod_refcounted, because we couldn't free a module if kprobe incremented the refcount of that module. [akpm@linux-foundation.org: document some locking] [mhiramat@redhat.com: bugfix: pass aggr_kprobe to arch_remove_kprobe] [mhiramat@redhat.com: bugfix: release old_p's insn_slot before error return] Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:21 -08:00
Masami Hiramatsu	1294156078	kprobes: add kprobe_insn_mutex and cleanup arch_remove_kprobe() Add kprobe_insn_mutex for protecting kprobe_insn_pages hlist, and remove kprobe_mutex from architecture dependent code. This allows us to call arch_remove_kprobe() (and free_insn_slot) while holding kprobe_mutex. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:20 -08:00
Masami Hiramatsu	a06f6211ef	module: add within_module_core() and within_module_init() This series of patches allows kprobes to probe module's __init and __exit functions. This means, you can probe driver initialization and terminating. Currently, kprobes can't probe __init function because these functions are freed after module initialization. And it also can't probe module __exit functions because kprobe increments reference count of target module and user can't unload it. this means __exit functions never be called unless removing probes from the module. To solve both cases, this series of patches introduces GONE flag and sets it when the target code is freed(for this purpose, kprobes hooks MODULE_STATE_* events). This also removes refcount incrementing for allowing user to unload target module. Users can check which probes are GONE by debugfs interface. For taking timing of freeing module's .init text, these also include a patch which adds module's notifier of MODULE_STATE_LIVE event. This patch: Add within_module_core() and within_module_init() for checking whether an address is in the module .init.text section or .text section, and replace within() local inline functions in kernel/module.c with them. kprobes uses these functions to check where the kprobe is inserted. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:20 -08:00
David Brownell	d29389de0b	spi_gpio driver Generalize the old at91rm9200 "bootstrap" bitbanging SPI master driver as "spi_gpio", so it works with arbitrary GPIOs and can be configured through platform_data. Such SPI masters support: - any number of bus instances (bus_num is the platform_device.id) - any number of chipselects (one GPIO per spi_device) - all four SPI_MODE values, and SPI_CS_HIGH - i/o word sizes from 1 to 32 bits; - devices configured as with any other spi_master controller When configured using platform_data, this provides relatively low clock rates. On platforms that support inlined GPIO calls, significantly improved transfer speeds are also possible with a semi-custom driver. (It's still painful when accessing flash memory, but less so.) Sanity checked by using this version to replace both native controllers on a board with six different SPI slaves, relying on three different SPI_MODE_* values and both SPI_CS_HIGH settings for correct operation. [akpm@linux-foundation.org: cleanups] Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Magnus Damm <damm@igel.co.jp> Tested-by: Magnus Damm <damm@igel.co.jp> Cc: Torgil Svensson <torgil.svensson@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:19 -08:00
Hiroshi Shimamoto	5cf0cc4e67	binfmts.h: include list.h linux_binfmt uses list_head, so list.h is needed. [akpm@linux-foundation.org: fix `make headerscheck'] Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:19 -08:00
Jesper Juhl	8c3659347e	include/linux/interrupt.h: do not include linux/irqnr.h twice Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:14 -08:00
Eric Dumazet	179f7ebff6	percpu_counter: FBC_BATCH should be a variable For NR_CPUS >= 16 values, FBC_BATCH is 2NR_CPUS Considering more and more distros are using high NR_CPUS values, it makes sense to use a more sensible value for FBC_BATCH, and get rid of NR_CPUS. A sensible value is 2num_online_cpus(), with a minimum value of 32 (This minimum value helps branch prediction in __percpu_counter_add()) We already have a hotcpu notifier, so we can adjust FBC_BATCH dynamically. We rename FBC_BATCH to percpu_counter_batch since its not a constant anymore. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:13 -08:00
Tejun Heo	5f820f648c	poll: allow f_op->poll to sleep f_op->poll is the only vfs operation which is not allowed to sleep. It's because poll and select implementation used task state to synchronize against wake ups, which doesn't have to be the case anymore as wait/wake interface can now use custom wake up functions. The non-sleep restriction can be a bit tricky because ->poll is not called from an atomic context and the result of accidentally sleeping in ->poll only shows up as temporary busy looping when the timing is right or rather wrong. This patch converts poll/select to use custom wake up function and use separate triggered variable to synchronize against wake up events. The only added overhead is an extra function call during wake up and negligible. This patch removes the one non-sleep exception from vfs locking rules and is beneficial to userland filesystem implementations like FUSE, 9p or peculiar fs like spufs as it's very difficult for those to implement non-sleeping poll method. While at it, make the following cosmetic changes to make poll.h and select.c checkpatch friendly. * s/type * symbol/type symbol/ : three places in poll.h remove blank line before EXPORT_SYMBOL() : two places in select.c Oleg: spotted missing barrier in poll_schedule_timeout() Davide: spotted missing write barrier in pollwake() Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@sandia.gov> Cc: Ingo Molnar <mingo@elte.hu> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Brad Boyer <flar@allandria.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Roland McGrath <roland@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:12 -08:00
Darrick J. Wong	9fe06081ef	Create a DIV_ROUND_CLOSEST macro to do division with rounding Create a helper macro to divide two numbers and round the result to the nearest whole number. This is a helper macro for hwmon drivers that want to convert incoming sysfs values per standard hwmon practice, though the macro itself can be used by anyone. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:12 -08:00
Alexey Dobriyan	f1883f86de	Remove remaining unwinder code Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Gabor Gombas <gombasg@sztaki.hu> Cc: Jan Beulich <jbeulich@novell.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu>, Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:11 -08:00
Matthew Wilcox	ea43546750	atomic_t: unify all arch definitions The atomic_t type cannot currently be used in some header files because it would create an include loop with asm/atomic.h. Move the type definition to linux/types.h to break the loop. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:10 -08:00
Oleg Nesterov	901608d904	mm: introduce get_mm_hiwater_xxx(), fix taskstats->hiwater_xxx accounting xacct_add_tsk() relies on do_exit()->update_hiwater_xxx() and uses mm->hiwater_xxx directly, this leads to 2 problems: - taskstats_user_cmd() can call fill_pid()->xacct_add_tsk() at any moment before the task exits, so we should check the current values of rss/vm anyway. - do_exit()->update_hiwater_xxx() calls are racy. An exiting thread can be preempted right before mm->hiwater_xxx = new_val, and another thread can use A_LOT of memory and exit in between. When the first thread resumes it can be the last thread in the thread group, in that case we report the wrong hiwater_xxx values which do not take A_LOT into account. Introduce get_mm_hiwater_rss() and get_mm_hiwater_vm() helpers and change xacct_add_tsk() to use them. The first helper will also be used by rusage->ru_maxrss accounting. Kill do_exit()->update_hiwater_xxx() calls. Unless we are going to decrease rss/vm there is no point to update mm->hiwater_xxx, and nobody can look at this mm_struct when exit_mmap() actually unmaps the memory. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Hugh Dickins <hugh@veritas.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:09 -08:00
Nick Piggin	856bf4d717	fs: sys_sync fix s_syncing livelock avoidance was breaking data integrity guarantee of sys_sync, by allowing sys_sync to skip writing or waiting for superblocks if there is a concurrent sys_sync happening. This livelock avoidance is much less important now that we don't have the get_super_to_sync() call after every sb that we sync. This was replaced by __put_super_and_need_restart. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:09 -08:00
Nick Piggin	4f5a99d64c	fs: remove WB_SYNC_HOLD Remove WB_SYNC_HOLD. The primary motiviation is the design of my anti-starvation code for fsync. It requires taking an inode lock over the sync operation, so we could run into lock ordering problems with multiple inodes. It is possible to take a single global lock to solve the ordering problem, but then that would prevent a future nice implementation of "sync multiple inodes" based on lock order via inode address. Seems like a backward step to remove this, but actually it is busted anyway: we can't use the inode lists for data integrity wait: an inode can be taken off the dirty lists but still be under writeback. In order to satisfy data integrity semantics, we should wait for it to finish writeback, but if we only search the dirty lists, we'll miss it. It would be possible to have a "writeback" list, for sys_sync, I suppose. But why complicate things by prematurely optimise? For unmounting, we could avoid the "livelock avoidance" code, which would be easier, but again premature IMO. Fixing the existing data integrity problem will come next. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:09 -08:00
Hugh Dickins	edc315fd22	badpage: remove vma from page_remove_rmap Remove page_remove_rmap()'s vma arg, which was only for the Eeek message. And remove the BUG_ON(page_mapcount(page) == 0) from CONFIG_DEBUG_VM's page_dup_rmap(): we're trying to be more resilient about that than BUGs. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:07 -08:00
Hugh Dickins	2509ef26db	badpage: zap print_bad_pte on swap and file Complete zap_pte_range()'s coverage of bad pagetable entries by calling print_bad_pte() on a pte_file in a linear vma and on a bad swap entry. That needs free_swap_and_cache() to tell it, which will also have shown one of those "swap_free" errors (but with much less information). Similar checks in fork's copy_one_pte()? No, that would be more noisy than helpful: we'll see them when parent and child exec or exit. Where do_nonlinear_fault() calls print_bad_pte(): omit !VM_CAN_NONLINEAR case, that could only be a bug in sys_remap_file_pages(), not a bad pte. VM_FAULT_OOM rather than VM_FAULT_SIGBUS? Well, okay, that is consistent with what happens if do_swap_page() operates a bad swap entry; but don't we have patches to be more careful about killing when VM_FAULT_OOM? Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:07 -08:00
Hugh Dickins	79f4b7bf39	badpage: simplify page_alloc flag check+clear Simplify the PAGE_FLAGS checking and clearing when freeing and allocating a page: check the same flags as before when freeing, clear ALL the flags (unless PageReserved) when freeing, check ALL flags off when allocating. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:07 -08:00
Hugh Dickins	20137a490f	swapfile: swapon randomize if nonrot Swap allocation has always started from the beginning of the swap area; but if we're dealing with a solidstate swap device which can only remap blocks within limited zones, that would sooner wear out the first zone. Therefore sys_swapon() test whether blk_queue is non-rotational, and if so randomize the cluster_next starting position for allocation. If blk_queue is nonrot, note SWP_SOLIDSTATE for later use, and report it with an "SS" at the right end of the kernel's "Adding ... swap" message (so that if it's both nonrot and discardable, "SSD" will be shown there). Perhaps something should be shown in /proc/swaps (swapon -s), but we have to be more cautious before making any addition to that format. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Joern Engel <joern@logfs.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Donjun Shin <djshin90@gmail.com> Cc: Tejun Heo <teheo@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:05 -08:00
Hugh Dickins	7992fde72c	swapfile: swap allocation use discard When scan_swap_map() finds a free cluster of swap pages to allocate, discard the old contents of the cluster if the device supports discard. But don't bother when swap is so fragmented that we allocate single pages. Be careful about racing allocations made while we're scanning for a cluster; and hold up allocations made while we're discarding. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Joern Engel <joern@logfs.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Donjun Shin <djshin90@gmail.com> Cc: Tejun Heo <teheo@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:05 -08:00
Hugh Dickins	6a6ba83175	swapfile: swapon use discard (trim) When adding swap, all the old data on swap can be forgotten: sys_swapon() discard all but the header page of the swap partition (or every extent but the header of the swap file), to give a solidstate swap device the opportunity to optimize its wear-levelling. If that succeeds, note SWP_DISCARDABLE for later use, and report it with a "D" at the right end of the kernel's "Adding ... swap" message. Perhaps something should be shown in /proc/swaps (swapon -s), but we have to be more cautious before making any addition to that format. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Joern Engel <joern@logfs.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Donjun Shin <djshin90@gmail.com> Cc: Tejun Heo <teheo@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:05 -08:00
Hugh Dickins	ebebbbe904	swapfile: rearrange scan and swap_info Before making functional changes, rearrange scan_swap_map() to simplify subsequent diffs. Actually, there is one functional change in there: leave cluster_nr negative while scanning for a new cluster - resetting it early increased the likelihood that when we have difficulty finding a free cluster, another task may come in and try doing exactly the same - just a waste of cpu. Before making functional changes, rearrange struct swap_info_struct slightly: flags will be needed as an unsigned long (for wait_on_bit), next is a good int to pair with prio, old_block_size is uninteresting so shift it to the end. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:05 -08:00
Hugh Dickins	22c6f8fdb3	swapfile: remove SWP_ACTIVE mask Remove the SWP_ACTIVE mask: it just obscures the SWP_WRITEOK flag. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:05 -08:00
KOSAKI Motohiro	69beeb1d34	mm: make vread() and vwrite() declaration Sparse output following warnings. mm/vmalloc.c:1436:6: warning: symbol 'vread' was not declared. Should it be static? mm/vmalloc.c:1474:6: warning: symbol 'vwrite' was not declared. Should it be static? However, it is used by /dev/kmem. fixed here. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:05 -08:00
Hugh Dickins	b962716b45	mm: optimize get_scan_ratio for no swap Rik suggests a simplified get_scan_ratio() for !CONFIG_SWAP. Yes, the gcc optimizer gives us that, when nr_swap_pages is #defined as 0L. Move usual declaration to swapfile.c: it never belonged in page_alloc.c. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Robin Holt <holt@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:04 -08:00
Hugh Dickins	60371d971a	mm: add add_to_swap stub If we add a failing stub for add_to_swap(), then we can remove the #ifdef CONFIG_SWAP from mm/vmscan.c. This was intended as a source cleanup, but looking more closely, it turns out that the !CONFIG_SWAP case was going to keep_locked for an anonymous page, whereas now it goes to the more suitable activate_locked, like the CONFIG_SWAP nr_swap_pages 0 case. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Robin Holt <holt@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:04 -08:00
Hugh Dickins	ac47b003d0	mm: remove gfp_mask from add_to_swap Remove gfp_mask argument from add_to_swap(): it's misleading because its only caller, shrink_page_list(), is not atomic at that point; and in due course (implementing discard) we'll sometimes want to allocate some memory with GFP_NOIO (as is used in swap_writepage) when allocating swap. No change to the gfp_mask passed down to add_to_swap_cache(): still use __GFP_HIGH without __GFP_WAIT (with nomemalloc and nowarn as before): though it's not obvious if that's the best combination to ask for here. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Robin Holt <holt@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:04 -08:00
Hugh Dickins	a2c43eed83	mm: try_to_free_swap replaces remove_exclusive_swap_page remove_exclusive_swap_page(): its problem is in living up to its name. It doesn't matter if someone else has a reference to the page (raised page_count); it doesn't matter if the page is mapped into userspace (raised page_mapcount - though that hints it may be worth keeping the swap): all that matters is that there be no more references to the swap (and no writeback in progress). swapoff (try_to_unuse) has been removing pages from swapcache for years, with no concern for page count or page mapcount, and we used to have a comment in lookup_swap_cache() recognizing that: if you go for a page of swapcache, you'll get the right page, but it could have been removed from swapcache by the time you get page lock. So, give up asking for exclusivity: get rid of remove_exclusive_swap_page(), and remove_exclusive_swap_page_ref() and remove_exclusive_swap_page_count() which were spawned for the recent LRU work: replace them by the simpler try_to_free_swap() which just checks page_swapcount(). Similarly, remove the page_count limitation from free_swap_and_count(), but assume that it's worth holding on to the swap if page is mapped and swap nowhere near full. Add a vm_swap_full() test in free_swap_cache()? It would be consistent, but I think we probably have enough for now. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Robin Holt <holt@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:03 -08:00
Hugh Dickins	7b1fe59793	mm: reuse_swap_page replaces can_share_swap_page A good place to free up old swap is where do_wp_page(), or do_swap_page(), is about to redirty the page: the data on disk is then stale and won't be read again; and if we do decide to write the page out later, using the previous swap location makes an unnecessary disk seek very likely. So give can_share_swap_page() the side-effect of delete_from_swap_cache() when it safely can. And can_share_swap_page() was always a misleading name, the more so if it has a side-effect: rename it reuse_swap_page(). Irrelevant cleanup nearby: remove swap_token_default_timeout definition from swap.h: it's used nowhere. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Robin Holt <holt@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:03 -08:00
David Rientjes	2da02997e0	mm: add dirty_background_bytes and dirty_bytes sysctls This change introduces two new sysctls to /proc/sys/vm: dirty_background_bytes and dirty_bytes. dirty_background_bytes is the counterpart to dirty_background_ratio and dirty_bytes is the counterpart to dirty_ratio. With growing memory capacities of individual machines, it's no longer sufficient to specify dirty thresholds as a percentage of the amount of dirtyable memory over the entire system. dirty_background_bytes and dirty_bytes specify quantities of memory, in bytes, that represent the dirty limits for the entire system. If either of these values is set, its value represents the amount of dirty memory that is needed to commence either background or direct writeback. When a `bytes' or `ratio' file is written, its counterpart becomes a function of the written value. For example, if dirty_bytes is written to be 8096, 8K of memory is required to commence direct writeback. dirty_ratio is then functionally equivalent to 8K / the amount of dirtyable memory: dirtyable_memory = free pages + mapped pages + file cache dirty_background_bytes = dirty_background_ratio * dirtyable_memory -or- dirty_background_ratio = dirty_background_bytes / dirtyable_memory AND dirty_bytes = dirty_ratio * dirtyable_memory -or- dirty_ratio = dirty_bytes / dirtyable_memory Only one of dirty_background_bytes and dirty_background_ratio may be specified at a time, and only one of dirty_bytes and dirty_ratio may be specified. When one sysctl is written, the other appears as 0 when read. The `bytes' files operate on a page size granularity since dirty limits are compared with ZVC values, which are in page units. Prior to this change, the minimum dirty_ratio was 5 as implemented by get_dirty_limits() although /proc/sys/vm/dirty_ratio would show any user written value between 0 and 100. This restriction is maintained, but dirty_bytes has a lower limit of only one page. Also prior to this change, the dirty_background_ratio could not equal or exceed dirty_ratio. This restriction is maintained in addition to restricting dirty_background_bytes. If either background threshold equals or exceeds that of the dirty threshold, it is implicitly set to half the dirty threshold. Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: David Rientjes <rientjes@google.com> Cc: Andrea Righi <righi.andrea@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:03 -08:00
David Rientjes	364aeb2849	mm: change dirty limit type specifiers to unsigned long The background dirty and dirty limits are better defined with type specifiers of unsigned long since negative writeback thresholds are not possible. These values, as returned by get_dirty_limits(), are normally compared with ZVC values to determine whether writeback shall commence or be throttled. Such page counts cannot be negative, so declaring the page limits as signed is unnecessary. Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: David Rientjes <rientjes@google.com> Cc: Andrea Righi <righi.andrea@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:02 -08:00
Hugh Dickins	2afd1c928f	mm: make page_lock_anon_vma() static page_lock_anon_vma() and page_unlock_anon_vma() were made available to show_page_path() in vmscan.c; but now that has been removed, make them static in rmap.c again, they're better kept private if possible. Signed-off-by: Hugh Dickins <hugh@veritas.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:02 -08:00
Hugh Dickins	b5934c5318	mm: add_active_or_unevictable into rmap lru_cache_add_active_or_unevictable() and page_add_new_anon_rmap() always appear together. Save some symbol table space and some jumping around by removing lru_cache_add_active_or_unevictable(), folding its code into page_add_new_anon_rmap(): like how we add file pages to lru just after adding them to page cache. Remove the nearby "TODO: is this safe?" comments (yes, it is safe), and change page_add_new_anon_rmap()'s address BUG_ON to VM_BUG_ON as originally intended. Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:02 -08:00
Hugh Dickins	6d91add09f	mm: add Set,ClearPageSwapCache stubs If we add NOOP stubs for SetPageSwapCache() and ClearPageSwapCache(), then we can remove the #ifdef CONFIG_SWAPs from mm/migrate.c. Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:02 -08:00
Hugh Dickins	3c1d43787b	mm: remove GFP_HIGHUSER_PAGECACHE GFP_HIGHUSER_PAGECACHE is just an alias for GFP_HIGHUSER_MOVABLE, making that harder to track down: remove it, and its out-of-work brothers GFP_NOFS_PAGECACHE and GFP_USER_PAGECACHE. Since we're making that improvement to hotremove_migrate_alloc(), I think we can now also remove one of the "o"s from its comment. Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:01 -08:00
Hugh Dickins	e5991371ee	mm: remove cgroup_mm_owner_callbacks cgroup_mm_owner_callbacks() was brought in to support the memrlimit controller, but sneaked into mainline ahead of it. That controller has now been shelved, and the mm_owner_changed() args were inadequate for it anyway (they needed an mm pointer instead of a task pointer). Remove the dead code, and restore mm_update_next_owner() locking to how it was before: taking mmap_sem there does nothing for memcontrol.c, now the only user of mm->owner. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Paul Menage <menage@google.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:01 -08:00
KOSAKI Motohiro	64cdd548ff	mm: cleanup: remove #ifdef CONFIG_MIGRATION #ifdef in *.c file decrease source readability a bit. removing is better. This patch doesn't have any functional change. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:00 -08:00
KOSAKI Motohiro	1b0bd11886	mm: get rid of pagevec_release_nonlru() speculative page references patch (commit: `e286781d5f`) removed last pagevec_release_nonlru() caller. So this function can be removed now. This patch doesn't have any functional change. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:00 -08:00
Gary Hade	c04fc586c1	mm: show node to memory section relationship with symlinks in sysfs Show node to memory section relationship with symlinks in sysfs Add /sys/devices/system/node/nodeX/memoryY symlinks for all the memory sections located on nodeX. For example: /sys/devices/system/node/node1/memory135 -> ../../memory/memory135 indicates that memory section 135 resides on node1. Also revises documentation to cover this change as well as updating Documentation/ABI/testing/sysfs-devices-memory to include descriptions of memory hotremove files 'phys_device', 'phys_index', and 'state' that were previously not described there. In addition to it always being a good policy to provide users with the maximum possible amount of physical location information for resources that can be hot-added and/or hot-removed, the following are some (but likely not all) of the user benefits provided by this change. Immediate: - Provides information needed to determine the specific node on which a defective DIMM is located. This will reduce system downtime when the node or defective DIMM is swapped out. - Prevents unintended onlining of a memory section that was previously offlined due to a defective DIMM. This could happen during node hot-add when the user or node hot-add assist script onlines _all_ offlined sections due to user or script inability to identify the specific memory sections located on the hot-added node. The consequences of reintroducing the defective memory could be ugly. - Provides information needed to vary the amount and distribution of memory on specific nodes for testing or debugging purposes. Future: - Will provide information needed to identify the memory sections that need to be offlined prior to physical removal of a specific node. Symlink creation during boot was tested on 2-node x86_64, 2-node ppc64, and 2-node ia64 systems. Symlink creation during physical memory hot-add tested on a 2-node x86_64 system. Signed-off-by: Gary Hade <garyhade@us.ibm.com> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:00 -08:00
David Rientjes	75aa199410	oom: print triggering task's cpuset and mems allowed When cpusets are enabled, it's necessary to print the triggering task's set of allowable nodes so the subsequently printed meminfo can be interpreted correctly. We also print the task's cpuset name for informational purposes. [rientjes@google.com: task lock current before dereferencing cpuset] Cc: Paul Menage <menage@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:58:59 -08:00
Nick Piggin	1c0fe6e3bd	mm: invoke oom-killer from page fault Rather than have the pagefault handler kill a process directly if it gets a VM_FAULT_OOM, have it call into the OOM killer. With increasingly sophisticated oom behaviour (cpusets, memory cgroups, oom killing throttling, oom priority adjustment or selective disabling, panic on oom, etc), it's silly to unconditionally kill the faulting process at page fault time. Create a hook for pagefault oom path to call into instead. Only converted x86 and uml so far. [akpm@linux-foundation.org: make __out_of_memory() static] [akpm@linux-foundation.org: fix comment] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Jeff Dike <jdike@addtoit.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:58:58 -08:00
Mel Gorman	3340289ddf	mm: report the MMU pagesize in /proc/pid/smaps The KernelPageSize entry in /proc/pid/smaps is the pagesize used by the kernel to back a VMA. This matches the size used by the MMU in the majority of cases. However, one counter-example occurs on PPC64 kernels whereby a kernel using 64K as a base pagesize may still use 4K pages for the MMU on older processor. To distinguish, this patch reports MMUPageSize as the pagesize used by the MMU in /proc/pid/smaps. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Cc: "KOSAKI Motohiro" <kosaki.motohiro@jp.fujitsu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:58:58 -08:00
Mel Gorman	08fba69986	mm: report the pagesize backing a VMA in /proc/pid/smaps It is useful to verify a hugepage-aware application is using the expected pagesizes for its memory regions. This patch creates an entry called KernelPageSize in /proc/pid/smaps that is the size of page used by the kernel to back a VMA. The entry is not called PageSize as it is possible the MMU uses a different size. This extension should not break any sensible parser that skips lines containing unrecognised information. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: "KOSAKI Motohiro" <kosaki.motohiro@jp.fujitsu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:58:58 -08:00
James Morris	ac8cc0fa53	Merge branch 'next' into for-linus	2009-01-07 09:58:22 +11:00
David Howells	3699c53c48	CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #3 ] Fix a regression in cap_capable() due to: commit `3b11a1dece` Author: David Howells <dhowells@redhat.com> Date: Fri Nov 14 10:39:26 2008 +1100 CRED: Differentiate objective and effective subjective credentials on a task The problem is that the above patch allows a process to have two sets of credentials, and for the most part uses the subjective credentials when accessing current's creds. There is, however, one exception: cap_capable(), and thus capable(), uses the real/objective credentials of the target task, whether or not it is the current task. Ordinarily this doesn't matter, since usually the two cred pointers in current point to the same set of creds. However, sys_faccessat() makes use of this facility to override the credentials of the calling process to make its test, without affecting the creds as seen from other processes. One of the things sys_faccessat() does is to make an adjustment to the effective capabilities mask, which cap_capable(), as it stands, then ignores. The affected capability check is in generic_permission(): if (!(mask & MAY_EXEC) \|\| execute_ok(inode)) if (capable(CAP_DAC_OVERRIDE)) return 0; This change passes the set of credentials to be tested down into the commoncap and SELinux code. The security functions called by capable() and has_capability() select the appropriate set of credentials from the process being checked. This can be tested by compiling the following program from the XFS testsuite: /* * t_access_root.c - trivial test program to show permission bug. * * Written by Michael Kerrisk - copyright ownership not pursued. * Sourced from: http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-10/6030.html / #include <limits.h> #include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <fcntl.h> #include <sys/stat.h> #define UID 500 #define GID 100 #define PERM 0 #define TESTPATH "/tmp/t_access" static void errExit(char msg) { perror(msg); exit(EXIT_FAILURE); } /* errExit / static void accessTest(char file, int mask, char mstr) { printf("access(%s, %s) returns %d\n", file, mstr, access(file, mask)); } / accessTest / int main(int argc, char argv[]) { int fd, perm, uid, gid; char testpath; char cmd[PATH_MAX + 20]; testpath = (argc > 1) ? argv[1] : TESTPATH; perm = (argc > 2) ? strtoul(argv[2], NULL, 8) : PERM; uid = (argc > 3) ? atoi(argv[3]) : UID; gid = (argc > 4) ? atoi(argv[4]) : GID; unlink(testpath); fd = open(testpath, O_RDWR \| O_CREAT, 0); if (fd == -1) errExit("open"); if (fchown(fd, uid, gid) == -1) errExit("fchown"); if (fchmod(fd, perm) == -1) errExit("fchmod"); close(fd); snprintf(cmd, sizeof(cmd), "ls -l %s", testpath); system(cmd); if (seteuid(uid) == -1) errExit("seteuid"); accessTest(testpath, 0, "0"); accessTest(testpath, R_OK, "R_OK"); accessTest(testpath, W_OK, "W_OK"); accessTest(testpath, X_OK, "X_OK"); accessTest(testpath, R_OK \| W_OK, "R_OK \| W_OK"); accessTest(testpath, R_OK \| X_OK, "R_OK \| X_OK"); accessTest(testpath, W_OK \| X_OK, "W_OK \| X_OK"); accessTest(testpath, R_OK \| W_OK \| X_OK, "R_OK \| W_OK \| X_OK"); exit(EXIT_SUCCESS); } / main */ This can be run against an Ext3 filesystem as well as against an XFS filesystem. If successful, it will show: [root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043 ---------- 1 dhowells dhowells 0 2008-12-31 03:00 /tmp/xxx access(/tmp/xxx, 0) returns 0 access(/tmp/xxx, R_OK) returns 0 access(/tmp/xxx, W_OK) returns 0 access(/tmp/xxx, X_OK) returns -1 access(/tmp/xxx, R_OK \| W_OK) returns 0 access(/tmp/xxx, R_OK \| X_OK) returns -1 access(/tmp/xxx, W_OK \| X_OK) returns -1 access(/tmp/xxx, R_OK \| W_OK \| X_OK) returns -1 If unsuccessful, it will show: [root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043 ---------- 1 dhowells dhowells 0 2008-12-31 02:56 /tmp/xxx access(/tmp/xxx, 0) returns 0 access(/tmp/xxx, R_OK) returns -1 access(/tmp/xxx, W_OK) returns -1 access(/tmp/xxx, X_OK) returns -1 access(/tmp/xxx, R_OK \| W_OK) returns -1 access(/tmp/xxx, R_OK \| X_OK) returns -1 access(/tmp/xxx, W_OK \| X_OK) returns -1 access(/tmp/xxx, R_OK \| W_OK \| X_OK) returns -1 I've also tested the fix with the SELinux and syscalls LTP testsuites. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: J. Bruce Fields <bfields@citi.umich.edu> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2009-01-07 09:38:48 +11:00
James Morris	29881c4502	Revert "CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #2 ]" This reverts commit `14eaddc967`. David has a better version to come.	2009-01-07 09:21:54 +11:00
Oliver Hartkopp	1fa17d4ba4	can: omit unneeded skb_clone() calls The AF_CAN core delivered always cloned sk_buffs to the AF_CAN protocols, although this was _only_ needed by the can-raw protocol. With this (additionally documented) change, the AF_CAN core calls the callback functions of the registered AF_CAN protocols with the original (uncloned) sk_buff pointer and let's the can-raw protocol do the skb_clone() itself which omits all unneeded skb_clone() calls for other AF_CAN protocols. Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net> Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-06 11:07:54 -08:00
Herbert Xu	e1c096e251	vlan: Add GRO interfaces This patch adds GRO interfaces for hardware-assisted VLAN reception. With this in place we're now at parity with LRO as far as the interface is concerned. That is, you can now take any LRO driver and convert it over to GRO. As the CB memory clashes with GRO's use of CB, I've removed it entirely by storing dev in skb->dev. This is OK because VLAN gets called first thing in netif_receive_skb and skb->dev is not used in between us calling netif_rx and netif_receive_skb getting called. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-06 10:50:09 -08:00
Herbert Xu	96e93eab20	gro: Add internal interfaces for VLAN Previously GRO's only entry point from the outside is through napi_gro_receive and napi_gro_frags. These interfaces are for device drivers. This patch rearranges things to provide a new set of interfaces for VLANs. These interfaces are for internal use only. The VLAN code itself can then provide a set of entry points for device drivers. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-06 10:49:34 -08:00
Stephen Rothwell	b8ac9fc0e8	uio: make uio_info's name and version const These are only ever assigned constant strings and never modified. This was noticed because Wolfram Sang needed to cast the result of of_get_property() in order to assign it to the name field of a struct uio_info. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Hans J. Koch <hjk@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:44 -08:00
Hans J. Koch	e70c412ee4	UIO: Pass information about ioports to userspace (V2) Devices sometimes have memory where all or parts of it can not be mapped to userspace. But it might still be possible to access this memory from userspace by other means. An example are PCI cards that advertise not only mappable memory but also ioport ranges. On x86 architectures, these can be accessed with ioperm, iopl, inb, outb, and friends. Mike Frysinger (CCed) reported a similar problem on Blackfin arch where it doesn't seem to be easy to mmap non-cached memory but it can still be accessed from userspace. This patch allows kernel drivers to pass information about such ports to userspace. Similar to the existing mem[] array, it adds a port[] array to struct uio_info. Each port range is described by start, size, and porttype. If a driver fills in at least one such port range, the UIO core will simply pass this information to userspace by creating a new directory "portio" underneath /sys/class/uio/uioN/. Similar to the "mem" directory, it will contain a subdirectory (portX) for each port range given. Note that UIO simply passes this information to userspace, it performs no action whatsoever with this data. It's userspace's responsibility to obtain access to these ports and to solve arch dependent issues. The "porttype" attribute tells userspace what kind of port it is dealing with. This mechanism could also be used to give userspace information about GPIOs related to a device. You frequently find such hardware in embedded devices, so I added a UIO_PORT_GPIO definition. I'm not really sure if this is a good idea since there are other solutions to this problem, but it won't hurt much anyway. Signed-off-by: Hans J. Koch <hjk@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:44 -08:00
Kay Sievers	475b44c199	mtd: struct device - replace bus_id with dev_name(), dev_set_name() CC: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:38 -08:00
Mark McLoughlin	0aa0dc41bf	driver core: add root_device_register() Add support for allocating root device objects which group device objects under /sys/devices directories. Also add a sysfs 'module' symlink which points to the owner of the root device object. This symlink will be used in virtio to allow userspace to determine which virtio bus implementation a given device is associated with. [Includes suggestions from Cornelia Huck] Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:33 -08:00
Cornelia Huck	d0d85ff989	Make DEBUG take precedence over DYNAMIC_PRINTK_DEBUG Statically defined DEBUG should take precedence over dynamically enabled debugging; otherwise adding DEBUG (like, for example, via CONFIG_DEBUG_KOBJECT) does not have the expected result of printing pr_debug() and dev_dbg() messages unconditionally. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:33 -08:00
Greg Kroah-Hartman	b9daa99ee5	driver core: move knode_bus into private structure Nothing outside of the driver core should ever touch knode_bus, so move it out of the public eye. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:33 -08:00
Greg Kroah-Hartman	93e746db18	driver core: move knode_driver into private structure Nothing outside of the driver core should ever touch knode_driver, so move it out of the public eye. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:32 -08:00
Greg Kroah-Hartman	11c3b5c3e0	driver core: move klist_children into private structure Nothing outside of the driver core should ever touch klist_children, or knode_parent, so move them out of the public eye. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:32 -08:00
Greg Kroah-Hartman	2831fe6f9c	driver core: create a private portion of struct device This is to be used to move things out of struct device that no code outside of the driver core should ever touch. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:32 -08:00
Matthew Wilcox	210272a284	driver core: Remove completion from struct klist_node Removing the completion from klist_node reduces its size from 64 bytes to 28 on x86-64. To maintain the semantics of klist_remove(), we add a single list of klist nodes which are pending deletion and scan them. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:30 -08:00
Matthew Wilcox	929d2fa595	driver core: Rearrange struct device for better packing This minor rearrangement saves 16 bytes from sizeof(struct device) according to pahole. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:30 -08:00
Alan Stern	7f4f5d4516	Fix misspellings in pm.h macros This patch (as1167) fixes some misspellings in various recently-added macros in pm.h. Fortunately these macros are not yet used anywhere. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Rafael J. Wysocki <rjw@sisk.pl>	2009-01-06 10:44:30 -08:00
Rafael J. Wysocki	adf094931f	PM: Simplify the new suspend/hibernation framework for devices PM: Simplify the new suspend/hibernation framework for devices Following the discussion at the Kernel Summit, simplify the new device PM framework by merging 'struct pm_ops' and 'struct pm_ext_ops' and removing pointers to 'struct pm_ext_ops' from 'struct platform_driver' and 'struct pci_driver'. After this change, the suspend/hibernation callbacks will only reside in 'struct device_driver' as well as at the bus type/ device class/device type level. Accordingly, PCI and platform device drivers are now expected to put their suspend/hibernation callbacks into the 'struct device_driver' embedded in 'struct pci_driver' or 'struct platform_driver', respectively. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@suse.cz> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:29 -08:00
Dan Williams	864498aaa9	dmaengine: use idr for registering dma device numbers This brings some predictability to dma device numbers, i.e. an rmmod/insmod cycle may now result in /sys/class/dma/dma0chan0 being restored rather than /sys/class/dma/dma1chan0 appearing. Cc: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:21 -07:00
Dan Williams	41d5e59c12	dmaengine: add a release for dma class devices and dependent infrastructure Resolves: WARNING: at drivers/base/core.c:122 device_release+0x4d/0x52() Device 'dma0chan0' does not have a release() function, it is broken and must be fixed. The dma_chan_dev object is introduced to gear-match sysfs kobject and dmaengine channel lifetimes. When a channel is removed access to the sysfs entries return -ENODEV until the kobject can be released. The bulk of the change is updates to existing code to handle the extra layer of indirection between a dma_chan and its struct device. Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:21 -07:00
Dan Williams	7dd6025101	dmaengine: kill enum dma_state_client DMA_NAK is now useless. We can just use a bool instead. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:19 -07:00
Dan Williams	f27c580c36	dmaengine: remove 'bigref' infrastructure Reference counting is done at the module level so clients need not worry that a channel will leave while they are actively using dmaengine. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:18 -07:00
Dan Williams	aa1e6f1a38	dmaengine: kill struct dma_client and supporting infrastructure All users have been converted to either the general-purpose allocator, dma_find_channel, or dma_request_channel. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:17 -07:00
Dan Williams	209b84a88f	dmaengine: replace dma_async_client_register with dmaengine_get Now that clients no longer need to be notified of channel arrival dma_async_client_register can simply increment the dmaengine_ref_count. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:17 -07:00
Dan Williams	74465b4ff9	atmel-mci: convert to dma_request_channel and down-level dma_slave dma_request_channel provides an exclusive channel, so we no longer need to pass slave data through dmaengine. Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:16 -07:00
Dan Williams	33df8ca068	dmatest: convert to dma_request_channel Replace the client registration infrastructure with a custom loop to poll for channels. Once dma_request_channel returns NULL stop asking for channels. A userspace side effect of this change if that loading the dmatest module before loading a dma driver will result in no channels being found, previously dmatest would get a callback. To facilitate testing in the built-in case dmatest_init is marked as a late_initcall. Another side effect is that channels under test can not be used for any other purpose. Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:15 -07:00
Dan Williams	59b5ec2144	dmaengine: introduce dma_request_channel and private channels This interface is primarily for device-to-memory clients which need to search for dma channels with platform-specific characteristics. The prototype is: struct dma_chan dma_request_channel(dma_cap_mask_t mask, dma_filter_fn filter_fn, void filter_param); When the optional 'filter_fn' parameter is set to NULL dma_request_channel simply returns the first channel that satisfies the capability mask. Otherwise, when the mask parameter is insufficient for specifying the necessary channel, the filter_fn routine can be used to disposition the available channels in the system. The filter_fn routine is called once for each free channel in the system. Upon seeing a suitable channel filter_fn returns DMA_ACK which flags that channel to be the return value from dma_request_channel. A channel allocated via this interface is exclusive to the caller, until dma_release_channel() is called. To ensure that all channels are not consumed by the general-purpose allocator the DMA_PRIVATE capability is provided to exclude a dma_device from general-purpose (memory-to-memory) consideration. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:15 -07:00
Dan Williams	f67b459992	net_dma: convert to dma_find_channel Use the general-purpose channel allocation provided by dmaengine. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:15 -07:00
Dan Williams	2ba05622b8	dmaengine: provide a common 'issue_pending_all' implementation async_tx and net_dma each have open-coded versions of issue_pending_all, so provide a common routine in dmaengine. The implementation needs to walk the global device list, so implement rcu to allow dma_issue_pending_all to run lockless. Clients protect themselves from channel removal events by holding a dmaengine reference. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:14 -07:00
Dan Williams	bec085134e	dmaengine: centralize channel allocation, introduce dma_find_channel Allowing multiple clients to each define their own channel allocation scheme quickly leads to a pathological situation. For memory-to-memory offload all clients can share a central allocator. This simply moves the existing async_tx allocator to dmaengine with minimal fixups: * async_tx.c:get_chan_ref_by_cap --> dmaengine.c:nth_chan * async_tx.c:async_tx_rebalance --> dmaengine.c:dma_channel_rebalance * split out common code from async_tx.c:__async_tx_find_channel --> dma_find_channel Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:14 -07:00
Dan Williams	6f49a57aa5	dmaengine: up-level reference counting to the module level Simply, if a client wants any dmaengine channel then prevent all dmaengine modules from being removed. Once the clients are done re-enable module removal. Why?, beyond reducing complication: 1/ Tracking reference counts per-transaction in an efficient manner, as is currently done, requires a complicated scheme to avoid cache-line bouncing effects. 2/ Per-transaction ref-counting gives the false impression that a dma-driver can be gracefully removed ahead of its user (net, md, or dma-slave) 3/ None of the in-tree dma-drivers talk to hot pluggable hardware, but if such an engine were built one day we still would not need to notify clients of remove events. The driver can simply return NULL to a ->prep() request, something that is much easier for a client to handle. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:14 -07:00
Chuck Lever	57ef692588	NLM: Rewrite IPv4 privileged requester's check Clean up. For consistency, rewrite the IPv4 check to match the same style as the new IPv6 check. Note that ipv4_is_loopback() is somewhat broader in its interpretation of what is a loopback address than simply "127.0.0.1". Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:56 -05:00
Chuck Lever	d1208f7073	NLM: nlm_privileged_requester() doesn't recognize mapped loopback address Commit `b85e4676` added the nlm_privileged_requester() helper to check whether an RPC request was sent from a local privileged caller. It recognizes IPv4 privileged callers (from "127.0.0.1"), and IPv6 privileged callers (from "::1"). However, IPV6_ADDR_LOOPBACK is not set for the mapped IPv4 loopback address (::ffff:7f00:0001), so the test breaks when the kernel's RPC service is IPv6-enabled but user space is calling via the IPv4 loopback address. This is actually the most common case for IPv6- enabled RPC services on Linux. Rewrite the IPv6 check to handle the mapped IPv4 loopback address as well as a normal IPv6 loopback address. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:56 -05:00
Chuck Lever	8529bc51d3	NSM: Move nsm_addr() to fs/lockd/mon.c Clean up: nsm_addr_in() is no longer used, and nsm_addr() is used only in fs/lockd/mon.c, so move it there. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:55 -05:00
Chuck Lever	e6765b8397	NSM: Remove include/linux/lockd/sm_inter.h Clean up: The include/linux/lockd/sm_inter.h header is nearly empty now. Remove it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:55 -05:00
Chuck Lever	92fd91b998	NLM: Remove "create" argument from nsm_find() Clean up: nsm_find() now has only one caller, and that caller unconditionally sets the @create argument. Thus the @create argument is no longer needed. Since nsm_find() now has a more specific purpose, pick a more appropriate name for it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:54 -05:00
Chuck Lever	3420a8c435	NSM: Add nsm_lookup() function Introduce a new API to fs/lockd/mon.c that allows nlm_host_rebooted() to lookup up nsm_handles via the contents of an nlm_reboot struct. The new function is equivalent to calling nsm_find() with @create set to zero, but it takes a struct nlm_reboot instead of separate arguments. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:54 -05:00
Chuck Lever	576df4634e	NLM: Decode "priv" argument of NLMPROC_SM_NOTIFY as an opaque The NLM XDR decoders for the NLMPROC_SM_NOTIFY procedure should treat their "priv" argument truly as an opaque, as defined by the protocol, and let the upper layers figure out what is in it. This will make it easier to modify the contents and interpretation of the "priv" argument, and keep knowledge about what's in "priv" local to fs/lockd/mon.c. For now, the NLM and NSM implementations should behave exactly as they did before. The formation of the address of the rebooted host in nlm_host_rebooted() may look a little strange, but it is the inverse of how nsm_init_private() forms the private cookie. Plus, it's going away soon anyway. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:54 -05:00
Chuck Lever	7fefc9cb9d	NLM: Change nlm_host_rebooted() to take a single nlm_reboot argument Pass the nlm_reboot data structure directly from the NLMPROC_SM_NOTIFY XDR decoders to nlm_host_rebooted(). This eliminates some packing and unpacking of the NLMPROC_SM_NOTIFY results, and prepares for passing these results, including the "priv" cookie, directly to a lookup routine in fs/lockd/mon.c. This patch changes code organization but should not cause any behavioral change. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:54 -05:00
Chuck Lever	7e44d3bea2	NSM: Generate NSMPROC_MON's "priv" argument when nsm_handle is created Introduce a new data type, used by both the in-kernel NLM and NSM implementations, that is used to manage the opaque "priv" argument for the NSMPROC_MON and NLMPROC_SM_NOTIFY calls. Construct the "priv" cookie when the nsm_handle is created. The nsm_init_private() function may look a little strange, but it is roughly equivalent to how the XDR encoder formed the "priv" argument. It's going to go away soon. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:53 -05:00
Chuck Lever	67c6d107a6	NSM: Move nsm_find() to fs/lockd/mon.c The nsm_find() function sets up fresh nsm_handle entries. This is where we will store the "priv" cookie used to lookup nsm_handles during reboot recovery. The cookie will be constructed when nsm_find() creates a new nsm_handle. As much as possible, I would like to keep everything that handles a "priv" cookie in fs/lockd/mon.c so that all the smarts are in one source file. That organization should make it pretty simple to see how all this works. To me, it makes more sense than the current arrangement to keep nsm_find() with nsm_monitor() and nsm_unmonitor(). So, start reorganizing by moving nsm_find() into fs/lockd/mon.c. The nsm_release() function comes along too, since it shares the nsm_lock global variable. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:53 -05:00
Chuck Lever	36e8e668d3	NSM: Move NSM program and procedure numbers to fs/lockd/mon.c Clean up: Move the RPC program and procedure numbers for NSM into the one source file that needs them: fs/lockd/mon.c. And, as with NLM, NFS, and rpcbind calls, use NSMPROC_FOO instead of SM_FOO for NSM procedure numbers. Finally, make a couple of comments more precise: what is referred to here as SM_NOTIFY is really the NLM (lockd) NLMPROC_SM_NOTIFY downcall, not NSMPROC_NOTIFY. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:52 -05:00
Chuck Lever	9c1bfd037f	NSM: Move NSM-related XDR data structures to lockd's xdr.h Clean up: NSM's XDR data structures are used only in fs/lockd/mon.c, so move them there. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:52 -05:00
Chuck Lever	356c3eb466	NLM: Move the public declaration of nsm_unmonitor() to lockd.h Clean up. Make the nlm_host argument "const," and move the public declaration to lockd.h. Add a documenting comment. Bruce observed that nsm_unmonitor()'s only caller doesn't care about its return code, so make nsm_unmonitor() return void. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:52 -05:00
Chuck Lever	c8c23c423d	NSM: Release nsmhandle in nlm_destroy_host The nsm_handle's reference count is bumped in nlm_lookup_host(). It should be decremented in nlm_destroy_host() to make it easier to see the balance of these two operations. Move the nsm_release() call to fs/lockd/host.c. The h_nsmhandle pointer is set in nlm_lookup_host(), and never cleared. The nlm_destroy_host() function is never called for the same nlm_host twice, so h_nsmhandle won't ever be NULL when nsm_unmonitor() is called. All references to the nlm_host are gone before it is freed. We can skip making h_nsmhandle NULL just before the nlm_host is deallocated. It's also likely we can remove the h_nsmhandle NULL check in nlmsvc_is_client() as well, but we can do that later when rearchitect- ing the nlm_host cache. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:52 -05:00
Chuck Lever	1e49323c4a	NLM: Move the public declaration of nsm_monitor() to lockd.h Clean up. Make the nlm_host argument "const," and move the public declaration to lockd.h with other NSM public function (nsm_release, eg) and global variable declarations. Add a documenting comment. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:52 -05:00
Chuck Lever	29ed1407ed	NSM: Support IPv6 version of mon_name The "mon_name" argument of the NSMPROC_MON and NSMPROC_UNMON upcalls is a string that contains the hostname or IP address of the remote peer to be notified when this host has rebooted. The sm-notify command uses this identifier to contact the peer when we reboot, so it must be either a well-qualified DNS hostname or a presentation format IP address string. When the "nsm_use_hostnames" sysctl is set to zero, the kernel's NSM provides a presentation format IP address in the "mon_name" argument. Otherwise, the "caller_name" argument from NLM requests is used, which is usually just the DNS hostname of the peer. To support IPv6 addresses for the mon_name argument, we use the nsm_handle's address eye-catcher, which already contains an appropriate presentation format address string. Using the eye-catcher string obviates the need to use a large buffer on the stack to form the presentation address string for the upcall. This patch also addresses a subtle bug. An NSMPROC_MON request and the subsequent NSMPROC_UNMON request for the same peer are required to use the same value for the "mon_name" argument. Otherwise, rpc.statd's NSMPROC_UNMON processing cannot locate the database entry for that peer and remove it. If the setting of nsm_use_hostnames is changed between the time the kernel sends an NSMPROC_MON request and the time it sends the NSMPROC_UNMON request for the same peer, the "mon_name" argument for these two requests may not be the same. This is because the value of "mon_name" is currently chosen at the moment the call is made based on the setting of nsm_use_hostnames To ensure both requests pass identical contents in the "mon_name" argument, we now select which string to use for the argument in the nsm_monitor() function. A pointer to this string is saved in the nsm_handle so it can be used for a subsequent NSMPROC_UNMON upcall. NB: There are other potential problems, such as how nlm_host_rebooted() might behave if nsm_use_hostnames were changed while hosts are still being monitored. This patch does not attempt to address those problems. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:51 -05:00
Chuck Lever	f47534f7f0	NSM: Use modern style for sm_name field in nsm_handle Clean up: I'm about to add another "char " field to the nsm_handle structure. The sm_name field uses an older style of declaring a "char " field. If I match that style for the new field, checkpatch.pl will complain. So, fix the sm_name field to use the new style. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:50 -05:00
Chuck Lever	bc995801a0	NLM: Support IPv6 scope IDs in nlm_display_address() Scope ID support is needed since the kernel's NSM implementation is about to use these displayed addresses as a mon_name in some cases. When nsm_use_hostnames is zero, without scope ID support NSM will fail to handle peers that contact us via a link-local address. Link-local addresses do not work without an interface ID, which is stored in the sockaddr's sin6_scope_id field. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:49 -05:00
Chuck Lever	1df40b609a	NLM: Remove address eye-catcher buffers from nlm_host The h_name field in struct nlm_host is a just copy of h_nsmhandle->sm_name. Likewise, the contents of the h_addrbuf field should be identical to the sm_addrbuf field. The h_srcaddrbuf field is used only in one place for debugging. We can live without this until we get %pI formatting for printk(). Currently these buffers are 48 bytes, but we need to support scope IDs in IPv6 presentation addresses, which means making the buffers even larger. Instead, let's find ways to eliminate them to save space. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:49 -05:00
Chuck Lever	7538ce1eb6	NLM: Use modern style for pointer fields in nlm_host Clean up: I'm about to add another "char " field to the nlm_host structure. The h_name field, for example, uses an older style of declaring a "char " field. If I match that style for the new field, checkpatch.pl will complain. So, fix pointer fields to use the new style. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:48 -05:00
Jeff Layton	c9233eb7b0	sunrpc: add sv_maxconn field to svc_serv (try #3 ) svc_check_conn_limits() attempts to prevent denial of service attacks by having the service close old connections once it reaches a threshold. This threshold is based on the number of threads in the service: (serv->sv_nrthreads + 3) * 20 Once we reach this, we drop the oldest connections and a printk pops to warn the admin that they should increase the number of threads. Increasing the number of threads isn't an option however for services like lockd. We don't want to eliminate this check entirely for such services but we need some way to increase this limit. This patch adds a sv_maxconn field to the svc_serv struct. When it's set to 0, we use the current method to calculate the max number of connections. RPC services can then set this on an as-needed basis. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:47 -05:00
J. Bruce Fields	548eaca46b	nfsd: document new filehandle fsid types Descriptions taken from mountd code (in nfs-utils/utils/mountd/cache.c). Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:47 -05:00
Sergei Shtylyov	592b531521	ide: move read_sff_dma_status() method to 'struct ide_dma_ops' Move apparently misplaced read_sff_dma_status() method from 'struct ide_tp_ops' to 'struct ide_dma_ops', renaming it to dma_sff_read_status() and making only required for SFF-8038i compatible IDE controller drivers (greatly cutting down the number of initializers) as its only user (outside ide-dma-sff.c and such drivers) appears to be ide_pci_check_simplex() which is only called for such controllers... Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:21:02 +01:00
Shane McDonald	391ad1908a	Resurrect IT8172 IDE controller driver Support for the IT8172 IDE controller was removed from the kernel sometime after 2.6.18. Support for the only boards that used the IT8172 was removed from the kernel after 2.6.18, as they had never compiled since 2.6.0. However, there are a couple of platforms that use this chip: the PMC-Sierra Xiao Hu thin-client computer, which is no longer in production, and the Linksys NSS4000 Network Attached Storage box, which is based on the Xiao Hu board. I am attempting to add support for the Xiao Hu to the kernel, and this IT8172 IDE controller is the first bit of code in this effort. This patch resurrects the IT8172 IDE controller code. I began with the 2.6.18 version of the it8172.c file, and have moved it forward so that it works with the latest version of the kernel. I have run this driver on a PMC-Sierra Xiao Hu board with the 2.6.28 kernel, and I have had no problems with it in my configuration. The attached patch applies cleanly against 2.6.28. Signed-off-by: Shane McDonald <mcdonald.shane@gmail.com> Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: alan@lxorguk.ukuu.org.uk [bart: s/HWIF(drive)/drive->hwif/] Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:21:01 +01:00
Bartlomiej Zolnierkiewicz	94c96445f3	ide: remove unused ide_hwif_t.sg_mapped field Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:59 +01:00
Bartlomiej Zolnierkiewicz	906ef986a7	ide: struct ide_atapi_pc - remove unused fields and update documentation Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:59 +01:00
Borislav Petkov	d6251d4488	ide-cd: convert to ide-atapi facilities ... and remove no longer needed cdrom_start_packet_command and cdrom_transfer_packet_command. Tested lightly with ide-cd and ide-floppy. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:58 +01:00
Bartlomiej Zolnierkiewicz	2bd24a1cfc	ide: add port and host iterators Add ide_port_for_each_dev() / ide_host_for_each_port() iterators and update IDE code to use them. While at it: - s/unit/i/ variable in ide_port_wait_ready(), ide_probe_port(), ide_port_tune_devices(), ide_port_init_devices_data(), do_reset1(), ide_acpi_set_state() and scc_dma_end() - s/d/i/ variable in ide_proc_port_register_devices() There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:56 +01:00
Bartlomiej Zolnierkiewicz	5e7f3a4669	ide: dynamic allocation of device structures Allocate device structures dynamically instead of having them embedded in ide_hwif_t: * Remove needless zeroing of port structure from ide_init_port_data(). * Add ide_hwif_t.devices[MAX_DRIVES] (table of pointers to the devices). * Add ide_port_{alloc,free}_devices() helpers and use them respectively in ide_{host,free}_alloc(). * Convert all users of ->drives[] to use ->devices[] instead. While at it: * Use drive->dn for the slave device check in scc_pata.c. As a nice side-effect this patch cuts ~1kB (x86-32) from the resulting code size: text data bss dec hex filename 53963 1244 237 55444 d894 drivers/ide/ide-core.o.before 52981 1244 237 54462 d4be drivers/ide/ide-core.o.after Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:56 +01:00
Bartlomiej Zolnierkiewicz	627e05daa1	ide: remove ->error method from struct ide_driver * Remove (now superfluous) ->error method from struct ide_driver. * Unexport __ide_error() and make it static. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:54 +01:00
Bartlomiej Zolnierkiewicz	7f3c868ba7	ide: remove ide_driver_t typedef While at it: - s/struct ide_driver_s/struct ide_driver/ - use to_ide_driver() macro in ide-proc.c Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:53 +01:00
Bartlomiej Zolnierkiewicz	9892ec5497	ide: remove 'byte' typedef Just use u8 instead, also s/__u8/u8/ in ide-cd.h while at it. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:53 +01:00
Bartlomiej Zolnierkiewicz	c0ae502347	ide: remove ide_pci_enablebit_t typedef Remove needless parens while at it. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:52 +01:00
Bartlomiej Zolnierkiewicz	54cc1428cf	ide: remove local_irq_set() macro Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:52 +01:00
Bartlomiej Zolnierkiewicz	898ec223fe	ide: remove HWIF() macro Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:52 +01:00
Bartlomiej Zolnierkiewicz	b40d1b88f1	ide: move ide_init_port_data() and friends to ide-probe.c * Move IDE_DEFAULT_MAX_FAILURES to <linux/ide.h>. * Move ide_cfg_mtx, ide_hwif_to_major[], ide_port_init_devices_data(), ide_init_port_data(), ide_init_port_hw() and ide_unregister() to ide-probe.c from ide.c. * Make ide_unregister(), ide_init_port_data(), ide_init_port_hw() and ide_cfg_mtx static. While at it: * Remove stale ide_init_port_data() documentation and ide_lock extern. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:51 +01:00
Bartlomiej Zolnierkiewicz	b65fac32cf	ide: merge ide_hwgroup_t with ide_hwif_t (v2) * Merge ide_hwgroup_t with ide_hwif_t. * Cleanup init_irq() accordingly, then remove no longer needed ide_remove_port_from_hwgroup() and ide_ports[]. * Remove now unused HWGROUP() macro. While at it: * ide_dump_ata_error() fixups v2: * Fix ->quirk_list check in do_ide_request() (s/hwif->cur_dev/prev_port->cur_dev). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:50 +01:00
Bartlomiej Zolnierkiewicz	5b31f855f1	ide: use lock bitops for ports serialization (v2) * Add ->host_busy field to struct ide_host and use it's first bit together with lock bitops to provide new ports serialization method. * Convert core IDE code to use new ide_[un]lock_host() helpers. This removes the need for taking hwgroup->lock if host is already busy on serialized hosts and makes it possible to merge ide_hwgroup_t into ide_hwif_t (done in the later patch). * Remove no longer needed ide_hwgroup_t.busy and ide_[un]lock_hwgroup(). * Update do_ide_request() documentation. v2: * ide_release_lock() should be called inside IDE_HFLAG_SERIALIZE check. * Add ide_hwif_t.busy flag and ide_[un]lock_port() for serializing devices on a port. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:49 +01:00
Bartlomiej Zolnierkiewicz	efe0397eef	ide: remove hwgroup->hwif and {drive,hwif}->next * Add 'int port_count' field to ide_hwgroup_t to keep the track of the number of ports in the hwgroup. Then update init_irq() and ide_remove_port_from_hwgroup() to use it. * Remove no longer needed hwgroup->hwif, {drive,hwif}->next, ide_add_drive_to_hwgroup() and ide_remove_drive_from_hwgroup() (hwgroup->drive now only denotes the currently active device in the hwgroup). * Update locking documentation in <linux/ide.h>. While at it: * Rename ->drive field in ide_hwgroup_t to ->cur_dev. * Use __func__ in ide_timer_expiry(). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:49 +01:00
Bartlomiej Zolnierkiewicz	ae86afaee6	ide: use per-port IRQ handlers Use hwif instead of hwgroup as {request,free}_irq()'s cookie, teach ide_intr() to return early for non-active serialized ports, modify unexpected_intr() accordingly and then use per-port IRQ handlers instead of per-hwgroup ones. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:48 +01:00
Bartlomiej Zolnierkiewicz	bd53cbcce5	ide: add ->cur_port to struct ide_host and use it for serialized hosts * Pass 'ide_hwif_t ' instead of 'ide_hwgroup_t ' to unexpected_intr(). * Cache pointer to the port currently being serviced in ->cur_port and use it instead of hwif->hwgroup on serialized hosts. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:48 +01:00
FUJITA Tomonori	f98eee8ea9	x86, ia64: remove duplicated swiotlb code This adds swiotlb_map_page and swiotlb_unmap_page to lib/swiotlb.c and remove IA64 and X86's swiotlb_map_page and swiotlb_unmap_page. This also removes unnecessary swiotlb_map_single, swiotlb_map_single_attrs, swiotlb_unmap_single and swiotlb_unmap_single_attrs. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-06 14:06:58 +01:00
FUJITA Tomonori	160c1d8e40	x86, ia64: convert to use generic dma_map_ops struct This converts X86 and IA64 to use include/linux/dma-mapping.h. It's a bit large but pretty boring. The major change for X86 is converting 'int dir' to 'enum dma_data_direction dir' in DMA mapping operations. The major changes for IA64 is using map_page and unmap_page instead of map_single and unmap_single. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-06 14:06:57 +01:00
FUJITA Tomonori	f0402a262e	generic: add common struct for dma map operations This adds struct dma_map_ops include/linux/dma-mapping.h, which, is used to handle multiple sets of dma mapping API. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-06 14:06:57 +01:00
Frederik Schwarzer	0211a9c850	trivial: fix an -> a typos in documentation and comments It is always "an" if there is a vowel _spoken_ (not written). So it is: "an hour" (spoken vowel) but "a uniform" (spoken 'j') Signed-off-by: Frederik Schwarzer <schwarzerf@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-01-06 11:28:07 +01:00
Frederik Schwarzer	025dfdafe7	trivial: fix then -> than typos in comments and documentation - (better, more, bigger ...) then -> (...) than Signed-off-by: Frederik Schwarzer <schwarzerf@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-01-06 11:28:06 +01:00
Ingo Molnar	3d7a96f5a4	Merge branch 'linus' into tracing/kmemtrace2	2009-01-06 09:53:05 +01:00
Ingo Molnar	d9be28ea91	Merge branches 'sched/clock', 'sched/cleanups' and 'linus' into sched/urgent	2009-01-06 09:33:57 +01:00
Ingo Molnar	fdbc0450df	Merge branches 'core/futexes', 'core/locking', 'core/rcu' and 'linus' into core/urgent	2009-01-06 09:32:11 +01:00
Rusty Russell	835481d9bc	cpumask: convert struct cpufreq_policy to cpumask_var_t Impact: use new cpumask API to reduce memory usage This is part of an effort to reduce structure sizes for machines configured with large NR_CPUS. cpumask_t gets replaced by cpumask_var_t, which is either struct cpumask[1] (small NR_CPUS) or struct cpumask * (large NR_CPUS). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Dave Jones <davej@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-06 09:05:31 +01:00
Theodore Ts'o	b3881f74b3	ext4: Add mount option to set kjournald's I/O priority Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Jens Axboe <jens.axboe@oracle.com>	2009-01-05 22:46:26 -05:00
Linus Torvalds	238c6d5483	Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm * git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: dm snapshot: extend exception store functions dm snapshot: split out exception store implementations dm snapshot: rename struct exception_store dm snapshot: separate out exception store interface dm mpath: move trigger_event to system workqueue dm: add name and uuid to sysfs dm table: rework reference counting dm: support barriers on simple devices dm request: extend target interface dm request: add caches dm ioctl: allow dm_copy_name_and_uuid to return only one field dm log: ensure log bitmap fits on log device dm log: move region_size validation dm log: avoid reinitialising io_req on every operation dm: consolidate target deregistration error handling dm raid1: fix error count dm log: fix dm_io_client leak on error paths dm snapshot: change yield to msleep dm table: drop reference at unbind	2009-01-05 19:20:59 -08:00
Andi Kleen	ab4c142488	dm: support barriers on simple devices Implement barrier support for single device DM devices This patch implements barrier support in DM for the common case of dm linear just remapping a single underlying device. In this case we can safely pass the barrier through because there can be no reordering between devices. NB. Any DM device might cease to support barriers if it gets reconfigured so code must continue to allow for a possible -EOPNOTSUPP on every barrier bio submitted. - agk Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-01-06 03:05:09 +00:00
Kiyoshi Ueda	7d76345da6	dm request: extend target interface This patch adds the following target interfaces for request-based dm. map_rq : for mapping a request rq_end_io : for finishing a request busy : for avoiding performance regression from bio-based dm. Target can tell dm core not to map requests now, and that may help requests in the block layer queue to be bigger by I/O merging. In bio-based dm, this behavior is done by device drivers managing the block layer queue. But in request-based dm, dm core has to do that since dm core manages the block layer queue. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-01-06 03:05:07 +00:00
Mikulas Patocka	10d3bd09a3	dm: consolidate target deregistration error handling Change dm_unregister_target to return void and use BUG() for error reporting. dm_unregister_target can only fail because of programming bug in the target driver. It can't fail because of user's behavior or disk errors. This patch changes unregister_target to return void and use BUG if someone tries to unregister non-registered target or unregister target that is in use. This patch removes code duplication (testing of error codes in all dm targets) and reports bugs in just one place, in dm_unregister_target. In some target drivers, these return codes were ignored, which could lead to a situation where bugs could be missed. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-01-06 03:04:58 +00:00
Linus Torvalds	8e128ce331	Merge branch 'for-next' of git://git.o-hand.com/linux-mfd * 'for-next' of git://git.o-hand.com/linux-mfd: (30 commits) mfd: Fix section mismatch in da903x mfd: move drivers/i2c/chips/menelaus.c to drivers/mfd mfd: move drivers/i2c/chips/tps65010.c to drivers/mfd mfd: dm355evm msp430 driver mfd: Add missing break from wm3850-core mfd: Add WM8351 support mfd: Support configurable numbers of DCDCs and ISINKs on WM8350 mfd: Handle missing WM8350 platform data mfd: Add WM8352 support mfd: Use irq_to_desc in twl4030 code power_supply: Add Dialog DA9030 battery charger driver mfd: Dialog DA9030 battery charger MFD driver mfd: Register WM8400 codec device mfd: Pass driver_data onto child devices mfd: Fix twl4030-core.c build error mfd: twl4030 regulator bug fixes mfd: twl4030: create some regulator devices mfd: twl4030: cleanup symbols and OMAP dependency mfd: twl4030: simplified child creation code power_supply: Add battery health reporting for WM8350 ...	2009-01-05 19:04:09 -08:00
Linus Torvalds	0bbb275358	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: module: convert to stop_machine_create/destroy. stop_machine: introduce stop_machine_create/destroy. parisc: fix module loading failure of large kernel modules module: fix module loading failure of large kernel modules for parisc module: fix warning of unused function when !CONFIG_PROC_FS kernel/module.c: compare symbol values when marking symbols as exported in /proc/kallsyms. remove CONFIG_KMOD	2009-01-05 19:03:39 -08:00
Linus Torvalds	0578c3b4d4	Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: swiotlb: Don't include linux/swiotlb.h twice in lib/swiotlb.c intel-iommu: fix build error with INTR_REMAP=y and DMAR=n swiotlb: add missing __init annotations	2009-01-05 19:03:11 -08:00
Linus Torvalds	8606ab6d30	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (22 commits) HID: fix error condition propagation in hid-sony driver HID: fix reference count leak hidraw HID: add proper support for pensketch 12x9 tablet HID: don't allow DealExtreme usb-radio be handled by usb hid driver HID: fix default Kconfig setting for TopSpeed driver HID: driver for TopSeed Cyberlink quirky remote HID: make boot protocol drivers depend on EMBEDDED HID: avoid sparse warning in HID_COMPAT_LOAD_DRIVER HID: hiddev cleanup -- handle all error conditions properly HID: force feedback driver for GreenAsia 0x12 PID HID: switch specialized drivers from "default y" to !EMBEDDED HID: set proper dev.parent in hidraw HID: add dynids facility HID: use GFP_KERNEL in hid_alloc_buffers HID: usbhid, use usb_endpoint_xfer_int HID: move usbhid flags to usbhid.h HID: add n-trig digitizer support HID: add phys and name ioctls to hidraw HID: struct device - replace bus_id with dev_name(), dev_set_name() HID: automatically call usbhid_set_leds in usbhid driver ...	2009-01-05 18:53:34 -08:00
Linus Torvalds	c54febae99	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (27 commits) GFS2: Use DEFINE_SPINLOCK GFS2: Fix use-after-free bug on umount (try #2) Revert "GFS2: Fix use-after-free bug on umount" GFS2: Streamline alloc calculations for writes GFS2: Send useful information with uevent messages GFS2: Fix use-after-free bug on umount GFS2: Remove ancient, unused code GFS2: Move four functions from super.c GFS2: Fix bug in gfs2_lock_fs_check_clean() GFS2: Send some sensible sysfs stuff GFS2: Kill two daemons with one patch GFS2: Move gfs2_recoverd into recovery.c GFS2: Fix "truncate in progress" hang GFS2: Clean up & move gfs2_quotad GFS2: Add more detail to debugfs glock dumps GFS2: Banish struct gfs2_rgrpd_host GFS2: Move rg_free from gfs2_rgrpd_host to gfs2_rgrpd GFS2: Move rg_igeneration into struct gfs2_rgrpd GFS2: Banish struct gfs2_dinode_host GFS2: Move i_size from gfs2_dinode_host and rename it to i_disksize ...	2009-01-05 18:52:54 -08:00
Linus Torvalds	15b0669072	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (44 commits) qlge: Fix sparse warnings for tx ring indexes. qlge: Fix sparse warning regarding rx buffer queues. qlge: Fix sparse endian warning in ql_hw_csum_setup(). qlge: Fix sparse endian warning for inbound packet control block flags. qlge: Fix sparse warnings for byte swapping in qlge_ethool.c myri10ge: print MAC and serial number on probe failure pkt_sched: cls_u32: Fix locking in u32_change() iucv: fix cpu hotplug af_iucv: Free iucv path/socket in path_pending callback af_iucv: avoid left over IUCV connections from failing connects af_iucv: New error return codes for connect() net/ehea: bitops work on unsigned longs Revert "net: Fix for initial link state in 2.6.28" tcp: Kill extraneous SPLICE_F_NONBLOCK checks. tcp: don't mask EOF and socket errors on nonblocking splice receive dccp: Integrate the TFRC library with DCCP dccp: Clean up ccid.c after integration of CCID plugins dccp: Lockless integration of CCID congestion-control plugins qeth: get rid of extra argument after printk to dev_* conversion qeth: No large send using EDDP for HiperSockets. ...	2009-01-05 18:44:59 -08:00
Linus Torvalds	e9af797d75	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] Fix on resume, now preserves user policy min/max. [CPUFREQ] Add Celeron Core support to p4-clockmod. [CPUFREQ] add to speedstep-lib additional fsb values for core processors [CPUFREQ] Disable sysfs ui for p4-clockmod. [CPUFREQ] p4-clockmod: reduce noise [CPUFREQ] clean up speedstep-centrino and reduce cpumask_t usage	2009-01-05 18:33:38 -08:00
Linus Torvalds	10cc04f5a0	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (138 commits) ocfs2: Access the right buffer_head in ocfs2_merge_rec_left. ocfs2: use min_t in ocfs2_quota_read() ocfs2: remove unneeded lvb casts ocfs2: Add xattr support checking in init_security ocfs2: alloc xattr bucket in ocfs2_xattr_set_handle ocfs2: calculate and reserve credits for xattr value in mknod ocfs2/xattr: fix credits calculation during index create ocfs2/xattr: Always updating ctime during xattr set. ocfs2/xattr: Remove extend_trans call and add its credits from the beginning ocfs2/dlm: Fix race during lockres mastery ocfs2/dlm: Fix race in adding/removing lockres' to/from the tracking list ocfs2/dlm: Hold off sending lockres drop ref message while lockres is migrating ocfs2/dlm: Clean up errors in dlm_proxy_ast_handler() ocfs2/dlm: Fix a race between migrate request and exit domain ocfs2: One more hamming code optimization. ocfs2: Another hamming code optimization. ocfs2: Don't hand-code xor in ocfs2_hamming_encode(). ocfs2: Enable metadata checksums. ocfs2: Validate superblock with checksum and ecc. ocfs2: Checksum and ECC for directory blocks. ...	2009-01-05 18:32:43 -08:00
Linus Torvalds	520c853466	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: inotify: fix type errors in interfaces fix breakage in reiserfs_new_inode() fix the treatment of jfs special inodes vfs: remove duplicate code in get_fs_type() add a vfs_fsync helper sys_execve and sys_uselib do not call into fsnotify zero i_uid/i_gid on inode allocation inode->i_op is never NULL ntfs: don't NULL i_op isofs check for NULL ->i_op in root directory is dead code affs: do not zero ->i_op kill suid bit only for regular files vfs: lseek(fd, 0, SEEK_CUR) race condition	2009-01-05 18:32:06 -08:00
Nick Piggin	e8c82c2e23	mm lockless pagecache barrier fix An XFS workload showed up a bug in the lockless pagecache patch. Basically it would go into an "infinite" loop, although it would sometimes be able to break out of the loop! The reason is a missing compiler barrier in the "increment reference count unless it was zero" case of the lockless pagecache protocol in the gang lookup functions. This would cause the compiler to use a cached value of struct page pointer to retry the operation with, rather than reload it. So the page might have been removed from pagecache and freed (refcount==0) but the lookup would not correctly notice the page is no longer in pagecache, and keep attempting to increment the refcount and failing, until the page gets reallocated for something else. This isn't a data corruption because the condition will be detected if the page has been reallocated. However it can result in a lockup. Linus points out that ACCESS_ONCE is also required in that pointer load, even if it's absence is not causing a bug on our particular build. The most general way to solve this is just to put an rcu_dereference in radix_tree_deref_slot. Assembly of find_get_pages, before: .L220: movq (%rbx), %rax #* ivtmp.1162, tmp82 movq (%rax), %rdi #, prephitmp.1149 .L218: testb $1, %dil #, prephitmp.1149 jne .L217 #, testq %rdi, %rdi # prephitmp.1149 je .L203 #, cmpq $-1, %rdi #, prephitmp.1149 je .L217 #, movl 8(%rdi), %esi # <variable>._count.counter, c testl %esi, %esi # c je .L218 #, after: .L212: movq (%rbx), %rax #* ivtmp.1109, tmp81 movq (%rax), %rdi #, ret testb $1, %dil #, ret jne .L211 #, testq %rdi, %rdi # ret je .L197 #, cmpq $-1, %rdi #, ret je .L211 #, movl 8(%rdi), %esi # <variable>._count.counter, c testl %esi, %esi # c je .L212 #, (notice the obvious infinite loop in the first example, if page->count remains 0) Signed-off-by: Nick Piggin <npiggin@suse.de> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-05 18:31:12 -08:00
Dan Williams	07f2211e4f	dmaengine: remove dependency on async_tx async_tx.ko is a consumer of dma channels. A circular dependency arises if modules in drivers/dma rely on common code in async_tx.ko. It prevents either module from being unloaded. Move dma_wait_for_async_tx and async_tx_run_dependencies to dmaeninge.o where they should have been from the beginning. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-05 18:10:19 -07:00
Michael Kerrisk	4ae8978cf9	inotify: fix type errors in interfaces The problems lie in the types used for some inotify interfaces, both at the kernel level and at the glibc level. This mail addresses the kernel problem. I will follow up with some suggestions for glibc changes. For the sys_inotify_rm_watch() interface, the type of the 'wd' argument is currently 'u32', it should be '__s32' . That is Robert's suggestion, and is consistent with the other declarations of watch descriptors in the kernel source, in particular, the inotify_event structure in include/linux/inotify.h: struct inotify_event { __s32 wd; /* watch descriptor / __u32 mask; / watch mask / __u32 cookie; / cookie to synchronize two events / __u32 len; / length (including nulls) of name / char name[0]; / stub for possible name */ }; The patch makes the changes needed for inotify_rm_watch(). Signed-off-by: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: Robert Love <rlove@google.com> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-01-05 11:54:29 -05:00
Christoph Hellwig	4c728ef583	add a vfs_fsync helper Fsync currently has a fdatawrite/fdatawait pair around the method call, and a mutex_lock/unlock of the inode mutex. All callers of fsync have to duplicate this, but we have a few and most of them don't quite get it right. This patch adds a new vfs_fsync that takes care of this. It's a little more complicated as usual as ->fsync might get a NULL file pointer and just a dentry from nfsd, but otherwise gets afile and we want to take the mapping and file operations from it when it is there. Notes on the fsync callers: - ecryptfs wasn't calling filemap_fdatawrite / filemap_fdatawait on the lower file - coda wasn't calling filemap_fdatawrite / filemap_fdatawait on the host file, and returning 0 when ->fsync was missing - shm wasn't calling either filemap_fdatawrite / filemap_fdatawait nor taking i_mutex. Now given that shared memory doesn't have disk backing not doing anything in fsync seems fine and I left it out of the vfs_fsync conversion for now, but in that case we might just not pass it through to the lower file at all but just call the no-op simple_sync_file directly. [and now actually export vfs_fsync] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-01-05 11:54:28 -05:00
Joel Becker	e06c8227fd	jbd2: Add buffer triggers Filesystems often to do compute intensive operation on some metadata. If this operation is repeated many times, it can be very expensive. It would be much nicer if the operation could be performed once before a buffer goes to disk. This adds triggers to jbd2 buffer heads. Just before writing a metadata buffer to the journal, jbd2 will optionally call a commit trigger associated with the buffer. If the journal is aborted, an abort trigger will be called on any dirty buffers as they are dropped from pending transactions. ocfs2 will use this feature. Initially I tried to come up with a more generic trigger that could be used for non-buffer-related events like transaction completion. It doesn't tie nicely, because the information a buffer trigger needs (specific to a journal_head) isn't the same as what a transaction trigger needs (specific to a tranaction_t or perhaps journal_t). So I implemented a buffer set, with the understanding that journal/transaction wide triggers should be implemented separately. There is only one trigger set allowed per buffer. I can't think of any reason to attach more than one set. Contrast this with a journal or transaction in which multiple places may want to watch the entire transaction separately. The trigger sets are considered static allocation from the jbd2 perspective. ocfs2 will just have one trigger set per block type, setting the same set on every bh of the same type. Signed-off-by: Joel Becker <joel.becker@oracle.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:40:30 -08:00
Jan Kara	7d9056ba20	quota: Export dquot_alloc() and dquot_destroy() functions These are default functions for creating and destroying quota structures and they should be used from filesystems. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:40:25 -08:00
Jan Kara	5cd9d5bb86	quota: Unexport dqblk_v1.h and dqblk_v2.h Unexport header files dqblk_v[12].h since except for quota format ID they don't contain information userspace should be interested in. Move ID definitions to quota.h. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:40:25 -08:00
Mark Fasheh	e97fcd95a4	jbd2: Add BH_JBDPrivateStart Add this so that file systems using JBD2 can safely allocate unused b_state bits. In this case, we add it so that Ocfs2 can define a single bit for tracking the validation state of a buffer. Acked-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:40:24 -08:00
Jan Kara	12c77527e4	quota: Implement function for scanning active dquots OCFS2 needs to scan all active dquots once in a while and sync quota information among cluster nodes. Provide a helper function for it so that it does not have to reimplement internally a list which VFS already has. Moreover this function is probably going to be useful for other clustered filesystems if they decide to use VFS quotas. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:40:23 -08:00
Jan Kara	3d9ea253a0	quota: Add helpers to allow ocfs2 specific quota initialization, freeing and recovery OCFS2 needs to peek whether quota structure is already in memory so that it can avoid expensive cluster locking in that case. Similarly when freeing dquots, it checks whether it is the last quota structure user or not. Finally, it needs to get reference to dquot structure for specified id and quota type when recovering quota file after crash. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:40:22 -08:00
Jan Kara	571b46e40b	quota: Update version number Increase reported version number of quota support since quota core has changed significantly. Also remove __DQUOT_NUM_VERSION__ since nobody uses it. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:40:22 -08:00
Jan Kara	4d59bce4f9	quota: Keep which entries were set by SETQUOTA quotactl Quota in a clustered environment needs to synchronize quota information among cluster nodes. This means we have to occasionally update some information in dquot from disk / network. On the other hand we have to be careful not to overwrite changes administrator did via SETQUOTA. So indicate in dquot->dq_flags which entries have been set by SETQUOTA and quota format can clear these flags when it properly propagated the changes. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:40:22 -08:00
Jan Kara	db49d2df48	quota: Allow negative usage of space and inodes For clustered filesystems, it can happen that space / inode usage goes negative temporarily (because some node is allocating another node is freeing and they are not completely in sync). So let quota code allow this and change qsize_t so a signed type so that we don't underflow the variables. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:40:21 -08:00
Jan Kara	e3d4d56b97	quota: Convert union in mem_dqinfo to a pointer Coming quota support for OCFS2 is going to need quite a bit of additional per-sb quota information. Moreover having fs.h include all the types needed for this structure would be a pain in the a**. So remove the union from mem_dqinfo and add a private pointer for filesystem's use. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:40:21 -08:00
Jan Kara	1ccd14b9c2	quota: Split off quota tree handling into a separate file There is going to be a new version of quota format having 64-bit quota limits and a new quota format for OCFS2. They are both going to use the same tree structure as VFSv0 quota format. So split out tree handling into a separate file and make size of leaf blocks, amount of space usable in each block (needed for checksumming) and structures contained in them configurable so that the code can be shared. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:40:21 -08:00
Jan Kara	cf770c1371	quota: Move quotaio_v[12].h from include/linux/ to fs/ Since these include files are used only by implementation of quota formats, there's no need to have them in include/linux/. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:36:58 -08:00
Jan Kara	ca785ec66b	quota: Introduce DQUOT_QUOTA_SYS_FILE flag If filesystem can handle quota files as system files hidden from users, we can skip a lot of cache invalidation, syncing, inode flags setting etc. when turning quotas on, off and quota_sync. Allow filesystem to indicate that it is hiding quota files from users by DQUOT_QUOTA_SYS_FILE flag. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:36:57 -08:00
Jan Kara	dcb30695f2	quota: Remove compatibility function sb_any_quota_enabled() Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:36:56 -08:00
Jan Kara	f55abc0fb9	quota: Allow to separately enable quota accounting and enforcing limits Split DQUOT_USR_ENABLED (and DQUOT_GRP_ENABLED) into DQUOT_USR_USAGE_ENABLED and DQUOT_USR_LIMITS_ENABLED. This way we are able to separately enable / disable whether we should: 1) ignore quotas completely 2) just keep uptodate information about usage 3) actually enforce quota limits This is going to be useful when quota is treated as filesystem metadata - we then want to keep quota information uptodate all the time and just enable / disable limits enforcement. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:36:56 -08:00
Jan Kara	e4bc7b4b7f	quota: Make _SUSPENDED just a flag Upto now, DQUOT_USR_SUSPENDED behaved like a state - i.e., either quota was enabled or suspended or none. Now allowed states are 0, ENABLED, ENABLED \| SUSPENDED. This will be useful later when we implement separate enabling of quota usage tracking and limits enforcement because we need to keep track of a state which has been suspended. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:36:56 -08:00
Jan Kara	12095460f7	quota: Increase size of variables for limits and inode usage So far quota was fine with quota block limits and inode limits/numbers in a 32-bit type. Now with rapid increase in storage sizes there are coming requests to be able to handle quota limits above 4TB / more that 2^32 inodes. So bump up sizes of types in mem_dqblk structure to 64-bits to be able to handle this. Also update inode allocation / checking functions to use qsize_t and make global structure keep quota limits in bytes so that things are consistent. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:36:55 -08:00
Jan Kara	74f783af95	quota: Add callbacks for allocating and destroying dquot structures Some filesystems would like to keep private information together with each dquot. Add callbacks alloc_dquot and destroy_dquot allowing filesystem to allocate larger dquots from their private slab in a similar fashion we currently allocate inodes. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:36:55 -08:00
Nicolas Ferre	c42aa775cc	atmel-mci: move atmel-mci.h file to include/linux Needed to use the atmel-mci driver in an architecture independant maner. Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>	2009-01-05 16:35:31 +01:00
Ingo Molnar	be92d7af38	genirq: provide irq_to_desc() to non-genirq architectures too Impact: build fix on non-genirq architectures Sam Ravnborg reported this build failure on sparc32 allmodconfig, the GPIO drivers assume the presence of irq_to_desc(): drivers/gpio/gpiolib.c: In function `gpiolib_dbg_show': drivers/gpio/gpiolib.c:1146: error: implicit declaration of function 'irq_to_desc' Add it in the !genirq case too. Reported-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Tested-by: Sam Ravnborg <sam@ravnborg.org>	2009-01-05 14:53:30 +01:00
Ingo Molnar	46483d10e5	Merge branch 'core/iommu' into core/urgent Conflicts: lib/swiotlb.c	2009-01-05 14:17:24 +01:00
Alexey Korolev	d81408304b	[MTD] LPDDR extended physmap driver to support LPDDR flash Physmap is a generic map driver for different platforms and flash types. We added support of LPDDR to physmap. All changes here are related to introduction of new pfow_base parameter. This parameter is valid in case of LPDDR chips only. Signed-off-by: Alexey Korolev <akorolev@infradead.org> Acked-by: Jared Hulbert <jaredeh@gmail.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2009-01-05 13:57:28 +01:00
Alexey Korolev	d13e51e747	[MTD] LPDDR added new pfow_base parameter We need to supply additional parameter to mapping driver and tell LPDDR drivers where PFOW window is in chip mapping. It leads to necessity of map_info structure extendoing. Signed-off-by: Alexey Korolev <akorolev@infradead.org> Acked-by: Jared Hulbert <jaredeh@gmail.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2009-01-05 13:56:08 +01:00
Alexey Korolev	eb3db27507	[MTD] LPDDR PFOW definition LPDDR chips use PFOW window for sending commands, reading status and capabilites requesting. This pfow.h - contains definitions for PFOW window fileds, possible commands, error flags and some common macro function to avoid code duplications. Signed-off-by: Alexey Korolev <akorolev@infradead.org> Acked-by: Jared Hulbert <jaredeh@gmail.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2009-01-05 13:55:58 +01:00
Alexey Korolev	922ab535bb	[MTD] LPDDR QINFO records definitions There are declaraton of structures and macros definitions necessary for operations with QINFO in this patch. Signed-off-by: Alexey Korolev <akorolev@infradead.org> Acked-by: Jared Hulbert <jaredeh@gmail.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2009-01-05 13:55:54 +01:00
Li Zefan	c70f22d203	sched: clean up arch_reinit_sched_domains() - Make arch_reinit_sched_domains() static. It was exported to be used in s390, but now rebuild_sched_domains() is used instead. - Make it return void. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-05 13:54:16 +01:00
Peter Zijlstra	a6037b61c2	hrtimer: fix recursion deadlock by re-introducing the softirq Impact: fix rare runtime deadlock There are a few sites that do: spin_lock_irq(&foo) hrtimer_start(&bar) __run_hrtimer(&bar) func() spin_lock(&foo) which obviously deadlocks. In order to avoid this, never call __run_hrtimer() from hrtimer_start*() context, but instead defer this to softirq context. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-05 13:14:33 +01:00
David Woodhouse	353816f43d	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: arch/arm/mach-pxa/corgi.c arch/arm/mach-pxa/poodle.c arch/arm/mach-pxa/spitz.c	2009-01-05 10:50:33 +01:00
Paul E. McKenney	ea7d3fef42	rcu: eliminate synchronize_rcu_xxx macro Impact: cleanup Expand macro into two files. The synchronize_rcu_xxx macro is quite ugly and it's only used by two callers, so expand it instead. This makes this code easier to change. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-05 10:18:08 +01:00
Steven Whitehouse	e9079cce20	GFS2: Support for FIEMAP ioctl This patch implements the FIEMAP ioctl for GFS2. We can use the generic code (aside from a lock order issue, solved as per Ted Tso's suggestion) for which I've introduced a new variant of the generic function. We also have one exception to deal with, namely stuffed files, so we do that "by hand", setting all the required flags. This has been tested with a modified (I could only find an old version) of Eric's test program, and appears to work correctly. This patch does not currently support FIEMAP of xattrs, but the plan is to add that feature at some future point. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Eric Sandeen <sandeen@redhat.com>	2009-01-05 07:38:46 +00:00
Linus Torvalds	fe0bdec68b	Merge branch 'audit.b61' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current * 'audit.b61' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: audit: validate comparison operations, store them in sane form clean up audit_rule_{add,del} a bit make sure that filterkey of task,always rules is reported audit rules ordering, part 2 fixing audit rule ordering mess, part 1 audit_update_lsm_rules() misses the audit_inode_hash[] ones sanitize audit_log_capset() sanitize audit_fd_pair() sanitize audit_mq_open() sanitize AUDIT_MQ_SENDRECV sanitize audit_mq_notify() sanitize audit_mq_getsetattr() sanitize audit_ipc_set_perm() sanitize audit_ipc_obj() sanitize audit_socketcall don't reallocate buffer in every audit_sockaddr()	2009-01-04 16:32:11 -08:00
David Howells	14eaddc967	CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #2 ] Fix a regression in cap_capable() due to: commit 5ff7711e635b32f0a1e558227d030c7e45b4a465 Author: David Howells <dhowells@redhat.com> Date: Wed Dec 31 02:52:28 2008 +0000 CRED: Differentiate objective and effective subjective credentials on a task The problem is that the above patch allows a process to have two sets of credentials, and for the most part uses the subjective credentials when accessing current's creds. There is, however, one exception: cap_capable(), and thus capable(), uses the real/objective credentials of the target task, whether or not it is the current task. Ordinarily this doesn't matter, since usually the two cred pointers in current point to the same set of creds. However, sys_faccessat() makes use of this facility to override the credentials of the calling process to make its test, without affecting the creds as seen from other processes. One of the things sys_faccessat() does is to make an adjustment to the effective capabilities mask, which cap_capable(), as it stands, then ignores. The affected capability check is in generic_permission(): if (!(mask & MAY_EXEC) \|\| execute_ok(inode)) if (capable(CAP_DAC_OVERRIDE)) return 0; This change splits capable() from has_capability() down into the commoncap and SELinux code. The capable() security op now only deals with the current process, and uses the current process's subjective creds. A new security op - task_capable() - is introduced that can check any task's objective creds. strictly the capable() security op is superfluous with the presence of the task_capable() op, however it should be faster to call the capable() op since two fewer arguments need be passed down through the various layers. This can be tested by compiling the following program from the XFS testsuite: /* * t_access_root.c - trivial test program to show permission bug. * * Written by Michael Kerrisk - copyright ownership not pursued. * Sourced from: http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-10/6030.html / #include <limits.h> #include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <fcntl.h> #include <sys/stat.h> #define UID 500 #define GID 100 #define PERM 0 #define TESTPATH "/tmp/t_access" static void errExit(char msg) { perror(msg); exit(EXIT_FAILURE); } /* errExit / static void accessTest(char file, int mask, char mstr) { printf("access(%s, %s) returns %d\n", file, mstr, access(file, mask)); } / accessTest / int main(int argc, char argv[]) { int fd, perm, uid, gid; char testpath; char cmd[PATH_MAX + 20]; testpath = (argc > 1) ? argv[1] : TESTPATH; perm = (argc > 2) ? strtoul(argv[2], NULL, 8) : PERM; uid = (argc > 3) ? atoi(argv[3]) : UID; gid = (argc > 4) ? atoi(argv[4]) : GID; unlink(testpath); fd = open(testpath, O_RDWR \| O_CREAT, 0); if (fd == -1) errExit("open"); if (fchown(fd, uid, gid) == -1) errExit("fchown"); if (fchmod(fd, perm) == -1) errExit("fchmod"); close(fd); snprintf(cmd, sizeof(cmd), "ls -l %s", testpath); system(cmd); if (seteuid(uid) == -1) errExit("seteuid"); accessTest(testpath, 0, "0"); accessTest(testpath, R_OK, "R_OK"); accessTest(testpath, W_OK, "W_OK"); accessTest(testpath, X_OK, "X_OK"); accessTest(testpath, R_OK \| W_OK, "R_OK \| W_OK"); accessTest(testpath, R_OK \| X_OK, "R_OK \| X_OK"); accessTest(testpath, W_OK \| X_OK, "W_OK \| X_OK"); accessTest(testpath, R_OK \| W_OK \| X_OK, "R_OK \| W_OK \| X_OK"); exit(EXIT_SUCCESS); } / main */ This can be run against an Ext3 filesystem as well as against an XFS filesystem. If successful, it will show: [root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043 ---------- 1 dhowells dhowells 0 2008-12-31 03:00 /tmp/xxx access(/tmp/xxx, 0) returns 0 access(/tmp/xxx, R_OK) returns 0 access(/tmp/xxx, W_OK) returns 0 access(/tmp/xxx, X_OK) returns -1 access(/tmp/xxx, R_OK \| W_OK) returns 0 access(/tmp/xxx, R_OK \| X_OK) returns -1 access(/tmp/xxx, W_OK \| X_OK) returns -1 access(/tmp/xxx, R_OK \| W_OK \| X_OK) returns -1 If unsuccessful, it will show: [root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043 ---------- 1 dhowells dhowells 0 2008-12-31 02:56 /tmp/xxx access(/tmp/xxx, 0) returns 0 access(/tmp/xxx, R_OK) returns -1 access(/tmp/xxx, W_OK) returns -1 access(/tmp/xxx, X_OK) returns -1 access(/tmp/xxx, R_OK \| W_OK) returns -1 access(/tmp/xxx, R_OK \| X_OK) returns -1 access(/tmp/xxx, W_OK \| X_OK) returns -1 access(/tmp/xxx, R_OK \| W_OK \| X_OK) returns -1 I've also tested the fix with the SELinux and syscalls LTP testsuites. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2009-01-05 11:17:04 +11:00
Herbert Xu	5d38a079ce	gro: Add page frag support This patch allows GRO to merge page frags (skb_shinfo(skb)->frags) in one skb, rather than using the less efficient frag_list. It also adds a new interface, napi_gro_frags to allow drivers to inject page frags directly into the stack without allocating an skb. This is intended to be the GRO equivalent for LRO's lro_receive_frags interface. The existing GSO interface can already handle page frags with or without an appended frag_list so nothing needs to be changed there. The merging itself is rather simple. We store any new frag entries after the last existing entry, without checking whether the first new entry can be merged with the last existing entry. Making this check would actually be easy but since no existing driver can produce contiguous frags anyway it would just be mental masturbation. If the total number of entries would exceed the capacity of a single skb, we simply resort to using frag_list as we do now. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-04 16:13:40 -08:00
Alain Knaff	bc22c17e12	bzip2/lzma: library support for gzip, bzip2 and lzma decompression Impact: Replaces inflate.c with a wrapper around zlib_inflate; new library code This is the first part of the bzip2/lzma patch The bzip patch is based on an idea by Christian Ludwig, includes support for compressing the kernel with bzip2 or lzma rather than gzip. Both compressors give smaller sizes than gzip. Lzma's decompresses faster than bzip2. It also supports ramdisks and initramfs' compressed using these two compressors. The functionality has been successfully used for a couple of years by the udpcast project This version applies to "tip" kernel 2.6.28 This part contains: - changed inflate.c to accomodate rest of patch - implementation of bzip2 compression (not used at this stage yet) - implementation of lzma compression (not used at this stage yet) - Makefile routines to support bzip2 and lzma kernel compression Signed-off-by: Alain Knaff <alain@knaff.lu> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-01-04 15:53:34 -08:00
Heiko Carstens	9ea09af3bd	stop_machine: introduce stop_machine_create/destroy. Introduce stop_machine_create/destroy. With this interface subsystems that need a non-failing stop_machine environment can create the stop_machine machine threads before actually calling stop_machine. When the threads aren't needed anymore they can be killed with stop_machine_destroy again. When stop_machine gets called and the threads aren't present they will be created and destroyed automatically. This restores the old behaviour of stop_machine. This patch also converts cpu hotplug to the new interface since it is special: cpu_down calls __stop_machine instead of stop_machine. However the kstop threads will only be created when stop_machine gets called. Changing the code so that the threads would be created automatically on __stop_machine is currently not possible: when __stop_machine gets called we hold cpu_add_remove_lock, which is the same lock that create_rt_workqueue would take. So the workqueue needs to be created before the cpu hotplug code locks cpu_add_remove_lock. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-01-05 08:40:14 +10:30
Helge Deller	088af9a6e0	module: fix module loading failure of large kernel modules for parisc When creating the final layout of a kernel module in memory, allow the module loader to reserve some additional memory in front of a given section. This is currently only needed for the parisc port which needs to put the stub entries there to fulfill the 17/22bit PCREL relocations with large kernel modules like xfs. Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (renamed fn)	2009-01-05 08:40:13 +10:30
Alessandro Zummo	099e657625	rtc: add alarm/update irq interfaces Add standard interfaces for alarm/update irqs enabling. Drivers are no more required to implement equivalent ioctl code as rtc-dev will provide it. UIE emulation should now be handled correctly and will work even for those RTC drivers who cannot be configured to do both UIE and AIE. Signed-off-by: Alessandro Zummo <a.zummo@towertech.it> Cc: David Brownell <david-b@pacbell.net> Cc: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-04 13:33:20 -08:00
Nick Piggin	54566b2c15	fs: symlink write_begin allocation context fix With the write_begin/write_end aops, page_symlink was broken because it could no longer pass a GFP_NOFS type mask into the point where the allocations happened. They are done in write_begin, which would always assume that the filesystem can be entered from reclaim. This bug could cause filesystem deadlocks. The funny thing with having a gfp_t mask there is that it doesn't really allow the caller to arbitrarily tinker with the context in which it can be called. It couldn't ever be GFP_ATOMIC, for example, because it needs to take the page lock. The only thing any callers care about is __GFP_FS anyway, so turn that into a single flag. Add a new flag for write_begin, AOP_FLAG_NOFS. Filesystems can now act on this flag in their write_begin function. Change __grab_cache_page to accept a nofs argument as well, to honour that flag (while we're there, change the name to grab_cache_page_write_begin which is more instructive and does away with random leading underscores). This is really a more flexible way to go in the end anyway -- if a filesystem happens to want any extra allocations aside from the pagecache ones in ints write_begin function, it may now use GFP_KERNEL (rather than GFP_NOFS) for common case allocations (eg. ocfs2_alloc_write_ctxt, for a random example). [kosaki.motohiro@jp.fujitsu.com: fix ubifs] [kosaki.motohiro@jp.fujitsu.com: fix fuse] Signed-off-by: Nick Piggin <npiggin@suse.de> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: <stable@kernel.org> [2.6.28.x] Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Cleaned up the calling convention: just pass in the AOP flags untouched to the grab_cache_page_write_begin() function. That just simplifies everybody, and may even allow future expansion of the logic. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-04 13:33:20 -08:00
Pekka Enberg	c644f0e4b5	fs: introduce bgl_lock_ptr() As suggested by Andreas Dilger, introduce a bgl_lock_ptr() helper in <linux/blockgroup_lock.h> and add separate sb_bgl_lock() helpers to filesystem specific header files to break the hidden dependency to struct ext[234]_sb_info. Also, while at it, convert the macros to static inlines to try make up for all the times I broke Andrew Morton's tree. Acked-by: Andreas Dilger <adilger@sun.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-04 13:33:20 -08:00
Randy Dunlap	0a30c5cefa	spi.h uses/needs device.h Include header files as used/needed: In file included from drivers/leds/leds-dac124s085.c:16: include/linux/spi/spi.h:66: error: field 'dev' has incomplete type include/linux/spi/spi.h: In function 'to_spi_device': include/linux/spi/spi.h💯 warning: type defaults to 'int' in declaration of '__mptr' ... Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-04 13:33:20 -08:00
Al Viro	5af75d8d58	audit: validate comparison operations, store them in sane form Don't store the field->op in the messy (and very inconvenient for e.g. audit_comparator()) form; translate to dense set of values and do full validation of userland-submitted value while we are at it. ->audit_init_rule() and ->audit_match_rule() get new values now; in-tree instances updated. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-01-04 15:14:42 -05:00
Al Viro	e45aa212ea	audit rules ordering, part 2 Fix the actual rule listing; add per-type lists _not_ used for matching, with all exit,... sitting on one such list. Simplifies "do something for all rules" logics, while we are at it... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-01-04 15:14:42 -05:00
Al Viro	0590b9335a	fixing audit rule ordering mess, part 1 Problem: ordering between the rules on exit chain is currently lost; all watch and inode rules are listed after everything else _and_ exit,never on one kind doesn't stop exit,always on another from being matched. Solution: assign priorities to rules, keep track of the current highest-priority matching rule and its result (always/never). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-01-04 15:14:41 -05:00
Al Viro	57f71a0af4	sanitize audit_log_capset() * no allocations * return void * don't duplicate checked for dummy context Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-01-04 15:14:41 -05:00
Al Viro	157cf649a7	sanitize audit_fd_pair() * no allocations * return void Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-01-04 15:14:41 -05:00
Al Viro	564f6993ff	sanitize audit_mq_open() * don't bother with allocations * don't do double copy_from_user() * don't duplicate parts of check for audit_dummy_context() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-01-04 15:14:41 -05:00
Al Viro	c32c8af43b	sanitize AUDIT_MQ_SENDRECV * logging the original value of msg_prio in mq_timedreceive(2) is insane - the argument is write-only (i.e. syscall always ignores the original value and only overwrites it). merge __audit_mq_timed{send,receive} * don't do copy_from_user() twice * don't mess with allocations in auditsc part * ... and don't bother checking !audit_enabled and !context in there - we'd already checked for audit_dummy_context(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-01-04 15:14:40 -05:00
Al Viro	20114f71b2	sanitize audit_mq_notify() * don't copy_from_user() twice * don't bother with allocations * don't duplicate parts of audit_dummy_context() * make it return void Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-01-04 15:14:40 -05:00
Al Viro	7392906ea9	sanitize audit_mq_getsetattr() * get rid of allocations * make it return void * don't duplicate parts of audit_dummy_context() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-01-04 15:14:40 -05:00
Al Viro	e816f370cb	sanitize audit_ipc_set_perm() * get rid of allocations * make it return void * simplify callers Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-01-04 15:14:40 -05:00
Al Viro	a33e675100	sanitize audit_ipc_obj() * get rid of allocations * make it return void * simplify callers Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-01-04 15:14:39 -05:00
Al Viro	f3298dc4f2	sanitize audit_socketcall * don't bother with allocations * now that it can't fail, make it return void Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-01-04 15:14:39 -05:00
David Brownell	0931a4c6db	mfd: dm355evm msp430 driver Basic MFD framework for the MSP430 microcontroller firmware used on the dm355evm board: - Provides an interface for other drivers: register read/write utilities, and register declarations. - Directly exports: * Many signals through the GPIO framework + LEDs + SW6 through gpio sysfs + NTSC/nPAL jumper through gpio sysfs + ... more could be added later, e.g. MMC signals * Child devices: + LEDs, via leds-gpio child (and default triggers) + RTC, via rtc-dm355evm child device + Buttons and IR control, via dm355evm_keys - Supports power-off system call. Use the reset button to power the board back up; the power supply LED will be on, but the MSP430 waits to re-activate the regulators. - On probe() this: * Announces firmware revision * Turns off the banked LEDs * Exports the resources noted above * Hooks the power-off support * Muxes tvp5146 -or- imager for video input Unless the new tvp514x driver (tracked for mainline) is configured, this assumes that some custom imager driver handles video-in. This completely ignores the registers reporting the output voltages on the various power supplies. Someone could add a hwmon interface if that seems useful. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-04 12:17:43 +01:00
Mark Brown	ca23f8c1b0	mfd: Add WM8351 support The WM8351 is a WM8350 variant. As well as register default changes the WM8351 has fewer voltage and current regulators than the WM8350. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-04 12:17:42 +01:00
Mark Brown	645524a9c6	mfd: Support configurable numbers of DCDCs and ISINKs on WM8350 Some WM8350 variants have fewer DCDCs and ISINKs. Identify these at probe and refuse to use the absent DCDCs when running on these chips. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-04 12:17:42 +01:00
Mark Brown	9692063062	mfd: Add WM8352 support The WM8352 is a variant of the WM8350. Aside from the register defaults there are no software visible differences to the WM8350. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-04 12:17:42 +01:00
Mike Rapoport	856f6fd119	mfd: Dialog DA9030 battery charger MFD driver This patch amends DA903x MFD driver with definitions and methods needed for battery charger driver. Signed-off-by: Mike Rapoport <mike@compulab.co.il> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-04 12:17:41 +01:00
David Brownell	b73eac7871	mfd: twl4030 regulator bug fixes This contains two bugfixes to the initial twl4030 regulator support patch related to USB: (a) always overwrite the old list of consumers ... else the regulator handles all use the same "usb1v5" name; (b) don't set up the "usbcp" regulator, which turns out to be managed through separate controls, usually ULPI directly from the OTG controller. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-04 12:17:40 +01:00
David Brownell	dad759ff8b	mfd: twl4030: create some regulator devices Initial code to create twl4030 voltage regulator devices, using the new regulator framework. Note that this now starts to care what name is used to declare the TWL chip: - TWL4030 is the "old" chip; newer ones have a bigger variety of VAUX2 voltages. - TWL5030 is the core "new" chip; TPS65950 is its catalog version. - The TPS65930 and TPS65920 are cost-reduced catalog versions of TWL5030 parts ... fewer regulators, no battery charger, etc. Board-specific regulator configuration should be provided, listing which regulators are used and their constraints (e.g. 1.8V only). Code that could ("should"?) leverage the regulator stuff includes TWL4030 USB transceiver support and MMC glue, LCD support for the 3430SDP and Labrador boards, and S-Video output. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-04 12:17:39 +01:00
David Brownell	67460a7c26	mfd: twl4030: cleanup symbols and OMAP dependency Finish removing dependency of TWL driver stack on platform-specific IRQ definitions ... and remove the build dependency on OMAP. This lets the TWL4030 code be included in test builds for most platforms, and will make it easier for non-OMAP folk to update most of this code for new APIs etc. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-04 12:17:39 +01:00
Mark Brown	4008e879e1	power_supply: Add battery health reporting for WM8350 Implement support for reporting battery health in the WM8350 battery interface. Since we are now able to report this via the classs remove the diagnostics from the interrupt handler. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Anton Vorontsov <cbouatmailru@gmail.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-04 12:17:39 +01:00
Mark Brown	7e386e6e0e	power_supply: Add cold to the POWER_SUPPLY_HEALTH report values Some systems are able to report problems with batteries being under temperature. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Anton Vorontsov <cbouatmailru@gmail.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-04 12:17:39 +01:00
Mark Brown	b797a55519	mfd: Refactor WM8350 chip identification Since the WM8350 driver was originally written the semantics for the identification registers of the chip have been clarified, allowing us to do an exact match on all the fields. This avoids mistakenly running on unsupported hardware. Also change to using the datasheet names more consistently for legibility and fix a printk() that should be dev_err(). Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-04 12:17:39 +01:00
Mark Brown	d756f4a444	mfd: Switch WM8350 revision detection to a feature based model Rather than check for chip revisions in the WM8350 drivers have the core code set flags for relevant differences. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-04 12:17:38 +01:00
Mark Brown	14431aa0c5	power_supply: Add support for WM8350 PMU This patch adds support for the PMU provided by the WM8350 which implements battery, line and USB supplies including a battery charger. The hardware functions largely autonomously, with minimal software control required to initiate fast charging. Support for configuration of the USB supply is not yet implemented. This means that the hardware will remain in the mode configured at startup, by default limiting the current drawn from USB to 100mA. This driver was originally written by Liam Girdwood with subsequent updates for submission by Mark Brown. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Anton Vorontsov <cbouatmailru@gmail.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-04 12:17:38 +01:00
David Brownell	3fba19ec1a	mfd: allow reading entire register banks on twl4030 Minor change to the TWL4030 utility interface: support reads of all 256 bytes in each register bank (vs just 255). This can help when debugging, but is otherwise a NOP. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-04 12:17:38 +01:00
Mark Brown	6748852634	mfd: Add AUXADC support for WM8350 The auxiliary ADC in the WM8350 is shared between several subdevices so access to it needs to be arbitrated by the core driver. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-04 12:17:38 +01:00
Mark Brown	0c8a601678	mfd: Add WM8350 revision H support No other software changes are required. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2009-01-04 12:17:38 +01:00
Ingo Molnar	c66b9906f8	intel-iommu: fix build error with INTR_REMAP=y and DMAR=n dmar.o can be built in the CONFIG_INTR_REMAP=y case but iommu_calculate_agaw() is only available if VT-d is built as well. So create an inline version of iommu_calculate_agaw() for the !CONFIG_DMAR case. The iommu->agaw value wont be used in this case, but the code is cleaner (has less #ifdefs) if we have it around unconditionally. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-04 11:00:05 +01:00
Theodore Ts'o	c319106723	ext4: Remove code to create the journal inode This code has been obsolete in quite some time, since the supported method for adding a journal inode is to use tune2fs (or to creating new filesystem with a journal via mke2fs or mkfs.ext4). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-01-06 11:14:25 -05:00
Hannes Eder	725cf0f47d	HID: avoid sparse warning in HID_COMPAT_LOAD_DRIVER Impact: include a prototype for the exported function in the macro Fix about 20 of this warnings: drivers/hid/hid-a4tech.c:162:1: warning: symbol 'hid_compat_a4tech' was not declared. Should it be static? Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-01-04 01:00:53 +01:00
Jiri Slaby	3a6f82f7a2	HID: add dynids facility Allow adding new devices to the hid drivers on the fly without a need of kernel recompilation. Now, one can test a driver e.g. by: echo 0003:045E:00F0.0003 > ../generic-usb/unbind echo 0003 045E 00F0 > new_id from some driver subdir. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-01-04 01:00:52 +01:00
Jiri Slaby	0ed94b3342	HID: move usbhid flags to usbhid.h Move usbhid specific flags from global hid.h into local usbhid.h. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-01-04 01:00:51 +01:00
Jiri Kosina	9188e79ec3	HID: add phys and name ioctls to hidraw The hiddev interface provides ioctl() calls which can be used to obtain phys and raw name of the underlying device. Add the corresponding support also into hidraw. Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-01-04 01:00:51 +01:00
Linus Torvalds	7d3b56ba37	Merge branch 'cpus4096-for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'cpus4096-for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (77 commits) x86: setup_per_cpu_areas() cleanup cpumask: fix compile error when CONFIG_NR_CPUS is not defined cpumask: use alloc_cpumask_var_node where appropriate cpumask: convert shared_cpu_map in acpi_processor* structs to cpumask_var_t x86: use cpumask_var_t in acpi/boot.c x86: cleanup some remaining usages of NR_CPUS where s/b nr_cpu_ids sched: put back some stack hog changes that were undone in kernel/sched.c x86: enable cpus display of kernel_max and offlined cpus ia64: cpumask fix for is_affinity_mask_valid() cpumask: convert RCU implementations, fix xtensa: define __fls mn10300: define __fls m32r: define __fls h8300: define __fls frv: define __fls cris: define __fls cpumask: CONFIG_DISABLE_OBSOLETE_CPUMASK_FUNCTIONS cpumask: zero extra bits in alloc_cpumask_var_node cpumask: replace for_each_cpu_mask_nr with for_each_cpu in kernel/time/ cpumask: convert mm/ ...	2009-01-03 12:04:39 -08:00
Linus Torvalds	269b012321	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu: (89 commits) AMD IOMMU: remove now unnecessary #ifdefs AMD IOMMU: prealloc_protection_domains should be static kvm/iommu: fix compile warning AMD IOMMU: add statistics about total number of map requests AMD IOMMU: add statistics about allocated io memory AMD IOMMU: add stats counter for domain tlb flushes AMD IOMMU: add stats counter for single iommu domain tlb flushes AMD IOMMU: add stats counter for cross-page request AMD IOMMU: add stats counter for free_coherent requests AMD IOMMU: add stats counter for alloc_coherent requests AMD IOMMU: add stats counter for unmap_sg requests AMD IOMMU: add stats counter for map_sg requests AMD IOMMU: add stats counter for unmap_single requests AMD IOMMU: add stats counter for map_single requests AMD IOMMU: add stats counter for completion wait events AMD IOMMU: add init code for statistic collection AMD IOMMU: add necessary header defines for stats counting AMD IOMMU: add Kconfig entry for statistic collection code AMD IOMMU: use dev_name in iommu_enable function AMD IOMMU: use calc_devid in prealloc_protection_domains ...	2009-01-03 12:03:52 -08:00
Linus Torvalds	f60a0a7984	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (34 commits) V4L/DVB (10173): Missing v4l2_prio_close in radio_release V4L/DVB (10172): add DVB_DEVICE_TYPE= to uevent V4L/DVB (10171): Use usb_set_intfdata V4L/DVB (10170): tuner-simple: prevent possible OOPS caused by divide by zero error V4L/DVB (10168): sms1xxx: fix inverted gpio for lna control on tiger r2 V4L/DVB (10167): sms1xxx: add support for inverted gpio V4L/DVB (10166): dvb frontend: stop using non-C99 compliant comments V4L/DVB (10165): Add FE_CAN_2G_MODULATION flag to frontends that support DVB-S2 V4L/DVB (10164): Add missing S2 caps flag to S2API V4L/DVB (10163): em28xx: allocate adev together with struct em28xx dev V4L/DVB (10162): tuner-simple: Fix tuner type set message V4L/DVB (10161): saa7134: fix autodetection for AVer TV GO 007 FM Plus V4L/DVB (10160): em28xx: update chip id for em2710 V4L/DVB (10157): Add USB ID for the Sil4701 radio from DealExtreme V4L/DVB (10156): saa7134: Add support for Avermedia AVer TV GO 007 FM Plus V4L/DVB (10155): Add TEA5764 radio driver V4L/DVB (10154): saa7134: fix a merge conflict on Behold H6 board V4L/DVB (10153): Add the Beholder H6 card to DVB-T part of sources. V4L/DVB (10152): Change configuration of the Beholder H6 card V4L/DVB (10151): Fix I2C bridge error in zl10353 ...	2009-01-03 12:02:18 -08:00
Yinghai Lu	2f98357001	sparseirq: move set/get_timer_rand_state back to .c those two functions only used in that C file Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-03 12:01:23 -08:00
Linus Torvalds	e9e67a8b57	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc: mmc: warn about voltage mismatches mmc_spi: Add support for OpenFirmware bindings pxamci: fix dma_unmap_sg length mmc_block: ensure all sectors that do not have errors are read drivers/mmc: Move a dereference below a NULL test sdhci: handle built-in sdhci with modular leds class mmc: balanc pci_iomap with pci_iounmap mmc_block: print better error messages mmc: Add mmc_vddrange_to_ocrmask() helper function ricoh_mmc: Handle newer models of Ricoh controllers mmc: Add 8-bit bus width support sdhci: activate led support also when module mmc: trivial annotation of 'blocks' pci: use pci_ioremap_bar() in drivers/mmc sdricoh_cs: Add support for Bay Controller devices mmc: at91_mci: reorder timer setup and mmc_add_host() call	2009-01-03 12:00:07 -08:00
Linus Torvalds	61420f59a5	Merge branch 'cputime' of git://git390.osdl.marist.edu/pub/scm/linux-2.6 * 'cputime' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: [PATCH] fast vdso implementation for CLOCK_THREAD_CPUTIME_ID [PATCH] improve idle cputime accounting [PATCH] improve precision of idle time detection. [PATCH] improve precision of process accounting. [PATCH] idle cputime accounting [PATCH] fix scaled & unscaled cputime accounting	2009-01-03 11:56:24 -08:00
Mike Travis	7eb1955336	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask into merge-rr-cpumask Conflicts: arch/x86/kernel/io_apic.c kernel/rcuclassic.c kernel/sched.c kernel/time/tick-sched.c Signed-off-by: Mike Travis <travis@sgi.com> [ mingo@elte.hu: backmerged typo fix for io_apic.c ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-03 18:53:31 +01:00
Theodore Ts'o	87d8fe1ee6	add releasepage hooks to block devices which can be used by file systems Implement blkdev_releasepage() to release the buffer_heads and pages after we release private data belonging to a mounted filesystem. Cc: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-01-03 09:47:09 -05:00
Joerg Roedel	e4754c96cf	VT-d: remove now unused intel_iommu_found function Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-01-03 14:11:08 +01:00
Joerg Roedel	d14d65777c	VT-d: adapt domain iova_to_phys function for IOMMU API Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-01-03 14:11:08 +01:00
Joerg Roedel	dde57a210d	VT-d: adapt domain map and unmap functions for IOMMU API Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-01-03 14:11:08 +01:00
Joerg Roedel	4c5478c94e	VT-d: adapt device attach and detach functions for IOMMU API Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-01-03 14:11:08 +01:00
Joerg Roedel	5d450806eb	VT-d: adapt domain init and destroy functions for IOMMU API Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-01-03 14:11:07 +01:00
Joerg Roedel	19de40a847	KVM: change KVM to use IOMMU API Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-01-03 14:11:07 +01:00
Joerg Roedel	4a77a6cf6d	introcude linux/iommu.h for an iommu api This patch introduces the API to abstract the exported VT-d functions for KVM into a generic API. This way the AMD IOMMU implementation can plug into this API later. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-01-03 14:10:09 +01:00
Weidong Han	b653574a7d	Deassign device in kvm_free_assgined_device In kvm_iommu_unmap_memslots(), assigned_dev_head is already empty. Signed-off-by: Weidong Han <weidong.han@intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-01-03 14:10:08 +01:00
Weidong Han	0a92035674	KVM: support device deassignment Support device deassignment, it can be used in device hotplug. Signed-off-by: Weidong Han <weidong.han@intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-01-03 14:02:19 +01:00
Weidong Han	260782bcfd	KVM: use the new intel iommu APIs intel iommu APIs are updated, use the new APIs. In addition, change kvm_iommu_map_guest() to just create the domain, let kvm_iommu_assign_device() assign device. Signed-off-by: Weidong Han <weidong.han@intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-01-03 14:02:19 +01:00
Weidong Han	faa3d6f5ff	Change intel iommu APIs of virtual machine domain These APIs are used by KVM to use VT-d Signed-off-by: Weidong Han <weidong.han@intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-01-03 14:02:18 +01:00
Weidong Han	1b5736839a	calculate agaw for each iommu "SAGAW" capability may be different across iommus. Use a default agaw, but if default agaw is not supported in some iommus, choose a less supported agaw. Signed-off-by: Weidong Han <weidong.han@intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-01-03 14:02:18 +01:00
Mark McLoughlin	2abd7e167c	intel-iommu: move iommu_prepare_gfx_mapping() out of dma_remapping.h Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2009-01-03 11:57:35 +01:00
Mark McLoughlin	58fa7304a2	intel-iommu: kill off duplicate def of dmar_disabled This is only used in dmar.c and intel-iommu.h, so dma_remapping.h seems like the appropriate place for it. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2009-01-03 11:57:35 +01:00
Mark McLoughlin	a647dacbb1	intel-iommu: move struct device_domain_info out of dma_remapping.h Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2009-01-03 11:57:35 +01:00
Mark McLoughlin	99126f7ce1	intel-iommu: move struct dmar_domain def out dma_remapping.h Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2009-01-03 11:57:35 +01:00
Mark McLoughlin	622ba12a4c	intel-iommu: move DMA PTE defs out of dma_remapping.h DMA_PTE_READ/WRITE are needed by kvm. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2009-01-03 11:57:35 +01:00
Mark McLoughlin	7a8fc25e0c	intel-iommu: move context entry defs out from dma_remapping.h Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2009-01-03 11:57:35 +01:00
Mark McLoughlin	46b08e1a76	intel-iommu: move root entry defs from dma_remapping.h We keep the struct root_entry forward declaration for the pointer in struct intel_iommu. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2009-01-03 11:57:35 +01:00
Mark McLoughlin	f27be03b27	intel-iommu: move DMA_32/64BIT_PFN into intel-iommu.c Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2009-01-03 11:57:34 +01:00
Mark McLoughlin	519a054915	intel-iommu: make init_dmars() static init_dmars() is not used outside of drivers/pci/intel-iommu.c Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2009-01-03 11:57:34 +01:00
Mark McLoughlin	015ab17dc2	intel-iommu: remove some unused struct intel_iommu fields The seg, saved_msg and sysdev fields appear to be unused since before the code was first merged. linux/msi.h is not needed in linux/intel-iommu.h anymore since there is no longer a reference to struct msi_msg. The MSI code in drivers/pci/intel-iommu.c still has linux/msi.h included via linux/dmar.h. linux/sysdev.h isn't needed because there is no reference to struct sys_device. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2009-01-03 11:57:34 +01:00
Ingo Molnar	6680598b44	Disallow gcc versions 3.{0,1} GCC 3.0 and 3.1 are too old to build a working kernel. Signed-off-by: Ingo Molnar <mingo@elte.hu> [ This check got dropped as obsolete when I simplified the gcc header inclusion mess in `f153b82121`, but Willy Tarreau reports actually having those old versions still.. -Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 12:19:34 -08:00
Linus Torvalds	b840d79631	Merge branch 'cpus4096-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'cpus4096-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (66 commits) x86: export vector_used_by_percpu_irq x86: use logical apicid in x2apic_cluster's x2apic_cpu_mask_to_apicid_and() sched: nominate preferred wakeup cpu, fix x86: fix lguest used_vectors breakage, -v2 x86: fix warning in arch/x86/kernel/io_apic.c sched: fix warning in kernel/sched.c sched: move test_sd_parent() to an SMP section of sched.h sched: add SD_BALANCE_NEWIDLE at MC and CPU level for sched_mc>0 sched: activate active load balancing in new idle cpus sched: bias task wakeups to preferred semi-idle packages sched: nominate preferred wakeup cpu sched: favour lower logical cpu number for sched_mc balance sched: framework for sched_mc/smt_power_savings=N sched: convert BALANCE_FOR_xx_POWER to inline functions x86: use possible_cpus=NUM to extend the possible cpus allowed x86: fix cpu_mask_to_apicid_and to include cpu_online_mask x86: update io_apic.c to the new cpumask code x86: Introduce topology_core_cpumask()/topology_thread_cpumask() x86: xen: use smp_call_function_many() x86: use work_on_cpu in x86/kernel/cpu/mcheck/mce_amd_64.c ... Fixed up trivial conflict in kernel/time/tick-sched.c manually	2009-01-02 11:44:09 -08:00
Linus Torvalds	597b0d2162	Merge branch 'kvm-updates/2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm * 'kvm-updates/2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (140 commits) KVM: MMU: handle large host sptes on invlpg/resync KVM: Add locking to virtual i8259 interrupt controller KVM: MMU: Don't treat a global pte as such if cr4.pge is cleared MAINTAINERS: Maintainership changes for kvm/ia64 KVM: ia64: Fix kvm_arch_vcpu_ioctl_[gs]et_regs() KVM: x86: Rework user space NMI injection as KVM_CAP_USER_NMI KVM: VMX: Fix pending NMI-vs.-IRQ race for user space irqchip KVM: fix handling of ACK from shared guest IRQ KVM: MMU: check for present pdptr shadow page in walk_shadow KVM: Consolidate userspace memory capability reporting into common code KVM: Advertise the bug in memory region destruction as fixed KVM: use cpumask_var_t for cpus_hardware_enabled KVM: use modern cpumask primitives, no cpumask_t on stack KVM: Extract core of kvm_flush_remote_tlbs/kvm_reload_remote_mmus KVM: set owner of cpu and vm file operations anon_inodes: use fops->owner for module refcount x86: KVM guest: kvm_get_tsc_khz: return khz, not lpj KVM: MMU: prepopulate the shadow on invlpg KVM: MMU: skip global pgtables on sync due to cr3 switch KVM: MMU: collapse remote TLB flushes on root sync ...	2009-01-02 11:41:11 -08:00
Mauro Carvalho Chehab	e4cda3e072	V4L/DVB (10166): dvb frontend: stop using non-C99 compliant comments Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2009-01-02 17:15:07 -02:00
Klaus Schmidinger	cb889a2f35	V4L/DVB (10164): Add missing S2 caps flag to S2API The attached patch adds a capability flag that allows an application to determine whether a particular device can handle "second generation modulation" transponders. This is necessary in order for applications to be able to decide which device to use for a given channel in a multi device environment, where DVB-S and DVB-S2 devices are mixed. It is assumed that a device capable of handling "second generation modulation" can implicitly handle "first generation modulation". The flag is not named anything with DVBS2 in order to allow its use with future DVBT2 devices as well (should they ever come). Signed-off by: Klaus Schmidinger <Klaus.Schmidinger@cadsoft.de> Acked-by: Steven Toth <stoth@linuxtv.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2009-01-02 17:15:01 -02:00
Hans Verkuil	aecde8b53b	V4L/DVB (10141): v4l2: debugging API changed to match against driver name instead of ID. Since the i2c driver ID will be removed in the near future we have to modify the v4l2 debugging API to use the driver name instead of driver ID. Note that this API is not used in applications other than v4l2-dbg.cpp as it is for debugging and testing only. Should anyone use the old VIDIOC_G_CHIP_IDENT, then this will be logged with a warning that it is deprecated and will be removed in 2.6.30. Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2009-01-02 17:11:52 -02:00
Linus Torvalds	2640c9a90f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (32 commits) ide-atapi: start dma in a drive-specific way ide-atapi: put the rest of non-ide-cd code into the else-clause of ide_transfer_pc ide-atapi: remove timeout arg to ide_issue_pc ide-cd: remove handler wrappers ide-cd: remove xferlen arg to cdrom_start_packet_command ide-atapi: split drive-specific functionality in ide_issue_pc ide-atapi: assign expiry and timeout based on device type ide-atapi: compute cmd_len based on device type in ide_transfer_pc ide: remove the last ide-scsi remnants ide-atapi: remove ide-scsi remnants from ide_pc_intr() ide-atapi: remove ide-scsi remnants from ide_transfer_pc() ide-atapi: remove ide-scsi remnants from ide_issue_pc ide-cd: move cdrom_timer_expiry to ide-atapi.c ide-atapi: teach ide atapi about drive->waiting_for_dma ide-atapi: accomodate transfer length calculation for ide-cd ide-atapi: setup dma for ide-cd ide-atapi: combine drive-specific assignments ide-atapi: add a dev_is_idecd-inline remove ide-scsi ide-floppy: allocate only toplevel packet commands ...	2009-01-02 10:32:18 -08:00
Linus Torvalds	80618fa83a	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/dvrabel/uwb * 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/dvrabel/uwb: (31 commits) uwb: remove beacon cache entry after calling uwb_notify() uwb: remove unused include/linux/uwb/debug.h uwb: use print_hex_dump() uwb: use dev_dbg() for debug messages uwb: fix memory leak in uwb_rc_notif() wusb: fix oops when terminating a non-existant reservation uwb: fix oops when terminating an already terminated reservation uwb: improved MAS allocator and reservation conflict handling wusb: add debug files for ASL, PZL and DI to the whci-hcd driver uwb: fix oops in debug PAL's reservation callback uwb: clean up whci_wait_for() timeout error message wusb: whci-hcd shouldn't do ASL/PZL updates while channel is inactive uwb: remove unused beacon group join/leave events wlp: start/stop radio on network interface up/down uwb: add basic radio manager uwb: add pal parameter to new reservation callback uwb: fix races between events and neh timers uwb: don't unbind the radio controller driver when resetting uwb: per-radio controller event thread and beacon cache uwb: add commands to add/remove IEs to the debug interface ...	2009-01-02 10:31:04 -08:00
Linus Torvalds	d2fde28ce7	Merge branch 'tty-updates' from Alan * tty-updates: (75 commits) serial_8250: support for Sealevel Systems Model 7803 COMM+8 hso maintainers update patch hso modem detect fix patch against Alan Cox'es tty tree tty: Fix an ircomm warning and note another bug drivers/char/cyclades.c: cy_pci_probe: fix error path Serial: UART driver changes for Cavium OCTEON. Serial: Allow port type to be specified when calling serial8250_register_port. 8250: Serial driver changes to support future Cavium OCTEON serial patches. 8250: Don't clobber spinlocks. fix for tty-serial-move-port tty: We want the port object to be persistent __FUNCTION__ is gcc-specific, use __func__ serial: RS485 ioctl structure uses __u32 include linux/types.h tty: Drop the lock_kernel in the private ioctl hook synclink_cs: Convert to tty_port tty: use port methods for the rocket driver tty: kref the rocket driver tty: make rocketport use standard port->flags tty: Redo the rocket driver locking tty: Make epca use the port helpers ...	2009-01-02 10:26:01 -08:00
Flavio Leitner	e65f0f8271	serial_8250: support for Sealevel Systems Model 7803 COMM+8 Add support for Sealevel Systems Model 7803 COMM+8 Signed-off-by: Flavio Leitner <fleitner@redhat.com> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 10:19:44 -08:00
David Daney	6b06f19151	Serial: UART driver changes for Cavium OCTEON. Cavium UART implementation is not covered by existing uart_configS. Define a new uart_config (PORT_OCTEON) which is specified by OCTEON platform device registration code. Signed-off-by: Tomaso Paoletti <tpaoletti@caviumnetworks.com> Signed-off-by: David Daney <ddaney@caviumnetworks.com> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 10:19:43 -08:00
David Daney	8e23fcc89c	Serial: Allow port type to be specified when calling serial8250_register_port. Add flag value UPF_FIXED_TYPE which specifies that the UART type is known and should not be probed. For this case the UARTs properties are just copied out of the uart_config entry. This allows us to keep SOC specific 8250 probe code out of 8250.c. In this case we know the serial hardware will not be changing as it is on the same silicon as the CPU, and we can specify it with certainty in the board/cpu setup code. The alternative is to load up 8250.c with a bunch of OCTEON specific special cases in the probing code. Signed-off-by: David Daney <ddaney@caviumnetworks.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 10:19:43 -08:00
David Daney	7d6a07d123	8250: Serial driver changes to support future Cavium OCTEON serial patches. In order to use Cavium OCTEON specific serial i/o drivers, we first patch the 8250 driver to use replaceable I/O functions. Compatible I/O functions are added for existing iotypeS. An added benefit of this change is that it makes it easy to factor some of the existing special cases out to board/SOC specific support code. The alternative is to load up 8250.c with a bunch of OCTEON specific iotype code and bug work-arounds. Signed-off-by: David Daney <ddaney@caviumnetworks.com> Signed-off-by: Tomaso Paoletti <tpaoletti@caviumnetworks.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 10:19:43 -08:00
Alan Cox	f751928e0d	tty: We want the port object to be persistent Move the tty_port and uart_info bits around a little. By embedding the uart_info into the uart_port we get rid of lots of corner case testing and also get the ability to go port<->state<->info which is a bit more elegant than the current data structures. Downsides - we allocate a tiny bit more memory for unused ports, upside we've removed as much code as it saved for most users.. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 10:19:42 -08:00
Andy Whitcroft	60c20fb8c0	serial: RS485 ioctl structure uses __u32 include linux/types.h In the commit below a new struct serial_rs485 was introduced for a new ioctl: commit `c26c56c0f4` Author: Alan Cox <alan@redhat.com> Date: Mon Oct 13 10:37:48 2008 +0100 tty: Cris has a nice RS485 ioctl so we should steal it This structure uses the __u32 types for some of its members, which leads to the following compile error: $ cc -I.../include -c X.c In file included from X.c:2: .../include/linux/serial.h:185: error: expected specifier-qualifier-list before ‘__u32’ $ It seems that these types are appropriate for this structure as it is to be exposed to userspace. These types are available via linux/types.h so move the include of that outside the __KERNEL__ section. Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 10:19:42 -08:00
Niels de Vos	39aced68d6	serial: set correct baud_base for Oxford Semiconductor Ltd EXSYS EX-41092 Dual 16950 Serial adapter The PCI-card identified as "Oxford Semiconductor Ltd EXSYS EX-41092 Dual 16950 Serial adapter" is only usable with other devices (i.e. not the same card) after doing a "setserial /dev/ttyS<n> baud_base 115200". This baud_base should be default for this card. Signed-off-by: Niels de Vos <niels.devos@wincor-nixdorf.com> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 10:19:40 -08:00
Alan Cox	a6614999e8	tty: Introduce some close helpers for ports Again this is a lot of common code we can unify Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 10:19:40 -08:00
Alan Cox	2a6eadbd5a	tty: Rework istallion to use the tty port changes Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 10:19:39 -08:00
Alan Cox	36c621d82b	tty: Introduce a tty_port generic block_til_ready Start sucking more commonality out of the drivers into a single piece of core code. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 10:19:39 -08:00
Alan Cox	3e61696bdc	isicom: redo locking to use tty port locks This helps set the basis for moving block_til_ready into common code. We also introduce a tty_port_hangup helper as this will also be generally needed. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 10:19:38 -08:00
Alan Cox	5d951fb458	tty: Pull the dtr raise into tty port This moves another per device special out of what should be shared open wait paths into private methods Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 10:19:38 -08:00
Alan Cox	31f35939d1	tty_port: Add a port level carrier detect operation This is the first step to generalising the various pieces of waiting logic duplicated in all sorts of serial drivers. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 10:19:38 -08:00
Alan Cox	c9b3976e3f	tty: Fix PPP hang under load Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 10:19:38 -08:00
Russell King	975a1a7d88	And here's a patch (to be applied on top of the last) which prevents this happening again by making use of 'const'. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 10:19:37 -08:00
Alan Cox	fc6f623822	pty: simplify resize We have special case logic for resizing pty/tty pairs. We also have a per driver resize method so for the pty case we should use it. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 10:19:36 -08:00
Joe Peterson	a88a69c912	n_tty: Fix loss of echoed characters and remove bkl from n_tty Fixes the loss of echoed (and other ldisc-generated characters) when the tty is stopped or when the driver output buffer is full (happens frequently for input during continuous program output, such as ^C) and removes the Big Kernel Lock from the N_TTY line discipline. Adds an "echo buffer" to the N_TTY line discipline that handles all ldisc-generated output (including echoed characters). Along with the loss of characters, this also fixes the associated loss of sync between tty output and the ldisc state when characters cannot be immediately written to the tty driver. The echo buffer stores (in addition to characters) state operations that need to be done at the time of character output (like management of the column position). This allows echo to cooperate correctly with program output, since the ldisc state remains consistent with actual characters written. Since the echo buffer code now isolates the tty column state code to the process_out* and process_echoes functions, we can remove the Big Kernel Lock (BKL) and replace it with mutex locks. Highlights are: * Handles echo (and other ldisc output) when tty driver buffer is full - continuous program output can block echo * Saves echo when tty is in stopped state (e.g. ^S) - (e.g.: ^Q will correctly cause held characters to be released for output) * Control character pairs (e.g. "^C") are treated atomically and not split up by interleaved program output * Line discipline state is kept consistent with characters sent to the tty driver * Remove the big kernel lock (BKL) from N_TTY line discipline Signed-off-by: Joe Peterson <joe@skyrush.com> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 10:19:35 -08:00
Linus Torvalds	f9d1425007	Disallow gcc versions 4.1.{0,1} These compiler versions are known to miscompile __weak functions and thus generate kernels that don't necessarily work correctly. If a weak function is int he same compilation unit as a caller, gcc may end up inlining it, and thus binding the weak function too early. See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27781 for details. Cc: Adrian Bunk <bunk@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 09:29:43 -08:00
Linus Torvalds	f153b82121	Sanitize gcc version header includes - include the gcc version-dependent header files from the generic gcc header file, rather than the other way around (iow: don't make the non-gcc header file have to know about gcc versions) - don't include compiler-gcc4.h for gcc 5 (for whenever it gets released). That's just confusing and made us do odd things in the gcc4 header file (testing that we really had version 4!) - generate the name from the __GNUC__ version directly, rather than having a mess of #if conditionals. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-02 09:23:03 -08:00
FUJITA Tomonori	97ae77a1cd	[SCSI] block: make blk_rq_map_user take a NULL user-space buffer for WRITE The commit `818827669d` (block: make blk_rq_map_user take a NULL user-space buffer) extended blk_rq_map_user to accept a NULL user-space buffer with a READ command. It was necessary to convert sg to use the block layer mapping API. This patch extends blk_rq_map_user again for a WRITE command. It is necessary to convert st and osst drivers to use the block layer apping API. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-01-02 11:10:35 -06:00
FUJITA Tomonori	56c451f4b5	[SCSI] block: fix the partial mappings with struct rq_map_data This fixes bio_copy_user_iov to properly handle the partial mappings with struct rq_map_data (which only sg uses for now but st and osst will shortly). It adds the offset member to struct rq_map_data and changes blk_rq_map_user to update it so that bio_copy_user_iov can add an appropriate page frame via bio_add_pc_page(). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-01-02 11:10:08 -06:00
Borislav Petkov	28ad91db77	ide-atapi: remove timeout arg to ide_issue_pc There should be no functionality change resulting from this patch. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-02 16:12:56 +01:00
Borislav Petkov	5317464dcc	ide: remove the last ide-scsi remnants Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-02 16:12:54 +01:00
Borislav Petkov	5d655a03b8	ide-atapi: remove ide-scsi remnants from ide_pc_intr() As a result, remove now unused ide_scsi_get_timeout and ide_scsi_expiry. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-02 16:12:54 +01:00
Borislav Petkov	4cad085efb	ide-cd: move cdrom_timer_expiry to ide-atapi.c - cdrom_timer_expiry -> ide_cd_expiry - remove expiry-arg to ide_issue_pc as it is redundant now - ide_debug_log -> debug_log Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-02 16:12:53 +01:00
Borislav Petkov	392de1d53d	ide-atapi: accomodate transfer length calculation for ide-cd ... by factoring it out of ide_cd_do_request() into a helper, as suggested by Bart. There should be no functionality change resulting from this patch. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> [bart: BLK_DEV_IDECD needs to select IDE_ATAPI now] Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-02 16:12:52 +01:00
Borislav Petkov	bf64741fe8	ide: make IDE_AFLAG_.. numbering continuous again Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-02 16:12:50 +01:00
Bartlomiej Zolnierkiewicz	201bffa464	ide: use per-device request queue locks (v2) * Move hack for flush requests from choose_drive() to do_ide_request(). * Add ide_plug_device() helper and convert core IDE code from using per-hwgroup lock as a request lock to use the ->queue_lock instead. * Remove no longer needed: - choose_drive() function - WAKEUP() macro - 'sleeping' flag from ide_hwif_t - 'service_{start,time}' fields from ide_drive_t This patch results in much simpler and more maintainable code (besides being a scalability improvement). v2: * Fixes/improvements based on review from Elias: - take as many requests off the queue as possible - remove now redundant BUG_ON() Cc: Elias Oltmanns <eo@nebensachen.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-02 16:12:50 +01:00
Bartlomiej Zolnierkiewicz	631de3708d	ide: add ide_[un]lock_hwgroup() helpers Add ide_[un]lock_hwgroup() inline helpers for obtaining exclusive access to the given hwgroup and update the core code accordingly. [ This change besides making code saner results in more efficient use of ide_{get,release}_lock(). ] Cc: Michael Schmitz <schmitz@biophys.uni-duesseldorf.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Elias Oltmanns <eo@nebensachen.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-02 16:12:50 +01:00
Bartlomiej Zolnierkiewicz	295f00042a	ide: don't execute the next queued command from the hard-IRQ context (v2) * Tell the block layer that we are not done handling requests by using blk_plug_device() in ide_do_request() (request handling function) and ide_timer_expiry() (timeout handler) if the queue is not empty. * Remove optimization which directly calls ide_do_request() for the next queued command from the ide_intr() (IRQ handler) and ide_timer_expiry(). * Remove no longer needed IRQ masking from ide_do_request() - in case of IDE ports needing serialization disable_irq_nosync()/enable_irq() was used for the (possibly shared) IRQ of the other IDE port. * Put the misplaced comment in the right place in ide_do_request(). * Drop no longer needed 'int masked_irq' argument from ide_do_request(). * Merge ide_do_request() into do_ide_request(). * Remove no longer needed IDE_NO_IRQ define. While at it: * Don't use HWGROUP() macro in do_ide_request(). * Use __func__ in ide_intr(). This patch reduces IRQ hadling latency for IDE and improves the system-wide handling of shared IRQs (which should result in more timeout resistant and stable IDE systems). It also makes it possible to do some further changes later (i.e. replace some busy-waiting delays with sleeping equivalents). v2: Changes per review from Elias Oltmanns: - fix wrong goto statement in 'if (startstop == ide_stopped)' block - use spin_unlock_irq() - don't use obsolete HWIF() macro Cc: Elias Oltmanns <eo@nebensachen.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-02 16:12:48 +01:00
Bartlomiej Zolnierkiewicz	ebdab07dad	ide: move sysfs support to ide-sysfs.c While at it: - media_string() -> ide_media_string() There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-02 16:12:48 +01:00
David Vrabel	b21a207141	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 into for-upstream Conflicts: drivers/uwb/wlp/eda.c	2009-01-02 13:17:13 +00:00
Linus Torvalds	b58602a4ba	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (34 commits) nfsd race fixes: jfs nfsd race fixes: reiserfs nfsd race fixes: ext4 nfsd race fixes: ext3 nfsd race fixes: ext2 nfsd/create race fixes, infrastructure filesystem notification: create fs/notify to contain all fs notification fs/block_dev.c: __read_mostly improvement and sb_is_blkdev_sb utilization kill ->dir_notify() filp_cachep can be static in fs/file_table.c fix f_count description in Documentation/filesystems/files.txt make INIT_FS use the __RW_LOCK_UNLOCKED initialization take init_fs to saner place kill vfs_permission pass a struct path * to may_open kill walk_init_root remove incorrect comment in inode_permission expand some comments (d_path / seq_path) correct wrong function name of d_put in kernel document and source comment fix switch_names() breakage in short-to-short case ...	2008-12-31 15:57:56 -08:00
Rusty Russell	8c384cdee3	cpumask: CONFIG_DISABLE_OBSOLETE_CPUMASK_FUNCTIONS Impact: new debug CONFIG options This helps find unconverted code. It currently breaks compile horribly, but we never wanted a flag day so that's expected. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-01-01 10:12:30 +10:30
Rusty Russell	41c7bb9588	cpumask: convert rest of files in kernel/ Impact: Reduce stack usage, use new cpumask API. Mainly changing cpumask_t to 'struct cpumask' and similar simple API conversion. Two conversions worth mentioning: 1) we use cpumask_any_but to avoid a temporary in kernel/softlockup.c, 2) Use cpumask_var_t in taskstats_user_cmd(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com>	2009-01-01 10:12:28 +10:30
Rusty Russell	bd232f97b3	cpumask: convert RCU implementations Impact: use new cpumask API. rcu_ctrlblk contains a cpumask, and it's highly optimized so I don't want a cpumask_var_t (ie. a pointer) for the CONFIG_CPUMASK_OFFSTACK case. It could use a dangling bitmap, and be allocated in __rcu_init to save memory, but for the moment we use a bitmap. (Eventually 'struct cpumask' will be undefined for CONFIG_CPUMASK_OFFSTACK, so we use a bitmap here to show we really mean it). We remove on-stack cpumasks, using cpumask_var_t for rcu_torture_shuffle_tasks() and for_each_cpu_and in force_quiescent_state(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-01-01 10:12:26 +10:30
Rusty Russell	d036e67b40	cpumask: convert kernel/irq Impact: Reduce stack usage, use new cpumask API. ALPHA mod! Main change is that irq_default_affinity becomes a cpumask_var_t, so treat it as a pointer (this effects alpha). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-01-01 10:12:26 +10:30
Rusty Russell	6b954823c2	cpumask: convert kernel time functions Impact: Use new APIs Convert kernel/time functions to use struct cpumask *. Note the ugly bitmap declarations in tick-broadcast.c. These should be cpumask_var_t, but there was no obvious initialization function to put the alloc_cpumask_var() calls in. This was safe. (Eventually 'struct cpumask' will be undefined for CONFIG_CPUMASK_OFFSTACK, so we use a bitmap here to show we really mean it). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com>	2009-01-01 10:12:25 +10:30
Rusty Russell	ab53d472e7	bitmap: find_last_bit() Impact: New API As the name suggests. For the moment everyone uses the generic one. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-01-01 10:12:19 +10:30
Al Viro	261bca86ed	nfsd/create race fixes, infrastructure new helpers - insert_inode_locked() and insert_inode_locked4(). Hash new inode, making sure that there's no such inode in icache already. If there is and it does not end up unhashed (as would happen if we have nfsd trying to resolve a bogus fhandle), fail. Otherwise insert our inode into hash and succeed. In either case have i_state set to new+locked; cleanup ends up being simpler with such calling conventions. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:43 -05:00
Al Viro	6badd79bd0	kill ->dir_notify() Remove the hopelessly misguided ->dir_notify(). The only instance (cifs) has been broken by design from the very beginning; the objects it creates are never destroyed, keep references to struct file they can outlive, nothing that could possibly evict them exists on close(2) path and no locking whatsoever is done to prevent races with close(), should the previous, er, deficiencies someday be dealt with. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:43 -05:00
Eric Dumazet	b6b3fdead2	filp_cachep can be static in fs/file_table.c Instead of creating the "filp" kmem_cache in vfs_caches_init(), we can do it a litle be later in files_init(), so that filp_cachep is static to fs/file_table.c Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:42 -05:00
Al Viro	18d8fda7c3	take init_fs to saner place Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:42 -05:00
Christoph Hellwig	cb23beb551	kill vfs_permission With all the nameidata removal there's no point anymore for this helper. Of the three callers left two will go away with the next lookup series anyway. Also add proper kerneldoc to inode_permission as this is the main permission check routine now. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:41 -05:00
Christoph Hellwig	3fb64190aa	pass a struct path * to may_open No need for the nameidata in may_open - a struct path is enough. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:41 -05:00
Duane Griffin	035146851c	vfs: introduce helper function to safely NUL-terminate symlinks A number of filesystems were potentially triggering kernel bugs due to corrupted symlink names on disk. This function helps safely terminate the names. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Duane Griffin <duaneg@dghda.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:38 -05:00
Jan Engelhardt	dded4f4d50	include: linux/fs.h: put declarations in __KERNEL__ include/linux/fs.h contains externs for a bunch of variables. That obviously belongs under ifdef __KERNEL__. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:38 -05:00
Nick Piggin	c2452f3278	shrink struct dentry struct dentry is one of the most critical structures in the kernel. So it's sad to see it going neglected. With CONFIG_PROFILING turned on (which is probably the common case at least for distros and kernel developers), sizeof(struct dcache) == 208 here (64-bit). This gives 19 objects per slab. I packed d_mounted into a hole, and took another 4 bytes off the inline name length to take the padding out from the end of the structure. This shinks it to 200 bytes. I could have gone the other way and increased the length to 40, but I'm aiming for a magic number, read on... I then got rid of the d_cookie pointer. This shrinks it to 192 bytes. Rant: why was this ever a good idea? The cookie system should increase its hash size or use a tree or something if lookups are a problem. Also the "fast dcookie lookups" in oprofile should be moved into the dcookie code -- how can oprofile possibly care about the dcookie_mutex? It gets dropped after get_dcookie() returns so it can't be providing any sort of protection. At 192 bytes, 21 objects fit into a 4K page, saving about 3MB on my system with ~140 000 entries allocated. 192 is also a multiple of 64, so we get nice cacheline alignment on 64 and 32 byte line systems -- any given dentry will now require 3 cachelines to touch all fields wheras previously it would require 4. I know the inline name size was chosen quite carefully, however with the reduction in cacheline footprint, it should actually be just about as fast to do a name lookup for a 36 character name as it was before the patch (and faster for other sizes). The memory footprint savings for names which are <= 32 or > 36 bytes long should more than make up for the memory cost for 33-36 byte names. Performance is a feature... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:38 -05:00
Kentaro Takeda	be6d3e56a6	introduce new LSM hooks where vfsmount is available. Add new LSM hooks for path-based checks. Call them on directory-modifying operations at the points where we still know the vfsmount involved. Signed-off-by: Kentaro Takeda <takedakn@nttdata.co.jp> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Toshiharu Harada <haradats@nttdata.co.jp> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:37 -05:00
Anton Vorontsov	9c43df5791	mmc_spi: Add support for OpenFirmware bindings The support is implemented via platform data accessors, new module (of_mmc_spi) will be created automatically when the driver compiles on OpenFirmware platforms. Link-time dependency will load the module automatically. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-12-31 19:01:55 +01:00
Anton Vorontsov	86e8286a0e	mmc: Add mmc_vddrange_to_ocrmask() helper function This function sets the OCR mask bits according to provided voltage ranges. Will be used by the mmc_spi OpenFirmware bindings. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-12-31 18:18:13 +01:00
Jarkko Lavinen	b30f8af335	mmc: Add 8-bit bus width support Signed-off-by: Jarkko Lavinen <jarkko.lavinen@nokia.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-12-31 18:18:12 +01:00
Linus Torvalds	db200df0b3	Merge branch 'irq-fixes-for-linus-4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-fixes-for-linus-4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sparseirq: move __weak symbols into separate compilation unit sparseirq: work around __weak alias bug sparseirq: fix hang with !SPARSE_IRQ sparseirq: set lock_class for legacy irq when sparse_irq is selected sparseirq: work around compiler optimizing away __weak functions sparseirq: fix desc->lock init sparseirq: do not printk when migrating IRQ descriptors sparseirq: remove duplicated arch_early_irq_init() irq: simplify for_each_irq_desc() usage proc: remove ifdef CONFIG_SPARSE_IRQ from stat.c irq: for_each_irq_desc() move to irqnr.h hrtimer: remove #include <linux/irq.h>	2008-12-31 09:00:59 -08:00
Jan Kiszka	4531220b71	KVM: x86: Rework user space NMI injection as KVM_CAP_USER_NMI There is no point in doing the ready_for_nmi_injection/ request_nmi_window dance with user space. First, we don't do this for in-kernel irqchip anyway, while the code path is the same as for user space irqchip mode. And second, there is nothing to loose if a pending NMI is overwritten by another one (in contrast to IRQs where we have to save the number). Actually, there is even the risk of raising spurious NMIs this way because the reason for the held-back NMI might already be handled while processing the first one. Therefore this patch creates a simplified user space NMI injection interface, exporting it under KVM_CAP_USER_NMI and dropping the old KVM_CAP_NMI capability. And this time we also take care to provide the interface only on archs supporting NMIs via KVM (right now only x86). Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:55:47 +02:00
Mark McLoughlin	defaf1587c	KVM: fix handling of ACK from shared guest IRQ If an assigned device shares a guest irq with an emulated device then we currently interpret an ack generated by the emulated device as originating from the assigned device leading to e.g. "Unbalanced enable for IRQ 4347" from the enable_irq() in kvm_assigned_dev_ack_irq(). The fix is fairly simple - don't enable the physical device irq unless it was previously disabled. Of course, this can still lead to a situation where a non-assigned device ACK can cause the physical device irq to be reenabled before the device was serviced. However, being level sensitive, the interrupt will merely be regenerated. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:55:47 +02:00
Avi Kivity	1a811b6167	KVM: Advertise the bug in memory region destruction as fixed Userspace might need to act differently. Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:55:46 +02:00
Sheng Yang	6b9cc7fd46	KVM: Enable MSI for device assignment We enable guest MSI and host MSI support in this patch. The userspace want to enable MSI should set KVM_DEV_IRQ_ASSIGN_ENABLE_MSI in the assigned_irq's flag. Function would return -ENOTTY if can't enable MSI, userspace shouldn't set MSI Enable bit when KVM_ASSIGN_IRQ return -ENOTTY with KVM_DEV_IRQ_ASSIGN_ENABLE_MSI. Userspace can tell the support of MSI device from #ifdef KVM_CAP_DEVICE_MSI. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:55:02 +02:00
Sheng Yang	0937c48d07	KVM: Add fields for MSI device assignment Prepared for kvm_arch_assigned_device_msi_dispatch(). Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:55:01 +02:00
Sheng Yang	4f906c19ae	KVM: Replace irq_requested with more generic irq_requested_type Separate guest irq type and host irq type, for we can support guest using INTx with host using MSI (but not opposite combination). Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:55:01 +02:00
Sheng Yang	e19e30effa	KVM: IRQ ACK notifier should be used with in-kernel irqchip Also remove unnecessary parameter of unregister irq ack notifier. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:51:47 +02:00
Jan Kiszka	c4abb7c9cd	KVM: x86: Support for user space injected NMIs Introduces the KVM_NMI IOCTL to the generic x86 part of KVM for injecting NMIs from user space and also extends the statistic report accordingly. Based on the original patch by Sheng Yang. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:51:42 +02:00
Martin Schwidefsky	79741dd357	[PATCH] idle cputime accounting The cpu time spent by the idle process actually doing something is currently accounted as idle time. This is plain wrong, the architectures that support VIRT_CPU_ACCOUNTING=y can do better: distinguish between the time spent doing nothing and the time spent by idle doing work. The first is accounted with account_idle_time and the second with account_system_time. The architectures that use the account_xxx_time interface directly and not the account_xxx_ticks interface now need to do the check for the idle process in their arch code. In particular to improve the system vs true idle time accounting the arch code needs to measure the true idle time instead of just testing for the idle process. To improve the tick based accounting as well we would need an architecture primitive that can tell us if the pt_regs of the interrupted context points to the magic instruction that halts the cpu. In addition idle time is no more added to the stime of the idle process. This field now contains the system time of the idle process as it should be. On systems without VIRT_CPU_ACCOUNTING this will always be zero as every tick that occurs while idle is running will be accounted as idle time. This patch contains the necessary common code changes to be able to distinguish idle system time and true idle time. The architectures with support for VIRT_CPU_ACCOUNTING need some changes to exploit this. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-12-31 15:11:46 +01:00
Martin Schwidefsky	457533a7d3	[PATCH] fix scaled & unscaled cputime accounting The utimescaled / stimescaled fields in the task structure and the global cpustat should be set on all architectures. On s390 the calls to account_user_time_scaled and account_system_time_scaled never have been added. In addition system time that is accounted as guest time to the user time of a process is accounted to the scaled system time instead of the scaled user time. To fix the bugs and to prevent future forgetfulness this patch merges account_system_time_scaled into account_system_time and account_user_time_scaled into account_user_time. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Michael Neuling <mikey@neuling.org> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-12-31 15:11:46 +01:00
Rusty Russell	2ca1a61583	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: arch/x86/kernel/io_apic.c	2008-12-31 23:05:57 +10:30
Thomas Gleixner	1c5745aa38	sched_clock: prevent scd->clock from moving backwards, take #2 Redo: `5b7dba4`: sched_clock: prevent scd->clock from moving backwards which had to be reverted due to s2ram hangs: `ca7e716`: Revert "sched_clock: prevent scd->clock from moving backwards" ... this time with resume restoring GTOD later in the sequence taken into account as well. The "timekeeping_suspended" flag is not very nice but we cannot call into GTOD before it has been properly resumed and the scheduler will run very early in the resume sequence. Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-31 09:53:21 +01:00
Ingo Molnar	a9de18eb76	Merge branch 'linus' into stackprotector Conflicts: arch/x86/include/asm/pda.h kernel/fork.c	2008-12-31 08:31:57 +01:00
Ingo Molnar	818fa7f390	Merge branch 'tracing/kmemtrace' into tracing/kmemtrace2	2008-12-31 08:19:48 +01:00
Ingo Molnar	5fdf7e5975	Merge branch 'linus' into tracing/kmemtrace Conflicts: mm/slub.c	2008-12-31 08:14:29 +01:00
Lin Ming	ea7e96e0f2	ACPI: remove private acpica headers from driver files External driver files should not include any private acpica headers. Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2008-12-31 01:15:22 -05:00
Bjorn Helgaas	f748bafa3c	ACPI: PCI: move struct acpi_prt_entry declaration out of public header file The struct acpi_prt_entry is used only in pci_irq.c, so there's no need for the declaration to be public. This patch moves it into pci_irq.c. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Len Brown <len.brown@intel.com>	2008-12-30 21:20:23 -05:00
Linus Torvalds	6a94cb7306	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs * 'for-linus' of git://oss.sgi.com/xfs/xfs: (184 commits) [XFS] Fix race in xfs_write() between direct and buffered I/O with DMAPI [XFS] handle unaligned data in xfs_bmbt_disk_get_all [XFS] avoid memory allocations in xfs_fs_vcmn_err [XFS] Fix speculative allocation beyond eof [XFS] Remove XFS_BUF_SHUT() and friends [XFS] Use the incore inode size in xfs_file_readdir() [XFS] set b_error from bio error in xfs_buf_bio_end_io [XFS] use inode_change_ok for setattr permission checking [XFS] add a FMODE flag to make XFS invisible I/O less hacky [XFS] resync headers with libxfs [XFS] simplify projid check in xfs_rename [XFS] replace b_fspriv with b_mount [XFS] Remove unused tracing code [XFS] Remove unnecessary assertion [XFS] Remove unused variable in ktrace_free() [XFS] Check return value of xfs_buf_get_noaddr() [XFS] Fix hang after disallowed rename across directory quota domains [XFS] Fix compile with CONFIG_COMPAT enabled move inode tracing out of xfs_vnode. move vn_iowait / vn_iowake into xfs_aops.c ...	2008-12-30 17:48:25 -08:00
Linus Torvalds	f57fa1d6a6	Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (70 commits) fs/nfs/nfs4proc.c: make nfs4_map_errors() static rpc: add service field to new upcall rpc: add target field to new upcall nfsd: support callbacks with gss flavors rpc: allow gss callbacks to client rpc: pass target name down to rpc level on callbacks nfsd: pass client principal name in rsc downcall rpc: implement new upcall rpc: store pointer to pipe inode in gss upcall message rpc: use count of pipe openers to wait for first open rpc: track number of users of the gss upcall pipe rpc: call release_pipe only on last close rpc: add an rpc_pipe_open method rpc: minor gss_alloc_msg cleanup rpc: factor out warning code from gss_pipe_destroy_msg rpc: remove unnecessary assignment NFS: remove unused status from encode routines NFS: increment number of operations in each encode routine NFS: fix comment placement in nfs4xdr.c NFS: fix tabs in nfs4xdr.c ...	2008-12-30 17:45:45 -08:00
Linus Torvalds	f54a6ec0fd	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (583 commits) V4L/DVB (10130): use USB API functions rather than constants V4L/DVB (10129): dvb: remove deprecated use of RW_LOCK_UNLOCKED in frontends V4L/DVB (10128): modify V4L documentation to be a valid XHTML V4L/DVB (10127): stv06xx: Avoid having y unitialized V4L/DVB (10125): em28xx: Don't do AC97 vendor detection for i2s audio devices V4L/DVB (10124): em28xx: expand output formats available V4L/DVB (10123): em28xx: fix reversed definitions of I2S audio modes V4L/DVB (10122): em28xx: don't load em28xx-alsa for em2870 based devices V4L/DVB (10121): em28xx: remove worthless Pinnacle PCTV HD Mini 80e device profile V4L/DVB (10120): em28xx: remove redundant Pinnacle Dazzle DVC 100 profile V4L/DVB (10119): em28xx: fix corrupted XCLK value V4L/DVB (10118): zoran: fix warning for a variable not used V4L/DVB (10116): af9013: Fix gcc false warnings V4L/DVB (10111a): usbvideo.h: remove an useless blank line V4L/DVB (10111): quickcam_messenger.c: fix a warning V4L/DVB (10110): v4l2-ioctl: Fix warnings when using .unlocked_ioctl = __video_ioctl2 V4L/DVB (10109): anysee: Fix usage of an unitialized function V4L/DVB (10104): uvcvideo: Add support for video output devices V4L/DVB (10102): uvcvideo: Ignore interrupt endpoint for built-in iSight webcams. V4L/DVB (10101): uvcvideo: Fix bulk URB processing when the header is erroneous ...	2008-12-30 17:41:32 -08:00
Linus Torvalds	ab70537c32	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: lguest: struct device - replace bus_id with dev_name() lguest: move the initial guest page table creation code to the host kvm-s390: implement config_changed for virtio on s390 virtio_console: support console resizing virtio: add PCI device release() function virtio_blk: fix type warning virtio: block: dynamic maximum segments virtio: set max_segment_size and max_sectors to infinite. virtio: avoid implicit use of Linux page size in balloon interface virtio: hand virtio ring alignment as argument to vring_new_virtqueue virtio: use KVM_S390_VIRTIO_RING_ALIGN instead of relying on pagesize virtio: use LGUEST_VRING_ALIGN instead of relying on pagesize virtio: Don't use PAGE_SIZE for vring alignment in virtio_pci. virtio: rename 'pagesize' arg to vring_init/vring_size virtio: Don't use PAGE_SIZE in virtio_pci.c virtio: struct device - replace bus_id with dev_name(), dev_set_name() virtio-pci queue allocation not page-aligned	2008-12-30 17:37:25 -08:00
Linus Torvalds	14a3c4ab0e	Merge branch 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm * 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (407 commits) [ARM] pxafb: add support for overlay1 and overlay2 as framebuffer devices [ARM] pxafb: cleanup of the timing checking code [ARM] pxafb: cleanup of the color format manipulation code [ARM] pxafb: add palette format support for LCCR4_PAL_FOR_3 [ARM] pxafb: add support for FBIOPAN_DISPLAY by dma braching [ARM] pxafb: allow pxafb_set_par() to start from arbitrary yoffset [ARM] pxafb: allow video memory size to be configurable [ARM] pxa: add document on the MFP design and how to use it [ARM] sa1100_wdt: don't assume CLOCK_TICK_RATE to be a constant [ARM] rtc-sa1100: don't assume CLOCK_TICK_RATE to be a constant [ARM] pxa/tavorevb: update board support (smartpanel LCD + keypad) [ARM] pxa: Update eseries defconfig [ARM] 5352/1: add w90p910-plat config file [ARM] s3c: S3C options should depend on PLAT_S3C [ARM] mv78xx0: implement GPIO and GPIO interrupt support [ARM] Kirkwood: implement GPIO and GPIO interrupt support [ARM] Orion: share GPIO IRQ handling code [ARM] Orion: share GPIO handling code [ARM] s3c: define __io using the typesafe version [ARM] S3C64XX: Ensure CPU_V6 is selected ...	2008-12-30 17:36:49 -08:00
Linus Torvalds	74a6d0f064	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (33 commits) ide-cd: remove dead dsc_overlap setting ide: push local_irq_{save,restore}() to do_identify() ide: remove superfluous local_irq_{save,restore}() from ide_dump_status() ide: move legacy ISA/VLB ports handling to ide-legacy.c (v2) ide: move Power Management support to ide-pm.c ide: use ATA_DMA_* defines in ide-dma-sff.c ide: checkpatch.pl fixes for ide-lib.c ide: remove inline tags from ide-probe.c ide: remove redundant code from ide_end_drive_cmd() ide: struct device - replace bus_id with dev_name(), dev_set_name() ide: rework handling of serialized ports (v2) cy82c693: remove superfluous ide_cy82c693 chipset type trm290: add IDE_HFLAG_TRM290 host flag ide: add ->max_sectors field to struct ide_port_info rz1000: apply chipset quirks early (v2) ide: always set nIEN on idle devices ide: fix ->quirk_list checking in ide_do_request() gayle: set IDE_HFLAG_SERIALIZE explictly cmd64x: set IDE_HFLAG_SERIALIZE explictly for CMD646 ali14xx: doesn't use shared IRQs ...	2008-12-30 17:34:37 -08:00
Linus Torvalds	5b8f258758	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: sata_sil: add Large Block Transfer support [libata] ata_piix: cleanup dmi strings checking DMI: add dmi_match libata: blacklist NCQ on OCZ CORE 2 SSD (resend) [libata] Update kernel-doc comments to match source code libata: perform port detach in EH libata: when restoring SControl during detach do the PMP links first libata: beef up iterators	2008-12-30 17:32:25 -08:00
Linus Torvalds	526ea064f9	Merge branch 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: oprofile: select RING_BUFFER ring_buffer: adding EXPORT_SYMBOLs oprofile: fix lost sample counter oprofile: remove nr_available_slots() oprofile: port to the new ring_buffer ring_buffer: add remaining cpu functions to ring_buffer.h oprofile: moving cpu_buffer_reset() to cpu_buffer.h oprofile: adding cpu_buffer_entries() oprofile: adding cpu_buffer_write_commit() oprofile: adding cpu buffer r/w access functions ftrace: remove unused function arg in trace_iterator_increment() ring_buffer: update description for ring_buffer_alloc() oprofile: set values to default when creating oprofilefs oprofile: implement switch/case in buffer_sync.c x86/oprofile: cleanup IBS init/exit functions in op_model_amd.c x86/oprofile: reordering IBS code in op_model_amd.c oprofile: fix typo oprofile: whitspace changes only oprofile: update comment for oprofile_add_sample() oprofile: comment cleanup	2008-12-30 17:31:25 -08:00
Linus Torvalds	db5e53fbf0	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6: slub: avoid leaking caches or refcounts on sysfs error slab: Fix comment on #endif slab: remove GFP_THISNODE clearing from alloc_slabmgmt() slub: Add might_sleep_if() to slab_alloc() SLUB: failslab support slub: Fix incorrect use of loose slab: Update the kmem_cache_create documentation regarding the name parameter slub: make early_kmem_cache_node_alloc void slab: unsigned slabp->inuse cannot be less than 0 slub - fix get_object_page comment SLUB: Replace __builtin_return_address(0) with _RET_IP_. SLUB: cleanup - define macros instead of hardcoded numbers	2008-12-30 17:28:09 -08:00
Linus Torvalds	3f4b5c5d27	Merge branch 'drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (37 commits) drm/i915: fix modeset devname allocation + agp init return check. drm/i915: Remove redundant test in error path. drm: Add a debug node for vblank state. drm: Avoid use-before-null-test on dev in drm_cleanup(). drm/i915: Don't print to dmesg when taking signal during object_pin. drm: pin new and unpin old buffer when setting a mode. drm/i915: un-EXPORT and make 'intelfb_panic' static drm/i915: Delete unused, pointless i915_driver_firstopen. drm/i915: fix sparse warnings: returning void-valued expression drm/i915: fix sparse warnings: move 'extern' decls to header file drm/i915: fix sparse warnings: make symbols static drm/i915: fix sparse warnings: declare one-bit bitfield as unsigned drm/i915: Don't double-unpin buffers if we take a signal in evict_everything(). drm/i915: Fix fbcon setup to align display pitch to 64b. drm/i915: Add missing userland definitions for gem init/execbuffer. i915/drm: provide compat defines for userspace for certain struct members. drm: drop DRM_IOCTL_MODE_REPLACEFB, add+remove works just as well. drm: sanitise drm modesetting API + remove unused hotplug drm: fix allowing master ioctls on non-master fds. drm/radeon: use locked rmmap to remove sarea mapping. ...	2008-12-30 17:25:49 -08:00
Linus Torvalds	6de71484cf	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6: (98 commits) sparc: move select of ARCH_SUPPORTS_MSI sparc: drop SUN_IO sparc: unify sections.h sparc: use .data.init_task section for init_thread_union sparc: fix array overrun check in of_device_64.c sparc: unify module.c sparc64: prepare module_64.c for unification sparc64: use bit neutral Elf symbols sparc: unify module.h sparc: introduce CONFIG_BITS sparc: fix hardirq.h removal fallout sparc64: do not export pus_fs_struct sparc: use sparc64 version of scatterlist.h sparc: Commonize memcmp assembler. sparc: Unify strlen assembler. sparc: Add asm/asm.h sparc: Kill memcmp_32.S code which has been ifdef'd out for centuries. sparc: replace for_each_cpu_mask_nr with for_each_cpu sparc: fix sparse warnings in irq_32.c sparc: add include guards to kernel.h ...	2008-12-30 17:23:31 -08:00
Linus Torvalds	1dff81f20c	Merge branch 'for-2.6.29' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.29' of git://git.kernel.dk/linux-2.6-block: (43 commits) bio: get rid of bio_vec clearing bounce: don't rely on a zeroed bio_vec list cciss: simplify parameters to deregister_disk function cfq-iosched: fix race between exiting queue and exiting task loop: Do not call loop_unplug for not configured loop device. loop: Flush possible running bios when loop device is released. alpha: remove dead BIO_VMERGE_BOUNDARY Get rid of CONFIG_LSF block: make blk_softirq_init() static block: use min_not_zero in blk_queue_stack_limits block: add one-hit cache for disk partition lookup cfq-iosched: remove limit of dispatch depth of max 4 times quantum nbd: tell the block layer that it is not a rotational device block: get rid of elevator_t typedef aio: make the lookup_ioctx() lockless bio: add support for inlining a number of bio_vecs inside the bio bio: allow individual slabs in the bio_set bio: move the slab pointer inside the bio_set bio: only mempool back the largest bio_vec slab cache block: don't use plugging on SSD devices ...	2008-12-30 17:20:05 -08:00
Linus Torvalds	179475a3b4	Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, sparseirq: clean up Kconfig entry x86: turn CONFIG_SPARSE_IRQ off by default sparseirq: fix numa_migrate_irq_desc dependency and comments sparseirq: add kernel-doc notation for new member in irq_desc, -v2 locking, irq: enclose irq_desc_lock_class in CONFIG_LOCKDEP sparseirq, xen: make sure irq_desc is allocated for interrupts sparseirq: fix !SMP building, #2 x86, sparseirq: move irq_desc according to smp_affinity, v7 proc: enclose desc variable of show_stat() in CONFIG_SPARSE_IRQ sparse irqs: add irqnr.h to the user headers list sparse irqs: handle !GENIRQ platforms sparseirq: fix !SMP && !PCI_MSI && !HT_IRQ build sparseirq: fix Alpha build failure sparseirq: fix typo in !CONFIG_IO_APIC case x86, MSI: pass irq_cfg and irq_desc x86: MSI start irq numbering from nr_irqs_gsi x86: use NR_IRQS_LEGACY sparse irq_desc[] array: core kernel and x86 changes genirq: record IRQ_LEVEL in irq_desc[] irq.h: remove padding from irq_desc on 64bits	2008-12-30 16:20:19 -08:00
Linus Torvalds	bb758e9637	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: hrtimers: fix warning in kernel/hrtimer.c x86: make sure we really have an hpet mapping before using it x86: enable HPET on Fujitsu u9200 linux/timex.h: cleanup for userspace posix-timers: simplify de_thread()->exit_itimers() path posix-timers: check ->it_signal instead of ->it_pid to validate the timer posix-timers: use "struct pid" instead of "struct task_struct" nohz: suppress needless timer reprogramming clocksource, acpi_pm.c: put acpi_pm_read_slow() under CONFIG_PCI nohz: no softirq pending warnings for offline cpus hrtimer: removing all ur callback modes, fix hrtimer: removing all ur callback modes, fix hotplug hrtimer: removing all ur callback modes x86: correct link to HPET timer specification rtc-cmos: export second NVRAM bank Fixed up conflicts in sound/drivers/pcsp/pcsp.c and sound/core/hrtimer.c manually.	2008-12-30 16:16:21 -08:00
Linus Torvalds	5f34fe1cfc	Merge branch 'core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (63 commits) stacktrace: provide save_stack_trace_tsk() weak alias rcu: provide RCU options on non-preempt architectures too printk: fix discarding message when recursion_bug futex: clean up futex_(un)lock_pi fault handling "Tree RCU": scalable classic RCU implementation futex: rename field in futex_q to clarify single waiter semantics x86/swiotlb: add default swiotlb_arch_range_needs_mapping x86/swiotlb: add default phys<->bus conversion x86: unify pci iommu setup and allow swiotlb to compile for 32 bit x86: add swiotlb allocation functions swiotlb: consolidate swiotlb info message printing swiotlb: support bouncing of HighMem pages swiotlb: factor out copy to/from device swiotlb: add arch hook to force mapping swiotlb: allow architectures to override phys<->bus<->phys conversions swiotlb: add comment where we handle the overflow of a dma mask on 32 bit rcu: fix rcutorture behavior during reboot resources: skip sanity check of busy resources swiotlb: move some definitions to header swiotlb: allow architectures to override swiotlb pool allocation ... Fix up trivial conflicts in arch/x86/kernel/Makefile arch/x86/mm/init_32.c include/linux/hardirq.h as per Ingo's suggestions.	2008-12-30 16:10:19 -08:00
Trond Myklebust	08cc36cbd1	Merge branch 'devel' into next	2008-12-30 16:51:43 -05:00
Magnus Damm	91962fa713	V4L/DVB (10078): video: add NV16 and NV61 pixel formats This patch adds support for NV16 and NV61 pixel formats. These pixel formats use two planes; one for 8-bit Y values and one for interleaved 8-bit U and V values. NV16/NV61 formats are very similar to NV12/NV21 with the exception that NV16/NV61 are using the same number of lines for both planes. The difference between NV16 and NV61 is the U and V byte order. The fourcc values are extrapolated from the NV12/NV21 case. Signed-off-by: Magnus Damm <damm@igel.co.jp> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2008-12-30 09:40:20 -02:00
Mauro Carvalho Chehab	531c98e718	V4L/DVB (9953): em28xx: Add suport for debugging AC97 anciliary chips The em28xx driver can be coupled to an anciliary AC97 chip. This patch allows read/write AC97 registers directly. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2008-12-30 09:39:27 -02:00
Hans Verkuil	4b00eb2534	V4L/DVB (9944): videodev2.h: fix typo. The comment said CX2584X instead of CX2341X. Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2008-12-30 09:39:27 -02:00
Hans Verkuil	92f45badbb	V4L/DVB (9932): v4l2-compat32: fix 32-64 compatibility module Added all missing v4l1/2 ioctls and fix several broken conversions. Partially based on work done by Cody Pisto <cpisto@gmail.com>. Tested-by: Brandon Jenkins <bcjenkins@tvwhere.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2008-12-30 09:39:22 -02:00
Laurent Pinchart	046425f8c4	V4L/DVB (9898): v4l2: Add privacy control The privacy control prevents video from being acquired by the camera. A true value indicates that no image can be captured. Devices that implement the privacy control must support read access and may support write access. Signed-off-by: Laurent Pinchart <laurent.pinchart@skynet.be> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2008-12-30 09:39:10 -02:00
Laurent Pinchart	0877258d98	V4L/DVB (9897): v4l2: Add camera zoom controls The zoom controls move the zoom lens group to a an absolute position, as a relative displacement or at a given speed until reaching physical device limits. Positive values move the zoom lens group towards the telephoto direction, negative values towards the wide-angle direction. Signed-off-by: Laurent Pinchart <laurent.pinchart@skynet.be> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2008-12-30 09:39:09 -02:00
Frederic Weisbecker	36994e58a4	tracing/kmemtrace: normalize the raw tracer event to the unified tracing API Impact: new tracer plugin This patch adapts kmemtrace raw events tracing to the unified tracing API. To enable and use this tracer, just do the following: echo kmemtrace > /debugfs/tracing/current_tracer cat /debugfs/tracing/trace You will have the following output: # tracer: kmemtrace # # # ALLOC TYPE REQ GIVEN FLAGS POINTER NODE CALLER # FREE \| \| \| \| \| \| \| \| # \| type_id 1 call_site 18446744071565527833 ptr 18446612134395152256 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 0 call_site 18446744071565636711 ptr 18446612134345164672 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 0 call_site 18446744071565636711 ptr 18446612134345164912 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 0 call_site 18446744071565636711 ptr 18446612134345165152 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1 type_id 0 call_site 18446744071566144042 ptr 18446612134346191680 bytes_req 1304 bytes_alloc 1312 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 That was to stay backward compatible with the format output produced in inux/tracepoint.h. This is the default ouput, but note that I tried something else. If you change an option: echo kmem_minimalistic > /debugfs/trace_options and then cat /debugfs/trace, you will have the following output: # tracer: kmemtrace # # # ALLOC TYPE REQ GIVEN FLAGS POINTER NODE CALLER # FREE \| \| \| \| \| \| \| \| # \| - C 0xffff88007c088780 file_free_rcu + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname + K 240 240 000000d0 0xffff8800790dc780 -1 d_alloc - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname + K 240 240 000000d0 0xffff8800790dc870 -1 d_alloc - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname + K 240 240 000000d0 0xffff8800790dc960 -1 d_alloc + K 1304 1312 000000d0 0xffff8800791d7340 -1 reiserfs_alloc_inode - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname - C 0xffff88007cad6000 putname + K 992 1000 000000d0 0xffff880079045b58 -1 alloc_inode + K 768 1024 000080d0 0xffff88007c096400 -1 alloc_pipe_info + K 240 240 000000d0 0xffff8800790dca50 -1 d_alloc + K 272 320 000080d0 0xffff88007c088780 -1 get_empty_filp + K 272 320 000080d0 0xffff88007c088000 -1 get_empty_filp Yeah I shall confess kmem_minimalistic should be: kmem_alternative. Whatever, I find it more readable but this a personal opinion of course. We can drop it if you want. On the ALLOC/FREE column, + means an allocation and - a free. On the type column, you have K = kmalloc, C = cache, P = page I would like the flags to be GFP_* strings but that would not be easy to not break the column with strings.... About the node...it seems to always be -1. I don't know why but that shouldn't be difficult to find. I moved linux/tracepoint.h to trace/tracepoint.h as well. I think that would be more easy to find the tracer headers if they are all in their common directory. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-30 09:36:13 +01:00
Jaswinder Singh Rajput	47fea2adfc	sched: sched.c declare variables before they get used Impact: cleanup, avoid sparse warnings In linux/sched.h moved out sysctl_sched_latency, sysctl_sched_min_granularity, sysctl_sched_wakeup_granularity, sysctl_sched_shares_ratelimit and sysctl_sched_shares_thresh from #ifdef CONFIG_SCHED_DEBUG as these variables are common for both. Fixes these sparse warnings: kernel/sched.c:825:14: warning: symbol 'sysctl_sched_shares_ratelimit' was not declared. Should it be static? kernel/sched.c:832:14: warning: symbol 'sysctl_sched_shares_thresh' was not declared. Should it be static? kernel/sched_fair.c:37:14: warning: symbol 'sysctl_sched_latency' was not declared. Should it be static? kernel/sched_fair.c:43:14: warning: symbol 'sysctl_sched_min_granularity' was not declared. Should it be static? kernel/sched_fair.c:72:14: warning: symbol 'sysctl_sched_wakeup_granularity' was not declared. Should it be static? Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-30 07:42:34 +01:00
Matias Zabaljauregui	58a2456644	lguest: move the initial guest page table creation code to the host This patch moves the initial guest page table creation code to the host, so the launcher keeps working with PAE enabled configs. Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:26:11 +10:30
Christian Borntraeger	c29834584e	virtio_console: support console resizing this patch uses the new hvc callback hvc_resize to set the window size which allows to change the tty size of hvc_console via a hvc_resize function. I have added a new feature bit VIRTIO_CONSOLE_F_SIZE. The driver will change the window size on tty open and via the config_changed callback of the transport. Currently lguest and kvm_s390 have not implemented this callback, but the callback can be implemented at a later point in time. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:26:10 +10:30
Hollis Blanchard	1b4aa2faec	virtio: avoid implicit use of Linux page size in balloon interface Make the balloon interface always use 4K pages, and convert Linux pfns if necessary. This patch assumes that Linux's PAGE_SHIFT will never be less than 12. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (modified)	2008-12-30 09:26:04 +10:30
Rusty Russell	87c7d57c17	virtio: hand virtio ring alignment as argument to vring_new_virtqueue This allows each virtio user to hand in the alignment appropriate to their virtio_ring structures. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>	2008-12-30 09:26:03 +10:30
Rusty Russell	2966af73e7	virtio: use LGUEST_VRING_ALIGN instead of relying on pagesize This doesn't really matter, since lguest is i386 only at the moment, but we could actually choose a different value. (lguest doesn't have a guarenteed ABI). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:26:02 +10:30
Rusty Russell	498af14783	virtio: Don't use PAGE_SIZE for vring alignment in virtio_pci. That doesn't work for non-4k guests which are now appearing. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:25:58 +10:30
Rusty Russell	5f0d1d7f22	virtio: rename 'pagesize' arg to vring_init/vring_size It's really the alignment desired for consumer/producer separation; historically this x86 pagesize, but with PowerPC it'll still be x86 pagesize. And in theory lguest could choose a different value. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:25:57 +10:30
Rusty Russell	480daab42c	virtio: Don't use PAGE_SIZE in virtio_pci.c The virtio PCI devices don't depend on the guest page size. This matters now PowerPC virtio is gaining ground (they like 64k pages). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:25:57 +10:30
Rusty Russell	e12f0102ac	cpumask: Use nr_cpu_ids in seq_cpumask Impact: cleanup, futureproof nr_cpu_ids is the (badly named) runtime limit on possible CPU numbers; ie. the variable version of NR_CPUS. With the new cpumask operators, only bits less than this are defined. So we should use it everywhere, rather than NR_CPUS. Eventually this will make it possible to allocate cpumasks of the minimal length at runtime. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu>	2008-12-30 09:05:19 +10:30
Rusty Russell	54b11e6d57	cpumask: smp_call_function_many() Impact: Implementation change to remove cpumask_t from stack. Actually change smp_call_function_mask() to smp_call_function_many(). We avoid cpumasks on the stack in this version. (S390 has its own version, but that's going away apparently). We have to do some dancing to figure out if 0 or 1 other cpus are in the mask supplied and the online mask without allocating a tmp cpumask. It's still fairly cheap. We allocate the cpumask at the end of the call_function_data structure: if allocation fails we fallback to smp_call_function_single rather than using the baroque quiescing code (which needs a cpumask on stack). (Thanks to Hiroshi Shimamoto for spotting several bugs in previous versions!) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Cc: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Cc: npiggin@suse.de Cc: axboe@kernel.dk	2008-12-30 09:05:16 +10:30
Rusty Russell	3fa4152069	cpumask: make set_cpu_/init_cpu_ out-of-line They're only for use in boot/cpu hotplug code anyway, and this avoids the use of deprecated cpu_*_map. Stephen Rothwell points out that gcc 4.2.4 (on powerpc at least) didn't like the cast away of const anyway: include/linux/cpumask.h: In function 'set_cpu_possible': include/linux/cpumask.h:1052: warning: passing argument 2 of 'cpumask_set_cpu' discards qualifiers from pointer target type So this kills two birds with one stone. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:05:16 +10:30
Rusty Russell	ae7a47e72e	cpumask: make cpumask.h eat its own dogfood. Changes: 1) cpumask_t to struct cpumask, 2) cpus_weight_nr to cpumask_weight, 3) cpu_isset to cpumask_test_cpu, 4) ->bits to cpumask_bits() 5) cpu__map to cpu__mask. 6) for_each_cpu_mask_nr to for_each_cpu Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:05:15 +10:30
Rusty Russell	b3199c025d	cpumask: switch over to cpu_online/possible/active/present_mask: core Impact: cleanup This implements the obsolescent cpu_online_map in terms of cpu_online_mask, rather than the other way around. Same for the other maps. The documentation comments are also updated to refer to _mask rather than _map. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com>	2008-12-30 09:05:14 +10:30
Rusty Russell	cb78a0ce69	bitmap: fix seq_bitmap and seq_cpumask to take const pointer Impact: cleanup seq_bitmap just calls bitmap_scnprintf on the bits: that arg can be const. Similarly, seq_cpumask just calls seq_bitmap. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:05:14 +10:30
Rusty Russell	4b0bc0bca8	bitmap: test for constant as well as small size for inline versions Impact: reduce text size bitmap_zero et al have a fastpath for nbits <= BITS_PER_LONG, but this should really only apply where the nbits is known at compile time. This only saves about 1200 bytes on an allyesconfig kernel, but with cpumasks going variable that number will increase. text data bss dec hex filename 35327852 5035607 6782976 47146435 2cf65c3 vmlinux-before 35326640 5035607 6782976 47145223 2cf6107 vmlinux-after Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:05:13 +10:30
Rusty Russell	278d1ed65e	cpumask: make CONFIG_NR_CPUS always valid. Impact: cleanup Currently we have NR_CPUS, which is 1 on UP, and CONFIG_NR_CPUS on SMP. If we make CONFIG_NR_CPUS always valid (and always 1 on !SMP), we can skip the middleman. This also allows us to find and check all the unaudited NR_CPUS usage as we prepare for v. large NR_CPUS. To avoid breaking every arch, we cheat and do this for the moment in the header if the arch doesn't. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com>	2008-12-30 09:05:12 +10:30
Rusty Russell	33edcf133b	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	2008-12-30 08:02:35 +10:30
Robert Jarzmik	69acdf1e5a	V4L/DVB (9530): Add new pixel format VYUY 16 bits wide. There were already 3 YUV formats defined : - YUYV - YVYU - UYVY The only left combination is VYUY, which is added in this patch. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2008-12-29 17:53:27 -02:00
Bartlomiej Zolnierkiewicz	e2984c628c	ide: move Power Management support to ide-pm.c There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:37 +01:00
Bartlomiej Zolnierkiewicz	702c026be8	ide: rework handling of serialized ports (v2) * hpt366: set IDE_HFLAG_SERIALIZE in ->host_flags if needed in init_hwif_hpt366(). Remove HPT_SERIALIZE_IO while at it. * Set IDE_HFLAG_SERIALIZE in ->host_flags if needed in ide_init_port(). * Convert init_irq() to use IDE_HFLAG_SERIALIZE together with hwif->host to find out ports which need to be serialized. * Remove no longer needed save_match() and ide_hwif_t.serialized. v2: * Set host's ->host_flags field instead of port's copy. This patch should fix the incorrect grouping of port(s) from host(s) that need serialization with port(s) that happen to use the same IRQ(s) but are from the host(s) that don't need it. Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:36 +01:00
Bartlomiej Zolnierkiewicz	b7876a6fb6	cy82c693: remove superfluous ide_cy82c693 chipset type Since CY82C693 doesn't require serialization we may as well use the default ide_pci chipset type. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:34 +01:00
Bartlomiej Zolnierkiewicz	1f66019bdf	trm290: add IDE_HFLAG_TRM290 host flag * Add IDE_HFLAG_TRM290 host flag and use it in ide_build_dmatable(). * Remove no longer needed ide_trm290 chipset type. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:34 +01:00
Bartlomiej Zolnierkiewicz	6b4924962c	ide: add ->max_sectors field to struct ide_port_info * Add ->max_sectors field to struct ide_port_info to allow host drivers to specify value used for hwif->rqsize (if smaller than the default). * Convert pdc202xx_old to use ->max_sectors and remove no longer needed IDE_HFLAG_RQSIZE_256 flag. There should be no functional changes caused by this patch. Acked-by: Sergei Shtyltov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:34 +01:00
Bartlomiej Zolnierkiewicz	7f1ac8c4b9	rz1000: apply chipset quirks early (v2) * Use pci_name(dev) instead of hwif->name in init_hwif_rz1000(). * init_hwif_rz1000() -> rz1000_init_chipset(). Update rz1000_init_one() to use rz1000_init_chipset() and add now required rz1000_remove(). * Remove superfluous ide_rz1000 chipset type. v2: * unsigned int rz1000_init_chipset() -> int rz1000_disable_readahead() per Sergei's suggestion. Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:33 +01:00
Bartlomiej Zolnierkiewicz	f58c1ab8de	ide: always set nIEN on idle devices * Set nIEN for previous port/device in ide_do_request() also if port uses a non-shared IRQ. * Remove no longer needed ide_hwif_t.sharing_irq. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:33 +01:00
Bartlomiej Zolnierkiewicz	6b5cde3629	cmd64x: set IDE_HFLAG_SERIALIZE explictly for CMD646 * Set IDE_HFLAG_SERIALIZE explictly for CMD646. * Remove no longer needed ide_cmd646 chipset type (which has a nice side-effect of fixing handling of unexpected IRQs). Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:32 +01:00
Bartlomiej Zolnierkiewicz	27c01c2db0	ide-cd: remove obsolete seek optimization It doesn't make much sense nowadays and is problematic on some drives. Cc: Borislav Petkov <petkovbb@googlemail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:32 +01:00
Bartlomiej Zolnierkiewicz	2a2ca6a961	ide: replace the global ide_lock spinlock by per-hwgroup spinlocks (v2) Now that (almost) all host drivers have been fixed not to abuse ide_lock and core code usage of ide_lock has been sanitized we may safely replace ide_lock by per-hwgroup locks. This patch is partially based on earlier patch from Ravikiran G Thirumalai. While at it: - don't use deprecated HWIF() and HWGROUP() macros - update locking documentation in ide.h v2: Add missing spin_lock_init(&hwgroup->lock). (Noticed by Elias Oltmanns) Cc: Vaibhav V. Nivargi <vaibhav.nivargi@gmail.com> Cc: Alok N. Kataria <alokk@calsoftinc.com> Cc: Shai Fultheim <shai@scalex86.org> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Cc: Elias Oltmanns <eo@nebensachen.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-29 20:27:31 +01:00
Gregory Haskins	917b627d4d	sched: create "pushable_tasks" list to limit pushing to one attempt The RT scheduler employs a "push/pull" design to actively balance tasks within the system (on a per disjoint cpuset basis). When a task is awoken, it is immediately determined if there are any lower priority cpus which should be preempted. This is opposed to the way normal SCHED_OTHER tasks behave, which will wait for a periodic rebalancing operation to occur before spreading out load. When a particular RQ has more than 1 active RT task, it is said to be in an "overloaded" state. Once this occurs, the system enters the active balancing mode, where it will try to push the task away, or persuade a different cpu to pull it over. The system will stay in this state until the system falls back below the <= 1 queued RT task per RQ. However, the current implementation suffers from a limitation in the push logic. Once overloaded, all tasks (other than current) on the RQ are analyzed on every push operation, even if it was previously unpushable (due to affinity, etc). Whats more, the operation stops at the first task that is unpushable and will not look at items lower in the queue. This causes two problems: 1) We can have the same tasks analyzed over and over again during each push, which extends out the fast path in the scheduler for no gain. Consider a RQ that has dozens of tasks that are bound to a core. Each one of those tasks will be encountered and skipped for each push operation while they are queued. 2) There may be lower-priority tasks under the unpushable task that could have been successfully pushed, but will never be considered until either the unpushable task is cleared, or a pull operation succeeds. The net result is a potential latency source for mid priority tasks. This patch aims to rectify these two conditions by introducing a new priority sorted list: "pushable_tasks". A task is added to the list each time a task is activated or preempted. It is removed from the list any time it is deactivated, made current, or fails to push. This works because a task only needs to be attempted to push once. After an initial failure to push, the other cpus will eventually try to pull the task when the conditions are proper. This also solves the problem that we don't completely analyze all tasks due to encountering an unpushable tasks. Now every task will have a push attempted (when appropriate). This reduces latency both by shorting the critical section of the rq->lock for certain workloads, and by making sure the algorithm considers all eligible tasks in the system. [ rostedt: added a couple more BUG_ONs ] Signed-off-by: Gregory Haskins <ghaskins@novell.com> Acked-by: Steven Rostedt <srostedt@redhat.com>	2008-12-29 09:39:53 -05:00
Gregory Haskins	4075134e40	plist: fix PLIST_NODE_INIT to work with debug enabled It seems that PLIST_NODE_INIT breaks if used and DEBUG_PI_LIST is defined. Since there are no current users of PLIST_NODE_INIT, this has gone undetected. This patch fixes the build issue that enables the DEBUG_PI_LIST later in the series when we use it in init_task.h Signed-off-by: Gregory Haskins <ghaskins@novell.com>	2008-12-29 09:39:53 -05:00
Gregory Haskins	967fc04671	sched: add sched_class->needs_post_schedule() member We currently run class->post_schedule() outside of the rq->lock, which means that we need to test for the need to post_schedule outside of the lock to avoid a forced reacquistion. This is currently not a problem as we only look at rq->rt.overloaded. However, we want to enhance this going forward to look at more state to reduce the need to post_schedule to a bare minimum set. Therefore, we introduce a new member-func called needs_post_schedule() which tests for the post_schedule condtion without actually performing the work. Therefore it is safe to call this function before the rq->lock is released, because we are guaranteed not to drop the lock at an intermediate point (such as what post_schedule() may do). We will use this later in the series [ rostedt: removed paranoid BUG_ON ] Signed-off-by: Gregory Haskins <ghaskins@novell.com>	2008-12-29 09:39:52 -05:00
Ingo Molnar	2ff9f9d962	Merge branch 'topic/kmemtrace' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 into tracing/kmemtrace	2008-12-29 15:16:24 +01:00
Eduard - Gabriel Munteanu	73cd6af041	kmemtrace: Better alternative to "kmemtrace: fix printk format warnings". Fix the problem "kmemtrace: fix printk format warnings" attempted to fix, but resulted in marker-probe format mismatch warnings. Instead of carrying size_t into probes, we get rid of it by casting to unsigned long, just as we did with gfp_t. This way, we don't need to change marker format strings and we don't have to rely on other format specifiers like "%zu", making for consistent use of more generic data types (since there are no format specifiers for gfp_t, for example). Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>	2008-12-29 15:34:11 +02:00
Eduard - Gabriel Munteanu	5b882be4e0	kmemtrace: SLUB hooks. This adds hooks for the SLUB allocator, to allow tracing with kmemtrace. Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>	2008-12-29 15:34:07 +02:00
Eduard - Gabriel Munteanu	3eae2cb24a	kmemtrace: SLOB hooks. This adds hooks for the SLOB allocator, to allow tracing with kmemtrace. We also convert some inline functions to __always_inline to make sure _RET_IP_, which expands to __builtin_return_address(0), always works as expected. Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>	2008-12-29 15:34:05 +02:00
Eduard - Gabriel Munteanu	36555751c6	kmemtrace: SLAB hooks. This adds hooks for the SLAB allocator, to allow tracing with kmemtrace. We also convert some inline functions to __always_inline to make sure _RET_IP_, which expands to __builtin_return_address(0), always works as expected. Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>	2008-12-29 15:34:04 +02:00
Eduard - Gabriel Munteanu	b9ce08c010	kmemtrace: Core implementation. kmemtrace provides tracing for slab allocator functions, such as kmalloc, kfree, kmem_cache_alloc, kmem_cache_free etc.. Collected data is then fed to the userspace application in order to analyse allocation hotspots, internal fragmentation and so on, making it possible to see how well an allocator performs, as well as debug and profile kernel code. Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>	2008-12-29 15:34:01 +02:00
Eduard - Gabriel Munteanu	35995a4d81	SLUB: Replace __builtin_return_address(0) with _RET_IP_. This patch replaces __builtin_return_address(0) with _RET_IP_, since a previous patch moved _RET_IP_ and _THIS_IP_ to include/linux/kernel.h and they're widely available now. This makes for shorter and easier to read code. [penberg@cs.helsinki.fi: remove _RET_IP_ casts to void pointer] Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>	2008-12-29 15:33:59 +02:00
Mike Frysinger	34a4c5eb42	Input: map_to_7segment.h - convert to __inline__ for userspace Use __inline__ rather than inline for map_to_seg7() since it is exported to userspace. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-12-29 04:59:31 -08:00
Peter Zijlstra	ea319518ba	locking, percpu counters: introduce separate lock classes Impact: fix lockdep false positives Classify percpu_counter instances similar to regular lock objects -- that is, per instantiation site. The networking code has increased its use of percpu_counters, which leads to false positives if they are treated as a single class. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-29 13:43:00 +01:00
Jiri Slaby	d61c72e52b	DMI: add dmi_match Add a wrapper for testing system_info which will handle also NULL system infos. This will be used by the ata PIIX driver. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Alexandru Romanescu <a_romanescu@yahoo.co.uk> Cc: Tejun Heo <tj@kernel.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-12-29 07:39:34 -05:00
Yinghai Lu	43a256322a	sparseirq: move __weak symbols into separate compilation unit GCC has a bug with __weak alias functions: if the functions are in the same compilation unit as their call site, GCC can decide to inline them - and thus rob the linker of the opportunity to override the weak alias with the real thing. So move all the IRQ handling related __weak symbols to kernel/irq/chip.c. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-29 12:15:49 +01:00
Pekka Enberg	3c506efd7e	Merge branch 'topic/failslab' into for-linus Conflicts: mm/slub.c Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>	2008-12-29 11:47:05 +02:00
Pekka Enberg	fd37617e69	Merge branches 'topic/fixes', 'topic/cleanups' and 'topic/documentation' into for-linus	2008-12-29 11:45:47 +02:00
Pascal Terjan	dfcd361028	slab: Fix comment on #endif This #endif in slab.h is described as closing the inner block while it's for the big CONFIG_NUMA one. That makes reading the code a bit harder. This trivial patch fixes the comment. Signed-off-by: Pascal Terjan <pterjan@mandriva.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>	2008-12-29 11:40:56 +02:00
Akinobu Mita	773ff60e84	SLUB: failslab support Currently fault-injection capability for SLAB allocator is only available to SLAB. This patch makes it available to SLUB, too. [penberg@cs.helsinki.fi: unify slab and slub implementations] Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>	2008-12-29 11:27:46 +02:00
Dave Airlie	f453ba0460	DRM: add mode setting support Add mode setting support to the DRM layer. This is a fairly big chunk of work that allows DRM drivers to provide full output control and configuration capabilities to userspace. It was motivated by several factors: - the fb layer's APIs aren't suited for anything but simple configurations - coordination between the fb layer, DRM layer, and various userspace drivers is poor to non-existent (radeonfb excepted) - user level mode setting drivers makes displaying panic & oops messages more difficult - suspend/resume of graphics state is possible in many more configurations with kernel level support This commit just adds the core DRM part of the mode setting APIs. Driver specific commits using these new structure and APIs will follow. Co-authors: Jesse Barnes <jbarnes@virtuousgeek.org>, Jakob Bornecrantz <jakob@tungstengraphics.com> Contributors: Alan Hourihane <alanh@tungstengraphics.com>, Maarten Maathuis <madman2003@gmail.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-12-29 17:47:23 +10:00
Jens Axboe	b3a6ffe16b	Get rid of CONFIG_LSF We have two seperate config entries for large devices/files. One is CONFIG_LBD that guards just the devices, the other is CONFIG_LSF that handles large files. This doesn't make a lot of sense, you typically want both or none. So get rid of CONFIG_LSF and change CONFIG_LBD wording to indicate that it covers both. Acked-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:29:51 +01:00
Jens Axboe	a6f23657d3	block: add one-hit cache for disk partition lookup disk_map_sector_rcu() returns a partition from a sector offset, which we use for IO statistics on a per-partition basis. The lookup itself is an O(N) list lookup, where N is the number of partitions. This actually hurts performance quite a bit, even on the lower end partitions. On higher numbered partitions, it can get pretty bad. Solve this by adding a one-hit cache for partition lookup. This makes the lookup O(1) for the case where we do most IO to one partition. Even for mixed partition workloads, amortized cost is pretty close to O(1) since the natural IO batching makes the one-hit cache last for lots of IOs. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:29:51 +01:00
Jens Axboe	b374d18a4b	block: get rid of elevator_t typedef Just use struct elevator_queue everywhere instead. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:29:50 +01:00
Jens Axboe	abf137dd77	aio: make the lookup_ioctx() lockless The mm->ioctx_list is currently protected by a reader-writer lock, so we always grab that lock on the read side for doing ioctx lookups. As the workload is extremely reader biased, turn this into an rcu hlist so we can make lookup_ioctx() lockless. Get rid of the rwlock and use a spinlock for providing update side exclusion. There's usually only 1 entry on this list, so it doesn't make sense to look into fancier data structures. Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:29:50 +01:00
Jens Axboe	392ddc3298	bio: add support for inlining a number of bio_vecs inside the bio When we go and allocate a bio for IO, we actually do two allocations. One for the bio itself, and one for the bi_io_vec that holds the actual pages we are interested in. This feature inlines a definable amount of io vecs inside the bio itself, so we eliminate the bio_vec array allocation for IO's up to a certain size. It defaults to 4 vecs, which is typically 16k of IO. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:29:50 +01:00
Jens Axboe	bb799ca020	bio: allow individual slabs in the bio_set Instead of having a global bio slab cache, add a reference to one in each bio_set that is created. This allows for personalized slabs in each bio_set, so that they can have bios of different sizes. This means we can personalize the bios we return. File systems may want to embed the bio inside another structure, to avoid allocation more items (and stuffing them in ->bi_private) after the get a bio. Or we may want to embed a number of bio_vecs directly at the end of a bio, to avoid doing two allocations to return a bio. This is now possible. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:29:23 +01:00
Jens Axboe	1b43449869	bio: move the slab pointer inside the bio_set In preparation for adding differently sized bios. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:46 +01:00
Jens Axboe	7ff9345ffa	bio: only mempool back the largest bio_vec slab cache We only very rarely need the mempool backing, so it makes sense to get rid of all but one of the mempool in a bio_set. So keep the largest bio_vec count mempool so we can always honor the largest allocation, and "upgrade" callers that fail. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:46 +01:00
Tejun Heo	58eea927d2	block: simplify empty barrier implementation Empty barrier required special handling in __elv_next_request() to complete it without letting the low level driver see it. With previous changes, barrier code is now flexible enough to skip the BAR step using the same barrier sequence selection mechanism. Drop the special handling and mask off q->ordered from start_ordered(). Remove blk_empty_barrier() test which now has no user. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:45 +01:00
Tejun Heo	8f11b3e99a	block: make barrier completion more robust Barrier completion had the following assumptions. * start_ordered() couldn't finish the whole sequence properly. If all actions are to be skipped, q->ordseq is set correctly but the actual completion was never triggered thus hanging the barrier request. * Drain completion in elv_complete_request() assumed that there's always at least one request in the queue when drain completes. Both assumptions are true but these assumptions need to be removed to improve empty barrier implementation. This patch makes the following changes. * Make start_ordered() use blk_ordered_complete_seq() to mark skipped steps complete and notify __elv_next_request() that it should fetch the next request if the whole barrier has completed inside start_ordered(). * Make drain completion path in elv_complete_request() check whether the queue is empty. Empty queue also indicates drain completion. * While at it, convert 0/1 return from blk_do_ordered() to false/true. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:45 +01:00
Tejun Heo	f671620e7d	block: make every barrier action optional In all barrier sequences, the barrier write itself was always assumed to be issued and thus didn't have corresponding control flag. This patch adds QUEUE_ORDERED_DO_BAR and unify action mask handling in start_ordered() such that any barrier action can be skipped. This patch doesn't introduce any visible behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:45 +01:00
Tejun Heo	313e42999d	block: reorganize QUEUE_ORDERED_* constants Separate out ordering type (drain,) and action masks (preflush, postflush, fua) from visible ordering mode selectors (QUEUE_ORDERED_). Ordering types are now named QUEUE_ORDERED_BY_ while action masks are named QUEUE_ORDERED_DO_*. This change is necessary to add QUEUE_ORDERED_DO_BAR and make it optional to improve empty barrier implementation. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:44 +01:00
Richard Kennedy	ba744d5e29	block: reorder struct bio to remove padding on 64bit Remove 8 bytes of padding from struct bio which also removes 16 bytes from struct bio_pair to make it 248 bytes. bio_pair then fits into one fewer cache lines & into a smaller slab. Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:44 +01:00
Cheng Renquan	64d01dc9e1	block: use cancel_work_sync() instead of kblockd_flush_work() After many improvements on kblockd_flush_work, it is now identical to cancel_work_sync, so a direct call to cancel_work_sync is suggested. The only difference is that cancel_work_sync is a GPL symbol, so no non-GPL modules anymore. Signed-off-by: Cheng Renquan <crquan@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:44 +01:00
Keith Mannthey	08bafc0341	block: Supress Buffer I/O errors when SCSI REQ_QUIET flag set Allow the scsi request REQ_QUIET flag to be propagated to the buffer file system layer. The basic ideas is to pass the flag from the scsi request to the bio (block IO) and then to the buffer layer. The buffer layer can then suppress needless printks. This patch declutters the kernel log by removed the 40-50 (per lun) buffer io error messages seen during a boot in my multipath setup . It is a good chance any real errors will be missed in the "noise" it the logs without this patch. During boot I see blocks of messages like " __ratelimit: 211 callbacks suppressed Buffer I/O error on device sdm, logical block 5242879 Buffer I/O error on device sdm, logical block 5242879 Buffer I/O error on device sdm, logical block 5242847 Buffer I/O error on device sdm, logical block 1 Buffer I/O error on device sdm, logical block 5242878 Buffer I/O error on device sdm, logical block 5242879 Buffer I/O error on device sdm, logical block 5242879 Buffer I/O error on device sdm, logical block 5242879 Buffer I/O error on device sdm, logical block 5242879 Buffer I/O error on device sdm, logical block 5242872 " in my logs. My disk environment is multipath fiber channel using the SCSI_DH_RDAC code and multipathd. This topology includes an "active" and "ghost" path for each lun. IO's to the "ghost" path will never complete and the SCSI layer, via the scsi device handler rdac code, quick returns the IOs to theses paths and sets the REQ_QUIET scsi flag to suppress the scsi layer messages. I am wanting to extend the QUIET behavior to include the buffer file system layer to deal with these errors as well. I have been running this patch for a while now on several boxes without issue. A few runs of bonnie++ show no noticeable difference in performance in my setup. Thanks for John Stultz for the quiet_error finalization. Submitted-by: Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:44 +01:00
Fernando Luis Vázquez Cao	88e740f165	block: add queue flag for paravirt frontend drivers As is the case with SSD devices, we do not want to idle in AS/CFQ when the block device is a paravirt front-end driver. This patch adds a flag (QUEUE_FLAG_VIRT) which should be used by front-end drivers such as virtio_blk and xen-blkfront to indicate a paravirtualized device. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:41 +01:00
Lachlan McIlroy	0a8c5395f9	[XFS] Fix merge failures Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: fs/xfs/linux-2.6/xfs_cred.h fs/xfs/linux-2.6/xfs_globals.h fs/xfs/linux-2.6/xfs_ioctl.c fs/xfs/xfs_vnodeops.h Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-12-29 16:47:18 +11:00
David S. Miller	e3c6d4ee54	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: arch/sparc64/kernel/idprom.c	2008-12-28 20:19:47 -08:00
Tejun Heo	ece180d1cf	libata: perform port detach in EH ata_port_detach() first made sure EH saw ATA_PFLAG_UNLOADING and then assumed EH context belongs to it and performed detach operation itself. However, UNLOADING doesn't disable all of EH and this could lead to problems including triggering WARN_ON()'s in EH path. This patch makes port detach behave more like other EH actions such that ata_port_detach() requests EH to detach and waits for completion. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-12-28 22:43:21 -05:00
Tejun Heo	1eca4365be	libata: beef up iterators There currently are the following looping constructs. * __ata_port_for_each_link() for all available links * ata_port_for_each_link() for edge links * ata_link_for_each_dev() for all devices * ata_link_for_each_dev_reverse() for all devices in reverse order Now there's a need for looping construct which is similar to __ata_port_for_each_link() but iterates over PMP links before the host link. Instead of adding another one with long name, do the following cleanup. * Implement and export ata_link_next() and ata_dev_next() which take @mode parameter and can be used to build custom loop. * Implement ata_for_each_link() and ata_for_each_dev() which take looping mode explicitly. The following iteration modes are implemented. * ATA_LITER_EDGE : loop over edge links * ATA_LITER_HOST_FIRST : loop over all links, host link first * ATA_LITER_PMP_FIRST : loop over all links, PMP links first * ATA_DITER_ENABLED : loop over enabled devices * ATA_DITER_ENABLED_REVERSE : loop over enabled devices in reverse order * ATA_DITER_ALL : loop over all devices * ATA_DITER_ALL_REVERSE : loop over all devices in reverse order This change removes exlicit device enabledness checks from many loops and makes it clear which ones are iterated over in which direction. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-12-28 22:43:20 -05:00
Linus Torvalds	3c92ec8ae9	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (144 commits) powerpc/44x: Support 16K/64K base page sizes on 44x powerpc: Force memory size to be a multiple of PAGE_SIZE powerpc/32: Wire up the trampoline code for kdump powerpc/32: Add the ability for a classic ppc kernel to be loaded at 32M powerpc/32: Allow __ioremap on RAM addresses for kdump kernel powerpc/32: Setup OF properties for kdump powerpc/32/kdump: Implement crash_setup_regs() using ppc_save_regs() powerpc: Prepare xmon_save_regs for use with kdump powerpc: Remove default kexec/crash_kernel ops assignments powerpc: Make default kexec/crash_kernel ops implicit powerpc: Setup OF properties for ppc32 kexec powerpc/pseries: Fix cpu hotplug powerpc: Fix KVM build on ppc440 powerpc/cell: add QPACE as a separate Cell platform powerpc/cell: fix build breakage with CONFIG_SPUFS disabled powerpc/mpc5200: fix error paths in PSC UART probe function powerpc/mpc5200: add rts/cts handling in PSC UART driver powerpc/mpc5200: Make PSC UART driver update serial errors counters powerpc/mpc5200: Remove obsolete code from mpc5200 MDIO driver powerpc/mpc5200: Add MDMA/UDMA support to MPC5200 ATA driver ... Fix trivial conflict in drivers/char/Makefile as per Paul's directions	2008-12-28 16:54:33 -08:00
Linus Torvalds	0191b625ca	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1429 commits) net: Allow dependancies of FDDI & Tokenring to be modular. igb: Fix build warning when DCA is disabled. net: Fix warning fallout from recent NAPI interface changes. gro: Fix potential use after free sfc: If AN is enabled, always read speed/duplex from the AN advertising bits sfc: When disabling the NIC, close the device rather than unregistering it sfc: SFT9001: Add cable diagnostics sfc: Add support for multiple PHY self-tests sfc: Merge top-level functions for self-tests sfc: Clean up PHY mode management in loopback self-test sfc: Fix unreliable link detection in some loopback modes sfc: Generate unique names for per-NIC workqueues 802.3ad: use standard ethhdr instead of ad_header 802.3ad: generalize out mac address initializer 802.3ad: initialize ports LACPDU from const initializer 802.3ad: remove typedef around ad_system 802.3ad: turn ports is_individual into a bool 802.3ad: turn ports is_enabled into a bool 802.3ad: make ntt bool ixgbe: Fix set_ringparam in ixgbe to use the same memory pools. ... Fixed trivial IPv4/6 address printing conflicts in fs/cifs/connect.c due to the conversion to %pI (in this networking merge) and the addition of doing IPv6 addresses (from the earlier merge of CIFS).	2008-12-28 12:49:40 -08:00
Linus Torvalds	1d248b2593	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (26 commits) IB/mlx4: Set ownership bit correctly when copying CQEs during CQ resize RDMA/nes: Remove tx_free_list RDMA/cma: Add IPv6 support RDMA/addr: Add support for translating IPv6 addresses mlx4_core: Delete incorrect comment mlx4_core: Add support for multiple completion event vectors IB/iser: Avoid recv buffer exhaustion caused by unexpected PDUs IB/ehca: Remove redundant test of vpage IB/ehca: Replace modulus operations in flush error completion path IB/ipath: Add locking for interrupt use of ipath_pd contexts vs free IB/ipath: Fix spi_pioindex value IB/ipath: Only do 1X workaround on rev1 chips IB/ipath: Don't count IB symbol and link errors unless link is UP IB/ipath: Check return value of dma_map_single() IB/ipath: Fix PSN of send WQEs after an RDMA read resend RDMA/nes: Cleanup warnings RDMA/nes: Add loopback check to make_cm_node() RDMA/nes: Check cqp_avail_reqs is empty after locking the list RDMA/nes: Fix TCP compliance test failures RDMA/nes: Forward packets for a new connection with stale APBVT entry ...	2008-12-28 12:33:59 -08:00
Linus Torvalds	a39b863342	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits) sched: fix warning in fs/proc/base.c schedstat: consolidate per-task cpu runtime stats sched: use RCU variant of list traversal in for_each_leaf_rt_rq() sched, cpuacct: export percpu cpuacct cgroup stats sched, cpuacct: refactoring cpuusage_read / cpuusage_write sched: optimize update_curr() sched: fix wakeup preemption clock sched: add missing arch_update_cpu_topology() call sched: let arch_update_cpu_topology indicate if topology changed sched: idle_balance() does not call load_balance_newidle() sched: fix sd_parent_degenerate on non-numa smp machine sched: add uid information to sched_debug for CONFIG_USER_SCHED sched: move double_unlock_balance() higher sched: update comment for move_task_off_dead_cpu sched: fix inconsistency when redistribute per-cpu tg->cfs_rq shares sched/rt: removed unneeded defintion sched: add hierarchical accounting to cpu accounting controller sched: include group statistics in /proc/sched_debug sched: rename SCHED_NO_NO_OMIT_FRAME_POINTER => SCHED_OMIT_FRAME_POINTER sched: clean up SCHED_CPUMASK_ALLOC ...	2008-12-28 12:27:58 -08:00
Linus Torvalds	b0f4b285d7	Merge branch 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (241 commits) sched, trace: update trace_sched_wakeup() tracing/ftrace: don't trace on early stage of a secondary cpu boot, v3 Revert "x86: disable X86_PTRACE_BTS" ring-buffer: prevent false positive warning ring-buffer: fix dangling commit race ftrace: enable format arguments checking x86, bts: memory accounting x86, bts: add fork and exit handling ftrace: introduce tracing_reset_online_cpus() helper tracing: fix warnings in kernel/trace/trace_sched_switch.c tracing: fix warning in kernel/trace/trace.c tracing/ring-buffer: remove unused ring_buffer size trace: fix task state printout ftrace: add not to regex on filtering functions trace: better use of stack_trace_enabled for boot up code trace: add a way to enable or disable the stack tracer x86: entry_64 - introduce FTRACE_ frame macro v2 tracing/ftrace: add the printk-msg-only option tracing/ftrace: use preempt_enable_no_resched_notrace in ring_buffer_time_stamp() x86, bts: correctly report invalid bts records ... Fixed up trivial conflict in scripts/recordmcount.pl due to SH bits being already partly merged by the SH merge.	2008-12-28 12:21:10 -08:00
Linus Torvalds	be9c5ae4ee	Merge branch 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (246 commits) x86: traps.c replace #if CONFIG_X86_32 with #ifdef CONFIG_X86_32 x86: PAT: fix address types in track_pfn_vma_new() x86: prioritize the FPU traps for the error code x86: PAT: pfnmap documentation update changes x86: PAT: move track untrack pfnmap stubs to asm-generic x86: PAT: remove follow_pfnmap_pte in favor of follow_phys x86: PAT: modify follow_phys to return phys_addr prot and return value x86: PAT: clarify is_linear_pfn_mapping() interface x86: ia32_signal: remove unnecessary declaration x86: common.c boot_cpu_stack and boot_exception_stacks should be static x86: fix intel x86_64 llc_shared_map/cpu_llc_id anomolies x86: fix warning in arch/x86/kernel/microcode_amd.c x86: ia32.h: remove unused struct sigfram32 and rt_sigframe32 x86: asm-offset_64: use rt_sigframe_ia32 x86: sigframe.h: include headers for dependency x86: traps.c declare functions before they get used x86: PAT: update documentation to cover pgprot and remap_pfn related changes - v3 x86: PAT: add pgprot_writecombine() interface for drivers - v3 x86: PAT: change pgprot_noncached to uc_minus instead of strong uc - v3 x86: PAT: implement track/untrack of pfnmap regions for x86 - v3 ...	2008-12-28 12:07:57 -08:00
Linus Torvalds	bb26c6c29b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (105 commits) SELinux: don't check permissions for kernel mounts security: pass mount flags to security_sb_kern_mount() SELinux: correctly detect proc filesystems of the form "proc/foo" Audit: Log TIOCSTI user namespaces: document CFS behavior user namespaces: require cap_set{ug}id for CLONE_NEWUSER user namespaces: let user_ns be cloned with fairsched CRED: fix sparse warnings User namespaces: use the current_user_ns() macro User namespaces: set of cleanups (v2) nfsctl: add headers for credentials coda: fix creds reference capabilities: define get_vfs_caps_from_disk when file caps are not enabled CRED: Allow kernel services to override LSM settings for task actions CRED: Add a kernel_service object class to SELinux CRED: Differentiate objective and effective subjective credentials on a task CRED: Documentation CRED: Use creds in file structs CRED: Prettify commoncap.c CRED: Make execve() take advantage of copy-on-write credentials ...	2008-12-28 11:43:54 -08:00
Linus Torvalds	e14e61e967	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (57 commits) crypto: aes - Precompute tables crypto: talitos - Ack done interrupt in isr instead of tasklet crypto: testmgr - Correct comment about deflate parameters crypto: salsa20 - Remove private wrappers around various operations crypto: des3_ede - permit weak keys unless REQ_WEAK_KEY set crypto: sha512 - Switch to shash crypto: sha512 - Move message schedule W[80] to static percpu area crypto: michael_mic - Switch to shash crypto: wp512 - Switch to shash crypto: tgr192 - Switch to shash crypto: sha256 - Switch to shash crypto: md5 - Switch to shash crypto: md4 - Switch to shash crypto: sha1 - Switch to shash crypto: rmd320 - Switch to shash crypto: rmd256 - Switch to shash crypto: rmd160 - Switch to shash crypto: rmd128 - Switch to shash crypto: null - Switch to shash crypto: hash - Make setkey optional ...	2008-12-28 11:43:22 -08:00
Linus Torvalds	cb10ea549f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: (367 commits) ALSA: ASoC: fix a typo in omp-pcm.c ASoC: Fix DSP formats in SSM2602 audio codec ASoC: Fix incorrect DSP format in OMAP McBSP DAI and affected drivers ALSA: hda: fix incorrect mixer index values for 92hd83xx ALSA: hda: dinput_mux check ALSA: hda - Add quirk for another HP dv7 ALSA: ASoC - Add missing __devexit annotation to wm8350.c ALSA: ASoc: DaVinci: davinci-evm use dsp_b mode ALSA: ASoC: DaVinci: i2s, evm, pass same value to codec and cpu_dai ALSA: ASoC: tlv320aic3x add dsp_a ALSA: ASoC: DaVinci: document I2S limitations ALSA: ASoC: DaVinci: davinci-i2s clean up ALSA: ASoC: DaVinci: davinci-i2s clean up ALSA: ASoC: DaVinci: davinci-i2s add comments to explain polarity ALSA: ASoC: DaVinci: davinvi-evm, make requests explicit ALSA: ca0106 - disable 44.1kHz capture ALSA: ca0106 - Add missing card->private_data initialization ALSA: ca0106 - Check ac97 availability at PM ALSA: hda - Power up always when no jack detection is available ALSA: hda - Fix unused variable warnings in patch_sigmatel.c ...	2008-12-28 11:41:32 -08:00
Jeremy Fitzhardinge	70a7d3cc13	swiotlb: add hwdev to swiotlb_phys_to_bus() / swiotlb_sg_to_bus() Impact: extend functions with a (yet unused) parameter, update callsites Some architectures need it - in preparation for highmem swiotlb. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-28 09:54:52 +01:00
Yinghai Lu	13a0c3c269	sparseirq: work around compiler optimizing away __weak functions Impact: fix panic on null pointer with sparseirq Some GCC versions seem to inline the weak global function, when that function is empty. Work it around, by making the functions return a (dummy) integer. Signed-off-by: Yinghai <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-27 13:24:00 +01:00
KOSAKI Motohiro	18eefedfe8	irq: simplify for_each_irq_desc() usage Impact: cleanup all for_each_irq_desc() usage point have !desc check. then its check can move into for_each_irq_desc() macro. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-26 09:48:18 +01:00
KOSAKI Motohiro	f9af0e7091	irq: for_each_irq_desc() move to irqnr.h Impact: cleanup before CONFIG_SPARSE_IRQ age, for_each_irq_desc() sat in irqnr.h and could be called from generic code. CONFIG_SPARSE_IRQ breaks this assumption, but SPARSE_IRQ version for_each_irq_desc() also can move into irqnr.h easily. Also, this patch unifies CONFIG_SPARSE_IRQ and !CONFIG_SPARSE_IRQ for_each_irq_desc(). Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-26 09:48:17 +01:00
Ingo Molnar	32e8d18683	Merge branches 'timers/clocksource', 'timers/hpet', 'timers/hrtimers', 'timers/nohz', 'timers/ntp', 'timers/posixtimers' and 'timers/rtc' into timers/core	2008-12-25 18:02:25 +01:00
Ingo Molnar	860cf8894b	Merge branches 'irq/sparseirq', 'irq/genirq' and 'irq/urgent'; commit 'v2.6.28' into irq/core	2008-12-25 16:27:54 +01:00
Ingo Molnar	6638101c11	Merge branches 'core/debugobjects', 'core/iommu', 'core/locking', 'core/printk', 'core/rcu', 'core/resources', 'core/softirq' and 'core/stacktrace' into core/core	2008-12-25 14:06:29 +01:00
Ingo Molnar	cc37d3d206	Merge branch 'core/futexes' into core/core	2008-12-25 13:54:14 +01:00
Ingo Molnar	0b271ef452	Merge commit 'v2.6.28' into core/core	2008-12-25 13:51:46 +01:00
Ingo Molnar	4e202284e6	Merge branch 'sched/urgent'; commit 'v2.6.28' into sched/core	2008-12-25 13:42:23 +01:00
Ingo Molnar	5250d329e3	Merge branches 'tracing/ftrace', 'tracing/hw-branch-tracing' and 'tracing/ring-buffer'; commit 'v2.6.28' into tracing/core	2008-12-25 13:11:00 +01:00
Takashi Iwai	a802269781	Merge branch 'topic/jack-mechanical' into to-push	2008-12-25 11:40:29 +01:00
Takashi Iwai	a65056205c	Merge branch 'topic/hda' into to-push	2008-12-25 11:40:28 +01:00
Takashi Iwai	5c8261e44e	Merge branch 'topic/asoc' into to-push	2008-12-25 11:40:25 +01:00
James Morris	cbacc2c7f0	Merge branch 'next' into for-linus	2008-12-25 11:40:09 +11:00
Herbert Xu	0426c16642	libcrc32c: Add crc32c_le macro The bnx2x driver actually uses the crc32c_le name so this patch restores the crc32c_le symbol through a macro. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:01:43 +11:00
Herbert Xu	69c35efcf1	libcrc32c: Move implementation to crypto crc32c This patch swaps the role of libcrc32c and crc32c. Previously the implementation was in libcrc32c and crc32c was a wrapper. Now the code is in crc32c and libcrc32c just calls the crypto layer. The reason for the change is to tap into the algorithm selection capability of the crypto API so that optimised implementations such as the one utilising Intel's CRC32C instruction can be used where available. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:01:40 +11:00
Herbert Xu	5f7082ed4f	crypto: hash - Export shash through hash This patch allows shash algorithms to be used through the old hash interface. This is a transitional measure so we can convert the underlying algorithms to shash before converting the users across. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:01:33 +11:00
Herbert Xu	dec8b78606	crypto: hash - Add import/export interface It is often useful to save the partial state of a hash function so that it can be used as a base for two or more computations. The most prominent example is HMAC where all hashes start from a base determined by the key. Having an import/export interface means that we only have to compute that base once rather than for each message. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:01:30 +11:00
Herbert Xu	3b2f6df082	crypto: hash - Export shash through ahash This patch allows shash algorithms to be used through the ahash interface. This is required before we can convert digest algorithms over to shash. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:01:28 +11:00
Herbert Xu	7b5a080b3c	crypto: hash - Add shash interface The shash interface replaces the current synchronous hash interface. It improves over hash in two ways. Firstly shash is reentrant, meaning that the same tfm may be used by two threads simultaneously as all hashing state is stored in a local descriptor. The other enhancement is that shash no longer takes scatter list entries. This is because shash is specifically designed for synchronous algorithms and as such scatter lists are unnecessary. All existing hash users will be converted to shash once the algorithms have been completely converted. There is also a new finup function that combines update with final. This will be extended to ahash once the algorithm conversion is done. This is also the first time that an algorithm type has their own registration function. Existing algorithm types will be converted to this way in due course. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:01:26 +11:00
Herbert Xu	7b0bac64cd	crypto: api - Rebirth of crypto_alloc_tfm This patch reintroduces a completely revamped crypto_alloc_tfm. The biggest change is that we now take two crypto_type objects when allocating a tfm, a frontend and a backend. In fact this simply formalises what we've been doing behind the API's back. For example, as it stands crypto_alloc_ahash may use an actual ahash algorithm or a crypto_hash algorithm. Putting this in the API allows us to do this much more cleanly. The existing types will be converted across gradually. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:01:24 +11:00
Herbert Xu	4a7794860b	crypto: api - Move type exit function into crypto_tfm The type exit function needs to undo any allocations done by the type init function. However, the type init function may differ depending on the upper-level type of the transform (e.g., a crypto_blkcipher instantiated as a crypto_ablkcipher). So we need to move the exit function out of the lower-level structure and into crypto_tfm itself. As it stands this is a no-op since nobody uses exit functions at all. However, all cases where a lower-level type is instantiated as a different upper-level type (such as blkcipher as ablkcipher) will be converted such that they allocate the underlying transform and use that instead of casting (e.g., crypto_ablkcipher casted into crypto_blkcipher). That will need to use a different exit function depending on the upper-level type. This patch also allows the type init/exit functions to call (or not) cra_init/cra_exit instead of always calling them from the top level. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:01:23 +11:00
Ingo Molnar	db8862eafe	Merge branch 'linus' into tracing/hw-branch-tracing	2008-12-24 21:08:26 +01:00
Takashi Iwai	7645c4bfbb	Merge branch 'fix/hda' into topic/hda	2008-12-24 11:04:08 +01:00
Olga Kornievskaia	61054b14d5	nfsd: support callbacks with gss flavors This patch adds server-side support for callbacks other than AUTH_SYS. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:19:00 -05:00
Olga Kornievskaia	608207e888	rpc: pass target name down to rpc level on callbacks The rpc client needs to know the principal that the setclientid was done as, so it can tell gssd who to authenticate to. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:17:40 -05:00
Olga Kornievskaia	68e76ad0ba	nfsd: pass client principal name in rsc downcall Two principals are involved in krb5 authentication: the target, who we authenticate to (normally the name of the server, like nfs/server.citi.umich.edu@CITI.UMICH.EDU), and the source, we we authenticate as (normally a user, like bfields@UMICH.EDU) In the case of NFSv4 callbacks, the target of the callback should be the source of the client's setclientid call, and the source should be the nfs server's own principal. Therefore we allow svcgssd to pass down the name of the principal that just authenticated, so that on setclientid we can store that principal name with the new client, to be used later on callbacks. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:17:15 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	c381060869	rpc: add an rpc_pipe_open method We want to transition to a new gssd upcall which is text-based and more easily extensible. To simplify upgrades, as well as testing and debugging, it will help if we can upgrade gssd (to a version which understands the new upcall) without having to choose at boot (or module-load) time whether we want the new or the old upcall. We will do this by providing two different pipes: one named, as currently, after the mechanism (normally "krb5"), and supporting the old upcall. One named "gssd" and supporting the new upcall version. We allow gssd to indicate which version it supports by its choice of which pipe to open. As we have no interest in supporting simultaneous use of both versions, we'll forbid opening both pipes at the same time. So, add a new pipe_open callback to the rpc_pipefs api, which the gss code can use to track which pipes have been open, and to refuse opens of incompatible pipes. We only need this to be called on the first open of a given pipe. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:08:32 -05:00
Benny Halevy	c977a2ef40	sunrpc: get rid of rpc_rqst.rq_bufsize rq_bufsize is not used. Signed-off-by: Mike Sager <Mike.Sager@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:06:13 -05:00
Peter Staubach	64672d55d9	optimize attribute timeouts for "noac" and "actimeo=0" Hi. I've been looking at a bugzilla which describes a problem where a customer was advised to use either the "noac" or "actimeo=0" mount options to solve a consistency problem that they were seeing in the file attributes. It turned out that this solution did not work reliably for them because sometimes, the local attribute cache was believed to be valid and not timed out. (With an attribute cache timeout of 0, the cache should always appear to be timed out.) In looking at this situation, it appears to me that the problem is that the attribute cache timeout code has an off-by-one error in it. It is assuming that the cache is valid in the region, [read_cache_jiffies, read_cache_jiffies + attrtimeo]. The cache should be considered valid only in the region, [read_cache_jiffies, read_cache_jiffies + attrtimeo). With this change, the options, "noac" and "actimeo=0", work as originally expected. This problem was previously addressed by special casing the attrtimeo == 0 case. However, since the problem is only an off- by-one error, the cleaner solution is address the off-by-one error and thus, not require the special case. Thanx... ps Signed-off-by: Peter Staubach <staubach@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:56 -05:00
Trond Myklebust	dc0b027dfa	NFSv4: Convert the open and close ops to use fmode Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:56 -05:00
Trond Myklebust	bd7bf9d540	NFSv4: Convert delegation->type field to fmode_t Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:53 -05:00
Trond Myklebust	95d35cb4c4	NFSv4: Remove nfs_client->cl_sem Now that we're using the flags to indicate state that needs to be recovered, as well as having implemented proper refcounting and spinlocking on the state and open_owners, we can get rid of nfs_client->cl_sem. The only remaining case that was dubious was the file locking, and that case is now covered by the nfsi->rwsem. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:45 -05:00
Chuck Lever	0cb2659b81	NLM: allow lockd requests from an unprivileged port If the admin has specified the "noresvport" option for an NFS mount point, the kernel's NFS client uses an unprivileged source port for the main NFS transport. The kernel's lockd client should use an unprivileged port in this case as well. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:38 -05:00
Chuck Lever	d740351bf0	NFS: add "[no]resvport" mount option The standard default security setting for NFS is AUTH_SYS. An NFS client connects to NFS servers via a privileged source port and a fixed standard destination port (2049). The client sends raw uid and gid numbers to identify users making NFS requests, and the server assumes an appropriate authority on the client has vetted these values because the source port is privileged. On Linux, by default in-kernel RPC services use a privileged port in the range between 650 and 1023 to avoid using source ports of well- known IP services. Using such a small range limits the number of NFS mount points and the number of unique NFS servers to which a client can connect concurrently. An NFS client can use unprivileged source ports to expand the range of source port numbers, allowing more concurrent server connections and more NFS mount points. Servers must explicitly allow NFS connections from unprivileged ports for this to work. In the past, bumping the value of the sunrpc.max_resvport sysctl on the client would permit the NFS client to use unprivileged ports. Bumping this setting also changes the maximum port number used by other in-kernel RPC services, some of which still required a port number less than 1023. This is exacerbated by the way source port numbers are chosen by the Linux RPC client, which starts at the top of the range and works downwards. It means that bumping the maximum means all RPC services requesting a source port will likely get an unprivileged port instead of a privileged one. Changing this setting effects all NFS mount points on a client. A sysadmin could not selectively choose which mount points would use non-privileged ports and which could not. Lastly, this mechanism of expanding the limit on the number of NFS mount points was entirely undocumented. To address the need for the NFS client to use a large range of source ports without interfering with the activity of other in-kernel RPC services, we introduce a new NFS mount option. This option explicitly tells only the NFS client to use a non-privileged source port when communicating with the NFS server for one specific mount point. This new mount option is called "resvport," like the similar NFS mount option on FreeBSD and Mac OS X. A sister patch for nfs-utils will be submitted that documents this new option in nfs(5). The default setting for this new mount option requires the NFS client to use a privileged port, as before. Explicitly specifying the "noresvport" mount option allows the NFS client to use an unprivileged source port for this mount point when connecting to the NFS server port. This mount option is supported only for text-based NFS mounts. [ Sidebar: it is widely known that security mechanisms based on the use of privileged source ports are ineffective. However, the NFS client can combine the use of unprivileged ports with the use of secure authentication mechanisms, such as Kerberos. This allows a large number of connections and mount points while ensuring a useful level of security. Eventually we may change the default setting for this option depending on the security flavor used for the mount. For example, if the mount is using only AUTH_SYS, then the default setting will be "resvport;" if the mount is using a strong security flavor such as krb5, the default setting will be "noresvport." ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> [Trond.Myklebust@netapp.com: Fixed a bug whereby nfs4_init_client() was being called with incorrect arguments.] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:37 -05:00
Chuck Lever	146ec944bb	NFS: Move declaration of nfs_mount() to fs/nfs/internal.h Clean up: The nfs_mount() function is not to be used outside of the NFS client. Move its public declaration to fs/nfs/internal.h. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:34 -05:00
Trond Myklebust	88a9fe8cae	SUNRPC: Remove the last remnant of the BKL... Somehow, this escaped the previous purge. There should be no need to keep any extra locks in the XDR callbacks. The NFS client XDR code only writes into private objects, whereas all reads of shared objects are confined to fields that do not change, such as filehandles... Ditto for lockd, the NFSv2/v3 client mount code, and rpcbind. The nfsd XDR code may require the BKL, but since it does a synchronous RPC call from a thread that already holds the lock, that issue is moot. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:31 -05:00
Ingo Molnar	bed4f13065	Merge branch 'x86/irq' into x86/core	2008-12-23 16:30:31 +01:00
Ingo Molnar	fa623d1b02	Merge branches 'x86/apic', 'x86/cleanups', 'x86/cpufeature', 'x86/crashdump', 'x86/debug', 'x86/defconfig', 'x86/detect-hyper', 'x86/doc', 'x86/dumpstack', 'x86/early-printk', 'x86/fpu', 'x86/idle', 'x86/io', 'x86/memory-corruption-check', 'x86/microcode', 'x86/mm', 'x86/mtrr', 'x86/nmi-watchdog', 'x86/pat2', 'x86/pci-ioapic-boot-irq-quirks', 'x86/ptrace', 'x86/quirks', 'x86/reboot', 'x86/setup-memory', 'x86/signal', 'x86/sparse-fixes', 'x86/time', 'x86/uv' and 'x86/xen' into x86/core	2008-12-23 16:27:23 +01:00
Ingo Molnar	bf8bd66d05	Merge branch 'x86/apic' into x86/irq Conflicts: arch/x86/kernel/apic.c	2008-12-23 16:24:15 +01:00
Kay Sievers	160bbab300	[MTD] struct device - replace bus_id with dev_name(), dev_set_name() Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-12-23 10:00:14 +00:00
Neil Horman	908a7a16b8	net: Remove unused netdev arg from some NAPI interfaces. When the napi api was changed to separate its 1:1 binding to the net_device struct, the netif_rx_[prep\|schedule\|complete] api failed to remove the now vestigual net_device structure parameter. This patch cleans up that api by properly removing it.. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-22 20:43:12 -08:00
David Vrabel	a01777ecf2	uwb: remove unused include/linux/uwb/debug.h Signed-off-by: David Vrabel <david.vrabel@csr.com>	2008-12-22 18:30:29 +00:00
Yevgeny Petrilin	b8dd786f94	mlx4_core: Add support for multiple completion event vectors When using MSI-X mode, create a completion event queue for each CPU. Report the number of completion EQs in a new struct mlx4_caps member, num_comp_vectors, and extend the mlx4_cq_alloc() interface with a vector parameter so that consumers can specify which completion EQ should be used to report events for the CQ being created. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-12-22 07:15:03 -08:00
Paul Mundt	209aa4fdc3	fb: SH-5 uses __raw I/O accessors now also, drop the special casing. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2008-12-22 18:44:05 +09:00
Lachlan McIlroy	27a0464a6c	[XFS] Fix merge conflict in fs/xfs/xfs_rename.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: fs/xfs/xfs_rename.c Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-12-22 17:34:26 +11:00
Don Skidmore	f4314e815e	net: add DCNA attribute to the BCN interface for DCB Adds the Backward Congestion Notification Address (BCNA) attribute to the Backward Congestion Notification (BCN) interface for Data Center Bridging (DCB), which was missing. Receive the BCNA attribute in the ixgbe driver. The BCNA attribute is for a switch to inform the endstation about the physical port identification in order to support BCN on aggregated links. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: Eric W Multanen <eric.w.multanen@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2008-12-21 20:10:29 -08:00
Lai Jiangshan	3ddeb912f4	ftrace: enable format arguments checking Impact: broaden gcc printf format checks for ftrace_printk() format arguments checking for ftrace_printk() is __printf(1, 2), not __printf(1, 0). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-21 09:46:45 +01:00
Anton Vorontsov	749820928a	of/gpio: Implement of_gpio_count() This function is used to count how many GPIOs are specified for a device node. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-12-21 14:21:14 +11:00
Kwangwoo Lee	50b6f1f4a4	Input: add tsc2007 based touchscreen driver This drive has been tested on ARM9 based SoC - MV86XX. Signed-off-by: Kwangwoo Lee <kwangwoo.lee@gmail.com> Acked-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-12-20 05:00:43 -05:00
Dmitry Torokhov	93b8eef1c0	Merge commit 'v2.6.28-rc9' into next	2008-12-20 04:54:54 -05:00
Markus Metzger	c5dee6177f	x86, bts: memory accounting Impact: move the BTS buffer accounting to the mlock bucket Add alloc_locked_buffer() and free_locked_buffer() functions to mm/mlock.c to kalloc a buffer and account the locked memory to current. Account the memory for the BTS buffer to the tracer. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-20 09:15:47 +01:00
Markus Metzger	bf53de907d	x86, bts: add fork and exit handling Impact: introduce new ptrace facility Add arch_ptrace_untrace() function that is called when the tracer detaches (either voluntarily or when the tracing task dies); ptrace_disable() is only called on a voluntary detach. Add ptrace_fork() and arch_ptrace_fork(). They are called when a traced task is forked. Clear DS and BTS related fields on fork. Release DS resources and reclaim memory in ptrace_untrace(). This releases resources already when the tracing task dies. We used to do that when the traced task dies. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-20 09:15:46 +01:00
venkatesh.pallipadi@intel.com	34801ba9bf	x86: PAT: move track untrack pfnmap stubs to asm-generic Impact: Cleanup and branch hints only. Move the track and untrack pfn stub routines from memory.c to asm-generic. Also add unlikely to pfnmap related calls in fork and exit path. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-12-19 15:40:30 -08:00
venkatesh.pallipadi@intel.com	982d789ab7	x86: PAT: remove follow_pfnmap_pte in favor of follow_phys Impact: Cleanup - removes a new function in favor of a recently modified older one. Replace follow_pfnmap_pte in pat code with follow_phys. follow_phys lso returns protection eliminating the need of pte_pgprot call. Using follow_phys also eliminates the need for pte_pa. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-12-19 15:40:30 -08:00
venkatesh.pallipadi@intel.com	d87fe6607c	x86: PAT: modify follow_phys to return phys_addr prot and return value Impact: Changes and globalizes an existing static interface. Follow_phys does similar things as follow_pfnmap_pte. Make a minor change to follow_phys so that it can be used in place of follow_pfnmap_pte. Physical address return value with 0 as error return does not work in follow_phys as the actual physical address 0 mapping may exist in pte. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-12-19 15:40:30 -08:00
venkatesh.pallipadi@intel.com	6bd9cd50c8	x86: PAT: clarify is_linear_pfn_mapping() interface Impact: Documentation only Incremental patches to address the review comments from Nick Piggin for v3 version of x86 PAT pfnmap changes patchset here http://lkml.indiana.edu/hypermail/linux/kernel/0812.2/01330.html This patch: Clarify is_linear_pfn_mapping() and its usage. It is used by x86 PAT code for performance reasons. Identifying pfnmap as linear over entire vma helps speedup reserve and free of memtype for the region. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-12-19 15:40:30 -08:00
James Morris	12204e24b1	security: pass mount flags to security_sb_kern_mount() Pass mount flags to security_sb_kern_mount(), so security modules can determine if a mount operation is being performed by the kernel. Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Stephen Smalley <sds@tycho.nsa.gov>	2008-12-20 09:02:39 +11:00
Sujith	094d05dc32	mac80211: Fix HT channel selection HT management is done differently for AP and STA modes, unify to just the ->config() callback since HT is fundamentally a PHY property and cannot be per-BSS. Rename enum nl80211_sec_chan_offset as nl80211_channel_type to denote the channel type ( NO_HT, HT20, HT40+, HT40- ). Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Sujith <Sujith.Manoharan@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-12-19 15:22:54 -05:00
Henning Rogge	420e7fabd9	nl80211: Add signal strength and bandwith to nl80211station info This patch adds signal strength and transmission bitrate to the station_info of nl80211. Signed-off-by: Henning Rogge <rogge@fgan.de> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-12-19 15:04:54 -05:00
Rafael J. Wysocki	ba84ed9546	ACPI hibernate: Introduce new kernel parameter acpi_sleep=s4_nonvs On some machines it may be necessary to disable the saving/restoring of the ACPI NVS memory region during hibernation/resume. For this purpose, introduce new ACPI kernel command line option acpi_sleep=s4_nonvs. Based on a patch by Zhang Rui. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Nigel Cunningham <nigel@tuxonice.net> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Len Brown <len.brown@intel.com>	2008-12-19 04:40:35 -05:00
Rafael J. Wysocki	3f4b0ef7f2	ACPI hibernate: Add a mechanism to save/restore ACPI NVS memory According to the ACPI Specification 3.0b, Section 15.3.2, "OSPM will call the _PTS control method some time before entering a sleeping state, to allow the platform's AML code to update this memory image before entering the sleeping state. After the system awakes from an S4 state, OSPM will restore this memory area and call the _WAK control method to enable the BIOS to reclaim its memory image." For this reason, implement a mechanism allowing us to save the NVS memory during hibernation and to restore it during the subsequent resume. Based on a patch by Zhang Rui. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Nigel Cunningham <nigel@tuxonice.net> Cc: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2008-12-19 04:40:34 -05:00
Ingo Molnar	30cd324e97	Merge branches 'tracing/ftrace', 'tracing/ring-buffer' and 'tracing/urgent' into tracing/core Conflicts: include/linux/ftrace.h	2008-12-19 09:42:40 +01:00
Ingo Molnar	06aaf76a7e	sched: move test_sd_parent() to an SMP section of sched.h Impact: build fix Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-19 09:21:56 +01:00
Vaidyanathan Srinivasan	100fdaee70	sched: add SD_BALANCE_NEWIDLE at MC and CPU level for sched_mc>0 Impact: change task balancing to save power more agressively Add SD_BALANCE_NEWIDLE flag at MC level and CPU level if sched_mc is set. This helps power savings and will not affect performance when sched_mc=0 Ingo and Mike Galbraith have optimised the SD flags by removing SD_BALANCE_NEWIDLE at MC and CPU level. This helps performance but hurts power savings since this slows down task consolidation by reducing the number of times load_balance is run. sched: fine-tune SD_MC_INIT commit `1480098470` Author: Mike Galbraith <efault@gmx.de> Date: Fri Nov 7 15:26:50 2008 +0100 sched: re-tune balancing -- revert commit `9fcd18c9e6` Author: Ingo Molnar <mingo@elte.hu> Date: Wed Nov 5 16:52:08 2008 +0100 This patch selectively enables SD_BALANCE_NEWIDLE flag only when sched_mc is set to 1 or 2. This helps power savings by task consolidation and also does not hurt performance at sched_mc=0 where all power saving optimisations are turned off. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-19 09:21:55 +01:00
Gautham R Shenoy	afb8a9b70b	sched: framework for sched_mc/smt_power_savings=N Impact: extend range of /sys/devices/system/cpu/sched_mc_power_savings Currently the sched_mc/smt_power_savings variable is a boolean, which either enables or disables topology based power savings. This patch extends the behaviour of the variable from boolean to multivalued, such that based on the value, we decide how aggressively do we want to perform powersavings balance at appropriate sched domain based on topology. Variable levels of power saving tunable would benefit end user to match the required level of power savings vs performance trade-off depending on the system configuration and workloads. This version makes the sched_mc_power_savings global variable to take more values (0,1,2). Later versions can have a single tunable called sched_power_savings instead of sched_{mc,smt}_power_savings. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-19 09:21:46 +01:00
Vaidyanathan Srinivasan	716707b299	sched: convert BALANCE_FOR_xx_POWER to inline functions Impact: cleanup BALANCE_FOR_MC_POWER and similar macros defined in sched.h are not constants and have various condition checks and significant amount of code that is not suitable to be contain in a macro. Also there could be side effects on the expressions passed to some of them like test_sd_parent(). This patch converts all complex macros related to power savings balance to inline functions. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-19 09:21:45 +01:00
Takashi Iwai	0ff555192a	Merge branch 'fix/hda' into topic/hda	2008-12-19 08:22:57 +01:00
Mike Travis	e057d7aea9	cpumask: add sysfs displays for configured and disabled cpu maps Impact: add new sysfs files. Add sysfs files "kernel_max" and "offline" to display the max CPU index allowed (NR_CPUS-1), and the map of cpus that are offline. Cpus can be offlined via HOTPLUG, disabled by the BIOS ACPI tables, or if they exceed the number of cpus allowed by the NR_CPUS config option, or the "maxcpus=NUM" kernel start parameter. The "possible_cpus=NUM" parameter can also extend the number of possible cpus allowed, in which case the cpus not present at startup will be in the offline state. (These cpus can be HOTPLUGGED ON after system startup [pending a follow-on patch to provide the capability via the /sys/devices/sys/cpu/cpuN/online mechanism to bring them online.]) By design, the "offlined cpus > possible cpus" display will always use the following formats: * all possible cpus online: "x$" or "x-y$" * some possible cpus offline: ".,x$" or ".,x-y$" where: x == number of possible cpus (nr_cpu_ids); and y == number of cpus >= NR_CPUS or maxcpus (if y > x). One use of this feature is for distros to select (or configure) the appropriate kernel to install for the resident system. Notes: * cpus offlined <= possible cpus will be printed for all architectures. * cpus offlined > possible cpus will only be printed for arches that set 'total_cpus' [X86 only in this patch]. Based on tip/cpus4096 + .../rusty/linux-2.6-for-ingo.git/master + x86-only-patches sent 12/15. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-19 17:47:04 +10:30
Mike Travis	7b4967c532	cpumask: Add alloc_cpumask_var_node() Impact: New API This will be needed in x86 code to allocate the domain and old_domain cpumasks on the same node as where the containing irq_cfg struct is allocated. (Also fixes double-dump_stack on rare CONFIG_DEBUG_PER_CPU_MAPS case) Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (re-impl alloc_cpumask_var)	2008-12-19 16:56:37 +10:30
Yinghai Lu	078a55db07	sparseirq: add kernel-doc notation for new member in irq_desc, -v2 Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-19 02:06:53 +01:00
Russell King	fdb0a1a67e	Merge branch 'next-merged' of git://aeryn.fluff.org.uk/bjdooks/linux into devel	2008-12-18 22:15:30 +00:00
venkatesh.pallipadi@intel.com	2ab640379a	x86: PAT: hooks in generic vm code to help archs to track pfnmap regions - v3 Impact: Introduces new hooks, which are currently null. Introduce generic hooks in remap_pfn_range and vm_insert_pfn and corresponding copy and free routines with reserve and free tracking. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-12-18 13:30:15 -08:00
venkatesh.pallipadi@intel.com	e121e41844	x86: PAT: add follow_pfnmp_pte routine to help tracking pfnmap pages - v3 Impact: New currently unused interface. Add a generic interface to follow pfn in a pfnmap vma range. This is used by one of the subsequent x86 PAT related patch to keep track of memory types for vma regions across vma copy and free. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-12-18 13:30:15 -08:00
venkatesh.pallipadi@intel.com	3c8bb73ace	x86: PAT: store vm_pgoff for all linear_over_vma_region mappings - v3 Impact: Code transformation, new functions added should have no effect. Drivers use mmap followed by pgprot_* and remap_pfn_range or vm_insert_pfn, in order to export reserved memory to userspace. Currently, such mappings are not tracked and hence not kept consistent with other mappings (/dev/mem, pci resource, ioremap) for the sme memory, that may exist in the system. The following patchset adds x86 PAT attribute tracking and untracking for pfnmap related APIs. First three patches in the patchset are changing the generic mm code to fit in this tracking. Last four patches are x86 specific to make things work with x86 PAT code. The patchset aso introduces pgprot_writecombine interface, which gives writecombine mapping when enabled, falling back to pgprot_noncached otherwise. This patch: While working on x86 PAT, we faced some hurdles with trackking remap_pfn_range() regions, as we do not have any information to say whether that PFNMAP mapping is linear for the entire vma range or it is smaller granularity regions within the vma. A simple solution to this is to use vm_pgoff as an indicator for linear mapping over the vma region. Currently, remap_pfn_range only sets vm_pgoff for COW mappings. Below patch changes the logic and sets the vm_pgoff irrespective of COW. This will still not be enough for the case where pfn is zero (vma region mapped to physical address zero). But, for all the other cases, we can look at pfnmap VMAs and say whether the mappng is for the entire vma region or not. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-12-18 13:30:15 -08:00
Paul E. McKenney	64db4cfff9	"Tree RCU": scalable classic RCU implementation This patch fixes a long-standing performance bug in classic RCU that results in massive internal-to-RCU lock contention on systems with more than a few hundred CPUs. Although this patch creates a separate flavor of RCU for ease of review and patch maintenance, it is intended to replace classic RCU. This patch still handles stress better than does mainline, so I am still calling it ready for inclusion. This patch is against the -tip tree. Nevertheless, experience on an actual 1000+ CPU machine would still be most welcome. Most of the changes noted below were found while creating an rcutiny (which should permit ejecting the current rcuclassic) and while doing detailed line-by-line documentation. Updates from v9 (http://lkml.org/lkml/2008/12/2/334): o Fixes from remainder of line-by-line code walkthrough, including comment spelling, initialization, undesirable narrowing due to type conversion, removing redundant memory barriers, removing redundant local-variable initialization, and removing redundant local variables. I do not believe that any of these fixes address the CPU-hotplug issues that Andi Kleen was seeing, but please do give it a whirl in case the machine is smarter than I am. A writeup from the walkthrough may be found at the following URL, in case you are suffering from terminal insomnia or masochism: http://www.kernel.org/pub/linux/kernel/people/paulmck/tmp/rcutree-walkthrough.2008.12.16a.pdf o Made rcutree tracing use seq_file, as suggested some time ago by Lai Jiangshan. o Added a .csv variant of the rcudata debugfs trace file, to allow people having thousands of CPUs to drop the data into a spreadsheet. Tested with oocalc and gnumeric. Updated documentation to suit. Updates from v8 (http://lkml.org/lkml/2008/11/15/139): o Fix a theoretical race between grace-period initialization and force_quiescent_state() that could occur if more than three jiffies were required to carry out the grace-period initialization. Which it might, if you had enough CPUs. o Apply Ingo's printk-standardization patch. o Substitute local variables for repeated accesses to global variables. o Fix comment misspellings and redundant (but harmless) increments of ->n_rcu_pending (this latter after having explicitly added it). o Apply checkpatch fixes. Updates from v7 (http://lkml.org/lkml/2008/10/10/291): o Fixed a number of problems noted by Gautham Shenoy, including the cpu-stall-detection bug that he was having difficulty convincing me was real. ;-) o Changed cpu-stall detection to wait for ten seconds rather than three in order to reduce false positive, as suggested by Ingo Molnar. o Produced a design document (http://lwn.net/Articles/305782/). The act of writing this document uncovered a number of both theoretical and "here and now" bugs as noted below. o Fix dynticks_nesting accounting confusion, simplify WARN_ON() condition, fix kerneldoc comments, and add memory barriers in dynticks interface functions. o Add more data to tracing. o Remove unused "rcu_barrier" field from rcu_data structure. o Count calls to rcu_pending() from scheduling-clock interrupt to use as a surrogate timebase should jiffies stop counting. o Fix a theoretical race between force_quiescent_state() and grace-period initialization. Yes, initialization does have to go on for some jiffies for this race to occur, but given enough CPUs... Updates from v6 (http://lkml.org/lkml/2008/9/23/448): o Fix a number of checkpatch.pl complaints. o Apply review comments from Ingo Molnar and Lai Jiangshan on the stall-detection code. o Fix several bugs in !CONFIG_SMP builds. o Fix a misspelled config-parameter name so that RCU now announces at boot time if stall detection is configured. o Run tests on numerous combinations of configurations parameters, which after the fixes above, now build and run correctly. Updates from v5 (http://lkml.org/lkml/2008/9/15/92, bad subject line): o Fix a compiler error in the !CONFIG_FANOUT_EXACT case (blew a changeset some time ago, and finally got around to retesting this option). o Fix some tracing bugs in rcupreempt that caused incorrect totals to be printed. o I now test with a more brutal random-selection online/offline script (attached). Probably more brutal than it needs to be on the people reading it as well, but so it goes. o A number of optimizations and usability improvements: o Make rcu_pending() ignore the grace-period timeout when there is no grace period in progress. o Make force_quiescent_state() avoid going for a global lock in the case where there is no grace period in progress. o Rearrange struct fields to improve struct layout. o Make call_rcu() initiate a grace period if RCU was idle, rather than waiting for the next scheduling clock interrupt. o Invoke rcu_irq_enter() and rcu_irq_exit() only when idle, as suggested by Andi Kleen. I still don't completely trust this change, and might back it out. o Make CONFIG_RCU_TRACE be the single config variable manipulated for all forms of RCU, instead of the prior confusion. o Document tracing files and formats for both rcupreempt and rcutree. Updates from v4 for those missing v5 given its bad subject line: o Separated dynticks interface so that NMIs and irqs call separate functions, greatly simplifying it. In particular, this code no longer requires a proof of correctness. ;-) o Separated dynticks state out into its own per-CPU structure, avoiding the duplicated accounting. o The case where a dynticks-idle CPU runs an irq handler that invokes call_rcu() is now correctly handled, forcing that CPU out of dynticks-idle mode. o Review comments have been applied (thank you all!!!). For but one example, fixed the dynticks-ordering issue that Manfred pointed out, saving me much debugging. ;-) o Adjusted rcuclassic and rcupreempt to handle dynticks changes. Attached is an updated patch to Classic RCU that applies a hierarchy, greatly reducing the contention on the top-level lock for large machines. This passes 10-hour concurrent rcutorture and online-offline testing on 128-CPU ppc64 without dynticks enabled, and exposes some timekeeping bugs in presence of dynticks (exciting working on a system where "sleep 1" hangs until interrupted...), which were fixed in the 2.6.27 kernel. It is getting more reliable than mainline by some measures, so the next version will be against -tip for inclusion. See also Manfred Spraul's recent patches (or his earlier work from 2004 at http://marc.info/?l=linux-kernel&m=108546384711797&w=2). We will converge onto a common patch in the fullness of time, but are currently exploring different regions of the design space. That said, I have already gratefully stolen quite a few of Manfred's ideas. This patch provides CONFIG_RCU_FANOUT, which controls the bushiness of the RCU hierarchy. Defaults to 32 on 32-bit machines and 64 on 64-bit machines. If CONFIG_NR_CPUS is less than CONFIG_RCU_FANOUT, there is no hierarchy. By default, the RCU initialization code will adjust CONFIG_RCU_FANOUT to balance the hierarchy, so strongly NUMA architectures may choose to set CONFIG_RCU_FANOUT_EXACT to disable this balancing, allowing the hierarchy to be exactly aligned to the underlying hardware. Up to two levels of hierarchy are permitted (in addition to the root node), allowing up to 16,384 CPUs on 32-bit systems and up to 262,144 CPUs on 64-bit systems. I just know that I am going to regret saying this, but this seems more than sufficient for the foreseeable future. (Some architectures might wish to set CONFIG_RCU_FANOUT=4, which would limit such architectures to 64 CPUs. If this becomes a real problem, additional levels can be added, but I doubt that it will make a significant difference on real hardware.) In the common case, a given CPU will manipulate its private rcu_data structure and the rcu_node structure that it shares with its immediate neighbors. This can reduce both lock and memory contention by multiple orders of magnitude, which should eliminate the need for the strange manipulations that are reported to be required when running Linux on very large systems. Some shortcomings: o More bugs will probably surface as a result of an ongoing line-by-line code inspection. Patches will be provided as required. o There are probably hangs, rcutorture failures, &c. Seems quite stable on a 128-CPU machine, but that is kind of small compared to 4096 CPUs. However, seems to do better than mainline. Patches will be provided as required. o The memory footprint of this version is several KB larger than rcuclassic. A separate UP-only rcutiny patch will be provided, which will reduce the memory footprint significantly, even compared to the old rcuclassic. One such patch passes light testing, and has a memory footprint smaller even than rcuclassic. Initial reaction from various embedded guys was "it is not worth it", so am putting it aside. Credits: o Manfred Spraul for ideas, review comments, and bugs spotted, as well as some good friendly competition. ;-) o Josh Triplett, Ingo Molnar, Peter Zijlstra, Mathieu Desnoyers, Lai Jiangshan, Andi Kleen, Andy Whitcroft, and Andrew Morton for reviews and comments. o Thomas Gleixner for much-needed help with some timer issues (see patches below). o Jon M. Tollefson, Tim Pepper, Andrew Theurer, Jose R. Santos, Andy Whitcroft, Darrick Wong, Nishanth Aravamudan, Anton Blanchard, Dave Kleikamp, and Nathan Lynch for keeping machines alive despite my heavy abuse^Wtesting. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-18 21:56:04 +01:00
Ingo Molnar	d110ec3a1e	Merge branch 'linus' into core/rcu	2008-12-18 21:54:49 +01:00
Linus Torvalds	b3806c3b94	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: bnx2: Fix bug in bnx2_free_rx_mem(). irda: Add irda_skb_cb qdisc related padding jme: Fixed a typo net: kernel BUG at drivers/net/phy/mdio_bus.c:165! drivers/net: starfire: Fix napi ->poll() weight handling tlan: Fix pci memory unmapping enc28j60: use netif_rx_ni() to deliver RX packets tlan: Fix small (< 64 bytes) datagram transmissions netfilter: ctnetlink: fix missing CTA_NAT_SEQ_UNSPEC	2008-12-18 12:00:46 -08:00
Mark Brown	40aa4a30d0	ASoC: Add WM8350 AudioPlus codec driver The WM8350 is an integrated audio and power management subsystem which provides a single-chip solution for portable audio and multimedia systems. The integrated audio CODEC provides all the necessary functions for high-quality stereo recording and playback. Programmable on-chip amplifiers allow for the direct connection of headphones and microphones with a minimum of external components. A programmable low-noise bias voltage is available to feed one or more electret microphones. Additional audio features include programmable high-pass filter in the ADC input path. This driver was originally written by Liam Girdwood with further updates from me. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2008-12-18 17:21:07 +00:00
KOSAKI Motohiro	74c8a61304	locking, irq: enclose irq_desc_lock_class in CONFIG_LOCKDEP Impact: simplify code commit "08678b0: generic: sparse irqs: use irq_desc() [...]" introduced the irq_desc_lock_class variable. But it is used only if CONFIG_SPARSE_IRQ=Y or CONFIG_TRACE_IRQFLAGS=Y. Otherwise, following warnings happen: CC kernel/irq/handle.o kernel/irq/handle.c:26: warning: 'irq_desc_lock_class' defined but not used Actually, current early_init_irq_lock_class has a bit strange and messy ifdef. In addition, it is not valueable. 1. this function is protected by !CONFIG_SPARSE_IRQ, but that is not necessary. if CONFIG_SPARSE_IRQ=Y, desc of all irq number are initialized by NULL at first - then this function calling is safe. 2. this function protected by CONFIG_TRACE_IRQFLAGS too. but it is not necessary either, because lockdep_set_class() doesn't have bad side effect even if CONFIG_TRACE_IRQFLAGS=n. This patch bloat kernel size a bit on CONFIG_TRACE_IRQFLAGS=n and CONFIG_SPARSE_IRQ=Y - but that's ok. early_init_irq_lock_class() is not a fastpatch at all. To avoid messy ifdefs is more important than a few bytes diet. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-18 14:35:53 +01:00
Ken Chen	9c2c48020e	schedstat: consolidate per-task cpu runtime stats Impact: simplify code When we turn on CONFIG_SCHEDSTATS, per-task cpu runtime is accumulated twice. Once in task->se.sum_exec_runtime and once in sched_info.cpu_time. These two stats are exactly the same. Given that task->se.sum_exec_runtime is always accumulated by the core scheduler, sched_info can reuse that data instead of duplicate the accounting. Signed-off-by: Ken Chen <kenchen@google.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-18 13:54:01 +01:00
Steven Rostedt	f38f1d2aa5	trace: add a way to enable or disable the stack tracer Impact: enhancement to stack tracer The stack tracer currently is either on when configured in or off when it is not. It can not be disabled when it is configured on. (besides disabling the function tracer that it uses) This patch adds a way to enable or disable the stack tracer at run time. It defaults off on bootup, but a kernel parameter 'stacktrace' has been added to enable it on bootup. A new sysctl has been added "kernel.stack_tracer_enabled" to let the user enable or disable the stack tracer at run time. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-18 12:56:24 +01:00
Ingo Molnar	b9974dc6bd	Merge branch 'linus' into cpus4096	2008-12-18 11:48:30 +01:00
Paul Mackerras	c280266a32	Merge branch 'linux-2.6' into next	2008-12-18 11:06:12 +11:00
Rémi Denis-Courmont	57c81fffc8	Phonet: allocate separate ARP type for GPRS over a Phonet pipe A separate xmit lock class supports GPRS over a Phonet pipe over a TUN device (type ARPHRD_NONE). Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-17 15:47:48 -08:00
Rémi Denis-Courmont	2d91d78b68	Phonet: allocate a non-Ethernet ARP type Also leave some room for more 802.11 types. Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-17 15:47:29 -08:00
Phil Endecott	9a9fafb894	USB: fix comment about endianness of descriptors This patch fixes a comment and clarifies the documentation about the endianness of descriptors. The current policy is that descriptors will be little-endian at the API even on big-endian systems; however the /proc/bus/usb API predates this policy and presents descriptors with some multibyte fields byte-swapped. Signed-off-by: Phil Endecott <usb_endian_patch@chezphil.org> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-12-17 10:49:14 -08:00
Ian Campbell	b81ea27b23	swiotlb: add arch hook to force mapping Impact: generalize the sw-IOTLB range checks Some architectures require special rules to determine whether a range needs mapping or not. This adds a weak function for architectures to override. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-17 18:58:13 +01:00
Ian Campbell	e08e1f7adb	swiotlb: allow architectures to override phys<->bus<->phys conversions Impact: generalize phys<->bus<->phys conversions in the swiotlb code Architectures may need to override these conversions. Implement a __weak hook point containing the default implementation. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-17 18:58:09 +01:00
Ingo Molnar	855caa37b9	Merge branch 'x86/crashdump' into cpus4096 Conflicts: arch/x86/kernel/crash.c Merged for semantic conflict: arch/x86/kernel/reboot.c	2008-12-17 13:24:52 +01:00
Ingo Molnar	948a7b2b5e	Merge branch 'irq/sparseirq' into cpus4096 Conflicts: arch/x86/kernel/io_apic.c Merge irq/sparseirq here, to resolve conflicts.	2008-12-17 13:16:08 +01:00
Ingo Molnar	1f3f424a6b	Merge branch 'linus' into cpus4096	2008-12-17 13:07:48 +01:00
Andy Fleming	b31a1d8b41	gianfar: Convert gianfar to an of_platform_driver Does the same for the accompanying MDIO driver, and then modifies the TBI configuration method. The old way used fields in einfo, which no longer exists. The new way is to create an MDIO device-tree node for each instance of gianfar, and create a tbi-handle property to associate ethernet controllers with the TBI PHYs they are connected to. Signed-off-by: Andy Fleming <afleming@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-16 15:29:15 -08:00
David S. Miller	354ade9058	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/enc28j60.c	2008-12-16 15:23:54 -08:00
Yinghai Lu	48a1b10aff	x86, sparseirq: move irq_desc according to smp_affinity, v7 Impact: improve NUMA handling by migrating irq_desc on smp_affinity changes if CONFIG_NUMA_MIGRATE_IRQ_DESC is set: - make irq_desc to go with affinity aka irq_desc moving etc - call move_irq_desc in irq_complete_move() - legacy irq_desc is not moved, because they are allocated via static array for logical apic mode, need to add move_desc_in_progress_in_same_domain, otherwise it will not be moved ==> also could need two phases to get irq_desc moved. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-17 00:14:01 +01:00
Ian Campbell	0016fdee92	swiotlb: move some definitions to header Impact: cleanup Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-16 21:31:40 +01:00
Jeremy Fitzhardinge	8c5df16bec	swiotlb: allow architectures to override swiotlb pool allocation Impact: generalize swiotlb allocation code Architectures may need to allocate memory specially for use with the swiotlb. Create the weak function swiotlb_alloc_boot() and swiotlb_alloc() defaulting to the current behaviour. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-16 21:31:38 +01:00
Ingo Molnar	9dfc3bc7d2	Merge branches 'tracing/fastboot', 'tracing/ftrace', 'tracing/function-graph-tracer' and 'tracing/hw-branch-tracing' into tracing/core	2008-12-16 12:03:38 +01:00
Yang Hongyang	b24a2516d1	ipv6: Add IPV6_PKTINFO sticky option support to setsockopt() There are three reasons for me to add this support: 1.When no interface is specified in an IPV6_PKTINFO ancillary data item, the interface specified in an IPV6_PKTINFO sticky optionis is used. RFC3542: 6.7. Summary of Outgoing Interface Selection This document and [RFC-3493] specify various methods that affect the selection of the packet's outgoing interface. This subsection summarizes the ordering among those in order to ensure deterministic behavior. For a given outgoing packet on a given socket, the outgoing interface is determined in the following order: 1. if an interface is specified in an IPV6_PKTINFO ancillary data item, the interface is used. 2. otherwise, if an interface is specified in an IPV6_PKTINFO sticky option, the interface is used. 2.When no IPV6_PKTINFO ancillary data is received,getsockopt() should return the sticky option value which set with setsockopt(). RFC 3542: Issuing getsockopt() for the above options will return the sticky option value i.e., the value set with setsockopt(). If no sticky option value has been set getsockopt() will return the following values: 3.Make the setsockopt implementation POSIX compliant. Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-16 02:06:23 -08:00
Steve Glendinning	bc02ff95fe	net: Refactor full duplex flow control resolution These 4 drivers have identical full duplex flow control resolution functions. This patch changes them all to use one common function. The function in question decides whether a device should enable TX and RX flow control in a standard way (IEEE 802.3-2005 table 28B-3), so this should also be useful for other drivers. Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-16 02:00:48 -08:00
Steve Glendinning	e18ce34654	net: Move flow control definitions to mii.h flags used within drivers for indicating tx and rx flow control are defined in 4 drivers (and probably more), move these constants to mii.h. The 3 SMSC drivers use the same constants (FLOW_CTRL_TX), but TG3 uses TG3_FLOW_CTRL_TX, so this patch also renames the constants within TG3. Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-16 02:00:00 -08:00
Pablo Neira Ayuso	092cab7e2c	netfilter: ctnetlink: fix missing CTA_NAT_SEQ_UNSPEC This patch fixes an inconsistency in nfnetlink_conntrack.h that I introduced myself. The problem is that CTA_NAT_SEQ_UNSPEC is missing from enum ctattr_natseq. This inconsistency may lead to problems in the message parsing in userspace (if the message contains the CTA_NAT_SEQ_* attributes, of course). This patch breaks backward compatibility, however, the only known client of this code is libnetfilter_conntrack which indeed crashes because it assumes the existence of CTA_NAT_SEQ_UNSPEC to do the parsing. The CTA_NAT_SEQ_* attributes were introduced in 2.6.25. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-16 01:19:41 -08:00
Herbert Xu	b240a0e564	ethtool: Add GGRO and SGRO ops This patch adds the ethtool ops to enable and disable GRO. It also makes GRO depend on RX checksum offload much the same as how TSO depends on SG support. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-15 23:44:31 -08:00
Herbert Xu	71d93b39e5	net: Add skb_gro_receive This patch adds the helper skb_gro_receive to merge packets for GRO. The current method is to allocate a new header skb and then chain the original packets to its frag_list. This is done to make it easier to integrate into the existing GSO framework. In future as GSO is moved into the drivers, we can undo this and simply chain the original packets together. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-15 23:42:33 -08:00
Herbert Xu	d565b0a1a9	net: Add Generic Receive Offload infrastructure This patch adds the top-level GRO (Generic Receive Offload) infrastructure. This is pretty similar to LRO except that this is protocol-independent. Instead of holding packets in an lro_mgr structure, they're now held in napi_struct. For drivers that intend to use this, they can set the NETIF_F_GRO bit and call napi_gro_receive instead of netif_receive_skb or just call netif_rx. The latter will call napi_receive_skb automatically. When napi_gro_receive is used, the driver must either call napi_complete/napi_rx_complete, or call napi_gro_flush in softirq context if the driver uses the primitives __napi_complete/__napi_rx_complete. Protocols will set the gro_receive and gro_complete function pointers in order to participate in this scheme. In addition to the packet, gro_receive will get a list of currently held packets. Each packet in the list has a same_flow field which is non-zero if it is a potential match for the new packet. For each packet that may match, they also have a flush field which is non-zero if the held packet must not be merged with the new packet. Once gro_receive has determined that the new skb matches a held packet, the held packet may be processed immediately if the new skb cannot be merged with it. In this case gro_receive should return the pointer to the existing skb in gro_list. Otherwise the new skb should be merged into the existing packet and NULL should be returned, unless the new skb makes it impossible for any further merges to be made (e.g., FIN packet) where the merged skb should be returned. Whenever the skb is merged into an existing entry, the gro_receive function should set NAPI_GRO_CB(skb)->same_flow. Note that if an skb merely matches an existing entry but can't be merged with it, then this shouldn't be set. If gro_receive finds it pointless to hold the new skb for future merging, it should set NAPI_GRO_CB(skb)->flush. Held packets will be flushed by napi_gro_flush which is called by napi_complete and napi_rx_complete. Currently held packets are stored in a singly liked list just like LRO. The list is limited to a maximum of 8 entries. In future, this may be expanded to use a hash table to allow more flows to be held for merging. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-15 23:38:52 -08:00
Herbert Xu	1a881f27c5	net: Add frag_list support to GSO This patch allows GSO to handle frag_list in a limited way for the purposes of allowing packets merged by GRO to be refragmented on output. Most hardware won't (and aren't expected to) support handling GRO frag_list packets directly. Therefore we will perform GSO in software for those cases. However, for drivers that can support it (such as virtual NICs) we may not have to segment the packets at all. Whether the added overhead of GRO/GSO is worthwhile for bridges and routers when weighed against the benefit of potentially increasing the MTU within the host is still an open question. However, for the case of host nodes this is undoubtedly a win. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-15 23:27:47 -08:00
Kay Sievers	b53c7583e2	rapidio: struct device - replace bus_id with dev_name(), dev_set_name() Cc: Matt Porter <mporter@kernel.crashing.org> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-12-16 15:53:41 +11:00
David S. Miller	eb14f01959	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/e1000e/ich8lan.c	2008-12-15 20:03:50 -08:00
Paul Mackerras	1e1c568d6c	Merge branch 'merge' into next	2008-12-16 14:38:58 +11:00
Linus Torvalds	7004405cb8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: Phonet: keep TX queue disabled when the device is off SCHED: netem: Correct documentation comment in code. netfilter: update rwlock initialization for nat_table netlabel: Compiler warning and NULL pointer dereference fix e1000e: fix double release of mutex IA64: HP_SIMETH needs to depend upon NET netpoll: fix race on poll_list resulting in garbage entry ipv6: silence log messages for locally generated multicast sungem: improve ethtool output with internal pcs and serdes tcp: tcp_vegas cong avoid fix sungem: Make PCS PHY support partially work again.	2008-12-15 16:30:22 -08:00
Rusty Russell	d2ff911882	Define smp_call_function_many for UP Otherwise those using it in transition patches (eg. kvm) can't compile with CONFIG_SMP=n: arch/x86/kvm/../../../virt/kvm/kvm_main.c: In function 'make_all_cpus_request': arch/x86/kvm/../../../virt/kvm/kvm_main.c:380: error: implicit declaration of function 'smp_call_function_many' Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-15 16:28:57 -08:00
Ben Dooks	b690ace50b	[ARM] S3C6400: serial support for S3C6400 and S3C6410 SoCs Add support to the Samsung serial driver for the S3C6400 and S3C6410 serial ports. Signed-off-by: Ben Dooks <ben-linux@fluff.org>	2008-12-15 21:58:11 +00:00
Rusty Russell	968ea6d80e	Merge ../linux-2.6-x86 Conflicts: arch/x86/kernel/io_apic.c kernel/sched.c kernel/sched_stats.h	2008-12-13 21:55:51 +10:30
Rusty Russell	7be7585393	cpumask: Use all NR_CPUS bits unless CONFIG_CPUMASK_OFFSTACK Impact: futureproof as we convert more code to new APIs The old cpumask operators treat all NR_CPUS bits as relevent, the new ones use nr_cpumask_bits. For large NR_CPUS and small nr_cpu_ids, this makes a difference. However, mixing the two can cause problems with undefined bits. An arch which sets CONFIG_CPUMASK_OFFSTACK should have converted across to the new operators, so it's safe in that case. (Thanks to Stephen Rothwell for bisecting the initial unused-bits bug, and Mike Travis for this solution). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Mike Travis <travis@sgi.com>	2008-12-13 21:20:28 +10:30
Rusty Russell	320ab2b0b1	cpumask: convert struct clock_event_device to cpumask pointers. Impact: change calling convention of existing clock_event APIs struct clock_event_timer's cpumask field gets changed to take pointer, as does the ->broadcast function. Another single-patch change. For safety, we BUG_ON() in clockevents_register_device() if it's not set. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@elte.hu>	2008-12-13 21:20:26 +10:30
Rusty Russell	0de26520c7	cpumask: make irq_set_affinity() take a const struct cpumask Impact: change existing irq_chip API Not much point with gentle transition here: the struct irq_chip's setaffinity method signature needs to change. Fortunately, not widely used code, but hits a few architectures. Note: In irq_select_affinity() I save a temporary in by mangling irq_desc[irq].affinity directly. Ingo, does this break anything? (Folded in fix from KOSAKI Motohiro) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Grant Grundler <grundler@parisc-linux.org> Acked-by: Ingo Molnar <mingo@redhat.com> Cc: ralf@linux-mips.org Cc: grundler@parisc-linux.org Cc: jeremy@xensource.com Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>	2008-12-13 21:20:26 +10:30
Rusty Russell	29c0177e6a	cpumask: change cpumask_scnprintf, cpumask_parse_user, cpulist_parse, and cpulist_scnprintf to take pointers. Impact: change calling convention of existing cpumask APIs Most cpumask functions started with cpus_: these have been replaced by cpumask_ ones which take struct cpumask pointers as expected. These four functions don't have good replacement names; fortunately they're rarely used, so we just change them over. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: paulus@samba.org Cc: mingo@redhat.com Cc: tony.luck@intel.com Cc: ralf@linux-mips.org Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: cl@linux-foundation.org Cc: srostedt@redhat.com	2008-12-13 21:20:25 +10:30
Johannes Berg	4dec9b807b	rfkill: strip pointless notifier chain No users, so no reason to have it. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-12-12 14:45:25 -05:00
Senthil Balasubramanian	bb608e9db7	wireless: Incorrect LEAP authentication algorithm identifier. This patch fixes a regression introduced by "wireless: avoid some net/ieee80211.h vs. linux/ieee80211.h conflicts" LEAP authentication algorithm identifier should be 128. Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-12-12 13:48:20 -05:00
Mike Frysinger	c29541b24f	linux/timex.h: cleanup for userspace Impact: fix user-space exported use Move all the kernel-specific defines and includes into the __KERNEL__ section so that they don't get leaked into userspace. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-12-12 17:01:38 +01:00
Oleg Nesterov	27af4245b6	posix-timers: use "struct pid" instead of "struct task_struct" Impact: restructure, clean up code k_itimer holds the ref to the ->it_process until sys_timer_delete(). This allows to pin up to RLIMIT_SIGPENDING dead task_struct's. Change the code to use "struct pid *" instead. The patch doesn't kill ->it_process, it places ->it_pid into the union. ->it_process is still used by do_cpu_nanosleep() as before. It would be trivial to change the nanosleep code as well, but since it uses it_process in a special way I think it is better to keep this field for grep. The patch bloats the kernel by 104 bytes and it also adds the new pointer, ->it_signal, to k_itimer. It is used by lock_timer() to verify that the found timer was not created by another process. It is not clear why do we use the global database (and thus the global idr_lock) for posix timers. We still need the signal_struct->posix_timers which contains all useable timers, perhaps it is better to use some form of per-process array instead. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-12-12 17:00:07 +01:00
Stefano Panella	5b37717a23	uwb: improved MAS allocator and reservation conflict handling Greatly enhance the MAS allocator: - Handle row and column reservations. - Permit all the available MAS to be allocated. - Follows the WiMedia rules on MAS selection. Take appropriate action when reservation conflicts are detected. - Correctly identify which reservation wins the conflict. - Protect alien BP reservations. - If an owned reservation loses, resize/move it. - Follow the backoff procedure before requesting additional MAS. When reservations are terminated, move the remaining reservations (if necessary) so they keep following the MAS allocation rules. Signed-off-by: Stefano Panella <stefano.panella@csr.com> Signed-off-by: David Vrabel <david.vrabel@csr.com>	2008-12-12 13:00:06 +00:00
Ingo Molnar	8299608f14	Merge branches 'irq/sparseirq', 'x86/quirks' and 'x86/reboot' into cpus4096 We merge the irq/sparseirq, x86/quirks and x86/reboot trees into the cpus4096 tree because the io-apic changes in the sparseirq change conflict with the cpumask changes in the cpumask tree, and we want to resolve those.	2008-12-12 13:49:24 +01:00
Ingo Molnar	45ab6b0c76	Merge branch 'sched/core' into cpus4096 Conflicts: include/linux/ftrace.h kernel/sched.c	2008-12-12 13:48:57 +01:00
Heiko Carstens	ee79d1bdb6	sched: let arch_update_cpu_topology indicate if topology changed Change arch_update_cpu_topology so it returns 1 if the cpu topology changed and 0 if it didn't change. This will be useful for the next patch which adds a call to this function in partition_sched_domains. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-12 13:47:21 +01:00
Ingo Molnar	81444a7995	Merge branch 'tracing/fastboot' into cpus4096	2008-12-12 12:43:05 +01:00
Ingo Molnar	30cb367ea2	sparse irqs: add irqnr.h to the user headers list Impact: fix build error /home/mingo/tip/usr/include/linux/random.h:11: included file 'linux/irqnr.h' is not exported Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-12 12:29:10 +01:00
Ingo Molnar	0ebb26e7a4	sparse irqs: handle !GENIRQ platforms Impact: build fix fix: In file included from /home/mingo/tip/arch/m68k/amiga/amiints.c:39: /home/mingo/tip/include/linux/interrupt.h:21: error: expected identifier or '(' /home/mingo/tip/arch/m68k/amiga/amiints.c: In function 'amiga_init_IRQ': Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-12 12:28:50 +01:00
Ingo Molnar	fd10902797	Merge commit 'v2.6.28-rc8' into x86/irq	2008-12-12 11:59:39 +01:00
Frederic Weisbecker	bcbc4f20b5	tracing/function-graph-tracer: annotate do_IRQ and smp_apic_timer_interrupt Impact: move most important x86 irq entry-points to a separate subsection Annotate do_IRQ and smp_apic_timer_interrupt to put them into the .irqentry.text subsection. These function will so be recognized as hardirq entrypoints for the function-graph-tracer. We could also annotate other irq entries but the others are far less important but they can be added on request. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-12 11:14:08 +01:00
Ingo Molnar	c1dfdc7597	Merge commit 'v2.6.28-rc8' into sched/core	2008-12-12 10:29:35 +01:00
Markus Metzger	c2724775ce	x86, bts: provide in-kernel branch-trace interface Impact: cleanup Move the BTS bits from ptrace.c into ds.c. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-12 08:08:12 +01:00
Ingo Molnar	f3134de606	Merge branches 'tracing/function-graph-tracer' and 'tracing/ring-buffer' into tracing/core	2008-12-12 07:40:08 +01:00
Christoph Hellwig	4d4be482a4	[XFS] add a FMODE flag to make XFS invisible I/O less hacky XFS has a mode called invisble I/O that doesn't update any of the timestamps. It's used for HSM-style applications and exposed through the nasty open by handle ioctl. Instead of doing directly assignment of file operations that set an internal flag for it add a new FMODE_NOCMTIME flag that we can check in the normal file operations. (addition of the generic VFS flag has been ACKed by Al as an interims solution) Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-12-11 13:14:41 +11:00
Benjamin Thery	8229efdaef	netns: ip6mr: enable namespace support in ipv6 multicast forwarding code This last patch makes the appropriate changes to use and propagate the network namespace where needed in IPv6 multicast forwarding code. This consists mainly in replacing all the remaining init_net occurences with current netns pointer retrieved from sockets, net devices or mfc6_caches depending on the routines' contexts. Some routines receive a new 'struct net' parameter to propagate the current netns: * ip6mr_get_route * ip6mr_cache_report * ip6mr_cache_find * ip6mr_cache_unresolved * mif6_add/mif6_delete * ip6mr_mfc_add/ip6mr_mfc_delete * ip6mr_reg_vif All the IPv6 multicast forwarding variables moved to struct netns_ipv6 by the previous patches are now referenced in the correct namespace. Changelog: ========== * Take into account the net associated to mfc6_cache when matching entries in mfc_unres_queue list. * Call mroute_clean_tables() in ip6mr_net_exit() to free memory allocated per-namespace. * Call dev_net_set() in ip6mr_reg_vif() to initialize dev->nd_net correctly. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-10 16:30:15 -08:00
Benjamin Thery	58701ad411	netns: ip6mr: store netns in struct mfc6_cache This patch stores into struct mfc6_cache the network namespace each mfc6_cache belongs to. The new member is mfc6_net. mfc6_net is assigned at cache allocation and doesn't change during the rest of the cache entry life. This will help to retrieve the current netns around the IPv6 multicast forwarding code. At the moment, all mfc6_cache are allocated in init_net. Changelog: ========== * Use write_pnet()/read_pnet() to set and get mfc6_net. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-10 16:22:34 -08:00
Benjamin Thery	bd91b8bf37	netns: ip6mr: allocate mroute6_socket per-namespace. Preliminary work to make IPv6 multicast forwarding netns-aware. Make IPv6 multicast forwarding mroute6_socket per-namespace, moves it into struct netns_ipv6. At the moment, mroute6_socket is only referenced in init_net. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-10 16:07:08 -08:00
Steve Glendinning	2107fb8b5b	smsc911x: add dynamic bus configuration Convert the driver to select 16-bit or 32-bit bus access at runtime, at a small performance cost. Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-10 15:12:45 -08:00
Randy Dunlap	d3af0f048c	[MTD] [NAND] remove excess kernel-doc notation Delete extra kernel-doc notation for struct fields and function parameters that don't exist: Warning(include/linux/mtd/nand.h:428): Excess struct/union/enum/typedef member 'wq' description in 'nand_chip' Warning(include/linux/mtd/nand.h:428): Excess struct/union/enum/typedef member 'datbuf' description in 'nand_chip' Warning(include/linux/mtd/nand.h:428): Excess struct/union/enum/typedef member 'oobbuf' description in 'nand_chip' Warning(include/linux/mtd/nand.h:428): Excess struct/union/enum/typedef member 'oobdirty' description in 'nand_chip' Warning(include/linux/mtd/nand.h:428): Excess struct/union/enum/typedef member 'data_poi' description in 'nand_chip' Warning(drivers/mtd/nand/nand_base.c:2527): Excess function parameter 'maxchips' description in 'nand_scan_tail' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-12-10 16:10:40 +00:00
Hugh Dickins	9c24624727	KSYM_SYMBOL_LEN fixes Miles Lane tailing /sys files hit a BUG which Pekka Enberg has tracked to my `966c8c12dc` sprint_symbol(): use less stack exposing a bug in slub's list_locations() - kallsyms_lookup() writes a 0 to namebuf[KSYM_NAME_LEN-1], but that was beyond the end of page provided. The 100 slop which list_locations() allows at end of page looks roughly enough for all the other stuff it might print after the symbol before it checks again: break out KSYM_SYMBOL_LEN earlier than before. Latencytop and ftrace and are using KSYM_NAME_LEN buffers where they need KSYM_SYMBOL_LEN buffers, and vmallocinfo a 2*KSYM_NAME_LEN buffer where it wants a KSYM_SYMBOL_LEN buffer: fix those before anyone copies them. [akpm@linux-foundation.org: ftrace.h needs module.h] Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc Miles Lane <miles.lane@gmail.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Steven Rostedt <srostedt@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-10 08:01:54 -08:00
Andrew Morton	02d2116887	revert "percpu_counter: new function percpu_counter_sum_and_set" Revert commit `e8ced39d5e` Author: Mingming Cao <cmm@us.ibm.com> Date: Fri Jul 11 19:27:31 2008 -0400 percpu_counter: new function percpu_counter_sum_and_set As described in revert "percpu counter: clean up percpu_counter_sum_and_set()" the new percpu_counter_sum_and_set() is racy against updates to the cpu-local accumulators on other CPUs. Revert that change. This means that ext4 will be slow again. But correct. Reported-by: Eric Dumazet <dada1@cosmosbay.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mingming Cao <cmm@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Cc: <stable@kernel.org> [2.6.27.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-10 08:01:52 -08:00
Andrew Morton	71c5576fbd	revert "percpu counter: clean up percpu_counter_sum_and_set()" Revert commit `1f7c14c62c` Author: Mingming Cao <cmm@us.ibm.com> Date: Thu Oct 9 12:50:59 2008 -0400 percpu counter: clean up percpu_counter_sum_and_set() Before this patch we had the following: percpu_counter_sum(): return the percpu_counter's value percpu_counter_sum_and_set(): return the percpu_counter's value, copying that value into the central value and zeroing the per-cpu counters before returning. After this patch, percpu_counter_sum_and_set() has gone, and percpu_counter_sum() gets the old percpu_counter_sum_and_set() functionality. Problem is, as Eric points out, the old percpu_counter_sum_and_set() functionality was racy and wrong. It zeroes out counters on "other" cpus, without holding any locks which will prevent races agaist updates from those other CPUS. This patch reverts `1f7c14c62c`. This means that percpu_counter_sum_and_set() still has the race, but percpu_counter_sum() does not. Note that this is not a simple revert - ext4 has since started using percpu_counter_sum() for its dirty_blocks counter as well. Note that this revert patch changes percpu_counter_sum() semantics. Before the patch, a call to percpu_counter_sum() will bring the counter's central counter mostly up-to-date, so a following percpu_counter_read() will return a close value. After this patch, a call to percpu_counter_sum() will leave the counter's central accumulator unaltered, so a subsequent call to percpu_counter_read() can now return a significantly inaccurate result. If there is any code in the tree which was introduced after `e8ced39d5e` was merged, and which depends upon the new percpu_counter_sum() semantics, that code will break. Reported-by: Eric Dumazet <dada1@cosmosbay.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mingming Cao <cmm@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-10 08:01:52 -08:00
David Woodhouse	c4956ed6fa	Merge branch 'misc/mtd/sharpsl-nand' of git://git.kernel.org/pub/scm/linux/kernel/git/lumag/tosa-2.6	2008-12-10 15:49:12 +00:00
Mark Brown	cdc6936432	ALSA: Add support for mechanical jack insertion Some systems support both mechanical and electrical jack detection, allowing them to report that a jack is physically present but does not have any functioning connections. Add a new jack type for these, allowing user space to report faulty connections. Thanks to Guillem Jover for the suggestion. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>	2008-12-10 15:10:44 +01:00
David Woodhouse	26cdb67c74	[MTD] Remove more strange u_intxx_t types Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-12-10 14:08:56 +00:00
David Woodhouse	3854be7712	[MTD] Remove strange u_int32_t types from FTL Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-12-10 14:06:42 +00:00
Adrian Hunter	69423d99fc	[MTD] update internal API to support 64-bit device size MTD internal API presently uses 32-bit values to represent device size. This patch updates them to 64-bits but leaves the external API unchanged. Extending the external API is a separate issue for several reasons. First, no one needs it at the moment. Secondly, whether the implementation is done with IOCTLs, sysfs or both is still debated. Thirdly external API changes require the internal API to be accepted first. Note that although the MTD API will be able to support 64-bit device sizes, existing drivers do not and are not required to do so, although NAND base has been updated. In general, changing from 32-bit to 64-bit values cause little or no changes to the majority of the code with the following exceptions: - printk message formats - division and modulus of 64-bit values - NAND base support - 32-bit local variables used by mtdpart and mtdconcat - naughtily assuming one structure maps to another in MEMERASE ioctl Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-12-10 13:37:21 +00:00
Robert Richter	e09373f22e	ring_buffer: add remaining cpu functions to ring_buffer.h These functions are not yet in ring_buffer.h though they seems to be part of the API. Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Robert Richter <robert.richter@amd.com>	2008-12-10 14:20:17 +01:00
Robert Richter	5849448758	oprofile: update comment for oprofile_add_sample() The cpu argument is no longer part of the parameter list. Signed-off-by: Robert Richter <robert.richter@amd.com>	2008-12-10 14:20:03 +01:00
Neil Horman	7b363e4400	netpoll: fix race on poll_list resulting in garbage entry A few months back a race was discused between the netpoll napi service path, and the fast path through net_rx_action: http://kerneltrap.org/mailarchive/linux-netdev/2007/10/16/345470 A patch was submitted for that bug, but I think we missed a case. Consider the following scenario: INITIAL STATE CPU0 has one napi_struct A on its poll_list CPU1 is calling netpoll_send_skb and needs to call poll_napi on the same napi_struct A that CPU0 has on its list CPU0 CPU1 net_rx_action poll_napi !list_empty (returns true) locks poll_lock for A poll_one_napi napi->poll netif_rx_complete __napi_complete (removes A from poll_list) list_entry(list->next) In the above scenario, net_rx_action assumes that the per-cpu poll_list is exclusive to that cpu. netpoll of course violates that, and because the netpoll path can dequeue from the poll list, its possible for CPU0 to detect a non-empty list at the top of the while loop in net_rx_action, but have it become empty by the time it calls list_entry. Since the poll_list isn't surrounded by any other structure, the returned data from that list_entry call in this situation is garbage, and any number of crashes can result based on what exactly that garbage is. Given that its not fasible for performance reasons to place exclusive locks arround each cpus poll list to provide that mutal exclusion, I think the best solution is modify the netpoll path in such a way that we continue to guarantee that the poll_list for a cpu is in fact exclusive to that cpu. To do this I've implemented the patch below. It adds an additional bit to the state field in the napi_struct. When executing napi->poll from the netpoll_path, this bit will be set. When a driver calls netif_rx_complete, if that bit is set, it will not remove the napi_struct from the poll_list. That work will be saved for the next iteration of net_rx_action. I've tested this and it seems to work well. About the biggest drawback I can see to it is the fact that it might result in an extra loop through net_rx_action in the event that the device is actually contended for (i.e. the netpoll path actually preforms all the needed work no the device, and the call to net_rx_action winds up doing nothing, except removing the napi_struct from the poll_list. However I think this is probably a small price to pay, given that the alternative is a crash. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-09 23:22:26 -08:00
Linus Torvalds	b749e3f8d7	Merge branch 'audit.b59' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current * 'audit.b59' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: [PATCH] fix broken timestamps in AVC generated by kernel threads [patch 1/1] audit: remove excess kernel-doc [PATCH] asm/generic: fix bug - kernel fails to build when enable some common audit code on Blackfin [PATCH] return records for fork() both to child and parent [PATCH] Audit: make audit=0 actually turn off audit	2008-12-09 08:28:13 -08:00
Al Viro	1e641743f0	Audit: Log TIOCSTI AUDIT_TTY records currently log all data read by processes marked for TTY input auditing, even if the data was "pushed back" using the TIOCSTI ioctl, not typed by the user. This patch records all TIOCSTI calls to disambiguate the input. It generates one audit message per character pushed back; considering TIOCSTI is used very rarely, this simple solution is probably good enough. (The only program I could find that uses TIOCSTI is mailx/nail in "header editing" mode, e.g. using the ~h escape. mailx is used very rarely, and the escapes are used even rarer.) Signed-Off-By: Miloslav Trmac <mitr@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: James Morris <jmorris@namei.org>	2008-12-09 20:32:06 +11:00
Al Viro	48887e63d6	[PATCH] fix broken timestamps in AVC generated by kernel threads Timestamp in audit_context is valid only if ->in_syscall is set. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-09 02:27:41 -05:00
Al Viro	a64e64944f	[PATCH] return records for fork() both to child and parent Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-09 02:27:38 -05:00
Linus Torvalds	f7a8db89c1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: tproxy: fixe a possible read from an invalid location in the socket match zd1211rw: use unaligned safe memcmp() in-place of compare_ether_addr() mac80211: use unaligned safe memcmp() in-place of compare_ether_addr() ipw2200: fix netif_*_queue() removal regression iwlwifi: clean key table in iwl_clear_stations_table function tcp: tcp_vegas ssthresh bug fix can: omit received RTR frames for single ID filter lists ATM: CVE-2008-5079: duplicate listen() on socket corrupts the vcc table netx-eth: initialize per device spinlock tcp: make urg+gso work for real this time enc28j60: Fix sporadic packet loss (corrected again) hysdn: fix writing outside the field on 64 bits b1isa: fix b1isa_exit() to really remove registered capi controllers can: Fix CAN_(EFF\|RTR)_FLAG handling in can_filter Phonet: do not dump addresses from other namespaces netlabel: Fix a potential NULL pointer dereference bnx2: Add workaround to handle missed MSI. xfrm: Fix kernel panic when flush and dump SPD entries	2008-12-08 19:52:43 -08:00
Yinghai Lu	240d367b4e	sparseirq: fix Alpha build failure Impact: build fix on Alpha -tip testing found this build failure on the Alpha defconfig: /home/mingo/tip/fs/proc/stat.c: In function 'show_stat': /home/mingo/tip/fs/proc/stat.c:48: error: implicit declaration of function 'for_each_irq_desc' /home/mingo/tip/fs/proc/stat.c:48: error: expected ';' before '{' token can not use irq_desc() in stat.c on older architectures. Signed-off-by: Yinghai Lu <yinghai@kernel.orgg> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-09 04:16:54 +01:00
David Vrabel	c35fa3ea1a	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 into for-upstream	2008-12-08 16:18:47 +00:00
Frederic Weisbecker	380c4b1411	tracing/function-graph-tracer: append the tracing_graph_flag Impact: Provide a way to pause the function graph tracer As suggested by Steven Rostedt, the previous patch that prevented from spinlock function tracing shouldn't use the raw_spinlock to fix it. It's much better to follow lockdep with normal spinlock, so this patch adds a new flag for each task to make the function graph tracer able to be paused. We also can send an ftrace_printk whithout worrying of the irrelevant traced spinlock during insertion. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-08 15:11:45 +01:00
Frederic Weisbecker	8b96f01198	tracing/function-graph-tracer: introduce __notrace_funcgraph to filter special functions Impact: trace more functions When the function graph tracer is configured, three more files are not traced to prevent only four functions to be traced. And this impacts the normal function tracer too. arch/x86/kernel/process_64/32.c: I had crashes when I let this file traced. After some debugging, I saw that the "current" task point was changed inside__swtich_to(), ie: "write_pda(pcurrent, next_p);" inside process_64.c Since the tracer store the original return address of the function inside current, we had crashes. Only __switch_to() has to be excluded from tracing. kernel/module.c and kernel/extable.c: Because of a function used internally by the function graph tracer: __kernel_text_address() To let the other functions inside these files to be traced, this patch introduces the __notrace_funcgraph function prefix which is __notrace if function graph tracer is configured and nothing if not. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-08 15:11:44 +01:00
Yinghai Lu	3145e941fc	x86, MSI: pass irq_cfg and irq_desc Impact: simplify code Pass irq_desc and cfg around, instead of raw IRQ numbers - this way we dont have to look it up again and again. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-08 14:31:59 +01:00
Yinghai Lu	0b8f1efad3	sparse irq_desc[] array: core kernel and x86 changes Impact: new feature Problem on distro kernels: irq_desc[NR_IRQS] takes megabytes of RAM with NR_CPUS set to large values. The goal is to be able to scale up to much larger NR_IRQS value without impacting the (important) common case. To solve this, we generalize irq_desc[NR_IRQS] to an (optional) array of irq_desc pointers. When CONFIG_SPARSE_IRQ=y is used, we use kzalloc_node to get irq_desc, this also makes the IRQ descriptors NUMA-local (to the site that calls request_irq()). This gets rid of the irq_cfg[] static array on x86 as well: irq_cfg now uses desc->chip_data for x86 to store irq_cfg. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-08 14:31:51 +01:00
Lai Jiangshan	361b73d5c3	ring_buffer: fix comments Impact: comments cleanup fix incorrect comments for enum ring_buffer_type Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-08 13:54:05 +01:00
Ingo Molnar	4d117c5c6b	Merge branch 'sched/urgent' into sched/core	2008-12-08 13:52:00 +01:00
Gerrit Renker	6fdd34d43b	dccp ccid-2: Phase out the use of boolean Ack Vector sysctl This removes the use of the sysctl and the minisock variable for the Send Ack Vector feature, as it now is handled fully dynamically via feature negotiation (i.e. when CCID-2 is enabled, Ack Vectors are automatically enabled as per RFC 4341, 4.). Using a sysctl in parallel to this implementation would open the door to crashes, since much of the code relies on tests of the boolean minisock / sysctl variable. Thus, this patch replaces all tests of type if (dccp_msk(sk)->dccpms_send_ack_vector) /* ... / with if (dp->dccps_hc_rx_ackvec != NULL) / ... / The dccps_hc_rx_ackvec is allocated by the dccp_hdlr_ackvec() when feature negotiation concluded that Ack Vectors are to be used on the half-connection. Otherwise, it is NULL (due to dccp_init_sock/dccp_create_openreq_child), so that the test is a valid one. The activation handler for Ack Vectors is called as soon as the feature negotiation has concluded at the server when the Ack marking the transition RESPOND => OPEN arrives; * client after it has sent its ACK, marking the transition REQUEST => PARTOPEN. Adding the sequence number of the Response packet to the Ack Vector has been removed, since (a) connection establishment implies that the Response has been received; (b) the CCIDs only look at packets received in the (PART)OPEN state, i.e. this entry will always be ignored; (c) it can not be used for anything useful - to detect loss for instance, only packets received after the loss can serve as pseudo-dupacks. There was a FIXME to change the error code when dccp_ackvec_add() fails. I removed this after finding out that: * the check whether ackno < ISN is already made earlier, * this Response is likely the 1st packet with an Ackno that the client gets, * so when dccp_ackvec_add() fails, the reason is likely not a packet error. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-08 01:19:06 -08:00
Gerrit Renker	4098dce5be	dccp: Remove manual influence on NDP Count feature Updating the NDP count feature is handled automatically now: * for CCID-2 it is disabled, since the code does not use NDP counts; * for CCID-3 it is enabled, as NDP counts are used to determine loss lengths. Allowing the user to change NDP values leads to unpredictable and failing behaviour, since it is then possible to disable NDP counts even when they are needed (e.g. in CCID-3). This means that only those user settings are sensible that agree with the values for Send NDP Count implied by the choice of CCID. But those settings are already activated by the feature negotiation (CCID dependency tracking), hence this form of support is redundant. At startup the initialisation of the NDP count feature uses the default value of 0, which is done implicitly by the zeroing-out of the socket when it is allocated. If the choice of CCID or feature negotiation enables NDP count, this will then be updated via the NDP activation handler. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-08 01:18:37 -08:00
Gerrit Renker	0049bab5e7	dccp: Remove obsolete parts of the old CCID interface The TX/RX CCIDs of the minisock are now redundant: similar to the Ack Vector case, their value equals initially that of the sysctl, but at the end of feature negotiation may be something different. The old interface removed by this patch thus has been replaced by the newer interface to dynamically query the currently loaded CCIDs. Also removed are the constructors for the TX CCID and the RX CCID, since the switch "rx <-> non-rx" is done by the handler in minisocks.c (and the handler is the only place in the code where CCIDs are loaded). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-08 01:18:05 -08:00
Wang Chen	b74ca3a896	netdevice: Kill netdev->priv This is the last shoot of this series. After I removing all directly reference of netdev->priv, I am killing "priv" of "struct net_device" and fixing relative comments/docs. Anyone will not be allowed to reference netdev->priv directly. If you want to reference the memory of private data, use netdev_priv() instead. If the private data is not allocted when alloc_netdev(), use netdev->ml_priv to point that memory after you creating that private data. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-08 01:14:16 -08:00
David S. Miller	730c30ec64	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl-core.c drivers/net/wireless/iwlwifi/iwl-sta.c	2008-12-05 22:54:40 -08:00
Linus Torvalds	f2f1fa78a1	Enforce a minimum SG_IO timeout There's no point in having too short SG_IO timeouts, since if the command does end up timing out, we'll end up through the reset sequence that is several seconds long in order to abort the command that timed out. As a result, shorter timeouts than a few seconds simply do not make sense, as the recovery would be longer than the timeout itself. Add a BLK_MIN_SG_TIMEOUT to match the existign BLK_DEFAULT_SG_TIMEOUT. Suggested-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Jens Axboe <jens.axboe@oracle.com> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-05 14:49:18 -08:00
Matthew Garrett	e088e4c9cd	[CPUFREQ] Disable sysfs ui for p4-clockmod. p4-clockmod has a long history of abuse. It pretends to be a CPU frequency scaling driver, even though it doesn't actually change the CPU frequency, but instead just modulates the frequency with wait-states. The biggest misconception is that when running at the lower 'frequency' p4-clockmod is saving power. This isn't the case, as workloads running slower take longer to complete, preventing the CPU from entering deep C states. However p4-clockmod does have a purpose. It can prevent overheating. Having it hooked up to the cpufreq interfaces is the wrong way to achieve cooling however. It should instead be hooked up to ACPI. This diff introduces a means for a cpufreq driver to register with the cpufreq core, but not present a sysfs interface. Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Dave Jones <davej@redhat.com>	2008-12-05 15:20:10 -05:00
Luis R. Rodriguez	10ec4f1d08	nl80211: relicense nl80211.h under the ISC We have a few BSD/ISC licensed userspace applications which include nl80211.h from the kernel. To avoid legal ambiguity for usage of the header file in these projects we rather simply relicense the header file under the ISC. We've received consent from all contributors to it. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Michael Wu <flamingice@sourmilk.net> Acked-by: Luis Carlos Cobo <luisca@cozybit.com> Acked-by: Michael Buesch <mb@bu3sch.de> Acked-by: Jouni Malinen <jouni.malinen@atheros.com> Acked-by: Colin McCabe <colin@cozybit.com> Acked-by: Javier Cardona <javier@cozybit.com> Cc: johannes@sipsolutions.net Cc: altape@eden.rutgers.edu Cc: luisca@cozybit.com Cc: mb@bu3sch.de Cc: jouni.malinen@atheros.com Cc: colin@cozybit.com Cc: javier@cozybit.com Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-12-05 09:32:12 -05:00
Jouni Malinen	72bdcf3438	nl80211: Add frequency configuration (including HT40) This patch adds new NL80211_CMD_SET_WIPHY attributes NL80211_ATTR_WIPHY_FREQ and NL80211_ATTR_WIPHY_SEC_CHAN_OFFSET to allow userspace to set the operating channel (e.g., hostapd for AP mode). Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-12-05 09:32:11 -05:00
Frederic Weisbecker	21a8c466f9	tracing/ftrace: provide the macro task_curr_ret_stack() Impact: cleanup As suggested by Steven Rostedt, this patch provide a new macro task_curr_ret_stack() to move the cpp conditionnal CONFIG into the linux/ftrace.h headers. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-05 14:47:44 +01:00
Ingo Molnar	970987beb9	Merge branches 'tracing/ftrace', 'tracing/function-graph-tracer' and 'tracing/urgent' into tracing/core	2008-12-05 14:45:22 +01:00
Lachlan McIlroy	14d676f56f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	2008-12-05 15:27:43 +11:00
David S. Miller	c9bb6003dd	of: Fix comment, sparc no longer uses of_device objects on special busses. It only uses of_platform_bus_type. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-04 09:16:45 -08:00
Christoph Hellwig	fc9161e54d	[PATCH 2/2] documnt FMODE_ constants Make sure all FMODE_ constants are documents, and ensure a coherent style for the already existing comments. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-04 04:22:58 -05:00
Christoph Hellwig	fd4ce1acd0	[PATCH 1/2] kill FMODE_NDELAY_NOW Update FMODE_NDELAY before each ioctl call so that we can kill the magic FMODE_NDELAY_NOW. It would be even better to do this directly in setfl(), but for that we'd need to have FMODE_NDELAY for all files, not just block special files. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-04 04:22:57 -05:00
Peter Zijlstra	00ef9f7348	lockdep: change a held lock's class Impact: introduce new lockdep API Allow to change a held lock's class. Basically the same as the existing code to change a subclass therefore reuse all that. The XFS code will be able to use this to annotate their inode locking. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-04 10:08:18 +01:00
Steven Rostedt	5ef6476190	pid: fix the do_each_pid_task() macro Impact: macro side-effects fix This patch adds parenthesis around 'pid' in the do_each_pid_task macro to allow callers to pass in more complex parameters. e.g. do_each_pid_task(*pid, type, task) Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-04 09:09:37 +01:00
Steven Rostedt	ea4e2bc4d9	ftrace: graph of a single function This patch adds the file: /debugfs/tracing/set_graph_function which can be used along with the function graph tracer. When this file is empty, the function graph tracer will act as usual. When the file has a function in it, the function graph tracer will only trace that function. For example: # echo blk_unplug > /debugfs/tracing/set_graph_function # cat /debugfs/tracing/trace [...] ------------------------------------------ \| 2) make-19003 => kjournald-2219 ------------------------------------------ 2) \| blk_unplug() { 2) \| dm_unplug_all() { 2) \| dm_get_table() { 2) 1.381 us \| _read_lock(); 2) 0.911 us \| dm_table_get(); 2) 1. 76 us \| _read_unlock(); 2) + 12.912 us \| } 2) \| dm_table_unplug_all() { 2) \| blk_unplug() { 2) 0.778 us \| generic_unplug_device(); 2) 2.409 us \| } 2) 5.992 us \| } 2) 0.813 us \| dm_table_put(); 2) + 29. 90 us \| } 2) + 34.532 us \| } You can add up to 32 functions into this file. Currently we limit it to 32, but this may change with later improvements. To add another function, use the append '>>': # echo sys_read >> /debugfs/tracing/set_graph_function # cat /debugfs/tracing/set_graph_function blk_unplug sys_read Using the '>' will clear out the function and write anew: # echo sys_write > /debug/tracing/set_graph_function # cat /debug/tracing/set_graph_function sys_write Note, if you have function graph running while doing this, the small time between clearing it and updating it will cause the graph to record all functions. This should not be an issue because after it sets the filter, only those functions will be recorded from then on. If you need to only record a particular function then set this file first before starting the function graph tracer. In the future this side effect may be corrected. The set_graph_function file is similar to the set_ftrace_filter but it does not take wild cards nor does it allow for more than one function to be set with a single write. There is no technical reason why this is the case, I just do not have the time yet to implement that. Note, dynamic ftrace must be enabled for this to appear because it uses the dynamic ftrace records to match the name to the mcount call sites. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-04 09:09:34 +01:00
Ingo Molnar	cb9c34e6d0	Merge commit 'v2.6.28-rc7' into core/locking	2008-12-04 08:52:14 +01:00
James Morris	ec98ce480a	Merge branch 'master' into next Conflicts: fs/nfsd/nfs4recover.c Manually fixed above to use new creds API functions, e.g. nfs4_save_creds(). Signed-off-by: James Morris <jmorris@namei.org>	2008-12-04 17:16:36 +11:00
David Woodhouse	8865c418ca	atm: 32-bit ioctl compatibility We lack compat ioctl support through most of the ATM code. This patch deals with most of it, and I can now at least use BR2684 and PPPoATM with 32-bit userspace. I haven't added a .compat_ioctl method to struct atm_ioctl, because AFAICT none of the current users need any conversion -- so we can just call the ->ioctl() method in every case. I looked at br2684, clip, lec, mpc, pppoatm and atmtcp. In svc_compat_ioctl() the only mangling which is needed is to change COMPAT_ATM_ADDPARTY to ATM_ADDPARTY. Although it's defined as _IOW('a', ATMIOC_SPECIAL+4,struct atm_iobuf) it doesn't actually _take_ a struct atm_iobuf as an argument -- it takes a struct sockaddr_atmsvc, which _is_ the same between 32-bit and 64-bit code, so doesn't need conversion. Almost all of vcc_ioctl() would have been identical, so I converted that into a core do_vcc_ioctl() function with an 'int compat' argument. I've done the same with atm_dev_ioctl(), where there _are_ a few differences, but still it's relatively contained and there would otherwise have been a lot of duplication. I haven't done any of the actual device-specific ioctls, although I've added a compat_ioctl method to struct atmdev_ops. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-03 22:12:38 -08:00
Oliver Hartkopp	d253eee201	can: Fix CAN_(EFF\|RTR)_FLAG handling in can_filter Due to a wrong safety check in af_can.c it was not possible to filter for SFF frames with a specific CAN identifier without getting the same selected CAN identifier from a received EFF frame also. This fix has a minimum (but user visible) impact on the CAN filter API and therefore the CAN version is set to a new date. Indeed the 'old' API is still working as-is. But when now setting CAN_(EFF\|RTR)_FLAG in can_filter.can_mask you might get less traffic than before - but still the stuff that you expected to get for your defined filter ... Thanks to Kurt Van Dijck for pointing at this issue and for the review. Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net> Acked-by: Kurt Van Dijck <kurt.van.dijck@eia.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-03 15:52:35 -08:00
Milan Broz	0e435ac26e	block: fix setting of max_segment_size and seg_boundary mask Fix setting of max_segment_size and seg_boundary mask for stacked md/dm devices. When stacking devices (LVM over MD over SCSI) some of the request queue parameters are not set up correctly in some cases by default, namely max_segment_size and and seg_boundary mask. If you create MD device over SCSI, these attributes are zeroed. Problem become when there is over this mapping next device-mapper mapping - queue attributes are set in DM this way: request_queue max_segment_size seg_boundary_mask SCSI 65536 0xffffffff MD RAID1 0 0 LVM 65536 -1 (64bit) Unfortunately bio_add_page (resp. bio_phys_segments) calculates number of physical segments according to these parameters. During the generic_make_request() is segment cout recalculated and can increase bio->bi_phys_segments count over the allowed limit. (After bio_clone() in stack operation.) Thi is specially problem in CCISS driver, where it produce OOPS here BUG_ON(creq->nr_phys_segments > MAXSGENTRIES); (MAXSEGENTRIES is 31 by default.) Sometimes even this command is enough to cause oops: dd iflag=direct if=/dev/<vg>/<lv> of=/dev/null bs=128000 count=10 This command generates bios with 250 sectors, allocated in 32 4k-pages (last page uses only 1024 bytes). For LVM layer, it allocates bio with 31 segments (still OK for CCISS), unfortunatelly on lower layer it is recalculated to 32 segments and this violates CCISS restriction and triggers BUG_ON(). The patch tries to fix it by: * initializing attributes above in queue request constructor blk_queue_make_request() * make sure that blk_queue_stack_limits() inherits setting (DM uses its own function to set the limits because it blk_queue_stack_limits() was introduced later. It should probably switch to use generic stack limit function too.) * sets the default seg_boundary value in one place (blkdev.h) * use this mask as default in DM (instead of -1, which differs in 64bit) Bugs related to this: https://bugzilla.redhat.com/show_bug.cgi?id=471639 http://bugzilla.kernel.org/show_bug.cgi?id=8672 Signed-off-by: Milan Broz <mbroz@redhat.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com> Cc: Neil Brown <neilb@suse.de> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Tejun Heo <htejun@gmail.com> Cc: Mike Miller <mike.miller@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-03 12:55:55 +01:00
Tejun Heo	53a08807c0	block: internal dequeue shouldn't start timer blkdev_dequeue_request() and elv_dequeue_request() are equivalent and both start the timeout timer. Barrier code dequeues the original barrier request but doesn't passes the request itself to lower level driver, only broken down proxy requests; however, as the original barrier code goes through the same dequeue path and timeout timer is started on it. If barrier sequence takes long enough, this timer expires but the low level driver has no idea about this request and oops follows. Timeout timer shouldn't have been started on the original barrier request as it never goes through actual IO. This patch unexports elv_dequeue_request(), which has no external user anyway, and makes it operate on elevator proper w/o adding the timer and make blkdev_dequeue_request() call elv_dequeue_request() and add timer. Internal users which don't pass the request to driver - barrier code and end_that_request_last() - are converted to use elv_dequeue_request(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-03 12:41:26 +01:00
Anton Vorontsov	b908b53d58	of/gpio: Implement of_get_gpio_flags() This adds a new function, of_get_gpio_flags, which is like of_get_gpio(), but accepts a new "flags" argument. This new function will be used by the drivers that need to retrieve additional GPIO information, such as active-low flag. Also, this changes the default ("simple") .xlate routine to warn about bogus (< 2) #gpio-cells usage: the second cell should always be present for GPIO flags. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-12-03 21:04:05 +11:00
Paul Mackerras	5274918855	Merge branch 'merge'	2008-12-03 20:11:06 +11:00
Steven Rostedt	e49dc19c6a	ftrace: function graph return for function entry Impact: feature, let entry function decide to trace or not This patch lets the graph tracer entry function decide if the tracing should be done at the end as well. This requires all function graph entry functions return 1 if it should trace, or 0 if the return should not be traced. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-03 08:56:26 +01:00
Steven Rostedt	14a866c567	ftrace: add ftrace_graph_stop() Impact: new ftrace_graph_stop function While developing more features of function graph, I hit a bug that caused the WARN_ON to trigger in the prepare_ftrace_return function. Well, it was hard for me to find out that was happening because the bug would not print, it would just cause a hard lockup or reboot. The reason is that it is not safe to call printk from this function. Looking further, I also found that it calls unregister_ftrace_graph, which grabs a mutex and calls kstop machine. This would definitely lock the box up if it were to trigger. This patch adds a fast and safe ftrace_graph_stop() which will stop the function tracer. Then it is safe to call the WARN ON. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-03 08:56:23 +01:00
Steven Rostedt	8789a9e7df	ring-buffer: read page interface Impact: new API to ring buffer This patch adds a new interface into the ring buffer that allows a page to be read from the ring buffer on a given CPU. For every page read, one must also be given to allow for a "swap" of the pages. rpage = ring_buffer_alloc_read_page(buffer); if (!rpage) goto err; ret = ring_buffer_read_page(buffer, &rpage, cpu, full); if (!ret) goto empty; process_page(rpage); ring_buffer_free_read_page(rpage); The caller of these functions must handle any waits that are needed to wait for new data. The ring_buffer_read_page will simply return 0 if there is no data, or if "full" is set and the writer is still on the current page. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-03 08:56:21 +01:00
Ingo Molnar	dfdc5437bd	Merge commit 'v2.6.28-rc7'; branch 'x86/dumpstack' into tracing/ftrace Merge x86/dumpstack into tracing/ftrace because upcoming ftrace changes depend on cleanups already in x86/dumpstack. Also merge to latest upstream -rc.	2008-12-03 08:55:34 +01:00
David S. Miller	aa2ba5f108	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/ixgbe/ixgbe_main.c drivers/net/smc91x.c	2008-12-02 19:50:27 -08:00
Linus Torvalds	e1825e7515	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (30 commits) MAINTAINERS: add netdev to ATM ATM: horizon, fix hrz_probe fail path pppol2tp: Add missing sock_put() in pppol2tp_release() net: Fix soft lockups/OOM issues w/ unix garbage collector macvlan: don't broadcast PAUSE frames to macvlan devices Phonet: fix oops in phonet_address_del() on non-Phonet device netfilter: ctnetlink: fix GFP_KERNEL allocation under spinlock sungem: Fix PCS_MIICTRL register write in gem_init_phy(). net: make skb_truesize_bug() call WARN() net: hp-plus uses eip_poll net/wireless/reg.c: fix bad WARN_ON in if statement ath5k: disable beacon filter when station is not associated ath5k: fix Security issue in DebugFS part of ath5k ath9k: correct expected max RX buffer size ath9k: Fix SW-IOMMU bounce buffer starvation mac80211 : Fix setting ad-hoc mode and non-ibss channel iwlagn: fix DMA sync phylib: Add Vitesse VSC8221 SGMII PHY rose: zero length frame filtering in af_rose.c bridge: netfilter: fix update_pmtu crash with GRE ...	2008-12-02 15:55:05 -08:00
Linus Torvalds	e2e29831cc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: alim15x3: fix sparse warning ide: remove dead code from drive_is_ready() ide: fix build for DEBUG_PM ide: respect current DMA setting during resume ide: add SAMSUNG SP0822N with firmware WA100-10 to ivb_list[] amd74xx: workaround unreliable AltStatus register for nVidia controllers ide: fix the ide_release_lock imbalance	2008-12-02 15:53:10 -08:00
Junjiro R. Okajima	1b79cd04fa	nfsd: fix vm overcommit crash fix #2 The previous patch from Alan Cox ("nfsd: fix vm overcommit crash", commit `731572d39f`) fixed the problem where knfsd crashes on exported shmemfs objects and strict overcommit is set. But the patch forgot supporting the case when CONFIG_SECURITY is disabled. This patch copies a part of his fix which is mainly for detecting a bug earlier. Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Junjiro R. Okajima <hooanon05@yahoo.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-02 15:50:40 -08:00
Bartlomiej Zolnierkiewicz	6636487e8d	amd74xx: workaround unreliable AltStatus register for nVidia controllers It seems that on some nVidia controllers using AltStatus register can be unreliable so default to Status register if the PCI device is in Compatibility Mode. In order to achieve this: * Add ide_pci_is_in_compatibility_mode() inline helper to <linux/ide.h>. * Add IDE_HFLAG_BROKEN_ALTSTATUS host flag and set it in amd74xx host driver for nVidia controllers in Compatibility Mode. * Teach actual_try_to_identify() and drive_is_ready() about the new flag. This fixes the regression caused by removal of CONFIG_IDEPCI_SHARE_IRQ config option in 2.6.25 and using AltStatus register unconditionally when available (kernel.org bugs #11659 and #10216). [ Moreover for CONFIG_IDEPCI_SHARE_IRQ=y (which is what most people and distributions use) it never worked correctly. ] Thanks to Remy LABENE and Lars Winterfeld for help with debugging the problem. More info at: http://bugzilla.kernel.org/show_bug.cgi?id=11659 http://bugzilla.kernel.org/show_bug.cgi?id=10216 Reported-by: Remy LABENE <remy.labene@free.fr> Tested-by: Remy LABENE <remy.labene@free.fr> Tested-by: Lars Winterfeld <lars.winterfeld@tu-ilmenau.de> Acked-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-12-02 20:40:03 +01:00
Ingo Molnar	a64d31baed	Merge branch 'linus' into cpus4096 Conflicts: kernel/trace/ring_buffer.c	2008-12-02 20:09:50 +01:00
Manfred Spraul	6ff2d39b91	lib/idr.c: fix rcu related race with idr_find 2nd part of the fixes needed for http://bugzilla.kernel.org/show_bug.cgi?id=11796. When the idr tree is either grown or shrunk, then the update to the number of layers and the top pointer were not atomic. This race caused crashes. The attached patch fixes that by replicating the layers counter in each layer, thus idr_find doesn't need idp->layers anymore. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Clement Calmels <cboulte@gmail.com> Cc: Nadia Derbey <Nadia.Derbey@bull.net> Cc: Pierre Peiffer <peifferp@gmail.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-01 19:55:25 -08:00
Davide Libenzi	7ef9964e6d	epoll: introduce resource usage limits It has been thought that the per-user file descriptors limit would also limit the resources that a normal user can request via the epoll interface. Vegard Nossum reported a very simple program (a modified version attached) that can make a normal user to request a pretty large amount of kernel memory, well within the its maximum number of fds. To solve such problem, default limits are now imposed, and /proc based configuration has been introduced. A new directory has been created, named /proc/sys/fs/epoll/ and inside there, there are two configuration points: max_user_instances = Maximum number of devices - per user max_user_watches = Maximum number of "watched" fds - per user The current default for "max_user_watches" limits the memory used by epoll to store "watches", to 1/32 of the amount of the low RAM. As example, a 256MB 32bit machine, will have "max_user_watches" set to roughly 90000. That should be enough to not break existing heavy epoll users. The default value for "max_user_instances" is set to 128, that should be enough too. This also changes the userspace, because a new error code can now come out from EPOLL_CTL_ADD (-ENOSPC). The EMFILE from epoll_create() was already listed, so that should be ok. [akpm@linux-foundation.org: use get_current_user()] Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: <stable@kernel.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Reported-by: Vegard Nossum <vegardno@ifi.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-01 19:55:24 -08:00
Arun R Bharadwaj	6c415b9234	sched: add uid information to sched_debug for CONFIG_USER_SCHED Impact: extend information in /proc/sched_debug This patch adds uid information in sched_debug for CONFIG_USER_SCHED Signed-off-by: Arun R Bharadwaj <arun@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-01 20:39:50 +01:00
Linus Torvalds	7ac01108e7	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: libata: blacklist Seagate drives which time out FLUSH_CACHE when used with NCQ [libata] pata_rb532_cf: fix signature of the xfer function [libata] pata_rb532_cf: fix and rename register definitions ata_piix: add borked Tecra M4 to broken suspend list	2008-12-01 11:23:33 -08:00
Linus Torvalds	4bc2a9bf8c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/mlx4: Fix MTT leakage in resize CQ IB/ehca: Fix problem with generated flush work completions IB/ehca: Change misleading error message on memory hotplug mlx4_core: Save/restore default port IB capability mask	2008-12-01 11:01:54 -08:00
Tejun Heo	ac70a964b0	libata: blacklist Seagate drives which time out FLUSH_CACHE when used with NCQ Some recent Seagate harddrives have firmware bug which causes FLUSH CACHE to timeout under certain circumstances if NCQ is being used. This can be worked around by disabling NCQ and fixed by updating the firmware. Implement ATA_HORKAGE_FIRMWARE_UPDATE and blacklist these devices. The wiki page has been updated to contain information on this issue. http://ata.wiki.kernel.org/index.php/Known_issues Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-12-01 13:49:27 -05:00
Miklos Szeredi	1f55ed06cf	fuse: update interface version Change interface version to 7.11 after adding the IOCTL and POLL messages. Also clean up the <linux/fuse.h> header a bit: - update copyright date to 2008 - fix checkpatch warning: WARNING: Use #include <linux/types.h> instead of <asm/types.h> - remove FUSE_MAJOR define, which is not being used any more Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-12-01 19:14:02 +01:00
Linus Torvalds	4ec8f077e4	Merge master.kernel.org:/home/rmk/linux-2.6-arm * master.kernel.org:/home/rmk/linux-2.6-arm: Allow architectures to override copy_user_highpage() [ARM] pxa/palmtx: misc fixes to use generic GPIO API ARM: OMAP: Fixes for suspend / resume GPIO wake-up handling [ARM] pxa/corgi: update default config to exclude tosa from being built [ARM] pxa/pcm990: use negative number for an invalid GPIO in camera data ARM: OMAP: Typo fix for clock_allow_idle ARM: OMAP: Remove broken LCD driver for SX1 [ARM] 5335/1: pxa25x_udc: Fix is_vbus_present to return 1 or 0 [ARM] pxa/MioA701: bluetooth resume fix [ARM] pxa/MioA701: fix memory corruption.	2008-11-30 16:39:06 -08:00
Linus Torvalds	72244c0e68	Merge branch 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: irq.h: fix missing/extra kernel-doc genirq: __irq_set_trigger: change pr_warning to pr_debug irq: fix typo x86: apic honour irq affinity which was set in early boot genirq: fix the affinity setting in setup_irq genirq: keep affinities set from userspace across free/request_irq()	2008-11-30 13:06:20 -08:00
Christoph Hellwig	96b8936a9e	remove __ARCH_WANT_COMPAT_SYS_PTRACE All architectures now use the generic compat_sys_ptrace, as should every new architecture that needs 32bit compat (if we'll ever get another). Remove the now superflous __ARCH_WANT_COMPAT_SYS_PTRACE define, and also kill a comment about __ARCH_SYS_PTRACE that was added after __ARCH_SYS_PTRACE was already gone. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-30 11:00:15 -08:00
Al Viro	02d0e6753d	hotplug_memory_notifier section annotation Same as for hotplug_cpu - we want static notifier_block in there in meminitdata, to avoid false positives whenever it's used. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-30 10:03:38 -08:00
Al Viro	31168481c3	meminit section warnings Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-30 10:03:35 -08:00
Jack Morgenstein	9a5aa622dd	mlx4_core: Save/restore default port IB capability mask Commit `7ff93f8b` ("mlx4_core: Multiple port type support") introduced support for different port types. As part of that support, SET_PORT is invoked to set the port type during driver startup. However, as a side-effect, for IB ports the invocation of this command also sets the port's capability mask to zero (losing the default value set by FW). To fix this, get the default ib port capabilities (via a MAD_IFC Port Info query) during driver startup, and save them for use in the mlx4_SET_PORT command when setting the port-type to Infiniband. This patch fixes problems with subnet manager (SM) failover such as <https://bugs.openfabrics.org/show_bug.cgi?id=1183>, which occurred because the IsTrapSupported bit in the capability mask was zeroed. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-11-28 21:29:46 -08:00
Giuseppe Cavallaro	0f0ca340e5	phy: power management support This patch adds the power management support into the physical abstraction layer. Suspend and resume functions respectively turns on/off the bit 11 into the PHY Basic mode control register. Generic PHY device starts supporting PM. In order to support the wake-on LAN and avoid to put in power down the PHY device, the MDIO is aware of what the Ethernet device wants to do. Voluntary, no CONFIG_PM defines were added into the sources. Also generic suspend/resume functions are exported to allow other drivers use them (such as genphy_config_aneg etc.). Within the phy_driver_register function, we need to remove the memset. It overrides the device driver owner and it is not good. Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-28 16:24:56 -08:00
Wu Fengguang	a838c2ec6e	markers: comment marker_synchronize_unregister() on data dependency Add document and comments on marker_synchronize_unregister(): it should be called before freeing resources that the probes depend on. Based on comments from Lai Jiangshan and Mathieu Desnoyers. Signed-off-by: Wu Fengguang <wfg@linux.intel.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-28 16:47:41 +01:00
Ingo Molnar	3bdae4f464	Merge branch 'x86/debug' into x86/irq We merge this branch because x86/debug touches code that we started cleaning up in x86/irq. The two branches started out independent, but as unexpected amount of activity went into x86/irq, they became dependent. Resolve that by this cross-merge.	2008-11-28 15:00:48 +01:00
Liming Wang	8b752e3ef6	softirq: remove useless function __local_bh_enable Impact: remove unused code __local_bh_enable has been replaced with _local_bh_enable. As comments says "it always nests inside local_bh_enable() sections" has not been valid now. Also there is no reason to use __local_bh_enable anywhere, so we can remove this useless function. Signed-off-by: Liming Wang <liming.wang@windriver.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-28 12:38:38 +01:00
David S. Miller	ed77a89c30	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 Conflicts: net/netfilter/nf_conntrack_netlink.c	2008-11-28 02:19:15 -08:00
Lachlan McIlroy	b5a20aa265	Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	2008-11-28 15:23:52 +11:00
Russell King	487ff32082	Allow architectures to override copy_user_highpage() With aliasing VIPT cache support, the ARM implementation of clear_user_page() and copy_user_page() sets up a temporary kernel space mapping such that we have the same cache colour as the userspace page. This avoids having to consider any userspace aliases from this operation. However, when highmem is enabled, kmap_atomic() have to setup mappings. The copy_user_highpage() and clear_user_highpage() call these functions before delegating the copies to copy_user_page() and clear_user_page(). The effect of this is that each of the _user_highpage() functions setup their own kmap mapping, followed by the _user_page() functions setting up another mapping. This is rather wasteful. Thankfully, copy_user_highpage() can be overriden by architectures by defining __HAVE_ARCH_COPY_USER_HIGHPAGE. However, replacement of clear_user_highpage() is more difficult because its inline definition is not conditional. It seems that you're expected to define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE and provide a replacement __alloc_zeroed_user_highpage() implementation instead. The allocation itself is fine, so we don't want to override that. What we really want to do is to override clear_user_highpage() with our own version which doesn't kmap_atomic() unnecessarily. Other VIPT architectures (PARISC and SH) would also like to override this function as well. Acked-by: Hugh Dickins <hugh@veritas.com> Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2008-11-27 23:39:48 +00:00
Alexander van Heukelum	d211af055d	i386: get rid of the use of KPROBE_ENTRY / KPROBE_END entry_32.S is now the only user of KPROBE_ENTRY / KPROBE_END, treewide. This patch reorders entry_64.S and explicitly generates a separate section for functions that need the protection. The generated code before and after the patch is equal. The KPROBE_ENTRY and KPROBE_END macro's are removed too. Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-27 12:37:54 +01:00
Ingo Molnar	c7cc773076	Merge branches 'tracing/blktrace', 'tracing/ftrace', 'tracing/function-graph-tracer' and 'tracing/power-tracer' into tracing/core	2008-11-27 10:56:13 +01:00
David S. Miller	5b9ab2ec04	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/hp-plus.c drivers/net/wireless/ath5k/base.c drivers/net/wireless/ath9k/recv.c net/wireless/reg.c	2008-11-26 23:48:40 -08:00
Jouni Malinen	bf8c1ac6d8	nl80211: Change max TX power to be in mBm instead of dBm In order to be consistent with NL80211_ATTR_POWER_RULE_MAX_EIRP, change NL80211_FREQUENCY_ATTR_MAX_TX_POWER to use mBm and U32 instead of dBm and U8. This is a userspace interface change, but the previous version had not yet been pushed upstream and there are no userspace programs using this yet, so there is justification to get this change in as long as it goes in before the previous version gets out. Signed-off-by: Jouni Malinen <j@w1.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-11-26 09:47:48 -05:00
Henrique de Moraes Holschuh	f80b5e99c7	rfkill: preserve state across suspend The rfkill class API requires that the driver connected to a class call rfkill_force_state() on resume to update the real state of the rfkill controller, OR that it provides a get_state() hook. This means there is potentially a hidden call in the resume code flow that changes rfkill->state (i.e. rfkill_force_state()), so the previous state of the transmitter was being lost. The simplest and most future-proof way to fix this is to explicitly store the pre-sleep state on the rfkill structure, and restore from that on resume. Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Alan Jenkins <alan-jenkins@tuffmail.co.uk> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-11-26 09:47:43 -05:00
Jouni Malinen	e2f367f269	nl80211: Report max TX power in NL80211_BAND_ATTR_FREQS This is useful information to provide for userspace (e.g., hostapd needs this to generate Country IE). Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-11-26 09:47:41 -05:00
Eduard - Gabriel Munteanu	ce71e27c6f	SLUB: Replace __builtin_return_address(0) with _RET_IP_. This patch replaces __builtin_return_address(0) with _RET_IP_, since a previous patch moved _RET_IP_ and _THIS_IP_ to include/linux/kernel.h and they're widely available now. This makes for shorter and easier to read code. [penberg@cs.helsinki.fi: remove _RET_IP_ casts to void pointer] Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>	2008-11-26 16:47:25 +02:00
David Vrabel	dcc7461eef	wusb: add debug files for ASL, PZL and DI to the whci-hcd driver Add asl, pzl and di debugfs files to uwb/uwbN/wusbhc for WHCI host controller. These dump the current ASL, PZL and DI buffer. Signed-off-by: David Vrabel <david.vrabel@csr.com>	2008-11-26 13:36:59 +00:00
Arnaldo Carvalho de Melo	5f3ea37c77	blktrace: port to tracepoints This was a forward port of work done by Mathieu Desnoyers, I changed it to encode the 'what' parameter on the tracepoint name, so that one can register interest in specific events and not on classes of events to then check the 'what' parameter. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-26 12:13:34 +01:00
Tejun Heo	95668a69a4	fuse: implement poll support Implement poll support. Polled files are indexed using kh in a RB tree rooted at fuse_conn->polled_files. Client should send FUSE_NOTIFY_POLL notification once after processing FUSE_POLL which has FUSE_POLL_SCHEDULE_NOTIFY set. Sending notification unconditionally after the latest poll or everytime file content might have changed is inefficient but won't cause malfunction. fuse_file_poll() can sleep and requires patches from the following thread which allows f_op->poll() to sleep. http://thread.gmane.org/gmane.linux.kernel/726176 Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:55 +01:00
Tejun Heo	8599396b50	fuse: implement unsolicited notification Clients always used to write only in response to read requests. To implement poll efficiently, clients should be able to issue unsolicited notifications. This patch implements basic notification support. Zero fuse_out_header.unique is now accepted and considered unsolicited notification and the error field contains notification code. This patch doesn't implement any actual notification. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:55 +01:00
Tejun Heo	59efec7b90	fuse: implement ioctl support Generic ioctl support is tricky to implement because only the ioctl implementation itself knows which memory regions need to be read and/or written. To support this, fuse client can request retry of ioctl specifying memory regions to read and write. Deep copying (nested pointers) can be implemented by retrying multiple times resolving one depth of dereference at a time. For security and cleanliness considerations, ioctl implementation has restricted mode where the kernel determines data transfer directions and sizes using the _IOC_*() macros on the ioctl command. In this mode, retry is not allowed. For all FUSE servers, restricted mode is enforced. Unrestricted ioctl will be used by CUSE. Plese read the comment on top of fs/fuse/file.c::fuse_file_do_ioctl() for more information. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:55 +01:00
Tejun Heo	193da60927	fuse: move FUSE_MINOR to miscdevice.h Move FUSE_MINOR to miscdevice.h. While at it, de-uglify the file. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:54 +01:00
Arjan van de Ven	f3f47a6768	tracing: add "power-tracer": C/P state tracer to help power optimization Impact: new "power-tracer" ftrace plugin This patch adds a C/P-state ftrace plugin that will generate detailed statistics about the C/P-states that are being used, so that we can look at detailed decisions that the C/P-state code is making, rather than the too high level "average" that we have today. An example way of using this is: mount -t debugfs none /sys/kernel/debug echo cstate > /sys/kernel/debug/tracing/current_tracer echo 1 > /sys/kernel/debug/tracing/tracing_enabled sleep 1 echo 0 > /sys/kernel/debug/tracing/tracing_enabled cat /sys/kernel/debug/tracing/trace \| perl scripts/trace/cstate.pl > out.svg Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-26 08:29:32 +01:00
Steven Rostedt	5a45cfe1c6	ftrace: use code patching for ftrace graph tracer Impact: more efficient code for ftrace graph tracer This patch uses the dynamic patching, when available, to patch the function graph code into the kernel. This patch will ease the way for letting both function tracing and function graph tracing run together. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-26 06:52:54 +01:00
Frederic Weisbecker	287b6e68ca	tracing/function-return-tracer: set a more human readable output Impact: feature This patch sets a C-like output for the function graph tracing. For this aim, we now call two handler for each function: one on the entry and one other on return. This way we can draw a well-ordered call stack. The pid of the previous trace is loosely stored to be compared against the one of the current trace to see if there were a context switch. Without this little feature, the call tree would seem broken at some locations. We could use the sched_tracer to capture these sched_events but this way of processing is much more simpler. 2 spaces have been chosen for indentation to fit the screen while deep calls. The time of execution in nanosecs is printed just after closed braces, it seems more easy this way to find the corresponding function. If the time was printed as a first column, it would be not so easy to find the corresponding function if it is called on a deep depth. I plan to output the return value but on 32 bits CPU, the return value can be 32 or 64, and its difficult to guess on which case we are. I don't know what would be the better solution on X86-32: only print eax (low-part) or even edx (high-part). Actually it's thee same problem when a function return a 8 bits value, the high part of eax could contain junk values... Here is an example of trace: sys_read() { fget_light() { } 526 vfs_read() { rw_verify_area() { security_file_permission() { cap_file_permission() { } 519 } 1564 } 2640 do_sync_read() { pipe_read() { __might_sleep() { } 511 pipe_wait() { prepare_to_wait() { } 760 deactivate_task() { dequeue_task() { dequeue_task_fair() { dequeue_entity() { update_curr() { update_min_vruntime() { } 504 } 1587 clear_buddies() { } 512 add_cfs_task_weight() { } 519 update_min_vruntime() { } 511 } 5602 dequeue_entity() { update_curr() { update_min_vruntime() { } 496 } 1631 clear_buddies() { } 496 update_min_vruntime() { } 527 } 4580 hrtick_update() { hrtick_start_fair() { } 488 } 1489 } 13700 } 14949 } 16016 msecs_to_jiffies() { } 496 put_prev_task_fair() { } 504 pick_next_task_fair() { } 489 pick_next_task_rt() { } 496 pick_next_task_fair() { } 489 pick_next_task_idle() { } 489 ------------8<---------- thread 4 ------------8<---------- finish_task_switch() { } 1203 do_softirq() { __do_softirq() { __local_bh_disable() { } 669 rcu_process_callbacks() { __rcu_process_callbacks() { cpu_quiet() { rcu_start_batch() { } 503 } 1647 } 3128 __rcu_process_callbacks() { } 542 } 5362 _local_bh_enable() { } 587 } 8880 } 9986 kthread_should_stop() { } 669 deactivate_task() { dequeue_task() { dequeue_task_fair() { dequeue_entity() { update_curr() { calc_delta_mine() { } 511 update_min_vruntime() { } 511 } 2813 Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-26 01:59:45 +01:00
Frederic Weisbecker	fb52607afc	tracing/function-return-tracer: change the name into function-graph-tracer Impact: cleanup This patch changes the name of the "return function tracer" into function-graph-tracer which is a more suitable name for a tracing which makes one able to retrieve the ordered call stack during the code flow. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-26 01:59:45 +01:00
Ingo Molnar	509dceef64	Merge branches 'tracing/hw-branch-tracing' and 'tracing/branch-tracer' into tracing/core	2008-11-26 01:58:05 +01:00
Luis R. Rodriguez	3f2355cb91	cfg80211/mac80211: Add 802.11d support This adds country IE parsing to mac80211 and enables its usage within the new regulatory infrastructure in cfg80211. We parse the country IEs only on management beacons for the BSSID you are associated to and disregard the IEs when the country and environment (indoor, outdoor, any) matches the already processed country IE. To avoid following misinformed or outdated APs we build and use a regulatory domain out of the intersection between what the AP provides us on the country IE and what CRDA is aware is allowed on the same country. A secondary device is allowed to follow only the same country IE as it make no sense for two devices on a system to be in two different countries. In the case the AP is using country IEs for an incorrect country the user may help compliance further by setting the regulatory domain before or after the IE is parsed and in that case another intersection will be performed. CONFIG_WIRELESS_OLD_REGULATORY is supported but requires CRDA present. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-11-25 16:41:26 -05:00
Markus Metzger	6abb11aecd	x86, bts, ptrace: move BTS buffer allocation from ds.c into ptrace.c Impact: restructure DS memory allocation to be done by the usage site of DS Require pre-allocated buffers in ds.h. Move the BTS buffer allocation for ptrace into ptrace.c. The pointer to the allocated buffer is stored in the traced task's task_struct together with the handle returned by ds_request_bts(). Removes memory accounting code. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-25 17:31:12 +01:00
Markus Metzger	ca0002a179	x86, bts: base in-kernel ds interface on handles Impact: generalize the DS code to shared buffers Change the in-kernel ds.h interface to identify the tracer via a handle returned on ds_request_~(). Tracers used to be identified via their task_struct. The changes are required to allow DS to be shared between different tasks, which is needed for perfmon2 and for ftrace. For ptrace, the handle is stored in the traced task's task_struct. This should probably go into a (arch-specific) ptrace context some time. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-25 17:31:11 +01:00
Peter Zijlstra	ca109491f6	hrtimer: removing all ur callback modes Impact: cleanup, move all hrtimer processing into hardirq context This is an attempt at removing some of the hrtimer complexity by reducing the number of callback modes to 1. This means that all hrtimer callback functions will be ran from HARD-irq context. I went through all the 30 odd hrtimer callback functions in the kernel and saw only one that I'm not quite sure of, which is the one in net/can/bcm.c - hence I'm CC-ing the folks responsible for that code. Furthermore, the hrtimer core now calls callbacks directly with IRQs disabled in case you try to enqueue an expired timer. If this timer is a periodic timer (which should use hrtimer_forward() to advance its time) then it might be possible to end up in an inf. recursive loop due to the fact that hrtimer_forward() doesn't round up to the next timer granularity, and therefore keeps on calling the callback - obviously this needs a fix. Aside from that, this seems to compile and actually boot on my dual core test box - although I'm sure there are some bugs in, me not hitting any makes me certain :-) Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-25 15:45:46 +01:00
David Vrabel	65d76f3682	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 into for-upstream	2008-11-25 13:52:56 +00:00
Jeff Kirsher	7a6b6f515f	DCB: fix kconfig option Since the netlink option for DCB is necessary to actually be useful, simplified the Kconfig option. In addition, added useful help text for the Kconfig option. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-25 01:02:08 -08:00
Stephen Hemminger	47fd5b8373	netdev: add HAVE_NET_DEVICE_OPS As a concession to vendors who have to deal with one source for different kernel versions, add a HAVE_NET_DEVICE_OPS so they don't end up hard coding ifdef against kernel version. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-25 00:20:43 -08:00
Ingo Molnar	14bfc987e3	tracing, tty: fix warnings caused by branch tracing and tty_kref_get() Stephen Rothwell reported tht this warning started triggering in linux-next: In file included from init/main.c:27: include/linux/tty.h: In function ‘tty_kref_get’: include/linux/tty.h:330: warning: ‘______f’ is static but declared in inline function ‘tty_kref_get’ which is not static Which gcc emits for 'extern inline' functions that nevertheless define static variables. Change it to 'static inline', which is the norm in the kernel anyway. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-25 08:59:44 +01:00
Ilpo Järvinen	111cc8b913	tcp: add some mibs to track collapsing Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-24 21:27:22 -08:00
Ilpo Järvinen	832d11c5cd	tcp: Try to restore large SKBs while SACK processing During SACK processing, most of the benefits of TSO are eaten by the SACK blocks that one-by-one fragment SKBs to MSS sized chunks. Then we're in problems when cleanup work for them has to be done when a large cumulative ACK comes. Try to return back to pre-split state already while more and more SACK info gets discovered by combining newly discovered SACK areas with the previous skb if that's SACKed as well. This approach has a number of benefits: 1) The processing overhead is spread more equally over the RTT 2) Write queue has less skbs to process (affect everything which has to walk in the queue past the sacked areas) 3) Write queue is consistent whole the time, so no other parts of TCP has to be aware of this (this was not the case with some other approach that was, well, quite intrusive all around). 4) Clean_rtx_queue can release most of the pages using single put_page instead of previous PAGE_SIZE/mss+1 calls In case a hole is fully filled by the new SACK block, we attempt to combine the next skb too which allows construction of skbs that are even larger than what tso split them to and it handles hole per on every nth patterns that often occur during slow start overshoot pretty nicely. Though this to be really useful also a retransmission would have to get lost since cumulative ACKs advance one hole at a time in the most typical case. TODO: handle upwards only merging. That should be rather easy when segment is fully sacked but I'm leaving that as future work item (it won't make very large difference anyway since this current approach already covers quite a lot of normal cases). I was earlier thinking of some sophisticated way of tracking timestamps of the first and the last segment but later on realized that it won't be that necessary at all to store the timestamp of the last segment. The cases that can occur are basically either: 1) ambiguous => no sensible measurement can be taken anyway 2) non-ambiguous is due to reordering => having the timestamp of the last segment there is just skewing things more off than does some good since the ack got triggered by one of the holes (besides some substle issues that would make determining right hole/skb even harder problem). Anyway, it has nothing to do with this change then. I choose to route some abnormal looking cases with goto noop, some could be handled differently (eg., by stopping the walking at that skb but again). In general, they either shouldn't happen at all or are rare enough to make no difference in practice. In theory this change (as whole) could cause some macroscale regression (global) because of cache misses that are taken over the round-trip time but it gets very likely better because of much less (local) cache misses per other write queue walkers and the big recovery clearing cumulative ack. Worth to note that these benefits would be very easy to get also without TSO/GSO being on as long as the data is in pages so that we can merge them. Currently I won't let that happen because DSACK splitting at fragment that would mess up pcounts due to sk_can_gso in tcp_set_skb_tso_segs. Once DSACKs fragments gets avoided, we have some conditions that can be made less strict. TODO: I will probably have to convert the excessive pointer passing to struct sacktag_state... :-) My testing revealed that considerable amount of skbs couldn't be shifted because they were cloned (most likely still awaiting tx reclaim)... [The rest is considering future work instead since I got repeatably EFAULT to tcpdump's recvfrom when I added pskb_expand_head to deal with clones, so I separated that into another, later patch] ...To counter that, I gave up on the fifth advantage: 5) When growing previous SACK block, less allocs for new skbs are done, basically a new alloc is needed only when new hole is detected and when the previous skb runs out of frags space ...which now only happens of if reclaim is fast enough to dispose the clone before the SACK block comes in (the window is RTT long), otherwise we'll have to alloc some. With clones being handled I got these numbers (will be somewhat worse without that), taken with fine-grained mibs: TCPSackShifted 398 TCPSackMerged 877 TCPSackShiftFallback 320 TCPSACKCOLLAPSEFALLBACKGSO 0 TCPSACKCOLLAPSEFALLBACKSKBBITS 0 TCPSACKCOLLAPSEFALLBACKSKBDATA 0 TCPSACKCOLLAPSEFALLBACKBELOW 0 TCPSACKCOLLAPSEFALLBACKFIRST 1 TCPSACKCOLLAPSEFALLBACKPREVBITS 318 TCPSACKCOLLAPSEFALLBACKMSS 1 TCPSACKCOLLAPSEFALLBACKNOHEAD 0 TCPSACKCOLLAPSEFALLBACKSHIFT 0 TCPSACKCOLLAPSENOOPSEQ 0 TCPSACKCOLLAPSENOOPSMALLPCOUNT 0 TCPSACKCOLLAPSENOOPSMALLLEN 0 TCPSACKCOLLAPSEHOLE 12 Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-24 21:20:15 -08:00
Jan Engelhardt	f79fca55f9	netfilter: xtables: add missing const qualifier to xt_tgchk_param When entryinfo was a standalone parameter to functions, it used to be "const void *". Put the const back in. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-24 16:06:17 -08:00
Serge Hallyn	18b6e0414e	User namespaces: set of cleanups (v2) The user_ns is moved from nsproxy to user_struct, so that a struct cred by itself is sufficient to determine access (which it otherwise would not be). Corresponding ecryptfs fixes (by David Howells) are here as well. Fix refcounting. The following rules now apply: 1. The task pins the user struct. 2. The user struct pins its user namespace. 3. The user namespace pins the struct user which created it. User namespaces are cloned during copy_creds(). Unsharing a new user_ns is no longer possible. (We could re-add that, but it'll cause code duplication and doesn't seem useful if PAM doesn't need to clone user namespaces). When a user namespace is created, its first user (uid 0) gets empty keyrings and a clean group_info. This incorporates a previous patch by David Howells. Here is his original patch description: >I suggest adding the attached incremental patch. It makes the following >changes: > > (1) Provides a current_user_ns() macro to wrap accesses to current's user > namespace. > > (2) Fixes eCryptFS. > > (3) Renames create_new_userns() to create_user_ns() to be more consistent > with the other associated functions and because the 'new' in the name is > superfluous. > > (4) Moves the argument and permission checks made for CLONE_NEWUSER to the > beginning of do_fork() so that they're done prior to making any attempts > at allocation. > > (5) Calls create_user_ns() after prepare_creds(), and gives it the new creds > to fill in rather than have it return the new root user. I don't imagine > the new root user being used for anything other than filling in a cred > struct. > > This also permits me to get rid of a get_uid() and a free_uid(), as the > reference the creds were holding on the old user_struct can just be > transferred to the new namespace's creator pointer. > > (6) Makes create_user_ns() reset the UIDs and GIDs of the creds under > preparation rather than doing it in copy_creds(). > >David >Signed-off-by: David Howells <dhowells@redhat.com> Changelog: Oct 20: integrate dhowells comments 1. leave thread_keyring alone 2. use current_user_ns() in set_user() Signed-off-by: Serge Hallyn <serue@us.ibm.com>	2008-11-24 18:57:41 -05:00
Thomas Gleixner	1acdac1046	futex: make clock selectable for FUTEX_WAIT_BITSET FUTEX_WAIT_BITSET could be used instead of FUTEX_WAIT by setting the bit set to FUTEX_BITSET_MATCH_ANY, but FUTEX_WAIT uses CLOCK_REALTIME while FUTEX_WAIT_BITSET uses CLOCK_MONOTONIC. Add a flag to select CLOCK_REALTIME for FUTEX_WAIT_BITSET so glibc can replace the FUTEX_WAIT logic which needs to do gettimeofday() calls before and after the syscall to convert the absolute timeout to a relative timeout for FUTEX_WAIT. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ulrich Drepper <drepper@redhat.com>	2008-11-24 20:00:40 +01:00
Thomas Gleixner	3e1d7a6219	Merge branch 'linus' into core/futexes	2008-11-24 19:54:37 +01:00
Rusty Russell	96f874e264	sched: convert remaining old-style cpumask operators Impact: Trivial API conversion NR_CPUS -> nr_cpu_ids cpumask_t -> struct cpumask sizeof(cpumask_t) -> cpumask_size() cpumask_a = cpumask_b -> cpumask_copy(&cpumask_a, &cpumask_b) cpu_set() -> cpumask_set_cpu() first_cpu() -> cpumask_first() cpumask_of_cpu() -> cpumask_of() cpus_* -> cpumask_* There are some FIXMEs where we all archs to complete infrastructure (patches have been sent): cpu_coregroup_map -> cpu_coregroup_mask node_to_cpumask* -> cpumask_of_node There is also one FIXME where we pass an array of cpumasks to partition_sched_domains(): this implies knowing the definition of 'struct cpumask' and the size of a cpumask. This will be fixed in a future patch. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-24 17:52:42 +01:00
Rusty Russell	6a7b3dc344	sched: convert nohz_cpu_mask to cpumask_var_t. Impact: (future) size reduction for large NR_CPUS. Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves space for small nr_cpu_ids but big CONFIG_NR_CPUS. cpumask_var_t is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-24 17:51:10 +01:00
Rusty Russell	6c99e9ad47	sched: convert struct sched_group/sched_domain cpumask_ts to variable bitmaps Impact: (future) size reduction for large NR_CPUS. We move the 'cpumask' member of sched_group to the end, so when we kmalloc it we can do a minimal allocation: saves space for small nr_cpu_ids but big CONFIG_NR_CPUS. Similar trick for 'span' in sched_domain. This isn't quite as good as converting to a cpumask_var_t, as some sched_groups are actually static, but it's safer: we don't have to figure out where to call alloc_cpumask_var/free_cpumask_var. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-24 17:50:57 +01:00
Rusty Russell	758b2cdc6f	sched: wrap sched_group and sched_domain cpumask accesses. Impact: trivial wrap of member accesses This eases the transition in the next patch. We also get rid of a temporary cpumask in find_idlest_cpu() thanks to for_each_cpu_and, and sched_balance_self() due to getting weight before setting sd to NULL. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-24 17:50:45 +01:00
Ingo Molnar	943f3d0300	Merge branches 'sched/core', 'core/core' and 'tracing/core' into cpus4096	2008-11-24 17:46:57 +01:00
Ingo Molnar	6f893fb2e8	Merge branches 'tracing/branch-tracer', 'tracing/fastboot', 'tracing/ftrace', 'tracing/function-return-tracer', 'tracing/power-tracer', 'tracing/powerpc', 'tracing/ring-buffer', 'tracing/stack-tracer' and 'tracing/urgent' into tracing/core	2008-11-24 17:46:24 +01:00
Ingo Molnar	b19b3c74c7	Merge branches 'core/debug', 'core/futexes', 'core/locking', 'core/rcu', 'core/signal', 'core/urgent' and 'core/xen' into core/core	2008-11-24 17:44:55 +01:00
Dmitry Torokhov	a2d781fc8d	Input: libps2 - handle 0xfc responses from devices Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-11-24 11:43:21 -05:00
Jaya Kumar	3eb1aa43ef	Input: add support for Wacom W8001 penabled serial touchscreen The Wacom W8001 sensor is a sensor device (uses electromagnetic resonance) and it is interfaced via its serial microcontroller to the host. Signed-off-by: Jaya Kumar <jayakumar.lkml@gmail.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-11-24 11:41:38 -05:00
Ingo Molnar	64b7482de2	Merge branch 'sched/rt' into sched/core	2008-11-24 17:37:12 +01:00
Eric Dumazet	1f87e235e6	eth: Declare an optimized compare_ether_addr_64bits() function Linus mentioned we could try to perform long word operations, even on potentially unaligned addresses, on x86 at least. David mentioned the HAVE_EFFICIENT_UNALIGNED_ACCESS test to handle this on all arches that have efficient unailgned accesses. I tried this idea and got nice assembly on 32 bits: 158: 33 82 38 01 00 00 xor 0x138(%edx),%eax 15e: 33 8a 34 01 00 00 xor 0x134(%edx),%ecx 164: c1 e0 10 shl $0x10,%eax 167: 09 c1 or %eax,%ecx 169: 74 0b je 176 <eth_type_trans+0x87> And very nice assembly on 64 bits of course (one xor, one shl) Nice oprofile improvement in eth_type_trans(), 0.17 % instead of 0.41 %, expected since we remove 8 instructions on a fast path. This patch implements a compare_ether_addr_64bits() function, that uses the CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS ifdef to efficiently perform the 6 bytes comparison on all capable arches. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-23 23:24:32 -08:00
Gerrit Renker	b20a9c24d5	dccp: Set per-connection CCIDs via socket options With this patch, TX/RX CCIDs can now be changed on a per-connection basis, which overrides the defaults set by the global sysctl variables for TX/RX CCIDs. To make full use of this facility, the remaining patches of this patch set are needed, which track dependencies and activate negotiated feature values. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-23 16:02:31 -08:00
Richard Kennedy	e262a7ba31	irq.h: remove padding from irq_desc on 64bits Impact: reduce struct irq_desc size struct irq_desc: reorder to remove padding on 64bits shrinks irq_desc to 128 bytes which saves data space & cache lines On a generic x86_64/SMP build this reduces the reported data size by 64k. Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-23 16:15:00 +01:00
Török Edwin	8d26487fd4	tracing/stack-tracer: introduce CONFIG_USER_STACKTRACE_SUPPORT Impact: cleanup User stack tracing is just implemented for x86, but it is not x86 specific. Introduce a generic config flag, that is currently enabled only for x86. When other arches implement it, they will have to SELECT USER_STACKTRACE_SUPPORT. Signed-off-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-23 11:53:50 +01:00
Török Edwin	8d7c6a9616	tracing/stack-tracer: fix style issues Impact: cleanup Signed-off-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-23 11:53:48 +01:00
Steven Rostedt	69bb54ec05	ftrace: add ftrace_off_permanent Impact: add new API to disable all of ftrace on anomalies It case of a serious anomaly being detected (like something caught by lockdep) it is a good idea to disable all tracing immediately, without grabing any locks. This patch adds ftrace_off_permanent that disables the tracers, function tracing and ring buffers without a way to enable them again. This should only be used when something serious has been detected. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-23 11:45:34 +01:00
Steven Rostedt	033601a32b	ring-buffer: add tracing_off_permanent Impact: feature to permanently disable ring buffer This patch adds a API to the ring buffer code that will permanently disable the ring buffer from ever recording. This should only be called when some serious anomaly is detected, and the system may be in an unstable state. When that happens, shutting down the recording to the ring buffers may be appropriate. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-23 11:44:37 +01:00
Steven Rostedt	2bcd521a68	trace: profile all if conditionals Impact: feature to profile if statements This patch adds a branch profiler for all if () statements. The results will be found in: /debugfs/tracing/profile_branch For example: miss hit % Function File Line ------- --------- - -------- ---- ---- 0 1 100 x86_64_start_reservations head64.c 127 0 1 100 copy_bootdata head64.c 69 1 0 0 x86_64_start_kernel head64.c 111 32 0 0 set_intr_gate desc.h 319 1 0 0 reserve_ebda_region head.c 51 1 0 0 reserve_ebda_region head.c 47 0 1 100 reserve_ebda_region head.c 42 0 0 X maxcpus main.c 165 Miss means the branch was not taken. Hit means the branch was taken. The percent is the percentage the branch was taken. This adds a significant amount of overhead and should only be used by those analyzing their system. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-23 11:41:01 +01:00
Steven Rostedt	45b797492a	trace: consolidate unlikely and likely profiler Impact: clean up to make one profiler of like and unlikely tracer The likely and unlikely profiler prints out the file and line numbers of the annotated branches that it is profiling. It shows the number of times it was correct or incorrect in its guess. Having two different files or sections for that matter to tell us if it was a likely or unlikely is pretty pointless. We really only care if it was correct or not. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-23 11:39:56 +01:00
Steven Rostedt	42f565e116	trace: remove extra assign in branch check Impact: clean up of branch check The unlikely/likely profiler does an extra assign of the f.line. This is not needed since it is already calculated at compile time. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-23 11:39:28 +01:00
Randy Dunlap	2ed1cdcf9a	irq.h: fix missing/extra kernel-doc Impact: fix kernel-doc build Fix missing & excess irq.h kernel-doc: Warning(include/linux/irq.h:182): No description found for parameter 'irq' Warning(include/linux/irq.h:182): Excess struct/union/enum/typedef member 'affinity_entry' description in 'irq_desc' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-23 10:52:45 +01:00
Ingo Molnar	9f14416442	Merge commit 'v2.6.28-rc6' into irq/urgent	2008-11-23 10:52:33 +01:00
Török Edwin	74e2f334f4	vfs, seqfile: make mangle_path() global Impact: expose new VFS API make mangle_path() available, as per the suggestions of Christoph Hellwig and Al Viro: http://lkml.org/lkml/2008/11/4/338 Signed-off-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-23 09:45:39 +01:00
Török Edwin	02b67518e2	tracing: add support for userspace stacktraces in tracing/iter_ctrl Impact: add new (default-off) tracing visualization feature Usage example: mount -t debugfs nodev /sys/kernel/debug cd /sys/kernel/debug/tracing echo userstacktrace >iter_ctrl echo sched_switch >current_tracer echo 1 >tracing_enabled .... run application ... echo 0 >tracing_enabled Then read one of 'trace','latency_trace','trace_pipe'. To get the best output you can compile your userspace programs with frame pointers (at least glibc + the app you are tracing). Signed-off-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-23 09:25:15 +01:00
Ingo Molnar	82f60f0bc8	tracing/function-return-tracer: clean up task start/exit callbacks Impact: cleanup Eliminate #ifdefs in core code by using empty inline functions. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-23 09:19:35 +01:00
Frederic Weisbecker	f201ae2356	tracing/function-return-tracer: store return stack into task_struct and allocate it dynamically Impact: use deeper function tracing depth safely Some tests showed that function return tracing needed a more deeper depth of function calls. But it could be unsafe to store these return addresses to the stack. So these arrays will now be allocated dynamically into task_struct of current only when the tracer is activated. Typical scheme when tracer is activated: - allocate a return stack for each task in global list. - fork: allocate the return stack for the newly created task - exit: free return stack of current - idle init: same as fork I chose a default depth of 50. I don't have overruns anymore. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-23 09:17:26 +01:00
Ingo Molnar	a0a70c735e	Merge branches 'tracing/profiling', 'tracing/options' and 'tracing/urgent' into tracing/core	2008-11-23 09:10:32 +01:00
Wang Chen	2baf8a2daa	netdevice hdlc: Convert directly reference of netdev->priv For killing directly reference of netdev->priv, use netdev->ml_priv to replace it. Because the private pvc data comes from add_pvc() and can't be allocated in alloc_netdev(). Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Acked-by: Krzysztof Halasa <khc@pm.waw.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-21 16:34:18 -08:00
Alexander Duyck	859ee3c438	DCB: Add support for DCB BCN Adds an interface to configure the Backward Congestion Notification (BCN) feature. In a BCN capabale network, congestion notifications from congested points out in the network can cause the end station limit the rate of a given traffic flow. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-20 21:10:23 -08:00
Alexander Duyck	0eb3aa9bab	DCB: Add interface to query the state of PFC feature. Adds a netlink interface for Data Center Bridging (DCB) to get and set the enable state of the Priority Flow Control (PFC) feature. Primarily, this is a way to turn off PFC in the driver while DCB remains enabled. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-20 21:09:23 -08:00
Alexander Duyck	33dbabc4a7	DCB: Add interface to query # of TCs supported by device Adds interface for Data Center Bridging (DCB) to query (and set if supported) the number of traffic classes currently supported by the device for the two (DCB) features: priority groups (PG) and priority flow control (PFC). Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-20 21:08:19 -08:00
Alexander Duyck	46132188bf	DCB: Add interface to query for the DCB capabilities of an device. Adds to the netlink interface for Data Center Bridging (DCB), allowing the DCB capabilities supported by a device to be queried. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-20 21:05:08 -08:00
Alexander Duyck	2f90b8657e	ixgbe: this patch adds support for DCB to the kernel and ixgbe driver This adds support for Data Center Bridging (DCB) features in the ixgbe driver and adds an rtnetlink interface for configuring DCB to the kernel. The DCB feature support included are Priority Grouping (PG) - which allows bandwidth guarantees to be allocated to groups to traffic based on the 802.1q priority, and Priority Based Flow Control (PFC) - which introduces a new MAC control PAUSE frame which works at granularity of the 802.1p priority instead of the link (IEEE 802.3x). Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-20 20:52:10 -08:00
Stephen Hemminger	748ff68fad	hippi: convert driver to net_device_ops Convert the HIPPI infrastructure for use with net_device_ops. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-20 20:32:15 -08:00
Stephen Hemminger	145186a395	fddi: convert to new network device ops Similar to ethernet. Convert infrastructure and the one lone FDDI driver (for the one lone user of that hardware??). Compile tested only. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-20 20:29:48 -08:00
Stephen Hemminger	008298231a	netdev: add more functions to netdevice ops This patch moves neigh_setup and hard_start_xmit into the network device ops structure. For bisection, fix all the previously converted drivers as well. Bonding driver took the biggest hit on this. Added a prefetch of the hard_start_xmit in the fast path to try and reduce any impact this would have. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-20 20:14:53 -08:00
David S. Miller	6ab33d5171	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/ixgbe/ixgbe_main.c include/net/mac80211.h net/phonet/af_phonet.c	2008-11-20 16:44:00 -08:00
Andy Whitcroft	018a7bf1e5	netfilter: ip{,6}t_policy.h should include xp_policy.h It seems that all of the include/netfilter_{ipv4,ipv6}/{ipt,ip6t}_.h which share constants include the corresponding include/netfilter/xp_.h files. Neither ipt_policy.h not ip6t_policy.h do. Make these consistant with the norm. Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2008-11-20 15:59:56 +01:00
Patrick McHardy	13d2a1d2b0	pkt_sched: add DRR scheduler Add classful DRR scheduler as a more flexible replacement for SFQ. The main difference to the algorithm described in "Efficient Fair Queueing using Deficit Round Robin" is that this implementation doesn't drop packets from the longest queue on overrun because its classful and limits are handled by each individual child qdisc. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-20 04:10:00 -08:00
Patrick McHardy	0c19b0adb8	netlink: avoid memset of 0 bytes sparse warning A netlink attribute padding of zero triggers this sparse warning: include/linux/netlink.h:245:8: warning: memset with byte count of 0 Avoid the memset when the size parameter is constant and requires no padding. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-20 04:08:29 -08:00
Pablo Neira Ayuso	d214c7537b	filter: add SKF_AD_NLATTR_NEST to look for nested attributes SKF_AD_NLATTR allows us to find the first matching attribute in a stream of netlink attributes from one offset to the end of the netlink message. This is not suitable to look for a specific matching inside a set of nested attributes. For example, in ctnetlink messages, if we look for the CTA_V6_SRC attribute in a message that talks about an IPv4 connection, SKF_AD_NLATTR returns the offset of CTA_STATUS which has the same value of CTA_V6_SRC but outside the nest. To differenciate CTA_STATUS and CTA_V6_SRC, we would have to make assumptions on the size of the attribute and the usual offset, resulting in horrible BSF code. This patch adds SKF_AD_NLATTR_NEST, which is a variant of SKF_AD_NLATTR, that looks for an attribute inside the limits of a nested attributes, but not further. This patch validates that we have enough room to look for the nested attributes - based on a suggestion from Patrick McHardy. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-20 00:49:27 -08:00
Stephen Hemminger	ccad637b0c	netdev: expose ethernet address primitives When ethernet devices are converted, the function pointer setup by eth_setup() need to be done during intialization. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-19 22:42:31 -08:00
Stephen Hemminger	eeda3fd64f	netdev: introduce dev_get_stats() In order for the network device ops get_stats call to be immutable, the handling of the default internal network device stats block has to be changed. Add a new helper function which replaces the old use of internal_get_stats. Note: change return code to make it clear that the caller should not go changing the returned statistics. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-19 21:40:23 -08:00
Stephen Hemminger	d314774cf2	netdev: network device operations infrastructure This patch changes the network device internal API to move adminstrative operations out of the network device structure and into a separate structure. This patch involves some hackery to maintain compatablity between the new and old model, so all 300+ drivers don't have to be changed at once. For drivers that aren't converted yet, the netdevice_ops virt function list still resides in the net_device structure. For old protocols, the new net_device_ops are copied out to the old net_device pointers. After the transistion is completed the nag message can be changed to an WARN_ON, and the compatiablity code can be made configurable. Some function pointers aren't moved: * destructor can't be in net_device_ops because it may need to be referenced after the module is unloaded. * neighbor setup is manipulated in a couple of places that need special consideration * hard_start_xmit is in the fast path for transmit. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-19 21:32:24 -08:00
Miao Xie	f481891fdc	cpuset: update top cpuset's mems after adding a node After adding a node into the machine, top cpuset's mems isn't updated. By reviewing the code, we found that the update function cpuset_track_online_nodes() was invoked after node_states[N_ONLINE] changes. It is wrong because N_ONLINE just means node has pgdat, and if node has/added memory, we use N_HIGH_MEMORY. So, We should invoke the update function after node_states[N_HIGH_MEMORY] changes, just like its commit says. This patch fixes it. And we use notifier of memory hotplug instead of direct calling of cpuset_track_online_nodes(). Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Acked-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Paul Menage <menage@google.com Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-19 18:49:58 -08:00
Ulrich Drepper	de11defebf	reintroduce accept4 Introduce a new accept4() system call. The addition of this system call matches analogous changes in 2.6.27 (dup3(), evenfd2(), signalfd4(), inotify_init1(), epoll_create1(), pipe2()) which added new system calls that differed from analogous traditional system calls in adding a flags argument that can be used to access additional functionality. The accept4() system call is exactly the same as accept(), except that it adds a flags bit-mask argument. Two flags are initially implemented. (Most of the new system calls in 2.6.27 also had both of these flags.) SOCK_CLOEXEC causes the close-on-exec (FD_CLOEXEC) flag to be enabled for the new file descriptor returned by accept4(). This is a useful security feature to avoid leaking information in a multithreaded program where one thread is doing an accept() at the same time as another thread is doing a fork() plus exec(). More details here: http://udrepper.livejournal.com/20407.html "Secure File Descriptor Handling", Ulrich Drepper). The other flag is SOCK_NONBLOCK, which causes the O_NONBLOCK flag to be enabled on the new open file description created by accept4(). (This flag is merely a convenience, saving the use of additional calls fcntl(F_GETFL) and fcntl (F_SETFL) to achieve the same result. Here's a test program. Works on x86-32. Should work on x86-64, but I (mtk) don't have a system to hand to test with. It tests accept4() with each of the four possible combinations of SOCK_CLOEXEC and SOCK_NONBLOCK set/clear in 'flags', and verifies that the appropriate flags are set on the file descriptor/open file description returned by accept4(). I tested Ulrich's patch in this thread by applying against 2.6.28-rc2, and it passes according to my test program. /* test_accept4.c Copyright (C) 2008, Linux Foundation, written by Michael Kerrisk <mtk.manpages@gmail.com> Licensed under the GNU GPLv2 or later. / #define _GNU_SOURCE #include <unistd.h> #include <sys/syscall.h> #include <sys/socket.h> #include <netinet/in.h> #include <stdlib.h> #include <fcntl.h> #include <stdio.h> #include <string.h> #define PORT_NUM 33333 #define die(msg) do { perror(msg); exit(EXIT_FAILURE); } while (0) /********************************************************************/ / The following is what we need until glibc gets a wrapper for accept4() / / Flags for socket(), socketpair(), accept4() / #ifndef SOCK_CLOEXEC #define SOCK_CLOEXEC O_CLOEXEC #endif #ifndef SOCK_NONBLOCK #define SOCK_NONBLOCK O_NONBLOCK #endif #ifdef __x86_64__ #define SYS_accept4 288 #elif __i386__ #define USE_SOCKETCALL 1 #define SYS_ACCEPT4 18 #else #error "Sorry -- don't know the syscall # on this architecture" #endif static int accept4(int fd, struct sockaddr sockaddr, socklen_t addrlen, int flags) { printf("Calling accept4(): flags = %x", flags); if (flags != 0) { printf(" ("); if (flags & SOCK_CLOEXEC) printf("SOCK_CLOEXEC"); if ((flags & SOCK_CLOEXEC) && (flags & SOCK_NONBLOCK)) printf(" "); if (flags & SOCK_NONBLOCK) printf("SOCK_NONBLOCK"); printf(")"); } printf("\n"); #if USE_SOCKETCALL long args[6]; args[0] = fd; args[1] = (long) sockaddr; args[2] = (long) addrlen; args[3] = flags; return syscall(SYS_socketcall, SYS_ACCEPT4, args); #else return syscall(SYS_accept4, fd, sockaddr, addrlen, flags); #endif } /********************************************************************/ static int do_test(int lfd, struct sockaddr_in conn_addr, int closeonexec_flag, int nonblock_flag) { int connfd, acceptfd; int fdf, flf, fdf_pass, flf_pass; struct sockaddr_in claddr; socklen_t addrlen; printf("=======================================\n"); connfd = socket(AF_INET, SOCK_STREAM, 0); if (connfd == -1) die("socket"); if (connect(connfd, (struct sockaddr ) conn_addr, sizeof(struct sockaddr_in)) == -1) die("connect"); addrlen = sizeof(struct sockaddr_in); acceptfd = accept4(lfd, (struct sockaddr ) &claddr, &addrlen, closeonexec_flag \| nonblock_flag); if (acceptfd == -1) { perror("accept4()"); close(connfd); return 0; } fdf = fcntl(acceptfd, F_GETFD); if (fdf == -1) die("fcntl:F_GETFD"); fdf_pass = ((fdf & FD_CLOEXEC) != 0) == ((closeonexec_flag & SOCK_CLOEXEC) != 0); printf("Close-on-exec flag is %sset (%s); ", (fdf & FD_CLOEXEC) ? "" : "not ", fdf_pass ? "OK" : "failed"); flf = fcntl(acceptfd, F_GETFL); if (flf == -1) die("fcntl:F_GETFD"); flf_pass = ((flf & O_NONBLOCK) != 0) == ((nonblock_flag & SOCK_NONBLOCK) !=0); printf("nonblock flag is %sset (%s)\n", (flf & O_NONBLOCK) ? "" : "not ", flf_pass ? "OK" : "failed"); close(acceptfd); close(connfd); printf("Test result: %s\n", (fdf_pass && flf_pass) ? "PASS" : "FAIL"); return fdf_pass && flf_pass; } static int create_listening_socket(int port_num) { struct sockaddr_in svaddr; int lfd; int optval; memset(&svaddr, 0, sizeof(struct sockaddr_in)); svaddr.sin_family = AF_INET; svaddr.sin_addr.s_addr = htonl(INADDR_ANY); svaddr.sin_port = htons(port_num); lfd = socket(AF_INET, SOCK_STREAM, 0); if (lfd == -1) die("socket"); optval = 1; if (setsockopt(lfd, SOL_SOCKET, SO_REUSEADDR, &optval, sizeof(optval)) == -1) die("setsockopt"); if (bind(lfd, (struct sockaddr ) &svaddr, sizeof(struct sockaddr_in)) == -1) die("bind"); if (listen(lfd, 5) == -1) die("listen"); return lfd; } int main(int argc, char argv[]) { struct sockaddr_in conn_addr; int lfd; int port_num; int passed; passed = 1; port_num = (argc > 1) ? atoi(argv[1]) : PORT_NUM; memset(&conn_addr, 0, sizeof(struct sockaddr_in)); conn_addr.sin_family = AF_INET; conn_addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK); conn_addr.sin_port = htons(port_num); lfd = create_listening_socket(port_num); if (!do_test(lfd, &conn_addr, 0, 0)) passed = 0; if (!do_test(lfd, &conn_addr, SOCK_CLOEXEC, 0)) passed = 0; if (!do_test(lfd, &conn_addr, 0, SOCK_NONBLOCK)) passed = 0; if (!do_test(lfd, &conn_addr, SOCK_CLOEXEC, SOCK_NONBLOCK)) passed = 0; close(lfd); exit(passed ? EXIT_SUCCESS : EXIT_FAILURE); } [mtk.manpages@gmail.com: rewrote changelog, updated test program] Signed-off-by: Ulrich Drepper <drepper@redhat.com> Tested-by: Michael Kerrisk <mtk.manpages@gmail.com> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Cc: <linux-api@vger.kernel.org> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-19 18:49:57 -08:00
David Vrabel	dba0a91872	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 into for-upstream	2008-11-19 14:48:07 +00:00
David Vrabel	0996e63824	uwb: remove unused beacon group join/leave events The UWB_NOTIF_BG_JOIN/UWB_NOTIF_BG_LEAVE events have been superceeded by the channel_changed callback in struct uwb_pal. Signed-off-by: David Vrabel <david.vrabel@csr.com>	2008-11-19 14:47:16 +00:00
David Vrabel	e8e1594c81	wlp: start/stop radio on network interface up/down Signed-off-by: David Vrabel <david.vrabel@csr.com>	2008-11-19 14:47:04 +00:00
David Vrabel	6fae35f9ce	uwb: add basic radio manager The UWB radio manager coordinates the use of the radio between the PALs that may be using it. PALs request use of the radio with uwb_radio_start() and the radio manager will start beaconing if its not already doing so. When the last PAL has called uwb_radio_stop() beaconing will be stopped. In the future, the radio manager will have a more sophisticated channel selection algorithm, probably following the Channel Selection Policy from the WiMedia Alliance when it is finalized. For now, channel 9 (BG1, TFC1) is selected. The user may override the channel selected by the radio manager and may force the radio to stop beaconing. The WUSB Host Controller PAL makes use of this and there are two new debug PAL commands that can be used for testing. Signed-off-by: David Vrabel <david.vrabel@csr.com>	2008-11-19 14:46:33 +00:00
Ingo Molnar	9676e73a9e	Merge branches 'tracing/ftrace' and 'tracing/urgent' into tracing/core Conflicts: kernel/trace/ftrace.c [ We conflicted here because we backported a few fixes to tracing/urgent - which has different internal APIs. ]	2008-11-19 10:04:25 +01:00
David S. Miller	198d6ba4d7	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/isdn/i4l/isdn_net.c fs/cifs/connect.c	2008-11-18 23:38:23 -08:00
Paul Mackerras	cea555d384	Merge branch 'linux-2.6' into next	2008-11-19 16:10:32 +11:00
Michael Ellerman	1e291b14c8	of: Add helpers for finding device nodes which have a given property This commit adds a routine for finding a device node which has a certain property. The contents of the property are not taken into account, merely the presence or absence of the property. Based on that routine, we add a for_each_ macro for iterating over all nodes that have a certain property. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-11-19 16:05:00 +11:00
Linus Torvalds	7f0f598a00	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: hold extra reference to bio in blk_rq_map_user_iov() relay: fix cpu offline problem Release old elevator on change elevator block: fix boot failure with CONFIG_DEBUG_BLOCK_EXT_DEVT=y and nash block/md: fix md autodetection block: make add_partition() return pointer to hd_struct block: fix add_partition() error path	2008-11-18 08:07:51 -08:00
Linus Torvalds	72b51a6b4d	Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: kernel/profile.c: fix section mismatch warning function tracing: fix wrong pos computing when read buffer has been fulfilled tracing: fix mmiotrace resizing crash ring-buffer: no preempt for sched_clock() ring-buffer: buffer record on/off switch	2008-11-18 08:06:35 -08:00
Tejun Heo	ba32929a91	block: make add_partition() return pointer to hd_struct Make add_partition() return pointer to the new hd_struct on success and ERR_PTR() value on failure. This change will be used to fix md autodetection bug. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-11-18 15:08:56 +01:00
Frederic Weisbecker	0231022cc3	tracing/function-return-tracer: add the overrun field Impact: help to find the better depth of trace We decided to arbitrary define the depth of function return trace as "20". Perhaps this is not enough. To help finding an optimal depth, we measure now the overrun: the number of functions that have been missed for the current thread. By default this is not displayed, we have to do set a particular flag on the return tracer: echo overrun > /debug/tracing/trace_options And the overrun will be printed on the right. As the trace shows below, the current 20 depth is not enough. update_wall_time+0x37f/0x8c0 -> update_xtime_cache (345 ns) (Overruns: 2838) update_wall_time+0x384/0x8c0 -> clocksource_get_next (1141 ns) (Overruns: 2838) do_timer+0x23/0x100 -> update_wall_time (3882 ns) (Overruns: 2838) tick_do_update_jiffies64+0xbf/0x160 -> do_timer (5339 ns) (Overruns: 2838) tick_sched_timer+0x6a/0xf0 -> tick_do_update_jiffies64 (7209 ns) (Overruns: 2838) vgacon_set_cursor_size+0x98/0x120 -> native_io_delay (2613 ns) (Overruns: 274) vgacon_cursor+0x16e/0x1d0 -> vgacon_set_cursor_size (33151 ns) (Overruns: 274) set_cursor+0x5f/0x80 -> vgacon_cursor (36432 ns) (Overruns: 274) con_flush_chars+0x34/0x40 -> set_cursor (38790 ns) (Overruns: 274) release_console_sem+0x1ec/0x230 -> up (721 ns) (Overruns: 274) release_console_sem+0x225/0x230 -> wake_up_klogd (316 ns) (Overruns: 274) con_flush_chars+0x39/0x40 -> release_console_sem (2996 ns) (Overruns: 274) con_write+0x22/0x30 -> con_flush_chars (46067 ns) (Overruns: 274) n_tty_write+0x1cc/0x360 -> con_write (292670 ns) (Overruns: 274) smp_apic_timer_interrupt+0x2a/0x90 -> native_apic_mem_write (330 ns) (Overruns: 274) irq_enter+0x17/0x70 -> idle_cpu (413 ns) (Overruns: 274) smp_apic_timer_interrupt+0x2f/0x90 -> irq_enter (1525 ns) (Overruns: 274) ktime_get_ts+0x40/0x70 -> getnstimeofday (465 ns) (Overruns: 274) ktime_get_ts+0x60/0x70 -> set_normalized_timespec (436 ns) (Overruns: 274) ktime_get+0x16/0x30 -> ktime_get_ts (2501 ns) (Overruns: 274) hrtimer_interrupt+0x77/0x1a0 -> ktime_get (3439 ns) (Overruns: 274) Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-18 11:11:00 +01:00
James Morris	f3a5c54701	Merge branch 'master' into next Conflicts: fs/cifs/misc.c Merge to resolve above, per the patch below. Signed-off-by: James Morris <jmorris@namei.org> diff --cc fs/cifs/misc.c index ec36410,addd1dc..0000000 --- a/fs/cifs/misc.c +++ b/fs/cifs/misc.c @@@ -347,13 -338,13 +338,13 @@@ header_assemble(struct smb_hdr buffer / BB Add support for establishing new tCon and SMB Session / / with userid/password pairs found on the smb session / / for other target tcp/ip addresses BB */ - if (current->fsuid != treeCon->ses->linux_uid) { + if (current_fsuid() != treeCon->ses->linux_uid) { cFYI(1, ("Multiuser mode and UID " "did not match tcon uid")); - read_lock(&GlobalSMBSeslock); - list_for_each(temp_item, &GlobalSMBSessionList) { - ses = list_entry(temp_item, struct cifsSesInfo, cifsSessionList); + read_lock(&cifs_tcp_ses_lock); + list_for_each(temp_item, &treeCon->ses->server->smb_ses_list) { + ses = list_entry(temp_item, struct cifsSesInfo, smb_ses_list); - if (ses->linux_uid == current->fsuid) { + if (ses->linux_uid == current_fsuid()) { if (ses->server == treeCon->ses->server) { cFYI(1, ("found matching uid substitute right smb_uid")); buffer->Uid = ses->Suid;	2008-11-18 18:52:37 +11:00
Linus Torvalds	847e9170c7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits) rtnetlink: propagate error from dev_change_flags in do_setlink() isdn: remove extra byteswap in isdn_net_ciscohdlck_slarp_send_reply Phonet: refuse to send bigger than MTU packets e1000e: fix IPMI traffic e1000e: fix warn_on reload after phy_id error phy: fix phy address bug e100: fix dma error in direction for mapping igb: use dev_printk instead of printk qla3xxx: Cleanup: Fix link print statements. igb: Use device_set_wakeup_enable e1000: Use device_set_wakeup_enable e1000e: Use device_set_wakeup_enable via-velocity: enable perfect filtering for multicast packets phy: Add support for Marvell 88E1118 PHY mlx4_en: Pause parameters per port phylib: fix premature freeing of struct mii_bus atl1: Do not enumerate options unsupported by chip atl1e: fix broken multicast by removing unnecessary crc inversion gianfar: Fix DMA unmap invocations net/ucc_geth: Fix oops in uec_get_ethtool_stats() ...	2008-11-17 07:53:25 -08:00
David Vrabel	e17be2b2a9	uwb: add pal parameter to new reservation callback The pal parameter allows PALs to retrieve their PAL-specific data structure. Signed-off-by: David Vrabel <david.vrabel@csr.com>	2008-11-17 15:24:14 +00:00
Ingo Molnar	3f8e402f34	Merge branches 'tracing/branch-tracer', 'tracing/ftrace', 'tracing/function-return-tracer', 'tracing/tracepoints' and 'tracing/urgent' into tracing/core	2008-11-17 09:36:22 +01:00
Gerrit Renker	dd9c0e363c	dccp: Deprecate Ack Ratio sysctl This patch deprecates the Ack Ratio sysctl, since * Ack Ratio is entirely ignored by CCID-3 and CCID-4, * Ack Ratio currently doesn't work in CCID-2 (i.e. is always set to 1); * even if it would work in CCID-2, there is no point for a user to change it: - Ack Ratio is constrained by cwnd (RFC 4341, 6.1.2), - if Ack Ratio > cwnd, the system resorts to spurious RTO timeouts (since waiting for Acks which will never arrive in this window), - cwnd is not a user-configurable value. The only reasonable place for Ack Ratio is to print it for debugging. It is planned to do this later on, as part of e.g. dccp_probe. With this patch Ack Ratio is now under full control of feature negotiation: * Ack Ratio is resolved as a dependency of the selected CCID; * if the chosen CCID supports it (i.e. CCID == CCID-2), Ack Ratio is set to the default of 2, following RFC 4340, 11.3 - "New connections start with Ack Ratio 2 for both endpoints"; * what happens then is part of another patch set, since it concerns the dynamic update of Ack Ratio while the connection is in full flight. Thanks to Tomasz Grobelny for discussion leading up to this patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 22:55:08 -08:00
Gerrit Renker	2945055984	dccp: Feature negotiation for minimum-checksum-coverage This provides feature negotiation for server minimum checksum coverage which so far has been missing. Since sender/receiver coverage values range only from 0...15, their type has also been reduced in size from u16 to u4. Feature-negotiation options are now generated for both sender and receiver coverage, i.e. when the peer has `forgotten' to enable partial coverage then feature negotiation will automatically enable (negotiate) the partial coverage value for this connection. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 22:53:48 -08:00
Gerrit Renker	49aebc66d6	dccp: Deprecate old setsockopt framework The previous setsockopt interface, which passed socket options via struct dccp_so_feat, is complicated/difficult to use. Continuing to support it leads to ugly code since the old approach did not distinguish between NN and SP values. This patch removes the old setsockopt interface and replaces it with two new functions to register NN/SP values for feature negotiation. These are essentially wrappers around the internal __feat_register functions, with checking added to avoid * wrong usage (type); * changing values while the connection is in progress. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 22:51:23 -08:00
Mark McLoughlin	3f2c31d903	virtio_net: VIRTIO_NET_F_MSG_RXBUF (imprive rcv buffer allocation) If segmentation offload is enabled by the host, we currently allocate maximum sized packet buffers and pass them to the host. This uses up 20 ring entries, allowing us to supply only 20 packet buffers to the host with a 256 entry ring. This is a huge overhead when receiving small packets, and is most keenly felt when receiving MTU sized packets from off-host. The VIRTIO_NET_F_MRG_RXBUF feature flag is set by hosts which support using receive buffers which are smaller than the maximum packet size. In order to transfer large packets to the guest, the host merges together multiple receive buffers to form a larger logical buffer. The number of merged buffers is returned to the guest via a field in the virtio_net_hdr. Make use of this support by supplying single page receive buffers to the host. On receive, we extract the virtio_net_hdr, copy 128 bytes of the payload to the skb's linear data buffer and adjust the fragment offset to point to the remaining data. This ensures proper alignment and allows us to not use any paged data for small packets. If the payload occupies multiple pages, we simply append those pages as fragments and free the associated skbs. This scheme allows us to be efficient in our use of ring entries while still supporting large packets. Benchmarking using netperf from an external machine to a guest over a 10Gb/s network shows a 100% improvement from ~1Gb/s to ~2Gb/s. With a local host->guest benchmark with GSO disabled on the host side, throughput was seen to increase from 700Mb/s to 1.7Gb/s. Based on a patch from Herbert Xu. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (use netdev_priv) Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 22:41:34 -08:00
Eric Dumazet	88ab1932ea	udp: Use hlist_nulls in UDP RCU code This is a straightforward patch, using hlist_nulls infrastructure. RCUification already done on UDP two weeks ago. Using hlist_nulls permits us to avoid some memory barriers, both at lookup time and delete time. Patch is large because it adds new macros to include/net/sock.h. These macros will be used by TCP & DCCP in next patch. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 19:39:21 -08:00
Eric Dumazet	bbaffaca48	rcu: Introduce hlist_nulls variant of hlist hlist uses NULL value to finish a chain. hlist_nulls variant use the low order bit set to 1 to signal an end-of-list marker. This allows to store many different end markers, so that some RCU lockless algos (used in TCP/UDP stack for example) can save some memory barriers in fast paths. Two new files are added : include/linux/list_nulls.h - mimics hlist part of include/linux/list.h, derived to hlist_nulls variant include/linux/rculist_nulls.h - mimics hlist part of include/linux/rculist.h, derived to hlist_nulls variant Only four helpers are declared for the moment : hlist_nulls_del_init_rcu(), hlist_nulls_del_rcu(), hlist_nulls_add_head_rcu() and hlist_nulls_for_each_entry_rcu() prefetches() were removed, since an end of list is not anymore NULL value. prefetches() could trigger useless (and possibly dangerous) memory transactions. Example of use (extracted from __udp4_lib_lookup()) struct sock sk, result; struct hlist_nulls_node node; unsigned short hnum = ntohs(dport); unsigned int hash = udp_hashfn(net, hnum); struct udp_hslot hslot = &udptable->hash[hash]; int score, badness; rcu_read_lock(); begin: result = NULL; badness = -1; sk_nulls_for_each_rcu(sk, node, &hslot->head) { score = compute_score(sk, net, saddr, hnum, sport, daddr, dport, dif); if (score > badness) { result = sk; badness = score; } } /* * if the nulls value we got at the end of this lookup is * not the expected one, we must restart lookup. * We probably met an item that was moved to another chain. */ if (get_nulls_value(node) != hash) goto begin; if (result) { if (unlikely(!atomic_inc_not_zero(&result->sk_refcnt))) result = NULL; else if (unlikely(compute_score(result, net, saddr, hnum, sport, daddr, dport, dif) < badness)) { sock_put(result); goto begin; } } rcu_read_unlock(); return result; Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 19:37:55 -08:00
Balazs Scheidler	e8b2dfe9b4	TPROXY: implemented IP_RECVORIGDSTADDR socket option In case UDP traffic is redirected to a local UDP socket, the originally addressed destination address/port cannot be recovered with the in-kernel tproxy. This patch adds an IP_RECVORIGDSTADDR sockopt that enables a IP_ORIGDSTADDR ancillary message in recvmsg(). This ancillary message contains the original destination address/port of the packet being received. Signed-off-by: Balazs Scheidler <bazsi@balabit.hu> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 19:32:39 -08:00
Paulius Zaleckas	f004f3ea34	phylib: make mdio-gpio work without OF (v4) make mdio-gpio work with non OpenFirmware gpio implementation. Aditional changes to mdio-gpio: - use gpio_request() and gpio_free() - place irq[] array in struct mdio_gpio_info - add module description, author and license - add note about compiling this driver as module - rename mdc and mdio function (were ugly names) - change MII to MDIO in bus name - add __init __exit to module (un)loading functions - probe fails if no phys added to the bus - kzalloc bitbang with sizeof(*bitbang) Changes since v3: - keep bus naming "%x" to be compatible with existing drivers. Changes since v2: - more #ifdefs reduction - platform driver will be registered on OF platforms also - unified platform and OF bus_id to phy%i Changes since v1: - removed NO_IRQ - reduced #idefs Laurent, please test this driver under OF. Signed-off-by: Paulius Zaleckas <paulius.zaleckas@teltonika.lt> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 18:59:45 -08:00
Mathieu Desnoyers	7e066fb870	tracepoints: add DECLARE_TRACE() and DEFINE_TRACE() Impact: API CHANGE. Must update all tracepoint users. Add DEFINE_TRACE() to tracepoints to let them declare the tracepoint structure in a single spot for all the kernel. It helps reducing memory consumption, especially when declaring a lot of tracepoints, e.g. for kmalloc tracing. API CHANGE WARNING: now, DECLARE_TRACE() must be used in headers for tracepoint declarations rather than DEFINE_TRACE(). This is the sane way to do it. The name previously used was misleading. Updates scheduler instrumentation to follow this API change. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-16 09:01:36 +01:00
Mathieu Desnoyers	5f382671de	tracepoints: do not put arguments in name Impact: cleanup That's overkill, takes space. We have a global tracepoint registery in header files anyway. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-16 09:01:34 +01:00
Mathieu Desnoyers	c420970ef4	tracepoints: use unregister return value Impact: bugfix. Unregistering a tracepoint can fail. Return the error value. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-16 09:01:33 +01:00
Mathieu Desnoyers	da7b3eab16	tracepoints: use rcu_*_sched_notrace Make sure tracepoints can be called within ftrace callbacks. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-16 09:01:32 +01:00
Mathieu Desnoyers	a0bca6a59e	markers: create DEFINE_MARKER and GET_MARKER (new API) Impact: new API. Allow markers to be used only for declaration, without function call associated. Useful to create specialized probes. The problem we had is that two function calls were required when one wanted to put a marker in a tracepoint probe. Now the marker can be used simply for trace data type declaration, leaving the trace write work within the tracepoint probe without any additional function call. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-16 09:01:30 +01:00
Mathieu Desnoyers	c1df1bd2c4	markers: auto enable tracepoints (new API : trace_mark_tp()) Impact: new API Add a new API trace_mark_tp(), which declares a marker within a tracepoint probe. When the marker is activated, the tracepoint is automatically enabled. No branch test is used at the marker site, because it would be a duplicate of the branch already present in the tracepoint. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-16 09:01:29 +01:00
Arnaldo Carvalho de Melo	e3f8c4b911	markers: add missing stdargs.h include, needed due to va_list usage Impact: build fix (for future changes) That seemed to cause built issue when marker.h is included early, even though stdargs.h is included in kernel.h. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-16 09:01:27 +01:00
Mathieu Desnoyers	954e100d22	rcu: add rcu_read__sched_notrace() Impact: new API, useful for tracepoints and markers. Add _notrace version to rcu_read__sched(). Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Reviewed-by: Paul E McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-16 09:01:25 +01:00
Frederic Weisbecker	e7d3737ea1	tracing/function-return-tracer: support for dynamic ftrace on function return tracer This patch adds the support for dynamic tracing on the function return tracer. The whole difference with normal dynamic function tracing is that we don't need to hook on a particular callback. The only pro that we want is to nop or set dynamically the calls to ftrace_caller (which is ftrace_return_caller here). Some security checks ensure that we are not trying to launch dynamic tracing for return tracing while normal function tracing is already running. An example of trace with getnstimeofday set as a filter: ktime_get_ts+0x22/0x50 -> getnstimeofday (2283 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1396 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1382 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1825 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1426 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1464 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1524 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1382 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1382 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1434 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1464 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1502 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1404 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1397 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1051 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1314 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1344 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1163 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1390 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1374 ns) Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-16 07:57:38 +01:00
Steven Rostedt	31e889098a	ftrace: pass module struct to arch dynamic ftrace functions Impact: allow archs more flexibility on dynamic ftrace implementations Dynamic ftrace has largly been developed on x86. Since x86 does not have the same limitations as other architectures, the ftrace interaction between the generic code and the architecture specific code was not flexible enough to handle some of the issues that other architectures have. Most notably, module trampolines. Due to the limited branch distance that archs make in calling kernel core code from modules, the module load code must create a trampoline to jump to what will make the larger jump into core kernel code. The problem arises when this happens to a call to mcount. Ftrace checks all code before modifying it and makes sure the current code is what it expects. Right now, there is not enough information to handle modifying module trampolines. This patch changes the API between generic dynamic ftrace code and the arch dependent code. There is now two functions for modifying code: ftrace_make_nop(mod, rec, addr) - convert the code at rec->ip into a nop, where the original text is calling addr. (mod is the module struct if called by module init) ftrace_make_caller(rec, addr) - convert the code rec->ip that should be a nop into a caller to addr. The record "rec" now has a new field called "arch" where the architecture can add any special attributes to each call site record. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-16 07:36:02 +01:00
Linus Torvalds	b42ccbc521	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: HID: don't grab devices with no input HID: fix radio-mr800 hidquirks HID: fix kworld fm700 radio hidquirks HID: fix start/stop cycle in usbhid driver HID: use single threaded work queue for hid_compat HID: map macbook keys for "Expose" and "Dashboard" HID: support for new unibody macbooks HID: fix locking in hidraw_open()	2008-11-15 19:02:48 -08:00
Al Viro	8f7b0ba1c8	Fix inotify watch removal/umount races Inotify watch removals suck violently. To kick the watch out we need (in this order) inode->inotify_mutex and ih->mutex. That's fine if we have a hold on inode; however, for all other cases we need to make damn sure we don't race with umount. We can NOT just grab a reference to a watch - inotify_unmount_inodes() will happily sail past it and we'll end with reference to inode potentially outliving its superblock. Ideally we just want to grab an active reference to superblock if we can; that will make sure we won't go into inotify_umount_inodes() until we are done. Cleanup is just deactivate_super(). However, that leaves a messy case - what if we are racing with umount() and active references to superblock can't be acquired anymore? We can bump ->s_count, grab ->s_umount, which will almost certainly wait until the superblock is shut down and the watch in question is pining for fjords. That's fine, but there is a problem - we might have hit the window between ->s_active getting to 0 / ->s_count - below S_BIAS (i.e. the moment when superblock is past the point of no return and is heading for shutdown) and the moment when deactivate_super() acquires ->s_umount. We could just do drop_super() yield() and retry, but that's rather antisocial and this stuff is luser-triggerable. OTOH, having grabbed ->s_umount and having found that we'd got there first (i.e. that ->s_root is non-NULL) we know that we won't race with inotify_umount_inodes(). So we could grab a reference to watch and do the rest as above, just with drop_super() instead of deactivate_super(), right? Wrong. We had to drop ih->mutex before we could grab ->s_umount. So the watch could've been gone already. That still can be dealt with - we need to save watch->wd, do idr_find() and compare its result with our pointer. If they match, we either have the damn thing still alive or we'd lost not one but two races at once, the watch had been killed and a new one got created with the same ->wd at the same address. That couldn't have happened in inotify_destroy(), but inotify_rm_wd() could run into that. Still, "new one got created" is not a problem - we have every right to kill it or leave it alone, whatever's more convenient. So we can use idr_find(...) == watch && watch->inode->i_sb == sb as "grab it and kill it" check. If it's been our original watch, we are fine, if it's a newcomer - nevermind, just pretend that we'd won the race and kill the fscker anyway; we are safe since we know that its superblock won't be going away. And yes, this is far beyond mere "not very pretty"; so's the entire concept of inotify to start with. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Greg KH <greg@kroah.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-15 12:26:44 -08:00
Linus Torvalds	537a2f889a	Merge branch 'sh/for-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * 'sh/for-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: serial: sh-sci: Reorder the SCxTDR write after the TDxE clear. sh: __copy_user function can corrupt the stack in case of exception sh: Fixed the TMU0 reload value on resume sh: Don't factor in PAGE_OFFSET for valid_phys_addr_range() check. sh: early printk port type fix i2c: fix i2c-sh_mobile rx underrun sh: Provide a sane valid_phys_addr_range() to prevent TLB reset with PMB. usb: r8a66597-hcd: fix wrong data access in SuperH on-chip USB fix sci type for SH7723 serial: sh-sci: fix cannot work SH7723 SCIFA sh: Handle fixmap TLB eviction more coherently.	2008-11-15 12:10:32 -08:00
Martin Schwidefsky	d091c2f58b	Add 'pr_fmt()' format modifier to pr_xyz macros. A common reason for device drivers to implement their own printk macros is the lack of a printk prefix with the standard pr_xyz macros. Introduce a pr_fmt() macro that is applied for every pr_xyz macro to the format string. The most common use of the pr_fmt macro would be to add the name of the device driver to all pr_xyz messages in a source file. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-15 11:43:37 -08:00
Ingo Molnar	e8f6fbf62d	lockdep: include/linux/lockdep.h - fix warning in net/bluetooth/af_bluetooth.c fix this warning: net/bluetooth/af_bluetooth.c:60: warning: ‘bt_key_strings’ defined but not used net/bluetooth/af_bluetooth.c:71: warning: ‘bt_slock_key_strings’ defined but not used this is a lockdep macro problem in the !LOCKDEP case. We cannot convert it to an inline because the macro works on multiple types, but we can mark the parameter used. [ also clean up a misaligned tab in sock_lock_init_class_and_name() ] [ also remove #ifdefs from around af_family_clock_key strings - which were certainly added to get rid of the ugly build warnings. ] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-13 23:19:10 -08:00
James Morris	2b82892565	Merge branch 'master' into next Conflicts: security/keys/internal.h security/keys/process_keys.c security/keys/request_key.c Fixed conflicts above by using the non 'tsk' versions. Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 11:29:12 +11:00
David Howells	3a3b7ce933	CRED: Allow kernel services to override LSM settings for task actions Allow kernel services to override LSM settings appropriate to the actions performed by a task by duplicating a set of credentials, modifying it and then using task_struct::cred to point to it when performing operations on behalf of a task. This is used, for example, by CacheFiles which has to transparently access the cache on behalf of a process that thinks it is doing, say, NFS accesses with a potentially inappropriate (with respect to accessing the cache) set of credentials. This patch provides two LSM hooks for modifying a task security record: () security_kernel_act_as() which allows modification of the security datum with which a task acts on other objects (most notably files). () security_kernel_create_files_as() which allows modification of the security datum that is used to initialise the security data on a file that a task creates. The patch also provides four new credentials handling functions, which wrap the LSM functions: (1) prepare_kernel_cred() Prepare a set of credentials for a kernel service to use, based either on a daemon's credentials or on init_cred. All the keyrings are cleared. (2) set_security_override() Set the LSM security ID in a set of credentials to a specific security context, assuming permission from the LSM policy. (3) set_security_override_from_ctx() As (2), but takes the security context as a string. (4) set_create_files_as() Set the file creation LSM security ID in a set of credentials to be the same as that on a particular inode. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> [Smack changes] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:28 +11:00
David Howells	3b11a1dece	CRED: Differentiate objective and effective subjective credentials on a task Differentiate the objective and real subjective credentials from the effective subjective credentials on a task by introducing a second credentials pointer into the task_struct. task_struct::real_cred then refers to the objective and apparent real subjective credentials of a task, as perceived by the other tasks in the system. task_struct::cred then refers to the effective subjective credentials of a task, as used by that task when it's actually running. These are not visible to the other tasks in the system. __task_cred(task) then refers to the objective/real credentials of the task in question. current_cred() refers to the effective subjective credentials of the current task. prepare_creds() uses the objective creds as a base and commit_creds() changes both pointers in the task_struct (indeed commit_creds() requires them to be the same). override_creds() and revert_creds() change the subjective creds pointer only, and the former returns the old subjective creds. These are used by NFSD, faccessat() and do_coredump(), and will by used by CacheFiles. In SELinux, current_has_perm() is provided as an alternative to task_has_perm(). This uses the effective subjective context of current, whereas task_has_perm() uses the objective/real context of the subject. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:26 +11:00
David Howells	98870ab0a5	CRED: Documentation Document credentials and the new credentials API. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:26 +11:00
David Howells	d76b0d9b2d	CRED: Use creds in file structs Attach creds to file structs and discard f_uid/f_gid. file_operations::open() methods (such as hppfs_open()) should use file->f_cred rather than current_cred(). At the moment file->f_cred will be current_cred() at this point. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:25 +11:00
David Howells	a6f76f23d2	CRED: Make execve() take advantage of copy-on-write credentials Make execve() take advantage of copy-on-write credentials, allowing it to set up the credentials in advance, and then commit the whole lot after the point of no return. This patch and the preceding patches have been tested with the LTP SELinux testsuite. This patch makes several logical sets of alteration: (1) execve(). The credential bits from struct linux_binprm are, for the most part, replaced with a single credentials pointer (bprm->cred). This means that all the creds can be calculated in advance and then applied at the point of no return with no possibility of failure. I would like to replace bprm->cap_effective with: cap_isclear(bprm->cap_effective) but this seems impossible due to special behaviour for processes of pid 1 (they always retain their parent's capability masks where normally they'd be changed - see cap_bprm_set_creds()). The following sequence of events now happens: (a) At the start of do_execve, the current task's cred_exec_mutex is locked to prevent PTRACE_ATTACH from obsoleting the calculation of creds that we make. (a) prepare_exec_creds() is then called to make a copy of the current task's credentials and prepare it. This copy is then assigned to bprm->cred. This renders security_bprm_alloc() and security_bprm_free() unnecessary, and so they've been removed. (b) The determination of unsafe execution is now performed immediately after (a) rather than later on in the code. The result is stored in bprm->unsafe for future reference. (c) prepare_binprm() is called, possibly multiple times. (i) This applies the result of set[ug]id binaries to the new creds attached to bprm->cred. Personality bit clearance is recorded, but now deferred on the basis that the exec procedure may yet fail. (ii) This then calls the new security_bprm_set_creds(). This should calculate the new LSM and capability credentials into bprm->cred. This folds together security_bprm_set() and parts of security_bprm_apply_creds() (these two have been removed). Anything that might fail must be done at this point. (iii) bprm->cred_prepared is set to 1. bprm->cred_prepared is 0 on the first pass of the security calculations, and 1 on all subsequent passes. This allows SELinux in (ii) to base its calculations only on the initial script and not on the interpreter. (d) flush_old_exec() is called to commit the task to execution. This performs the following steps with regard to credentials: (i) Clear pdeath_signal and set dumpable on certain circumstances that may not be covered by commit_creds(). (ii) Clear any bits in current->personality that were deferred from (c.i). (e) install_exec_creds() [compute_creds() as was] is called to install the new credentials. This performs the following steps with regard to credentials: (i) Calls security_bprm_committing_creds() to apply any security requirements, such as flushing unauthorised files in SELinux, that must be done before the credentials are changed. This is made up of bits of security_bprm_apply_creds() and security_bprm_post_apply_creds(), both of which have been removed. This function is not allowed to fail; anything that might fail must have been done in (c.ii). (ii) Calls commit_creds() to apply the new credentials in a single assignment (more or less). Possibly pdeath_signal and dumpable should be part of struct creds. (iii) Unlocks the task's cred_replace_mutex, thus allowing PTRACE_ATTACH to take place. (iv) Clears The bprm->cred pointer as the credentials it was holding are now immutable. (v) Calls security_bprm_committed_creds() to apply any security alterations that must be done after the creds have been changed. SELinux uses this to flush signals and signal handlers. (f) If an error occurs before (d.i), bprm_free() will call abort_creds() to destroy the proposed new credentials and will then unlock cred_replace_mutex. No changes to the credentials will have been made. (2) LSM interface. A number of functions have been changed, added or removed: () security_bprm_alloc(), ->bprm_alloc_security() () security_bprm_free(), ->bprm_free_security() Removed in favour of preparing new credentials and modifying those. () security_bprm_apply_creds(), ->bprm_apply_creds() () security_bprm_post_apply_creds(), ->bprm_post_apply_creds() Removed; split between security_bprm_set_creds(), security_bprm_committing_creds() and security_bprm_committed_creds(). () security_bprm_set(), ->bprm_set_security() Removed; folded into security_bprm_set_creds(). () security_bprm_set_creds(), ->bprm_set_creds() New. The new credentials in bprm->creds should be checked and set up as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the second and subsequent calls. () security_bprm_committing_creds(), ->bprm_committing_creds() (*) security_bprm_committed_creds(), ->bprm_committed_creds() New. Apply the security effects of the new credentials. This includes closing unauthorised files in SELinux. This function may not fail. When the former is called, the creds haven't yet been applied to the process; when the latter is called, they have. The former may access bprm->cred, the latter may not. (3) SELinux. SELinux has a number of changes, in addition to those to support the LSM interface changes mentioned above: (a) The bprm_security_struct struct has been removed in favour of using the credentials-under-construction approach. (c) flush_unauthorized_files() now takes a cred pointer and passes it on to inode_has_perm(), file_has_perm() and dentry_open(). Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:24 +11:00
David Howells	d84f4f992c	CRED: Inaugurate COW credentials Inaugurate copy-on-write credentials management. This uses RCU to manage the credentials pointer in the task_struct with respect to accesses by other tasks. A process may only modify its own credentials, and so does not need locking to access or modify its own credentials. A mutex (cred_replace_mutex) is added to the task_struct to control the effect of PTRACE_ATTACHED on credential calculations, particularly with respect to execve(). With this patch, the contents of an active credentials struct may not be changed directly; rather a new set of credentials must be prepared, modified and committed using something like the following sequence of events: struct cred new = prepare_creds(); int ret = blah(new); if (ret < 0) { abort_creds(new); return ret; } return commit_creds(new); There are some exceptions to this rule: the keyrings pointed to by the active credentials may be instantiated - keyrings violate the COW rule as managing COW keyrings is tricky, given that it is possible for a task to directly alter the keys in a keyring in use by another task. To help enforce this, various pointers to sets of credentials, such as those in the task_struct, are declared const. The purpose of this is compile-time discouragement of altering credentials through those pointers. Once a set of credentials has been made public through one of these pointers, it may not be modified, except under special circumstances: (1) Its reference count may incremented and decremented. (2) The keyrings to which it points may be modified, but not replaced. The only safe way to modify anything else is to create a replacement and commit using the functions described in Documentation/credentials.txt (which will be added by a later patch). This patch and the preceding patches have been tested with the LTP SELinux testsuite. This patch makes several logical sets of alteration: (1) execve(). This now prepares and commits credentials in various places in the security code rather than altering the current creds directly. (2) Temporary credential overrides. do_coredump() and sys_faccessat() now prepare their own credentials and temporarily override the ones currently on the acting thread, whilst preventing interference from other threads by holding cred_replace_mutex on the thread being dumped. This will be replaced in a future patch by something that hands down the credentials directly to the functions being called, rather than altering the task's objective credentials. (3) LSM interface. A number of functions have been changed, added or removed: () security_capset_check(), ->capset_check() () security_capset_set(), ->capset_set() Removed in favour of security_capset(). () security_capset(), ->capset() New. This is passed a pointer to the new creds, a pointer to the old creds and the proposed capability sets. It should fill in the new creds or return an error. All pointers, barring the pointer to the new creds, are now const. () security_bprm_apply_creds(), ->bprm_apply_creds() Changed; now returns a value, which will cause the process to be killed if it's an error. () security_task_alloc(), ->task_alloc_security() Removed in favour of security_prepare_creds(). () security_cred_free(), ->cred_free() New. Free security data attached to cred->security. () security_prepare_creds(), ->cred_prepare() New. Duplicate any security data attached to cred->security. () security_commit_creds(), ->cred_commit() New. Apply any security effects for the upcoming installation of new security by commit_creds(). () security_task_post_setuid(), ->task_post_setuid() Removed in favour of security_task_fix_setuid(). () security_task_fix_setuid(), ->task_fix_setuid() Fix up the proposed new credentials for setuid(). This is used by cap_set_fix_setuid() to implicitly adjust capabilities in line with setuid() changes. Changes are made to the new credentials, rather than the task itself as in security_task_post_setuid(). () security_task_reparent_to_init(), ->task_reparent_to_init() Removed. Instead the task being reparented to init is referred directly to init's credentials. NOTE! This results in the loss of some state: SELinux's osid no longer records the sid of the thread that forked it. () security_key_alloc(), ->key_alloc() () security_key_permission(), ->key_permission() Changed. These now take cred pointers rather than task pointers to refer to the security context. (4) sys_capset(). This has been simplified and uses less locking. The LSM functions it calls have been merged. (5) reparent_to_kthreadd(). This gives the current thread the same credentials as init by simply using commit_thread() to point that way. (6) __sigqueue_alloc() and switch_uid() __sigqueue_alloc() can't stop the target task from changing its creds beneath it, so this function gets a reference to the currently applicable user_struct which it then passes into the sigqueue struct it returns if successful. switch_uid() is now called from commit_creds(), and possibly should be folded into that. commit_creds() should take care of protecting __sigqueue_alloc(). (7) [sg]et[ug]id() and co and [sg]et_current_groups. The set functions now all use prepare_creds(), commit_creds() and abort_creds() to build and check a new set of credentials before applying it. security_task_set[ug]id() is called inside the prepared section. This guarantees that nothing else will affect the creds until we've finished. The calling of set_dumpable() has been moved into commit_creds(). Much of the functionality of set_user() has been moved into commit_creds(). The get functions all simply access the data directly. (8) security_task_prctl() and cap_task_prctl(). security_task_prctl() has been modified to return -ENOSYS if it doesn't want to handle a function, or otherwise return the return value directly rather than through an argument. Additionally, cap_task_prctl() now prepares a new set of credentials, even if it doesn't end up using it. (9) Keyrings. A number of changes have been made to the keyrings code: (a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have all been dropped and built in to the credentials functions directly. They may want separating out again later. (b) key_alloc() and search_process_keyrings() now take a cred pointer rather than a task pointer to specify the security context. (c) copy_creds() gives a new thread within the same thread group a new thread keyring if its parent had one, otherwise it discards the thread keyring. (d) The authorisation key now points directly to the credentials to extend the search into rather pointing to the task that carries them. (e) Installing thread, process or session keyrings causes a new set of credentials to be created, even though it's not strictly necessary for process or session keyrings (they're shared). (10) Usermode helper. The usermode helper code now carries a cred struct pointer in its subprocess_info struct instead of a new session keyring pointer. This set of credentials is derived from init_cred and installed on the new process after it has been cloned. call_usermodehelper_setup() allocates the new credentials and call_usermodehelper_freeinfo() discards them if they haven't been used. A special cred function (prepare_usermodeinfo_creds()) is provided specifically for call_usermodehelper_setup() to call. call_usermodehelper_setkeys() adjusts the credentials to sport the supplied keyring as the new session keyring. (11) SELinux. SELinux has a number of changes, in addition to those to support the LSM interface changes mentioned above: (a) selinux_setprocattr() no longer does its check for whether the current ptracer can access processes with the new SID inside the lock that covers getting the ptracer's SID. Whilst this lock ensures that the check is done with the ptracer pinned, the result is only valid until the lock is released, so there's no point doing it inside the lock. (12) is_single_threaded(). This function has been extracted from selinux_setprocattr() and put into a file of its own in the lib/ directory as join_session_keyring() now wants to use it too. The code in SELinux just checked to see whether a task shared mm_structs with other tasks (CLONE_VM), but that isn't good enough. We really want to know if they're part of the same thread group (CLONE_THREAD). (13) nfsd. The NFS server daemon now has to use the COW credentials to set the credentials it is going to use. It really needs to pass the credentials down to the functions it calls, but it can't do that until other patches in this series have been applied. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:23 +11:00
David Howells	745ca2475a	CRED: Pass credentials through dentry_open() Pass credentials through dentry_open() so that the COW creds patch can have SELinux's flush_unauthorized_files() pass the appropriate creds back to itself when it opens its null chardev. The security_dentry_open() call also now takes a creds pointer, as does the dentry_open hook in struct security_operations. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:22 +11:00
David Howells	bb952bb98a	CRED: Separate per-task-group keyrings from signal_struct Separate per-task-group keyrings from signal_struct and dangle their anchor from the cred struct rather than the signal_struct. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:20 +11:00
David Howells	c69e8d9c01	CRED: Use RCU to access another task's creds and to release a task's own creds Use RCU to access another task's creds and to release a task's own creds. This means that it will be possible for the credentials of a task to be replaced without another task (a) requiring a full lock to read them, and (b) seeing deallocated memory. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:19 +11:00
David Howells	86a264abe5	CRED: Wrap current->cred and a few other accessors Wrap current->cred and a few other accessors to hide their actual implementation. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:18 +11:00
David Howells	f1752eec61	CRED: Detach the credentials from task_struct Detach the credentials from task_struct, duplicating them in copy_process() and releasing them in __put_task_struct(). Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:17 +11:00
David Howells	b6dff3ec5e	CRED: Separate task security context from task_struct Separate the task security context from task_struct. At this point, the security data is temporarily embedded in the task_struct with two pointers pointing to it. Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in entry.S via asm-offsets. With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:16 +11:00
David Howells	15a2460ed0	CRED: Constify the kernel_cap_t arguments to the capset LSM hooks Constify the kernel_cap_t arguments to the capset LSM hooks. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:15 +11:00
David Howells	1cdcbec1a3	CRED: Neuter sys_capset() Take away the ability for sys_capset() to affect processes other than current. This means that current will not need to lock its own credentials when reading them against interference by other processes. This has effectively been the case for a while anyway, since: (1) Without LSM enabled, sys_capset() is disallowed. (2) With file-based capabilities, sys_capset() is neutered. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Andrew G. Morgan <morgan@kernel.org> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:14 +11:00
David Howells	8bbf4976b5	KEYS: Alter use of key instantiation link-to-keyring argument Alter the use of the key instantiation and negation functions' link-to-keyring arguments. Currently this specifies a keyring in the target process to link the key into, creating the keyring if it doesn't exist. This, however, can be a problem for copy-on-write credentials as it means that the instantiating process can alter the credentials of the requesting process. This patch alters the behaviour such that: (1) If keyctl_instantiate_key() or keyctl_negate_key() are given a specific keyring by ID (ringid >= 0), then that keyring will be used. (2) If keyctl_instantiate_key() or keyctl_negate_key() are given one of the special constants that refer to the requesting process's keyrings (KEY_SPEC_*_KEYRING, all <= 0), then: (a) If sys_request_key() was given a keyring to use (destringid) then the key will be attached to that keyring. (b) If sys_request_key() was given a NULL keyring, then the key being instantiated will be attached to the default keyring as set by keyctl_set_reqkey_keyring(). (3) No extra link will be made. Decision point (1) follows current behaviour, and allows those instantiators who've searched for a specifically named keyring in the requestor's keyring so as to partition the keys by type to still have their named keyrings. Decision point (2) allows the requestor to make sure that the key or keys that get produced by request_key() go where they want, whilst allowing the instantiator to request that the key is retained. This is mainly useful for situations where the instantiator makes a secondary request, the key for which should be retained by the initial requestor: +-----------+ +--------------+ +--------------+ \| \| \| \| \| \| \| Requestor \|------->\| Instantiator \|------->\| Instantiator \| \| \| \| \| \| \| +-----------+ +--------------+ +--------------+ request_key() request_key() This might be useful, for example, in Kerberos, where the requestor requests a ticket, and then the ticket instantiator requests the TGT, which someone else then has to go and fetch. The TGT, however, should be retained in the keyrings of the requestor, not the first instantiator. To make this explict an extra special keyring constant is also added. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:14 +11:00
David Howells	e9e349b051	KEYS: Disperse linux/key_ui.h Disperse the bits of linux/key_ui.h as the reason they were put here (keyfs) didn't get in. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:13 +11:00
David Howells	da9592edeb	CRED: Wrap task credential accesses in the filesystem subsystem Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(\|e\|s\|fs)[ug]id to current_(\|e\|s\|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:05 +11:00
Alan Stern	352d026338	USB: don't register endpoints for interfaces that are going away This patch (as1155) fixes a bug in usbcore. When interfaces are deleted, either because the device was disconnected or because of a configuration change, the extra attribute files and child endpoint devices may get left behind. This is because the core removes them before calling device_del(). But during device_del(), after the driver is unbound the core will reinstall altsetting 0 and recreate those extra attributes and children. The patch prevents this by adding a flag to record when the interface is in the midst of being unregistered. When the flag is set, the attribute files and child devices will not be created. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Cc: stable <stable@kernel.org> [2.6.27, 2.6.26, 2.6.25] Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-11-13 14:45:00 -08:00
Peter Zijlstra	d7de4c1dc3	slab: document SLAB_DESTROY_BY_RCU Explain this SLAB_DESTROY_BY_RCU thing... [hugh@veritas.com: add a pointer to comment in mm/slab.c] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Jens Axboe <jens.axboe@oracle.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>	2008-11-13 20:49:02 +02:00
Henrik Rydberg	437184ae8b	HID: map macbook keys for "Expose" and "Dashboard" On macbooks there are specific keys for the user-space functions Expose and Dashboard, which currently has no counterpart in input.h. This patch adds KEY_SCALE and KEY_DASHBOARD, and maps the keyboard accordingly. Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Henrik Rydberg <rydberg@euromail.se> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2008-11-13 10:31:36 +01:00
Rodolfo Giometti	4e17e1db96	Add c2 port support C2port implements a two wire serial communication protocol (bit banging) designed to enable in-system programming, debugging, and boundary-scan testing on low pin-count Silicon Labs devices. Currently this code supports only flash programming through sysfs interface but extensions shoud be easy to add. Signed-off-by: Rodolfo Giometti <giometti@linux.it> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-12 17:17:18 -08:00
Mark Brown	077eaf5b40	rtc: rtc-wm8350: add support for WM8350 RTC This adds support for the RTC provided by the Wolfson Microelectronics WM8350. This driver was originally written by Graeme Gregory and Liam Girdwood, though it has been modified since then to update it to current mainline coding standards and for API completeness. [akpm@linux-foundation.org: s/schedule_timeout_interruptible/schedule_timeout_uninterruptible/ to prevent bogus timeout when signal_pending()] Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: David Brownell <david-b@pacbell.net> Cc: Liam Girdwood <linux@wolfsonmicro.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-12 17:17:18 -08:00
Andrew Morton	b76f90b526	remove ratelimt() It mistakenly assumes that a static local in an inlined function is a kernel-wide singleton. It also has no callers, so let's remove it. Cc: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-12 17:17:17 -08:00
Steven Rostedt	2ed84eeb88	trace: rename unlikely profiler to branch profiler Impact: name change of unlikely tracer and profiler Ingo Molnar suggested changing the config from UNLIKELY_PROFILE to BRANCH_PROFILING. I never did like the "unlikely" name so I went one step farther, and renamed all the unlikely configurations to a "BRANCH" variant. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-12 22:27:58 +01:00
Linus Torvalds	45a9524a61	Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: hrtimer: clean up unused callback modes	2008-11-12 12:00:15 -08:00
Linus Torvalds	08c1184fa2	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (47 commits) ACPI: pci_link: remove acpi_irq_balance_set() interface fujitsu-laptop: Add DMI callback for Lifebook S6420 ACPI: EC: Don't do transaction from GPE handler in poll mode. ACPI: EC: lower interrupt storm treshold ACPICA: Use spinlock for acpi_{en\|dis}able_gpe ACPI: EC: restart failed command ACPI: EC: wait for last write gpe ACPI: EC: make kernel messages more useful when GPE storm is detected ACPI: EC: revert msleep patch thinkpad_acpi: fingers off backlight if video.ko is serving this functionality sony-laptop: fingers off backlight if video.ko is serving this functionality msi-laptop: fingers off backlight if video.ko is serving this functionality fujitsu-laptop: fingers off backlight if video.ko is serving this functionality eeepc-laptop: fingers off backlight if video.ko is serving this functionality compal: fingers off backlight if video.ko is serving this functionality asus-acpi: fingers off backlight if video.ko is serving this functionality Acer-WMI: fingers off backlight if video.ko is serving this functionality ACPI video: if no ACPI backlight support, use vendor drivers ACPI: video: Ignore devices that aren't present in hardware Delete an unwanted return statement at evgpe.c ...	2008-11-12 10:24:46 -08:00
Ingo Molnar	eb42c75878	Merge branch 'linus' into x86/crashdump	2008-11-12 15:43:39 +01:00
Ingo Molnar	2b7d0390a6	tracing: branch tracer, fix vdso crash Impact: fix bootup crash the branch tracer missed arch/x86/vdso/vclock_gettime.c from disabling tracing, which caused such bootup crashes: [ 201.840097] init[1]: segfault at 7fffed3fe7c0 ip 00007fffed3fea2e sp 000077 also clean up the ugly ifdefs in arch/x86/kernel/vsyscall_64.c by creating DISABLE_UNLIKELY_PROFILE facility for code to turn off instrumentation on a per file basis. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-12 13:26:38 +01:00
Ingo Molnar	e25cf3db56	lockdep: include/linux/lockdep.h - fix warning in net/bluetooth/af_bluetooth.c fix this warning: net/bluetooth/af_bluetooth.c:60: warning: ‘bt_key_strings’ defined but not used net/bluetooth/af_bluetooth.c:71: warning: ‘bt_slock_key_strings’ defined but not used this is a lockdep macro problem in the !LOCKDEP case. We cannot convert it to an inline because the macro works on multiple types, but we can mark the parameter used. [ also clean up a misaligned tab in sock_lock_init_class_and_name() ] [ also remove #ifdefs from around af_family_clock_key strings - which were certainly added to get rid of the ugly build warnings. ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-12 12:39:40 +01:00
Ingo Molnar	708b8eae0f	Merge branch 'linus' into core/locking	2008-11-12 12:39:21 +01:00
Steven Rostedt	1f0d69a9fc	tracing: profile likely and unlikely annotations Impact: new unlikely/likely profiler Andrew Morton recently suggested having an in-kernel way to profile likely and unlikely macros. This patch achieves that goal. When configured, every() likely and unlikely macro gets a counter attached to it. When the condition is hit, the hit and misses of that condition are recorded. These numbers can later be retrieved by: /debugfs/tracing/profile_likely - All likely markers /debugfs/tracing/profile_unlikely - All unlikely markers. # cat /debug/tracing/profile_unlikely \| head correct incorrect % Function File Line ------- --------- - -------- ---- ---- 2167 0 0 do_arch_prctl process_64.c 832 0 0 0 do_arch_prctl process_64.c 804 2670 0 0 IS_ERR err.h 34 71230 5693 7 __switch_to process_64.c 673 76919 0 0 __switch_to process_64.c 639 43184 33743 43 __switch_to process_64.c 624 12740 64181 83 __switch_to process_64.c 594 12740 64174 83 __switch_to process_64.c 590 # cat /debug/tracing/profile_unlikely \| \ awk '{ if ($3 > 25) print $0; }' \|head -20 44963 35259 43 __switch_to process_64.c 624 12762 67454 84 __switch_to process_64.c 594 12762 67447 84 __switch_to process_64.c 590 1478 595 28 syscall_get_error syscall.h 51 0 2821 100 syscall_trace_leave ptrace.c 1567 0 1 100 native_smp_prepare_cpus smpboot.c 1237 86338 265881 75 calc_delta_fair sched_fair.c 408 210410 108540 34 calc_delta_mine sched.c 1267 0 54550 100 sched_info_queued sched_stats.h 222 51899 66435 56 pick_next_task_fair sched_fair.c 1422 6 10 62 yield_task_fair sched_fair.c 982 7325 2692 26 rt_policy sched.c 144 0 1270 100 pre_schedule_rt sched_rt.c 1261 1268 48073 97 pick_next_task_rt sched_rt.c 884 0 45181 100 sched_info_dequeued sched_stats.h 177 0 15 100 sched_move_task sched.c 8700 0 15 100 sched_move_task sched.c 8690 53167 33217 38 schedule sched.c 4457 0 80208 100 sched_info_switch sched_stats.h 270 30585 49631 61 context_switch sched.c 2619 # cat /debug/tracing/profile_likely \| awk '{ if ($3 > 25) print $0; }' 39900 36577 47 pick_next_task sched.c 4397 20824 15233 42 switch_mm mmu_context_64.h 18 0 7 100 __cancel_work_timer workqueue.c 560 617 66484 99 clocksource_adjust timekeeping.c 456 0 346340 100 audit_syscall_exit auditsc.c 1570 38 347350 99 audit_get_context auditsc.c 732 0 345244 100 audit_syscall_entry auditsc.c 1541 38 1017 96 audit_free auditsc.c 1446 0 1090 100 audit_alloc auditsc.c 862 2618 1090 29 audit_alloc auditsc.c 858 0 6 100 move_masked_irq migration.c 9 1 198 99 probe_sched_wakeup trace_sched_switch.c 58 2 2 50 probe_wakeup trace_sched_wakeup.c 227 0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144 4514 2090 31 __grab_cache_page filemap.c 2149 12882 228786 94 mapping_unevictable pagemap.h 50 4 11 73 __flush_cpu_slab slub.c 1466 627757 330451 34 slab_free slub.c 1731 2959 61245 95 dentry_lru_del_init dcache.c 153 946 1217 56 load_elf_binary binfmt_elf.c 904 102 82 44 disk_put_part genhd.h 206 1 1 50 dst_gc_task dst.c 82 0 19 100 tcp_mss_split_point tcp_output.c 1126 As you can see by the above, there's a bit of work to do in rethinking the use of some unlikelys and likelys. Note: the unlikely case had 71 hits that were more than 25%. Note: After submitting my first version of this patch, Andrew Morton showed me a version written by Daniel Walker, where I picked up the following ideas from: 1) Using __builtin_constant_p to avoid profiling fixed values. 2) Using __FILE__ instead of instruction pointers. 3) Using the preprocessor to stop all profiling of likely annotations from vsyscall_64.c. Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar for their feed back on this patch. () Not ever unlikely is recorded, those that are used by vsyscalls (a few of them) had to have profiling disabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-12 11:52:02 +01:00
James Morris	92a77aac98	security: remove broken and useless declarations Remove broken declarations for security_capable* functions, which were not needed anyway. Signed-off-by: James Morris <jmorris@namei.org>	2008-11-12 21:20:00 +11:00
Frederic Weisbecker	3f5ec13696	tracing/fastboot: move boot tracer structs and funcs into their own header. Impact: Cleanups on the boot tracer and ftrace This patch bring some cleanups about the boot tracer headers. The functions and structures of this tracer have nothing related to ftrace and should have so their own header file. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-12 10:17:18 +01:00
Ingo Molnar	60a011c736	Merge branch 'tracing/function-return-tracer' into tracing/fastboot	2008-11-12 10:17:09 +01:00
Ingo Molnar	d06bbd6695	Merge branches 'tracing/ftrace' and 'tracing/urgent' into tracing/core Conflicts: kernel/trace/ring_buffer.c	2008-11-12 10:11:37 +01:00
Peter Zijlstra	621a0d5207	hrtimer: clean up unused callback modes Impact: cleanup git grep HRTIMER_CB_IRQSAFE revealed half the callback modes are actually unused. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-12 09:54:40 +01:00
Gerrit Renker	d90ebcbfa7	dccp: Query supported CCIDs This provides a data structure to record which CCIDs are locally supported and three accessor functions: - a test function for internal use which is used to validate CCID requests made by the user; - a copy function so that the list can be used for feature-negotiation; - documented getsockopt() support so that the user can query capabilities. The data structure is a table which is filled in at compile-time with the list of available CCIDs (which in turn depends on the Kconfig choices). Using the copy function for cloning the list of supported CCIDs is useful for feature negotiation, since the negotiation is now with the full list of available CCIDs (e.g. {2, 3}) instead of the default value {2}. This means negotiation will not fail if the peer requests to use CCID3 instead of CCID2. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-12 00:47:26 -08:00
Yoshihiro Shimoda	1a22f08dbd	serial: sh-sci: fix cannot work SH7723 SCIFA SH7723 has SCIFA. This module is similer SCI register map, but it has FIFO. So this patch adds new type(PORT_SCIFA) and change some type checking. Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2008-11-12 12:29:56 +09:00
Len Brown	f398778aa3	Merge branch 'video' into release Conflicts: Documentation/kernel-parameters.txt Signed-off-by: Len Brown <len.brown@intel.com>	2008-11-11 21:15:50 -05:00
Len Brown	3e0fe36483	Merge branch 'misc' into release	2008-11-11 21:14:11 -05:00
David S. Miller	7e452baf6b	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/message/fusion/mptlan.c drivers/net/sfc/ethtool.c net/mac80211/debugfs_sta.c	2008-11-11 15:43:02 -08:00
Ingo Molnar	c1e7abbc7a	Merge branch 'devel' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/urgent	2008-11-11 21:34:07 +01:00
Steven Rostedt	a358324466	ring-buffer: buffer record on/off switch Impact: enable/disable ring buffer recording API added Several kernel developers have requested that there be a way to stop recording into the ring buffers with a simple switch that can also be enabled from userspace. This patch addes a new kernel API to the ring buffers called: tracing_on() tracing_off() When tracing_off() is called, all ring buffers will not be able to record into their buffers. tracing_on() will enable the ring buffers again. These two act like an on/off switch. That is, there is no counting of the number of times tracing_off or tracing_on has been called. A new file is added to the debugfs/tracing directory called tracing_on This allows for userspace applications to also flip the switch. echo 0 > debugfs/tracing/tracing_on disables the tracing. echo 1 > /debugfs/tracing/tracing_on enables it. Note, this does not disable or enable any tracers. It only sets or clears a flag that needs to be set in order for the ring buffers to write to their buffers. It is a global flag, and affects all ring buffers. The buffers start out with tracing_on enabled. There are now three flags that control recording into the buffers: tracing_on: which affects all ring buffer tracers. buffer->record_disabled: which affects an allocated buffer, which may be set if an anomaly is detected, and tracing is disabled. cpu_buffer->record_disabled: which is set by tracing_stop() or if an anomaly is detected. tracing_start can not reenable this if an anomaly occurred. The userspace debugfs/tracing/tracing_enabled is implemented with tracing_stop() but the user space code can not enable it if the kernel called tracing_stop(). Userspace can enable the tracing_on even if the kernel disabled it. It is just a switch used to stop tracing if a condition was hit. tracing_on is not for protecting critical areas in the kernel nor is it for stopping tracing if an anomaly occurred. This is because userspace can reenable it at any time. Side effect: With this patch, I discovered a dead variable in ftrace.c called tracing_on. This patch removes it. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2008-11-11 15:02:04 -05:00
Linus Torvalds	2f96cb57cd	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: release buddies on yield fix for account_group_exec_runtime(), make sure ->signal can't be freed under rq->lock sched: clean up debug info	2008-11-11 10:52:25 -08:00
Alan Cox	0906dd9df2	telephony: trivial: fix up email address Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-11 09:30:23 -08:00
Linus Torvalds	0a4cf2c878	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: dsa: fix master interface allmulti/promisc handling dsa: fix skb->pkt_type when mac address of slave interface differs net: fix setting of skb->tail in skb_recycle_check() net: fix /proc/net/snmp as memory corruptor mac80211: fix a buffer overrun in station debug code netfilter: payload_len is be16, add size of struct rather than size of pointer ipv6: fix ip6_mr_init error path [4/4] dca: fixup initialization dependency [3/4] I/OAT: fix async_tx.callback checking [2/4] I/OAT: fix dma_pin_iovec_pages() error handling [1/4] I/OAT: fix channel resources free for not allocated channels ssb: Fix DMA-API compilation for non-PCI systems SSB: hide empty sub menu vlan: Fix typos in proc output string [netdrvr] usb/hso: Cleanup rfkill error handling sfc: Correct address of gPXE boot configuration in EEPROM el3_common_init() should be __devinit, not __init hso: rfkill type should be WWAN mlx4_en: Start port error flow bug fix af_key: mark policy as dead before destroying	2008-11-11 09:20:29 -08:00
Dhaval Giani	50ee91765e	sched/rt: removed unneeded defintion Impact: cleanup This function no longer exists, so remove the defintion. Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-11 13:53:13 +01:00
Eric Paris	06112163f5	Add a new capable interface that will be used by systems that use audit to make an A or B type decision instead of a security decision. Currently this is the case at least for filesystems when deciding if a process can use the reserved 'root' blocks and for the case of things like the oom algorithm determining if processes are root processes and should be less likely to be killed. These types of security system requests should not be audited or logged since they are not really security decisions. It would be possible to solve this problem like the vm_enough_memory security check did by creating a new LSM interface and moving all of the policy into that interface but proves the needlessly bloat the LSM and provide complex indirection. This merely allows those decisions to be made where they belong and to not flood logs or printk with denials for thing that are not security decisions. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-11 22:02:50 +11:00
Eric Paris	e68b75a027	When the capset syscall is used it is not possible for audit to record the actual capbilities being added/removed. This patch adds a new record type which emits the target pid and the eff, inh, and perm cap sets. example output if you audit capset syscalls would be: type=SYSCALL msg=audit(1225743140.465:76): arch=c000003e syscall=126 success=yes exit=0 a0=17f2014 a1=17f201c a2=80000000 a3=7fff2ab7f060 items=0 ppid=2160 pid=2223 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=1 comm="setcap" exe="/usr/sbin/setcap" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null) type=UNKNOWN[1322] msg=audit(1225743140.465:76): pid=0 cap_pi=ffffffffffffffff cap_pp=ffffffffffffffff cap_pe=ffffffffffffffff Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-11 21:48:22 +11:00
Eric Paris	3fc689e96c	Any time fcaps or a setuid app under SECURE_NOROOT is used to result in a non-zero pE we will crate a new audit record which contains the entire set of known information about the executable in question, fP, fI, fE, fversion and includes the process's pE, pI, pP. Before and after the bprm capability are applied. This record type will only be emitted from execve syscalls. an example of making ping use fcaps instead of setuid: setcap "cat_net_raw+pe" /bin/ping type=SYSCALL msg=audit(1225742021.015:236): arch=c000003e syscall=59 success=yes exit=0 a0=1457f30 a1=14606b0 a2=1463940 a3=321b770a70 items=2 ppid=2929 pid=2963 auid=0 uid=500 gid=500 euid=500 suid=500 fsuid=500 egid=500 sgid=500 fsgid=500 tty=pts0 ses=3 comm="ping" exe="/bin/ping" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null) type=UNKNOWN[1321] msg=audit(1225742021.015:236): fver=2 fp=0000000000002000 fi=0000000000000000 fe=1 old_pp=0000000000000000 old_pi=0000000000000000 old_pe=0000000000000000 new_pp=0000000000002000 new_pi=0000000000000000 new_pe=0000000000002000 type=EXECVE msg=audit(1225742021.015:236): argc=2 a0="ping" a1="127.0.0.1" type=CWD msg=audit(1225742021.015:236): cwd="/home/test" type=PATH msg=audit(1225742021.015:236): item=0 name="/bin/ping" inode=49256 dev=fd:00 mode=0100755 ouid=0 ogid=0 rdev=00:00 obj=system_u:object_r:ping_exec_t:s0 cap_fp=0000000000002000 cap_fe=1 cap_fver=2 type=PATH msg=audit(1225742021.015:236): item=1 name=(null) inode=507915 dev=fd:00 mode=0100755 ouid=0 ogid=0 rdev=00:00 obj=system_u:object_r:ld_so_t:s0 Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-11 21:48:18 +11:00
Eric Paris	851f7ff56d	This patch will print cap_permitted and cap_inheritable data in the PATH records of any file that has file capabilities set. Files which do not have fcaps set will not have different PATH records. An example audit record if you run: setcap "cap_net_admin+pie" /bin/bash /bin/bash type=SYSCALL msg=audit(1225741937.363:230): arch=c000003e syscall=59 success=yes exit=0 a0=2119230 a1=210da30 a2=20ee290 a3=8 items=2 ppid=2149 pid=2923 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=3 comm="ping" exe="/bin/ping" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null) type=EXECVE msg=audit(1225741937.363:230): argc=2 a0="ping" a1="www.google.com" type=CWD msg=audit(1225741937.363:230): cwd="/root" type=PATH msg=audit(1225741937.363:230): item=0 name="/bin/ping" inode=49256 dev=fd:00 mode=0104755 ouid=0 ogid=0 rdev=00:00 obj=system_u:object_r:ping_exec_t:s0 cap_fp=0000000000002000 cap_fi=0000000000002000 cap_fe=1 cap_fver=2 type=PATH msg=audit(1225741937.363:230): item=1 name=(null) inode=507915 dev=fd:00 mode=0100755 ouid=0 ogid=0 rdev=00:00 obj=system_u:object_r:ld_so_t:s0 Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-11 21:48:14 +11:00
Eric Paris	c0b004413a	This patch add a generic cpu endian caps structure and externally available functions which retrieve fcaps information from disk. This information is necessary so fcaps information can be collected and recorded by the audit system. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-11 21:48:10 +11:00
Eric Paris	9d36be76c5	Document the order of arguments for cap_issubset. It's not instantly clear which order the argument should be in. So give an example. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-11 21:48:07 +11:00
Frederic Weisbecker	caf4b323b0	tracing, x86: add low level support for ftrace return tracing Impact: add infrastructure for function-return tracing Add low level support for ftrace return tracing. This plug-in stores return addresses on the thread_info structure of the current task. The index of the current return address is initialized when the task is the first one (init) and when a process forks (the child). It is not needed when a task does a sys_execve because after this syscall, it still needs to return on the kernel functions it called. Note that the code of return_to_handler has been suggested by Steven Rostedt as almost all of the ideas of improvements in this V3. For purpose of security, arch/x86/kernel/process_32.c is not traced because __switch_to() changes the current task during its execution. That could cause inconsistency in the stored return address of this function even if I didn't have any crash after testing with tracing on this function enabled. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-11 10:29:11 +01:00
Ingo Molnar	e0cb4ebcd9	Merge branch 'tracing/urgent' into tracing/ftrace Conflicts: kernel/trace/trace.c	2008-11-11 09:40:18 +01:00
Oleg Nesterov	ad474caca3	fix for account_group_exec_runtime(), make sure ->signal can't be freed under rq->lock Impact: fix hang/crash on ia64 under high load This is ugly, but the simplest patch by far. Unlike other similar routines, account_group_exec_runtime() could be called "implicitly" from within scheduler after exit_notify(). This means we can race with the parent doing release_task(), we can't just check ->signal != NULL. Change __exit_signal() to do spin_unlock_wait(&task_rq(tsk)->lock) before __cleanup_signal() to make sure ->signal can't be freed under task_rq(tsk)->lock. Note that task_rq_unlock_wait() doesn't care about the case when tsk changes cpu/rq under us, this should be OK. Thanks to Ingo who nacked my previous buggy patch. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Reported-by: Doug Chapman <doug.chapman@hp.com>	2008-11-11 08:01:43 +01:00
Michael Buesch	fd0fcf5c29	ssb: Fix DMA-API compilation for non-PCI systems This fixes compilation of the SSB DMA-API code on non-PCI platforms. Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-10 13:50:19 -08:00
Jouni Malinen	fc6971d491	mac80211_hwsim: Add support for client PS mode This introduces a debugfs file (ieee80211/phy#/hwsim/ps) that can be used to force a simulated radio into power save mode. Following values can be written into this file to change PS mode: 0 = power save disabled (constantly awake) 1 = power save enabled (drop all frames; do not send PS-Poll) 2 = power save enabled (send PS-Poll frames automatically to receive buffered unicast frames); not yet fully implemented 3 = manual PS-Poll trigger (send a single PS-Poll frame) Two different behavior for power save mode processing can be tested: - move between modes 1 and 0 (i.e., receive all buffered frames at a time) - move to mode 1 and use manual PS-Poll frames (write 3 to the 'ps' debugfs file) to fetch power save buffered frames one at a time Mode 2 (automatic PS-Poll) does not yet parse Beacon frames, but eventually, it should take a look at TIM IE and send PS-Poll if a traffic bit is set for our AID. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-11-10 15:17:41 -05:00
Jouni Malinen	318884875b	nl80211: Add TX queue parameter configuration Add a new attribute, NL80211_ATTR_WIPHY_TXQ_PARAMS, that can be used with NL80211_CMD_SET_WIPHY for userspace (e.g., hostapd) to set TX queue parameters (txop, cwmin, cwmax, aifs). Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-11-10 15:17:40 -05:00
Jouni Malinen	90c97a040d	nl80211: Add basic rate configuration for AP mode Add a new attribute, NL80211_ATTR_BSS_BASIC_RATES, that can be used with NL80211_CMD_SET_BSS for userspace (e.g., hostapd) to set which rates are in the basic rate set. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-11-10 15:17:39 -05:00
Johannes Berg	1239cd58d2	wireless: move mesh config length constant This is a constant from the 802.11 specification. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: Javier Cardona <javier@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-11-10 15:10:16 -05:00
Tejun Heo	8a8bc22332	libata: revert convert-to-block-tagging patches This patch reverts the following three commits which convert libata to use block layer tagging. `43a49cbdf3` `e013e13bf6` `2fca5ccf97` Although using block layer tagging is the right direction, due to the tight coupling among tag number, data structure allocation and hardware command slot allocation, libata doesn't work correctly with the current conversion. The biggest problem is guaranteeing that tag 0 is always used for non-NCQ commands. Due to the way blk-tag is implemented and how SCSI starts and finishes requests, such guarantee can't be made. I'm not sure whether this would actually break any low level driver but it doesn't look like a good idea to break such assumption given the frailty of ATA controllers. So, for the time being, keep using the old dumb in-libata qc allocation. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axobe <jens.axboe@oracle.com> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-10 08:04:47 -08:00
Thomas Gleixner	f6d87f4bd2	genirq: keep affinities set from userspace across free/request_irq() Impact: preserve user-modified affinities on interrupts Kumar Galak noticed that commit `1840475676` (genirq: Expose default irq affinity mask (take 3)) overrides an already set affinity setting across a free / request_irq(). Happens e.g. with ifdown/ifup of a network device. Change the logic to mark the affinities as set and keep them intact. This also fixes the unlocked access to irq_desc in irq_select_affinity() when called from irq_affinity_proc_write() Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-09 22:23:49 +01:00
Linus Torvalds	cb56d98e2a	Merge branch 'cpus4096' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'cpus4096' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: cpumask: introduce new API, without changing anything, v3 cpumask: new API, v2 cpumask: introduce new API, without changing anything	2008-11-09 12:20:56 -08:00
Rusty Russell	984f2f377f	cpumask: introduce new API, without changing anything, v3 Impact: cleanup Clean up based on feedback from Andrew Morton and others: - change to inline functions instead of macros - add __init to bootmem method - add a missing debug check Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-09 21:09:54 +01:00
Nicolas Pitre	058e3739f6	clarify usage expectations for cnt32_to_63() Currently, all existing users of cnt32_to_63() are fine since the CPU architectures where it is used don't do read access reordering, and user mode preemption is disabled already. It is nevertheless a good idea to better elaborate usage requirements wrt preemption, and use an explicit memory barrier on SMP to avoid different CPUs accessing the counter value in the wrong order. On UP a simple compiler barrier is sufficient. Signed-off-by: Nicolas Pitre <nico@marvell.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-09 11:17:33 -08:00
Kay Sievers	d1b2686308	mmc: struct device - replace bus_id with dev_name(), dev_set_name() Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-Off-By: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2008-11-08 21:37:46 +01:00
Ingo Molnar	a6b0786f7f	Merge branches 'tracing/ftrace', 'tracing/fastboot', 'tracing/nmisafe' and 'tracing/urgent' into tracing/core	2008-11-08 09:34:35 +01:00
Thomas Graf	f400923735	pkt_sched: Control group classifier The classifier should cover the most common use case and will work without any special configuration. The principle of the classifier is to directly access the task_struct via get_current(). In order for this to work, classification requests from softirqs must be ignored. This is not a problem because the vast majority of packets in softirq context are not assigned to a task anyway. For this to work, a mechanism is needed to trace softirq context. This repost goes back to the method of relying on the number of nested bh disable calls for the sake of not adding too much complexity and the option to come up with something more reliable if actually needed. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-07 22:56:00 -08:00
Eric W. Biederman	505d4f73dd	net: Guaranetee the proper ordering of the loopback device. v2 I was recently hunting a bug that occurred in network namespace cleanup. In looking at the code it became apparrent that we have and will continue to have cases where if we have anything going on in a network namespace there will be assumptions that the loopback device is present. Things like sending igmp unsubscribe messages when we bring down network devices invokes the routing code which assumes that at least the loopback driver is present. Therefore to avoid magic initcall ordering hackery that is hard to follow and hard to get right insert a call to register the loopback device directly from net_dev_init(). This guarantes that the loopback device is the first device registered and the last network device to go away. But do it carefully so we register the loopback device after we clear dev_boot_phase. Signed-off-by: Eric W. Biederman <ebiederm@maxwell.aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-07 22:54:20 -08:00
David S. Miller	3d8160b149	Revert "net: Guaranetee the proper ordering of the loopback device." This reverts commit `ae33bc40c0`.	2008-11-07 22:52:14 -08:00
Thomas Renninger	c3d6de698c	ACPI video: if no ACPI backlight support, use vendor drivers If an ACPI graphics device supports backlight brightness functions (cmp. with latest ACPI spec Appendix B), let the ACPI video driver control backlight and switch backlight control off in vendor specific ACPI drivers (asus_acpi, thinkpad_acpi, eeepc, fujitsu_laptop, msi_laptop, sony_laptop, acer-wmi). Currently it is possible to load above drivers and let both poke on the brightness HW registers, the video and vendor specific ACPI drivers -> bad. This patch provides the basic support to check for BIOS capabilities before driver loading time. Driver specific modifications are in separate follow up patches. "acpi_backlight=vendor" Prever vendor driver over ACPI driver for backlight. "acpi_backlight=video" (default) Prever ACPI driver over vendor driver for backlight. Signed-off-by: Thomas Renninger <trenn@suse.de> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2008-11-07 23:57:55 -05:00
David Vrabel	307ba6dd73	uwb: don't unbind the radio controller driver when resetting Use pre_reset and post_reset methods to avoid unbinding the radio controller driver after a uwb_rc_reset_all() call. This avoids a deadlock in uwb_rc_rm() when waiting for the uwb event thread to stop. Signed-off-by: David Vrabel <david.vrabel@csr.com>	2008-11-07 17:37:33 +00:00
Linus Torvalds	8ec96e7bba	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: fix range check on mmapped sysfs resource files PCI: remove excess kernel-doc notation PCI: annotate return value of pci_ioremap_bar with __iomem PCI: fix VPD limit quirk for Broadcom 5708S	2008-11-07 09:18:14 -08:00
Ingo Molnar	52c642f33b	sched: fine-tune SD_SIBLING_INIT fine-tune the HT sched-domains parameters as well. On a HT capable box, this increases lat_ctx performance from 23.87 usecs to 1.49 usecs: # before $ ./lat_ctx -s 0 2 "size=0k ovr=1.89 2 23.87 # after $ ./lat_ctx -s 0 2 "size=0k ovr=1.84 2 1.49 Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-07 16:09:23 +01:00
Mike Galbraith	1480098470	sched: fine-tune SD_MC_INIT Tune SD_MC_INIT the same way as SD_CPU_INIT: unset SD_BALANCE_NEWIDLE, and set SD_WAKE_BALANCE. This improves vmark by 5%: vmark 132102 125968 125497 messages/sec avg 127855.66 .984 vmark 139404 131719 131272 messages/sec avg 134131.66 1.033 Signed-off-by: Mike Galbraith <efault@gmx.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> # DOCUMENTATION	2008-11-07 15:35:11 +01:00
Rusty Russell	cd83e42c6b	cpumask: new API, v2 - add cpumask_of() - add free_bootmem_cpumask_var() Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-07 12:52:30 +01:00
David S. Miller	167c6274c3	Merge branch 'davem-next' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6	2008-11-07 01:37:16 -08:00
David S. Miller	9eeda9abd1	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/ath5k/base.c net/8021q/vlan_core.c	2008-11-06 22:43:03 -08:00
Niv Sardi	dcd7b4e5c0	Merge branch 'master' of git://oss.sgi.com:8090/xfs/linux-2.6	2008-11-07 15:07:12 +11:00
Linus Torvalds	4bab0ea1d4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: net: Fix recursive descent in __scm_destroy(). iwl3945: fix deadlock on suspend iwl3945: do not send scan command if channel count zero iwl3945: clear scanning bits upon failure ath5k: correct handling of rx status fields zd1211rw: Add 2 device IDs Fix logic error in rfkill_check_duplicity iwlagn: avoid sleep in softirq context iwlwifi: clear scanning bits upon failure Revert "ath5k: honor FIF_BCN_PRBRESP_PROMISC in STA mode" tcp: Fix recvmsg MSG_PEEK influence of blocking behavior. netfilter: netns ct: walk netns list under RTNL ipv6: fix run pending DAD when interface becomes ready net/9p: fix printk format warnings net: fix packet socket delivery in rx irq handler xfrm: Have af-specific init_tempsel() initialize family field of temporary selector	2008-11-06 16:44:23 -08:00
Linus Torvalds	e252f4db18	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: Block: use round_jiffies_up() Add round_jiffies_up and related routines block: fix __blkdev_get() for removable devices generic-ipi: fix the smp_mb() placement blk: move blk_delete_timer call in end_that_request_last block: add timer on blkdev_dequeue_request() not elv_next_request() bio: define __BIOVEC_PHYS_MERGEABLE block: remove unused ll_new_mergeable()	2008-11-06 15:53:47 -08:00
Linus Torvalds	067ab19923	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: re-tune balancing sched: fix buddies for group scheduling sched: backward looking buddy sched: fix fair preempt check sched: cleanup fair task selection	2008-11-06 15:45:40 -08:00
David S. Miller	3b53fbf431	net: Fix recursive descent in __scm_destroy(). __scm_destroy() walks the list of file descriptors in the scm_fp_list pointed to by the scm_cookie argument. Those, in turn, can close sockets and invoke __scm_destroy() again. There is nothing which limits how deeply this can occur. The idea for how to fix this is from Linus. Basically, we do all of the fput()s at the top level by collecting all of the scm_fp_list objects hit by an fput(). Inside of the initial __scm_destroy() we keep running the list until it is empty. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-06 15:45:32 -08:00
David Howells	7597bc94d6	Fix accidental implicit cast in HR-timer conversion Fix the hrtimer_add_expires_ns() function. It should take a 'u64 ns' argument, but rather takes an 'unsigned long ns' argument - which might only be 32-bits. On FRV, this results in the kernel locking up because hrtimer_forward() passes the result of a 64-bit multiplication to this function, for which the compiler discards the top 32-bits - something that didn't happen when ktime_add_ns() was called directly. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-06 15:44:19 -08:00
Linus Torvalds	c361948712	Merge git://git.infradead.org/mtd-2.6 * git://git.infradead.org/mtd-2.6: [JFFS2] fix race condition in jffs2_lzo_compress() [MTD] [NOR] Fix cfi_send_gen_cmd handling of x16 devices in x8 mode (v4) [JFFS2] Fix lack of locking in thread_should_wake() [JFFS2] Fix build failure with !CONFIG_JFFS2_FS_WRITEBUFFER [MTD] [NAND] OMAP2: remove duplicated #include	2008-11-06 15:43:13 -08:00
OGAWA Hirofumi	9c0aa1b87b	fat: Cleanup FAT attribute stuff This adds three helpers: fat_make_attrs() - makes FAT attributes from inode. fat_make_mode() - makes mode_t from FAT attributes. fat_save_attrs() - saves FAT attributes to inode. Then this replaces: MSDOS_MKMODE() by fat_make_mode(), fat_attr() by fat_make_attrs(), ->i_attrs = attr & ATTR_UNUSED by fat_save_attrs(). And for root inode, those is used with ATTR_DIR instead of bogus ATTR_NONE. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-06 15:41:21 -08:00
OGAWA Hirofumi	9e975dae29	fat: split include/msdos_fs.h This splits __KERNEL__ stuff in include/msdos_fs.h into fs/fat/fat.h. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-06 15:41:20 -08:00
Aneesh Kumar K.V	fb68407b0d	jbd2: Call journal commit callback without holding j_list_lock Avoid freeing the transaction in __jbd2_journal_drop_transaction() so the journal commit callback can run without holding j_list_lock, to avoid lock contention on this spinlock. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-11-06 17:50:21 -05:00
David Miller	f8d570a474	net: Fix recursive descent in __scm_destroy(). __scm_destroy() walks the list of file descriptors in the scm_fp_list pointed to by the scm_cookie argument. Those, in turn, can close sockets and invoke __scm_destroy() again. There is nothing which limits how deeply this can occur. The idea for how to fix this is from Linus. Basically, we do all of the fput()s at the top level by collecting all of the scm_fp_list objects hit by an fput(). Inside of the initial __scm_destroy() we keep running the list until it is empty. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-06 13:51:50 -08:00
Steven Rostedt	6a60dd121c	ftrace: split out hardirq ftrace code into own header Impact: moving of function prototypes into own header file ftrace.h is too big of a file for hardirq.h, and some archs will fail to build because of the include dependencies not being met. This patch pulls out the required prototypes for hardirq.h into a smaller and safer ftrace_irq.h file. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-06 22:20:46 +01:00
Bjorn Helgaas	8950d89aca	ACPI: remove CONFIG_ACPI_EC Remove CONFIG_ACPI_EC. It was always set the same as CONFIG_ACPI, and it had no menu label, so there was no way to set it to anything other than "y". Per section 6.5.4 of the ACPI 3.0b specification, OSPM must make Embedded Controller operation regions, accessed via the Embedded Controllers described in ECDT, available before executing any control method. The ECDT table is optional, but if it is present, the above text means that the EC it describes is a required part of the ACPI subsystem, so CONFIG_ACPI_EC=n wouldn't make sense. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Acked-by: Alexey Starikovskiy <astarikovskiy@suse.de> Signed-off-by: Len Brown <len.brown@intel.com>	2008-11-06 15:52:28 -05:00
Rusty Russell	2d3854a37e	cpumask: introduce new API, without changing anything Impact: introduce new APIs We want to deprecate cpumasks on the stack, as we are headed for gynormous numbers of CPUs. Eventually, we want to head towards an undefined 'struct cpumask' so they can never be declared on stack. 1) New cpumask functions which take pointers instead of copies. (cpus_* -> cpumask_*) 2) Several new helpers to reduce requirements for temporary cpumasks (cpumask_first_and, cpumask_next_and, cpumask_any_and) 3) Helpers for declaring cpumasks on or offstack for large NR_CPUS (cpumask_var_t, alloc_cpumask_var and free_cpumask_var) 4) 'struct cpumask' for explicitness and to mark new-style code. 5) Make iterator functions stop at nr_cpu_ids (a runtime constant), not NR_CPUS for time efficiency and for smaller dynamic allocations in future. 6) cpumask_copy() so we can allocate less than a full cpumask eventually (for alloc_cpumask_var), and so we can eliminate the 'struct cpumask' definition eventually. 7) work_on_cpu() helper for doing task on a CPU, rather than saving old cpumask for current thread and manipulating it. 8) smp_call_function_many() which is smp_call_function_mask() except taking a cpumask pointer. Note that this patch simply introduces the new functions and leaves the obsolescent ones in place. This is to simplify the transition patches. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-06 09:05:33 +01:00
Alan Stern	9c133c469d	Add round_jiffies_up and related routines This patch (as1158b) adds round_jiffies_up() and friends. These routines work like the analogous round_jiffies() functions, except that they will never round down. The new routines will be useful for timeouts where we don't care exactly when the timer expires, provided it doesn't expire too soon. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-11-06 08:42:48 +01:00
Jeremy Fitzhardinge	f92131c3dd	bio: define __BIOVEC_PHYS_MERGEABLE Define __BIOVEC_PHYS_MERGEABLE as the default implementation of BIOVEC_PHYS_MERGEABLE, so that its available for reuse within an arch-specific definition of BIOVEC_PHYS_MERGEABLE. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-11-06 08:41:55 +01:00
Steven Rostedt	0f04870148	ftrace: soft tracing stop and start Impact: add way to quickly start stop tracing from the kernel This patch adds a soft stop and start to the trace. This simply disables function tracing via the ftrace_disabled flag, and disables the trace buffers to prevent recording. The tracing code may still be executed, but the trace will not be recorded. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-06 07:50:57 +01:00
Steven Rostedt	60a7ecf426	ftrace: add quick function trace stop Impact: quick start and stop of function tracer This patch adds a way to disable the function tracer quickly without the need to run kstop_machine. It adds a new variable called function_trace_stop which will stop the calls to functions from mcount when set. This is just an on/off switch and does not handle recursion like preempt_disable(). It's main purpose is to help other tracers/debuggers start and stop tracing fuctions without the need to call kstop_machine. The config option HAVE_FUNCTION_TRACE_MCOUNT_TEST is added for archs that implement the testing of the function_trace_stop in the mcount arch dependent code. Otherwise, the test is done in the C code. x86 is the only arch at the moment that supports this. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-06 07:50:51 +01:00
Steve Glendinning	fd9abb3d97	SMSC LAN911x and LAN921x vendor driver Attached is a driver for SMSC's LAN911x and LAN921x families of embedded ethernet controllers. There is an existing smc911x driver in the tree; this is intended to replace it. Dustin McIntire (the author of the smc911x driver) has expressed his support for switching to this driver. This driver contains workarounds for all known hardware issues, and has been tested on all flavours of the chip on multiple architectures. This driver now uses phylib, so this patch also adds support for the device's internal phy Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> Signed-off-by: Bahadir Balban <Bahadir.Balban@arm.com> Signed-off-by: Dustin Mcintire <dustin@sensoria.com> Signed-off-by: Bill Gatliff <bgat@billgatliff.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-11-06 00:58:40 -05:00
Eric W. Biederman	ae33bc40c0	net: Guaranetee the proper ordering of the loopback device. I was recently hunting a bug that occurred in network namespace cleanup. In looking at the code it became apparrent that we have and will continue to have cases where if we have anything going on in a network namespace there will be assumptions that the loopback device is present. Things like sending igmp unsubscribe messages when we bring down network devices invokes the routing code which assumes that at least the loopback driver is present. Therefore to avoid magic initcall ordering hackery that is hard to follow and hard to get right insert a call to register the loopback device directly from net_dev_init(). This guarantes that the loopback device is the first device registered and the last network device to go away. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-05 16:00:02 -08:00
Serge E. Hallyn	1f29fae297	file capabilities: add no_file_caps switch (v4) Add a no_file_caps boot option when file capabilities are compiled into the kernel (CONFIG_SECURITY_FILE_CAPABILITIES=y). This allows distributions to ship a kernel with file capabilities compiled in, without forcing users to use (and understand and trust) them. When no_file_caps is specified at boot, then when a process executes a file, any file capabilities stored with that file will not be used in the calculation of the process' new capability sets. This means that booting with the no_file_caps boot option will not be the same as booting a kernel with file capabilities compiled out - in particular a task with CAP_SETPCAP will not have any chance of passing capabilities to another task (which isn't "really" possible anyway, and which may soon by killed altogether by David Howells in any case), and it will instead be able to put new capabilities in its pI. However since fI will always be empty and pI is masked with fI, it gains the task nothing. We also support the extra prctl options, setting securebits and dropping capabilities from the per-process bounding set. The other remaining difference is that killpriv, task_setscheduler, setioprio, and setnice will continue to be hooked. That will be noticable in the case where a root task changed its uid while keeping some caps, and another task owned by the new uid tries to change settings for the more privileged task. Changelog: Nov 05 2008: (v4) trivial port on top of always-start-\ with-clear-caps patch Sep 23 2008: nixed file_caps_enabled when file caps are not compiled in as it isn't used. Document no_file_caps in kernel-parameters.txt. Signed-off-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Andrew G. Morgan <morgan@kernel.org> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-06 07:14:51 +08:00
Ingo Molnar	9fcd18c9e6	sched: re-tune balancing Impact: improve wakeup affinity on NUMA systems, tweak SMP systems Given the fixes+tweaks to the wakeup-buddy code, re-tweak the domain balancing defaults on NUMA and SMP systems. Turn on SD_WAKE_AFFINE which was off on x86 NUMA - there's no reason why we would not want to have wakeup affinity across nodes as well. (we already do this in the standard NUMA template.) lat_ctx on a NUMA box is particularly happy about this change: before: \| phoenix:~/l> ./lat_ctx -s 0 2 \| "size=0k ovr=2.60 \| 2 5.70 after: \| phoenix:~/l> ./lat_ctx -s 0 2 \| "size=0k ovr=2.65 \| 2 2.07 a 2.75x speedup. pipe-test is similarly happy about it too: \| phoenix:~/sched-tests> ./pipe-test \| 18.26 usecs/loop. \| 14.70 usecs/loop. \| 14.38 usecs/loop. \| 10.55 usecs/loop. # +WAKE_AFFINE on domain0+domain1 \| 8.63 usecs/loop. \| 8.59 usecs/loop. \| 9.03 usecs/loop. \| 8.94 usecs/loop. \| 8.96 usecs/loop. \| 8.63 usecs/loop. Also: - disable SD_BALANCE_NEWIDLE on NUMA and SMP domains (keep it for siblings) - enable SD_WAKE_BALANCE on SMP domains Sysbench+postgresql improves all around the board, quite significantly: .28-rc3-11474e2c .28-rc3-11474e2c-tune ------------------------------------------------- 1: 571 688 +17.08% 2: 1236 1206 -2.55% 4: 2381 2642 +9.89% 8: 4958 5164 +3.99% 16: 9580 9574 -0.07% 32: 7128 8118 +12.20% 64: 7342 8266 +11.18% 128: 7342 8064 +8.95% 256: 7519 7884 +4.62% 512: 7350 7731 +4.93% ------------------------------------------------- SUM: 55412 59341 +6.62% So it's a win both for the runup portion, the peak area and the tail. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-05 18:04:38 +01:00
Eric W. Biederman	467622ef2a	[MTD] [NOR] Fix cfi_send_gen_cmd handling of x16 devices in x8 mode (v4) For "unlock" cycles to 16bit devices in 8bit compatibility mode we need to use the byte addresses 0xaaa and 0x555. These effectively match the word address 0x555 and 0x2aa, except the latter has its low bit set. Most chips don't care about the value of the 'A-1' pin in x8 mode, but some -- like the ST M29W320D -- do. So we need to be careful to set it where appropriate. cfi_send_gen_cmd is only ever passed addresses where the low byte is 0x00, 0x55 or 0xaa. Of those, only addresses ending 0xaa are affected by this patch, by masking in the extra low bit when the device is known to be in compatibility mode. [dwmw2: Do it only when (cmd_ofs & 0xff) == 0xaa] v4: Fix stupid typo in cfi_build_cmd_addr that failed to compile I'm writing this patch way to late at night. v3: Bring all of the work back into cfi_build_cmd_addr including calling of map_bankwidth(map) and cfi_interleave(cfi) So every caller doesn't need to. v2: Only modified the address if we our device_type is larger than our bus width. Cc: stable@kernel.org Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-11-05 14:40:25 +01:00
Gerrit Renker	ac75773c27	dccp: Per-socket initialisation of feature negotiation This provides feature-negotiation initialisation for both DCCP sockets and DCCP request_sockets, to support feature negotiation during connection setup. It also resolves a FIXME regarding the congestion control initialisation. Thanks to Wei Yongjun for help with the IPv6 side of this patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-04 23:55:49 -08:00
Gerrit Renker	7d43d1a0f2	dccp: Implement lookup table for feature-negotiation information A lookup table for feature-negotiation information, extracted from RFC 4340/42, is provided by this patch. All currently known features can be found in this table, along with their feature location, their default value, and type. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-04 23:43:47 -08:00
Theodore Ts'o	1a0d3786dd	jbd2: Remove a large array of bh's from the stack of the checkpoint routine jbd2_log_do_checkpoint()n is one of the kernel's largest stack users. Move the array of buffer head's from the stack of jbd2_log_do_checkpoint() to the in-core journal structure. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-11-05 00:09:22 -05:00
Theodore Ts'o	30773840c1	ext4: add fsync batch tuning knobs Add new mount options, min_batch_time and max_batch_time, which controls how long the jbd2 layer should wait for additional filesystem operations to get batched with a synchronous write transaction. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-01-03 20:27:38 -05:00
Josef Bacik	e07f7183a4	jbd2: improve jbd2 fsync batching This patch removes the static sleep time in favor of a more self optimizing approach where we measure the average amount of time it takes to commit a transaction to disk and the ammount of time a transaction has been running. If somebody does a sync write or an fsync() traditionally we would sleep for 1 jiffies, which depending on the value of HZ could be a significant amount of time compared to how long it takes to commit a transaction to the underlying storage. With this patch instead of sleeping for a jiffie, we check to see if the amount of time this transaction has been running is less than the average commit time, and if it is we sleep for the delta using schedule_hrtimeout to give us a higher precision sleep time. This greatly benefits high end storage where you could end up sleeping for longer than it takes to commit the transaction and therefore sitting idle instead of allowing the transaction to be committed by keeping the sleep time to a minimum so you are sure to always be doing something. Signed-off-by: Josef Bacik <jbacik@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-11-26 01:14:26 -05:00
Mark Fasheh	171bbfbeab	jbd2: Add BH_JBDPrivateStart Add this so that file systems using JBD2 can safely allocate unused b_state bits. In this case, we add it so that Ocfs2 can define a single bit for tracking the validation state of a buffer. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-11-25 17:42:31 -05:00
Patrick McHardy	9b22ea5609	net: fix packet socket delivery in rx irq handler The changes to deliver hardware accelerated VLAN packets to packet sockets (commit `bc1d0411`) caused a warning for non-NAPI drivers. The __vlan_hwaccel_rx() function is called directly from the drivers RX function, for non-NAPI drivers that means its still in RX IRQ context: [ 27.779463] ------------[ cut here ]------------ [ 27.779509] WARNING: at kernel/softirq.c:136 local_bh_enable+0x37/0x81() ... [ 27.782520] [<c0264755>] netif_nit_deliver+0x5b/0x75 [ 27.782590] [<c02bba83>] __vlan_hwaccel_rx+0x79/0x162 [ 27.782664] [<f8851c1d>] atl1_intr+0x9a9/0xa7c [atl1] [ 27.782738] [<c0155b17>] handle_IRQ_event+0x23/0x51 [ 27.782808] [<c015692e>] handle_edge_irq+0xc2/0x102 [ 27.782878] [<c0105fd5>] do_IRQ+0x4d/0x64 Split hardware accelerated VLAN reception into two parts to fix this: - __vlan_hwaccel_rx just stores the VLAN TCI and performs the VLAN device lookup, then calls netif_receive_skb()/netif_rx() - vlan_hwaccel_do_receive(), which is invoked by netif_receive_skb() in softirq context, performs the real reception and delivery to packet sockets. Reported-and-tested-by: Ramon Casellas <ramon.casellas@cttc.es> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-04 14:49:57 -08:00
Alok Kataria	fd8cd7e191	x86: vmware: look for DMI string in the product serial key Impact: Should permit VMware detection on older platforms where the vendor is changed. Could theoretically cause a regression if some weird serial number scheme contains the string "VMware" by pure chance. Seems unlikely, especially with the mixed case. In some user configured cases, VMware may choose not to put a VMware specific DMI string, but the product serial key is always there and is VMware specific. Add a interface to check the serial key, when checking for VMware in the DMI information. Signed-off-by: Alok N Kataria <akataria@vmware.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-11-04 13:59:00 -08:00
Frederic Weisbecker	71566a0d16	tracing/fastboot: Enable boot tracing only during initcalls Impact: modify boot tracer We used to disable the initcall tracing at a specified time (IE: end of builtin initcalls). But we don't need it anymore. It will be stopped when initcalls are finished. However we want two things: _Start this tracing only after pre-smp initcalls are finished. _Since we are planning to trace sched_switches at the same time, we want to enable them only during the initcall execution. For this purpose, this patch introduce two functions to enable/disable the sched_switch tracing during boot. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-04 17:14:02 +01:00
Stefano Panella	fec1a5932f	uwb: per-radio controller event thread and beacon cache Use an event thread per-radio controller so processing events from one radio controller doesn't delay another. A radio controller shouldn't have information on devices seen by a different radio controller (they may be on different channels) so make the beacon cache per-radio controller. Signed-off-by: Stefano Panella <stefano.panella@csr.com> Signed-off-by: David Vrabel <david.vrabel@csr.com>	2008-11-04 15:55:26 +00:00
Stefano Panella	6d5a681dfb	uwb: add commands to add/remove IEs to the debug interface Add the commands UWB_DBG_CMD_IE_ADD and UWB_DBG_CMD_IE_RM to the debug interface and make them call uwb_rc_ie_add() and uwb_rc_ie_rm(). Signed-off-by: Stefano Panella <stefano.panella@csr.com> Signed-off-by: David Vrabel <david.vrabel@csr.com>	2008-11-04 15:54:30 +00:00
Stefano Panella	c5995bd281	uwb: infrastructure for handling Relinquish Request IEs The structures and event handler needed to handle Relinish Request IEs received from neighbors. Nothing is done with these IEs yet. Signed-off-by: Stefano Panella <stefano.panella@csr.com> Signed-off-by: David Vrabel <david.vrabel@csr.com>	2008-11-04 15:53:29 +00:00
Alexey Dobriyan	6beceee5aa	netfilter: netns ebtables: part 2 * return ebt_table from ebt_register_table(), module code will save it into per-netns data for unregistration * duplicate ebt_table at the very beginning of registration -- it's added into list, so one ebt_table wouldn't end up in many lists (and each netns has different one) * introduce underscored tables in individial modules, this is temporary to not break bisection. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2008-11-04 14:27:15 +01:00
Alexey Dobriyan	511061e2dd	netfilter: netns ebtables: part 1 * propagate netns from userspace, register table in passed netns * remporarily register every ebt_table in init_net P. S.: one needs to add ".netns_ok = 1" to igmp_protocol to test with ebtables(8) in netns. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2008-11-04 14:22:55 +01:00
Tejun Heo	6a87e42e95	libata: implement ATA_HORKAGE_ATAPI_MOD16_DMA and apply it libata always uses PIO for ATAPI commands when the number of bytes to transfer isn't multiple of 16 but quantum DAT72 chokes on odd bytes PIO transfers. Implement a horkage to skip the mod16 check and apply it to the quantum device. This is reported by John Clark in the following thread. http://thread.gmane.org/gmane.linux.ide/34748 Signed-off-by: Tejun Heo <tj@kernel.org> Cc: John Clark <clarkjc@runbox.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-11-04 01:08:27 -05:00
Jay Vosburgh	6cf3f41e6c	bonding, net: Move last_rx update into bonding recv logic The only user of the net_device->last_rx field is bonding. This patch adds a conditional update of last_rx to the bonding special logic in skb_bond_should_drop, causing last_rx to only be updated when the ARP monitor is running. This frees network device drivers from the necessity of updating last_rx, which can have cache line thrash issues. Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-03 18:16:50 -08:00
Harvey Harrison	a7b930cdf8	PCI: annotate return value of pci_ioremap_bar with __iomem Was missing from the initial patch. Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-11-03 14:31:18 -08:00
Linus Torvalds	da4a22cba7	Merge branch 'io-mappings-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'io-mappings-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: io mapping: clean up #ifdefs io mapping: improve documentation i915: use io-mapping interfaces instead of a variety of mapping kludges resources: add io-mapping functions to dynamically map large device apertures x86: add iomap_atomic*()/iounmap_atomic() on 32-bit using fixmaps	2008-11-03 10:15:40 -08:00
Paul E. McKenney	29cbda77a6	rcu: increase RCU stall-check timeouts Impact: increase timeout of debug check feature Increase RCU stall period timeouts to reduce the likelyhood of false positives. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-03 18:36:48 +01:00
Keith Packard	e5beae1690	io mapping: clean up #ifdefs Impact: cleanup clean up ifdefs: change #ifdef CONFIG_X86_32/64 to CONFIG_HAVE_ATOMIC_IOMAP. flip around the #ifdef sections to clean up the structure. Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-03 18:21:45 +01:00
Steven Rostedt	7e5e26a3d8	ftrace: fix hardirq header for non ftrace archs Impact: build fix for non-ftrace architectures Not all archs implement ftrace, and therefore do not have an asm/ftrace.h. This patch corrects the problem. The ftrace_nmi_enter/exit now must be defined for all archs that implement dynamic ftrace. Currently, only x86 does. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-03 11:03:43 +01:00
Ingo Molnar	7a895f53cd	Merge branches 'tracing/ftrace', 'tracing/markers', 'tracing/mmiotrace', 'tracing/nmisafe', 'tracing/tracepoints' and 'tracing/urgent' into tracing/core	2008-11-03 10:34:23 +01:00
Lai Jiangshan	127cafbb27	tracepoint: introduce _noupdate APIs. Impact: add new tracepoint APIs to allow the batched registration of probes new APIs separate tracepoint_probe_register(), tracepoint_probe_unregister() into 2 steps. The first step of them is just update tracepoint_entry, not connect or disconnect. this patch introduces tracepoint_probe_update_all() for update all. these APIs are very useful for registering lots of probes but just updating once. Another very important thing is that _noupdate APIs do not require module_mutex. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-03 10:28:52 +01:00
Ingo Molnar	36609469c8	Merge commit 'v2.6.28-rc3' into tracing/ftrace	2008-11-03 09:11:13 +01:00
Linus Torvalds	391e572cd1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (33 commits) af_unix: netns: fix problem of return value IRDA: remove double inclusion of module.h udp: multicast packets need to check namespace net: add documentation for skb recycling key: fix setkey(8) policy set breakage bpa10x: free sk_buff with kfree_skb xfrm: do not leak ESRCH to user space net: Really remove all of LOOPBACK_TSO code. netfilter: nf_conntrack_proto_gre: switch to register_pernet_gen_subsys() netns: add register_pernet_gen_subsys/unregister_pernet_gen_subsys net: delete excess kernel-doc notation pppoe: Fix socket leak. gianfar: Don't reset TBI<->SerDes link if it's already up gianfar: Fix race in TBI/SerDes configuration at91_ether: request/free GPIO for PHY interrupt amd8111e: fix dma_free_coherent context atl1: fix vlan tag regression SMC91x: delete unused local variable "lp" myri10ge: fix stop/go mmio ordering bonding: fix panic when taking bond interface down before removing module ...	2008-11-02 10:15:52 -08:00
Jeff Garzik	4ac96572f1	linux/string.h: fix comment typo s/user/used/ Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-02 10:15:07 -08:00
Sujith	8b30b1fe36	mac80211: Re-enable aggregation Wireless HW without any dedicated queues for aggregation do not need the ampdu_queues mechanism present right now in mac80211. Since mac80211 is still incomplete wrt TX MQ changes, do not allow aggregation sessions for drivers that set ampdu_queues. This is only an interim hack until Intel fixes the requeue issue. Signed-off-by: Sujith <Sujith.Manoharan@atheros.com> Signed-off-by: Luis Rodriguez <Luis.Rodriguez@Atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-10-31 19:02:14 -04:00
John W. Linville	7211801527	wireless: avoid some net/ieee80211.h vs. linux/ieee80211.h conflicts There is quite a lot of overlap in definitions between these headers... Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-10-31 19:00:50 -04:00
John W. Linville	9387b7caf3	wireless: use individual buffers for printing ssid values Also change escape_ssid to print_ssid to match print_mac semantics. Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-10-31 19:00:50 -04:00
colin@cozybit.com	93da9cc17c	Add nl80211 commands to get and set o11s mesh networking parameters The two new commands are NL80211_CMD_GET_MESH_PARAMS and NL80211_CMD_SET_MESH_PARAMS. There is a new attribute enum, NL80211_ATTR_MESH_PARAMS, which enumerates the various mesh configuration parameters. Moved struct mesh_config from mac80211/ieee80211_i.h to net/cfg80211.h. nl80211_get_mesh_params and nl80211_set_mesh_params unpack the netlink messages and ask the driver to get or set the configuration. This is done via two new function stubs, get_mesh_params and set_mesh_params, in struct cfg80211_ops. Signed-off-by: Colin McCabe <colin@cozybit.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-10-31 19:00:39 -04:00
Johannes Berg	d51626df57	nl80211: export HT capabilities This exports the local HT capabilities in nl80211. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-10-31 19:00:13 -04:00
Johannes Berg	d9fe60dea7	802.11: clean up/fix HT support This patch cleans up a number of things: * the unusable definition of the HT capabilities/HT information information elements * variable names that are hard to understand * mac80211: move ieee80211_handle_ht to ht.c and remove the unused enable_ht parameter * mac80211: fix bug with MCS rate 32 in ieee80211_handle_ht * mac80211: fix bug with casting the result of ieee80211_bss_get_ie to an information element _contents_ rather than the whole element, add size checking (another out-of-bounds access bug fixed!) * mac80211: remove some unused return values in favour of BUG_ON checking * a few minor other things Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-10-31 19:00:06 -04:00
Kay Sievers	ae9eba0e27	uwb: struct device - replace bus_id with dev_name(), dev_set_name() Cc: David Vrabel <david.vrabel@csr.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-Off-By: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: David Vrabel <david.vrabel@csr.com>	2008-10-31 15:07:06 +00:00
Steven Rostedt	a26a2a2739	ftrace: nmi safe code clean ups Impact: cleanup This patch cleans up the NMI safe code for dynamic ftrace as suggested by Andrew Morton. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-31 10:29:17 +01:00
Keith Packard	9663f2e6a6	resources: add io-mapping functions to dynamically map large device apertures Impact: add new generic io_map_*() APIs Graphics devices have large PCI apertures which would consume a significant fraction of a 32-bit address space if mapped during driver initialization. Using ioremap at runtime is impractical as it is too slow. This new set of interfaces uses atomic mappings on 32-bit processors and a large static mapping on 64-bit processors to provide reasonable 32-bit performance and optimal 64-bit performance. The current implementation sits atop the io_map_atomic fixmap-based mechanism for 32-bit processors. This includes some editorial suggestions from Randy Dunlap for Documentation/io-mapping.txt Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-31 10:12:39 +01:00
Huang Ying	92be3d6bdf	kexec/i386: allocate page table pages dynamically Impact: save .text size when kexec is built in but not loaded This patch adds an architecture specific struct kimage_arch into struct kimage. The pointers to page table pages used by kexec are added to struct kimage_arch. The page tables pages are dynamically allocated in machine_kexec_prepare instead of statically from BSS segment. This will save up to 20k memory when kexec image is not loaded. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-31 10:01:56 +01:00
Harvey Harrison	3685f25de1	misc: replace NIPQUAD() Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u can be replaced with %pI4 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-31 00:56:49 -07:00
David S. Miller	a1744d3bee	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/p54/p54common.c	2008-10-31 00:17:34 -07:00
Randy Dunlap	ad1d967c88	net: delete excess kernel-doc notation Remove excess kernel-doc function parameters from networking header & driver files: Warning(include/net/sock.h:946): Excess function parameter or struct member 'sk' description in 'sk_filter_release' Warning(include/linux/netdevice.h:1545): Excess function parameter or struct member 'cpu' description in 'netif_tx_lock' Warning(drivers/net/wan/z85230.c:712): Excess function parameter or struct member 'regs' description in 'z8530_interrupt' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-30 23:54:35 -07:00
David S. Miller	194dcdba5a	Merge branch 'davem-fixes' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6	2008-10-30 23:50:18 -07:00
Jens Axboe	9ce8e3073d	libata: add whitelist for devices with known good pata-sata bridges libata currently imposes a UDMA5 max transfer rate and 200 sector max transfer size for SATA devices that sit behind a pata-sata bridge. Lots of devices have known good bridges that don't need this limit applied. The MTRON SSD disks are such devices. Transfer rates are increased by 20-30% with the restriction removed. So add a "blacklist" entry for the MTRON devices, with a flag indicating that the bridge is known good. Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-10-31 01:45:06 -04:00
Trent Piepho	c132419e56	gianfar: Fix race in TBI/SerDes configuration The init_phy() function attaches to the PHY, then configures the SerDes<->TBI link (in SGMII mode). The TBI is on the MDIO bus with the PHY (sort of) and is accessed via the gianfar's MDIO registers, using the functions gfar_local_mdio_read/write(), which don't do any locking. The previously attached PHY will start a work-queue on a timer, and probably an irq handler as well, which will talk to the PHY and thus use the MDIO bus. This uses phy_read/write(), which have locking, but not against the gfar_local_mdio versions. The result is that PHY code will try to use the MDIO bus at the same time as the SerDes setup code, corrupting the transfers. Setting up the SerDes before attaching to the PHY will insure that there is no race between the SerDes code and our PHY, but doesn't fix everything. Typically the PHYs for all gianfar devices are on the same MDIO bus, which is associated with the first gianfar device. This means that the first gianfar's SerDes code could corrupt the MDIO transfers for a different gianfar's PHY. The lock used by phy_read/write() is contained in the mii_bus structure, which is pointed to by the PHY. This is difficult to access from the gianfar drivers, as there is no link between a gianfar device and the mii_bus which shares the same MDIO registers. As far as the device layer and drivers are concerned they are two unrelated devices (which happen to share registers). Generally all gianfar devices' PHYs will be on the bus associated with the first gianfar. But this might not be the case, so simply locking the gianfar's PHY's mii bus might not lock the mii bus that the SerDes setup code is going to use. We solve this by having the code that creates the gianfar platform device look in the device tree for an mdio device that shares the gianfar's registers. If one is found the ID of its platform device is saved in the gianfar's platform data. A new function in the gianfar mii code, gfar_get_miibus(), can use the bus ID to search through the platform devices for a gianfar_mdio device with the right ID. The platform device's driver data is the mii_bus structure, which the SerDes setup code can use to lock the current bus. Signed-off-by: Trent Piepho <tpiepho@freescale.com> CC: Andy Fleming <afleming@freescale.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-10-31 00:59:46 -04:00
Ingo Molnar	e1e302d8a9	Merge branch 'linus' into tracing/ftrace	2008-10-31 00:38:21 +01:00
Steven Rostedt	17666f02b1	ftrace: nmi safe code modification Impact: fix crashes that can occur in NMI handlers, if their code is modified Modifying code is something that needs special care. On SMP boxes, if code that is being modified is also being executed on another CPU, that CPU will have undefined results. The dynamic ftrace uses kstop_machine to make the system act like a uniprocessor system. But this does not address NMIs, that can still run on other CPUs. One approach to handle this is to make all code that are used by NMIs not be traced. But NMIs can call notifiers that spread throughout the kernel and this will be very hard to maintain, and the chance of missing a function is very high. The approach that this patch takes is to have the NMIs modify the code if the modification is taking place. The way this works is that just writing to code executing on another CPU is not harmful if what is written is the same as what exists. Two buffers are used: an IP buffer and a "code" buffer. The steps that the patcher takes are: 1) Put in the instruction pointer into the IP buffer and the new code into the "code" buffer. 2) Set a flag that says we are modifying code 3) Wait for any running NMIs to finish. 4) Write the code 5) clear the flag. 6) Wait for any running NMIs to finish. If an NMI is executed, it will also write the pending code. Multiple writes are OK, because what is being written is the same. Then the patcher must wait for all running NMIs to finish before going to the next line that must be patched. This is basically the RCU approach to code modification. Thanks to Ingo Molnar for suggesting the idea, and to Arjan van de Ven for his guidence on what is safe and what is not. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-30 21:30:08 +01:00
Linus Torvalds	65fc716fa6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes * git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes: Fix incompatibility with versions of Perl less than 5.6.0 kbuild: do not include arch/<ARCH>/include/asm in find-sources twice. kbuild: tag with git revision when git describe is missing kbuild: prevent modpost from looking for a .cmd file for a static library linked into a module kbuild: fix KBUILD_EXTRA_SYMBOLS adjust init section definitions scripts/checksyscalls.sh: fix for non-gnu sed scripts/package: don't break if %{_smp_mflags} isn't set kbuild: setlocalversion: dont include svn change count kbuild: improve check-symlink kbuild: mkspec - fix build rpm	2008-10-30 12:55:49 -07:00
Arjan van de Ven	d98d38f201	mutex: improve header comment to be actually informative about the API Impact: improve documentation It's nice to say that mutex_trylock follows the spin_trylock convention. It's a lot nicer if the comment also says which that is... make it so. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2008-10-30 19:55:00 +01:00
Linus Torvalds	cdcba02a5f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: HID: add quirk entry for no-name keyboard (0x13ba/0x0017) HID: fix hid_device_id for cross compiling HID: sync on deleted io_retry timer in usbhid driver HID: fix oops during suspend of unbound HID devices	2008-10-30 11:51:43 -07:00
Fernando Luis Vazquez Cao	effdb9492d	spi: fix compile error Fix compile error below: LD drivers/spi/built-in.o CC [M] drivers/spi/spi_gpio.o In file included from drivers/spi/spi_gpio.c:26: include/linux/spi/spi_bitbang.h:23: error: field `work' has incomplete type make[2]: * [drivers/spi/spi_gpio.o] Error 1 make[1]: * [drivers/spi] Error 2 make: *** [drivers] Error 2 Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-30 11:38:47 -07:00
Alan Cox	731572d39f	nfsd: fix vm overcommit crash Junjiro R. Okajima reported a problem where knfsd crashes if you are using it to export shmemfs objects and run strict overcommit. In this situation the current->mm based modifier to the overcommit goes through a NULL pointer. We could simply check for NULL and skip the modifier but we've caught other real bugs in the past from mm being NULL here - cases where we did need a valid mm set up (eg the exec bug about a year ago). To preserve the checks and get the logic we want shuffle the checking around and add a new helper to the vm_ security wrappers Also fix a current->mm reference in nommu that should use the passed mm [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix build] Reported-by: Junjiro R. Okajima <hooanon05@yahoo.co.jp> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-30 11:38:47 -07:00
Randy Dunlap	7106a27b52	kernel.h: fix might_sleep kernel-doc Put the kernel-doc for might_sleep() _immediately_ before the macro (no intervening lines). Otherwise kernel-doc complains like so: Warning(linux-2.6.27-rc3-git2//include/linux/kernel.h:129): No description found for parameter 'file' Warning(linux-2.6.27-rc3-git2//include/linux/kernel.h:129): No description found for parameter 'line' because kernel-doc is looking at the wrong function prototype (i.e., __might_sleep). [Yes, I have a todo note to myself to check/warn for that inconsistency in scripts/kernel-doc.] Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: <Uwe.Kleine-Koenig@digi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-30 11:38:47 -07:00
Nick Piggin	4e02ed4b4a	fs: remove prepare_write/commit_write Nothing uses prepare_write or commit_write. Remove them from the tree completely. [akpm@linux-foundation.org: schedule simple_prepare_write() for unexporting] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-30 11:38:45 -07:00
Li Zefan	9b913735e5	cgroups: tiny cleanups - remove 'private' field from struct subsys - remove cgroup_init_smp() Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-30 11:38:45 -07:00
Li Zefan	00c2e63c31	freezer_cg: use thaw_process() in unfreeze_cgroup() Don't duplicate the implementation of thaw_process(). [akpm@linux-foundation.org: make __thaw_process() static] Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Acked-by: Matt Helsley <matthltc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-30 11:38:45 -07:00
Kurt Garloff	0833422274	mm: increase the default mlock limit from 32k to 64k By default, non-privileged tasks can only mlock() a small amount of memory to avoid a DoS attack by ordinary users. The Linux kernel defaulted to 32k (on a 4k page size system) to accommodate the needs of gpg. However, newer gpg2 needs 64k in various circumstances and otherwise fails miserably, see bnc#329675. Change the default to 64k, and make it more agnostic to PAGE_SIZE. Signed-off-by: Kurt Garloff <garloff@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-30 11:38:45 -07:00
David Chinner	8290c35f87	Inode: Allow external list initialisation To allow XFS to combine the XFS and linux inodes into a single structure, we need to drive inode lookup from the XFS inode cache, not the generic inode cache. This means that we need initialise a struct inode from a context outside alloc_inode() as it is no longer used by XFS. After inode allocation and initialisation, we need to add the inode to the superblock list, the in-use list, hash it and do some accounting. This all needs to be done with the inode_lock held and there are already several places in fs/inode.c that do this list manipulation. Factor out the common code, add a locking wrapper and export the function so ti can be called from XFS. Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-10-30 17:35:24 +11:00
David Chinner	2cb1599f9b	Inode: Allow external initialisers To allow XFS to combine the XFS and linux inodes into a single structure, we need to drive inode lookup from the XFS inode cache, not the generic inode cache. This means that we need initialise a struct inode from a context outside alloc_inode() as it is no longer used by XFS. Factor and export the struct inode initialisation code from alloc_inode() to inode_init_always() as a counterpart to inode_init_once(). i.e. we have to call this init function for each inode instantiation (always), as opposed inode_init_once() which is only called on slab object instantiation (once). Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-10-30 17:32:23 +11:00
Jan Beulich	3f5e26cee4	adjust init section definitions Add rodata equivalents for assembly use, and fix the section attributes used by __REFCONST. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>	2008-10-29 22:02:09 +01:00
Harvey Harrison	5b095d9892	net: replace %p6 with %pI6 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-29 12:52:50 -07:00
Eric Dumazet	96631ed16c	udp: introduce sk_for_each_rcu_safenext() Corey Minyard found a race added in commit `271b72c7fa` (udp: RCU handling for Unicast packets.) "If the socket is moved from one list to another list in-between the time the hash is calculated and the next field is accessed, and the socket has moved to the end of the new list, the traversal will not complete properly on the list it should have, since the socket will be on the end of the new list and there's not a way to tell it's on a new list and restart the list traversal. I think that this can be solved by pre-fetching the "next" field (with proper barriers) before checking the hash." This patch corrects this problem, introducing a new sk_for_each_rcu_safenext() macro. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-29 11:19:58 -07:00
Dmitry Baryshkov	a20c7ab570	[MTD] sharpsl-nand: use platform_data for model-specific values Add platform_data which holds all model-specific values, like badblocks pattern, oobinfo, partitions. Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com>	2008-10-29 21:06:38 +03:00
Harvey Harrison	b189db5d29	net: remove NIP6(), NIP6_FMT, NIP6_SEQFMT and final users Open code NIP6_FMT in the one call inside sscanf and one user of NIP6() that could use %p6 in the netfilter code. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-28 23:02:38 -07:00
Harvey Harrison	0c6ce78abf	net: replace uses of NIP6_FMT with %p6 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-28 23:02:31 -07:00
Andreas Schwab	8175fe2dda	HID: fix hid_device_id for cross compiling struct hid_device_id contains hidden padding which is bad for cross compiling. Make the padding explicit and consistent across architectures. Signed-off-by: Andreas Schwab <schwab@suse.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2008-10-29 00:51:25 +01:00
Martin Willi	3a2dfbe8ac	xfrm: Notify changes in UDP encapsulation via netlink Add new_mapping() implementation to the netlink xfrm_mgr to notify address/port changes detected in UDP encapsulated ESP packets. Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-28 16:01:07 -07:00
Alexey Dobriyan	def8b4faff	net: reduce structures when XFRM=n ifdef out * struct sk_buff::sp (pointer) * struct dst_entry::xfrm (pointer) * struct sock::sk_policy (2 pointers) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-28 13:24:06 -07:00
Theodore Ts'o	5e1f8c9e20	ext3: Add support for non-native signed/unsigned htree hash algorithms The original ext3 hash algorithms assumed that variables of type char were signed, as God and K&R intended. Unfortunately, this assumption is not true on some architectures. Userspace support for marking filesystems with non-native signed/unsigned chars was added two years ago, but the kernel-side support was never added (until now). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org	2008-10-28 13:21:55 -04:00
Linus Torvalds	e946217e4f	Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits) ftrace: fix current_tracer error return tracing: fix a build error on alpha ftrace: use a real variable for ftrace_nop in x86 tracing/ftrace: make boot tracer select the sched_switch tracer tracepoint: check if the probe has been registered asm-generic: define DIE_OOPS in asm-generic trace: fix printk warning for u64 ftrace: warning in kernel/trace/ftrace.c ftrace: fix build failure ftrace, powerpc, sparc64, x86: remove notrace from arch ftrace file ftrace: remove ftrace hash ftrace: remove mcount set ftrace: remove daemon ftrace: disable dynamic ftrace for all archs that use daemon ftrace: add ftrace warn on to disable ftrace ftrace: only have ftrace_kill atomic ftrace: use probe_kernel ftrace: comment arch ftrace code ftrace: return error on failed modified text. ftrace: dynamic ftrace process only text section ...	2008-10-28 09:52:25 -07:00
Linus Torvalds	a186576925	Merge branch 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm * 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: KVM: ia64: Makefile fix for forcing to re-generate asm-offsets.h KVM: Future-proof device assignment ABI KVM: ia64: Fix halt emulation logic KVM: Fix guest shared interrupt with in-kernel irqchip KVM: MMU: sync root on paravirt TLB flush	2008-10-28 09:50:11 -07:00
Linus Torvalds	8ca6215502	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: fix documentation reference for sched_min_granularity_ns sched: virtual time buddy preemption sched: re-instate vruntime based wakeup preemption sched: weaken sync hint sched: more accurate min_vruntime accounting sched: fix a find_busiest_group buglet sched: add CONFIG_SMP consistency	2008-10-28 09:46:20 -07:00
Ingo Molnar	d1a76187a5	Merge commit 'v2.6.28-rc2' into core/locking Conflicts: arch/um/include/asm/system.h	2008-10-28 16:54:49 +01:00
Ingo Molnar	7a9787e1eb	Merge commit 'v2.6.28-rc2' into x86/pci-ioapic-boot-irq-quirks	2008-10-28 16:26:12 +01:00
Avi Kivity	bb45e202e6	KVM: Future-proof device assignment ABI Reserve some space so we can add more data. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-28 14:22:15 +02:00
Sheng Yang	5550af4df1	KVM: Fix guest shared interrupt with in-kernel irqchip Every call of kvm_set_irq() should offer an irq_source_id, which is allocated by kvm_request_irq_source_id(). Based on irq_source_id, we identify the irq source and implement logical OR for shared level interrupts. The allocated irq_source_id can be freed by kvm_free_irq_source_id(). Currently, we support at most sizeof(unsigned long) different irq sources. [Amit: - rebase to kvm.git HEAD - move definition of KVM_USERSPACE_IRQ_SOURCE_ID to common file - move kvm_request_irq_source_id to the update_irq ioctl] [Xiantao: - Add kvm/ia64 stuff and make it work for kvm/ia64 guests] Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-28 14:21:34 +02:00
David Vrabel	1cde7f68ce	uwb: order IEs by element ID ECMA-368 requires that IEs in a beacon must be sorted by element ID. Most hardware uses the ordering in the Set IE URC command so get the ordering right on the host. Also refactor the IE management code: - use uwb_ie_next() instead of uwb_ie_for_each(). - remove unnecessary functions. - API is now only uwb_rc_ie_add() and uwb_rc_ie_rm(). Signed-off-by: David Vrabel <david.vrabel@csr.com>	2008-10-28 12:09:17 +00:00
David Vrabel	4d2bea4ca0	wusb: do a proper channel stop When stopping the WUSB channel the host should send Channel Stop IEs giving the WUSB Channel Time of the last MMC. Both WHCI and HWA hosts provide a channel stop command for this. Signed-off-by: David Vrabel <david.vrabel@csr.com>	2008-10-28 12:08:46 +00:00
David Vrabel	cae1c11414	uwb: reference count reservations Reference counting the struct uwb_rsv's is safer and easier to get right than the transferring ownership of the structures from the PAL to reservation manager. This fixes an oops in the debug PAL after a reservation timed out. Signed-off-by: David Vrabel <david.vrabel@csr.com>	2008-10-28 12:07:23 +00:00
Dominic Curran	b67b4b1177	Input: gpio-keys - add flag to allow auto repeat This patch adds a flag to gpio-key driver to turn on the input subsystems auto repeat feature if needed. Signed-off-by: Dominic Curran <dcurran@ti.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-10-27 22:33:04 -04:00
Jiri Slaby	3d5afd324a	HID: fix oops during suspend of unbound HID devices Usbhid structure is allocated on start invoked only from probe of some driver. When there is no driver, the structure is null and causes null-dereference oopses. Fix it by allocating the structure on probe and disconnect of the device itself. Also make sure we won't race between start and resume or stop and suspend respectively. References: http://bugzilla.kernel.org/show_bug.cgi?id=11827 Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Andreas Schwab <schwab@suse.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2008-10-27 15:06:01 +01:00
Steven Rostedt	944ac4259e	ftrace: ftrace dump on oops control Impact: add (default-off) dump-trace-on-oops flag Currently, ftrace is set up to dump its contents to the console if the kernel panics or oops. This can be annoying if you have trace data in the buffers and you experience an oops, but the trace data is old or static. Usually when you want ftrace to dump its contents is when you are debugging your system and you have set up ftrace to trace the events leading to an oops. This patch adds a control variable called "ftrace_dump_on_oops" that will enable the ftrace dump to console on oops. This variable is default off but a developer can enable it either through the kernel command line by adding "ftrace_dump_on_oops" or at run time by setting (or disabling) /proc/sys/kernel/ftrace_dump_on_oops. v2: Replaced /** with /* as Randy explained that kernel-doc does not yet handle variables. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-27 15:03:15 +01:00
Lai Jiangshan	505e371da1	markers: remove exported symbol marker_probe_cb_noarg() marker_probe_cb_noarg() should not be seen by outer code. this patch remove it. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-27 13:02:23 +01:00
Ingo Molnar	4944dd62de	Merge commit 'v2.6.28-rc2' into tracing/urgent	2008-10-27 10:50:54 +01:00
Matthew Ranostay	a53ccab3cc	ALSA: jack: lineout support to jack abstraction layer This patch introduces support for reporting SW_LINEOUT_INSERT detection events via the jack abstraction layer. Also adds a SND_JACK_LINEOUT define to the input system header. Signed-off-by: Matthew Ranostay <mranostay@embeddedalley.com> Cc: Dmitry Torokhov <dtor@mail.ru> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>	2008-10-27 08:15:14 +01:00
Remi Denis-Courmont	c3a90c788b	Phonet: do not reply to indication reset packets This fixes a potential error packet loop. Signed-off-by: Remi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-26 23:07:25 -07:00
Linus Torvalds	f8d56f1771	Merge branch 'for-linus' of git://neil.brown.name/md * 'for-linus' of git://neil.brown.name/md: md: allow extended partitions on md devices. md: use sysfs_notify_dirent to notify changes to md/dev-xxx/state md: use sysfs_notify_dirent to notify changes to md/array_state	2008-10-26 16:42:18 -07:00
Linus Torvalds	ecc96e7920	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: HID: Add support for Sony Vaio VGX-TP1E HID: fix lock imbalance in hiddev HID: fix lock imbalance in hidraw HID: fix hidbus/appletouch device binding regression HID: add hid_type to general hid struct HID: quirk for OLED devices present in ASUS G50/G70/G71 HID: Remove "default m" for Thrustmaster and Zeroplus HID: fix hidraw_exit section mismatch HID: add support for another Gyration remote control Revert "HID: Invert HWHEEL mappings for some Logitech mice"	2008-10-26 16:34:14 -07:00
Len Brown	9fb3c5ca3d	Merge branch 'i7300_idle' into release	2008-10-25 04:07:44 -04:00
Venki Pallipadi	3ad0b02e4c	i7300_idle: Disable ioat channel only on platforms where ile driver can load Based on input from Andi Kleen: share the platform detection code with ioat_dma and disable the channel in dma engine only for specific platforms. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2008-10-24 12:54:18 -04:00
Ingo Molnar	8c82a17e9c	Merge commit 'v2.6.28-rc1' into sched/urgent	2008-10-24 12:48:46 +02:00
Linus Torvalds	3a2c5dad6c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: remove unused resource assignment in pci_read_bridge_bases() PCI hotplug: shpchp: message refinement PCI hotplug: shpchp: replace printk with dev_printk PCI: add routines for debugging and handling lost interrupts PCI hotplug: pciehp: message refinement PCI: fix ARI code to be compatible with mixed ARI/non-ARI systems PCI hotplug: cpqphp: fix kernel NULL pointer dereference	2008-10-23 19:31:34 -07:00
Linus Torvalds	2242d5eff1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (29 commits) tcp: Restore ordering of TCP options for the sake of inter-operability net: Fix disjunct computation of netdev features sctp: Fix to handle SHUTDOWN in SHUTDOWN_RECEIVED state sctp: Fix to handle SHUTDOWN in SHUTDOWN-PENDING state sctp: Add check for the TSN field of the SHUTDOWN chunk sctp: Drop ICMP packet too big message with MTU larger than current PMTU p54: enable 2.4/5GHz spectrum by eeprom bits. orinoco: reduce stack usage in firmware download path ath5k: fix suspend-related oops on rmmod [netdrvr] fec_mpc52xx: Implement polling, to make netconsole work. qlge: Fix MSI/legacy single interrupt bug. smc911x: Make the driver safer on SMP smc911x: Add IRQ polarity configuration smc911x: Allow Kconfig dependency on ARM sis190: add identifier for Atheros AR8021 PHY 8139x: reduce message severity on driver overlap igb: add IGB_DCA instead of selecting INTEL_IOATDMA igb: fix tx data corruption with transition to L0s on 82575 ehea: Fix memory hotplug support netdev: DM9000: remove BLACKFIN hacking in DM9000 netdev driver ...	2008-10-23 19:19:54 -07:00
Linus Torvalds	ea541686d8	Merge branch 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds * 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds: leds/acpi: Fix merge fallout from acpi_driver_data change leds: Simplify logic in leds-ams-delta leds: Fix trigger registration race leds: Fix leds-class.c comment leds: Add driver for HP harddisk protection LEDs leds: leds-pca955x - Mark pca955x_led_set() static leds: Remove uneeded leds-cm-x270 driver leds: Remove uneeded strlen calls leds: Add leds-wrap default-trigger leds: Make default trigger fields const leds: Add backlight LED trigger leds: da903x: Add support for LEDs found on DA9030/DA9034	2008-10-23 16:07:32 -07:00
Jens Axboe	2fca5ccf97	libata: switch to using block layer tagging support libata currently has a pretty dumb ATA_MAX_QUEUE loop for finding a free tag to use. Instead of fixing that up, convert libata to using block layer tagging - gets rid of code in libata, and is also much faster. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-23 16:05:26 -07:00
James Bottomley	388c8c16ab	PCI: add routines for debugging and handling lost interrupts We're getting a lot of storage drivers blamed for interrupt misrouting issues. This patch provides a standard way of reporting the problem ... and, if possible, correcting it. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-10-23 14:54:18 -07:00
Linus Torvalds	88ed86fee6	Merge branch 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc * 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc: (35 commits) proc: remove fs/proc/proc_misc.c proc: move /proc/vmcore creation to fs/proc/vmcore.c proc: move pagecount stuff to fs/proc/page.c proc: move all /proc/kcore stuff to fs/proc/kcore.c proc: move /proc/schedstat boilerplate to kernel/sched_stats.h proc: move /proc/modules boilerplate to kernel/module.c proc: move /proc/diskstats boilerplate to block/genhd.c proc: move /proc/zoneinfo boilerplate to mm/vmstat.c proc: move /proc/vmstat boilerplate to mm/vmstat.c proc: move /proc/pagetypeinfo boilerplate to mm/vmstat.c proc: move /proc/buddyinfo boilerplate to mm/vmstat.c proc: move /proc/vmallocinfo to mm/vmalloc.c proc: move /proc/slabinfo boilerplate to mm/slub.c, mm/slab.c proc: move /proc/slab_allocators boilerplate to mm/slab.c proc: move /proc/interrupts boilerplate code to fs/proc/interrupts.c proc: move /proc/stat to fs/proc/stat.c proc: move rest of /proc/partitions code to block/genhd.c proc: move /proc/cpuinfo code to fs/proc/cpuinfo.c proc: move /proc/devices code to fs/proc/devices.c proc: move rest of /proc/locks to fs/locks.c ...	2008-10-23 12:04:37 -07:00
Linus Torvalds	1f6d6e8ebe	Merge branch 'v28-range-hrtimers-for-linus-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'v28-range-hrtimers-for-linus-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (37 commits) hrtimers: add missing docbook comments to struct hrtimer hrtimers: simplify hrtimer_peek_ahead_timers() hrtimers: fix docbook comments DECLARE_PER_CPU needs linux/percpu.h hrtimers: fix typo rangetimers: fix the bug reported by Ingo for real rangetimer: fix BUG_ON reported by Ingo rangetimer: fix x86 build failure for the !HRTIMERS case select: fix alpha OSF wrapper select: fix alpha OSF wrapper hrtimer: peek at the timer queue just before going idle hrtimer: make the futex() system call use the per process slack value hrtimer: make the nanosleep() syscall use the per process slack hrtimer: fix signed/unsigned bug in slack estimator hrtimer: show the timer ranges in /proc/timer_list hrtimer: incorporate feedback from Peter Zijlstra hrtimer: add a hrtimer_start_range() function hrtimer: another build fix hrtimer: fix build bug found by Ingo hrtimer: make select() and poll() use the hrtimer range feature ...	2008-10-23 10:53:02 -07:00
Linus Torvalds	2248485640	Merge git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev * git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev: (66 commits) [PATCH] kill the rest of struct file propagation in block ioctls [PATCH] get rid of struct file use in blkdev_ioctl() BLKBSZSET [PATCH] get rid of blkdev_locked_ioctl() [PATCH] get rid of blkdev_driver_ioctl() [PATCH] sanitize blkdev_get() and friends [PATCH] remember mode of reiserfs journal [PATCH] propagate mode through swsusp_close() [PATCH] propagate mode through open_bdev_excl/close_bdev_excl [PATCH] pass fmode_t to blkdev_put() [PATCH] kill the unused bsize on the send side of /dev/loop [PATCH] trim file propagation in block/compat_ioctl.c [PATCH] end of methods switch: remove the old ones [PATCH] switch sr [PATCH] switch sd [PATCH] switch ide-scsi [PATCH] switch tape_block [PATCH] switch dcssblk [PATCH] switch dasd [PATCH] switch mtd_blkdevs [PATCH] switch mmc ...	2008-10-23 10:23:07 -07:00
Linus Torvalds	5ed487bc2c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (46 commits) [PATCH] fs: add a sanity check in d_free [PATCH] i_version: remount support [patch] vfs: make security_inode_setattr() calling consistent [patch 1/3] FS_MBCACHE: don't needlessly make it built-in [PATCH] move executable checking into ->permission() [PATCH] fs/dcache.c: update comment of d_validate() [RFC PATCH] touch_mnt_namespace when the mount flags change [PATCH] reiserfs: add missing llseek method [PATCH] fix ->llseek for more directories [PATCH vfs-2.6 6/6] vfs: add LOOKUP_RENAME_TARGET intent [PATCH vfs-2.6 5/6] vfs: remove LOOKUP_PARENT from non LOOKUP_PARENT lookup [PATCH vfs-2.6 4/6] vfs: remove unnecessary fsnotify_d_instantiate() [PATCH vfs-2.6 3/6] vfs: add __d_instantiate() helper [PATCH vfs-2.6 2/6] vfs: add d_ancestor() [PATCH vfs-2.6 1/6] vfs: replace parent == dentry->d_parent by IS_ROOT() [PATCH] get rid of on-stack dentry in udf [PATCH 2/2] anondev: switch to IDA [PATCH 1/2] anondev: init IDR statically [JFFS2] Use d_splice_alias() not d_add() in jffs2_lookup() [PATCH] Optimise NFS readdir hack slightly. ...	2008-10-23 10:22:40 -07:00
Linus Torvalds	765426e8ee	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (123 commits) dock: make dock driver not a module ACPI: fix ia64 build warning ACPI: hack around sysfs warning with link order ACPI suspend: fix build warning when CONFIG_ACPI_SLEEP=n intel_menlo: fix build warning panasonic-laptop: fix build ACPICA: Update version to 20080926 ACPICA: Add support for zero-length buffer-to-string conversions ACPICA: New: Validation for predefined ACPI methods/objects ACPICA: Fix for implicit return compatibility ACPICA: Fixed a couple memory leaks associated with "implicit return" ACPICA: Optimize buffer allocation procedure ACPICA: Fix possible memory leak, error exit path ACPICA: Fix fault after mem allocation failure in AML parser ACPICA: Remove unused ACPI register bit definition ACPICA: Update version to 20080829 ACPICA: Fix possible memory leak in acpi_ns_get_external_pathname ACPICA: Cleanup for internal Reference Object ACPICA: Update comments - no functional changes ACPICA: Update for Reference ACPI_OPERAND_OBJECT ...	2008-10-23 10:20:36 -07:00
Linus Torvalds	a3415dc34f	Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (32 commits) PCI hotplug: fix logic in Compaq hotplug controller bus speed setup PCI: don't export linux/io.h from pci.h PCI: PCI_QUIRKS depends on PCI PCI hotplug: pciehp: poll data link layer link active PCI hotplug: pciehp: fix possible memory leak in pcie_init PCI: Workaround invalid P2P bridge bus numbers PCI Hotplug: fakephp: add duplicate slot name debugging PCI: Hotplug core: remove 'name' PCI: shcphp: remove 'name' parameter PCI: SGI Hotplug: stop managing bss_hotplug_slot->name PCI: rpaphp: kmalloc/kfree slot->name directly PCI: pciehp: remove 'name' parameter PCI: ibmphp: stop managing hotplug_slot->name PCI: fakephp: remove 'name' parameter PCI, PCI Hotplug: introduce slot_name helpers PCI: cpqphp: stop managing hotplug_slot->name PCI: cpci_hotplug: stop managing hotplug_slot->name PCI: acpiphp: remove 'name' parameter PCI: prevent duplicate slot names PCI Hotplug: serialize pci_hp_register and pci_hp_deregister ...	2008-10-23 10:16:53 -07:00
Linus Torvalds	feeedc6c82	Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6 * 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6: i2c: Add info->archdata field i2c: Inform about deprecated chips directory i2c: Use pci_ioremap_bar() Schedule removal of the legacy i2c device driver binding model i2c: Clean up <linux/i2c.h> i2c: Update and clean up writing-clients document i2c: Drop 2-byte address block transfer defines i2c: Delete legacy model documentation i2c: Constify i2c_get_clientdata's parameter i2c: Delete outdated client porting guide i2c: Make clear what the class field of i2c_adapter is good for i2c-algo-pcf: Fix typo in debugging log message i2c-algo-pcf: Add adapter hooks around xfer begin and end i2c-algo-pcf: Pass adapter data into ->waitforpin() method i2c-i801: Add support for Intel Ibex Peak	2008-10-23 10:10:25 -07:00
Linus Torvalds	a534487606	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: stop_machine: fix error code handling on multiple cpus stop_machine: use workqueues instead of kernel threads workqueue: introduce create_rt_workqueue Call init_workqueues before pre smp initcalls. Make panic= and panic_on_oops into core_params Make initcall_debug a core_param core_param() for genuinely core kernel parameters param: Fix duplicate module prefixes module: check kernel param length at compile time, not runtime Remove stop_machine during module load v2 module: simplify load_module.	2008-10-23 10:00:14 -07:00
Linus Torvalds	296e1ce0dc	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (36 commits) V4L/DVB (9336): cx88: always de-alloc frontends on fault condition V4L/DVB (9335): videobuf: split unregister bus creating self-contained frontend de-allocator V4L/DVB (9334): cx88: dvb_remove debug output V4L/DVB (9333): cx88: Not all boards that requires cx88-mpeg has frontends V4L/DVB (9332): cx88: initial fix for analogue only compilation V4L/DVB (9331): Remove unused inode parameter from video_ioctl2 V4L/DVB (9330): Get rid of inode parameter at v4l_compat_translate_ioctl() V4L/DVB (9328): ivtvfb: FB_BLANK_POWERDOWN turns off video output V4L/DVB (9327): v4l: use video_device.num instead of minor in video%d V4L/DVB (9326): ivtv: avoid green flashing when loading ivtv V4L/DVB (9325): ivtv: switch to unlocked_ioctl. V4L/DVB (9324): v4l2: add video_ioctl2_unlocked for unlocked_ioctl support. V4L/DVB (9323): v4l2-int-if: Add enum_framesizes and enum_frameintervals ioctls. V4L/DVB (9322): v4l2-int-if: Export more interfaces to modules V4L/DVB (9321): v4l2-int-if: Define new power state changes V4L/DVB (9320): v4l2: Add 10-bit RAW Bayer formats V4L/DVB (9319): v4l2-int-if: Add cropcap, g_crop and s_crop commands. V4L/DVB (9318): v4l2-int-if: Add command to get slave private data. V4L/DVB (9316): s5h1411: Power down s5h1411 when not in use V4L/DVB (9315): s5h1411: Skip reconfiguring demod modulation if already at the desired modulation ...	2008-10-23 09:59:29 -07:00
Linus Torvalds	6770ab5cf5	Merge git://git.infradead.org/iommu-2.6 * git://git.infradead.org/iommu-2.6: Admit to maintaining VT-d, for my sins. dmar: fix uninitialised 'ret' variable in dmar_parse_dev() intel-iommu: use coherent_dma_mask in alloc_coherent amd_iommu: fix nasty bug that caused ILLEGAL_DEVICE_TABLE_ENTRY errors intel-iommu: IA64 support dmar: remove the quirk which disables dma-remapping when intr-remapping enabled dmar: Use queued invalidation interface for IOTLB and context invalidation dmar: context cache and IOTLB invalidation using queued invalidation dmar: use spin_lock_irqsave() in qi_submit_sync()	2008-10-23 09:53:14 -07:00
Linus Torvalds	3e5cce627c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm * git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: dm: tidy local_init dm: remove unused flush_all dm raid1: separate region_hash interface part1 dm: mark split bio as cloned dm crypt: remove waitqueue dm crypt: fix async split dm crypt: tidy sector dm: remove dm header from targets dm: publish array_too_big dm exception store: fix misordered writes dm exception store: refactor zero_area dm snapshot: drop unused last_percent dm snapshot: fix primary_pe race dm kcopyd: avoid queue shuffle	2008-10-23 09:50:12 -07:00
Linus Torvalds	133e887f90	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: disable the hrtick for now sched: revert back to per-rq vruntime sched: fair scheduler should not resched rt tasks sched: optimize group load balancer sched: minor fast-path overhead reduction sched: fix the wrong mask_len, cleanup sched: kill unused scheduler decl. sched: fix the wrong mask_len sched: only update rq->clock while holding rq->lock	2008-10-23 09:37:16 -07:00
Linus Torvalds	e82cff752f	Merge branch 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: genirq: NULL struct irq_desc's member 'name' in dynamic_irq_cleanup() genirq: fix off by one and coding style genirq: fix set_irq_type() when recording trigger type	2008-10-23 09:36:55 -07:00
Lee Howard	7106b4e333	8250: Oxford Semiconductor Devices Add support for the OxSemi 'Tornado' devices. Reformatted and reworked a bit by Alan Cox Signed-off-by: Lee Howard <lee.howard@mainpine.com> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-23 09:31:09 -07:00
KAMEZAWA Hiroyuki	94b6da5ab8	memcg: fix page_cgroup allocation page_cgroup_init() is called from mem_cgroup_init(). But at this point, we cannot call alloc_bootmem(). (and this caused panic at boot.) This patch moves page_cgroup_init() to init/main.c. Time table is following: == parse_args(). # we can trust mem_cgroup_subsys.disabled bit after this. .... cgroup_init_early() # "early" init of cgroup. .... setup_arch() # memmap is allocated. ... page_cgroup_init(); mem_init(); # we cannot call alloc_bootmem after this. .... cgroup_init() # mem_cgroup is initialized. == Before page_cgroup_init(), mem_map must be initialized. So, I added page_cgroup_init() to init/main.c directly. (*) maybe this is not very clean but - cgroup_init_early() is too early - in cgroup_init(), we have to use vmalloc instead of alloc_bootmem(). use of vmalloc area in x86-32 is important and we should avoid very large vmalloc() in x86-32. So, we want to use alloc_bootmem() and added page_cgroup_init() directly to init/main.c [akpm@linux-foundation.org: remove unneeded/bad mem_cgroup_subsys declaration] [akpm@linux-foundation.org: fix build] Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Tested-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-23 08:55:02 -07:00
Hidehiro Kawai	4afe978530	jbd: fix error handling for checkpoint io When a checkpointing IO fails, current JBD code doesn't check the error and continue journaling. This means latest metadata can be lost from both the journal and filesystem. This patch leaves the failed metadata blocks in the journal space and aborts journaling in the case of log_do_checkpoint(). To achieve this, we need to do: 1. don't remove the failed buffer from the checkpoint list where in the case of __try_to_free_cp_buf() because it may be released or overwritten by a later transaction 2. log_do_checkpoint() is the last chance, remove the failed buffer from the checkpoint list and abort the journal 3. when checkpointing fails, don't update the journal super block to prevent the journaled contents from being cleaned. For safety, don't update j_tail and j_tail_sequence either 4. when checkpointing fails, notify this error to the ext3 layer so that ext3 don't clear the needs_recovery flag, otherwise the journaled contents are ignored and cleaned in the recovery phase 5. if the recovery fails, keep the needs_recovery flag 6. prevent cleanup_journal_tail() from being called between __journal_drop_transaction() and journal_abort() (a race issue between journal_flush() and __log_wait_for_space() Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Acked-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-23 08:55:01 -07:00
Paul Mundt	66f50ee3ce	profiling: fix up CONFIG_PROC_FS=n build In the case where procfs is disabled, create_proc_profile() does not exist. Stub it in with the others. Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-23 08:55:01 -07:00
Linus Torvalds	9bf9b2f3ad	Merge git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (53 commits) powerpc: Support for relocatable kdump kernel powerpc: Don't use a 16G page if beyond mem= limits powerpc: Add del_node() for early boot code to prune inapplicable devices. powerpc: Further compile fixup for STRICT_MM_TYPECHECKS powerpc: Remove empty #else from signal_64.c powerpc: Move memory size print into common show_cpuinfo for 32-bit hvc_console: Remove __devexit annotation of hvc_remove() hvc_console: Add support for tty window resizing hvc_console: Fix loop if put_char() returns 0 hvc_console: Add tty driver flag TTY_DRIVER_RESET_TERMIOS hvc_console: Add a hangup notifier for backends powerpc/83xx: Add DS1339 RTC support for MPC8349E-mITX boards .dts powerpc/83xx: Add support for MCU microcontroller in .dts files powerpc/85xx: Move mpc8572ds.dts to address-cells/size-cells = <2> of/spi: Support specifying chip select as active high via device tree powerpc: Remove device_type = "board_control" properties in .dts files i2c-cpm: Suppress autoprobing for devices powerpc/85xx: Fix mpc8536ds dma interrupt numbers powerpc/85xx: Enable enhanced functions for 8536 TSEC powerpc: Delete unused prom_strtoul and prom_memparse ...	2008-10-23 08:28:25 -07:00
Linus Torvalds	9779a8325a	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/dvrabel/uwb * 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/dvrabel/uwb: (47 commits) uwb: wrong sizeof argument in mac address compare uwb: don't use printk_ratelimit() so often uwb: use kcalloc where appropriate uwb: use time_after() when purging stale beacons uwb: add credits for the original developers of the UWB/WUSB/WLP subsystems uwb: add entries in the MAINTAINERS file uwb: depend on EXPERIMENTAL wusb: wusb-cbaf (CBA driver) sysfs ABI simplification uwb: document UWB and WUSB sysfs files uwb: add symlinks in sysfs between radio controllers and PALs uwb: dont tranmit identification IEs uwb: i1480/GUWA100U: fix firmware download issues uwb: i1480: remove MAC/PHY information checking function uwb: add Intel i1480 HWA to the UWB RC quirk table uwb: disable command/event filtering for D-Link DUB-1210 uwb: initialize the debug sub-system uwb: Fix handling IEs with empty IE data in uwb_est_get_size() wusb: fix bmRequestType for Abort RPipe request wusb: fix error path for wusb_set_dev_addr() wusb: add HWA host controller driver ...	2008-10-23 08:20:34 -07:00
Linus Torvalds	309e1e4240	Merge branch 'for-next' of git://git.o-hand.com/linux-mfd * 'for-next' of git://git.o-hand.com/linux-mfd: mfd: check for platform_get_irq() return value in sm501 mfd: use pci_ioremap_bar() in sm501 mfd: Don't store volatile bits in WM8350 register cache mfd: don't export wm3850 static functions mfd: twl4030-gpio driver mfd: rtc-twl4030 driver mfd: twl4030 IRQ handling update	2008-10-23 08:18:33 -07:00
Linus Torvalds	724bdd097e	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/ehca: Reject dynamic memory add/remove when ehca adapter is present IB/ehca: Fix reported max number of QPs and CQs in systems with >1 adapter IPoIB: Set netdev offload features properly for child (VLAN) interfaces IPoIB: Clean up ethtool support mlx4_core: Add Ethernet PCI device IDs mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC mlx4_core: Multiple port type support mlx4_core: Ethernet MAC/VLAN management mlx4_core: Get ethernet MTU and default address from firmware mlx4_core: Support multiple pre-reserved QP regions Update NetEffect maintainer emails to Intel emails RDMA/cxgb3: Remove cmid reference on tid allocation failures IB/mad: Use krealloc() to resize snoop table IPoIB: Always initialize poll_timer to avoid crash on unload IB/ehca: Don't allow creating UC QP with SRQ mlx4_core: Add QP range reservation support RDMA/ucma: Test ucma_alloc_multicast() return against NULL, not with IS_ERR()	2008-10-23 08:16:03 -07:00
Alexey Dobriyan	59c7572e82	proc: remove fs/proc/proc_misc.c Now that everything was moved to their more or less expected places, apply rm(1). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2008-10-23 18:54:05 +04:00
Alexey Dobriyan	5aa140c2de	proc: move /proc/vmcore creation to fs/proc/vmcore.c Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2008-10-23 18:51:22 +04:00
Alexey Dobriyan	97ce5d6dcb	proc: move all /proc/kcore stuff to fs/proc/kcore.c Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2008-10-23 18:32:38 +04:00
Alexey Dobriyan	b5aadf7f14	proc: move /proc/schedstat boilerplate to kernel/sched_stats.h Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2008-10-23 18:06:12 +04:00
Steven Rostedt	08f5ac906d	ftrace: remove ftrace hash The ftrace hash was used by the ftrace_daemon code. The record ip function would place the calling address (ip) into the hash. The daemon would later read the hash and modify that code. The hash complicates the code. This patch removes it. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-23 16:00:24 +02:00
Steven Rostedt	4d296c2432	ftrace: remove mcount set The arch dependent function ftrace_mcount_set was only used by the daemon start up code. This patch removes it. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-23 16:00:23 +02:00
Steven Rostedt	81adbdc029	ftrace: only have ftrace_kill atomic When an anomaly is detected, we need a way to completely disable ftrace. Right now we have two functions: ftrace_kill and ftrace_kill_atomic. The ftrace_kill tries to do it in a "nice" way by converting everything back to a nop. The "nice" way is dangerous itself, so this patch removes it and only has the "atomic" version, which is all that is needed. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-23 16:00:19 +02:00
Steven Rostedt	593eb8a2d6	ftrace: return error on failed modified text. Have the ftrace_modify_code return error values: -EFAULT on error of reading the address -EINVAL if what is read does not match what it expected -EPERM if the write fails to update after a successful match. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-23 16:00:13 +02:00
Alexey Dobriyan	31d85ab28e	proc: move /proc/diskstats boilerplate to block/genhd.c Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Jens Axboe <jens.axboe@oracle.com>	2008-10-23 17:57:37 +04:00
Alexey Dobriyan	5c9fe6281b	proc: move /proc/zoneinfo boilerplate to mm/vmstat.c Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Christoph Lameter <cl@linux-foundation.org>	2008-10-23 17:35:04 +04:00
Alexey Dobriyan	b6aa44ab69	proc: move /proc/vmstat boilerplate to mm/vmstat.c Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Christoph Lameter <cl@linux-foundation.org>	2008-10-23 17:12:51 +04:00
Alexey Dobriyan	74e2e8e8ce	proc: move /proc/pagetypeinfo boilerplate to mm/vmstat.c Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2008-10-23 16:33:29 +04:00
Alexey Dobriyan	8f32f7e5ac	proc: move /proc/buddyinfo boilerplate to mm/vmstat.c Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2008-10-23 16:12:04 +04:00
Alexey Dobriyan	5f6a6a9c4e	proc: move /proc/vmallocinfo to mm/vmalloc.c Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Christoph Lameter <cl@linux-foundation.org>	2008-10-23 15:48:28 +04:00
Alexey Dobriyan	7b3c3a50a3	proc: move /proc/slabinfo boilerplate to mm/slub.c, mm/slab.c Lose dummy ->write hook in case of SLUB, it's possible now. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>	2008-10-23 15:20:06 +04:00
Alexey Dobriyan	f500975a3f	proc: move rest of /proc/partitions code to block/genhd.c Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Jens Axboe <jens.axboe@oracle.com>	2008-10-23 15:07:31 +04:00
Alexey Dobriyan	d8ba7a3633	proc: move rest of /proc/locks to fs/locks.c Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2008-10-23 14:37:00 +04:00
Alexey Dobriyan	e1759c215b	proc: switch /proc/meminfo to seq_file and move it to fs/proc/meminfo.c while I'm at it. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2008-10-23 13:52:40 +04:00
Mimi Zohar	08b9fe6b12	[PATCH] i_version: remount support Add support for remounting a filesystem with the i_version option. Signed-off-by: Mimi Zohar <zohar@us.ibm.com>	2008-10-23 05:13:28 -04:00
Miklos Szeredi	f696a3659f	[PATCH] move executable checking into ->permission() For execute permission on a regular files we need to check if file has any execute bits at all, regardless of capabilites. This check is normally performed by generic_permission() but was also added to the case when the filesystem defines its own ->permission() method. In the latter case the filesystem should be responsible for performing this check. Move the check from inode_permission() inside filesystems which are not calling generic_permission(). Create a helper function execute_ok() that returns true if the inode is a directory or if any execute bits are present in i_mode. Also fix up the following code: - coda control file is never executable - sysctl files are never executable - hfs_permission seems broken on MAY_EXEC, remove - hfsplus_permission is eqivalent to generic_permission(), remove Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-10-23 05:13:25 -04:00
OGAWA Hirofumi	4e9ed2f85a	[PATCH vfs-2.6 6/6] vfs: add LOOKUP_RENAME_TARGET intent This adds LOOKUP_RENAME_TARGET intent for lookup of rename destination. LOOKUP_RENAME_TARGET is going to be used like LOOKUP_CREATE. But since the destination of rename() can be existing directory entry, so it has a difference. Although that difference doesn't matter in my usage, this tells it to user of this intent. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>	2008-10-23 05:13:20 -04:00
OGAWA Hirofumi	e2761a1167	[PATCH vfs-2.6 2/6] vfs: add d_ancestor() This adds d_ancestor() instead of d_isparent(), then use it. If new_dentry == old_dentry, is_subdir() returns 1, looks strange. "new_dentry == old_dentry" is not subdir obviously. But I'm not checking callers for now, so this keeps current behavior. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>	2008-10-23 05:13:16 -04:00
Alexey Dobriyan	6de24f0ed0	[PATCH 1/2] anondev: init IDR statically Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2008-10-23 05:13:13 -04:00
Christoph Hellwig	9308a6128d	[PATCH] kill d_alloc_anon Remove d_alloc_anon now that no users are left. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-23 05:13:02 -04:00
Christoph Hellwig	4ea3ada295	[PATCH] new helper: d_obtain_alias The calling conventions of d_alloc_anon are rather unfortunate for all users, and it's name is not very descriptive either. Add d_obtain_alias as a new exported helper that drops the inode reference in the failure case, too and allows to pass-through NULL pointers and inodes to allow for tail-calls in the export operations. Incidentally this helper already existed as a private function in libfs.c as exportfs_d_alloc so kill that one and switch the callers to d_obtain_alias. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-23 05:13:00 -04:00
Al Viro	3516586a42	[PATCH] make O_EXCL in nd->intent.flags visible in nd->flags New flag: LOOKUP_EXCL. Set before doing the final step of pathname resolution on the paths that have LOOKUP_CREATE and O_EXCL. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-23 05:12:56 -04:00
Herbert Xu	b63365a2d6	net: Fix disjunct computation of netdev features My change commit `e2a6b85247` net: Enable TSO if supported by at least one device didn't do what was intended because the netdev_compute_features function was designed for conjunctions. So what happened was that it would simply take the TSO status of the last constituent device. This patch extends it to support both conjunctions and disjunctions under the new name of netdev_increment_features. It also adds a new function netdev_fix_features which does the sanity checking that usually occurs upon registration. This ensures that the computation doesn't result in an illegal combination since this checking is absent when the change is initiated via ethtool. The two users of netdev_compute_features have been converted. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-23 01:11:29 -07:00
Al Viro	d181146572	[PATCH] new helper - kern_path() Analog of lookup_path(), takes struct path *. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-23 03:34:19 -04:00
Len Brown	057316cc6a	Merge branch 'linus' into test Conflicts: MAINTAINERS arch/x86/kernel/acpi/boot.c arch/x86/kernel/acpi/sleep.c drivers/acpi/Kconfig drivers/pnp/Makefile drivers/pnp/quirks.c Signed-off-by: Len Brown <len.brown@intel.com>	2008-10-23 00:11:07 -04:00
Len Brown	1b79b27da1	Merge branch 'yinghai' into test	2008-10-22 23:35:56 -04:00
Len Brown	4dff4e7f6c	Merge branch 'pnp-debug' into test	2008-10-22 23:28:43 -04:00
Len Brown	530bc23bfe	Merge branch 'i7300_idle' into test	2008-10-22 23:28:36 -04:00
Len Brown	6b3c4f8b9c	Merge branch 'FW_BUG' into test	2008-10-22 23:19:45 -04:00
Tejun Heo	848e4c68c4	libata: transfer EHI control flags to slave ehc.i ATA_EHI_NO_AUTOPSY and ATA_EHI_QUIET are used to control the behavior of EH. As only the master link is visible outside EH, these flags are set only for the master link although they should also apply to the slave link, which causes spurious EH messages during probe and suspend/resume. This patch transfers those two flags to slave ehc.i before performing slave autopsy and reporting. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-10-22 20:40:19 -04:00
Stephen Rothwell	1388cc964e	PCI: don't export linux/io.h from pci.h Move the include of io.h down into the #ifdef __KERNEL__ protected region. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-10-22 16:42:46 -07:00
Alex Chiang	58319b802a	PCI: Hotplug core: remove 'name' Now that the PCI core manages the 'name' for each individual hotplug driver, and all drivers (except rpaphp) have been converted to use hotplug_slot_name(), there is no need for the PCI hotplug core to drag around its own copy of name either. Cc: kristen.c.accardi@intel.com Cc: matthew@wil.cx Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-10-22 16:42:43 -07:00
Alex Chiang	0ad772ec46	PCI, PCI Hotplug: introduce slot_name helpers In preparation for cleaning up the various hotplug drivers such that they don't have to manage their own 'name' parameters anymore, we provide the following convenience functions: pci_slot_name() hotplug_slot_name() These helpers will be used by individual hotplug drivers. Cc: kristen.c.accardi@intel.com Cc: matthew@wil.cx Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-10-22 16:42:40 -07:00
Alex Chiang	828f37683e	PCI: update pci_create_slot() to take a 'hotplug' param Slot detection drivers can co-exist with hotplug drivers. The names of the detected/claimed slots may be different depending on module load order. For legacy reasons, we need to allow hotplug drivers to override the slot name if a detection driver is loaded first (and they find the same slots). Creating and overriding slot names should be an atomic operation, otherwise you get a locking nightmare as various drivers race to call pci_create_slot(). pci_create_slot() is already serialized by grabbing the pci_bus_sem. We update the API and add a 'hotplug' param, which is: set if the caller is a hotplug driver NULL if the caller is a detection driver pci_create_slot() does not actually use the 'hotplug' parameter in this patch. A later patch will add the logic that uses it. Cc: kristen.c.accardi@intel.com Cc: matthew@wil.cx Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-10-22 16:42:38 -07:00
Alex Chiang	d25b7c8d6b	PCI: rename pci_update_slot_number to pci_renumber_slot The GPL exported symbol pci_update_slot_number has been renamed to pci_renumber_slot. Some of the safety checks were unnecessary and were removed. Cc: kristen.c.accardi@intel.com Cc: matthew@wil.cx Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-10-22 16:42:37 -07:00
Alex Chiang	1359f2701b	PCI Hotplug core: add 'name' param pci_hp_register interface Update pci_hp_register() to take a const char *name parameter. The motivation for this is to clean up the individual hotplug drivers so that each one does not have to manage its own name. The PCI core should be the place where we manage the name. We update the interface and all callsites first, in a "no functional change" manner, and clean up the drivers later. Cc: kristen.c.accardi@intel.com Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-10-22 16:42:37 -07:00
Jesse Barnes	64c7f63c1b	PCI: include io.h in pci.h so that ioremap_nocache is defined Ingo pointed out that the m32r build was broken by pci_ioremap. It looks like some files include pci.h w/o including io.h. The latter defines ioremap_* if present, so it makes sense to include it in pci.h now that we have pci_ioremap there. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-10-22 16:42:36 -07:00
Sheng Yang	8dd7f8036c	PCI: add support for function level reset Sometimes, it's necessary to enable software's ability to quiesce and reset endpoint hardware with function-level granularity, so provide support for it. The patch implement Function Level Reset(FLR) feature following PCI-e spec. And this is the first step. We would add more generic method, like D0/D3, to allow more devices support this function. The patch contains two functions. pcie_reset_function() is the new driver API, and, contains some action to quiesce a device. The other function is a helper: pcie_execute_reset_function() just executes the reset for a particular device function. Current the usage model is in KVM. Function reset is necessary for assigning device to a guest, or moving it between partitions. For Function Level Reset(FLR), please refer to PCI Express spec chapter 6.6.2. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-10-22 16:42:35 -07:00
Yevgeny Petrilin	7ff93f8b7e	mlx4_core: Multiple port type support Multi-protocol adapters support different port types. Each consumer of mlx4_core queries for supported port types; in particular mlx4_ib can no longer assume that all physical ports belong to it. Port type is configured through a sysfs interface. When the type of a port is changed, all mlx4 interfaces are unregistered, and then registered again with the new port types. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-10-22 15:38:42 -07:00
Yevgeny Petrilin	2a2336f822	mlx4_core: Ethernet MAC/VLAN management Add support for managing MAC and VLAN filters for each port. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Oren Duer <oren@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-10-22 11:44:46 -07:00
Anton Vorontsov	11f1f2afd6	i2c: Add info->archdata field If present the info->archdata is copied into the dev->archdata. Some (OpenFirmware) platforms need it. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-10-22 20:21:33 +02:00
Jean Delvare	3ae70deef0	i2c: Clean up <linux/i2c.h> Fix most checkpatch.pl errors and warnings. This includes replacing spaces with tabs in many places, adding and removing spaces, and folding long lines. Also complete a couple prototypes to make it clearer what the parameters represent. Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-10-22 20:21:32 +02:00
Jean Delvare	c0589d4bc1	i2c: Drop 2-byte address block transfer defines We have no users and no implementers for these transfer types so it makes little sense to define functionality bits for them. Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-10-22 20:21:31 +02:00
Jean Delvare	7d1d8999b4	i2c: Constify i2c_get_clientdata's parameter i2c_get_clientdata doesn't change the i2c_client it is passed as a parameter, so it can be constified. Same for i2c_get_adapdata. Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-10-22 20:21:31 +02:00
Wolfram Sang	14f55f7a03	i2c: Make clear what the class field of i2c_adapter is good for Make clear what the class field of i2c_adapter is good for. Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-10-22 20:21:30 +02:00
David Miller	30091404af	i2c-algo-pcf: Add adapter hooks around xfer begin and end Some I2C bus implementations need to synchronize with external entities, such as system firmware, which might also be programming the same I2C bus. In order to facilitate this add ->xfer_begin() and ->xfer_end() hooks which are invoked around pcf_xfer(). [JD: Make these hooks optional.] Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-10-22 20:21:30 +02:00
David Miller	08e5338d11	i2c-algo-pcf: Pass adapter data into ->waitforpin() method Pass adapter data into ->waitforpin() method. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-10-22 20:21:29 +02:00
Yevgeny Petrilin	b79acb49de	mlx4_core: Get ethernet MTU and default address from firmware Get maximum ethernet MTU and default MAC address from the firmware QUERY_DEV_CAP command. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-10-22 10:56:48 -07:00
Yevgeny Petrilin	93fc9e1bb6	mlx4_core: Support multiple pre-reserved QP regions For ethernet support, we need to reserve QPs for the ethernet and fibre channel driver. The QPs are reserved at the end of the QP table. (This way we assure that they are aligned to their size) We need to consider these reserved ranges in bitmap creation, so we extend the mlx4 bitmap utility functions to allow reserved ranges at both the bottom and the top of the range. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-10-22 10:25:29 -07:00
Jiri Slaby	a73a63701f	HID: add hid_type to general hid struct Add type to the hid structure to distinguish to which device type (now only mouse) we are talking to. Needed for per device type ignore list support. Note: this patch leaves the type as unknown for bluetooth devices, there is not support for this in the hidp code. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2008-10-22 14:45:11 +02:00
Catalin Marinas	319edafef6	smc911x: Add IRQ polarity configuration Platforms like ARM Ltd's RealView require the IRQ polarity bit to be set for the SMC9118 chip. This patch allows the dynamic configuration via the smc911x_platdata structure. This patch also changes the smc91x_platdata structure name to the correct smc911x_platdata in the smc911x_drv_probe() function. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-10-22 07:00:38 -04:00
Li Zefan	4ce72a2c06	sched: add CONFIG_SMP consistency a patch from Henrik Austad did this: >> Do not declare select_task_rq as part of sched_class when CONFIG_SMP is >> not set. Peter observed: > While a proper cleanup, could you do it by re-arranging the methods so > as to not create an additional ifdef? Do not declare select_task_rq and some other methods as part of sched_class when CONFIG_SMP is not set. Also gather those methods to avoid CONFIG_SMP mess. Idea-by: Henrik Austad <henrik.austad@gmail.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Henrik Austad <henrik@austad.us> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-22 10:01:52 +02:00
Thomas Gleixner	268a3dcfea	Merge branch 'timers/range-hrtimers' into v28-range-hrtimers-for-linus-v2 Conflicts: kernel/time/tick-sched.c Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-10-22 09:48:06 +02:00
Ingo Molnar	debfcaf93e	Merge branch 'tracing/ftrace' into tracing/urgent	2008-10-22 09:08:14 +02:00
Andy Henroid	27471fdb32	i7300_idle driver v1.55 The Intel 7300 Memory Controller supports dynamic throttling of memory which can be used to save power when system is idle. This driver does the memory throttling when all CPUs are idle on such a system. Refer to "Intel 7300 Memory Controller Hub (MCH)" datasheet for the config space description. Signed-off-by: Andy Henroid <andrew.d.henroid@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>	2008-10-21 23:58:41 -04:00
David Brownell	a30d46c042	mfd: twl4030 IRQ handling update - Move it into a separate file; clean and streamline it - Restructure the init code for reuse during secondary dispatch - Support both levels (primary, secondary) of IRQ dispatch - Use a workqueue for irq mask/unmask and trigger configuration Code for two subchips currently share that secondary handler code. One is the power subchip; its IRQs are now handled by this core, courtesy of this patch. The other is the GPIO module, which will be supported through a later patch. There are also minor changes to the header file, mostly related to GPIO support; nothing yet in mainline cares about those. A few references to OMAP-specific symbols are disabled; when they can all be removed, the TWL4030 support ceases being OMAP-specific. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-10-22 01:19:37 +02:00
Heiko Carstens	0d557dc97f	workqueue: introduce create_rt_workqueue create_rt_workqueue will create a real time prioritized workqueue. This is needed for the conversion of stop_machine to a workqueue based implementation. This patch adds yet another parameter to __create_workqueue_key to tell it that we want an rt workqueue. However it looks like we rather should have something like "int type" instead of singlethread, freezable and rt. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@elte.hu>	2008-10-22 10:00:25 +11:00
Rusty Russell	67e67ceaac	core_param() for genuinely core kernel parameters There are a lot of one-liner uses of __setup() in the kernel: they're cumbersome and not queryable (definitely not settable) via /sys. Yet it's ugly to simplify them to module_param(), because by default that inserts a prefix of the module name (usually filename). So, introduce a "core_param". The parameter gets no prefix, but appears in /sys/module/kernel/parameters/ (if non-zero perms arg). I thought about using the name "core", but that's more common than "kernel". And if you create a module called "kernel", you will die a horrible death. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-10-22 10:00:23 +11:00
Rusty Russell	9b473de872	param: Fix duplicate module prefixes Instead of insisting each new module_param sysfs entry is unique, handle the case where it already exists (for builtin modules). The current code assumes that all identical prefixes are together in the section: true for normal uses, but not necessarily so if someone overrides MODULE_PARAM_PREFIX. More importantly, it's not true with the new "core_param()" code which uses "kernel" as a prefix. This simplifies the caller for the builtin case, at a slight loss of efficiency (we do the lookup every time to see if the directory exists). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Greg Kroah-Hartman <gregkh@suse.de>	2008-10-22 10:00:23 +11:00
Rusty Russell	730b69d225	module: check kernel param length at compile time, not runtime The kparam code tries to handle over-length parameter prefixes at runtime. Not only would I bet this has never been tested, it's not clear that truncating names is a good idea either. So let's check at compile time. We need to move the #define to moduleparam.h to do this, though. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-10-22 10:00:22 +11:00
Rusty Russell	5e458cc0f4	module: simplify load_module. Linus' recent catch of stack overflow in load_module lead me to look at the code. A couple of helpers to get a section address and get objects from a section can help clean things up a little. (And in case you're wondering, the stack size also dropped from 328 to 284 bytes). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-10-22 10:00:15 +11:00
David Woodhouse	b876d08f81	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: drivers/pci/dmar.c	2008-10-21 19:42:20 +01:00
Heinz Mauelshagen	1f965b1943	dm raid1: separate region_hash interface part1 Separate the region hash code from raid1 so it can be shared by forthcoming targets. Use BUG_ON() for failed async dm_io() calls. Signed-off-by: Heinz Mauelshagen <hjm@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-21 17:45:06 +01:00
Mikulas Patocka	d63a5ce3c0	dm: publish array_too_big Move array_too_big to include/linux/device-mapper.h because it is used by targets. Remove the test from dm-raid1 as the number of mirror legs is limited such that it can never fail. (Even for stripes it seems rather unlikely.) Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-10-21 17:44:57 +01:00
Sergio Aguirre	733d710b09	V4L/DVB (9320): v4l2: Add 10-bit RAW Bayer formats Add 10-bit raw bayer format expanded to 16 bits. Adds also definition for 10-bit raw bayer format dpcm-compressed to 8 bits. Signed-off-by: Sergio Aguirre <saaguirre@ti.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2008-10-21 14:31:15 -02:00
Ingo Molnar	e9f95e6373	genirq: fix off by one and coding style Fix off-by-one in for_each_irq_desc_reverse(). Impact is near zero in practice, because nothing substantial wants to iterate down to IRQ#0 - but fix it nevertheless. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-10-21 15:54:40 +02:00
Al Viro	56b26add02	[PATCH] kill the rest of struct file propagation in block ioctls Now we can switch blkdev_ioctl() block_device/mode Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:49:14 -04:00
Al Viro	e436fdae70	[PATCH] get rid of blkdev_driver_ioctl() convert remaining callers to __blkdev_driver_ioctl() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:49:08 -04:00
Al Viro	572c489215	[PATCH] sanitize blkdev_get() and friends * get rid of fake struct file/struct dentry in __blkdev_get() * merge __blkdev_get() and do_open() * get rid of flags argument of blkdev_get() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:49:06 -04:00
Al Viro	e5eb8caa83	[PATCH] remember mode of reiserfs journal Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:49:04 -04:00
Al Viro	30c40d2c01	[PATCH] propagate mode through open_bdev_excl/close_bdev_excl replace open_bdev_excl/close_bdev_excl with variants taking fmode_t. superblock gets the value used to mount it stored in sb->s_mode Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:49:00 -04:00
Al Viro	9a1c354276	[PATCH] pass fmode_t to blkdev_put() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:48:58 -04:00
Al Viro	90b8f2824c	[PATCH] end of methods switch: remove the old ones Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:48:52 -04:00
Al Viro	d4430d62fa	[PATCH] beginning of methods conversion To keep the size of changesets sane we split the switch by drivers; to keep the damn thing bisectable we do the following: 1) rename the affected methods, add ones with correct prototypes, make (few) callers handle both. That's this changeset. 2) for each driver convert to new methods. ALL drivers are converted in this series. 3) kill the old (renamed) methods. Note that it _is_ a flagday; all in-tree drivers are converted and by the end of this series no trace of old methods remain. The only reason why we do that this way is to keep the damn thing bisectable and allow per-driver debugging if anything goes wrong. New methods: open(bdev, mode) release(disk, mode) ioctl(bdev, mode, cmd, arg) /* Called without BKL / compat_ioctl(bdev, mode, cmd, arg) locked_ioctl(bdev, mode, cmd, arg) / Called with BKL, legacy */ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:47:32 -04:00
Al Viro	badf8082c3	[PATCH] switch ide_disk_ops ->ioctl() to sane prototype Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:47:30 -04:00
Al Viro	633a08b812	[PATCH] introduce __blkdev_driver_ioctl() Analog of blkdev_driver_ioctl() with sane arguments. For now uses fake struct file, by the end of the series it won't and blkdev_driver_ioctl() will become a wrapper around it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:47:26 -04:00
Al Viro	bbc1cc9784	[PATCH] switch cdrom_{open,release,ioctl} to sane APIs ... convert to it in callers Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:47:22 -04:00
Al Viro	08f8585121	[PATCH] move block_device_operations to blkdev.h Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:47:20 -04:00
Al Viro	647b3d0084	[PATCH] lose unused arguments in dm ioctl callbacks Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:47:18 -04:00
Al Viro	1bddd9e645	[PATCH] lose the unused file argument in generic_ide_ioctl() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:47:16 -04:00
Al Viro	74f3c8aff3	[PATCH] switch scsi_cmd_ioctl() to passing fmode_t Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:47:14 -04:00
Al Viro	e915e872ed	[PATCH] switch sg_scsi_ioctl() to passing fmode_t Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:47:12 -04:00
Al Viro	86d434dede	[PATCH] eliminate use of ->f_flags in block methods store needed information in f_mode Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:47:08 -04:00
Al Viro	aeb5d72706	[PATCH] introduce fmode_t, do annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:47:06 -04:00
Benjamin Herrenschmidt	a02efb906d	Merge commit 'origin' into master Manual merge of: arch/powerpc/Kconfig arch/powerpc/include/asm/page.h	2008-10-21 15:52:04 +11:00
Carl Love	a5598ca0d4	powerpc/oprofile: Fix mutex locking for cell spu-oprofile The issue is the SPU code is not holding the kernel mutex lock while adding samples to the kernel buffer. This patch creates per SPU buffers to hold the data. Data is added to the buffers from in interrupt context. The data is periodically pushed to the kernel buffer via a new Oprofile function oprofile_put_buff(). The oprofile_put_buff() function is called via a work queue enabling the funtion to acquire the mutex lock. The existing user controls for adjusting the per CPU buffer size is used to control the size of the per SPU buffers. Similarly, overflows of the SPU buffers are reported by incrementing the per CPU buffer stats. This eliminates the need to have architecture specific controls for the per SPU buffers which is not acceptable to the OProfile user tool maintainer. The export of the oprofile add_event_entry() is removed as it is no longer needed given this patch. Note, this patch has not addressed the issue of indexing arrays by the spu number. This still needs to be fixed as the spu numbering is not guarenteed to be 0 to max_num_spus-1. Signed-off-by: Carl Love <carll@us.ibm.com> Signed-off-by: Maynard Johnson <maynardj@us.ibm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Acked-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-10-21 15:17:48 +11:00
NeilBrown	3c0ee63a64	md: use sysfs_notify_dirent to notify changes to md/dev-xxx/state The 'state' file for a device reports, for example, when the device has failed. Changes should be reported to userspace ASAP without the possibility of blocking on low-memory. sysfs_notify does have that possibility (as it takes a mutex which can be held across a kmalloc) so use sysfs_notify_dirent instead. Signed-off-by: NeilBrown <neilb@suse.de>	2008-10-21 13:25:28 +11:00
NeilBrown	b62b75905d	md: use sysfs_notify_dirent to notify changes to md/array_state Now that we have sysfs_notify_dirent, use it to notify changes to md/array_state. As sysfs_notify_dirent can be called in atomic context, we can remove the delayed notify and the MD_NOTIFY_ARRAY_STATE flag. Signed-off-by: NeilBrown <neilb@suse.de>	2008-10-21 13:25:21 +11:00
Trent Piepho	326bb8a5a1	leds: Make default trigger fields const The default_trigger fields of struct gpio_led and thus struct led_classdev are pretty much always assigned from a string literal, which means the string can't be modified. Which is fine, since there is no reason to modify the string and in fact it never is. But they should be marked const to prevent such code from being added, to prevent warnings if -Wwrite-strings is used, when assigned from a constant string other than a string literal (which produces a warning under current kernel compiler flags), and for general good coding practices. Signed-off-by: Trent Piepho <tpiepho@freescale.com> Signed-off-by: Richard Purdie <rpurdie@linux.intel.com>	2008-10-20 22:34:12 +01:00
Linus Torvalds	a0bfb673dc	Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (41 commits) PCI: fix pci_ioremap_bar() on s390 PCI: fix AER capability check PCI: use pci_find_ext_capability everywhere PCI: remove #ifdef DEBUG around dev_dbg call PCI hotplug: fix get_##name return value problem PCI: document the pcie_aspm kernel parameter PCI: introduce an pci_ioremap(pdev, barnr) function powerpc/PCI: Add legacy PCI access via sysfs PCI: Add ability to mmap legacy_io on some platforms PCI: probing debug message uniformization PCI: support PCIe ARI capability PCI: centralize the capabilities code in probe.c PCI: centralize the capabilities code in pci-sysfs.c PCI: fix 64-vbit prefetchable memory resource BARs PCI: replace cfg space size (256/4096) by macros. PCI: use resource_size() everywhere. PCI: use same arg names in PCI_VDEVICE comment PCI hotplug: rpaphp: make debug var unique PCI: use %pF instead of print_fn_descriptor_symbol() in quirks.c PCI: fix hotplug get_##name return value problem ...	2008-10-20 13:40:47 -07:00
Linus Torvalds	92b29b86fe	Merge branch 'tracing-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (131 commits) tracing/fastboot: improve help text tracing/stacktrace: improve help text tracing/fastboot: fix initcalls disposition in bootgraph.pl tracing/fastboot: fix bootgraph.pl initcall name regexp tracing/fastboot: fix issues and improve output of bootgraph.pl tracepoints: synchronize unregister static inline tracepoints: tracepoint_synchronize_unregister() ftrace: make ftrace_test_p6nop disassembler-friendly markers: fix synchronize marker unregister static inline tracing/fastboot: add better resolution to initcall debug/tracing trace: add build-time check to avoid overrunning hex buffer ftrace: fix hex output mode of ftrace tracing/fastboot: fix initcalls disposition in bootgraph.pl tracing/fastboot: fix printk format typo in boot tracer ftrace: return an error when setting a nonexistent tracer ftrace: make some tracers reentrant ring-buffer: make reentrant ring-buffer: move page indexes into page headers tracing/fastboot: only trace non-module initcalls ftrace: move pc counter in irqtrace ... Manually fix conflicts: - init/main.c: initcall tracing - kernel/module.c: verbose level vs tracepoints - scripts/bootgraph.pl: fallout from cherry-picking commits.	2008-10-20 13:35:07 -07:00
Linus Torvalds	9301975ec2	Merge branch 'genirq-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip This merges branches irq/genirq, irq/sparseirq-v4, timers/hpet-percpu and x86/uv. The sparseirq branch is just preliminary groundwork: no sparse IRQs are actually implemented by this tree anymore - just the new APIs are added while keeping the old way intact as well (the new APIs map 1:1 to irq_desc[]). The 'real' sparse IRQ support will then be a relatively small patch ontop of this - with a v2.6.29 merge target. * 'genirq-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (178 commits) genirq: improve include files intr_remapping: fix typo io_apic: make irq_mis_count available on 64-bit too genirq: fix name space collisions of nr_irqs in arch/* genirq: fix name space collision of nr_irqs in autoprobe.c genirq: use iterators for irq_desc loops proc: fixup irq iterator genirq: add reverse iterator for irq_desc x86: move ack_bad_irq() to irq.c x86: unify show_interrupts() and proc helpers x86: cleanup show_interrupts genirq: cleanup the sparseirq modifications genirq: remove artifacts from sparseirq removal genirq: revert dynarray genirq: remove irq_to_desc_alloc genirq: remove sparse irq code genirq: use inline function for irq_to_desc genirq: consolidate nr_irqs and for_each_irq_desc() x86: remove sparse irq from Kconfig genirq: define nr_irqs for architectures with GENERIC_HARDIRQS=n ...	2008-10-20 13:23:01 -07:00
Linus Torvalds	99ebcf8285	Merge branch 'v28-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'v28-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (36 commits) fix documentation of sysrq-q really Fix documentation of sysrq-q timer_list: add base address to clock base timer_list: print cpu number of clockevents device timer_list: print real timer address NOHZ: restart tick device from irq_enter() NOHZ: split tick_nohz_restart_sched_tick() NOHZ: unify the nohz function calls in irq_enter() timers: fix itimer/many thread hang, fix timers: fix itimer/many thread hang, v3 ntp: improve adjtimex frequency rounding timekeeping: fix rounding problem during clock update ntp: let update_persistent_clock() sleep hrtimer: reorder struct hrtimer to save 8 bytes on 64bit builds posix-timers: lock_timer: make it readable posix-timers: lock_timer: kill the bogus ->it_id check posix-timers: kill ->it_sigev_signo and ->it_sigev_value posix-timers: sys_timer_create: cleanup the error handling posix-timers: move the initialization of timer->sigq from send to create path posix-timers: sys_timer_create: simplify and s/tasklist/rcu/ ... Fix trivial conflicts due to sysrq-q description clahes in Documentation/sysrq.txt and drivers/char/sysrq.c	2008-10-20 13:19:56 -07:00
Linus Torvalds	72558dde73	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (36 commits) ide: re-add TRM290 fix lost during ide_build_dmatable() cleanup scc_pata: kill unused variables sgiioc4: kill duplicate ioremap() sgiioc4: kill useless address checks delkin_cb: add PM support ide: remove broken hpt34x driver ide-floppy: remove idefloppy_floppy_t typedef sgiioc4: remove maskproc() method hpt366: cleanup maskproc() method ide: mask interrupt in ide_config_drive_speed() hpt366: fix compile warning ide: remove unused macros from <asm-parisc/ide.h> ide: remove M68K_IDE_SWAPW define from <asm-m68k/ide.h> ide: remove dead <asm-arm/arch-sa1100/ide.h> ide: fix support for IDE PCI controllers using MMIO on frv ide-cd: remove stale comment ide-cd: small drive type print fix ide-cd: debug log enhancements ide: add generic ATA/ATAPI disk driver ide: allow device drivers to specify per-device type /proc settings ...	2008-10-20 13:12:39 -07:00
Linus Torvalds	1d9a8a47d6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: implement nonseekable open fuse: add include protectors fuse: config description improvement fuse: add missing fuse_request_free fuse: fix SEEK_END incorrectness	2008-10-20 12:53:27 -07:00
David Woodhouse	b364776ad1	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: drivers/pci/intel-iommu.c	2008-10-20 20:19:36 +01:00
Heiko Carstens	96499871f4	PCI: fix pci_ioremap_bar() on s390 s390 doesn't have ioremap_*, so protect the definition of the new pci_ioremap_bar function with CONFIG_HAS_IOMEM to avoid build breakage. Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-10-20 11:28:34 -07:00
Yu Zhao	270c66be9b	PCI: fix AER capability check The 'use pci_find_ext_capability everywhere' cleanup brought a new bug, which makes the AER stop working. Fix it by actually using find_ext_cap instead of just find_cap. Drop the unused config space size define while we're at it. Signed-off-by: Yu Zhao <yu.zhao@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-10-20 11:01:52 -07:00
Jesse Barnes	0927678f55	PCI: use pci_find_ext_capability everywhere Remove some open coded (and buggy) versions of pci_find_ext_capability in favor of the real routine in the PCI core. Tested-by: Tomasz Czernecki <czernecki@gmail.com> Acked-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-10-20 11:01:51 -07:00
Arjan van de Ven	aa42d7c613	PCI: introduce an pci_ioremap(pdev, barnr) function A common thing in many PCI drivers is to ioremap() an entire bar. This is a slightly fragile thing right now, needing both an address and a size, and many driver writers do.. various things there. This patch introduces an pci_ioremap() function taking just a PCI device struct and the bar number as arguments, and figures this all out itself, in one place. In addition, we can add various sanity checks to this function (the patch already checks to make sure that the bar in question really is a MEM bar; few to no drivers do that sort of thing). Hopefully with this type of API we get less chance of mistakes in drivers with ioremap() operations. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-10-20 11:01:48 -07:00
Yu Zhao	58c3a727cb	PCI: support PCIe ARI capability This patch adds support for PCI Express Alternative Routing-ID Interpretation (ARI) capability. The ARI capability extends the Function Number field of the PCI Express Endpoint by reusing the Device Number which is otherwise hardwired to 0. With ARI, an Endpoint can have up to 256 functions. Signed-off-by: Yu Zhao <yu.zhao@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-10-20 10:54:32 -07:00
Zhao, Yu	c322b28a04	PCI: use same arg names in PCI_VDEVICE comment This cleanup makes the argument names in PCI_VDEVICE comment consistent with those used in its definition. Signed-off-by: Yu Zhao <yu.zhao@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-10-20 10:54:28 -07:00
Seth Heasley	37a84ec668	x86/PCI: irq and pci_ids patch for Intel Ibex Peak DeviceIDs This patch updates the Intel Ibex Peak (PCH) LPC and SMBus Controller DeviceIDs. The LPC Controller ID is set by Firmware within the range of 0x3b00-3b1f. This range is included in pci_ids.h using min and max values, and irq.c now has code to handle the range (in lieu of 32 additions to a SWITCH statement). The SMBus Controller ID is a fixed-value and will not change. Signed-off-by: Seth Heasley <seth.heasley@intel.com> Acked-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-10-20 10:53:48 -07:00
Yinghai Lu	16dbef4a83	PCI: change MSI-x vector to 32bit We are using 28bit pci (bus/dev/fn + 12 bits) as irq number, so the cache for irq number should be 32 bit too. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-10-20 10:53:42 -07:00
Rafael J. Wysocki	0235c4fc7f	PCI PM: Introduce function pci_wake_from_d3 Many device drivers use the following sequence of statements to enable the device to wake up the system while being in the D3_hot or D3_cold low power state: pci_enable_wake(pdev, PCI_D3hot, 1); pci_enable_wake(pdev, PCI_D3cold, 1); However, the second call is not necessary if the first one succeeds (the ordering of the statements above doesn't matter here) and it may even be harmful, because we are not supposed to enable PME# after the wake-up power has been enabled for the device. To allow drivers to overcome this problem, introduce function pci_wake_from_d3() that will enable the device to wake up the system from any of D3_hot and D3_cold as long as the wake-up from at least one of them is supported. Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-10-20 10:53:41 -07:00
Milton Miller	edbc25caaa	PCI: remove dynids.use_driver_data The driver flag dynids.use_driver_data is almost consistently not set, and causes more problems than it solves. It was initially intended as a flag to indicate whether a driver's usage of driver_data had been carefully inspected and was ready for values from userspace. That audit was never done, so most drivers just get a 0 for driver_data when new IDs are added from userspace via sysfs. So remove the flag, allowing drivers to see the data directly (a followon patch validates the passed driver_data value against what the drivers expect). Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-10-20 10:48:34 -07:00
Linus Torvalds	7d67474e50	Merge git://git.infradead.org/battery-2.6 * git://git.infradead.org/battery-2.6: bq27x00_battery: use unaligned access helper power_supply: fix dependency of tosa_battery power_supply: Support for Texas Instruments BQ27200 battery managers power_supply: Add function to return system-wide power state pda_power: Check and handle return value of set_irq_wake	2008-10-20 09:44:30 -07:00
Linus Torvalds	52c6738b7f	Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: use correct fs type for v4 submounts and referrals Make nfs_file_cred more robust. NFS: Enable NFSv4 callback server to listen on AF_INET6 sockets	2008-10-20 09:39:20 -07:00
Steven Rostedt	606576ce81	ftrace: rename FTRACE to FUNCTION_TRACER Due to confusion between the ftrace infrastructure and the gcc profiling tracer "ftrace", this patch renames the config options from FTRACE to FUNCTION_TRACER. The other two names that are offspring from FTRACE DYNAMIC_FTRACE and FTRACE_MCOUNT_RECORD will stay the same. This patch was generated mostly by script, and partially by hand. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-20 18:27:03 +02:00
Linus Torvalds	3b72e44154	Merge branch 'for-next' of git://git.o-hand.com/linux-mfd * 'for-next' of git://git.o-hand.com/linux-mfd: mfd: further unbork the ucb1400 ac97_bus dependencies mfd: ucb1400 needs GPIO mfd: ucb1400 sound driver uses/depends on AC97_BUS: mfd: Don't use NO_IRQ in WM8350 mfd: update TMIO drivers to use the clock API mfd: twl4030-core irq simplification mfd: add base support for Dialog DA9030/DA9034 PMICs mfd: TWL4030 core driver mfd: support tmiofb cell on tc6393xb mfd: add OHCI cell to tc6393xb mfd: Fix htc-egpio compile warning mfd: do tcb6393xb state restore on resume only if requested mfd: provide and use setup hook for tc6393xb mfd: update sm501 debugging/low information messages mfd: reduce stack usage in mfd-core.c	2008-10-20 09:22:47 -07:00
Linus Torvalds	ed402af3c2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (112 commits) sh: Move SH-4 CPU headers down one more level. sh: Only build in gpio.o when CONFIG_GENERIC_GPIO is selected. sh: Migrate common board headers to mach-common/. sh: Move the CPU definition headers from asm/ to cpu/. serial: sh-sci: Add support SCIF of SH7723 video: add sh_mobile_lcdc platform flags video: remove unused sh_mobile_lcdc platform data sh: remove consistent alloc cruft sh: add dynamic crash base address support sh: reduce Migo-R smc91x overruns sh: Fix up some merge damage. Fix debugfs_create_file's error checking method for arch/sh/mm/ Fix debugfs_create_dir's error checking method for arch/sh/kernel/ sh: ap325rxa: Add support RTC RX-8564LC in AP325RXA board sh: Use sh7720 GPIO on magicpanelr2 board sh: Add sh7720 pinmux code sh: Use sh7203 GPIO on rsk7203 board sh: Add sh7203 pinmux code sh: Use sh7723 GPIO on AP325RXA board sh: Add sh7723 pinmux code ...	2008-10-20 09:13:34 -07:00
Linus Torvalds	2be508d847	Merge git://git.infradead.org/mtd-2.6 * git://git.infradead.org/mtd-2.6: (69 commits) Revert "[MTD] m25p80.c code cleanup" [MTD] [NAND] GPIO driver depends on ARM... for now. [MTD] [NAND] sh_flctl: fix compile error [MTD] [NOR] AT49BV6416 has swapped erase regions [MTD] [NAND] GPIO NAND flash driver [MTD] cmdlineparts documentation change - explain where mtd-id comes from [MTD] cfi_cmdset_0002.c: Add Macronix CFI V1.0 TopBottom detection [MTD] [NAND] Fix compilation warnings in drivers/mtd/nand/cs553x_nand.c [JFFS2] Write buffer offset adjustment for NOR-ECC (Sibley) flash [MTD] mtdoops: Fix a bug where block may not be erased [MTD] mtdoops: Add a magic number to logged kernel oops [MTD] mtdoops: Fix an off by one error [JFFS2] Correct parameter names of jffs2_compress() in comments [MTD] [NAND] sh_flctl: add support for Renesas SuperH FLCTL [MTD] [NAND] Bug on atmel_nand HW ECC : OOB info not correctly written [MTD] [MAPS] Remove unused variable after ROM API cleanup. [MTD] m25p80.c extended jedec support (v2) [MTD] remove unused mtd parameter in of_mtd_parse_partitions() [MTD] [NAND] remove dead Kconfig associated with !CONFIG_PPC_MERGE [MTD] [NAND] driver extension to support NAND on TQM85xx modules ...	2008-10-20 09:03:12 -07:00
Parag Warudkar	01e8ef11bc	x86: sysfs: kill owner field from attribute Tejun's commit `7b595756ec` made sysfs attribute->owner unnecessary. But the field was left in the structure to ease the merge. It's been over a year since that change and it is now time to start killing attribute->owner along with its users - one arch at a time! This patch is attempt #1 to get rid of attribute->owner only for CONFIG_X86_64 or CONFIG_X86_32 . We will deal with other arches later on as and when possible - avr32 will be the next since that is something I can test. Compile (make allyesconfig / make allmodconfig / custom config) and boot tested. akpm: the idea is that we put the declaration of sttribute.owner inside `#ifndef CONFIG_X86'. But that proved to be too ambitious for now because new usages kept on turning up in subsystem trees. [akpm: remove the ifdef for now] Signed-off-by: Parag Warudkar <parag.lkml@gmail.com> Cc: Greg KH <greg@kroah.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Tejun Heo <htejun@gmail.com> Cc: Len Brown <lenb@kernel.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Jean Delvare <khali@linux-fr.org> Cc: Roland Dreier <rolandd@cisco.com> Cc: David Brownell <david-b@pacbell.net> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:42 -07:00
Adrian Bunk	5a85a7dda1	include/linux/bcd.h: remove comments - the macros are gone - there's no more code in this file, LGPL + GPL = GPL, and the code that was moved to lib/bcd.c is anyway trivial Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:42 -07:00
Adrian Bunk	a0098efd6e	remove the obsolete BCDBIN/BINBCD macros Remove the following obsolete macros: - BCD2BIN - BIN2BCD - BCD_TO_BIN - BIN_TO_BCD Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:41 -07:00
Adrian Bunk	fdd2e5f88a	make mm/rmap.c:anon_vma_cachep static This patch makes the needlessly global anon_vma_cachep static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:40 -07:00
Harvey Harrison	1d8cca44b6	byteorder: provide swabb.h generically in asm/byteorder.h This is needed during the transition to the new byteorder headers as the swabb.h functionality will be provided from asm/byteorder.h in the new version. To avoid breakage on arches still using the old implementation, provide swabb.h from asm/byteorder.h as well. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:40 -07:00
Harvey Harrison	acf0108a84	byteorder: use generic C version for value byteswapping This makes the new implementation of the byteorder helpers match the old in how it degraded when an arch-defined version was not available: 1) swab() - look for arch defined - if not, use generic c version 2) swabp() - look for arch-defined - if not, deref pointer and use swab() 3) swabs() - look for arch defined - if not, use swabp Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:40 -07:00
Harvey Harrison	b8e465f494	byteorder: add new headers for make headers-install Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:40 -07:00
Simon Horman	85a0ee342e	kdump: add is_vmcore_usable() and vmcore_unusable() The usage of elfcorehdr_addr has changed recently such that being set to ELFCORE_ADDR_MAX is used by is_kdump_kernel() to indicate if the code is executing in a kernel executed as a crash kernel. However, arch/ia64/kernel/setup.c:reserve_elfcorehdr will rest elfcorehdr_addr to ELFCORE_ADDR_MAX on error, which means any subsequent calls to is_kdump_kernel() will return 0, even though they should return 1. Ok, at this point in time there are no subsequent calls, but I think its fair to say that there is ample scope for error or at the very least confusion. This patch add an extra state, ELFCORE_ADDR_ERR, which indicates that elfcorehdr_addr was passed on the command line, and thus execution is taking place in a crashdump kernel, but vmcore can't be used for some reason. This is tested for using is_vmcore_usable() and set using vmcore_unusable(). A subsequent patch makes use of this new code. To summarise, the states that elfcorehdr_addr can now be in are as follows: ELFCORE_ADDR_MAX: not a crashdump kernel ELFCORE_ADDR_ERR: crashdump kernel but vmcore is unusable any other value: crash dump kernel and vmcore is usable Signed-off-by: Simon Horman <horms@verge.net.au> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:40 -07:00
Vivek Goyal	57cac4d188	kdump: make elfcorehdr_addr independent of CONFIG_PROC_VMCORE o elfcorehdr_addr is used by not only the code under CONFIG_PROC_VMCORE but also by the code which is not inside CONFIG_PROC_VMCORE. For example, is_kdump_kernel() is used by powerpc code to determine if kernel is booting after a panic then use previous kernel's TCE table. So even if CONFIG_PROC_VMCORE is not set in second kernel, one should be able to correctly determine that we are booting after a panic and setup calgary iommu accordingly. o So remove the assumption that elfcorehdr_addr is under CONFIG_PROC_VMCORE. o Move definition of elfcorehdr_addr to arch dependent crash files. (Unfortunately crash dump does not have an arch independent file otherwise that would have been the best place). o kexec.c is not the right place as one can Have CRASH_DUMP enabled in second kernel without KEXEC being enabled. o I don't see sh setup code parsing the command line for elfcorehdr_addr. I am wondering how does vmcore interface work on sh. Anyway, I am atleast defining elfcoredhr_addr so that compilation is not broken on sh. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Simon Horman <horms@verge.net.au> Acked-by: Paul Mundt <lethal@linux-sh.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:39 -07:00
Roland McGrath	656eb2cd5d	add CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS This adds a kconfig option to change the /proc/PID/coredump_filter default. Fedora has been carrying a trivial patch to change the hard-wired value for this default, since Fedora 8. The default default can't change safely because there are old GDB versions out there (all before 6.7) that are confused by the core dump files created by the MMF_DUMP_ELF_HEADERS setting. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Andi Kleen <andi@firstfloor.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Kawai Hidehiro <hidehiro.kawai.ez@hitachi.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:39 -07:00
Adrian Bunk	b747c8c102	make ptrace_untrace() static ptrace_untrace() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:39 -07:00
Lai Jiangshan	c459643540	bitmask: remove bitmap_scnprintf_len() bitmap_scnprintf_len() is not used now, so we remove it. Otherwise we have to maintain it and make its return value always equal to bitmap_scnprintf()'s return value. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Paul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:39 -07:00
Lai Jiangshan	3eda201180	seq_file: add seq_cpumask_list(), seq_nodemask_list() seq_cpumask_list(), seq_nodemask_list() are very like seq_cpumask(), seq_nodemask(), but they print human readable string. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Paul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:39 -07:00
KAMEZAWA Hiroyuki	52d4b9ac0b	memcg: allocate all page_cgroup at boot Allocate all page_cgroup at boot and remove page_cgroup poitner from struct page. This patch adds an interface as struct page_cgroup lookup_page_cgroup(struct page) All FLATMEM/DISCONTIGMEM/SPARSEMEM and MEMORY_HOTPLUG is supported. Remove page_cgroup pointer reduces the amount of memory by - 4 bytes per PAGE_SIZE. - 8 bytes per PAGE_SIZE if memory controller is disabled. (even if configured.) On usual 8GB x86-32 server, this saves 8MB of NORMAL_ZONE memory. On my x86-64 server with 48GB of memory, this saves 96MB of memory. I think this reduction makes sense. By pre-allocation, kmalloc/kfree in charge/uncharge are removed. This means - we're not necessary to be afraid of kmalloc faiulre. (this can happen because of gfp_mask type.) - we can avoid calling kmalloc/kfree. - we can avoid allocating tons of small objects which can be fragmented. - we can know what amount of memory will be used for this extra-lru handling. I added printk message as "allocated %ld bytes of page_cgroup" "please try cgroup_disable=memory option if you don't want" maybe enough informative for users. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:39 -07:00
Paul Menage	886465f407	cgroups: fix declaration of cgroup_mm_owner_callbacks The choice of real/dummy declaration for cgroup_mm_owner_callbacks() shouldn't be based on CONFIG_MM_OWNER, but on CONFIG_CGROUPS. Otherwise kernel/exit.c fails to compile when something other than a cgroups controller selects CONFIG_MM_OWNER Signed-off-by: Paul Menage <menage@google.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:38 -07:00
Paul Menage	cc31edceee	cgroups: convert tasks file to use a seq_file with shared pid array Rather than pre-generating the entire text for the "tasks" file each time the file is opened, we instead just generate/update the array of process ids and use a seq_file to report these to userspace. All open file handles on the same "tasks" file can share a pid array, which may be updated any time that no thread is actively reading the array. By sharing the array, the potential for userspace to DoS the system by opening many handles on the same "tasks" file is removed. [Based on a patch by Lai Jiangshan, extended to use seq_file] Signed-off-by: Paul Menage <menage@google.com> Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:38 -07:00
Lai Jiangshan	146aa1bd05	cgroups: fix probable race with put_css_set[_taskexit] and find_css_set put_css_set_taskexit may be called when find_css_set is called on other cpu. And the race will occur: put_css_set_taskexit side find_css_set side \| atomic_dec_and_test(&kref->refcount) \| /* kref->refcount = 0 / \| .................................................................... \| read_lock(&css_set_lock) \| find_existing_css_set \| get_css_set \| read_unlock(&css_set_lock); .................................................................... __release_css_set \| .................................................................... \| / use a released css_set */ \| [put_css_set is the same. But in the current code, all put_css_set are put into cgroup mutex critical region as the same as find_css_set.] [akpm@linux-foundation.org: repair comments] [menage@google.com: eliminate race in css_set refcounting] Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:38 -07:00
Hidehiro Kawai	0e4fb5e283	ext3: add an option to control error handling on file data If the journal doesn't abort when it gets an IO error in file data blocks, the file data corruption will spread silently. Because most of applications and commands do buffered writes without fsync(), they don't notice the IO error. It's scary for mission critical systems. On the other hand, if the journal aborts whenever it gets an IO error in file data blocks, the system will easily become inoperable. So this patch introduces a filesystem option to determine whether it aborts the journal or just call printk() when it gets an IO error in file data. If you mount a ext3 fs with data_err=abort option, it aborts on file data write error. If you mount it with data_err=ignore, it doesn't abort, just call printk(). data_err=ignore is the default. Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:37 -07:00
Krzysztof Helt	3e680aae4e	fb: convert lock/unlock_kernel() into local fb mutex Change lock_kernel()/unlock_kernel() to local fb mutex. Each frame buffer instance has its own mutex. The one line try_to_load() function is unrolled to request_module() in two places for readability. [righi.andrea@gmail.com: fb: fix NULL pointer BUG dereference in fb_open()] Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:36 -07:00
Matt Helsley	dc52ddc0e6	container freezer: implement freezer cgroup subsystem This patch implements a new freezer subsystem in the control groups framework. It provides a way to stop and resume execution of all tasks in a cgroup by writing in the cgroup filesystem. The freezer subsystem in the container filesystem defines a file named freezer.state. Writing "FROZEN" to the state file will freeze all tasks in the cgroup. Subsequently writing "RUNNING" will unfreeze the tasks in the cgroup. Reading will return the current state. * Examples of usage : # mkdir /containers/freezer # mount -t cgroup -ofreezer freezer /containers # mkdir /containers/0 # echo $some_pid > /containers/0/tasks to get status of the freezer subsystem : # cat /containers/0/freezer.state RUNNING to freeze all tasks in the container : # echo FROZEN > /containers/0/freezer.state # cat /containers/0/freezer.state FREEZING # cat /containers/0/freezer.state FROZEN to unfreeze all tasks in the container : # echo RUNNING > /containers/0/freezer.state # cat /containers/0/freezer.state RUNNING This is the basic mechanism which should do the right thing for user space task in a simple scenario. It's important to note that freezing can be incomplete. In that case we return EBUSY. This means that some tasks in the cgroup are busy doing something that prevents us from completely freezing the cgroup at this time. After EBUSY, the cgroup will remain partially frozen -- reflected by freezer.state reporting "FREEZING" when read. The state will remain "FREEZING" until one of these things happens: 1) Userspace cancels the freezing operation by writing "RUNNING" to the freezer.state file 2) Userspace retries the freezing operation by writing "FROZEN" to the freezer.state file (writing "FREEZING" is not legal and returns EIO) 3) The tasks that blocked the cgroup from entering the "FROZEN" state disappear from the cgroup's set of tasks. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: export thaw_process] Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Acked-by: Serge E. Hallyn <serue@us.ibm.com> Tested-by: Matt Helsley <matthltc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:34 -07:00
Matt Helsley	8174f1503f	container freezer: make refrigerator always available Now that the TIF_FREEZE flag is available in all architectures, extract the refrigerator() and freeze_task() from kernel/power/process.c and make it available to all. The refrigerator() can now be used in a control group subsystem implementing a control group freezer. Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Acked-by: Serge E. Hallyn <serue@us.ibm.com> Tested-by: Matt Helsley <matthltc@us.ibm.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:33 -07:00
KOSAKI Motohiro	e575f111dc	coredump_filter: add hugepage dumping Presently hugepage's vma has a VM_RESERVED flag in order not to be swapped. But a VM_RESERVED vma isn't core dumped because this flag is often used for some kernel vmas (e.g. vmalloc, sound related). Thus hugepages are never dumped and it can't be debugged easily. Many developers want hugepages to be included into core-dump. However, We can't read generic VM_RESERVED area because this area is often IO mapping area. then these area reading may change device state. it is definitly undesiable side-effect. So adding a hugepage specific bit to the coredump filter is better. It will be able to hugepage core dumping and doesn't cause any side-effect to any i/o devices. In additional, libhugetlb use hugetlb private mapping pages as anonymous page. Then, hugepage private mapping pages should be core dumped by default. Then, /proc/[pid]/core_dump_filter has two new bits. - bit 5 mean hugetlb private mapping pages are dumped or not. (default: yes) - bit 6 mean hugetlb shared mapping pages are dumped or not. (default: no) I tested by following method. % ulimit -c unlimited % ./crash_hugepage 50 % ./crash_hugepage 50 -p % ls -lh % gdb ./crash_hugepage core % % echo 0x43 > /proc/self/coredump_filter % ./crash_hugepage 50 % ./crash_hugepage 50 -p % ls -lh % gdb ./crash_hugepage core #include <stdlib.h> #include <stdio.h> #include <unistd.h> #include <sys/mman.h> #include <string.h> #include "hugetlbfs.h" int main(int argc, char** argv){ char* p; int ch; int mmap_flags = MAP_SHARED; int fd; int nr_pages; while((ch = getopt(argc, argv, "p")) != -1) { switch (ch) { case 'p': mmap_flags &= ~MAP_SHARED; mmap_flags \|= MAP_PRIVATE; break; default: /* nothing/ break; } } argc -= optind; argv += optind; if (argc == 0){ printf("need # of pages\n"); exit(1); } nr_pages = atoi(argv[0]); if (nr_pages < 2) { printf("nr_pages must >2\n"); exit(1); } fd = hugetlbfs_unlinked_fd(); p = mmap(NULL, nr_pages gethugepagesize(), PROT_READ\|PROT_WRITE, mmap_flags, fd, 0); sleep(2); (p + gethugepagesize()) = 1; / COW / sleep(2); / crash! / (int*)0 = 1; return 0; } Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: Kawai Hidehiro <hidehiro.kawai.ez@hitachi.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: William Irwin <wli@holomorphy.com> Cc: Adam Litke <agl@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:32 -07:00
Nick Piggin	db64fe0225	mm: rewrite vmap layer Rewrite the vmap allocator to use rbtrees and lazy tlb flushing, and provide a fast, scalable percpu frontend for small vmaps (requires a slightly different API, though). The biggest problem with vmap is actually vunmap. Presently this requires a global kernel TLB flush, which on most architectures is a broadcast IPI to all CPUs to flush the cache. This is all done under a global lock. As the number of CPUs increases, so will the number of vunmaps a scaled workload will want to perform, and so will the cost of a global TLB flush. This gives terrible quadratic scalability characteristics. Another problem is that the entire vmap subsystem works under a single lock. It is a rwlock, but it is actually taken for write in all the fast paths, and the read locking would likely never be run concurrently anyway, so it's just pointless. This is a rewrite of vmap subsystem to solve those problems. The existing vmalloc API is implemented on top of the rewritten subsystem. The TLB flushing problem is solved by using lazy TLB unmapping. vmap addresses do not have to be flushed immediately when they are vunmapped, because the kernel will not reuse them again (would be a use-after-free) until they are reallocated. So the addresses aren't allocated again until a subsequent TLB flush. A single TLB flush then can flush multiple vunmaps from each CPU. XEN and PAT and such do not like deferred TLB flushing because they can't always handle multiple aliasing virtual addresses to a physical address. They now call vm_unmap_aliases() in order to flush any deferred mappings. That call is very expensive (well, actually not a lot more expensive than a single vunmap under the old scheme), however it should be OK if not called too often. The virtual memory extent information is stored in an rbtree rather than a linked list to improve the algorithmic scalability. There is a per-CPU allocator for small vmaps, which amortizes or avoids global locking. To use the per-CPU interface, the vm_map_ram / vm_unmap_ram interfaces must be used in place of vmap and vunmap. Vmalloc does not use these interfaces at the moment, so it will not be quite so scalable (although it will use lazy TLB flushing). As a quick test of performance, I ran a test that loops in the kernel, linearly mapping then touching then unmapping 4 pages. Different numbers of tests were run in parallel on an 4 core, 2 socket opteron. Results are in nanoseconds per map+touch+unmap. threads vanilla vmap rewrite 1 14700 2900 2 33600 3000 4 49500 2800 8 70631 2900 So with a 8 cores, the rewritten version is already 25x faster. In a slightly more realistic test (although with an older and less scalable version of the patch), I ripped the not-very-good vunmap batching code out of XFS, and implemented the large buffer mapping with vm_map_ram and vm_unmap_ram... along with a couple of other tricks, I was able to speed up a large directory workload by 20x on a 64 CPU system. I believe vmap/vunmap is actually sped up a lot more than 20x on such a system, but I'm running into other locks now. vmap is pretty well blown off the profiles. Before: 1352059 total 0.1401 798784 _write_lock 8320.6667 <- vmlist_lock 529313 default_idle 1181.5022 15242 smp_call_function 15.8771 <- vmap tlb flushing 2472 __get_vm_area_node 1.9312 <- vmap 1762 remove_vm_area 4.5885 <- vunmap 316 map_vm_area 0.2297 <- vmap 312 kfree 0.1950 300 _spin_lock 3.1250 252 sn_send_IPI_phys 0.4375 <- tlb flushing 238 vmap 0.8264 <- vmap 216 find_lock_page 0.5192 196 find_next_bit 0.3603 136 sn2_send_IPI 0.2024 130 pio_phys_write_mmr 2.0312 118 unmap_kernel_range 0.1229 After: 78406 total 0.0081 40053 default_idle 89.4040 33576 ia64_spinlock_contention 349.7500 1650 _spin_lock 17.1875 319 __reg_op 0.5538 281 _atomic_dec_and_lock 1.0977 153 mutex_unlock 1.5938 123 iget_locked 0.1671 117 xfs_dir_lookup 0.1662 117 dput 0.1406 114 xfs_iget_core 0.0268 92 xfs_da_hashname 0.1917 75 d_alloc 0.0670 68 vmap_page_range 0.0462 <- vmap 58 kmem_cache_alloc 0.0604 57 memset 0.0540 52 rb_next 0.1625 50 __copy_user 0.0208 49 bitmap_find_free_region 0.2188 <- vmap 46 ia64_sn_udelay 0.1106 45 find_inode_fast 0.1406 42 memcmp 0.2188 42 finish_task_switch 0.1094 42 __d_lookup 0.0410 40 radix_tree_lookup_slot 0.1250 37 _spin_unlock_irqrestore 0.3854 36 xfs_bmapi 0.0050 36 kmem_cache_free 0.0256 35 xfs_vn_getattr 0.0322 34 radix_tree_lookup 0.1062 33 __link_path_walk 0.0035 31 xfs_da_do_buf 0.0091 30 _xfs_buf_find 0.0204 28 find_get_page 0.0875 27 xfs_iread 0.0241 27 __strncpy_from_user 0.2812 26 _xfs_buf_initialize 0.0406 24 _xfs_buf_lookup_pages 0.0179 24 vunmap_page_range 0.0250 <- vunmap 23 find_lock_page 0.0799 22 vm_map_ram 0.0087 <- vmap 20 kfree 0.0125 19 put_page 0.0330 18 __kmalloc 0.0176 17 xfs_da_node_lookup_int 0.0086 17 _read_lock 0.0885 17 page_waitqueue 0.0664 vmap has gone from being the top 5 on the profiles and flushing the crap out of all TLBs, to using less than 1% of kernel time. [akpm@linux-foundation.org: cleanups, section fix] [akpm@linux-foundation.org: fix build on alpha] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:32 -07:00
Nick Piggin	51b07fc3c5	fs: buffer lock use lock bitops trylock_buffer and unlock_buffer open and close a critical section. Hence, we can use the lock bitops to get the desired memory ordering. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:32 -07:00
Nick Piggin	8413ac9d8c	mm: page lock use lock bitops trylock_page, unlock_page open and close a critical section. Hence, we can use the lock bitops to get the desired memory ordering. Also, mark trylock as likely to succeed (and remove the annotation from callers). Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:32 -07:00
Nick Piggin	f45840b5c1	mm: pagecache insertion fewer atomics Setting and clearing the page locked when inserting it into swapcache / pagecache when it has no other references can use non-atomic page flags operations because no other CPU may be operating on it at this time. This saves one atomic operation when inserting a page into pagecache. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:31 -07:00
KOSAKI Motohiro	902d2e8ae0	vmscan: kill unused lru functions Several LRU manupuration function are not used now. So they can be removed. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:31 -07:00
Lee Schermerhorn	985737cf2e	mlock: count attempts to free mlocked page Allow free of mlock()ed pages. This shouldn't happen, but during developement, it occasionally did. This patch allows us to survive that condition, while keeping the statistics and events correct for debug. Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:31 -07:00
Lee Schermerhorn	af936a1606	vmscan: unevictable LRU scan sysctl This patch adds a function to scan individual or all zones' unevictable lists and move any pages that have become evictable onto the respective zone's inactive list, where shrink_inactive_list() will deal with them. Adds sysctl to scan all nodes, and per node attributes to individual nodes' zones. Kosaki: If evictable page found in unevictable lru when write /proc/sys/vm/scan_unevictable_pages, print filename and file offset of these pages. [akpm@linux-foundation.org: fix one CONFIG_MMU=n build error] [kosaki.motohiro@jp.fujitsu.com: adapt vmscan-unevictable-lru-scan-sysctl.patch to new sysfs API] Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:31 -07:00
Lee Schermerhorn	64d6519dda	swap: cull unevictable pages in fault path In the fault paths that install new anonymous pages, check whether the page is evictable or not using lru_cache_add_active_or_unevictable(). If the page is evictable, just add it to the active lru list [via the pagevec cache], else add it to the unevictable list. This "proactive" culling in the fault path mimics the handling of mlocked pages in Nick Piggin's series to keep mlocked pages off the lru lists. Notes: 1) This patch is optional--e.g., if one is concerned about the additional test in the fault path. We can defer the moving of nonreclaimable pages until when vmscan [shrink_*_list()] encounters them. Vmscan will only need to handle such pages once, but if there are a lot of them it could impact system performance. 2) The 'vma' argument to page_evictable() is require to notice that we're faulting a page into an mlock()ed vma w/o having to scan the page's rmap in the fault path. Culling mlock()ed anon pages is currently the only reason for this patch. 3) We can't cull swap pages in read_swap_cache_async() because the vma argument doesn't necessarily correspond to the swap cache offset passed in by swapin_readahead(). This could [did!] result in mlocking pages in non-VM_LOCKED vmas if [when] we tried to cull in this path. 4) Move set_pte_at() to after where we add page to lru to keep it hidden from other tasks that might walk the page table. We already do it in this order in do_anonymous() page. And, these are COW'd anon pages. Is this safe? [riel@redhat.com: undo an overzealous code cleanup] Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:31 -07:00
Nick Piggin	5344b7e648	vmstat: mlocked pages statistics Add NR_MLOCK zone page state, which provides a (conservative) count of mlocked pages (actually, the number of mlocked pages moved off the LRU). Reworked by lts to fit in with the modified mlock page support in the Reclaim Scalability series. [kosaki.motohiro@jp.fujitsu.com: fix incorrect Mlocked field of /proc/meminfo] [lee.schermerhorn@hp.com: mlocked-pages: add event counting with statistics] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:31 -07:00
Nick Piggin	b291f00039	mlock: mlocked pages are unevictable Make sure that mlocked pages also live on the unevictable LRU, so kswapd will not scan them over and over again. This is achieved through various strategies: 1) add yet another page flag--PG_mlocked--to indicate that the page is locked for efficient testing in vmscan and, optionally, fault path. This allows early culling of unevictable pages, preventing them from getting to page_referenced()/try_to_unmap(). Also allows separate accounting of mlock'd pages, as Nick's original patch did. Note: Nick's original mlock patch used a PG_mlocked flag. I had removed this in favor of the PG_unevictable flag + an mlock_count [new page struct member]. I restored the PG_mlocked flag to eliminate the new count field. 2) add the mlock/unevictable infrastructure to mm/mlock.c, with internal APIs in mm/internal.h. This is a rework of Nick's original patch to these files, taking into account that mlocked pages are now kept on unevictable LRU list. 3) update vmscan.c:page_evictable() to check PageMlocked() and, if vma passed in, the vm_flags. Note that the vma will only be passed in for new pages in the fault path; and then only if the "cull unevictable pages in fault path" patch is included. 4) add try_to_unlock() to rmap.c to walk a page's rmap and ClearPageMlocked() if no other vmas have it mlocked. Reuses as much of try_to_unmap() as possible. This effectively replaces the use of one of the lru list links as an mlock count. If this mechanism let's pages in mlocked vmas leak through w/o PG_mlocked set [I don't know that it does], we should catch them later in try_to_unmap(). One hopes this will be rare, as it will be relatively expensive. Original mm/internal.h, mm/rmap.c and mm/mlock.c changes: Signed-off-by: Nick Piggin <npiggin@suse.de> splitlru: introduce __get_user_pages(): New munlock processing need to GUP_FLAGS_IGNORE_VMA_PERMISSIONS. because current get_user_pages() can't grab PROT_NONE pages theresore it cause PROT_NONE pages can't munlock. [akpm@linux-foundation.org: fix this for pagemap-pass-mm-into-pagewalkers.patch] [akpm@linux-foundation.org: untangle patch interdependencies] [akpm@linux-foundation.org: fix things after out-of-order merging] [hugh@veritas.com: fix page-flags mess] [lee.schermerhorn@hp.com: fix munlock page table walk - now requires 'mm'] [kosaki.motohiro@jp.fujitsu.com: build fix] [kosaki.motohiro@jp.fujitsu.com: fix truncate race and sevaral comments] [kosaki.motohiro@jp.fujitsu.com: splitlru: introduce __get_user_pages()] Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:30 -07:00
Lee Schermerhorn	89e004ea55	SHM_LOCKED pages are unevictable Shmem segments locked into memory via shmctl(SHM_LOCKED) should not be kept on the normal LRU, since scanning them is a waste of time and might throw off kswapd's balancing algorithms. Place them on the unevictable LRU list instead. Use the AS_UNEVICTABLE flag to mark address_space of SHM_LOCKed shared memory regions as unevictable. Then these pages will be culled off the normal LRU lists during vmscan. Add new wrapper function to clear the mapping's unevictable state when/if shared memory segment is munlocked. Add 'scan_mapping_unevictable_page()' to mm/vmscan.c to scan all pages in the shmem segment's mapping [struct address_space] for evictability now that they're no longer locked. If so, move them to the appropriate zone lru list. Changes depend on [CONFIG_]UNEVICTABLE_LRU. [kosaki.motohiro@jp.fujitsu.com: revert shm change] Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Kosaki Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:50:26 -07:00
Lee Schermerhorn	ba9ddf4939	Ramfs and Ram Disk pages are unevictable Christoph Lameter pointed out that ram disk pages also clutter the LRU lists. When vmscan finds them dirty and tries to clean them, the ram disk writeback function just redirties the page so that it goes back onto the active list. Round and round she goes... With the ram disk driver [rd.c] replaced by the newer 'brd.c', this is no longer the case, as ram disk pages are no longer maintained on the lru. [This makes them unmigratable for defrag or memory hot remove, but that can be addressed by a separate patch series.] However, the ramfs pages behave like ram disk pages used to, so: Define new address_space flag [shares address_space flags member with mapping's gfp mask] to indicate that the address space contains all unevictable pages. This will provide for efficient testing of ramfs pages in page_evictable(). Also provide wrapper functions to set/test the unevictable state to minimize #ifdefs in ramfs driver and any other users of this facility. Set the unevictable state on address_space structures for new ramfs inodes. Test the unevictable state in page_evictable() to cull unevictable pages. These changes depend on [CONFIG_]UNEVICTABLE_LRU. [riel@redhat.com: undo the brd.c part] Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Rik van Riel <riel@redhat.com> Debugged-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:50:26 -07:00
Lee Schermerhorn	bbfd28eee9	unevictable lru: add event counting with statistics Fix to unevictable-lru-page-statistics.patch Add unevictable lru infrastructure vm events to the statistics patch. Rename the "NORECL_" and "noreclaim_" symbols and text strings to "UNEVICTABLE_" and "unevictable_", respectively. Currently, both the infrastructure and the mlocked pages event are added by a single patch later in the series. This makes it difficult to add or rework the incremental patches. The events actually "belong" with the stats, so pull them up to here. Also, restore the event counting to putback_lru_page(). This was removed from previous patch in series where it was "misplaced". The actual events weren't defined that early. Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Rik van Riel <riel@redhat.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:50:26 -07:00
Lee Schermerhorn	894bc31041	Unevictable LRU Infrastructure When the system contains lots of mlocked or otherwise unevictable pages, the pageout code (kswapd) can spend lots of time scanning over these pages. Worse still, the presence of lots of unevictable pages can confuse kswapd into thinking that more aggressive pageout modes are required, resulting in all kinds of bad behaviour. Infrastructure to manage pages excluded from reclaim--i.e., hidden from vmscan. Based on a patch by Larry Woodman of Red Hat. Reworked to maintain "unevictable" pages on a separate per-zone LRU list, to "hide" them from vmscan. Kosaki Motohiro added the support for the memory controller unevictable lru list. Pages on the unevictable list have both PG_unevictable and PG_lru set. Thus, PG_unevictable is analogous to and mutually exclusive with PG_active--it specifies which LRU list the page is on. The unevictable infrastructure is enabled by a new mm Kconfig option [CONFIG_]UNEVICTABLE_LRU. A new function 'page_evictable(page, vma)' in vmscan.c tests whether or not a page may be evictable. Subsequent patches will add the various !evictable tests. We'll want to keep these tests light-weight for use in shrink_active_list() and, possibly, the fault path. To avoid races between tasks putting pages [back] onto an LRU list and tasks that might be moving the page from non-evictable to evictable state, the new function 'putback_lru_page()' -- inverse to 'isolate_lru_page()' -- tests the "evictability" of a page after placing it on the LRU, before dropping the reference. If the page has become unevictable, putback_lru_page() will redo the 'putback', thus moving the page to the unevictable list. This way, we avoid "stranding" evictable pages on the unevictable list. [akpm@linux-foundation.org: fix fallout from out-of-order merge] [riel@redhat.com: fix UNEVICTABLE_LRU and !PROC_PAGE_MONITOR build] [nishimura@mxp.nes.nec.co.jp: remove redundant mapping check] [kosaki.motohiro@jp.fujitsu.com: unevictable-lru-infrastructure: putback_lru_page()/unevictable page handling rework] [kosaki.motohiro@jp.fujitsu.com: kill unnecessary lock_page() in vmscan.c] [kosaki.motohiro@jp.fujitsu.com: revert migration change of unevictable lru infrastructure] [kosaki.motohiro@jp.fujitsu.com: revert to unevictable-lru-infrastructure-kconfig-fix.patch] [kosaki.motohiro@jp.fujitsu.com: restore patch failure of vmstat-unevictable-and-mlocked-pages-vm-events.patch] Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Debugged-by: Benjamin Kidwell <benjkidwell@yahoo.com> Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:50:26 -07:00
Lee Schermerhorn	8a7a8544a4	pageflag helpers for configed-out flags Define proper false/noop inline functions for noreclaim page flags when !defined(CONFIG_UNEVICTABLE_LRU) Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:50:26 -07:00
Rik van Riel	556adecba1	vmscan: second chance replacement for anonymous pages We avoid evicting and scanning anonymous pages for the most part, but under some workloads we can end up with most of memory filled with anonymous pages. At that point, we suddenly need to clear the referenced bits on all of memory, which can take ages on very large memory systems. We can reduce the maximum number of pages that need to be scanned by not taking the referenced state into account when deactivating an anonymous page. After all, every anonymous page starts out referenced, so why check? If an anonymous page gets referenced again before it reaches the end of the inactive list, we move it back to the active list. To keep the maximum amount of necessary work reasonable, we scale the active to inactive ratio with the size of memory, using the formula active:inactive ratio = sqrt(memory in GB * 10). Kswapd CPU use now seems to scale by the amount of pageout bandwidth, instead of by the amount of memory present in the system. [kamezawa.hiroyu@jp.fujitsu.com: fix OOM with memcg] [kamezawa.hiroyu@jp.fujitsu.com: memcg: lru scan fix] Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:50:25 -07:00
Rik van Riel	4f98a2fee8	vmscan: split LRU lists into anon & file sets Split the LRU lists in two, one set for pages that are backed by real file systems ("file") and one for pages that are backed by memory and swap ("anon"). The latter includes tmpfs. The advantage of doing this is that the VM will not have to scan over lots of anonymous pages (which we generally do not want to swap out), just to find the page cache pages that it should evict. This patch has the infrastructure and a basic policy to balance how much we scan the anon lists and how much we scan the file lists. The big policy changes are in separate patches. [lee.schermerhorn@hp.com: collect lru meminfo statistics from correct offset] [kosaki.motohiro@jp.fujitsu.com: prevent incorrect oom under split_lru] [kosaki.motohiro@jp.fujitsu.com: fix pagevec_move_tail() doesn't treat unevictable page] [hugh@veritas.com: memcg swapbacked pages active] [hugh@veritas.com: splitlru: BDI_CAP_SWAP_BACKED] [akpm@linux-foundation.org: fix /proc/vmstat units] [nishimura@mxp.nes.nec.co.jp: memcg: fix handling of shmem migration] [kosaki.motohiro@jp.fujitsu.com: adjust Quicklists field of /proc/meminfo] [kosaki.motohiro@jp.fujitsu.com: fix style issue of get_scan_ratio()] Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:50:25 -07:00
Rik van Riel	b2e185384f	define page_file_cache() function Define page_file_cache() function to answer the question: is page backed by a file? Originally part of Rik van Riel's split-lru patch. Extracted to make available for other, independent reclaim patches. Moved inline function to linux/mm_inline.h where it will be needed by subsequent "split LRU" and "noreclaim" patches. Unfortunately this needs to use a page flag, since the PG_swapbacked state needs to be preserved all the way to the point where the page is last removed from the LRU. Trying to derive the status from other info in the page resulted in wrong VM statistics in earlier split VM patchsets. The total number of page flags in use on a 32 bit machine after this patch is 19. [akpm@linux-foundation.org: fix up out-of-order merge fallout] [hugh@veritas.com: splitlru: shmem_getpage SetPageSwapBacked sooner[ Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: MinChan Kim <minchan.kim@gmail.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:50:25 -07:00
Rik van Riel	68a22394c2	vmscan: free swap space on swap-in/activation If vm_swap_full() (swap space more than 50% full), the system will free swap space at swapin time. With this patch, the system will also free the swap space in the pageout code, when we decide that the page is not a candidate for swapout (and just wasting swap space). Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Signed-off-by: MinChan Kim <minchan.kim@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:50:25 -07:00
KOSAKI Motohiro	f04e9ebbe4	swap: use an array for the LRU pagevecs Turn the pagevecs into an array just like the LRUs. This significantly cleans up the source code and reduces the size of the kernel by about 13kB after all the LRU lists have been created further down in the split VM patch series. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:50:25 -07:00
Christoph Lameter	b69408e88b	vmscan: Use an indexed array for LRU variables Currently we are defining explicit variables for the inactive and active list. An indexed array can be more generic and avoid repeating similar code in several places in the reclaim code. We are saving a few bytes in terms of code size: Before: text data bss dec hex filename 4097753 573120 4092484 8763357 85b7dd vmlinux After: text data bss dec hex filename 4097729 573120 4092484 8763333 85b7c5 vmlinux Having an easy way to add new lru lists may ease future work on the reclaim code. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:50:25 -07:00
Nick Piggin	62695a84eb	vmscan: move isolate_lru_page() to vmscan.c On large memory systems, the VM can spend way too much time scanning through pages that it cannot (or should not) evict from memory. Not only does it use up CPU time, but it also provokes lock contention and can leave large systems under memory presure in a catatonic state. This patch series improves VM scalability by: 1) putting filesystem backed, swap backed and unevictable pages onto their own LRUs, so the system only scans the pages that it can/should evict from memory 2) switching to two handed clock replacement for the anonymous LRUs, so the number of pages that need to be scanned when the system starts swapping is bound to a reasonable number 3) keeping unevictable pages off the LRU completely, so the VM does not waste CPU time scanning them. ramfs, ramdisk, SHM_LOCKED shared memory segments and mlock()ed VMA pages are keept on the unevictable list. This patch: isolate_lru_page logically belongs to be in vmscan.c than migrate.c. It is tough, because we don't need that function without memory migration so there is a valid argument to have it in migrate.c. However a subsequent patch needs to make use of it in the core mm, so we can happily move it to vmscan.c. Also, make the function a little more generic by not requiring that it adds an isolated page to a given list. Callers can do that. Note that we now have '__isolate_lru_page()', that does something quite different, visible outside of vmscan.c for use with memory controller. Methinks we need to rationalize these names/purposes. --lts [akpm@linux-foundation.org: fix mm/memory_hotplug.c build] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:50:25 -07:00
David Vrabel	61e0e79ee3	Merge branch 'master' into for-upstream Conflicts: Documentation/ABI/testing/sysfs-bus-usb drivers/Makefile	2008-10-20 16:07:19 +01:00
Thomas Gleixner	592aa999d6	hrtimers: add missing docbook comments to struct hrtimer Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-10-20 16:38:19 +02:00
Peter Zijlstra	c7e78cff6b	lockstat: contend with points We currently only provide points that have to wait on contention, also lists the points we have to wait for. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-20 15:43:10 +02:00
Peter Zijlstra	ffda12a17a	sched: optimize group load balancer I noticed that tg_shares_up() unconditionally takes rq-locks for all cpus in the sched_domain. This hurts. We need the rq-locks whenever we change the weight of the per-cpu group sched entities. To allevate this a little, only change the weight when the new weight is at least shares_thresh away from the old value. This avoids the rq-lock for the top level entries, since those will never be re-weighted, and fuzzes the lower level entries a little to gain performance in semi-stable situations. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-20 14:05:02 +02:00
Thomas Gleixner	c465a76af6	Merge branches 'timers/clocksource', 'timers/hrtimers', 'timers/nohz', 'timers/ntp', 'timers/posixtimers' and 'timers/debug' into v28-timers-for-linus	2008-10-20 13:14:06 +02:00
Paul Mundt	4cb40f795a	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: Documentation/kernel-parameters.txt arch/sh/include/asm/elf.h	2008-10-20 11:17:52 +09:00
Ian Molton	7acb706ca9	mfd: update TMIO drivers to use the clock API This patch updates the remaining two TMIO drivers to use the clock API rather than callback hooks into platform code. Signed-off-by: Ian Molton <spyro@f2s.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-10-19 22:54:12 +02:00
Eric Miao	26b8f5e1e2	mfd: add base support for Dialog DA9030/DA9034 PMICs DA9030 (a.k.a ARAVA) and DA9034 (a.k.a MICCO) are PMICs designed by Dialog Semiconductor, usually found on PXA-based platforms. These PMICs are I2C-based, multi-function devices, usually with LEDs, PWMs for backlight, BUCKs and LDOs, ADCs and touchscreen controller (on DA9034). This is the base support for the I2C operations, event registration and handling, sub-devices management. Signed-off-by: Mike Rapoport <mike@compulab.co.il> Signed-off-by: Eric Miao <eric.miao@marvell.com> Signed-off-by: Liam Girdwood <lrg@kernel.org> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-10-19 22:54:11 +02:00
David Brownell	a603a7fa87	mfd: TWL4030 core driver This patch adds the core of the TWL4030 driver, which supports chips including the TPS65950. These chips are multi-function; see http://focus.ti.com/docs/prod/folders/print/tps65950.html Public specs are in the works. For now, the block diagram on the second page of the datasheet is fairly informative. There are some known issues with this core code. Most notably, the IRQ dispatching needs simplification (to use more of genirq), generalization (integrating support for secondary IRQ dispatch as well as primary, and removing the build dependency on OMAP), and then probably updating to leverage threaded IRQ support (expected to arrive in mainline "soon"). Once the core is in mainline, drivers for other parts of this chip can follow its lead and start swimming upstream too. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-10-19 22:54:11 +02:00
Dmitry Baryshkov	9e78cfe53f	mfd: support tmiofb cell on tc6393xb Add support for tmiofb cell found in tc6393xb chip. Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com> Cc: Ian Molton <spyro@f2s.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-10-19 22:54:11 +02:00
Dmitry Baryshkov	51a5562356	mfd: add OHCI cell to tc6393xb Add information regarding OHCI cell of the tc6393xb Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com> Acked-by: Ian Molton <spyro@f2s.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-10-19 22:54:10 +02:00
Dmitry Baryshkov	f98a0bd0e4	mfd: do tcb6393xb state restore on resume only if requested As requested by Ian make state restore only if it's requested by platform data: some platforms do correctly save the state of the chip during suspend/resume, but some (like tosa) incorrectly power off the chip at suspend, so the driver supports restoring some bits of the tc6393xb state (not full, merely enough to support resume on tosa). With this patch this code is disabled by default. Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com> Acked-by: Ian Molton <spyro@f2s.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-10-19 22:54:10 +02:00
Dmitry Baryshkov	1c1b6ffce5	mfd: provide and use setup hook for tc6393xb Instead of using bitfields for initial gpio setup, provide generic setup/teardown hooks that can be used to set the gpio states, register child devices, etc. Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>	2008-10-19 22:54:09 +02:00
Ingo Molnar	3e10e879a8	Merge branch 'linus' into tracing-v28-for-linus-v3 Conflicts: init/main.c kernel/module.c scripts/bootgraph.pl	2008-10-19 19:04:47 +02:00
Anton Vorontsov	ed8c3174dd	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: drivers/power/Makefile	2008-10-18 20:28:24 +04:00
Fenghua Yu	5b6985ce8e	intel-iommu: IA64 support The current Intel IOMMU code assumes that both host page size and Intel IOMMU page size are 4KiB. The first patch supports variable page size. This provides support for IA64 which has multiple page sizes. This patch also adds some other code hooks for IA64 platform including DMAR_OPERATION_TIMEOUT definition. [dwmw2: some cleanup] Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-10-18 14:29:15 +01:00
Thomas Gleixner	dd3a1db900	genirq: improve include files Move the irq_desc related iterators out of irq.h, into irqnr.h, also available via interrupt.h. This way non-genirq (and even non-hardirq) architectures get the common definitions and iterators. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-18 14:05:18 +02:00
Mike Rapoport	aaf7ea2000	[MTD] [NAND] GPIO NAND flash driver The patch adds support for NAND flashes connected to GPIOs. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Mike Rapoport <mike@compulab.co.il> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-10-18 12:48:42 +01:00
Linus Torvalds	0cfd81031a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (94 commits) USB: remove err() macro from more usb drivers USB: remove err() macro from usb misc drivers USB: remove err() macro from usb core code USB: remove err() macro from usb class drivers USB: remove use of err() in drivers/usb/serial USB: remove info() macro from usb mtd drivers USB: remove info() macro from usb input drivers USB: remove info() macro from usb network drivers USB: remove info() macro from remaining usb drivers USB: remove info() macro from usb/misc drivers USB: remove info() macro from usb/serial drivers USB: remove warn macro from HID core USB: remove warn() macro from usb drivers USB: remove warn() macro from usb net drivers USB: remove warn() macro from usb media drivers USB: remove warn() macro from usb input drivers usb/fsl_qe_udc: clear data toggle on clear halt request usb/fsl_qe_udc: fix response to get status request fsl_usb2_udc: Fix oops on probe failure. fsl_usb2_udc: Add a wmb before priming endpoint. ...	2008-10-17 15:43:52 -07:00
Linus Torvalds	5564da7e9d	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (95 commits) V4L/DVB (9296): Patch to remove warning message during cx88-dvb compilation V4L/DVB (9294): gspca: Add a stop sequence in t613. V4L/DVB (9293): gspca: Separate and fix the sensor dependant sequences in t613. V4L/DVB (9292): gspca: Call the control setting functions at init time in t613. V4L/DVB (9291): gspca: Do not set the white balance temperature by default in t613. V4L/DVB (9290): gspca: Adjust the sensor init sequences in t613. V4L/DVB (9289): gspca: Other sensor identified as om6802 in t613. V4L/DVB (9288): gspca: Write to the USB device and not USB interface in t613. V4L/DVB (9287): gspca: Change the name of the multi bytes write function in t613. V4L/DVB (9286): gspca: Compilation problem of gspca.c and the kernel version. V4L/DVB (9283): Correct typo and enable setting the gain on the mt9m111 sensor V4L/DVB (9282): Properly iterate the urbs when destroying them. V4L/DVB (9281): gspca: Add hflip and vflip to the po1030 sensor V4L/DVB (9280): gspca: Use the gspca debug macros V4L/DVB (9279): gspca: Correct some copyright headers V4L/DVB (9278): gspca: Remove the m5602_debug variable V4L/DVB (9277): gspca: propagate an error in m5602_start_transfer() V4L/DVB (9276): videobuf-dvb: two functions are now static V4L/DVB (9275): dvb: input data pointer of cx24116_writeregN() should be const V4L/DVB (9274): Remove spurious messages and turn into debug. ...	2008-10-17 15:08:47 -07:00
Linus Torvalds	58617d5e59	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: Remove automatic enabling of the HUGE_FILE feature flag ext4: Replace hackish ext4_mb_poll_new_transaction with commit callback ext4: Update Documentation/filesystems/ext4.txt ext4: Remove unused mount options: nomballoc, mballoc, nocheck ext4: Remove compile warnings when building w/o CONFIG_PROC_FS ext4: Add missing newlines to printk messages ext4: Fix file fragmentation during large file write. vfs: Add no_nrwrite_index_update writeback control flag vfs: Remove the range_cont writeback mode. ext4: Use tag dirty lookup during mpage_da_submit_io ext4: let the block device know when unused blocks can be discarded ext4: Don't reuse released data blocks until transaction commits ext4: Use an rbtree for tracking blocks freed during transaction. ext4: Do mballoc init before doing filesystem recovery ext4: Free ext4_prealloc_space using kmem_cache_free ext4: Fix Kconfig typo for ext4dev ext4: Remove an old reference to ext4dev in Makefile comment	2008-10-17 15:08:11 -07:00
Oliver Neukum	1987625226	USB: anchor API changes needed for btusb This extends the anchor API as btusb needs for autosuspend. Signed-off-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-10-17 14:41:02 -07:00
Stephen Ware	cbc30118d7	usb: vstusb.c : new driver for spectrometers used by Vernier Software & Technology, Inc. This patch adds the vstusb driver to the drivers/usb/misc directory. This driver provides support for Vernier Software & Technology spectrometers, all made by Ocean Optics. The driver provides both IOCTL and read()/write() methods for sending raw data to spectrometers across the bulk channel. Each method allows for a configured timeout. From: Stephen Ware <stephen.ware@eqware.net> Signed-off-by: Dennis O'Brien <dennis.obrien@eqware.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-10-17 14:41:01 -07:00
Geoff Levand	0b14c3881d	USB: Fix spelling in usb/serial.h Fixes a minor typo in the comments for usb_set_serial_data. Signed-off-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-10-17 14:40:56 -07:00
Felipe Balbi	3086775a49	usb gadget: cdc obex glue The following patch introduces a new f_obex.c function driver. It allows userspace obex servers to use usb as transport layer for their messages. [ dbrownell@users.sourceforge.net: various fixes and cleanups ] Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-10-17 14:40:53 -07:00
David Brownell	60beed95e3	usb gadget: function activation/deactivation Add a new mechanism to the composite gadget framework, letting functions deactivate (and reactivate) themselves. Think of it as a refcounted wrapper for the software pullup control. A key example of why to use this mechanism involves functions that require a userspace daemon. Those functions shuld use this new mechanism to prevent the gadget from enumerating until those daemons are activated. Without this mechanism, hosts would see devices that malfunction until the relevant daemons start. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-10-17 14:40:53 -07:00
Oliver Neukum	6a2839bedc	USB: extend poisoning to anchors this extends the poisoning concept to anchors. This way poisoning will work with fire and forget drivers. Signed-off-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-10-17 14:40:51 -07:00
Oliver Neukum	55b447bf79	USB: kill URBs permanently looking at usb_kill_urb() it seems to me that it is unnecessarily lenient. In the use case of disconnect() you never want to use the URB again (for the same device) But leaving urb->reject elevated will make it easier to avoid races between read/write and disconnect. Signed-off-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-10-17 14:40:51 -07:00
Greg Kroah-Hartman	5b775f672c	USB: add USB test and measurement class driver This driver was originaly written by Stefan Kopp, but massively reworked by Greg for submission. Thanks to Felipe Balbi <me@felipebalbi.com> for lots of work in cleaning up this driver. Thanks to Oliver Neukum <oliver@neukum.org> for reviewing previous versions and pointing out problems. Cc: Stefan Kopp <stefan_kopp@agilent.com> Cc: Marcel Janssen <korgull@home.nl> Cc: Felipe Balbi <me@felipebalbi.com> Cc: Oliver Neukum <oliver@neukum.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-10-17 14:40:51 -07:00
Jean Delvare	2a1d245b70	V4L/DVB (9240): saa7127: Fix two typos Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2008-10-17 17:25:11 -03:00
Manu Abraham	5ba4ecc8b0	V4L/DVB (9196): Add support for DSS delivery Signed-off-by: Manu Abraham <manu@linuxtv.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2008-10-17 17:15:43 -03:00
Manu Abraham	97854829b9	V4L/DVB (9195): Frontend API Fix: 32APSK is a valid modulation for the DVB-S2 delivery Signed-off-by: Manu Abraham <manu@linuxtv.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2008-10-17 17:15:37 -03:00
Neil Brown	504e518953	Make nfs_file_cred more robust. As not all files have an associated open_context (e.g. device special files), it is safest to test for the existence of the open context before de-referencing it. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-17 13:06:45 -04:00
Linus Torvalds	26e9a39777	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: (25 commits) staging: at76_usb wireless driver Staging: workaround build system bug Staging: Lindent sxg.c Staging: SLICOSS: Call pci_release_regions at driver exit Staging: SLICOSS: Fix remaining type names Staging: SLICOSS: Fix warnings due to static usage Staging: SLICOSS: lots of checkpatch fixes Staging: go7007 v4l fixes Staging: Fix gcc warnings in sxg Staging: add echo cancelation module Staging: add wlan-ng prism2 usb driver Staging: add w35und wifi driver Staging: USB/IP: add host driver Staging: USB/IP: add client driver Staging: USB/IP: add common functions needed Staging: add the go7007 video driver Staging: add me4000 pci data collection driver Staging: add me4000 firmware files Staging: add sxg network driver Staging: add Alacritech slicoss network driver ... Fixed up conflicts due to taint flags changes and MAINTAINERS cleanup in MAINTAINERS, include/linux/kernel.h and kernel/panic.c.	2008-10-17 09:50:12 -07:00
Linus Torvalds	c53dbf5486	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: remove __generic_unplug_device() from exports block: move q->unplug_work initialization blktrace: pass zfcp driver data blktrace: add support for driver data block: fix current kernel-doc warnings block: only call ->request_fn when the queue is not stopped block: simplify string handling in elv_iosched_store() block: fix kernel-doc for blk_alloc_devt() block: fix nr_phys_segments miscalculation bug block: add partition attribute for partition number block: add BIG FAT WARNING to CONFIG_DEBUG_BLOCK_EXT_DEVT softirq: Add support for triggering softirq work on softirqs.	2008-10-17 09:29:55 -07:00
Arjan van de Ven	651dab4264	Merge commit 'linus/master' into merge-linus Conflicts: arch/x86/kvm/i8254.c	2008-10-17 09:20:26 -07:00
Thomas Gleixner	719254faa1	NOHZ: unify the nohz function calls in irq_enter() We have two separate nohz function calls in irq_enter() for no good reason. Just call a single NOHZ function from irq_enter() and call the bits in the tick code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-10-17 18:13:38 +02:00
Bartlomiej Zolnierkiewicz	806f80a6fc	ide: add generic ATA/ATAPI disk driver * Add struct ide_disk_ops containing protocol specific methods. * Add 'struct ide_disk_ops ' to ide_drive_t. Convert ide-{disk,floppy} drivers to use struct ide_disk_ops. * Merge ide-{disk,floppy} drivers into generic ide-gd driver. While at it: - ide_disk_init_capacity() -> ide_disk_get_capacity() Acked-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-10-17 18:09:14 +02:00
Bartlomiej Zolnierkiewicz	79cb380397	ide: allow device drivers to specify per-device type /proc settings Turn ide_driver_t's 'proc' field into ->proc_entries method (and also 'settings' field into ->proc_devsets method). Then update all device drivers accordingly. There should be no functional changes caused by this patch. Acked-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-10-17 18:09:13 +02:00
Bartlomiej Zolnierkiewicz	42619d35c7	ide: remove IDE_AFLAG_NO_DOORLOCKING Just use IDE_DFLAG_DOORLOCKING instead. There should be no functional changes caused by this patch. Acked-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-10-17 18:09:11 +02:00
Bartlomiej Zolnierkiewicz	e01286282e	ide: IDE_AFLAG_FORMAT_IN_PROGRESS -> IDE_DFLAG_FORMAT_IN_PROGRESS There should be no functional changes caused by this patch. Acked-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-10-17 18:09:11 +02:00
Bartlomiej Zolnierkiewicz	da167876bd	ide: IDE_AFLAG_WP -> IDE_DFLAG_WP There should be no functional changes caused by this patch. Acked-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-10-17 18:09:11 +02:00
Bartlomiej Zolnierkiewicz	fe11edfaab	ide: IDE_AFLAG_MEDIA_CHANGED -> IDE_DFLAG_MEDIA_CHANGED There should be no functional changes caused by this patch. Acked-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-10-17 18:09:11 +02:00
Linus Torvalds	ed09441dac	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (39 commits) [SCSI] sd: fix compile failure with CONFIG_BLK_DEV_INTEGRITY=n libiscsi: fix locking in iscsi_eh_device_reset libiscsi: check reason why we are stopping iscsi session to determine error value [SCSI] iscsi_tcp: return a descriptive error value during connection errors [SCSI] libiscsi: rename host reset to target reset [SCSI] iscsi class: fix endpoint id handling [SCSI] libiscsi: Support drivers initiating session removal [SCSI] libiscsi: fix data corruption when target has to resend data-in packets [SCSI] sd: Switch kernel printing level for DIF messages [SCSI] sd: Correctly handle all combinations of DIF and DIX [SCSI] sd: Always print actual protection_type [SCSI] sd: Issue correct protection operation [SCSI] scsi_error: fix target reset handling [SCSI] lpfc 8.2.8 v2 : Add statistical reporting control and additional fc vendor events [SCSI] lpfc 8.2.8 v2 : Add sysfs control of target queue depth handling [SCSI] lpfc 8.2.8 v2 : Revert target busy in favor of transport disrupted [SCSI] scsi_dh_alua: remove REQ_NOMERGE [SCSI] lpfc 8.2.8 : update driver version to 8.2.8 [SCSI] lpfc 8.2.8 : Add MSI-X support [SCSI] lpfc 8.2.8 : Update driver to use new Host byte error code DID_TRANSPORT_DISRUPTED ...	2008-10-17 09:00:23 -07:00
Jens Axboe	f73e2d13a1	block: remove __generic_unplug_device() from exports The only out-of-core user is IDE, and that should be using blk_start_queueing() instead. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-10-17 14:03:08 +02:00
David Miller	e62b485398	sched: kill unused scheduler decl. I noticed this while making investigations into the tbench regressions. Please apply. sched: Remove hrtick_resched() extern decl. This function was removed by `31656519e1` ("sched, x86: clean up hrtick implementation"). Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-17 13:02:24 +02:00
Youquan Song	a77b67d402	dmar: Use queued invalidation interface for IOTLB and context invalidation If queued invalidation interface is available and enabled, queued invalidation interface will be used instead of the register based interface. According to Vt-d2 specification, when queued invalidation is enabled, invalidation command submit works only through invalidation queue and not through the command registers interface. Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-10-17 08:05:01 +01:00
Youquan Song	3481f21097	dmar: context cache and IOTLB invalidation using queued invalidation Implement context cache invalidate and IOTLB invalidation using queued invalidation interface. This interface will be used by DMA remapping, when queued invalidation is supported. Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-10-17 08:03:14 +01:00
Stefan Raspl	756f824318	blktrace: add support for driver data This patch adds the new api call blk_add_driver_data() to blktrace. It allows to trace device driver-specific binary data. Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com> Signed-off-by: Martin Peschke <mp3@de.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-10-17 08:46:57 +02:00
FUJITA Tomonori	8677142710	block: fix nr_phys_segments miscalculation bug This fixes the bug reported by Nikanth Karthikesan <knikanth@suse.de>: http://lkml.org/lkml/2008/10/2/203 The root cause of the bug is that blk_phys_contig_segment miscalculates q->max_segment_size. blk_phys_contig_segment checks: req->biotail->bi_size + next_req->bio->bi_size > q->max_segment_size But blk_recalc_rq_segments might expect that req->biotail and the previous bio in the req are supposed be merged into one segment. blk_recalc_rq_segments might also expect that next_req->bio and the next bio in the next_req are supposed be merged into one segment. In such case, we merge two requests that can't be merged here. Later, blk_rq_map_sg gives more segments than it should. We need to keep track of segment size in blk_recalc_rq_segments and use it to see if two requests can be merged. This patch implements it in the similar way that we used to do for hw merging (virtual merging). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-10-17 08:46:56 +02:00
David S. Miller	54514a70ad	softirq: Add support for triggering softirq work on softirqs. This is basically a genericization of Jens Axboe's block layer remote softirq changes. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-10-17 08:46:56 +02:00
Theodore Ts'o	3e624fc72f	ext4: Replace hackish ext4_mb_poll_new_transaction with commit callback The multiblock allocator needs to be able to release blocks (and issue a blkdev discard request) when the transaction which freed those blocks is committed. Previously this was done via a polling mechanism when blocks are allocated or freed. A much better way of doing things is to create a jbd2 callback function and attaching the list of blocks to be freed directly to the transaction structure. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-10-16 20:00:24 -04:00
Linus Torvalds	52ad096465	Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (53 commits) NFS: Fix a resolution problem with nfs_inode->cache_change_attribute NFS: Fix the resolution problem with nfs_inode_attrs_need_update() NFS: Changes to inode->i_nlinks must set the NFS_INO_INVALID_ATTR flag RPC/RDMA: ensure connection attempt is complete before signalling. RPC/RDMA: correct the reconnect timer backoff RPC/RDMA: optionally emit useful transport info upon connect/disconnect. RPC/RDMA: reformat a debug printk to keep lines together. RPC/RDMA: harden connection logic against missing/late rdma_cm upcalls. RPC/RDMA: fix connect/reconnect resource leak. RPC/RDMA: return a consistent error, when connect fails. RPC/RDMA: adhere to protocol for unpadded client trailing write chunks. RPC/RDMA: avoid an oops due to disconnect racing with async upcalls. RPC/RDMA: maintain the RPC task bytes-sent statistic. RPC/RDMA: suppress retransmit on RPC/RDMA clients. RPC/RDMA: fix connection IRD/ORD setting RPC/RDMA: support FRMR client memory registration. RPC/RDMA: check selected memory registration mode at runtime. RPC/RDMA: add data types and new FRMR memory registration enum. RPC/RDMA: refactor the inline memory registration code. NFS: fix nfs_parse_ip_address() corner case ...	2008-10-16 15:39:20 -07:00
Linus Torvalds	08d19f51f0	Merge branch 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm * 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (134 commits) KVM: ia64: Add intel iommu support for guests. KVM: ia64: add directed mmio range support for kvm guests KVM: ia64: Make pmt table be able to hold physical mmio entries. KVM: Move irqchip_in_kernel() from ioapic.h to irq.h KVM: Separate irq ack notification out of arch/x86/kvm/irq.c KVM: Change is_mmio_pfn to kvm_is_mmio_pfn, and make it common for all archs KVM: Move device assignment logic to common code KVM: Device Assignment: Move vtd.c from arch/x86/kvm/ to virt/kvm/ KVM: VMX: enable invlpg exiting if EPT is disabled KVM: x86: Silence various LAPIC-related host kernel messages KVM: Device Assignment: Map mmio pages into VT-d page table KVM: PIC: enhance IPI avoidance KVM: MMU: add "oos_shadow" parameter to disable oos KVM: MMU: speed up mmu_unsync_walk KVM: MMU: out of sync shadow core KVM: MMU: mmu_convert_notrap helper KVM: MMU: awareness of new kvm_mmu_zap_page behaviour KVM: MMU: mmu_parent_walk KVM: x86: trap invlpg KVM: MMU: sync roots on mmu reload ...	2008-10-16 15:36:00 -07:00
Linus Torvalds	e533b22705	Merge branch 'core-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: do_generic_file_read: s/EINTR/EIO/ if lock_page_killable() fails softirq, warning fix: correct a format to avoid a warning softirqs, debug: preemption check x86, pci-hotplug, calgary / rio: fix EBDA ioremap() IO resources, x86: ioremap sanity check to catch mapping requests exceeding, fix IO resources, x86: ioremap sanity check to catch mapping requests exceeding the BAR sizes softlockup: Documentation/sysctl/kernel.txt: fix softlockup_thresh description dmi scan: warn about too early calls to dmi_check_system() generic: redefine resource_size_t as phys_addr_t generic: make PFN_PHYS explicitly return phys_addr_t generic: add phys_addr_t for holding physical addresses softirq: allocate less vectors IO resources: fix/remove printk printk: robustify printk, update comment printk: robustify printk, fix #2 printk: robustify printk, fix printk: robustify printk Fixed up conflicts in: arch/powerpc/include/asm/types.h arch/powerpc/platforms/Kconfig.cputype manually.	2008-10-16 15:17:40 -07:00
Linus Torvalds	1eee21abaf	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: firewire: Add more documentation to firewire-cdev.h firewire: fix ioctl() return code firewire: fix setting tag and sy in iso transmission firewire: fw-sbp2: fix another small generation access bug firewire: fw-sbp2: enforce s/g segment size limit firewire: fw_send_request_sync() ieee1394: survive a few seconds connection loss ieee1394: nodemgr clean up class iterators ieee1394: dv1394, video1394: remove unnecessary expressions ieee1394: raw1394: make write() thread-safe ieee1394: raw1394: narrow down the state_mutex protected region ieee1394: raw1394: replace BKL by local mutex, make ioctl() and mmap() thread-safe ieee1394: sbp2: enforce s/g segment size limit ieee1394: sbp2: check for DMA mapping failures ieee1394: sbp2: stricter dma_sync ieee1394: Use DIV_ROUND_UP	2008-10-16 15:02:24 -07:00
Linus Torvalds	c813b4e16e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (46 commits) UIO: Fix mapping of logical and virtual memory UIO: add automata sercos3 pci card support UIO: Change driver name of uio_pdrv UIO: Add alignment warnings for uio-mem Driver core: add bus_sort_breadthfirst() function NET: convert the phy_device file to use bus_find_device_by_name kobject: Cleanup kobject_rename and !CONFIG_SYSFS kobject: Fix kobject_rename and !CONFIG_SYSFS sysfs: Make dir and name args to sysfs_notify() const platform: add new device registration helper sysfs: use ilookup5() instead of ilookup5_nowait() PNP: create device attributes via default device attributes Driver core: make bus_find_device_by_name() more robust usb: turn dev_warn+WARN_ON combos into dev_WARN debug: use dev_WARN() rather than WARN_ON() in device_pm_add() debug: Introduce a dev_WARN() function sysfs: fix deadlock device model: Do a quickcheck for driver binding before doing an expensive check Driver core: Fix cleanup in device_create_vargs(). Driver core: Clarify device cleanup. ...	2008-10-16 12:40:26 -07:00
Linus Torvalds	c472273f86	Merge branch 'for-linus' of git://neil.brown.name/md * 'for-linus' of git://neil.brown.name/md: md: fix input truncation in safe_delay_store() md: check for memory allocation failure in faulty personality md: build failure due to missing delay.h md: Relax minimum size restrictions on chunk_size. md: remove space after function name in declaration and call. md: Remove unnecessary #includes, #defines, and function declarations. md: Convert remaining 1k representations in linear.c to sectors. md: linear.c: Make two local variables sector-based. md: linear: Represent dev_info->size and dev_info->offset in sectors. md: linear.c: Remove broken debug code. md: linear.c: Remove pointless initialization of curr_offset. md: linear.c: Fix typo in comment. md: Don't try to set an array to 'read-auto' if it is already in that state. md: Allow metadata_version to be updated for externally managed metadata. md: Fix rdev_size_store with size == 0	2008-10-16 11:55:11 -07:00
Linus Torvalds	36ac1d2f32	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (32 commits) Input: wm97xx - update email address for Liam Girdwood Input: i8042 - add Thinkpad R31 to nomux list Input: move map_to_7segment.h to include/linux Input: ads7846 - fix cache line sharing issue Input: cm109 - add missing newlines to messages Input: document i8042.debug in kernel-parameters.txt Input: keyboard - fix potential out of bound access to key_map Input: psmouse - add OLPC touchpad driver Input: psmouse - tweak PSMOUSE_DEFINE_ATTR to support raw set callbacks Input: psmouse - add psmouse_queue_work() for ps/2 extension to make use of Input: psmouse - export psmouse_set_state for ps/2 extensions to use Input: ads7846 - introduce .gpio_pendown to get pendown state Input: ALPS - add signature for DualPoint found in Dell Latitude E6500 Input: serio_raw - allow attaching to translated (SERIO_I8042XL) ports Input: cm109 - don't use obsolete logging macros Input: atkbd - expand Latitude's force release quirk to other Dells Input: bf54x-keys - add power management support Input: atmel_tsadcc - improve accuracy Input: convert drivers to use strict_strtoul() Input: appletouch - handle geyser 3/4 status bits ...	2008-10-16 11:52:08 -07:00
Linus Torvalds	cb23832e39	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (26 commits) decnet: Fix compiler warning in dn_dev.c IPV6: Fix default gateway criteria wrt. HIGH/LOW preference radv option net/802/fc.c: Fix compilation warnings netns: correct mib stats in ip6_route_me_harder() netns: fix net_generic array leak rt2x00: fix regression introduced by "mac80211: free up 2 bytes in skb->cb" rtl8187: Add USB ID for Belkin F5D7050 with RTL8187B chip p54usb: Device ID updates mac80211: fixme for kernel-doc ath9k/mac80211: disallow fragmentation in ath9k, report to userspace libertas : Remove unused variable warning for "old_channel" from cmd.c mac80211: Fix scan RX processing oops orinoco: fix unsafe locking in spectrum_cs_suspend orinoco: fix unsafe locking in orinoco_cs_resume cfg80211: fix debugfs error handling mac80211: fix debugfs netdev rename iwlwifi: fix ct kill configuration for 5350 mac80211: fix HT information element parsing p54: Fix compilation problem on PPC mac80211: fix debugfs lockup ...	2008-10-16 11:26:26 -07:00
Magnus Damm	c9f66169f1	resource: add resource_type() and IORESOURCE_TYPE_BITS Add resource_type() and IORESOURCE_TYPE_BITS. They make it easier to add more resource types without having to rewrite tons of code. Signed-off-by: Magnus Damm <damm@igel.co.jp> Cc: Ben Dooks <ben-linux@fluff.org> Cc: Jean Delvare <khali@linux-fr.org> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:51 -07:00
Thomas Petazzoni	ebf3f09c63	Configure out AIO support This patchs adds the CONFIG_AIO option which allows to remove support for asynchronous I/O operations, that are not necessarly used by applications, particularly on embedded devices. As this is a size-reduction option, it depends on CONFIG_EMBEDDED. It allows to save ~7 kilobytes of kernel code/data: text data bss dec hex filename 1115067 119180 217088 1451335 162547 vmlinux 1108025 119048 217088 1444161 160941 vmlinux.new -7042 -132 0 -7174 -1C06 +/- This patch has been originally written by Matt Mackall <mpm@selenic.com>, and is part of the Linux Tiny project. [randy.dunlap@oracle.com: build fix] Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Zach Brown <zach.brown@oracle.com> Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:51 -07:00
Adrian Bunk	612de10db0	parport: remove CVS keywords Remove CVS keywords that weren't updated for a long time from comments. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:49 -07:00
Oleg Nesterov	25cbe53ef1	pid_ns: kill the now unused task_child_reaper() task_child_reaper() has no callers anymore, kill it. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:48 -07:00
Alexey Dobriyan	f221e726bf	sysctl: simplify ->strategy name and nlen parameters passed to ->strategy hook are unused, remove them. In general ->strategy hook should know what it's doing, and don't do something tricky for which, say, pointer to original userspace array may be needed (name). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> [ networking bits ] Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Howells <dhowells@redhat.com> Cc: Matt Mackall <mpm@selenic.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:47 -07:00
Adrian Bunk	b73c29f6b0	quota: remove CVS keywords Remove CVS keywords that weren't updated for a long time from comments. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:46 -07:00
Adrian Bunk	60836eb63b	telephony: remove CVS keywords Remove CVS keywords that weren't updated for a long time from comments. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:45 -07:00
Rene Herman	b563cf59c4	pnp: make the resource type an unsigned long PnP encodes the resource type directly as its struct resource->flags value which is an unsigned long. Make it so... Signed-off-by: Rene Herman <rene.herman@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:45 -07:00
Dmitry Baryshkov	b53cde3557	fbdev: add new TMIO framebuffer driver Add driver for TMIO framebuffer cells as found e.g. in Toshiba TC6393XB chips. Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com> Cc: Ian Molton <spyro@f2s.com> Acked-by: Samuel Ortiz <sameo@openedhand.com> Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:45 -07:00
Darrick J. Wong	e3a1938805	matroxfb: support G200eV chip Support the Matrox G200eV chip, based on timings that I found in the X.org matrox driver. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl> Cc: Petr Vandrovec <vandrove@vc.cvut.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:45 -07:00
Uwe Kleine-König	3d599d1ca5	gpio_free might sleep, generic part According to the documentation gpio_free should only be called from task context only. To make this more explicit add a might sleep to all implementations. This is the generic part which changes gpiolib and the fallback implementation only. Signed-off-by: Uwe Kleine-König <ukleinek@informatik.uni-freiburg.de> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:40 -07:00
Ian Kent	8d7b48e0bc	autofs4: add miscellaneous device for ioctls Add a miscellaneous device to the autofs4 module for routing ioctls. This provides the ability to obtain an ioctl file handle for an autofs mount point that is possibly covered by another mount. The actual problem with autofs is that it can't reconnect to existing mounts. Immediately one things of just adding the ability to remount autofs file systems would solve it, but alas, that can't work. This is because autofs direct mounts and the implementation of "on demand mount and expire" of nested mount trees have the file system mounted on top of the mount trigger dentry. To resolve this a miscellaneous device node for routing ioctl commands to these mount points has been implemented in the autofs4 kernel module and a library added to autofs. This provides the ability to open a file descriptor for these over mounted autofs mount points. Please refer to Documentation/filesystems/autofs4-mount-control.txt for a discussion of the problem, implementation alternatives considered and a description of the interface. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: build fix] Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:39 -07:00
Ian Kent	bb979d7fc3	autofs4: cleanup autofs mount type usage Usage of the AUTOFS_TYPE_* defines is a little confusing and appears inconsistent. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:39 -07:00
Alan Cox	9d793b0bcb	i2o: Fix 32/64bit DMA locking The I2O ioctls assume 32bits. In itself that is fine as they are old cards and nobody uses 64bit. However on LKML it was noted this assumption is also made for allocated memory and is unsafe on 64bit systems. Fixing this is a mess. It turns out there is tons of crap buried in a header file that does racy 32/64bit filtering on the masks. So we: - Verify all callers of the racy code can sleep (i2o_dma_[re]alloc) - Move the code into a new i2o/memory.c file - Remove the gfp_mask argument so nobody can try and misuse the function - Wrap a mutex around the problem area (a single mutex is easy to do and none of this is performance relevant) - Switch the remaining problem kmalloc holdout to use i2o_dma_alloc Cc: Markus Lidel <Markus.Lidel@shadowconnect.com> Cc: Vasily Averin <vvs@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:38 -07:00
Lennert Buytenhek	2bec19feab	orion_spi: handle 88F6183 erratum Add support to orion_spi for the 88F6183 ARM SoC by adding code to work around a 6183-specific erratum. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:38 -07:00
Kirill A. Shutemov	bf2a9a3963	Allow recursion in binfmt_script and binfmt_misc binfmt_script and binfmt_misc disallow recursion to avoid stack overflow using sh_bang and misc_bang. It causes problem in some cases: $ echo '#!/bin/ls' > /tmp/t0 $ echo '#!/tmp/t0' > /tmp/t1 $ echo '#!/tmp/t1' > /tmp/t2 $ chmod +x /tmp/t* $ /tmp/t2 zsh: exec format error: /tmp/t2 Similar problem with binfmt_misc. This patch introduces field 'recursion_depth' into struct linux_binprm to track recursion level in binfmt_misc and binfmt_script. If recursion level more then BINPRM_MAX_RECURSION it generates -ENOEXEC. [akpm@linux-foundation.org: make linux_binprm.recursion_depth a uint] Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:38 -07:00

... 28 29 30 31 32 ...

15731 Commits