Commit Graph

1268 Commits

Author SHA1 Message Date
Andi Kleen
5679af4c16 x86, mce: fix boot logging logic
The earlier patch to change the poller to a separate function subtly
broke the boot logging logic. This could lead to machine checks
getting logged at boot even when disabled or defaulting to off
on some systems. Fix that.

[ Impact: bug fix - avoid spurious MCE in log ]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-04-22 13:56:25 -07:00
Andi Kleen
6298c512bc x86, mce: make polling timer interval per CPU
The polling timer while running per CPU still uses a global next_interval
variable, which lead to some CPUs either polling too fast or too slow.   
This was not a serious problem because all errors get picked up eventually,
but it's still better to avoid it. Turn next_interval into a per cpu variable.

v2: Fix check_interval == 0 case (Hidetoshi Seto)

[ Impact: minor bug fix ]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-04-22 13:54:37 -07:00
Thomas Renninger
d876dfbbf5 acpi-cpufreq: Do not let get_measured perf depend on internal variable
Take already available policy->cpuinfo.max_freq and get rid of acpi-cpufreq
specific max_freq variable.

This implies that P0 is always the highest frequency which should always
be true as ACPI spec says:
As a result, the zeroth entry describes the highest performance state

Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-19 22:47:21 -04:00
Thomas Renninger
d91758f5dd acpi-cpufreq: style-only: add parens to math expression
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-19 22:47:21 -04:00
Thomas Renninger
e0e8c4e512 acpi-cpufreq: Cleanup: Use printk_once
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-19 22:47:20 -04:00
Pallipadi, Venkatesh
093f13e231 x86, acpi_cpufreq: Fix the NULL pointer dereference in get_measured_perf
Fix for a regression that was introduced by earlier commit
18b2646fe3 on Mon Apr 6 11:26:08 2009

Regression resulted in the below error happened on systems with
software coordination where per_cpu acpi data will not be initiated for
secondary CPUs in a P-state domain.

On Tue, 2009-04-14 at 23:01 -0700, Zhang, Yanmin wrote:
 My machine hanged with kernel 2.6.30-rc2 when script read
> /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor.
>
> opps happens in get_measured_perf:
>
>         cur.aperf.whole = readin.aperf.whole -
>                                 per_cpu(drv_data, cpu)->saved_aperf;
>
> Because per_cpu(drv_data, cpu)=NULL.
>
> So function get_measured_perf should check if (per_cpu(drv_data,
> cpu)==NULL)
> and return 0 if it's NULL.

--------------sys log------------------

BUG: unable to handle kernel NULL pointer dereference at
0000000000000020
IP: [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9
PGD a7dd88067 PUD a7ccf5067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
CPU 0
Modules linked in: video output
Pid: 2091, comm: kondemand/0 Not tainted 2.6.30-rc2 #1 MP Server
RIP: 0010:[<ffffffff8021af75>]  [<ffffffff8021af75>]
get_measured_perf+0x4a/0xf9
RSP: 0018:ffff880a7d56de20  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 00000046241a42b6 RCX: ffff88004d219000
RDX: 000000000000b660 RSI: 0000000000000020 RDI: 0000000000000001
RBP: ffff880a7f052000 R08: 00000046241a42b6 R09: ffffffff807639f0
R10: 00000000ffffffea R11: ffffffff802207f4 R12: ffff880a7f052000
R13: ffff88004d20e460 R14: 0000000000ddd5a6 R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffff88004d200000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000020 CR3: 0000000a7f1bf000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kondemand/0 (pid: 2091, threadinfo ffff880a7d56c000, task
ffff880a7d4d18c0)
Stack:
 ffff880a7f052078 ffffffff803efd54 00000046241a42b6 000000462ffa9e95
 0000000000000001 0000000000000001 00000000ffffffea ffffffff8064f41a
 0000000000000012 0000000000000012 ffff880a7f052000 ffffffff80650547
Call Trace:
 [<ffffffff803efd54>] ? kobject_get+0x12/0x17
 [<ffffffff8064f41a>] ? __cpufreq_driver_getavg+0x42/0x57
 [<ffffffff80650547>] ? do_dbs_timer+0x147/0x272
 [<ffffffff80650400>] ? do_dbs_timer+0x0/0x272
 [<ffffffff802474ca>] ? worker_thread+0x15b/0x1f5
 [<ffffffff8024a02c>] ? autoremove_wake_function+0x0/0x2e
 [<ffffffff8024736f>] ? worker_thread+0x0/0x1f5
 [<ffffffff80249f0d>] ? kthread+0x54/0x83
 [<ffffffff8020c87a>] ? child_rip+0xa/0x20
 [<ffffffff80249eb9>] ? kthread+0x0/0x83
 [<ffffffff8020c870>] ? child_rip+0x0/0x20
Code: 99 a6 03 00 31 c9 85 c0 0f 85 c3 00 00 00 89 df 4c 8b 44 24 10 48
c7 c2 60 b6 00 00 48 8b 0c fd e0 30 a5 80 4c 89 c3 48 8b 04 0a <48> 2b
58 20 48 8b 44 24 18 48 89 1c 24 48 8b 34 0a 48 2b 46 28
RIP  [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9
 RSP <ffff880a7d56de20>
CR2: 0000000000000020
---[ end trace 2b8fac9a49e19ad4 ]---

Tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-19 22:47:20 -04:00
Linus Torvalds
ea34f43a07 acpi-cpufreq: fix 'smp_call_function_many()' confusion
It turns out that 'smp_call_function_many()' doesn't work at all like
'smp_call_function_single()', and my change to Andrew's patch to use it
rather than a loop over all CPU's acpi-cpufreq doesn't work.

My bad.

'smp_call_function_many()' has two "features" (aka "documented bugs"):

 (a) it needs to be called with preemption disabled, because it uses
     smp_processor_id() without guarding the CPU lookup with 'get_cpu()'
     and 'put_cpu()' like the 'single' variant does.

 (b) even if the current CPU is part of the CPU mask, it won't do the
     call on that CPU.

Still, we're better off trying to use 'smp_call_function_many()' than
looping over CPU's, since it at least in theory allows us to use a
broadcast IPI and do it all in parallel.  So let's just work around the
silly semantic bugs in that function.

Reported-and-tested-by: Ali Gholami Rudi <ali@rudi.ir>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-15 08:41:16 -07:00
Linus Torvalds
1c98aa7424 Fix quilt merge error in acpi-cpufreq.c
We ended up incorrectly using '&cur' instead of '&readin' in the
work_on_cpu() -> smp_call_function_single() transformation in commit
01599fca67 ("cpufreq: use
smp_call_function_[single|many]() in acpi-cpufreq.c").

Andrew explains:
 "OK, the acpi tree went and had conflicting changes merged into it after
  I'd written the patch and it appears that I incorrectly reverted part
  of 18b2646fe3 while fixing the resulting
  rejects.

  Switching it to `readin' looks correct."

Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-13 18:09:20 -07:00
Andrew Morton
01599fca67 cpufreq: use smp_call_function_[single|many]() in acpi-cpufreq.c
Atttempting to rid us of the problematic work_on_cpu().  Just use
smp_call_fuction_single() here.

This repairs a 10% sysbench(oltp)+mysql regression which Mike reported,
due to

  commit 6b44003e5c
  Author: Andrew Morton <akpm@linux-foundation.org>
  Date:   Thu Apr 9 09:50:37 2009 -0600

      work_on_cpu(): rewrite it to create a kernel thread on demand

It seems that the kernel calls these acpi-cpufreq functions at a quite
high frequency.

Valdis Kletnieks also reports that this causes 70-90 forks per second on
his hardware.

Cc: Valdis.Kletnieks@vt.edu
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Zhao Yakui <yakui.zhao@intel.com>
Acked-by: Dave Jones <davej@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Mike Galbraith <efault@gmx.de>
Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
[ Made it use smp_call_function_many() instead of looping over cpu's
  with smp_call_function_single()    - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-13 11:09:46 -07:00
Andreas Herrmann
6265ff19ca x86: cacheinfo: complete L2/L3 Cache and TLB associativity field definitions
See "CPUID Specification" (AMD Publication #: 25481, Rev. 2.28, April 2008)

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Mark Langsdorf <mark.langsdorf@amd.com>
LKML-Reference: <20090409134710.GA8026@alberich.amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 15:41:18 +02:00
Mark Langsdorf
ba518bea2d x86: cacheinfo: disable L3 ECC scrubbing when L3 cache index is disabled
(Use correct mask to zero out bits 24-28 by Andreas)

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090409132406.GK31527@alberich.amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 14:22:34 +02:00
Mark Langsdorf
f8b201fc71 x86: cacheinfo: replace sysfs interface for cache_disable feature
Impact: replace sysfs attribute

Current interface violates against "one-value-per-sysfs-attribute
rule". This patch replaces current attribute with two attributes --
one for each L3 Cache Index Disable register.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090409131849.GJ31527@alberich.amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 14:21:53 +02:00
Andreas Herrmann
afd9fceec5 x86: cacheinfo: use cached K8 NB_MISC devices instead of scanning for it
Impact: avoid code duplication

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Langsdorf <mark.langsdorf@amd.com>
LKML-Reference: <20090409131617.GI31527@alberich.amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 14:21:49 +02:00
Andreas Herrmann
845d8c761e x86: cacheinfo: correct return value when cache_disable feature is not active
Impact: bug fix

If user writes to "cache_disable" attribute on a CPU that does not support
this feature, the process hangs due to an invalid return value in
store_cache_disable().

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Langsdorf <mark.langsdorf@amd.com>
LKML-Reference: <20090409130729.GH31527@alberich.amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 14:21:42 +02:00
Andreas Herrmann
bda869c614 x86: cacheinfo: use L3 cache index disable feature only for CPUs that support it
AMD family 0x11 CPU doesn't support the feature.

Some AMD family 0x10 CPUs do not support it or have an erratum, see
erratum #382 in "Revision Guide for AMD Family 10h Processors, 41322
Rev. 3.40 February 2009".

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
CC: Mark Langsdorf <mark.langsdorf@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090409130510.GG31527@alberich.amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 14:21:40 +02:00
Linus Torvalds
e66dd19092 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: cpu_debug remove execute permission
  x86: smarten /proc/interrupts output for new counters
  x86: DMI match for the Dell DXP061 as it needs BIOS reboot
  x86: make 64 bit to use default_inquire_remote_apic
  x86, setup: un-resequence mode setting for VGA 80x34 and 80x60 modes
  x86, intel-iommu: fix X2APIC && !ACPI build failure
2009-04-09 10:38:23 -07:00
Jaswinder Singh Rajput
f20ab9c38f x86: cpu_debug remove execute permission
It seems by mistake these files got execute permissions so removing it.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
LKML-Reference: <1239211186.9037.2.camel@ht.satnam>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-09 06:34:02 +02:00
Peter Zijlstra
78f13e9525 perf_counter: allow for data addresses to be recorded
Paul suggested we allow for data addresses to be recorded along with
the traditional IPs as power can provide these.

For now, only the software pagefault events provide data addresses,
but in the future power might as well for some events.

x86 doesn't seem capable of providing this atm.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090408130409.394816925@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-08 19:05:56 +02:00
Len Brown
8897c18595 Merge branches 'release', 'APERF', 'ARAT', 'misc', 'kelvin', 'device-lock' and 'bjorn.notify' into release 2009-04-07 18:18:42 -04:00
Venkatesh Pallipadi
db954b5898 x86 ACPI: Add support for Always Running APIC timer
Add support for Always Running APIC timer, CPUID_0x6_EAX_Bit2.
This bit means the APIC timer continues to run even when CPU is
in deep C-states.

The advantage is that we can use LAPIC timer on these CPUs
always, and there is no need for "slow to read and program"
external timers (HPET/PIT) and the timer broadcast logic
and related code in C-state entry and exit.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-07 18:17:51 -04:00
Venkatesh Pallipadi
18b2646fe3 ACPI x86: Make aperf/mperf MSR access in acpi_cpufreq read_only
Do not write zeroes to APERF and MPERF by ondemand governor. With this
change, other users can share these MSRs for reads.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-07 18:15:05 -04:00
Venkatesh Pallipadi
e4f6937222 ACPI x86: Cleanup acpi_cpufreq structures related to aperf/mperf
Change structure name to make the code cleaner and simpler. No
functionality change in this patch.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-07 18:15:05 -04:00
Peter Zijlstra
f6c7d5fe58 perf_counter: theres more to overflow than writing events
Prepare for more generic overflow handling. The new perf_counter_overflow()
method will handle the generic bits of the counter overflow, and can return
a !0 return value, in which case the counter should be (soft) disabled, so
that it won't count until it's properly disabled.

XXX: do powerpc and swcounter

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090406094517.812109629@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-07 10:48:56 +02:00
Peter Zijlstra
b6276f353b perf_counter: x86: self-IPI for pending work
Implement set_perf_counter_pending() with a self-IPI so that it will
run ASAP in a usable context.

For now use a second IRQ vector, because the primary vector pokes
the apic in funny ways that seem to confuse things.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090406094517.724626696@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-07 10:48:56 +02:00
Huang Weiyi
d226169428 ACPI: cpufreq: remove dupilcated #include
Remove dupilicated #include in arch/x86/kernel/cpu/cpufreq/longhaul.c.

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-07 01:39:14 -04:00
Peter Zijlstra
5872bdb88a perf_counter: add more context information
Put in counts to tell which ips belong to what context.

  -----
   | |  hv
   | --
nr | |  kernel
   | --
   | |  user
  -----

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Orig-LKML-Reference: <20090402091319.493101305@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:30:46 +02:00
Peter Zijlstra
4e935e4717 perf_counter: pmc arbitration
Follow the example set by powerpc and try to play nice with oprofile
and the nmi watchdog.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090330171024.459968444@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:30:44 +02:00
Peter Zijlstra
d7d59fb323 perf_counter: x86: callchain support
Provide the x86 perf_callchain() implementation.

Code based on the ftrace/sysprof code from Soeren Sandmann Pedersen.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: Soeren Sandmann Pedersen <sandmann@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Orig-LKML-Reference: <20090330171024.341993293@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:30:43 +02:00
Peter Zijlstra
9ea98e1912 perf_counter: x86: proper error propagation for the x86 hw_perf_counter_init()
Now that Paul cleaned up the error propagation paths, pass down the
x86 error as well.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090330171023.792822360@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:30:40 +02:00
Peter Zijlstra
925d519ab8 perf_counter: unify and fix delayed counter wakeup
While going over the wakeup code I noticed delayed wakeups only work
for hardware counters but basically all software counters rely on
them.

This patch unifies and generalizes the delayed wakeup to fix this
issue.

Since we're dealing with NMI context bits here, use a cmpxchg() based
single link list implementation to track counters that have pending
wakeups.

[ This should really be generic code for delayed wakeups, but since we
  cannot use cmpxchg()/xchg() in generic code, I've let it live in the
  perf_counter code. -- Eric Dumazet could use it to aggregate the
  network wakeups. ]

Furthermore, the x86 method of using TIF flags was flawed in that its
quite possible to end up setting the bit on the idle task, loosing the
wakeup.

The powerpc method uses per-cpu storage and does appear to be
sufficient.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090330171023.153932974@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:30:36 +02:00
Peter Zijlstra
f4a2deb486 perf_counter: remove the event config bitfields
Since the bitfields turned into a bit of a mess, remove them and rely on
good old masks.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090323172417.059499915@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:30:25 +02:00
Peter Zijlstra
0322cd6ec5 perf_counter: unify irq output code
Impact: cleanup

Having 3 slightly different copies of the same code around does nobody
any good. First step in revamping the output format.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Orig-LKML-Reference: <20090319194233.929962222@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:30:17 +02:00
Peter Zijlstra
b8e83514b6 perf_counter: revamp syscall input ABI
Impact: modify ABI

The hardware/software classification in hw_event->type became a little
strained due to the addition of tracepoint tracing.

Instead split up the field and provide a type field to explicitly specify
the counter type, while using the event_id field to specify which event to
use.

Raw counters still work as before, only the raw config now goes into
raw_event.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Orig-LKML-Reference: <20090319194233.836807573@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:30:17 +02:00
Ingo Molnar
7bb497bd88 perf_counter: fix crash on perfmon v1 systems
Impact: fix boot crash on Intel Perfmon Version 1 systems

Intel Perfmon v1 does not support the global MSRs, nor does
it offer the generalized MSR ranges. So support v2 and later
CPUs only.

Also mark pmc_ops as read-mostly - to avoid false cacheline
sharing.

Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:30:14 +02:00
Peter Zijlstra
82bae4f8c2 perf_counter: x86: use ULL postfix for 64bit constants
Fix a build warning on 32bit machines by explicitly marking the
constants as 64-bit.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:29:34 +02:00
Peter Zijlstra
60b3df9c1e perf_counter: add comment to barrier
We need to ensure the enabled=0 write happens before we
start disabling the actual counters, so that a pcm_amd_enable()
will not enable one underneath us.

I think the race is impossible anyway, we always balance the
ops within any one context and perform enable() with IRQs disabled.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:29:32 +02:00
Peter Zijlstra
595258aaea perf_counter: x86: fix 32-bit irq_period assumption
No need to assume the irq_period is 32bit.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:29:29 +02:00
Ingo Molnar
f541ae326f Merge branch 'linus' into perfcounters/core-v2
Merge reason: we have gathered quite a few conflicts, need to merge upstream

Conflicts:
	arch/powerpc/kernel/Makefile
	arch/x86/ia32/ia32entry.S
	arch/x86/include/asm/hardirq.h
	arch/x86/include/asm/unistd_32.h
	arch/x86/include/asm/unistd_64.h
	arch/x86/kernel/cpu/common.c
	arch/x86/kernel/irq.c
	arch/x86/kernel/syscall_table_32.S
	arch/x86/mm/iomap_32.c
	include/linux/sched.h
	kernel/Makefile

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:02:57 +02:00
Linus Torvalds
32fb6c1756 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (140 commits)
  ACPI: processor: use .notify method instead of installing handler directly
  ACPI: button: use .notify method instead of installing handler directly
  ACPI: support acpi_device_ops .notify methods
  toshiba-acpi: remove MAINTAINERS entry
  ACPI: battery: asynchronous init
  acer-wmi: Update copyright notice & documentation
  acer-wmi: Cleanup the failure cleanup handling
  acer-wmi: Blacklist Acer Aspire One
  video: build fix
  thinkpad-acpi: rework brightness support
  thinkpad-acpi: enhanced debugging messages for the fan subdriver
  thinkpad-acpi: enhanced debugging messages for the hotkey subdriver
  thinkpad-acpi: enhanced debugging messages for rfkill subdrivers
  thinkpad-acpi: restrict access to some firmware LEDs
  thinkpad-acpi: remove HKEY disable functionality
  thinkpad-acpi: add new debug helpers and warn of deprecated atts
  thinkpad-acpi: add missing log levels
  thinkpad-acpi: cleanup debug helpers
  thinkpad-acpi: documentation cleanup
  thinkpad-acpi: drop ibm-acpi alias
  ...
2009-04-05 11:16:25 -07:00
Linus Torvalds
714f83d5d9 Merge branch 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (413 commits)
  tracing, net: fix net tree and tracing tree merge interaction
  tracing, powerpc: fix powerpc tree and tracing tree interaction
  ring-buffer: do not remove reader page from list on ring buffer free
  function-graph: allow unregistering twice
  trace: make argument 'mem' of trace_seq_putmem() const
  tracing: add missing 'extern' keywords to trace_output.h
  tracing: provide trace_seq_reserve()
  blktrace: print out BLK_TN_MESSAGE properly
  blktrace: extract duplidate code
  blktrace: fix memory leak when freeing struct blk_io_trace
  blktrace: fix blk_probes_ref chaos
  blktrace: make classic output more classic
  blktrace: fix off-by-one bug
  blktrace: fix the original blktrace
  blktrace: fix a race when creating blk_tree_root in debugfs
  blktrace: fix timestamp in binary output
  tracing, Text Edit Lock: cleanup
  tracing: filter fix for TRACE_EVENT_FORMAT events
  ftrace: Using FTRACE_WARN_ON() to check "freed record" in ftrace_release()
  x86: kretprobe-booster interrupt emulation code fix
  ...

Fix up trivial conflicts in
 arch/parisc/include/asm/ftrace.h
 include/linux/memory.h
 kernel/extable.c
 kernel/module.c
2009-04-05 11:04:19 -07:00
Linus Torvalds
90975ef712 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask: (36 commits)
  cpumask: remove cpumask allocation from idle_balance, fix
  numa, cpumask: move numa_node_id default implementation to topology.h, fix
  cpumask: remove cpumask allocation from idle_balance
  x86: cpumask: x86 mmio-mod.c use cpumask_var_t for downed_cpus
  x86: cpumask: update 32-bit APM not to mug current->cpus_allowed
  x86: microcode: cleanup
  x86: cpumask: use work_on_cpu in arch/x86/kernel/microcode_core.c
  cpumask: fix CONFIG_CPUMASK_OFFSTACK=y cpu hotunplug crash
  numa, cpumask: move numa_node_id default implementation to topology.h
  cpumask: convert node_to_cpumask_map[] to cpumask_var_t
  cpumask: remove x86 cpumask_t uses.
  cpumask: use cpumask_var_t in uv_flush_tlb_others.
  cpumask: remove cpumask_t assignment from vector_allocation_domain()
  cpumask: make Xen use the new operators.
  cpumask: clean up summit's send_IPI functions
  cpumask: use new cpumask functions throughout x86
  x86: unify cpu_callin_mask/cpu_callout_mask/cpu_initialized_mask/cpu_sibling_setup_mask
  cpumask: convert struct cpuinfo_x86's llc_shared_map to cpumask_var_t
  cpumask: convert node_to_cpumask_map[] to cpumask_var_t
  x86: unify 32 and 64-bit node_to_cpumask_map
  ...
2009-04-05 10:33:07 -07:00
Len Brown
478c6a43fc Merge branch 'linus' into release
Conflicts:
	arch/x86/kernel/cpu/cpufreq/longhaul.c

Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-05 02:14:15 -04:00
Len Brown
8a3f257c70 Merge branch 'misc' into release 2009-04-05 01:52:07 -04:00
Ingo Molnar
c5c67c7cba x86, mtrr: remove debug message
The MTRR code grew a new debug message which triggers commonly:

[   40.142276]   get_mtrr: cpu0 reg00 base=0000000000 size=0000080000 write-back
[   40.142280]   get_mtrr: cpu0 reg01 base=0000080000 size=0000040000 write-back
[   40.142284]   get_mtrr: cpu0 reg02 base=0000100000 size=0000040000 write-back
[   40.142311]   get_mtrr: cpu0 reg00 base=0000000000 size=0000080000 write-back
[   40.142314]   get_mtrr: cpu0 reg01 base=0000080000 size=0000040000 write-back
[   40.142317]   get_mtrr: cpu0 reg02 base=0000100000 size=0000040000 write-back

Remove this annoyance.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-04 00:31:02 +02:00
Ingo Molnar
8302294f43 Merge branch 'tracing/core-v2' into tracing-for-linus
Conflicts:
	include/linux/slub_def.h
	lib/Kconfig.debug
	mm/slob.c
	mm/slub.c
2009-04-02 00:49:02 +02:00
Rusty Russell
558f6ab910 Merge branch 'cpumask-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
Conflicts:

	arch/x86/include/asm/topology.h
	drivers/oprofile/buffer_sync.c
(Both cases: changed in Linus' tree, removed in Ingo's).
2009-03-31 13:33:50 +10:30
Ingo Molnar
65fb0d23fc Merge branch 'linus' into cpumask-for-linus
Conflicts:
	arch/x86/kernel/cpu/common.c
2009-03-30 23:53:32 +02:00
Alexey Dobriyan
99b7623380 proc 2/2: remove struct proc_dir_entry::owner
Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy
as correctly noted at bug #12454. Someone can lookup entry with NULL
->owner, thus not pinning enything, and release it later resulting
in module refcount underflow.

We can keep ->owner and supply it at registration time like ->proc_fops
and ->data.

But this leaves ->owner as easy-manipulative field (just one C assignment)
and somebody will forget to unpin previous/pin current module when
switching ->owner. ->proc_fops is declared as "const" which should give
some thoughts.

->read_proc/->write_proc were just fixed to not require ->owner for
protection.

rmmod'ed directories will be empty and return "." and ".." -- no harm.
And directories with tricky enough readdir and lookup shouldn't be modular.
We definitely don't want such modular code.

Removing ->owner will also make PDE smaller.

So, let's nuke it.

Kudos to Jeff Layton for reminding about this, let's say, oversight.

http://bugzilla.kernel.org/show_bug.cgi?id=12454

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-03-31 01:14:44 +04:00
Ingo Molnar
3fab191002 Merge branch 'linus' into x86/core 2009-03-28 22:27:45 +01:00
Pallipadi, Venkatesh
a59d1637eb ACPI: cap off P-state transition latency from buggy BIOSes
Some BIOSes report very high frequency transition latency which are plainly
wrong on CPus that can change frequency using native MSR interface.

One such system is IBM T42 (2327-8ZU) as reported by Owen Taylor and
Rik van Riel.

cpufreq_ondemand driver uses this transition latency to come up with a
reasonable sampling interval to sample CPU usage and with such high
latency value, ondemand sampling interval ends up being very high
(0.5 sec, in this particular case), resulting in performance impact due to
slow response to increasing frequency.

Fix it by capping-off the transition latency to 20uS for native MSR based
frequency transitions.

mjg: We've confirmed that this also helps on the X31

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Acked-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-03-27 21:21:09 -04:00
Lin Ming
fb318cbff4 ACPI: cpufreq: use new bit register access function
> arch/x86/kernel/cpu/cpufreq/longhaul.c: In function 'longhaul_setstate':
> arch/x86/kernel/cpu/cpufreq/longhaul.c:308: error: implicit declaration of function 'acpi_set_register'

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Compile-tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-03-27 16:49:53 -04:00
Ingo Molnar
6e15cf0486 Merge branch 'core/percpu' into percpu-cpumask-x86-for-linus-2
Conflicts:
	arch/parisc/kernel/irq.c
	arch/x86/include/asm/fixmap_64.h
	arch/x86/include/asm/setup.h
	kernel/irq/handle.c

Semantic merge:
        arch/x86/include/asm/fixmap.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-27 17:28:43 +01:00
Linus Torvalds
831576fe40 Merge branch 'sched-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (46 commits)
  sched: Add comments to find_busiest_group() function
  sched: Refactor the power savings balance code
  sched: Optimize the !power_savings_balance during fbg()
  sched: Create a helper function to calculate imbalance
  sched: Create helper to calculate small_imbalance in fbg()
  sched: Create a helper function to calculate sched_domain stats for fbg()
  sched: Define structure to store the sched_domain statistics for fbg()
  sched: Create a helper function to calculate sched_group stats for fbg()
  sched: Define structure to store the sched_group statistics for fbg()
  sched: Fix indentations in find_busiest_group() using gotos
  sched: Simple helper functions for find_busiest_group()
  sched: remove unused fields from struct rq
  sched: jiffies not printed per CPU
  sched: small optimisation of can_migrate_task()
  sched: fix typos in documentation
  sched: add avg_overlap decay
  x86, sched_clock(): mark variables read-mostly
  sched: optimize ttwu vs group scheduling
  sched: TIF_NEED_RESCHED -> need_reshed() cleanup
  sched: don't rebalance if attached on NULL domain
  ...
2009-03-26 16:05:01 -07:00
Linus Torvalds
ada19a31a9 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: (35 commits)
  [CPUFREQ] Prevent p4-clockmod from auto-binding to the ondemand governor.
  [CPUFREQ] Make cpufreq-nforce2 less obnoxious
  [CPUFREQ] p4-clockmod reports wrong frequency.
  [CPUFREQ] powernow-k8: Use a common exit path.
  [CPUFREQ] Change link order of x86 cpufreq modules
  [CPUFREQ] conservative: remove 10x from def_sampling_rate
  [CPUFREQ] conservative: fixup governor to function more like ondemand logic
  [CPUFREQ] conservative: fix dbs_cpufreq_notifier so freq is not locked
  [CPUFREQ] conservative: amend author's email address
  [CPUFREQ] Use swap() in longhaul.c
  [CPUFREQ] checkpatch cleanups for acpi-cpufreq
  [CPUFREQ] powernow-k8: Only print error message once, not per core.
  [CPUFREQ] ondemand/conservative: sanitize sampling_rate restrictions
  [CPUFREQ] ondemand/conservative: deprecate sampling_rate{min,max}
  [CPUFREQ] powernow-k8: Always compile powernow-k8 driver with ACPI support
  [CPUFREQ] Introduce /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_transition_latency
  [CPUFREQ] checkpatch cleanups for powernow-k8
  [CPUFREQ] checkpatch cleanups for ondemand governor.
  [CPUFREQ] checkpatch cleanups for powernow-k7
  [CPUFREQ] checkpatch cleanups for speedstep related drivers.
  ...
2009-03-26 11:04:08 -07:00
Jaswinder Singh Rajput
f2362e6f1b x86: cpu/cpu.h cleanup
Impact: cleanup

 - Fix various style issues

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-03-23 02:06:51 +05:30
Ingo Molnar
705bb9dc72 Merge branches 'x86/cleanups', 'x86/cpu', 'x86/debug', 'x86/mce2', 'x86/mm', 'x86/mtrr', 'x86/setup', 'x86/setup-memory', 'x86/urgent', 'x86/uv', 'x86/x2apic' and 'linus' into x86/core
Conflicts:
	arch/parisc/kernel/irq.c
2009-03-18 13:19:49 +01:00
Jaswinder Singh Rajput
4e16c88875 x86: cpu/mttr/cleanup.c fix compilation warning
arch/x86/kernel/cpu/mtrr/cleanup.c:197: warning: format ‘%d’ expects type ‘int’, but argument 2 has type ‘long unsigned int’

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1237378015.13488.1.camel@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-18 13:14:31 +01:00
Rusty Russell
30e1e6d1af cpumask: fix CONFIG_CPUMASK_OFFSTACK=y cpu hotunplug crash
Impact: Fix cpu offline when CONFIG_MAXSMP=y

Changeset bc9b83dd1f "cpumask: convert
c1e_mask in arch/x86/kernel/process.c to cpumask_var_t" contained a
bug: c1e_mask is manipulated even if C1E isn't detected (and hence
not allocated).

This is simply fixed by checking for NULL (which gcc optimizes out
anyway of CONFIG_CPUMASK_OFFSTACK=n, since it knows ce1_mask can never
be NULL).

In addition, fix a leak where select_idle_routine re-allocates
(and re-clears) c1e_mask on every cpu init.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Mike Travis <travis@sgi.com>
LKML-Reference: <200903171450.34549.rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-18 09:40:35 +01:00
Andrew Morton
a6b6a14e0c x86: use smp_call_function_single() in arch/x86/kernel/cpu/mcheck/mce_amd_64.c
Attempting to rid us of the problematic work_on_cpu().  Just use
smp_call_function_single() here.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <20090318042217.EF3F1DDF39@ozlabs.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-18 07:03:12 +01:00
Yinghai Lu
f0348c438c x86: MTRR workaround for system with stange var MTRRs
Impact: don't trim e820 according to wrong mtrr

Ozan reports that his server emits strange warning.
it turns out the BIOS sets the MTRRs incorrectly.

Ignore those strange ranges, and don't trim e820,
just emit one warning about BIOS

Reported-by: Ozan Çağlayan <ozan@pardus.org.tr>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <49BEE1E7.7020706@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-17 10:47:47 +01:00
Hidetoshi Seto
514ec49a5f x86, mce: remove incorrect __cpuinit for intel_init_cmci()
Impact: Bug fix on UP

Referring commit cc3ca22063,
Peter removed __cpuinit annotations for mce_cpu_features()
and its successor functions, which caused troubles on UP
configurations.

However the intel_init_cmci() was introduced after that and
it also has __cpuinit annotation even though it is called from
mce_cpu_features(). Remove the annotation from that function
too.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-16 09:15:32 +01:00
Jaswinder Singh Rajput
f4c3c4cdb1 x86: cpu_debug add support for various AMD CPUs
Impact: Added AMD CPUs support

Added flags for various AMD CPUs.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-14 18:07:58 +01:00
Sebastian Andrzej Siewior
48f4c485c2 x86/centaur: merge 32 & 64 bit version
there should be no difference, except:

 * the 64bit variant now also initializes the padlock unit.
 * ->c_early_init() is executed again from ->c_init()
 * the 64bit fixups made into 32bit path.

Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: herbert@gondor.apana.org.au
LKML-Reference: <1237029843-28076-2-git-send-email-sebastian@breakpoint.cc>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-14 16:27:29 +01:00
Ingo Molnar
0ca0f16fd1 Merge branches 'x86/apic', 'x86/asm', 'x86/cleanups', 'x86/debug', 'x86/kconfig', 'x86/mm', 'x86/ptrace', 'x86/setup' and 'x86/urgent'; commit 'v2.6.29-rc8' into x86/core 2009-03-14 16:25:40 +01:00
Yinghai Lu
d4c90e37a2 x86: print the continous part of fixed mtrrs together
Impact: print out fewer lines

 1. print continuous range with same type together
 2. change _INFO to _DEBUG

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <49BACB61.8000302@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-14 12:27:06 +01:00
Yinghai Lu
63516ef6d6 x86: fix get_mtrr() warning about smp_processor_id() with CONFIG_PREEMPT=y
Impact: fix debug warning

Jaswinder noticed that there is a warning about smp_processor_id()
in get_mtrr().

Fix it by wrapping the printout into a get/put_cpu() pair.

Reported-by: Jaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <49BAB7FF.4030107@kernel.org>
[ changed to get/put_cpu(), cleaned up surrounding code a it. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-14 12:27:06 +01:00
Ingo Molnar
0f3fa48a7e x86: cpu/common.c more cleanups
Complete/fix the cleanups of cpu/common.c:

 - fix ugly warning due to asm/topology.h -> linux/topology.h change
 - standardize the style across the file
 - simplify/refactor the code flow where possible

Cc: Jaswinder Singh Rajput <jaswinder@kernel.org>
LKML-Reference: <1237009789.4387.2.camel@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-14 10:37:34 +01:00
Ingo Molnar
62395efdb0 Merge branch 'x86/asm' into tracing/syscalls
We need the wider TIF work-mask checks in entry_32.S.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-14 09:44:08 +01:00
Jaswinder Singh Rajput
9766cdbcb2 x86: cpu/common.c cleanups
- fix various style problems
 - declare varibles before they get used
 - introduced clear_all_debug_regs
 - fix header files issues

LKML-Reference: <1237009789.4387.2.camel@localhost.localdomain>
Signed-off-by: Jaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-14 08:59:50 +01:00
Andreas Herrmann
3ff42da504 x86: mtrr: don't modify RdDram/WrDram bits of fixed MTRRs
Impact: bug fix + BIOS workaround

BIOS is expected to clear the SYSCFG[MtrrFixDramModEn] on AMD CPUs
after fixed MTRRs are configured.

Some BIOSes do not clear SYSCFG[MtrrFixDramModEn] on BP (and on APs).

This can lead to obfuscation in Linux when this bit is not cleared on
BP but cleared on APs. A consequence of this is that the saved
fixed-MTRR state (from BP) differs from the fixed-MTRRs of APs --
because RdDram/WrDram bits are read as zero when
SYSCFG[MtrrFixDramModEn] is cleared -- and Linux tries to sync
fixed-MTRR state from BP to AP. This implies that Linux sets
SYSCFG[MtrrFixDramEn] and activates those bits.

More important is that (some) systems change these bits in SMM when
ACPI is enabled. Hence it is racy if Linux modifies RdMem/WrMem bits,
too.

(1) The patch modifies an old fix from Bernhard Kaindl to get
    suspend/resume working on some Acer Laptops. Bernhard's patch
    tried to sync RdMem/WrMem bits of fixed MTRR registers and that
    helped on those old Laptops. (Don't ask me why -- can't test it
    myself). But this old problem was not the motivation for the
    patch. (See http://lkml.org/lkml/2007/4/3/110)

(2) The more important effect is to fix issues on some more current systems.

    On those systems Linux panics or just freezes, see

    http://bugzilla.kernel.org/show_bug.cgi?id=11541
    (and also duplicates of this bug:
    http://bugzilla.kernel.org/show_bug.cgi?id=11737
    http://bugzilla.kernel.org/show_bug.cgi?id=11714)

    The affected systems boot only using acpi=ht, acpi=off or
    when the kernel is built with CONFIG_MTRR=n.

    The acpi options prevent full enablement of ACPI.  Obviously when
    ACPI is enabled the BIOS/SMM modfies RdMem/WrMem bits.  When
    CONFIG_MTRR=y Linux also accesses and modifies those bits when it
    needs to sync fixed-MTRRs across cores (Bernhard's fix, see (1)).
    How do you synchronize that? You can't. As a consequence Linux
    shouldn't touch those bits at all (Rationale are AMD's BKDGs which
    recommend to clear the bit that makes RdMem/WrMem accessible).
    This is the purpose of this patch. And (so far) this suffices to
    fix (1) and (2).

I suggest not to touch RdDram/WrDram bits of fixed-MTRRs and
SYSCFG[MtrrFixDramEn] and to clear SYSCFG[MtrrFixDramModEn] as
suggested by AMD K8, and AMD family 10h/11h BKDGs.
BIOS is expected to do this anyway. This should avoid that
Linux and SMM tread on each other's toes ...

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: trenn@suse.de
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20090312163937.GH20716@alberich.amd.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 10:19:27 +01:00
Rusty Russell
4f0628963c cpumask: use new cpumask functions throughout x86
Impact: cleanup

1) &cpu_online_map -> cpu_online_mask
2) first_cpu/next_cpu_nr -> cpumask_first/cpumask_next
3) cpu_*_map manipulation -> init_cpu_* / set_cpu_*

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-03-13 14:49:54 +10:30
Rusty Russell
3f76a183de x86: unify cpu_callin_mask/cpu_callout_mask/cpu_initialized_mask/cpu_sibling_setup_mask
Impact: cleanup

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-03-13 14:49:54 +10:30
Rusty Russell
996867d096 cpumask: convert arch/x86/kernel/cpu/mcheck/mce_64.c
Impact: reduce kernel memory usage when CONFIG_CPUMASK_OFFSTACK=y

Simple conversion of mce_device_initialized to cpumask_var_t.  We don't
check the alloc_cpumask_var() return since it's boot-time only, and
the misc_register() in that same function isn't checked.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-03-13 14:49:51 +10:30
Rusty Russell
7ad728f981 cpumask: x86: convert cpu_sibling_map/cpu_core_map to cpumask_var_t
Impact: reduce per-cpu size for CONFIG_CPUMASK_OFFSTACK=y

In most places it's cleaner to use the accessors cpu_sibling_mask()
and cpu_core_mask() wrappers which already exist.

I couldn't avoid cleaning up the access in oprofile, either.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-03-13 14:49:50 +10:30
Ingo Molnar
f6411fe7e0 Merge branches 'sched/clock', 'sched/urgent' and 'linus' into sched/core 2009-03-13 04:50:44 +01:00
Jaswinder Singh Rajput
91219bcbdc x86: cpu_debug add write support for MSRs
Supported write flag for registers.
currently write is enabled only for PMC MSR.

[root@ht]# cat /sys/kernel/debug/x86/cpu/cpu1/pmc/0x300/value
0x0

[root@ht]# echo 1234 > /sys/kernel/debug/x86/cpu/cpu1/pmc/0x300/value
[root@ht]# cat /sys/kernel/debug/x86/cpu/cpu1/pmc/0x300/value
0x4d2

[root@ht]# echo 0x1234 > /sys/kernel/debug/x86/cpu/cpu1/pmc/0x300/value
[root@ht]# cat /sys/kernel/debug/x86/cpu/cpu1/pmc/0x300/value
0x1234

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 03:02:45 +01:00
Yinghai Lu
0d890355bf x86: separate mtrr cleanup/mtrr_e820 trim to separate file
Impact: cleanup

mtrr main.c is too big, seperate mtrr cleanup and mtrr e820 trim
code to another file.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <49B87C7B.80809@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 02:52:19 +01:00
Yinghai Lu
c1ab7e93c6 x86: print out mtrr_range_state when user specify size
Impact: print more debug info

Keep it consistent with autodetect version.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <49B87C0A.4010105@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 02:52:18 +01:00
Yinghai Lu
8ad9790588 x86: more MTRR debug printouts
Impact: improve MTRR debugging messages

There's still inefficiencies suspected with the MTRR sanitizing
code, so make sure we get all the info we need from a dmesg.

- Remove unneeded mtrr_show

 (It will only printout one time by first cpu, so it is no big deal.)

- Also print out directly from get_mtrr, because it doesn't update mtrr_state.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <49B9BA5A.40108@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 02:52:18 +01:00
Jan Beulich
13c6c53282 x86, 32-bit: also use cpuinfo_x86's x86_{phys,virt}_bits members
Impact: 32/64-bit consolidation

In a first step, this allows fixing phys_addr_valid() for PAE (which
until now reported all addresses to be valid). Subsequently, this will
also allow simplifying some MTRR handling code.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <49B9101E.76E4.0078.0@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 02:37:17 +01:00
Jan Beulich
02dde8b45c x86: move various CPU initialization objects into .cpuinit.rodata
Impact: debuggability and micro-optimization

Putting whatever is possible into the (final) .rodata section increases
the likelihood of catching memory corruption bugs early, and reduces
false cache line sharing.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <49B90961.76E4.0078.0@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-12 13:13:07 +01:00
Ingo Molnar
a98fe7f342 Merge branches 'x86/asm', 'x86/debug', 'x86/mm', 'x86/setup', 'x86/urgent' and 'linus' into x86/core 2009-03-12 11:50:15 +01:00
Jaswinder Singh Rajput
8229d75438 x86: cpu architecture debug code, build fix, cleanup
move store_ldt outside the CONFIG_PARAVIRT section and
also clean up the code a bit.

Signed-off-by: Jaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-11 14:52:03 +01:00
Ingo Molnar
d95c357812 Merge branch 'x86/core' into cpus4096 2009-03-11 10:49:34 +01:00
Ingo Molnar
78b020d035 Merge branches 'x86/cleanups', 'x86/kexec', 'x86/mce2' and 'linus' into x86/core 2009-03-11 10:49:15 +01:00
KOSAKI Motohiro
5490fa9673 x86, mce: use round_jiffies() instead round_jiffies_relative()
Impact: saving power _very_ little

round_jiffies() round up absolute jiffies to full second.
round_jiffies_relative() round up relative jiffies to full second.

The "t->expires" is absolute jiffies. Then, round_jiffies() should be
used instead round_jiffies_relative().

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-03-10 22:33:06 -07:00
Ingo Molnar
4dd163a051 Merge branches 'tracing/ftrace', 'tracing/textedit' and 'linus' into tracing/core 2009-03-10 22:54:23 +01:00
Jaswinder Singh Rajput
9b779edf4b x86: cpu architecture debug code
Introduce:

 cat /sys/kernel/debug/x86/cpu/*

for Intel and AMD processors to view / debug the state of each CPU.

By using this we can debug whole range of registers and other
cpu information for debugging purpose and monitor how things
are changing.

This can be useful for developers as well as for users.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
LKML-Reference: <1236701373.3387.4.camel@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-10 18:39:45 +01:00
Ingo Molnar
8293dd6f86 Merge branch 'x86/core' into tracing/ftrace
Semantic merge:

  kernel/trace/trace_functions_graph.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-10 10:17:48 +01:00
Stoyan Gaydarov
8c5dfd2551 x86: BUG to BUG_ON changes
Impact: cleanup

Signed-off-by: Stoyan Gaydarov <stoyboyker@gmail.com>
LKML-Reference: <1236661850-8237-8-git-send-email-stoyboyker@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-10 09:55:18 +01:00
Dave Jones
129f8ae9b1 Revert "[CPUFREQ] Disable sysfs ui for p4-clockmod."
This reverts commit e088e4c9cd.

Removing the sysfs interface for p4-clockmod was flagged as a
regression in bug 12826.

Course of action:
 - Find out the remaining causes of overheating, and fix them
   if possible. ACPI should be doing the right thing automatically.
   If it isn't, we need to fix that.
 - mark p4-clockmod ui as deprecated
 - try again with the removal in six months.

It's not really feasible to printk about the deprecation, because
it needs to happen at all the sysfs entry points, which means adding
a lot of strcmp("p4-clockmod".. calls to the core, which.. bleuch.

Signed-off-by: Dave Jones <davej@redhat.com>
2009-03-09 15:07:33 -04:00
Jaswinder Singh Rajput
e255357764 x86: perf_counter cleanup
Remove unused variables and duplicate header file.

Signed-off-by: Jaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-08 16:26:50 +01:00
Peter Zijlstra
184fe4ab1f x86: perf_counter cleanup
Use and actual unsigned long bitmap instead of casting our way around.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jaswinder Singh Rajput <jaswinder@kernel.org>
LKML-Reference: <1236508459.22914.3645.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-08 16:24:49 +01:00
Yinghai Lu
1f442d70c8 x86: remove smp_apply_quirks()/smp_checks()
Impact: cleanup and code size reduction on 64-bit

This code is only applied to Intel Pentium and AMD K7 32-bit cpus.

Move those checks to intel_init()/amd_init() for 32-bit
so 64-bit will not build this code.

Also change to use cpu_index check to see if we need to emit warning.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <49B377D2.8030108@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-08 16:22:56 +01:00
Ingo Molnar
f0ef039851 Merge branch 'x86/core' into tracing/textedit
Conflicts:
	arch/x86/Kconfig
	block/blktrace.c
	kernel/irq/handle.c

Semantic conflict:
	kernel/trace/blktrace.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-06 16:45:01 +01:00
Ingo Molnar
caab36b593 Merge branch 'x86/mce2' into x86/core 2009-03-05 21:49:25 +01:00
Peter Zijlstra
b5e8acf66f perfcounters: IRQ and NMI support on AMD CPUs, fix
The BKGD suggests that counter width on AMD CPUs is 48 for all
existing models (it certainly is for mine).

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-05 20:37:21 +01:00
Peter Zijlstra
b0f3f28e0f perfcounters: IRQ and NMI support on AMD CPUs
The below completes the K7+ performance counter support:

 - IRQ support
 - NMI support

KernelTop output works now as well.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jaswinder Singh Rajput <jaswinder@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1236273633.5187.286.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-05 18:25:16 +01:00
Dave Jones
36e8abf3ed [CPUFREQ] Prevent p4-clockmod from auto-binding to the ondemand governor.
The latency of p4-clockmod sucks so hard that scaling on a regular
basis with ondemand is a really bad idea.

Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-03-05 00:16:26 -05:00
Ingo Molnar
73af76dfd1 x86, mce: fix build failure in arch/x86/kernel/cpu/mcheck/threshold.c
Impact: build fix

The APIC code rewrite in the x86 tree broke the x86/mce branch:

 arch/x86/kernel/cpu/mcheck/threshold.c: In function ‘mce_threshold_interrupt’:
 arch/x86/kernel/cpu/mcheck/threshold.c:24: error: implicit declaration of function ‘ack_APIC_irq’

Also tidy up the file a bit while at it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-04 11:48:28 +01:00
Ingo Molnar
91d75e209b Merge branch 'x86/core' into core/percpu 2009-03-04 02:29:19 +01:00
Ingo Molnar
8b0e5860cb Merge branches 'x86/apic', 'x86/cpu', 'x86/fixmap', 'x86/mm', 'x86/sched', 'x86/setup-lzma', 'x86/signal' and 'x86/urgent' into x86/core 2009-03-04 02:22:31 +01:00
Jaswinder Singh Rajput
a1ef58f442 x86: use pr_info in perf_counter.c
Impact: cleanup

using pr_info in perf_counter.c fixes various 80 characters warnings and
also indenting for conditional statement

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-02 11:31:44 +01:00
Jaswinder Singh Rajput
169e41eb7f x86: decent declarations in perf_counter.c
Impact: cleanup

making decent declrations for struct pmc_x86_ops and
fix checkpatch error:
 ERROR: Macros with complex values should be enclosed in parenthesis

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-02 11:31:06 +01:00
Jaswinder Singh Rajput
327f4387e3 x86: remove double copy of show_cpuinfo_core for 32 and 64 bit
Impact: unification

show_cpuinfo_core is identical for 32 and 64 bit and can be unified,
and CONFIG_X86_HT inherently depends on CONFIG_X86_SMP.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-28 19:26:33 -08:00
Jaswinder Singh Rajput
f87ad35d37 x86: AMD Support for perf_counter
Supported basic performance counter for AMD K7 and later:

$ perfstat -e 0,1,2,3,4,5,-1,-2,-3,-4,-5 ls > /dev/null

 Performance counter stats for 'ls':

      12.298610  task clock ticks     (msecs)

        3298477  CPU cycles           (events)
        1406354  instructions         (events)
         749035  cache references     (events)
          16939  cache misses         (events)
         100589  branches             (events)
          11159  branch misses        (events)
       7.627540  cpu clock ticks      (msecs)
      12.298610  task clock ticks     (msecs)
            500  pagefaults           (events)
              6  context switches     (events)
              3  CPU migrations       (events)

 Wall-clock time elapsed:     8.672290 msecs

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-28 10:38:32 +01:00
Jaswinder Singh Rajput
b56a3802dc x86: prepare perf_counter to add more cpus
Introduced  struct pmc_x86_ops to add more cpus.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-28 10:38:27 +01:00
Ingo Molnar
1b49061d40 Merge branch 'sched/clock' into tracing/ftrace
Conflicts:
	kernel/sched_clock.c
2009-02-27 08:35:19 +01:00
Ingo Molnar
ba1d755a36 fix warning in arch/x86/kernel/cpu/intel_cacheinfo.c
fix this warning:

  arch/x86/kernel/cpu/intel_cacheinfo.c:139: warning: ‘k8_nb_id’ defined but not used
  arch/x86/kernel/cpu/intel_cacheinfo.c:527: warning: ‘free_cache_attributes’ defined but not used
  arch/x86/kernel/cpu/intel_cacheinfo.c:538: warning: ‘detect_cache_attributes’ defined but not used

Unused variables in the !CONFIG_SYSCTL case.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26 22:39:12 +01:00
Ingo Molnar
83ce400928 x86: set X86_FEATURE_TSC_RELIABLE
If the TSC is constant and non-stop, also set it reliable.

(We will turn this off in DMI quirks for multi-chassis systems)

The performance number on a 16-way Nehalem system running
32 tasks that context-switch between each other is significant:

   sched_clock_stable=0		sched_clock_stable=1
   ....................         ....................
   22.456925 million/sec        24.306972 million/sec   [+8.2%]

lmbench's "lat_ctx -s 0 2" goes from 0.63 microseconds to
0.59 microseconds - a 6.7% increase in context-switching
performance.

Perfstat of 1 million pipe context switches between two tasks:

 Performance counter stats for './pipe-test-1m':

       [before]           [after]
   ............      ............
   37621.421089      36436.848378    task clock ticks     (msecs)

              0                 0    CPU migrations       (events)
        2000274           2000189    context switches     (events)
            194               193    pagefaults           (events)
     8433799643        8171016416    CPU cycles           (events) -3.21%
     8370133368        8180999694    instructions         (events) -2.31%
        4158565           3895941    cache references     (events) -6.74%
          44312             46264    cache misses         (events)

    2349.287976       2279.362465    wall-time            (msecs)  -3.06%

The speedup comes straight from the reduction in the instruction
count. sched_clock_cpu() got simpler and the whole workload thus
executes faster.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26 21:20:25 +01:00
Ingo Molnar
8e818179eb Merge branch 'x86/core' into perfcounters/core
Conflicts:
	arch/x86/kernel/apic/apic.c
	arch/x86/kernel/irqinit_32.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26 13:02:23 +01:00
Matthew Garrett
eb3092cee7 [CPUFREQ] Make cpufreq-nforce2 less obnoxious
Not owning an nforce2 is a sign of good taste, not an error.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:32 -05:00
Matthias-Christian Ott
199785eac8 [CPUFREQ] p4-clockmod reports wrong frequency.
http://bugzilla.kernel.org/show_bug.cgi?id=10968

[ Updated for current tree, and fixed compile failure
  when p4-clockmod was built modular -- davej]

From: Matthias-Christian Ott <ott@mirix.org>
Signed-off-by: Dominik Brodowski <linux@brodo.de>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:32 -05:00
Dave Jones
0cb8bc2560 [CPUFREQ] powernow-k8: Use a common exit path.
a0abd520fd introduced a slew of
extra kfree/return -ENODEV pairs. This replaces them all
with gotos.

Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:32 -05:00
Matthew Garrett
de3ed81d74 [CPUFREQ] Change link order of x86 cpufreq modules
Change the link order of the cpufreq modules to ensure that they're
probed in the preferred order when statically linked in.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:32 -05:00
Dave Jones
91420220d2 [CPUFREQ] Use swap() in longhaul.c
Remove hand-coded implementation of swap()

Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:31 -05:00
Dave Jones
3a58df35a6 [CPUFREQ] checkpatch cleanups for acpi-cpufreq
Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:31 -05:00
Thomas Renninger
79cc56af9f [CPUFREQ] powernow-k8: Only print error message once, not per core.
This is the typical message you get if you plug in a CPU
which is newer than your BIOS. It's annoying seeing this
message for each core.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:31 -05:00
Thomas Renninger
57f4fa6991 [CPUFREQ] powernow-k8: Always compile powernow-k8 driver with ACPI support
powernow-k8 driver should always try to get cpufreq info from ACPI.
Otherwise it will not be able to detect the transition latency correctly
which results in ondemand governor taking a wrong sampling rate which will
then result in sever performance loss.

Let the user not shoot himself in the foot and always compile in ACPI
support for powernow-k8.

This also fixes a wrong message if ACPI_PROCESSOR is compiled as a module and
#ifndef CONFIG_ACPI_PROCESSOR
path is chosen.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:31 -05:00
Dave Jones
0e64a0c982 [CPUFREQ] checkpatch cleanups for powernow-k8
This driver has so many long function names, and deep nested if's
The remaining warnings will need some code restructuring to clean up.

Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:30 -05:00
Dave Jones
b9e7638a30 [CPUFREQ] checkpatch cleanups for powernow-k7
The asm/timer.h warning can be ignored, it's needed for
recalibrate_cpu_khz()

Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:30 -05:00
Dave Jones
bbfebd6655 [CPUFREQ] checkpatch cleanups for speedstep related drivers.
Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:30 -05:00
Dave Jones
6072ace436 [CPUFREQ] checkpatch cleanups for sc520
Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:29 -05:00
Dave Jones
14a6650f13 [CPUFREQ] checkpatch cleanups for powernow-k6
Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:29 -05:00
Dave Jones
48ee923a66 [CPUFREQ] checkpatch cleanups for longrun
Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:29 -05:00
Dave Jones
ac617bd0f7 [CPUFREQ] checkpatch cleanups for longhaul
Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:29 -05:00
Dave Jones
00f6a235bf [CPUFREQ] checkpatch cleanups for gx-suspmod
Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:29 -05:00
Dave Jones
c9b8c87152 [CPUFREQ] checkpatch cleanups for e_powersaver
Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:28 -05:00
Dave Jones
04cd1a99dc [CPUFREQ] checkpatch cleanups for elanfreq
The remaining warning about the simple_strtoul conversion
to strict_strtoul seems kind of pointless to me.

Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:28 -05:00
Dave Jones
20174b65d9 [CPUFREQ] nforce2: Use driver prefix, not cpufreq prefix.
Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:28 -05:00
Dave Jones
b5c9166662 [CPUFREQ] checkpatch cleanups for cpufreq-nforce2
Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:28 -05:00
Dave Jones
fff78ad5ce [CPUFREQ] Stupidly trivial CodingStyle fix
GNU indent complains about this being ambiguous, because it's dumb.
One of my automated tests relies on the output of indent, so this shuts
it up.

Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-24 22:47:28 -05:00
H. Peter Anvin
638bee71c8 Merge branch 'x86/core' into x86/mce2 2009-02-24 16:11:51 -08:00
H. Peter Anvin
df20e2eb3e x86, mce, cmci: remove incorrect __cpuinit/__cpuexit annotations
Impact: Bug fix on UP

The MCE code is reinitialized from resume, so we can't use
__cpuinit/__cpuexit for most of the code.  Remove those annotations
for anything downstream of mce_init().

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-24 13:41:01 -08:00
Andi Kleen
88ccbedd9c x86, mce, cmci: add CMCI support
Impact: Major new feature

Intel CMCI (Corrected Machine Check Interrupt) is a new
feature on Nehalem CPUs. It allows the CPU to trigger
interrupts on corrected events, which allows faster
reaction to them instead of with the traditional
polling timer.

Also use CMCI to discover shared banks. Machine check banks
can be shared by CPU threads or even cores. Using the CMCI enable
bit it is possible to detect the fact that another CPU already
saw a specific bank. Use this to assign shared banks only
to one CPU to avoid reporting duplicated events.

On CPU hot unplug bank sharing is re discovered. This is done
using a thread that cycles through all the CPUs.

To avoid races between the poller and CMCI we only poll
for banks that are not CMCI capable and only check CMCI
owned banks on a interrupt.

The shared banks ownership information is currently only used for
CMCI interrupts, not polled banks.

The sharing discovery code follows the algorithm recommended in the
IA32 SDM Vol3a 14.5.2.1

The CMCI interrupt handler just calls the machine check poller to
pick up the machine check event that caused the interrupt.

I decided not to implement a separate threshold event like
the AMD version has, because the threshold is always one currently
and adding another event didn't seem to add any value.

Some code inspired by Yunhong Jiang's Xen implementation,
which was in term inspired by a earlier CMCI implementation
by me.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-24 13:41:00 -08:00
Andi Kleen
ee031c31d6 x86, mce, cmci: use polled banks bitmap in machine check poller
Define a per cpu bitmap that contains the banks polled by the machine
check poller. This is needed for the CMCI code in the next patches
to be able to disable polling on specific banks.

The bank by default contains all banks, so there is no behaviour
change. Only future code will remove some banks from the polling
set.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-24 13:26:05 -08:00
Andi Kleen
8457c84d68 x86, mce: replace machine check events logged interval with ratelimit
Impact: behavior change, use common code

Use a standard leaky bucket ratelimit for the machine check
warning print interval instead of waiting every check_interval.
Also decrease the limit to twice per minute.
This interacts better with threshold interrupts because
they can happen more often than check_interval.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-24 13:25:53 -08:00
Andi Kleen
f9695df42c x86, mce, cmci: avoid potential reentry of threshold interrupt
Impact: minor bugfix

The threshold handler on AMD (and soon on Intel) could be theoretically
reentered by the hardware. This could lead to corrupted events
because the machine check poll code assumes it is not reentered.

Move the APIC ACK to the end of the interrupt handler to let
the hardware avoid that.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-24 13:24:42 -08:00
Andi Kleen
b276268631 x86, mce, cmci: factor out threshold interrupt handler
Impact: cleanup; preparation for feature

The mce_amd_64 code has an own private MC threshold vector with an own
interrupt handler. Since Intel needs a similar handler
it makes sense to share the vector because both can not
be active at the same time.

I factored the common APIC handler code into a separate file which can
be used by both the Intel or AMD MC code.

This is needed for the next patch which adds an Intel specific
CMCI handler.

This patch should be a nop for AMD, it just moves some code
around.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-24 13:24:42 -08:00
Andi Kleen
41fdff322e x86, mce, cmci: export MAX_NR_BANKS
Impact: Cleanup (code movement)

Move MAX_NR_BANKS into mce.h because it's needed there
for followup patches.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-24 13:24:42 -08:00
Ingo Molnar
0edcf8d692 Merge branch 'tj-percpu' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/percpu
Conflicts:
	arch/x86/include/asm/pgtable.h
2009-02-24 21:52:45 +01:00
Ingo Molnar
a852cbfaaf Merge branches 'x86/acpi', 'x86/apic', 'x86/asm', 'x86/cleanups', 'x86/mm', 'x86/signal' and 'x86/urgent'; commit 'v2.6.29-rc6' into x86/core 2009-02-24 21:50:43 +01:00
Ingo Molnar
a7f4463e03 Merge branch 'tracing/ftrace'; commit 'v2.6.29-rc6' into tracing/core 2009-02-24 18:22:39 +01:00
H. Peter Anvin
dc731ca609 Merge branch 'x86/urgent' into x86/mce2 2009-02-23 14:05:56 -08:00
H. Peter Anvin
ec5b3d3243 x86, mce: remove invalid __cpuinit/__cpuexit annotations
Impact: Bug fix when CPU hotplug is disabled

Correct the following broken __cpuinit/__cpuexit annotations:

- mce_cpu_features() is called from mce_resume(), and so cannot be
  __cpuinit.
- mce_disable_cpu() and mce_reenable_cpu() are called from
  mce_cpu_callback(), and so cannot be __cpuexit().

Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-02-23 14:01:04 -08:00
Ingo Molnar
fc6fc7f1b1 Merge branch 'linus' into x86/apic
Conflicts:
	arch/x86/mach-default/setup.c

Semantic conflict resolution:
	arch/x86/kernel/setup.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-22 20:05:19 +01:00
H. Peter Anvin
cc3ca22063 x86, mce: remove incorrect __cpuinit for mce_cpu_features()
Impact: Bug fix on UP

Checkin 6ec68bff3c:
    x86, mce: reinitialize per cpu features on resume

introduced a call to mce_cpu_features() in the resume path, in order
for the MCE machinery to get properly reinitialized after a resume.
However, this function (and its successors) was flagged __cpuinit,
which becomes __init on UP configurations (on SMP suspend/resume
requires CPU hotplug and so this would not be seen.)

Remove the offending __cpuinit annotations for mce_cpu_features() and
its successor functions.

Cc: Andi Kleen <ak@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-20 23:40:40 -08:00
Ingo Molnar
609162850d Merge branches 'x86/asm', 'x86/cleanups' and 'x86/headers' into x86/core 2009-02-20 17:40:50 +01:00
Ingo Molnar
3b6f7b9beb Merge branch 'x86/urgent' into x86/core 2009-02-20 17:40:43 +01:00
Vegard Nossum
ecab22aa6d x86: use symbolic constants for MSR_IA32_MISC_ENABLE bits
Impact: Cleanup. No functional changes.

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-20 12:07:43 +01:00
Ingo Molnar
64b36ca7f4 Merge branches 'tracing/function-graph-tracer' and 'linus' into tracing/core 2009-02-20 11:35:57 +01:00
Rusty Russell
b36128c830 alloc_percpu: change percpu_ptr to per_cpu_ptr
Impact: cleanup

There are two allocated per-cpu accessor macros with almost identical
spelling.  The original and far more popular is per_cpu_ptr (44
files), so change over the other 4 files.

tj: kill percpu_ptr() and update UP too

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: mingo@redhat.com
Cc: lenb@kernel.org
Cc: cpufreq@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-02-20 16:29:08 +09:00
H. Peter Anvin
f6d1826dfa x86, mce: use %ll instead of %L for 64-bit numbers
Impact: Cleanup

The standard spelling of a printf pattern for long long is "ll", not
"L", which is for long double.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-02-19 15:44:58 -08:00
Andi Kleen
b79109c3bb x86, mce: separate correct machine check poller and fatal exception handler
Impact: cleanup, performance enhancement

The machine check poller is diverging more and more from the fatal
exception handler. Instead of adding more special cases separate the code
paths completely. The corrected poll path is actually quite simple,
and this doesn't result in much code duplication.

This makes both handlers much easier to read and results in
cleaner code flow.  The exception handler now only needs to care
about uncorrected errors, which also simplifies the handling of multiple
errors. The corrected poller also now always runs in standard interrupt
context and does not need to do anything special to handle NMI context.

Minor behaviour changes:
- MCG status is now not cleared on polling.
- Only the banks which had corrected errors get cleared on polling
- The exception handler only clears banks with errors now

v2: Forward port to new patch order. Add "uc" argument.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-02-19 14:52:20 -08:00
Andi Kleen
b5f2fa4ea0 x86, mce: factor out duplicated struct mce setup into one function
Impact: cleanup

This merely factors out duplicated code to set up
the initial struct mce state into a single function.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-02-19 14:51:39 -08:00
Andi Kleen
0d7482e3d7 x86, mce: implement dynamic machine check banks support
Impact: cleanup; making code future proof; memory saving on small systems

This patch replaces the hardcoded max number of machine check banks with 
dynamic allocation depending on what the CPU reports. The sysfs
data structures and the banks array are dynamically allocated.

There is still a hard bank limit (128) because the mcelog protocol uses
banks >= 128 as pseudo banks to escape other events. But we expect
that 128 banks is beyond any reasonable CPU for now.

This supersedes an earlier patch by Venki, but it solves the problem
more completely by making the limit fully dynamic (up to the 128
boundary).

This saves some memory on machines with less than 6 banks because
they won't need sysdevs for unused ones and also allows to 
use sysfs to control these banks on possible future CPUs with
more than 6 banks.

This is an updated patch addressing Venki's comments.  I also added in
another patch from Thomas which fixed the error allocation path (that
patch was previously separated)

Cc: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-02-19 14:50:58 -08:00
Linus Torvalds
bcf8951fc2 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, mce: fix ifdef for 64bit thermal apic vector clear on shutdown
  x86, mce: use force_sig_info to kill process in machine check
  x86, mce: reinitialize per cpu features on resume
  x86, rcu: fix strange load average and ksoftirqd behavior
2009-02-19 09:14:35 -08:00
Ingo Molnar
72c26c9a26 Merge branch 'linus' into tracing/blktrace
Conflicts:
	block/blktrace.c

Semantic merge:
	kernel/trace/blktrace.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-19 09:00:35 +01:00
Huang Ying
ef41df4344 x86, mce: fix a race condition in mce_read()
Impact: bugfix

Considering the situation as follow:

before: mcelog.next == 1, mcelog.entry[0].finished = 1

+--------------------------------------------------------------------------
R                   W1                  W2                  W3

read mcelog.next (1)
                    mcelog.next++ (2)
                    (working on entry 1,
                    finished == 0)

mcelog.next = 0
                                        mcelog.next++ (1)
                                        (working on entry 0)
                                                           mcelog.next++ (2)
                                                           (working on entry 1)
                        <----------------- race ---------------->
                    (done on entry 1,
                    finished = 1)
                                                           (done on entry 1,
                                                           finished = 1)

To fix the race condition, a cmpxchg loop is added to mce_read() to
ensure no new MCE record can be added between mcelog.next reading and
mcelog.next = 0.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:33:05 -08:00
Andi Kleen
d6b75584a3 x86, mce: disable machine checks on offlined CPUs
Impact: Lower priority bug fix

Offlined CPUs could still get machine checks, but the machine check handler
cannot handle them properly, leading to an unconditional crash. Disable
machine checks on CPUs that are going down.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:32:56 -08:00
Andi Kleen
5b4408fdaa x86, mce: don't set up mce sysdev devices with mce=off
Impact: bug fix, in this case the resume handler shouldn't run which
	avoids incorrectly reenabling machine checks on resume

When MCEs are completely disabled on the command line don't set
up the sysdev devices for them either.

Includes a comment fix from Thomas Gleixner.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:32:50 -08:00
Andi Kleen
52d168e28b x86, mce: switch machine check polling to per CPU timer
Impact: Higher priority bug fix

The machine check poller runs a single timer and then broadcasted an
IPI to all CPUs to check them. This leads to unnecessary
synchronization between CPUs. The original CPU running the timer has
to wait potentially a long time for all other CPUs answering. This is
also real time unfriendly and in general inefficient.

This was especially a problem on systems with a lot of events where
the poller run with a higher frequency after processing some events.
There could be more and more CPU time wasted with this, to
the point of significantly slowing down machines.

The machine check polling is actually fully independent per CPU, so
there's no reason to not just do this all with per CPU timers.  This
patch implements that.

Also switch the poller also to use standard timers instead of work
queues. It was using work queues to be able to execute a user program
on a event, but mce_notify_user() handles this case now with a
separate callback. So instead always run the poll code in in a
standard per CPU timer, which means that in the common case of not
having to execute a trigger there will be less overhead.

This allows to clean up the initialization significantly, because
standard timers are already up when machine checks get init'ed.  No
multiple initialization functions.

Thanks to Thomas Gleixner for some help.

Cc: thockin@google.com
v2: Use del_timer_sync() on cpu shutdown and don't try to handle
migrated timers.
v3: Add WARN_ON for timer running on unexpected CPU

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:32:44 -08:00
Andi Kleen
9bd9840580 x86, mce: always use separate work queue to run trigger
Impact: Needed for bug fix in next patch

This relaxes the requirement that mce_notify_user has to run in process
context. Useful for future changes, but also leads to cleaner
behaviour now. Now instead mce_notify_user can be called directly
from interrupt (but not NMI) context.

The work queue only uses a single global work struct, which can be done safely
because it is always free to reuse before the trigger function is executed.
This way no events can be lost.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:32:41 -08:00
Andi Kleen
123aa76ec0 x86, mce: don't disable machine checks during code patching
Impact: low priority bug fix

This removes part of a a patch I added myself some time ago. After some
consideration the patch was a bad idea. In particular it stopped machine check
exceptions during code patching.

To quote the comment:

        * MCEs only happen when something got corrupted and in this
        * case we must do something about the corruption.
        * Ignoring it is worse than a unlikely patching race.
        * Also machine checks tend to be broadcast and if one CPU
        * goes into machine check the others follow quickly, so we don't
        * expect a machine check to cause undue problems during to code
        * patching.

So undo the machine check related parts of
8f4e956b31 NMIs are still disabled.

This only removes code, the only additions are a new comment.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:32:38 -08:00
Andi Kleen
973a2dd1d5 x86, mce: disable machine checks on suspend
Impact: Bug fix

During suspend it is not reliable to process machine check
exceptions, because CPUs disappear but can still get machine check
broadcasts.  Also the system is slightly more likely to
machine check them, but the handler is typically not a position
to handle them in a meaningfull way.

So disable them during suspend and enable them during resume.

Also make sure they are always disabled on hot-unplugged CPUs.

This new code assumes that suspend always hotunplugs all
non BP CPUs.

v2: Remove the WARN_ONs Thomas objected to.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:32:14 -08:00
Andi Kleen
380851bc6b x86, mce: use force_sig_info to kill process in machine check
Impact: bug fix (with tolerant == 3)

do_exit cannot be called directly from the exception handler because
it can sleep and the exception handler runs on the exception stack.
Use force_sig() instead.

Based on a earlier patch by Ying Huang who debugged the problem.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:24:31 -08:00
Andi Kleen
6ec68bff3c x86, mce: reinitialize per cpu features on resume
Impact: Bug fix

This fixes a long standing bug in the machine check code. On resume the
boot CPU wouldn't get its vendor specific state like thermal handling
reinitialized. This means the boot cpu wouldn't ever get any thermal
events reported again.

Call the respective initialization functions on resume

v2: Remove ancient init because they don't have a resume device anyways.
    Pointed out by Thomas Gleixner.
v3: Now fix the Subject too to reflect v2 change

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:24:28 -08:00
Ingo Molnar
e641f5f525 x86, apic: remove duplicate asm/apic.h inclusions
Impact: cleanup

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:44 +01:00
Ingo Molnar
7b6aa335ca x86, apic: remove genapic.h
Impact: cleanup

Remove genapic.h and remove all references to it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:44 +01:00
Ingo Molnar
0b6de00922 Merge branch 'x86/apic' into perfcounters/core
Conflicts:
	arch/x86/kernel/cpu/perfctr-watchdog.c
2009-02-17 17:20:11 +01:00
Ingo Molnar
7d01d32d3b x86, apic: fix build fallout of genapic changes
- make oprofile build
- select X86_X2APIC from X86_UV - it relies on it
- export genapic for oprofile modular build

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 13:13:25 +01:00
Yinghai Lu
c1eeb2de41 x86: fold apic_ops into genapic
Impact: cleanup

make it simpler, don't need have one extra struct.

v2: fix the sgi_uv build

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 12:22:20 +01:00
Yinghai Lu
06cd9a7dc8 x86: add x2apic config
Impact: cleanup

so could deselect x2apic
and INTR_REMAP will select x2apic

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 12:22:20 +01:00
Ingo Molnar
494df596f9 Merge branches 'x86/acpi', 'x86/apic', 'x86/cpudetect', 'x86/headers', 'x86/paravirt', 'x86/urgent' and 'x86/xen'; commit 'v2.6.29-rc5' into x86/core 2009-02-17 12:07:00 +01:00
Rusty Russell
a0abd520fd cpumask: fix powernow-k8: partial revert of 2fdf66b491
Impact: fix powernow-k8 when acpi=off (or other error).

There was a spurious change introduced into powernow-k8 in this patch:
so that we try to "restore" the cpus_allowed we never saved.  We revert
that file.

See lkml "[PATCH] x86/powernow: fix cpus_allowed brokage when
acpi=off" from Yinghai for the bug report.

Cc: Mike Travis <travis@sgi.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-02-16 17:31:59 +10:30
Ingo Molnar
72b623c736 Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/power-tracer 2009-02-15 20:43:03 +01:00
Yinghai Lu
f6db44df5b x86: fix typo in filter_cpuid_features()
Impact: fix wrong disabling of cpu features

an amd system got this strange output:

 CPU: CPU feature monitor disabled due to lack of CPUID level 0x5

but in /proc/cpuinfo I have:

 cpuid level	: 5

on intel system:

 CPU: CPU feature monitor disabled due to lack of CPUID level 0x5
 CPU: CPU feature dca disabled due to lack of CPUID level 0x9

but in /proc/cpuinfo i have:

 cpuid level     : 11

Tt turns out there is a typo, and we should use level member in df.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-15 09:03:29 +01:00
Jason Baron
b5f9fd0f8a tracing: convert c/p state power tracer to use tracepoints
Convert the c/p state "power" tracer to use tracepoints. Avoids a
function call when the tracer is disabled.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-13 09:06:18 -05:00
Ingo Molnar
1c511f740f Merge branches 'tracing/ftrace', 'tracing/ring-buffer', 'tracing/sysprof', 'tracing/urgent' and 'linus' into tracing/core 2009-02-13 10:25:18 +01:00
Ingo Molnar
b1864e9a1a Merge branch 'x86/core' into perfcounters/core
Conflicts:
	arch/x86/Kconfig
	arch/x86/kernel/apic.c
	arch/x86/kernel/setup_percpu.c
2009-02-13 09:49:38 +01:00
Ingo Molnar
ab639f3593 Merge branch 'core/percpu' into x86/core 2009-02-13 09:45:09 +01:00
Ingo Molnar
f8a6b2b9ce Merge branch 'linus' into x86/apic
Conflicts:
	arch/x86/kernel/acpi/boot.c
	arch/x86/mm/fault.c
2009-02-13 09:44:22 +01:00
Ingo Molnar
e9c4ffb11f Merge branch 'linus' into perfcounters/core
Conflicts:
	arch/x86/kernel/acpi/boot.c
2009-02-13 09:34:07 +01:00
Linus Torvalds
9ce04f9238 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  ptrace, x86: fix the usage of ptrace_fork()
  i8327: fix outb() parameter order
  x86: fix math_emu register frame access
  x86: math_emu info cleanup
  x86: include correct %gs in a.out core dump
  x86, vmi: put a missing paravirt_release_pmd in pgd_dtor
  x86: find nr_irqs_gsi with mp_ioapic_routing
  x86: add clflush before monitor for Intel 7400 series
  x86: disable intel_iommu support by default
  x86: don't apply __supported_pte_mask to non-present ptes
  x86: fix grammar in user-visible BIOS warning
  x86/Kconfig.cpu: make Kconfig help readable in the console
  x86, 64-bit: print DMI info in the oops trace
2009-02-11 08:23:22 -08:00
Ingo Molnar
ffc0467293 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/perfcounters into perfcounters/core 2009-02-11 09:22:14 +01:00
Ingo Molnar
95fd4845ed Merge commit 'v2.6.29-rc4' into perfcounters/core
Conflicts:
	arch/x86/kernel/setup_percpu.c
	arch/x86/mm/fault.c
	drivers/acpi/processor_idle.c
	kernel/irq/handle.c
2009-02-11 09:22:04 +01:00
Paul Mackerras
0475f9ea8e perf_counters: allow users to count user, kernel and/or hypervisor events
Impact: new perf_counter feature

This extends the perf_counter_hw_event struct with bits that specify
that events in user, kernel and/or hypervisor mode should not be
counted (i.e. should be excluded), and adds code to program the PMU
mode selection bits accordingly on x86 and powerpc.

For software counters, we don't currently have the infrastructure to
distinguish which mode an event occurs in, so we currently fail the
counter initialization if the setting of the hw_event.exclude_* bits
would require us to distinguish.  Context switches and CPU migrations
are currently considered to occur in kernel mode.

On x86, this changes the previous policy that only root can count
kernel events.  Now non-root users can count kernel events or exclude
them.  Non-root users still can't use NMI events, though.  On x86 we
don't appear to have any way to control whether hypervisor events are
counted or not, so hw_event.exclude_hv is ignored.

On powerpc, the selection of whether to count events in user, kernel
and/or hypervisor mode is PMU-wide, not per-counter, so this adds a
check that the hw_event.exclude_* settings are the same as other events
on the PMU.  Counters being added to a group have to have the same
settings as the other hardware counters in the group.  Counters and
groups can only be enabled in hw_perf_group_sched_in or power_perf_enable
if they have the same settings as any other counters already on the
PMU.  If we are not running on a hypervisor, the exclude_hv setting
is ignored (by forcing it to 0) since we can't ever get any
hypervisor events.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-02-11 15:06:59 +11:00
Ingo Molnar
f9915bfef3 Merge branches 'tracing/ftrace' and 'tracing/urgent' into tracing/core 2009-02-10 13:25:42 +01:00
Tejun Heo
60a5317ff0 x86: implement x86_32 stack protector
Impact: stack protector for x86_32

Implement stack protector for x86_32.  GDT entry 28 is used for it.
It's set to point to stack_canary-20 and have the length of 24 bytes.
CONFIG_CC_STACKPROTECTOR turns off CONFIG_X86_32_LAZY_GS and sets %gs
to the stack canary segment on entry.  As %gs is otherwise unused by
the kernel, the canary can be anywhere.  It's defined as a percpu
variable.

x86_32 exception handlers take register frame on stack directly as
struct pt_regs.  With -fstack-protector turned on, gcc copies the
whole structure after the stack canary and (of course) doesn't copy
back on return thus losing all changed.  For now, -fno-stack-protector
is added to all files which contain those functions.  We definitely
need something better.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10 00:42:01 +01:00
Ingo Molnar
92e2d50846 Merge branch 'x86/urgent' into core/percpu
Conflicts:
	arch/x86/kernel/acpi/boot.c
2009-02-10 00:41:02 +01:00
Linus Torvalds
6707fbb56c Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] powernow-k8: Get transition latency from ACPI _PSS table
  [CPUFREQ] Make ignore_nice_load setting of ondemand work as expected.
2009-02-09 13:58:22 -08:00
Ingo Molnar
249d51b53a Merge commit 'v2.6.29-rc4' into core/percpu
Conflicts:
	arch/x86/mach-voyager/voyager_smp.c
	arch/x86/mm/fault.c
2009-02-09 14:58:11 +01:00
Mike Galbraith
d278c48435 perf_counters: account NMI interrupts
I noticed that kerneltop interrupts were accounted as NMI, but not their
perf counter origin.

Account NMI performance counter interrupts.

Signed-off-by: Mike Galbraith  <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

 arch/x86/kernel/cpu/perf_counter.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
2009-02-09 13:03:38 +01:00
Ingo Molnar
eca217b36e Merge branch 'x86/paravirt' into x86/apic
Conflicts:
	arch/x86/mach-voyager/voyager_smp.c
2009-02-09 12:16:59 +01:00
Pallipadi, Venkatesh
e736ad548d x86: add clflush before monitor for Intel 7400 series
For Intel 7400 series CPUs, the recommendation is to use a clflush on the
monitored address just before monitor and mwait pair [1].

This clflush makes sure that there are no false wakeups from mwait when the
monitored address was recently written to.

[1] "MONITOR/MWAIT Recommendations for Intel Xeon Processor 7400 series"
    section in specification update document of 7400 series
    http://download.intel.com/design/xeon/specupdt/32033601.pdf

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 11:15:15 +01:00
Frederic Weisbecker
1292211058 tracing/power: move the power trace headers to a dedicated file
Impact: cleanup

Move the power tracer headers to trace/power.h to keep ftrace.h and power bits
more easy to maintain as separated topics.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 10:51:38 +01:00
Brian Gerst
2add8e235c x86: use linker to offset symbols by __per_cpu_load
Impact: cleanup and bug fix

Use the linker to create symbols for certain per-cpu variables
that are offset by __per_cpu_load.  This allows the removal of
the runtime fixup of the GDT pointer, which fixes a bug with
resume reported by Jiri Slaby.

Reported-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Acked-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 10:30:30 +01:00
Len Brown
2d29c6a075 Merge branches 'release', 'asus', 'bugzilla-12450', 'cpuidle', 'debug', 'ec', 'misc', 'printk' and 'processor' into release 2009-02-07 01:34:56 -05:00
Ingo Molnar
9d45cf9e36 Merge branch 'x86/urgent' into x86/apic
Conflicts:
	arch/x86/mach-default/setup.c

Semantic merge:
	arch/x86/kernel/irqinit_32.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 22:30:01 +01:00
Mark Langsdorf
732553e567 [CPUFREQ] powernow-k8: Get transition latency from ACPI _PSS table
At this time, the PowerNow! driver for K8 uses an experimentally
derived formula to calculate transition latency.  The value it
provides is orders of magnitude too large on modern systems.
This patch replaces the formula with ACPI _PSS latency values
for more accuracy and better performance.

I've tested it on two 2nd generation Opteron systems, a 3rd
generation Operton system, and a Turion X2 without seeing any
stability problems.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-05 12:25:26 -05:00
Mike Galbraith
5b75af0a02 perfcounters: fix "perf counters kill oprofile" bug
With oprofile as a module, and unloaded by profiling script,
both oprofile and kerneltop work fine.. unless you leave kerneltop
running when you start profiling, then you may see badness.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-04 17:36:18 +01:00
Thomas Renninger
62663ea822 ACPI: cpufreq: Remove deprecated /proc/acpi/processor/../performance proc entries
They were long enough set deprecated...

Update Documentation/cpu-freq/users-guide.txt:
The deprecated files listed there seen not to exist for some time anymore
already.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-02-04 00:12:24 -05:00
Dave Jones
9a8ecae87a x86: add cache descriptors for Intel Core i7
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-01 11:06:50 +01:00
Jeremy Fitzhardinge
11e3a840cd x86: split loading percpu segments from loading gdt
Impact: split out a function, no functional change

Xen needs to be able to access percpu data from very early on.  For
various reasons, it cannot also load the gdt at that time.   It does,
however, have a pefectly functional gdt at that point, so there's no
pressing need to reload the gdt.

Split the function to load the segment registers off, so Xen can call
it directly.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-31 14:28:54 +09:00
Brian Gerst
552be871e6 x86: pass in cpu number to switch_to_new_gdt()
Impact: cleanup, prepare for xen boot fix.

Xen needs to call this function very early to setup the GDT and
per-cpu segments.  Remove the call to smp_processor_id() and just
pass in the cpu number.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-31 14:28:50 +09:00
Ingo Molnar
3e5095d152 x86: replace CONFIG_X86_SMP with CONFIG_SMP
The x86/Voyager subarch used to have this distinction between
 'x86 SMP support' and 'Voyager SMP support':

 config X86_SMP
	bool
	depends on SMP && ((X86_32 && !X86_VOYAGER) || X86_64)

This is a pointless distinction - Voyager can (and already does) use
smp_ops to implement various SMP quirks it has - and it can be extended
more to cover all the specialities of Voyager.

So remove this complication in the Kconfig space.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:17:00 +01:00
Ingo Molnar
1dcdd3d15e x86: remove mach_apic.h
Spread mach_apic.h definitions into genapic.h. (with some knock-on effects
on smp.h and apic.h.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:16:42 +01:00
Ingo Molnar
bf3647c44b x86: tone down mtrr_trim_uncached_memory() warning
kerneloops.org is reporting a lot of these warnings that come due to
vmware not setting up any MTRRs for emulated CPUs:

| Reported 709 times (14696 total reports)
| BIOS bug (often in VMWare) where the MTRR's are set up incorrectly
| or not at all
|
| This warning was last seen in version 2.6.29-rc2-git1, and first
| seen in 2.6.24.
|
| More info:
|   http://www.kerneloops.org/searchweek.php?search=mtrr_trim_uncached_memory

Keep a one-liner KERN_INFO about it - so that we have so notice if empty
MTRRs are caused by native hardware/BIOS weirdness.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 11:45:35 +01:00
Ingo Molnar
cb8cc442dc x86, apic: refactor ->phys_pkg_id()
Refactor the ->phys_pkg_id() methods:

 - namespace separation

 - macro wrapper removal

 - open-coded calls to the methods in the generic code

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:27 +01:00
Ingo Molnar
d4c9a9f3d4 x86, apic: unify phys_pkg_id()
- unify the call signature of 64-bit to that of 32-bit

 - clean up the types all around

 - clean up namespace contamination

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:26 +01:00
Ingo Molnar
74b6eb6b93 Merge branches 'x86/asm', 'x86/cleanups', 'x86/cpudetect', 'x86/debug', 'x86/doc', 'x86/header-fixes', 'x86/mm', 'x86/paravirt', 'x86/pat', 'x86/setup-v2', 'x86/subarch', 'x86/uaccess' and 'x86/urgent' into x86/core 2009-01-28 23:13:53 +01:00
Peter Zijlstra
8f6d86dc41 x86: cpu_init(): remove ugly #ifdef construct around debug register clear
Impact: Cleanup

While I was looking through the new and improved bootstrap code - great
work that, thanks! I found the below a slight improvement.

Remove unnecessary ugly #ifdef construct around debug register clear.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-01-27 14:54:44 -08:00
Hiroshi Shimamoto
bd0838fc48 x86: intel_cacheinfo: fix compiler warning
fix the following warning:

  CC      arch/x86/kernel/cpu/intel_cacheinfo.o
  arch/x86/kernel/cpu/intel_cacheinfo.c:314: warning: 'cpuid4_cache_lookup' defined but not used

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-27 13:57:41 +01:00
Ingo Molnar
4369f1fb7c Merge branch 'tj-percpu' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/percpu
Conflicts:
	arch/x86/kernel/setup_percpu.c

Semantic conflict:

	arch/x86/kernel/cpu/common.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-27 12:03:24 +01:00
Ingo Molnar
3ddeb51d9c Merge branch 'linus' into core/percpu
Conflicts:
	arch/x86/kernel/setup_percpu.c
2009-01-27 12:01:51 +01:00
Brian Gerst
2697fbd5fa x86: load new GDT after setting up boot cpu per-cpu area
Impact: sync 32 and 64-bit code

Merge load_gs_base() into switch_to_new_gdt().  Load the GDT and
per-cpu state for the boot cpu when its new area is set up.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:48 +09:00
Brian Gerst
2f2f52bad7 x86: move setup_cpu_local_masks()
Impact: Code movement, no functional change.

Move setup_cpu_local_masks() to kernel/cpu/common.c, where the
masks are defined.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:47 +09:00
H. Peter Anvin
30a0fb947a x86: correct the CPUID pattern for MSR_IA32_MISC_ENABLE availability
Impact: re-enable CPUID unmasking on affected processors

As far as I am capable of discerning from the documentation,
MSR_IA32_MISC_ENABLE should be available for all family 0xf CPUs, as
well as family 6 for model >= 0xd (newer Pentium M).

The documentation on this isn't ideal, so we need to be on the lookout
for errors, still.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-01-26 09:40:58 -08:00
Ingo Molnar
99fb4d349d x86: unmask CPUID levels on Intel CPUs, fix
Impact: fix boot hang on pre-model-15 Intel CPUs

rdmsrl_safe() does not work in very early bootup code yet, because we
dont have the pagefault handler installed yet so exception section
does not get parsed. rdmsr_safe() will just crash and hang the bootup.

So limit the MSR_IA32_MISC_ENABLE MSR read to those CPU types that
support it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-26 12:36:24 +01:00
H. Peter Anvin
b38b066590 x86: filter CPU features dependent on unavailable CPUID levels
Impact: Fixes potential crashes on misconfigured systems.

Some CPU features require specific CPUID levels to be available in
order to function, as they contain information about the operation of
a specific feature.  However, some BIOSes and virtualization software
provide the ability to mask CPUID levels in order to support legacy
operating systems.  We try to enable such CPUID levels when we know
how to do it, but for the remaining cases, filter out such CPU
features when there is no way for us to support them.

Do this in one place, in the CPUID code, with a table-driven approach.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-01-23 18:08:05 -08:00
H. Peter Anvin
75a048119e x86: handle PAT more like other CPU features
Impact: Cleanup

When PAT was originally introduced, it was handled specially for a few
reasons:

- PAT bugs are hard to track down, so we wanted to maintain a
  whitelist of CPUs.
- The i386 and x86-64 CPUID code was not yet unified.

Both of these are now obsolete, so handle PAT like any other features,
including ordinary feature blacklisting due to known bugs.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-01-23 18:07:45 -08:00
Mike Galbraith
3415dd9146 perfcounters fix section mismatch warning in perf_counter.c::perf_counters_lapic_init()
Fix:

WARNING: arch/x86/kernel/built-in.o(.text+0xdd0f): Section mismatch in reference from the function pmc_generic_enable() to the function .cpuinit.text:perf_counters_lapic_init()
The function pmc_generic_enable() references
the function __cpuinit perf_counters_lapic_init().
This is often because pmc_generic_enable lacks a __cpuinit
annotation or the annotation of perf_counters_lapic_init is wrong.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-23 14:51:22 +01:00
Mike Galbraith
4b39fd9685 perfcounters: ratelimit performance counter interrupts
Ratelimit performance counter interrupts to 100KHz per CPU.

This replaces the irq-delta-time based method.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-23 14:50:02 +01:00
Mike Galbraith
1b023a96d9 perfcounters: throttle on too high IRQ rates
Starting kerneltop with only -c 100 seems to be a bad idea, it can
easily lock the system due to perfcounter IRQ overload.

So add throttling: if a new IRQ arrives in a shorter than
PERFMON_MIN_PERIOD_NS time, turn off perfcounters and untrottle them
from the next timer tick.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-23 11:33:18 +01:00
Ingo Molnar
bfe2a3c3b5 Merge branch 'core/percpu' into perfcounters/core
Conflicts:
	arch/x86/include/asm/hardirq_32.h
	arch/x86/include/asm/hardirq_64.h

Semantic merge:
	arch/x86/include/asm/hardirq.h
	[ added apic_perf_irqs field. ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-23 10:20:15 +01:00
Brian Gerst
3819cd489e x86: remove include of apic.h from hardirq_64.h
Impact: cleanup

APIC definitions aren't needed here.  Remove the include and fix
up the fallout.

tj: added include to mce_intel_64.c.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-23 11:03:29 +09:00
H. Peter Anvin
066941bd4e x86: unmask CPUID levels on Intel CPUs
Impact: Fixes crashes with misconfigured BIOSes on XSAVE hardware

Avuton Olrich reported early boot crashes with v2.6.28 and
bisected it down to dc1e35c6e9
("x86, xsave: enable xsave/xrstor on cpus with xsave support").

If the CPUID limit bit in MSR_IA32_MISC_ENABLE is set, clear it to
make all CPUID information available.  This is required for some
features to work, in particular XSAVE.

Reported-and-bisected-by: Avuton Olrich <avuton@gmail.com>
Tested-by: Avuton Olrich <avuton@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-01-22 09:24:02 +01:00
Thomas Renninger
731f1872f4 x86: mtrr fix debug boot parameter
while looking at:

  http://bugzilla.kernel.org/show_bug.cgi?id=11541

I realized that the mtrr.show param cannot work, because
the code is processed much too early.

This patch:
 - Declares mtrr.show as early_param
 - Stays consistent with the previous param (which I doubt
   that it ever worked), so mtrr.show=1 would still work
 - Declares mtrr_show as initdata

Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-21 12:26:42 +01:00
Ingo Molnar
3eb3963fd1 Merge branch 'cpus4096' into core/percpu
Conflicts:
	arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
	arch/x86/kernel/tlb_32.c

Merge it here because both the cpumask changes and the ongoing percpu
work is touching the TLB code. The percpu changes take precedence, as
they eliminate tlb_32.c altogether.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-21 10:14:17 +01:00
Tejun Heo
bdbcdd4888 x86: uv cleanup
Impact: cleanup

Make the following uv related cleanups.

* collect visible uv related definitions and interfaces into uv/uv.h
  and use it.  this cleans up the messy situation where on 64bit, uv
  is defined properly, on 32bit generic it's dummy and on the rest
  undefined.  after this clean up, uv is defined on 64 and dummy on
  32.

* update uv_flush_tlb_others() such that it takes cpumask of
  to-be-flushed cpus as argument, instead of that minus self, and
  returns yet-to-be-flushed cpumask, instead of modifying the passed
  in parameter.  this interface change will ease dummy implementation
  of uv_flush_tlb_others() and makes uv tlb flush related stuff
  defined in tlb_uv proper.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21 17:26:06 +09:00
Brian Gerst
0dd76d736e x86: set %fs to __KERNEL_PERCPU unconditionally for x86_32
Impact: cleanup

%fs is currently set to __KERNEL_DS at boot, and conditionally
switched to __KERNEL_PERCPU for secondary cpus.  Instead, initialize
GDT_ENTRY_PERCPU to the same attributes as GDT_ENTRY_KERNEL_DS and
set %fs to __KERNEL_PERCPU unconditionally.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21 17:26:05 +09:00
Brian Gerst
06deef892c x86: clean up gdt_page definition
Impact: cleanup && more compact percpu area layout with future changes

Move 64-bit GDT to page-aligned section and clean up comment
formatting.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21 17:26:05 +09:00
Brian Gerst
0d974d4592 x86: remove pda.h
Impact: cleanup

Signed-off-by: Brian Gerst <brgerst@gmail.com>
2009-01-20 12:29:20 +09:00
Brian Gerst
947e76cdc3 x86: move stack_canary into irq_stack
Impact: x86_64 percpu area layout change, irq_stack now at the beginning

Now that the PDA is empty except for the stack canary, it can be removed.
The irqstack is moved to the start of the per-cpu section.  If the stack
protector is enabled, the canary overlaps the bottom 48 bytes of the irqstack.

tj: * updated subject
    * dropped asm relocation of irq_stack_ptr
    * updated comments a bit
    * rebased on top of stack canary changes

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-20 12:29:20 +09:00
Brian Gerst
8ce031972b x86: remove pda_init()
Impact: cleanup

Copy the code to cpu_init() to satisfy the requirement that the cpu
be reinitialized.  Remove all other calls, since the segments are
already initialized in head_64.S.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-20 12:29:19 +09:00
Ingo Molnar
bfa318ad52 fix: crash: IP: __bitmap_intersects+0x48/0x73
-tip testing found this crash:

> [   35.258515] calling  acpi_cpufreq_init+0x0/0x127 @ 1
> [   35.264127] BUG: unable to handle kernel NULL pointer dereference at (null)
> [   35.267554] IP: [<ffffffff80478092>] __bitmap_intersects+0x48/0x73
> [   35.267554] PGD 0
> [   35.267554] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC

arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c is still broken: there's no
allocation of the variable mask, so we pass in an uninitialized cmd.mask
field to drv_read(), which then passes it to the scheduler which then
crashes ...

Switch it over to the much simpler constant-cpumask-pointers approach.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20 00:17:01 +01:00
Mike Travis
7285908185 cpufreq: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
Impact: use new work_on_cpu function to reduce stack usage

Replace the saving of current->cpus_allowed and set_cpus_allowed_ptr() with
a work_on_cpu function for drv_read() and drv_write().

Basically converts do_drv_{read,write} into "work_on_cpu" functions that
are now called by drv_read and drv_write.

Note: This patch basically reverts 50c668d6 which reverted 7503bfba, now
that the work_on_cpu() function is more stable.

Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Dieter Ries <clip2@gmx.de>
Tested-by: Maciej Rutecki <maciej.rutecki@gmail.com>
Cc: Dave Jones <davej@redhat.com>
Cc: <cpufreq@vger.kernel.org>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-19 22:36:13 +01:00
Ingo Molnar
af37501c79 Merge branch 'core/percpu' into perfcounters/core
Conflicts:
	arch/x86/include/asm/pda.h

We merge tip/core/percpu into tip/perfcounters/core because of a
semantic and contextual conflict: the former eliminates the PDA,
while the latter extends it with apic_perf_irqs field.

Resolve the conflict by moving the new field to the irq_cpustat
structure on 64-bit too.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-18 18:15:49 +01:00
Brian Gerst
e7a22c1ebc x86-64: Move nodenumber from PDA to per-cpu.
tj: * s/nodenumber/node_number/
    * removed now unused pda variable from pda_init()

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:59 +09:00
Brian Gerst
5689553076 x86-64: Move irqcount from PDA to per-cpu.
tj: s/irqcount/irq_count/

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
9af45651f1 x86-64: Move kernelstack from PDA to per-cpu.
Also clean up PER_CPU_VAR usage in xen-asm_64.S

tj: * remove now unused stack_thread_info()
    * s/kernelstack/kernel_stack/
    * added FIXME comment in xen-asm_64.S

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
c6f5e0acd5 x86-64: Move current task from PDA to per-cpu and consolidate with 32-bit.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
ea9279066d x86-64: Move cpu number from PDA to per-cpu and consolidate with 32-bit.
tj: moved cpu_number definition out of CONFIG_HAVE_SETUP_PER_CPU_AREA
    for voyager.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
92d65b2371 x86-64: Convert exception stacks to per-cpu
Move the exception stacks to per-cpu, removing specific allocation code.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
26f80bd6a9 x86-64: Convert irqstacks to per-cpu
Move the irqstackptr variable from the PDA to per-cpu.  Make the
stacks themselves per-cpu, removing some specific allocation code.
Add a seperate flag (is_boot_cpu) to simplify the per-cpu boot
adjustments.

tj: * sprinkle some underbars around.

    * irq_stack_ptr is not used till traps_init(), no reason to
      initialize it early.  On SMP, just leaving it NULL till proper
      initialization in setup_per_cpu_areas() works.  Dropped
      is_boot_cpu and early irq_stack_ptr initialization.

    * do DECLARE/DEFINE_PER_CPU(char[IRQ_STACK_SIZE], irq_stack)
      instead of (char, irq_stack[IRQ_STACK_SIZE]).

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
9eb912d1aa x86-64: Move TLB state from PDA to per-cpu and consolidate with 32-bit.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:57 +09:00
Mike Travis
6eb714c63e cpufreq: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
Impact: use new work_on_cpu function to reduce stack usage

Replace the saving of current->cpus_allowed and set_cpus_allowed_ptr() with
a work_on_cpu function for drv_read() and drv_write().

Basically converts do_drv_{read,write} into "work_on_cpu" functions that
are now called by drv_read and drv_write.

Note: This patch basically reverts 50c668d6 which reverted 7503bfba, now
that the work_on_cpu() function is more stable.

Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Dieter Ries <clip2@gmx.de>
Tested-by: Maciej Rutecki <maciej.rutecki@gmail.com>
Cc: Dave Jones <davej@redhat.com>
Cc: <cpufreq@vger.kernel.org>
2009-01-16 15:31:15 -08:00
Tejun Heo
b12d8db8fb x86: make pda a percpu variable
[ Based on original patch from Christoph Lameter and Mike Travis. ]

As pda is now allocated in percpu area, it can easily be made a proper
percpu variable.  Make it so by defining per cpu symbol from linker
script and declaring it in C code for SMP and simply defining it for
UP.  This change cleans up code and brings SMP and UP closer a bit.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:20:03 +01:00
Tejun Heo
1a51e3a0ae x86: fold pda into percpu area on SMP
[ Based on original patch from Christoph Lameter and Mike Travis. ]

Currently pdas and percpu areas are allocated separately.  %gs points
to local pda and percpu area can be reached using pda->data_offset.
This patch folds pda into percpu area.

Due to strange gcc requirement, pda needs to be at the beginning of
the percpu area so that pda->stack_canary is at %gs:40.  To achieve
this, a new percpu output section macro - PERCPU_VADDR_PREALLOC() - is
added and used to reserve pda sized chunk at the start of the percpu
area.

After this change, for boot cpu, %gs first points to pda in the
data.init area and later during setup_per_cpu_areas() gets updated to
point to the actual pda.  This means that setup_per_cpu_areas() need
to reload %gs for CPU0 while clearing pda area for other cpus as cpu0
already has modified it when control reaches setup_per_cpu_areas().

This patch also removes now unnecessary get_local_pda() and its call
sites.

A lot of this patch is taken from Mike Travis' "x86_64: Fold pda into
per cpu area" patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:19:46 +01:00
Tejun Heo
c8f3329a0d x86: use static _cpu_pda array
_cpu_pda array first uses statically allocated storage in data.init
and then switches to allocated bootmem to conserve space.  However,
after folding pda area into percpu area, _cpu_pda array will be
removed completely.  Drop the reallocation part to simplify the code
for soon-to-follow changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:19:40 +01:00
Ingo Molnar
5cd7376200 fix: crash: IP: __bitmap_intersects+0x48/0x73
-tip testing found this crash:

> [   35.258515] calling  acpi_cpufreq_init+0x0/0x127 @ 1
> [   35.264127] BUG: unable to handle kernel NULL pointer dereference at (null)
> [   35.267554] IP: [<ffffffff80478092>] __bitmap_intersects+0x48/0x73
> [   35.267554] PGD 0
> [   35.267554] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC

arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c is still broken: there's no
allocation of the variable mask, so we pass in an uninitialized cmd.mask
field to drv_read(), which then passes it to the scheduler which then
crashes ...

Switch it over to the much simpler constant-cpumask-pointers approach.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15 15:46:12 +01:00
Ingo Molnar
49a93bc978 Merge branch 'linus' into cpus4096 2009-01-15 15:45:31 +01:00
Ingo Molnar
7f268f4352 Merge branches 'cpus4096', 'x86/cleanups' and 'x86/urgent' into x86/percpu 2009-01-15 13:18:57 +01:00
Ingo Molnar
4a922a969c x86, cpufreq: remove leftover copymask_copy()
Impact: fix potential boot crash on MAXSMP

Remove code left over by:

  50c668d: Revert "cpumask: use work_on_cpu in acpi-cpufreq.c for drv_read

That cmd.cpumask is not allocated anymore. No impact on default !MAXSMP
kernels.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-13 16:11:00 +01:00
Ingo Molnar
50c668d678 Revert "cpumask: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write"
This reverts commit 7503bfbae8.

Dieter Ries reported bootup soft-hangs and bisected it back to
this commit, and reverting this commit gave him a working system.

The commit introduces work_on_cpu() use into the cpufreq code,
but that is subtly problematic from a lock hierarchy POV: the
hotplug-cpu lock is an highlevel lock that is taken before
lowlevel locks, and in this codepath we are called with the
policy lock taken.

Dieter did not have lockdep enabled so we dont have a nice stack
trace proof for this, but using work_on_cpu() in such a lowlevel
place certainly looks wrong, so we revert the patch.

work_on_cpu() needs to be reworked to be more generally usable.

Reported-by: Dieter Ries <clip2@gmx.de>
Tested-by: Dieter Ries <clip2@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 19:24:23 +01:00
Mike Travis
f9b90566cd x86: reduce stack usage in init_intel_cacheinfo
Impact: reduce stack usage.

init_intel_cacheinfo() does not use the cpumask so define a subset
of struct _cpuid4_info (_cpuid4_info_regs) that can be used instead.

Signed-off-by: Mike Travis <travis@sgi.com>
2009-01-11 19:13:16 +01:00
Mike Travis
a1c33bbeb7 x86: cleanup remaining cpumask_t code in mce_amd_64.c
Impact: Reduce memory usage, use new cpumask API.

Use cpumask_var_t for 'cpus' cpumask in struct threshold_bank and update
remaining old cpumask_t functions to new cpumask API.

Signed-off-by: Mike Travis <travis@sgi.com>
2009-01-11 19:13:12 +01:00
Ingo Molnar
506c10f26c Merge commit 'v2.6.29-rc1' into perfcounters/core
Conflicts:
	include/linux/kernel_stat.h
2009-01-11 02:42:53 +01:00
Jaswinder Singh Rajput
068790334c x86: smp.h move cpu_callin_mask and cpu_callin_map declartion to cpumask.h
Impact: cleanup

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-10 23:57:09 +01:00
Ingo Molnar
1de8cd3cb9 Merge branch 'linus' into x86/cleanups 2009-01-10 23:56:42 +01:00
Linus Torvalds
3d14bdad40 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (36 commits)
  x86: fix section mismatch warnings in mcheck/mce_amd_64.c
  x86: offer frame pointers in all build modes
  x86: remove duplicated #include's
  x86: k8 numa register active regions later
  x86: update Alan Cox's email addresses
  x86: rename all fields of mpc_table mpc_X to X
  x86: rename all fields of mpc_oemtable oem_X to X
  x86: rename all fields of mpc_bus mpc_X to X
  x86: rename all fields of mpc_cpu mpc_X to X
  x86: rename all fields of mpc_intsrc mpc_X to X
  x86: rename all fields of mpc_lintsrc mpc_X to X
  x86: rename all fields of mpc_iopic mpc_X to X
  x86: irqinit_64.c init_ISA_irqs should be static
  Documentation/x86/boot.txt: payload length was changed to payload_length
  x86: setup_percpu.c fix style problems
  x86: irqinit_64.c fix style problems
  x86: irqinit_32.c fix style problems
  x86: i8259.c fix style problems
  x86: irq_32.c fix style problems
  x86: ioport.c fix style problems
  ...
2009-01-10 06:13:09 -08:00
Linus Torvalds
4e9b1c184c Merge branch 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  [IA64] fix typo in cpumask_of_pcibus()
  x86: fix x86_32 builds for summit and es7000 arch's
  cpumask: use work_on_cpu in acpi-cpufreq.c for read_measured_perf_ctrs
  cpumask: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
  cpumask: use cpumask_var_t in acpi-cpufreq.c
  cpumask: use work_on_cpu in acpi/cstate.c
  cpumask: convert struct cpufreq_policy to cpumask_var_t
  cpumask: replace CPUMASK_ALLOC etc with cpumask_var_t
  x86: cleanup remaining cpumask_t ops in smpboot code
  cpumask: update pci_bus_show_cpuaffinity to use new cpumask API
  cpumask: update local_cpus_show to use new cpumask API
  ia64: cpumask fix for is_affinity_mask_valid()
2009-01-10 06:12:18 -08:00
Fernando Carrijo
c19a28e119 remove lots of double-semicolons
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Theodore Ts'o <tytso@mit.edu>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: James Morris <jmorris@namei.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-08 08:31:14 -08:00
Jaswinder Singh Rajput
f472cdba84 x86: smp.h move stack_processor_id declartion to cpu.h
Impact: cleanup

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-07 21:48:25 +01:00
Leonardo Potenza
51d7a1398d x86: fix section mismatch warnings in mcheck/mce_amd_64.c
Mark the function local_allocate_threshold_blocks() with __cpuinit,
in order to remove the following section mismatch messages:

WARNING: arch/x86/kernel/cpu/mcheck/built-in.o(.text+0x1363): Section mismatch in reference from the function local_allocate_threshold_blocks() to the function .cpuinit.text:allocate_threshold_blocks()
The function local_allocate_threshold_blocks() references
the function __cpuinit allocate_threshold_blocks().
This is often because local_allocate_threshold_blocks lacks a __cpuinit
annotation or the annotation of allocate_threshold_blocks is wrong.

WARNING: arch/x86/kernel/cpu/built-in.o(.text+0x1def): Section mismatch in reference from the function local_allocate_threshold_blocks() to the function .cpuinit.text:allocate_threshold_blocks()
The function local_allocate_threshold_blocks() references
the function __cpuinit allocate_threshold_blocks().
This is often because local_allocate_threshold_blocks lacks a __cpuinit
annotation or the annotation of allocate_threshold_blocks is wrong.

WARNING: arch/x86/kernel/built-in.o(.text+0xef2b): Section mismatch in reference from the function local_allocate_threshold_blocks() to the function .cpuinit.text:allocate_threshold_blocks()
The function local_allocate_threshold_blocks() references
the function __cpuinit allocate_threshold_blocks().
This is often because local_allocate_threshold_blocks lacks a __cpuinit
annotation or the annotation of allocate_threshold_blocks is wrong.

All the callsites of this function are __cpuinit already, and all the
functions it calls are __cpuinit as well.

Signed-off-by: Leonardo Potenza <lpotenza@inwind.it>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-07 12:23:35 +01:00
Ingo Molnar
0936912274 Merge branches 'x86/cleanups', 'x86/mpparse', 'x86/numa' and 'x86/uv' into x86/urgent 2009-01-06 17:39:52 +01:00
Mike Travis
e39ad415ac cpumask: use work_on_cpu in acpi-cpufreq.c for read_measured_perf_ctrs
Impact: use new cpumask API to reduce stack usage

Replace the saving of current->cpus_allowed and set_cpus_allowed_ptr() with
a work_on_cpu function for read_measured_perf_ctrs().

Basically splits off the work function from get_measured_perf which is
run on the designated cpu.  Moves definition of struct perf_cur out of
function local namespace, and is used as the work function argument.
References in get_measured_perf use values in the perf_cur struct.

Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-06 09:05:43 +01:00
Mike Travis
7503bfbae8 cpumask: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
Impact: use new cpumask API to reduce stack usage

Replace the saving of current->cpus_allowed and set_cpus_allowed_ptr() with
a work_on_cpu function for drv_read() and drv_write().

Basically converts do_drv_{read,write} into "work_on_cpu" functions that
are now called by drv_read and drv_write.

Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-06 09:05:40 +01:00
Mike Travis
4d8bb53749 cpumask: use cpumask_var_t in acpi-cpufreq.c
Impact: cleanup, reduce stack usage, use new cpumask API.

Replace the cpumask_t in struct drv_cmd with a cpumask_var_t.  Remove unneeded
online_policy_cpus cpumask_t in acpi_cpufreq_target.  Update refs to use
new cpumask API.

Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-06 09:05:37 +01:00
Rusty Russell
835481d9bc cpumask: convert struct cpufreq_policy to cpumask_var_t
Impact: use new cpumask API to reduce memory usage

This is part of an effort to reduce structure sizes for machines
configured with large NR_CPUS.  cpumask_t gets replaced by
cpumask_var_t, which is either struct cpumask[1] (small NR_CPUS) or
struct cpumask * (large NR_CPUS).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-06 09:05:31 +01:00
Rusty Russell
5cb0535f17 cpumask: replace CPUMASK_ALLOC etc with cpumask_var_t
Impact: cleanup

There's only one user, and it's a fairly easy conversion.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-06 09:05:28 +01:00
Ingo Molnar
d12418fdea Merge branch 'linus' into cpus4096 2009-01-06 09:04:32 +01:00
Linus Torvalds
e9af797d75 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] Fix on resume, now preserves user policy min/max.
  [CPUFREQ] Add Celeron Core support to p4-clockmod.
  [CPUFREQ] add to speedstep-lib additional fsb values for core processors
  [CPUFREQ] Disable sysfs ui for p4-clockmod.
  [CPUFREQ] p4-clockmod: reduce noise
  [CPUFREQ] clean up speedstep-centrino and reduce cpumask_t usage
2009-01-05 18:33:38 -08:00
Alan Cox
87c6fe2618 x86: update Alan Cox's email addresses
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-05 15:19:16 +01:00
Mike Travis
c2d1cec1c7 x86: cleanup remaining cpumask_t ops in smpboot code
Impact: use new cpumask API to reduce memory and stack usage

Allocate the following local cpumasks based on the number of cpus that
are present.  References will use new cpumask API.  (Currently only
modified for x86_64, x86_32 continues to use the *_map variants.)

    cpu_callin_mask
    cpu_callout_mask
    cpu_initialized_mask
    cpu_sibling_setup_mask

Provide the following accessor functions:

    struct cpumask *cpu_sibling_mask(int cpu)
    struct cpumask *cpu_core_mask(int cpu)

Other changes are when setting or clearing the cpu online, possible
or present maps, use the accessor functions.

Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-04 15:39:26 +01:00
Linus Torvalds
7d3b56ba37 Merge branch 'cpus4096-for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'cpus4096-for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (77 commits)
  x86: setup_per_cpu_areas() cleanup
  cpumask: fix compile error when CONFIG_NR_CPUS is not defined
  cpumask: use alloc_cpumask_var_node where appropriate
  cpumask: convert shared_cpu_map in acpi_processor* structs to cpumask_var_t
  x86: use cpumask_var_t in acpi/boot.c
  x86: cleanup some remaining usages of NR_CPUS where s/b nr_cpu_ids
  sched: put back some stack hog changes that were undone in kernel/sched.c
  x86: enable cpus display of kernel_max and offlined cpus
  ia64: cpumask fix for is_affinity_mask_valid()
  cpumask: convert RCU implementations, fix
  xtensa: define __fls
  mn10300: define __fls
  m32r: define __fls
  h8300: define __fls
  frv: define __fls
  cris: define __fls
  cpumask: CONFIG_DISABLE_OBSOLETE_CPUMASK_FUNCTIONS
  cpumask: zero extra bits in alloc_cpumask_var_node
  cpumask: replace for_each_cpu_mask_nr with for_each_cpu in kernel/time/
  cpumask: convert mm/
  ...
2009-01-03 12:04:39 -08:00
Mike Travis
80855f7361 cpumask: use alloc_cpumask_var_node where appropriate
Impact: Reduce inter-node memory traffic.

Reduces inter-node memory traffic (offloading the global system bus)
by allocating referenced struct cpumasks on the same node as the
referring struct.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-03 19:15:40 +01:00
Rusty Russell
2fdf66b491 cpumask: convert shared_cpu_map in acpi_processor* structs to cpumask_var_t
Impact: Reduce memory usage, use new API.

This is part of an effort to reduce structure sizes for machines
configured with large NR_CPUS.  cpumask_t gets replaced by
cpumask_var_t, which is either struct cpumask[1] (small NR_CPUS) or
struct cpumask * (large NR_CPUS).

(Changes to powernow-k* by <travis>.)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-03 19:15:40 +01:00
Mike Travis
9628937d5b x86: cleanup some remaining usages of NR_CPUS where s/b nr_cpu_ids
Impact: Reduce future system panics due to cpumask operations using NR_CPUS

Insure that code does not look at bits >= nr_cpu_ids as when cpumasks are
allocated based on nr_cpu_ids, these extra bits will not be defined.

Also some other minor updates:

   * change in to use cpu accessor function set_cpu_present() instead of
     directly accessing cpu_present_map w/cpu_clear() [arch/x86/kernel/reboot.c]

   * use cpumask_of() instead of &cpumask_of_cpu() [arch/x86/kernel/reboot.c]

   * optimize some cpu_mask_to_apicid_and functions.

Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-03 19:00:55 +01:00
Mike Travis
7eb1955336 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask into merge-rr-cpumask
Conflicts:
	arch/x86/kernel/io_apic.c
	kernel/rcuclassic.c
	kernel/sched.c
	kernel/time/tick-sched.c

Signed-off-by: Mike Travis <travis@sgi.com>
[ mingo@elte.hu: backmerged typo fix for io_apic.c ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-03 18:53:31 +01:00
Ingo Molnar
923a789b49 Merge branch 'linus' into x86/cleanups
Conflicts:
	arch/x86/kernel/reboot.c
2009-01-02 22:41:36 +01:00
Linus Torvalds
b840d79631 Merge branch 'cpus4096-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'cpus4096-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (66 commits)
  x86: export vector_used_by_percpu_irq
  x86: use logical apicid in x2apic_cluster's x2apic_cpu_mask_to_apicid_and()
  sched: nominate preferred wakeup cpu, fix
  x86: fix lguest used_vectors breakage, -v2
  x86: fix warning in arch/x86/kernel/io_apic.c
  sched: fix warning in kernel/sched.c
  sched: move test_sd_parent() to an SMP section of sched.h
  sched: add SD_BALANCE_NEWIDLE at MC and CPU level for sched_mc>0
  sched: activate active load balancing in new idle cpus
  sched: bias task wakeups to preferred semi-idle packages
  sched: nominate preferred wakeup cpu
  sched: favour lower logical cpu number for sched_mc balance
  sched: framework for sched_mc/smt_power_savings=N
  sched: convert BALANCE_FOR_xx_POWER to inline functions
  x86: use possible_cpus=NUM to extend the possible cpus allowed
  x86: fix cpu_mask_to_apicid_and to include cpu_online_mask
  x86: update io_apic.c to the new cpumask code
  x86: Introduce topology_core_cpumask()/topology_thread_cpumask()
  x86: xen: use smp_call_function_many()
  x86: use work_on_cpu in x86/kernel/cpu/mcheck/mce_amd_64.c
  ...

Fixed up trivial conflict in kernel/time/tick-sched.c manually
2009-01-02 11:44:09 -08:00
Sheng Yang
932d27a791 x86: Export some definition of MTRR
For KVM can reuse the type define, and need them to support shadow MTRR.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:44 +02:00
Sheng Yang
b558bc0a25 x86: Rename mtrr_state struct and macro names
Prepare for exporting them.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:44 +02:00
Rusty Russell
33edcf133b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-12-30 08:02:35 +10:30
Sergio Luis
6092848a2a x86: mark get_cpu_leaves() with __cpuinit annotation
Impact: fix section mismatch warning

Commit b2bb855491 ("x86: Remove cpumask games
in x86/kernel/cpu/intel_cacheinfo.c") introduced get_cpu_leaves(), which
references __cpuinit cpuid4_cache_lookup().

Mark get_cpu_leaves() with a __cpuinit annotation.

Signed-off-by: Sergio Luis <sergio@larces.uece.br>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-29 13:11:18 +01:00
Ingo Molnar
75329f1f0c Merge branch 'linus' into x86/cleanups 2008-12-29 13:09:40 +01:00
Ingo Molnar
e1df957670 Merge branch 'linus' into perfcounters/core
Conflicts:
	fs/exec.c
	include/linux/init_task.h

Simple context conflicts.
2008-12-29 09:45:15 +01:00
Linus Torvalds
b0f4b285d7 Merge branch 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (241 commits)
  sched, trace: update trace_sched_wakeup()
  tracing/ftrace: don't trace on early stage of a secondary cpu boot, v3
  Revert "x86: disable X86_PTRACE_BTS"
  ring-buffer: prevent false positive warning
  ring-buffer: fix dangling commit race
  ftrace: enable format arguments checking
  x86, bts: memory accounting
  x86, bts: add fork and exit handling
  ftrace: introduce tracing_reset_online_cpus() helper
  tracing: fix warnings in kernel/trace/trace_sched_switch.c
  tracing: fix warning in kernel/trace/trace.c
  tracing/ring-buffer: remove unused ring_buffer size
  trace: fix task state printout
  ftrace: add not to regex on filtering functions
  trace: better use of stack_trace_enabled for boot up code
  trace: add a way to enable or disable the stack tracer
  x86: entry_64 - introduce FTRACE_ frame macro v2
  tracing/ftrace: add the printk-msg-only option
  tracing/ftrace: use preempt_enable_no_resched_notrace in ring_buffer_time_stamp()
  x86, bts: correctly report invalid bts records
  ...

Fixed up trivial conflict in scripts/recordmcount.pl due to SH bits
being already partly merged by the SH merge.
2008-12-28 12:21:10 -08:00
Jaswinder Singh Rajput
2b583d8bc8 x86: perf_counter remove unwanted hw_perf_enable_all
Impact: clean, reduce kernel size a bit, avoid sparse warnings

Fixes sparse warnings:

 arch/x86/kernel/cpu/perf_counter.c:153:6: warning: symbol 'hw_perf_enable_all' was not declared. Should it be static?
 arch/x86/kernel/cpu/perf_counter.c:279:3: warning: returning void-valued expression
 arch/x86/kernel/cpu/perf_counter.c:206:3: warning: returning void-valued expression
 arch/x86/kernel/cpu/perf_counter.c:206:3: warning: returning void-valued expression

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-27 16:00:51 +01:00
Ingo Molnar
34bf5d0ff5 Merge branch 'x86/core' into x86/cleanups 2008-12-27 11:30:05 +01:00
Rusty Russell
71ab6b245f x86: remove impossible test in mtrr/main.c
Impact: cleanup

enable_mtrr_cleanup is static, and is never set to anything but 0 or 1.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-12-25 12:46:06 -08:00
Ingo Molnar
5250d329e3 Merge branches 'tracing/ftrace', 'tracing/hw-branch-tracing' and 'tracing/ring-buffer'; commit 'v2.6.28' into tracing/core 2008-12-25 13:11:00 +01:00
Ingo Molnar
a3eeeefbf1 Merge branch 'x86/tsc' into tracing/core
Merge it to resolve this incidental conflict between the BTS fixes/cleanups
and changes in x86/tsc:

Conflicts:
	arch/x86/kernel/cpu/intel.c
2008-12-25 12:48:18 +01:00
Frederic Weisbecker
0ca59dd948 tracing/ftrace: don't trace on early stage of a secondary cpu boot, v3
Impact: fix a crash/hard-reboot on certain configs while enabling cpu runtime

On some archs, the boot of a secondary cpu can have an early fragile state.
On x86-64, the pda is not initialized on the first stage of a cpu boot but
it is needed to get the cpu number and the current task pointer. This data
is needed during tracing. As they were dereferenced at this stage, we got a
crash while tracing a cpu being enabled at runtime.

Some other archs like ia64 can have such kind of issue too.

Changes on v2:

We dropped the previous solution of a per-arch called function to guess the
current state of a cpu. That could slow down the tracing.

This patch removes the -pg flag on arch/x86/kernel/cpu/common.c where
the low level cpu boot functions exist, on start_secondary() and a helper
function used at this stage.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-25 09:39:22 +01:00
Ingo Molnar
db8862eafe Merge branch 'linus' into tracing/hw-branch-tracing 2008-12-24 21:08:26 +01:00
Ingo Molnar
bed4f13065 Merge branch 'x86/irq' into x86/core 2008-12-23 16:30:31 +01:00
Ingo Molnar
be9a1d3c2e Merge branch 'x86/tsc' into x86/core 2008-12-23 16:30:20 +01:00
Ingo Molnar
7e3cbc3f77 Merge branch 'x86/ptrace' into x86/tsc
Conflicts:
	arch/x86/kernel/cpu/intel.c
2008-12-23 16:29:31 +01:00
Ingo Molnar
fa623d1b02 Merge branches 'x86/apic', 'x86/cleanups', 'x86/cpufeature', 'x86/crashdump', 'x86/debug', 'x86/defconfig', 'x86/detect-hyper', 'x86/doc', 'x86/dumpstack', 'x86/early-printk', 'x86/fpu', 'x86/idle', 'x86/io', 'x86/memory-corruption-check', 'x86/microcode', 'x86/mm', 'x86/mtrr', 'x86/nmi-watchdog', 'x86/pat2', 'x86/pci-ioapic-boot-irq-quirks', 'x86/ptrace', 'x86/quirks', 'x86/reboot', 'x86/setup-memory', 'x86/signal', 'x86/sparse-fixes', 'x86/time', 'x86/uv' and 'x86/xen' into x86/core 2008-12-23 16:27:23 +01:00
Ingo Molnar
2f18d1e8d0 x86, perfcounters: add support for fixed-function pmcs
Impact: extend performance counter support on x86 Intel CPUs

Modern Intel CPUs have 3 "fixed-function" performance counters, which
count these hardware events:

    Instr_Retired.Any
    CPU_CLK_Unhalted.Core
    CPU_CLK_Unhalted.Ref

Add support for them to the performance counters subsystem.

Their use is transparent to user-space: the counter scheduler is
extended to automatically recognize the cases where a fixed-function
PMC can be utilized instead of a generic PMC. In such cases the
generic PMC is kept available for more counters.

The above fixed-function events map to these generic counter hw events:

        PERF_COUNT_INSTRUCTIONS
        PERF_COUNT_CPU_CYCLES
        PERF_COUNT_BUS_CYCLES

(The 'bus' cycles are in reality often CPU-ish cycles, just with a fixed
 frequency.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-23 12:45:25 +01:00
Ingo Molnar
f650a67235 perfcounters: add PERF_COUNT_BUS_CYCLES
Generalize "bus cycles" hw events - and map them to CPU_CLK_Unhalted.Ref
on x86. (which is a good enough approximation)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-23 12:45:24 +01:00
Ingo Molnar
0dff86aa7b x86, perfcounters: print out the ->used bitmask
Impact: extend debug printouts

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-23 12:45:20 +01:00
Ingo Molnar
95cdd2e785 perfcounters: enable lowlevel pmc code to schedule counters
Allow lowlevel ->enable() op to return an error if a counter can not be
added. This can be used to handle counter constraints.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-23 12:45:19 +01:00
Ingo Molnar
7671581f16 perfcounters: hw ops rename
Impact: rename field names

Shorten them.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-23 12:45:13 +01:00
Ingo Molnar
862a1a5f34 x86, perfcounters: refactor code for fixed-function PMCs
Impact: clean up

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-23 12:45:12 +01:00
Ingo Molnar
703e937c83 perfcounters: add fixed-mode PMC enumeration
Enumerate fixed-mode PMCs based on CPUID, and feed that into the
perfcounter code.

Does not use fixed-mode PMCs yet.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-23 12:45:11 +01:00
Ingo Molnar
eb2b861810 x86, perfcounters: prepare for fixed-mode PMCs
Impact: refactor the x86 code for fixed-mode PMCs

Extend the data structures and rename the existing facilities
to allow for a 'generic' versus 'fixed' counter distinction.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-23 12:45:10 +01:00
Ingo Molnar
5c167b8585 x86, perfcounters: rename intel_arch_perfmon.h => perf_counter.h
Impact: rename include file

We'll be providing an asm/perf_counter.h to the generic perfcounter code,
so use the already existing x86 file for this purpose and rename it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-23 12:45:09 +01:00
Ingo Molnar
8fb9331391 perfcounters: remove warnings
Impact: remove debug checks

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-23 12:45:08 +01:00
Jaswinder Singh
34945ede31 x86: common.c boot_cpu_stack and boot_exception_stacks should be static
Impact: cleanup, avoid sparse warnings, reduce kernel size a bit

Fixes these sparse warnings:

 arch/x86/kernel/cpu/common.c:869:6: warning: symbol 'boot_cpu_stack' was not declared. Should it be static?
 arch/x86/kernel/cpu/common.c:910:6: warning: symbol 'boot_exception_stacks' was not declared. Should it be static?

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-19 23:16:08 +01:00
Jaswinder Singh
94c46572a6 x86: perf_counter.c intel_perfmon_event_map and max_intel_perfmon_events should be static
Impact: cleanup, avoid sparse warnings, reduce kernel size a bit

Fixes these sparse warnings:
 arch/x86/kernel/cpu/perf_counter.c:44:11: warning: symbol 'intel_perfmon_event_map' was not declared. Should it be static?
 arch/x86/kernel/cpu/perf_counter.c:54:11: warning: symbol 'max_intel_perfmon_events' was not declared. Should it be static?

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-19 23:15:10 +01:00
Suresh Siddha
345077cd98 x86: fix intel x86_64 llc_shared_map/cpu_llc_id anomolies
Impact: fix wrong cache sharing detection on platforms supporting > 8 bit apicid's

In the presence of extended topology eumeration leaf 0xb provided
by cpuid, 32bit extended initial_apicid in cpuinfo_x86 struct will be
updated by detect_extended_topology(). At this instance, we should also
reinit the apicid (which could also potentially be extended to 32bit).

With out this there will potentially be duplicate apicid's populated in the
per cpu's cpuinfo_x86 struct, resulting in wrong cache sharing topology etc
detected by init_intel_cacheinfo().

Reported-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
2008-12-19 09:13:50 +01:00
Mike Travis
4cd4601d59 x86: use work_on_cpu in x86/kernel/cpu/mcheck/mce_amd_64.c
Impact: Remove cpumask_t's from stack.

Simple transition to work_on_cpu(), rather than cpumask games.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Robert Richter <robert.richter@amd.com>
Cc: jacob.shin@amd.com
2008-12-16 17:40:59 -08:00
Mike Travis
b2bb855491 x86: Remove cpumask games in x86/kernel/cpu/intel_cacheinfo.c
Impact: remove cpumask_t from stack.

We should not try to save and restore cpus_allowed on current.

We can't use work_on_cpu() here, since it's in the hotplug cpu path
(if anyone else tries to get the hotplug lock from a workqueue we
could deadlock against them).

Fortunately, we can just use smp_call_function_single() since the
function can run from an interrupt.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
2008-12-16 17:40:58 -08:00
Andi Kleen
cf9b303e55 x86: re-enable MCE on secondary CPUS after suspend/resume
Impact: fix disabled MCE after resume

Don't prevent multiple initialization of MCEs.

Back from early prehistory mcheck_init() has a reentry check. Presumably
that was needed in very old kernels to prevent it entering twice.

But as Andreas points out this prevents CPU hotplug (and therefore resume)
to correctly reinitialize MCEs when a AP boots again after being
offlined.

Just drop the check.

Reported-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-16 23:03:02 +01:00
Hiroshi Shimamoto
8ae9366909 x86: hardirq: use inc_irq_stat() in non-unified functions
Impact: cleanup

Replace incrementing irq stat with inc_irq_stat() in non-unified functions.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-16 22:30:19 +01:00
Venki Pallipadi
40fb17152c x86: support always running TSC on Intel CPUs
Impact: reward non-stop TSCs with good TSC-based clocksources, etc.

Add support for CPUID_0x80000007_Bit8 on Intel CPUs as well. This bit means
that the TSC is invariant with C/P/T states and always runs at constant
frequency.

With Intel CPUs, we have 3 classes
* CPUs where TSC runs at constant rate and does not stop n C-states
* CPUs where TSC runs at constant rate, but will stop in deep C-states
* CPUs where TSC rate will vary based on P/T-states and TSC will stop in deep
  C-states.

To cover these 3, one feature bit (CONSTANT_TSC) is not enough. So, add a
second bit (NONSTOP_TSC). CONSTANT_TSC indicates that the TSC runs at
constant frequency irrespective of P/T-states, and NONSTOP_TSC indicates
that TSC does not stop in deep C-states.

CPUID_0x8000000_Bit8 indicates both these feature bit can be set.
We still have CONSTANT_TSC _set_ and NONSTOP_TSC _not_set_ on some older Intel
CPUs, based on model checks. We can use TSC on such CPUs for time, as long as
those CPUs do not support/enter deep C-states.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-16 21:02:50 +01:00
Ingo Molnar
75f224cf77 perfcounters: fix lapic initialization
Fix non-working NMI sampling in certain bootup scenarios.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-14 22:00:31 +01:00
Ingo Molnar
2b9ff0db19 perfcounters: fix non-intel-perfmon CPUs
Do not write MSR_CORE_PERF_GLOBAL_CTRL on CPUs where it does not exist.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-14 20:31:28 +01:00
Ingo Molnar
ee06094f82 perfcounters: restructure x86 counter math
Impact: restructure code

Change counter math from absolute values to clear delta logic.

We try to extract elapsed deltas from the raw hw counter - and put
that into the generic counter.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-14 20:30:48 +01:00
Rusty Russell
968ea6d80e Merge ../linux-2.6-x86
Conflicts:

	arch/x86/kernel/io_apic.c
	kernel/sched.c
	kernel/sched_stats.h
2008-12-13 21:55:51 +10:30
Rusty Russell
29c0177e6a cpumask: change cpumask_scnprintf, cpumask_parse_user, cpulist_parse, and cpulist_scnprintf to take pointers.
Impact: change calling convention of existing cpumask APIs

Most cpumask functions started with cpus_: these have been replaced by
cpumask_ ones which take struct cpumask pointers as expected.

These four functions don't have good replacement names; fortunately
they're rarely used, so we just change them over.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: paulus@samba.org
Cc: mingo@redhat.com
Cc: tony.luck@intel.com
Cc: ralf@linux-mips.org
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: cl@linux-foundation.org
Cc: srostedt@redhat.com
2008-12-13 21:20:25 +10:30
Ingo Molnar
92bf73e90a Merge branch 'x86/irq' into perfcounters/core
( with manual semantic merge of arch/x86/kernel/cpu/perf_counter.c )
2008-12-12 12:00:14 +01:00
Markus Metzger
c2724775ce x86, bts: provide in-kernel branch-trace interface
Impact: cleanup

Move the BTS bits from ptrace.c into ds.c.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-12 08:08:12 +01:00
Ingo Molnar
6a930700c8 perf counters: clean up state transitions
Impact: cleanup

Introduce a proper enum for the 3 states of a counter:

	PERF_COUNTER_STATE_OFF		= -1
	PERF_COUNTER_STATE_INACTIVE	=  0
	PERF_COUNTER_STATE_ACTIVE	=  1

and rename counter->active to counter->state and propagate the
changes everywhere.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-11 15:45:56 +01:00
Ingo Molnar
01b2838c42 perf counters: consolidate hw_perf save/restore APIs
Impact: cleanup

Rename them to better match up the usual IRQ disable/enable APIs:

 hw_perf_disable_all()  => hw_perf_save_disable()
 hw_perf_restore_ctrl() => hw_perf_restore()

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-11 15:45:53 +01:00
Ingo Molnar
5c92d12411 perf counters: implement PERF_COUNT_CPU_CLOCK
Impact: add new perf-counter type

The 'CPU clock' counter counts the amount of CPU clock time that is
elapsing, in nanoseconds. (regardless of how much of it the task is
spending on a CPU executing)

This counter type is a Linux kernel based abstraction, it is available
even if the hardware does not support native hardware performance counters.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-11 15:45:52 +01:00
Ingo Molnar
621a01eac8 perf counters: hw driver API
Impact: restructure code, introduce hw_ops driver abstraction

Introduce this abstraction to handle counter details:

 struct hw_perf_counter_ops {
	void (*hw_perf_counter_enable)	(struct perf_counter *counter);
	void (*hw_perf_counter_disable)	(struct perf_counter *counter);
	void (*hw_perf_counter_read)	(struct perf_counter *counter);
 };

This will be useful to support assymetric hw details, and it will also
be useful to implement "software counters". (Counters that count kernel
managed sw events such as pagefaults, context-switches, wall-clock time
or task-local time.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-11 15:45:51 +01:00
Ingo Molnar
04289bb989 perf counters: add support for group counters
Impact: add group counters

This patch adds the "counter groups" abstraction.

Groups of counters behave much like normal 'single' counters, with a
few semantic and behavioral extensions on top of that.

A counter group is created by creating a new counter with the open()
syscall's group-leader group_fd file descriptor parameter pointing
to another, already existing counter.

Groups of counters are scheduled in and out in one atomic group, and
they are also roundrobin-scheduled atomically.

Counters that are member of a group can also record events with an
(atomic) extended timestamp that extends to all members of the group,
if the record type is set to PERF_RECORD_GROUP.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-11 15:45:49 +01:00
Ingo Molnar
9f66a3810f perf counters: restructure the API
Impact: clean up new API

Thorough cleanup of the new perf counters API, we now get clean separation
of the various concepts:

 - introduce perf_counter_hw_event to separate out the event source details

 - move special type flags into separate attributes: PERF_COUNT_NMI,
   PERF_COUNT_RAW

 - extend the type to u64 and reserve it fully to the architecture in the
   raw type case.

And make use of all these changes in the core and x86 perfcounters code.

Also change the syscall signature to:

  asmlinkage int sys_perf_counter_open(

	struct perf_counter_hw_event	*hw_event_uptr		__user,
	pid_t				pid,
	int				cpu,
	int				group_fd);

( Note that group_fd is unused for now - it's reserved for the counter
  groups abstraction. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-11 15:45:48 +01:00
Thomas Gleixner
dfa7c899b4 perf counters: expand use of counter->event
Impact: change syscall, cleanup

Make use of the new perf_counters event type.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-11 15:45:47 +01:00
Thomas Gleixner
4ac13294e4 perf counters: protect them against CSTATE transitions
Impact: fix rare lost events problem

There are CPUs whose performance counters misbehave on CSTATE transitions,
so provide a way to just disable/enable them around deep idle methods.

(hw_perf_enable_all() is cheap on x86.)

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-11 15:45:45 +01:00
Ingo Molnar
43874d238d perfcounters: consolidate global-disable codepaths
Impact: cleanup

Simplify global disable handling.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-09 19:28:50 +01:00
Ingo Molnar
1e12567678 perfcounters, x86: clean up debug code
Impact: cleanup

Get rid of unused debug code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-09 19:28:49 +01:00
Ingo Molnar
7e2ae34749 perfcounters, x86: simplify disable/enable of counters
Impact: fix spurious missed counter wakeups

In the case of NMI events, close a race window that can occur if an NMI
hits counter code that temporarily disables+enables a counter, and the NMI
leaks into the disabled section.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-09 19:28:48 +01:00
Ingo Molnar
87b9cf4623 x86, perfcounters: read out MSR_CORE_PERF_GLOBAL_STATUS with counters disabled
Impact: make perfcounter NMI and IRQ sequence more robust

Make __smp_perf_counter_interrupt() a bit more conservative: first disable
all counters, then read out the status. Most invocations are because there
are real events, so there's no performance impact.

Code flow gets a bit simpler as well this way.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-08 15:56:42 +01:00
Ingo Molnar
241771ef01 performance counters: x86 support
Implement performance counters for x86 Intel CPUs.

It's simplified right now: the PERFMON CPU feature is assumed,
which is available in Core2 and later Intel CPUs.

The design is flexible to be extended to more CPU types as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-08 15:47:15 +01:00
Herton Ronaldo Krzesinski
8529154ec3 [CPUFREQ] Add Celeron Core support to p4-clockmod.
Add Celeron Core support to p4-clockmod.

Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Signed-off-by: Dave Jones <davej@redhat.com>
2008-12-05 15:20:11 -05:00
Herton Ronaldo Krzesinski
c60e19eb21 [CPUFREQ] add to speedstep-lib additional fsb values for core processors
Add additional fsb values to pentium_core_get_frequency, from latest edition
(September 2008) of Intel 64 and IA-32 Architectures Software Develper's Manual,
Volume 3B: System Programming Guide, Part 2. Values added are to detect 800,
1067 and 1333 FSB types.

Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Signed-off-by: Dave Jones <davej@redhat.com>
2008-12-05 15:20:11 -05:00
Matthew Garrett
e088e4c9cd [CPUFREQ] Disable sysfs ui for p4-clockmod.
p4-clockmod has a long history of abuse.   It pretends to be a CPU
frequency scaling driver, even though it doesn't actually change
the CPU frequency, but instead just modulates the frequency with
wait-states.
The biggest misconception is that when running at the lower 'frequency'
p4-clockmod is saving power.  This isn't the case, as workloads running
slower take longer to complete, preventing the CPU from entering deep C states.

However p4-clockmod does have a purpose.  It can prevent overheating.
Having it hooked up to the cpufreq interfaces is the wrong way to achieve
cooling however. It should instead be hooked up to ACPI.

This diff introduces a means for a cpufreq driver to register with the
cpufreq core, but not present a sysfs interface.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2008-12-05 15:20:10 -05:00
Dominik Brodowski
10db2e5cbd [CPUFREQ] p4-clockmod: reduce noise
On those CPUs which are SpeedStep (EST) capable, we do not care at all if
p4-clockmod does not work, since a technically superior CPU frequency
management technology is to be used.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Dave Jones <davej@redhat.com>
2008-12-05 15:20:10 -05:00
Rusty Russell
9963d1aad4 [CPUFREQ] clean up speedstep-centrino and reduce cpumask_t usage
Impact: cleanup

1) The #ifdef CONFIG_HOTPLUG_CPU seems unnecessary these days.
2) The loop can simply skip over offline cpus, rather than creating a tmp mask.
3) set_mask is set to either a single cpu or all online cpus in a policy.
   Since it's just used for set_cpus_allowed(), any offline cpus in a policy
   don't matter, so we can just use cpumask_of_cpu() or the policy->cpus.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2008-12-05 15:20:10 -05:00
Ingo Molnar
c0515566f3 Merge commit 'v2.6.28-rc7' into x86/cleanups 2008-12-04 11:05:26 +01:00
Ingo Molnar
b8307db247 Merge commit 'v2.6.28-rc7' into tracing/core 2008-12-04 09:07:19 +01:00
Jiri Slaby
4385cecf1f x86: intel_cacheinfo, minor show_type cleanup
Impact: cleanup

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-30 07:46:31 +01:00
Arjan van de Ven
f3f47a6768 tracing: add "power-tracer": C/P state tracer to help power optimization
Impact: new "power-tracer" ftrace plugin

This patch adds a C/P-state ftrace plugin that will generate
detailed statistics about the C/P-states that are being used,
so that we can look at detailed decisions that the C/P-state
code is making, rather than the too high level "average"
that we have today.

An example way of using this is:

 mount -t debugfs none /sys/kernel/debug
 echo cstate > /sys/kernel/debug/tracing/current_tracer
 echo 1 > /sys/kernel/debug/tracing/tracing_enabled
 sleep 1
 echo 0 > /sys/kernel/debug/tracing/tracing_enabled
 cat /sys/kernel/debug/tracing/trace | perl scripts/trace/cstate.pl > out.svg

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26 08:29:32 +01:00
Andreas Herrmann
a266d9f125 [CPUFREQ] powernow-k8: ignore out-of-range PstateStatus value
A workaround for AMD CPU family 11h erratum 311 might cause that the
P-state Status Register shows a "current P-state" which is larger than
the "current P-state limit" in P-state Current Limit Register. For the
wrong P-state value there is no ACPI _PSS object defined and
powernow-k8/cpufreq can't determine the proper CPU frequency for that
state.

As a consequence this can cause a panic during boot (potentially with
all recent kernel versions -- at least I have reproduced it with
various 2.6.27 kernels and with the current .28 series), as an
example:

powernow-k8: Found 1 AMD Turion(tm)X2 Ultra DualCore Mobile ZM-82 processors (2 \
)
powernow-k8:    0 : pstate 0 (2200 MHz)
powernow-k8:    1 : pstate 1 (1100 MHz)
powernow-k8:    2 : pstate 2 (600 MHz)
BUG: unable to handle kernel paging request at ffff88086e7528b8
IP: [<ffffffff80486361>] cpufreq_stats_update+0x4a/0x5f
PGD 202063 PUD 0
Oops: 0002 [#1] SMP
last sysfs file:
CPU 1
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.28-rc3-dirty #16
RIP: 0010:[<ffffffff80486361>]  [<ffffffff80486361>] cpufreq_stats_update+0x4a/0\
f
Synaptics claims to have extended capabilities, but I'm not able to read them.<6\
6
RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88006e7528c0
RDX: 00000000ffffffff RSI: ffff88006e54af00 RDI: ffffffff808f056c
RBP: 00000000fffee697 R08: 0000000000000003 R09: ffff88006e73f080
R10: 0000000000000001 R11: 00000000002191c0 R12: ffff88006fb83c10
R13: 00000000ffffffff R14: 0000000000000001 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88006fb50740(0000) knlGS:0000000000000000
Unable to initialize Synaptics hardware.
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: ffff88086e7528b8 CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff88006fb82000, task ffff88006fb816d0)
Stack:
 ffff88006e74da50 0000000000000000 ffff88006e54af00 ffffffff804863c7
 ffff88006e74da50 0000000000000000 00000000ffffffff 0000000000000000
 ffff88006fb83c10 ffffffff8024b46c ffffffff808f0560 ffff88006fb83c10
Call Trace:
 [<ffffffff804863c7>] ? cpufreq_stat_notifier_trans+0x51/0x83
 [<ffffffff8024b46c>] ? notifier_call_chain+0x29/0x4c
 [<ffffffff8024b561>] ? __srcu_notifier_call_chain+0x46/0x61
 [<ffffffff8048496d>] ? cpufreq_notify_transition+0x93/0xa9
 [<ffffffff8021ab8d>] ? powernowk8_target+0x1e8/0x5f3
 [<ffffffff80486687>] ? cpufreq_governor_performance+0x1b/0x20
 [<ffffffff80484886>] ? __cpufreq_governor+0x71/0xa8
 [<ffffffff80484b21>] ? __cpufreq_set_policy+0x101/0x13e
 [<ffffffff80485bcd>] ? cpufreq_add_dev+0x3f0/0x4cd
 [<ffffffff8048577a>] ? handle_update+0x0/0x8
 [<ffffffff803c2062>] ? sysdev_driver_register+0xb6/0x10d
 [<ffffffff8056592c>] ? powernowk8_init+0x0/0x7e
 [<ffffffff8048604c>] ? cpufreq_register_driver+0x8f/0x140
 [<ffffffff80209056>] ? _stext+0x56/0x14f
 [<ffffffff802c2234>] ? proc_register+0x122/0x17d
 [<ffffffff802c23a0>] ? create_proc_entry+0x73/0x8a
 [<ffffffff8025c259>] ? register_irq_proc+0x92/0xaa
 [<ffffffff8025c2c8>] ? init_irq_proc+0x57/0x69
 [<ffffffff807fc85f>] ? kernel_init+0x116/0x169
 [<ffffffff8020cc79>] ? child_rip+0xa/0x11
 [<ffffffff807fc749>] ? kernel_init+0x0/0x169
 [<ffffffff8020cc6f>] ? child_rip+0x0/0x11
Code: 05 c5 83 36 00 48 c7 c2 48 5d 86 80 48 8b 04 d8 48 8b 40 08 48 8b 34 02 48\

RIP  [<ffffffff80486361>] cpufreq_stats_update+0x4a/0x5f
 RSP <ffff88006fb83b20>
CR2: ffff88086e7528b8
---[ end trace 0678bac75e67a2f7 ]---
Kernel panic - not syncing: Attempted to kill init!

In short, aftereffect of the wrong P-state is that
cpufreq_stats_update() uses "-1" as index for some array in

cpufreq_stats_update (unsigned int cpu)
{
...
     if (stat->time_in_state)
                stat->time_in_state[stat->last_index] =
                        cputime64_add(stat->time_in_state[stat->last_index],
                                      cputime_sub(cur_time, stat->last_time));
...
}

Fortunately, the wrong P-state value is returned only if the core is
in P-state 0. This fix solves the problem by detecting the
out-of-range P-state, ignoring it, and using "0" instead.

Cc: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2008-11-25 13:38:29 -05:00
Hannes Eder
4e42ebd57b x86: hypervisor - fix sparse warnings
Impact: fix sparse build warning

Fix the following sparse warnings:

  arch/x86/kernel/cpu/hypervisor.c:37:15: warning: symbol
  'get_hypervisor_tsc_freq' was not declared. Should it be static?
  arch/x86/kernel/cpu/hypervisor.c:53:16: warning: symbol
  'init_hypervisor' was not declared. Should it be static?

Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Cc: "Alok N Kataria" <akataria@vmware.com>
Cc: "Dan Hecht" <dhecht@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23 11:11:52 +01:00
Hannes Eder
c450d7805b x86: vmware - fix sparse warnings
Impact: fix sparse build warning

Fix the following sparse warnings:

arch/x86/kernel/cpu/vmware.c:69:5: warning: symbol 'vmware_platform'
was not declared. Should it be static?
arch/x86/kernel/cpu/vmware.c:89:15: warning: symbol
'vmware_get_tsc_khz' was not declared. Should it be static?
arch/x86/kernel/cpu/vmware.c:107:16: warning: symbol
'vmware_set_feature_bits' was not declared. Should it be static?

Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Cc: "Alok N Kataria" <akataria@vmware.com>
Cc: "Dan Hecht" <dhecht@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23 11:02:36 +01:00
Markus Metzger
f4166c54bf x86, bts: DS and BTS initialization
Impact: widen BTS/PEBS ptrace enablement to more CPU models

Move BTS initialisation out of an #ifdef CONFIG_X86_64 guard.

Assume core2 BTS and DS layout for future models of family 6 processors.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-10 08:50:32 +01:00
Alok Kataria
fd8cd7e191 x86: vmware: look for DMI string in the product serial key
Impact: Should permit VMware detection on older platforms where the
vendor is changed.  Could theoretically cause a regression if some
weird serial number scheme contains the string "VMware" by pure
chance.  Seems unlikely, especially with the mixed case.

In some user configured cases, VMware may choose not to put a VMware specific
DMI string, but the product serial key is always there and is VMware specific.
Add a interface to check the serial key, when checking for VMware in the DMI
information.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-11-04 13:59:00 -08:00
Alok Kataria
6bdbfe9991 x86: VMware: Fix vmware_get_tsc code
Impact: Fix possible failure to calibrate the TSC on Vmware near 4 GHz

The current version of the code to get the tsc frequency from
the VMware hypervisor, will be broken on processor with frequency
(4G-1) HZ, because on such processors eax will have UINT_MAX
and that would be legitimate.
We instead check that EBX did change to decide if we were able to
read the frequency from the hypervisor.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-11-03 11:35:57 -08:00
Alok Kataria
eca0cd028b x86: Add a synthetic TSC_RELIABLE feature bit.
Impact: Changes timebase calibration on Vmware.

Use the synthetic TSC_RELIABLE bit to workaround virtualization anomalies.

Virtual TSCs can be kept nearly in sync, but because the virtual TSC
offset is set by software, it's not perfect.  So, the TSC
synchronization test can fail. Even then the TSC can be used as a
clocksource since the VMware platform exports a reliable TSC to the
guest for timekeeping purposes. Use this bit to check if we need to
skip the TSC sync checks.

Along with this also set the CONSTANT_TSC bit when on VMware, since we
still want to use TSC as clocksource on VM running over hardware which
has unsynchronized TSC's (opteron's), since the hypervisor will take
care of providing consistent TSC to the guest.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: Dan Hecht <dhecht@vmware.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-11-01 18:58:01 -07:00
Alok Kataria
88b094fb8d x86: Hypervisor detection and get tsc_freq from hypervisor
Impact: Changes timebase calibration on Vmware.

v3->v2 : Abstract the hypervisor detection and feature (tsc_freq) request
	 behind a hypervisor.c file
v2->v1 : Add a x86_hyper_vendor field to the cpuinfo_x86 structure.
	 This avoids multiple calls to the hypervisor detection function.

This patch adds function to detect if we are running under VMware.
The current way to check if we are on VMware is following,
#  check if "hypervisor present bit" is set, if so read the 0x40000000
   cpuid leaf and check for "VMwareVMware" signature.
#  if the above fails, check the DMI vendors name for "VMware" string
   if we find one we query the VMware hypervisor port to check if we are
   under VMware.

The DMI + "VMware hypervisor port check" is needed for older VMware products,
which don't implement the hypervisor signature cpuid leaf.
Also note that since we are checking for the DMI signature the hypervisor
port should never be accessed on native hardware.

This patch also adds a hypervisor_get_tsc_freq function, instead of
calibrating the frequency which can be error prone in virtualized
environment, we ask the hypervisor for it. We get the frequency from
the hypervisor by accessing the hypervisor port if we are running on VMware.
Other hypervisors too can add code to the generic routine to get frequency on
their platform.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: Dan Hecht <dhecht@vmware.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-11-01 18:57:08 -07:00
Ingo Molnar
b342797c1e x86: build fix
Impact: build fix on certain UP configs

fix:

 arch/x86/kernel/cpu/common.c: In function 'cpu_init':
 arch/x86/kernel/cpu/common.c:1141: error: 'boot_cpu_id' undeclared (first use in this function)
 arch/x86/kernel/cpu/common.c:1141: error: (Each undeclared identifier is reported only once
 arch/x86/kernel/cpu/common.c:1141: error: for each function it appears in.)

Pull in asm/smp.h on UP, so that we get the definition of
boot_cpu_id.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-31 09:31:38 +01:00
Ingo Molnar
1c4acdb467 x86: cpu_index build fix
fix:

 arch/x86/kernel/cpu/common.c: In function 'early_identify_cpu':
 arch/x86/kernel/cpu/common.c:553: error: 'struct cpuinfo_x86' has no member named 'cpu_index'

as cpu_index is only available on SMP.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-31 00:43:03 +01:00
James Bottomley
bfcb4c1bec x86/voyager: fix missing cpu_index initialisation
Impact: fix /proc/cpuinfo output on x86/Voyager

Ever since

| commit 92cb7612ae
| Author: Mike Travis <travis@sgi.com>
| Date:   Fri Oct 19 20:35:04 2007 +0200
|
|     x86: convert cpuinfo_x86 array to a per_cpu array

We've had an extra field in cpuinfo_x86 which is cpu_index.
Unfortunately, voyager has never initialised this, although the only
noticeable impact seems to be that /proc/cpuinfo shows all zeros for
the processor ids.

Anyway, fix this by initialising the boot CPU properly and setting the
index when the secondaries update.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-31 00:19:37 +01:00
James Bottomley
b3572e361b x86/voyager: fix compile breakage caused by dc1e35c6e9
Impact: build fix on x86/Voyager

Given commits like this:

| Author: Suresh Siddha <suresh.b.siddha@intel.com>
| Date:   Tue Jul 29 10:29:19 2008 -0700
|
|     x86, xsave: enable xsave/xrstor on cpus with xsave support

Which deliberately expose boot cpu dependence to pieces of the system,
I think it's time to explicitly have a variable for it to prevent this
continual misassumption that the boot CPU is zero.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-31 00:19:33 +01:00
James Bottomley
017d9d20d8 x86: use CONFIG_X86_SMP instead of CONFIG_SMP
Impact: fix x86/Voyager boot

CONFIG_SMP is used for features which work on *all* x86 boxes.
CONFIG_X86_SMP is used for standard PC like x86 boxes (for things like
multi core and apics)

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-30 22:53:10 +01:00
Yinghai Lu
30604bb410 x86: break up mtrr_cleanup() into several small functions.
Ingo said mtrr_cleanup() is big and ugly.

so break it up into more functions and make it more readable.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 16:30:57 +01:00
Linus Torvalds
c3c9897c63 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix section mismatch warning - apic_x2apic_phys
  x86: fix section mismatch warning - apic_x2apic_cluster
  x86: fix section mismatch warning - apic_x2apic_uv_x
  x86: fix section mismatch warning - apic_physflat
  x86: fix section mismatch warning - apic_flat
  x86: memtest fix use of reserve_early()
  x86 syscall.h: fix argument order
  x86/tlb_uv: remove strange mc146818rtc include
  x86: remove redundant KERN_DEBUG on pr_debug
  x86: do_boot_cpu - check if we have ESR register
  x86: MAINTAINERS change for AMD microcode patch loader
  x86/proc: fix /proc/cpuinfo cpu offline bug
  x86: call dmi-quirks for HP Laptops after early-quirks are executed
  x86, kexec: fix hang on i386 when panic occurs while console_sem is held
  MCE: Don't run 32bit machine checks with interrupts on
  x86: SB600: skip IRQ0 override if it is not routed to INT2 of IOAPIC
  x86: make variables static
2008-10-23 12:38:39 -07:00
Linus Torvalds
5b34653963 Merge branch 'x86/um-header' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86/um-header' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (26 commits)
  x86: canonicalize remaining header guards
  x86: drop double underscores from header guards
  x86: Fix ASM_X86__ header guards
  x86, um: get rid of uml-config.h
  x86, um: get rid of arch/um/Kconfig.arch
  x86, um: get rid of arch/um/os symlink
  x86, um: get rid of excessive includes of uml-config.h
  x86, um: get rid of header symlinks
  x86, um: merge Kconfig.i386 and Kconfig.x86_64
  x86, um: get rid of sysdep symlink
  x86, um: trim the junk from uml ptrace-*.h
  x86, um: take vm-flags.h to sysdep
  x86, um: get rid of uml asm/arch
  x86, um: get rid of uml highmem.h
  x86, um: get rid of uml unistd.h
  x86, um: get rid of system.h -> system.h include
  x86, um: uml atomic.h is not needed anymore
  x86, um: untangle uml ldt.h
  x86, um: get rid of more uml asm/arch uses
  x86, um: remove dead header (uml module-generic.h; never used these days)
  ...
2008-10-23 10:22:01 -07:00
Al Viro
bb8985586b x86, um: ... and asm-x86 move
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-10-22 22:55:20 -07:00
Len Brown
057316cc6a Merge branch 'linus' into test
Conflicts:
	MAINTAINERS
	arch/x86/kernel/acpi/boot.c
	arch/x86/kernel/acpi/sleep.c
	drivers/acpi/Kconfig
	drivers/pnp/Makefile
	drivers/pnp/quirks.c

Signed-off-by: Len Brown <len.brown@intel.com>
2008-10-23 00:11:07 -04:00
Len Brown
1b79b27da1 Merge branch 'yinghai' into test 2008-10-22 23:35:56 -04:00
Len Brown
6b3c4f8b9c Merge branch 'FW_BUG' into test 2008-10-22 23:19:45 -04:00
Lai Jiangshan
bc8bcc79ea x86/proc: fix /proc/cpuinfo cpu offline bug
Impact: fix missing CPUs in /proc/cpuinfo after CPU hotunplug/hotreplug

In my test, I found that if a cpu has been offline,
the next cpus may not be shown in the /proc/cpuinfo.

if one read() cannot consume the whole /proc/cpuinfo,
c_start() will be called again in the next read() calls.
And *pos has been increased by 1 by the caller(seq_read()).
if this time the cpu#*pos is offline, c_start() will return
NULL, and the next cpus can not be shown.

this fix use next_cpu_nr(*pos - 1, cpu_online_map) to
search the next unshown cpu.

the most easy way to reproduce this bug:
1) offline cpu#1             (cpu#0 is online)
2) dd ibs=2 if=/proc/cpuinfo
   the result is that only cpu#0 is shown.
   cpu#2 and cpu#3 .... cannot be shown in /proc/cpuinfo
   it's bug.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-22 14:29:37 +02:00
Linus Torvalds
92b29b86fe Merge branch 'tracing-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (131 commits)
  tracing/fastboot: improve help text
  tracing/stacktrace: improve help text
  tracing/fastboot: fix initcalls disposition in bootgraph.pl
  tracing/fastboot: fix bootgraph.pl initcall name regexp
  tracing/fastboot: fix issues and improve output of bootgraph.pl
  tracepoints: synchronize unregister static inline
  tracepoints: tracepoint_synchronize_unregister()
  ftrace: make ftrace_test_p6nop disassembler-friendly
  markers: fix synchronize marker unregister static inline
  tracing/fastboot: add better resolution to initcall debug/tracing
  trace: add build-time check to avoid overrunning hex buffer
  ftrace: fix hex output mode of ftrace
  tracing/fastboot: fix initcalls disposition in bootgraph.pl
  tracing/fastboot: fix printk format typo in boot tracer
  ftrace: return an error when setting a nonexistent tracer
  ftrace: make some tracers reentrant
  ring-buffer: make reentrant
  ring-buffer: move page indexes into page headers
  tracing/fastboot: only trace non-module initcalls
  ftrace: move pc counter in irqtrace
  ...

Manually fix conflicts:
 - init/main.c: initcall tracing
 - kernel/module.c: verbose level vs tracepoints
 - scripts/bootgraph.pl: fallout from cherry-picking commits.
2008-10-20 13:35:07 -07:00
Linus Torvalds
9301975ec2 Merge branch 'genirq-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
This merges branches irq/genirq, irq/sparseirq-v4, timers/hpet-percpu
and x86/uv.

The sparseirq branch is just preliminary groundwork: no sparse IRQs are
actually implemented by this tree anymore - just the new APIs are added
while keeping the old way intact as well (the new APIs map 1:1 to
irq_desc[]).  The 'real' sparse IRQ support will then be a relatively
small patch ontop of this - with a v2.6.29 merge target.

* 'genirq-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (178 commits)
  genirq: improve include files
  intr_remapping: fix typo
  io_apic: make irq_mis_count available on 64-bit too
  genirq: fix name space collisions of nr_irqs in arch/*
  genirq: fix name space collision of nr_irqs in autoprobe.c
  genirq: use iterators for irq_desc loops
  proc: fixup irq iterator
  genirq: add reverse iterator for irq_desc
  x86: move ack_bad_irq() to irq.c
  x86: unify show_interrupts() and proc helpers
  x86: cleanup show_interrupts
  genirq: cleanup the sparseirq modifications
  genirq: remove artifacts from sparseirq removal
  genirq: revert dynarray
  genirq: remove irq_to_desc_alloc
  genirq: remove sparse irq code
  genirq: use inline function for irq_to_desc
  genirq: consolidate nr_irqs and for_each_irq_desc()
  x86: remove sparse irq from Kconfig
  genirq: define nr_irqs for architectures with GENERIC_HARDIRQS=n
  ...
2008-10-20 13:23:01 -07:00
Dave Jones
f4432c5cae Update email addresses.
Update assorted email addresses and related info to point
to a single current, valid address.

additionally
- trivial CREDITS entry updates. (Not that this file means much any more)
- remove arjans dead redhat.com address from powernow driver

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20 12:50:03 -07:00
Linus Torvalds
db7a6d8d01 Update .gitignore files for generated targets
The generated 'capflags.c' file wasn't properly ignored, and the list of
files in scripts/basic/ wasn't up-to-date.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20 11:24:31 -07:00
Yinghai Lu
823b259b80 x86: print out apic id in hex format
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-16 16:53:09 +02:00
Ingo Molnar
8b1fa1d7b2 ftrace: mark lapic_wd_event() notrace
it can be called in the NMI path:

[    0.645999] calling  ftrace_dynamic_init+0x0/0xd6
[    0.647521] ------------[ cut here ]------------
[    0.647521] WARNING: at kernel/trace/ftrace.c:348 ftrace_record_ip+0x4e/0x252()
[    0.647521] Modules linked in:
[    0.647521] Pid: 15, comm: kstop1 Not tainted 2.6.27-rc1-tip #22686
[    0.647521]
[    0.647521] Call Trace:
[    0.647521]  <NMI>  [<ffffffff8024593f>] warn_on_slowpath+0x5d/0x84
[    0.647521]  [<ffffffff80220b99>] ? lapic_wd_event+0xb/0x5c
[    0.647521]  [<ffffffff80287b3b>] ftrace_record_ip+0x4e/0x252
[    0.647521]  [<ffffffff80211274>] mcount_call+0x5/0x31
[    0.647521]  [<ffffffff80220b9e>] ? lapic_wd_event+0x10/0x5c
[    0.647521]  [<ffffffff8083f3ec>] nmi_watchdog_tick+0x19d/0x1ad
[    0.647521]  [<ffffffff8083e875>] default_do_nmi+0x75/0x1e3
[    0.647521]  [<ffffffff8083f0b3>] do_nmi+0x5d/0x94
[    0.647521]  [<ffffffff8083e2d2>] nmi+0xa2/0xc2
[    0.647521]  [<ffffffff802b48c3>] ? check_bytes_and_report+0x11/0xcc
[    0.647521]  <<EOE>>  [<ffffffff80211274>] ? mcount_call+0x5/0x31
[    0.647521]  [<ffffffff802b49df>] check_object+0x61/0x1b0
[    0.647521]  [<ffffffff802b502a>] __slab_free+0x169/0x2ae
[    0.647521]  [<ffffffff80242dbf>] ? __cleanup_sighand+0x25/0x27
[    0.647521]  [<ffffffff80242dbf>] ? __cleanup_sighand+0x25/0x27
[    0.647521]  [<ffffffff802b60cd>] kmem_cache_free+0x85/0xb9
[    0.647521]  [<ffffffff80242dbf>] __cleanup_sighand+0x25/0x27
[    0.647521]  [<ffffffff80247b3d>] release_task+0x256/0x339
[    0.647521]  [<ffffffff802490b4>] do_exit+0x764/0x7ef
[    0.647521]  [<ffffffff8027624c>] __xchg+0x0/0x38
[    0.647521]  [<ffffffff8027619a>] ? stop_cpu+0x0/0xb2
[    0.647521]  [<ffffffff8027619a>] ? stop_cpu+0x0/0xb2
[    0.647521]  [<ffffffff8025922f>] kthread+0x4e/0x7b
[    0.647521]  [<ffffffff80212979>] child_rip+0xa/0x11
[    0.647521]  [<ffffffff80211c17>] ? restore_args+0x0/0x30
[    0.647521]  [<ffffffff802283a5>] ? native_load_tls+0x14/0x2e
[    0.647521]  [<ffffffff802591e1>] ? kthread+0x0/0x7b
[    0.647521]  [<ffffffff8021296f>] ? child_rip+0x0/0x11
[    0.647521]
[    0.647521] ---[ end trace 4eaa2a86a8e2da22 ]---
[    0.672032] initcall ftrace_dynamic_init+0x0/0xd6 returned 0 after 19 msecs

also mark it no-kprobes while at it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14 10:34:36 +02:00
Krzysztof Helt
94f6bac105 x86: do not allow to optimize flag_is_changeable_p() (rev. 2)
The flag_is_changeable_p() is used by
has_cpuid_p() which can return different results
in the code sequence below:

 if (!have_cpuid_p())
      identify_cpu_without_cpuid(c);

  /* cyrix could have cpuid enabled via c_identify()*/
  if (!have_cpuid_p())
      return;

Otherwise, the gcc 3.4.6 optimizes these two calls
into one which make the code not working correctly.

Cyrix cpus have the CPUID instruction enabled before
the second call to the have_cpuid_p() but
it is not detected due to the gcc optimization.
Thus the ARR registers (mtrr like) are not detected
on such a cpu.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-13 10:33:13 +02:00
Glauber Costa
e04d645f32 x86: move vgetcpu mode probing to cpu detection
Take it out of time initialization and move it to
cpu detection time.

Signed-off-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-13 10:21:49 +02:00
Yinghai Lu
bd32a8cfa8 x86: cpu don't print duplicated vendor string
Some CPUs have vendor string in the middle of model_id instead of beginning

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-13 10:21:21 +02:00
Ingo Molnar
365d46dc9b Merge branch 'linus' into x86/xen
Conflicts:
	arch/x86/kernel/cpu/common.c
	arch/x86/kernel/process_64.c
	arch/x86/xen/enlighten.c
2008-10-12 12:37:32 +02:00
Linus Torvalds
ead9d23d80 Merge phase #4 (X2APIC, APIC unification, CPU identification unification) of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-v28-for-linus-phase4-D' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (186 commits)
  x86, debug: print more information about unknown CPUs
  x86 setup: handle more than 8 CPU flag words
  x86: cpuid, fix typo
  x86: move transmeta cap read to early_init_transmeta()
  x86: identify_cpu_without_cpuid v2
  x86: extended "flags" to show virtualization HW feature in /proc/cpuinfo
  x86: move VMX MSRs to msr-index.h
  x86: centaur_64.c remove duplicated setting of CONSTANT_TSC
  x86: intel.c put workaround for old cpus together
  x86: let intel 64-bit use intel.c
  x86: make intel_64.c the same as intel.c
  x86: make intel.c have 64-bit support code
  x86: little clean up of intel.c/intel_64.c
  x86: make 64 bit to use amd.c
  x86: make amd_64 have 32 bit code
  x86: make amd.c have 64bit support code
  x86: merge header in amd_64.c
  x86: add srat_detect_node for amd64
  x86: remove duplicated force_mwait
  x86: cpu make amd.c more like amd_64.c v2
  ...
2008-10-11 11:51:16 -07:00
Ingo Molnar
0afe2db213 Merge branch 'x86/unify-cpu-detect' into x86-v28-for-linus-phase4-D
Conflicts:
	arch/x86/kernel/cpu/common.c
	arch/x86/kernel/signal_64.c
	include/asm-x86/cpufeature.h
2008-10-11 20:23:20 +02:00
Ingo Molnar
d84705969f Merge branch 'x86/apic' into x86-v28-for-linus-phase4-B
Conflicts:
	arch/x86/kernel/apic_32.c
	arch/x86/kernel/apic_64.c
	arch/x86/kernel/setup.c
	drivers/pci/intel-iommu.c
	include/asm-x86/cpufeature.h
	include/asm-x86/dma-mapping.h
2008-10-11 20:17:36 +02:00
Linus Torvalds
098ef215b1 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] Fix BUG: using smp_processor_id() in preemptible code
  [CPUFREQ] Don't export governors for default governor
  [CPUFREQ][6/6] cpufreq: Add idle microaccounting in ondemand governor
  [CPUFREQ][5/6] cpufreq: Changes to get_cpu_idle_time_us(), used by ondemand governor
  [CPUFREQ][4/6] cpufreq_ondemand: Parameterize down differential
  [CPUFREQ][3/6] cpufreq: get_cpu_idle_time() changes in ondemand for idle-microaccounting
  [CPUFREQ][2/6] cpufreq: Change load calculation in ondemand for software coordination
  [CPUFREQ][1/6] cpufreq: Add cpu number parameter to __cpufreq_driver_getavg()
  [CPUFREQ] use deferrable delayed work init in conservative governor
  [CPUFREQ] drivers/cpufreq/cpufreq.c: Adjust error handling code involving cpufreq_cpu_put
  [CPUFREQ] add error handling for cpufreq_register_governor() error
  [CPUFREQ] acpi-cpufreq: add error handling for cpufreq_register_driver() error
  [CPUFREQ] Coding style fixes to arch/x86/kernel/cpu/cpufreq/powernow-k6.c
  [CPUFREQ] Coding style fixes to arch/x86/kernel/cpu/cpufreq/elanfreq.c
2008-10-11 08:49:34 -07:00
Yinghai Lu
ee29753327 ACPI: don't load acpi_cpufreq if acpi=off
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-10-10 23:45:38 -04:00
Linus Torvalds
d403a6484f Merge phase #1 of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
This merges phase 1 of the x86 tree, which is a collection of branches:

  x86/alternatives, x86/cleanups, x86/commandline, x86/crashdump,
  x86/debug, x86/defconfig, x86/doc, x86/exports, x86/fpu, x86/gart,
  x86/idle, x86/mm, x86/mtrr, x86/nmi-watchdog, x86/oprofile,
  x86/paravirt, x86/reboot, x86/sparse-fixes, x86/tsc, x86/urgent and
  x86/vmalloc

and as Ingo says: "these are the easiest, purely independent x86 topics
with no conflicts, in one nice Octopus merge".

* 'x86-v28-for-linus-phase1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (147 commits)
  x86: mtrr_cleanup: treat WRPROT as UNCACHEABLE
  x86: mtrr_cleanup: first 1M may be covered in var mtrrs
  x86: mtrr_cleanup: print out correct type v2
  x86: trivial printk fix in efi.c
  x86, debug: mtrr_cleanup print out var mtrr before change it
  x86: mtrr_cleanup try gran_size to less than 1M, v3
  x86: mtrr_cleanup try gran_size to less than 1M, cleanup
  x86: change MTRR_SANITIZER to def_bool y
  x86, debug printouts: IOMMU setup failures should not be KERN_ERR
  x86: export set_memory_ro and set_memory_rw
  x86: mtrr_cleanup try gran_size to less than 1M
  x86: mtrr_cleanup prepare to make gran_size to less 1M
  x86: mtrr_cleanup safe to get more spare regs now
  x86_64: be less annoying on boot, v2
  x86: mtrr_cleanup hole size should be less than half of chunk_size, v2
  x86: add mtrr_cleanup_debug command line
  x86: mtrr_cleanup optimization, v2
  x86: don't need to go to chunksize to 4G
  x86_64: be less annoying on boot
  x86, olpc: fix endian bug in openfirmware workaround
  ...
2008-10-10 08:28:58 -07:00
Hans Schou
43603c8df9 x86, debug: print more information about unknown CPUs
Write the name of the unknown vendor_id to output instead of just
"unknown".

Tag changed to 'vendor_id' as used in /proc/cpuinfo

Signed-off-by: Hans Schou <linux@schou.dk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-10 17:03:59 +02:00
venkatesh.pallipadi@intel.com
bf0b90e357 [CPUFREQ][1/6] cpufreq: Add cpu number parameter to __cpufreq_driver_getavg()
Add a cpu parameter to __cpufreq_driver_getavg(). This is needed for software
cpufreq coordination where policy->cpu may not be same as the CPU on which we
want to getavg frequency.

A follow-on patch will use this parameter to getavg freq from all cpus
in policy->cpus.

Change since last patch. Fix the offline/online and suspend/resume
oops reported by Youquan Song <youquan.song@intel.com>

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2008-10-09 13:52:43 -04:00
Akinobu Mita
847aef6ffd [CPUFREQ] acpi-cpufreq: add error handling for cpufreq_register_driver() error
add error handling for cpufreq_register_driver() error

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: cpufreq@lists.linux.org.uk
Signed-off-by: Dave Jones <davej@redhat.com>
2008-10-09 13:52:42 -04:00
Paolo Ciarrocchi
8d2d2051e5 [CPUFREQ] Coding style fixes to arch/x86/kernel/cpu/cpufreq/powernow-k6.c
Before:
total: 11 errors, 15 warnings, 255 lines checked

After:
total: 0 errors, 6 warnings, 254 lines checked

paolo@paolo-desktop:~/linux.trees.git$ md5sum /tmp/powernow-k6.o.*
476932f5e1ffe365db9d1dfb3f860369  /tmp/powernow-k6.o.after
476932f5e1ffe365db9d1dfb3f860369  /tmp/powernow-k6.o.before

Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2008-10-09 13:52:42 -04:00
Paolo Ciarrocchi
18c6faa962 [CPUFREQ] Coding style fixes to arch/x86/kernel/cpu/cpufreq/elanfreq.c
Before:
total: 15 errors, 10 warnings, 308 lines checked

After:
total: 0 errors, 4 warnings, 308 lines checked

paolo@paolo-desktop:~/linux.trees.git$ md5sum /tmp/elafreq.o.*
add1d36c2f077c5aab7682e8642a9f34  /tmp/elafreq.o.after
add1d36c2f077c5aab7682e8642a9f34  /tmp/elafreq.o.before

paolo@paolo-desktop:~/linux.trees.git$ size /tmp/elafreq.o.*
   text    data     bss     dec     hex filename
    934     270       4    1208     4b8 /tmp/elafreq.o.after
    934     270       4    1208     4b8 /tmp/elafreq.o.before

Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2008-10-09 13:52:42 -04:00
Németh Márton
8d59225720 [CPUFREQ] correct broken links and email addresses
Replace the no longer working links and email address in the
documentation and in source code.

Signed-off-by: Márton Németh <nm127@freemail.hu>
Signed-off-by: Dave Jones <davej@redhat.com>
2008-10-09 13:52:40 -04:00
Ingo Molnar
e496e3d645 Merge branches 'x86/alternatives', 'x86/cleanups', 'x86/commandline', 'x86/crashdump', 'x86/debug', 'x86/defconfig', 'x86/doc', 'x86/exports', 'x86/fpu', 'x86/gart', 'x86/idle', 'x86/mm', 'x86/mtrr', 'x86/nmi-watchdog', 'x86/oprofile', 'x86/paravirt', 'x86/reboot', 'x86/sparse-fixes', 'x86/tsc', 'x86/urgent' and 'x86/vmalloc' into x86-v28-for-linus-phase1 2008-10-06 18:17:07 +02:00
Ingo Molnar
0962f402af Merge branch 'x86/prototypes' into x86-v28-for-linus-phase1
Conflicts:
	arch/x86/kernel/process_32.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-06 18:06:53 +02:00
Ingo Molnar
19268ed744 Merge branch 'x86/pebs' into x86-v28-for-linus-phase1
Conflicts:
	include/asm-x86/ds.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-06 16:17:23 +02:00
Yinghai Lu
dd5523552c x86: mtrr_cleanup: treat WRPROT as UNCACHEABLE
For the purpose of MTRR canonicalization, treat WRPROT as UNCACHEABLE.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-10-04 20:10:22 -07:00
Yinghai Lu
99e1aa17ce x86: mtrr_cleanup: first 1M may be covered in var mtrrs
The first 1M is don't care when it comes to the variables MTRRs.
Cover it as WB as a heuristic approximation; this is generally what we
want to minimize the number of registers.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-10-04 20:09:14 -07:00
Yinghai Lu
42fde7a05c x86: mtrr_cleanup: print out correct type v2
Print out the correct type when the Write Protected (WP) type is seen.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-10-04 20:07:14 -07:00
Yinghai Lu
8bb39311bf x86, debug: mtrr_cleanup print out var mtrr before change it
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-03 09:50:04 +02:00
Yinghai Lu
834836ee6a x86: mtrr_cleanup try gran_size to less than 1M, v3
J.A. Magallón reported:

 >> Also, on a 64 bit box with 4Gb, it gives this:
 >>
 >> cicely:~# cat /proc/mtrr
 >> reg00: base=0x00000000 (   0MB), size=4096MB: write-back, count=1
 >> reg01: base=0x100000000 (4096MB), size=1024MB: write-back, count=1
 >> reg02: base=0x140000000 (5120MB), size= 512MB: write-back, count=1
 >> reg03: base=0x160000000 (5632MB), size= 256MB: write-back, count=1
 >> reg04: base=0x80000000 (2048MB), size=2048MB: uncachable, count=1

boundary handling has a problem ... fix it.

Reported-by: J.A. Magallón <jamagallon@ono.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-03 09:41:55 +02:00
J.A. Magallón
136c82c6f7 x86: mtrr_cleanup try gran_size to less than 1M, cleanup
Patch below cleans up formatting, with space for big bases and sizes (64 Gb).

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-03 09:41:42 +02:00
J.A. Magallón
3dcd7e269d x86: fix typo in enable_mtrr_cleanup early parameter
Correct typo for 'enable_mtrr_cleanup' early boot param name.

Signed-off-by: J.A. Magallon <jamagallon@ono.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-30 10:06:09 +02:00
Yinghai Lu
4624065731 x86: mtrr_cleanup try gran_size to less than 1M
one have gran < 1M

reg00: base=0xd8000000 (3456MB), size= 128MB: uncachable, count=1
reg01: base=0xe0000000 (3584MB), size= 512MB: uncachable, count=1
reg02: base=0x00000000 (   0MB), size=4096MB: write-back, count=1
reg03: base=0x100000000 (4096MB), size= 512MB: write-back, count=1
reg04: base=0x120000000 (4608MB), size= 128MB: write-back, count=1
reg05: base=0xd7f80000 (3455MB), size= 512KB: uncachable, count=1

will get

Found optimal setting for mtrr clean up
gran_size: 512K         chunk_size: 2M  num_reg: 7      lose RAM: 0G
range0: 0000000000000000 - 00000000d8000000
Setting variable MTRR 0, base: 0GB, range: 2GB, type WB
Setting variable MTRR 1, base: 2GB, range: 1GB, type WB
Setting variable MTRR 2, base: 3GB, range: 256MB, type WB
Setting variable MTRR 3, base: 3328MB, range: 128MB, type WB
hole: 00000000d7f00000 - 00000000d7f80000
Setting variable MTRR 4, base: 3455MB, range: 512KB, type UC
rangeX: 0000000100000000 - 0000000128000000
Setting variable MTRR 5, base: 4GB, range: 512MB, type WB
Setting variable MTRR 6, base: 4608MB, range: 128MB, type WB

so start from 64k instead of 1M

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-09-29 20:50:11 -07:00
Yinghai Lu
dd7e52224f x86: mtrr_cleanup prepare to make gran_size to less 1M
make the print out right with size < 1M

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-09-29 20:50:04 -07:00
Yinghai Lu
73436a1d25 x86: mtrr_cleanup safe to get more spare regs now
Delay exit to make sure we can actually get the optimal result in as
many cases as possible.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-09-29 18:34:08 -07:00
Yinghai Lu
8f0afaa58e x86: mtrr_cleanup hole size should be less than half of chunk_size, v2
v2: should check with half of range0 size instead of chunk_size

So don't have silly big hole.

in hpa's case we could auto detect instead of adding mtrr_chunk_size in
command line.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-28 09:12:27 +02:00
Yinghai Lu
54d45ff420 x86: add mtrr_cleanup_debug command line
add mtrr_cleanup_debug to print out more info about layout

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-27 18:09:35 +02:00
Yinghai Lu
2313c2793d x86: mtrr_cleanup optimization, v2
fix hpa's t61 with 4g ram:

   change layout from
	(n - 1)*chunksize + chunk_size - NC
   to
	n*chunksize - NC

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-27 18:09:34 +02:00
Yinghai Lu
7fc2368d1d x86: don't need to go to chunksize to 4G
change back chunksize max to 2g
   otherwise will get strange layout in 2G ram system like
     0 - 4g WB, 2040M - 2048M UC, 2048M -  4G NC
   instead of
     0 - 2g WB, 2040M - 2048M UC

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-27 18:09:33 +02:00
Ingo Molnar
ebdd90a8cb Merge commit 'v2.6.27-rc7' into x86/pebs 2008-09-24 09:56:20 +02:00
Ingo Molnar
07bbc16a86 Merge branch 'timers/urgent' into x86/xen
Conflicts:
	arch/x86/kernel/process_32.c
	arch/x86/kernel/process_64.c

Manual merge:

	arch/x86/kernel/smpboot.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-23 23:26:42 +02:00
Thomas Renninger
2fd47094f9 CPUFREQ: powernow-k8: Try to detect old BIOS, not supporting CPU freq on a recent AMD CPUs.
Make use of FW_BUG interface to give vendors and users the ability to
automatically check for powernow-k8 related BIOS bugs by:
dmesg |grep "Firmware Bug"

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-09-22 18:50:52 -04:00
Aristeu Rozanski
b3e15bdef6 x86, NMI watchdog: setup before enabling NMI watchdog
There's a small window when NMI watchdog is being set up that if any NMIs
are triggered, the NMI code will make make use of not initalized wd_ops
elements:
	void setup_apic_nmi_watchdog(void *unused)
	{
		if (__get_cpu_var(wd_enabled))
			return;

		/* cheap hack to support suspend/resume */
		/* if cpu0 is not active neither should the other cpus */
		if (smp_processor_id() != 0 && atomic_read(&nmi_active) <= 0)
			return;

		switch (nmi_watchdog) {
		case NMI_LOCAL_APIC:
			/* enable it before to avoid race with handler */
-->			__get_cpu_var(wd_enabled) = 1;
-->			if (lapic_watchdog_init(nmi_hz) < 0) {
(...)
	asmlinkage notrace __kprobes void default_do_nmi(struct pt_regs *regs)
	{
	(...)
			if (nmi_watchdog_tick(regs, reason))
				return;
(...)
	notrace __kprobes int
	nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
	{
	(...)
		if (!__get_cpu_var(wd_enabled))
			return rc;
		switch (nmi_watchdog) {
		case NMI_LOCAL_APIC:
			rc |= lapic_wd_event(nmi_hz);
(...)
int lapic_wd_event(unsigned nmi_hz)
{
	struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
	u64 ctr;

-->	rdmsrl(wd->perfctr_msr, ctr);

and wd->*_msr will be initialized on each processor type specific setup, after
enabling NMIs for PMIs. Since the counter was just set, the chances of an
performance counter generated NMI is minimal, but any other unknown NMI would
trigger the problem. This patch fixes the problem by setting everything up
before enabling performance counter generated NMIs and will set wd_enabled
using a callback function.

Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-22 19:48:19 +02:00
Aristeu Rozanski
28b166a700 x86, NMI watchdog: when booting with reset_devices, clear the performance counters
P4s have a quirk that makes necessary to clear P4_CCCR_OVF bit on the CCCR
everytime the PMI is triggered. When booting the kernel with reset_devices
(more specific kdump case), the counters reach zero and the PMI will be
generated. This is not a problem on other processors but on P4s, it'll
continue to generate NMIs until that bit is cleared. Since there may be
other users of the performance counters, clear and disable all of them
when booting with reset_devices option.

We have a P4 box here that crashes because of this problem. Since the kdump
kernel usually boots with only one processor active, the second logical
unit won't be set up, therefore, MSR_P4_IQ_CCCR1 (and other performance
counter registers) won't be cleared and P4_CCCR_OVF may be still set because
the previous kernel was using this register. An NMI is triggered because of
the MSR_P4_IQ_CCCR1 right after the NMI delivery is enabled, triggering the
race fixed on my previous email.

Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-22 19:48:18 +02:00
Yinghai Lu
16dc552f35 x86: use WARN_ONCE in workaround for mtrr mask
so could help catch attention about bug in bios about mtrr mask setting.

WARN_ONCE got into mainline already, lets use it.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-22 13:09:56 +02:00
Ingo Molnar
0b88641f1b Merge commit 'v2.6.27-rc7' into x86/debug 2008-09-22 13:08:57 +02:00
Yinghai Lu
279b0bbba2 x86: fix arch/x86/kernel/cpu/mtrr/main.c warning
fix this warning reported by Andrew Morton:

> arch/x86/kernel/cpu/mtrr/main.c: In function 'mtrr_bp_init':
> arch/x86/kernel/cpu/mtrr/main.c:1170: warning: 'extra_remove_base' may be used uninitialized in this function

the warning is bogus but the logic that prevents uninitialized use
is a bit convoluted so simplify it all.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-19 09:16:06 +02:00
H. Peter Anvin
ba0593bf55 x86: completely disable NOPL on 32 bits
Completely disable NOPL on 32 bits.  It turns out that Microsoft
Virtual PC is so broken it can't even reliably *fail* in the presence
of NOPL.

This leaves the infrastructure in place but disables it
unconditionally.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-09-16 09:33:57 -07:00
Ingo Molnar
a9853dd6d2 x86: cpuid, fix typo
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-14 14:46:58 +02:00
Yinghai Lu
afae865613 x86: move transmeta cap read to early_init_transmeta()
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-14 14:09:14 +02:00
Yinghai Lu
aef93c8bd5 x86: identify_cpu_without_cpuid v2
Krzysztof found some old cyrix cpu where an mtrr-alike cpu feature was
not detected properly.

this one is based on Krzysztof' patch, and we call ->c_identify() in
early_identify_cpu.

need to call c_identify() for cpus without cpuid even earlier ...

v2: Krzysztof point out need to give cyrix another chance about cpuid
    checking again, after ->c_identify() enables cpuid for it

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-14 14:09:13 +02:00
Ingo Molnar
3ce9bcb583 Merge branch 'core/xen' into x86/xen 2008-09-10 14:05:45 +02:00
Sheng Yang
e38e05a858 x86: extended "flags" to show virtualization HW feature in /proc/cpuinfo
The hardware virtualization technology evolves very fast. But currently
it's hard to tell if your CPU support a certain kind of HW technology
without digging into the source code.

The patch add a new catagory in "flags" under /proc/cpuinfo. Now "flags"
can indicate the (important) HW virtulization features the CPU supported
as well.

Current implementation just cover Intel VMX side.

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-10 14:00:56 +02:00
Yinghai Lu
ec70cae869 x86: centaur_64.c remove duplicated setting of CONSTANT_TSC
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-10 08:21:06 +02:00
Yinghai Lu
4052704d92 x86: intel.c put workaround for old cpus together
consolidate the code some more.

No change in functionality intended.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-10 08:21:06 +02:00
Yinghai Lu
879d792b66 x86: let intel 64-bit use intel.c
now that arch/x86/kernel/cpu/intel_64.c and
arch/x86/kernel/cpu/intel.c are equal, drop
arch/x86/kernel/cpu/intel_64.c and fix up
the glue.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-10 08:21:05 +02:00
Yinghai Lu
58602c1681 x86: make intel_64.c the same as intel.c
No change in functionality intended - this only adds the 32-bit side.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-10 08:21:04 +02:00
Yinghai Lu
185f3b9da2 x86: make intel.c have 64-bit support code
prepare for unification.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-10 08:21:03 +02:00
Ingo Molnar
81faaae457 Merge branch 'x86/pebs' into x86/unify-cpu-detect
Conflicts:
	arch/x86/Kconfig.cpu
	include/asm-x86/ds.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-10 08:20:51 +02:00
Yinghai Lu
f69feff720 x86: little clean up of intel.c/intel_64.c
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-08 15:46:03 +02:00
Yinghai Lu
ff73152ced x86: make 64 bit to use amd.c
arch/x86/kernel/cpu/amd.c is now 100% identical to
arch/x86/kernel/cpu/amd_64.c, so use amd.c on 64-bit too
and fix up the namespace impact.

Simplify the Kconfig glue as well.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-08 15:32:06 +02:00
Yinghai Lu
2a02505055 x86: make amd_64 have 32 bit code
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-08 15:32:03 +02:00
Yinghai Lu
6c62aa4a3c x86: make amd.c have 64bit support code
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-08 15:32:02 +02:00
Yinghai Lu
8d71a2ea0a x86: merge header in amd_64.c
Singed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-08 15:32:01 +02:00
Yinghai Lu
2b86473604 x86: add srat_detect_node for amd64
separate that from amd_detect_cmp()

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-08 15:32:00 +02:00
Yinghai Lu
c58606ad55 x86: remove duplicated force_mwait
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-08 15:31:59 +02:00
Yinghai Lu
11fdd252bb x86: cpu make amd.c more like amd_64.c v2
1. make 32bit have early_init_amd_mc and amd_detect_cmp
2. seperate init_amd_k5/k6/k7 ...

v2: fix compiling for !CONFIG_SMP

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-08 15:31:58 +02:00
Andreas Herrmann
23952a96ae x86: cpu_init(): fix memory leak when using CPU hotplug
Exception stacks are allocated each time a CPU is set online.
But the allocated space is never freed. Thus with one CPU hotplug
offline/online cycle there is a memory leak of 24K (6 pages) for
a CPU.

Fix is to allocate exception stacks only once -- when the CPU is
set online for the first time.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: akpm@linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 20:48:16 +02:00
Andreas Herrmann
d04ec773d7 x86: pda_init(): fix memory leak when using CPU hotplug
pda->irqstackptr is allocated whenever a CPU is set online.
But it is never freed. This results in a memory leak of 16K
for each CPU offline/online cycle.

Fix is to allocate pda->irqstackptr only once.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: akpm@linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 20:48:02 +02:00
Jan Beulich
2d9cd6c27f x86-64: add two __cpuinit annotations
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 19:50:41 +02:00
Yinghai Lu
dd786dd12c x86: move mtrr cpu cap setting early in early_init_xxxx
Krzysztof Helt found MTRR is not detected on k6-2

root cause:
	we moved mtrr_bp_init() early for mtrr trimming,
and in early_detect we only read the CPU capability from cpuid,
so some cpu doesn't have that bit in cpuid.

So we need to add early_init_xxxx to preset those bit before mtrr_bp_init
for those earlier cpus.

this patch is for v2.6.27

Reported-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 17:50:55 +02:00
Krzysztof Helt
12cf105cd6 x86: delay early cpu initialization until cpuid is done
Move early cpu initialization after cpu early get cap so the
early cpu initialization can fix up cpu caps.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 17:50:38 +02:00
Yinghai Lu
e322423471 x86, cpu init: call early_init_xxx in init_xxx
so we:

 1. could set some cap to ap
 2. restore some cap after memset in identify_cpu for boot cpu

esp for CONSTANT_TSC this matters, as:

before this patch:
 flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow rep_good nopl pni monitor cx16 lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs

after this patch:
 flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl pni monitor cx16 lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs

so constant_tsc is back...

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 14:09:14 +02:00
Yinghai Lu
1b05d60d60 x86: remove duplicated get_model_name() calling
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-06 14:09:12 +02:00
H. Peter Anvin
b6734c35af x86: add NOPL as a synthetic CPU feature bit
The long noops ("NOPL") are supposed to be detected by family >= 6.
Unfortunately, several non-Intel x86 implementations, both hardware
and software, don't obey this dictum.  Instead, probe for NOPL
directly by executing a NOPL instruction and see if we get #UD.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-09-05 16:13:52 -07:00
Alex Nixon
913da64b54 x86: build fix for !CONFIG_SMP
Move reset_lazy_tlbstate into tlb_32.c, and define noop versions of
play_dead() in process_{32,64}.c when !CONFIG_SMP.

Signed-off-by: Alex Nixon <alex.nixon@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 17:44:08 +02:00
Yinghai Lu
bd220a24a9 x86: move nonx_setup etc from common.c to init_64.c
like 32 bit put it in init_32.c

Signed-off-by: Yinghai <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 10:23:47 +02:00
Yinghai Lu
f5017cfa35 x86: use cpu/common.c on 64 bit
Use cpu/common.c on both 64-bit and 32-bit and remove cpu/common_64.c.

We started out with this linecount:

  816  arch/x86/kernel/cpu/common_64.c
  805  arch/x86/kernel/cpu/common.c

and the resulting common.c is 1197 lines long, so there's already
424 lines of code eliminated in this phase of the unification.

Signed-off-by: Yinghai <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:40:57 +02:00
Ingo Molnar
143b604a2d x86: cpu/common*.c, merge whitespaces
Merge leftover whitespaces, to make arch/x86/kernel/cpu/common_64.c
exactly identical to arch/x86/kernel/cpu/common.c.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:40:56 +02:00
Yinghai Lu
102bbe3ab8 x86: cpu/common*.c, merge identify_cpu()
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:40:56 +02:00
Yinghai Lu
b89d3b3e2c x86: cpu/common*.c, merge generic_identify()
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:40:55 +02:00
Yinghai Lu
56f0d033be x86: cpu/common*.c: merge print_cpu_info()
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:40:54 +02:00
Yinghai Lu
6627d24230 x86: cpu/common*.c, merge early_identify_cpu()
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:40:54 +02:00
Yinghai Lu
5122c890ba x86: cpu/common.c: merge get_cpu_cap()
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:40:53 +02:00
Yinghai Lu
1cd78776c7 x86: cpu/common*.c, merge detect_ht()
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:40:52 +02:00
Yinghai Lu
140fc72709 x86: cpu/common*.c, merge display_cacheinfo()
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:40:51 +02:00
Yinghai Lu
b9e67f0042 x86: cpu/common.c, merge default_init()
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:40:50 +02:00
Yinghai Lu
fab334c1d5 x86: cpu/common*.c, merge switch_to_new_gdt()
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:40:50 +02:00
Yinghai Lu
1ba76586f7 x86: cpu/common*.c have same cpu_init(), with copying and #ifdef
hard to merge by lines... (as here we have material differences between
32-bit and 64-bit mode) - will try to do it later.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:40:49 +02:00
Yinghai Lu
d5494d4f51 x86: cpu/common*.c, make 32-bit have 64-bit only functions
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:40:48 +02:00
Yinghai Lu
ba51dced0b x86: cpu/common.c, let 64-bit code have 32-bit only functions
No effect on 64-bit.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:40:47 +02:00
Yinghai Lu
950ad7ff6e x86: same gdt_page with macro
Move the 32-bit and 64-bit gdt_page definitions next to each
other, separated with an #ifdef.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:40:47 +02:00
Yinghai Lu
f0fc4aff1f x86: make header file the same in arch/x86/kernel/cpu/common_xx.c
Make the files more similar in preparation to unification, no
code changed.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:40:46 +02:00
Yinghai Lu
97e4db7c87 x86: make detect_ht depend on CONFIG_X86_HT
64-bit has X86_HT set too, so use that instead of SMP.

This also removes a include/asm-x86/processor.h ifdef.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:40:45 +02:00
Ingo Molnar
d3d0ba7b8f Merge commit '63cc8c75156462d4b42cbdd76c293b7eee7ddbfe':
"percpu: introduce DEFINE_PER_CPU_PAGE_ALIGNED() macro"

into x86/core

Conflicts:
	arch/x86/kernel/cpu/common.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:24:30 +02:00
Ingo Molnar
9042763808 Merge branch 'x86/x2apic' into x86/core
Conflicts:
	arch/x86/kernel/cpu/common_64.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:21:21 +02:00
Ingo Molnar
446d27338d Merge branch 'x86/cpu' into x86/core 2008-09-05 09:19:50 +02:00
Ingo Molnar
accf0fa697 Merge branch 'x86/xsave' into x86/core 2008-09-05 09:18:39 +02:00
Yinghai Lu
0a488a53d7 x86: move 32bit related functions together
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-04 21:09:47 +02:00
Yinghai Lu
01b2e16a7a x86: make get_mode_name of 64bit the same as 32bit
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-04 21:09:46 +02:00
Yinghai Lu
a0854a46c5 x86: make 32bit support show_msr like 64 bit
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-04 21:09:46 +02:00
Yinghai Lu
10a434fcb2 x86: remove cpu_vendor_dev
1. add c_x86_vendor into cpu_dev
2. change cpu_devs to static
3. check c_x86_vendor before put that cpu_dev into array
4. remove alignment for 64bit
5. order the sequence in cpu_devs according to link sequence...
   so could put intel at first, then amd...

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-04 21:09:45 +02:00
Yinghai Lu
9d31d35b5f x86: order functions in cpu/common.c and cpu/common_64.c v2
v2: make 64 bit get c->x86_cache_alignment = c->x86_clfush_size

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-04 21:09:44 +02:00
Yinghai Lu
3da99c9776 x86: make (early)_identify_cpu more the same between 32bit and 64 bit
1. add extended_cpuid_level for 32bit
 2. add generic_identify for 64bit
 3. add early_identify_cpu for 32bit
 4. early_identify_cpu not be called by identify_cpu
 5. remove early in get_cpu_vendor for 32bit
 6. add get_cpu_cap
 7. add cpu_detect for 64bit

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-04 21:09:44 +02:00
Krzysztof Helt
5031088dbc x86: delay early cpu initialization until cpuid is done
Move early cpu initialization after cpu early get cap so the
early cpu initialization can fix up cpu caps.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-04 21:09:43 +02:00
Yinghai Lu
5fef55fddb x86: move mtrr cpu cap setting early in early_init_xxxx
Krzysztof Helt found MTRR is not detected on k6-2

root cause:
	we moved mtrr_bp_init() early for mtrr trimming,
and in early_detect we only read the CPU capability from cpuid,
so some cpu doesn't have that bit in cpuid.

So we need to add early_init_xxxx to preset those bit before mtrr_bp_init
for those earlier cpus.

this patch is for v2.6.27

Reported-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-04 21:09:43 +02:00
Ingo Molnar
62b3f98188 Merge branch 'x86/debug' into x86/cpu 2008-09-04 21:08:09 +02:00
H. Peter Anvin
aa3341a168 Merge branch 'x86/cpu' into x86/x2apic
Conflicts:

	arch/x86/kernel/cpu/feature_names.c
	include/asm-x86/cpufeature.h
2008-09-04 09:21:21 -07:00
H. Peter Anvin
fe47784ba5 Merge branch 'x86/cpu' into x86/xsave
Conflicts:

	arch/x86/kernel/cpu/feature_names.c
	include/asm-x86/cpufeature.h
2008-09-04 09:04:45 -07:00
H. Peter Anvin
7203781c98 Merge branch 'x86/cpu' into x86/core
Conflicts:

	arch/x86/kernel/cpu/feature_names.c
	include/asm-x86/cpufeature.h
2008-09-04 08:08:42 -07:00
Ingo Molnar
42390cdec5 Merge branch 'linus' into x86/x2apic
Conflicts:
	arch/x86/kernel/cpu/cyrix.c
	include/asm-x86/cpufeature.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-04 13:02:35 +02:00
H. Peter Anvin
7414aa41a6 x86: generate names for /proc/cpuinfo from <asm/cpufeature.h>
We have had a number of cases where <asm/cpufeature.h> (and its
predecessors) have diverged substantially from the names list in
/proc/cpuinfo.  This patch generates the latter from the former.

It retains the option for explicitly overriding the strings, but by
making that require a separate action it should at least be less
likely to happen.

It would be good to do a future pass and rename strings that are
gratuituously different in the kernel (/proc/cpuinfo is a userspace
interface and must remain constant.)

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-27 19:23:22 -07:00
H. Peter Anvin
b30a72a7ed Merge branch 'x86/urgent' into x86/cpu
Conflicts:

	arch/x86/kernel/cpu/cyrix.c
2008-08-27 19:17:07 -07:00
Suresh Siddha
11c231a962 x86: use x2apic id reported by cpuid during topology discovery, fix
v2: Fix for !SMP build

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-27 09:02:19 +02:00
Ingo Molnar
f58899bb02 Merge branch 'linus' into x86/urgent 2008-08-25 14:39:12 +02:00
Alex Nixon
3790025863 x86_32: clean up play_dead
The removal of the CPU from the various maps was redundant as it already
happened in cpu_disable.

After cleaning this up, cpu_uninit only resets the tlb state, so rename
it and create a noop version for the X86_64 case (so the two play_deads
can be unified later).

Signed-off-by: Alex Nixon <alex.nixon@citrix.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-25 10:59:18 +02:00
Rafael J. Wysocki
8735728ef8 x86 MCE: Fix CPU hotplug problem with multiple multicore AMD CPUs
During CPU hot-remove the sysfs directory created by
threshold_create_bank(), defined in
arch/x86/kernel/cpu/mcheck/mce_amd_64.c, has to be removed before
its parent directory, created by mce_create_device(), defined in
arch/x86/kernel/cpu/mcheck/mce_64.c .  Moreover, when the CPU in
question is hotplugged again, obviously the latter has to be created
before the former.  At present, the right ordering is not enforced,
because all of these operations are carried out by CPU hotplug
notifiers which are not appropriately ordered with respect to each
other.  This leads to serious problems on systems with two or more
multicore AMD CPUs, among other things during suspend and hibernation.

Fix the problem by placing threshold bank CPU hotplug callbacks in
mce_cpu_callback(), so that they are invoked at the right places,
if defined.  Additionally, use kobject_del() to remove the sysfs
directory associated with the kobject created by
kobject_create_and_add() in threshold_create_bank(), to prevent the
kernel from crashing during CPU hotplug operations on systems with
two or more multicore AMD CPUs.

This patch fixes bug #11337.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Andi Kleen <andi@firstfloor.org>
Tested-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-23 17:49:19 +02:00
Suresh Siddha
bbb65d2d36 x86: use cpuid vector 0xb when available for detecting cpu topology
cpuid leaf 0xb provides extended topology enumeration. This interface provides
the 32-bit x2APIC id of the logical processor and it also provides a new
mechanism to detect SMT and core siblings (which provides increased
addressability).

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-23 17:47:10 +02:00
Ingo Molnar
87ce786ae5 Merge branch 'x86/cpu' into x86/x2apic 2008-08-23 17:46:59 +02:00
Linus Torvalds
358c323c17 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: work around MTRR mask setting, v2
  x86: fix section mismatch warning - uv_cpu_init
  x86: fix VMI for early params
  x86: fix two modpost warnings in mm/init_64.c
  x86: fix 1:1 mapping init on 64-bit (memory hotplug case)
  x86: work around MTRR mask setting
  x86: PAT Update validate_pat_support for intel CPUs
  devmem, x86: PAT Change /dev/mem mmap with O_SYNC to use UC_MINUS
  x86: PAT proper tracking of set_memory_uc and friends
  x86: fix BUG: unable to handle kernel paging request (numaq_tsc_disable)
  x86: export pv_lock_ops non-GPL
  x86, mmiotrace: silence section mismatch warning - leave_uniprocessor
  x86: use WARN() in arch/x86/kernel
  x86: use WARN() in arch/x86/mm/ioremap.c
  werror: fix pci calgary
  x86: fix oprofile + hibernation badness
  x86, SGI UV: hardcode the TLB flush interrupt system vector
  x86: fix Xorg startup/shutdown slowdown with PAT
  x86: fix "kernel won't boot on a Cyrix MediaGXm (Geode)"
  x86 iommu: remove unneeded parenthesis
2008-08-22 08:23:53 -07:00
Ingo Molnar
9754a5b840 x86: work around MTRR mask setting, v2
improve the debug printout:

- make it actually display something
- print it only once

would be nice to have a WARN_ONCE() facility, to feed such things to
kerneloops.org.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-22 14:12:31 +02:00
Yinghai Lu
b05f78f5c7 x86_64: printout msr -v2
commandline show_msr=1 for bsp, show_msr=32 for all 32 cpus.

[ mingo@elte.hu: added documentation ]

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-22 10:43:21 +02:00
Yinghai Lu
38cc1c3df7 x86: work around MTRR mask setting
Joshua Hoblitt reported that only 3 GB of his 16 GB of RAM is
usable. Booting with mtrr_show showed us the BIOS-initialized
MTRR settings - which are all wrong.

So the root cause is that the BIOS has not set the mask correctly:

>               [    0.429971]  MSR00000200: 00000000d0000000
>               [    0.433305]  MSR00000201: 0000000ff0000800
> should be ==> [    0.433305]  MSR00000201: 0000003ff0000800
>
>               [    0.436638]  MSR00000202: 00000000e0000000
>               [    0.439971]  MSR00000203: 0000000fe0000800
> should be ==> [    0.439971]  MSR00000203: 0000003fe0000800
>
>               [    0.443304]  MSR00000204: 0000000000000006
>               [    0.446637]  MSR00000205: 0000000c00000800
> should be ==> [    0.446637]  MSR00000205: 0000003c00000800
>
>               [    0.449970]  MSR00000206: 0000000400000006
>               [    0.453303]  MSR00000207: 0000000fe0000800
> should be ==> [    0.453303]  MSR00000207: 0000003fe0000800
>
>               [    0.456636]  MSR00000208: 0000000420000006
>               [    0.459970]  MSR00000209: 0000000ff0000800
> should be ==> [    0.459970]  MSR00000209: 0000003ff0000800

So detect this borkage and add the prefix 111.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-22 05:49:35 +02:00
venkatesh.pallipadi@intel.com
8323444b5d x86: PAT Update validate_pat_support for intel CPUs
Pentium III and Core Solo/Duo CPUs have an erratum
" Page with PAT set to WC while associated MTRR is UC may consolidate to UC "
which can result in WC setting in PAT to be ineffective. We will disable
PAT on such CPUs, so that we can continue to use MTRR WC setting.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-21 13:27:34 +02:00
Arjan van de Ven
bde78a79a6 x86: use WARN() in arch/x86/kernel
Use WARN() instead of a printk+WARN_ON() pair; this way the message
becomes part of the warning section for better reporting/collection.
This also allowed the folding of some if()'s into the WARN()

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: akpm@linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-21 10:01:52 +02:00
Samuel Sieb
c6744955d0 x86: fix "kernel won't boot on a Cyrix MediaGXm (Geode)"
Cyrix MediaGXm/Cx5530 Unicorn Revision 1.19.3B has stopped
booting starting at v2.6.22.

The reason is this commit:

> commit f25f64ed5b
> Author: Juergen Beisert <juergen@kreuzholzen.de>
> Date:   Sun Jul 22 11:12:38 2007 +0200
>
>     x86: Replace NSC/Cyrix specific chipset access macros by inlined functions.

this commit activated a macro which was dormant before due to (buggy)
macro side-effects.

I've looked through various datasheets and found that the GXm and GXLV
Geode processors don't have an incrementor.

Remove the incrementor setup entirely.  As the incrementor value
differs according to clock speed and we would hope that the BIOS
configures it correctly, it is probably the right solution.

Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-20 11:31:00 +02:00
Linus Torvalds
f607e3a03c Revert "[CPUFREQ][2/2] preregister support for powernow-k8"
This reverts commit 34ae7f35a2, which has
been reported to cause a number of problems.  During suspend and resume,
it apparently causes a crash in a CPU hotplug notifier to happen,
although the exact details are sketchy because of the inability to get
good traces during the suspend sequence.

See buzilla entries

	http://bugzilla.kernel.org/show_bug.cgi?id=11296
	http://bugzilla.kernel.org/show_bug.cgi?id=11339

for more examples and details.

[ Mark: "Revert the patch for now.  I'm still looking into getting a
  reliable reproduction and I do not have a fix at this time." ]

Requested-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Mark Langsdorf <mark.langsdorf@amd.com>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@inux-foundation.org>
2008-08-19 13:34:59 -07:00
H. Peter Anvin
7e00df5818 x86: add NOPL as a synthetic CPU feature bit
The long noops ("NOPL") are supposed to be detected by family >= 6.
Unfortunately, several non-Intel x86 implementations, both hardware
and software, don't obey this dictum.  Instead, probe for NOPL
directly by executing a NOPL instruction and see if we get #UD.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-18 18:22:17 -07:00
Thomas Petazzoni
8d02c2110b x86: configuration options to compile out x86 CPU support code
This patch adds some configuration options that allow to compile out
CPU vendor-specific code in x86 kernels (in arch/x86/kernel/cpu). The
new configuration options are only visible when CONFIG_EMBEDDED is
selected, as they are mostly interesting for space savings reasons.

An example of size saving, on x86 with only Intel CPU support:

   text	   data	    bss	    dec	    hex	filename
1125479	 118760	 212992	1457231	 163c4f	vmlinux.old
1121355	 116536	 212992	1450883	 162383	vmlinux
  -4124   -2224       0   -6348   -18CC +/-

However, I'm not exactly sure that the Kconfig wording is correct with
regard to !64BIT / 64BIT.

[ mingo@elte.hu: convert macro to inline ]

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 16:05:48 +02:00
Thomas Petazzoni
774400a3ba x86: move cmpxchg fallbacks to a generic place
arch/x86/kernel/cpu/intel.c defines a few fallback functions
(cmpxchg_*()) that are used when the CPU doesn't support cmpxchg
and/or cmpxchg64 natively. However, while defined in an Intel-specific
file, these functions are also used for CPUs from other vendors when
they don't support cmpxchg and/or cmpxchg64. This breaks the
compilation when support for Intel CPUs is disabled.

This patch moves these functions to a new
arch/x86/kernel/cpu/cmpxchg.c file, unconditionally compiled when
X86_32 is enabled.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: michael@free-electrons.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 16:05:47 +02:00
Thomas Petazzoni
8bfcb3960f x86: make movsl_mask definition non-CPU specific
movsl_mask is currently defined in arch/x86/kernel/cpu/intel.c, which
contains code specific to Intel CPUs. However, movsl_mask is used in
the non-CPU specific code in arch/x86/lib/usercopy_32.c, which breaks
the compilation when support for Intel CPUs is compiled out.

This patch solves this problem by moving movsl_mask's definition close
to its users in arch/x86/lib/usercopy_32.c.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: michael@free-electrons.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 16:05:45 +02:00
Ingo Molnar
1a10390708 Merge branch 'linus' into x86/cpu 2008-08-15 16:16:15 +02:00