* [PATCH for review] [0/48] Second batch of x86 patches for .23
@ 2007-07-19 13:48 Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [1/48] i386: pgd_{c,d}tor() static Andi Kleen
` (37 more replies)
0 siblings, 38 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: linux-kernel
[repost to linux-kernel because I messed up the email address first time]
- Controversal change: extend the unhandled signals logging to i386
* all flames to Masoud Asgharifard Sharbiani <masouds@google.com> please
- NUMA fallback handling for AMD Fam10h/11h (Joachim Deguara)
- Second try at registering HPET in resources
- Various AMD Geode fixes
- Better OOM handling -- kill all threads
- Better HPET sanity checking
- Fix PCI mmconfig on some systems
- kprobes/alternative handling now works with DEBUG_RODATA again
- Allow KVM on non PAE kernels again
- Mostly various small leanups and fixes
Please review. I plan to send them off relatively quickly
because I'm very late with this merge.
-Andi
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [1/48] i386: pgd_{c,d}tor() static
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [2/48] i386: fix section mismatch warning in intel_cacheinfo Andi Kleen
` (36 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: bunk, linux-kernel
From: Adrian Bunk <bunk@stusta.de>
pgd_{c,d}tor() can now become static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
---
arch/i386/mm/pgtable.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
Index: linux/arch/i386/mm/pgtable.c
===================================================================
--- linux.orig/arch/i386/mm/pgtable.c
+++ linux/arch/i386/mm/pgtable.c
@@ -235,7 +235,7 @@ static inline void pgd_list_del(pgd_t *p
#if (PTRS_PER_PMD == 1)
/* Non-PAE pgd constructor */
-void pgd_ctor(void *pgd)
+static void pgd_ctor(void *pgd)
{
unsigned long flags;
@@ -257,7 +257,7 @@ void pgd_ctor(void *pgd)
}
#else /* PTRS_PER_PMD > 1 */
/* PAE pgd constructor */
-void pgd_ctor(void *pgd)
+static void pgd_ctor(void *pgd)
{
/* PAE, kernel PMD may be shared */
@@ -276,7 +276,7 @@ void pgd_ctor(void *pgd)
}
#endif /* PTRS_PER_PMD */
-void pgd_dtor(void *pgd)
+static void pgd_dtor(void *pgd)
{
unsigned long flags; /* can be called from interrupt context */
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [2/48] i386: fix section mismatch warning in intel_cacheinfo
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [1/48] i386: pgd_{c,d}tor() static Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [3/48] i386: do not restore reserved memory after hibernation Andi Kleen
` (35 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: sam, linux-kernel
From: Sam Ravnborg <sam@ravnborg.org>
Fix following warning:
WARNING: arch/i386/kernel/built-in.o(.init.text+0x3818): Section mismatch: reference to .exit.text:cache_remove_dev (between 'cacheinfo_cpu_callback' and 'cache_sysfs_init')
It points out that a function marked __cpuexit is calling a function marked
__cpuinit => oops.
The call happens only in an error-condition which may explain why we have
not seen it before.
The offending function was not used anywhere else - so marked it __cpuexit.
Note: This warning triggers only with a local copy of modpost
but that version will soon be pushed out.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
---
arch/i386/kernel/cpu/intel_cacheinfo.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux/arch/i386/kernel/cpu/intel_cacheinfo.c
===================================================================
--- linux.orig/arch/i386/kernel/cpu/intel_cacheinfo.c
+++ linux/arch/i386/kernel/cpu/intel_cacheinfo.c
@@ -746,7 +746,7 @@ static int __cpuinit cache_add_dev(struc
return retval;
}
-static void __cpuexit cache_remove_dev(struct sys_device * sys_dev)
+static void __cpuinit cache_remove_dev(struct sys_device * sys_dev)
{
unsigned int cpu = sys_dev->id;
unsigned long i;
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [3/48] i386: do not restore reserved memory after hibernation
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [1/48] i386: pgd_{c,d}tor() static Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [2/48] i386: fix section mismatch warning in intel_cacheinfo Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [4/48] i386: DMI_MATCH patch in reboot.c for SFF Dell OptiPlex 745 - fixes hang on reboot Andi Kleen
` (34 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: rjw, ak, rientjes, linux-kernel
From: Rafael J. Wysocki <rjw@sisk.pl>
On some systems the ACPI NVS area is located in the first 1 MB of RAM and
it is overwritten by the i386 code during the restore after hibernation.
This confuses the ACPI platform firmware that doesn't update the AC adapter
status appropriately as a result
(http://bugzilla.kernel.org/show_bug.cgi?id=7995).
The solution is to register the reserved memory in the first 1 MB as
'nosave', so that swsusp doesn't touch it during the restore. Also, this
has been done on x86_64 for a long time now, so this patch makes the i386
restore code behave like the x86_64 one.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Andi Kleen <ak@suse.de>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/i386/kernel/e820.c | 32 ++++++++++++++++++++++++++++++++
arch/i386/kernel/setup.c | 1 +
include/asm-i386/e820.h | 8 ++++++++
3 files changed, 41 insertions(+)
Index: linux/arch/i386/kernel/e820.c
===================================================================
--- linux.orig/arch/i386/kernel/e820.c
+++ linux/arch/i386/kernel/e820.c
@@ -10,6 +10,7 @@
#include <linux/efi.h>
#include <linux/pfn.h>
#include <linux/uaccess.h>
+#include <linux/suspend.h>
#include <asm/pgtable.h>
#include <asm/page.h>
@@ -320,6 +321,37 @@ static int __init request_standard_resou
subsys_initcall(request_standard_resources);
+#if defined(CONFIG_PM) && defined(CONFIG_SOFTWARE_SUSPEND)
+/**
+ * e820_mark_nosave_regions - Find the ranges of physical addresses that do not
+ * correspond to e820 RAM areas and mark the corresponding pages as nosave for
+ * hibernation.
+ *
+ * This function requires the e820 map to be sorted and without any
+ * overlapping entries and assumes the first e820 area to be RAM.
+ */
+void __init e820_mark_nosave_regions(void)
+{
+ int i;
+ unsigned long pfn;
+
+ pfn = PFN_DOWN(e820.map[0].addr + e820.map[0].size);
+ for (i = 1; i < e820.nr_map; i++) {
+ struct e820entry *ei = &e820.map[i];
+
+ if (pfn < PFN_UP(ei->addr))
+ register_nosave_region(pfn, PFN_UP(ei->addr));
+
+ pfn = PFN_DOWN(ei->addr + ei->size);
+ if (ei->type != E820_RAM)
+ register_nosave_region(PFN_UP(ei->addr), pfn);
+
+ if (pfn >= max_low_pfn)
+ break;
+ }
+}
+#endif
+
void __init add_memory_region(unsigned long long start,
unsigned long long size, int type)
{
Index: linux/arch/i386/kernel/setup.c
===================================================================
--- linux.orig/arch/i386/kernel/setup.c
+++ linux/arch/i386/kernel/setup.c
@@ -640,6 +640,7 @@ void __init setup_arch(char **cmdline_p)
#endif
e820_register_memory();
+ e820_mark_nosave_regions();
#ifdef CONFIG_VT
#if defined(CONFIG_VGA_CONSOLE)
Index: linux/include/asm-i386/e820.h
===================================================================
--- linux.orig/include/asm-i386/e820.h
+++ linux/include/asm-i386/e820.h
@@ -47,6 +47,14 @@ extern void e820_register_memory(void);
extern void limit_regions(unsigned long long size);
extern void print_memory_map(char *who);
+#if defined(CONFIG_PM) && defined(CONFIG_SOFTWARE_SUSPEND)
+extern void e820_mark_nosave_regions(void);
+#else
+static inline void e820_mark_nosave_regions(void)
+{
+}
+#endif
+
#endif/*!__ASSEMBLY__*/
#endif/*__E820_HEADER*/
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [4/48] i386: DMI_MATCH patch in reboot.c for SFF Dell OptiPlex 745 - fixes hang on reboot
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (2 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [3/48] i386: do not restore reserved memory after hibernation Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [5/48] i386: HPET, check if the counter works Andi Kleen
` (33 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: James.Jarvis, ak, linux-kernel
From: James Jarvis <James.Jarvis@ed.ac.uk>
The following patch enables reboot through BIOS on the Dell Optiplex 745
Small Form Factor base, on which reboot hangs. The larger form factor does
not require this, hence the match on DMI_BOARD_NAME.
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
---
arch/i386/kernel/reboot.c | 9 +++++++++
1 file changed, 9 insertions(+)
Index: linux/arch/i386/kernel/reboot.c
===================================================================
--- linux.orig/arch/i386/kernel/reboot.c
+++ linux/arch/i386/kernel/reboot.c
@@ -113,6 +113,15 @@ static struct dmi_system_id __initdata r
DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge 300/"),
},
},
+ { /* Handle problems with rebooting on Dell Optiplex 745's SFF*/
+ .callback = set_bios_reboot,
+ .ident = "Dell OptiPlex 745",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "OptiPlex 745"),
+ DMI_MATCH(DMI_BOARD_NAME, "0WF810"),
+ },
+ },
{ /* Handle problems with rebooting on Dell 2400's */
.callback = set_bios_reboot,
.ident = "Dell PowerEdge 2400",
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [5/48] i386: HPET, check if the counter works
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (3 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [4/48] i386: DMI_MATCH patch in reboot.c for SFF Dell OptiPlex 745 - fixes hang on reboot Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [6/48] x86: trim memory not covered by WB MTRRs Andi Kleen
` (32 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: tglx, ak, johnstul, stable, linux-kernel
From: Thomas Gleixner <tglx@linutronix.de>
Some systems have a HPET which is not incrementing, which leads to a
complete hang. Detect it during HPET setup.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/i386/kernel/hpet.c | 24 +++++++++++++++++++++++-
1 file changed, 23 insertions(+), 1 deletion(-)
Index: linux/arch/i386/kernel/hpet.c
===================================================================
--- linux.orig/arch/i386/kernel/hpet.c
+++ linux/arch/i386/kernel/hpet.c
@@ -226,7 +226,8 @@ int __init hpet_enable(void)
{
unsigned long id;
uint64_t hpet_freq;
- u64 tmp;
+ u64 tmp, start, now;
+ cycle_t t1;
if (!is_hpet_capable())
return 0;
@@ -273,6 +274,27 @@ int __init hpet_enable(void)
/* Start the counter */
hpet_start_counter();
+ /* Verify whether hpet counter works */
+ t1 = read_hpet();
+ rdtscll(start);
+
+ /*
+ * We don't know the TSC frequency yet, but waiting for
+ * 200000 TSC cycles is safe:
+ * 4 GHz == 50us
+ * 1 GHz == 200us
+ */
+ do {
+ rep_nop();
+ rdtscll(now);
+ } while ((now - start) < 200000UL);
+
+ if (t1 == read_hpet()) {
+ printk(KERN_WARNING
+ "HPET counter not counting. HPET disabled\n");
+ goto out_nohpet;
+ }
+
/* Initialize and register HPET clocksource
*
* hpet period is in femto seconds per cycle
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [6/48] x86: trim memory not covered by WB MTRRs
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (4 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [5/48] i386: HPET, check if the counter works Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [7/48] i386: divorce CONFIG_X86_PAE from CONFIG_HIGHMEM64G Andi Kleen
` (31 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: jesse.barnes, andi, ebiederm, yhlu.kernel, linux-kernel
From: Jesse Barnes <jesse.barnes@intel.com>
On some machines, buggy BIOSes don't properly setup WB MTRRs to cover all
available RAM, meaning the last few megs (or even gigs) of memory will be
marked uncached. Since Linux tends to allocate from high memory addresses
first, this causes the machine to be unusably slow as soon as the kernel
starts really using memory (i.e. right around init time).
This patch works around the problem by scanning the MTRRs at boot and
figuring out whether the current end_pfn value (setup by early e820 code)
goes beyond the highest WB MTRR range, and if so, trimming it to match. A
fairly obnoxious KERN_WARNING is printed too, letting the user know that
not all of their memory is available due to a likely BIOS bug.
Something similar could be done on i386 if needed, but the boot ordering
would be slightly different, since the MTRR code on i386 depends on the
boot_cpu_data structure being setup.
This patch fixes a bug in the last patch that caused the code to run on
non-Intel machines (AMD machines apparently don't need it and it's untested
on other non-Intel machines, so best keep it off).
Signed-off-by: Jesse Barnes <jesse.barnes@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Tested-by: Justin Piszcz <jpiszcz@lucidpixels.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
Documentation/kernel-parameters.txt | 6 ++
arch/i386/kernel/cpu/mtrr/generic.c | 8 ---
arch/i386/kernel/cpu/mtrr/if.c | 8 ---
arch/i386/kernel/cpu/mtrr/main.c | 85 +++++++++++++++++++++++++++---------
arch/i386/kernel/cpu/mtrr/mtrr.h | 3 +
arch/x86_64/kernel/bugs.c | 1
arch/x86_64/kernel/setup.c | 4 +
include/asm-x86_64/mtrr.h | 5 +-
8 files changed, 84 insertions(+), 36 deletions(-)
Index: linux/Documentation/kernel-parameters.txt
===================================================================
--- linux.orig/Documentation/kernel-parameters.txt
+++ linux/Documentation/kernel-parameters.txt
@@ -537,6 +537,12 @@ and is between 256 and 4096 characters.
See drivers/char/README.epca and
Documentation/digiepca.txt.
+ disable_mtrr_trim [X86-64, Intel only]
+ By default the kernel will trim any uncacheable
+ memory out of your available memory pool based on
+ MTRR settings. This parameter disables that behavior,
+ possibly causing your machine to run very slowly.
+
dmascc= [HW,AX25,SERIAL] AX.25 Z80SCC driver with DMA
support available.
Format: <io_dev0>[,<io_dev1>[,..<io_dev32>]]
Index: linux/arch/i386/kernel/cpu/mtrr/generic.c
===================================================================
--- linux.orig/arch/i386/kernel/cpu/mtrr/generic.c
+++ linux/arch/i386/kernel/cpu/mtrr/generic.c
@@ -13,7 +13,7 @@
#include "mtrr.h"
struct mtrr_state {
- struct mtrr_var_range *var_ranges;
+ struct mtrr_var_range var_ranges[MAX_VAR_RANGES];
mtrr_type fixed_ranges[NUM_FIXED_RANGES];
unsigned char enabled;
unsigned char have_fixed;
@@ -85,12 +85,6 @@ void __init get_mtrr_state(void)
struct mtrr_var_range *vrs;
unsigned lo, dummy;
- if (!mtrr_state.var_ranges) {
- mtrr_state.var_ranges = kmalloc(num_var_ranges * sizeof (struct mtrr_var_range),
- GFP_KERNEL);
- if (!mtrr_state.var_ranges)
- return;
- }
vrs = mtrr_state.var_ranges;
rdmsr(MTRRcap_MSR, lo, dummy);
Index: linux/arch/i386/kernel/cpu/mtrr/if.c
===================================================================
--- linux.orig/arch/i386/kernel/cpu/mtrr/if.c
+++ linux/arch/i386/kernel/cpu/mtrr/if.c
@@ -11,10 +11,6 @@
#include <asm/mtrr.h>
#include "mtrr.h"
-/* RED-PEN: this is accessed without any locking */
-extern unsigned int *usage_table;
-
-
#define FILE_FCOUNT(f) (((struct seq_file *)((f)->private_data))->private)
static const char *const mtrr_strings[MTRR_NUM_TYPES] =
@@ -396,7 +392,7 @@ static int mtrr_seq_show(struct seq_file
for (i = 0; i < max; i++) {
mtrr_if->get(i, &base, &size, &type);
if (size == 0)
- usage_table[i] = 0;
+ mtrr_usage_table[i] = 0;
else {
if (size < (0x100000 >> PAGE_SHIFT)) {
/* less than 1MB */
@@ -410,7 +406,7 @@ static int mtrr_seq_show(struct seq_file
len += seq_printf(seq,
"reg%02i: base=0x%05lx000 (%4luMB), size=%4lu%cB: %s, count=%d\n",
i, base, base >> (20 - PAGE_SHIFT), size, factor,
- mtrr_attrib_to_str(type), usage_table[i]);
+ mtrr_attrib_to_str(type), mtrr_usage_table[i]);
}
}
return 0;
Index: linux/arch/i386/kernel/cpu/mtrr/main.c
===================================================================
--- linux.orig/arch/i386/kernel/cpu/mtrr/main.c
+++ linux/arch/i386/kernel/cpu/mtrr/main.c
@@ -38,8 +38,8 @@
#include <linux/cpu.h>
#include <linux/mutex.h>
+#include <asm/e820.h>
#include <asm/mtrr.h>
-
#include <asm/uaccess.h>
#include <asm/processor.h>
#include <asm/msr.h>
@@ -47,7 +47,7 @@
u32 num_var_ranges = 0;
-unsigned int *usage_table;
+unsigned int mtrr_usage_table[MAX_VAR_RANGES];
static DEFINE_MUTEX(mtrr_mutex);
u64 size_or_mask, size_and_mask;
@@ -121,13 +121,8 @@ static void __init init_table(void)
int i, max;
max = num_var_ranges;
- if ((usage_table = kmalloc(max * sizeof *usage_table, GFP_KERNEL))
- == NULL) {
- printk(KERN_ERR "mtrr: could not allocate\n");
- return;
- }
for (i = 0; i < max; i++)
- usage_table[i] = 1;
+ mtrr_usage_table[i] = 1;
}
struct set_mtrr_data {
@@ -385,7 +380,7 @@ int mtrr_add_page(unsigned long base, un
goto out;
}
if (increment)
- ++usage_table[i];
+ ++mtrr_usage_table[i];
error = i;
goto out;
}
@@ -394,12 +389,13 @@ int mtrr_add_page(unsigned long base, un
if (i >= 0) {
set_mtrr(i, base, size, type);
if (likely(replace < 0))
- usage_table[i] = 1;
+ mtrr_usage_table[i] = 1;
else {
- usage_table[i] = usage_table[replace] + !!increment;
+ mtrr_usage_table[i] = mtrr_usage_table[replace] +
+ !!increment;
if (unlikely(replace != i)) {
set_mtrr(replace, 0, 0, 0);
- usage_table[replace] = 0;
+ mtrr_usage_table[replace] = 0;
}
}
} else
@@ -529,11 +525,11 @@ int mtrr_del_page(int reg, unsigned long
printk(KERN_WARNING "mtrr: MTRR %d not used\n", reg);
goto out;
}
- if (usage_table[reg] < 1) {
+ if (mtrr_usage_table[reg] < 1) {
printk(KERN_WARNING "mtrr: reg: %d has count=0\n", reg);
goto out;
}
- if (--usage_table[reg] < 1)
+ if (--mtrr_usage_table[reg] < 1)
set_mtrr(reg, 0, 0, 0);
error = reg;
out:
@@ -593,16 +589,11 @@ struct mtrr_value {
unsigned long lsize;
};
-static struct mtrr_value * mtrr_state;
+static struct mtrr_value mtrr_state[MAX_VAR_RANGES];
static int mtrr_save(struct sys_device * sysdev, pm_message_t state)
{
int i;
- int size = num_var_ranges * sizeof(struct mtrr_value);
-
- mtrr_state = kzalloc(size,GFP_ATOMIC);
- if (!mtrr_state)
- return -ENOMEM;
for (i = 0; i < num_var_ranges; i++) {
mtrr_if->get(i,
@@ -624,7 +615,6 @@ static int mtrr_restore(struct sys_devic
mtrr_state[i].lsize,
mtrr_state[i].ltype);
}
- kfree(mtrr_state);
return 0;
}
@@ -635,6 +625,59 @@ static struct sysdev_driver mtrr_sysdev_
.resume = mtrr_restore,
};
+static int disable_mtrr_trim;
+
+static int __init disable_mtrr_trim_setup(char *str)
+{
+ disable_mtrr_trim = 1;
+ return 0;
+}
+early_param("disable_mtrr_trim", disable_mtrr_trim_setup);
+
+#ifdef CONFIG_X86_64
+/**
+ * mtrr_trim_uncached_memory - trim RAM not covered by MTRRs
+ *
+ * Some buggy BIOSes don't setup the MTRRs properly for systems with certain
+ * memory configurations. This routine checks to make sure the MTRRs having
+ * a write back type cover all of the memory the kernel is intending to use.
+ * If not, it'll trim any memory off the end by adjusting end_pfn, removing
+ * it from the kernel's allocation pools, warning the user with an obnoxious
+ * message.
+ */
+void __init mtrr_trim_uncached_memory(void)
+{
+ unsigned long i, base, size, highest_addr = 0, def, dummy;
+ mtrr_type type;
+
+ /* Make sure we only trim uncachable memory on Intel machines */
+ rdmsr(MTRRdefType_MSR, def, dummy);
+ def &= 0xff;
+ if (!is_cpu(INTEL) || disable_mtrr_trim || def != MTRR_TYPE_UNCACHABLE)
+ return;
+
+ /* Find highest cached pfn */
+ for (i = 0; i < num_var_ranges; i++) {
+ mtrr_if->get(i, &base, &size, &type);
+ if (type != MTRR_TYPE_WRBACK)
+ continue;
+ base <<= PAGE_SHIFT;
+ size <<= PAGE_SHIFT;
+ if (highest_addr < base + size)
+ highest_addr = base + size;
+ }
+
+ if ((highest_addr >> PAGE_SHIFT) < end_pfn) {
+ printk(KERN_WARNING "***************\n");
+ printk(KERN_WARNING "**** WARNING: likely BIOS bug\n");
+ printk(KERN_WARNING "**** MTRRs don't cover all of "
+ "memory, trimmed %ld pages\n", end_pfn -
+ (highest_addr >> PAGE_SHIFT));
+ printk(KERN_WARNING "***************\n");
+ end_pfn = highest_addr >> PAGE_SHIFT;
+ }
+}
+#endif
/**
* mtrr_bp_init - initialize mtrrs on the boot CPU
Index: linux/arch/i386/kernel/cpu/mtrr/mtrr.h
===================================================================
--- linux.orig/arch/i386/kernel/cpu/mtrr/mtrr.h
+++ linux/arch/i386/kernel/cpu/mtrr/mtrr.h
@@ -14,6 +14,7 @@
#define MTRRphysMask_MSR(reg) (0x200 + 2 * (reg) + 1)
#define NUM_FIXED_RANGES 88
+#define MAX_VAR_RANGES 256
#define MTRRfix64K_00000_MSR 0x250
#define MTRRfix16K_80000_MSR 0x258
#define MTRRfix16K_A0000_MSR 0x259
@@ -34,6 +35,8 @@
an 8 bit field: */
typedef u8 mtrr_type;
+extern unsigned int mtrr_usage_table[MAX_VAR_RANGES];
+
struct mtrr_ops {
u32 vendor;
u32 use_intel_if;
Index: linux/arch/x86_64/kernel/bugs.c
===================================================================
--- linux.orig/arch/x86_64/kernel/bugs.c
+++ linux/arch/x86_64/kernel/bugs.c
@@ -14,7 +14,6 @@
void __init check_bugs(void)
{
identify_cpu(&boot_cpu_data);
- mtrr_bp_init();
#if !defined(CONFIG_SMP)
printk("CPU: ");
print_cpu_info(&boot_cpu_data);
Index: linux/arch/x86_64/kernel/setup.c
===================================================================
--- linux.orig/arch/x86_64/kernel/setup.c
+++ linux/arch/x86_64/kernel/setup.c
@@ -266,6 +266,10 @@ void __init setup_arch(char **cmdline_p)
* we are rounding upwards:
*/
end_pfn = e820_end_of_ram();
+ /* Trim memory not covered by WB MTRRs */
+ mtrr_bp_init();
+ mtrr_trim_uncached_memory();
+
num_physpages = end_pfn;
check_efer();
Index: linux/include/asm-x86_64/mtrr.h
===================================================================
--- linux.orig/include/asm-x86_64/mtrr.h
+++ linux/include/asm-x86_64/mtrr.h
@@ -78,6 +78,7 @@ extern int mtrr_add_page (unsigned long
unsigned int type, char increment);
extern int mtrr_del (int reg, unsigned long base, unsigned long size);
extern int mtrr_del_page (int reg, unsigned long base, unsigned long size);
+extern void mtrr_trim_uncached_memory(void);
# else
static __inline__ int mtrr_add (unsigned long base, unsigned long size,
unsigned int type, char increment)
@@ -99,7 +100,9 @@ static __inline__ int mtrr_del_page (int
{
return -ENODEV;
}
-
+static __inline__ void mtrr_trim_uncached_memory(void)
+{
+}
#endif /* CONFIG_MTRR */
#ifdef CONFIG_COMPAT
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [7/48] i386: divorce CONFIG_X86_PAE from CONFIG_HIGHMEM64G
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (5 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [6/48] x86: trim memory not covered by WB MTRRs Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 14:46 ` Dave Jones
2007-07-19 14:52 ` Christoph Hellwig
2007-07-19 13:48 ` [PATCH for review] [8/48] i386: Remove unneeded test of 'task' in dump_trace() Andi Kleen
` (30 subsequent siblings)
37 siblings, 2 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: wli, lkml, ak, linux-kernel
From: William Lee Irwin III <wli@holomorphy.com>
PAE is useful for more than supporting more than 4GB RAM. It supports
expanded swapspace and NX executable protections. Some users may want NX
or expanded swapspace support without the overhead or instability of
highmem. For these reasons, the following patch divorces CONFIG_X86_PAE
from CONFIG_HIGHMEM64G.
Cc: Mark Lord <lkml@rtr.ca>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: William Irwin <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
---
arch/i386/Kconfig | 16 +++++++++++-----
arch/i386/kernel/setup.c | 8 ++++----
2 files changed, 15 insertions(+), 9 deletions(-)
Index: linux/arch/i386/Kconfig
===================================================================
--- linux.orig/arch/i386/Kconfig
+++ linux/arch/i386/Kconfig
@@ -544,6 +544,7 @@ config HIGHMEM4G
config HIGHMEM64G
bool "64GB"
depends on !M386 && !M486
+ select X86_PAE
help
Select this if you have a 32-bit processor and more than 4
gigabytes of physical RAM.
@@ -573,12 +574,12 @@ choice
config VMSPLIT_3G
bool "3G/1G user/kernel split"
config VMSPLIT_3G_OPT
- depends on !HIGHMEM
+ depends on !X86_PAE
bool "3G/1G user/kernel split (for full 1G low memory)"
config VMSPLIT_2G
bool "2G/2G user/kernel split"
config VMSPLIT_2G_OPT
- depends on !HIGHMEM
+ depends on !X86_PAE
bool "2G/2G user/kernel split (for full 2G low memory)"
config VMSPLIT_1G
bool "1G/3G user/kernel split"
@@ -598,10 +599,15 @@ config HIGHMEM
default y
config X86_PAE
- bool
- depends on HIGHMEM64G
- default y
+ bool "PAE (Physical Address Extension) Support"
+ default n
+ depends on !HIGHMEM4G
select RESOURCES_64BIT
+ help
+ PAE is required for NX support, and furthermore enables
+ larger swapspace support for non-overcommit purposes. It
+ has the cost of more pagetable lookup overhead, and also
+ consumes more pagetable space per process.
# Common NUMA Features
config NUMA
Index: linux/arch/i386/kernel/setup.c
===================================================================
--- linux.orig/arch/i386/kernel/setup.c
+++ linux/arch/i386/kernel/setup.c
@@ -273,18 +273,18 @@ unsigned long __init find_max_low_pfn(vo
printk(KERN_WARNING "Warning only %ldMB will be used.\n",
MAXMEM>>20);
if (max_pfn > MAX_NONPAE_PFN)
- printk(KERN_WARNING "Use a PAE enabled kernel.\n");
+ printk(KERN_WARNING "Use a HIGHMEM64G enabled kernel.\n");
else
printk(KERN_WARNING "Use a HIGHMEM enabled kernel.\n");
max_pfn = MAXMEM_PFN;
#else /* !CONFIG_HIGHMEM */
-#ifndef CONFIG_X86_PAE
+#ifndef CONFIG_HIGHMEM64G
if (max_pfn > MAX_NONPAE_PFN) {
max_pfn = MAX_NONPAE_PFN;
printk(KERN_WARNING "Warning only 4GB will be used.\n");
- printk(KERN_WARNING "Use a PAE enabled kernel.\n");
+ printk(KERN_WARNING "Use a HIGHMEM64G enabled kernel.\n");
}
-#endif /* !CONFIG_X86_PAE */
+#endif /* !CONFIG_HIGHMEM64G */
#endif /* !CONFIG_HIGHMEM */
} else {
if (highmem_pages == -1)
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [8/48] i386: Remove unneeded test of 'task' in dump_trace()
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (6 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [7/48] i386: divorce CONFIG_X86_PAE from CONFIG_HIGHMEM64G Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [9/48] i386: move the kernel to 16MB for NUMA-Q Andi Kleen
` (29 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: jesper.juhl, linux-kernel
From: Jesper Juhl <jesper.juhl@gmail.com>
Remove unneeded test of task != NULL from
arch/i386/kernel/traps.c::dump_trace()
At the start of the function we have this test:
if (!task)
task = current;
so further down there's no need to test 'task'.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
---
arch/i386/kernel/traps.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux/arch/i386/kernel/traps.c
===================================================================
--- linux.orig/arch/i386/kernel/traps.c
+++ linux/arch/i386/kernel/traps.c
@@ -148,7 +148,7 @@ void dump_trace(struct task_struct *task
if (!stack) {
unsigned long dummy;
stack = &dummy;
- if (task && task != current)
+ if (task != current)
stack = (unsigned long *)task->thread.esp;
}
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [9/48] i386: move the kernel to 16MB for NUMA-Q
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (7 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [8/48] i386: Remove unneeded test of 'task' in dump_trace() Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [10/48] x86_64: Move functions declarations to header file Andi Kleen
` (28 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: apw, linux-kernel
From: Andy Whitcroft <apw@shadowen.org>
We are seeing corruption of the decompressed kernel. It is suspected that
this is platform specific as it has yet to be seen on any other x86. Move
the kernel to the 16MB boundary.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
---
arch/i386/Kconfig | 1 +
1 file changed, 1 insertion(+)
Index: linux/arch/i386/Kconfig
===================================================================
--- linux.orig/arch/i386/Kconfig
+++ linux/arch/i386/Kconfig
@@ -823,6 +823,7 @@ config CRASH_DUMP
config PHYSICAL_START
hex "Physical address where the kernel is loaded" if (EMBEDDED || CRASH_DUMP)
+ default "0x1000000" if X86_NUMAQ
default "0x100000"
help
This gives the physical address where the kernel is loaded.
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [10/48] x86_64: Move functions declarations to header file
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (8 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [9/48] i386: move the kernel to 16MB for NUMA-Q Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [11/48] x86_64: During VM oom condition, kill all threads in process group Andi Kleen
` (27 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: gcosta, ak, linux-kernel
From: Glauber de Oliveira Costa <gcosta@redhat.com>
Some interrupt entry points are currently defined in i8259.c They probably
belong in a header. Right now, their only user is init_IRQ, justifying
their declaration in-file. But when virtualization comes in, we may be
interested in using that functions in late initializations.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/x86_64/kernel/i8259.c | 18 ------------------
include/asm-x86_64/hw_irq.h | 20 ++++++++++++++++++++
2 files changed, 20 insertions(+), 18 deletions(-)
Index: linux/arch/x86_64/kernel/i8259.c
===================================================================
--- linux.orig/arch/x86_64/kernel/i8259.c
+++ linux/arch/x86_64/kernel/i8259.c
@@ -444,24 +444,6 @@ void __init init_ISA_irqs (void)
}
}
-void apic_timer_interrupt(void);
-void spurious_interrupt(void);
-void error_interrupt(void);
-void reschedule_interrupt(void);
-void call_function_interrupt(void);
-void irq_move_cleanup_interrupt(void);
-void invalidate_interrupt0(void);
-void invalidate_interrupt1(void);
-void invalidate_interrupt2(void);
-void invalidate_interrupt3(void);
-void invalidate_interrupt4(void);
-void invalidate_interrupt5(void);
-void invalidate_interrupt6(void);
-void invalidate_interrupt7(void);
-void thermal_interrupt(void);
-void threshold_interrupt(void);
-void i8254_timer_resume(void);
-
static void setup_timer_hardware(void)
{
outb_p(0x34,0x43); /* binary, mode 2, LSB/MSB, ch 0 */
Index: linux/include/asm-x86_64/hw_irq.h
===================================================================
--- linux.orig/include/asm-x86_64/hw_irq.h
+++ linux/include/asm-x86_64/hw_irq.h
@@ -95,6 +95,26 @@
#ifndef __ASSEMBLY__
+
+/* Interrupt handlers registered during init_IRQ */
+void apic_timer_interrupt(void);
+void spurious_interrupt(void);
+void error_interrupt(void);
+void reschedule_interrupt(void);
+void call_function_interrupt(void);
+void irq_move_cleanup_interrupt(void);
+void invalidate_interrupt0(void);
+void invalidate_interrupt1(void);
+void invalidate_interrupt2(void);
+void invalidate_interrupt3(void);
+void invalidate_interrupt4(void);
+void invalidate_interrupt5(void);
+void invalidate_interrupt6(void);
+void invalidate_interrupt7(void);
+void thermal_interrupt(void);
+void threshold_interrupt(void);
+void i8254_timer_resume(void);
+
typedef int vector_irq_t[NR_VECTORS];
DECLARE_PER_CPU(vector_irq_t, vector_irq);
extern void __setup_vector_irq(int cpu);
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [11/48] x86_64: During VM oom condition, kill all threads in process group
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (9 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [10/48] x86_64: Move functions declarations to header file Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 14:04 ` Christoph Hellwig
2007-07-19 13:48 ` [PATCH for review] [12/48] x86_64: use the global PIT lock Andi Kleen
` (26 subsequent siblings)
37 siblings, 1 reply; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: will_schmidt, ak, linux-kernel
From: Will Schmidt <will_schmidt@vnet.ibm.com>
During a VM oom condition, kill all threads in the process group.
We have had complaints where a threaded application is left in a bad state
after one of it's threads is killed when we hit a VM: out_of_memory condition.
Killing just one of the process threads can leave the application in a bad
state, whereas killing the entire process group would allow for the
application to restart, or otherwise handled, and makes it very obvious that
something has gone wrong.
This change allows the entire process group to be taken down, rather than just
the one thread.
Signed-off-by: Will <will_schmidt@vnet.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/x86_64/mm/fault.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux/arch/x86_64/mm/fault.c
===================================================================
--- linux.orig/arch/x86_64/mm/fault.c
+++ linux/arch/x86_64/mm/fault.c
@@ -569,7 +569,7 @@ out_of_memory:
}
printk("VM: killing process %s\n", tsk->comm);
if (error_code & 4)
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
goto no_context;
do_sigbus:
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [12/48] x86_64: use the global PIT lock
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (10 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [11/48] x86_64: During VM oom condition, kill all threads in process group Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 15:22 ` Dmitry Torokhov
2007-07-19 13:48 ` [PATCH for review] [13/48] x86_64: fix typo in acpi_pm.c Andi Kleen
` (25 subsequent siblings)
37 siblings, 1 reply; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: tglx, mingo, dtor, linux-kernel
From: Thomas Gleixner <tglx@linutronix.de>
Replace the pcspkr private PIT lock by the global PIT lock to serialize the
PIT access all over the place.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/x86_64/kernel/time.c | 2 ++
drivers/input/misc/pcspkr.c | 11 ++++++++---
include/asm-x86_64/i8253.h | 6 ++++++
3 files changed, 16 insertions(+), 3 deletions(-)
Index: linux/arch/x86_64/kernel/time.c
===================================================================
--- linux.orig/arch/x86_64/kernel/time.c
+++ linux/arch/x86_64/kernel/time.c
@@ -33,6 +33,7 @@
#include <acpi/acpi_bus.h>
#endif
#include <asm/8253pit.h>
+#include <asm/i8253.h>
#include <asm/pgtable.h>
#include <asm/vsyscall.h>
#include <asm/timex.h>
@@ -51,6 +52,7 @@ static char *timename = NULL;
DEFINE_SPINLOCK(rtc_lock);
EXPORT_SYMBOL(rtc_lock);
DEFINE_SPINLOCK(i8253_lock);
+EXPORT_SYMBOL(i8253_lock);
volatile unsigned long __jiffies __section_jiffies = INITIAL_JIFFIES;
Index: linux/drivers/input/misc/pcspkr.c
===================================================================
--- linux.orig/drivers/input/misc/pcspkr.c
+++ linux/drivers/input/misc/pcspkr.c
@@ -24,7 +24,12 @@ MODULE_AUTHOR("Vojtech Pavlik <vojtech@u
MODULE_DESCRIPTION("PC Speaker beeper driver");
MODULE_LICENSE("GPL");
-static DEFINE_SPINLOCK(i8253_beep_lock);
+#ifdef CONFIG_X86
+/* Use the global PIT lock ! */
+#include <asm/i8253.h>
+#else
+static DEFINE_SPINLOCK(i8253_lock);
+#endif
static int pcspkr_event(struct input_dev *dev, unsigned int type, unsigned int code, int value)
{
@@ -43,7 +48,7 @@ static int pcspkr_event(struct input_dev
if (value > 20 && value < 32767)
count = PIT_TICK_RATE / value;
- spin_lock_irqsave(&i8253_beep_lock, flags);
+ spin_lock_irqsave(&i8253_lock, flags);
if (count) {
/* enable counter 2 */
@@ -58,7 +63,7 @@ static int pcspkr_event(struct input_dev
outb(inb_p(0x61) & 0xFC, 0x61);
}
- spin_unlock_irqrestore(&i8253_beep_lock, flags);
+ spin_unlock_irqrestore(&i8253_lock, flags);
return 0;
}
Index: linux/include/asm-x86_64/i8253.h
===================================================================
--- /dev/null
+++ linux/include/asm-x86_64/i8253.h
@@ -0,0 +1,6 @@
+#ifndef __ASM_I8253_H__
+#define __ASM_I8253_H__
+
+extern spinlock_t i8253_lock;
+
+#endif /* __ASM_I8253_H__ */
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [13/48] x86_64: fix typo in acpi_pm.c
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (11 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [12/48] x86_64: use the global PIT lock Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [14/48] x86_64: lower printk severity Andi Kleen
` (24 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: abogani, johnstul, linux-kernel
From: Alessio Igor Bogani <abogani@texware.it>
Signed-off-by: Alessio Igor Bogani <abogani@texware.it>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
drivers/clocksource/acpi_pm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux/drivers/clocksource/acpi_pm.c
===================================================================
--- linux.orig/drivers/clocksource/acpi_pm.c
+++ linux/drivers/clocksource/acpi_pm.c
@@ -71,7 +71,7 @@ static struct clocksource clocksource_ac
.rating = 200,
.read = acpi_pm_read,
.mask = (cycle_t)ACPI_PM_MASK,
- .mult = 0, /*to be caluclated*/
+ .mult = 0, /*to be calculated*/
.shift = 22,
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [14/48] x86_64: lower printk severity
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (12 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [13/48] x86_64: fix typo in acpi_pm.c Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [15/48] x86_64: fix wrong comment regarding set_fixmap() Andi Kleen
` (23 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: da-x, linux-kernel
From: Dan Aloni <da-x@monatomic.org>
Signed-off-by: Dan Aloni <da-x@monatomic.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
---
arch/x86_64/kernel/process.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux/arch/x86_64/kernel/process.c
===================================================================
--- linux.orig/arch/x86_64/kernel/process.c
+++ linux/arch/x86_64/kernel/process.c
@@ -279,7 +279,7 @@ void __cpuinit select_idle_routine(const
*/
if (!pm_idle) {
if (!printed) {
- printk("using mwait in idle threads.\n");
+ printk(KERN_INFO "using mwait in idle threads.\n");
printed = 1;
}
pm_idle = mwait_idle;
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [15/48] x86_64: fix wrong comment regarding set_fixmap()
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (13 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [14/48] x86_64: lower printk severity Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [16/48] x86_64: Geode HW Random Number Generator depends on X86_32 Andi Kleen
` (22 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: jkosina, ak, linux-kernel
From: Jiri Kosina <jkosina@suse.cz>
The function name is set_fixmap(), not fixmap_set() as stated in the comment.
Also fix a typo, punctuation and lower/uppercase a bit.
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
---
include/asm-x86_64/fixmap.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
Index: linux/include/asm-x86_64/fixmap.h
===================================================================
--- linux.orig/include/asm-x86_64/fixmap.h
+++ linux/include/asm-x86_64/fixmap.h
@@ -22,9 +22,9 @@
* compile time, but to set the physical address only
* in the boot process.
*
- * these 'compile-time allocated' memory buffers are
- * fixed-size 4k pages. (or larger if used with an increment
- * highger than 1) use fixmap_set(idx,phys) to associate
+ * These 'compile-time allocated' memory buffers are
+ * fixed-size 4k pages (or larger if used with an increment
+ * higher than 1). Use set_fixmap(idx,phys) to associate
* physical memory with fixmap indices.
*
* TLB entries of such buffers will not be flushed across
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [16/48] x86_64: Geode HW Random Number Generator depends on X86_32
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (14 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [15/48] x86_64: fix wrong comment regarding set_fixmap() Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [17/48] x86_64: change _map_single to static in pci_gart.c etc Andi Kleen
` (21 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: Yinghai.Lu, linux-kernel
From: Yinghai Lu <Yinghai.Lu@Sun.COM>
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
drivers/char/hw_random/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux/drivers/char/hw_random/Kconfig
===================================================================
--- linux.orig/drivers/char/hw_random/Kconfig
+++ linux/drivers/char/hw_random/Kconfig
@@ -41,7 +41,7 @@ config HW_RANDOM_AMD
config HW_RANDOM_GEODE
tristate "AMD Geode HW Random Number Generator support"
- depends on HW_RANDOM && X86 && PCI
+ depends on HW_RANDOM && X86_32 && PCI
default HW_RANDOM
---help---
This driver provides kernel-side support for the Random Number
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [17/48] x86_64: change _map_single to static in pci_gart.c etc
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (15 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [16/48] x86_64: Geode HW Random Number Generator depends on X86_32 Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [18/48] x86_64: flush_tlb_kernel_range() warning fix Andi Kleen
` (20 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: Yinghai.Lu, linux-kernel
From: Yinghai Lu <Yinghai.Lu@Sun.COM>
This function is called via dma_ops->.., so change it to static
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
---
arch/x86_64/kernel/pci-gart.c | 6 +++---
arch/x86_64/kernel/pci-nommu.c | 6 +++---
2 files changed, 6 insertions(+), 6 deletions(-)
Index: linux/arch/x86_64/kernel/pci-gart.c
===================================================================
--- linux.orig/arch/x86_64/kernel/pci-gart.c
+++ linux/arch/x86_64/kernel/pci-gart.c
@@ -235,7 +235,7 @@ static dma_addr_t gart_map_simple(struct
}
/* Map a single area into the IOMMU */
-dma_addr_t gart_map_single(struct device *dev, void *addr, size_t size, int dir)
+static dma_addr_t gart_map_single(struct device *dev, void *addr, size_t size, int dir)
{
unsigned long phys_mem, bus;
@@ -253,7 +253,7 @@ dma_addr_t gart_map_single(struct device
/*
* Free a DMA mapping.
*/
-void gart_unmap_single(struct device *dev, dma_addr_t dma_addr,
+static void gart_unmap_single(struct device *dev, dma_addr_t dma_addr,
size_t size, int direction)
{
unsigned long iommu_page;
@@ -275,7 +275,7 @@ void gart_unmap_single(struct device *de
/*
* Wrapper for pci_unmap_single working with scatterlists.
*/
-void gart_unmap_sg(struct device *dev, struct scatterlist *sg, int nents, int dir)
+static void gart_unmap_sg(struct device *dev, struct scatterlist *sg, int nents, int dir)
{
int i;
Index: linux/arch/x86_64/kernel/pci-nommu.c
===================================================================
--- linux.orig/arch/x86_64/kernel/pci-nommu.c
+++ linux/arch/x86_64/kernel/pci-nommu.c
@@ -34,7 +34,7 @@ nommu_map_single(struct device *hwdev, v
return bus;
}
-void nommu_unmap_single(struct device *dev, dma_addr_t addr,size_t size,
+static void nommu_unmap_single(struct device *dev, dma_addr_t addr,size_t size,
int direction)
{
}
@@ -54,7 +54,7 @@ void nommu_unmap_single(struct device *d
* Device ownership issues as mentioned above for pci_map_single are
* the same here.
*/
-int nommu_map_sg(struct device *hwdev, struct scatterlist *sg,
+static int nommu_map_sg(struct device *hwdev, struct scatterlist *sg,
int nents, int direction)
{
int i;
@@ -74,7 +74,7 @@ int nommu_map_sg(struct device *hwdev, s
* Again, cpu read rules concerning calls here are the same as for
* pci_unmap_single() above.
*/
-void nommu_unmap_sg(struct device *dev, struct scatterlist *sg,
+static void nommu_unmap_sg(struct device *dev, struct scatterlist *sg,
int nents, int dir)
{
}
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [18/48] x86_64: flush_tlb_kernel_range() warning fix
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (16 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [17/48] x86_64: change _map_single to static in pci_gart.c etc Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [19/48] i386: add cpu_relax() to cmos_lock() Andi Kleen
` (19 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: akpm, ak, linux-kernel
From: Andrew Morton <akpm@linux-foundation.org>
mm/vmalloc.c: In function 'unmap_kernel_range':
mm/vmalloc.c:75: warning: unused variable 'start'
make it a C function so that the compiler thinks it used its arguments.
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
---
include/asm-x86_64/tlbflush.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
Index: linux/include/asm-x86_64/tlbflush.h
===================================================================
--- linux.orig/include/asm-x86_64/tlbflush.h
+++ linux/include/asm-x86_64/tlbflush.h
@@ -92,7 +92,11 @@ static inline void flush_tlb_range(struc
#endif
-#define flush_tlb_kernel_range(start, end) flush_tlb_all()
+static inline void flush_tlb_kernel_range(unsigned long start,
+ unsigned long end)
+{
+ flush_tlb_all();
+}
static inline void flush_tlb_pgtables(struct mm_struct *mm,
unsigned long start, unsigned long end)
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [19/48] i386: add cpu_relax() to cmos_lock()
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (17 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [18/48] x86_64: flush_tlb_kernel_range() warning fix Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [20/48] i386: replace hard-coded constant with appropriate macro from kernel.h Andi Kleen
` (18 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: andi, ak, linux-kernel
From: Andreas Mohr <andi@lisas.de>
Add cpu_relax() to cmos_lock() inline function for faster operation on SMT
CPUs and less power consumption on others in case of lock contention (which
probably doesn't happen too often, so admittedly this patch is not too
exciting).
[akpm@linux-foundation.org: Include the header file for cpu_relax()]
Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/asm-i386/mc146818rtc.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
Index: linux/include/asm-i386/mc146818rtc.h
===================================================================
--- linux.orig/include/asm-i386/mc146818rtc.h
+++ linux/include/asm-i386/mc146818rtc.h
@@ -6,6 +6,7 @@
#include <asm/io.h>
#include <asm/system.h>
+#include <asm/processor.h>
#include <linux/mc146818rtc.h>
#ifndef RTC_PORT
@@ -43,8 +44,10 @@ static inline void lock_cmos(unsigned ch
unsigned long new;
new = ((smp_processor_id()+1) << 8) | reg;
for (;;) {
- if (cmos_lock)
+ if (cmos_lock) {
+ cpu_relax();
continue;
+ }
if (__cmpxchg(&cmos_lock, 0, new, sizeof(cmos_lock)) == 0)
return;
}
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [20/48] i386: replace hard-coded constant with appropriate macro from kernel.h
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (18 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [19/48] i386: add cpu_relax() to cmos_lock() Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [21/48] x86_64: disable the GART in shutdown Andi Kleen
` (17 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: rpjday, linux-kernel
From: "Robert P. J. Day" <rpjday@mindspring.com>
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
---
include/asm-i386/uaccess.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux/include/asm-i386/uaccess.h
===================================================================
--- linux.orig/include/asm-i386/uaccess.h
+++ linux/include/asm-i386/uaccess.h
@@ -581,7 +581,7 @@ long __must_check __strncpy_from_user(ch
* If there is a limit on the length of a valid string, you may wish to
* consider using strnlen_user() instead.
*/
-#define strlen_user(str) strnlen_user(str, ~0UL >> 1)
+#define strlen_user(str) strnlen_user(str, LONG_MAX)
long strnlen_user(const char __user *str, long n);
unsigned long __must_check clear_user(void __user *mem, unsigned long len);
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [21/48] x86_64: disable the GART in shutdown
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (19 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [20/48] i386: replace hard-coded constant with appropriate macro from kernel.h Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [22/48] x86_64: fix e820_hole_size based on address ranges Andi Kleen
` (16 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: Yinghai.Lu, ak, alan, ebiederm, muli, vgoyal, davej, linux-kernel
From: Yinghai Lu <Yinghai.Lu@Sun.COM>
For K8 system: 4G RAM with memory hole remapping enabled, or more than 4G
RAM installed. when using kexec to load second kernel. In the second
kernel, when mem is allocated for GART, it will do the memset for clear, it
will cause restart, because some device still used that for dma. solution
will be:
in second kernel: disable that at first before we try to allocate mem for
it. or in the first kernel: do disable that before shutdown.
Andi/Eric/Alan prefer to second one for clean shutdown in first kernel.
Andi also point out need to consider to AGP enable but mem less 4G case
too.
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/x86_64/kernel/pci-dma.c | 5 +++++
arch/x86_64/kernel/pci-gart.c | 20 ++++++++++++++++++++
arch/x86_64/kernel/reboot.c | 4 ++++
include/asm-x86_64/proto.h | 7 +++++++
4 files changed, 36 insertions(+)
Index: linux/arch/x86_64/kernel/pci-dma.c
===================================================================
--- linux.orig/arch/x86_64/kernel/pci-dma.c
+++ linux/arch/x86_64/kernel/pci-dma.c
@@ -321,6 +321,11 @@ static int __init pci_iommu_init(void)
return 0;
}
+void pci_iommu_shutdown(void)
+{
+ gart_iommu_shutdown();
+}
+
#ifdef CONFIG_PCI
/* Many VIA bridges seem to corrupt data for DAC. Disable it here */
Index: linux/arch/x86_64/kernel/pci-gart.c
===================================================================
--- linux.orig/arch/x86_64/kernel/pci-gart.c
+++ linux/arch/x86_64/kernel/pci-gart.c
@@ -571,6 +571,26 @@ static const struct dma_mapping_ops gart
.unmap_sg = gart_unmap_sg,
};
+void gart_iommu_shutdown(void)
+{
+ struct pci_dev *dev;
+ int i;
+
+ if (no_agp && (dma_ops != &gart_dma_ops))
+ return;
+
+ for (i = 0; i < num_k8_northbridges; i++) {
+ u32 ctl;
+
+ dev = k8_northbridges[i];
+ pci_read_config_dword(dev, 0x90, &ctl);
+
+ ctl &= ~1;
+
+ pci_write_config_dword(dev, 0x90, ctl);
+ }
+}
+
void __init gart_iommu_init(void)
{
struct agp_kern_info info;
Index: linux/arch/x86_64/kernel/reboot.c
===================================================================
--- linux.orig/arch/x86_64/kernel/reboot.c
+++ linux/arch/x86_64/kernel/reboot.c
@@ -16,6 +16,7 @@
#include <asm/pgtable.h>
#include <asm/tlbflush.h>
#include <asm/apic.h>
+#include <asm/proto.h>
/*
* Power off function, if any
@@ -81,6 +82,7 @@ static inline void kb_wait(void)
void machine_shutdown(void)
{
unsigned long flags;
+
/* Stop the cpus and apics */
#ifdef CONFIG_SMP
int reboot_cpu_id;
@@ -111,6 +113,8 @@ void machine_shutdown(void)
disable_IO_APIC();
local_irq_restore(flags);
+
+ pci_iommu_shutdown();
}
void machine_emergency_restart(void)
Index: linux/include/asm-x86_64/proto.h
===================================================================
--- linux.orig/include/asm-x86_64/proto.h
+++ linux/include/asm-x86_64/proto.h
@@ -85,11 +85,13 @@ extern int exception_trace;
extern unsigned cpu_khz;
extern unsigned tsc_khz;
+extern void pci_iommu_shutdown(void);
extern void no_iommu_init(void);
extern int force_iommu, no_iommu;
extern int iommu_detected;
#ifdef CONFIG_IOMMU
extern void gart_iommu_init(void);
+extern void gart_iommu_shutdown(void);
extern void __init gart_parse_options(char *);
extern void iommu_hole_init(void);
extern int fallback_aper_order;
@@ -101,6 +103,11 @@ extern int fix_aperture;
#else
#define iommu_aperture 0
#define iommu_aperture_allowed 0
+
+static inline void gart_iommu_shutdown(void)
+{
+}
+
#endif
extern int reboot_force;
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [22/48] x86_64: fix e820_hole_size based on address ranges
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (20 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [21/48] x86_64: disable the GART in shutdown Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [23/48] x86_64: disable srat when numa emulation succeeds Andi Kleen
` (15 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: rientjes, ak, linux-kernel
From: David Rientjes <rientjes@google.com>
e820_hole_size() now uses the newly extracted helper function,
e820_find_active_region(), to determine the size of usable RAM in a range of
PFN's.
This was previously broken because of two reasons:
- The start and end PFN's of each e820 entry were not properly rounded
prior to excluding those entries in the range, and
- Entries smaller than a page were not properly excluded from being
accumulated.
This resulted in emulated nodes being incorrectly mapped to ranges that
were completely reserved and not candidates for being registered as
active ranges.
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/x86_64/kernel/e820.c | 54 +++++++++++++++++++---------------------------
arch/x86_64/mm/numa.c | 8 +-----
2 files changed, 25 insertions(+), 37 deletions(-)
Index: linux/arch/x86_64/kernel/e820.c
===================================================================
--- linux.orig/arch/x86_64/kernel/e820.c
+++ linux/arch/x86_64/kernel/e820.c
@@ -194,37 +194,6 @@ unsigned long __init e820_end_of_ram(voi
}
/*
- * Find the hole size in the range.
- */
-unsigned long __init e820_hole_size(unsigned long start, unsigned long end)
-{
- unsigned long ram = 0;
- int i;
-
- for (i = 0; i < e820.nr_map; i++) {
- struct e820entry *ei = &e820.map[i];
- unsigned long last, addr;
-
- if (ei->type != E820_RAM ||
- ei->addr+ei->size <= start ||
- ei->addr >= end)
- continue;
-
- addr = round_up(ei->addr, PAGE_SIZE);
- if (addr < start)
- addr = start;
-
- last = round_down(ei->addr + ei->size, PAGE_SIZE);
- if (last >= end)
- last = end;
-
- if (last > addr)
- ram += last - addr;
- }
- return ((end - start) - ram);
-}
-
-/*
* Mark e820 reserved areas as busy for the resource manager.
*/
void __init e820_reserve_resources(void)
@@ -364,6 +333,29 @@ void __init add_memory_region(unsigned l
e820.nr_map++;
}
+/*
+ * Find the hole size (in bytes) in the memory range.
+ * @start: starting address of the memory range to scan
+ * @end: ending address of the memory range to scan
+ */
+unsigned long __init e820_hole_size(unsigned long start, unsigned long end)
+{
+ unsigned long start_pfn = start >> PAGE_SHIFT;
+ unsigned long end_pfn = end >> PAGE_SHIFT;
+ unsigned long ei_startpfn;
+ unsigned long ei_endpfn;
+ unsigned long ram = 0;
+ int i;
+
+ for (i = 0; i < e820.nr_map; i++) {
+ if (e820_find_active_region(&e820.map[i],
+ start_pfn, end_pfn,
+ &ei_startpfn, &ei_endpfn))
+ ram += ei_endpfn - ei_startpfn;
+ }
+ return end - start - (ram << PAGE_SHIFT);
+}
+
void __init e820_print_map(char *who)
{
int i;
Index: linux/arch/x86_64/mm/numa.c
===================================================================
--- linux.orig/arch/x86_64/mm/numa.c
+++ linux/arch/x86_64/mm/numa.c
@@ -273,9 +273,6 @@ void __init numa_init_array(void)
#ifdef CONFIG_NUMA_EMU
/* Numa emulation */
-#define E820_ADDR_HOLE_SIZE(start, end) \
- (e820_hole_size((start) >> PAGE_SHIFT, (end) >> PAGE_SHIFT) << \
- PAGE_SHIFT)
char *cmdline __initdata;
/*
@@ -319,7 +316,7 @@ static int __init split_nodes_equally(st
return -1;
if (num_nodes > MAX_NUMNODES)
num_nodes = MAX_NUMNODES;
- size = (max_addr - *addr - E820_ADDR_HOLE_SIZE(*addr, max_addr)) /
+ size = (max_addr - *addr - e820_hole_size(*addr, max_addr)) /
num_nodes;
/*
* Calculate the number of big nodes that can be allocated as a result
@@ -347,7 +344,7 @@ static int __init split_nodes_equally(st
if (i == num_nodes + node_start - 1)
end = max_addr;
else
- while (end - *addr - E820_ADDR_HOLE_SIZE(*addr, end) <
+ while (end - *addr - e820_hole_size(*addr, end) <
size) {
end += FAKE_NODE_MIN_SIZE;
if (end > max_addr) {
@@ -488,7 +485,6 @@ out:
numa_init_array();
return 0;
}
-#undef E820_ADDR_HOLE_SIZE
#endif /* CONFIG_NUMA_EMU */
void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [23/48] x86_64: disable srat when numa emulation succeeds
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (21 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [22/48] x86_64: fix e820_hole_size based on address ranges Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [24/48] x86_64: move iommu declaration from proto to iommu.h Andi Kleen
` (14 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: rientjes, ak, lenb, linux-kernel
From: David Rientjes <rientjes@google.com>
When NUMA emulation succeeds, acpi_numa needs to be set to -1 so that
srat_disabled() will always return true. We won't be calling
acpi_scan_nodes() or registering the true nodes we've found.
[hugh@veritas.com: Fix x86_64 CONFIG_NUMA_EMU build: acpi_numa needs CONFIG_ACPI_NUMA]
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/x86_64/mm/numa.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
Index: linux/arch/x86_64/mm/numa.c
===================================================================
--- linux.orig/arch/x86_64/mm/numa.c
+++ linux/arch/x86_64/mm/numa.c
@@ -473,9 +473,13 @@ out:
/*
* We need to vacate all active ranges that may have been registered by
- * SRAT.
+ * SRAT and set acpi_numa to -1 so that srat_disabled() always returns
+ * true. NUMA emulation has succeeded so we will not scan ACPI nodes.
*/
remove_all_active_ranges();
+#ifdef CONFIG_ACPI_NUMA
+ acpi_numa = -1;
+#endif
for_each_node_mask(i, node_possible_map) {
e820_register_active_regions(i, nodes[i].start >> PAGE_SHIFT,
nodes[i].end >> PAGE_SHIFT);
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [24/48] x86_64: move iommu declaration from proto to iommu.h
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (22 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [23/48] x86_64: disable srat when numa emulation succeeds Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [25/48] i386: remove volatile in apic.c Andi Kleen
` (13 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: Yinghai.Lu, ak, alan, ebiederm, muli, vgoyal, davej, linux-kernel
From: Yinghai Lu <Yinghai.Lu@Sun.COM>
[akpm@linux-foundation.org: build fix]
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/x86_64/kernel/aperture.c | 2 +-
arch/x86_64/kernel/early-quirks.c | 1 +
arch/x86_64/kernel/pci-calgary.c | 2 +-
arch/x86_64/kernel/pci-dma.c | 2 +-
arch/x86_64/kernel/pci-gart.c | 1 +
arch/x86_64/kernel/pci-nommu.c | 2 +-
arch/x86_64/kernel/pci-swiotlb.c | 2 +-
arch/x86_64/kernel/reboot.c | 2 +-
include/asm-x86_64/iommu.h | 29 +++++++++++++++++++++++++++++
include/asm-x86_64/proto.h | 25 -------------------------
10 files changed, 37 insertions(+), 31 deletions(-)
Index: linux/arch/x86_64/kernel/aperture.c
===================================================================
--- linux.orig/arch/x86_64/kernel/aperture.c
+++ linux/arch/x86_64/kernel/aperture.c
@@ -20,7 +20,7 @@
#include <linux/ioport.h>
#include <asm/e820.h>
#include <asm/io.h>
-#include <asm/proto.h>
+#include <asm/iommu.h>
#include <asm/pci-direct.h>
#include <asm/dma.h>
#include <asm/k8.h>
Index: linux/arch/x86_64/kernel/early-quirks.c
===================================================================
--- linux.orig/arch/x86_64/kernel/early-quirks.c
+++ linux/arch/x86_64/kernel/early-quirks.c
@@ -14,6 +14,7 @@
#include <linux/pci_ids.h>
#include <asm/pci-direct.h>
#include <asm/proto.h>
+#include <asm/iommu.h>
#include <asm/dma.h>
static void __init via_bugs(void)
Index: linux/arch/x86_64/kernel/pci-calgary.c
===================================================================
--- linux.orig/arch/x86_64/kernel/pci-calgary.c
+++ linux/arch/x86_64/kernel/pci-calgary.c
@@ -35,7 +35,7 @@
#include <linux/pci_ids.h>
#include <linux/pci.h>
#include <linux/delay.h>
-#include <asm/proto.h>
+#include <asm/iommu.h>
#include <asm/calgary.h>
#include <asm/tce.h>
#include <asm/pci-direct.h>
Index: linux/arch/x86_64/kernel/pci-dma.c
===================================================================
--- linux.orig/arch/x86_64/kernel/pci-dma.c
+++ linux/arch/x86_64/kernel/pci-dma.c
@@ -8,7 +8,7 @@
#include <linux/pci.h>
#include <linux/module.h>
#include <asm/io.h>
-#include <asm/proto.h>
+#include <asm/iommu.h>
#include <asm/calgary.h>
int iommu_merge __read_mostly = 0;
Index: linux/arch/x86_64/kernel/pci-gart.c
===================================================================
--- linux.orig/arch/x86_64/kernel/pci-gart.c
+++ linux/arch/x86_64/kernel/pci-gart.c
@@ -28,6 +28,7 @@
#include <asm/mtrr.h>
#include <asm/pgtable.h>
#include <asm/proto.h>
+#include <asm/iommu.h>
#include <asm/cacheflush.h>
#include <asm/swiotlb.h>
#include <asm/dma.h>
Index: linux/arch/x86_64/kernel/pci-nommu.c
===================================================================
--- linux.orig/arch/x86_64/kernel/pci-nommu.c
+++ linux/arch/x86_64/kernel/pci-nommu.c
@@ -6,7 +6,7 @@
#include <linux/string.h>
#include <linux/dma-mapping.h>
-#include <asm/proto.h>
+#include <asm/iommu.h>
#include <asm/processor.h>
#include <asm/dma.h>
Index: linux/arch/x86_64/kernel/pci-swiotlb.c
===================================================================
--- linux.orig/arch/x86_64/kernel/pci-swiotlb.c
+++ linux/arch/x86_64/kernel/pci-swiotlb.c
@@ -5,7 +5,7 @@
#include <linux/module.h>
#include <linux/dma-mapping.h>
-#include <asm/proto.h>
+#include <asm/iommu.h>
#include <asm/swiotlb.h>
#include <asm/dma.h>
Index: linux/arch/x86_64/kernel/reboot.c
===================================================================
--- linux.orig/arch/x86_64/kernel/reboot.c
+++ linux/arch/x86_64/kernel/reboot.c
@@ -16,7 +16,7 @@
#include <asm/pgtable.h>
#include <asm/tlbflush.h>
#include <asm/apic.h>
-#include <asm/proto.h>
+#include <asm/iommu.h>
/*
* Power off function, if any
Index: linux/include/asm-x86_64/iommu.h
===================================================================
--- /dev/null
+++ linux/include/asm-x86_64/iommu.h
@@ -0,0 +1,29 @@
+#ifndef _ASM_X8664_IOMMU_H
+#define _ASM_X8664_IOMMU_H 1
+
+extern void pci_iommu_shutdown(void);
+extern void no_iommu_init(void);
+extern int force_iommu, no_iommu;
+extern int iommu_detected;
+#ifdef CONFIG_IOMMU
+extern void gart_iommu_init(void);
+extern void gart_iommu_shutdown(void);
+extern void __init gart_parse_options(char *);
+extern void iommu_hole_init(void);
+extern int fallback_aper_order;
+extern int fallback_aper_force;
+extern int iommu_aperture;
+extern int iommu_aperture_allowed;
+extern int iommu_aperture_disabled;
+extern int fix_aperture;
+#else
+#define iommu_aperture 0
+#define iommu_aperture_allowed 0
+
+static inline void gart_iommu_shutdown(void)
+{
+}
+
+#endif
+
+#endif
Index: linux/include/asm-x86_64/proto.h
===================================================================
--- linux.orig/include/asm-x86_64/proto.h
+++ linux/include/asm-x86_64/proto.h
@@ -85,31 +85,6 @@ extern int exception_trace;
extern unsigned cpu_khz;
extern unsigned tsc_khz;
-extern void pci_iommu_shutdown(void);
-extern void no_iommu_init(void);
-extern int force_iommu, no_iommu;
-extern int iommu_detected;
-#ifdef CONFIG_IOMMU
-extern void gart_iommu_init(void);
-extern void gart_iommu_shutdown(void);
-extern void __init gart_parse_options(char *);
-extern void iommu_hole_init(void);
-extern int fallback_aper_order;
-extern int fallback_aper_force;
-extern int iommu_aperture;
-extern int iommu_aperture_allowed;
-extern int iommu_aperture_disabled;
-extern int fix_aperture;
-#else
-#define iommu_aperture 0
-#define iommu_aperture_allowed 0
-
-static inline void gart_iommu_shutdown(void)
-{
-}
-
-#endif
-
extern int reboot_force;
extern int notsc_setup(char *);
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [25/48] i386: remove volatile in apic.c
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (23 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [24/48] x86_64: move iommu declaration from proto to iommu.h Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [26/48] i386: hpet assumes boot cpu is 0 Andi Kleen
` (12 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: tglx, ak, mingo, linux-kernel
From: Thomas Gleixner <tglx@linutronix.de>
Remove the volatile in apic. We have a cpu_relax() in the wait loop. Fix a
coding style issue while at it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/i386/kernel/apic.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
Index: linux/arch/i386/kernel/apic.c
===================================================================
--- linux.orig/arch/i386/kernel/apic.c
+++ linux/arch/i386/kernel/apic.c
@@ -315,7 +315,7 @@ static void __devinit setup_APIC_timer(v
#define LAPIC_CAL_LOOPS (HZ/10)
-static __initdata volatile int lapic_cal_loops = -1;
+static __initdata int lapic_cal_loops = -1;
static __initdata long lapic_cal_t1, lapic_cal_t2;
static __initdata unsigned long long lapic_cal_tsc1, lapic_cal_tsc2;
static __initdata unsigned long lapic_cal_pm1, lapic_cal_pm2;
@@ -485,7 +485,7 @@ void __init setup_boot_APIC_clock(void)
/* Let the interrupts run */
local_irq_enable();
- while(lapic_cal_loops <= LAPIC_CAL_LOOPS)
+ while (lapic_cal_loops <= LAPIC_CAL_LOOPS)
cpu_relax();
local_irq_disable();
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [26/48] i386: hpet assumes boot cpu is 0
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (24 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [25/48] i386: remove volatile in apic.c Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 16:23 ` Jeremy Fitzhardinge
2007-07-19 13:48 ` [PATCH for review] [27/48] i386: move PIT function declarations and constants to correct header file Andi Kleen
` (11 subsequent siblings)
37 siblings, 1 reply; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: chrisw, mingo, johnstul, ak, linux-kernel
From: Chris Wright <chrisw@sous-sol.org>
I fixed this in x86_64. Looks like the kind of thing that will break voyager
on i386.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/i386/kernel/hpet.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux/arch/i386/kernel/hpet.c
===================================================================
--- linux.orig/arch/i386/kernel/hpet.c
+++ linux/arch/i386/kernel/hpet.c
@@ -321,7 +321,7 @@ int __init hpet_enable(void)
* Start hpet with the boot cpu mask and make it
* global after the IO_APIC has been initialized.
*/
- hpet_clockevent.cpumask =cpumask_of_cpu(0);
+ hpet_clockevent.cpumask = cpumask_of_cpu(smp_processor_id());
clockevents_register_device(&hpet_clockevent);
global_clock_event = &hpet_clockevent;
return 1;
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [27/48] i386: move PIT function declarations and constants to correct header file
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (25 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [26/48] i386: hpet assumes boot cpu is 0 Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [28/48] i386: fix iounmap's use of vm_struct's size field Andi Kleen
` (10 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: tglx, ak, mingo, johnstul, ak, linux-kernel
From: Thomas Gleixner <tglx@linutronix.de>
setup_pit_timer is declared in asm-i386/timer.h. Move it to the pit header
file, so it can be used by x86_64 as well.
Move also the PIT constants.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/i386/kernel/i8253.c | 2 --
arch/i386/kernel/vmiclock.c | 1 +
include/asm-i386/i8253.h | 7 +++++++
include/asm-i386/mach-default/io_ports.h | 5 -----
include/asm-i386/timer.h | 1 -
5 files changed, 8 insertions(+), 8 deletions(-)
Index: linux/arch/i386/kernel/i8253.c
===================================================================
--- linux.orig/arch/i386/kernel/i8253.c
+++ linux/arch/i386/kernel/i8253.c
@@ -15,8 +15,6 @@
#include <asm/io.h>
#include <asm/timer.h>
-#include "io_ports.h"
-
DEFINE_SPINLOCK(i8253_lock);
EXPORT_SYMBOL(i8253_lock);
Index: linux/arch/i386/kernel/vmiclock.c
===================================================================
--- linux.orig/arch/i386/kernel/vmiclock.c
+++ linux/arch/i386/kernel/vmiclock.c
@@ -32,6 +32,7 @@
#include <asm/apicdef.h>
#include <asm/apic.h>
#include <asm/timer.h>
+#include <asm/i8253.h>
#include <irq_vectors.h>
#include "io_ports.h"
Index: linux/include/asm-i386/i8253.h
===================================================================
--- linux.orig/include/asm-i386/i8253.h
+++ linux/include/asm-i386/i8253.h
@@ -3,8 +3,15 @@
#include <linux/clockchips.h>
+/* i8253A PIT registers */
+#define PIT_MODE 0x43
+#define PIT_CH0 0x40
+#define PIT_CH2 0x42
+
extern spinlock_t i8253_lock;
extern struct clock_event_device *global_clock_event;
+extern void setup_pit_timer(void);
+
#endif /* __ASM_I8253_H__ */
Index: linux/include/asm-i386/mach-default/io_ports.h
===================================================================
--- linux.orig/include/asm-i386/mach-default/io_ports.h
+++ linux/include/asm-i386/mach-default/io_ports.h
@@ -7,11 +7,6 @@
#ifndef _MACH_IO_PORTS_H
#define _MACH_IO_PORTS_H
-/* i8253A PIT registers */
-#define PIT_MODE 0x43
-#define PIT_CH0 0x40
-#define PIT_CH2 0x42
-
/* i8259A PIC registers */
#define PIC_MASTER_CMD 0x20
#define PIC_MASTER_IMR 0x21
Index: linux/include/asm-i386/timer.h
===================================================================
--- linux.orig/include/asm-i386/timer.h
+++ linux/include/asm-i386/timer.h
@@ -5,7 +5,6 @@
#define TICK_SIZE (tick_nsec / 1000)
-void setup_pit_timer(void);
unsigned long native_calculate_cpu_khz(void);
extern int timer_ack;
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [28/48] i386: fix iounmap's use of vm_struct's size field
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (26 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [27/48] i386: move PIT function declarations and constants to correct header file Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [29/48] x86_64: arch/x86_64/kernel/aperture.c lower printk severity Andi Kleen
` (9 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: jeremy, hidave.darkstar, cebbert, ak, linux-kernel
From: Jeremy Fitzhardinge <jeremy@goop.org>
get_vm_area always returns an area with an adjacent guard page. That guard
page is included in vm_struct.size. iounmap uses vm_struct.size to
determine how much address space needs to have change_page_attr applied to
it, which will BUG if applied to the guard page.
This patch adds a helper function - get_vm_area_size() in linux/vmalloc.h -
to return the actual size of a vm area, and uses it to make iounmap do the
right thing. There are probably other places which should be using
get_vm_area_size().
Thanks to Dave Young <hidave.darkstar@gmail.com> for debugging the
problem.
[ Andi, it wasn't clear to me whether x86_64 needs the same fix. ]
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Dave Young <hidave.darkstar@gmail.com>
Cc: Chuck Ebbert <cebbert@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/i386/mm/ioremap.c | 2 +-
include/linux/vmalloc.h | 7 +++++++
2 files changed, 8 insertions(+), 1 deletion(-)
Index: linux/arch/i386/mm/ioremap.c
===================================================================
--- linux.orig/arch/i386/mm/ioremap.c
+++ linux/arch/i386/mm/ioremap.c
@@ -196,7 +196,7 @@ void iounmap(volatile void __iomem *addr
/* Reset the direct mapping. Can block */
if ((p->flags >> 20) && p->phys_addr < virt_to_phys(high_memory) - 1) {
change_page_attr(virt_to_page(__va(p->phys_addr)),
- p->size >> PAGE_SHIFT,
+ get_vm_area_size(p) >> PAGE_SHIFT,
PAGE_KERNEL);
global_flush_tlb();
}
Index: linux/include/linux/vmalloc.h
===================================================================
--- linux.orig/include/linux/vmalloc.h
+++ linux/include/linux/vmalloc.h
@@ -58,6 +58,13 @@ void vmalloc_sync_all(void);
/*
* Lowlevel-APIs (not for driver use!)
*/
+
+static inline size_t get_vm_area_size(const struct vm_struct *area)
+{
+ /* return actual size without guard page */
+ return area->size - PAGE_SIZE;
+}
+
extern struct vm_struct *get_vm_area(unsigned long size, unsigned long flags);
extern struct vm_struct *__get_vm_area(unsigned long size, unsigned long flags,
unsigned long start, unsigned long end);
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [29/48] x86_64: arch/x86_64/kernel/aperture.c lower printk severity
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (27 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [28/48] i386: fix iounmap's use of vm_struct's size field Andi Kleen
@ 2007-07-19 13:48 ` Andi Kleen
2007-07-19 13:49 ` [PATCH for review] [30/48] x86_64: arch/x86_64/kernel/e820.c " Andi Kleen
` (8 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:48 UTC (permalink / raw)
To: da-x, linux-kernel
From: Dan Aloni <da-x@monatomic.org>
Users that use kernel log filtering (e.g. via syslogd or a proprietry method)
wouldn't like to see warning prints that are not really warnings.
Signed-off-by: Dan Aloni <da-x@monatomic.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
---
arch/x86_64/kernel/aperture.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux/arch/x86_64/kernel/aperture.c
===================================================================
--- linux.orig/arch/x86_64/kernel/aperture.c
+++ linux/arch/x86_64/kernel/aperture.c
@@ -214,7 +214,7 @@ void __init iommu_hole_init(void)
if (iommu_aperture_disabled || !fix_aperture || !early_pci_allowed())
return;
- printk("Checking aperture...\n");
+ printk(KERN_INFO "Checking aperture...\n");
fix = 0;
for (num = 24; num < 32; num++) {
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [30/48] x86_64: arch/x86_64/kernel/e820.c lower printk severity
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (28 preceding siblings ...)
2007-07-19 13:48 ` [PATCH for review] [29/48] x86_64: arch/x86_64/kernel/aperture.c lower printk severity Andi Kleen
@ 2007-07-19 13:49 ` Andi Kleen
2007-07-19 13:49 ` [PATCH for review] [31/48] i386: basic infrastructure support for AMD geode-class machines Andi Kleen
` (7 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:49 UTC (permalink / raw)
To: da-x, linux-kernel
From: Dan Aloni <da-x@monatomic.org>
Signed-off-by: Dan Aloni <da-x@monatomic.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
---
arch/x86_64/kernel/e820.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux/arch/x86_64/kernel/e820.c
===================================================================
--- linux.orig/arch/x86_64/kernel/e820.c
+++ linux/arch/x86_64/kernel/e820.c
@@ -361,7 +361,7 @@ void __init e820_print_map(char *who)
int i;
for (i = 0; i < e820.nr_map; i++) {
- printk(" %s: %016Lx - %016Lx ", who,
+ printk(KERN_INFO " %s: %016Lx - %016Lx ", who,
(unsigned long long) e820.map[i].addr,
(unsigned long long) (e820.map[i].addr + e820.map[i].size));
switch (e820.map[i].type) {
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [31/48] i386: basic infrastructure support for AMD geode-class machines
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (29 preceding siblings ...)
2007-07-19 13:49 ` [PATCH for review] [30/48] x86_64: arch/x86_64/kernel/e820.c " Andi Kleen
@ 2007-07-19 13:49 ` Andi Kleen
2007-07-19 13:49 ` [PATCH for review] [32/48] i386: insert HPET firmware resource after PCI enumeration has completed Andi Kleen
` (6 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:49 UTC (permalink / raw)
To: dilinger, ak, alan, david-b, linux-kernel
From: Andres Salomon <dilinger@queued.net>
This builds upon the existing geode infrastructure, but adds southbridge
support, some GPIO functions, and a header file (asm-i386/geode.h) with some
useful GX/LX detection tests.
The majority of this code was written by Jordan Crouse.
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/i386/kernel/Makefile | 1
arch/i386/kernel/geode.c | 155 ++++++++++++++++++++++++++++++++++++++++++++
include/asm-i386/geode.h | 159 ++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 315 insertions(+)
Index: linux/arch/i386/kernel/Makefile
===================================================================
--- linux.orig/arch/i386/kernel/Makefile
+++ linux/arch/i386/kernel/Makefile
@@ -41,6 +41,7 @@ obj-$(CONFIG_VM86) += vm86.o
obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
obj-$(CONFIG_HPET_TIMER) += hpet.o
obj-$(CONFIG_K8_NB) += k8.o
+obj-$(CONFIG_MGEODE_LX) += geode.o
obj-$(CONFIG_VMI) += vmi.o vmiclock.o
obj-$(CONFIG_PARAVIRT) += paravirt.o
Index: linux/arch/i386/kernel/geode.c
===================================================================
--- /dev/null
+++ linux/arch/i386/kernel/geode.c
@@ -0,0 +1,155 @@
+/*
+ * AMD Geode southbridge support code
+ * Copyright (C) 2006, Advanced Micro Devices, Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/ioport.h>
+#include <linux/io.h>
+#include <asm/msr.h>
+#include <asm/geode.h>
+
+static struct {
+ char *name;
+ u32 msr;
+ int size;
+ u32 base;
+} lbars[] = {
+ { "geode-pms", MSR_LBAR_PMS, LBAR_PMS_SIZE, 0 },
+ { "geode-acpi", MSR_LBAR_ACPI, LBAR_ACPI_SIZE, 0 },
+ { "geode-gpio", MSR_LBAR_GPIO, LBAR_GPIO_SIZE, 0 },
+ { "geode-mfgpt", MSR_LBAR_MFGPT, LBAR_MFGPT_SIZE, 0 }
+};
+
+static void __init init_lbars(void)
+{
+ u32 lo, hi;
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(lbars); i++) {
+ rdmsr(lbars[i].msr, lo, hi);
+ if (hi & 0x01)
+ lbars[i].base = lo & 0x0000ffff;
+
+ if (lbars[i].base == 0)
+ printk(KERN_ERR "geode: Couldn't initialize '%s'\n",
+ lbars[i].name);
+ }
+}
+
+int geode_get_dev_base(unsigned int dev)
+{
+ BUG_ON(dev >= ARRAY_SIZE(lbars));
+ return lbars[dev].base;
+}
+EXPORT_SYMBOL_GPL(geode_get_dev_base);
+
+/* === GPIO API === */
+
+void geode_gpio_set(unsigned int gpio, unsigned int reg)
+{
+ u32 base = geode_get_dev_base(GEODE_DEV_GPIO);
+
+ if (!base)
+ return;
+
+ if (gpio < 16)
+ outl(1 << gpio, base + reg);
+ else
+ outl(1 << (gpio - 16), base + 0x80 + reg);
+}
+EXPORT_SYMBOL_GPL(geode_gpio_set);
+
+void geode_gpio_clear(unsigned int gpio, unsigned int reg)
+{
+ u32 base = geode_get_dev_base(GEODE_DEV_GPIO);
+
+ if (!base)
+ return;
+
+ if (gpio < 16)
+ outl(1 << (gpio + 16), base + reg);
+ else
+ outl(1 << gpio, base + 0x80 + reg);
+}
+EXPORT_SYMBOL_GPL(geode_gpio_clear);
+
+int geode_gpio_isset(unsigned int gpio, unsigned int reg)
+{
+ u32 base = geode_get_dev_base(GEODE_DEV_GPIO);
+
+ if (!base)
+ return 0;
+
+ if (gpio < 16)
+ return (inl(base + reg) & (1 << gpio)) ? 1 : 0;
+ else
+ return (inl(base + 0x80 + reg) & (1 << (gpio - 16))) ? 1 : 0;
+}
+EXPORT_SYMBOL_GPL(geode_gpio_isset);
+
+void geode_gpio_set_irq(unsigned int group, unsigned int irq)
+{
+ u32 lo, hi;
+
+ if (group > 7 || irq > 15)
+ return;
+
+ rdmsr(MSR_PIC_ZSEL_HIGH, lo, hi);
+
+ lo &= ~(0xF << (group * 4));
+ lo |= (irq & 0xF) << (group * 4);
+
+ wrmsr(MSR_PIC_ZSEL_HIGH, lo, hi);
+}
+EXPORT_SYMBOL_GPL(geode_gpio_set_irq);
+
+void geode_gpio_setup_event(unsigned int gpio, int pair, int pme)
+{
+ u32 base = geode_get_dev_base(GEODE_DEV_GPIO);
+ u32 offset, shift, val;
+
+ if (gpio >= 24)
+ offset = GPIO_MAP_W;
+ else if (gpio >= 16)
+ offset = GPIO_MAP_Z;
+ else if (gpio >= 8)
+ offset = GPIO_MAP_Y;
+ else
+ offset = GPIO_MAP_X;
+
+ shift = (gpio % 8) * 4;
+
+ val = inl(base + offset);
+
+ /* Clear whatever was there before */
+ val &= ~(0xF << shift);
+
+ /* And set the new value */
+
+ val |= ((pair & 7) << shift);
+
+ /* Set the PME bit if this is a PME event */
+
+ if (pme)
+ val |= (1 << (shift + 3));
+
+ outl(val, base + offset);
+}
+EXPORT_SYMBOL_GPL(geode_gpio_setup_event);
+
+static int __init geode_southbridge_init(void)
+{
+ if (!is_geode())
+ return -ENODEV;
+
+ init_lbars();
+ return 0;
+}
+
+postcore_initcall(geode_southbridge_init);
Index: linux/include/asm-i386/geode.h
===================================================================
--- /dev/null
+++ linux/include/asm-i386/geode.h
@@ -0,0 +1,159 @@
+/*
+ * AMD Geode definitions
+ * Copyright (C) 2006, Advanced Micro Devices, Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#ifndef _ASM_GEODE_H_
+#define _ASM_GEODE_H_
+
+#include <asm/processor.h>
+#include <linux/io.h>
+
+/* Generic southbridge functions */
+
+#define GEODE_DEV_PMS 0
+#define GEODE_DEV_ACPI 1
+#define GEODE_DEV_GPIO 2
+#define GEODE_DEV_MFGPT 3
+
+extern int geode_get_dev_base(unsigned int dev);
+
+/* Useful macros */
+#define geode_pms_base() geode_get_dev_base(GEODE_DEV_PMS)
+#define geode_acpi_base() geode_get_dev_base(GEODE_DEV_ACPI)
+#define geode_gpio_base() geode_get_dev_base(GEODE_DEV_GPIO)
+#define geode_mfgpt_base() geode_get_dev_base(GEODE_DEV_MFGPT)
+
+/* MSRS */
+
+#define GX_GLCP_SYS_RSTPLL 0x4C000014
+
+#define MSR_LBAR_SMB 0x5140000B
+#define MSR_LBAR_GPIO 0x5140000C
+#define MSR_LBAR_MFGPT 0x5140000D
+#define MSR_LBAR_ACPI 0x5140000E
+#define MSR_LBAR_PMS 0x5140000F
+
+#define MSR_PIC_YSEL_LOW 0x51400020
+#define MSR_PIC_YSEL_HIGH 0x51400021
+#define MSR_PIC_ZSEL_LOW 0x51400022
+#define MSR_PIC_ZSEL_HIGH 0x51400023
+
+#define MFGPT_IRQ_MSR 0x51400028
+#define MFGPT_NR_MSR 0x51400029
+
+/* Resource Sizes */
+
+#define LBAR_GPIO_SIZE 0xFF
+#define LBAR_MFGPT_SIZE 0x40
+#define LBAR_ACPI_SIZE 0x40
+#define LBAR_PMS_SIZE 0x80
+
+/* ACPI registers (PMS block) */
+
+/*
+ * PM1_EN is only valid when VSA is enabled for 16 bit reads.
+ * When VSA is not enabled, *always* read both PM1_STS and PM1_EN
+ * with a 32 bit read at offset 0x0
+ */
+
+#define PM1_STS 0x00
+#define PM1_EN 0x02
+#define PM1_CNT 0x08
+#define PM2_CNT 0x0C
+#define PM_TMR 0x10
+#define PM_GPE0_STS 0x18
+#define PM_GPE0_EN 0x1C
+
+/* PMC registers (PMS block) */
+
+#define PM_SSD 0x00
+#define PM_SCXA 0x04
+#define PM_SCYA 0x08
+#define PM_OUT_SLPCTL 0x0C
+#define PM_SCLK 0x10
+#define PM_SED 0x1
+#define PM_SCXD 0x18
+#define PM_SCYD 0x1C
+#define PM_IN_SLPCTL 0x20
+#define PM_WKD 0x30
+#define PM_WKXD 0x34
+#define PM_RD 0x38
+#define PM_WKXA 0x3C
+#define PM_FSD 0x40
+#define PM_TSD 0x44
+#define PM_PSD 0x48
+#define PM_NWKD 0x4C
+#define PM_AWKD 0x50
+#define PM_SSC 0x54
+
+/* GPIO */
+
+#define GPIO_OUTPUT_VAL 0x00
+#define GPIO_OUTPUT_ENABLE 0x04
+#define GPIO_OUTPUT_OPEN_DRAIN 0x08
+#define GPIO_OUTPUT_INVERT 0x0C
+#define GPIO_OUTPUT_AUX1 0x10
+#define GPIO_OUTPUT_AUX2 0x14
+#define GPIO_PULL_UP 0x18
+#define GPIO_PULL_DOWN 0x1C
+#define GPIO_INPUT_ENABLE 0x20
+#define GPIO_INPUT_INVERT 0x24
+#define GPIO_INPUT_FILTER 0x28
+#define GPIO_INPUT_EVENT_COUNT 0x2C
+#define GPIO_READ_BACK 0x30
+#define GPIO_INPUT_AUX1 0x34
+#define GPIO_EVENTS_ENABLE 0x38
+#define GPIO_LOCK_ENABLE 0x3C
+#define GPIO_POSITIVE_EDGE_EN 0x40
+#define GPIO_NEGATIVE_EDGE_EN 0x44
+#define GPIO_POSITIVE_EDGE_STS 0x48
+#define GPIO_NEGATIVE_EDGE_STS 0x4C
+
+#define GPIO_MAP_X 0xE0
+#define GPIO_MAP_Y 0xE4
+#define GPIO_MAP_Z 0xE8
+#define GPIO_MAP_W 0xEC
+
+extern void geode_gpio_set(unsigned int, unsigned int);
+extern void geode_gpio_clear(unsigned int, unsigned int);
+extern int geode_gpio_isset(unsigned int, unsigned int);
+extern void geode_gpio_setup_event(unsigned int, int, int);
+extern void geode_gpio_set_irq(unsigned int, unsigned int);
+
+static inline void geode_gpio_event_irq(unsigned int gpio, int pair)
+{
+ geode_gpio_setup_event(gpio, pair, 0);
+}
+
+static inline void geode_gpio_event_pme(unsigned int gpio, int pair)
+{
+ geode_gpio_setup_event(gpio, pair, 1);
+}
+
+/* Specific geode tests */
+
+static inline int is_geode_gx(void)
+{
+ return ((boot_cpu_data.x86_vendor == X86_VENDOR_NSC) &&
+ (boot_cpu_data.x86 == 5) &&
+ (boot_cpu_data.x86_model == 5));
+}
+
+static inline int is_geode_lx(void)
+{
+ return ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) &&
+ (boot_cpu_data.x86 == 5) &&
+ (boot_cpu_data.x86_model == 10));
+}
+
+static inline int is_geode(void)
+{
+ return (is_geode_gx() || is_geode_lx());
+}
+
+#endif
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [32/48] i386: insert HPET firmware resource after PCI enumeration has completed
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (30 preceding siblings ...)
2007-07-19 13:49 ` [PATCH for review] [31/48] i386: basic infrastructure support for AMD geode-class machines Andi Kleen
@ 2007-07-19 13:49 ` Andi Kleen
2007-07-19 13:49 ` [PATCH for review] [33/48] i386: remove old IRQ balancing debug cruft Andi Kleen
` (5 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:49 UTC (permalink / raw)
To: adurbin, johnstul, tglx, ak, lenb, linux-kernel
From: Aaron Durbin <adurbin@google.com>
Insert HPET resources after pci probing has been completed in order to
avoid resource conflicts with PCI resource reservation. With this change
the HPET firmware resources will be identified, but it should also not
cause issues when the HPET address falls on a BAR in a PCI device, and the
PCI enumeration cannot reserve the resources.
Signed-off-by: Aaron Durbin <adurbin@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/i386/kernel/acpi/boot.c | 36 ++++++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
Index: linux/arch/i386/kernel/acpi/boot.c
===================================================================
--- linux.orig/arch/i386/kernel/acpi/boot.c
+++ linux/arch/i386/kernel/acpi/boot.c
@@ -618,6 +618,8 @@ static int __init acpi_parse_sbf(struct
#ifdef CONFIG_HPET_TIMER
#include <asm/hpet.h>
+static struct __initdata resource *hpet_res;
+
static int __init acpi_parse_hpet(struct acpi_table_header *table)
{
struct acpi_table_hpet *hpet_tbl;
@@ -638,8 +640,42 @@ static int __init acpi_parse_hpet(struct
printk(KERN_INFO PREFIX "HPET id: %#x base: %#lx\n",
hpet_tbl->id, hpet_address);
+ /*
+ * Allocate and initialize the HPET firmware resource for adding into
+ * the resource tree during the lateinit timeframe.
+ */
+#define HPET_RESOURCE_NAME_SIZE 9
+ hpet_res = alloc_bootmem(sizeof(*hpet_res) + HPET_RESOURCE_NAME_SIZE);
+
+ if (!hpet_res)
+ return 0;
+
+ memset(hpet_res, 0, sizeof(*hpet_res));
+ hpet_res->name = (void *)&hpet_res[1];
+ hpet_res->flags = IORESOURCE_MEM;
+ snprintf((char *)hpet_res->name, HPET_RESOURCE_NAME_SIZE, "HPET %u",
+ hpet_tbl->sequence);
+
+ hpet_res->start = hpet_address;
+ hpet_res->end = hpet_address + (1 * 1024) - 1;
+
return 0;
}
+
+/*
+ * hpet_insert_resource inserts the HPET resources used into the resource
+ * tree.
+ */
+static __init int hpet_insert_resource(void)
+{
+ if (!hpet_res)
+ return 1;
+
+ return insert_resource(&iomem_resource, hpet_res);
+}
+
+late_initcall(hpet_insert_resource);
+
#else
#define acpi_parse_hpet NULL
#endif
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [33/48] i386: remove old IRQ balancing debug cruft
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (31 preceding siblings ...)
2007-07-19 13:49 ` [PATCH for review] [32/48] i386: insert HPET firmware resource after PCI enumeration has completed Andi Kleen
@ 2007-07-19 13:49 ` Andi Kleen
2007-07-19 13:49 ` [PATCH for review] [34/48] i386: Update alignment when 4K stacks are used Andi Kleen
` (4 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:49 UTC (permalink / raw)
To: stefanr, mingo, ak, rpjday, linux-kernel
From: Stefan Richter <stefanr@s5r6.in-berlin.de>
Dead or misnamed CONFIG_BALANCED_IRQ_DEBUG found by Robert P. J. Day.
It's not a Kconfig variable.
Since this debug code is ancient, I suggest to get rid of this
misleading CONFIG_ macro by deleting all of this debug code.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Cc: "Robert P. J. Day" <rpjday@mindspring.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
arch/i386/kernel/io_apic.c | 24 ++----------------------
1 file changed, 2 insertions(+), 22 deletions(-)
Index: linux/arch/i386/kernel/io_apic.c
===================================================================
--- linux.orig/arch/i386/kernel/io_apic.c
+++ linux/arch/i386/kernel/io_apic.c
@@ -353,14 +353,6 @@ static void set_ioapic_affinity_irq(unsi
# include <linux/slab.h> /* kmalloc() */
# include <linux/timer.h> /* time_after() */
-#ifdef CONFIG_BALANCED_IRQ_DEBUG
-# define TDprintk(x...) do { printk("<%ld:%s:%d>: ", jiffies, __FILE__, __LINE__); printk(x); } while (0)
-# define Dprintk(x...) do { TDprintk(x); } while (0)
-# else
-# define TDprintk(x...)
-# define Dprintk(x...)
-# endif
-
#define IRQBALANCE_CHECK_ARCH -999
#define MAX_BALANCED_IRQ_INTERVAL (5*HZ)
#define MIN_BALANCED_IRQ_INTERVAL (HZ/2)
@@ -443,7 +435,7 @@ static inline void balance_irq(int cpu,
static inline void rotate_irqs_among_cpus(unsigned long useful_load_threshold)
{
int i, j;
- Dprintk("Rotating IRQs among CPUs.\n");
+
for_each_online_cpu(i) {
for (j = 0; j < NR_IRQS; j++) {
if (!irq_desc[j].action)
@@ -560,19 +552,11 @@ tryanothercpu:
max_loaded = tmp_loaded; /* processor */
imbalance = (max_cpu_irq - min_cpu_irq) / 2;
- Dprintk("max_loaded cpu = %d\n", max_loaded);
- Dprintk("min_loaded cpu = %d\n", min_loaded);
- Dprintk("max_cpu_irq load = %ld\n", max_cpu_irq);
- Dprintk("min_cpu_irq load = %ld\n", min_cpu_irq);
- Dprintk("load imbalance = %lu\n", imbalance);
-
/* if imbalance is less than approx 10% of max load, then
* observe diminishing returns action. - quit
*/
- if (imbalance < (max_cpu_irq >> 3)) {
- Dprintk("Imbalance too trivial\n");
+ if (imbalance < (max_cpu_irq >> 3))
goto not_worth_the_effort;
- }
tryanotherirq:
/* if we select an IRQ to move that can't go where we want, then
@@ -629,9 +613,6 @@ tryanotherirq:
cpus_and(tmp, target_cpu_mask, allowed_mask);
if (!cpus_empty(tmp)) {
-
- Dprintk("irq = %d moved to cpu = %d\n",
- selected_irq, min_loaded);
/* mark for change destination */
set_pending_irq(selected_irq, cpumask_of_cpu(min_loaded));
@@ -651,7 +632,6 @@ not_worth_the_effort:
*/
balanced_irq_interval = min((long)MAX_BALANCED_IRQ_INTERVAL,
balanced_irq_interval + BALANCED_IRQ_MORE_DELTA);
- Dprintk("IRQ worth rotating not found\n");
return;
}
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [34/48] i386: Update alignment when 4K stacks are used.
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (32 preceding siblings ...)
2007-07-19 13:49 ` [PATCH for review] [33/48] i386: remove old IRQ balancing debug cruft Andi Kleen
@ 2007-07-19 13:49 ` Andi Kleen
2007-07-19 13:49 ` [PATCH for review] [35/48] x86_64: remove __smp_alt* sections Andi Kleen
` (3 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:49 UTC (permalink / raw)
To: rpjday, linux-kernel
From: "Robert P. J. Day" <rpjday@mindspring.com>
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Andi Kleen <ak@suse.de>
---
it's not clear from MAINTAINERS who's responsible for something this
generic.
---
arch/i386/kernel/irq.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
Index: linux/arch/i386/kernel/irq.c
===================================================================
--- linux.orig/arch/i386/kernel/irq.c
+++ linux/arch/i386/kernel/irq.c
@@ -149,15 +149,11 @@ fastcall unsigned int do_IRQ(struct pt_r
#ifdef CONFIG_4KSTACKS
-/*
- * These should really be __section__(".bss.page_aligned") as well, but
- * gcc's 3.0 and earlier don't handle that correctly.
- */
static char softirq_stack[NR_CPUS * THREAD_SIZE]
- __attribute__((__aligned__(THREAD_SIZE)));
+ __attribute__((__section__(".bss.page_aligned")));
static char hardirq_stack[NR_CPUS * THREAD_SIZE]
- __attribute__((__aligned__(THREAD_SIZE)));
+ __attribute__((__section__(".bss.page_aligned")));
/*
* allocate per-cpu stacks for hardirq and for softirq processing
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [35/48] x86_64: remove __smp_alt* sections
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (33 preceding siblings ...)
2007-07-19 13:49 ` [PATCH for review] [34/48] i386: Update alignment when 4K stacks are used Andi Kleen
@ 2007-07-19 13:49 ` Andi Kleen
2007-07-19 13:49 ` [PATCH for review] [36/48] x86_64: k8topology add family 10h and 11h PCI IDs Andi Kleen
` (2 subsequent siblings)
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:49 UTC (permalink / raw)
To: jbeulich, linux-kernel
From: "Jan Beulich" <jbeulich@novell.com>
Leftovers from the removal of the more general (but abandoned) SMP
alternatives.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
arch/x86_64/kernel/vmlinux.lds.S | 9 ---------
1 file changed, 9 deletions(-)
Index: linux/arch/x86_64/kernel/vmlinux.lds.S
===================================================================
--- linux.orig/arch/x86_64/kernel/vmlinux.lds.S
+++ linux/arch/x86_64/kernel/vmlinux.lds.S
@@ -141,20 +141,11 @@ SECTIONS
/* might get freed after init */
. = ALIGN(4096);
__smp_alt_begin = .;
- __smp_alt_instructions = .;
- .smp_altinstructions : AT(ADDR(.smp_altinstructions) - LOAD_OFFSET) {
- *(.smp_altinstructions)
- }
- __smp_alt_instructions_end = .;
- . = ALIGN(8);
__smp_locks = .;
.smp_locks : AT(ADDR(.smp_locks) - LOAD_OFFSET) {
*(.smp_locks)
}
__smp_locks_end = .;
- .smp_altinstr_replacement : AT(ADDR(.smp_altinstr_replacement) - LOAD_OFFSET) {
- *(.smp_altinstr_replacement)
- }
. = ALIGN(4096);
__smp_alt_end = .;
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [36/48] x86_64: k8topology add family 10h and 11h PCI IDs
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (34 preceding siblings ...)
2007-07-19 13:49 ` [PATCH for review] [35/48] x86_64: remove __smp_alt* sections Andi Kleen
@ 2007-07-19 13:49 ` Andi Kleen
2007-07-19 13:49 ` [PATCH for review] [37/48] x86_64: make k8topology multi-core aware Andi Kleen
2007-07-19 13:49 ` [PATCH for review] [38/48] x86_64: Put allocated ELF notes in read-only data segment Andi Kleen
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:49 UTC (permalink / raw)
To: joachim.deguara, linux-kernel
From: "Joachim Deguara" <joachim.deguara@amd.com>
This just adds the PCI IDs of AMD's family 10h and 11h CPU's northbridges to
k8topology discovery.
Signed-off-by: Joachim Deguara <joachim.deguara@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
--
---
arch/x86_64/mm/k8topology.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
Index: linux/arch/x86_64/mm/k8topology.c
===================================================================
--- linux.orig/arch/x86_64/mm/k8topology.c
+++ linux/arch/x86_64/mm/k8topology.c
@@ -28,11 +28,15 @@ static __init int find_northbridge(void)
u32 header;
header = read_pci_config(0, num, 0, 0x00);
- if (header != (PCI_VENDOR_ID_AMD | (0x1100<<16)))
+ if (header != (PCI_VENDOR_ID_AMD | (0x1100<<16)) &&
+ header != (PCI_VENDOR_ID_AMD | (0x1200<<16)) &&
+ header != (PCI_VENDOR_ID_AMD | (0x1300<<16)) )
continue;
header = read_pci_config(0, num, 1, 0x00);
- if (header != (PCI_VENDOR_ID_AMD | (0x1101<<16)))
+ if (header != (PCI_VENDOR_ID_AMD | (0x1101<<16)) &&
+ header != (PCI_VENDOR_ID_AMD | (0x1201<<16)) &&
+ header != (PCI_VENDOR_ID_AMD | (0x1301<<16)) )
continue;
return num;
}
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [37/48] x86_64: make k8topology multi-core aware
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (35 preceding siblings ...)
2007-07-19 13:49 ` [PATCH for review] [36/48] x86_64: k8topology add family 10h and 11h PCI IDs Andi Kleen
@ 2007-07-19 13:49 ` Andi Kleen
2007-07-19 13:49 ` [PATCH for review] [38/48] x86_64: Put allocated ELF notes in read-only data segment Andi Kleen
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:49 UTC (permalink / raw)
To: joachim.deguara, linux-kernel
From: "Joachim Deguara" <joachim.deguara@amd.com>
This makes k8topology multicore aware instead of limited to signle- and
dual-core CPUs. It uses the CPUID to be more future proof.
Signed-off-by: Joachim Deguara <joachim.deguara@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
--
---
arch/x86_64/mm/k8topology.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
Index: linux/arch/x86_64/mm/k8topology.c
===================================================================
--- linux.orig/arch/x86_64/mm/k8topology.c
+++ linux/arch/x86_64/mm/k8topology.c
@@ -48,12 +48,12 @@ int __init k8_scan_nodes(unsigned long s
{
unsigned long prevbase;
struct bootnode nodes[8];
- int nodeid, i, nb;
+ int nodeid, i, j, nb;
unsigned char nodeids[8];
int found = 0;
u32 reg;
unsigned numnodes;
- unsigned dualcore = 0;
+ unsigned num_cores;
if (!early_pci_allowed())
return -1;
@@ -64,6 +64,9 @@ int __init k8_scan_nodes(unsigned long s
printk(KERN_INFO "Scanning NUMA topology in Northbridge %d\n", nb);
+ num_cores = (cpuid_ecx(0x80000008) & 0xff) + 1;
+ printk(KERN_INFO "CPU has %d num_cores\n", num_cores);
+
reg = read_pci_config(0, nb, 0, 0x60);
numnodes = ((reg >> 4) & 0xF) + 1;
if (numnodes <= 1)
@@ -77,8 +80,6 @@ int __init k8_scan_nodes(unsigned long s
unsigned long base,limit;
u32 nodeid;
- /* Undefined before E stepping, but hopefully 0 */
- dualcore |= ((read_pci_config(0, nb, 3, 0xe8) >> 12) & 3) == 1;
base = read_pci_config(0, nb, 1, 0x40 + i*8);
limit = read_pci_config(0, nb, 1, 0x44 + i*8);
@@ -174,8 +175,8 @@ int __init k8_scan_nodes(unsigned long s
for (i = 0; i < 8; i++) {
if (nodes[i].start != nodes[i].end) {
nodeid = nodeids[i];
- apicid_to_node[nodeid << dualcore] = i;
- apicid_to_node[(nodeid << dualcore) + dualcore] = i;
+ for (j = 0; j < num_cores; j++)
+ apicid_to_node[(nodeid * num_cores) + j] = i;
setup_node_bootmem(i, nodes[i].start, nodes[i].end);
}
}
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH for review] [38/48] x86_64: Put allocated ELF notes in read-only data segment
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
` (36 preceding siblings ...)
2007-07-19 13:49 ` [PATCH for review] [37/48] x86_64: make k8topology multi-core aware Andi Kleen
@ 2007-07-19 13:49 ` Andi Kleen
37 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 13:49 UTC (permalink / raw)
To: roland, linux-kernel
From: Roland McGrath <roland@redhat.com>
This changes the x86_64 linker script to use the asm-generic NOTES macro so
that ELF note sections with SHF_ALLOC set are linked into the kernel image
along with other read-only data. The PT_NOTE also points to their location.
This paves the way for putting useful build-time information into ELF notes
that can be found easily later in a kernel memory dump.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
---
arch/x86_64/kernel/vmlinux.lds.S | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
Index: linux/arch/x86_64/kernel/vmlinux.lds.S
===================================================================
--- linux.orig/arch/x86_64/kernel/vmlinux.lds.S
+++ linux/arch/x86_64/kernel/vmlinux.lds.S
@@ -48,7 +48,9 @@ SECTIONS
__ex_table : AT(ADDR(__ex_table) - LOAD_OFFSET) { *(__ex_table) }
__stop___ex_table = .;
- BUG_TABLE
+ NOTES :text :note
+
+ BUG_TABLE :text
RODATA
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH for review] [11/48] x86_64: During VM oom condition, kill all threads in process group
2007-07-19 13:48 ` [PATCH for review] [11/48] x86_64: During VM oom condition, kill all threads in process group Andi Kleen
@ 2007-07-19 14:04 ` Christoph Hellwig
2007-07-19 14:14 ` Geert Uytterhoeven
0 siblings, 1 reply; 60+ messages in thread
From: Christoph Hellwig @ 2007-07-19 14:04 UTC (permalink / raw)
To: Andi Kleen; +Cc: will_schmidt, linux-kernel, linux-arch
On Thu, Jul 19, 2007 at 03:48:40PM +0200, Andi Kleen wrote:
>
> From: Will Schmidt <will_schmidt@vnet.ibm.com>
>
> During a VM oom condition, kill all threads in the process group.
>
> We have had complaints where a threaded application is left in a bad state
> after one of it's threads is killed when we hit a VM: out_of_memory condition.
>
> Killing just one of the process threads can leave the application in a bad
> state, whereas killing the entire process group would allow for the
> application to restart, or otherwise handled, and makes it very obvious that
> something has gone wrong.
>
> This change allows the entire process group to be taken down, rather than just
> the one thread.
Shouldn't we have one patc hthat does this for every architecture instead
of going through arch maintainers and probably losing half of them?
> Index: linux/arch/x86_64/mm/fault.c
> ===================================================================
> --- linux.orig/arch/x86_64/mm/fault.c
> +++ linux/arch/x86_64/mm/fault.c
> @@ -569,7 +569,7 @@ out_of_memory:
> }
> printk("VM: killing process %s\n", tsk->comm);
> if (error_code & 4)
> - do_exit(SIGKILL);
> + do_group_exit(SIGKILL);
> goto no_context;
>
> do_sigbus:
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH for review] [11/48] x86_64: During VM oom condition, kill all threads in process group
2007-07-19 14:04 ` Christoph Hellwig
@ 2007-07-19 14:14 ` Geert Uytterhoeven
2007-07-19 15:03 ` Will Schmidt
2007-07-23 18:09 ` [PATCH respin, was PATCH for review] " Will Schmidt
0 siblings, 2 replies; 60+ messages in thread
From: Geert Uytterhoeven @ 2007-07-19 14:14 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Andi Kleen, will_schmidt, linux-kernel, linux-arch
On Thu, 19 Jul 2007, Christoph Hellwig wrote:
> On Thu, Jul 19, 2007 at 03:48:40PM +0200, Andi Kleen wrote:
> > From: Will Schmidt <will_schmidt@vnet.ibm.com>
> >
> > During a VM oom condition, kill all threads in the process group.
> >
> > We have had complaints where a threaded application is left in a bad state
> > after one of it's threads is killed when we hit a VM: out_of_memory condition.
> >
> > Killing just one of the process threads can leave the application in a bad
> > state, whereas killing the entire process group would allow for the
> > application to restart, or otherwise handled, and makes it very obvious that
> > something has gone wrong.
> >
> > This change allows the entire process group to be taken down, rather than just
> > the one thread.
>
> Shouldn't we have one patc hthat does this for every architecture instead
> of going through arch maintainers and probably losing half of them?
Yes, please ;-)
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
return -EMAINTAINER_TOO_BUSY
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH for review] [7/48] i386: divorce CONFIG_X86_PAE from CONFIG_HIGHMEM64G
2007-07-19 13:48 ` [PATCH for review] [7/48] i386: divorce CONFIG_X86_PAE from CONFIG_HIGHMEM64G Andi Kleen
@ 2007-07-19 14:46 ` Dave Jones
2007-07-19 17:06 ` Andi Kleen
2007-07-19 14:52 ` Christoph Hellwig
1 sibling, 1 reply; 60+ messages in thread
From: Dave Jones @ 2007-07-19 14:46 UTC (permalink / raw)
To: Andi Kleen; +Cc: wli, lkml, linux-kernel
On Thu, Jul 19, 2007 at 03:48:36PM +0200, Andi Kleen wrote:
> Some users may want NX
> or expanded swapspace support without the overhead or instability of
> highmem.
NX is still going to need the larger PTEs, so I don't see how this
change removes any 'overhead' or potential 'instability'.
I'm not necessarily disagreeing with the change, but this part
of the changelog seems to be false advertising.
Dave
--
http://www.codemonkey.org.uk
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH for review] [7/48] i386: divorce CONFIG_X86_PAE from CONFIG_HIGHMEM64G
2007-07-19 13:48 ` [PATCH for review] [7/48] i386: divorce CONFIG_X86_PAE from CONFIG_HIGHMEM64G Andi Kleen
2007-07-19 14:46 ` Dave Jones
@ 2007-07-19 14:52 ` Christoph Hellwig
2007-07-20 0:45 ` William Lee Irwin III
1 sibling, 1 reply; 60+ messages in thread
From: Christoph Hellwig @ 2007-07-19 14:52 UTC (permalink / raw)
To: Andi Kleen; +Cc: wli, lkml, linux-kernel
On Thu, Jul 19, 2007 at 03:48:36PM +0200, Andi Kleen wrote:
>
> From: William Lee Irwin III <wli@holomorphy.com>
>
> PAE is useful for more than supporting more than 4GB RAM. It supports
> expanded swapspace and NX executable protections. Some users may want NX
> or expanded swapspace support without the overhead or instability of
> highmem. For these reasons, the following patch divorces CONFIG_X86_PAE
> from CONFIG_HIGHMEM64G.
What overhead of instability of highmem? Sorry folks but this is utter
bollocks. Back in the Caldera days we did a lot of measurement on highmem
overhead, and CONFIG_HIGHMEM has no measurable overhead at all on a system
that doesn't use it. CONFIG_HIGHMEM64G on the other hand has
a quite visible overhead on small systems, but that's entirely due to the
bigger page table entries that you need for NX.
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH for review] [11/48] x86_64: During VM oom condition, kill all threads in process group
2007-07-19 14:14 ` Geert Uytterhoeven
@ 2007-07-19 15:03 ` Will Schmidt
2007-07-23 18:09 ` [PATCH respin, was PATCH for review] " Will Schmidt
1 sibling, 0 replies; 60+ messages in thread
From: Will Schmidt @ 2007-07-19 15:03 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Christoph Hellwig, Andi Kleen, linux-kernel, linux-arch
On Thu, 2007-07-19 at 16:14 +0200, Geert Uytterhoeven wrote:
> On Thu, 19 Jul 2007, Christoph Hellwig wrote:
> > On Thu, Jul 19, 2007 at 03:48:40PM +0200, Andi Kleen wrote:
> > > From: Will Schmidt <will_schmidt@vnet.ibm.com>
> > >
> > > During a VM oom condition, kill all threads in the process group.
> > >
> > > We have had complaints where a threaded application is left in a bad state
> > > after one of it's threads is killed when we hit a VM: out_of_memory condition.
> > >
> > > Killing just one of the process threads can leave the application in a bad
> > > state, whereas killing the entire process group would allow for the
> > > application to restart, or otherwise handled, and makes it very obvious that
> > > something has gone wrong.
> > >
> > > This change allows the entire process group to be taken down, rather than just
> > > the one thread.
> >
> > Shouldn't we have one patc hthat does this for every architecture instead
> > of going through arch maintainers and probably losing half of them?
>
> Yes, please ;-)
Ok, i'm convinced. :-) I'll spin up an all-arch encompassing patch
in the next day or so.
-Will
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> return -EMAINTAINER_TOO_BUSY
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH for review] [12/48] x86_64: use the global PIT lock
2007-07-19 13:48 ` [PATCH for review] [12/48] x86_64: use the global PIT lock Andi Kleen
@ 2007-07-19 15:22 ` Dmitry Torokhov
2007-07-19 17:29 ` Andi Kleen
0 siblings, 1 reply; 60+ messages in thread
From: Dmitry Torokhov @ 2007-07-19 15:22 UTC (permalink / raw)
To: Andi Kleen; +Cc: tglx, mingo, linux-kernel
Hi Andi,
On 7/19/07, Andi Kleen <ak@suse.de> wrote:
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> Replace the pcspkr private PIT lock by the global PIT lock to serialize the
> PIT access all over the place.
>
Like I said before I'd be more happy if spinlock was attached to a
platform device that pcspkr binds to so the arch code would control
wehther we use a private spinlock or a global one (I sent a patch to
that effect earlier). However I am OK with Thomas's patch as well.
--
Dmitry
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH for review] [26/48] i386: hpet assumes boot cpu is 0
2007-07-19 13:48 ` [PATCH for review] [26/48] i386: hpet assumes boot cpu is 0 Andi Kleen
@ 2007-07-19 16:23 ` Jeremy Fitzhardinge
0 siblings, 0 replies; 60+ messages in thread
From: Jeremy Fitzhardinge @ 2007-07-19 16:23 UTC (permalink / raw)
To: chrisw; +Cc: Andi Kleen, mingo, johnstul, linux-kernel
Andi Kleen wrote:
> From: Chris Wright <chrisw@sous-sol.org>
>
> I fixed this in x86_64. Looks like the kind of thing that will break voyager
> on i386.
>
All those Voyagers with hpet?
J
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH for review] [7/48] i386: divorce CONFIG_X86_PAE from CONFIG_HIGHMEM64G
2007-07-19 14:46 ` Dave Jones
@ 2007-07-19 17:06 ` Andi Kleen
0 siblings, 0 replies; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 17:06 UTC (permalink / raw)
To: Dave Jones; +Cc: wli, lkml, linux-kernel
On Thursday 19 July 2007 16:46:31 Dave Jones wrote:
> On Thu, Jul 19, 2007 at 03:48:36PM +0200, Andi Kleen wrote:
I didn't. If you guys really want to have a thread to criticize the changelog
(which seems quite bogus, after all we're interested in code here,
not changelogs) then please at least get your attributions right.
Regarding highmem instability: there are still known ways to drive
systems with larger highmem ratios to early OOM or deadlock, although they
are relatively obscure.
-Andi
>
> > Some users may want NX
> > or expanded swapspace support without the overhead or instability of
> > highmem.
>
> NX is still going to need the larger PTEs, so I don't see how this
> change removes any 'overhead' or potential 'instability'.
> I'm not necessarily disagreeing with the change, but this part
> of the changelog seems to be false advertising.
>
> Dave
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH for review] [12/48] x86_64: use the global PIT lock
2007-07-19 15:22 ` Dmitry Torokhov
@ 2007-07-19 17:29 ` Andi Kleen
2007-07-19 19:23 ` Dmitry Torokhov
0 siblings, 1 reply; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 17:29 UTC (permalink / raw)
To: Dmitry Torokhov; +Cc: tglx, mingo, linux-kernel
On Thursday 19 July 2007 17:22:38 Dmitry Torokhov wrote:
> Hi Andi,
>
> On 7/19/07, Andi Kleen <ak@suse.de> wrote:
> >
> > From: Thomas Gleixner <tglx@linutronix.de>
> >
> > Replace the pcspkr private PIT lock by the global PIT lock to serialize the
> > PIT access all over the place.
> >
>
> Like I said before I'd be more happy if spinlock was attached to a
> platform device that pcspkr binds to so the arch code would control
> wehther we use a private spinlock or a global one (I sent a patch to
> that effect earlier).
Not sure that flexibility is needed. Why would an architecture ever want
to have more than one lock for this? And we normally don't need sysdevs
for locks, they seem to be quite unrelated.
AFAIK sysdevs are just for suspend/resume, and even for that they seem
to get obsoleted now.
-Andi
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH for review] [12/48] x86_64: use the global PIT lock
2007-07-19 17:29 ` Andi Kleen
@ 2007-07-19 19:23 ` Dmitry Torokhov
2007-07-19 19:52 ` Andi Kleen
0 siblings, 1 reply; 60+ messages in thread
From: Dmitry Torokhov @ 2007-07-19 19:23 UTC (permalink / raw)
To: Andi Kleen; +Cc: tglx, mingo, linux-kernel
On 7/19/07, Andi Kleen <ak@suse.de> wrote:
> On Thursday 19 July 2007 17:22:38 Dmitry Torokhov wrote:
> > Hi Andi,
> >
> > On 7/19/07, Andi Kleen <ak@suse.de> wrote:
> > >
> > > From: Thomas Gleixner <tglx@linutronix.de>
> > >
> > > Replace the pcspkr private PIT lock by the global PIT lock to serialize the
> > > PIT access all over the place.
> > >
> >
> > Like I said before I'd be more happy if spinlock was attached to a
> > platform device that pcspkr binds to so the arch code would control
> > wehther we use a private spinlock or a global one (I sent a patch to
> > that effect earlier).
>
> Not sure that flexibility is needed. Why would an architecture ever want
> to have more than one lock for this? And we normally don't need sysdevs
> for locks, they seem to be quite unrelated.
>
I was not talking about sysdevs. I was talking about platform devices
that are already being created for pcspkr by arch code. Now I want
arch code to provide a spinlock for pcspkr driver to use when
accessing PIT. What it does it allows to remove arch specific
knowledge (i.e. #ifdef CONFIG_X86...) from the pcspkr driver.
--
Dmitry
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH for review] [12/48] x86_64: use the global PIT lock
2007-07-19 19:23 ` Dmitry Torokhov
@ 2007-07-19 19:52 ` Andi Kleen
2007-07-20 4:24 ` Dmitry Torokhov
0 siblings, 1 reply; 60+ messages in thread
From: Andi Kleen @ 2007-07-19 19:52 UTC (permalink / raw)
To: Dmitry Torokhov; +Cc: tglx, mingo, linux-kernel
>
> I was not talking about sysdevs. I was talking about platform devices
> that are already being created for pcspkr by arch code. Now I want
> arch code to provide a spinlock for pcspkr driver to use when
> accessing PIT. What it does it allows to remove arch specific
> knowledge (i.e. #ifdef CONFIG_X86...) from the pcspkr driver.
Ok please send a patch.
-Andi
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH for review] [7/48] i386: divorce CONFIG_X86_PAE from CONFIG_HIGHMEM64G
2007-07-19 14:52 ` Christoph Hellwig
@ 2007-07-20 0:45 ` William Lee Irwin III
0 siblings, 0 replies; 60+ messages in thread
From: William Lee Irwin III @ 2007-07-20 0:45 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Andi Kleen, lkml, linux-kernel
From: William Lee Irwin III <wli@holomorphy.com>
>> PAE is useful for more than supporting more than 4GB RAM. It supports
>> expanded swapspace and NX executable protections. Some users may want NX
>> or expanded swapspace support without the overhead or instability of
>> highmem. For these reasons, the following patch divorces CONFIG_X86_PAE
>> from CONFIG_HIGHMEM64G.
On Thu, Jul 19, 2007 at 03:52:29PM +0100, Christoph Hellwig wrote:
> What overhead of instability of highmem? Sorry folks but this is utter
> bollocks. Back in the Caldera days we did a lot of measurement on highmem
> overhead, and CONFIG_HIGHMEM has no measurable overhead at all on a system
> that doesn't use it. CONFIG_HIGHMEM64G on the other hand has
> a quite visible overhead on small systems, but that's entirely due to the
> bigger page table entries that you need for NX.
The missing context here is CONFIG_VMSPLIT on laptops.
Laptop users, who frequently use CONFIG_VMSPLIT options to avoid
highmem, wanted to turn on NX. Prior to the patch, those options were
barred for all highmem configurations. In response to those requests,
I produced the patch.
The overhead and instability derived from tiny zones as opposed to
kmap()/kunmap(), or at least such was the case historically.
-- wli
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH for review] [12/48] x86_64: use the global PIT lock
2007-07-19 19:52 ` Andi Kleen
@ 2007-07-20 4:24 ` Dmitry Torokhov
2007-07-20 8:25 ` Andi Kleen
0 siblings, 1 reply; 60+ messages in thread
From: Dmitry Torokhov @ 2007-07-20 4:24 UTC (permalink / raw)
To: Andi Kleen; +Cc: tglx, mingo, linux-kernel
On Thursday 19 July 2007 15:52, Andi Kleen wrote:
>
> >
> > I was not talking about sysdevs. I was talking about platform devices
> > that are already being created for pcspkr by arch code. Now I want
> > arch code to provide a spinlock for pcspkr driver to use when
> > accessing PIT. What it does it allows to remove arch specific
> > knowledge (i.e. #ifdef CONFIG_X86...) from the pcspkr driver.
>
> Ok please send a patch.
>
> -Andi
>
Here it is...
--
Dmitry
Subject: Input: pcspkr - use proper lock
From: Dmitry Torokhov <dtor@insightbb.com>
On i386 and x86_64 the access to the PIT is serialized by a lock
in the architecture code. The separate locking in the PC-speaker
code ignores the global lock and creates a nasty race between the
PC-speaker and the PIT clock source/events code on SMP machines.
To fix this we architecture code attaches proper lock to the
pcspkr platform device and the driver uses it instead of it's
own private lock.
Noticed by Thomas Gleixner <tglx@linutronix.de>
Also resore uevent generation for pcspkr devices so that the
driver can be loaded automatically by udev.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
---
arch/i386/kernel/pcspeaker.c | 20 -------------------
arch/alpha/kernel/setup.c | 9 +++++++-
arch/i386/kernel/Makefile | 1
arch/i386/kernel/i8253.c | 23 ++++++++++++++++++++++
arch/mips/kernel/pcspeaker.c | 9 +++++++-
arch/powerpc/kernel/setup-common.c | 9 +++++++-
arch/x86_64/kernel/Makefile | 2 -
arch/x86_64/kernel/time.c | 38 +++++++++++++++++++++++++++++--------
drivers/input/misc/pcspkr.c | 10 +++++----
9 files changed, 83 insertions(+), 38 deletions(-)
Index: work/arch/alpha/kernel/setup.c
===================================================================
--- work.orig/arch/alpha/kernel/setup.c
+++ work/arch/alpha/kernel/setup.c
@@ -1492,6 +1492,8 @@ alpha_panic_event(struct notifier_block
return NOTIFY_DONE;
}
+static DEFINE_SPINLOCK(i8253_lock);
+
static __init int add_pcspkr(void)
{
struct platform_device *pd;
@@ -1501,9 +1503,14 @@ static __init int add_pcspkr(void)
if (!pd)
return -ENOMEM;
+ pd->dev.platform_data = &i8253_lock;
+ pd->dev.uevent_suppress = 0;
+
ret = platform_device_add(pd);
- if (ret)
+ if (ret) {
+ pd->dev.platform_data = NULL; /* so we don't try to free it */
platform_device_put(pd);
+ }
return ret;
}
Index: work/arch/mips/kernel/pcspeaker.c
===================================================================
--- work.orig/arch/mips/kernel/pcspeaker.c
+++ work/arch/mips/kernel/pcspeaker.c
@@ -10,6 +10,8 @@
#include <linux/platform_device.h>
+static DEFINE_SPINLOCK(i8253_lock);
+
static __init int add_pcspkr(void)
{
struct platform_device *pd;
@@ -19,9 +21,14 @@ static __init int add_pcspkr(void)
if (!pd)
return -ENOMEM;
+ pd->dev.platform_data = &i8253_lock;
+ pd->dev.uevent_suppress = 0;
+
ret = platform_device_add(pd);
- if (ret)
+ if (ret) {
+ pd->dev.platform_data = NULL; /* so we don't try to free it */
platform_device_put(pd);
+ }
return ret;
}
Index: work/arch/x86_64/kernel/time.c
===================================================================
--- work.orig/arch/x86_64/kernel/time.c
+++ work/arch/x86_64/kernel/time.c
@@ -23,6 +23,7 @@
#include <linux/module.h>
#include <linux/device.h>
#include <linux/sysdev.h>
+#include <linux/platform_device.h>
#include <linux/bcd.h>
#include <linux/notifier.h>
#include <linux/cpu.h>
@@ -185,7 +186,7 @@ void main_timer_handler(void)
set_rtc_mmss(xtime.tv_sec);
rtc_update = xtime.tv_sec + 660;
}
-
+
write_sequnlock(&xtime_lock);
}
@@ -226,7 +227,7 @@ static unsigned long get_cmos_time(void)
/*
* We know that x86-64 always uses BCD format, no need to check the
* config register.
- */
+ */
BCD_TO_BIN(sec);
BCD_TO_BIN(min);
@@ -239,11 +240,11 @@ static unsigned long get_cmos_time(void)
BCD_TO_BIN(century);
year += century * 100;
printk(KERN_INFO "Extended CMOS year: %d\n", century * 100);
- } else {
+ } else {
/*
* x86-64 systems only exists since 2002.
* This will work up to Dec 31, 2100
- */
+ */
year += 2000;
}
@@ -321,7 +322,7 @@ static unsigned int __init pit_calibrate
end = get_cycles_sync();
spin_unlock_irqrestore(&i8253_lock, flags);
-
+
return (end - start) / 50;
}
@@ -366,7 +367,7 @@ static struct irqaction irq0 = {
.handler = timer_interrupt,
.flags = IRQF_DISABLED | IRQF_IRQPOLL,
.mask = CPU_MASK_NONE,
- .name = "timer"
+ .name = "timer"
};
void __init time_init(void)
@@ -384,7 +385,7 @@ void __init time_init(void)
if (hpet_use_timer) {
/* set tick_nsec to use the proper rate for HPET */
- tick_nsec = TICK_NSEC_HPET;
+ tick_nsec = TICK_NSEC_HPET;
tsc_khz = hpet_calibrate_tsc();
timename = "HPET";
} else {
@@ -485,5 +486,26 @@ static int time_init_device(void)
error = sysdev_register(&device_timer);
return error;
}
-
device_initcall(time_init_device);
+
+static __init int add_pcspkr(void)
+{
+ struct platform_device *pd;
+ int ret;
+
+ pd = platform_device_alloc("pcspkr", -1);
+ if (!pd)
+ return -ENOMEM;
+
+ pd->dev.platform_data = &i8253_lock;
+ pd->dev.uevent_suppress = 0;
+
+ ret = platform_device_add(pd);
+ if (ret) {
+ pd->dev.platform_data = NULL; /* so we don't try to free it */
+ platform_device_put(pd);
+ }
+
+ return ret;
+}
+device_initcall(add_pcspkr);
Index: work/arch/i386/kernel/i8253.c
===================================================================
--- work.orig/arch/i386/kernel/i8253.c
+++ work/arch/i386/kernel/i8253.c
@@ -6,6 +6,7 @@
#include <linux/spinlock.h>
#include <linux/jiffies.h>
#include <linux/sysdev.h>
+#include <linux/platform_device.h>
#include <linux/module.h>
#include <linux/init.h>
@@ -204,3 +205,25 @@ static int __init init_pit_clocksource(v
return clocksource_register(&clocksource_pit);
}
arch_initcall(init_pit_clocksource);
+
+static __init int add_pcspkr(void)
+{
+ struct platform_device *pd;
+ int ret;
+
+ pd = platform_device_alloc("pcspkr", -1);
+ if (!pd)
+ return -ENOMEM;
+
+ pd->dev.platform_data = &i8253_lock;
+ pd->dev.uevent_suppress = 0;
+
+ ret = platform_device_add(pd);
+ if (ret) {
+ pd->dev.platform_data = NULL; /* so we don't try to free it */
+ platform_device_put(pd);
+ }
+
+ return ret;
+}
+device_initcall(add_pcspkr);
Index: work/arch/x86_64/kernel/Makefile
===================================================================
--- work.orig/arch/x86_64/kernel/Makefile
+++ work/arch/x86_64/kernel/Makefile
@@ -45,7 +45,6 @@ obj-$(CONFIG_PCI) += early-quirks.o
obj-y += topology.o
obj-y += intel_cacheinfo.o
obj-y += addon_cpuid_features.o
-obj-y += pcspeaker.o
CFLAGS_vsyscall.o := $(PROFILING) -g0
@@ -61,5 +60,4 @@ quirks-y += ../../i386/kernel/quirks.o
i8237-y += ../../i386/kernel/i8237.o
msr-$(subst m,y,$(CONFIG_X86_MSR)) += ../../i386/kernel/msr.o
alternative-y += ../../i386/kernel/alternative.o
-pcspeaker-y += ../../i386/kernel/pcspeaker.o
perfctr-watchdog-y += ../../i386/kernel/cpu/perfctr-watchdog.o
Index: work/arch/i386/kernel/Makefile
===================================================================
--- work.orig/arch/i386/kernel/Makefile
+++ work/arch/i386/kernel/Makefile
@@ -43,7 +43,6 @@ obj-$(CONFIG_K8_NB) += k8.o
obj-$(CONFIG_VMI) += vmi.o vmiclock.o
obj-$(CONFIG_PARAVIRT) += paravirt.o
-obj-y += pcspeaker.o
obj-$(CONFIG_SCx200) += scx200.o
Index: work/arch/i386/kernel/pcspeaker.c
===================================================================
--- work.orig/arch/i386/kernel/pcspeaker.c
+++ /dev/null
@@ -1,20 +0,0 @@
-#include <linux/platform_device.h>
-#include <linux/errno.h>
-#include <linux/init.h>
-
-static __init int add_pcspkr(void)
-{
- struct platform_device *pd;
- int ret;
-
- pd = platform_device_alloc("pcspkr", -1);
- if (!pd)
- return -ENOMEM;
-
- ret = platform_device_add(pd);
- if (ret)
- platform_device_put(pd);
-
- return ret;
-}
-device_initcall(add_pcspkr);
Index: work/drivers/input/misc/pcspkr.c
===================================================================
--- work.orig/drivers/input/misc/pcspkr.c
+++ work/drivers/input/misc/pcspkr.c
@@ -24,10 +24,10 @@ MODULE_AUTHOR("Vojtech Pavlik <vojtech@u
MODULE_DESCRIPTION("PC Speaker beeper driver");
MODULE_LICENSE("GPL");
-static DEFINE_SPINLOCK(i8253_beep_lock);
-
static int pcspkr_event(struct input_dev *dev, unsigned int type, unsigned int code, int value)
{
+ struct platform_device *pdev = input_get_drvdata(dev);
+ spinlock_t *i8253_lock = pdev->dev.platform_data;
unsigned int count = 0;
unsigned long flags;
@@ -43,7 +43,7 @@ static int pcspkr_event(struct input_dev
if (value > 20 && value < 32767)
count = PIT_TICK_RATE / value;
- spin_lock_irqsave(&i8253_beep_lock, flags);
+ spin_lock_irqsave(i8253_lock, flags);
if (count) {
/* enable counter 2 */
@@ -58,7 +58,7 @@ static int pcspkr_event(struct input_dev
outb(inb_p(0x61) & 0xFC, 0x61);
}
- spin_unlock_irqrestore(&i8253_beep_lock, flags);
+ spin_unlock_irqrestore(i8253_lock, flags);
return 0;
}
@@ -84,6 +84,8 @@ static int __devinit pcspkr_probe(struct
pcspkr_dev->sndbit[0] = BIT(SND_BELL) | BIT(SND_TONE);
pcspkr_dev->event = pcspkr_event;
+ input_set_drvdata(pcspkr_dev, dev);
+
err = input_register_device(pcspkr_dev);
if (err) {
input_free_device(pcspkr_dev);
Index: work/arch/powerpc/kernel/setup-common.c
===================================================================
--- work.orig/arch/powerpc/kernel/setup-common.c
+++ work/arch/powerpc/kernel/setup-common.c
@@ -425,6 +425,8 @@ void __init smp_setup_cpu_maps(void)
}
#endif /* CONFIG_SMP */
+static DEFINE_SPINLOCK(i8253_lock);
+
static __init int add_pcspkr(void)
{
struct device_node *np;
@@ -440,9 +442,14 @@ static __init int add_pcspkr(void)
if (!pd)
return -ENOMEM;
+ pd->dev.platform_data = &i8253_lock;
+ pd->dev.uevent_suppress = 0;
+
ret = platform_device_add(pd);
- if (ret)
+ if (ret) {
+ pd->dev.platform_data = NULL; /* so we don't try to free it */
platform_device_put(pd);
+ }
return ret;
}
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH for review] [12/48] x86_64: use the global PIT lock
2007-07-20 4:24 ` Dmitry Torokhov
@ 2007-07-20 8:25 ` Andi Kleen
2007-07-20 12:50 ` Dmitry Torokhov
0 siblings, 1 reply; 60+ messages in thread
From: Andi Kleen @ 2007-07-20 8:25 UTC (permalink / raw)
To: Dmitry Torokhov; +Cc: tglx, mingo, linux-kernel
> +static DEFINE_SPINLOCK(i8253_lock);
> +
> static __init int add_pcspkr(void)
> {
> struct platform_device *pd;
> @@ -1501,9 +1503,14 @@ static __init int add_pcspkr(void)
> if (!pd)
> return -ENOMEM;
>
> + pd->dev.platform_data = &i8253_lock;
That seems pretty ugly to pass spinlocks around in void * pointers. Also
out of general memory bloat reasons i don't like allocating big data structures
just for this.
Wouldn't it be better to just define i8253_lock weakly in the pcspkr code and let
the architecture override it?
> Index: work/arch/x86_64/kernel/time.c
> ===================================================================
> --- work.orig/arch/x86_64/kernel/time.c
> +++ work/arch/x86_64/kernel/time.c
> @@ -23,6 +23,7 @@
> #include <linux/module.h>
> #include <linux/device.h>
> #include <linux/sysdev.h>
> +#include <linux/platform_device.h>
> #include <linux/bcd.h>
> #include <linux/notifier.h>
> #include <linux/cpu.h>
> @@ -185,7 +186,7 @@ void main_timer_handler(void)
> set_rtc_mmss(xtime.tv_sec);
> rtc_update = xtime.tv_sec + 660;
> }
> -
> +
> write_sequnlock(&xtime_lock);
> }
No random white space changes in patches, multiple occurrences ?!?
-Andi
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH for review] [12/48] x86_64: use the global PIT lock
2007-07-20 8:25 ` Andi Kleen
@ 2007-07-20 12:50 ` Dmitry Torokhov
0 siblings, 0 replies; 60+ messages in thread
From: Dmitry Torokhov @ 2007-07-20 12:50 UTC (permalink / raw)
To: Andi Kleen; +Cc: tglx, mingo, linux-kernel
On 7/20/07, Andi Kleen <ak@suse.de> wrote:
>
> > +static DEFINE_SPINLOCK(i8253_lock);
> > +
> > static __init int add_pcspkr(void)
> > {
> > struct platform_device *pd;
> > @@ -1501,9 +1503,14 @@ static __init int add_pcspkr(void)
> > if (!pd)
> > return -ENOMEM;
> >
> > +pd->dev.platform_data = &i8253_lock;
>
> That seems pretty ugly to pass spinlocks around in void * pointers.
That spinlock _is_ platform data. We could define
struct pcspkr_platform_data {
spinlock_t *lock;
};
and pass around this as the rest of platform code does but then we'd
need a header file and it would add a level of indirection but if you
like this better I can change it. Otherwise spinlock is another data
structure and we pass them around all teh time.
> Also
> out of general memory bloat reasons i don't like allocating big data structures
> just for this.
>
I am not sure where you see new data structure allocation... If you
look at your box you should see that /sys/bus/platform/devices/pcspkr
device is already there. We already create it so that pcspkr driver
can bind to it.
> Wouldn't it be better to just define i8253_lock weakly in the pcspkr code and let
> the architecture override it?
Yes, it probably is btetter.
>
> > Index: work/arch/x86_64/kernel/time.c
> > ===================================================================
> > --- work.orig/arch/x86_64/kernel/time.c
> > +++ work/arch/x86_64/kernel/time.c
> > @@ -23,6 +23,7 @@
> > #include <linux/module.h>
> > #include <linux/device.h>
> > #include <linux/sysdev.h>
> > +#include <linux/platform_device.h>
> > #include <linux/bcd.h>
> > #include <linux/notifier.h>
> > #include <linux/cpu.h>
> > @@ -185,7 +186,7 @@ void main_timer_handler(void)
> > set_rtc_mmss(xtime.tv_sec);
> > rtc_update = xtime.tv_sec + 660;
> > }
> > -
> > +
> > write_sequnlock(&xtime_lock);
> > }
>
> No random white space changes in patches, multiple occurrences ?!?
>
By bad, sorry.
--
Dmitry
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH respin, was PATCH for review] During VM oom condition, kill all threads in process group
2007-07-19 14:14 ` Geert Uytterhoeven
2007-07-19 15:03 ` Will Schmidt
@ 2007-07-23 18:09 ` Will Schmidt
2007-07-23 21:16 ` Andrew Morton
2007-07-31 9:31 ` Pavel Machek
1 sibling, 2 replies; 60+ messages in thread
From: Will Schmidt @ 2007-07-23 18:09 UTC (permalink / raw)
To: Geert Uytterhoeven, Andrew Morton
Cc: Christoph Hellwig, Andi Kleen, linux-kernel, linux-arch
During VM oom condition, kill all threads in process group.
We have had complaints where a threaded application is left in a bad
state after one of it's threads is killed when we hit a VM: out_of_memory
condition.
Killing just one of the process threads can leave the application in a
bad state, whereas killing the entire process group would allow for
the application to restart, or be otherwise handled, and makes it very
obvious that something has gone wrong.
This change allows the entire process group to be taken down, rather
than just the one thread.
Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
---
This patch hits all arches except for x86_64 and powerpc; those arches
have already been updated with this change.
diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c
index a0e18da..25154df 100644
--- a/arch/alpha/mm/fault.c
+++ b/arch/alpha/mm/fault.c
@@ -197,7 +197,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr,
current->comm, current->pid);
if (!user_mode(regs))
goto no_context;
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
do_sigbus:
/* Send a sigbus, regardless of whether we were in kernel
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 846cce4..59ed1d0 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -266,7 +266,7 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
* the page fault gracefully.
*/
printk("VM: killing process %s\n", tsk->comm);
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
return 0;
}
if (fault & VM_FAULT_SIGBUS) {
diff --git a/arch/arm26/mm/fault.c b/arch/arm26/mm/fault.c
index dec638a..df14681 100644
--- a/arch/arm26/mm/fault.c
+++ b/arch/arm26/mm/fault.c
@@ -246,7 +246,7 @@ int do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
* us that made us unable to handle the page fault gracefully.
*/
printk("VM: killing process %s\n", tsk->comm);
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
}
else{
__do_user_fault(tsk, addr, fsr, fault == -1 ? SEGV_ACCERR : SEGV_MAPERR, regs);
diff --git a/arch/avr32/mm/fault.c b/arch/avr32/mm/fault.c
index ae2d2c5..11472f8 100644
--- a/arch/avr32/mm/fault.c
+++ b/arch/avr32/mm/fault.c
@@ -216,7 +216,7 @@ out_of_memory:
}
printk("VM: Killing process %s\n", tsk->comm);
if (user_mode(regs))
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
goto no_context;
do_sigbus:
diff --git a/arch/cris/mm/fault.c b/arch/cris/mm/fault.c
index 8672ab7..8aab814 100644
--- a/arch/cris/mm/fault.c
+++ b/arch/cris/mm/fault.c
@@ -360,7 +360,7 @@ do_page_fault(unsigned long address, struct pt_regs *regs,
up_read(&mm->mmap_sem);
printk("VM: killing process %s\n", tsk->comm);
if (user_mode(regs))
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
goto no_context;
do_sigbus:
diff --git a/arch/frv/mm/fault.c b/arch/frv/mm/fault.c
index 6798fa0..05093d4 100644
--- a/arch/frv/mm/fault.c
+++ b/arch/frv/mm/fault.c
@@ -259,7 +259,7 @@ asmlinkage void do_page_fault(int datammu, unsigned long esr0, unsigned long ear
up_read(&mm->mmap_sem);
printk("VM: killing process %s\n", current->comm);
if (user_mode(__frame))
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
goto no_context;
do_sigbus:
diff --git a/arch/i386/mm/fault.c b/arch/i386/mm/fault.c
index 01ffdd4..2bc2592 100644
--- a/arch/i386/mm/fault.c
+++ b/arch/i386/mm/fault.c
@@ -597,7 +597,7 @@ out_of_memory:
}
printk("VM: killing process %s\n", tsk->comm);
if (error_code & 4)
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
goto no_context;
do_sigbus:
diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c
index 73ccb60..320edf8 100644
--- a/arch/ia64/mm/fault.c
+++ b/arch/ia64/mm/fault.c
@@ -273,6 +273,6 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
}
printk(KERN_CRIT "VM: killing process %s\n", current->comm);
if (user_mode(regs))
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
goto no_context;
}
diff --git a/arch/m32r/mm/fault.c b/arch/m32r/mm/fault.c
index 676a1c4..70a766a 100644
--- a/arch/m32r/mm/fault.c
+++ b/arch/m32r/mm/fault.c
@@ -278,7 +278,7 @@ out_of_memory:
}
printk("VM: killing process %s\n", tsk->comm);
if (error_code & ACE_USERMODE)
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
goto no_context;
do_sigbus:
diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c
index 578b48f..eaa6186 100644
--- a/arch/m68k/mm/fault.c
+++ b/arch/m68k/mm/fault.c
@@ -188,7 +188,7 @@ out_of_memory:
printk("VM: killing process %s\n", current->comm);
if (user_mode(regs))
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
no_context:
current->thread.signo = SIGBUS;
diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c
index 521771b..5699c77 100644
--- a/arch/mips/mm/fault.c
+++ b/arch/mips/mm/fault.c
@@ -180,7 +180,7 @@ out_of_memory:
}
printk("VM: killing process %s\n", tsk->comm);
if (user_mode(regs))
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
goto no_context;
do_sigbus:
diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c
index 7899ab8..1c091b4 100644
--- a/arch/parisc/mm/fault.c
+++ b/arch/parisc/mm/fault.c
@@ -263,6 +263,6 @@ no_context:
up_read(&mm->mmap_sem);
printk(KERN_CRIT "VM: killing process %s\n", current->comm);
if (user_mode(regs))
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
goto no_context;
}
diff --git a/arch/ppc/mm/fault.c b/arch/ppc/mm/fault.c
index b98244e..94913dd 100644
--- a/arch/ppc/mm/fault.c
+++ b/arch/ppc/mm/fault.c
@@ -297,7 +297,7 @@ out_of_memory:
}
printk("VM: killing process %s\n", current->comm);
if (user_mode(regs))
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
return SIGKILL;
do_sigbus:
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index 5405519..f5bd497 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -218,7 +218,7 @@ static int do_out_of_memory(struct pt_regs *regs, unsigned long error_code,
}
printk("VM: killing process %s\n", tsk->comm);
if (regs->psw.mask & PSW_MASK_PSTATE)
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
do_no_context(regs, error_code, address);
return 0;
}
diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
index 964c676..b0d5170 100644
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@@ -215,7 +215,7 @@ out_of_memory:
}
printk("VM: killing process %s\n", tsk->comm);
if (user_mode(regs))
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
goto no_context;
do_sigbus:
diff --git a/arch/sh64/mm/fault.c b/arch/sh64/mm/fault.c
index 0d069d8..dd81c66 100644
--- a/arch/sh64/mm/fault.c
+++ b/arch/sh64/mm/fault.c
@@ -334,7 +334,7 @@ out_of_memory:
}
printk("VM: killing process %s\n", tsk->comm);
if (user_mode(regs))
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
goto no_context;
do_sigbus:
diff --git a/arch/sparc/mm/fault.c b/arch/sparc/mm/fault.c
index 50747fe..e4d9c8e 100644
--- a/arch/sparc/mm/fault.c
+++ b/arch/sparc/mm/fault.c
@@ -369,7 +369,7 @@ out_of_memory:
up_read(&mm->mmap_sem);
printk("VM: killing process %s\n", tsk->comm);
if (from_user)
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
goto no_context;
do_sigbus:
diff --git a/arch/sparc64/mm/fault.c b/arch/sparc64/mm/fault.c
index 17123e9..13fdfa3 100644
--- a/arch/sparc64/mm/fault.c
+++ b/arch/sparc64/mm/fault.c
@@ -466,7 +466,7 @@ out_of_memory:
up_read(&mm->mmap_sem);
printk("VM: killing process %s\n", current->comm);
if (!(regs->tstate & TSTATE_PRIV))
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
goto handle_kernel_fault;
intr_or_no_mm:
diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c
index 1600406..399df5c 100644
--- a/arch/xtensa/mm/fault.c
+++ b/arch/xtensa/mm/fault.c
@@ -150,7 +150,7 @@ out_of_memory:
}
printk("VM: killing process %s\n", current->comm);
if (user_mode(regs))
- do_exit(SIGKILL);
+ do_group_exit(SIGKILL);
bad_page_fault(regs, address, SIGKILL);
return;
^ permalink raw reply related [flat|nested] 60+ messages in thread
* Re: [PATCH respin, was PATCH for review] During VM oom condition, kill all threads in process group
2007-07-23 18:09 ` [PATCH respin, was PATCH for review] " Will Schmidt
@ 2007-07-23 21:16 ` Andrew Morton
2007-07-24 14:28 ` Will Schmidt
2007-07-31 9:31 ` Pavel Machek
1 sibling, 1 reply; 60+ messages in thread
From: Andrew Morton @ 2007-07-23 21:16 UTC (permalink / raw)
To: will_schmidt
Cc: Geert Uytterhoeven, Christoph Hellwig, Andi Kleen, linux-kernel,
linux-arch
On Mon, 23 Jul 2007 13:09:45 -0500
Will Schmidt <will_schmidt@vnet.ibm.com> wrote:
> During VM oom condition, kill all threads in process group.
>
> We have had complaints where a threaded application is left in a bad
> state after one of it's threads is killed when we hit a VM: out_of_memory
> condition.
> Killing just one of the process threads can leave the application in a
> bad state, whereas killing the entire process group would allow for
> the application to restart, or be otherwise handled, and makes it very
> obvious that something has gone wrong.
>
> This change allows the entire process group to be taken down, rather
> than just the one thread.
Just checking...
blackfin
h8300
m68knommu
uml
v850
were not changed. Intentional?
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH respin, was PATCH for review] During VM oom condition, kill all threads in process group
2007-07-23 21:16 ` Andrew Morton
@ 2007-07-24 14:28 ` Will Schmidt
2007-07-24 14:31 ` Christoph Hellwig
0 siblings, 1 reply; 60+ messages in thread
From: Will Schmidt @ 2007-07-24 14:28 UTC (permalink / raw)
To: Andrew Morton
Cc: Geert Uytterhoeven, Christoph Hellwig, Andi Kleen, linux-kernel,
linux-arch
On Mon, 2007-07-23 at 14:16 -0700, Andrew Morton wrote:
> On Mon, 23 Jul 2007 13:09:45 -0500
> Will Schmidt <will_schmidt@vnet.ibm.com> wrote:
>
> > During VM oom condition, kill all threads in process group.
> >
> > We have had complaints where a threaded application is left in a bad
> > state after one of it's threads is killed when we hit a VM: out_of_memory
> > condition.
> > Killing just one of the process threads can leave the application in a
> > bad state, whereas killing the entire process group would allow for
> > the application to restart, or be otherwise handled, and makes it very
> > obvious that something has gone wrong.
> >
> > This change allows the entire process group to be taken down, rather
> > than just the one thread.
>
> Just checking...
>
> blackfin
> h8300
> m68knommu
> uml
> v850
>
> were not changed. Intentional?
Yes. Those arch's don't have the VM oom code that I could see. There
is an occasional do_exit() reference elsewhere in the fault handler
code, for reasons other than VM oom, so I deliberately didn't touch
those either.
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH respin, was PATCH for review] During VM oom condition, kill all threads in process group
2007-07-24 14:28 ` Will Schmidt
@ 2007-07-24 14:31 ` Christoph Hellwig
0 siblings, 0 replies; 60+ messages in thread
From: Christoph Hellwig @ 2007-07-24 14:31 UTC (permalink / raw)
To: Will Schmidt
Cc: Andrew Morton, Geert Uytterhoeven, Christoph Hellwig, Andi Kleen,
linux-kernel, linux-arch
On Tue, Jul 24, 2007 at 09:28:44AM -0500, Will Schmidt wrote:
> > Just checking...
> >
> > blackfin
> > h8300
> > m68knommu
> > uml
> > v850
> >
> > were not changed. Intentional?
>
> Yes. Those arch's don't have the VM oom code that I could see. There
> is an occasional do_exit() reference elsewhere in the fault handler
> code, for reasons other than VM oom, so I deliberately didn't touch
> those either.
Except for uml these are the !CONFIG_MMU only architectures, so that makes
a lot of sense.
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH respin, was PATCH for review] During VM oom condition, kill all threads in process group
2007-07-23 18:09 ` [PATCH respin, was PATCH for review] " Will Schmidt
2007-07-23 21:16 ` Andrew Morton
@ 2007-07-31 9:31 ` Pavel Machek
2007-07-31 14:55 ` Will Schmidt
1 sibling, 1 reply; 60+ messages in thread
From: Pavel Machek @ 2007-07-31 9:31 UTC (permalink / raw)
To: Will Schmidt
Cc: Geert Uytterhoeven, Andrew Morton, Christoph Hellwig, Andi Kleen,
linux-kernel, linux-arch
Hi!
>
> During VM oom condition, kill all threads in process group.
>
> We have had complaints where a threaded application is left in a bad
> state after one of it's threads is killed when we hit a VM: out_of_memory
> condition.
> Killing just one of the process threads can leave the application in a
> bad state, whereas killing the entire process group would allow for
> the application to restart, or be otherwise handled, and makes it very
> obvious that something has gone wrong.
>
> This change allows the entire process group to be taken down, rather
> than just the one thread.
>
> Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
> diff --git a/arch/sparc64/mm/fault.c b/arch/sparc64/mm/fault.c
> index 17123e9..13fdfa3 100644
> --- a/arch/sparc64/mm/fault.c
> +++ b/arch/sparc64/mm/fault.c
> @@ -466,7 +466,7 @@ out_of_memory:
> up_read(&mm->mmap_sem);
> printk("VM: killing process %s\n", current->comm);
> if (!(regs->tstate & TSTATE_PRIV))
> - do_exit(SIGKILL);
> + do_group_exit(SIGKILL);
> goto handle_kernel_fault;
>
> intr_or_no_mm:
is the printk still accurate (does it kill more than one process now)?
Why does it print when it will not really kill the process?
I see similar code across all the archs... would it make sense to
create common helper... or is the helper too trivial?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH respin, was PATCH for review] During VM oom condition, kill all threads in process group
2007-07-31 9:31 ` Pavel Machek
@ 2007-07-31 14:55 ` Will Schmidt
0 siblings, 0 replies; 60+ messages in thread
From: Will Schmidt @ 2007-07-31 14:55 UTC (permalink / raw)
To: Pavel Machek
Cc: Geert Uytterhoeven, Andrew Morton, Christoph Hellwig, Andi Kleen,
linux-kernel, linux-arch
On Tue, 2007-07-31 at 11:31 +0200, Pavel Machek wrote:
> Hi!
>
> >
> > During VM oom condition, kill all threads in process group.
> >
> > We have had complaints where a threaded application is left in a bad
> > state after one of it's threads is killed when we hit a VM: out_of_memory
> > condition.
> > Killing just one of the process threads can leave the application in a
> > bad state, whereas killing the entire process group would allow for
> > the application to restart, or be otherwise handled, and makes it very
> > obvious that something has gone wrong.
> >
> > This change allows the entire process group to be taken down, rather
> > than just the one thread.
> >
> > Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
>
> > diff --git a/arch/sparc64/mm/fault.c b/arch/sparc64/mm/fault.c
> > index 17123e9..13fdfa3 100644
> > --- a/arch/sparc64/mm/fault.c
> > +++ b/arch/sparc64/mm/fault.c
> > @@ -466,7 +466,7 @@ out_of_memory:
> > up_read(&mm->mmap_sem);
> > printk("VM: killing process %s\n", current->comm);
> > if (!(regs->tstate & TSTATE_PRIV))
> > - do_exit(SIGKILL);
> > + do_group_exit(SIGKILL);
> > goto handle_kernel_fault;
> >
> > intr_or_no_mm:
>
> is the printk still accurate (does it kill more than one process now)?
I was going to double-check this morning.. but don't see where
current->comm is copied into a new task_struct. I thought that all
processes within the group had the same current->comm value, so figure
this is OK.
> Why does it print when it will not really kill the process?
no idea..
> I see similar code across all the archs... would it make sense to
> create common helper... or is the helper too trivial?
The checks blocking flow into do_group_exit, like (regs->tstate &
TSTATE_PRIV) for sparc64, or (user_mode(regs)) for powerpc, do vary
across the arch's. The code could be rearranged to have a helper
containing just the printk and the do_group_exit() call; but I'm not
sure that would be an improvement.
maybe a do_group_sigkill_if(condition); helper :-)
-Will
> Pavel
^ permalink raw reply [flat|nested] 60+ messages in thread
end of thread, other threads:[~2007-07-31 14:56 UTC | newest]
Thread overview: 60+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-19 13:48 [PATCH for review] [0/48] Second batch of x86 patches for .23 Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [1/48] i386: pgd_{c,d}tor() static Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [2/48] i386: fix section mismatch warning in intel_cacheinfo Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [3/48] i386: do not restore reserved memory after hibernation Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [4/48] i386: DMI_MATCH patch in reboot.c for SFF Dell OptiPlex 745 - fixes hang on reboot Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [5/48] i386: HPET, check if the counter works Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [6/48] x86: trim memory not covered by WB MTRRs Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [7/48] i386: divorce CONFIG_X86_PAE from CONFIG_HIGHMEM64G Andi Kleen
2007-07-19 14:46 ` Dave Jones
2007-07-19 17:06 ` Andi Kleen
2007-07-19 14:52 ` Christoph Hellwig
2007-07-20 0:45 ` William Lee Irwin III
2007-07-19 13:48 ` [PATCH for review] [8/48] i386: Remove unneeded test of 'task' in dump_trace() Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [9/48] i386: move the kernel to 16MB for NUMA-Q Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [10/48] x86_64: Move functions declarations to header file Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [11/48] x86_64: During VM oom condition, kill all threads in process group Andi Kleen
2007-07-19 14:04 ` Christoph Hellwig
2007-07-19 14:14 ` Geert Uytterhoeven
2007-07-19 15:03 ` Will Schmidt
2007-07-23 18:09 ` [PATCH respin, was PATCH for review] " Will Schmidt
2007-07-23 21:16 ` Andrew Morton
2007-07-24 14:28 ` Will Schmidt
2007-07-24 14:31 ` Christoph Hellwig
2007-07-31 9:31 ` Pavel Machek
2007-07-31 14:55 ` Will Schmidt
2007-07-19 13:48 ` [PATCH for review] [12/48] x86_64: use the global PIT lock Andi Kleen
2007-07-19 15:22 ` Dmitry Torokhov
2007-07-19 17:29 ` Andi Kleen
2007-07-19 19:23 ` Dmitry Torokhov
2007-07-19 19:52 ` Andi Kleen
2007-07-20 4:24 ` Dmitry Torokhov
2007-07-20 8:25 ` Andi Kleen
2007-07-20 12:50 ` Dmitry Torokhov
2007-07-19 13:48 ` [PATCH for review] [13/48] x86_64: fix typo in acpi_pm.c Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [14/48] x86_64: lower printk severity Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [15/48] x86_64: fix wrong comment regarding set_fixmap() Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [16/48] x86_64: Geode HW Random Number Generator depends on X86_32 Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [17/48] x86_64: change _map_single to static in pci_gart.c etc Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [18/48] x86_64: flush_tlb_kernel_range() warning fix Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [19/48] i386: add cpu_relax() to cmos_lock() Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [20/48] i386: replace hard-coded constant with appropriate macro from kernel.h Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [21/48] x86_64: disable the GART in shutdown Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [22/48] x86_64: fix e820_hole_size based on address ranges Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [23/48] x86_64: disable srat when numa emulation succeeds Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [24/48] x86_64: move iommu declaration from proto to iommu.h Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [25/48] i386: remove volatile in apic.c Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [26/48] i386: hpet assumes boot cpu is 0 Andi Kleen
2007-07-19 16:23 ` Jeremy Fitzhardinge
2007-07-19 13:48 ` [PATCH for review] [27/48] i386: move PIT function declarations and constants to correct header file Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [28/48] i386: fix iounmap's use of vm_struct's size field Andi Kleen
2007-07-19 13:48 ` [PATCH for review] [29/48] x86_64: arch/x86_64/kernel/aperture.c lower printk severity Andi Kleen
2007-07-19 13:49 ` [PATCH for review] [30/48] x86_64: arch/x86_64/kernel/e820.c " Andi Kleen
2007-07-19 13:49 ` [PATCH for review] [31/48] i386: basic infrastructure support for AMD geode-class machines Andi Kleen
2007-07-19 13:49 ` [PATCH for review] [32/48] i386: insert HPET firmware resource after PCI enumeration has completed Andi Kleen
2007-07-19 13:49 ` [PATCH for review] [33/48] i386: remove old IRQ balancing debug cruft Andi Kleen
2007-07-19 13:49 ` [PATCH for review] [34/48] i386: Update alignment when 4K stacks are used Andi Kleen
2007-07-19 13:49 ` [PATCH for review] [35/48] x86_64: remove __smp_alt* sections Andi Kleen
2007-07-19 13:49 ` [PATCH for review] [36/48] x86_64: k8topology add family 10h and 11h PCI IDs Andi Kleen
2007-07-19 13:49 ` [PATCH for review] [37/48] x86_64: make k8topology multi-core aware Andi Kleen
2007-07-19 13:49 ` [PATCH for review] [38/48] x86_64: Put allocated ELF notes in read-only data segment Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox