* [PATCH 01/12] x86_64: Allow fixmaps to be used with the initial page table.
[not found] <m1d51l6f1y.fsf@ebiederm.dsl.xmission.com>
@ 2007-04-30 15:48 ` Eric W. Biederman
[not found] ` <m18xc96eyq.fsf@ebiederm.dsl.xmission.com>
1 sibling, 0 replies; 21+ messages in thread
From: Eric W. Biederman @ 2007-04-30 15:48 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel, virtualization, H. Peter Anvin, Andrew Morton
This patch preallocates the intermediate page table entries so that
all that is needed to setup a fixmap is to fill in the appropriate
pte.
By doing this modern hardware that uses memory mapped access can be
talked to early in boot through a fixmap.
Allowing USB debugging and the like.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
arch/x86_64/kernel/head.S | 14 +++++++++++++-
1 files changed, 13 insertions(+), 1 deletions(-)
diff --git a/arch/x86_64/kernel/head.S b/arch/x86_64/kernel/head.S
index 1fab487..1b704c4 100644
--- a/arch/x86_64/kernel/head.S
+++ b/arch/x86_64/kernel/head.S
@@ -74,6 +74,9 @@ startup_64:
addq %rbp, level3_ident_pgt + 0(%rip)
addq %rbp, level3_kernel_pgt + (510*8)(%rip)
+ addq %rbp, level3_kernel_pgt + (511*8)(%rip)
+
+ addq %rbp, level2_fixmap_pgt + (506*8)(%rip)
/* Add an Identity mapping if I am above 1G */
leaq _text(%rip), %rdi
@@ -314,7 +317,16 @@ NEXT_PAGE(level3_kernel_pgt)
.fill 510,8,0
/* (2^48-(2*1024*1024*1024)-((2^39)*511))/(2^30) = 510 */
.quad level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE
- .fill 1,8,0
+ .quad level2_fixmap_pgt - __START_KERNEL_map + _KERNPG_TABLE + _PAGE_USER
+
+NEXT_PAGE(level2_fixmap_pgt)
+ .fill 506,8,0
+ .quad level1_fixmap_pgt - __START_KERNEL_map + _KERNPG_TABLE + _PAGE_USER
+ /* 8MB reserved for vsyscalls + a 2MB hole = 4 + 1 entries */
+ .fill 5,8,0
+
+NEXT_PAGE(level1_fixmap_pgt)
+ .fill 512,8,0
NEXT_PAGE(level2_ident_pgt)
/* Since I easily can, map the first 1G.
--
1.5.1.1.181.g2de0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 02/12] i386 head.S: Remove unnecessary use of %ebx as the boot cpu flag
[not found] ` <m18xc96eyq.fsf@ebiederm.dsl.xmission.com>
@ 2007-04-30 15:49 ` Eric W. Biederman
[not found] ` <m14pmx6ewk.fsf_-_@ebiederm.dsl.xmission.com>
1 sibling, 0 replies; 21+ messages in thread
From: Eric W. Biederman @ 2007-04-30 15:49 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel, virtualization, H. Peter Anvin, Andrew Morton
Currently in head.S there are two ways we test to see if we
are the boot cpu. By looking at %ebx and by looking at the
static variable ready. When changing things around I have
found that it gets tricky to preserve %ebx. So this
patch just switches head.S over to the more reliable
test of always using ready.
Hopefully later we can kill these tests entirely.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
arch/i386/kernel/head.S | 7 +------
1 files changed, 1 insertions(+), 6 deletions(-)
diff --git a/arch/i386/kernel/head.S b/arch/i386/kernel/head.S
index 9b10af6..aa9e28e 100644
--- a/arch/i386/kernel/head.S
+++ b/arch/i386/kernel/head.S
@@ -157,7 +157,6 @@ page_pde_offset = (__PAGE_OFFSET >> 20);
jb 10b
movl %edi,(init_pg_tables_end - __PAGE_OFFSET)
- xorl %ebx,%ebx /* This is the boot CPU (BSP) */
jmp 3f
/*
* Non-boot CPU entry point; entered from trampoline.S
@@ -228,10 +227,6 @@ ENTRY(startup_32_smp)
wrmsr
6:
- /* This is a secondary processor (AP) */
- xorl %ebx,%ebx
- incl %ebx
-
#endif /* CONFIG_SMP */
3:
@@ -257,7 +252,7 @@ ENTRY(startup_32_smp)
popfl
#ifdef CONFIG_SMP
- andl %ebx,%ebx
+ cmpb $0, ready
jz 1f /* Initial CPU cleans BSS */
jmp checkCPUtype
1:
--
1.5.1.1.181.g2de0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 03/12] i386 head.S: Always run the full set of paging state
[not found] ` <m14pmx6ewk.fsf_-_@ebiederm.dsl.xmission.com>
@ 2007-04-30 15:51 ` Eric W. Biederman
[not found] ` <m1zm4p509a.fsf_-_@ebiederm.dsl.xmission.com>
1 sibling, 0 replies; 21+ messages in thread
From: Eric W. Biederman @ 2007-04-30 15:51 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel, virtualization, H. Peter Anvin, Andrew Morton
I am preparing to convert the boot time page table to the kernels
native format. To achieve that I need to enable PAE. Enabling PSE
and the no execute bit would not hurt. So this patch modifies
the boot cpu path to execute all of the kernels enable code
if and only if we have the proper bits set in mmu_cr4_features.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
arch/i386/kernel/head.S | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/i386/kernel/head.S b/arch/i386/kernel/head.S
index aa9e28e..248d44c 100644
--- a/arch/i386/kernel/head.S
+++ b/arch/i386/kernel/head.S
@@ -181,6 +181,8 @@ ENTRY(startup_32_smp)
movl %eax,%es
movl %eax,%fs
movl %eax,%gs
+#endif /* CONFIG_SMP */
+3:
/*
* New page tables may be in 4Mbyte page mode and may
@@ -227,8 +229,6 @@ ENTRY(startup_32_smp)
wrmsr
6:
-#endif /* CONFIG_SMP */
-3:
/*
* Enable paging
--
1.5.1.1.181.g2de0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 04/12] i386 voyager: Use modern techniques to setup and teardown low identiy mappings.
[not found] ` <m1zm4p509a.fsf_-_@ebiederm.dsl.xmission.com>
@ 2007-04-30 15:57 ` Eric W. Biederman
2007-04-30 16:03 ` [PATCH 05/12] i386: During page table initialization always set the leaf page table entries Eric W. Biederman
` (2 more replies)
0 siblings, 3 replies; 21+ messages in thread
From: Eric W. Biederman @ 2007-04-30 15:57 UTC (permalink / raw)
To: Andi Kleen
Cc: linux-kernel, virtualization, James Bottomley, H. Peter Anvin,
Andrew Morton
This is a trivial and hopefully obviously correct patch to setup
and teardown the identity mappings the way the rest of arch/i386
does.
My new page table setup code will break some assumptions below so
this is my attempt to keep voyager working.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
arch/i386/mach-voyager/voyager_smp.c | 38 +++++----------------------------
1 files changed, 6 insertions(+), 32 deletions(-)
diff --git a/arch/i386/mach-voyager/voyager_smp.c b/arch/i386/mach-voyager/voyager_smp.c
index b9ce33c..5e98ae7 100644
--- a/arch/i386/mach-voyager/voyager_smp.c
+++ b/arch/i386/mach-voyager/voyager_smp.c
@@ -536,15 +536,6 @@ do_boot_cpu(__u8 cpu)
& ~( voyager_extended_vic_processors
& voyager_allowed_boot_processors);
- /* For the 486, we can't use the 4Mb page table trick, so
- * must map a region of memory */
-#ifdef CONFIG_M486
- int i;
- unsigned long *page_table_copies = (unsigned long *)
- __get_free_page(GFP_KERNEL);
-#endif
- pgd_t orig_swapper_pg_dir0;
-
/* This is an area in head.S which was used to set up the
* initial kernel stack. We need to alter this to give the
* booting CPU a new stack (taken from its idle process) */
@@ -587,24 +578,11 @@ do_boot_cpu(__u8 cpu)
VDEBUG(("VOYAGER SMP: Booting CPU%d at 0x%lx[%x:%x], stack %p\n", cpu,
(unsigned long)hijack_source.val, hijack_source.idt.Segment,
hijack_source.idt.Offset, stack_start.esp));
- /* set the original swapper_pg_dir[0] to map 0 to 4Mb transparently
- * (so that the booting CPU can find start_32 */
- orig_swapper_pg_dir0 = swapper_pg_dir[0];
-#ifdef CONFIG_M486
- if(page_table_copies == NULL)
- panic("No free memory for 486 page tables\n");
- for(i = 0; i < PAGE_SIZE/sizeof(unsigned long); i++)
- page_table_copies[i] = (i * PAGE_SIZE)
- | _PAGE_RW | _PAGE_USER | _PAGE_PRESENT;
-
- ((unsigned long *)swapper_pg_dir)[0] =
- ((virt_to_phys(page_table_copies)) & PAGE_MASK)
- | _PAGE_RW | _PAGE_USER | _PAGE_PRESENT;
-#else
- ((unsigned long *)swapper_pg_dir)[0] =
- (virt_to_phys(pg0) & PAGE_MASK)
- | _PAGE_RW | _PAGE_USER | _PAGE_PRESENT;
-#endif
+
+ /* init lowmem identity mapping */
+ clone_pgd_range(swapper_pg_dir, swapper_pg_dir + USER_PGD_PTRS,
+ min_t(unsigned long, KERNEL_PGD_PTRS, USER_PGD_PTRS));
+ flush_tlb_all();
if(quad_boot) {
printk("CPU %d: non extended Quad boot\n", cpu);
@@ -647,11 +625,7 @@ do_boot_cpu(__u8 cpu)
udelay(100);
}
/* reset the page table */
- swapper_pg_dir[0] = orig_swapper_pg_dir0;
- local_flush_tlb();
-#ifdef CONFIG_M486
- free_page((unsigned long)page_table_copies);
-#endif
+ zap_low_mappings();
if (cpu_booted_map) {
VDEBUG(("CPU%d: Booted successfully, back in CPU %d\n",
--
1.5.1.1.181.g2de0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 05/12] i386: During page table initialization always set the leaf page table entries.
2007-04-30 15:57 ` [PATCH 04/12] i386 voyager: Use modern techniques to setup and teardown low identiy mappings Eric W. Biederman
@ 2007-04-30 16:03 ` Eric W. Biederman
[not found] ` <m1r6q14zow.fsf_-_@ebiederm.dsl.xmission.com>
2007-04-30 17:06 ` [PATCH 04/12] i386 voyager: Use modern techniques to setup and teardown low identiy mappings James Bottomley
2 siblings, 0 replies; 21+ messages in thread
From: Eric W. Biederman @ 2007-04-30 16:03 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel, virtualization, H. Peter Anvin, Andrew Morton
If we don't set the leaf page table entries it is quite possible that
will inherit and incorrect page table entry from the initial boot
page table setup in head.S. So we need to redo the effort here,
so we pick up PSE, PGE and the like.
Hypervisors like Xen require that their page tables be read-only,
which is slightly incompatible with our low identity mappings, however
I discussed this with Jeremy he has modified the Xen early set_pte
function to avoid problems in this area.
Andi I sent this once a part of the discussion on this issue so
you may already have this patch in your queue.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
arch/i386/mm/init.c | 52 +++++++++++++++++++-------------------------------
1 files changed, 20 insertions(+), 32 deletions(-)
diff --git a/arch/i386/mm/init.c b/arch/i386/mm/init.c
index b77a43c..dbe16f6 100644
--- a/arch/i386/mm/init.c
+++ b/arch/i386/mm/init.c
@@ -63,18 +63,18 @@ static pmd_t * __init one_md_table_init(pgd_t *pgd)
pmd_t *pmd_table;
#ifdef CONFIG_X86_PAE
- pmd_table = (pmd_t *) alloc_bootmem_low_pages(PAGE_SIZE);
-
- paravirt_alloc_pd(__pa(pmd_table) >> PAGE_SHIFT);
- set_pgd(pgd, __pgd(__pa(pmd_table) | _PAGE_PRESENT));
- pud = pud_offset(pgd, 0);
- if (pmd_table != pmd_offset(pud, 0))
- BUG();
-#else
+ if (!(pgd_val(*pgd) & _PAGE_PRESENT)) {
+ pmd_table = (pmd_t *) alloc_bootmem_low_pages(PAGE_SIZE);
+
+ paravirt_alloc_pd(__pa(pmd_table) >> PAGE_SHIFT);
+ set_pgd(pgd, __pgd(__pa(pmd_table) | _PAGE_PRESENT));
+ pud = pud_offset(pgd, 0);
+ if (pmd_table != pmd_offset(pud, 0))
+ BUG();
+ }
+#endif
pud = pud_offset(pgd, 0);
pmd_table = pmd_offset(pud, 0);
-#endif
-
return pmd_table;
}
@@ -84,7 +84,7 @@ static pmd_t * __init one_md_table_init(pgd_t *pgd)
*/
static pte_t * __init one_page_table_init(pmd_t *pmd)
{
- if (pmd_none(*pmd)) {
+ if (!(pmd_val(*pmd) & _PAGE_PRESENT)) {
pte_t *page_table = (pte_t *) alloc_bootmem_low_pages(PAGE_SIZE);
paravirt_alloc_pt(__pa(page_table) >> PAGE_SHIFT);
@@ -109,7 +109,6 @@ static pte_t * __init one_page_table_init(pmd_t *pmd)
static void __init page_table_range_init (unsigned long start, unsigned long end, pgd_t *pgd_base)
{
pgd_t *pgd;
- pud_t *pud;
pmd_t *pmd;
int pgd_idx, pmd_idx;
unsigned long vaddr;
@@ -120,13 +119,10 @@ static void __init page_table_range_init (unsigned long start, unsigned long end
pgd = pgd_base + pgd_idx;
for ( ; (pgd_idx < PTRS_PER_PGD) && (vaddr != end); pgd++, pgd_idx++) {
- if (!(pgd_val(*pgd) & _PAGE_PRESENT))
- one_md_table_init(pgd);
- pud = pud_offset(pgd, vaddr);
- pmd = pmd_offset(pud, vaddr);
+ pmd = one_md_table_init(pgd);
+ pmd = pmd + pmd_index(vaddr);
for (; (pmd_idx < PTRS_PER_PMD) && (vaddr != end); pmd++, pmd_idx++) {
- if (pmd_none(*pmd))
- one_page_table_init(pmd);
+ one_page_table_init(pmd);
vaddr += PMD_SIZE;
}
@@ -159,11 +155,7 @@ static void __init kernel_physical_mapping_init(pgd_t *pgd_base)
pfn = 0;
for (; pgd_idx < PTRS_PER_PGD; pgd++, pgd_idx++) {
- if (!(pgd_val(*pgd) & _PAGE_PRESENT))
- pmd = one_md_table_init(pgd);
- else
- pmd = pmd_offset(pud_offset(pgd, PAGE_OFFSET), PAGE_OFFSET);
-
+ pmd = one_md_table_init(pgd);
if (pfn >= max_low_pfn)
continue;
for (pmd_idx = 0; pmd_idx < PTRS_PER_PMD && pfn < max_low_pfn; pmd++, pmd_idx++) {
@@ -172,12 +164,11 @@ static void __init kernel_physical_mapping_init(pgd_t *pgd_base)
/* Map with big pages if possible, otherwise create normal page tables. */
if (cpu_has_pse) {
unsigned int address2 = (pfn + PTRS_PER_PTE - 1) * PAGE_SIZE + PAGE_OFFSET + PAGE_SIZE-1;
- if (!pmd_present(*pmd)) {
- if (is_kernel_text(address) || is_kernel_text(address2))
- set_pmd(pmd, pfn_pmd(pfn, PAGE_KERNEL_LARGE_EXEC));
- else
- set_pmd(pmd, pfn_pmd(pfn, PAGE_KERNEL_LARGE));
- }
+ if (is_kernel_text(address) || is_kernel_text(address2))
+ set_pmd(pmd, pfn_pmd(pfn, PAGE_KERNEL_LARGE_EXEC));
+ else
+ set_pmd(pmd, pfn_pmd(pfn, PAGE_KERNEL_LARGE));
+
pfn += PTRS_PER_PTE;
} else {
pte = one_page_table_init(pmd);
@@ -185,9 +176,6 @@ static void __init kernel_physical_mapping_init(pgd_t *pgd_base)
for (pte_ofs = 0;
pte_ofs < PTRS_PER_PTE && pfn < max_low_pfn;
pte++, pfn++, pte_ofs++, address += PAGE_SIZE) {
- if (pte_present(*pte))
- continue;
-
if (is_kernel_text(address))
set_pte(pte, pfn_pte(pfn, PAGE_KERNEL_EXEC));
else
--
1.5.1.1.181.g2de0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 06/12] i386: Minimum cpu detection cleanups.
[not found] ` <m1r6q14zow.fsf_-_@ebiederm.dsl.xmission.com>
@ 2007-04-30 16:09 ` Eric W. Biederman
2007-04-30 16:10 ` [PATCH 07/12] i386: Add missing !X86_PAE dependincy to the 2G/2G split Eric W. Biederman
2007-04-30 16:13 ` [PATCH 06/12] i386: Minimum cpu detection cleanups H. Peter Anvin
2007-04-30 16:34 ` [PATCH 05/12] i386: During page table initialization always set the leaf page table entries Jeremy Fitzhardinge
1 sibling, 2 replies; 21+ messages in thread
From: Eric W. Biederman @ 2007-04-30 16:09 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel, virtualization, H. Peter Anvin, Andrew Morton
This patch modifies verify_cpu to check for a 486 even when
REQUIRED_MASK1 == 0
This patch modifies REQUIRED_MASK1 to require PAE when CONFIG_X86_PAE
is set not when HIGHMEM64G is set, not that there is a functional
difference but it seems an obvious fix.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
arch/i386/kernel/verify_cpu.S | 7 ++-----
include/asm-i386/required-features.h | 2 +-
2 files changed, 3 insertions(+), 6 deletions(-)
diff --git a/arch/i386/kernel/verify_cpu.S b/arch/i386/kernel/verify_cpu.S
index ba9e03e..e51a869 100644
--- a/arch/i386/kernel/verify_cpu.S
+++ b/arch/i386/kernel/verify_cpu.S
@@ -4,10 +4,6 @@
#include <asm/cpufeature.h>
verify_cpu:
-#if REQUIRED_MASK1 == 0
- xorl %eax,%eax
- ret
-#endif
pushfl # Save caller passed flags
pushl $0 # Kill any dangerous flags
popfl
@@ -21,7 +17,7 @@ verify_cpu:
testl $(1<<18),%eax
jz bad
#endif
-
+#if REQUIRED_MASK1 != 0
pushfl # standard way to check for cpuid
popl %eax
movl %eax,%ebx
@@ -57,6 +53,7 @@ verify_cpu:
andl $REQUIRED_MASK1,%edx
xorl $REQUIRED_MASK1,%edx
jnz bad
+#endif /* REQUIRED_MASK1 */
popfl
xor %eax,%eax
diff --git a/include/asm-i386/required-features.h b/include/asm-i386/required-features.h
index 062407e..9db866c 100644
--- a/include/asm-i386/required-features.h
+++ b/include/asm-i386/required-features.h
@@ -11,7 +11,7 @@
The real information is in arch/i386/Kconfig.cpu, this just converts
the CONFIGs into a bitmask */
-#ifdef CONFIG_HIGHMEM64G
+#ifdef CONFIG_X86_PAE
#define NEED_PAE (1<<X86_FEATURE_PAE)
#else
#define NEED_PAE 0
--
1.5.1.1.181.g2de0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 07/12] i386: Add missing !X86_PAE dependincy to the 2G/2G split.
2007-04-30 16:09 ` [PATCH 06/12] i386: Minimum cpu detection cleanups Eric W. Biederman
@ 2007-04-30 16:10 ` Eric W. Biederman
2007-04-30 16:15 ` [PATCH 08/12] i386: Convert the boot time page tables to the kernels native format Eric W. Biederman
2007-04-30 16:16 ` [PATCH 07/12] i386: Add missing !X86_PAE dependincy to the 2G/2G split H. Peter Anvin
2007-04-30 16:13 ` [PATCH 06/12] i386: Minimum cpu detection cleanups H. Peter Anvin
1 sibling, 2 replies; 21+ messages in thread
From: Eric W. Biederman @ 2007-04-30 16:10 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel, virtualization, H. Peter Anvin, Andrew Morton
When in PAE mode we require that the user kernel divide to be
on a 1G boundary. The 2G/2G split does not have that property
so require !X86_PAE
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
arch/i386/Kconfig | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/arch/i386/Kconfig b/arch/i386/Kconfig
index 1a94a73..80003de 100644
--- a/arch/i386/Kconfig
+++ b/arch/i386/Kconfig
@@ -570,6 +570,7 @@ choice
depends on !HIGHMEM
bool "3G/1G user/kernel split (for full 1G low memory)"
config VMSPLIT_2G
+ depends on !X86_PAE
bool "2G/2G user/kernel split"
config VMSPLIT_1G
bool "1G/3G user/kernel split"
--
1.5.1.1.181.g2de0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 06/12] i386: Minimum cpu detection cleanups.
2007-04-30 16:09 ` [PATCH 06/12] i386: Minimum cpu detection cleanups Eric W. Biederman
2007-04-30 16:10 ` [PATCH 07/12] i386: Add missing !X86_PAE dependincy to the 2G/2G split Eric W. Biederman
@ 2007-04-30 16:13 ` H. Peter Anvin
1 sibling, 0 replies; 21+ messages in thread
From: H. Peter Anvin @ 2007-04-30 16:13 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: linux-kernel, Andrew Morton, virtualization
Eric W. Biederman wrote:
>
> diff --git a/arch/i386/kernel/verify_cpu.S b/arch/i386/kernel/verify_cpu.S
> index ba9e03e..e51a869 100644
> --- a/arch/i386/kernel/verify_cpu.S
> +++ b/arch/i386/kernel/verify_cpu.S
> @@ -4,10 +4,6 @@
> #include <asm/cpufeature.h>
>
> verify_cpu:
> -#if REQUIRED_MASK1 == 0
> - xorl %eax,%eax
> - ret
> -#endif
> pushfl # Save caller passed flags
> pushl $0 # Kill any dangerous flags
> popfl
> @@ -21,7 +17,7 @@ verify_cpu:
> testl $(1<<18),%eax
^^^^^^^
> jz bad
While you're in there can you change references to the flags to use the
<asm/cpuflags.h> constants?
-hpa
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 08/12] i386: Convert the boot time page tables to the kernels native format.
2007-04-30 16:10 ` [PATCH 07/12] i386: Add missing !X86_PAE dependincy to the 2G/2G split Eric W. Biederman
@ 2007-04-30 16:15 ` Eric W. Biederman
2007-04-30 16:26 ` Andi Kleen
` (2 more replies)
2007-04-30 16:16 ` [PATCH 07/12] i386: Add missing !X86_PAE dependincy to the 2G/2G split H. Peter Anvin
1 sibling, 3 replies; 21+ messages in thread
From: Eric W. Biederman @ 2007-04-30 16:15 UTC (permalink / raw)
To: Andi Kleen
Cc: linux-kernel, virtualization, James Bottomley, H. Peter Anvin,
Andrew Morton
Currently we have a lot of special case code and a lot of limitations
because we cannot count on the initial boot time page tables being in
the format our page table handling routines know how to manipulate.
So this patch rewrites the code that initializes our boot time page
tables.
The error message for running on cpus that don't support PAE mode is
removed from mm/init.c. We are already checking and printing an error
message to the user in setup.S.
boot_ioreamp is removed and replaced with the already existing and
semantically equivalent bt_ioremap. The practical difference was that
bt_ioremap used fixmaps where boot_ioremap hacked the boot time page
table. Since the boot time page table is now in the native format
fixmaps work as soon as we hit C code.
The voyager code is modified to use bt_ioremap, instead of hacking the
page tables in an ugly way. We now have a good universal replacement
and changing voyager means nothing cares which order we store the
pages in pg0.
The meat of the change is the introduction of the function
early_pgtable_init written in C to build the initial page tables. By
writing in C, I only have about 6 lines of code that are different
between the PAE and the non-PAE case, and the code is a little more
approachable. Unfortuantely we still have to contend with the fact
that the code is running in a very weird environment and so can not
use virtual addresses. Which limits the opportunites for code reuse.
The page table entry flags now use the standard defines from pgtable.h
instead of a hard coded 0x007 which while not wrong was not really
correct either.
The big change is that now the boot time page tables now add an extra
pte page to support the kernel fixmap entries. In addition to being
very useful this was necessary so that boot_ioremap could be replaced
with something that work in the presence of PAE page tables.
The net result is a simpler and easier to work in, early boot
environment.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
arch/i386/Kconfig | 7 --
arch/i386/kernel/efi.c | 15 +--
arch/i386/kernel/head.S | 66 +------------
arch/i386/kernel/setup.c | 4 +
arch/i386/kernel/srat.c | 8 +-
arch/i386/mach-voyager/voyager_basic.c | 16 +--
arch/i386/mm/Makefile | 1 -
arch/i386/mm/boot_ioremap.c | 100 --------------------
arch/i386/mm/init.c | 159 +++++++++++++++++++++++---------
include/asm-i386/page.h | 1 -
include/asm-i386/pgtable.h | 4 -
11 files changed, 137 insertions(+), 244 deletions(-)
delete mode 100644 arch/i386/mm/boot_ioremap.c
diff --git a/arch/i386/Kconfig b/arch/i386/Kconfig
index 80003de..ae00af8 100644
--- a/arch/i386/Kconfig
+++ b/arch/i386/Kconfig
@@ -750,13 +750,6 @@ config IRQBALANCE
The default yes will allow the kernel to do irq load balancing.
Saying no will keep the kernel from doing irq load balancing.
-# turning this on wastes a bunch of space.
-# Summit needs it only when NUMA is on
-config BOOT_IOREMAP
- bool
- depends on (((X86_SUMMIT || X86_GENERICARCH) && NUMA) || (X86 && EFI))
- default y
-
config SECCOMP
bool "Enable seccomp to safely compute untrusted bytecode"
depends on PROC_FS
diff --git a/arch/i386/kernel/efi.c b/arch/i386/kernel/efi.c
index dd9e7fa..1319bae 100644
--- a/arch/i386/kernel/efi.c
+++ b/arch/i386/kernel/efi.c
@@ -50,11 +50,6 @@ static struct efi efi_phys;
struct efi_memory_map memmap;
/*
- * We require an early boot_ioremap mapping mechanism initially
- */
-extern void * boot_ioremap(unsigned long, unsigned long);
-
-/*
* To make EFI call EFI runtime service in physical addressing mode we need
* prelog/epilog before/after the invocation to disable interrupt, to
* claim EFI runtime service handler exclusively and to duplicate a memory in
@@ -338,7 +333,7 @@ void __init efi_init(void)
memmap.desc_size = EFI_MEMDESC_SIZE;
efi.systab = (efi_system_table_t *)
- boot_ioremap((unsigned long) efi_phys.systab,
+ bt_ioremap((unsigned long) efi_phys.systab,
sizeof(efi_system_table_t));
/*
* Verify the EFI Table
@@ -365,7 +360,7 @@ void __init efi_init(void)
/*
* Show what we know for posterity
*/
- c16 = (efi_char16_t *) boot_ioremap(efi.systab->fw_vendor, 2);
+ c16 = (efi_char16_t *) bt_ioremap(efi.systab->fw_vendor, 2);
if (c16) {
for (i = 0; i < (sizeof(vendor) - 1) && *c16; ++i)
vendor[i] = *c16++;
@@ -381,7 +376,7 @@ void __init efi_init(void)
* Let's see what config tables the firmware passed to us.
*/
config_tables = (efi_config_table_t *)
- boot_ioremap((unsigned long) config_tables,
+ bt_ioremap((unsigned long) config_tables,
num_config_tables * sizeof(efi_config_table_t));
if (config_tables == NULL)
@@ -431,7 +426,7 @@ void __init efi_init(void)
* set the firmware into virtual mode.
*/
- runtime = (efi_runtime_services_t *) boot_ioremap((unsigned long)
+ runtime = (efi_runtime_services_t *) bt_ioremap((unsigned long)
runtime,
sizeof(efi_runtime_services_t));
if (runtime != NULL) {
@@ -448,7 +443,7 @@ void __init efi_init(void)
printk(KERN_ERR PFX "Could not map the runtime service table!\n");
/* Map the EFI memory map for use until paging_init() */
- memmap.map = boot_ioremap((unsigned long) EFI_MEMMAP, EFI_MEMMAP_SIZE);
+ memmap.map = bt_ioremap((unsigned long) EFI_MEMMAP, EFI_MEMMAP_SIZE);
if (memmap.map == NULL)
printk(KERN_ERR PFX "Could not map the EFI memory map!\n");
diff --git a/arch/i386/kernel/head.S b/arch/i386/kernel/head.S
index 248d44c..de65f45 100644
--- a/arch/i386/kernel/head.S
+++ b/arch/i386/kernel/head.S
@@ -33,35 +33,6 @@
#define X86_VENDOR_ID new_cpu_data+CPUINFO_x86_vendor_id
/*
- * This is how much memory *in addition to the memory covered up to
- * and including _end* we need mapped initially.
- * We need:
- * - one bit for each possible page, but only in low memory, which means
- * 2^32/4096/8 = 128K worst case (4G/4G split.)
- * - enough space to map all low memory, which means
- * (2^32/4096) / 1024 pages (worst case, non PAE)
- * (2^32/4096) / 512 + 4 pages (worst case for PAE)
- * - a few pages for allocator use before the kernel pagetable has
- * been set up
- *
- * Modulo rounding, each megabyte assigned here requires a kilobyte of
- * memory, which is currently unreclaimed.
- *
- * This should be a multiple of a page.
- */
-LOW_PAGES = 1<<(32-PAGE_SHIFT_asm)
-
-#if PTRS_PER_PMD > 1
-PAGE_TABLE_SIZE = (LOW_PAGES / PTRS_PER_PMD) + PTRS_PER_PGD
-#else
-PAGE_TABLE_SIZE = (LOW_PAGES / PTRS_PER_PGD)
-#endif
-BOOTBITMAP_SIZE = LOW_PAGES / 8
-ALLOCATOR_SLOP = 4
-
-INIT_MAP_BEYOND_END = BOOTBITMAP_SIZE + (PAGE_TABLE_SIZE + ALLOCATOR_SLOP)*PAGE_SIZE_asm
-
-/*
* 32-bit kernel entrypoint; only used by the boot CPU. On entry,
* %esi points to the real-mode code as a 32-bit pointer.
* CS and DS must be 4 GB flat segments, but we don't depend on
@@ -125,37 +96,12 @@ ENTRY(startup_32)
movsl
1:
-/*
- * Initialize page tables. This creates a PDE and a set of page
- * tables, which are located immediately beyond _end. The variable
- * init_pg_tables_end is set up to point to the first "safe" location.
- * Mappings are created both at virtual address 0 (identity mapping)
- * and PAGE_OFFSET for up to _end+sizeof(page tables)+INIT_MAP_BEYOND_END.
- *
- * Warning: don't use %esi or the stack in this code. However, %esp
- * can be used as a GPR if you really need it...
- */
-page_pde_offset = (__PAGE_OFFSET >> 20);
-
- movl $(pg0 - __PAGE_OFFSET), %edi
- movl $(swapper_pg_dir - __PAGE_OFFSET), %edx
- movl $0x007, %eax /* 0x007 = PRESENT+RW+USER */
-10:
- leal 0x007(%edi),%ecx /* Create PDE entry */
- movl %ecx,(%edx) /* Store identity PDE entry */
- movl %ecx,page_pde_offset(%edx) /* Store kernel PDE entry */
- addl $4,%edx
- movl $1024, %ecx
-11:
- stosl
- addl $0x1000,%eax
- loop 11b
- /* End condition: we must map up to and including INIT_MAP_BEYOND_END */
- /* bytes beyond the end of our own page tables; the +0x007 is the attribute bits */
- leal (INIT_MAP_BEYOND_END+0x007)(%edi),%ebp
- cmpl %ebp,%eax
- jb 10b
- movl %edi,(init_pg_tables_end - __PAGE_OFFSET)
+ /* Setup the stack */
+ lss stack_start - __PAGE_OFFSET, %esp
+ subl $__PAGE_OFFSET, %esp
+
+ /* Initialize the boot page tables */
+ call early_pgtable_init
jmp 3f
/*
diff --git a/arch/i386/kernel/setup.c b/arch/i386/kernel/setup.c
index 698c24f..3e31591 100644
--- a/arch/i386/kernel/setup.c
+++ b/arch/i386/kernel/setup.c
@@ -82,7 +82,11 @@ struct cpuinfo_x86 new_cpu_data __cpuinitdata = { 0, 0, 0, 0, -1, 1, 0, 0, -1 };
struct cpuinfo_x86 boot_cpu_data __read_mostly = { 0, 0, 0, 0, -1, 1, 0, 0, -1 };
EXPORT_SYMBOL(boot_cpu_data);
+#ifndef CONFIG_X86_PAE
unsigned long mmu_cr4_features;
+#else
+unsigned long mmu_cr4_features = X86_CR4_PAE;
+#endif
/* for MCA, but anyone else can use it if they want */
unsigned int machine_id;
diff --git a/arch/i386/kernel/srat.c b/arch/i386/kernel/srat.c
index 2a8713e..24bfd4a 100644
--- a/arch/i386/kernel/srat.c
+++ b/arch/i386/kernel/srat.c
@@ -57,8 +57,6 @@ static struct node_memory_chunk_s node_memory_chunk[MAXCHUNKS];
static int num_memory_chunks; /* total number of memory chunks */
static u8 __initdata apicid_to_pxm[MAX_APICID];
-extern void * boot_ioremap(unsigned long, unsigned long);
-
/* Identify CPU proximity domains */
static void __init parse_cpu_affinity_structure(char *p)
{
@@ -299,7 +297,7 @@ int __init get_memcfg_from_srat(void)
}
rsdt = (struct acpi_table_rsdt *)
- boot_ioremap(rsdp->rsdt_physical_address, sizeof(struct acpi_table_rsdt));
+ bt_ioremap(rsdp->rsdt_physical_address, sizeof(struct acpi_table_rsdt));
if (!rsdt) {
printk(KERN_WARNING
@@ -339,11 +337,11 @@ int __init get_memcfg_from_srat(void)
for (i = 0; i < tables; i++) {
/* Map in header, then map in full table length. */
header = (struct acpi_table_header *)
- boot_ioremap(saved_rsdt.table.table_offset_entry[i], sizeof(struct acpi_table_header));
+ bt_ioremap(saved_rsdt.table.table_offset_entry[i], sizeof(struct acpi_table_header));
if (!header)
break;
header = (struct acpi_table_header *)
- boot_ioremap(saved_rsdt.table.table_offset_entry[i], header->length);
+ bt_ioremap(saved_rsdt.table.table_offset_entry[i], header->length);
if (!header)
break;
diff --git a/arch/i386/mach-voyager/voyager_basic.c b/arch/i386/mach-voyager/voyager_basic.c
index 8fe7e45..144e235 100644
--- a/arch/i386/mach-voyager/voyager_basic.c
+++ b/arch/i386/mach-voyager/voyager_basic.c
@@ -114,8 +114,7 @@ typedef struct ClickMap {
} ClickMap_t;
-/* This routine is pretty much an awful hack to read the bios clickmap by
- * mapping it into page 0. There are usually three regions in the map:
+/* This routine to reads the bios clickmap. There are usually three regions in the map:
* Base Memory
* Extended Memory
* zero length marker for end of map
@@ -142,12 +141,8 @@ voyager_memory_detect(int region, __u32 *start, __u32 *length)
map_addr = *(unsigned long *)cmos;
- /* steal page 0 for this */
- old = pg0[0];
- pg0[0] = ((map_addr & PAGE_MASK) | _PAGE_RW | _PAGE_PRESENT);
- local_flush_tlb();
- /* now clear everything out but page 0 */
- map = (ClickMap_t *)(map_addr & (~PAGE_MASK));
+ /* Setup a temporary mapping for the clickmap */
+ map = bt_ioremap(map_addr, sizeof(*map));
/* zero length is the end of the clickmap */
if(map->Entry[region].Length != 0) {
@@ -156,9 +151,8 @@ voyager_memory_detect(int region, __u32 *start, __u32 *length)
retval = 1;
}
- /* replace the mapping */
- pg0[0] = old;
- local_flush_tlb();
+ /* undo the mapping */
+ bt_iounmap(map, sizeof(*map));
return retval;
}
diff --git a/arch/i386/mm/Makefile b/arch/i386/mm/Makefile
index 80908b5..215eed5 100644
--- a/arch/i386/mm/Makefile
+++ b/arch/i386/mm/Makefile
@@ -7,4 +7,3 @@ obj-y := init.o pgtable.o fault.o ioremap.o extable.o pageattr.o mmap.o
obj-$(CONFIG_NUMA) += discontig.o
obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
obj-$(CONFIG_HIGHMEM) += highmem.o
-obj-$(CONFIG_BOOT_IOREMAP) += boot_ioremap.o
diff --git a/arch/i386/mm/boot_ioremap.c b/arch/i386/mm/boot_ioremap.c
deleted file mode 100644
index 4de95a1..0000000
--- a/arch/i386/mm/boot_ioremap.c
+++ /dev/null
@@ -1,100 +0,0 @@
-/*
- * arch/i386/mm/boot_ioremap.c
- *
- * Re-map functions for early boot-time before paging_init() when the
- * boot-time pagetables are still in use
- *
- * Written by Dave Hansen <haveblue@us.ibm.com>
- */
-
-
-/*
- * We need to use the 2-level pagetable functions, but CONFIG_X86_PAE
- * keeps that from happenning. If anyone has a better way, I'm listening.
- *
- * boot_pte_t is defined only if this all works correctly
- */
-
-#undef CONFIG_X86_PAE
-#undef CONFIG_PARAVIRT
-#include <asm/page.h>
-#include <asm/pgtable.h>
-#include <asm/tlbflush.h>
-#include <linux/init.h>
-#include <linux/stddef.h>
-
-/*
- * I'm cheating here. It is known that the two boot PTE pages are
- * allocated next to each other. I'm pretending that they're just
- * one big array.
- */
-
-#define BOOT_PTE_PTRS (PTRS_PER_PTE*2)
-
-static unsigned long boot_pte_index(unsigned long vaddr)
-{
- return __pa(vaddr) >> PAGE_SHIFT;
-}
-
-static inline boot_pte_t* boot_vaddr_to_pte(void *address)
-{
- boot_pte_t* boot_pg = (boot_pte_t*)pg0;
- return &boot_pg[boot_pte_index((unsigned long)address)];
-}
-
-/*
- * This is only for a caller who is clever enough to page-align
- * phys_addr and virtual_source, and who also has a preference
- * about which virtual address from which to steal ptes
- */
-static void __boot_ioremap(unsigned long phys_addr, unsigned long nrpages,
- void* virtual_source)
-{
- boot_pte_t* pte;
- int i;
- char *vaddr = virtual_source;
-
- pte = boot_vaddr_to_pte(virtual_source);
- for (i=0; i < nrpages; i++, phys_addr += PAGE_SIZE, pte++) {
- set_pte(pte, pfn_pte(phys_addr>>PAGE_SHIFT, PAGE_KERNEL));
- __flush_tlb_one(&vaddr[i*PAGE_SIZE]);
- }
-}
-
-/* the virtual space we're going to remap comes from this array */
-#define BOOT_IOREMAP_PAGES 4
-#define BOOT_IOREMAP_SIZE (BOOT_IOREMAP_PAGES*PAGE_SIZE)
-static __initdata char boot_ioremap_space[BOOT_IOREMAP_SIZE]
- __attribute__ ((aligned (PAGE_SIZE)));
-
-/*
- * This only applies to things which need to ioremap before paging_init()
- * bt_ioremap() and plain ioremap() are both useless at this point.
- *
- * When used, we're still using the boot-time pagetables, which only
- * have 2 PTE pages mapping the first 8MB
- *
- * There is no unmap. The boot-time PTE pages aren't used after boot.
- * If you really want the space back, just remap it yourself.
- * boot_ioremap(&ioremap_space-PAGE_OFFSET, BOOT_IOREMAP_SIZE)
- */
-__init void* boot_ioremap(unsigned long phys_addr, unsigned long size)
-{
- unsigned long last_addr, offset;
- unsigned int nrpages;
-
- last_addr = phys_addr + size - 1;
-
- /* page align the requested address */
- offset = phys_addr & ~PAGE_MASK;
- phys_addr &= PAGE_MASK;
- size = PAGE_ALIGN(last_addr) - phys_addr;
-
- nrpages = size >> PAGE_SHIFT;
- if (nrpages > BOOT_IOREMAP_PAGES)
- return NULL;
-
- __boot_ioremap(phys_addr, nrpages, boot_ioremap_space);
-
- return &boot_ioremap_space[offset];
-}
diff --git a/arch/i386/mm/init.c b/arch/i386/mm/init.c
index dbe16f6..78f03b1 100644
--- a/arch/i386/mm/init.c
+++ b/arch/i386/mm/init.c
@@ -44,6 +44,7 @@
#include <asm/tlbflush.h>
#include <asm/sections.h>
#include <asm/paravirt.h>
+#include <asm/setup.h>
unsigned int __VMALLOC_RESERVE = 128 << 20;
@@ -53,6 +54,119 @@ unsigned long highstart_pfn, highend_pfn;
static int noinline do_test_wp_bit(void);
/*
+ * This is how much memory *in addition to the memory covered up to
+ * and including _end* we need mapped initially. We need one bit for
+ * each possible page, but only in low memory, which means
+ * 2^32/4096/8 = 128K worst case (4G/4G split.)
+ *
+ * Modulo rounding, each megabyte assigned here requires a kilobyte of
+ * memory, which is currently unreclaimed.
+ *
+ * This should be a multiple of a page.
+ */
+#define INIT_MAP_BEYOND_END (128*1024)
+
+/*
+ * Initialize page tables. This creates a PDE and a set of page
+ * tables, which are located immediately beyond _end. The variable
+ * init_pg_tables_end is set up to point to the first "safe" location.
+ * Mappings are created both at virtual address 0 (identity mapping)
+ * and PAGE_OFFSET for up to _end+sizeof(page tables)+INIT_MAP_BEYOND_END.
+ *
+ * WARNING: This code runs at it's physical address not it's virtual address,
+ * with all physical everything identity mapped, and nothing else mapped.
+ * This means global variabels must be done very carfully.
+ */
+#define __pavar(X) (*(__typeof__(X) *)__pa_symbol(&(X)))
+
+static __init inline pud_t *early_pud_offset(pgd_t *pgd, unsigned long vaddr)
+{
+ return (pud_t *)pgd;
+}
+
+static __init inline pmd_t *early_pmd_offset(pud_t *pud, unsigned long vaddr)
+{
+#ifndef CONFIG_X86_PAE
+ return (pmd_t *)pud;
+#else
+ return ((pmd_t *)(u32)(pud_val(*pud) & PAGE_MASK)) + pmd_index(vaddr);
+#endif
+}
+
+static __init inline pte_t *early_pte_offset(pmd_t *pmd, unsigned long vaddr)
+{
+ return ((pte_t *)(u32)(pmd_val(*pmd) & PAGE_MASK)) + pte_index(vaddr);
+}
+
+static __init pmd_t *
+early_pmd_alloc(pgd_t *pgd_base, unsigned long vaddr, unsigned long *end)
+{
+ pgd_t *pgd;
+ pud_t *pud;
+ pgd = pgd_base + pgd_index(vaddr);
+ pud = early_pud_offset(pgd, vaddr);
+
+#ifdef CONFIG_X86_PAE
+ if (!(pud_val(*pud) & _PAGE_PRESENT)) {
+ unsigned long phys = *end;
+ memset((void *)phys, 0, PAGE_SIZE);
+ native_set_pud(pud, __pud(phys | _PAGE_PRESENT));
+ *end += PAGE_SIZE;
+ }
+#endif
+ return early_pmd_offset(pud, vaddr);
+}
+
+static __init pte_t *
+early_pte_alloc(pgd_t *pgd_base, unsigned long vaddr, unsigned long *end)
+{
+ pmd_t *pmd;
+
+ pmd = early_pmd_alloc(pgd_base, vaddr, end);
+ if (!(pmd_val(*pmd) & _PAGE_PRESENT)) {
+ unsigned long phys = *end;
+ memset((void *)phys, 0, PAGE_SIZE);
+ native_set_pmd(pmd, __pmd(phys | _PAGE_TABLE));
+ *end += PAGE_SIZE;
+ }
+ return early_pte_offset(pmd, vaddr);
+}
+
+static __init void early_set_pte_phys(pgd_t *pgd_base, unsigned long vaddr,
+ unsigned long phys, unsigned long *end)
+{
+ pte_t *pte;
+ pte = early_pte_alloc(pgd_base, vaddr, end);
+ native_set_pte(pte, __pte(phys | _PAGE_KERNEL_EXEC));
+}
+
+void __init early_pgtable_init(void)
+{
+ unsigned long addr, end;
+ pgd_t *pgd_base;
+
+ pgd_base = __pavar(swapper_pg_dir);
+ end = __pa_symbol(pg0);
+
+ /* Initialize the directory page */
+ memset(pgd_base, 0, PAGE_SIZE);
+
+ /* Set up the fixmap page table */
+ early_pte_alloc(pgd_base, __pavar(__FIXADDR_TOP), &end);
+
+ /* Set up the initial kernel mapping */
+ for (addr = 0; addr < (end + INIT_MAP_BEYOND_END); addr += PAGE_SIZE)
+ early_set_pte_phys(pgd_base, addr + PAGE_OFFSET, addr, &end);
+
+ /* Set up the low identity mappings */
+ clone_pgd_range(pgd_base, pgd_base + USER_PTRS_PER_PGD,
+ min_t(unsigned long, KERNEL_PGD_PTRS, USER_PGD_PTRS));
+
+ __pavar(init_pg_tables_end) = end;
+}
+#undef __pavar
+
+/*
* Creates a middle page table and puts a pointer to it in the
* given global directory entry. This only returns the gd entry
* in non-PAE compilation mode, since the middle layer is folded.
@@ -338,44 +452,11 @@ extern void __init remap_numa_kva(void);
void __init native_pagetable_setup_start(pgd_t *base)
{
-#ifdef CONFIG_X86_PAE
- int i;
-
- /*
- * Init entries of the first-level page table to the
- * zero page, if they haven't already been set up.
- *
- * In a normal native boot, we'll be running on a
- * pagetable rooted in swapper_pg_dir, but not in PAE
- * mode, so this will end up clobbering the mappings
- * for the lower 24Mbytes of the address space,
- * without affecting the kernel address space.
- */
- for (i = 0; i < USER_PTRS_PER_PGD; i++)
- set_pgd(&base[i],
- __pgd(__pa(empty_zero_page) | _PAGE_PRESENT));
-
- /* Make sure kernel address space is empty so that a pagetable
- will be allocated for it. */
- memset(&base[USER_PTRS_PER_PGD], 0,
- KERNEL_PGD_PTRS * sizeof(pgd_t));
-#else
paravirt_alloc_pd(__pa(swapper_pg_dir) >> PAGE_SHIFT);
-#endif
}
void __init native_pagetable_setup_done(pgd_t *base)
{
-#ifdef CONFIG_X86_PAE
- /*
- * Add low memory identity-mappings - SMP needs it when
- * starting up on an AP from real-mode. In the non-PAE
- * case we already have these mappings through head.S.
- * All user-space mappings are explicitly cleared after
- * SMP startup.
- */
- set_pgd(&base[0], base[USER_PTRS_PER_PGD]);
-#endif
}
/*
@@ -567,14 +648,6 @@ void __init paging_init(void)
load_cr3(swapper_pg_dir);
-#ifdef CONFIG_X86_PAE
- /*
- * We will bail out later - printk doesn't work right now so
- * the user would just see a hanging kernel.
- */
- if (cpu_has_pae)
- set_in_cr4(X86_CR4_PAE);
-#endif
__flush_tlb_all();
kmap_init();
@@ -704,10 +777,6 @@ void __init mem_init(void)
BUG_ON((unsigned long)high_memory > VMALLOC_START);
#endif /* double-sanity-check paranoia */
-#ifdef CONFIG_X86_PAE
- if (!cpu_has_pae)
- panic("cannot execute a PAE-enabled kernel on a PAE-less CPU!");
-#endif
if (boot_cpu_data.wp_works_ok < 0)
test_wp_bit();
diff --git a/include/asm-i386/page.h b/include/asm-i386/page.h
index 818ac8b..bb29e82 100644
--- a/include/asm-i386/page.h
+++ b/include/asm-i386/page.h
@@ -90,7 +90,6 @@ static inline pte_t native_make_pte(unsigned long long val)
typedef struct { unsigned long pte_low; } pte_t;
typedef struct { unsigned long pgd; } pgd_t;
typedef struct { unsigned long pgprot; } pgprot_t;
-#define boot_pte_t pte_t /* or would you rather have a typedef */
static inline unsigned long native_pgd_val(pgd_t pgd)
{
diff --git a/include/asm-i386/pgtable.h b/include/asm-i386/pgtable.h
index c6b8b94..7159f12 100644
--- a/include/asm-i386/pgtable.h
+++ b/include/asm-i386/pgtable.h
@@ -68,10 +68,6 @@ void paging_init(void);
#define USER_PGD_PTRS (PAGE_OFFSET >> PGDIR_SHIFT)
#define KERNEL_PGD_PTRS (PTRS_PER_PGD-USER_PGD_PTRS)
-#define TWOLEVEL_PGDIR_SHIFT 22
-#define BOOT_USER_PGD_PTRS (__PAGE_OFFSET >> TWOLEVEL_PGDIR_SHIFT)
-#define BOOT_KERNEL_PGD_PTRS (1024-BOOT_USER_PGD_PTRS)
-
/* Just any arbitrary offset to the start of the vmalloc VM area: the
* current 8MB value just means that there will be a 8MB "hole" after the
* physical memory until the kernel virtual memory starts. That means that
--
1.5.1.1.181.g2de0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 07/12] i386: Add missing !X86_PAE dependincy to the 2G/2G split.
2007-04-30 16:10 ` [PATCH 07/12] i386: Add missing !X86_PAE dependincy to the 2G/2G split Eric W. Biederman
2007-04-30 16:15 ` [PATCH 08/12] i386: Convert the boot time page tables to the kernels native format Eric W. Biederman
@ 2007-04-30 16:16 ` H. Peter Anvin
2007-04-30 16:39 ` Eric W. Biederman
1 sibling, 1 reply; 21+ messages in thread
From: H. Peter Anvin @ 2007-04-30 16:16 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: linux-kernel, Andrew Morton, virtualization
Eric W. Biederman wrote:
> When in PAE mode we require that the user kernel divide to be
> on a 1G boundary. The 2G/2G split does not have that property
> so require !X86_PAE
?????
-hpa
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 08/12] i386: Convert the boot time page tables to the kernels native format.
2007-04-30 16:15 ` [PATCH 08/12] i386: Convert the boot time page tables to the kernels native format Eric W. Biederman
@ 2007-04-30 16:26 ` Andi Kleen
2007-04-30 16:32 ` [PATCH 09/12] i386/x86_64: EHCI usb debug port early printk support Eric W. Biederman
[not found] ` <200704301826.57920.ak@suse.de>
2 siblings, 0 replies; 21+ messages in thread
From: Andi Kleen @ 2007-04-30 16:26 UTC (permalink / raw)
To: Eric W. Biederman
Cc: linux-kernel, virtualization, James Bottomley, H. Peter Anvin,
Andrew Morton
On Monday 30 April 2007 18:15:08 Eric W. Biederman wrote:
>
> Currently we have a lot of special case code and a lot of limitations
> because we cannot count on the initial boot time page tables being in
> the format our page table handling routines know how to manipulate.
> So this patch rewrites the code that initializes our boot time page
> tables.
Sounds reasonable, but for post .22
-Andi
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 09/12] i386/x86_64: EHCI usb debug port early printk support.
2007-04-30 16:15 ` [PATCH 08/12] i386: Convert the boot time page tables to the kernels native format Eric W. Biederman
2007-04-30 16:26 ` Andi Kleen
@ 2007-04-30 16:32 ` Eric W. Biederman
2007-04-30 16:32 ` [PATCH 10/12] i386: Introduce head32.c Eric W. Biederman
` (2 more replies)
[not found] ` <200704301826.57920.ak@suse.de>
2 siblings, 3 replies; 21+ messages in thread
From: Eric W. Biederman @ 2007-04-30 16:32 UTC (permalink / raw)
To: Andi Kleen
Cc: Greg Kroah-Hartman, linux-kernel, virtualization, H. Peter Anvin,
Andrew Morton
With legacy free systems serial ports have stopped being an option
to get early boot traces and other debug information out of a machine.
EHCI USB controllers provide a relatively simple debug interface
that can control port 1 of the root hub. This interface is limited
to 8 byte packets so it can not be used with most USB devices. But
with a USB debug device this is sufficient to talk to another machine.
When the special feature of the EHCI is not enabled the port
1 of the root hub acts just like any other USB port so machines
with the necessary support are widely available.
This debug device can be used to replace serial ports for
kgdb, kdb, and console support. And gregkh has a simple usb
serial driver for it so user space applications that control
serial ports should work unmodified.
Currently there only appears to be one manufacturer of debug
devices see:
http://www.plxtech.com/products/NET2000/NET20DC/default.asp
I think simple RS232 serial ports provide a nicer and simpler
interface but the usb debug port looks like a functional alternative
when you don't have that.
My code likely doesn't handle all of the corner cases yet. But this
is getting it out there so other people can starting using it and help
make clean drivers. When writing a polling driver you do have to be
careful with your logic, because if you do things like reset a usb
device at the wrong time you can completely confuse various EHCI
controllers.
My driver should be sufficient to work with any EHCI in a realatively
clean state, and needs no special BIOS support just the hardware.
This appears to be different than the way the windows drivers are
using these debug devices.
The big dependency that was holding this code back was the requirement
for early fixmap support and that is now present (early in this patchset)
so things should work relatively cleanly.
There is currently a conflict between the early debug code and
ehci-hcd where if you leave the early debug printk' enabled after the
normal console hand off point "earlyprintk=dbgp,keep" the ehci-hcd
driver will hang the kernel while initializing, despite detecting
that the early debug point is in use.
For users the hard part looks like it will be finding cables and
finding which is usb debug port 1 and realizing that there is
flow control so the kernel boot will not happen if someone is not
reading the serial console data.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
arch/x86_64/kernel/early_printk.c | 571 +++++++++++++++++++++++++++++++++++++
drivers/usb/host/ehci.h | 8 +
include/asm-i386/fixmap.h | 1 +
include/asm-x86_64/fixmap.h | 1 +
4 files changed, 581 insertions(+), 0 deletions(-)
diff --git a/arch/x86_64/kernel/early_printk.c b/arch/x86_64/kernel/early_printk.c
index 92213d2..dc097aa 100644
--- a/arch/x86_64/kernel/early_printk.c
+++ b/arch/x86_64/kernel/early_printk.c
@@ -3,9 +3,19 @@
#include <linux/init.h>
#include <linux/string.h>
#include <linux/screen_info.h>
+#include <linux/usb/ch9.h>
+#include <linux/pci_regs.h>
+#include <linux/pci_ids.h>
+#include <linux/errno.h>
#include <asm/io.h>
#include <asm/processor.h>
#include <asm/fcntl.h>
+#include <asm/pci-direct.h>
+#include <asm/pgtable.h>
+#include <asm/fixmap.h>
+#define EARLY_PRINTK
+#include "../../../drivers/usb/host/ehci.h"
+
/* Simple VGA output */
@@ -155,6 +165,555 @@ static struct console early_serial_console = {
.index = -1,
};
+
+static struct ehci_caps __iomem *ehci_caps;
+static struct ehci_regs __iomem *ehci_regs;
+static struct ehci_dbg_port __iomem *ehci_debug;
+static unsigned dbgp_endpoint_out;
+
+#define USB_DEBUG_DEVNUM 127
+
+#define DBGP_DATA_TOGGLE 0x8800
+#define DBGP_PID_UPDATE(x, tok) \
+ ((((x) ^ DBGP_DATA_TOGGLE) & 0xffff00) | ((tok) & 0xff))
+
+#define DBGP_LEN_UPDATE(x, len) (((x) & ~0x0f) | ((len) & 0x0f))
+/*
+ * USB Packet IDs (PIDs)
+ */
+
+/* token */
+#define USB_PID_OUT 0xe1
+#define USB_PID_IN 0x69
+#define USB_PID_SOF 0xa5
+#define USB_PID_SETUP 0x2d
+/* handshake */
+#define USB_PID_ACK 0xd2
+#define USB_PID_NAK 0x5a
+#define USB_PID_STALL 0x1e
+#define USB_PID_NYET 0x96
+/* data */
+#define USB_PID_DATA0 0xc3
+#define USB_PID_DATA1 0x4b
+#define USB_PID_DATA2 0x87
+#define USB_PID_MDATA 0x0f
+/* Special */
+#define USB_PID_PREAMBLE 0x3c
+#define USB_PID_ERR 0x3c
+#define USB_PID_SPLIT 0x78
+#define USB_PID_PING 0xb4
+#define USB_PID_UNDEF_0 0xf0
+
+#define USB_PID_DATA_TOGGLE 0x88
+#define DBGP_CLAIM (DBGP_OWNER | DBGP_ENABLED | DBGP_INUSE)
+
+#define PCI_CAP_ID_EHCI_DEBUG 0xa
+
+#define HUB_ROOT_RESET_TIME 50 /* times are in msec */
+#define HUB_SHORT_RESET_TIME 10
+#define HUB_LONG_RESET_TIME 200
+#define HUB_RESET_TIMEOUT 500
+
+#define DBGP_MAX_PACKET 8
+
+static int dbgp_wait_until_complete(void)
+{
+ unsigned ctrl;
+ for (;;) {
+ ctrl = readl(&ehci_debug->control);
+ /* Stop when the transaction is finished */
+ if (ctrl & DBGP_DONE)
+ break;
+ }
+ /* Now that we have observed the completed transaction,
+ * clear the done bit.
+ */
+ writel(ctrl | DBGP_DONE, &ehci_debug->control);
+ return (ctrl & DBGP_ERROR) ? -DBGP_ERRCODE(ctrl) : DBGP_LEN(ctrl);
+}
+
+static void dbgp_mdelay(int ms)
+{
+ int i;
+ while (ms--) {
+ for (i = 0; i < 1000; i++)
+ outb(0x1, 0x80);
+ }
+}
+
+static void dbgp_breath(void)
+{
+ /* Sleep to give the debug port a chance to breathe */
+}
+
+static int dbgp_wait_until_done(unsigned ctrl)
+{
+ unsigned pids, lpid;
+ int ret;
+
+retry:
+ writel(ctrl | DBGP_GO, &ehci_debug->control);
+ ret = dbgp_wait_until_complete();
+ pids = readl(&ehci_debug->pids);
+ lpid = DBGP_PID_GET(pids);
+
+ if (ret < 0)
+ return ret;
+
+ /* If the port is getting full or it has dropped data
+ * start pacing ourselves, not necessary but it's friendly.
+ */
+ if ((lpid == USB_PID_NAK) || (lpid == USB_PID_NYET))
+ dbgp_breath();
+
+ /* If I get a NACK reissue the transmission */
+ if (lpid == USB_PID_NAK)
+ goto retry;
+
+ return ret;
+}
+
+static void dbgp_set_data(const void *buf, int size)
+{
+ const unsigned char *bytes = buf;
+ unsigned lo, hi;
+ int i;
+ lo = hi = 0;
+ for (i = 0; i < 4 && i < size; i++)
+ lo |= bytes[i] << (8*i);
+ for (; i < 8 && i < size; i++)
+ hi |= bytes[i] << (8*(i - 4));
+ writel(lo, &ehci_debug->data03);
+ writel(hi, &ehci_debug->data47);
+}
+
+static void dbgp_get_data(void *buf, int size)
+{
+ unsigned char *bytes = buf;
+ unsigned lo, hi;
+ int i;
+ lo = readl(&ehci_debug->data03);
+ hi = readl(&ehci_debug->data47);
+ for (i = 0; i < 4 && i < size; i++)
+ bytes[i] = (lo >> (8*i)) & 0xff;
+ for (; i < 8 && i < size; i++)
+ bytes[i] = (hi >> (8*(i - 4))) & 0xff;
+}
+
+static int dbgp_bulk_write(unsigned devnum, unsigned endpoint, const char *bytes, int size)
+{
+ unsigned pids, addr, ctrl;
+ int ret;
+ if (size > DBGP_MAX_PACKET)
+ return -1;
+
+ addr = DBGP_EPADDR(devnum, endpoint);
+
+ pids = readl(&ehci_debug->pids);
+ pids = DBGP_PID_UPDATE(pids, USB_PID_OUT);
+
+ ctrl = readl(&ehci_debug->control);
+ ctrl = DBGP_LEN_UPDATE(ctrl, size);
+ ctrl |= DBGP_OUT;
+ ctrl |= DBGP_GO;
+
+ dbgp_set_data(bytes, size);
+ writel(addr, &ehci_debug->address);
+ writel(pids, &ehci_debug->pids);
+
+ ret = dbgp_wait_until_done(ctrl);
+ if (ret < 0) {
+ return ret;
+ }
+ return ret;
+}
+
+static int dbgp_bulk_read(unsigned devnum, unsigned endpoint, void *data, int size)
+{
+ unsigned pids, addr, ctrl;
+ int ret;
+
+ if (size > DBGP_MAX_PACKET)
+ return -1;
+
+ addr = DBGP_EPADDR(devnum, endpoint);
+
+ pids = readl(&ehci_debug->pids);
+ pids = DBGP_PID_UPDATE(pids, USB_PID_IN);
+
+ ctrl = readl(&ehci_debug->control);
+ ctrl = DBGP_LEN_UPDATE(ctrl, size);
+ ctrl &= ~DBGP_OUT;
+ ctrl |= DBGP_GO;
+
+ writel(addr, &ehci_debug->address);
+ writel(pids, &ehci_debug->pids);
+ ret = dbgp_wait_until_done(ctrl);
+ if (ret < 0)
+ return ret;
+ if (size > ret)
+ size = ret;
+ dbgp_get_data(data, size);
+ return ret;
+}
+
+static int dbgp_control_msg(unsigned devnum, int requesttype, int request,
+ int value, int index, void *data, int size)
+{
+ unsigned pids, addr, ctrl;
+ struct usb_ctrlrequest req;
+ int read;
+ int ret;
+
+ read = (requesttype & USB_DIR_IN) != 0;
+ if (size > (read?DBGP_MAX_PACKET:0))
+ return -1;
+
+ /* Compute the control message */
+ req.bRequestType = requesttype;
+ req.bRequest = request;
+ req.wValue = value;
+ req.wIndex = index;
+ req.wLength = size;
+
+ pids = DBGP_PID_SET(USB_PID_DATA0, USB_PID_SETUP);
+ addr = DBGP_EPADDR(devnum, 0);
+
+ ctrl = readl(&ehci_debug->control);
+ ctrl = DBGP_LEN_UPDATE(ctrl, sizeof(req));
+ ctrl |= DBGP_OUT;
+ ctrl |= DBGP_GO;
+
+ /* Send the setup message */
+ dbgp_set_data(&req, sizeof(req));
+ writel(addr, &ehci_debug->address);
+ writel(pids, &ehci_debug->pids);
+ ret = dbgp_wait_until_done(ctrl);
+ if (ret < 0)
+ return ret;
+
+
+ /* Read the result */
+ ret = dbgp_bulk_read(devnum, 0, data, size);
+ return ret;
+}
+
+
+/* Find a PCI capability */
+static __u32 __init find_cap(int num, int slot, int func, int cap)
+{
+ u8 pos;
+ int bytes;
+ if (!(read_pci_config_16(num,slot,func,PCI_STATUS) & PCI_STATUS_CAP_LIST))
+ return 0;
+ pos = read_pci_config_byte(num,slot,func,PCI_CAPABILITY_LIST);
+ for (bytes = 0; bytes < 48 && pos >= 0x40; bytes++) {
+ u8 id;
+ pos &= ~3;
+ id = read_pci_config_byte(num,slot,func,pos+PCI_CAP_LIST_ID);
+ if (id == 0xff)
+ break;
+ if (id == cap)
+ return pos;
+ pos = read_pci_config_byte(num,slot,func,pos+PCI_CAP_LIST_NEXT);
+ }
+ return 0;
+}
+
+static __u32 __init find_dbgp(int ehci_num, unsigned *rbus, unsigned *rslot, unsigned *rfunc)
+{
+ unsigned bus, slot, func;
+
+ for (bus = 0; bus < 256; bus++) {
+ for (slot = 0; slot < 32; slot++) {
+ for (func = 0; func < 8; func++) {
+ u32 class;
+ unsigned cap;
+ class = read_pci_config(bus, slot, func, PCI_CLASS_REVISION);
+ if ((class >> 8) != PCI_CLASS_SERIAL_USB_EHCI)
+ continue;
+ cap = find_cap(bus, slot, func, PCI_CAP_ID_EHCI_DEBUG);
+ if (!cap)
+ continue;
+ if (ehci_num-- != 0)
+ continue;
+ *rbus = bus;
+ *rslot = slot;
+ *rfunc = func;
+ return cap;
+ }
+ }
+ }
+ return 0;
+}
+
+static int ehci_reset_port(int port)
+{
+ unsigned portsc;
+ unsigned delay_time, delay;
+
+ /* Reset the usb debug port */
+ portsc = readl(&ehci_regs->port_status[port - 1]);
+ portsc &= ~PORT_PE;
+ portsc |= PORT_RESET;
+ writel(portsc, &ehci_regs->port_status[port - 1]);
+
+ delay = HUB_ROOT_RESET_TIME;
+ for (delay_time = 0; delay_time < HUB_RESET_TIMEOUT;
+ delay_time += delay) {
+ dbgp_mdelay(delay);
+
+ portsc = readl(&ehci_regs->port_status[port - 1]);
+ if (portsc & PORT_RESET) {
+ /* force reset to complete */
+ writel(portsc & ~(PORT_RWC_BITS | PORT_RESET),
+ &ehci_regs->port_status[port - 1]);
+ while (portsc & PORT_RESET)
+ portsc = readl(&ehci_regs->port_status[port - 1]);
+ }
+
+ /* Device went away? */
+ if (!(portsc & PORT_CONNECT))
+ return -ENOTCONN;
+
+ /* bomb out completely if something weird happend */
+ if ((portsc & PORT_CSC))
+ return -EINVAL;
+
+ /* If we've finished resetting, then break out of the loop */
+ if (!(portsc & PORT_RESET) && (portsc & PORT_PE))
+ return 0;
+ }
+ return -EBUSY;
+}
+
+static int ehci_wait_for_port(int port)
+{
+ unsigned status;
+ int ret, reps;
+ for (reps = 0; reps >= 0; reps++) {
+ status = readl(&ehci_regs->status);
+ if (status & STS_PCD) {
+ ret = ehci_reset_port(port);
+ if (ret == 0)
+ return 0;
+ }
+ }
+ return -ENOTCONN;
+}
+
+
+#define DBGP_DEBUG 0
+#if DBGP_DEBUG
+void early_printk(const char *fmt, ...);
+# define dbgp_printk early_printk
+#else
+static inline void dbgp_printk(const char *fmt, ...) { }
+#endif
+
+static int ehci_setup(void)
+{
+ unsigned cmd, ctrl, status, portsc, hcs_params, debug_port, n_ports;
+ int ret;
+
+ hcs_params = readl(&ehci_caps->hcs_params);
+ debug_port = HCS_DEBUG_PORT(hcs_params);
+ n_ports = HCS_N_PORTS(hcs_params);
+
+ dbgp_printk("debug_port: %d\n", debug_port);
+ dbgp_printk("n_ports: %d\n", n_ports);
+
+ /* Reset the EHCI controller */
+ cmd = readl(&ehci_regs->command);
+ cmd |=CMD_RESET;
+ writel(cmd, &ehci_regs->command);
+ while (cmd & CMD_RESET)
+ cmd = readl(&ehci_regs->command);
+
+ /* Claim ownership, but do not enable yet */
+ ctrl = readl(&ehci_debug->control);
+ ctrl |= DBGP_OWNER;
+ ctrl &= ~(DBGP_ENABLED | DBGP_INUSE);
+ writel(ctrl, &ehci_debug->control);
+
+ /* Start the ehci running */
+ cmd = readl(&ehci_regs->command);
+ cmd &= ~(CMD_LRESET | CMD_IAAD | CMD_PSE | CMD_ASE | CMD_RESET);
+ cmd |= CMD_RUN;
+ writel(cmd, &ehci_regs->command);
+
+ /* Ensure everything is routed to the EHCI */
+ writel(FLAG_CF, &ehci_regs->configured_flag);
+
+ /* Wait until the controller is no longer halted */
+ do {
+ status = readl(&ehci_regs->status);
+ } while (status & STS_HALT);
+
+ /* Wait for a device to show up in the debug port */
+ ret = ehci_wait_for_port(debug_port);
+ if (ret < 0) {
+ dbgp_printk("No device found in debug port\n");
+ return -1;
+ }
+
+ /* Enable the debug port */
+ ctrl = readl(&ehci_debug->control);
+ ctrl |= DBGP_CLAIM;
+ writel(ctrl, &ehci_debug->control);
+ ctrl = readl(&ehci_debug->control);
+ if ((ctrl & DBGP_CLAIM) != DBGP_CLAIM) {
+ dbgp_printk("No device in debug port\n");
+ writel(ctrl & ~DBGP_CLAIM, &ehci_debug->control);
+ return -1;
+
+ }
+
+ /* Completely transfer the debug device to the debug controller */
+ portsc = readl(&ehci_regs->port_status[debug_port - 1]);
+ portsc &= ~PORT_PE;
+ writel(portsc, &ehci_regs->port_status[debug_port - 1]);
+
+ return 0;
+}
+
+static __init void early_dbgp_init(char *s)
+{
+ struct usb_debug_descriptor dbgp_desc;
+ void __iomem *ehci_bar;
+ unsigned ctrl, devnum;
+ unsigned bus, slot, func, cap;
+ unsigned debug_port, bar, offset;
+ unsigned bar_val;
+ unsigned dbgp_num;
+ char *e;
+ int ret;
+
+ if (!early_pci_allowed())
+ return;
+
+ dbgp_num = 0;
+ if (*s) {
+ dbgp_num = simple_strtoul(s, &e, 10);
+ }
+ dbgp_printk("dbgp_num: %d\n", dbgp_num);
+ cap = find_dbgp(dbgp_num, &bus, &slot, &func);
+ if (!cap)
+ return;
+
+ dbgp_printk("Found EHCI debug port\n");
+
+ debug_port = read_pci_config(bus, slot, func, cap);
+ bar = (debug_port >> 29) & 0x7;
+ bar = (bar * 4) + 0xc;
+ offset = (debug_port >> 16) & 0xfff;
+ if (bar != PCI_BASE_ADDRESS_0) {
+ dbgp_printk("only debug ports on bar 1 handled.\n");
+ return;
+ }
+
+ /* FIXME this assumes the bar is a 32bit mmio bar */
+ bar_val = read_pci_config(bus, slot, func, PCI_BASE_ADDRESS_0);
+
+ /* FIXME I don't have the bar size so just guess PAGE_SIZE is more
+ * than enough. 1K is the biggest I have seen.
+ */
+ set_fixmap_nocache(FIX_DBGP_BASE, bar_val & PAGE_MASK);
+ ehci_bar = (void __iomem *)fix_to_virt(FIX_DBGP_BASE);
+ ehci_bar += bar_val & ~PAGE_MASK;
+
+ ehci_caps = ehci_bar;
+ ehci_regs = ehci_bar + HC_LENGTH(readl(&ehci_caps->hc_capbase));
+ ehci_debug = ehci_bar + offset;
+
+ ret = ehci_setup();
+ if (ret < 0) {
+ dbgp_printk("ehci_setup failed\n");
+ return;
+ }
+
+ /* Find the debug device and make it device number 127 */
+ for (devnum = 0; devnum <= 127; devnum++) {
+ ret = dbgp_control_msg(devnum,
+ USB_DIR_IN | USB_TYPE_STANDARD | USB_RECIP_DEVICE,
+ USB_REQ_GET_DESCRIPTOR, (USB_DT_DEBUG << 8), 0,
+ &dbgp_desc, sizeof(dbgp_desc));
+ if (ret > 0)
+ break;
+ }
+ if (devnum > 127) {
+ dbgp_printk("Could not find attached debug device\n");
+ goto err;
+ }
+ if (ret < 0) {
+ dbgp_printk("Attached device is not a debug device\n");
+ goto err;
+ }
+ dbgp_endpoint_out = dbgp_desc.bDebugOutEndpoint;
+
+ /* Move the device to 127 if it isn't already there */
+ if (devnum != USB_DEBUG_DEVNUM) {
+ ret = dbgp_control_msg(devnum,
+ USB_DIR_OUT | USB_TYPE_STANDARD | USB_RECIP_DEVICE,
+ USB_REQ_SET_ADDRESS, USB_DEBUG_DEVNUM, 0, NULL, 0);
+ if (ret < 0) {
+ dbgp_printk("Could not move attached device to %d\n",
+ USB_DEBUG_DEVNUM);
+ goto err;
+ }
+ devnum = USB_DEBUG_DEVNUM;
+ }
+
+ /* Enable the debug interface */
+ ret = dbgp_control_msg(USB_DEBUG_DEVNUM,
+ USB_DIR_OUT | USB_TYPE_STANDARD | USB_RECIP_DEVICE,
+ USB_REQ_SET_FEATURE, USB_DEVICE_DEBUG_MODE, 0, NULL, 0);
+ if (ret < 0) {
+ dbgp_printk(" Could not enable the debug device\n");
+ goto err;
+ }
+
+ /* Perform a small write to get the even/odd data state in sync
+ */
+ ret = dbgp_bulk_write(USB_DEBUG_DEVNUM, dbgp_endpoint_out, " ",1);
+ if (ret < 0) {
+ dbgp_printk("dbgp_bulk_write failed: %d\n", ret);
+ goto err;
+ }
+
+
+ return;
+err:
+ /* Things didn't work so remove my claim */
+ ctrl = readl(&ehci_debug->control);
+ ctrl &= ~(DBGP_CLAIM | DBGP_OUT);
+ writel(ctrl, &ehci_debug->control);
+ return;
+}
+
+static void early_dbgp_write(struct console *con, const char *str, unsigned n)
+{
+ int chunk, ret;
+ if (!ehci_debug)
+ return;
+ while (n > 0) {
+ chunk = n;
+ if (chunk > DBGP_MAX_PACKET)
+ chunk = DBGP_MAX_PACKET;
+ ret = dbgp_bulk_write(USB_DEBUG_DEVNUM,
+ dbgp_endpoint_out, str, chunk);
+ str += chunk;
+ n -= chunk;
+ }
+}
+
+static struct console early_dbgp_console = {
+ .name = "earlydbg",
+ .write = early_dbgp_write,
+ .flags = CON_PRINTBUFFER,
+ .index = -1,
+};
+
/* Console interface to a host file on AMD's SimNow! */
static int simnow_fd;
@@ -242,8 +801,20 @@ static int __init setup_early_printk(char *buf)
simnow_init(buf + 6);
early_console = &simnow_console;
keep_early = 1;
+ } else if (!strncmp(buf, "dbgp", 4)) {
+ early_dbgp_init(buf + 4);
+ early_console = &early_dbgp_console;
}
register_console(early_console);
+#if DBGP_DEBUG
+ {
+ static const char dbgp_test_str[] =
+ "The quick brown fox jumped over the lazy dog!\n";
+ early_dbgp_init("");
+ early_dbgp_write(&early_dbgp_console,
+ dbgp_test_str, sizeof(dbgp_test_str) - 1);
+ }
+#endif
return 0;
}
diff --git a/drivers/usb/host/ehci.h b/drivers/usb/host/ehci.h
index 46fa57a..8455f13 100644
--- a/drivers/usb/host/ehci.h
+++ b/drivers/usb/host/ehci.h
@@ -46,6 +46,7 @@ struct ehci_stats {
#define EHCI_MAX_ROOT_PORTS 15 /* see HCS_N_PORTS */
+#ifndef EARLY_PRINTK
struct ehci_hcd { /* one per controller */
/* glue to PCI and HCD framework */
struct ehci_caps __iomem *caps;
@@ -166,6 +167,7 @@ timer_action (struct ehci_hcd *ehci, enum ehci_timer_action action)
mod_timer (&ehci->watchdog, t);
}
}
+#endif /* EARLY_PRINTK */
/*-------------------------------------------------------------------------*/
@@ -390,6 +392,7 @@ union ehci_shadow {
* These appear in both the async and (for interrupt) periodic schedules.
*/
+#ifndef EARLY_PRINTK
struct ehci_qh {
/* first part defined by EHCI spec */
__le32 hw_next; /* see EHCI 3.6.1 */
@@ -438,6 +441,7 @@ struct ehci_qh {
#define NO_FRAME ((unsigned short)~0) /* pick new start */
struct usb_device *dev; /* access to TT */
} __attribute__ ((aligned (32)));
+#endif /* EARLY_PRITNK */
/*-------------------------------------------------------------------------*/
@@ -607,6 +611,8 @@ struct ehci_fstn {
union ehci_shadow fstn_next; /* ptr to periodic q entry */
} __attribute__ ((aligned (32)));
+#ifndef EARLY_PRINTK
+
/*-------------------------------------------------------------------------*/
#ifdef CONFIG_USB_EHCI_ROOT_HUB_TT
@@ -704,4 +710,6 @@ static inline void ehci_writel (const struct ehci_hcd *ehci,
/*-------------------------------------------------------------------------*/
+#endif /* EARLY_PRINTK */
+
#endif /* __LINUX_EHCI_HCD_H */
diff --git a/include/asm-i386/fixmap.h b/include/asm-i386/fixmap.h
index 80ea052..c644922 100644
--- a/include/asm-i386/fixmap.h
+++ b/include/asm-i386/fixmap.h
@@ -54,6 +54,7 @@ extern unsigned long __FIXADDR_TOP;
enum fixed_addresses {
FIX_HOLE,
FIX_VDSO,
+ FIX_DBGP_BASE,
#ifdef CONFIG_X86_LOCAL_APIC
FIX_APIC_BASE, /* local (CPU) APIC) -- required for SMP or not */
#endif
diff --git a/include/asm-x86_64/fixmap.h b/include/asm-x86_64/fixmap.h
index e90e167..3e716d9 100644
--- a/include/asm-x86_64/fixmap.h
+++ b/include/asm-x86_64/fixmap.h
@@ -35,6 +35,7 @@ enum fixed_addresses {
VSYSCALL_LAST_PAGE,
VSYSCALL_FIRST_PAGE = VSYSCALL_LAST_PAGE + ((VSYSCALL_END-VSYSCALL_START) >> PAGE_SHIFT) - 1,
VSYSCALL_HPET,
+ FIX_DBGP_BASE,
FIX_HPET_BASE,
FIX_APIC_BASE, /* local (CPU) APIC) -- required for SMP or not */
FIX_IO_APIC_BASE_0,
--
1.5.1.1.181.g2de0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 10/12] i386: Introduce head32.c
2007-04-30 16:32 ` [PATCH 09/12] i386/x86_64: EHCI usb debug port early printk support Eric W. Biederman
@ 2007-04-30 16:32 ` Eric W. Biederman
2007-04-30 16:33 ` [PATCH 11/12] i386: Move setup_idt from head.S to head32.c Eric W. Biederman
2007-04-30 17:56 ` [PATCH 09/12] i386/x86_64: EHCI usb debug port early printk support Andi Kleen
[not found] ` <20070430175607.GD25929@bingen.suse.de>
2 siblings, 1 reply; 21+ messages in thread
From: Eric W. Biederman @ 2007-04-30 16:32 UTC (permalink / raw)
To: Andi Kleen
Cc: Greg Kroah-Hartman, linux-kernel, virtualization, H. Peter Anvin,
Andrew Morton
Copy x86_64 and add a head32.c so we can start moving early
architecture initialization out of assembly.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
arch/i386/Makefile | 2 +-
arch/i386/kernel/Makefile | 2 +-
arch/i386/kernel/head.S | 2 +-
arch/i386/kernel/head32.c | 14 ++++++++++++++
4 files changed, 17 insertions(+), 3 deletions(-)
create mode 100644 arch/i386/kernel/head32.c
diff --git a/arch/i386/Makefile b/arch/i386/Makefile
index 6dc5e5d..89e6f6f 100644
--- a/arch/i386/Makefile
+++ b/arch/i386/Makefile
@@ -96,7 +96,7 @@ core-$(CONFIG_X86_ES7000) := arch/i386/mach-es7000/
# default subarch .h files
mflags-y += -Iinclude/asm-i386/mach-default
-head-y := arch/i386/kernel/head.o arch/i386/kernel/init_task.o
+head-y := arch/i386/kernel/head.o arch/i386/kernel/head32.o arch/i386/kernel/init_task.o
libs-y += arch/i386/lib/
core-y += arch/i386/kernel/ \
diff --git a/arch/i386/kernel/Makefile b/arch/i386/kernel/Makefile
index 4c96141..530ccaa 100644
--- a/arch/i386/kernel/Makefile
+++ b/arch/i386/kernel/Makefile
@@ -2,7 +2,7 @@
# Makefile for the linux kernel.
#
-extra-y := head.o init_task.o vmlinux.lds
+extra-y := head.o head32.o init_task.o vmlinux.lds
obj-y := process.o signal.o entry.o traps.o irq.o \
ptrace.o time.o ioport.o ldt.o setup.o i8259.o sys_i386.o \
diff --git a/arch/i386/kernel/head.S b/arch/i386/kernel/head.S
index de65f45..22ddb3f 100644
--- a/arch/i386/kernel/head.S
+++ b/arch/i386/kernel/head.S
@@ -300,7 +300,7 @@ is386: movl $2,%ecx # set MP
jmp initialize_secondary # all other CPUs call initialize_secondary
1:
#endif /* CONFIG_SMP */
- jmp start_kernel
+ jmp i386_start_kernel
/*
* We depend on ET to be correct. This checks for 287/387.
diff --git a/arch/i386/kernel/head32.c b/arch/i386/kernel/head32.c
new file mode 100644
index 0000000..3db0590
--- /dev/null
+++ b/arch/i386/kernel/head32.c
@@ -0,0 +1,14 @@
+/*
+ * linux/arch/i386/kernel/head32.c -- prepare to run common code
+ *
+ * Copyright (C) 2000 Andrea Arcangeli <andrea@suse.de> SuSE
+ * Copyright (C) 2007 Eric Biederman <ebiederm@xmission.com>
+ */
+
+#include <linux/init.h>
+#include <linux/start_kernel.h>
+
+void __init i386_start_kernel(void)
+{
+ start_kernel();
+}
--
1.5.1.1.181.g2de0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 11/12] i386: Move setup_idt from head.S to head32.c
2007-04-30 16:32 ` [PATCH 10/12] i386: Introduce head32.c Eric W. Biederman
@ 2007-04-30 16:33 ` Eric W. Biederman
2007-04-30 16:35 ` [PATCH 12/12] i386: remove cpuid checking in head.S Eric W. Biederman
0 siblings, 1 reply; 21+ messages in thread
From: Eric W. Biederman @ 2007-04-30 16:33 UTC (permalink / raw)
To: Andi Kleen
Cc: Greg Kroah-Hartman, linux-kernel, virtualization, H. Peter Anvin,
Andrew Morton
This slightly delays when we setup the idt. But by doing it in C
things are noticeably simpler.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
arch/i386/kernel/head.S | 68 +++-----------------------------------------
arch/i386/kernel/head32.c | 26 +++++++++++++++++
2 files changed, 31 insertions(+), 63 deletions(-)
diff --git a/arch/i386/kernel/head.S b/arch/i386/kernel/head.S
index 22ddb3f..0ee615b 100644
--- a/arch/i386/kernel/head.S
+++ b/arch/i386/kernel/head.S
@@ -197,19 +197,6 @@ ENTRY(startup_32_smp)
pushl $0
popfl
-#ifdef CONFIG_SMP
- cmpb $0, ready
- jz 1f /* Initial CPU cleans BSS */
- jmp checkCPUtype
-1:
-#endif /* CONFIG_SMP */
-
-/*
- * start system 32-bit setup. We need to re-do some of the things done
- * in 16-bit mode for the "real" operations.
- */
- call setup_idt
-
checkCPUtype:
movl $-1,X86_CPUID # -1 for no CPUID initially
@@ -274,7 +261,6 @@ is386: movl $2,%ecx # set MP
call check_x87
lgdt early_gdt_descr
- lidt idt_descr
ljmp $(__KERNEL_CS),$1f
1: movl $(__KERNEL_DS),%eax # reload all the segment registers
movl %eax,%ss # after changing gdt.
@@ -321,65 +307,21 @@ check_x87:
.byte 0xDB,0xE4 /* fsetpm for 287, ignored by 387 */
ret
-/*
- * setup_idt
- *
- * sets up a idt with 256 entries pointing to
- * ignore_int, interrupt gates. It doesn't actually load
- * idt - that can be done only after paging has been enabled
- * and the kernel moved to PAGE_OFFSET. Interrupts
- * are enabled elsewhere, when we can be relatively
- * sure everything is ok.
- *
- * Warning: %esi is live across this function.
- */
-setup_idt:
- lea ignore_int,%edx
- movl $(__KERNEL_CS << 16),%eax
- movw %dx,%ax /* selector = 0x0010 = cs */
- movw $0x8E00,%dx /* interrupt gate - dpl=0, present */
-
- lea idt_table,%edi
- mov $256,%ecx
-rp_sidt:
- movl %eax,(%edi)
- movl %edx,4(%edi)
- addl $8,%edi
- dec %ecx
- jne rp_sidt
-
-.macro set_early_handler handler,trapno
- lea \handler,%edx
- movl $(__KERNEL_CS << 16),%eax
- movw %dx,%ax
- movw $0x8E00,%dx /* interrupt gate - dpl=0, present */
- lea idt_table,%edi
- movl %eax,8*\trapno(%edi)
- movl %edx,8*\trapno+4(%edi)
-.endm
-
- set_early_handler handler=early_divide_err,trapno=0
- set_early_handler handler=early_illegal_opcode,trapno=6
- set_early_handler handler=early_protection_fault,trapno=13
- set_early_handler handler=early_page_fault,trapno=14
-
- ret
-
-early_divide_err:
+ENTRY(early_divide_err)
xor %edx,%edx
pushl $0 /* fake errcode */
jmp early_fault
-early_illegal_opcode:
+ENTRY(early_illegal_opcode)
movl $6,%edx
pushl $0 /* fake errcode */
jmp early_fault
-early_protection_fault:
+ENTRY(early_protection_fault)
movl $13,%edx
jmp early_fault
-early_page_fault:
+ENTRY(early_page_fault)
movl $14,%edx
jmp early_fault
@@ -408,7 +350,7 @@ hlt_loop:
/* This is the default interrupt "handler" :-) */
ALIGN
-ignore_int:
+ENTRY(ignore_int)
cld
#ifdef CONFIG_PRINTK
pushl %eax
diff --git a/arch/i386/kernel/head32.c b/arch/i386/kernel/head32.c
index 3db0590..d2f85d5 100644
--- a/arch/i386/kernel/head32.c
+++ b/arch/i386/kernel/head32.c
@@ -7,8 +7,34 @@
#include <linux/init.h>
#include <linux/start_kernel.h>
+#include <asm/desc.h>
+
+extern void ignore_int(void);
+extern void early_divide_err(void);
+extern void early_illegal_opcode(void);
+extern void early_protection_fault(void);
+extern void early_page_fault(void);
+
+/*
+ * setup_idt
+ *
+ * sets up a idt with 256 entries pointing to ignore_int. Interrupts
+ * are enabled elsewhere, when we can be relatively sure everything is ok.
+ */
+static void __init setup_idt(void)
+{
+ int i;
+ for (i = 0; i < IDT_ENTRIES; i++)
+ set_intr_gate(i, ignore_int);
+ set_intr_gate(0, early_divide_err);
+ set_intr_gate(6, early_illegal_opcode);
+ set_intr_gate(13, early_protection_fault);
+ set_intr_gate(14, early_page_fault);
+ load_idt(&idt_descr);
+}
void __init i386_start_kernel(void)
{
+ setup_idt();
start_kernel();
}
--
1.5.1.1.181.g2de0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 05/12] i386: During page table initialization always set the leaf page table entries.
[not found] ` <m1r6q14zow.fsf_-_@ebiederm.dsl.xmission.com>
2007-04-30 16:09 ` [PATCH 06/12] i386: Minimum cpu detection cleanups Eric W. Biederman
@ 2007-04-30 16:34 ` Jeremy Fitzhardinge
1 sibling, 0 replies; 21+ messages in thread
From: Jeremy Fitzhardinge @ 2007-04-30 16:34 UTC (permalink / raw)
To: Eric W. Biederman
Cc: linux-kernel, H. Peter Anvin, Andrew Morton, virtualization
Eric W. Biederman wrote:
> If we don't set the leaf page table entries it is quite possible that
> will inherit and incorrect page table entry from the initial boot
> page table setup in head.S. So we need to redo the effort here,
> so we pick up PSE, PGE and the like.
>
> Hypervisors like Xen require that their page tables be read-only,
> which is slightly incompatible with our low identity mappings, however
> I discussed this with Jeremy he has modified the Xen early set_pte
> function to avoid problems in this area.
>
> Andi I sent this once a part of the discussion on this issue so
> you may already have this patch in your queue.
>
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
>
Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com>
J
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 12/12] i386: remove cpuid checking in head.S
2007-04-30 16:33 ` [PATCH 11/12] i386: Move setup_idt from head.S to head32.c Eric W. Biederman
@ 2007-04-30 16:35 ` Eric W. Biederman
0 siblings, 0 replies; 21+ messages in thread
From: Eric W. Biederman @ 2007-04-30 16:35 UTC (permalink / raw)
To: Andi Kleen
Cc: Greg Kroah-Hartman, linux-kernel, virtualization, H. Peter Anvin,
Andrew Morton
This patch augments the existing cpu initialization code in C
with the the few missing pieces that were performed in assembly.
Allowing us to remove cpu initialization from head.S completely.
This should also allow remove the need to call cpu_detect in
the paravirt initialization code paths.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
arch/i386/kernel/cpu/common.c | 55 +++++++++++++++++++----
arch/i386/kernel/head.S | 96 +----------------------------------------
arch/i386/kernel/setup.c | 3 -
3 files changed, 47 insertions(+), 107 deletions(-)
diff --git a/arch/i386/kernel/cpu/common.c b/arch/i386/kernel/cpu/common.c
index 794d593..1a7b48a 100644
--- a/arch/i386/kernel/cpu/common.c
+++ b/arch/i386/kernel/cpu/common.c
@@ -251,6 +251,16 @@ static inline int flag_is_changeable_p(u32 flag)
return ((f1^f2) & flag) != 0;
}
+static inline int has_x87(void)
+{
+ unsigned result;
+ asm("clts\n\t"
+ "fninit\n\t"
+ "fstsw %%ax\n\t"
+ : "=a"(result));
+ return (result & 0xf) == 0;
+}
+
/* Probe for the CPUID instruction */
static int __cpuinit have_cpuid_p(void)
@@ -258,6 +268,23 @@ static int __cpuinit have_cpuid_p(void)
return flag_is_changeable_p(X86_EFLAGS_ID);
}
+static int __cpuinit init_cr0(void)
+{
+ unsigned long cr0;
+ int hard_math;
+ cr0 = read_cr0();
+ cr0 &= X86_CR0_PG | X86_CR0_ET | X86_CR0_PE;
+ cr0 |= X86_CR0_MP;
+ if (flag_is_changeable_p(X86_EFLAGS_AC))
+ cr0 |= X86_CR0_AM | X86_CR0_WP | X86_CR0_NE;
+ write_cr0(cr0);
+ hard_math = has_x87();
+ if (!hard_math)
+ /* No coprocessor: Enable emulation */
+ write_cr0(cr0 | X86_CR0_EM);
+ return hard_math;
+}
+
void __init cpu_detect(struct cpuinfo_x86 *c)
{
/* Get vendor name */
@@ -268,8 +295,10 @@ void __init cpu_detect(struct cpuinfo_x86 *c)
c->x86 = 4;
if (c->cpuid_level >= 0x00000001) {
- u32 junk, tfms, cap0, misc;
- cpuid(0x00000001, &tfms, &misc, &junk, &cap0);
+ u32 tfms, misc, excap, capability;
+ cpuid(0x00000001, &tfms, &misc, &excap, &capability);
+ c->x86_capability[0] = capability;
+ c->x86_capability[4] = excap;
c->x86 = (tfms >> 8) & 15;
c->x86_model = (tfms >> 4) & 15;
if (c->x86 == 0xf)
@@ -277,7 +306,7 @@ void __init cpu_detect(struct cpuinfo_x86 *c)
if (c->x86 >= 0x6)
c->x86_model += ((tfms >> 16) & 0xF) << 4;
c->x86_mask = tfms & 15;
- if (cap0 & (1<<19))
+ if (capability & (1<<19))
c->x86_cache_alignment = ((misc >> 8) & 0xff) * 8;
}
}
@@ -292,14 +321,21 @@ static void __init early_cpu_detect(void)
{
struct cpuinfo_x86 *c = &boot_cpu_data;
+ c->cpuid_level = -1;
c->x86_cache_alignment = 32;
- if (!have_cpuid_p())
- return;
-
- cpu_detect(c);
-
- get_cpu_vendor(c, 1);
+ if (!have_cpuid_p()) {
+ /* First of all, decide if this is a 486 or higher */
+ /* It's a 486 if we can modify the AC flag */
+ if ( flag_is_changeable_p(X86_EFLAGS_AC) )
+ c->x86 = 4;
+ else
+ c->x86 = 3;
+ } else {
+ cpu_detect(c);
+ get_cpu_vendor(c, 1);
+ }
+ c->hard_math = init_cr0();
}
static void __cpuinit generic_identify(struct cpuinfo_x86 * c)
@@ -670,6 +706,7 @@ void __cpuinit cpu_init(void)
printk(KERN_INFO "Initializing CPU#%d\n", cpu);
+ init_cr0();
if (cpu_has_vme || cpu_has_tsc || cpu_has_de)
clear_in_cr4(X86_CR4_VME|X86_CR4_PVI|X86_CR4_TSD|X86_CR4_DE);
if (tsc_disable && cpu_has_tsc) {
diff --git a/arch/i386/kernel/head.S b/arch/i386/kernel/head.S
index 0ee615b..5e3478b 100644
--- a/arch/i386/kernel/head.S
+++ b/arch/i386/kernel/head.S
@@ -20,19 +20,6 @@
#include <asm/setup.h>
/*
- * References to members of the new_cpu_data structure.
- */
-
-#define X86 new_cpu_data+CPUINFO_x86
-#define X86_VENDOR new_cpu_data+CPUINFO_x86_vendor
-#define X86_MODEL new_cpu_data+CPUINFO_x86_model
-#define X86_MASK new_cpu_data+CPUINFO_x86_mask
-#define X86_HARD_MATH new_cpu_data+CPUINFO_hard_math
-#define X86_CPUID new_cpu_data+CPUINFO_cpuid_level
-#define X86_CAPABILITY new_cpu_data+CPUINFO_x86_capability
-#define X86_VENDOR_ID new_cpu_data+CPUINFO_x86_vendor_id
-
-/*
* 32-bit kernel entrypoint; only used by the boot CPU. On entry,
* %esi points to the real-mode code as a 32-bit pointer.
* CS and DS must be 4 GB flat segments, but we don't depend on
@@ -182,6 +169,7 @@ ENTRY(startup_32_smp)
movl $swapper_pg_dir-__PAGE_OFFSET,%eax
movl %eax,%cr3 /* set the page table pointer.. */
movl %cr0,%eax
+ andl $0x0000011,%eax /* Save PE,ET */
orl $0x80000000,%eax
movl %eax,%cr0 /* ..and set paging (PG) bit */
ljmp $__BOOT_CS,$1f /* Clear prefetch and normalize %eip */
@@ -197,69 +185,6 @@ ENTRY(startup_32_smp)
pushl $0
popfl
-checkCPUtype:
-
- movl $-1,X86_CPUID # -1 for no CPUID initially
-
-/* check if it is 486 or 386. */
-/*
- * XXX - this does a lot of unnecessary setup. Alignment checks don't
- * apply at our cpl of 0 and the stack ought to be aligned already, and
- * we don't need to preserve eflags.
- */
-
- movb $3,X86 # at least 386
- pushfl # push EFLAGS
- popl %eax # get EFLAGS
- movl %eax,%ecx # save original EFLAGS
- xorl $0x240000,%eax # flip AC and ID bits in EFLAGS
- pushl %eax # copy to EFLAGS
- popfl # set EFLAGS
- pushfl # get new EFLAGS
- popl %eax # put it in eax
- xorl %ecx,%eax # change in flags
- pushl %ecx # restore original EFLAGS
- popfl
- testl $0x40000,%eax # check if AC bit changed
- je is386
-
- movb $4,X86 # at least 486
- testl $0x200000,%eax # check if ID bit changed
- je is486
-
- /* get vendor info */
- xorl %eax,%eax # call CPUID with 0 -> return vendor ID
- cpuid
- movl %eax,X86_CPUID # save CPUID level
- movl %ebx,X86_VENDOR_ID # lo 4 chars
- movl %edx,X86_VENDOR_ID+4 # next 4 chars
- movl %ecx,X86_VENDOR_ID+8 # last 4 chars
-
- orl %eax,%eax # do we have processor info as well?
- je is486
-
- movl $1,%eax # Use the CPUID instruction to get CPU type
- cpuid
- movb %al,%cl # save reg for future use
- andb $0x0f,%ah # mask processor family
- movb %ah,X86
- andb $0xf0,%al # mask model
- shrb $4,%al
- movb %al,X86_MODEL
- andb $0x0f,%cl # mask mask revision
- movb %cl,X86_MASK
- movl %edx,X86_CAPABILITY
-
-is486: movl $0x50022,%ecx # set AM, WP, NE and MP
- jmp 2f
-
-is386: movl $2,%ecx # set MP
-2: movl %cr0,%eax
- andl $0x80000011,%eax # Save PG,PE,ET
- orl %ecx,%eax
- movl %eax,%cr0
-
- call check_x87
lgdt early_gdt_descr
ljmp $(__KERNEL_CS),$1f
1: movl $(__KERNEL_DS),%eax # reload all the segment registers
@@ -288,25 +213,6 @@ is386: movl $2,%ecx # set MP
#endif /* CONFIG_SMP */
jmp i386_start_kernel
-/*
- * We depend on ET to be correct. This checks for 287/387.
- */
-check_x87:
- movb $0,X86_HARD_MATH
- clts
- fninit
- fstsw %ax
- cmpb $0,%al
- je 1f
- movl %cr0,%eax /* no coprocessor: have to set bits */
- xorl $4,%eax /* set EM */
- movl %eax,%cr0
- ret
- ALIGN
-1: movb $1,X86_HARD_MATH
- .byte 0xDB,0xE4 /* fsetpm for 287, ignored by 387 */
- ret
-
ENTRY(early_divide_err)
xor %edx,%edx
pushl $0 /* fake errcode */
diff --git a/arch/i386/kernel/setup.c b/arch/i386/kernel/setup.c
index 3e31591..401e48a 100644
--- a/arch/i386/kernel/setup.c
+++ b/arch/i386/kernel/setup.c
@@ -76,8 +76,6 @@ int disable_pse __devinitdata = 0;
extern struct resource code_resource;
extern struct resource data_resource;
-/* cpu data as detected by the assembly code in head.S */
-struct cpuinfo_x86 new_cpu_data __cpuinitdata = { 0, 0, 0, 0, -1, 1, 0, 0, -1 };
/* common cpu data for all cpus */
struct cpuinfo_x86 boot_cpu_data __read_mostly = { 0, 0, 0, 0, -1, 1, 0, 0, -1 };
EXPORT_SYMBOL(boot_cpu_data);
@@ -515,7 +513,6 @@ void __init setup_arch(char **cmdline_p)
{
unsigned long max_low_pfn;
- memcpy(&boot_cpu_data, &new_cpu_data, sizeof(new_cpu_data));
pre_setup_arch_hook();
early_cpu_init();
--
1.5.1.1.181.g2de0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 07/12] i386: Add missing !X86_PAE dependincy to the 2G/2G split.
2007-04-30 16:16 ` [PATCH 07/12] i386: Add missing !X86_PAE dependincy to the 2G/2G split H. Peter Anvin
@ 2007-04-30 16:39 ` Eric W. Biederman
0 siblings, 0 replies; 21+ messages in thread
From: Eric W. Biederman @ 2007-04-30 16:39 UTC (permalink / raw)
To: H. Peter Anvin
Cc: linux-kernel, virtualization, Eric W. Biederman, Andrew Morton
"H. Peter Anvin" <hpa@zytor.com> writes:
> Eric W. Biederman wrote:
>> When in PAE mode we require that the user kernel divide to be
>> on a 1G boundary. The 2G/2G split does not have that property
>> so require !X86_PAE
>
> ?????
>
> -hpa
From arch/i386/Kconfig:
>
> choice
> depends on EXPERIMENTAL
> prompt "Memory split" if EMBEDDED
> default VMSPLIT_3G
> help
> Select the desired split between kernel and user memory.
>
> If the address range available to the kernel is less than the
> physical memory installed, the remaining memory will be available
> as "high memory". Accessing high memory is a little more costly
> than low memory, as it needs to be mapped into the kernel first.
> Note that increasing the kernel address space limits the range
> available to user programs, making the address space there
> tighter. Selecting anything other than the default 3G/1G split
> will also likely make your kernel incompatible with binary-only
> kernel modules.
>
> If you are not absolutely sure what you are doing, leave this
> option alone!
>
> config VMSPLIT_3G
> bool "3G/1G user/kernel split"
> config VMSPLIT_3G_OPT
> depends on !HIGHMEM
> bool "3G/1G user/kernel split (for full 1G low memory)"
> config VMSPLIT_2G
> depends on !X86_PAE
> bool "2G/2G user/kernel split"
> config VMSPLIT_1G
> bool "1G/3G user/kernel split"
> endchoice
>
> config PAGE_OFFSET
> hex
> default 0xB0000000 if VMSPLIT_3G_OPT
> default 0x78000000 if VMSPLIT_2G
> default 0x40000000 if VMSPLIT_1G
> default 0xC0000000
The default PAGE_OFFSET is at 0x7800000 for the 2G/2G split.
All of these options were originally !X86_PAE, I assume
the intention was to be able to have 2G of RAM without
needing high memory.
I don't really care I just saw the problem and decided to prevent
people trying a combination that simply doesn't work.
Eric
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 08/12] i386: Convert the boot time page tables to the kernels native format.
[not found] ` <200704301826.57920.ak@suse.de>
@ 2007-04-30 16:42 ` Eric W. Biederman
0 siblings, 0 replies; 21+ messages in thread
From: Eric W. Biederman @ 2007-04-30 16:42 UTC (permalink / raw)
To: Andi Kleen
Cc: linux-kernel, virtualization, James Bottomley, H. Peter Anvin,
Andrew Morton
Andi Kleen <ak@suse.de> writes:
> On Monday 30 April 2007 18:15:08 Eric W. Biederman wrote:
>>
>> Currently we have a lot of special case code and a lot of limitations
>> because we cannot count on the initial boot time page tables being in
>> the format our page table handling routines know how to manipulate.
>> So this patch rewrites the code that initializes our boot time page
>> tables.
>
> Sounds reasonable, but for post .22
Sure. A little aging to gain confidence with it is reasonable. At
this point I finally have the glitches worked out so these patches
needed to get out of my private queue.
Eric
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 04/12] i386 voyager: Use modern techniques to setup and teardown low identiy mappings.
2007-04-30 15:57 ` [PATCH 04/12] i386 voyager: Use modern techniques to setup and teardown low identiy mappings Eric W. Biederman
2007-04-30 16:03 ` [PATCH 05/12] i386: During page table initialization always set the leaf page table entries Eric W. Biederman
[not found] ` <m1r6q14zow.fsf_-_@ebiederm.dsl.xmission.com>
@ 2007-04-30 17:06 ` James Bottomley
2 siblings, 0 replies; 21+ messages in thread
From: James Bottomley @ 2007-04-30 17:06 UTC (permalink / raw)
To: Eric W. Biederman
Cc: linux-kernel, H. Peter Anvin, Andrew Morton, virtualization
On Mon, 2007-04-30 at 09:57 -0600, Eric W. Biederman wrote:
> This is a trivial and hopefully obviously correct patch to setup
> and teardown the identity mappings the way the rest of arch/i386
> does.
>
> My new page table setup code will break some assumptions below so
> this is my attempt to keep voyager working.
It works for the 586 beasts ... I'll dig out some 486 cards and test
them later. Thanks,
James
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 09/12] i386/x86_64: EHCI usb debug port early printk support.
2007-04-30 16:32 ` [PATCH 09/12] i386/x86_64: EHCI usb debug port early printk support Eric W. Biederman
2007-04-30 16:32 ` [PATCH 10/12] i386: Introduce head32.c Eric W. Biederman
@ 2007-04-30 17:56 ` Andi Kleen
[not found] ` <20070430175607.GD25929@bingen.suse.de>
2 siblings, 0 replies; 21+ messages in thread
From: Andi Kleen @ 2007-04-30 17:56 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Greg Kroah-Hartman, linux-kernel, H. Peter Anvin, Andrew Morton,
virtualization
Thanks for writing that code. It should be an interesting alternative
on boxes where firescope doesn't work.
I hope I can eventually merge early firewire support code too.
On Mon, Apr 30, 2007 at 10:32:02AM -0600, Eric W. Biederman wrote:
>
> With legacy free systems serial ports have stopped being an option
> to get early boot traces and other debug information out of a machine.
This needs a CONFIG_* at least. And some documentation on how to set it
up on both sides.
>
> This debug device can be used to replace serial ports for
> kgdb, kdb, and console support. And gregkh has a simple usb
> serial driver for it so user space applications that control
> serial ports should work unmodified.
But not merged yet, right? I was hoping it could be done from
user space anyways.
> For users the hard part looks like it will be finding cables and
> finding which is usb debug port 1 and realizing that there is
> flow control so the kernel boot will not happen if someone is not
> reading the serial console data.
That's nasty. Any way to work around that?
> index 92213d2..dc097aa 100644
> --- a/arch/x86_64/kernel/early_printk.c
> +++ b/arch/x86_64/kernel/early_printk.c
> @@ -3,9 +3,19 @@
> #include <linux/init.h>
> #include <linux/string.h>
> #include <linux/screen_info.h>
> +#include <linux/usb/ch9.h>
> +#include <linux/pci_regs.h>
> +#include <linux/pci_ids.h>
> +#include <linux/errno.h>
Can you put it in a separate file please?
Perhaps with a little abstraction in drivers/usb ?
> +static void dbgp_breath(void)
> +{
> + /* Sleep to give the debug port a chance to breathe */
But you don't?
> +static __u32 __init find_dbgp(int ehci_num, unsigned *rbus, unsigned *rslot, unsigned *rfunc)
This should be probably merged into the early quirks loop
> early_console = &simnow_console;
> keep_early = 1;
> + } else if (!strncmp(buf, "dbgp", 4)) {
usb would seem to be more intuitive
-Andi
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 09/12] i386/x86_64: EHCI usb debug port early printk support.
[not found] ` <20070430175607.GD25929@bingen.suse.de>
@ 2007-04-30 20:54 ` Eric W. Biederman
0 siblings, 0 replies; 21+ messages in thread
From: Eric W. Biederman @ 2007-04-30 20:54 UTC (permalink / raw)
To: Andi Kleen
Cc: Greg Kroah-Hartman, linux-kernel, virtualization, H. Peter Anvin,
Andrew Morton
Andi Kleen <ak@suse.de> writes:
> Thanks for writing that code. It should be an interesting alternative
> on boxes where firescope doesn't work.
>
> I hope I can eventually merge early firewire support code too.
>
> On Mon, Apr 30, 2007 at 10:32:02AM -0600, Eric W. Biederman wrote:
>>
>> With legacy free systems serial ports have stopped being an option
>> to get early boot traces and other debug information out of a machine.
>
> This needs a CONFIG_* at least. And some documentation on how to set it
> up on both sides.
Besides CONFIG_EARLY_PRINTK I assume. It is enough code so there is a
case for it.
>> This debug device can be used to replace serial ports for
>> kgdb, kdb, and console support. And gregkh has a simple usb
>> serial driver for it so user space applications that control
>> serial ports should work unmodified.
>
> But not merged yet, right? I was hoping it could be done from
> user space anyways.
Sorry old comment, that piece has been merged for a while.
It is the usb_debug module. It creates a tty device that you
can just cat to get the usb debug output.
>> For users the hard part looks like it will be finding cables and
>> finding which is usb debug port 1 and realizing that there is
>> flow control so the kernel boot will not happen if someone is not
>> reading the serial console data.
>
> That's nasty. Any way to work around that?
Maybe. It has been long enough since I wrote the code
I need to go back and look.
>> index 92213d2..dc097aa 100644
>> --- a/arch/x86_64/kernel/early_printk.c
>> +++ b/arch/x86_64/kernel/early_printk.c
>> @@ -3,9 +3,19 @@
>> #include <linux/init.h>
>> #include <linux/string.h>
>> #include <linux/screen_info.h>
>> +#include <linux/usb/ch9.h>
>> +#include <linux/pci_regs.h>
>> +#include <linux/pci_ids.h>
>> +#include <linux/errno.h>
>
> Can you put it in a separate file please?
> Perhaps with a little abstraction in drivers/usb ?
>
>> +static void dbgp_breath(void)
>> +{
>> + /* Sleep to give the debug port a chance to breathe */
>
> But you don't?
Good point. At least this early I'm not certain there is any
way I can productively do that. This is before we have calibrated
the tsc's and the like so timeouts are difficult, and as I
recall our default guess isn't.
This lack of a good timeout looks to be the reason ehci_wait_for_port
doesn't timeout in a timely fashion because I don't timeout until
I have wrapped a 32bit number.
>> +static __u32 __init find_dbgp(int ehci_num, unsigned *rbus, unsigned *rslot,
> unsigned *rfunc)
>
> This should be probably merged into the early quirks loop
>
>> early_console = &simnow_console;
>> keep_early = 1;
>> + } else if (!strncmp(buf, "dbgp", 4)) {
>
> usb would seem to be more intuitive
Could be. I was thinking usb debug port. dbgp is at least unique,
and unfortunately it doesn't look like any old usb cable will
do, so a straight usb I expect would be very misleading.
The truth is I don't have a big need for this. I put it together as
a proof of concept to see how hard it would be, etc. I can clean
it up a little but I'm really hoping I can get this into one of
the development trees and people who have more use for it then I
do can play with it and improve things.
One persons experience on two machines probably isn't quite enough
of a sample to Document how to use this. At least beyond what I
did in my changelog.
Eric
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2007-04-30 20:54 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <m1d51l6f1y.fsf@ebiederm.dsl.xmission.com>
2007-04-30 15:48 ` [PATCH 01/12] x86_64: Allow fixmaps to be used with the initial page table Eric W. Biederman
[not found] ` <m18xc96eyq.fsf@ebiederm.dsl.xmission.com>
2007-04-30 15:49 ` [PATCH 02/12] i386 head.S: Remove unnecessary use of %ebx as the boot cpu flag Eric W. Biederman
[not found] ` <m14pmx6ewk.fsf_-_@ebiederm.dsl.xmission.com>
2007-04-30 15:51 ` [PATCH 03/12] i386 head.S: Always run the full set of paging state Eric W. Biederman
[not found] ` <m1zm4p509a.fsf_-_@ebiederm.dsl.xmission.com>
2007-04-30 15:57 ` [PATCH 04/12] i386 voyager: Use modern techniques to setup and teardown low identiy mappings Eric W. Biederman
2007-04-30 16:03 ` [PATCH 05/12] i386: During page table initialization always set the leaf page table entries Eric W. Biederman
[not found] ` <m1r6q14zow.fsf_-_@ebiederm.dsl.xmission.com>
2007-04-30 16:09 ` [PATCH 06/12] i386: Minimum cpu detection cleanups Eric W. Biederman
2007-04-30 16:10 ` [PATCH 07/12] i386: Add missing !X86_PAE dependincy to the 2G/2G split Eric W. Biederman
2007-04-30 16:15 ` [PATCH 08/12] i386: Convert the boot time page tables to the kernels native format Eric W. Biederman
2007-04-30 16:26 ` Andi Kleen
2007-04-30 16:32 ` [PATCH 09/12] i386/x86_64: EHCI usb debug port early printk support Eric W. Biederman
2007-04-30 16:32 ` [PATCH 10/12] i386: Introduce head32.c Eric W. Biederman
2007-04-30 16:33 ` [PATCH 11/12] i386: Move setup_idt from head.S to head32.c Eric W. Biederman
2007-04-30 16:35 ` [PATCH 12/12] i386: remove cpuid checking in head.S Eric W. Biederman
2007-04-30 17:56 ` [PATCH 09/12] i386/x86_64: EHCI usb debug port early printk support Andi Kleen
[not found] ` <20070430175607.GD25929@bingen.suse.de>
2007-04-30 20:54 ` Eric W. Biederman
[not found] ` <200704301826.57920.ak@suse.de>
2007-04-30 16:42 ` [PATCH 08/12] i386: Convert the boot time page tables to the kernels native format Eric W. Biederman
2007-04-30 16:16 ` [PATCH 07/12] i386: Add missing !X86_PAE dependincy to the 2G/2G split H. Peter Anvin
2007-04-30 16:39 ` Eric W. Biederman
2007-04-30 16:13 ` [PATCH 06/12] i386: Minimum cpu detection cleanups H. Peter Anvin
2007-04-30 16:34 ` [PATCH 05/12] i386: During page table initialization always set the leaf page table entries Jeremy Fitzhardinge
2007-04-30 17:06 ` [PATCH 04/12] i386 voyager: Use modern techniques to setup and teardown low identiy mappings James Bottomley
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).