* COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 @ 2005-01-19 23:13 Janos Farkas 2005-01-20 4:21 ` Chris Bruner 2005-01-20 16:28 ` Adrian Bunk 0 siblings, 2 replies; 24+ messages in thread From: Janos Farkas @ 2005-01-19 23:13 UTC (permalink / raw) To: linux-kernel; +Cc: Andi Kleen, Chris Bruner Hi Andi! I had difficulties booting recent rc1-bkN kernels on at least two Athlon machines (but somehow, on an *old* Pentium laptop booted with the a very similar system just fine). The kernel just hung very early, just after displaying "BIOS data check successful" by lilo (22.6.1). Ctrl-Alt-Del worked to reboot, but nothing else was shown. It is a similar experience to Chris Bruner's post here: > http://article.gmane.org/gmane.linux.kernel/271352 I also recall someone having similar problem with Opterons too, but can't find just now.. rc1-bk6 didn't boot, and thus I started checking revisions: rc1-bk3 did boot (as well as plain rc1) rc1-bk4 didn't boot rc1-bk7 booted *after* reverting the patch below: > 4 days ak 1.2329.1.38 [PATCH] x86_64/i386: increase command line size > Enlarge i386/x86-64 kernel command line to 2k > This is useful when the kernel command line is used to pass other > information to initrds or installers. > On i386 it was duplicated for unknown reasons. > Signed-off-by: Andi Kleen > Signed-off-by: Andrew Morton > Signed-off-by: Linus Torvalds While arguably it's not a completely scientific approach (no plain bk7, and no bk6 reverted was tested), I'm inclined to say this was my problem... Isn't this define a lilo dependence? -- Janos | romfs is at http://romfs.sourceforge.net/ | Don't talk about silence. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 2005-01-19 23:13 COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 Janos Farkas @ 2005-01-20 4:21 ` Chris Bruner 2005-01-20 16:28 ` Adrian Bunk 1 sibling, 0 replies; 24+ messages in thread From: Chris Bruner @ 2005-01-20 4:21 UTC (permalink / raw) To: Janos Farkas, linux-kernel, Andi Kleen FYI, I found that the problem I was having was caused by the "BIOS Enhanced Disk Drives" turned on. It was on in previous versions as well, and they worked ok, so I assume that something has changed. In anycase turning it off fixed my problem. Chris Bruner On Wed January 19 2005 06:13 pm, Janos Farkas wrote: > Hi Andi! > > I had difficulties booting recent rc1-bkN kernels on at least two > Athlon machines (but somehow, on an *old* Pentium laptop booted with the > a very similar system just fine). > > The kernel just hung very early, just after displaying "BIOS data check > successful" by lilo (22.6.1). Ctrl-Alt-Del worked to reboot, but > nothing else was shown. > > It is a similar experience to Chris Bruner's post here: > > http://article.gmane.org/gmane.linux.kernel/271352 > > I also recall someone having similar problem with Opterons too, but > can't find just now.. > > rc1-bk6 didn't boot, and thus I started checking revisions: > rc1-bk3 did boot (as well as plain rc1) > rc1-bk4 didn't boot > > rc1-bk7 booted *after* reverting the patch below: > > 4 days ak 1.2329.1.38 [PATCH] x86_64/i386: increase command line size > > Enlarge i386/x86-64 kernel command line to 2k > > This is useful when the kernel command line is used to pass other > > information to initrds or installers. > > On i386 it was duplicated for unknown reasons. > > Signed-off-by: Andi Kleen > > Signed-off-by: Andrew Morton > > Signed-off-by: Linus Torvalds > > While arguably it's not a completely scientific approach (no plain bk7, > and no bk6 reverted was tested), I'm inclined to say this was my > problem... > > Isn't this define a lilo dependence? -- I say, if your knees aren't green by the end of the day, you ought to seriously re-examine your life. -- Calvin ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 2005-01-19 23:13 COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 Janos Farkas 2005-01-20 4:21 ` Chris Bruner @ 2005-01-20 16:28 ` Adrian Bunk 2005-01-20 16:40 ` Linus Torvalds 2005-01-20 16:48 ` Andi Kleen 1 sibling, 2 replies; 24+ messages in thread From: Adrian Bunk @ 2005-01-20 16:28 UTC (permalink / raw) To: Janos Farkas, linux-kernel, Andi Kleen, Chris Bruner, Andrew Morton, Linus Torvalds, Matt Domsch On Thu, Jan 20, 2005 at 12:13:22AM +0100, Janos Farkas wrote: > Hi Andi! > > I had difficulties booting recent rc1-bkN kernels on at least two > Athlon machines (but somehow, on an *old* Pentium laptop booted with the > a very similar system just fine). > > The kernel just hung very early, just after displaying "BIOS data check > successful" by lilo (22.6.1). Ctrl-Alt-Del worked to reboot, but > nothing else was shown. > > It is a similar experience to Chris Bruner's post here: > > http://article.gmane.org/gmane.linux.kernel/271352 > > I also recall someone having similar problem with Opterons too, but > can't find just now.. > > rc1-bk6 didn't boot, and thus I started checking revisions: > rc1-bk3 did boot (as well as plain rc1) > rc1-bk4 didn't boot > rc1-bk7 booted *after* reverting the patch below: > > > 4 days ak 1.2329.1.38 [PATCH] x86_64/i386: increase command line size > > Enlarge i386/x86-64 kernel command line to 2k > > This is useful when the kernel command line is used to pass other > > information to initrds or installers. > > On i386 it was duplicated for unknown reasons. > > Signed-off-by: Andi Kleen > > Signed-off-by: Andrew Morton > > Signed-off-by: Linus Torvalds > > While arguably it's not a completely scientific approach (no plain bk7, > and no bk6 reverted was tested), I'm inclined to say this was my > problem... > > Isn't this define a lilo dependence? AOL: - lilo 22.6.1 - CONFIG_EDD=y - 2.6.10-mm1 and 2.6.11-rc1 did boot - 2.6.11-rc1-mm1 and 2.6.11-rc1-mm2 didn't boot - 2.6.11-rc1-mm2 with this ChangeSet reverted boots. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 2005-01-20 16:28 ` Adrian Bunk @ 2005-01-20 16:40 ` Linus Torvalds 2005-01-20 16:48 ` Andi Kleen 1 sibling, 0 replies; 24+ messages in thread From: Linus Torvalds @ 2005-01-20 16:40 UTC (permalink / raw) To: Adrian Bunk Cc: Janos Farkas, linux-kernel, Andi Kleen, Chris Bruner, Andrew Morton, Matt Domsch On Thu, 20 Jan 2005, Adrian Bunk wrote: > > On Thu, Jan 20, 2005 at 12:13:22AM +0100, Janos Farkas wrote: > > > > Isn't this define a lilo dependence? > > AOL: > - lilo 22.6.1 > - CONFIG_EDD=y > - 2.6.10-mm1 and 2.6.11-rc1 did boot > - 2.6.11-rc1-mm1 and 2.6.11-rc1-mm2 didn't boot > - 2.6.11-rc1-mm2 with this ChangeSet reverted boots. Thanks. Reverted. Linus ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 2005-01-20 16:28 ` Adrian Bunk 2005-01-20 16:40 ` Linus Torvalds @ 2005-01-20 16:48 ` Andi Kleen 2005-01-20 20:53 ` Something very strange on x86_64 2.6.X kernels Eric Dumazet 2005-01-21 6:58 ` COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 Catalin(ux aka Dino) BOIE 1 sibling, 2 replies; 24+ messages in thread From: Andi Kleen @ 2005-01-20 16:48 UTC (permalink / raw) To: Adrian Bunk Cc: Janos Farkas, linux-kernel, Andi Kleen, Chris Bruner, Andrew Morton, Linus Torvalds, Matt Domsch > AOL: > - lilo 22.6.1 > - CONFIG_EDD=y > - 2.6.10-mm1 and 2.6.11-rc1 did boot > - 2.6.11-rc1-mm1 and 2.6.11-rc1-mm2 didn't boot > - 2.6.11-rc1-mm2 with this ChangeSet reverted boots. What I gather so far the problem seems to only happen with lilo and EDID together. grub appears to work. Or did anyone see problems with grub too? I'll dig a bit, but reverting for now is probably best. Thanks Linus. -Andi ^ permalink raw reply [flat|nested] 24+ messages in thread
* Something very strange on x86_64 2.6.X kernels 2005-01-20 16:48 ` Andi Kleen @ 2005-01-20 20:53 ` Eric Dumazet 2005-01-20 21:08 ` Andrew Morton 2005-01-21 16:26 ` Petr Vandrovec 2005-01-21 6:58 ` COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 Catalin(ux aka Dino) BOIE 1 sibling, 2 replies; 24+ messages in thread From: Eric Dumazet @ 2005-01-20 20:53 UTC (permalink / raw) To: Andi Kleen, linux-kernel Cc: Chris Bruner, Andrew Morton, Linus Torvalds, Matt Domsch Hi Andi I have very strange coredumps happening on a big 64bits program. Some background : - This program is multi-threaded - Machine is a dual Opteron 248 machine, 12GB ram. - Kernel 2.6.6 (tried 2.6.10 too but problems too) - The program uses hugetlb pages. - The program uses prefetchnta - The program uses about 8GB of ram. After numerous differents core dumps of this program, and gdb debugging I found : Every time the crash occurs when one thread is using some ram located at virtual address 0xffffe6xx When examining the core image, the data saved on this page seems correct (ie countains coherent user data). But one register (%rbx) is usually corrupted and contains a small value (like 0x3c) The last instruction using this register is : prefetchnta 0x18(,%rbx,4) Examining linux sources, I found that 0xffffe000 is 'special' (ia 32 vsyscall) and 0xffffe600 is about sigreturn subsection of this special area. Is it possible some vm trick just kicks in and corrupts my true 64bits program ? Thank you Eric Dumazet ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Something very strange on x86_64 2.6.X kernels 2005-01-20 20:53 ` Something very strange on x86_64 2.6.X kernels Eric Dumazet @ 2005-01-20 21:08 ` Andrew Morton 2005-01-20 21:19 ` Eric Dumazet 2005-01-21 16:26 ` Petr Vandrovec 1 sibling, 1 reply; 24+ messages in thread From: Andrew Morton @ 2005-01-20 21:08 UTC (permalink / raw) To: Eric Dumazet; +Cc: ak, linux-kernel, cryst, torvalds, Matt_Domsch Eric Dumazet <dada1@cosmosbay.com> wrote: > > Hi Andi > > I have very strange coredumps happening on a big 64bits program. > > Some background : > - This program is multi-threaded > - Machine is a dual Opteron 248 machine, 12GB ram. > - Kernel 2.6.6 (tried 2.6.10 too but problems too) > - The program uses hugetlb pages. > - The program uses prefetchnta > - The program uses about 8GB of ram. > > After numerous differents core dumps of this program, and gdb debugging > I found : > > Every time the crash occurs when one thread is using some ram located at > virtual address 0xffffe6xx What does "using" mean? Is the program executing from that location? > When examining the core image, the data saved on this page seems correct > (ie countains coherent user data). But one register (%rbx) is usually > corrupted and contains a small value (like 0x3c) > > The last instruction using this register is : > prefetchnta 0x18(,%rbx,4) > > > Examining linux sources, I found that 0xffffe000 is 'special' (ia 32 > vsyscall) and 0xffffe600 is about sigreturn subsection of this special area. > > Is it possible some vm trick just kicks in and corrupts my true 64bits > program ? > Interesting. IIRC, opterons will very occasionally (and incorrectly) take a fault when performing a prefetch against a dud pointer. The kernel will fix that up. At a guess, I'd say tha the fixup code isn't doing the right thing when the faulting EIP is in the vsyscall page. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Something very strange on x86_64 2.6.X kernels 2005-01-20 21:08 ` Andrew Morton @ 2005-01-20 21:19 ` Eric Dumazet 0 siblings, 0 replies; 24+ messages in thread From: Eric Dumazet @ 2005-01-20 21:19 UTC (permalink / raw) To: Andrew Morton; +Cc: ak, linux-kernel Andrew Morton wrote: > Eric Dumazet <dada1@cosmosbay.com> wrote: >> >>Every time the crash occurs when one thread is using some ram located at >>virtual address 0xffffe6xx > > > What does "using" mean? Is the program executing from that location? No, the program text is located between 0x00100000 and 0x001c6000 (no shared libs) 0xffffe6xx is READ|WRITE data, mapped on Hugetlb fs extract from /proc/pid/maps ff400000-100400000 rw-s 82000000 00:0b 12960938 /huge/file > > Interesting. IIRC, opterons will very occasionally (and incorrectly) take > a fault when performing a prefetch against a dud pointer. The kernel will > fix that up. At a guess, I'd say tha the fixup code isn't doing the right > thing when the faulting EIP is in the vsyscall page. Maybe, but I want to say that in this case, the address 'prefetched' is valid (ie mapped read/write by the program, on a huge page too) Thanks Eric Dumazet ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Something very strange on x86_64 2.6.X kernels 2005-01-20 20:53 ` Something very strange on x86_64 2.6.X kernels Eric Dumazet 2005-01-20 21:08 ` Andrew Morton @ 2005-01-21 16:26 ` Petr Vandrovec 2005-01-21 16:49 ` Eric Dumazet 2005-01-22 1:54 ` Andi Kleen 1 sibling, 2 replies; 24+ messages in thread From: Petr Vandrovec @ 2005-01-21 16:26 UTC (permalink / raw) To: Eric Dumazet Cc: Andi Kleen, linux-kernel, Chris Bruner, Andrew Morton, Linus Torvalds, Matt Domsch On Thu, Jan 20, 2005 at 09:53:36PM +0100, Eric Dumazet wrote: > > Examining linux sources, I found that 0xffffe000 is 'special' (ia 32 > vsyscall) and 0xffffe600 is about sigreturn subsection of this special area. > > Is it possible some vm trick just kicks in and corrupts my true 64bits > program ? Maybe I already missed answer, but try patch below. It is definitely bad to mark syscall page as global one... When you build program below, once as 64bit and once as 32bit, 32bit one should print 464C457F and 64bit one should die with SIGSEGV. But when you run both in parallel, 64bit one sometime gets SIGSEGV as it should, sometime it gets 464C457F. (actually results below are from SMP system; I believe that on UP you'll get reproducible 464C457F on UP system...) vana:~/64bit-test# ./tpg32 Memory at ffffe000 is 464C457F vana:~/64bit-test# ./tpg Segmentation fault vana:~/64bit-test# ./tpg32 & ./tpg [1] 8450 Memory at ffffe000 is 464C457F Memory at ffffe000 is 464C457F [1]+ Exit 31 ./tpg32 vana:~/64bit-test# ./tpg32 & ./tpg [1] 8454 Memory at ffffe000 is 464C457F [1]+ Exit 31 ./tpg32 Segmentation fault vana:~/64bit-test# ./tpg32 & ./tpg [1] 8456 Memory at ffffe000 is 464C457F Memory at ffffe000 is 464C457F [1]+ Exit 31 ./tpg32 vana:~/64bit-test# ./tpg32 & ./tpg [1] 8458 Memory at ffffe000 is 464C457F Memory at ffffe000 is 464C457F [1]+ Exit 31 ./tpg32 vana:~/64bit-test# void main(void) { int acc; int i; for (i = 0; i < 100000000; i++) ; acc = *(volatile unsigned long*)(0xffffe000); printf("Memory at ffffe000 is %08X\n", acc); } Petr diff -urdN linux/arch/x86_64/ia32/syscall32.c linux/arch/x86_64/ia32/syscall32.c --- linux/arch/x86_64/ia32/syscall32.c 2005-01-17 12:29:05.000000000 +0000 +++ linux/arch/x86_64/ia32/syscall32.c 2005-01-21 16:15:04.000000000 +0000 @@ -55,7 +55,7 @@ if (pte_none(*pte)) { set_pte(pte, mk_pte(virt_to_page(syscall32_page), - PAGE_KERNEL_VSYSCALL)); + PAGE_KERNEL_VSYSCALL32)); } /* Flush only the local CPU. Other CPUs taking a fault will just end up here again diff -urdN linux/include/asm-x86_64/pgtable.h linux/include/asm-x86_64/pgtable.h --- linux/include/asm-x86_64/pgtable.h 2005-01-17 12:29:11.000000000 +0000 +++ linux/include/asm-x86_64/pgtable.h 2005-01-21 16:14:44.000000000 +0000 @@ -182,6 +182,7 @@ #define PAGE_KERNEL_EXEC MAKE_GLOBAL(__PAGE_KERNEL_EXEC) #define PAGE_KERNEL_RO MAKE_GLOBAL(__PAGE_KERNEL_RO) #define PAGE_KERNEL_NOCACHE MAKE_GLOBAL(__PAGE_KERNEL_NOCACHE) +#define PAGE_KERNEL_VSYSCALL32 __pgprot(__PAGE_KERNEL_VSYSCALL) #define PAGE_KERNEL_VSYSCALL MAKE_GLOBAL(__PAGE_KERNEL_VSYSCALL) #define PAGE_KERNEL_LARGE MAKE_GLOBAL(__PAGE_KERNEL_LARGE) #define PAGE_KERNEL_VSYSCALL_NOCACHE MAKE_GLOBAL(__PAGE_KERNEL_VSYSCALL_NOCACHE) ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Something very strange on x86_64 2.6.X kernels 2005-01-21 16:26 ` Petr Vandrovec @ 2005-01-21 16:49 ` Eric Dumazet 2005-01-21 18:30 ` Petr Vandrovec 2005-01-22 1:54 ` Andi Kleen 1 sibling, 1 reply; 24+ messages in thread From: Eric Dumazet @ 2005-01-21 16:49 UTC (permalink / raw) To: Petr Vandrovec; +Cc: Andi Kleen, linux-kernel, Andrew Morton Petr Vandrovec wrote: > > Maybe I already missed answer, but try patch below. It is definitely bad > to mark syscall page as global one... > Hi Petr If I follow you, any 64 bits program is corrupted as soon one 32bits program using sysenter starts ? Thank you for the patch, I will try it as soon as possible. I tried your tpg program and had the same behavior you describe. I confirm that avoiding the 0xFFFFE000 - 0x100000000 VM ranges is also OK , the program never crash... Eric > When you build program below, once as 64bit and once as 32bit, 32bit one > should print 464C457F and 64bit one should die with SIGSEGV. But when > you run both in parallel, 64bit one sometime gets SIGSEGV as it should, > sometime it gets 464C457F. (actually results below are from SMP system; > I believe that on UP you'll get reproducible 464C457F on UP system...) > > vana:~/64bit-test# ./tpg32 > Memory at ffffe000 is 464C457F > vana:~/64bit-test# ./tpg > Segmentation fault > vana:~/64bit-test# ./tpg32 & ./tpg > [1] 8450 > Memory at ffffe000 is 464C457F > Memory at ffffe000 is 464C457F > [1]+ Exit 31 ./tpg32 > vana:~/64bit-test# ./tpg32 & ./tpg > [1] 8454 > Memory at ffffe000 is 464C457F > [1]+ Exit 31 ./tpg32 > Segmentation fault > vana:~/64bit-test# ./tpg32 & ./tpg > [1] 8456 > Memory at ffffe000 is 464C457F > Memory at ffffe000 is 464C457F > [1]+ Exit 31 ./tpg32 > vana:~/64bit-test# ./tpg32 & ./tpg > [1] 8458 > Memory at ffffe000 is 464C457F > Memory at ffffe000 is 464C457F > [1]+ Exit 31 ./tpg32 > vana:~/64bit-test# > > > void main(void) { > int acc; > int i; > > for (i = 0; i < 100000000; i++) ; > acc = *(volatile unsigned long*)(0xffffe000); > printf("Memory at ffffe000 is %08X\n", acc); > } > > Petr > > > diff -urdN linux/arch/x86_64/ia32/syscall32.c linux/arch/x86_64/ia32/syscall32.c > --- linux/arch/x86_64/ia32/syscall32.c 2005-01-17 12:29:05.000000000 +0000 > +++ linux/arch/x86_64/ia32/syscall32.c 2005-01-21 16:15:04.000000000 +0000 > @@ -55,7 +55,7 @@ > if (pte_none(*pte)) { > set_pte(pte, > mk_pte(virt_to_page(syscall32_page), > - PAGE_KERNEL_VSYSCALL)); > + PAGE_KERNEL_VSYSCALL32)); > } > /* Flush only the local CPU. Other CPUs taking a fault > will just end up here again > diff -urdN linux/include/asm-x86_64/pgtable.h linux/include/asm-x86_64/pgtable.h > --- linux/include/asm-x86_64/pgtable.h 2005-01-17 12:29:11.000000000 +0000 > +++ linux/include/asm-x86_64/pgtable.h 2005-01-21 16:14:44.000000000 +0000 > @@ -182,6 +182,7 @@ > #define PAGE_KERNEL_EXEC MAKE_GLOBAL(__PAGE_KERNEL_EXEC) > #define PAGE_KERNEL_RO MAKE_GLOBAL(__PAGE_KERNEL_RO) > #define PAGE_KERNEL_NOCACHE MAKE_GLOBAL(__PAGE_KERNEL_NOCACHE) > +#define PAGE_KERNEL_VSYSCALL32 __pgprot(__PAGE_KERNEL_VSYSCALL) > #define PAGE_KERNEL_VSYSCALL MAKE_GLOBAL(__PAGE_KERNEL_VSYSCALL) > #define PAGE_KERNEL_LARGE MAKE_GLOBAL(__PAGE_KERNEL_LARGE) > #define PAGE_KERNEL_VSYSCALL_NOCACHE MAKE_GLOBAL(__PAGE_KERNEL_VSYSCALL_NOCACHE) > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Something very strange on x86_64 2.6.X kernels 2005-01-21 16:49 ` Eric Dumazet @ 2005-01-21 18:30 ` Petr Vandrovec 0 siblings, 0 replies; 24+ messages in thread From: Petr Vandrovec @ 2005-01-21 18:30 UTC (permalink / raw) To: Eric Dumazet; +Cc: Andi Kleen, linux-kernel, Andrew Morton On Fri, Jan 21, 2005 at 05:49:25PM +0100, Eric Dumazet wrote: > Petr Vandrovec wrote: > > > > >Maybe I already missed answer, but try patch below. It is definitely bad > >to mark syscall page as global one... > > > > Hi Petr > > If I follow you, any 64 bits program is corrupted as soon one 32bits > program using sysenter starts ? Yes. As soon as 32bit app touches sysenter page (execution, read, whatever), it is loaded to the processor's TLB, and as page is marked global it is not flushed when kernel switches address space to another app - like 64bit one. Fortunately TLB is not that big, so for most of real-world workloads you'll not notice, but if you are doing context switches really often, sooner or later you'll hit vsyscall page instead of data page your process has mapped, and bad things happen. To get your app (or any other 64bit app...) to work reliably on unpatched kernels you should mmap one page at 0xffffe000 and forget about that page forever... Petr ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Something very strange on x86_64 2.6.X kernels 2005-01-21 16:26 ` Petr Vandrovec 2005-01-21 16:49 ` Eric Dumazet @ 2005-01-22 1:54 ` Andi Kleen 2005-01-22 2:14 ` Linus Torvalds 1 sibling, 1 reply; 24+ messages in thread From: Andi Kleen @ 2005-01-22 1:54 UTC (permalink / raw) To: Petr Vandrovec Cc: Eric Dumazet, Andi Kleen, linux-kernel, Chris Bruner, Andrew Morton, Linus Torvalds, Matt Domsch On Fri, Jan 21, 2005 at 05:26:01PM +0100, Petr Vandrovec wrote: > On Thu, Jan 20, 2005 at 09:53:36PM +0100, Eric Dumazet wrote: > > > > Examining linux sources, I found that 0xffffe000 is 'special' (ia 32 > > vsyscall) and 0xffffe600 is about sigreturn subsection of this special area. > > > > Is it possible some vm trick just kicks in and corrupts my true 64bits > > program ? > > Maybe I already missed answer, but try patch below. It is definitely bad > to mark syscall page as global one... Patch looks good thanks. Ugh, what a stupid bug. I applied the patch to my tree. -Andi ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Something very strange on x86_64 2.6.X kernels 2005-01-22 1:54 ` Andi Kleen @ 2005-01-22 2:14 ` Linus Torvalds 0 siblings, 0 replies; 24+ messages in thread From: Linus Torvalds @ 2005-01-22 2:14 UTC (permalink / raw) To: Andi Kleen Cc: Petr Vandrovec, Eric Dumazet, linux-kernel, Chris Bruner, Andrew Morton, Matt Domsch On Sat, 22 Jan 2005, Andi Kleen wrote: > > I applied the patch to my tree. I already applied it as obvious ;) Linus ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 2005-01-20 16:48 ` Andi Kleen 2005-01-20 20:53 ` Something very strange on x86_64 2.6.X kernels Eric Dumazet @ 2005-01-21 6:58 ` Catalin(ux aka Dino) BOIE 2005-01-21 7:11 ` Andi Kleen 2005-02-14 6:15 ` Adam Sulmicki 1 sibling, 2 replies; 24+ messages in thread From: Catalin(ux aka Dino) BOIE @ 2005-01-21 6:58 UTC (permalink / raw) To: Andi Kleen Cc: Adrian Bunk, Janos Farkas, linux-kernel, Chris Bruner, Andrew Morton, Linus Torvalds, Matt Domsch On Thu, 20 Jan 2005, Andi Kleen wrote: >> AOL: >> - lilo 22.6.1 >> - CONFIG_EDD=y >> - 2.6.10-mm1 and 2.6.11-rc1 did boot >> - 2.6.11-rc1-mm1 and 2.6.11-rc1-mm2 didn't boot >> - 2.6.11-rc1-mm2 with this ChangeSet reverted boots. > > What I gather so far the problem seems to only happen with lilo > and EDID together. grub appears to work. Or did anyone > see problems with grub too? > > I'll dig a bit, but reverting for now is probably best. > Thanks Linus. I really suggest to push this limit to 4k. My reason is that under UML I need to put a lot of stuff in command line and uml crash if I not extend this limit. Can we make it depend on arhitecture? Thanks. --- Catalin(ux aka Dino) BOIE catab at deuroconsult.ro http://kernel.umbrella.ro/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 2005-01-21 6:58 ` COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 Catalin(ux aka Dino) BOIE @ 2005-01-21 7:11 ` Andi Kleen 2005-01-21 17:46 ` Matt Domsch 2005-02-07 6:57 ` Werner Almesberger 2005-02-14 6:15 ` Adam Sulmicki 1 sibling, 2 replies; 24+ messages in thread From: Andi Kleen @ 2005-01-21 7:11 UTC (permalink / raw) To: Catalin(ux aka Dino) BOIE Cc: Andi Kleen, Adrian Bunk, Janos Farkas, linux-kernel, Chris Bruner, Andrew Morton, Linus Torvalds, Matt Domsch > I really suggest to push this limit to 4k. My reason is that under UML I > need to put a lot of stuff in command line and uml crash if I not extend > this limit. Can we make it depend on arhitecture? It's dependent on the architecture already. I would like to enable it on i386/x86-64 because the kernel command line is often used to pass parameters to installers, and having a small limit there can be awkward. But first need to figure out what went wrong with EDD. Matt D., do you have thoughts on this? -Andi ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 2005-01-21 7:11 ` Andi Kleen @ 2005-01-21 17:46 ` Matt Domsch 2005-01-21 19:05 ` H. Peter Anvin 2005-02-07 6:57 ` Werner Almesberger 1 sibling, 1 reply; 24+ messages in thread From: Matt Domsch @ 2005-01-21 17:46 UTC (permalink / raw) To: Andi Kleen, hpa Cc: Catalin(ux aka Dino) BOIE, Adrian Bunk, Janos Farkas, linux-kernel, Chris Bruner, Andrew Morton, Linus Torvalds On Fri, Jan 21, 2005 at 08:11:44AM +0100, Andi Kleen wrote: > > I really suggest to push this limit to 4k. My reason is that under UML I > > need to put a lot of stuff in command line and uml crash if I not extend > > this limit. Can we make it depend on arhitecture? > > It's dependent on the architecture already. I would like to enable > it on i386/x86-64 because the kernel command line is often used > to pass parameters to installers, and having a small limit there > can be awkward. > > But first need to figure out what went wrong with EDD. > > Matt D., do you have thoughts on this? It is definitely boot-loader dependent. Simply changing COMMAND_LINE_SIZE from 256 to 2048 in the kernel isn't enough. There are 2 ways the command line is passed from the boot loader into the kernel. Boot loader version <= 0x0201 (which LILO uses) I believe the command line is located at the end of what was known as the 'empty zero page', now known as the boot parameters. This part is black magic to me. Boot loader version >= 0x0202 (which GRUB uses) command line can be essentially any size, located anywhere in memory, and the boot loader tells the kernel where to find it. The EDD real mode code uses only this case for parsing the command line, and if an older loader is used, EDD skips parsing the command line looking for its options. There's little space left in the boot parameters block, my EDD code uses nearly all that was remaining, and could use some more if it were available. Having a longer command line would be nice too. I spoke with hpa at OLS last summer about this, and he offered to help. Peter? Thanks, Matt -- Matt Domsch Software Architect Dell Linux Solutions linux.dell.com & www.dell.com/linux Linux on Dell mailing lists @ http://lists.us.dell.com ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 2005-01-21 17:46 ` Matt Domsch @ 2005-01-21 19:05 ` H. Peter Anvin 0 siblings, 0 replies; 24+ messages in thread From: H. Peter Anvin @ 2005-01-21 19:05 UTC (permalink / raw) To: Matt Domsch Cc: Andi Kleen, Catalin(ux aka Dino) BOIE, Adrian Bunk, Janos Farkas, linux-kernel, Chris Bruner, Andrew Morton, Linus Torvalds Matt Domsch wrote: > On Fri, Jan 21, 2005 at 08:11:44AM +0100, Andi Kleen wrote: > >>>I really suggest to push this limit to 4k. My reason is that under UML I >>>need to put a lot of stuff in command line and uml crash if I not extend >>>this limit. Can we make it depend on arhitecture? >> >>It's dependent on the architecture already. I would like to enable >>it on i386/x86-64 because the kernel command line is often used >>to pass parameters to installers, and having a small limit there >>can be awkward. >> >>But first need to figure out what went wrong with EDD. >> >>Matt D., do you have thoughts on this? > > > It is definitely boot-loader dependent. Simply changing > COMMAND_LINE_SIZE from 256 to 2048 in the kernel isn't enough. > > There are 2 ways the command line is passed from the boot loader into > the kernel. > > Boot loader version <= 0x0201 (which LILO uses) > I believe the command line is located at the end of what was known as > the 'empty zero page', now known as the boot parameters. This part is > black magic to me. > > Boot loader version >= 0x0202 (which GRUB uses) > command line can be essentially any size, located anywhere in memory, > and the boot loader tells the kernel where to find it. The EDD real > mode code uses only this case for parsing the command line, and if an > older loader is used, EDD skips parsing the command line looking > for its options. > > > There's little space left in the boot parameters block, my EDD code > uses nearly all that was remaining, and could use some more if it were > available. Having a longer command line would be nice too. I spoke > with hpa at OLS last summer about this, and he offered to help. > Peter? > The protocol itself doesn't encode it, but before we extend it for protocol >= 0x0202 we need to make sure that older kernels don't break if they get a very long command line (truncation is OK, crashing is not.) If they do crash, we need to add a field in the header. I don't see any reason why the boot parameter block can't be more than one page long. I think today that it's just a static structure. -hpa ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 2005-01-21 7:11 ` Andi Kleen 2005-01-21 17:46 ` Matt Domsch @ 2005-02-07 6:57 ` Werner Almesberger 2005-02-12 13:54 ` Eric W. Biederman 1 sibling, 1 reply; 24+ messages in thread From: Werner Almesberger @ 2005-02-07 6:57 UTC (permalink / raw) To: Andi Kleen Cc: Catalin(ux aka Dino) BOIE, Adrian Bunk, Janos Farkas, linux-kernel, Chris Bruner, Andrew Morton, Linus Torvalds, Matt Domsch Andi Kleen wrote: > It's dependent on the architecture already. I would like to enable > it on i386/x86-64 because the kernel command line is often used > to pass parameters to installers, and having a small limit there > can be awkward. Something to keep in mind when extending the command line is that we'll probably need a mechanism for passing additional (and possibly large) data blocks from the boot loader soon. The reason for this is that, if booting through kexec, it would be attractive to pass device scan results, so that the second kernel doesn't have to repeat the work. As an obvious extension, anyone who wants to boot *quickly* could also pass such data from persistent storage without actually performing the device scan at all when the machine is booted. The command line may be suitable for this, but to allow for passing a lot of data, its place in memory should perhaps just be reserved, at least until the system has passed initialization, without trying to copy it to a "safe" place early in kernel startup. - Werner -- _________________________________________________________________________ / Werner Almesberger, Buenos Aires, Argentina wa@almesberger.net / /_http://www.almesberger.net/____________________________________________/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 2005-02-07 6:57 ` Werner Almesberger @ 2005-02-12 13:54 ` Eric W. Biederman 2005-02-12 14:51 ` Werner Almesberger 0 siblings, 1 reply; 24+ messages in thread From: Eric W. Biederman @ 2005-02-12 13:54 UTC (permalink / raw) To: Werner Almesberger Cc: Andi Kleen, Catalin(ux aka Dino) BOIE, Adrian Bunk, Janos Farkas, linux-kernel, Chris Bruner, Andrew Morton, Linus Torvalds, Matt Domsch Werner Almesberger <wa@almesberger.net> writes: > Andi Kleen wrote: > > It's dependent on the architecture already. I would like to enable > > it on i386/x86-64 because the kernel command line is often used > > to pass parameters to installers, and having a small limit there > > can be awkward. > > Something to keep in mind when extending the command line is that > we'll probably need a mechanism for passing additional (and > possibly large) data blocks from the boot loader soon. > > The reason for this is that, if booting through kexec, it would be > attractive to pass device scan results, so that the second kernel > doesn't have to repeat the work. As an obvious extension, anyone > who wants to boot *quickly* could also pass such data from > persistent storage without actually performing the device scan at > all when the machine is booted. > > The command line may be suitable for this, but to allow for passing > a lot of data, its place in memory should perhaps just be reserved, > at least until the system has passed initialization, without trying > to copy it to a "safe" place early in kernel startup. Actually this is trivial to do by using a file in initramfs. If we need something in a well defined format anyway. Eric ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 2005-02-12 13:54 ` Eric W. Biederman @ 2005-02-12 14:51 ` Werner Almesberger 2005-02-12 15:17 ` Eric W. Biederman 0 siblings, 1 reply; 24+ messages in thread From: Werner Almesberger @ 2005-02-12 14:51 UTC (permalink / raw) To: Eric W. Biederman Cc: Andi Kleen, Catalin(ux aka Dino) BOIE, Adrian Bunk, Janos Farkas, linux-kernel, Chris Bruner, Andrew Morton, Linus Torvalds, Matt Domsch Eric W. Biederman wrote: > Actually this is trivial to do by using a file in initramfs. > If we need something in a well defined format anyway. Yes, constructing an additional initramfs, or modifying an existing one to hold such data is certainly a possibility. I think there are mainly three choices: 1) the command line 2) an initramfs 3) some other, yet to be defined data structure 1) is relatively easy to do, but leads to more little parsers and doesn't scale too well. 2) scales well but has a relatively high overhead (constructing/scanning a cpio archive, etc., particularly for items needed early in the boot process), and does not work too well for discontiguous data structures. 3) is of course what we should try to avoid :-) So far, I also think that using an initramfs, or at least something that looks like one, even if not normally used as such, is the thing to try first. - Werner -- _________________________________________________________________________ / Werner Almesberger, Buenos Aires, Argentina wa@almesberger.net / /_http://www.almesberger.net/____________________________________________/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 2005-02-12 14:51 ` Werner Almesberger @ 2005-02-12 15:17 ` Eric W. Biederman 2005-02-14 5:49 ` Werner Almesberger 0 siblings, 1 reply; 24+ messages in thread From: Eric W. Biederman @ 2005-02-12 15:17 UTC (permalink / raw) To: Werner Almesberger Cc: Andi Kleen, Catalin(ux aka Dino) BOIE, Adrian Bunk, Janos Farkas, linux-kernel, Chris Bruner, Andrew Morton, Linus Torvalds, Matt Domsch Werner Almesberger <wa@almesberger.net> writes: > Eric W. Biederman wrote: > > Actually this is trivial to do by using a file in initramfs. > > If we need something in a well defined format anyway. > > Yes, constructing an additional initramfs, or modifying an existing > one to hold such data is certainly a possibility. > > I think there are mainly three choices: > 1) the command line > 2) an initramfs > 3) some other, yet to be defined data structure > > 1) is relatively easy to do, but leads to more little parsers and > doesn't scale too well. 2) scales well but has a relatively high > overhead (constructing/scanning a cpio archive, etc., particularly > for items needed early in the boot process), and does not work too > well for discontiguous data structures. There is certainly an issue with reading it early. But constructing an additional cpio and sticking it into the initrd block is fairly simple. For detecting devices especially in the case that takes a while that isn't something we need to do early in the boot process. > 3) is of course what we should try to avoid :-) Well the data structure is still yet to be defined. The question you raised is how to pass it. > So far, I also think that using an initramfs, or at least > something that looks like one, even if not normally used as such, > is the thing to try first. Something like that. I have yet to see a even a proof of concept of the idea of passing device information, to clean up probes. Nor am I quite certain if it is really useful. But when it happens I am sure we can cope. Eric ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 2005-02-12 15:17 ` Eric W. Biederman @ 2005-02-14 5:49 ` Werner Almesberger 2005-02-14 7:36 ` Eric W. Biederman 0 siblings, 1 reply; 24+ messages in thread From: Werner Almesberger @ 2005-02-14 5:49 UTC (permalink / raw) To: Eric W. Biederman Cc: Andi Kleen, Catalin(ux aka Dino) BOIE, Adrian Bunk, Janos Farkas, linux-kernel, Chris Bruner, Andrew Morton, Linus Torvalds, Matt Domsch Eric W. Biederman wrote: > For detecting devices especially in the case that takes > a while that isn't something we need to do early > in the boot process. Yes, but I'd rather have a generic mechanism that works in all reasonable cases. Things have a tendency of growing in the oddest directions. E.g. when introducing the boot command line, all I had in mind was to have a way to boot single-user mode :-) > Well the data structure is still yet to be defined. The > question you raised is how to pass it. Err yes, that's what I wanted to say :) Some new mechanism to pass the data, or a weird data structure instead of (as opposed to be on) initrd/initramfs. > Something like that. I have yet to see a even a proof of concept > of the idea of passing device information, to clean up probes. Yes, the kexec-based boot loader first, then this. For a kexec-based boot loader, passing device scan results will be very useful, plus it's a good environment for experimenting with such a feature. - Werner -- _________________________________________________________________________ / Werner Almesberger, Buenos Aires, Argentina wa@almesberger.net / /_http://www.almesberger.net/____________________________________________/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 2005-02-14 5:49 ` Werner Almesberger @ 2005-02-14 7:36 ` Eric W. Biederman 0 siblings, 0 replies; 24+ messages in thread From: Eric W. Biederman @ 2005-02-14 7:36 UTC (permalink / raw) To: Werner Almesberger Cc: Andi Kleen, Catalin(ux aka Dino) BOIE, Adrian Bunk, Janos Farkas, linux-kernel, Chris Bruner, Andrew Morton, Linus Torvalds, Matt Domsch, fastboot Werner Almesberger <wa@almesberger.net> writes: > Eric W. Biederman wrote: > > Something like that. I have yet to see a even a proof of concept > > of the idea of passing device information, to clean up probes. > > Yes, the kexec-based boot loader first, then this. For a > kexec-based boot loader, passing device scan results will be > very useful, plus it's a good environment for experimenting > with such a feature. And from another perspective what drives things are practical requirements. Boot speed while nice does not yet seem to be a driver, and that is all I have seen proposed with passing the list of hardware. What is currently a driver in the kexec scenario is booting a kernel without firmware calls, and in the kexec-on-panic case booting a kernel without a kernel where the hardware is in a known messed up state. So far I have seen nothing that even resembles an architecture independent solution to avoiding firmware calls. And right now I'm not even certain I even expect to see something it become architecture independent. At the very least we need some clean architecture specific support first, so we can have a clue what needs to be generalized. ia64 and ppc are coming... At any rate I see the problem of which hardware devices are present as a subset of the problem of booting without firmware. So I suspect we are going to get some pretty weird architecture specific implementations at least in the first go round. Eric ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 2005-01-21 6:58 ` COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 Catalin(ux aka Dino) BOIE 2005-01-21 7:11 ` Andi Kleen @ 2005-02-14 6:15 ` Adam Sulmicki 1 sibling, 0 replies; 24+ messages in thread From: Adam Sulmicki @ 2005-02-14 6:15 UTC (permalink / raw) To: Catalin(ux aka Dino) BOIE Cc: Andi Kleen, Adrian Bunk, Janos Farkas, linux-kernel, Chris Bruner, Andrew Morton, Linus Torvalds, Matt Domsch On Fri, 21 Jan 2005, Catalin(ux aka Dino) BOIE wrote: > I really suggest to push this limit to 4k. My reason is that under UML I need > to put a lot of stuff in command line and uml crash if I not extend this > limit. Can we make it depend on arhitecture? another nice feature would be the kernel ignoring the any "/n" in the command line. Currently if you accdentally pass the "/n" in the command line the most weird things happen. for examle, type, following mkelfImage /boot/vmlinuz-2.6.11-rc2-mm1 /boot/vmlinuz-2.6.11-rc2-mm1.elf \ --command-line="console=ttyS0,19200 root=/dev/nfs nfsroot=/ ip=any init=/usr/src/cm/files/init.kexec.sh" and watch kernel saying that it does not get any DHCP replies, while the real problem is that there's /n before init= line. ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2005-02-14 7:39 UTC | newest] Thread overview: 24+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-01-19 23:13 COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 Janos Farkas 2005-01-20 4:21 ` Chris Bruner 2005-01-20 16:28 ` Adrian Bunk 2005-01-20 16:40 ` Linus Torvalds 2005-01-20 16:48 ` Andi Kleen 2005-01-20 20:53 ` Something very strange on x86_64 2.6.X kernels Eric Dumazet 2005-01-20 21:08 ` Andrew Morton 2005-01-20 21:19 ` Eric Dumazet 2005-01-21 16:26 ` Petr Vandrovec 2005-01-21 16:49 ` Eric Dumazet 2005-01-21 18:30 ` Petr Vandrovec 2005-01-22 1:54 ` Andi Kleen 2005-01-22 2:14 ` Linus Torvalds 2005-01-21 6:58 ` COMMAND_LINE_SIZE increasing in 2.6.11-rc1-bk6 Catalin(ux aka Dino) BOIE 2005-01-21 7:11 ` Andi Kleen 2005-01-21 17:46 ` Matt Domsch 2005-01-21 19:05 ` H. Peter Anvin 2005-02-07 6:57 ` Werner Almesberger 2005-02-12 13:54 ` Eric W. Biederman 2005-02-12 14:51 ` Werner Almesberger 2005-02-12 15:17 ` Eric W. Biederman 2005-02-14 5:49 ` Werner Almesberger 2005-02-14 7:36 ` Eric W. Biederman 2005-02-14 6:15 ` Adam Sulmicki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox