* Re: Linux v2.5.62 --- spontaneous reboots [not found] ` <fa.d672u14.1gk8ea4@ifi.uio.no> @ 2003-02-18 23:48 ` walt 0 siblings, 0 replies; 18+ messages in thread From: walt @ 2003-02-18 23:48 UTC (permalink / raw) To: linux-kernel Chris Wedgwood wrote: > ...I'd suspect it was an Athlon or chipset problem if it weren't for the > fact 2.4.x is stable for 8+ hours doing doing the same exact thing[1]. Unfortunately this is not proof :-( I can tell you from personal experience that the BSD kernels are much more sensitive to overheating hardware than linux is, for example -- so one linux kernel could just as easily be more sensitive to overheating than another linux kernel. I've never found out why this is, but I know it's true. When I try to run a BSD kernel on a dust-covered motherboard I'll get random crashes all over the place even though a linux kernel will run just fine on the same machine. All I do is blow the dust off the motherboard and both kernels run again without problem. Absolutely for sure. I'd love to know what makes the difference. ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <fa.du861p4.qi0a2o@ifi.uio.no>]
[parent not found: <fa.m7uie32.15048ou@ifi.uio.no>]
* Re: Linux v2.5.62 --- spontaneous reboots [not found] ` <fa.m7uie32.15048ou@ifi.uio.no> @ 2003-02-18 13:07 ` Ed Tomlinson 0 siblings, 0 replies; 18+ messages in thread From: Ed Tomlinson @ 2003-02-18 13:07 UTC (permalink / raw) To: Linus Torvalds, linux-kernel Linus Torvalds wrote: > A lot of people seem to be using gcc-3.2 these days, since it's what RH-8 > comes with as standard. I don't think there are any known problems with > that compiler, at least on x86. No so, See the lkml thread Re: [BUG] link error in usbserial with gcc3.2 Ed Tomlinson ^ permalink raw reply [flat|nested] 18+ messages in thread
* Linux v2.5.62
@ 2003-02-17 23:18 Linus Torvalds
2003-02-18 0:03 ` Linux v2.5.62 --- spontaneous reboots Chris Wedgwood
0 siblings, 1 reply; 18+ messages in thread
From: Linus Torvalds @ 2003-02-17 23:18 UTC (permalink / raw)
To: Kernel Mailing List
Hmm.. Mostly lots of small updates, although the merge with Andrew
included the RCU dcache patches from IBM that he has carried along for a
while (ie fairly fundamnetal, but also very well tested).
ARM, PPC, PPC64, alpha, kbuild.
Oh, and as a sign that 2.6.x really _is_ approaching, people have started
sending me spelling fixes. Kernel coders are apparently all atrocious
spellers, and for some reason the spelling police always comes out of the
woodwork when stable releases get closer.
Linus
---
Summary of changes from v2.5.61 to v2.5.62
============================================
<d.mueller@elsoft.ch>:
o PPC32: Export additional symbols for CONFIG_4xx
<tinglett@vnet.ibm.com>:
o ppc64: revised machine check exception handler
o ppc64: new scanlog interface
Adrian Bunk <bunk@fs.tum.de>:
o [netdrvr] make CONFIG_MII one-line desc more pretty
Alan Cox <alan@lxorguk.ukuu.org.uk>:
o Add printk levels to mtrr, also clarify
o merge the NEC98 parsing code
o make the io-apic printk generate less junk mail
o printk levels for mpparse
o remove bogowarning
o itanic people cant spell either
o nor PPC people ;)
o specialix fix from 2.4 missing in 2.5
o bring 2.5 arcnet into line with 2.4
o Fix aha1542
o mca 53c9x also needs mca-legacy
o another ia64 typo
o header update for arcnet updates (again to match 2.4)
Andrew Morton <akpm@digeo.com>:
o ppc64: kill ppc64 unused var warning
o ppc64: fix warning in smp_prepare_cpus
o JFS build fix with gcc-2.95.3
o flush_tlb_all is not preempt safe
o move fault_in_pages_readable/writeable to header
o separate checks from generic_file_aio_write
o fix ext3 BUG due to race with truncate
o crc32 improvements
o dcache_rcu: revert fast_walk code
o dcache_rcu
o error checking in ext3 xattr code
o xattr: listxattr fix
o xattr: infrastructure for permission overrides
o xattr: allow kernel code to override EA permissions
o xattr: trusted extended attributes
o blk_congestion_wait tuning and lockup fix
o cciss driver update
o cciss, fix array bounds overrun
o direct-io return value fix
o direct-io: allow reading of the part-filled EOF block
o Fix ext3 build when EXT3_DEBUG is defined
o Make the world safe for -Wundef
o fix compile breakage on drivers/scsi/NCR53C9x.c
o Use table lookup for radix_tree_maxindex()
o elv_former_request reversion
Andries E. Brouwer <andries.brouwer@cwi.nl>:
o add static, fix typo
Anton Blanchard <anton@samba.org>:
o ppc64: add TCSBRKP
o ppc64: Remove sys32_mremap, not required on ppc64 since we alter
TASK_SIZE
o ppc64: fix compile warnings
o ppc64: clean up some of big bad sys_ppc32.c
o ppc64: always compile in 32bit ELF support
o ppc64: Never call event-scan faster than once per second, required
on some machines
o ppc64: dont attempt a traceback table lookup for userspace
addresses
o ppc64: warning fix, caused by me
o ppc64: use get_user in alignment exception handler
o ppc64: ptrace signal fix
o ppc64: make sure socketcall_table is 8 byte aligned
o ppc64: add set_tid_address and fadvise64
o disable printout of interrupts in /proc/stat on ppc64
o enable OFFB on ppc64
o remove stale comment
o compat futex fix
Art Haas <ahaas@airmail.net>:
o C99 initializers for drivers/net/aironet4500_proc.c
o C99 initializers for drivers/char/rtc.c
o C99 initializers for drivers/cdrom/cdrom.c
o C99 initializers for drivers/net/arlan-proc.c
Ben Collins <bcollins@debian.org>:
o IEEE-1394 Updates
Brian Gerst <bgerst@didntduck.org>:
o remove .mod.c files in make clean
Daniel Jacobowitz <drow@nevyn.them.org>:
o Clean up ptrace_setoptions and PT_* constants
o Set ptrace_message before PT_TRACE_EXIT
Dave Kleikamp <shaggy@shaggy.austin.ibm.com>:
o JFS: Fix jfs_sync_fs
Dominik Brodowski <linux@brodo.de>:
o pcmcia: add device_class pcmcia_socket, update devices & drivers
o pcmcia: use device_class->add_device/remove_device
o cpufreq: move frequency table helpers to extra module
o cpufreq: move /proc/cpufreq interface code to extra module
o cpufreq: fix compilation of ACPI if !CPU_FREQ
o pcmcia: small bugfix & cleanup
François Romieu <romieu@fr.zoreil.com>:
o [netdrvr rrunner] small fixes and cleanups
Jaroslav Kysela <perex@suse.cz>:
o ALSA update
Jeff Wiedemeier <jeff.wiedemeier@hp.com>:
o alpha numa setup_memory leaves meaningless {min,max}_low_pfn
o delay marvel agp printk until after !hose check
Jens Axboe <axboe@suse.de>:
o deadline ioscheduler bug fixes
o fix request-to-request front merging
o missing lock in get_request_wait()
o front merge fix (really!)
Kai Germaschewski <kai@tp1.ruhr-uni-bochum.de>:
o kbuild: Always postprocess modules
o kbuild: Move the version magic generation into module
postprocessing
o kbuild: Use list of modules for "make modules_install"
o kbuild: Do module post processing in C
o kbuild: Add dependency info to modules
o kbuild: Add dependency info to modules
o kbuild: Figure endianness / word size at compile time
o kbuild: Merge file2alias into scripts/modpost.c
o kbuild: Rename some module postprocessing stuff
o kbuild: scripts/elfconfig.h is generated
o kbuild: Warn on undefined exported symbols
o kbuild: Fix modules_install w/o modules error
o kbuild: Fix a 64-bit issue in scripts/modpost.c
o kbuild: Fix a "make -j" bug
Linus Torvalds <torvalds@home.transmeta.com>:
o Fix futex compile breakage introduced by the compat code
o Clean up and fix locking around signal rendering
o Do proper signal locking for the old-style /proc/stat too
o It's usually considered stupid to lock the same spinlock twice in
close succession. However, for this once we'll just call it
"inspired".
o Fix locking for "send_sig_info()", to avoid possible races with
signal state changes due to execve() and exit(). We need to hold
the tasklist lock to guarantee stability of "task->sighand".
Marc Zyngier <mzyngier@freesurf.fr>:
o EISA/sysfs updates
Matthew Wilcox <willy@debian.org>:
o Fix mandatory locking
Paul Mackerras <paulus@samba.org>:
o PPC32: Changes to accommodate recent signal changes
(current->sighand)
o PPC32: Fix compile warnings in some programs used in the build
process
o PPC32: Add set_tid_address and fadvise64 system calls
o PPC32: declare pm_power_off
o PPC32: use ptrace_notify
Randy Dunlap <rddunlap@osdl.org>:
o fix Documentation/cli-sti-removal.txt thinko
Richard Henderson <rth@are.twiddle.net>:
o [ALPHA] Add missing sighand bits
o [ALPHA] Add isa_eth_io_copy_and_sum
o [ALPHA] Add fadvise64
Rob Weryk <rjweryk@uwo.ca>:
o Fix small typo
Robert Love <rml@tech9.net>:
o trivial: unused var in sunrpc
Roger Luethi <rl@hellgate.ch>:
o [netdrvr via-rhine] trivial bits
o [netdrvr via-rhine] fix broken tx-underrun handling
o [netdrvr via-rhine] various duplex-related fixes
o [netdrvr via-rhine] reset function rewrite
o [netdrvr via-rhine] bump version, use constant instead of magic
number
o Fix 8139too device close
Russell King <rmk@flint.arm.linux.org.uk>:
o [ARM] Fix resource initialisation for IOP310
o [ARM] Miscellaneous cleanups
o [ARM] Reduce scope of "safe_buffers"
o [ARM PATCH] 1372/1: EPXA10DB: Add missing include files to irq.c
for 2.5.59
o [ARM PATCH] 1373/1: EPXA10DB: Update def-config file
o [ARM PATCH] 1376/1: Use #defines for iq80310 serial port
o [ARM PATCH] 1377/1: Retain endianess state on XScale CPUs during
boot
o [ARM PATCH] 1368/1: Fix some typos in proc-armv/system.h
o [ARM] Better handling of bad IRQ implementations
o [ARM PATCH] 1380/1: Big-Endian support for jiffies
o [ARM] Add init_sighand for 2.5.60
o [ARM] Ensure backtrace terminates on corrupted frame pointers
o [ARM] Update Acorn SCSI drivers
o [ARM] Update wdt285 and wdt977 watchdog drivers
o [ARM] Add input_devclass support to SA1111 PS/2 port driver
o [ARM PATCH] 1099/4: trizeps MTD support
o [ARM] Update signal handling for ARM
Rusty Russell <rusty@rustcorp.com.au>:
o kbuild: Module alias and device table support
o kbuild: Do modversions checks on module structure
o get rid of exec_usermodehelper, replace with call_usermodehelper
o kbuild: Fix non-verbose make modules_install output
Sam Ravnborg <sam@ravnborg.org>:
o fix warning in kernel/dma.c
o char/drivers/random.c - fix warning
Scott Anderson <scott_anderson@mvista.com>:
o PPC32: Invalidate the icache before use on PPC40x
Stephen Rothwell <sfr@canb.auug.org.au>:
o compat_sys_futex 1/3 generic, parisc, ppc64, s390x and x86_64
Steve French <stevef@smfhome1.austin.rr.com>:
o Merge in fixes from version 0.6.5 of the CIFS VFS. Greatly
improved performance including improved distributed caching support
and support for readpages and larger read sizes. Cache data now
flushed properly at file close time. Socket and memory leak fixed.
Fix two oops. Fix error logging and made more consistent. Generic
sendfile added
Steven Cole <elenstev@mesatop.com>:
o [tokenring proteon] trivial, spelling fix
o high pedantry in ppc spelling
o alpha typo fix
o 2.5.61 fix erroneous spellings of error
o 2.5.61 Reduce the number of "nuber" by four
o 2.5.61 fix spelling of necessary in 11 files
o fix different spellings of different and differences
o correct the spelling of correction and correctly
o more accurate spelling of accuracy
o yet more pedantry: complement vs compliment
Tom Rini <trini@kernel.crashing.org>:
o PPC32: Fix some license drain bamage. Noticed by Christoph Hellwig
^ permalink raw reply [flat|nested] 18+ messages in thread* Linux v2.5.62 --- spontaneous reboots 2003-02-17 23:18 Linux v2.5.62 Linus Torvalds @ 2003-02-18 0:03 ` Chris Wedgwood 2003-02-18 0:44 ` Jeff Garzik ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: Chris Wedgwood @ 2003-02-18 0:03 UTC (permalink / raw) To: Linus Torvalds; +Cc: Kernel Mailing List On Mon, Feb 17, 2003 at 03:18:43PM -0800, Linus Torvalds wrote: > Oh, and as a sign that 2.6.x really _is_ approaching, people have > started sending me spelling fixes. FWIW, I can't get 2.5.59+ (maybe earlier) to run reliably for me without spontaneous rebooting under load (kernel compile in a loop). I wondered if it was specific to my system here except a few other people have reported this on *very* different hardware (I'm have UP Athlon with IDE, they have 8-way P4 with SCSI). Is anyone else seeing this? Might there be some bogon causing triple faults or similar lurking that I'm just unlucky enough to hit often? I note the 2.5.59-mjb4 seems pretty reliable and doesn't have this problem... --cw ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Linux v2.5.62 --- spontaneous reboots 2003-02-18 0:03 ` Linux v2.5.62 --- spontaneous reboots Chris Wedgwood @ 2003-02-18 0:44 ` Jeff Garzik 2003-02-18 0:46 ` Chris Wedgwood 2003-02-18 1:42 ` Linus Torvalds 2003-02-18 12:13 ` Pavel Machek 2 siblings, 1 reply; 18+ messages in thread From: Jeff Garzik @ 2003-02-18 0:44 UTC (permalink / raw) To: Chris Wedgwood; +Cc: Linus Torvalds, Kernel Mailing List Chris Wedgwood wrote: > On Mon, Feb 17, 2003 at 03:18:43PM -0800, Linus Torvalds wrote: > > >>Oh, and as a sign that 2.6.x really _is_ approaching, people have >>started sending me spelling fixes. > > > FWIW, I can't get 2.5.59+ (maybe earlier) to run reliably for me > without spontaneous rebooting under load (kernel compile in a loop). ACPI, or no? highmem, or no? Are you running your UP Athlon with CONFIG_X86_UP_APIC? Jeff ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Linux v2.5.62 --- spontaneous reboots 2003-02-18 0:44 ` Jeff Garzik @ 2003-02-18 0:46 ` Chris Wedgwood 0 siblings, 0 replies; 18+ messages in thread From: Chris Wedgwood @ 2003-02-18 0:46 UTC (permalink / raw) To: Jeff Garzik; +Cc: Linus Torvalds, Kernel Mailing List On Mon, Feb 17, 2003 at 07:44:08PM -0500, Jeff Garzik wrote: > ACPI, or no? nope > highmem, or no? no for me --- yes for them I assume (8-way P4) > Are you running your UP Athlon with CONFIG_X86_UP_APIC? I was... I wondered if that might do it, so I tried without. Still reboots. Built kernel as 486 kernel with no IO-APIC too, still reboots. Nothing is logged (serial console). Tried gcc-2.95 and gcc-3.2. --cw ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Linux v2.5.62 --- spontaneous reboots 2003-02-18 0:03 ` Linux v2.5.62 --- spontaneous reboots Chris Wedgwood 2003-02-18 0:44 ` Jeff Garzik @ 2003-02-18 1:42 ` Linus Torvalds 2003-02-18 1:53 ` Chris Wedgwood 2003-02-18 21:44 ` Chris Wedgwood 2003-02-18 12:13 ` Pavel Machek 2 siblings, 2 replies; 18+ messages in thread From: Linus Torvalds @ 2003-02-18 1:42 UTC (permalink / raw) To: Chris Wedgwood; +Cc: Kernel Mailing List On Mon, 17 Feb 2003, Chris Wedgwood wrote: > > FWIW, I can't get 2.5.59+ (maybe earlier) to run reliably for me > without spontaneous rebooting under load (kernel compile in a loop). > > I note the 2.5.59-mjb4 seems pretty reliable and doesn't have this > problem... It would be interesting to hear exactly when the trouble started. And if plain 2.5.59 does it (which is unclear from your description), but 59-mjb4 doesn't, then that's an interesting data point. Linus ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Linux v2.5.62 --- spontaneous reboots 2003-02-18 1:42 ` Linus Torvalds @ 2003-02-18 1:53 ` Chris Wedgwood 2003-02-18 2:02 ` Linus Torvalds 2003-02-18 21:44 ` Chris Wedgwood 1 sibling, 1 reply; 18+ messages in thread From: Chris Wedgwood @ 2003-02-18 1:53 UTC (permalink / raw) To: Linus Torvalds; +Cc: Kernel Mailing List On Mon, Feb 17, 2003 at 05:42:38PM -0800, Linus Torvalds wrote: > It would be interesting to hear exactly when the trouble > started. And if plain 2.5.59 does it (which is unclear from your > description), but 59-mjb4 doesn't, then that's an interesting data > point. plain 2.5.59 does 59-mjb4 does NOT I tested 59-mjb4 at the suggest of mbligh after hearing that other people had discovered the same bug and were now using 59-mjb4 --cw ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Linux v2.5.62 --- spontaneous reboots 2003-02-18 1:53 ` Chris Wedgwood @ 2003-02-18 2:02 ` Linus Torvalds 2003-02-18 2:16 ` Chris Wedgwood ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: Linus Torvalds @ 2003-02-18 2:02 UTC (permalink / raw) To: Chris Wedgwood; +Cc: Kernel Mailing List, Martin J. Bligh On Mon, 17 Feb 2003, Chris Wedgwood wrote: > > plain 2.5.59 does > > 59-mjb4 does NOT Can you check mjb 1-3 too? The better it gets pinpointed, the easier it's going to be to find. Also, if you can figure out _which_ part of the patch makes a difference, that would obviously be even better. Part of the stuff in mjb is already merged in later kernels (ie things like using sequence locks for xtime is already there in 2.5.60, so clearly that doesn't seem to be the thing that helps your situation). Martin cc'd, in case he has suggestions on how/what to split up the patch. Do you use the starfire driver? That's a big part of the patch, for example.. And part of the patch just makes the timer interrupt happen much less often, if you havn't configured for 1000Hz - and it may well be that small perturbations like that are the things that matter to you. Linus ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Linux v2.5.62 --- spontaneous reboots 2003-02-18 2:02 ` Linus Torvalds @ 2003-02-18 2:16 ` Chris Wedgwood 2003-02-18 2:33 ` Linus Torvalds 2003-02-18 3:21 ` Martin J. Bligh 2003-02-19 11:02 ` David Ford 2 siblings, 1 reply; 18+ messages in thread From: Chris Wedgwood @ 2003-02-18 2:16 UTC (permalink / raw) To: Linus Torvalds; +Cc: Kernel Mailing List, Martin J. Bligh On Mon, Feb 17, 2003 at 06:02:03PM -0800, Linus Torvalds wrote: > Can you check mjb 1-3 too? The better it gets pinpointed, the easier > it's going to be to find. Sure... I'll test them later on. > Also, if you can figure out _which_ part of the patch makes a > difference, that would obviously be even better. I'll try to narrow this down. > Part of the stuff in mjb is already merged in later kernels (ie > things like using sequence locks for xtime is already there in > 2.5.60, so clearly that doesn't seem to be the thing that helps your > situation). I don't think it's anything really obvious. If the problem I'm seeing is the same as the one showing up on *some* IBM NUMA-Q (or whatever they are) boxen then it's probably not a driver or fs thing --- as we have nothing in common. Now... it could be two different problems, except the same kernel which the IBM people found works for them also works for me. Oddly, wli has not seen this problem and he's using similar hardware (I think) to the other IBM people and the same compiler as me. > Do you use the starfire driver? Nope. A stripped down kernel, compile for a 486 with no IO-APIC support (in an attempt to slow things down and hopefully avoid possible hardware problems such as overheating) still reboots on me. The only thing I can think of is a triple-fault... I'm wondering about using gcc-3.2 instead of 2.95.4 (Debian blah blort blem) on the off chance it's a weird compiler problem. --cw ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Linux v2.5.62 --- spontaneous reboots 2003-02-18 2:16 ` Chris Wedgwood @ 2003-02-18 2:33 ` Linus Torvalds 0 siblings, 0 replies; 18+ messages in thread From: Linus Torvalds @ 2003-02-18 2:33 UTC (permalink / raw) To: Chris Wedgwood; +Cc: Kernel Mailing List, Martin J. Bligh On Mon, 17 Feb 2003, Chris Wedgwood wrote: > > The only thing I can think of is a triple-fault... I'm wondering > about using gcc-3.2 instead of 2.95.4 (Debian blah blort blem) on the > off chance it's a weird compiler problem. A lot of people seem to be using gcc-3.2 these days, since it's what RH-8 comes with as standard. I don't think there are any _known_ problems with that compiler, at least on x86. Now, interestingly enough, the mjb patch _does_ contain a change to mm/memory.c that really makes no sense _except_ in the case of a compiler bug. So you could check whether that (small) mm/memory.c patch is the thing that makes a difference for you.. It would also be interesting to see if you can check just the scheduler part of the mjb patch. On the whole the mjb patch looks like it should be fairly easy to cut into specific parts, and Martin may actually have it somewhere as separate patches. Linus ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Linux v2.5.62 --- spontaneous reboots 2003-02-18 2:02 ` Linus Torvalds 2003-02-18 2:16 ` Chris Wedgwood @ 2003-02-18 3:21 ` Martin J. Bligh 2003-02-19 11:02 ` David Ford 2 siblings, 0 replies; 18+ messages in thread From: Martin J. Bligh @ 2003-02-18 3:21 UTC (permalink / raw) To: Linus Torvalds, Chris Wedgwood; +Cc: Kernel Mailing List >> plain 2.5.59 does >> >> 59-mjb4 does NOT > > Can you check mjb 1-3 too? The better it gets pinpointed, the easier it's > going to be to find. I should note that our performance team also has triple-faults on some database app on a 8x machine ... that goes away with mjb4, not sure why as yet. There's nothing in there that I can think of that would fix a triple fault, so it may well be something annoyingly subtle. Try -mjb1 first, if that still fixes it, then I'll start hacking off chunks for you to test. Try 62 as well ... that has dcache_rcu merged, which is another major chunk of the patch. kgdb is also big, and may well change timings ... > Also, if you can figure out _which_ part of the patch makes a difference, > that would obviously be even better. Part of the stuff in mjb is already > merged in later kernels (ie things like using sequence locks for xtime is > already there in 2.5.60, so clearly that doesn't seem to be the thing that > helps your situation). Yup, a lot of it is designed to give our performance team a stable base to work from - so minimal changes to a 59 base. I use gcc-2.95.4 (Debian) as Chris does and have found that extremely stable, not sure what the perf team were using, I'll find out. > Now, interestingly enough, the mjb patch _does_ contain a change to > mm/memory.c that really makes no sense _except_ in the case of a compiler > bug. So you could check whether that (small) mm/memory.c patch is the > thing that makes a difference for you.. That's the config_page_offset patch, which Dave ported forward from Andrea's tree ... I've split that out below: diff -urpN -X /home/fletch/.diff.exclude 21-config_hz/arch/i386/Kconfig 22-config_page_offset/arch/i386/Kconfig --- 21-config_hz/arch/i386/Kconfig Wed Feb 5 22:22:59 2003 +++ 22-config_page_offset/arch/i386/Kconfig Wed Feb 5 22:23:00 2003 @@ -660,6 +660,44 @@ config HIGHMEM64G endchoice +choice + help + On i386, a process can only virtually address 4GB of memory. This + lets you select how much of that virtual space you would like to + devoted to userspace, and how much to the kernel. + + Some userspace programs would like to address as much as possible and + have few demands of the kernel other than it get out of the way. These + users may opt to use the 3.5GB option to give their userspace program + as much room as possible. Due to alignment issues imposed by PAE, + the "3.5GB" option is unavailable if "64GB" high memory support is + enabled. + + Other users (especially those who use PAE) may be running out of + ZONE_NORMAL memory. Those users may benefit from increasing the + kernel's virtual address space size by taking it away from userspace, + which may not need all of its space. An indicator that this is + happening is when /proc/Meminfo's "LowFree:" is a small percentage of + "LowTotal:" while "HighFree:" is very large. + + If unsure, say "3GB" + prompt "User address space size" + default 1GB + +config 05GB + bool "3.5 GB" + depends on !HIGHMEM64G + +config 1GB + bool "3 GB" + +config 2GB + bool "2 GB" + +config 3GB + bool "1 GB" +endchoice + config HIGHMEM bool depends on HIGHMEM64G || HIGHMEM4G diff -urpN -X /home/fletch/.diff.exclude 21-config_hz/arch/i386/Makefile 22-config_page_offset/arch/i386/Makefile --- 21-config_hz/arch/i386/Makefile Fri Jan 17 09:18:19 2003 +++ 22-config_page_offset/arch/i386/Makefile Wed Feb 5 22:23:00 2003 @@ -89,6 +89,7 @@ drivers-$(CONFIG_OPROFILE) += arch/i386 CFLAGS += $(mflags-y) AFLAGS += $(mflags-y) +AFLAGS_vmlinux.lds.o += -imacros $(TOPDIR)/include/asm-i386/page.h boot := arch/i386/boot diff -urpN -X /home/fletch/.diff.exclude 21-config_hz/arch/i386/vmlinux.lds.S 22-config_page_offset/arch/i386/vmlinux.lds.S --- 21-config_hz/arch/i386/vmlinux.lds.S Fri Jan 17 09:18:20 2003 +++ 22-config_page_offset/arch/i386/vmlinux.lds.S Wed Feb 5 22:23:00 2003 @@ -10,7 +10,7 @@ ENTRY(_start) jiffies = jiffies_64; SECTIONS { - . = 0xC0000000 + 0x100000; + . = __PAGE_OFFSET + 0x100000; /* read-only */ _text = .; /* Text and read-only data */ .text : { diff -urpN -X /home/fletch/.diff.exclude 21-config_hz/include/asm-i386/page.h 22-config_page_offset/include/asm-i386/page.h --- 21-config_hz/include/asm-i386/page.h Tue Jan 14 10:06:18 2003 +++ 22-config_page_offset/include/asm-i386/page.h Wed Feb 5 22:23:00 2003 @@ -89,7 +89,16 @@ typedef struct { unsigned long pgprot; } * and CONFIG_HIGHMEM64G options in the kernel configuration. */ -#define __PAGE_OFFSET (0xC0000000) +#include <linux/config.h> +#ifdef CONFIG_05GB +#define __PAGE_OFFSET (0xE0000000) +#elif defined(CONFIG_1GB) +#define __PAGE_OFFSET (0xC0000000) +#elif defined(CONFIG_2GB) +#define __PAGE_OFFSET (0x80000000) +#elif defined(CONFIG_3GB) +#define __PAGE_OFFSET (0x40000000) +#endif /* * This much address space is reserved for vmalloc() and iomap() diff -urpN -X /home/fletch/.diff.exclude 21-config_hz/include/asm-i386/processor.h 22-config_page_offset/include/asm-i386/processor.h --- 21-config_hz/include/asm-i386/processor.h Thu Jan 2 22:05:15 2003 +++ 22-config_page_offset/include/asm-i386/processor.h Wed Feb 5 22:23:00 2003 @@ -279,7 +279,11 @@ extern unsigned int mca_pentium_flag; /* This decides where the kernel will search for a free chunk of vm * space during mmap's. */ +#ifdef CONFIG_05GB +#define TASK_UNMAPPED_BASE (PAGE_ALIGN(TASK_SIZE / 16)) +#else #define TASK_UNMAPPED_BASE (PAGE_ALIGN(TASK_SIZE / 3)) +#endif /* * Size of io_bitmap in longwords: 32 is ports 0-0x3ff. diff -urpN -X /home/fletch/.diff.exclude 21-config_hz/mm/memory.c 22-config_page_offset/mm/memory.c --- 21-config_hz/mm/memory.c Mon Jan 13 21:09:28 2003 +++ 22-config_page_offset/mm/memory.c Wed Feb 5 22:23:00 2003 @@ -101,8 +101,7 @@ static inline void free_one_pmd(struct m static inline void free_one_pgd(struct mmu_gather *tlb, pgd_t * dir) { - int j; - pmd_t * pmd; + pmd_t * pmd, * md, * emd; if (pgd_none(*dir)) return; @@ -113,8 +112,21 @@ static inline void free_one_pgd(struct m } pmd = pmd_offset(dir, 0); pgd_clear(dir); - for (j = 0; j < PTRS_PER_PMD ; j++) - free_one_pmd(tlb, pmd+j); + /* + * Beware if changing the loop below. It once used int j, + * for (j = 0; j < PTRS_PER_PMD; j++) + * free_one_pmd(pmd+j); + * but some older i386 compilers (e.g. egcs-2.91.66, gcc-2.95.3) + * terminated the loop with a _signed_ address comparison + * using "jle", when configured for HIGHMEM64GB (X86_PAE). + * If also configured for 3GB of kernel virtual address space, + * if page at physical 0x3ffff000 virtual 0x7ffff000 is used as + * a pmd, when that mm exits the loop goes on to free "entries" + * found at 0x80000000 onwards. The loop below compiles instead + * to be terminated by unsigned address comparison using "jb". + */ + for (md = pmd, emd = pmd + PTRS_PER_PMD; md < emd; md++) + free_one_pmd(tlb,md); pmd_free_tlb(tlb, pmd); } ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Linux v2.5.62 --- spontaneous reboots 2003-02-18 2:02 ` Linus Torvalds 2003-02-18 2:16 ` Chris Wedgwood 2003-02-18 3:21 ` Martin J. Bligh @ 2003-02-19 11:02 ` David Ford 2 siblings, 0 replies; 18+ messages in thread From: David Ford @ 2003-02-19 11:02 UTC (permalink / raw) To: Linus Torvalds; +Cc: Chris Wedgwood, Kernel Mailing List, Martin J. Bligh I have a 2.5.58 box that's a simple firewall/router w/ iptables running on it. It crashes and reboots automatically roughly every other day. It's been doing that for a long time and I never had the time to debug it. I'll put .62 on it with a serial console and see what it comes up with. It runs two PPPoE channels over ethX. PPPoE is known to blow up (OOPS) on pppd hangup/restarts. David ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Linux v2.5.62 --- spontaneous reboots 2003-02-18 1:42 ` Linus Torvalds 2003-02-18 1:53 ` Chris Wedgwood @ 2003-02-18 21:44 ` Chris Wedgwood 2003-02-18 21:59 ` Chris Wedgwood 1 sibling, 1 reply; 18+ messages in thread From: Chris Wedgwood @ 2003-02-18 21:44 UTC (permalink / raw) To: Linus Torvalds; +Cc: Kernel Mailing List, Martin J. Bligh On Mon, Feb 17, 2003 at 05:42:38PM -0800, Linus Torvalds wrote: > It would be interesting to hear exactly when the trouble > started. And if plain 2.5.59 does it (which is unclear from your > description), but 59-mjb4 doesn't, then that's an interesting data > point. After much testing, which is still in progress it would seem that *maybe* mjb4 does have the problem too, although it's much harder to hit. Please note that this is a single data point where for other kernels I have two or more occurrences of spontaneous reboots. I've been checking older kernels... it would seem the problem first occurs in 2.5.53 (that is 2.5.53 through 2.5.62-bk all reboot for me). 2.5.51 doesn't appear to and thus far neither does 2.5.52. I say thus far, because the problem usually appears after about 15 minutes of compiling, but it sometimes takes a little longer. I'm running 2.5.52 now and after 45 minutes it's still going. As to what difference it might be between '52 and '53 I have no idea. I had a quick look and the changes there are considerable. I've tried different compiles, with and without preempt, and and without IO-APIC and trimming down the kernel... --cw ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Linux v2.5.62 --- spontaneous reboots 2003-02-18 21:44 ` Chris Wedgwood @ 2003-02-18 21:59 ` Chris Wedgwood 2003-02-18 22:13 ` Linus Torvalds 0 siblings, 1 reply; 18+ messages in thread From: Chris Wedgwood @ 2003-02-18 21:59 UTC (permalink / raw) To: Linus Torvalds; +Cc: Kernel Mailing List, Martin J. Bligh On Tue, Feb 18, 2003 at 01:44:31PM -0800, Chris Wedgwood wrote: > I say thus far, because the problem usually appears after about 15 > minutes of compiling, but it sometimes takes a little longer. I'm > running 2.5.52 now and after 45 minutes it's still going. Of course, Murphy being the optimist he is; about two minutes after I make a claim that 2.5.52 does NOT spontaneously reboot --- it *DOES*. I'm back to 2.5.51 and I'll beat it hard and see what happens. I guess until I (or someone else who sees this) can get some concrete data points you'll have to ignore this. --cw ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Linux v2.5.62 --- spontaneous reboots 2003-02-18 21:59 ` Chris Wedgwood @ 2003-02-18 22:13 ` Linus Torvalds 2003-02-18 22:34 ` Linus Torvalds 2003-02-18 23:01 ` Chris Wedgwood 0 siblings, 2 replies; 18+ messages in thread From: Linus Torvalds @ 2003-02-18 22:13 UTC (permalink / raw) To: Chris Wedgwood; +Cc: Kernel Mailing List, Martin J. Bligh On Tue, 18 Feb 2003, Chris Wedgwood wrote: > > Of course, Murphy being the optimist he is; about two minutes after I > make a claim that 2.5.52 does NOT spontaneously reboot --- it *DOES*. > > I'm back to 2.5.51 and I'll beat it hard and see what happens. I > guess until I (or someone else who sees this) can get some concrete > data points you'll have to ignore this. Ok. Especially if it seems that -mjb4 also potentially does it (just harder to trigger), I don't see many other alternatives than just going back in time to see when it started. But if it was getting hard to trigger with 2.5.52 too, things might be getting hairier and hairier.. If it becomes hard enough to trigger as to be practically nondeterministic, a better approach might be to just go back to -mjb4, and even if it is still there in -mjb4 try to see which part of the patch seems to be making it more stable. That might give us more clues, and it's a much smaller problem set than going arbitrarily far back in the 2.5.x series. Linus ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Linux v2.5.62 --- spontaneous reboots 2003-02-18 22:13 ` Linus Torvalds @ 2003-02-18 22:34 ` Linus Torvalds 2003-02-18 23:01 ` Chris Wedgwood 1 sibling, 0 replies; 18+ messages in thread From: Linus Torvalds @ 2003-02-18 22:34 UTC (permalink / raw) To: Chris Wedgwood; +Cc: Kernel Mailing List, Martin J. Bligh On Tue, 18 Feb 2003, Linus Torvalds wrote: > > But if it was getting hard to trigger with 2.5.52 too, things might be > getting hairier and hairier.. If it becomes hard enough to trigger as to > be practically nondeterministic, a better approach might be to just go > back to -mjb4, and even if it is still there in -mjb4 try to see which > part of the patch seems to be making it more stable. Btw, this is particularly true if it takes you potentially hours to test something like 2.5.51 for stability, but you can reboot 2.5.59 at will in ten minutes. In that case, you can test several vrsions of "2.5.59 + partial -mjb patches" much more quickly than you can walk backwards in 2.5.x, and try to pinpoint the "this part of -mjb makes it much less likely to reboot". Also, with the -mjb patch there are some new configuration options. For example, CONFIG_100HZ on -mjb has very different behaviour than a plain 2.5.59 kernel that defaults to 1kHz timer clock, and maybe the reason -mjb seems more stable is that you may have selected a configuration option that made -mjb act differently. Regardless, it would be very interesting to hear what the -mjb split-down results would be. Even if the answer might be "at 1kHz timer it is unstable, at 100Hz it is stable" (and if that were to be it, then you'd have to walk backwards to 2.5.24 to find the old 2.5.x kernel that had a slow tick rate). Linus ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Linux v2.5.62 --- spontaneous reboots 2003-02-18 22:13 ` Linus Torvalds 2003-02-18 22:34 ` Linus Torvalds @ 2003-02-18 23:01 ` Chris Wedgwood 1 sibling, 0 replies; 18+ messages in thread From: Chris Wedgwood @ 2003-02-18 23:01 UTC (permalink / raw) To: Linus Torvalds; +Cc: Kernel Mailing List, Martin J. Bligh On Tue, Feb 18, 2003 at 02:13:00PM -0800, Linus Torvalds wrote: > > I'm back to 2.5.51 and I'll beat it hard and see what happens. I > > guess until I (or someone else who sees this) can get some > > concrete data points you'll have to ignore this. > > Ok. Especially if it seems that -mjb4 also potentially does it (just > harder to trigger), I don't see many other alternatives than just > going back in time to see when it started. It seems 2.5.51 *does* also show this... but it took nearly an hour this time. > But if it was getting hard to trigger with 2.5.52 too, things might > be getting hairier and hairier... If it becomes hard enough to > trigger as to be practically nondeterministic, a better approach > might be to just go back to -mjb4, and even if it is still there in > -mjb4 try to see which part of the patch seems to be making it more > stable. I may have to do that... it seems older kernel do have this problem, it's just harder to hit for some reason. I'd suspect it was an Athlon or chipset problem if it weren't for the fact 2.4.x is stable for 8+ hours doing doing the same exact thing[1]. > That might give us more clues, and it's a much smaller problem set > than going arbitrarily far back in the 2.5.x series. Sure thing. --cw ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Linux v2.5.62 --- spontaneous reboots 2003-02-18 0:03 ` Linux v2.5.62 --- spontaneous reboots Chris Wedgwood 2003-02-18 0:44 ` Jeff Garzik 2003-02-18 1:42 ` Linus Torvalds @ 2003-02-18 12:13 ` Pavel Machek 2 siblings, 0 replies; 18+ messages in thread From: Pavel Machek @ 2003-02-18 12:13 UTC (permalink / raw) To: Chris Wedgwood; +Cc: Linus Torvalds, Kernel Mailing List Hi! > > Oh, and as a sign that 2.6.x really _is_ approaching, people have > > started sending me spelling fixes. > > FWIW, I can't get 2.5.59+ (maybe earlier) to run reliably for me > without spontaneous rebooting under load (kernel compile in a loop). > > I wondered if it was specific to my system here except a few other > people have reported this on *very* different hardware (I'm have UP > Athlon with IDE, they have 8-way P4 with SCSI). > > Is anyone else seeing this? Might there be some bogon causing triple > faults or similar lurking that I'm just unlucky enough to hit often? I'm seeing loop-related problems around 2.5.60+... Pavel -- Casualities in World Trade Center: ~3k dead inside the building, cryptography in U.S.A. and free speech in Czech Republic. ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2003-02-19 10:58 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <fa.oa9dc7e.jk65re@ifi.uio.no>
[not found] ` <fa.d672u14.1gk8ea4@ifi.uio.no>
2003-02-18 23:48 ` Linux v2.5.62 --- spontaneous reboots walt
[not found] <fa.du861p4.qi0a2o@ifi.uio.no>
[not found] ` <fa.m7uie32.15048ou@ifi.uio.no>
2003-02-18 13:07 ` Ed Tomlinson
2003-02-17 23:18 Linux v2.5.62 Linus Torvalds
2003-02-18 0:03 ` Linux v2.5.62 --- spontaneous reboots Chris Wedgwood
2003-02-18 0:44 ` Jeff Garzik
2003-02-18 0:46 ` Chris Wedgwood
2003-02-18 1:42 ` Linus Torvalds
2003-02-18 1:53 ` Chris Wedgwood
2003-02-18 2:02 ` Linus Torvalds
2003-02-18 2:16 ` Chris Wedgwood
2003-02-18 2:33 ` Linus Torvalds
2003-02-18 3:21 ` Martin J. Bligh
2003-02-19 11:02 ` David Ford
2003-02-18 21:44 ` Chris Wedgwood
2003-02-18 21:59 ` Chris Wedgwood
2003-02-18 22:13 ` Linus Torvalds
2003-02-18 22:34 ` Linus Torvalds
2003-02-18 23:01 ` Chris Wedgwood
2003-02-18 12:13 ` Pavel Machek
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox