Linux v2.5.62

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Linux v2.5.62
@ 2003-02-17 23:18 Linus Torvalds
  2003-02-18  0:03 ` Linux v2.5.62 --- spontaneous reboots Chris Wedgwood
  2003-02-19 10:53 ` Linux v2.5.62 David Ford
  0 siblings, 2 replies; 72+ messages in thread
From: Linus Torvalds @ 2003-02-17 23:18 UTC (permalink / raw)
  To: Kernel Mailing List


Hmm.. Mostly lots of small updates, although the merge with Andrew
included the RCU dcache patches from IBM that he has carried along for a
while (ie fairly fundamnetal, but also very well tested).

ARM, PPC, PPC64, alpha, kbuild.

Oh, and as a sign that 2.6.x really _is_ approaching, people have started 
sending me spelling fixes. Kernel coders are apparently all atrocious 
spellers, and for some reason the spelling police always comes out of the 
woodwork when stable releases get closer.

		Linus

---

Summary of changes from v2.5.61 to v2.5.62
============================================

<d.mueller@elsoft.ch>:
  o PPC32: Export additional symbols for CONFIG_4xx

<tinglett@vnet.ibm.com>:
  o ppc64: revised machine check exception handler
  o ppc64: new scanlog interface

Adrian Bunk <bunk@fs.tum.de>:
  o [netdrvr] make CONFIG_MII one-line desc more pretty

Alan Cox <alan@lxorguk.ukuu.org.uk>:
  o Add printk levels to mtrr, also clarify
  o merge the NEC98 parsing code
  o make the io-apic printk generate less junk mail
  o printk levels for mpparse
  o remove bogowarning
  o itanic people cant spell either
  o nor PPC people ;)
  o specialix fix from 2.4 missing in 2.5
  o bring 2.5 arcnet into line with 2.4
  o Fix aha1542
  o mca 53c9x also needs mca-legacy
  o another ia64 typo
  o header update for arcnet updates (again to match 2.4)

Andrew Morton <akpm@digeo.com>:
  o ppc64: kill ppc64 unused var warning
  o ppc64: fix warning in smp_prepare_cpus
  o JFS build fix with gcc-2.95.3
  o flush_tlb_all is not preempt safe
  o move fault_in_pages_readable/writeable to header
  o separate checks from generic_file_aio_write
  o fix ext3 BUG due to race with truncate
  o crc32 improvements
  o dcache_rcu: revert fast_walk code
  o dcache_rcu
  o error checking in ext3 xattr code
  o xattr: listxattr fix
  o xattr: infrastructure for permission overrides
  o xattr: allow kernel code to override EA permissions
  o xattr: trusted extended attributes
  o blk_congestion_wait tuning and lockup fix
  o cciss driver update
  o cciss, fix array bounds overrun
  o direct-io return value fix
  o direct-io: allow reading of the part-filled EOF block
  o Fix ext3 build when EXT3_DEBUG is defined
  o Make the world safe for -Wundef
  o fix compile breakage on drivers/scsi/NCR53C9x.c
  o Use table lookup for radix_tree_maxindex()
  o elv_former_request reversion

Andries E. Brouwer <andries.brouwer@cwi.nl>:
  o add static, fix typo

Anton Blanchard <anton@samba.org>:
  o ppc64: add TCSBRKP
  o ppc64: Remove sys32_mremap, not required on ppc64 since we alter
    TASK_SIZE
  o ppc64: fix compile warnings
  o ppc64: clean up some of big bad sys_ppc32.c
  o ppc64: always compile in 32bit ELF support
  o ppc64: Never call event-scan faster than once per second, required
    on some machines
  o ppc64: dont attempt a traceback table lookup for userspace
    addresses
  o ppc64: warning fix, caused by me
  o ppc64: use get_user in alignment exception handler
  o ppc64: ptrace signal fix
  o ppc64: make sure socketcall_table is 8 byte aligned
  o ppc64: add set_tid_address and fadvise64
  o disable printout of interrupts in /proc/stat on ppc64
  o enable OFFB on ppc64
  o remove stale comment
  o compat futex fix

Art Haas <ahaas@airmail.net>:
  o C99 initializers for drivers/net/aironet4500_proc.c
  o C99 initializers for drivers/char/rtc.c
  o C99 initializers for drivers/cdrom/cdrom.c
  o C99 initializers for drivers/net/arlan-proc.c

Ben Collins <bcollins@debian.org>:
  o IEEE-1394 Updates

Brian Gerst <bgerst@didntduck.org>:
  o remove .mod.c files in make clean

Daniel Jacobowitz <drow@nevyn.them.org>:
  o Clean up ptrace_setoptions and PT_* constants
  o Set ptrace_message before PT_TRACE_EXIT

Dave Kleikamp <shaggy@shaggy.austin.ibm.com>:
  o JFS: Fix jfs_sync_fs

Dominik Brodowski <linux@brodo.de>:
  o pcmcia: add device_class pcmcia_socket, update devices & drivers
  o pcmcia: use device_class->add_device/remove_device
  o cpufreq: move frequency table helpers to extra module
  o cpufreq: move /proc/cpufreq interface code to extra module
  o cpufreq: fix compilation of ACPI if !CPU_FREQ
  o pcmcia: small bugfix & cleanup

François Romieu <romieu@fr.zoreil.com>:
  o [netdrvr rrunner] small fixes and cleanups

Jaroslav Kysela <perex@suse.cz>:
  o ALSA update

Jeff Wiedemeier <jeff.wiedemeier@hp.com>:
  o alpha numa setup_memory leaves meaningless {min,max}_low_pfn
  o delay marvel agp printk until after !hose check

Jens Axboe <axboe@suse.de>:
  o deadline ioscheduler bug fixes
  o fix request-to-request front merging
  o missing lock in get_request_wait()
  o front merge fix (really!)

Kai Germaschewski <kai@tp1.ruhr-uni-bochum.de>:
  o kbuild: Always postprocess modules
  o kbuild: Move the version magic generation into module
    postprocessing
  o kbuild: Use list of modules for "make modules_install"
  o kbuild: Do module post processing in C
  o kbuild: Add dependency info to modules
  o kbuild: Add dependency info to modules
  o kbuild: Figure endianness / word size at compile time
  o kbuild: Merge file2alias into scripts/modpost.c
  o kbuild: Rename some module postprocessing stuff
  o kbuild: scripts/elfconfig.h is generated
  o kbuild: Warn on undefined exported symbols
  o kbuild: Fix modules_install w/o modules error
  o kbuild: Fix a 64-bit issue in scripts/modpost.c
  o kbuild: Fix a "make -j" bug

Linus Torvalds <torvalds@home.transmeta.com>:
  o Fix futex compile breakage introduced by the compat code
  o Clean up and fix locking around signal rendering
  o Do proper signal locking for the old-style /proc/stat too
  o It's usually considered stupid to lock the same spinlock twice in
    close succession. However, for this once we'll just call it
    "inspired".
  o Fix locking for "send_sig_info()", to avoid possible races with
    signal state changes due to execve() and exit(). We need to hold
    the tasklist lock to guarantee stability of "task->sighand".

Marc Zyngier <mzyngier@freesurf.fr>:
  o EISA/sysfs updates

Matthew Wilcox <willy@debian.org>:
  o Fix mandatory locking

Paul Mackerras <paulus@samba.org>:
  o PPC32: Changes to accommodate recent signal changes
    (current->sighand)
  o PPC32: Fix compile warnings in some programs used in the build
    process
  o PPC32: Add set_tid_address and fadvise64 system calls
  o PPC32: declare pm_power_off
  o PPC32: use ptrace_notify

Randy Dunlap <rddunlap@osdl.org>:
  o fix Documentation/cli-sti-removal.txt thinko

Richard Henderson <rth@are.twiddle.net>:
  o [ALPHA] Add missing sighand bits
  o [ALPHA] Add isa_eth_io_copy_and_sum
  o [ALPHA] Add fadvise64

Rob Weryk <rjweryk@uwo.ca>:
  o Fix small typo

Robert Love <rml@tech9.net>:
  o trivial: unused var in sunrpc

Roger Luethi <rl@hellgate.ch>:
  o [netdrvr via-rhine] trivial bits
  o [netdrvr via-rhine] fix broken tx-underrun handling
  o [netdrvr via-rhine] various duplex-related fixes
  o [netdrvr via-rhine] reset function rewrite
  o [netdrvr via-rhine] bump version, use constant instead of magic
    number
  o Fix 8139too device close

Russell King <rmk@flint.arm.linux.org.uk>:
  o [ARM] Fix resource initialisation for IOP310
  o [ARM] Miscellaneous cleanups
  o [ARM] Reduce scope of "safe_buffers"
  o [ARM PATCH] 1372/1: EPXA10DB: Add missing include files to irq.c
    for 2.5.59
  o [ARM PATCH] 1373/1: EPXA10DB: Update def-config file
  o [ARM PATCH] 1376/1: Use #defines for iq80310 serial port
  o [ARM PATCH] 1377/1: Retain endianess state on XScale CPUs during
    boot
  o [ARM PATCH] 1368/1: Fix some typos in proc-armv/system.h
  o [ARM] Better handling of bad IRQ implementations
  o [ARM PATCH] 1380/1: Big-Endian support for jiffies
  o [ARM] Add init_sighand for 2.5.60
  o [ARM] Ensure backtrace terminates on corrupted frame pointers
  o [ARM] Update Acorn SCSI drivers
  o [ARM] Update wdt285 and wdt977 watchdog drivers
  o [ARM] Add input_devclass support to SA1111 PS/2 port driver
  o [ARM PATCH] 1099/4: trizeps MTD support
  o [ARM] Update signal handling for ARM

Rusty Russell <rusty@rustcorp.com.au>:
  o kbuild: Module alias and device table support
  o kbuild: Do modversions checks on module structure
  o get rid of exec_usermodehelper, replace with call_usermodehelper
  o kbuild: Fix non-verbose make modules_install output

Sam Ravnborg <sam@ravnborg.org>:
  o fix warning in kernel/dma.c
  o char/drivers/random.c - fix warning

Scott Anderson <scott_anderson@mvista.com>:
  o PPC32: Invalidate the icache before use on PPC40x

Stephen Rothwell <sfr@canb.auug.org.au>:
  o compat_sys_futex 1/3 generic, parisc, ppc64, s390x and x86_64

Steve French <stevef@smfhome1.austin.rr.com>:
  o Merge in fixes from version 0.6.5 of the CIFS VFS.  Greatly
    improved performance including improved distributed caching support
    and support for readpages and larger read sizes. Cache data now
    flushed properly at file close time. Socket and memory leak fixed. 
    Fix two oops. Fix error logging and made more consistent.  Generic
    sendfile added

Steven Cole <elenstev@mesatop.com>:
  o [tokenring proteon] trivial, spelling fix
  o high pedantry in ppc spelling
  o alpha typo fix
  o 2.5.61 fix erroneous spellings of error
  o 2.5.61 Reduce the number of "nuber" by four
  o 2.5.61 fix spelling of necessary in 11 files
  o fix different spellings of different and differences
  o correct the spelling of correction and correctly
  o more accurate spelling of accuracy
  o yet more pedantry: complement vs compliment

Tom Rini <trini@kernel.crashing.org>:
  o PPC32: Fix some license drain bamage.  Noticed by Christoph Hellwig



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Linux v2.5.62 --- spontaneous reboots
  2003-02-17 23:18 Linux v2.5.62 Linus Torvalds
@ 2003-02-18  0:03 ` Chris Wedgwood
  2003-02-18  0:44   ` Jeff Garzik
                     ` (2 more replies)
  2003-02-19 10:53 ` Linux v2.5.62 David Ford
  1 sibling, 3 replies; 72+ messages in thread
From: Chris Wedgwood @ 2003-02-18  0:03 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kernel Mailing List

On Mon, Feb 17, 2003 at 03:18:43PM -0800, Linus Torvalds wrote:

> Oh, and as a sign that 2.6.x really _is_ approaching, people have
> started sending me spelling fixes.

FWIW, I can't get 2.5.59+ (maybe earlier) to run reliably for me
without spontaneous rebooting under load (kernel compile in a loop).

I wondered if it was specific to my system here except a few other
people have reported this on *very* different hardware (I'm have UP
Athlon with IDE, they have 8-way P4 with SCSI).

Is anyone else seeing this?  Might there be some bogon causing triple
faults or similar lurking that I'm just unlucky enough to hit often?

I note the 2.5.59-mjb4 seems pretty reliable and doesn't have this
problem...

  --cw

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62 --- spontaneous reboots
  2003-02-18  0:03 ` Linux v2.5.62 --- spontaneous reboots Chris Wedgwood
@ 2003-02-18  0:44   ` Jeff Garzik
  2003-02-18  0:46     ` Chris Wedgwood
  2003-02-18  1:42   ` Linus Torvalds
  2003-02-18 12:13   ` Linux v2.5.62 --- spontaneous reboots Pavel Machek
  2 siblings, 1 reply; 72+ messages in thread
From: Jeff Garzik @ 2003-02-18  0:44 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: Linus Torvalds, Kernel Mailing List

Chris Wedgwood wrote:
> On Mon, Feb 17, 2003 at 03:18:43PM -0800, Linus Torvalds wrote:
> 
> 
>>Oh, and as a sign that 2.6.x really _is_ approaching, people have
>>started sending me spelling fixes.
> 
> 
> FWIW, I can't get 2.5.59+ (maybe earlier) to run reliably for me
> without spontaneous rebooting under load (kernel compile in a loop).


ACPI, or no?

highmem, or no?

Are you running your UP Athlon with CONFIG_X86_UP_APIC?

	Jeff




^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62 --- spontaneous reboots
  2003-02-18  0:44   ` Jeff Garzik
@ 2003-02-18  0:46     ` Chris Wedgwood
  0 siblings, 0 replies; 72+ messages in thread
From: Chris Wedgwood @ 2003-02-18  0:46 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Linus Torvalds, Kernel Mailing List

On Mon, Feb 17, 2003 at 07:44:08PM -0500, Jeff Garzik wrote:

> ACPI, or no?

nope

> highmem, or no?

no for me --- yes for them I assume (8-way P4)

> Are you running your UP Athlon with CONFIG_X86_UP_APIC?

I was... I wondered if that might do it, so I tried without.  Still
reboots.  Built kernel as 486 kernel with no IO-APIC too, still
reboots.

Nothing is logged (serial console).

Tried gcc-2.95 and gcc-3.2.



  --cw

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62 --- spontaneous reboots
  2003-02-18  0:03 ` Linux v2.5.62 --- spontaneous reboots Chris Wedgwood
  2003-02-18  0:44   ` Jeff Garzik
@ 2003-02-18  1:42   ` Linus Torvalds
  2003-02-18  1:53     ` Chris Wedgwood
  2003-02-18 21:44     ` Chris Wedgwood
  2003-02-18 12:13   ` Linux v2.5.62 --- spontaneous reboots Pavel Machek
  2 siblings, 2 replies; 72+ messages in thread
From: Linus Torvalds @ 2003-02-18  1:42 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: Kernel Mailing List


On Mon, 17 Feb 2003, Chris Wedgwood wrote:
> 
> FWIW, I can't get 2.5.59+ (maybe earlier) to run reliably for me
> without spontaneous rebooting under load (kernel compile in a loop).
> 
> I note the 2.5.59-mjb4 seems pretty reliable and doesn't have this
> problem...

It would be interesting to hear exactly when the trouble started. And if
plain 2.5.59 does it (which is unclear from your description), but 59-mjb4
doesn't, then that's an interesting data point.

		Linus


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62 --- spontaneous reboots
  2003-02-18  1:42   ` Linus Torvalds
@ 2003-02-18  1:53     ` Chris Wedgwood
  2003-02-18  2:02       ` Linus Torvalds
  2003-02-18 21:44     ` Chris Wedgwood
  1 sibling, 1 reply; 72+ messages in thread
From: Chris Wedgwood @ 2003-02-18  1:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kernel Mailing List

On Mon, Feb 17, 2003 at 05:42:38PM -0800, Linus Torvalds wrote:

> It would be interesting to hear exactly when the trouble
> started. And if plain 2.5.59 does it (which is unclear from your
> description), but 59-mjb4 doesn't, then that's an interesting data
> point.

  plain 2.5.59 does

  59-mjb4 does NOT

I tested 59-mjb4 at the suggest of mbligh after hearing that other
people had discovered the same bug and were now using 59-mjb4


  --cw


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62 --- spontaneous reboots
  2003-02-18  1:53     ` Chris Wedgwood
@ 2003-02-18  2:02       ` Linus Torvalds
  2003-02-18  2:16         ` Chris Wedgwood
                           ` (2 more replies)
  0 siblings, 3 replies; 72+ messages in thread
From: Linus Torvalds @ 2003-02-18  2:02 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: Kernel Mailing List, Martin J. Bligh

On Mon, 17 Feb 2003, Chris Wedgwood wrote:
> 
>   plain 2.5.59 does
> 
>   59-mjb4 does NOT

Can you check mjb 1-3 too? The better it gets pinpointed, the easier it's 
going to be to find.

Also, if you can figure out _which_ part of the patch makes a difference,
that would obviously be even better.  Part of the stuff in mjb is already
merged in later kernels (ie things like using sequence locks for xtime is
already there in 2.5.60, so clearly that doesn't seem to be the thing that
helps your situation).

Martin cc'd, in case he has suggestions on how/what to split up the patch.

Do you use the starfire driver? That's a big part of the patch, for
example.. And part of the patch just makes the timer interrupt happen much
less often, if you havn't configured for 1000Hz - and it may well be that
small perturbations like that are the things that matter to you.

		Linus

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62 --- spontaneous reboots
  2003-02-18  2:02       ` Linus Torvalds
@ 2003-02-18  2:16         ` Chris Wedgwood
  2003-02-18  2:33           ` Linus Torvalds
  2003-02-18  3:21         ` Martin J. Bligh
  2003-02-19 11:02         ` David Ford
  2 siblings, 1 reply; 72+ messages in thread
From: Chris Wedgwood @ 2003-02-18  2:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kernel Mailing List, Martin J. Bligh

On Mon, Feb 17, 2003 at 06:02:03PM -0800, Linus Torvalds wrote:

> Can you check mjb 1-3 too? The better it gets pinpointed, the easier
> it's going to be to find.

Sure... I'll test them later on.

> Also, if you can figure out _which_ part of the patch makes a
> difference, that would obviously be even better.

I'll try to narrow this down.

> Part of the stuff in mjb is already merged in later kernels (ie
> things like using sequence locks for xtime is already there in
> 2.5.60, so clearly that doesn't seem to be the thing that helps your
> situation).

I don't think it's anything really obvious.  If the problem I'm seeing
is the same as the one showing up on *some* IBM NUMA-Q (or whatever
they are) boxen then it's probably not a driver or fs thing --- as we
have nothing in common.

Now... it could be two different problems, except the same kernel
which the IBM people found works for them also works for me.

Oddly, wli has not seen this problem and he's using similar hardware
(I think) to the other IBM people and the same compiler as me.

> Do you use the starfire driver?

Nope.

A stripped down kernel, compile for a 486 with no IO-APIC support (in
an attempt to slow things down and hopefully avoid possible hardware
problems such as overheating) still reboots on me.

The only thing I can think of is a triple-fault...  I'm wondering
about using gcc-3.2 instead of 2.95.4 (Debian blah blort blem) on the
off chance it's a weird compiler problem.

  --cw

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62 --- spontaneous reboots
  2003-02-18  2:16         ` Chris Wedgwood
@ 2003-02-18  2:33           ` Linus Torvalds
  0 siblings, 0 replies; 72+ messages in thread
From: Linus Torvalds @ 2003-02-18  2:33 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: Kernel Mailing List, Martin J. Bligh

On Mon, 17 Feb 2003, Chris Wedgwood wrote:
> 
> The only thing I can think of is a triple-fault...  I'm wondering
> about using gcc-3.2 instead of 2.95.4 (Debian blah blort blem) on the
> off chance it's a weird compiler problem.

A lot of people seem to be using gcc-3.2 these days, since it's what RH-8 
comes with as standard. I don't think there are any _known_ problems with 
that compiler, at least on x86.

Now, interestingly enough, the mjb patch _does_ contain a change to 
mm/memory.c that really makes no sense _except_ in the case of a compiler 
bug. So you could check whether that (small) mm/memory.c patch is the 
thing that makes a difference for you..

It would also be interesting to see if you can check just the scheduler 
part of the mjb patch. On the whole the mjb patch looks like it should be 
fairly easy to cut into specific parts, and Martin may actually have it 
somewhere as separate patches.

		Linus

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62 --- spontaneous reboots
  2003-02-18  2:02       ` Linus Torvalds
  2003-02-18  2:16         ` Chris Wedgwood
@ 2003-02-18  3:21         ` Martin J. Bligh
  2003-02-19 11:02         ` David Ford
  2 siblings, 0 replies; 72+ messages in thread
From: Martin J. Bligh @ 2003-02-18  3:21 UTC (permalink / raw)
  To: Linus Torvalds, Chris Wedgwood; +Cc: Kernel Mailing List

>>   plain 2.5.59 does
>> 
>>   59-mjb4 does NOT
> 
> Can you check mjb 1-3 too? The better it gets pinpointed, the easier it's 
> going to be to find.

I should note that our performance team also has triple-faults on some 
database app on a 8x machine ... that goes away with mjb4, not sure why 
as yet. There's nothing in there that I can think of that would fix
a triple fault, so it may well be something annoyingly subtle.

Try -mjb1 first, if that still fixes it, then I'll start hacking off 
chunks for you to test. Try 62 as well ... that has dcache_rcu merged,
which is another major chunk of the patch. kgdb is also big, and may 
well change timings ...
 
> Also, if you can figure out _which_ part of the patch makes a difference,
> that would obviously be even better.  Part of the stuff in mjb is already
> merged in later kernels (ie things like using sequence locks for xtime is
> already there in 2.5.60, so clearly that doesn't seem to be the thing that
> helps your situation).

Yup, a lot of it is designed to give our performance team a stable base
to work from - so minimal changes to a 59 base.

I use gcc-2.95.4 (Debian) as Chris does and have found that extremely 
stable, not sure what the perf team were using, I'll find out.

> Now, interestingly enough, the mjb patch _does_ contain a change to 
> mm/memory.c that really makes no sense _except_ in the case of a compiler 
> bug. So you could check whether that (small) mm/memory.c patch is the 
> thing that makes a difference for you..

That's the config_page_offset patch, which Dave ported forward from 
Andrea's tree ... I've split that out below:

diff -urpN -X /home/fletch/.diff.exclude 21-config_hz/arch/i386/Kconfig 22-config_page_offset/arch/i386/Kconfig
--- 21-config_hz/arch/i386/Kconfig	Wed Feb  5 22:22:59 2003
+++ 22-config_page_offset/arch/i386/Kconfig	Wed Feb  5 22:23:00 2003
@@ -660,6 +660,44 @@ config HIGHMEM64G
 
 endchoice
 
+choice
+	help
+	  On i386, a process can only virtually address 4GB of memory.  This
+	  lets you select how much of that virtual space you would like to 
+	  devoted to userspace, and how much to the kernel.
+
+	  Some userspace programs would like to address as much as possible and 
+	  have few demands of the kernel other than it get out of the way.  These
+	  users may opt to use the 3.5GB option to give their userspace program 
+	  as much room as possible.  Due to alignment issues imposed by PAE, 
+	  the "3.5GB" option is unavailable if "64GB" high memory support is 
+	  enabled.
+
+	  Other users (especially those who use PAE) may be running out of
+	  ZONE_NORMAL memory.  Those users may benefit from increasing the
+	  kernel's virtual address space size by taking it away from userspace, 
+	  which may not need all of its space.  An indicator that this is 
+	  happening is when /proc/Meminfo's "LowFree:" is a small percentage of
+	  "LowTotal:" while "HighFree:" is very large.
+
+	  If unsure, say "3GB"
+	prompt "User address space size"
+        default 1GB
+	
+config	05GB
+	bool "3.5 GB"
+	depends on !HIGHMEM64G
+	
+config	1GB
+	bool "3 GB"
+	
+config	2GB
+	bool "2 GB"
+	
+config	3GB
+	bool "1 GB"
+endchoice
+
 config HIGHMEM
 	bool
 	depends on HIGHMEM64G || HIGHMEM4G
diff -urpN -X /home/fletch/.diff.exclude 21-config_hz/arch/i386/Makefile 22-config_page_offset/arch/i386/Makefile
--- 21-config_hz/arch/i386/Makefile	Fri Jan 17 09:18:19 2003
+++ 22-config_page_offset/arch/i386/Makefile	Wed Feb  5 22:23:00 2003
@@ -89,6 +89,7 @@ drivers-$(CONFIG_OPROFILE)		+= arch/i386
 
 CFLAGS += $(mflags-y)
 AFLAGS += $(mflags-y)
+AFLAGS_vmlinux.lds.o += -imacros $(TOPDIR)/include/asm-i386/page.h
 
 boot := arch/i386/boot
 
diff -urpN -X /home/fletch/.diff.exclude 21-config_hz/arch/i386/vmlinux.lds.S 22-config_page_offset/arch/i386/vmlinux.lds.S
--- 21-config_hz/arch/i386/vmlinux.lds.S	Fri Jan 17 09:18:20 2003
+++ 22-config_page_offset/arch/i386/vmlinux.lds.S	Wed Feb  5 22:23:00 2003
@@ -10,7 +10,7 @@ ENTRY(_start)
 jiffies = jiffies_64;
 SECTIONS
 {
-  . = 0xC0000000 + 0x100000;
+  . = __PAGE_OFFSET + 0x100000;
   /* read-only */
   _text = .;			/* Text and read-only data */
   .text : {
diff -urpN -X /home/fletch/.diff.exclude 21-config_hz/include/asm-i386/page.h 22-config_page_offset/include/asm-i386/page.h
--- 21-config_hz/include/asm-i386/page.h	Tue Jan 14 10:06:18 2003
+++ 22-config_page_offset/include/asm-i386/page.h	Wed Feb  5 22:23:00 2003
@@ -89,7 +89,16 @@ typedef struct { unsigned long pgprot; }
  * and CONFIG_HIGHMEM64G options in the kernel configuration.
  */
 
-#define __PAGE_OFFSET		(0xC0000000)
+#include <linux/config.h>
+#ifdef CONFIG_05GB
+#define __PAGE_OFFSET          (0xE0000000)
+#elif defined(CONFIG_1GB)
+#define __PAGE_OFFSET          (0xC0000000)
+#elif defined(CONFIG_2GB)
+#define __PAGE_OFFSET          (0x80000000)
+#elif defined(CONFIG_3GB)
+#define __PAGE_OFFSET          (0x40000000)
+#endif
 
 /*
  * This much address space is reserved for vmalloc() and iomap()
diff -urpN -X /home/fletch/.diff.exclude 21-config_hz/include/asm-i386/processor.h 22-config_page_offset/include/asm-i386/processor.h
--- 21-config_hz/include/asm-i386/processor.h	Thu Jan  2 22:05:15 2003
+++ 22-config_page_offset/include/asm-i386/processor.h	Wed Feb  5 22:23:00 2003
@@ -279,7 +279,11 @@ extern unsigned int mca_pentium_flag;
 /* This decides where the kernel will search for a free chunk of vm
  * space during mmap's.
  */
+#ifdef CONFIG_05GB
+#define TASK_UNMAPPED_BASE	(PAGE_ALIGN(TASK_SIZE / 16))
+#else
 #define TASK_UNMAPPED_BASE	(PAGE_ALIGN(TASK_SIZE / 3))
+#endif
 
 /*
  * Size of io_bitmap in longwords: 32 is ports 0-0x3ff.
diff -urpN -X /home/fletch/.diff.exclude 21-config_hz/mm/memory.c 22-config_page_offset/mm/memory.c
--- 21-config_hz/mm/memory.c	Mon Jan 13 21:09:28 2003
+++ 22-config_page_offset/mm/memory.c	Wed Feb  5 22:23:00 2003
@@ -101,8 +101,7 @@ static inline void free_one_pmd(struct m
 
 static inline void free_one_pgd(struct mmu_gather *tlb, pgd_t * dir)
 {
-	int j;
-	pmd_t * pmd;
+	pmd_t * pmd, * md, * emd;
 
 	if (pgd_none(*dir))
 		return;
@@ -113,8 +112,21 @@ static inline void free_one_pgd(struct m
 	}
 	pmd = pmd_offset(dir, 0);
 	pgd_clear(dir);
-	for (j = 0; j < PTRS_PER_PMD ; j++)
-		free_one_pmd(tlb, pmd+j);
+	/*
+	 * Beware if changing the loop below.  It once used int j,
+	 * 	for (j = 0; j < PTRS_PER_PMD; j++)
+	 * 		free_one_pmd(pmd+j);
+	 * but some older i386 compilers (e.g. egcs-2.91.66, gcc-2.95.3)
+	 * terminated the loop with a _signed_ address comparison
+	 * using "jle", when configured for HIGHMEM64GB (X86_PAE).
+	 * If also configured for 3GB of kernel virtual address space,
+	 * if page at physical 0x3ffff000 virtual 0x7ffff000 is used as
+	 * a pmd, when that mm exits the loop goes on to free "entries"
+	 * found at 0x80000000 onwards.  The loop below compiles instead
+	 * to be terminated by unsigned address comparison using "jb".
+	 */
+	for (md = pmd, emd = pmd + PTRS_PER_PMD; md < emd; md++)
+		free_one_pmd(tlb,md);
 	pmd_free_tlb(tlb, pmd);
 }
 


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62 --- spontaneous reboots
  2003-02-18  2:02       ` Linus Torvalds
  2003-02-18  2:16         ` Chris Wedgwood
  2003-02-18  3:21         ` Martin J. Bligh
@ 2003-02-19 11:02         ` David Ford
  2 siblings, 0 replies; 72+ messages in thread
From: David Ford @ 2003-02-19 11:02 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Chris Wedgwood, Kernel Mailing List, Martin J. Bligh

I have a 2.5.58 box that's a simple firewall/router w/ iptables running 
on it.  It crashes and reboots automatically roughly every other day.  
It's been doing that for a  long time and I never had the time to debug 
it.  I'll put .62 on it with a serial console and see what it comes up 
with.  It runs two PPPoE channels over ethX.  PPPoE is known to blow up 
(OOPS) on pppd hangup/restarts.

David

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62 --- spontaneous reboots
  2003-02-18  1:42   ` Linus Torvalds
  2003-02-18  1:53     ` Chris Wedgwood
@ 2003-02-18 21:44     ` Chris Wedgwood
  2003-02-18 21:59       ` Chris Wedgwood
  1 sibling, 1 reply; 72+ messages in thread
From: Chris Wedgwood @ 2003-02-18 21:44 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kernel Mailing List, Martin J. Bligh

On Mon, Feb 17, 2003 at 05:42:38PM -0800, Linus Torvalds wrote:

> It would be interesting to hear exactly when the trouble
> started. And if plain 2.5.59 does it (which is unclear from your
> description), but 59-mjb4 doesn't, then that's an interesting data
> point.

After much testing, which is still in progress it would seem that
*maybe* mjb4 does have the problem too, although it's much harder to
hit.  Please note that this is a single data point where for other
kernels I have two or more occurrences of spontaneous reboots.

I've been checking older kernels...  it would seem the problem first
occurs in 2.5.53 (that is 2.5.53 through 2.5.62-bk all reboot for me).
2.5.51 doesn't appear to and thus far neither does 2.5.52.

I say thus far, because the problem usually appears after about 15
minutes of compiling, but it sometimes takes a little longer.  I'm
running 2.5.52 now and after 45 minutes it's still going.

As to what difference it might be between '52 and '53 I have no idea.
I had a quick look and the changes there are considerable.

I've tried different compiles, with and without preempt, and and
without IO-APIC and trimming down the kernel...

  --cw

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62 --- spontaneous reboots
  2003-02-18 21:44     ` Chris Wedgwood
@ 2003-02-18 21:59       ` Chris Wedgwood
  2003-02-18 22:13         ` Linus Torvalds
  0 siblings, 1 reply; 72+ messages in thread
From: Chris Wedgwood @ 2003-02-18 21:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kernel Mailing List, Martin J. Bligh

On Tue, Feb 18, 2003 at 01:44:31PM -0800, Chris Wedgwood wrote:

> I say thus far, because the problem usually appears after about 15
> minutes of compiling, but it sometimes takes a little longer.  I'm
> running 2.5.52 now and after 45 minutes it's still going.

Of course, Murphy being the optimist he is; about two minutes after I
make a claim that 2.5.52 does NOT spontaneously reboot --- it *DOES*.

I'm back to 2.5.51 and I'll beat it hard and see what happens.  I
guess until I (or someone else who sees this) can get some concrete
data points you'll have to ignore this.

  --cw

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62 --- spontaneous reboots
  2003-02-18 21:59       ` Chris Wedgwood
@ 2003-02-18 22:13         ` Linus Torvalds
  2003-02-18 22:34           ` Linus Torvalds
  2003-02-18 23:01           ` Chris Wedgwood
  0 siblings, 2 replies; 72+ messages in thread
From: Linus Torvalds @ 2003-02-18 22:13 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: Kernel Mailing List, Martin J. Bligh

On Tue, 18 Feb 2003, Chris Wedgwood wrote:
> 
> Of course, Murphy being the optimist he is; about two minutes after I
> make a claim that 2.5.52 does NOT spontaneously reboot --- it *DOES*.
> 
> I'm back to 2.5.51 and I'll beat it hard and see what happens.  I
> guess until I (or someone else who sees this) can get some concrete
> data points you'll have to ignore this.

Ok. Especially if it seems that -mjb4 also potentially does it (just
harder to trigger), I don't see many other alternatives than just going
back in time to see when it started.

But if it was getting hard to trigger with 2.5.52 too, things might be
getting hairier and hairier.. If it becomes hard enough to trigger as to
be practically nondeterministic, a better approach might be to just go
back to -mjb4, and even if it is still there in -mjb4 try to see which
part of the patch seems to be making it more stable. That might give us
more clues, and it's a much smaller problem set than going arbitrarily far
back in the 2.5.x series.

		Linus

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62 --- spontaneous reboots
  2003-02-18 22:13         ` Linus Torvalds
@ 2003-02-18 22:34           ` Linus Torvalds
  2003-02-18 23:01           ` Chris Wedgwood
  1 sibling, 0 replies; 72+ messages in thread
From: Linus Torvalds @ 2003-02-18 22:34 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: Kernel Mailing List, Martin J. Bligh

On Tue, 18 Feb 2003, Linus Torvalds wrote:
> 
> But if it was getting hard to trigger with 2.5.52 too, things might be
> getting hairier and hairier.. If it becomes hard enough to trigger as to
> be practically nondeterministic, a better approach might be to just go
> back to -mjb4, and even if it is still there in -mjb4 try to see which
> part of the patch seems to be making it more stable.

Btw, this is particularly true if it takes you potentially hours to test 
something like 2.5.51 for stability, but you can reboot 2.5.59 at will in 
ten minutes. 

In that case, you can test several vrsions of "2.5.59 + partial -mjb
patches" much more quickly than you can walk backwards in 2.5.x, and try 
to pinpoint the "this part of -mjb makes it much less likely to reboot".

Also, with the -mjb patch there are some new configuration options. For 
example, CONFIG_100HZ on -mjb has very different behaviour than a plain 
2.5.59 kernel that defaults to 1kHz timer clock, and maybe the reason -mjb 
seems more stable is that you may have selected a configuration option 
that made -mjb act differently.

Regardless, it would be very interesting to hear what the -mjb split-down
results would be. Even if the answer might be "at 1kHz timer it is
unstable, at 100Hz it is stable" (and if that were to be it, then you'd
have to walk backwards to 2.5.24 to find the old 2.5.x kernel that had a
slow tick rate).

		Linus

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62 --- spontaneous reboots
  2003-02-18 22:13         ` Linus Torvalds
  2003-02-18 22:34           ` Linus Torvalds
@ 2003-02-18 23:01           ` Chris Wedgwood
  2003-02-19 23:35             ` doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots) Linus Torvalds
  1 sibling, 1 reply; 72+ messages in thread
From: Chris Wedgwood @ 2003-02-18 23:01 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kernel Mailing List, Martin J. Bligh

On Tue, Feb 18, 2003 at 02:13:00PM -0800, Linus Torvalds wrote:

> > I'm back to 2.5.51 and I'll beat it hard and see what happens.  I
> > guess until I (or someone else who sees this) can get some
> > concrete data points you'll have to ignore this.
>
> Ok. Especially if it seems that -mjb4 also potentially does it (just
> harder to trigger), I don't see many other alternatives than just
> going back in time to see when it started.

It seems 2.5.51 *does* also show this... but it took nearly an hour
this time.

> But if it was getting hard to trigger with 2.5.52 too, things might
> be getting hairier and hairier... If it becomes hard enough to
> trigger as to be practically nondeterministic, a better approach
> might be to just go back to -mjb4, and even if it is still there in
> -mjb4 try to see which part of the patch seems to be making it more
> stable.

I may have to do that...  it seems older kernel do have this problem,
it's just harder to hit for some reason.

I'd suspect it was an Athlon or chipset problem if it weren't for the
fact 2.4.x is stable for 8+ hours doing doing the same exact thing[1].

> That might give us more clues, and it's a much smaller problem set
> than going arbitrarily far back in the 2.5.x series.

Sure thing.


  --cw

^ permalink raw reply	[flat|nested] 72+ messages in thread

* doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-18 23:01           ` Chris Wedgwood
@ 2003-02-19 23:35             ` Linus Torvalds
  2003-02-20  2:22               ` Zwane Mwaikambo
  0 siblings, 1 reply; 72+ messages in thread
From: Linus Torvalds @ 2003-02-19 23:35 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: Kernel Mailing List, Martin J. Bligh, Ingo Molnar


Ok, I wrote up this doublefault task-gate handler which has gotten some
very very minimal testing, and which is probably totally buggered on SMP
machines etc, but which has caught at least one double-fault on one of my
test-machines (which I forced to double-fault by making %esp contain an
invalid value in kernel mode).

If the reboot is due to a triple-fault, this may give out some debugging 
information and then lock up hard instead of rebooting.

Change the "ptr_ok()" to match your hardware (or just make it do

	#define ptr_ok(x) (1)

since I only really wrote it that way due to debugging the damn thing).

Anyway, this patch should apply pretty directly on top of 2.5.62, and if 
you run UP it might even work. So apply this, and try to crash the 
machine, and see if it spits out any interesting information.

NOTE NOTE NOTE! When the double-fault happens, the machine as-is will be 
COMPLETELY DEAD! Don't try to access "current" or anything like that, 
since the stack is scrogged. That's why it gets the state by actually 
reading the current value of gdt, and following it to the TSS structure.

If this approach works, we can try to make the doublefault handling less 
prone to lock up the machine (ie kill the offending task and continuing),  
but in the meantime at least it should avoid having things like stack 
errors result in triple faults and reboots.

Improvements welcome (and boy was this a bitch to debug).

		Linus

-----
===== arch/i386/kernel/Makefile 1.35 vs edited =====
--- 1.35/arch/i386/kernel/Makefile	Tue Feb 18 18:59:01 2003
+++ edited/arch/i386/kernel/Makefile	Wed Feb 19 11:56:49 2003
@@ -6,7 +6,8 @@
 
 obj-y	:= process.o semaphore.o signal.o entry.o traps.o irq.o vm86.o \
 		ptrace.o i8259.o ioport.o ldt.o setup.o time.o sys_i386.o \
-		pci-dma.o i386_ksyms.o i387.o dmi_scan.o bootflag.o
+		pci-dma.o i386_ksyms.o i387.o dmi_scan.o bootflag.o \
+		doublefault.o
 
 obj-y				+= cpu/
 obj-y				+= timers/
===== arch/i386/kernel/head.S 1.24 vs edited =====
--- 1.24/arch/i386/kernel/head.S	Tue Feb 18 18:58:53 2003
+++ edited/arch/i386/kernel/head.S	Wed Feb 19 11:56:50 2003
@@ -476,6 +476,13 @@
 	.quad 0x00009a0000000000	/* 0xc0 APM CS 16 code (16 bit) */
 	.quad 0x0040920000000000	/* 0xc8 APM DS    data */
 
+	.quad 0x0000000000000000	/* 0xd0 - unused */
+	.quad 0x0000000000000000	/* 0xd8 - unused */
+	.quad 0x0000000000000000	/* 0xe0 - unused */
+	.quad 0x0000000000000000	/* 0xe8 - unused */
+	.quad 0x0000000000000000	/* 0xf0 - unused */
+	.quad 0x0000000000000000	/* 0xf8 - GDT entry 31: double-fault TSS */
+
 #if CONFIG_SMP
 	.fill (NR_CPUS-1)*GDT_ENTRIES,8,0 /* other CPU's GDT */
 #endif
===== arch/i386/kernel/traps.c 1.44 vs edited =====
--- 1.44/arch/i386/kernel/traps.c	Sat Feb 15 19:30:17 2003
+++ edited/arch/i386/kernel/traps.c	Wed Feb 19 11:56:50 2003
@@ -775,7 +775,7 @@
 }
 #endif
 
-#define _set_gate(gate_addr,type,dpl,addr) \
+#define _set_gate(gate_addr,type,dpl,addr,seg) \
 do { \
   int __d0, __d1; \
   __asm__ __volatile__ ("movw %%dx,%%ax\n\t" \
@@ -785,7 +785,7 @@
 	:"=m" (*((long *) (gate_addr))), \
 	 "=m" (*(1+(long *) (gate_addr))), "=&a" (__d0), "=&d" (__d1) \
 	:"i" ((short) (0x8000+(dpl<<13)+(type<<8))), \
-	 "3" ((char *) (addr)),"2" (__KERNEL_CS << 16)); \
+	 "3" ((char *) (addr)),"2" ((seg) << 16)); \
 } while (0)
 
 
@@ -797,22 +797,27 @@
  */
 void set_intr_gate(unsigned int n, void *addr)
 {
-	_set_gate(idt_table+n,14,0,addr);
+	_set_gate(idt_table+n,14,0,addr,__KERNEL_CS);
 }
 
 static void __init set_trap_gate(unsigned int n, void *addr)
 {
-	_set_gate(idt_table+n,15,0,addr);
+	_set_gate(idt_table+n,15,0,addr,__KERNEL_CS);
 }
 
 static void __init set_system_gate(unsigned int n, void *addr)
 {
-	_set_gate(idt_table+n,15,3,addr);
+	_set_gate(idt_table+n,15,3,addr,__KERNEL_CS);
 }
 
 static void __init set_call_gate(void *a, void *addr)
 {
-	_set_gate(a,12,3,addr);
+	_set_gate(a,12,3,addr,__KERNEL_CS);
+}
+
+static void __init set_task_gate(unsigned int n, unsigned int gdt_entry)
+{
+	_set_gate(idt_table+n,5,0,0,(gdt_entry<<3));
 }
 
 
@@ -843,7 +848,7 @@
 	set_system_gate(5,&bounds);
 	set_trap_gate(6,&invalid_op);
 	set_trap_gate(7,&device_not_available);
-	set_trap_gate(8,&double_fault);
+	set_task_gate(8,GDT_ENTRY_DOUBLEFAULT_TSS);
 	set_trap_gate(9,&coprocessor_segment_overrun);
 	set_trap_gate(10,&invalid_TSS);
 	set_trap_gate(11,&segment_not_present);
===== arch/i386/kernel/cpu/common.c 1.17 vs edited =====
--- 1.17/arch/i386/kernel/cpu/common.c	Sat Dec 28 09:17:17 2002
+++ edited/arch/i386/kernel/cpu/common.c	Wed Feb 19 11:56:50 2003
@@ -490,6 +490,10 @@
 	load_TR_desc();
 	load_LDT(&init_mm.context);
 
+	/* Set up doublefault TSS pointer in the GDT */
+	__set_tss_desc(cpu, GDT_ENTRY_DOUBLEFAULT_TSS, &doublefault_tss);
+	cpu_gdt_table[cpu][GDT_ENTRY_DOUBLEFAULT_TSS].b &= 0xfffffdff;
+
 	/* Clear %fs and %gs. */
 	asm volatile ("xorl %eax, %eax; movl %eax, %fs; movl %eax, %gs");
 
===== include/asm-i386/desc.h 1.12 vs edited =====
--- 1.12/include/asm-i386/desc.h	Sat Dec 28 09:18:49 2002
+++ edited/include/asm-i386/desc.h	Wed Feb 19 11:56:51 2003
@@ -42,10 +42,12 @@
 	"rorl $16,%%eax" \
 	: "=m"(*(n)) : "a" (addr), "r"(n), "ir"(limit), "i"(type))
 
-static inline void set_tss_desc(unsigned int cpu, void *addr)
+static inline void __set_tss_desc(unsigned int cpu, unsigned int entry, void *addr)
 {
-	_set_tssldt_desc(&cpu_gdt_table[cpu][GDT_ENTRY_TSS], (int)addr, 235, 0x89);
+	_set_tssldt_desc(&cpu_gdt_table[cpu][entry], (int)addr, 235, 0x89);
 }
+
+#define set_tss_desc(cpu,addr) __set_tss_desc(cpu, GDT_ENTRY_TSS, addr)
 
 static inline void set_ldt_desc(unsigned int cpu, void *addr, unsigned int size)
 {
===== include/asm-i386/processor.h 1.39 vs edited =====
--- 1.39/include/asm-i386/processor.h	Fri Feb 14 18:24:10 2003
+++ edited/include/asm-i386/processor.h	Wed Feb 19 11:56:51 2003
@@ -83,6 +83,7 @@
 extern struct cpuinfo_x86 boot_cpu_data;
 extern struct cpuinfo_x86 new_cpu_data;
 extern struct tss_struct init_tss[NR_CPUS];
+extern struct tss_struct doublefault_tss;
 
 #ifdef CONFIG_SMP
 extern struct cpuinfo_x86 cpu_data[];
===== include/asm-i386/segment.h 1.5 vs edited =====
--- 1.5/include/asm-i386/segment.h	Sat Dec 28 09:18:49 2002
+++ edited/include/asm-i386/segment.h	Wed Feb 19 11:56:52 2003
@@ -37,6 +37,13 @@
  *  23 - APM BIOS support
  *  24 - APM BIOS support
  *  25 - APM BIOS support 
+ *
+ *  26 - unused
+ *  27 - unused
+ *  28 - unused
+ *  29 - unused
+ *  30 - unused
+ *  31 - TSS for double fault handler
  */
 #define GDT_ENTRY_TLS_ENTRIES	3
 #define GDT_ENTRY_TLS_MIN	6
@@ -64,10 +71,12 @@
 #define GDT_ENTRY_PNPBIOS_BASE		(GDT_ENTRY_KERNEL_BASE + 6)
 #define GDT_ENTRY_APMBIOS_BASE		(GDT_ENTRY_KERNEL_BASE + 11)
 
+#define GDT_ENTRY_DOUBLEFAULT_TSS	31
+
 /*
- * The GDT has 25 entries but we pad it to cacheline boundary:
+ * The GDT has 32 entries
  */
-#define GDT_ENTRIES 28
+#define GDT_ENTRIES 32
 
 #define GDT_SIZE (GDT_ENTRIES * 8)
 
--- /dev/null	2002-08-30 16:31:37.000000000 -0700
+++ ./arch/i386/kernel/doublefault.c	2003-02-19 15:26:44.000000000 -0800
@@ -0,0 +1,65 @@
+#include <linux/mm.h>
+#include <linux/sched.h>
+#include <linux/init.h>
+#include <linux/init_task.h>
+#include <linux/fs.h>
+
+#include <asm/uaccess.h>
+#include <asm/pgtable.h>
+#include <asm/desc.h>
+
+#define DOUBLEFAULT_STACKSIZE (1024)
+static unsigned long doublefault_stack[DOUBLEFAULT_STACKSIZE];
+#define STACK_START (unsigned long)(doublefault_stack+DOUBLEFAULT_STACKSIZE)
+
+#define ptr_ok(x) ((x) > 0xc0000000 && (x) < 0xc1000000)
+
+static void doublefault_fn(void)
+{
+	struct Xgt_desc_struct gdt_desc = {0, 0};
+	unsigned long gdt, tss;
+
+	__asm__ __volatile__("sgdt %0": "=m" (gdt_desc): :"memory");
+	gdt = gdt_desc.address;
+
+	printk("double fault, gdt at %08lx [%d bytes]\n", gdt, gdt_desc.size);
+
+	if (ptr_ok(gdt)) {
+		gdt += GDT_ENTRY_TSS << 3;
+		tss = *(u16 *)(gdt+2);
+		tss += *(u8 *)(gdt+4) << 16;
+		tss += *(u8 *)(gdt+7) << 24;
+		printk("double fault, tss at %08lx\n", tss);
+
+		if (ptr_ok(tss)) {
+			struct tss_struct *t = (struct tss_struct *)tss;
+
+			printk("eip = %08lx, esp = %08lx\n", t->eip, t->esp);
+
+			printk("eax = %08lx, ebx = %08lx, ecx = %08lx, edx = %08lx\n",
+				t->eax, t->ebx, t->ecx, t->edx);
+			printk("esi = %08lx, edi = %08lx\n",
+				t->esi, t->edi);
+		}
+	}
+
+	for (;;) /* nothing */;
+}
+
+struct tss_struct doublefault_tss __cacheline_aligned = {
+	.esp0		= STACK_START,
+	.ss0		= __KERNEL_DS,
+	.ldt		= 0,
+	.bitmap		= INVALID_IO_BITMAP_OFFSET,
+	.io_bitmap	= { [0 ... IO_BITMAP_SIZE ] = ~0 },
+
+	.eip		= (unsigned long) doublefault_fn,
+	.eflags		= 0x00000082,
+	.esp		= STACK_START,
+	.es		= __USER_DS,
+	.cs		= __KERNEL_CS,
+	.ss		= __KERNEL_DS,
+	.ds		= __USER_DS,
+
+	.__cr3		= __pa(swapper_pg_dir)
+};


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-19 23:35             ` doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots) Linus Torvalds
@ 2003-02-20  2:22               ` Zwane Mwaikambo
  2003-02-20  2:26                 ` William Lee Irwin III
  2003-02-20  4:52                 ` Linus Torvalds
  0 siblings, 2 replies; 72+ messages in thread
From: Zwane Mwaikambo @ 2003-02-20  2:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Chris Wedgwood, Kernel Mailing List, Martin J. Bligh, Ingo Molnar,
	William Lee Irwin III

Thanks!
	Here is a triple fault case (2.5.62-pgcl) and since i'm not a Real 
Man i had to use a simulator ;) Unfortunately i can't unwind the stack.

Freeing unused kernel memory: 100k freed
double fault, gdt at c0268020 [255 bytes]
double fault, tss at c027d800
eip = c01181c4, esp = f7f9bf90
eax = c0003dfc, ebx = ffffffff, ecx = 0000007b, edx = f7f9c04c
esi = 00000003, edi = c01181b0

0xc01181c4 <do_page_fault+20>:  mov    %eax,0xc(%esp,1)

(0) [0x001139e4] 0060:c01139e4 (t doublefault_fn+c4): jmp c0113ae4  ; ebfe

eax            0x1f             31
ecx            0xc027d800       -1071130624
edx            0xc027d800       -1071130624
ebx            0xc027d800       -1071130624
esp            0xc029f7ec       0xc029f7ec
ebp            0x0              0x0
esi            0xffffffff       -1
edi            0x0              0
eip            0xc01139e4       0xc01139e4
eflags         0x4082           16514
cs             0x60             96
ss             0x68             104
ds             0x7b             123
es             0x7b             123
fs             0x0              0
gs             0x0              0

CR0=0x8005003b
    PG=paging=1
    CD=cache disable=0
    NW=not write through=0
    AM=alignment mask=1
    WP=write protect=1
    NE=numeric error=1
    ET=extension type=1
    TS=task switched=1
    EM=FPU emulation=0
    MP=monitor coprocessor=1
    PE=protection enable=1
CR2=page fault linear address=0xf7f9bf8c
CR3=0x00101000
    PCD=page-level cache disable=0
    PWT=page-level writes transparent=0
CR4=0x000000b0
    VME=virtual-8086 mode extensions=0
    PVI=protected-mode virtual interrupts=0
    TSD=time stamp disable=0
    DE=debugging extensions=0
    PSE=page size extensions=1
    PAE=physical address extension=1
    MCE=machine check enable=0
    PGE=page global enable=1
    PCE=performance-monitor counter enable=0
    OXFXSR=OS support for FXSAVE/FXRSTOR=0
    OSXMMEXCPT=OS support for unmasked SIMD FP exceptions=0

Global Descriptor Table (0xc0268020):
GDT[0x00]=??? descriptor hi=00000000, lo=00000000
GDT[0x01]=??? descriptor hi=00000000, lo=00000000
GDT[0x02]=??? descriptor hi=00000000, lo=00000000
GDT[0x03]=??? descriptor hi=00000000, lo=00000000
GDT[0x04]=??? descriptor hi=00000000, lo=00000000
GDT[0x05]=??? descriptor hi=00000000, lo=00000000
GDT[0x06]=??? descriptor hi=00000000, lo=00000000
GDT[0x07]=??? descriptor hi=00000000, lo=00000000
GDT[0x08]=??? descriptor hi=00000000, lo=00000000
GDT[0x09]=??? descriptor hi=00000000, lo=00000000
GDT[0x0a]=??? descriptor hi=00000000, lo=00000000
GDT[0x0b]=??? descriptor hi=00000000, lo=00000000
GDT[0x0c]=Code segment, linearaddr=00000000, len=fffff * 4Kbytes, Execute/Read, 32-bit addrs
GDT[0x0d]=Data segment, linearaddr=00000000, len=fffff * 4Kbytes, Read/Write, Accessed
GDT[0x0e]=Code segment, linearaddr=00000000, len=fffff * 4Kbytes, Execute/Read, 32-bit addrs
GDT[0x0f]=Data segment, linearaddr=00000000, len=fffff * 4Kbytes, Read/Write, Accessed
GDT[0x10]=32-Bit TSS (Busy) at c027d800, length 0x000eb
GDT[0x11]=LDT
GDT[0x12]=Code segment, linearaddr=00000000, len=00000 * 4Kbytes, Execute/Read, 32-bit addrs
GDT[0x13]=Code segment, linearaddr=00000000, len=00000 * 4Kbytes, Execute/Read, 16-bit addrs
GDT[0x14]=Data segment, linearaddr=00000000, len=00000 * 4Kbytes, Read/Write
GDT[0x15]=Data segment, linearaddr=00000000, len=00000 * 4Kbytes, Read/Write
GDT[0x16]=Data segment, linearaddr=00000000, len=00000 * 4Kbytes, Read/Write
GDT[0x17]=Code segment, linearaddr=00000000, len=00000 bytes, Execute/Read, 32-bit addrs
GDT[0x18]=Code segment, linearaddr=00000000, len=00000 bytes, Execute/Read, 16-bit addrs
GDT[0x19]=Data segment, linearaddr=00000000, len=00000 bytes, Read/Write
GDT[0x1a]=??? descriptor hi=00000000, lo=00000000
GDT[0x1b]=??? descriptor hi=00000000, lo=00000000
GDT[0x1c]=??? descriptor hi=00000000, lo=00000000
GDT[0x1d]=??? descriptor hi=00000000, lo=00000000
GDT[0x1e]=??? descriptor hi=00000000, lo=00000000
GDT[0x1f]=32-Bit TSS (Busy) at c027f500, length 0x000eb


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20  2:22               ` Zwane Mwaikambo
@ 2003-02-20  2:26                 ` William Lee Irwin III
  2003-02-20  2:55                   ` Zwane Mwaikambo
  2003-02-20  4:52                 ` Linus Torvalds
  1 sibling, 1 reply; 72+ messages in thread
From: William Lee Irwin III @ 2003-02-20  2:26 UTC (permalink / raw)
  To: Zwane Mwaikambo
  Cc: Linus Torvalds, Chris Wedgwood, Kernel Mailing List,
	Martin J. Bligh, Ingo Molnar

On Wed, Feb 19, 2003 at 09:22:42PM -0500, Zwane Mwaikambo wrote:
> 	Here is a triple fault case (2.5.62-pgcl) and since i'm not a Real 
> Man i had to use a simulator ;) Unfortunately i can't unwind the stack.
> 
> CR2=page fault linear address=0xf7f9bf8c
> CR3=0x00101000
>     PCD=page-level cache disable=0
>     PWT=page-level writes transparent=0

Looks like either a pagetable or physmap/vmalloc/fixmap screwup.
What do the bootlogs have for those things?


-- wli

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20  2:26                 ` William Lee Irwin III
@ 2003-02-20  2:55                   ` Zwane Mwaikambo
  2003-02-20  3:15                     ` William Lee Irwin III
  0 siblings, 1 reply; 72+ messages in thread
From: Zwane Mwaikambo @ 2003-02-20  2:55 UTC (permalink / raw)
  To: William Lee Irwin III
  Cc: Linus Torvalds, Chris Wedgwood, Kernel Mailing List,
	Martin J. Bligh, Ingo Molnar

On Wed, 19 Feb 2003, William Lee Irwin III wrote:

> On Wed, Feb 19, 2003 at 09:22:42PM -0500, Zwane Mwaikambo wrote:
> > 	Here is a triple fault case (2.5.62-pgcl) and since i'm not a Real 
> > Man i had to use a simulator ;) Unfortunately i can't unwind the stack.
> > 
> > CR2=page fault linear address=0xf7f9bf8c
> > CR3=0x00101000
> >     PCD=page-level cache disable=0
> >     PWT=page-level writes transparent=0
> 
> Looks like either a pagetable or physmap/vmalloc/fixmap screwup.
> What do the bootlogs have for those things?

Verified there were no overlapping regions. If you really really really 
want them i can put in some printks

	Zwane
-- 
function.linuxpower.ca

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20  2:55                   ` Zwane Mwaikambo
@ 2003-02-20  3:15                     ` William Lee Irwin III
  0 siblings, 0 replies; 72+ messages in thread
From: William Lee Irwin III @ 2003-02-20  3:15 UTC (permalink / raw)
  To: Zwane Mwaikambo
  Cc: Linus Torvalds, Chris Wedgwood, Kernel Mailing List,
	Martin J. Bligh, Ingo Molnar

On Wed, 19 Feb 2003, William Lee Irwin III wrote:
>> Looks like either a pagetable or physmap/vmalloc/fixmap screwup.
>> What do the bootlogs have for those things?

On Wed, Feb 19, 2003 at 09:55:47PM -0500, Zwane Mwaikambo wrote:
> Verified there were no overlapping regions. If you really really really 
> want them i can put in some printks

The printk's should have come in with the pgcl patch. Did you keep the
bootlogs? I'm looking for rounding errors in my pagetable init stuff
to see if we're trying to use memory beyond the edge of a 2MB region
we didn't bother mapping or something but that only matters for phys
mappings and so on. If you hit vmallocspace or fixmapspace it's an
entirely different question. There are also small "holes"...

So it'd be very handy to figure out which of the three spaces the
address that turned up in %cr2 was supposed to be in. I can probably
guess a little better if you told me your PAGE_MMUSHIFT value also.

-- wli

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20  2:22               ` Zwane Mwaikambo
  2003-02-20  2:26                 ` William Lee Irwin III
@ 2003-02-20  4:52                 ` Linus Torvalds
  2003-02-20  5:07                   ` William Lee Irwin III
                                     ` (2 more replies)
  1 sibling, 3 replies; 72+ messages in thread
From: Linus Torvalds @ 2003-02-20  4:52 UTC (permalink / raw)
  To: Zwane Mwaikambo
  Cc: Chris Wedgwood, Kernel Mailing List, Martin J. Bligh, Ingo Molnar,
	William Lee Irwin III

On Wed, 19 Feb 2003, Zwane Mwaikambo wrote:
>
> 	Here is a triple fault case (2.5.62-pgcl) and since i'm not a Real 
> Man i had to use a simulator ;) Unfortunately i can't unwind the stack.

Well, the reason you can't unwind the stack is the same reason you got the 
double fault: the stack pointer is crap.

> Freeing unused kernel memory: 100k freed
> double fault, gdt at c0268020 [255 bytes]
> double fault, tss at c027d800
> eip = c01181c4, esp = f7f9bf90
> eax = c0003dfc, ebx = ffffffff, ecx = 0000007b, edx = f7f9c04c
> esi = 00000003, edi = c01181b0

Whee. So the double-fault patch actually ends up being useful? It didn't 
help with Chris' problem, but hey, if it helps with something else..

Anyway, that %esp is crap, which also explains this:

> 0xc01181c4 <do_page_fault+20>:  mov    %eax,0xc(%esp,1)

Took a page fault because 0xc(%esp) wasn't there, and the page fault 
couldn't write the fault trace to the stack (same reason), so you got a 
double fault.

Anyway, it's hard to try to re-create any state from the above. Very few 
clues about why the stack pointer is so messed up, but _usually_ a messed 
up stack pointer is because the stack itself got hammered, and then the 
stack pointer gets corrupted when somebody restores it off the stack (ie 
the normal 

	movl %ebp,%esp
	popl %ebp
	ret

kind of epilogue thing).

You could try to make the double-fault handler print out more information, 
suggested starting point something like the following: the stack pointer 
is corrupted, but we know what the original top-of-stack was (esp0), so we 
could print out part of that stack to get a guess about what it was doing 
when it all went south..

			Linus

------

===== arch/i386/kernel/doublefault.c 1.1 vs edited =====
--- 1.1/arch/i386/kernel/doublefault.c	Wed Feb 19 17:48:55 2003
+++ edited/arch/i386/kernel/doublefault.c	Wed Feb 19 20:50:47 2003
@@ -33,13 +33,26 @@

 		if (ptr_ok(tss)) {
 			struct tss_struct *t = (struct tss_struct *)tss;
+			unsigned long esp0 = t->esp0;

 			printk("eip = %08lx, esp = %08lx\n", t->eip, t->esp);

 			printk("eax = %08lx, ebx = %08lx, ecx = %08lx, edx = %08lx\n",
 				t->eax, t->ebx, t->ecx, t->edx);
-			printk("esi = %08lx, edi = %08lx\n",
-				t->esi, t->edi);
+			printk("esi = %08lx, edi = %08lx, %ebp = %08lx\n",
+				t->esi, t->edi, t->ebp);
+
+			/*
+			 * We could print out the stack contents here: esp0
+			 * is the beginning of the stack, we could print out
+			 * all the code points we can find underneath it or
+			 * something.. 
+			 */
+		
+			/* This might be a point to try to kill the process and clean up */
+			t->esp = esp0;
+			t->eip = (unsigned long) do_exit;
+			asm volatile("iret");
 		}
 	}

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20  4:52                 ` Linus Torvalds
@ 2003-02-20  5:07                   ` William Lee Irwin III
  2003-02-20  6:05                   ` Zwane Mwaikambo
  2003-02-20 11:46                   ` Ingo Molnar
  2 siblings, 0 replies; 72+ messages in thread
From: William Lee Irwin III @ 2003-02-20  5:07 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Zwane Mwaikambo, Chris Wedgwood, Kernel Mailing List,
	Martin J. Bligh, Ingo Molnar

On Wed, Feb 19, 2003 at 08:52:46PM -0800, Linus Torvalds wrote:
> Whee. So the double-fault patch actually ends up being useful? It didn't 
> help with Chris' problem, but hey, if it helps with something else..
> Anyway, that %esp is crap, which also explains this:
>> 0xc01181c4 <do_page_fault+20>:  mov    %eax,0xc(%esp,1)
> Took a page fault because 0xc(%esp) wasn't there, and the page fault 
> couldn't write the fault trace to the stack (same reason), so you got a 
> double fault.

Not sure where he got his %esp, but I extracted the following:

<zwane> MAXMEM=0x33e00000
<zwane> vmalloc: start = 0xf3e1f000, end = 0xfbe21000
<zwane> fixaddr: start = 0xfbe23000, end = 0xfffff000

which means somehow %esp landed in an unmapped tidbit in the middle of
of vmallocspace that isn't even mapped. I highly suspect rounding
errors of mine since I squished vmallocspace, fixmapspace, and the
physical mapping so close together they might share L3 pagetables, i.e.
they're separated by 2*MMUPAGE_SIZE instead of customary 8MB or so.


-- wli

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20  4:52                 ` Linus Torvalds
  2003-02-20  5:07                   ` William Lee Irwin III
@ 2003-02-20  6:05                   ` Zwane Mwaikambo
  2003-02-20 11:46                   ` Ingo Molnar
  2 siblings, 0 replies; 72+ messages in thread
From: Zwane Mwaikambo @ 2003-02-20  6:05 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Chris Wedgwood, Kernel Mailing List, Martin J. Bligh, Ingo Molnar,
	William Lee Irwin III

On Wed, 19 Feb 2003, Linus Torvalds wrote:

> +			printk("esi = %08lx, edi = %08lx, %ebp = %08lx\n",
> +				t->esi, t->edi, t->ebp);

Too much AT&T for you ;) '%ebp'

> +			 * We could print out the stack contents here: esp0
> +			 * is the beginning of the stack, we could print out
> +			 * all the code points we can find underneath it or
> +			 * something.. 
> +			 */

Simulator managed to dump stack for me, nothing interesting though

  > +		
> +			/* This might be a point to try to kill the process and clean up */
> +			t->esp = esp0;
> +			t->eip = (unsigned long) do_exit;
> +			asm volatile("iret");
>  		}
>  	}
>  
> 
> 

Here is what i managed to fish out from the sim, not a real call trace, 
i just piped the stack contents through ksymoops.

Trace; c02b97ec <doublefault_stack+fec/1000>
Trace; c02b97ee <doublefault_stack+fee/1000>
Trace; c02b97f0 <doublefault_stack+ff0/1000>
Trace; c02b97f2 <doublefault_stack+ff2/1000>
Trace; c02b97f4 <doublefault_stack+ff4/1000>
Trace; c02b97f6 <doublefault_stack+ff6/1000>
Trace; c02b97f8 <doublefault_stack+ff8/1000>
Trace; c02b97fa <doublefault_stack+ffa/1000>
Trace; c02b97fc <doublefault_stack+ffc/1000>
Trace; c02b97fe <doublefault_stack+ffe/1000>
Trace; c02b9800 <use_tsc+0/4>
Trace; c02b9802 <use_tsc+2/4>
Trace; c02b9804 <delay_at_last_interrupt+0/4>
Trace; c02b9806 <delay_at_last_interrupt+2/4>
Trace; c02b9808 <last_tsc_low+0/4>
Trace; c02b980a <last_tsc_low+2/4>
Trace; c02b980c <fast_gettimeoffset_quotient+0/4>
Trace; c02b980e <fast_gettimeoffset_quotient+2/4>
Trace; c02b9810 <pm_power_off+0/4>
Trace; c02b9812 <pm_power_off+2/4>
Trace; c02b9814 <no_idt+0/8>
Trace; c02b9816 <no_idt+2/8>
Trace; c02b9818 <no_idt+4/8>
Trace; c02b981a <no_idt+6/8>
Trace; c02b981c <reboot_mode+0/4>
Trace; c02b981e <reboot_mode+2/4>
Trace; c02b9820 <reboot_thru_bios+0/4>
Trace; c02b9822 <reboot_thru_bios+2/4>
Trace; c02b9824 <flush_cpumask+0/4>
Trace; c02b9826 <flush_cpumask+2/4>
Trace; c02b9828 <flush_mm+0/4>
Trace; c02b982a <flush_mm+2/4>
Trace; c02b982c <flush_va+0/4>
Trace; c02b982e <flush_va+2/4>
Trace; c02b9830 <call_data+0/8>
Trace; c02b9832 <call_data+2/8>
Trace; c02b9834 <call_data+4/8>
Trace; c02b9836 <call_data+6/8>
Trace; c02b9838 <cacheflush_time+0/8>
Trace; c02b983a <cacheflush_time+2/8>
Trace; c02b983c <cacheflush_time+4/8>
Trace; c02b983e <cacheflush_time+6/8>
Trace; c02b9840 <cpu_online_map+0/4>
Trace; c02b9842 <cpu_online_map+2/4>
Trace; c02b9844 <cpu_callout_map+0/4>
Trace; c02b9846 <cpu_callout_map+2/4>
Trace; c02b9848 <smp_threads_ready+0/4>
Trace; c02b984a <smp_threads_ready+2/4>
Trace; c02b984c <cache_decay_ticks+0/4>
Trace; c02b984e <cache_decay_ticks+2/4>
Trace; c02b9850 <phys_proc_id+0/4>
Trace; c02b9852 <phys_proc_id+2/4>
Trace; c02b9854 <cpu_callin_map+0/4>
Trace; c02b9856 <cpu_callin_map+2/4>
Trace; c02b9858 <smp_commenced_mask+0/4>
Trace; c02b985a <smp_commenced_mask+2/4>
Trace; c02b985c <trampoline_base+0/4>
Trace; c02b985e <trampoline_base+2/4>
Trace; c02b9860 <tsc_values+0/8>
Trace; c02b9862 <tsc_values+2/8>
Trace; c02b9864 <tsc_values+4/8>
Trace; c02b9866 <tsc_values+6/8>
Trace; c02b9868 <init_deasserted+0/4>
Trace; c02b986a <init_deasserted+2/4>

-- 
function.linuxpower.ca

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20  4:52                 ` Linus Torvalds
  2003-02-20  5:07                   ` William Lee Irwin III
  2003-02-20  6:05                   ` Zwane Mwaikambo
@ 2003-02-20 11:46                   ` Ingo Molnar
  2003-02-20 12:12                     ` William Lee Irwin III
                                       ` (2 more replies)
  2 siblings, 3 replies; 72+ messages in thread
From: Ingo Molnar @ 2003-02-20 11:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Zwane Mwaikambo, Chris Wedgwood, Kernel Mailing List,
	Martin J. Bligh, William Lee Irwin III


i think i managed to trigger a potentially useful oops, with BK-curr:

Unable to handle kernel paging request at virtual address 6b6b6b8b
 printing eip:
c011944b
*pde = 00000000
Oops: 0002
CPU:    0
EIP:    0060:[<c011944b>]    Not tainted
EFLAGS: 00010046
EIP is at do_page_fault+0x7b/0x4e4
eax: 6b6b6b8b   ebx: 6b6b6b6b   ecx: 0000002b   edx: c02dd6ac
esi: 6b6b6b8b   edi: ca095320   ebp: ca092170   esp: ca0920c8
ds: 007b   es: 007b   ss: 0068
Process start-threads (pid: 21685, threadinfo=ca090000 task=ca094ce0)
Stack: c02dd6ac 0000002b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b8b 6b6b6b6b 6b6b6b6b
       6b6b6b6b 00030001 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b
       6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b
Call Trace:

 [tons of pagefault recursion]

 [<c01193d0>] do_page_fault+0x0/0x4e4
 [<c010a691>] error_code+0x2d/0x38
 [<c011944b>] do_page_fault+0x7b/0x4e4
 [<c01193d0>] do_page_fault+0x0/0x4e4
 [<c010a691>] error_code+0x2d/0x38
 [<c011944b>] do_page_fault+0x7b/0x4e4
 [<c01294f8>] do_timer+0xc8/0xd0
 [<c013330c>] rcu_process_callbacks+0x17c/0x1b0
 [<c011b4bf>] scheduler_tick+0x3ff/0x410
 [<c0125113>] tasklet_action+0x73/0xc0
 [<c01193d0>] do_page_fault+0x0/0x4e4
 [<c010a691>] error_code+0x2d/0x38
 [<c011b598>] schedule+0xb8/0x3d0
 [<c01219fd>] release_task+0x17d/0x200
 [<c011e70f>] mmput+0x1f/0xc0
 [<c0122cad>] do_exit+0x31d/0x3b0
 [<c010b328>] do_nmi+0x58/0x60
 [<c012a93e>] __dequeue_signal+0x6e/0xb0
 [<c0122ef0>] do_group_exit+0x110/0x140
 [<c012a9ae>] dequeue_signal+0x2e/0x60
 [<c012c2b1>] get_signal_to_deliver+0x2b1/0x440
 [<c01099a2>] do_signal+0xb2/0xf0
 [<c01296c4>] schedule_timeout+0x74/0xc0
 [<c012c4f9>] sigprocmask+0x89/0x140
 [<c0129640>] process_timeout+0x0/0x10
 [<c012c62d>] sys_rt_sigprocmask+0x7d/0x1a0
 [<c0129944>] sys_nanosleep+0x154/0x180
 [<c0109a3b>] do_notify_resume+0x5b/0x60
 [<c0109c72>] work_notifysig+0x13/0x15



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20 11:46                   ` Ingo Molnar
@ 2003-02-20 12:12                     ` William Lee Irwin III
  2003-02-20 12:33                       ` Ingo Molnar
  2003-02-20 14:00                     ` Zwane Mwaikambo
  2003-02-20 15:43                     ` Linus Torvalds
  2 siblings, 1 reply; 72+ messages in thread
From: William Lee Irwin III @ 2003-02-20 12:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Zwane Mwaikambo, Chris Wedgwood,
	Kernel Mailing List, Martin J. Bligh

On Thu, Feb 20, 2003 at 12:46:51PM +0100, Ingo Molnar wrote:
> i think i managed to trigger a potentially useful oops, with BK-curr:
> Stack: c02dd6ac 0000002b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b8b 6b6b6b6b 6b6b6b6b
>        6b6b6b6b 00030001 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b
>        6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b

Looks like some kind of serious use-after-free slab issue. IF is clear,
so we aren't under spin_lock_irq(&rq->lock) on the initial fault. It
might be interesting to find a way to trap it earlier. Reproducible?
If so, how?


-- wli

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20 12:12                     ` William Lee Irwin III
@ 2003-02-20 12:33                       ` Ingo Molnar
  2003-02-20 14:03                         ` Zwane Mwaikambo
  0 siblings, 1 reply; 72+ messages in thread
From: Ingo Molnar @ 2003-02-20 12:33 UTC (permalink / raw)
  To: William Lee Irwin III
  Cc: Linus Torvalds, Zwane Mwaikambo, Chris Wedgwood,
	Kernel Mailing List, Martin J. Bligh


i had some other stuff in my tree as well, which could be the culprit. The
crash looked unrelated though. (procfs optimizations for the threaded
case.)

	Ingo


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20 12:33                       ` Ingo Molnar
@ 2003-02-20 14:03                         ` Zwane Mwaikambo
  0 siblings, 0 replies; 72+ messages in thread
From: Zwane Mwaikambo @ 2003-02-20 14:03 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: William Lee Irwin III, Linus Torvalds, Chris Wedgwood,
	Kernel Mailing List, Martin J. Bligh

On Thu, 20 Feb 2003, Ingo Molnar wrote:

> 
> i had some other stuff in my tree as well, which could be the culprit. The
> crash looked unrelated though. (procfs optimizations for the threaded
> case.)

I can provide more debug information when i get back from work later.

Cheers,
	Zwane
-- 
function.linuxpower.ca

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20 11:46                   ` Ingo Molnar
  2003-02-20 12:12                     ` William Lee Irwin III
@ 2003-02-20 14:00                     ` Zwane Mwaikambo
  2003-02-20 15:43                     ` Linus Torvalds
  2 siblings, 0 replies; 72+ messages in thread
From: Zwane Mwaikambo @ 2003-02-20 14:00 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Chris Wedgwood, Kernel Mailing List,
	Martin J. Bligh, William Lee Irwin III

On Thu, 20 Feb 2003, Ingo Molnar wrote:

> 
> i think i managed to trigger a potentially useful oops, with BK-curr:
> 
> Unable to handle kernel paging request at virtual address 6b6b6b8b
>  printing eip:
> c011944b
> *pde = 00000000
> Oops: 0002
> CPU:    0
> EIP:    0060:[<c011944b>]    Not tainted
> EFLAGS: 00010046
> EIP is at do_page_fault+0x7b/0x4e4
> eax: 6b6b6b8b   ebx: 6b6b6b6b   ecx: 0000002b   edx: c02dd6ac
> esi: 6b6b6b8b   edi: ca095320   ebp: ca092170   esp: ca0920c8
> ds: 007b   es: 007b   ss: 0068
> Process start-threads (pid: 21685, threadinfo=ca090000 task=ca094ce0)
> Stack: c02dd6ac 0000002b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b8b 6b6b6b6b 6b6b6b6b
>        6b6b6b6b 00030001 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b
>        6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b 6b6b6b6b
> Call Trace:

I've seen this with 2.5.62, it's here;

00407434086i[CPU0 ] task_switch: bad LDT segment at c0121a00
00407434086i[CPU0 ] task switch: posting exception 10 after commit point
00407434086p[CPU0 ] >>PANIC<< can_push(): SS invalidated.
00407434086i[SYS  ] Last time is 1045745354
00407434086i[XGUI ] Exit.
00407434086i[CPU0 ] protected mode
00407434086i[CPU0 ] CS.d_b = 32 bit
00407434086i[CPU0 ] SS.d_b = 32 bit
00407434086i[CPU0 ] | EAX=f7ffd6b4  EBX=ffffffff  ECX=0000007b  
EDX=f7f9c048
00407434086i[CPU0 ] | ESP=c02b97dc  EBP=00000001  ESI=00000000  
EDI=c0118250
00407434086i[CPU0 ] | IOPL=0 NV UP DI NG NZ NA PO NC
00407434086i[CPU0 ] | SEG selector     base    limit G D
00407434086i[CPU0 ] | SEG sltr(index|ti|rpl)     base    limit G D
00407434086i[CPU0 ] |  DS:007b( 000f| 0|  3) 00000000 000fffff 1 1
00407434086i[CPU0 ] |  ES:007b( 000f| 0|  3) 00000000 000fffff 1 1
00407434086i[CPU0 ] |  FS:0000( 0000| 0|  0) 00000000 000fffff 1 1
00407434086i[CPU0 ] |  GS:0000( 0000| 0|  0) 00000000 000fffff 1 1
00407434086i[CPU0 ] |  SS:0068( 000d| 0|  0) 00000000 000fffff 1 1
00407434086i[CPU0 ] |  CS:0060( 000c| 0|  0) 00000000 000fffff 1 1
00407434086i[CPU0 ] | EIP=c0121a00 (c0121a00)
00407434086i[CPU0 ] | CR0=0x8005003b CR1=0x00000000 CR2=0xf7f9bf88
00407434086i[CPU0 ] | CR3=0x00000000 CR4=0x000000b0
00407434086i[CPU0 ] >> 55
00407434086i[CPU0 ] >> : push EBP

(gdb) disassemble 0xc0121a00
Dump of assembler code for function do_exit:
0xc0121a00 <do_exit>:   push   %ebp
0xc0121a01 <do_exit+1>: push   %edi
0xc0121a02 <do_exit+2>: push   %esi
0xc0121a03 <do_exit+3>: push   %ebx

-- 
function.linuxpower.ca

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20 11:46                   ` Ingo Molnar
  2003-02-20 12:12                     ` William Lee Irwin III
  2003-02-20 14:00                     ` Zwane Mwaikambo
@ 2003-02-20 15:43                     ` Linus Torvalds
  2003-02-20 15:52                       ` Ingo Molnar
                                         ` (3 more replies)
  2 siblings, 4 replies; 72+ messages in thread
From: Linus Torvalds @ 2003-02-20 15:43 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Zwane Mwaikambo, Chris Wedgwood, Kernel Mailing List,
	Martin J. Bligh, William Lee Irwin III

On Thu, 20 Feb 2003, Ingo Molnar wrote:
> 
> i think i managed to trigger a potentially useful oops, with BK-curr:

Ok, this is definitely a stack overflow:

> EIP is at do_page_fault+0x7b/0x4e4
> eax: 6b6b6b8b   ebx: 6b6b6b6b   ecx: 0000002b   edx: c02dd6ac
> esi: 6b6b6b8b   edi: ca095320   ebp: ca092170   esp: ca0920c8
> ds: 007b   es: 007b   ss: 0068
> Process start-threads (pid: 21685, threadinfo=ca090000 task=ca094ce0)

Note the "threadinfo=ca090000" and "esp: ca0920c8".

If the threadinfo isn't on the same double-page as the stack, then you're 
screwed, and you've just overwritten the _real_ threadinfo, and the stack 
is probably screwed. In fact, any recursion on do_page_fault() is 
_probably_ due to the fact that you overwrote thread-info.

This could explain Chris' problems too - my doublefault thing won't help
much if recursion on the stack has clobbered a lot of kernel state (and
the doublefault will likely happen only after enough state is clobbered 
that even the doublefault handling might have trouble).

>  [tons of pagefault recursion]
> 
>  [<c01193d0>] do_page_fault+0x0/0x4e4
>  [<c010a691>] error_code+0x2d/0x38
>  [<c011944b>] do_page_fault+0x7b/0x4e4
>  [<c01193d0>] do_page_fault+0x0/0x4e4
>  [<c010a691>] error_code+0x2d/0x38
>  [<c011944b>] do_page_fault+0x7b/0x4e4
>  [<c01294f8>] do_timer+0xc8/0xd0
>  [<c013330c>] rcu_process_callbacks+0x17c/0x1b0
>  [<c011b4bf>] scheduler_tick+0x3ff/0x410
>  [<c0125113>] tasklet_action+0x73/0xc0
>  [<c01193d0>] do_page_fault+0x0/0x4e4
>  [<c010a691>] error_code+0x2d/0x38
>  [<c011b598>] schedule+0xb8/0x3d0
>  [<c01219fd>] release_task+0x17d/0x200
>  [<c011e70f>] mmput+0x1f/0xc0
>  [<c0122cad>] do_exit+0x31d/0x3b0
>  [<c010b328>] do_nmi+0x58/0x60
>  [<c012a93e>] __dequeue_signal+0x6e/0xb0
>  [<c0122ef0>] do_group_exit+0x110/0x140
>  [<c012a9ae>] dequeue_signal+0x2e/0x60
>  [<c012c2b1>] get_signal_to_deliver+0x2b1/0x440
>  [<c01099a2>] do_signal+0xb2/0xf0
>  [<c01296c4>] schedule_timeout+0x74/0xc0
>  [<c012c4f9>] sigprocmask+0x89/0x140
>  [<c0129640>] process_timeout+0x0/0x10
>  [<c012c62d>] sys_rt_sigprocmask+0x7d/0x1a0
>  [<c0129944>] sys_nanosleep+0x154/0x180
>  [<c0109a3b>] do_notify_resume+0x5b/0x60
>  [<c0109c72>] work_notifysig+0x13/0x15

I bet the doublefaults are on "tsk->mm" accesses (specifically, 
tsk->mm->mmap_sem", which should be the first of them).

That easily happens if "tsk" is crud (either because recursion has already 
overwritten it, _or_ because %esp has recursed so far down that the 
"current()" logic ends up hitting the next page.

The stack doesn't look _that_ deep to me, but if some of these functions
have a large local frame, then that would certainly do it.. At a guess, it
Looks like a fairly deep "schedule()" coupled with deep RCU processing.

And that RCU path is reasonably new. The infrastructure was put in 2.5.43, 
which might explain Chris' case too ("somewhere before 2.5.51").

Does anybody have an up-to-date "use -gp and a special 'mcount()' 
function to check stack depth" patch? The CONFIG_DEBUG_STACKOVERFLOW thing 
is quite possibly too stupid to find things like this (it only finds 
interrupts that overflow the stack, not deep call sequences).

Guys: you could try to enable CONFIG_DEBUG_STACKOVERFLOW, and then perhaps 
make it a bit more aggressive (rigth now it does:

                if (unlikely(esp < (sizeof(struct thread_info) + 1024))) {

and I'd suggest changing it to something more like

		/* Have we used up more than half the stack? */
		if (unlikely(esp < 4096)) {

and add a "for (;;)" after doing the dump_stack() because otherwise the 
machine may reboot before you get anywhere.

		Linus

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20 15:43                     ` Linus Torvalds
@ 2003-02-20 15:52                       ` Ingo Molnar
  2003-02-20 16:11                       ` Martin J. Bligh
                                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 72+ messages in thread
From: Ingo Molnar @ 2003-02-20 15:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Zwane Mwaikambo, Chris Wedgwood, Kernel Mailing List,
	Martin J. Bligh, William Lee Irwin III


another datapoint: on SMP i can get various types of backtraces, on UP
it's the spontaneous reboot that triggers.

	Ingo


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20 15:43                     ` Linus Torvalds
  2003-02-20 15:52                       ` Ingo Molnar
@ 2003-02-20 16:11                       ` Martin J. Bligh
  2003-02-20 16:54                         ` Linus Torvalds
  2003-02-20 23:09                         ` Chris Wedgwood
  2003-02-20 16:44                       ` Ingo Molnar
  2003-02-20 20:13                       ` Chris Wedgwood
  3 siblings, 2 replies; 72+ messages in thread
From: Martin J. Bligh @ 2003-02-20 16:11 UTC (permalink / raw)
  To: Linus Torvalds, Ingo Molnar, Dave Hansen
  Cc: Zwane Mwaikambo, Chris Wedgwood, Kernel Mailing List,
	William Lee Irwin III

[-- Attachment #1: Type: text/plain, Size: 1140 bytes --]

> Does anybody have an up-to-date "use -gp and a special 'mcount()' 
> function to check stack depth" patch? The CONFIG_DEBUG_STACKOVERFLOW thing 
> is quite possibly too stupid to find things like this (it only finds 
> interrupts that overflow the stack, not deep call sequences).
> 
> Guys: you could try to enable CONFIG_DEBUG_STACKOVERFLOW, and then perhaps 
> make it a bit more aggressive (rigth now it does:
> 
>                 if (unlikely(esp < (sizeof(struct thread_info) + 1024))) {
> 
> and I'd suggest changing it to something more like
> 
> 		/* Have we used up more than half the stack? */
> 		if (unlikely(esp < 4096)) {
> 
> and add a "for (;;)" after doing the dump_stack() because otherwise the 
> machine may reboot before you get anywhere.

There are patches in -mjb from Dave Hansen / Ben LaHaise to detect stack
overflow included with the stuff for the 4K stacks patch (intended for 
scaling to large numbers of tasks). I've split them out attatched, should 
apply to mainline reasonably easily.

M.

PS. Linus, I think the attatchments will work for you as they're text/plain,
if not, I'll resend them all inline.

[-- Attachment #2: 220-thread_info_cleanup --]
[-- Type: text/plain, Size: 4328 bytes --]

diff -urpN -X /home/fletch/.diff.exclude 211-shpte/arch/i386/kernel/entry.S 220-thread_info_cleanup/arch/i386/kernel/entry.S
--- 211-shpte/arch/i386/kernel/entry.S	Sun Feb 16 15:10:13 2003
+++ 220-thread_info_cleanup/arch/i386/kernel/entry.S	Mon Feb 17 10:57:56 2003
@@ -155,7 +155,7 @@ do_lcall:
 	movl %eax,EFLAGS(%ebp)	#
 	movl %edx,EIP(%ebp)	# Now we move them to their "normal" places
 	movl %ecx,CS(%ebp)	#
-	andl $-8192, %ebp	# GET_THREAD_INFO
+	GET_THREAD_INFO_WITH_ESP(%ebp)  # GET_THREAD_INFO
 	movl TI_EXEC_DOMAIN(%ebp), %edx	# Get the execution domain
 	call *4(%edx)		# Call the lcall7 handler for the domain
 	addl $4, %esp
diff -urpN -X /home/fletch/.diff.exclude 211-shpte/arch/i386/kernel/head.S 220-thread_info_cleanup/arch/i386/kernel/head.S
--- 211-shpte/arch/i386/kernel/head.S	Thu Jan  2 22:04:58 2003
+++ 220-thread_info_cleanup/arch/i386/kernel/head.S	Mon Feb 17 10:57:56 2003
@@ -16,6 +16,7 @@
 #include <asm/pgtable.h>
 #include <asm/desc.h>
 #include <asm/cache.h>
+#include <asm/thread_info.h>
 
 #define OLD_CL_MAGIC_ADDR	0x90020
 #define OLD_CL_MAGIC		0xA33F
@@ -309,7 +310,7 @@ rp_sidt:
 	ret
 
 ENTRY(stack_start)
-	.long init_thread_union+8192
+	.long init_thread_union+THREAD_SIZE
 	.long __BOOT_DS
 
 /* This is the default interrupt "handler" :-) */
diff -urpN -X /home/fletch/.diff.exclude 211-shpte/include/asm-i386/page.h 220-thread_info_cleanup/include/asm-i386/page.h
--- 211-shpte/include/asm-i386/page.h	Sun Feb 16 13:18:59 2003
+++ 220-thread_info_cleanup/include/asm-i386/page.h	Mon Feb 17 10:57:56 2003
@@ -3,7 +3,11 @@
 
 /* PAGE_SHIFT determines the page size */
 #define PAGE_SHIFT	12
-#define PAGE_SIZE	(1UL << PAGE_SHIFT)
+#ifndef __ASSEMBLY__
+#define PAGE_SIZE      (1UL << PAGE_SHIFT)
+#else
+#define PAGE_SIZE      (1 << PAGE_SHIFT)
+#endif
 #define PAGE_MASK	(~(PAGE_SIZE-1))
 
 #define LARGE_PAGE_MASK (~(LARGE_PAGE_SIZE-1))
diff -urpN -X /home/fletch/.diff.exclude 211-shpte/include/asm-i386/thread_info.h 220-thread_info_cleanup/include/asm-i386/thread_info.h
--- 211-shpte/include/asm-i386/thread_info.h	Thu Jan  9 19:16:11 2003
+++ 220-thread_info_cleanup/include/asm-i386/thread_info.h	Mon Feb 17 10:57:56 2003
@@ -9,6 +9,7 @@
 
 #ifdef __KERNEL__
 
+#include <asm/page.h>
 #ifndef __ASSEMBLY__
 #include <asm/processor.h>
 #endif
@@ -57,11 +58,14 @@ struct thread_info {
  *
  * preempt_count needs to be 1 initially, until the scheduler is functional.
  */
+#define THREAD_ORDER 1 
+#define INIT_THREAD_SIZE       THREAD_SIZE
+
 #ifndef __ASSEMBLY__
 
 #define INIT_THREAD_INFO(tsk)			\
 {						\
-	.task		= &tsk,			\
+	.task		= &tsk,         	\
 	.exec_domain	= &default_exec_domain,	\
 	.flags		= 0,			\
 	.cpu		= 0,			\
@@ -75,30 +79,36 @@ struct thread_info {
 #define init_thread_info	(init_thread_union.thread_info)
 #define init_stack		(init_thread_union.stack)
 
+/* thread information allocation */
+#define THREAD_SIZE (PAGE_SIZE << THREAD_ORDER)
+#define alloc_thread_info() ((struct thread_info *) __get_free_pages(GFP_KERNEL,THREAD_ORDER))
+#define free_thread_info(ti) free_pages((unsigned long) (ti), THREAD_ORDER)
+#define get_thread_info(ti) get_task_struct((ti)->task)
+#define put_thread_info(ti) put_task_struct((ti)->task)
+
 /* how to get the thread information struct from C */
 static inline struct thread_info *current_thread_info(void)
 {
 	struct thread_info *ti;
-	__asm__("andl %%esp,%0; ":"=r" (ti) : "0" (~8191UL));
+	__asm__("andl %%esp,%0; ":"=r" (ti) : "0" (~(THREAD_SIZE - 1)));
 	return ti;
 }
 
-/* thread information allocation */
-#define THREAD_SIZE (2*PAGE_SIZE)
-#define alloc_thread_info() ((struct thread_info *) __get_free_pages(GFP_KERNEL,1))
-#define free_thread_info(ti) free_pages((unsigned long) (ti), 1)
-#define get_thread_info(ti) get_task_struct((ti)->task)
-#define put_thread_info(ti) put_task_struct((ti)->task)
-
 #else /* !__ASSEMBLY__ */
 
+#define THREAD_SIZE (PAGE_SIZE << THREAD_ORDER)
+
 /* how to get the thread information struct from ASM */
 #define GET_THREAD_INFO(reg) \
-	movl $-8192, reg; \
+	movl $-THREAD_SIZE, reg; \
 	andl %esp, reg
 
-#endif
+/* use this one if reg already contains %esp */
+#define GET_THREAD_INFO_WITH_ESP(reg) \
+	andl $-THREAD_SIZE, reg
 
+#endif
+	 
 /*
  * thread information flags
  * - these are process state flags that various assembly files may need to access

[-- Attachment #3: 221-interrupt_stacks --]
[-- Type: text/plain, Size: 13839 bytes --]

diff -urpN -X /home/fletch/.diff.exclude 220-thread_info_cleanup/arch/i386/Kconfig 221-interrupt_stacks/arch/i386/Kconfig
--- 220-thread_info_cleanup/arch/i386/Kconfig	Mon Feb 17 10:55:52 2003
+++ 221-interrupt_stacks/arch/i386/Kconfig	Mon Feb 17 10:57:57 2003
@@ -374,6 +374,11 @@ config X86_SSE2
 	depends on MK8 || MPENTIUM4
 	default y
 
+config X86_CMOV
+	bool
+	depends on M686 || MPENTIUMII || MPENTIUMIII || MPENTIUM4 || MK8 || MCRUSOE
+	default y
+
 config HUGETLB_PAGE
 	bool "Huge TLB Page Support"
 	help
diff -urpN -X /home/fletch/.diff.exclude 220-thread_info_cleanup/arch/i386/kernel/apic.c 221-interrupt_stacks/arch/i386/kernel/apic.c
--- 220-thread_info_cleanup/arch/i386/kernel/apic.c	Sat Feb 15 16:11:40 2003
+++ 221-interrupt_stacks/arch/i386/kernel/apic.c	Mon Feb 17 10:57:57 2003
@@ -1040,7 +1040,8 @@ inline void smp_local_timer_interrupt(st
  *   interrupt as well. Thus we cannot inline the local irq ... ]
  */
 
-void smp_apic_timer_interrupt(struct pt_regs regs)
+struct pt_regs * IRQHANDLER(smp_apic_timer_interrupt(struct pt_regs* regs));
+struct pt_regs * smp_apic_timer_interrupt(struct pt_regs* regs)
 {
 	int cpu = smp_processor_id();
 
@@ -1060,14 +1061,16 @@ void smp_apic_timer_interrupt(struct pt_
 	 * interrupt lock, which is the WrongThing (tm) to do.
 	 */
 	irq_enter();
-	smp_local_timer_interrupt(&regs);
+	smp_local_timer_interrupt(regs);
 	irq_exit();
+	return regs;
 }
 
 /*
  * This interrupt should _never_ happen with our APIC/SMP architecture
  */
-asmlinkage void smp_spurious_interrupt(void)
+struct pt_regs * IRQHANDLER(smp_spurious_interrupt(struct pt_regs* regs));
+struct pt_regs * smp_spurious_interrupt(struct pt_regs* regs)
 {
 	unsigned long v;
 
@@ -1085,13 +1088,15 @@ asmlinkage void smp_spurious_interrupt(v
 	printk(KERN_INFO "spurious APIC interrupt on CPU#%d, should never happen.\n",
 			smp_processor_id());
 	irq_exit();
+	return regs;
 }
 
 /*
  * This interrupt should never happen with our APIC/SMP architecture
  */
 
-asmlinkage void smp_error_interrupt(void)
+struct pt_regs * IRQHANDLER(smp_error_interrupt(struct pt_regs* regs));
+struct pt_regs * smp_error_interrupt(struct pt_regs* regs)
 {
 	unsigned long v, v1;
 
@@ -1116,6 +1121,7 @@ asmlinkage void smp_error_interrupt(void
 	printk (KERN_INFO "APIC error on CPU%d: %02lx(%02lx)\n",
 	        smp_processor_id(), v , v1);
 	irq_exit();
+	return regs;
 }
 
 /*
diff -urpN -X /home/fletch/.diff.exclude 220-thread_info_cleanup/arch/i386/kernel/cpu/mcheck/p4.c 221-interrupt_stacks/arch/i386/kernel/cpu/mcheck/p4.c
--- 220-thread_info_cleanup/arch/i386/kernel/cpu/mcheck/p4.c	Thu Jan  2 22:04:58 2003
+++ 221-interrupt_stacks/arch/i386/kernel/cpu/mcheck/p4.c	Mon Feb 17 10:57:57 2003
@@ -61,11 +61,13 @@ static void intel_thermal_interrupt(stru
 /* Thermal interrupt handler for this CPU setup */
 static void (*vendor_thermal_interrupt)(struct pt_regs *regs) = unexpected_thermal_interrupt;
 
-asmlinkage void smp_thermal_interrupt(struct pt_regs regs)
+struct pt_regs * IRQHANDLER(smp_thermal_interrupt(struct pt_regs* regs));
+struct pt_regs * smp_thermal_interrupt(struct pt_regs* regs)
 {
 	irq_enter();
 	vendor_thermal_interrupt(&regs);
 	irq_exit();
+	return regs;
 }
 
 /* P4/Xeon Thermal regulation detect and init */
diff -urpN -X /home/fletch/.diff.exclude 220-thread_info_cleanup/arch/i386/kernel/entry.S 221-interrupt_stacks/arch/i386/kernel/entry.S
--- 220-thread_info_cleanup/arch/i386/kernel/entry.S	Mon Feb 17 10:57:56 2003
+++ 221-interrupt_stacks/arch/i386/kernel/entry.S	Mon Feb 17 10:57:57 2003
@@ -388,17 +388,78 @@ ENTRY(irq_entries_start)
 vector=vector+1
 .endr
 
+
+# lets play optimizing compiler...
+#ifdef CONFIG_X86_CMOV
+#define COND_MOVE	cmovnz %esi,%esp;
+#else
+#define COND_MOVE	\
+	jz 1f;		\
+	mov %esi,%esp;	\
+1:
+#endif
+
+# These macros will switch you to, and from a per-cpu interrupt stack
+# They take the pt_regs arg and move it from the normal place on the 
+# stack to %eax.  Any handler function can retrieve it using regparm(1). 
+# The handlers are expected to return the stack to switch back to in 
+# the same register. 
+#
+# This means that the irq handlers need to return their arg
+#
+# SWITCH_TO_IRQSTACK clobbers %ebx, %ecx, %edx, %esi
+# old stack gets put in %eax
+
+.macro SWITCH_TO_IRQSTACK 
+	GET_THREAD_INFO(%ebx);
+	movl TI_IRQ_STACK(%ebx),%ecx;
+	movl TI_TASK(%ebx),%edx;
+	movl %esp,%eax;
+
+	# %ecx+THREAD_SIZE is next stack -4 keeps us in the right one
+	leal (THREAD_SIZE-4)(%ecx),%esi; 
+
+	# is there a valid irq_stack?
+	testl %ecx,%ecx;
+	COND_MOVE;
+
+	# update the task pointer in the irq stack
+	GET_THREAD_INFO(%esi);
+	movl %edx,TI_TASK(%esi);
+
+	# update the preempt count in the irq stack
+	movl TI_PRE_COUNT(%ebx),%ecx;
+	movl %ecx,TI_PRE_COUNT(%esi);
+.endm
+
+# copy flags from the irq stack back into the task's thread_info
+# %esi is saved over the irq handler call and contains the irq stack's
+#      thread_info pointer
+# %eax was returned from the handler, as described above
+# %ebx contains the original thread_info pointer
+
+.macro RESTORE_FROM_IRQSTACK 
+	movl %eax,%esp;
+	movl TI_FLAGS(%esi),%eax;
+	movl $0,TI_FLAGS(%esi);
+	LOCK orl %eax,TI_FLAGS(%ebx);
+.endm
+
 	ALIGN
 common_interrupt:
 	SAVE_ALL
+	SWITCH_TO_IRQSTACK
 	call do_IRQ
+	RESTORE_FROM_IRQSTACK
 	jmp ret_from_intr
 
 #define BUILD_INTERRUPT(name, nr)	\
 ENTRY(name)				\
 	pushl $nr-256;			\
 	SAVE_ALL			\
-	call smp_/**/name;	\
+	SWITCH_TO_IRQSTACK;		\
+	call smp_/**/name;		\
+	RESTORE_FROM_IRQSTACK;		\
 	jmp ret_from_intr;
 
 /* The include is where all of the SMP etc. interrupts come from */
diff -urpN -X /home/fletch/.diff.exclude 220-thread_info_cleanup/arch/i386/kernel/init_task.c 221-interrupt_stacks/arch/i386/kernel/init_task.c
--- 220-thread_info_cleanup/arch/i386/kernel/init_task.c	Thu Feb 13 11:08:02 2003
+++ 221-interrupt_stacks/arch/i386/kernel/init_task.c	Mon Feb 17 10:57:57 2003
@@ -14,6 +14,10 @@ static struct signal_struct init_signals
 static struct sighand_struct init_sighand = INIT_SIGHAND(init_sighand);
 struct mm_struct init_mm = INIT_MM(init_mm);
 
+union thread_union init_irq_union
+	__attribute__((__section__(".data.init_task")));
+
+
 /*
  * Initial thread structure.
  *
diff -urpN -X /home/fletch/.diff.exclude 220-thread_info_cleanup/arch/i386/kernel/irq.c 221-interrupt_stacks/arch/i386/kernel/irq.c
--- 220-thread_info_cleanup/arch/i386/kernel/irq.c	Thu Feb 13 11:08:02 2003
+++ 221-interrupt_stacks/arch/i386/kernel/irq.c	Mon Feb 17 10:57:57 2003
@@ -311,7 +311,8 @@ void enable_irq(unsigned int irq)
  * SMP cross-CPU interrupts have their own specific
  * handlers).
  */
-asmlinkage unsigned int do_IRQ(struct pt_regs regs)
+struct pt_regs * IRQHANDLER(do_IRQ(struct pt_regs *regs));
+struct pt_regs * do_IRQ(struct pt_regs *regs)
 {	
 	/* 
 	 * We ack quickly, we don't want the irq controller
@@ -323,7 +324,7 @@ asmlinkage unsigned int do_IRQ(struct pt
 	 * 0 return value means that this irq is already being
 	 * handled by some other CPU. (or is disabled)
 	 */
-	int irq = regs.orig_eax & 0xff; /* high bits used in ret_from_ code  */
+	int irq = regs->orig_eax & 0xff; /* high bits used in ret_from_ code  */
 	int cpu = smp_processor_id();
 	irq_desc_t *desc = irq_desc + irq;
 	struct irqaction * action;
@@ -388,7 +389,7 @@ asmlinkage unsigned int do_IRQ(struct pt
 	 */
 	for (;;) {
 		spin_unlock(&desc->lock);
-		handle_IRQ_event(irq, &regs, action);
+		handle_IRQ_event(irq, regs, action);
 		spin_lock(&desc->lock);
 		
 		if (likely(!(desc->status & IRQ_PENDING)))
@@ -407,7 +408,7 @@ out:
 
 	irq_exit();
 
-	return 1;
+	return regs;
 }
 
 /**
diff -urpN -X /home/fletch/.diff.exclude 220-thread_info_cleanup/arch/i386/kernel/process.c 221-interrupt_stacks/arch/i386/kernel/process.c
--- 220-thread_info_cleanup/arch/i386/kernel/process.c	Thu Feb 13 11:08:02 2003
+++ 221-interrupt_stacks/arch/i386/kernel/process.c	Mon Feb 17 10:57:57 2003
@@ -432,6 +432,7 @@ void __switch_to(struct task_struct *pre
 
 	/* never put a printk in __switch_to... printk() calls wake_up*() indirectly */
 
+	next_p->thread_info->irq_stack = prev_p->thread_info->irq_stack;
 	unlazy_fpu(prev_p);
 
 	/*
diff -urpN -X /home/fletch/.diff.exclude 220-thread_info_cleanup/arch/i386/kernel/smp.c 221-interrupt_stacks/arch/i386/kernel/smp.c
--- 220-thread_info_cleanup/arch/i386/kernel/smp.c	Sun Feb 16 13:22:10 2003
+++ 221-interrupt_stacks/arch/i386/kernel/smp.c	Mon Feb 17 10:57:57 2003
@@ -305,7 +305,8 @@ static inline void leave_mm (unsigned lo
  * 2) Leave the mm if we are in the lazy tlb mode.
  */
 
-asmlinkage void smp_invalidate_interrupt (void)
+struct pt_regs * IRQHANDLER(smp_invalidate_interrupt(struct pt_regs *regs));
+struct pt_regs * smp_invalidate_interrupt(struct pt_regs *regs)
 {
 	unsigned long cpu;
 
@@ -336,6 +337,7 @@ asmlinkage void smp_invalidate_interrupt
 
 out:
 	put_cpu_no_resched();
+	return regs;
 }
 
 static void flush_tlb_others (unsigned long cpumask, struct mm_struct *mm,
@@ -598,12 +600,15 @@ void smp_send_stop(void)
  * all the work is done automatically when
  * we return from the interrupt.
  */
-asmlinkage void smp_reschedule_interrupt(void)
+struct pt_regs * IRQHANDLER(smp_reschedule_interrupt(struct pt_regs *regs));
+struct pt_regs * smp_reschedule_interrupt(struct pt_regs *regs)
 {
 	ack_APIC_irq();
+	return regs;
 }
 
-asmlinkage void smp_call_function_interrupt(struct pt_regs regs)
+struct pt_regs * IRQHANDLER(smp_call_function_interrupt(struct pt_regs *regs));
+struct pt_regs * smp_call_function_interrupt(struct pt_regs *regs)
 {
 	void (*func) (void *info, struct pt_regs *) = (void (*)(void *, struct pt_regs*))call_data->func;
 	void *info = call_data->info;
@@ -627,5 +632,6 @@ asmlinkage void smp_call_function_interr
 		mb();
 		atomic_inc(&call_data->finished);
 	}
+	return regs;
 }
 
diff -urpN -X /home/fletch/.diff.exclude 220-thread_info_cleanup/arch/i386/kernel/smpboot.c 221-interrupt_stacks/arch/i386/kernel/smpboot.c
--- 220-thread_info_cleanup/arch/i386/kernel/smpboot.c	Sun Feb 16 13:18:39 2003
+++ 221-interrupt_stacks/arch/i386/kernel/smpboot.c	Mon Feb 17 10:57:57 2003
@@ -71,6 +71,11 @@ static unsigned long smp_commenced_mask;
 /* Per CPU bogomips and other parameters */
 struct cpuinfo_x86 cpu_data[NR_CPUS] __cacheline_aligned;
 
+/* Per CPU interrupt stacks */
+extern union thread_union init_irq_union;
+union thread_union *irq_stacks[NR_CPUS] __cacheline_aligned =
+	{ &init_irq_union, };
+
 /* Set when the idlers are all forked */
 int smp_threads_ready;
 
@@ -770,6 +775,28 @@ wakeup_secondary_cpu(int phys_apicid, un
 }
 #endif	/* WAKE_SECONDARY_VIA_INIT */
 
+static void __init setup_irq_stack(struct task_struct *p, int cpu)
+{
+	unsigned long stk;
+
+	stk = __get_free_pages(GFP_KERNEL, THREAD_ORDER);
+	if (!stk)
+		panic("I can't seem to allocate my irq stack.  Oh well, giving up.");
+
+	irq_stacks[cpu] = (void *)stk;
+	memset(irq_stacks[cpu], 0, THREAD_SIZE);
+	irq_stacks[cpu]->thread_info.cpu = cpu;
+	irq_stacks[cpu]->thread_info.preempt_count = 1;
+					/* interrupts are not preemptable */
+	p->thread_info->irq_stack = &irq_stacks[cpu]->thread_info;
+
+	/* If we want to make the irq stack more than one unit
+	 * deep, we can chain then off of the irq_stack pointer
+	 * here.
+	 */
+}
+
+
 extern unsigned long cpu_initialized;
 
 static int __init do_boot_cpu(int apicid)
@@ -793,6 +820,8 @@ static int __init do_boot_cpu(int apicid
 	idle = fork_by_hand();
 	if (IS_ERR(idle))
 		panic("failed fork for CPU %d", cpu);
+
+	setup_irq_stack(idle, cpu);
 
 	/*
 	 * We remove it from the pidhash and the runqueue
diff -urpN -X /home/fletch/.diff.exclude 220-thread_info_cleanup/include/asm-i386/linkage.h 221-interrupt_stacks/include/asm-i386/linkage.h
--- 220-thread_info_cleanup/include/asm-i386/linkage.h	Sun Nov 17 20:29:46 2002
+++ 221-interrupt_stacks/include/asm-i386/linkage.h	Mon Feb 17 10:57:57 2003
@@ -3,6 +3,7 @@
 
 #define asmlinkage CPP_ASMLINKAGE __attribute__((regparm(0)))
 #define FASTCALL(x)	x __attribute__((regparm(3)))
+#define IRQHANDLER(x)	x __attribute__((regparm(1)))
 
 #ifdef CONFIG_X86_ALIGNMENT_16
 #define __ALIGN .align 16,0x90
diff -urpN -X /home/fletch/.diff.exclude 220-thread_info_cleanup/include/asm-i386/thread_info.h 221-interrupt_stacks/include/asm-i386/thread_info.h
--- 220-thread_info_cleanup/include/asm-i386/thread_info.h	Mon Feb 17 10:57:56 2003
+++ 221-interrupt_stacks/include/asm-i386/thread_info.h	Mon Feb 17 10:57:57 2003
@@ -30,9 +30,11 @@ struct thread_info {
 	__s32			preempt_count; /* 0 => preemptable, <0 => BUG */
 
 	mm_segment_t		addr_limit;	/* thread address space:
+						   0 for interrupts: illegal
 					 	   0-0xBFFFFFFF for user-thead
 						   0-0xFFFFFFFF for kernel-thread
 						*/
+	struct thread_info	*irq_stack;	/* pointer to cpu irq stack */
 	struct restart_block    restart_block;
 
 	__u8			supervisor_stack[0];
@@ -47,7 +49,8 @@ struct thread_info {
 #define TI_CPU		0x0000000C
 #define TI_PRE_COUNT	0x00000010
 #define TI_ADDR_LIMIT	0x00000014
-#define TI_RESTART_BLOCK 0x0000018
+#define TI_IRQ_STACK	0x00000018
+#define TI_RESTART_BLOCK 0x0000022
 
 #endif
 
@@ -63,17 +66,18 @@ struct thread_info {
 
 #ifndef __ASSEMBLY__
 
-#define INIT_THREAD_INFO(tsk)			\
-{						\
-	.task		= &tsk,         	\
-	.exec_domain	= &default_exec_domain,	\
-	.flags		= 0,			\
-	.cpu		= 0,			\
-	.preempt_count	= 1,			\
-	.addr_limit	= KERNEL_DS,		\
-	.restart_block = {			\
-		.fn = do_no_restart_syscall,	\
-	},					\
+#define INIT_THREAD_INFO(tsk)				\
+{							\
+	.task		= &tsk,         		\
+	.exec_domain	= &default_exec_domain,		\
+	.flags		= 0,				\
+	.cpu		= 0,				\
+	.preempt_count	= 1,				\
+	.addr_limit	= KERNEL_DS,			\
+	.irq_stack	= &init_irq_union.thread_info,	\
+	.restart_block = {				\
+		.fn = do_no_restart_syscall,		\
+	}						\
 }
 
 #define init_thread_info	(init_thread_union.thread_info)

[-- Attachment #4: 222-stack_usage_check --]
[-- Type: text/plain, Size: 6600 bytes --]

diff -urpN -X /home/fletch/.diff.exclude 221-interrupt_stacks/arch/i386/Kconfig 222-stack_usage_check/arch/i386/Kconfig
--- 221-interrupt_stacks/arch/i386/Kconfig	Mon Feb 17 10:57:57 2003
+++ 222-stack_usage_check/arch/i386/Kconfig	Mon Feb 17 10:57:57 2003
@@ -1764,6 +1764,25 @@ config FRAME_POINTER
 	  If you don't debug the kernel, you can say N, but we may not be able
 	  to solve problems without frame pointers.
 
+config X86_STACK_CHECK
+	bool "Detect stack overflows"
+	depends on FRAME_POINTER
+	help
+	  Say Y here to have the kernel attempt to detect when the per-task
+	  kernel stack overflows.  This is much more robust checking than
+	  the above overflow check, which will only occasionally detect
+	  an overflow.  The level of guarantee here is much greater.
+	
+	  Some older versions of gcc don't handle the -p option correctly.  
+	  Kernprof is affected by the same problem, which is described here:
+	  http://oss.sgi.com/projects/kernprof/faq.html#Q9
+	
+	  Basically, if you get oopses in __free_pages_ok during boot when
+	  you have this turned on, you need to fix gcc.  The Redhat 2.96 
+	  version and gcc-3.x seem to work.  
+	
+	  If not debugging a stack overflow problem, say N
+
 config X86_EXTRA_IRQS
 	bool
 	depends on X86_LOCAL_APIC || X86_VOYAGER
diff -urpN -X /home/fletch/.diff.exclude 221-interrupt_stacks/arch/i386/Makefile 222-stack_usage_check/arch/i386/Makefile
--- 221-interrupt_stacks/arch/i386/Makefile	Sun Feb 16 13:18:58 2003
+++ 222-stack_usage_check/arch/i386/Makefile	Mon Feb 17 10:57:57 2003
@@ -76,6 +76,10 @@ mcore-$(CONFIG_X86_SUMMIT)  := mach-defa
 # default subarch .h files
 mflags-y += -Iinclude/asm-i386/mach-default
 
+ifdef CONFIG_X86_STACK_CHECK
+CFLAGS += -p
+endif
+
 head-y := arch/i386/kernel/head.o arch/i386/kernel/init_task.o
 
 libs-y 					+= arch/i386/lib/
diff -urpN -X /home/fletch/.diff.exclude 221-interrupt_stacks/arch/i386/boot/compressed/misc.c 222-stack_usage_check/arch/i386/boot/compressed/misc.c
--- 221-interrupt_stacks/arch/i386/boot/compressed/misc.c	Thu Jan  2 22:04:58 2003
+++ 222-stack_usage_check/arch/i386/boot/compressed/misc.c	Mon Feb 17 10:57:57 2003
@@ -377,3 +377,7 @@ asmlinkage int decompress_kernel(struct 
 	if (high_loaded) close_output_buffer_if_we_run_high(mv);
 	return high_loaded;
 }
+
+/* We don't actually check for stack overflows this early. */
+__asm__(".globl mcount ; mcount: ret\n");
+
diff -urpN -X /home/fletch/.diff.exclude 221-interrupt_stacks/arch/i386/kernel/entry.S 222-stack_usage_check/arch/i386/kernel/entry.S
--- 221-interrupt_stacks/arch/i386/kernel/entry.S	Mon Feb 17 10:57:57 2003
+++ 222-stack_usage_check/arch/i386/kernel/entry.S	Mon Feb 17 10:57:57 2003
@@ -640,6 +640,61 @@ ENTRY(spurious_interrupt_bug)
 	pushl $do_spurious_interrupt_bug
 	jmp error_code
 
+
+#ifdef CONFIG_X86_STACK_CHECK
+.data
+	.globl	stack_overflowed
+stack_overflowed:
+	.long	0
+.text
+
+ENTRY(mcount)
+	push %eax
+	movl $(THREAD_SIZE - 1),%eax
+	andl %esp,%eax
+	cmpl $STACK_WARN,%eax	/* more than half the stack is used*/
+	jle 1f
+2:
+	popl %eax
+	ret
+1:	
+	lock;   btsl    $0,stack_overflowed
+	jc      2b
+	
+	# switch to overflow stack
+	movl	%esp,%eax
+	movl	$(stack_overflow_stack + THREAD_SIZE - 4),%esp
+
+	pushf
+	cli
+	pushl	%eax
+
+	# push eip then esp of error for stack_overflow_panic
+	pushl	4(%eax)
+	pushl	%eax
+
+	# update the task pointer and cpu in the overflow stack's thread_info.
+	GET_THREAD_INFO_WITH_ESP(%eax)
+	movl	TI_TASK(%eax),%ebx
+	movl	%ebx,stack_overflow_stack+TI_TASK
+	movl	TI_CPU(%eax),%ebx
+	movl	%ebx,stack_overflow_stack+TI_CPU
+
+	call	stack_overflow
+
+	# pop off call arguments
+	addl	$8,%esp 
+
+	popl	%eax
+	popf
+	movl	%eax,%esp
+	popl	%eax
+	movl	$0,stack_overflowed
+	ret
+
+#warning stack check enabled
+#endif
+
 .data
 ENTRY(sys_call_table)
 	.long sys_restart_syscall	/* 0 - old "setup()" system call, used for restarting */
diff -urpN -X /home/fletch/.diff.exclude 221-interrupt_stacks/arch/i386/kernel/i386_ksyms.c 222-stack_usage_check/arch/i386/kernel/i386_ksyms.c
--- 221-interrupt_stacks/arch/i386/kernel/i386_ksyms.c	Sun Feb 16 15:10:06 2003
+++ 222-stack_usage_check/arch/i386/kernel/i386_ksyms.c	Mon Feb 17 10:57:57 2003
@@ -228,3 +228,8 @@ EXPORT_SYMBOL(kmap_atomic_to_page);
 EXPORT_SYMBOL(edd);
 EXPORT_SYMBOL(eddnr);
 #endif
+
+#ifdef CONFIG_X86_STACK_CHECK
+extern void mcount(void);
+EXPORT_SYMBOL(mcount);
+#endif
diff -urpN -X /home/fletch/.diff.exclude 221-interrupt_stacks/arch/i386/kernel/init_task.c 222-stack_usage_check/arch/i386/kernel/init_task.c
--- 221-interrupt_stacks/arch/i386/kernel/init_task.c	Mon Feb 17 10:57:57 2003
+++ 222-stack_usage_check/arch/i386/kernel/init_task.c	Mon Feb 17 10:57:57 2003
@@ -17,6 +17,10 @@ struct mm_struct init_mm = INIT_MM(init_
 union thread_union init_irq_union
 	__attribute__((__section__(".data.init_task")));
 
+#ifdef CONFIG_X86_STACK_CHECK
+union thread_union stack_overflow_stack
+	__attribute__((__section__(".data.init_task")));
+#endif
 
 /*
  * Initial thread structure.
diff -urpN -X /home/fletch/.diff.exclude 221-interrupt_stacks/arch/i386/kernel/process.c 222-stack_usage_check/arch/i386/kernel/process.c
--- 221-interrupt_stacks/arch/i386/kernel/process.c	Mon Feb 17 10:57:57 2003
+++ 222-stack_usage_check/arch/i386/kernel/process.c	Mon Feb 17 10:57:57 2003
@@ -159,7 +159,25 @@ static int __init idle_setup (char *str)
 
 __setup("idle=", idle_setup);
 
-void show_regs(struct pt_regs * regs)
+void stack_overflow(unsigned long esp, unsigned long eip)
+{
+	int panicing = ((esp&(THREAD_SIZE-1)) <= STACK_PANIC);
+
+	printk( "esp: 0x%lx masked: 0x%lx STACK_PANIC:0x%x %d %d\n", 
+		esp, (esp&(THREAD_SIZE-1)), STACK_PANIC, (((esp&(THREAD_SIZE-1)) <= STACK_PANIC)), panicing );
+	
+	if (panicing)
+		print_symbol("stack overflow from %s\n", eip);
+	else
+		print_symbol("excessive stack use from %s\n", eip);
+	printk("esp: %p\n", (void*)esp);
+	show_trace((void*)esp);
+	
+	if (panicing)
+		panic("stack overflow\n");
+}
+
+asmlinkage void show_regs(struct pt_regs * regs)
 {
 	unsigned long cr0 = 0L, cr2 = 0L, cr3 = 0L, cr4 = 0L;
 
diff -urpN -X /home/fletch/.diff.exclude 221-interrupt_stacks/include/asm-i386/thread_info.h 222-stack_usage_check/include/asm-i386/thread_info.h
--- 221-interrupt_stacks/include/asm-i386/thread_info.h	Mon Feb 17 10:57:57 2003
+++ 222-stack_usage_check/include/asm-i386/thread_info.h	Mon Feb 17 10:57:57 2003
@@ -63,6 +63,8 @@ struct thread_info {
  */
 #define THREAD_ORDER 1 
 #define INIT_THREAD_SIZE       THREAD_SIZE
+#define STACK_PANIC		0x200ul
+#define STACK_WARN		((THREAD_SIZE)>>1)
 
 #ifndef __ASSEMBLY__
 

[-- Attachment #5: 223-4k_stacks --]
[-- Type: text/plain, Size: 1664 bytes --]

diff -urpN -X /home/fletch/.diff.exclude 222-stack_usage_check/arch/i386/Kconfig 223-4k_stacks/arch/i386/Kconfig
--- 222-stack_usage_check/arch/i386/Kconfig	Mon Feb 17 10:57:57 2003
+++ 223-4k_stacks/arch/i386/Kconfig	Mon Feb 17 10:57:58 2003
@@ -742,6 +742,16 @@ config SHAREPTE
 	  level of the page table between address spaces that are sharing data
 	  pages.
 
+config 4K_STACK
+	bool "Use smaller 4k per-task stacks"
+	help
+	  This option will shrink the kernel's per-task stack from 8k to
+	  4k.  This will greatly increase your chance of overflowing it.
+	  But, if you use the per-cpu interrupt stacks as well, your chances
+	  go way down.  Also try the CONFIG_X86_STACK_CHECK overflow
+	  detection.  It is much more reliable than the currently in-kernel
+	  version.
+
 config MATH_EMULATION
 	bool "Math emulation"
 	---help---
diff -urpN -X /home/fletch/.diff.exclude 222-stack_usage_check/include/asm-i386/thread_info.h 223-4k_stacks/include/asm-i386/thread_info.h
--- 222-stack_usage_check/include/asm-i386/thread_info.h	Mon Feb 17 10:57:57 2003
+++ 223-4k_stacks/include/asm-i386/thread_info.h	Mon Feb 17 10:57:58 2003
@@ -61,10 +61,16 @@ struct thread_info {
  *
  * preempt_count needs to be 1 initially, until the scheduler is functional.
  */
-#define THREAD_ORDER 1 
+#ifdef CONFIG_4K_STACK
+#define THREAD_ORDER 0
+#define STACK_WARN		0x200
+#define STACK_PANIC		0x100
+#else
+#define THREAD_ORDER 1
+#define STACK_WARN              ((THREAD_SIZE)>>1)
+#define STACK_PANIC             0x100
+#endif
 #define INIT_THREAD_SIZE       THREAD_SIZE
-#define STACK_PANIC		0x200ul
-#define STACK_WARN		((THREAD_SIZE)>>1)
 
 #ifndef __ASSEMBLY__
 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20 16:11                       ` Martin J. Bligh
@ 2003-02-20 16:54                         ` Linus Torvalds
  2003-02-20 17:24                           ` Jeff Garzik
                                             ` (4 more replies)
  2003-02-20 23:09                         ` Chris Wedgwood
  1 sibling, 5 replies; 72+ messages in thread
From: Linus Torvalds @ 2003-02-20 16:54 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Ingo Molnar, Dave Hansen, Zwane Mwaikambo, Chris Wedgwood,
	Kernel Mailing List, William Lee Irwin III


On Thu, 20 Feb 2003, Martin J. Bligh wrote:
> 
> There are patches in -mjb from Dave Hansen / Ben LaHaise to detect stack
> overflow included with the stuff for the 4K stacks patch (intended for 
> scaling to large numbers of tasks). I've split them out attatched, should 
> apply to mainline reasonably easily.

Ok, the 4kB stack definitely won't work in real life, but that's because 
we have some hopelessly bad stack users in the kernel. But the debugging 
part would be good to try (in fact, it might be a good idea to keep the 
8kB stack, but with rather anal debugging. Just the "mcount" part should 
do that).

A sorted list of bad stack users (more than 256 bytes) in my default build
follows. Anybody can create their own with something like

	objdump -d linux/vmlinux |
		grep 'sub.*$0x...,.*esp' |
		awk '{ print $9,$1 }' |
		sort > bigstack

and a script to look up the addresses.

That ide_unregister() thing uses up >2kB in just one call! And there are 
several in the 1.5kB range too, with a long list of ~500 byte offenders.

Yeah, and this assumes we don't have alloca() users or other dynamic 
stack allocators (non-constant-size automatic arrays). I hope we don't 
have that kind of crap anywhere..

			Linus

-----
0xc02ae062 <ide_unregister+8>:				sub    $0x8c4,%esp
0xc010535d <huft_build+9>:				sub    $0x5b0,%esp
0xc0326a53 <snd_pcm_oss_change_params+6>:		sub    $0x590,%esp
0xc0106156 <inflate_dynamic+6>:				sub    $0x554,%esp
0xc0176150 <elf_core_dump+13>:				sub    $0x4b4,%esp
0xc0105fb8 <inflate_fixed+7>:				sub    $0x4ac,%esp
0xc035935e <pci_sanity_check+6>:			sub    $0x398,%esp
0xc035986d <pcibios_fixup_peer_bridges+5>:		sub    $0x394,%esp
0xc0334b85 <snd_pcm_hw_params_old_user+8>:		sub    $0x37c,%esp
0xc0334a97 <snd_pcm_hw_refine_old_user+8>:		sub    $0x37c,%esp
0xc02fbc74 <cb_alloc+6>:				sub    $0x32c,%esp
0xc0211b2a <pci_do_scan_bus+14>:			sub    $0x314,%esp
0xc034be58 <snd_seq_midisynth_register_port+12>:	sub    $0x2f0,%esp
0xc0264406 <extract_entropy+6>:				sub    $0x2d8,%esp
0xc02fcdde <ds_ioctl+3>:				sub    $0x2c8,%esp
0xc01dbd6b <udf_load_pvoldesc+6>:			sub    $0x2bc,%esp
0xc0329c6e <snd_pcm_oss_proc_write+6>:			sub    $0x298,%esp
0xc02a218f <pcnet_config+6>:				sub    $0x294,%esp
0xc01c8457 <nlmclnt_proc+14>:				sub    $0x294,%esp
0xc0327ecc <snd_pcm_oss_get_formats+12>:		sub    $0x290,%esp
0xc01d781f <udf_add_entry+6>:				sub    $0x290,%esp
0xc01c8e56 <nlmclnt_reclaim+18>:			sub    $0x280,%esp
0xc0330802 <snd_pcm_hw_params_user+8>:			sub    $0x27c,%esp
0xc03304af <snd_pcm_hw_refine_user+8>:			sub    $0x27c,%esp
0xc01ea4c9 <reiserfs_rename+13>:			sub    $0x27c,%esp
0xc029b57c <e100_ethtool_eeprom+10>:			sub    $0x260,%esp
0xc020a9df <semctl_main+12>:				sub    $0x25c,%esp
0xc0267205 <do_kdgkb_ioctl+24>:				sub    $0x244,%esp
0xc01d0ac8 <do_udf_readdir+6>:				sub    $0x240,%esp
0xc01e137a <udf_get_filename+3>:			sub    $0x23c,%esp
0xc01bd38c <find_exported_dentry+8>:			sub    $0x234,%esp
0xc01a5fa4 <fat_readdirx+15>:				sub    $0x230,%esp
0xc01fe813 <reiserfs_delete_solid_item+6>:		sub    $0x22c,%esp
0xc031f24d <snd_iprintf+3>:				sub    $0x21c,%esp
0xc02b4d6f <cdrom_read_intr+8>:				sub    $0x21c,%esp
0xc024adfb <pnp_printf+3>:				sub    $0x218,%esp
0xc02b4cac <cdrom_buffer_sectors+11>:			sub    $0x210,%esp
0xc01ebf96 <reiserfs_get_block+8>:			sub    $0x210,%esp
0xc020b2f0 <sys_semtimedop+3>:				sub    $0x208,%esp
0xc01fe58d <reiserfs_delete_item+12>:			sub    $0x208,%esp
0xc0529e98 <snd_seq_oss_create_client+12>:		sub    $0x204,%esp
0xc038efed <tcp_check_req+6>:				sub    $0x1f8,%esp
0xc038b462 <tcp_v4_conn_request+6>:			sub    $0x1f8,%esp
0xc01fef81 <reiserfs_cut_from_item+6>:			sub    $0x1f8,%esp
0xc038df7f <tcp_timewait_state_process+8>:		sub    $0x1e4,%esp
0xc0325539 <snd_mixer_oss_build_input+3>:		sub    $0x1e0,%esp
0xc01d9328 <udf_symlink+13>:				sub    $0x1cc,%esp
0xc01ffb15 <reiserfs_insert_item+6>:			sub    $0x1c4,%esp
0xc01ffa03 <reiserfs_paste_into_item+6>:		sub    $0x1c4,%esp
0xc01c43b6 <svc_export_parse+3>:			sub    $0x1c4,%esp
0xc02f6770 <pcmcia_validate_cis+3>:			sub    $0x1c0,%esp
0xc052a2c7 <snd_seq_system_client_init+24>:		sub    $0x1bc,%esp
0xc03511c9 <snd_intel8x0_mixer+13>:			sub    $0x1bc,%esp
0xc01a54f8 <fat_search_long+6>:				sub    $0x1b4,%esp
0xc052a0a1 <snd_seq_oss_midi_lookup_ports+9>:		sub    $0x1ac,%esp
0xc02e99f5 <sg_ioctl+6>:				sub    $0x19c,%esp
0xc0320fb0 <snd_ctl_card_info+12>:			sub    $0x198,%esp
0xc0171860 <ep_send_events+8>:				sub    $0x198,%esp
0xc0155ad4 <blkdev_get+11>:				sub    $0x194,%esp
0xc01b3bea <nfs_symlink+6>:				sub    $0x18c,%esp
0xc01b2699 <nfs_readdir+9>:				sub    $0x18c,%esp
0xc01b347d <nfs_mknod+6>:				sub    $0x17c,%esp
0xc01d71e3 <udf_find_entry+6>:				sub    $0x178,%esp
0xc01b333d <nfs_create+6>:				sub    $0x178,%esp
0xc01b35ca <nfs_mkdir+6>:				sub    $0x174,%esp
0xc02873a3 <radeon_cp_vertex2+3>:			sub    $0x16c,%esp
0xc01583a5 <do_execve+3>:				sub    $0x158,%esp
0xc033e177 <snd_seq_oss_ioctl+3>:			sub    $0x154,%esp
0xc02f13d9 <mmc_ioctl+3>:				sub    $0x154,%esp
0xc017d267 <elf_kcore_store_hdr+6>:			sub    $0x150,%esp
0xc01f048d <reiserfs_readdir+6>:			sub    $0x148,%esp
0xc01b28aa <nfs_lookup_revalidate+11>:			sub    $0x148,%esp
0xc036d0e8 <rt_cache_seq_show+6>:			sub    $0x144,%esp
0xc01d4115 <udf_fill_inode+6>:				sub    $0x144,%esp
0xc032fec8 <snd_pcm_info_user+3>:			sub    $0x140,%esp
0xc0286167 <radeon_cp_clear+3>:				sub    $0x13c,%esp
0xc019608f <journal_commit_transaction+6>:		sub    $0x13c,%esp
0xc0174db5 <load_elf_binary+20>:			sub    $0x13c,%esp
0xc03b5ba4 <ip_map_parse+3>:				sub    $0x138,%esp
0xc035c698 <sys_sendmsg+8>:				sub    $0x134,%esp
0xc02f66fe <read_tuple+3>:				sub    $0x134,%esp
0xc01b2ed9 <nfs_lookup+6>:				sub    $0x134,%esp
0xc0172105 <aout_core_dump+21>:				sub    $0x134,%esp
0xc02df535 <ahc_linux_proc_info+11>:			sub    $0x130,%esp
0xc02d8097 <ahc_linux_info+16>:				sub    $0x130,%esp
0xc034d3db <snd_rawmidi_info_select_user+3>:		sub    $0x12c,%esp
0xc032e77a <snd_pcm_proc_info_read+4>:			sub    $0x12c,%esp
0xc0308874 <proc_getdriver+3>:				sub    $0x12c,%esp
0xc01d4c5c <udf_update_inode+6>:			sub    $0x12c,%esp
0xc034d2a5 <snd_rawmidi_info_user+3>:			sub    $0x128,%esp
0xc01e148f <udf_put_filename+3>:			sub    $0x128,%esp
0xc01d9c88 <udf_rename+6>:				sub    $0x128,%esp
0xc0325433 <snd_mixer_oss_build_test+3>:		sub    $0x124,%esp
0xc0321351 <snd_ctl_elem_info+11>:			sub    $0x124,%esp
0xc02f4c26 <verify_cis_cache+6>:			sub    $0x124,%esp
0xc0242307 <acpi_pci_bind+32>:				sub    $0x124,%esp
0xc01e8aff <reiserfs_add_entry+11>:			sub    $0x124,%esp
0xc01cc029 <nlmsvc_proc_granted_msg+3>:			sub    $0x124,%esp
0xc01cbfab <nlmsvc_proc_unlock_msg+3>:			sub    $0x124,%esp
0xc01cbf2d <nlmsvc_proc_cancel_msg+3>:			sub    $0x124,%esp
0xc01cbeaf <nlmsvc_proc_lock_msg+3>:			sub    $0x124,%esp
0xc01cbe31 <nlmsvc_proc_test_msg+3>:			sub    $0x124,%esp
0xc017c649 <meminfo_read_proc+15>:			sub    $0x124,%esp
0xc016a6b5 <setxattr+8>:				sub    $0x124,%esp
0xc01e37ef <autofs4_expire_run+12>:			sub    $0x120,%esp
0xc0198244 <log_do_checkpoint+6>:			sub    $0x120,%esp
0xc016a969 <getxattr+3>:				sub    $0x120,%esp
0xc0257e97 <parport_pc_probe_port+12>:			sub    $0x11c,%esp
0xc024263a <acpi_pci_bind_root+32>:			sub    $0x11c,%esp
0xc01e3118 <autofs4_notify_daemon+12>:			sub    $0x11c,%esp
0xc035c92a <sys_recvmsg+3>:				sub    $0x118,%esp
0xc031c91e <i8042_interrupt+8>:				sub    $0x118,%esp
0xc02ee068 <sg_proc_hoststrs_info+6>:			sub    $0x118,%esp
0xc02551f0 <do_autoprobe+3>:				sub    $0x118,%esp
0xc0241aab <acpi_pci_irq_add_prt+20>:			sub    $0x118,%esp
0xc016adc3 <removexattr+3>:				sub    $0x118,%esp
0xc02deecd <copy_info+3>:				sub    $0x114,%esp
0xc02c05c8 <scsi_request_sense+6>:			sub    $0x114,%esp
0xc020c619 <sys_shmctl+3>:				sub    $0x114,%esp
0xc0203c55 <reiserfs_breada+6>:				sub    $0x114,%esp
0xc012a88b <sys_reboot+10>:				sub    $0x114,%esp
0xc052aeea <pirq_peer_trick+13>:			sub    $0x110,%esp
0xc01a059f <ext2_get_parent+3>:				sub    $0x110,%esp
0xc01719cf <ep_events_transfer+11>:			sub    $0x110,%esp
0xc02efd9b <dvd_read_bca+3>:				sub    $0x10c,%esp
0xc02550e0 <do_active_device+8>:			sub    $0x10c,%esp
0xc01d2ab5 <inode_getblk+6>:				sub    $0x10c,%esp
0xc01898b5 <ext3_get_parent+12>:			sub    $0x10c,%esp
0xc024839d <acpi_bus_match+8>:				sub    $0x108,%esp
0xc029ac87 <e100_do_ethtool_ioctl+10>:			sub    $0x100,%esp
0xc01beba6 <write_filehandle+3>:			sub    $0x100,%esp


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20 16:54                         ` Linus Torvalds
@ 2003-02-20 17:24                           ` Jeff Garzik
  2003-02-20 21:21                           ` Alan Cox
                                             ` (3 subsequent siblings)
  4 siblings, 0 replies; 72+ messages in thread
From: Jeff Garzik @ 2003-02-20 17:24 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Martin J. Bligh, Ingo Molnar, Dave Hansen, Zwane Mwaikambo,
	Chris Wedgwood, Kernel Mailing List, William Lee Irwin III

On Thu, Feb 20, 2003 at 08:54:55AM -0800, Linus Torvalds wrote:
> A sorted list of bad stack users (more than 256 bytes) in my default build
> follows. Anybody can create their own with something like
[...]

Yum.  Thanks for this list (and means to reproduce)...

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20 16:54                         ` Linus Torvalds
  2003-02-20 17:24                           ` Jeff Garzik
@ 2003-02-20 21:21                           ` Alan Cox
  2003-02-20 20:20                             ` Linus Torvalds
  2003-02-20 20:23                             ` Martin J. Bligh
  2003-02-21  7:39                           ` [PATCH] snd_pcm_oss_change_params is a stack offender Muli Ben-Yehuda
                                             ` (2 subsequent siblings)
  4 siblings, 2 replies; 72+ messages in thread
From: Alan Cox @ 2003-02-20 21:21 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Martin J. Bligh, Ingo Molnar, Dave Hansen, Zwane Mwaikambo,
	Chris Wedgwood, Linux Kernel Mailing List, William Lee Irwin III

On Thu, 2003-02-20 at 16:54, Linus Torvalds wrote:
> Ok, the 4kB stack definitely won't work in real life, but that's because 
> we have some hopelessly bad stack users in the kernel. But the debugging 
> part would be good to try (in fact, it might be a good idea to keep the 
> 8kB stack, but with rather anal debugging. Just the "mcount" part should 
> do that).

You also need IRQ stacks to get down to 4K. The wrong pattern of ten
different IRQ handlers using a mere 200 bytes each will eventually
happen and eventually kill you otherwise.

> That ide_unregister() thing uses up >2kB in just one call! And there are 
> several in the 1.5kB range too, with a long list of ~500 byte offenders.

ide_unregister is a really stupid one. Its copying a struct mostly to
restore fields it shouldnt be restoring but should be setting in the 
allocator. I hadn't realised quite how bad it was. Added to the ide
shitlist



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20 21:21                           ` Alan Cox
@ 2003-02-20 20:20                             ` Linus Torvalds
  2003-02-20 20:23                             ` Martin J. Bligh
  1 sibling, 0 replies; 72+ messages in thread
From: Linus Torvalds @ 2003-02-20 20:20 UTC (permalink / raw)
  To: Alan Cox
  Cc: Martin J. Bligh, Ingo Molnar, Dave Hansen, Zwane Mwaikambo,
	Chris Wedgwood, Linux Kernel Mailing List, William Lee Irwin III


On 20 Feb 2003, Alan Cox wrote:
> On Thu, 2003-02-20 at 16:54, Linus Torvalds wrote:
> > Ok, the 4kB stack definitely won't work in real life, but that's because 
> > we have some hopelessly bad stack users in the kernel. But the debugging 
> > part would be good to try (in fact, it might be a good idea to keep the 
> > 8kB stack, but with rather anal debugging. Just the "mcount" part should 
> > do that).
> 
> You also need IRQ stacks to get down to 4K. The wrong pattern of ten
> different IRQ handlers using a mere 200 bytes each will eventually
> happen and eventually kill you otherwise.

Martin's patch set included the per-IRQ stacks, so that part should be ok. 
However, since even a single function will overflow the stack depth test 
of "half the stack", I'm just saying that right now the 4kB stacks 
obviously shouldn't be used for overflow testing (and the 8kB stack 
version right now is way too permissive).

> > That ide_unregister() thing uses up >2kB in just one call! And there are 
> > several in the 1.5kB range too, with a long list of ~500 byte offenders.
> 
> ide_unregister is a really stupid one. Its copying a struct mostly to
> restore fields it shouldnt be restoring but should be setting in the 
> allocator. I hadn't realised quite how bad it was. Added to the ide
> shitlist

Well, ide_unregister() was only the worst of a fairly large bunch of crap. 

Although I guess nobody is really surprised.

		Linus


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20 21:21                           ` Alan Cox
  2003-02-20 20:20                             ` Linus Torvalds
@ 2003-02-20 20:23                             ` Martin J. Bligh
  2003-02-20 20:42                               ` William Lee Irwin III
  1 sibling, 1 reply; 72+ messages in thread
From: Martin J. Bligh @ 2003-02-20 20:23 UTC (permalink / raw)
  To: Alan Cox, Linus Torvalds
  Cc: Ingo Molnar, Dave Hansen, Zwane Mwaikambo, Chris Wedgwood,
	Linux Kernel Mailing List, William Lee Irwin III

>> Ok, the 4kB stack definitely won't work in real life, but that's because 
>> we have some hopelessly bad stack users in the kernel. But the debugging 
>> part would be good to try (in fact, it might be a good idea to keep the 
>> 8kB stack, but with rather anal debugging. Just the "mcount" part should 
>> do that).
> 
> You also need IRQ stacks to get down to 4K. The wrong pattern of ten
> different IRQ handlers using a mere 200 bytes each will eventually
> happen and eventually kill you otherwise.

That's in Dave's patchset, and 4K stacks is a config option for now.

M.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20 20:23                             ` Martin J. Bligh
@ 2003-02-20 20:42                               ` William Lee Irwin III
  2003-02-20 20:51                                 ` Linus Torvalds
  0 siblings, 1 reply; 72+ messages in thread
From: William Lee Irwin III @ 2003-02-20 20:42 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Alan Cox, Linus Torvalds, Ingo Molnar, Dave Hansen,
	Zwane Mwaikambo, Chris Wedgwood, Linux Kernel Mailing List

At some point in the past, _A_ wrote:
>> You also need IRQ stacks to get down to 4K. The wrong pattern of ten
>> different IRQ handlers using a mere 200 bytes each will eventually
>> happen and eventually kill you otherwise.

On Thu, Feb 20, 2003 at 12:23:49PM -0800, Martin J. Bligh wrote:
> That's in Dave's patchset, and 4K stacks is a config option for now.

You might want to grab aeb's fully non-recursive pathwalking if
you really want to cut back the stack to 4KB, as well as fixing
whatever stackblasting drivers are about.


-- wli

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20 20:42                               ` William Lee Irwin III
@ 2003-02-20 20:51                                 ` Linus Torvalds
  0 siblings, 0 replies; 72+ messages in thread
From: Linus Torvalds @ 2003-02-20 20:51 UTC (permalink / raw)
  To: William Lee Irwin III
  Cc: Martin J. Bligh, Alan Cox, Ingo Molnar, Dave Hansen,
	Zwane Mwaikambo, Chris Wedgwood, Linux Kernel Mailing List

On Thu, 20 Feb 2003, William Lee Irwin III wrote:
> 
> You might want to grab aeb's fully non-recursive pathwalking if
> you really want to cut back the stack to 4KB, as well as fixing
> whatever stackblasting drivers are about.

The path walking should really not be an issue. Each level of a symlink
takes something like 64 bytes of stack on x86 (I checked it some time ago,
maybe it's changed a bit), since the actual recursive part is very shallow
indeed.

And since we don't recurse deeper than 5 levels anyway, the symlink 
recursion ends up not being a real problem compared to a lot of other 
code (never mind the single functions with hundreds of bytes of stack 
space: just regular function calls 5 levels deep is quite normal).

That fs recursion was not the problem even back in the days when the max
stack depth was <3kB (4kB allocation, 1kB task_struct). It used to be 8
levels deep or something, it was changed to 5 not because we ran out on
x86, but because of those stupid sparc register windows (causing much
bigger minimum function stack requirements than on x86).

			Linus

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH] snd_pcm_oss_change_params is a stack offender
  2003-02-20 16:54                         ` Linus Torvalds
  2003-02-20 17:24                           ` Jeff Garzik
  2003-02-20 21:21                           ` Alan Cox
@ 2003-02-21  7:39                           ` Muli Ben-Yehuda
  2003-02-21  7:58                             ` Andreas Dilger
  2003-02-27 18:50                           ` doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots) Randy.Dunlap
  2003-02-27 23:32                           ` Randy.Dunlap
  4 siblings, 1 reply; 72+ messages in thread
From: Muli Ben-Yehuda @ 2003-02-21  7:39 UTC (permalink / raw)
  To: Jaroslav Kysela; +Cc: Kernel Mailing List

On Thu, Feb 20, 2003 at 08:54:55AM -0800, Linus Torvalds wrote:

> Ok, the 4kB stack definitely won't work in real life, but that's because 
> we have some hopelessly bad stack users in the kernel. But the debugging 
> part would be good to try (in fact, it might be a good idea to keep the 
> 8kB stack, but with rather anal debugging. Just the "mcount" part should 
> do that).
> 
> A sorted list of bad stack users (more than 256 bytes) in my default build
> follows. Anybody can create their own with something like
> 
> 	objdump -d linux/vmlinux |
> 		grep 'sub.*$0x...,.*esp' |
> 		awk '{ print $9,$1 }' |
> 		sort > bigstack
> 
> and a script to look up the addresses.
> 
[snipped] 

> 0xc02ae062 <ide_unregister+8>:				sub    $0x8c4,%esp
> 0xc010535d <huft_build+9>:				sub    $0x5b0,%esp
> 0xc0326a53 <snd_pcm_oss_change_params+6>:		sub    $0x590,%esp

Here's a quick patch to fix the third worst offender,
snd_pcm_oss_change_params. Compiles fine but not tested yet. 

#	sound/core/oss/pcm_oss.c	1.20    -> 1.21   
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 03/02/21	mulix@alhambra.mulix.org	1.1007
# snd_pcm_oss_change_params was a stack offender, having three large 
# structs on the stack. Allocate those structs on the heap and change
# the code accordingly. 
# --------------------------------------------
#
diff -Nru a/sound/core/oss/pcm_oss.c b/sound/core/oss/pcm_oss.c
--- a/sound/core/oss/pcm_oss.c	Fri Feb 21 09:35:24 2003
+++ b/sound/core/oss/pcm_oss.c	Fri Feb 21 09:35:24 2003
@@ -291,11 +291,51 @@
 	return snd_pcm_hw_param_near(substream, params, SNDRV_PCM_HW_PARAM_RATE, best_rate, 0);
 }
 
+static int alloc_param_structs(snd_pcm_hw_params_t** params, 
+			       snd_pcm_hw_params_t** sparams,
+			       snd_pcm_sw_params_t** sw_params)
+{
+	snd_pcm_hw_params_t* hwp; 
+	snd_pcm_sw_params_t* swp; 
+
+	if (!(hwp = kmalloc(sizeof(*hwp), GFP_KERNEL)))
+		goto out; 
+
+	memset(hwp, 0, sizeof(*hwp)); 
+	*params = hwp; 
+
+	if (!(hwp = kmalloc(sizeof(*hwp), GFP_KERNEL)))
+		goto free_params; 
+
+	memset(hwp, 0, sizeof(*hwp)); 
+	*sparams = hwp; 
+
+	if (!(swp = kmalloc(sizeof(*swp), GFP_KERNEL)))
+		goto free_sparams; 
+
+	memset(swp, 0, sizeof(*swp)); 
+	*sw_params = swp; 
+	
+	return 0; 
+
+ free_sparams:
+	kfree(*sparams); 
+	*sparams = NULL; 
+
+ free_params:
+	kfree(*params); 
+	*params = NULL; 
+
+ out: 
+	return -ENOMEM; 
+}
+
+
 static int snd_pcm_oss_change_params(snd_pcm_substream_t *substream)
 {
 	snd_pcm_runtime_t *runtime = substream->runtime;
-	snd_pcm_hw_params_t params, sparams;
-	snd_pcm_sw_params_t sw_params;
+	snd_pcm_hw_params_t *params, *sparams;
+	snd_pcm_sw_params_t *sw_params;
 	ssize_t oss_buffer_size, oss_period_size;
 	size_t oss_frame_size;
 	int err;
@@ -311,9 +351,14 @@
 		direct = (setup != NULL && setup->direct);
 	}
 
-	_snd_pcm_hw_params_any(&sparams);
-	_snd_pcm_hw_param_setinteger(&sparams, SNDRV_PCM_HW_PARAM_PERIODS);
-	_snd_pcm_hw_param_min(&sparams, SNDRV_PCM_HW_PARAM_PERIODS, 2, 0);
+	if ((err = alloc_param_structs(&params, &sparams, &sw_params))) {
+		snd_printd("out of memory\n"); 
+		return err; 
+	}
+
+	_snd_pcm_hw_params_any(sparams);
+	_snd_pcm_hw_param_setinteger(sparams, SNDRV_PCM_HW_PARAM_PERIODS);
+	_snd_pcm_hw_param_min(sparams, SNDRV_PCM_HW_PARAM_PERIODS, 2, 0);
 	snd_mask_none(&mask);
 	if (atomic_read(&runtime->mmap_count))
 		snd_mask_set(&mask, SNDRV_PCM_ACCESS_MMAP_INTERLEAVED);
@@ -322,17 +367,17 @@
 		if (!direct)
 			snd_mask_set(&mask, SNDRV_PCM_ACCESS_RW_NONINTERLEAVED);
 	}
-	err = snd_pcm_hw_param_mask(substream, &sparams, SNDRV_PCM_HW_PARAM_ACCESS, &mask);
+	err = snd_pcm_hw_param_mask(substream, sparams, SNDRV_PCM_HW_PARAM_ACCESS, &mask);
 	if (err < 0) {
 		snd_printd("No usable accesses\n");
 		return -EINVAL;
 	}
-	choose_rate(substream, &sparams, runtime->oss.rate);
-	snd_pcm_hw_param_near(substream, &sparams, SNDRV_PCM_HW_PARAM_CHANNELS, runtime->oss.channels, 0);
+	choose_rate(substream, sparams, runtime->oss.rate);
+	snd_pcm_hw_param_near(substream, sparams, SNDRV_PCM_HW_PARAM_CHANNELS, runtime->oss.channels, 0);
 
 	format = snd_pcm_oss_format_from(runtime->oss.format);
 
-	sformat_mask = *hw_param_mask(&sparams, SNDRV_PCM_HW_PARAM_FORMAT);
+	sformat_mask = *hw_param_mask(sparams, SNDRV_PCM_HW_PARAM_FORMAT);
 	if (direct)
 		sformat = format;
 	else
@@ -349,46 +394,46 @@
 			return -EINVAL;
 		}
 	}
-	err = _snd_pcm_hw_param_set(&sparams, SNDRV_PCM_HW_PARAM_FORMAT, sformat, 0);
+	err = _snd_pcm_hw_param_set(sparams, SNDRV_PCM_HW_PARAM_FORMAT, sformat, 0);
 	snd_assert(err >= 0, return err);
 
 	if (direct) {
-		params = sparams;
+		memcpy(params, sparams, sizeof(*params)); 
 	} else {
-		_snd_pcm_hw_params_any(&params);
-		_snd_pcm_hw_param_set(&params, SNDRV_PCM_HW_PARAM_ACCESS,
+		_snd_pcm_hw_params_any(params);
+		_snd_pcm_hw_param_set(params, SNDRV_PCM_HW_PARAM_ACCESS,
 				      SNDRV_PCM_ACCESS_RW_INTERLEAVED, 0);
-		_snd_pcm_hw_param_set(&params, SNDRV_PCM_HW_PARAM_FORMAT,
+		_snd_pcm_hw_param_set(params, SNDRV_PCM_HW_PARAM_FORMAT,
 				      snd_pcm_oss_format_from(runtime->oss.format), 0);
-		_snd_pcm_hw_param_set(&params, SNDRV_PCM_HW_PARAM_CHANNELS,
+		_snd_pcm_hw_param_set(params, SNDRV_PCM_HW_PARAM_CHANNELS,
 				      runtime->oss.channels, 0);
-		_snd_pcm_hw_param_set(&params, SNDRV_PCM_HW_PARAM_RATE,
+		_snd_pcm_hw_param_set(params, SNDRV_PCM_HW_PARAM_RATE,
 				      runtime->oss.rate, 0);
 		pdprintf("client: access = %i, format = %i, channels = %i, rate = %i\n",
-			 params_access(&params), params_format(&params),
-			 params_channels(&params), params_rate(&params));
+			 params_access(params), params_format(params),
+			 params_channels(params), params_rate(params));
 	}
 	pdprintf("slave: access = %i, format = %i, channels = %i, rate = %i\n",
-		 params_access(&sparams), params_format(&sparams),
-		 params_channels(&sparams), params_rate(&sparams));
+		 params_access(sparams), params_format(sparams),
+		 params_channels(sparams), params_rate(sparams));
 
-	oss_frame_size = snd_pcm_format_physical_width(params_format(&params)) *
-			 params_channels(&params) / 8;
+	oss_frame_size = snd_pcm_format_physical_width(params_format(params)) *
+			 params_channels(params) / 8;
 
 	snd_pcm_oss_plugin_clear(substream);
 	if (!direct) {
 		/* add necessary plugins */
 		snd_pcm_oss_plugin_clear(substream);
 		if ((err = snd_pcm_plug_format_plugins(substream,
-						       &params, 
-						       &sparams)) < 0) {
+						       params, 
+						       sparams)) < 0) {
 			snd_printd("snd_pcm_plug_format_plugins failed: %i\n", err);
 			snd_pcm_oss_plugin_clear(substream);
 			return err;
 		}
 		if (runtime->oss.plugin_first) {
 			snd_pcm_plugin_t *plugin;
-			if ((err = snd_pcm_plugin_build_io(substream, &sparams, &plugin)) < 0) {
+			if ((err = snd_pcm_plugin_build_io(substream, sparams, &plugin)) < 0) {
 				snd_printd("snd_pcm_plugin_build_io failed: %i\n", err);
 				snd_pcm_oss_plugin_clear(substream);
 				return err;
@@ -405,51 +450,50 @@
 		}
 	}
 
-	err = snd_pcm_oss_period_size(substream, &params, &sparams);
+	err = snd_pcm_oss_period_size(substream, params, sparams);
 	if (err < 0)
 		return err;
 
 	n = snd_pcm_plug_slave_size(substream, runtime->oss.period_bytes / oss_frame_size);
-	err = snd_pcm_hw_param_near(substream, &sparams, SNDRV_PCM_HW_PARAM_PERIOD_SIZE, n, 0);
+	err = snd_pcm_hw_param_near(substream, sparams, SNDRV_PCM_HW_PARAM_PERIOD_SIZE, n, 0);
 	snd_assert(err >= 0, return err);
 
-	err = snd_pcm_hw_param_near(substream, &sparams, SNDRV_PCM_HW_PARAM_PERIODS,
+	err = snd_pcm_hw_param_near(substream, sparams, SNDRV_PCM_HW_PARAM_PERIODS,
 				     runtime->oss.periods, 0);
 	snd_assert(err >= 0, return err);
 
 	snd_pcm_kernel_ioctl(substream, SNDRV_PCM_IOCTL_DROP, 0);
 
-	if ((err = snd_pcm_kernel_ioctl(substream, SNDRV_PCM_IOCTL_HW_PARAMS, &sparams)) < 0) {
+	if ((err = snd_pcm_kernel_ioctl(substream, SNDRV_PCM_IOCTL_HW_PARAMS, sparams)) < 0) {
 		snd_printd("HW_PARAMS failed: %i\n", err);
 		return err;
 	}
 
-	memset(&sw_params, 0, sizeof(sw_params));
 	if (runtime->oss.trigger) {
-		sw_params.start_threshold = 1;
+		sw_params->start_threshold = 1;
 	} else {
-		sw_params.start_threshold = runtime->boundary;
+		sw_params->start_threshold = runtime->boundary;
 	}
 	if (atomic_read(&runtime->mmap_count))
-		sw_params.stop_threshold = runtime->boundary;
+		sw_params->stop_threshold = runtime->boundary;
 	else
-		sw_params.stop_threshold = runtime->buffer_size;
-	sw_params.tstamp_mode = SNDRV_PCM_TSTAMP_NONE;
-	sw_params.period_step = 1;
-	sw_params.sleep_min = 0;
-	sw_params.avail_min = runtime->period_size;
-	sw_params.xfer_align = 1;
-	sw_params.silence_threshold = 0;
-	sw_params.silence_size = 0;
+		sw_params->stop_threshold = runtime->buffer_size;
+	sw_params->tstamp_mode = SNDRV_PCM_TSTAMP_NONE;
+	sw_params->period_step = 1;
+	sw_params->sleep_min = 0;
+	sw_params->avail_min = runtime->period_size;
+	sw_params->xfer_align = 1;
+	sw_params->silence_threshold = 0;
+	sw_params->silence_size = 0;
 
-	if ((err = snd_pcm_kernel_ioctl(substream, SNDRV_PCM_IOCTL_SW_PARAMS, &sw_params)) < 0) {
+	if ((err = snd_pcm_kernel_ioctl(substream, SNDRV_PCM_IOCTL_SW_PARAMS, sw_params)) < 0) {
 		snd_printd("SW_PARAMS failed: %i\n", err);
 		return err;
 	}
 	runtime->control->avail_min = runtime->period_size;
 
-	runtime->oss.periods = params_periods(&sparams);
-	oss_period_size = snd_pcm_plug_client_size(substream, params_period_size(&sparams));
+	runtime->oss.periods = params_periods(sparams);
+	oss_period_size = snd_pcm_plug_client_size(substream, params_period_size(sparams));
 	snd_assert(oss_period_size >= 0, return -EINVAL);
 	if (runtime->oss.plugin_first) {
 		err = snd_pcm_plug_alloc(substream, oss_period_size);
@@ -468,12 +512,12 @@
 		 runtime->oss.period_bytes,
 		 runtime->oss.buffer_bytes);
 	pdprintf("slave: period_size = %i, buffer_size = %i\n",
-		 params_period_size(&sparams),
-		 params_buffer_size(&sparams));
+		 params_period_size(sparams),
+		 params_buffer_size(sparams));
 
-	runtime->oss.format = snd_pcm_oss_format_to(params_format(&params));
-	runtime->oss.channels = params_channels(&params);
-	runtime->oss.rate = params_rate(&params);
+	runtime->oss.format = snd_pcm_oss_format_to(params_format(params));
+	runtime->oss.channels = params_channels(params);
+	runtime->oss.rate = params_rate(params);
 
 	runtime->oss.params = 0;
 	runtime->oss.prepare = 1;


-- 
Muli Ben-Yehuda
http://www.mulix.org


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] snd_pcm_oss_change_params is a stack offender
  2003-02-21  7:39                           ` [PATCH] snd_pcm_oss_change_params is a stack offender Muli Ben-Yehuda
@ 2003-02-21  7:58                             ` Andreas Dilger
  2003-02-21  8:20                               ` Muli Ben-Yehuda
  0 siblings, 1 reply; 72+ messages in thread
From: Andreas Dilger @ 2003-02-21  7:58 UTC (permalink / raw)
  To: Muli Ben-Yehuda; +Cc: Jaroslav Kysela, Kernel Mailing List

On Feb 21, 2003  09:39 +0200, Muli Ben-Yehuda wrote:
> +static int alloc_param_structs(snd_pcm_hw_params_t** params, 
> +			       snd_pcm_hw_params_t** sparams,
> +			       snd_pcm_sw_params_t** sw_params)

So, it looks like you've changed a large stack user into a leaker of
memory.  Nowhere is the allocated memory freed, AFAICS, not upon
successful completion, nor at any of the error exits.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] snd_pcm_oss_change_params is a stack offender
  2003-02-21  7:58                             ` Andreas Dilger
@ 2003-02-21  8:20                               ` Muli Ben-Yehuda
  0 siblings, 0 replies; 72+ messages in thread
From: Muli Ben-Yehuda @ 2003-02-21  8:20 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Jaroslav Kysela, Kernel Mailing List

[again, with a real subject line this time. This just isn't my day]. 

On Fri, Feb 21, 2003 at 12:58:52AM -0700, Andreas Dilger wrote:
> On Feb 21, 2003  09:39 +0200, Muli Ben-Yehuda wrote:
> > +static int alloc_param_structs(snd_pcm_hw_params_t** params, 
> > +			       snd_pcm_hw_params_t** sparams,
> > +			       snd_pcm_sw_params_t** sw_params)
> 
> So, it looks like you've changed a large stack user into a leaker of
> memory.  Nowhere is the allocated memory freed, AFAICS, not upon
> successful completion, nor at any of the error exits.

Thanks for spotting. I can only claim not having woken up yet. 

Here's a fixed patch, which frees the allocations properly. I didn't
want to make more than the minimal changes necessary, but if it's ok
with the maintainer, it should be switched to the common "goto style",
and something should be done about those snd_asserts. Jaroslav, ok to
rewrite?

#	sound/core/oss/pcm_oss.c	1.20    -> 1.22   
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 03/02/21	mulix@alhambra.mulix.org	1.1007
# snd_pcm_oss_change_params was a stack offender, having three large 
# structs on the stack. Allocate those structs on the heap and change
# the code accordingly. 
# --------------------------------------------
# 03/02/21	mulix@alhambra.mulix.org	1.1008
# This time, also free the memory :-((
# Thanks to Andreas Dilger for spotting.
# --------------------------------------------
#
diff -Nru a/sound/core/oss/pcm_oss.c b/sound/core/oss/pcm_oss.c
--- a/sound/core/oss/pcm_oss.c	Fri Feb 21 10:15:10 2003
+++ b/sound/core/oss/pcm_oss.c	Fri Feb 21 10:15:10 2003
@@ -291,11 +291,58 @@
 	return snd_pcm_hw_param_near(substream, params, SNDRV_PCM_HW_PARAM_RATE, best_rate, 0);
 }
 
+static int alloc_param_structs(snd_pcm_hw_params_t** params, 
+			       snd_pcm_hw_params_t** sparams,
+			       snd_pcm_sw_params_t** sw_params)
+{
+	snd_pcm_hw_params_t* hwp; 
+	snd_pcm_sw_params_t* swp; 
+
+	if (!(hwp = kmalloc(sizeof(*hwp), GFP_KERNEL)))
+		goto out; 
+
+	memset(hwp, 0, sizeof(*hwp)); 
+	*params = hwp; 
+
+	if (!(hwp = kmalloc(sizeof(*hwp), GFP_KERNEL)))
+		goto free_params; 
+
+	memset(hwp, 0, sizeof(*hwp)); 
+	*sparams = hwp; 
+
+	if (!(swp = kmalloc(sizeof(*swp), GFP_KERNEL)))
+		goto free_sparams; 
+
+	memset(swp, 0, sizeof(*swp)); 
+	*sw_params = swp; 
+	
+	return 0; 
+
+ free_sparams:
+	kfree(*sparams); 
+	*sparams = NULL; 
+
+ free_params:
+	kfree(*params); 
+	*params = NULL; 
+
+ out: 
+	return -ENOMEM; 
+}
+
+static void free_param_structs(snd_pcm_hw_params_t* params, snd_pcm_hw_params_t* sparams,
+			       snd_pcm_sw_params_t* sw_params)
+{
+	kfree(params); 
+	kfree(sparams); 
+	kfree(sw_params); 
+}
+
 static int snd_pcm_oss_change_params(snd_pcm_substream_t *substream)
 {
 	snd_pcm_runtime_t *runtime = substream->runtime;
-	snd_pcm_hw_params_t params, sparams;
-	snd_pcm_sw_params_t sw_params;
+	snd_pcm_hw_params_t *params, *sparams;
+	snd_pcm_sw_params_t *sw_params;
 	ssize_t oss_buffer_size, oss_period_size;
 	size_t oss_frame_size;
 	int err;
@@ -311,9 +358,14 @@
 		direct = (setup != NULL && setup->direct);
 	}
 
-	_snd_pcm_hw_params_any(&sparams);
-	_snd_pcm_hw_param_setinteger(&sparams, SNDRV_PCM_HW_PARAM_PERIODS);
-	_snd_pcm_hw_param_min(&sparams, SNDRV_PCM_HW_PARAM_PERIODS, 2, 0);
+	if ((err = alloc_param_structs(&params, &sparams, &sw_params))) {
+		snd_printd("out of memory\n"); 
+		return err; 
+	}
+
+	_snd_pcm_hw_params_any(sparams);
+	_snd_pcm_hw_param_setinteger(sparams, SNDRV_PCM_HW_PARAM_PERIODS);
+	_snd_pcm_hw_param_min(sparams, SNDRV_PCM_HW_PARAM_PERIODS, 2, 0);
 	snd_mask_none(&mask);
 	if (atomic_read(&runtime->mmap_count))
 		snd_mask_set(&mask, SNDRV_PCM_ACCESS_MMAP_INTERLEAVED);
@@ -322,17 +374,18 @@
 		if (!direct)
 			snd_mask_set(&mask, SNDRV_PCM_ACCESS_RW_NONINTERLEAVED);
 	}
-	err = snd_pcm_hw_param_mask(substream, &sparams, SNDRV_PCM_HW_PARAM_ACCESS, &mask);
+	err = snd_pcm_hw_param_mask(substream, sparams, SNDRV_PCM_HW_PARAM_ACCESS, &mask);
 	if (err < 0) {
+		free_param_structs(params, sparams, sw_params); 
 		snd_printd("No usable accesses\n");
 		return -EINVAL;
 	}
-	choose_rate(substream, &sparams, runtime->oss.rate);
-	snd_pcm_hw_param_near(substream, &sparams, SNDRV_PCM_HW_PARAM_CHANNELS, runtime->oss.channels, 0);
+	choose_rate(substream, sparams, runtime->oss.rate);
+	snd_pcm_hw_param_near(substream, sparams, SNDRV_PCM_HW_PARAM_CHANNELS, runtime->oss.channels, 0);
 
 	format = snd_pcm_oss_format_from(runtime->oss.format);
 
-	sformat_mask = *hw_param_mask(&sparams, SNDRV_PCM_HW_PARAM_FORMAT);
+	sformat_mask = *hw_param_mask(sparams, SNDRV_PCM_HW_PARAM_FORMAT);
 	if (direct)
 		sformat = format;
 	else
@@ -345,50 +398,53 @@
 				break;
 		}
 		if (sformat > SNDRV_PCM_FORMAT_LAST) {
+			free_param_structs(params, sparams, sw_params); 
 			snd_printd("Cannot find a format!!!\n");
 			return -EINVAL;
 		}
 	}
-	err = _snd_pcm_hw_param_set(&sparams, SNDRV_PCM_HW_PARAM_FORMAT, sformat, 0);
-	snd_assert(err >= 0, return err);
+	err = _snd_pcm_hw_param_set(sparams, SNDRV_PCM_HW_PARAM_FORMAT, sformat, 0);
+	snd_assert(err >= 0, {free_param_structs(params, sparams, sw_params); return err});
 
 	if (direct) {
-		params = sparams;
+		memcpy(params, sparams, sizeof(*params)); 
 	} else {
-		_snd_pcm_hw_params_any(&params);
-		_snd_pcm_hw_param_set(&params, SNDRV_PCM_HW_PARAM_ACCESS,
+		_snd_pcm_hw_params_any(params);
+		_snd_pcm_hw_param_set(params, SNDRV_PCM_HW_PARAM_ACCESS,
 				      SNDRV_PCM_ACCESS_RW_INTERLEAVED, 0);
-		_snd_pcm_hw_param_set(&params, SNDRV_PCM_HW_PARAM_FORMAT,
+		_snd_pcm_hw_param_set(params, SNDRV_PCM_HW_PARAM_FORMAT,
 				      snd_pcm_oss_format_from(runtime->oss.format), 0);
-		_snd_pcm_hw_param_set(&params, SNDRV_PCM_HW_PARAM_CHANNELS,
+		_snd_pcm_hw_param_set(params, SNDRV_PCM_HW_PARAM_CHANNELS,
 				      runtime->oss.channels, 0);
-		_snd_pcm_hw_param_set(&params, SNDRV_PCM_HW_PARAM_RATE,
+		_snd_pcm_hw_param_set(params, SNDRV_PCM_HW_PARAM_RATE,
 				      runtime->oss.rate, 0);
 		pdprintf("client: access = %i, format = %i, channels = %i, rate = %i\n",
-			 params_access(&params), params_format(&params),
-			 params_channels(&params), params_rate(&params));
+			 params_access(params), params_format(params),
+			 params_channels(params), params_rate(params));
 	}
 	pdprintf("slave: access = %i, format = %i, channels = %i, rate = %i\n",
-		 params_access(&sparams), params_format(&sparams),
-		 params_channels(&sparams), params_rate(&sparams));
+		 params_access(sparams), params_format(sparams),
+		 params_channels(sparams), params_rate(sparams));
 
-	oss_frame_size = snd_pcm_format_physical_width(params_format(&params)) *
-			 params_channels(&params) / 8;
+	oss_frame_size = snd_pcm_format_physical_width(params_format(params)) *
+			 params_channels(params) / 8;
 
 	snd_pcm_oss_plugin_clear(substream);
 	if (!direct) {
 		/* add necessary plugins */
 		snd_pcm_oss_plugin_clear(substream);
 		if ((err = snd_pcm_plug_format_plugins(substream,
-						       &params, 
-						       &sparams)) < 0) {
+						       params, 
+						       sparams)) < 0) {
+			free_param_structs(params, sparams, sw_params); 
 			snd_printd("snd_pcm_plug_format_plugins failed: %i\n", err);
 			snd_pcm_oss_plugin_clear(substream);
 			return err;
 		}
 		if (runtime->oss.plugin_first) {
 			snd_pcm_plugin_t *plugin;
-			if ((err = snd_pcm_plugin_build_io(substream, &sparams, &plugin)) < 0) {
+			if ((err = snd_pcm_plugin_build_io(substream, sparams, &plugin)) < 0) {
+				free_param_structs(params, sparams, sw_params); 
 				snd_printd("snd_pcm_plugin_build_io failed: %i\n", err);
 				snd_pcm_oss_plugin_clear(substream);
 				return err;
@@ -399,67 +455,73 @@
 				err = snd_pcm_plugin_insert(plugin);
 			}
 			if (err < 0) {
+				free_param_structs(params, sparams, sw_params); 
 				snd_pcm_oss_plugin_clear(substream);
 				return err;
 			}
 		}
 	}
 
-	err = snd_pcm_oss_period_size(substream, &params, &sparams);
-	if (err < 0)
+	err = snd_pcm_oss_period_size(substream, params, sparams);
+	if (err < 0) {
+		free_param_structs(params, sparams, sw_params); 
 		return err;
+	}
 
 	n = snd_pcm_plug_slave_size(substream, runtime->oss.period_bytes / oss_frame_size);
-	err = snd_pcm_hw_param_near(substream, &sparams, SNDRV_PCM_HW_PARAM_PERIOD_SIZE, n, 0);
-	snd_assert(err >= 0, return err);
+	err = snd_pcm_hw_param_near(substream, sparams, SNDRV_PCM_HW_PARAM_PERIOD_SIZE, n, 0);
+	snd_assert(err >= 0, {free_param_structs(params, sparams, sw_params); return err});
 
-	err = snd_pcm_hw_param_near(substream, &sparams, SNDRV_PCM_HW_PARAM_PERIODS,
+	err = snd_pcm_hw_param_near(substream, sparams, SNDRV_PCM_HW_PARAM_PERIODS,
 				     runtime->oss.periods, 0);
-	snd_assert(err >= 0, return err);
+	snd_assert(err >= 0, {free_param_structs(params, sparams, sw_params); return err});
 
 	snd_pcm_kernel_ioctl(substream, SNDRV_PCM_IOCTL_DROP, 0);
 
-	if ((err = snd_pcm_kernel_ioctl(substream, SNDRV_PCM_IOCTL_HW_PARAMS, &sparams)) < 0) {
+	if ((err = snd_pcm_kernel_ioctl(substream, SNDRV_PCM_IOCTL_HW_PARAMS, sparams)) < 0) {
+		free_param_structs(params, sparams, sw_params); 
 		snd_printd("HW_PARAMS failed: %i\n", err);
 		return err;
 	}
 
-	memset(&sw_params, 0, sizeof(sw_params));
 	if (runtime->oss.trigger) {
-		sw_params.start_threshold = 1;
+		sw_params->start_threshold = 1;
 	} else {
-		sw_params.start_threshold = runtime->boundary;
+		sw_params->start_threshold = runtime->boundary;
 	}
 	if (atomic_read(&runtime->mmap_count))
-		sw_params.stop_threshold = runtime->boundary;
+		sw_params->stop_threshold = runtime->boundary;
 	else
-		sw_params.stop_threshold = runtime->buffer_size;
-	sw_params.tstamp_mode = SNDRV_PCM_TSTAMP_NONE;
-	sw_params.period_step = 1;
-	sw_params.sleep_min = 0;
-	sw_params.avail_min = runtime->period_size;
-	sw_params.xfer_align = 1;
-	sw_params.silence_threshold = 0;
-	sw_params.silence_size = 0;
+		sw_params->stop_threshold = runtime->buffer_size;
+	sw_params->tstamp_mode = SNDRV_PCM_TSTAMP_NONE;
+	sw_params->period_step = 1;
+	sw_params->sleep_min = 0;
+	sw_params->avail_min = runtime->period_size;
+	sw_params->xfer_align = 1;
+	sw_params->silence_threshold = 0;
+	sw_params->silence_size = 0;
 
-	if ((err = snd_pcm_kernel_ioctl(substream, SNDRV_PCM_IOCTL_SW_PARAMS, &sw_params)) < 0) {
+	if ((err = snd_pcm_kernel_ioctl(substream, SNDRV_PCM_IOCTL_SW_PARAMS, sw_params)) < 0) {
+		free_param_structs(params, sparams, sw_params); 
 		snd_printd("SW_PARAMS failed: %i\n", err);
 		return err;
 	}
 	runtime->control->avail_min = runtime->period_size;
 
-	runtime->oss.periods = params_periods(&sparams);
-	oss_period_size = snd_pcm_plug_client_size(substream, params_period_size(&sparams));
-	snd_assert(oss_period_size >= 0, return -EINVAL);
+	runtime->oss.periods = params_periods(sparams);
+	oss_period_size = snd_pcm_plug_client_size(substream, params_period_size(sparams));
+	snd_assert(oss_period_size >= 0, {free_param_structs(params, sparams, sw_params); return -EINVAL});
 	if (runtime->oss.plugin_first) {
 		err = snd_pcm_plug_alloc(substream, oss_period_size);
-		if (err < 0)
+		if (err < 0) {
+			free_param_structs(params, sparams, sw_params); 
 			return err;
+		}
 	}
 	oss_period_size *= oss_frame_size;
 
 	oss_buffer_size = oss_period_size * runtime->oss.periods;
-	snd_assert(oss_buffer_size >= 0, return -EINVAL);
+	snd_assert(oss_buffer_size >= 0, {free_param_structs(params, sparams, sw_params); return -EINVAL});
 
 	runtime->oss.period_bytes = oss_period_size;
 	runtime->oss.buffer_bytes = oss_buffer_size;
@@ -468,12 +530,12 @@
 		 runtime->oss.period_bytes,
 		 runtime->oss.buffer_bytes);
 	pdprintf("slave: period_size = %i, buffer_size = %i\n",
-		 params_period_size(&sparams),
-		 params_buffer_size(&sparams));
+		 params_period_size(sparams),
+		 params_buffer_size(sparams));
 
-	runtime->oss.format = snd_pcm_oss_format_to(params_format(&params));
-	runtime->oss.channels = params_channels(&params);
-	runtime->oss.rate = params_rate(&params);
+	runtime->oss.format = snd_pcm_oss_format_to(params_format(params));
+	runtime->oss.channels = params_channels(params);
+	runtime->oss.rate = params_rate(params);
 
 	runtime->oss.params = 0;
 	runtime->oss.prepare = 1;
@@ -483,6 +545,8 @@
 	runtime->oss.buffer_used = 0;
 	if (runtime->dma_area)
 		snd_pcm_format_set_silence(runtime->format, runtime->dma_area, bytes_to_samples(runtime, runtime->dma_bytes));
+
+	free_param_structs(params, sparams, sw_params); 
 	return 0;
 }
 

-- 
Muli Ben-Yehuda
http://www.mulix.org



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20 16:54                         ` Linus Torvalds
                                             ` (2 preceding siblings ...)
  2003-02-21  7:39                           ` [PATCH] snd_pcm_oss_change_params is a stack offender Muli Ben-Yehuda
@ 2003-02-27 18:50                           ` Randy.Dunlap
  2003-02-27 19:39                             ` Muli Ben-Yehuda
  2003-03-02  6:12                             ` Keith Owens
  2003-02-27 23:32                           ` Randy.Dunlap
  4 siblings, 2 replies; 72+ messages in thread
From: Randy.Dunlap @ 2003-02-27 18:50 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mbligh, zwane, cw, linux-kernel

On Thu, 20 Feb 2003 08:54:55 -0800 (PST)
Linus Torvalds <torvalds@transmeta.com> wrote:

| On Thu, 20 Feb 2003, Martin J. Bligh wrote:
| > 
| > There are patches in -mjb from Dave Hansen / Ben LaHaise to detect stack
| > overflow included with the stuff for the 4K stacks patch (intended for 
| > scaling to large numbers of tasks). I've split them out attatched, should 
| > apply to mainline reasonably easily.
| 
| Ok, the 4kB stack definitely won't work in real life, but that's because 
| we have some hopelessly bad stack users in the kernel. But the debugging 
| part would be good to try (in fact, it might be a good idea to keep the 
| 8kB stack, but with rather anal debugging. Just the "mcount" part should 
| do that).
| 
| A sorted list of bad stack users (more than 256 bytes) in my default build
| follows. Anybody can create their own with something like
| 
| 	objdump -d linux/vmlinux |
| 		grep 'sub.*$0x...,.*esp' |
| 		awk '{ print $9,$1 }' |
| 		sort > bigstack
| 
| and a script to look up the addresses.
| 
| That ide_unregister() thing uses up >2kB in just one call! And there are 
| several in the 1.5kB range too, with a long list of ~500 byte offenders.
| 
| Yeah, and this assumes we don't have alloca() users or other dynamic 
| stack allocators (non-constant-size automatic arrays). I hope we don't 
| have that kind of crap anywhere..

I don't get a nice listing from this script like you did.
Example of mine is below.  Do I just have a tools issue?

Thanks,
--
~Randy



$0x424,%esp c01f6bc0:
$0x490,%esp c0106010:
$0x4ac,%esp c016aec3:
$0x540,%esp c01061a6:
$0x5ac,%esp c010533e:
$0x798,%esp c02528b8:
$0x924,%esp c02484fb:

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-27 18:50                           ` doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots) Randy.Dunlap
@ 2003-02-27 19:39                             ` Muli Ben-Yehuda
  2003-02-27 19:47                               ` Randy.Dunlap
  2003-03-02  6:12                             ` Keith Owens
  1 sibling, 1 reply; 72+ messages in thread
From: Muli Ben-Yehuda @ 2003-02-27 19:39 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 873 bytes --]

On Thu, Feb 27, 2003 at 10:50:56AM -0800, Randy.Dunlap wrote:
> On Thu, 20 Feb 2003 08:54:55 -0800 (PST)
> Linus Torvalds <torvalds@transmeta.com> wrote:

[snipped] 

> | A sorted list of bad stack users (more than 256 bytes) in my default build
> | follows. Anybody can create their own with something like
> | 
> | 	objdump -d linux/vmlinux |
> | 		grep 'sub.*$0x...,.*esp' |
> | 		awk '{ print $9,$1 }' |
> | 		sort > bigstack
> | 
> | and a script to look up the addresses.

[snipped] 

> I don't get a nice listing from this script like you did.
> Example of mine is below.  Do I just have a tools issue?

See the part where Linus said "...and a script to look up the
addresses.". You can use 'ksymoops -v vmlinux -m System.map --no-ksyms
--no-lsmod -A 0xcodebabe' to translate address to symbol. 
-- 
Muli Ben-Yehuda
http://www.mulix.org


[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-27 19:39                             ` Muli Ben-Yehuda
@ 2003-02-27 19:47                               ` Randy.Dunlap
  0 siblings, 0 replies; 72+ messages in thread
From: Randy.Dunlap @ 2003-02-27 19:47 UTC (permalink / raw)
  To: Muli Ben-Yehuda; +Cc: linux-kernel

On Thu, 27 Feb 2003 21:39:44 +0200
Muli Ben-Yehuda <mulix@mulix.org> wrote:

| On Thu, Feb 27, 2003 at 10:50:56AM -0800, Randy.Dunlap wrote:
| > On Thu, 20 Feb 2003 08:54:55 -0800 (PST)
| > Linus Torvalds <torvalds@transmeta.com> wrote:
| 
| [snipped] 
| 
| > | A sorted list of bad stack users (more than 256 bytes) in my default build
| > | follows. Anybody can create their own with something like
| > | 
| > | 	objdump -d linux/vmlinux |
| > | 		grep 'sub.*$0x...,.*esp' |
| > | 		awk '{ print $9,$1 }' |
| > | 		sort > bigstack
| > | 
| > | and a script to look up the addresses.
| 
| [snipped] 
| 
| > I don't get a nice listing from this script like you did.
| > Example of mine is below.  Do I just have a tools issue?
| 
| See the part where Linus said "...and a script to look up the
| addresses.". You can use 'ksymoops -v vmlinux -m System.map --no-ksyms
| --no-lsmod -A 0xcodebabe' to translate address to symbol. 

Yes, sorry about skimming over that.
And yes, I'm familiar with that option of ksymoops.*  :)

--
~Randy


*: since it's based on
   http://www.osdl.org/archive/rddunlap/scripts/ksysmap

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-27 18:50                           ` doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots) Randy.Dunlap
  2003-02-27 19:39                             ` Muli Ben-Yehuda
@ 2003-03-02  6:12                             ` Keith Owens
  1 sibling, 0 replies; 72+ messages in thread
From: Keith Owens @ 2003-03-02  6:12 UTC (permalink / raw)
  To: linux-kernel

Linus Torvalds <torvalds@transmeta.com> wrote:
> A sorted list of bad stack users (more than 256 bytes) in my default build
> follows. Anybody can create their own with something like
> 
> 	objdump -d linux/vmlinux |
> 		grep 'sub.*$0x...,.*esp' |
> 		awk '{ print $9,$1 }' |
> 		sort > bigstack
> 
> and a script to look up the addresses.
> 
> Yeah, and this assumes we don't have alloca() users or other dynamic 
> stack allocators (non-constant-size automatic arrays). I hope we don't 
> have that kind of crap anywhere..

We do.

kernel.stack identifies big offenders, dynamic stacks and tells you
which procedure is at fault.  This must be at least the fifth time I
have published this script.

#!/bin/bash
#
#	Run a compiled ix86 kernel and print large local stack usage.
#
#	/>:/{s/[<>:]*//g; h; }   On lines that contain '>:' (headings like
#	c0100000 <_stext>:), remove <, > and : and hold the line.  Identifies
#	the procedure and its start address.
#
#	/subl\?.*\$0x[^,][^,][^,].*,%esp/{    Select lines containing
#	subl\?...0x...,%esp but only if there are at least 3 digits between 0x and
#	,%esp.  These are local stacks of at least 0x100 bytes.
#
#	s/.*$0x\([^,]*\).*/\1/;   Extract just the stack adjustment
#	/^[89a-f].......$/d;   Ignore lines with 8 digit offsets that are 
#	negative.  Some compilers adjust the stack on exit, seems to be related
#	to goto statements
#	G;   Append the held line (procedure and start address).
#	s/\(.*\)\n.* \(.*\)/\1 \2/;  Remove the newline and procedure start 
#	address.  Leaves just stack size and procedure name.
#	p; };   Print stack size and procedure name.
#
#	/subl\?.*%.*,%esp/{   Selects adjustment of %esp by register, dynamic 
#	arrays on stack.
#	G;   Append the held line (procedure and start address).
#	s/\(.*\)\n\(.*\)/Dynamic \2 \1/;   Reformat to "Dynamic", procedure 
#	start address, procedure name and the instruction that adjusts the
#	stack, including its offset within the proc.
#	p; };   Print the dynamic line.
#
#
#	Leading spaces in the sed string are required.
#
objdump --disassemble "$@" | \
sed -ne '/>:/{s/[<>:]*//g; h; }
 /subl\?.*\$0x[^,][^,][^,].*,%esp/{
 s/.*\$0x\([^,]*\).*/\1/; /^[89a-f].......$/d; G; s/\(.*\)\n.* \(.*\)/\1 \2/; p; };
 /subl\?.*%.*,%esp/{ G; s/\(.*\)\n\(.*\)/Dynamic \2 \1/; p; }; ' | \
sort


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20 16:54                         ` Linus Torvalds
                                             ` (3 preceding siblings ...)
  2003-02-27 18:50                           ` doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots) Randy.Dunlap
@ 2003-02-27 23:32                           ` Randy.Dunlap
  4 siblings, 0 replies; 72+ messages in thread
From: Randy.Dunlap @ 2003-02-27 23:32 UTC (permalink / raw)
  To: linux-kernel

| A sorted list of bad stack users (more than 256 bytes) in my default build
| follows. Anybody can create their own with something like
| 
| 	objdump -d linux/vmlinux |
| 		grep 'sub.*$0x...,.*esp' |
| 		awk '{ print $9,$1 }' |
| 		sort > bigstack
| 
| and a script to look up the addresses.
| 
| That ide_unregister() thing uses up >2kB in just one call! And there are 
| several in the 1.5kB range too, with a long list of ~500 byte offenders.
| 
| Yeah, and this assumes we don't have alloca() users or other dynamic 
| stack allocators (non-constant-size automatic arrays). I hope we don't 
| have that kind of crap anywhere..

Keith Owens did such a script over 1 year ago.  It's available from
  http://kernelnewbies.org/scripts/check-stack.sh
It also identifies (flags) dynamic stack allocation.
(course, I can't read Keith's as well as I can Linus's)

--
~Randy

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20 16:11                       ` Martin J. Bligh
  2003-02-20 16:54                         ` Linus Torvalds
@ 2003-02-20 23:09                         ` Chris Wedgwood
  1 sibling, 0 replies; 72+ messages in thread
From: Chris Wedgwood @ 2003-02-20 23:09 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Linus Torvalds, Ingo Molnar, Dave Hansen, Zwane Mwaikambo,
	Kernel Mailing List, William Lee Irwin III

On Thu, Feb 20, 2003 at 08:11:31AM -0800, Martin J. Bligh wrote:

> There are patches in -mjb from Dave Hansen / Ben LaHaise to detect
> stack overflow included with the stuff for the 4K stacks patch
> (intended for scaling to large numbers of tasks). I've split them
> out attatched, should apply to mainline reasonably easily.

I tried with these patches and also wli's sched deadlock fix to see if
that helps.

Sadly not,  I can still easily reproduce a reboot.


  --cw

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20 15:43                     ` Linus Torvalds
  2003-02-20 15:52                       ` Ingo Molnar
  2003-02-20 16:11                       ` Martin J. Bligh
@ 2003-02-20 16:44                       ` Ingo Molnar
  2003-02-20 20:13                       ` Chris Wedgwood
  3 siblings, 0 replies; 72+ messages in thread
From: Ingo Molnar @ 2003-02-20 16:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Zwane Mwaikambo, Chris Wedgwood, Kernel Mailing List,
	Martin J. Bligh, William Lee Irwin III


On Thu, 20 Feb 2003, Linus Torvalds wrote:

> Ok, this is definitely a stack overflow:

> Does anybody have an up-to-date "use -gp and a special 'mcount()'
> function to check stack depth" patch? The CONFIG_DEBUG_STACKOVERFLOW
> thing is quite possibly too stupid to find things like this (it only
> finds interrupts that overflow the stack, not deep call sequences).

i had CONFIG_DEBUG_STACKOVERFLOW on, but i'll make it more agressive. It's
fairly easy to reproduce the oops. (at least it was when i was trying to
avoid them :-)

	Ing


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots)
  2003-02-20 15:43                     ` Linus Torvalds
                                         ` (2 preceding siblings ...)
  2003-02-20 16:44                       ` Ingo Molnar
@ 2003-02-20 20:13                       ` Chris Wedgwood
  3 siblings, 0 replies; 72+ messages in thread
From: Chris Wedgwood @ 2003-02-20 20:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Zwane Mwaikambo, Kernel Mailing List,
	Martin J. Bligh, William Lee Irwin III

On Thu, Feb 20, 2003 at 07:43:16AM -0800, Linus Torvalds wrote:

> This could explain Chris' problems too - my doublefault thing won't
> help much if recursion on the stack has clobbered a lot of kernel
> state (and the doublefault will likely happen only after enough
> state is clobbered that even the doublefault handling might have
> trouble).

An overflow *might* explain why

  - it never happens under 2.4.x

  - for some configurations of 2.5.x it never seems to happen either

  - for some configurations of 2.5.x it does happen, but it's very
    nebulous as to which options are required to make this happen;
    very few options seems table,  many options crashes quickly, and a
    in-between it lasts for what might be slightly longer periods of
    time

Now, one thing I'm using that many people may not be is XFS, ACLs &
quota.  Since IRIX has almost inifinite memory available in
kernel-space, I should check to make sure XFS isn't sucking too much
stack space somewhere...  it could be that it is, and depending on the
right magic internal XFS state and when an interrupt arrives or
similar, something goes splat.

I have the stack checking on, but as observed it may not suffice.  I
wonder if 16k stacks are possible for testing?



  --cw


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62 --- spontaneous reboots
  2003-02-18  0:03 ` Linux v2.5.62 --- spontaneous reboots Chris Wedgwood
  2003-02-18  0:44   ` Jeff Garzik
  2003-02-18  1:42   ` Linus Torvalds
@ 2003-02-18 12:13   ` Pavel Machek
  2 siblings, 0 replies; 72+ messages in thread
From: Pavel Machek @ 2003-02-18 12:13 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: Linus Torvalds, Kernel Mailing List

Hi!

> > Oh, and as a sign that 2.6.x really _is_ approaching, people have
> > started sending me spelling fixes.
> 
> FWIW, I can't get 2.5.59+ (maybe earlier) to run reliably for me
> without spontaneous rebooting under load (kernel compile in a loop).
> 
> I wondered if it was specific to my system here except a few other
> people have reported this on *very* different hardware (I'm have UP
> Athlon with IDE, they have 8-way P4 with SCSI).
> 
> Is anyone else seeing this?  Might there be some bogon causing triple
> faults or similar lurking that I'm just unlucky enough to hit often?

I'm seeing loop-related problems around 2.5.60+...
								Pavel

-- 
Casualities in World Trade Center: ~3k dead inside the building,
cryptography in U.S.A. and free speech in Czech Republic.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-17 23:18 Linux v2.5.62 Linus Torvalds
  2003-02-18  0:03 ` Linux v2.5.62 --- spontaneous reboots Chris Wedgwood
@ 2003-02-19 10:53 ` David Ford
  2003-02-19  6:49   ` Thomas Molina
                     ` (3 more replies)
  1 sibling, 4 replies; 72+ messages in thread
From: David Ford @ 2003-02-19 10:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kernel Mailing List

2.5.60+ is rather unstable for me on an Athlon CPU w/ gcc 3.2.2.  If I'm 
careful and do very little in X, it seems to stay up for a few days.  If 
I do any sort of fast graphics or sound, etc, it'll die very quickly.  
'tis an instant death with no OOPS, nothing at all on screen, nothing on 
serial console.

Just an FYI, I'm trying to narrow it down.

David

Linus Torvalds wrote:

>Hmm.. Mostly lots of small updates, although the merge with Andrew
>included the RCU dcache patches from IBM that he has carried along for a
>while (ie fairly fundamnetal, but also very well tested).
>
>ARM, PPC, PPC64, alpha, kbuild.
>
>Oh, and as a sign that 2.6.x really _is_ approaching, people have started 
>sending me spelling fixes. Kernel coders are apparently all atrocious 
>spellers, and for some reason the spelling police always comes out of the 
>woodwork when stable releases get closer.
>  
>


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-19 10:53 ` Linux v2.5.62 David Ford
@ 2003-02-19  6:49   ` Thomas Molina
  2003-02-19 11:04   ` Duncan Sands
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 72+ messages in thread
From: Thomas Molina @ 2003-02-19  6:49 UTC (permalink / raw)
  To: David Ford; +Cc: Kernel Mailing List

On Wed, 19 Feb 2003, David Ford wrote:

> 2.5.60+ is rather unstable for me on an Athlon CPU w/ gcc 3.2.2.  If I'm 
> careful and do very little in X, it seems to stay up for a few days.  If 
> I do any sort of fast graphics or sound, etc, it'll die very quickly.  
> 'tis an instant death with no OOPS, nothing at all on screen, nothing on 
> serial console.

2.5.60 is where it started getting the most stable for me with similar 
equipment.  My system is Athlon 1.3 GHz, ASUS A7V, RedHat 8.0, Yamaha 754 
soundcard.  

I'm not burning any DVDs, but I am making a few CDs, lots of kernel 
compiles, playing with modules, etc.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-19 10:53 ` Linux v2.5.62 David Ford
  2003-02-19  6:49   ` Thomas Molina
@ 2003-02-19 11:04   ` Duncan Sands
  2003-02-19 11:07     ` William Lee Irwin III
  2003-02-19 11:17   ` Hirling Endre
  2003-02-19 18:50   ` Zilvinas Valinskas
  3 siblings, 1 reply; 72+ messages in thread
From: Duncan Sands @ 2003-02-19 11:04 UTC (permalink / raw)
  To: David Ford; +Cc: Kernel Mailing List

David, sounds like what I described in the email
"2.5.6x hard freeze playing DVDs".  I have made
no progress because I don't know how to proceed.

All the best,

Duncan.

On Wednesday 19 February 2003 11:53, David Ford wrote:
> 2.5.60+ is rather unstable for me on an Athlon CPU w/ gcc 3.2.2.  If I'm
> careful and do very little in X, it seems to stay up for a few days.  If
> I do any sort of fast graphics or sound, etc, it'll die very quickly.
> 'tis an instant death with no OOPS, nothing at all on screen, nothing on
> serial console.
>
> Just an FYI, I'm trying to narrow it down.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-19 11:04   ` Duncan Sands
@ 2003-02-19 11:07     ` William Lee Irwin III
  2003-02-19 11:58       ` Duncan Sands
  0 siblings, 1 reply; 72+ messages in thread
From: William Lee Irwin III @ 2003-02-19 11:07 UTC (permalink / raw)
  To: Duncan Sands; +Cc: David Ford, Kernel Mailing List

On Wed, Feb 19, 2003 at 12:04:55PM +0100, Duncan Sands wrote:
> David, sounds like what I described in the email
> "2.5.6x hard freeze playing DVDs".  I have made
> no progress because I don't know how to proceed.
> All the best,

Well, there's always the NMI oopser + serial console to log oopses.
X seems to make VGA console unavailable, plus it's not loggable.


-- wli

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-19 11:07     ` William Lee Irwin III
@ 2003-02-19 11:58       ` Duncan Sands
  2003-02-19 12:04         ` William Lee Irwin III
  0 siblings, 1 reply; 72+ messages in thread
From: Duncan Sands @ 2003-02-19 11:58 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: David Ford, Kernel Mailing List

On Wednesday 19 February 2003 12:07, William Lee Irwin III wrote:
> On Wed, Feb 19, 2003 at 12:04:55PM +0100, Duncan Sands wrote:
> > David, sounds like what I described in the email
> > "2.5.6x hard freeze playing DVDs".  I have made
> > no progress because I don't know how to proceed.
> > All the best,
>
> Well, there's always the NMI oopser + serial console to log oopses.
> X seems to make VGA console unavailable, plus it's not loggable.

AMD K6-2 w/o APIC...

Duncan.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-19 11:58       ` Duncan Sands
@ 2003-02-19 12:04         ` William Lee Irwin III
  0 siblings, 0 replies; 72+ messages in thread
From: William Lee Irwin III @ 2003-02-19 12:04 UTC (permalink / raw)
  To: Duncan Sands; +Cc: David Ford, Kernel Mailing List

On Wednesday 19 February 2003 12:07, William Lee Irwin III wrote:
>> Well, there's always the NMI oopser + serial console to log oopses.
>> X seems to make VGA console unavailable, plus it's not loggable.

On Wed, Feb 19, 2003 at 12:58:58PM +0100, Duncan Sands wrote:
> AMD K6-2 w/o APIC...

Hmm. Could you be convinced to upgrade to a machine with a real
interrupt controller?

Well, hook up serial anyway. Maybe it's oopsing and you just can't
see what it's trying to printk.


-- wli

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-19 10:53 ` Linux v2.5.62 David Ford
  2003-02-19  6:49   ` Thomas Molina
  2003-02-19 11:04   ` Duncan Sands
@ 2003-02-19 11:17   ` Hirling Endre
  2003-02-19 11:24     ` Duncan Sands
  2003-02-19 18:50   ` Zilvinas Valinskas
  3 siblings, 1 reply; 72+ messages in thread
From: Hirling Endre @ 2003-02-19 11:17 UTC (permalink / raw)
  To: Kernel Mailing List

On Wed, 2003-02-19 at 11:53, David Ford wrote:
> 2.5.60+ is rather unstable for me on an Athlon CPU w/ gcc 3.2.2.  If I'm 
> careful and do very little in X, it seems to stay up for a few days.  If 
> I do any sort of fast graphics or sound, etc, it'll die very quickly.  
> 'tis an instant death with no OOPS, nothing at all on screen, nothing on 
> serial console.

You're lucky, for me 2.5.60+ freezes right after "uncompressing kernel".
Tried with and without ACPI, with and without 'noapic', with APIC
enabled and disabled in the BIOS. 

2.5.59 is just unstable.

(msi kt4 ultra MB, athlon xp 2200+, gcc 3.2.2)

I'll try a minimal 2.5.62 now.

endre


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-19 11:17   ` Hirling Endre
@ 2003-02-19 11:24     ` Duncan Sands
  2003-02-19 11:52       ` Hirling Endre
  0 siblings, 1 reply; 72+ messages in thread
From: Duncan Sands @ 2003-02-19 11:24 UTC (permalink / raw)
  To: Hirling Endre, Kernel Mailing List

On Wednesday 19 February 2003 12:17, Hirling Endre wrote:
> On Wed, 2003-02-19 at 11:53, David Ford wrote:
> > 2.5.60+ is rather unstable for me on an Athlon CPU w/ gcc 3.2.2.  If I'm
> > careful and do very little in X, it seems to stay up for a few days.  If
> > I do any sort of fast graphics or sound, etc, it'll die very quickly.
> > 'tis an instant death with no OOPS, nothing at all on screen, nothing on
> > serial console.
>
> You're lucky, for me 2.5.60+ freezes right after "uncompressing kernel".
> Tried with and without ACPI, with and without 'noapic', with APIC
> enabled and disabled in the BIOS.
>
> 2.5.59 is just unstable.
>
> (msi kt4 ultra MB, athlon xp 2200+, gcc 3.2.2)
>
> I'll try a minimal 2.5.62 now.

Endre, check out this thread:

Re: 2.5.62 fails to boot, Uncompressing... and then nothing

Duncan.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-19 11:24     ` Duncan Sands
@ 2003-02-19 11:52       ` Hirling Endre
  0 siblings, 0 replies; 72+ messages in thread
From: Hirling Endre @ 2003-02-19 11:52 UTC (permalink / raw)
  To: lkml

On Wed, 2003-02-19 at 12:24, Duncan Sands wrote:

> Endre, check out this thread:
> 
> Re: 2.5.62 fails to boot, Uncompressing... and then nothing

Been there, done that. I tried without ACPI and I have VT console
enabled. Haven't tried early_printk, though.

endre


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-19 10:53 ` Linux v2.5.62 David Ford
                     ` (2 preceding siblings ...)
  2003-02-19 11:17   ` Hirling Endre
@ 2003-02-19 18:50   ` Zilvinas Valinskas
  2003-02-19 21:46     ` Remco Post
                       ` (2 more replies)
  3 siblings, 3 replies; 72+ messages in thread
From: Zilvinas Valinskas @ 2003-02-19 18:50 UTC (permalink / raw)
  To: Kernel Mailing List

On Wed, Feb 19, 2003 at 05:53:43AM -0500, David Ford wrote:
> 2.5.60+ is rather unstable for me on an Athlon CPU w/ gcc 3.2.2.  If I'm 
> careful and do very little in X, it seems to stay up for a few days.  If 
> I do any sort of fast graphics or sound, etc, it'll die very quickly.  
> 'tis an instant death with no OOPS, nothing at all on screen, nothing on 
> serial console.
> 
> Just an FYI, I'm trying to narrow it down.

it might triple fault ? Who knows. One thing I am sure of, if I don't
load agpgart + intel-agp, laptop in questions, works flawlessly.
Otherwise first time I log of KDE trying to login as different user I
get instant reboot.

That's the clue.

ps.
Hardware :

Compaq EVO 800
Intel P4, 1.7GHz, 256MB RAM, ATI Radeon Mobility LY (something).

> 
> David

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-19 18:50   ` Zilvinas Valinskas
@ 2003-02-19 21:46     ` Remco Post
  2003-02-19 22:23       ` Remco Post
  2003-02-20 13:31     ` Dave Jones
  2003-02-21  3:58     ` Alexander Hoogerhuis
  2 siblings, 1 reply; 72+ messages in thread
From: Remco Post @ 2003-02-19 21:46 UTC (permalink / raw)
  To: linux-kernel; +Cc: linuxppc-dev

Hi all,

just to let you all know, The linus 2.5.62 (plain as can be) just booted
on my motorola powerstack II system. No modules, but also, no oops on
boot, like 2.5.59 and allmost every other 2.5 before that....

-- Remco

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-19 21:46     ` Remco Post
@ 2003-02-19 22:23       ` Remco Post
  2003-02-20  1:13         ` Tom Rini
  0 siblings, 1 reply; 72+ messages in thread
From: Remco Post @ 2003-02-19 22:23 UTC (permalink / raw)
  To: Remco Post; +Cc: linux-kernel, linuxppc-dev

On Wed, 19 Feb 2003 22:46:27 +0100
Remco Post <r.post@sara.nl> wrote:

> Hi all,
> 
> just to let you all know, The linus 2.5.62 (plain as can be) just booted
> on my motorola powerstack II system. No modules, but also, no oops on
> boot, like 2.5.59 and allmost every other 2.5 before that....
> 
> -- Remco
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

and fortunately, I also have some use for booting this kernel:

When the ethernet link goed down on my on-board dec-tulip:

eth1: timeout expired stopping DMA                                              
kernel BUG at drivers/net/tulip/de2104x.c:925!                                  
Oops: Exception in kernel mode, sig: 4                                          
NIP: C0138248 LR: C0138248 SP: C0275E00 REGS: c0275d50 TRAP: 0700    
Not taintedMSR: 00089032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11                                 
TASK = c022f550[0] 'swapper' Last syscall: 120                                  
GPR00: C0138248 C0275E00 C022F550 0000002F 00000001 C0275CB8 C0271800 C02B0000  
GPR08: 0000161F 00000000 00000000 C0275D30 4000C088 00000000 00000000 00000000  
GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000  
GPR24: 00000000 00000000 00000002 00001032 C03DD000 00009032 FFFFFFCE C03DD1C0  
Call trace: [c0138588]  [c002066c]  [c001b85c]  [c0007e80]  [c00061c4]  [c00039 
Kernel panic: Aiee, killing interrupt handler!                                  
In interrupt handler - not syncing                                             

Only after some traffic was supposed to leave the machine, not that it ever does 
with this kernel....


-- Remco

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-19 22:23       ` Remco Post
@ 2003-02-20  1:13         ` Tom Rini
  2003-02-20 19:42           ` Remco Post
  0 siblings, 1 reply; 72+ messages in thread
From: Tom Rini @ 2003-02-20  1:13 UTC (permalink / raw)
  To: Remco Post; +Cc: linux-kernel, linuxppc-dev

On Wed, Feb 19, 2003 at 11:23:44PM +0100, Remco Post wrote:
> 
> On Wed, 19 Feb 2003 22:46:27 +0100
> Remco Post <r.post@sara.nl> wrote:
> 
> > Hi all,
> >
> > just to let you all know, The linus 2.5.62 (plain as can be) just booted
> > on my motorola powerstack II system. No modules, but also, no oops on
> > boot, like 2.5.59 and allmost every other 2.5 before that....
> >
> > -- Remco
> 
> and fortunately, I also have some use for booting this kernel:
> 
> When the ethernet link goed down on my on-board dec-tulip:
> 
> eth1: timeout expired stopping DMA
> kernel BUG at drivers/net/tulip/de2104x.c:925!
> Oops: Exception in kernel mode, sig: 4
> NIP: C0138248 LR: C0138248 SP: C0275E00 REGS: c0275d50 TRAP: 0700
> Not taintedMSR: 00089032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
> TASK = c022f550[0] 'swapper' Last syscall: 120
> GPR00: C0138248 C0275E00 C022F550 0000002F 00000001 C0275CB8 C0271800 C02B0000
> GPR08: 0000161F 00000000 00000000 C0275D30 4000C088 00000000 00000000 00000000
> GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> GPR24: 00000000 00000000 00000002 00001032 C03DD000 00009032 FFFFFFCE C03DD1C0
> Call trace: [c0138588]  [c002066c]  [c001b85c]  [c0007e80]  [c00061c4]  [c00039
> Kernel panic: Aiee, killing interrupt handler!
> In interrupt handler - not syncing

What does that decode to?

-- 
Tom Rini
http://gate.crashing.org/~trini/

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-20  1:13         ` Tom Rini
@ 2003-02-20 19:42           ` Remco Post
  2003-02-20 19:46             ` Tom Rini
  0 siblings, 1 reply; 72+ messages in thread
From: Remco Post @ 2003-02-20 19:42 UTC (permalink / raw)
  To: Tom Rini; +Cc: linux-kernel, linuxppc-dev

On Wed, 19 Feb 2003 18:13:39 -0700
Tom Rini <trini@kernel.crashing.org> wrote:

> On Wed, Feb 19, 2003 at 11:23:44PM +0100, Remco Post wrote:
> > 
> > On Wed, 19 Feb 2003 22:46:27 +0100
> > Remco Post <r.post@sara.nl> wrote:
> > 
> > > Hi all,
> > >
> > > just to let you all know, The linus 2.5.62 (plain as can be) just booted
> > > on my motorola powerstack II system. No modules, but also, no oops on
> > > boot, like 2.5.59 and allmost every other 2.5 before that....
> > >
> > > -- Remco
> > 
> > and fortunately, I also have some use for booting this kernel:
> > 
> > When the ethernet link goed down on my on-board dec-tulip:
> > 
> > eth1: timeout expired stopping DMA
> > kernel BUG at drivers/net/tulip/de2104x.c:925!
> > Oops: Exception in kernel mode, sig: 4
> > NIP: C0138248 LR: C0138248 SP: C0275E00 REGS: c0275d50 TRAP: 0700
> > Not taintedMSR: 00089032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
> > TASK = c022f550[0] 'swapper' Last syscall: 120
> > GPR00: C0138248 C0275E00 C022F550 0000002F 00000001 C0275CB8 C0271800 C02B0000
> > GPR08: 0000161F 00000000 00000000 C0275D30 4000C088 00000000 00000000 00000000
> > GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > GPR24: 00000000 00000000 00000002 00001032 C03DD000 00009032 FFFFFFCE C03DD1C0
> > Call trace: [c0138588]  [c002066c]  [c001b85c]  [c0007e80]  [c00061c4]  [c00039
> > Kernel panic: Aiee, killing interrupt handler!
> > In interrupt handler - not syncing
> 
> What does that decode to?
> 

Well it doesn't, of course, relevant addresses close to the ones in the call trace:

c00061c4 T ret_from_except
c0003904 t setup_disp_bat
c0003950 T init_idle_6xx
c0003988 T ppc6xx_idle
c0007bfc T timer_interrupt
c0007e94 T do_gettimeofday
c001b7d4 T do_softirq
c001b8d8 T raise_softirq
c0020560 t run_timer_softirq
c00206c4 T run_local_timers
c0138460 t de21040_media_timer
c0138620 t de_ok_to_advertise


> -- 
> Tom Rini
> http://gate.crashing.org/~trini/

Hope this is about what you're looking for... If not, please let me know...


-- 

Remco

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-20 19:42           ` Remco Post
@ 2003-02-20 19:46             ` Tom Rini
  2003-02-20 20:05               ` Remco Post
  0 siblings, 1 reply; 72+ messages in thread
From: Tom Rini @ 2003-02-20 19:46 UTC (permalink / raw)
  To: Remco Post; +Cc: linux-kernel, linuxppc-dev

On Thu, Feb 20, 2003 at 08:42:36PM +0100, Remco Post wrote:
> On Wed, 19 Feb 2003 18:13:39 -0700
> Tom Rini <trini@kernel.crashing.org> wrote:
> 
> > On Wed, Feb 19, 2003 at 11:23:44PM +0100, Remco Post wrote:
> > > 
> > > On Wed, 19 Feb 2003 22:46:27 +0100
> > > Remco Post <r.post@sara.nl> wrote:
> > > 
> > > > Hi all,
> > > >
> > > > just to let you all know, The linus 2.5.62 (plain as can be) just booted
> > > > on my motorola powerstack II system. No modules, but also, no oops on
> > > > boot, like 2.5.59 and allmost every other 2.5 before that....
> > > >
> > > > -- Remco
> > > 
> > > and fortunately, I also have some use for booting this kernel:
> > > 
> > > When the ethernet link goed down on my on-board dec-tulip:
> > > 
> > > eth1: timeout expired stopping DMA
> > > kernel BUG at drivers/net/tulip/de2104x.c:925!
> > > Oops: Exception in kernel mode, sig: 4
> > > NIP: C0138248 LR: C0138248 SP: C0275E00 REGS: c0275d50 TRAP: 0700
> > > Not taintedMSR: 00089032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
> > > TASK = c022f550[0] 'swapper' Last syscall: 120
> > > GPR00: C0138248 C0275E00 C022F550 0000002F 00000001 C0275CB8 C0271800 C02B0000
> > > GPR08: 0000161F 00000000 00000000 C0275D30 4000C088 00000000 00000000 00000000
> > > GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > > GPR24: 00000000 00000000 00000002 00001032 C03DD000 00009032 FFFFFFCE C03DD1C0
> > > Call trace: [c0138588]  [c002066c]  [c001b85c]  [c0007e80]  [c00061c4]  [c00039
> > > Kernel panic: Aiee, killing interrupt handler!
> > > In interrupt handler - not syncing
> > 
> > What does that decode to?
> > 
> 
> Well it doesn't, of course, relevant addresses close to the ones in the call trace:

Um, ksymoops should be able to decode that fine...

-- 
Tom Rini
http://gate.crashing.org/~trini/

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-20 19:46             ` Tom Rini
@ 2003-02-20 20:05               ` Remco Post
  0 siblings, 0 replies; 72+ messages in thread
From: Remco Post @ 2003-02-20 20:05 UTC (permalink / raw)
  To: Tom Rini; +Cc: linux-kernel, linuxppc-dev

On Thu, 20 Feb 2003 12:46:14 -0700
Tom Rini <trini@kernel.crashing.org> wrote:

> 
> On Thu, Feb 20, 2003 at 08:42:36PM +0100, Remco Post wrote:
> > On Wed, 19 Feb 2003 18:13:39 -0700
> > Tom Rini <trini@kernel.crashing.org> wrote:
> >
> > > On Wed, Feb 19, 2003 at 11:23:44PM +0100, Remco Post wrote:
> > > >
> > > > On Wed, 19 Feb 2003 22:46:27 +0100
> > > > Remco Post <r.post@sara.nl> wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > just to let you all know, The linus 2.5.62 (plain as can be) just booted
> > > > > on my motorola powerstack II system. No modules, but also, no oops on
> > > > > boot, like 2.5.59 and allmost every other 2.5 before that....
> > > > >
> > > > > -- Remco
> > > >
> > > > and fortunately, I also have some use for booting this kernel:
> > > >
> > > > When the ethernet link goed down on my on-board dec-tulip:
> > > >
> > > > eth1: timeout expired stopping DMA
> > > > kernel BUG at drivers/net/tulip/de2104x.c:925!
> > > > Oops: Exception in kernel mode, sig: 4
> > > > NIP: C0138248 LR: C0138248 SP: C0275E00 REGS: c0275d50 TRAP: 0700
> > > > Not taintedMSR: 00089032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
> > > > TASK = c022f550[0] 'swapper' Last syscall: 120
> > > > GPR00: C0138248 C0275E00 C022F550 0000002F 00000001 C0275CB8 C0271800 C02B0000
> > > > GPR08: 0000161F 00000000 00000000 C0275D30 4000C088 00000000 00000000 00000000
> > > > GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > > > GPR24: 00000000 00000000 00000002 00001032 C03DD000 00009032 FFFFFFCE C03DD1C0
> > > > Call trace: [c0138588]  [c002066c]  [c001b85c]  [c0007e80]  [c00061c4]  [c00039
> > > > Kernel panic: Aiee, killing interrupt handler!
> > > > In interrupt handler - not syncing
> > >
> > > What does that decode to?
> > >
> >
> > Well it doesn't, of course, relevant addresses close to the ones in the call trace:
> 
> Um, ksymoops should be able to decode that fine...
> 

That's the hint I needed:

$ ksymoops -v vmlinux -O -K -L -m System.map ~/oops.file 
ksymoops 2.4.5 on ppc 2.4.18-powerpc.  Options used
     -v vmlinux (specified)
     -K (specified)
     -L (specified)
     -O (specified)
     -m System.map (specified)

Oops: Exception in kernel mode, sig: 4                                          
NIP: C0138248 LR: C0138248 SP: C0275E00 REGS: c0275d50 TRAP: 0700    Not tainted
MSR: 00089032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11                                 
TASK = c022f550[0] 'swapper' Last syscall: 120                                  
GPR00: C0138248 C0275E00 C022F550 0000002F 00000001 C0275CB8 C0271800 C02B0000  
GPR08: 00001398 00000000 00000000 C0275D30 4000C088 00000000 00000000 00000000  
GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000  
GPR24: 00000000 00000000 00000002 00001032 C03DD000 00009032 FFFFFFCE C03DD1C0  
Call trace: [c0138588]  [c002066c]  [c001b85c]  [c0007e80]  [c00061c4]  [c00039 
Kernel panic: Aiee, killing interrupt handler!                                  
In interrupt handler - not syncing                                             
Using defaults from ksymoops -t elf32-powerpc -a powerpc:common
Warning (Oops_read): Code line not seen, dumping what data is available


>>NIP; c0138248 <de_set_media+48/1f0>   <=====


1 warning issued.  Results may not be reliable.
$

> --
> Tom Rini
> http://gate.crashing.org/~trini/
> 
> ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
> 

-- 

Remco

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-19 18:50   ` Zilvinas Valinskas
  2003-02-19 21:46     ` Remco Post
@ 2003-02-20 13:31     ` Dave Jones
  2003-02-20 13:57       ` Zilvinas Valinskas
  2003-02-21  3:58     ` Alexander Hoogerhuis
  2 siblings, 1 reply; 72+ messages in thread
From: Dave Jones @ 2003-02-20 13:31 UTC (permalink / raw)
  To: Zilvinas Valinskas; +Cc: Kernel Mailing List

On Wed, Feb 19, 2003 at 08:50:17PM +0200, Zilvinas Valinskas wrote:
 > it might triple fault ? Who knows. One thing I am sure of, if I don't
 > load agpgart + intel-agp, laptop in questions, works flawlessly.
 > Otherwise first time I log of KDE trying to login as different user I
 > get instant reboot.

Ok, there were quite a few changes in that area in .61.
Can you check .60 was ok, and .61 crashes the same way ?
If .61 is ok, agp is a red-herring, as it didnt change in .62

		Dave

-- 
| Dave Jones.        http://www.codemonkey.org.uk
| SuSE Labs

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-20 13:31     ` Dave Jones
@ 2003-02-20 13:57       ` Zilvinas Valinskas
  2003-02-20 14:31         ` Dave Jones
  0 siblings, 1 reply; 72+ messages in thread
From: Zilvinas Valinskas @ 2003-02-20 13:57 UTC (permalink / raw)
  To: Dave Jones; +Cc: Kernel Mailing List

Hello Dave,

it was the same with 2.5.59,2.5.60 (not sure now, I will check that
later) and with 2.5.61 (and yesterdays most current bk snapshot as
well).

Can it be related to DRI ? (that might be my guess). Event though I
can't use DRI on debian unstable because libGL.so mistakenly recognizes
Pentium 4 as 3Dnow! capable and crashes immediately.

For some reasons always, once I log off - system reboots most of the
times when agpgart & agp-intel loaded (if these are not loaded) - DRI
can not be initialized and system is always stable during log off from 
KDE session.

On Thu, 2003-02-20 at 15:31, Dave Jones wrote:
> On Wed, Feb 19, 2003 at 08:50:17PM +0200, Zilvinas Valinskas wrote:
>  > it might triple fault ? Who knows. One thing I am sure of, if I don't
>  > load agpgart + intel-agp, laptop in questions, works flawlessly.
>  > Otherwise first time I log of KDE trying to login as different user I
>  > get instant reboot.
> 
> Ok, there were quite a few changes in that area in .61.
> Can you check .60 was ok, and .61 crashes the same way ?
> If .61 is ok, agp is a red-herring, as it didnt change in .62
> 
> 		Dave
-- 
Zilvinas Valinskas
Best regards

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-20 13:57       ` Zilvinas Valinskas
@ 2003-02-20 14:31         ` Dave Jones
  0 siblings, 0 replies; 72+ messages in thread
From: Dave Jones @ 2003-02-20 14:31 UTC (permalink / raw)
  To: Zilvinas Valinskas; +Cc: Kernel Mailing List

On Thu, Feb 20, 2003 at 03:57:05PM +0200, Zilvinas Valinskas wrote:

 > it was the same with 2.5.59,2.5.60 (not sure now, I will check that
 > later) and with 2.5.61 (and yesterdays most current bk snapshot as
 > well).

.59 ? Ugh, a load of stuff has changed in agpgart/ since then.
Can you recall when it last actually worked for you ? 

 > Can it be related to DRI ? (that might be my guess).

You can test basic GART functionality with testgart
(http://www.codemonkey.org.uk/cruft/testgart.c)

 > Event though I
 > can't use DRI on debian unstable because libGL.so mistakenly recognizes
 > Pentium 4 as 3Dnow! capable and crashes immediately.

If thats what I think it is, its not a bug. This has come up a number
of times on the dri-devel list.
libGL does a test which runs 3dnow instructions. Obviouslly it'll
crash on a non-3dnow capable box, but prior to the test it installs
an exception handler to fix things up if it all goes awry.

Whats the debian bugzilla number for this bug out of interest ?

 > For some reasons always, once I log off - system reboots most of the
 > times when agpgart & agp-intel loaded (if these are not loaded) - DRI
 > can not be initialized and system is always stable during log off from 
 > KDE session.

The latter is normal, the former isn't (obviously).
Does it reboot as soon as you modprobe them, or when X/DRI starts ?

		Dave

-- 
| Dave Jones.        http://www.codemonkey.org.uk
| SuSE Labs

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-19 18:50   ` Zilvinas Valinskas
  2003-02-19 21:46     ` Remco Post
  2003-02-20 13:31     ` Dave Jones
@ 2003-02-21  3:58     ` Alexander Hoogerhuis
  2003-02-22  5:34       ` Alexander Hoogerhuis
  2 siblings, 1 reply; 72+ messages in thread
From: Alexander Hoogerhuis @ 2003-02-21  3:58 UTC (permalink / raw)
  To: Zilvinas Valinskas; +Cc: Kernel Mailing List

Zilvinas Valinskas <zilvinas@gemtek.lt> writes:

> On Wed, Feb 19, 2003 at 05:53:43AM -0500, David Ford wrote:
> > 2.5.60+ is rather unstable for me on an Athlon CPU w/ gcc 3.2.2.  If I'm 
> > careful and do very little in X, it seems to stay up for a few days.  If 
> > I do any sort of fast graphics or sound, etc, it'll die very quickly.  
> > 'tis an instant death with no OOPS, nothing at all on screen, nothing on 
> > serial console.
> > 
> > Just an FYI, I'm trying to narrow it down.
> 
> it might triple fault ? Who knows. One thing I am sure of, if I don't
> load agpgart + intel-agp, laptop in questions, works flawlessly.
> Otherwise first time I log of KDE trying to login as different user I
> get instant reboot.
> 

I'm seeing the same on my Evo800c, I think it's very much
ACPI-related, as logging out of gnome and back in worked before i got
a newer ACPI-patch on 2.4. Currently on 2.4.20 with ACPI patch from
early January.

Planning on testing out the latest ACPI-patch dates February 18th
along with 2.4.21-pre4 now; and tinker a bit with the DSDT to make it
usefull; I'll let you know how it works out.

> 
> Compaq EVO 800
> Intel P4, 1.7GHz, 256MB RAM, ATI Radeon Mobility LY (something).
> 

Got the same box, only 512Mb more RAM ;)

mvh,
A

-- 
Alexander Hoogerhuis                               | alexh@ihatent.com
CCNP - CCDP - MCNE - CCSE                          | +47 908 21 485
"You have zero privacy anyway. Get over it."  --Scott McNealy

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Linux v2.5.62
  2003-02-21  3:58     ` Alexander Hoogerhuis
@ 2003-02-22  5:34       ` Alexander Hoogerhuis
  0 siblings, 0 replies; 72+ messages in thread
From: Alexander Hoogerhuis @ 2003-02-22  5:34 UTC (permalink / raw)
  To: Zilvinas Valinskas; +Cc: Kernel Mailing List

Alexander Hoogerhuis <alexh@ihatent.com> writes:

> Zilvinas Valinskas <zilvinas@gemtek.lt> writes:
> 
> > On Wed, Feb 19, 2003 at 05:53:43AM -0500, David Ford wrote:
> > > 2.5.60+ is rather unstable for me on an Athlon CPU w/ gcc 3.2.2.  If I'm 
> > > careful and do very little in X, it seems to stay up for a few days.  If 
> > > I do any sort of fast graphics or sound, etc, it'll die very quickly.  
> > > 'tis an instant death with no OOPS, nothing at all on screen, nothing on 
> > > serial console.
> > > 
> > > Just an FYI, I'm trying to narrow it down.
> > 
> > it might triple fault ? Who knows. One thing I am sure of, if I don't
> > load agpgart + intel-agp, laptop in questions, works flawlessly.
> > Otherwise first time I log of KDE trying to login as different user I
> > get instant reboot.
> > 
> 
> I'm seeing the same on my Evo800c, I think it's very much
> ACPI-related, as logging out of gnome and back in worked before i got
> a newer ACPI-patch on 2.4. Currently on 2.4.20 with ACPI patch from
> early January.
> 
> Planning on testing out the latest ACPI-patch dates February 18th
> along with 2.4.21-pre4 now; and tinker a bit with the DSDT to make it
> usefull; I'll let you know how it works out.
>

Made a new kernel, 2.4.21-pre4 with ACPI form 0218 patched it, and
recompiled. Running with the builtin its fine, and my own supplied DSDT the
machine will instantly reboot when hitting the logout-button in Gnome
2.2.

How do I get a way of telling exactly what went pear shaped whe the
machine just reboots like that?

mvh,
A
-- 
Alexander Hoogerhuis                               | alexh@ihatent.com
CCNP - CCDP - MCNE - CCSE                          | +47 908 21 485
"You have zero privacy anyway. Get over it."  --Scott McNealy

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2003-03-02  6:02 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-02-17 23:18 Linux v2.5.62 Linus Torvalds
2003-02-18  0:03 ` Linux v2.5.62 --- spontaneous reboots Chris Wedgwood
2003-02-18  0:44   ` Jeff Garzik
2003-02-18  0:46     ` Chris Wedgwood
2003-02-18  1:42   ` Linus Torvalds
2003-02-18  1:53     ` Chris Wedgwood
2003-02-18  2:02       ` Linus Torvalds
2003-02-18  2:16         ` Chris Wedgwood
2003-02-18  2:33           ` Linus Torvalds
2003-02-18  3:21         ` Martin J. Bligh
2003-02-19 11:02         ` David Ford
2003-02-18 21:44     ` Chris Wedgwood
2003-02-18 21:59       ` Chris Wedgwood
2003-02-18 22:13         ` Linus Torvalds
2003-02-18 22:34           ` Linus Torvalds
2003-02-18 23:01           ` Chris Wedgwood
2003-02-19 23:35             ` doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots) Linus Torvalds
2003-02-20  2:22               ` Zwane Mwaikambo
2003-02-20  2:26                 ` William Lee Irwin III
2003-02-20  2:55                   ` Zwane Mwaikambo
2003-02-20  3:15                     ` William Lee Irwin III
2003-02-20  4:52                 ` Linus Torvalds
2003-02-20  5:07                   ` William Lee Irwin III
2003-02-20  6:05                   ` Zwane Mwaikambo
2003-02-20 11:46                   ` Ingo Molnar
2003-02-20 12:12                     ` William Lee Irwin III
2003-02-20 12:33                       ` Ingo Molnar
2003-02-20 14:03                         ` Zwane Mwaikambo
2003-02-20 14:00                     ` Zwane Mwaikambo
2003-02-20 15:43                     ` Linus Torvalds
2003-02-20 15:52                       ` Ingo Molnar
2003-02-20 16:11                       ` Martin J. Bligh
2003-02-20 16:54                         ` Linus Torvalds
2003-02-20 17:24                           ` Jeff Garzik
2003-02-20 21:21                           ` Alan Cox
2003-02-20 20:20                             ` Linus Torvalds
2003-02-20 20:23                             ` Martin J. Bligh
2003-02-20 20:42                               ` William Lee Irwin III
2003-02-20 20:51                                 ` Linus Torvalds
2003-02-21  7:39                           ` [PATCH] snd_pcm_oss_change_params is a stack offender Muli Ben-Yehuda
2003-02-21  7:58                             ` Andreas Dilger
2003-02-21  8:20                               ` Muli Ben-Yehuda
2003-02-27 18:50                           ` doublefault debugging (was Re: Linux v2.5.62 --- spontaneous reboots) Randy.Dunlap
2003-02-27 19:39                             ` Muli Ben-Yehuda
2003-02-27 19:47                               ` Randy.Dunlap
2003-03-02  6:12                             ` Keith Owens
2003-02-27 23:32                           ` Randy.Dunlap
2003-02-20 23:09                         ` Chris Wedgwood
2003-02-20 16:44                       ` Ingo Molnar
2003-02-20 20:13                       ` Chris Wedgwood
2003-02-18 12:13   ` Linux v2.5.62 --- spontaneous reboots Pavel Machek
2003-02-19 10:53 ` Linux v2.5.62 David Ford
2003-02-19  6:49   ` Thomas Molina
2003-02-19 11:04   ` Duncan Sands
2003-02-19 11:07     ` William Lee Irwin III
2003-02-19 11:58       ` Duncan Sands
2003-02-19 12:04         ` William Lee Irwin III
2003-02-19 11:17   ` Hirling Endre
2003-02-19 11:24     ` Duncan Sands
2003-02-19 11:52       ` Hirling Endre
2003-02-19 18:50   ` Zilvinas Valinskas
2003-02-19 21:46     ` Remco Post
2003-02-19 22:23       ` Remco Post
2003-02-20  1:13         ` Tom Rini
2003-02-20 19:42           ` Remco Post
2003-02-20 19:46             ` Tom Rini
2003-02-20 20:05               ` Remco Post
2003-02-20 13:31     ` Dave Jones
2003-02-20 13:57       ` Zilvinas Valinskas
2003-02-20 14:31         ` Dave Jones
2003-02-21  3:58     ` Alexander Hoogerhuis
2003-02-22  5:34       ` Alexander Hoogerhuis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox