All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <haveblue@us.ibm.com>
To: Greg KH <greg@kroah.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: oops in 2.5.41-bk in tasklet_hi_action
Date: Tue, 08 Oct 2002 16:12:51 -0700	[thread overview]
Message-ID: <3DA36673.30702@us.ibm.com> (raw)
In-Reply-To: 20021008223044.GB10837@kroah.com

[-- Attachment #1: Type: text/plain, Size: 1089 bytes --]

Greg KH wrote:
> I got the following oops on 2.5.41 + latest bk tree as of 3 pm PST under
> heavy ide load:
> 
> It's a UP box running with SMP and preempt enabled.
> 
> Unable to handle kernel NULL pointer dereference at virtual address 00000004
> *pde = 00000000
> Oops: 0002
> hid uhci-hcd usbcore  
> CPU:    0
> EIP:    0060:[<c0125292>]    Not tainted
> EFLAGS: 00010016
> EIP is at run_timer_tasklet+0x102/0x220
 > <snip>
> Call Trace:
>  [<c01215c0>] tasklet_hi_action+0x80/0xd0
>  [<c01212bb>] do_softirq+0x5b/0xc0
>  [<c01114d1>] smp_apic_timer_interrupt+0x111/0x140
>  [<c0109b0f>] do_IRQ+0x18f/0x230
>  [<c0108256>] apic_timer_interrupt+0x1a/0x20

Notice that eip is actually in run_timer_tasklet and __run_timers is 
inlined there.  You're actually crashing in __run_timers, in the same 
place as me, before Ingo's fix.  Try the attached patch that I got 
from Ingo this morning.  There is still some kind of problem for me, 
but it might be a completely different one.

My oops is while running Specweb, so high interrupt load for me too.
-- 
Dave Hansen
haveblue@us.ibm.com

[-- Attachment #2: run_timers_fix-ingo-0.patch --]
[-- Type: text/plain, Size: 4796 bytes --]

Received: from localhost (nighthawk [127.0.0.1])
	by nighthawk.sr71.net (8.11.6/8.11.6) with ESMTP id g98AuLc12011
	for <dave@localhost>; Tue, 8 Oct 2002 03:56:24 -0700
Received: from imap.linux.ibm.com [9.27.103.44]
	by localhost with IMAP (fetchmail-5.9.0)
	for dave@localhost (multi-drop); Tue, 08 Oct 2002 03:56:24 -0700 (PDT)
Received: from localhost ([unix socket])
	by imap.linux.ibm.com (Cyrus v2.1.9) with LMTP; Tue, 08 Oct 2002 06:55:42 -0400
X-Sieve: CMU Sieve 2.2
Received: from smtp.linux.ibm.com (linux.ibm.com [9.26.4.197])
	by imap.linux.ibm.com (Postfix) with ESMTP id 401627C017
	for <haveblue@imap.linux.ibm.com>; Tue,  8 Oct 2002 06:55:42 -0400 (EDT)
Received: from northrelay03.pok.ibm.com (northrelay03.pok.ibm.com [9.56.224.151])
	by smtp.linux.ibm.com (Postfix) with ESMTP id 02E323FE06
	for <haveblue@linux.ibm.com>; Tue,  8 Oct 2002 06:55:34 -0400 (EDT)
Received: from e31.co.us.ibm.com (d03av01.boulder.ibm.com [9.17.193.81])
	by northrelay03.pok.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id g98AtVqY080584
	for <haveblue@us.ibm.com>; Tue, 8 Oct 2002 06:55:32 -0400
Received: from mx1.elte.hu (mx1.elte.hu [157.181.1.137])
	by e31.co.us.ibm.com (8.12.2/8.12.2) with ESMTP id g98AtRAs034788
	for <haveblue@us.ibm.com>; Tue, 8 Oct 2002 06:55:31 -0400
Received: from chiara.elte.hu (chiara.elte.hu [157.181.150.200])
	by mx1.elte.hu (Postfix) with ESMTP
	id 08F0244732; Tue,  8 Oct 2002 12:55:11 +0200 (CEST)
Received: by chiara.elte.hu (Postfix, from userid 17806)
	id E70821FF1; Tue,  8 Oct 2002 12:54:44 +0200 (CEST)
Date: Tue, 8 Oct 2002 13:05:40 +0200 (CEST)
From: Ingo Molnar <mingo@elte.hu>
Sender: mingo@localhost.localdomain
Reply-To: Ingo Molnar <mingo@elte.hu>
To: Dave Hansen <haveblue@us.ibm.com>
Cc: Andrew Morton <akpm@digeo.com>, lkml <linux-kernel@vger.kernel.org>,
   "linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: 2.5.40-mm2
In-Reply-To: <3DA0A144.8070301@us.ibm.com>
Message-ID: <Pine.LNX.4.44.0210081303090.29540-100000@localhost.localdomain>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-8.4 required=5.0
	tests=IN_REP_TO,UNIFIED_PATCH
	version=2.31
X-Spam-Level: 
X-Fetchmail-Warning: no recipient addresses matched declared local names


On Sun, 6 Oct 2002, Dave Hansen wrote:

> cc'ing Ingo, because I think this might be related to the timer bh
> removal.

could you try the attached patch against 2.5.41, does it help? It fixes
the bugs found so far plus makes del_timer_sync() a bit more robust by
re-checking timer pending-ness before exiting. There is one type of code
that might have relied on this kind of behavior of the old timer code.

	Ingo

--- linux/kernel/timer.c.orig	2002-10-08 12:39:46.000000000 +0200
+++ linux/kernel/timer.c	2002-10-08 12:49:50.000000000 +0200
@@ -266,29 +266,31 @@
 int del_timer_sync(timer_t *timer)
 {
 	tvec_base_t *base = tvec_bases;
-	int i, ret;
+	int i, ret = 0;
 
-	ret = del_timer(timer);
+del_again:
+	ret += del_timer(timer);
 
-	for (i = 0; i < NR_CPUS; i++) {
+	for (i = 0; i < NR_CPUS; i++, base++) {
 		if (!cpu_online(i))
 			continue;
 		if (base->running_timer == timer) {
 			while (base->running_timer == timer) {
 				cpu_relax();
-				preempt_disable();
-				preempt_enable();
+				preempt_check_resched();
 			}
 			break;
 		}
-		base++;
 	}
+	if (timer_pending(timer))
+		goto del_again;
+
 	return ret;
 }
 #endif
 
 
-static void cascade(tvec_base_t *base, tvec_t *tv)
+static int cascade(tvec_base_t *base, tvec_t *tv)
 {
 	/* cascade all the timers from tv up one level */
 	struct list_head *head, *curr, *next;
@@ -310,7 +312,8 @@
 		curr = next;
 	}
 	INIT_LIST_HEAD(head);
-	tv->index = (tv->index + 1) & TVN_MASK;
+
+	return tv->index = (tv->index + 1) & TVN_MASK;
 }
 
 /***
@@ -322,26 +325,18 @@
  */
 static inline void __run_timers(tvec_base_t *base)
 {
-	unsigned long flags;
-
-	spin_lock_irqsave(&base->lock, flags);
+	spin_lock_irq(&base->lock);
 	while ((long)(jiffies - base->timer_jiffies) >= 0) {
 		struct list_head *head, *curr;
 
 		/*
 		 * Cascade timers:
 		 */
-		if (!base->tv1.index) {
-			cascade(base, &base->tv2);
-			if (base->tv2.index == 1) {
-				cascade(base, &base->tv3);
-				if (base->tv3.index == 1) {
-					cascade(base, &base->tv4);
-					if (base->tv4.index == 1)
-						cascade(base, &base->tv5);
-				}
-			}
-		}
+		if (!base->tv1.index &&
+			(cascade(base, &base->tv2) == 1) &&
+				(cascade(base, &base->tv3) == 1) &&
+					cascade(base, &base->tv4) == 1)
+			cascade(base, &base->tv5);
 repeat:
 		head = base->tv1.vec + base->tv1.index;
 		curr = head->next;
@@ -370,7 +365,7 @@
 #if CONFIG_SMP
 	base->running_timer = NULL;
 #endif
-	spin_unlock_irqrestore(&base->lock, flags);
+	spin_unlock_irq(&base->lock);
 }
 
 /******************************************************************/


      reply	other threads:[~2002-10-08 23:08 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-10-08 22:30 oops in 2.5.41-bk in tasklet_hi_action Greg KH
2002-10-08 23:12 ` Dave Hansen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3DA36673.30702@us.ibm.com \
    --to=haveblue@us.ibm.com \
    --cc=greg@kroah.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.