public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Olien <dmo@osdl.org>
To: Nick Piggin <piggin@cyberone.com.au>
Cc: Andrew Morton <akpm@osdl.org>,
	linux-kernel@vger.kernel.org, maryedie@osdl.org
Subject: Re: 2.6.0-test5-mm3 as-iosched Oops running dbt2 workload
Date: Sun, 21 Sep 2003 22:38:18 -0700	[thread overview]
Message-ID: <20030921223818.A7483@osdl.org> (raw)
In-Reply-To: <3F6BAC5F.503@cyberone.com.au>; from piggin@cyberone.com.au on Sat, Sep 20, 2003 at 11:24:47AM +1000


OK, we'll give it a shot Monday.  Thanks!

On Sat, Sep 20, 2003 at 11:24:47AM +1000, Nick Piggin wrote:
> Sigh. Sorry, I'm an idiot...
> 
> If a request is merged with another, it sometimes has to be repositioned
> on the rbtree - you just do a delete then an add. This is a quite
> uncommon case though.
> 
> I changed the way adding works, so collisions must be handled by the
> caller instead of being dumbly fixed by the add routine. Unfortunately
> the uncommon callers weren't handling it properly. Try this please.
> 
> 
> 
> Dave Olien wrote:
> 
> >Andrew,
> >
> >Attached is console output containing a stack trace from an Oops, followed
> >by a Fatal exception, and LOTS of APIC errors.  The machine was hung,
> >printing APIC error messages forever.
> >
> >This looks like another as-iosched problem.  So, I'm copying Nick Piggin
> >on this email.  But the Fatal exception and APIC errors following
> >that are a mystery to me.
> >
> >Mary encountered this running the sapdb dbt2 cached database workload on her
> >project machine.  The project machine was running 2.6.0-test5-mm3.
> >This same test passes on the stp machines.  But Mary's project machine
> >has more processors, and more disks, and a different disk controller type.
> >
> >At this stage, the database has gotten past the database restore phase.
> >That's where it was failing prior to last night's mm3 patch.  Now, the
> >database itself has been running for about 30 minutes.  In the cached
> >case, much of that first 30 minutes is spent loading the cache.
> >
> >This Oops seems to have occurred at about the time the database is
> >transitioning to using its cache.  Most of the I/O after this point
> >is to the log, doing LOTS of sequential writes, with the occasional
> >random read/write.
> >
> >Since this machine has more processors, it's doing transactions
> >more quickly than the same workload on STP machines.  So the log write
> >traffic is probably a lot heavier.
> >
> >
> >------------[ cut here ]------------
> >kernel BUG at drivers/block/as-iosched.c:1230!
> >invalid operand: 0000 [#1]
> >SMP 
> >CPU:    3
> >EIP:    0060:[<c0228146>]    Not tainted VLI
> >EFLAGS: 00010046
> >EIP is at as_dispatch_request+0x236/0x2f0
> >eax: 00000000   ebx: f7a451a0   ecx: 00000000   edx: 00000000
> >esi: 00000000   edi: 00000001   ebp: 00000000   esp: f5f67ef8
> >ds: 007b   es: 007b   ss: 0068
> >Process kernel (pid: 2283, threadinfo=f5f66000 task=f62760a0)
> >Stack: f7a451a0 f5900820 f7a28000 f7a02600 c0235c23 f7a451a0 00000000 f7836000 
> >       f5f67fc4 c0228238 f7a451a0 f7a08420 f7836000 c021fbb6 f7836000 f77e0000 
> >       f7a28000 f7a08420 f7a28000 c023a128 f7836000 f5f02de0 f77e0090 0000000a 
> >Call Trace:
> > [<c0235c23>] DAC960_BA_QueueCommand+0x43/0xb0
> > [<c0228238>] as_next_request+0x38/0x50
> > [<c021fbb6>] elv_next_request+0x16/0x110
> > [<c023a128>] DAC960_ProcessRequest+0x38/0x190
> > [<c023cd40>] DAC960_BA_InterruptHandler+0x90/0xb0
> > [<c010e899>] handle_IRQ_event+0x49/0x80
> > [<c010ec0f>] do_IRQ+0x9f/0x150
> > [<c030cb0c>] common_interrupt+0x18/0x20
> > [<c030007b>] rpcauth_free_credcache+0xbb/0x100
> >
> >Code: 43 50 00 00 00 00 8b 43 54 8b 6b 1c c7 43 5c 00 00 00 00 89 43 58 e9 95 fe ff ff c7 43 48 01 00 00 00 c7 43 4c 00 00 00 00 eb d4 <0f> 0b ce 04 fd da 32 c0 eb c7 8b 43 50 e9 61 ff ff ff 0f 0b b6 
> > <0>Kernel panic: Fatal exception in interrupt
> >In interrupt handler - not syncing
> > <6>APIC error on CPU3: 00(08)
> >APIC error on CPU3: 08(08)
> >APIC error on CPU3: 08(08)
> >APIC error on CPU3: 08(08)
> >APIC error on CPU3: 08(08)
> >APIC error on CPU3: 08(08)
> >APIC error on CPU3: 08(08)
> >APIC error on CPU3: 08(08)
> >APIC error on CPU3: 08(08)
> >APIC error on CPU3: 08(08)
> >APIC error on CPU3: 08(08)
> >APIC error on CPU3: 08(08)
> >APIC error on CPU3: 08(08)
> >APIC error on CPU3: 08(08)
> >-
> >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >the body of a message to majordomo@vger.kernel.org
> >More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >Please read the FAQ at  http://www.tux.org/lkml/
> >
> >
> >  
> >

>  linux-2.6-npiggin/drivers/block/as-iosched.c |   21 +++++++++++++++------
>  1 files changed, 15 insertions(+), 6 deletions(-)
> 
> diff -puN drivers/block/as-iosched.c~as-oops-fix drivers/block/as-iosched.c
> --- linux-2.6/drivers/block/as-iosched.c~as-oops-fix	2003-09-20 11:13:26.000000000 +1000
> +++ linux-2.6-npiggin/drivers/block/as-iosched.c	2003-09-20 11:22:55.000000000 +1000
> @@ -1303,7 +1303,7 @@ static struct request *as_next_request(r
>   * Add arq to a list behind alias
>   */
>  static inline void
> -as_add_aliased_request(struct as_rq *arq, struct as_rq *alias)
> +as_add_aliased_request(struct as_data *ad, struct as_rq *arq, struct as_rq *alias)
>  {
>  	/*
>  	 * Another request with the same start sector on the rbtree.
> @@ -1312,6 +1312,11 @@ as_add_aliased_request(struct as_rq *arq
>  	 */
>  	list_add_tail(&arq->request->queuelist,	&alias->request->queuelist);
>  
> +	/*
> +	 * Don't want to have to handle merges.
> +	 */
> +	as_remove_merge_hints(ad->q, arq);
> +
>  }
>  
>  /*
> @@ -1353,7 +1358,7 @@ static void as_add_request(struct as_dat
>  		as_update_arq(ad, arq); /* keep state machine up to date */
>  
>  	} else {
> -		as_add_aliased_request(arq, alias);
> +		as_add_aliased_request(ad, arq, alias);
>  		/*
>  		 * have we been anticipating this request?
>  		 * or does it come from the same process as the one we are
> @@ -1553,8 +1558,10 @@ static void as_merged_request(request_qu
>  		 * currently don't bother. Ditto the next function.
>  		 */
>  		as_del_arq_rb(ad, arq);
> -		if ((alias = as_add_arq_rb(ad, arq)) )
> -			as_add_aliased_request(arq, alias);
> +		if ((alias = as_add_arq_rb(ad, arq)) ) {
> +			list_del_init(&arq->fifo);
> +			as_add_aliased_request(ad, arq, alias);
> +		}
>  		/*
>  		 * Note! At this stage of this and the next function, our next
>  		 * request may not be optimal - eg the request may have "grown"
> @@ -1586,8 +1593,10 @@ as_merged_requests(request_queue_t *q, s
>  	if (rq_rb_key(req) != arq->rb_key) {
>  		struct as_rq *alias;
>  		as_del_arq_rb(ad, arq);
> -		if ((alias = as_add_arq_rb(ad, arq)) )
> -			as_add_aliased_request(arq, alias);
> +		if ((alias = as_add_arq_rb(ad, arq)) ) {
> +			list_del_init(&arq->fifo);
> +			as_add_aliased_request(ad, arq, alias);
> +		}
>  	}
>  
>  	/*
> 
> _


  reply	other threads:[~2003-09-22  5:38 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-09-19 18:56 2.6.0-test5-mm3 as-iosched Oops running dbt2 workload Dave Olien
2003-09-19 19:07 ` William Lee Irwin III
2003-09-19 20:45 ` Mary Edie Meredith
2003-09-20  1:24 ` Nick Piggin
2003-09-22  5:38   ` Dave Olien [this message]
2003-09-22 20:38   ` Mary Edie Meredith
2003-09-23  0:20   ` Mary Edie Meredith
2003-09-23 23:53     ` 2.6.0-test5-mm4 passes the dbt2 on STP Daniel McNeil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20030921223818.A7483@osdl.org \
    --to=dmo@osdl.org \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maryedie@osdl.org \
    --cc=piggin@cyberone.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox