From mboxrd@z Thu Jan 1 00:00:00 1970 From: Greg Harris Subject: Re: Kernel Panic in xen-blkfront.c:blkif_queue_request under 2.6.28 Date: Tue, 3 Feb 2009 15:37:29 -0500 (EST) Message-ID: <18316821.8169281233693449109.JavaMail.root@ouachita> References: <4987821C.9040605@goop.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <4987821C.9040605@goop.org> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jeremy Fitzhardinge Cc: Jens Axboe , xen-devel List-Id: xen-devel@lists.xenproject.org After applying the patch we were able to reproduce the panic and the additional debugging output is attached. The driver appears to re-try the request several times before dying: Writing inode tables: ------------[ cut here ]------------ WARNING: at drivers/block/xen-blkfront.c:244 do_blkif_request+0x301/0x440() Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.28.2-metacarta-appliance-1 #2 Call Trace: [] warn_on_slowpath+0x64/0xa0 [] enqueue_task+0x13/0x30 [] _spin_unlock_irqrestore+0x14/0x20 [] get_free_entries+0xbc/0x2a0 [] do_blkif_request+0x301/0x440 [] blk_invoke_request_fn+0xa5/0x110 [] kick_pending_request_queues+0x18/0x30 [] blkif_interrupt+0x197/0x1e0 [] handle_IRQ_event+0x39/0x80 [] handle_level_irq+0x96/0x120 [] do_IRQ+0x85/0x110 [] xen_evtchn_do_upcall+0xe5/0x130 [] __do_softirq+0xe7/0x180 [] xen_do_hypervisor_callback+0x1e/0x30 [] _stext+0x3aa/0x1000 [] _stext+0x3aa/0x1000 [] xen_safe_halt+0xc/0x20 [] xen_idle+0x2a/0x50 [] cpu_idle+0x41/0x70 ---[ end trace 107c74ebf2b50a63 ]--- METACARTA: too many segments for ring (11): req->nr_phys_segments = 11 METACARTA: 0: bio page ffffe2000c291d00 pfn 379760 off 1536 len 512 METACARTA: 0: bio page ffffe2000c291d00 pfn 379760 off 2048 len 512 METACARTA: 0: bio page ffffe2000c291d00 pfn 379760 off 2560 len 512 METACARTA: 0: bio page ffffe2000c291d00 pfn 379760 off 3072 len 512 METACARTA: 0: bio page ffffe2000c291d00 pfn 379760 off 3584 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 0 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 512 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 1024 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 1536 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 2048 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 2560 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 3072 len 512 ------------[ cut here ]------------ WARNING: at drivers/block/xen-blkfront.c:244 do_blkif_request+0x301/0x440() Modules linked in: Pid: 0, comm: swapper Tainted: G W 2.6.28.2-metacarta-appliance-1 #2 Call Trace: [] warn_on_slowpath+0x64/0xa0 [] _spin_unlock_irqrestore+0x14/0x20 [] get_free_entries+0xbc/0x2a0 [] do_blkif_request+0x301/0x440 [] blkif_interrupt+0x197/0x1e0 [] handle_IRQ_event+0x39/0x80 [] handle_level_irq+0x96/0x120 [] do_IRQ+0x85/0x110 [] xen_evtchn_do_upcall+0xe5/0x130 [] __do_softirq+0xe7/0x180 [] xen_do_hypervisor_callback+0x1e/0x30 [] _stext+0x3aa/0x1000 [] _stext+0x3aa/0x1000 [] xen_safe_halt+0xc/0x20 [] xen_idle+0x2a/0x50 [] cpu_idle+0x41/0x70 ---[ end trace 107c74ebf2b50a63 ]--- METACARTA: too many segments for ring (11): req->nr_phys_segments = 11 METACARTA: 0: bio page ffffe2000c291d00 pfn 379760 off 1536 len 512 METACARTA: 0: bio page ffffe2000c291d00 pfn 379760 off 2048 len 512 METACARTA: 0: bio page ffffe2000c291d00 pfn 379760 off 2560 len 512 METACARTA: 0: bio page ffffe2000c291d00 pfn 379760 off 3072 len 512 METACARTA: 0: bio page ffffe2000c291d00 pfn 379760 off 3584 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 0 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 512 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 1024 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 1536 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 2048 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 2560 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 3072 len 512 ------------[ cut here ]------------ WARNING: at drivers/block/xen-blkfront.c:244 do_blkif_request+0x301/0x440() Modules linked in: Pid: 0, comm: swapper Tainted: G W 2.6.28.2-metacarta-appliance-1 #2 Call Trace: [] warn_on_slowpath+0x64/0xa0 [] enqueue_task+0x13/0x30 [] _spin_unlock_irqrestore+0x14/0x20 [] get_free_entries+0xbc/0x2a0 [] do_blkif_request+0x301/0x440 [] blk_invoke_request_fn+0xa5/0x110 [] kick_pending_request_queues+0x18/0x30 [] blkif_interrupt+0x197/0x1e0 [] handle_IRQ_event+0x39/0x80 [] handle_level_irq+0x96/0x120 [] do_IRQ+0x85/0x110 [] xen_evtchn_do_upcall+0xe5/0x130 [] __do_softirq+0xe7/0x180 [] xen_do_hypervisor_callback+0x1e/0x30 [] _stext+0x3aa/0x1000 [] _stext+0x3aa/0x1000 [] xen_safe_halt+0xc/0x20 [] xen_idle+0x2a/0x50 [] cpu_idle+0x41/0x70 ---[ end trace 107c74ebf2b50a63 ]--- METACARTA: too many segments for ring (11): req->nr_phys_segments = 11 METACARTA: 0: bio page ffffe2000c291d00 pfn 379760 off 1536 len 512 METACARTA: 0: bio page ffffe2000c291d00 pfn 379760 off 2048 len 512 METACARTA: 0: bio page ffffe2000c291d00 pfn 379760 off 2560 len 512 METACARTA: 0: bio page ffffe2000c291d00 pfn 379760 off 3072 len 512 METACARTA: 0: bio page ffffe2000c291d00 pfn 379760 off 3584 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 0 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 512 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 1024 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 1536 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 2048 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 2560 len 512 METACARTA: 0: bio page ffffe2000c291d38 pfn 379761 off 3072 len 512 ------------[ cut here ]------------ WARNING: at drivers/block/xen-blkfront.c:244 do_blkif_request+0x301/0x440() Modules linked in: Pid: 0, comm: swapper Tainted: G W 2.6.28.2-metacarta-appliance-1 #2 Call Trace: [] warn_on_slowpath+0x64/0xa0 [] _spin_unlock_irqrestore+0x14/0x20 [] get_free_entries+0xbc/0x2a0 [] do_blkif_request+0x301/0x440 [] blkif_interrupt+0x197/0x1e0 [] handle_IRQ_event+0x39/0x80 [] handle_level_irq+0x96/0x120 [] do_IRQ+0x85/0x110 [] xen_evtchn_do_upcall+0xe5/0x13 We also attempted changing the blk_queue_max_*_segments calls to use BLKIF_MAX_SEGMENTS_PER_REQUEST - 1 and our spinner was able to run overnight without any panics... --- Greg Harris System Administrator MetaCarta, Inc. (O) +1 (617) 301-5530 (M) +1 (781) 258-4474 ----- "Jeremy Fitzhardinge" wrote: > Jens Axboe wrote: > > To shed some more light on this, I'd suggest changing that BUG_ON() > to > > some code that simply dumps each segment (each bvec in the iterator > > list) from start to finish along with values of > > request->nr_phys_segments and size info. > > > > OK, something like this? > > J > > Subject: xen/blkfront: try to track down over-segment BUG_ON in > blkfront > > Signed-off-by: Jeremy Fitzhardinge > --- > drivers/block/xen-blkfront.c | 24 +++++++++++++++++++++++- > 1 file changed, 23 insertions(+), 1 deletion(-) > > =================================================================== > --- a/drivers/block/xen-blkfront.c > +++ b/drivers/block/xen-blkfront.c > @@ -240,7 +240,10 @@ > > ring_req->nr_segments = 0; > rq_for_each_segment(bvec, req, iter) { > - BUG_ON(ring_req->nr_segments == BLKIF_MAX_SEGMENTS_PER_REQUEST); > + if (WARN_ON(ring_req->nr_segments >= > + BLKIF_MAX_SEGMENTS_PER_REQUEST)) > + goto dump_req; > + > buffer_mfn = pfn_to_mfn(page_to_pfn(bvec->bv_page)); > fsect = bvec->bv_offset >> 9; > lsect = fsect + (bvec->bv_len >> 9) - 1; > @@ -274,6 +277,25 @@ > gnttab_free_grant_references(gref_head); > > return 0; > + > +dump_req: > + { > + int i; > + > + printk(KERN_DEBUG "too many segments for ring (%d): " > + "req->nr_phys_segments = %d\n", > + BLKIF_MAX_SEGMENTS_PER_REQUEST, req->nr_phys_segments); > + > + i = 0; > + rq_for_each_segment(bvec, req, iter) { > + printk(KERN_DEBUG > + " %d: bio page %p pfn %lx off %u len %u\n", > + i++, bvec->bv_page, page_to_pfn(bvec->bv_page), > + bvec->bv_offset, bvec->bv_len); > + } > + } > + > + return 1; > }