From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756481Ab1K2USt (ORCPT ); Tue, 29 Nov 2011 15:18:49 -0500 Received: from mx1.redhat.com ([209.132.183.28]:15838 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755822Ab1K2USs (ORCPT ); Tue, 29 Nov 2011 15:18:48 -0500 Date: Tue, 29 Nov 2011 15:18:03 -0500 From: Mike Snitzer To: Heiko Carstens Cc: Hannes Reinecke , "Jun'ichi Nomura" , James Bottomley , Steffen Maier , "linux-scsi@vger.kernel.org" , Jens Axboe , Linux Kernel , Alan Stern , Thadeu Lima de Souza Cascardo , "Taraka R. Bodireddy" , "Seshagiri N. Ippili" , "Manvanthara B. Puttashankar" , Jeff Moyer , Shaohua Li , gmuelas@de.ibm.com, dm-devel@redhat.com Subject: Re: [GIT PULL] Queue free fix (was Re: [PATCH] block: Free queue resources at blk_release_queue()) Message-ID: <20111129201803.GB6827@redhat.com> References: <20111031100557.GA2621@osiris.boeblingen.de.ibm.com> <1320057746.2964.1.camel@dabdike> <4EAE8A7E.8000504@ce.jp.nec.com> <20111031130004.GB4768@osiris.boeblingen.de.ibm.com> <20111103182548.GA12131@redhat.com> <20111104091936.GB2397@osiris.boeblingen.de.ibm.com> <4EBA49C2.1000704@suse.de> <20111110161008.GA15659@osiris.boeblingen.de.ibm.com> <20111117162919.GA3812@redhat.com> <20111129120047.GA2456@osiris.boeblingen.de.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20111129120047.GA2456@osiris.boeblingen.de.ibm.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 29 2011 at 7:00am -0500, Heiko Carstens wrote: > > > > Hmm. Just to be on the safe side, could you try this one: > > > > > > > > diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c > > > > index 5e0090e..e6fad46 100644 > > > > --- a/drivers/md/dm-mpath.c > > > > +++ b/drivers/md/dm-mpath.c > > > > @@ -920,8 +920,10 @@ static int multipath_map(struct dm_target *ti, > > > > struct reque > > > > st *clone, > > > > map_context->ptr = mpio; > > > > clone->cmd_flags |= REQ_FAILFAST_TRANSPORT; > > > > r = map_io(m, clone, mpio, 0); > > > > - if (r < 0 || r == DM_MAPIO_REQUEUE) > > > > + if (r < 0 || r == DM_MAPIO_REQUEUE) { > > > > mempool_free(mpio, m->mpio_pool); > > > > + map_context->ptr = NULL; > > > > + } > > > > > > > > return r; > > > > } > > > > > > With your patch we haven't been able to reproduce the kernel crash until now. > > > Now we "only" run into I/O stalls, which before your patch we also did. But > > > repeatedly rebooting and retrying and ignoring the I/O stalls always lead to > > > a crash. > > > Gonzalo will run a couple of extra rounds so we can have a feeling if at least > > > one of the bugs could be fixed with your patch ;) > > > > Hi, > > > > Any update after further testing with Hannes' patch? > > Sorry for the late update, our internal IBM IMAP servers have been down > for nearly a week :/ > > So, we were unable to reproduce the original bug with the patch applied > during various runs. OK, so it seems to be a benenficial change (and obviously correct to me). Hannes, care to formally post your fix to dm-devel so we can get it in 3.2-rc? > However, we ran into this one instead, which is yet another use-after-free bug > (I need to double check, but I'm quite sure that a freed struct scsi_cmnd > caused this). OK, yeah something is causing poisoned (POISON_FREE) memory to be used. > [ 4906.683654] Unable to handle kernel pointer dereference at virtual kernel address 6b6b6b6b6b6b6000 ... > Gonzalo also tried 2.6.38.8 as suggested and ran into this one: > > [ 292.877936] ------------[ cut here ]------------ > [ 292.877939] Kernel BUG at 6b6b6b6b6b6b6b6d [verbose debug info unavailable] Again, more poison. Seems this test is causing us to fall on our face no matter what. Likely, best to leave this 2.6.38 blk_unplug crash to one side and continue focusing on latest upstream. Mike