From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751701AbXCSTHB (ORCPT ); Mon, 19 Mar 2007 15:07:01 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751755AbXCSTHB (ORCPT ); Mon, 19 Mar 2007 15:07:01 -0400 Received: from sabe.cs.wisc.edu ([128.105.6.20]:47811 "EHLO sabe.cs.wisc.edu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751701AbXCSTHA (ORCPT ); Mon, 19 Mar 2007 15:07:00 -0400 Message-ID: <45FEDF23.1060801@cs.wisc.edu> Date: Mon, 19 Mar 2007 14:06:11 -0500 From: Mike Christie User-Agent: Thunderbird 1.5 (X11/20060313) MIME-Version: 1.0 To: James Bottomley CC: Andreas Steinmetz , Linux Kernel Mailinglist , linux-scsi@vger.kernel.org, akpm@linux-foundation.org Subject: Re: 2.6.20.3: kernel BUG at mm/slab.c:597 try#2 References: <45FDDA8E.8030100@domdv.de> <45FECD45.20705@cs.wisc.edu> <1174328987.3512.37.camel@mulgrave.il.steeleye.com> In-Reply-To: <1174328987.3512.37.camel@mulgrave.il.steeleye.com> X-Enigmail-Version: 0.94.0.0 Content-Type: multipart/mixed; boundary="------------090309010601060903010201" Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org This is a multi-part message in MIME format. --------------090309010601060903010201 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit James Bottomley wrote: > On Mon, 2007-03-19 at 12:49 -0500, Mike Christie wrote: >>> I can't even say if the tapes are written correctly as I can't read them >>> (one does not reboot production machines back to 2.4.x just to try to >>> read a backup tape - I don't have 2.6.x older than 2.6.20 on these >>> machines). >> Could you try this patch >> http://marc.info/?l=linux-scsi&m=116464965414878&w=2 >> I thought st was modified to not send offsets in the last elements but >> it looks like it wasn't. > > Actually, there are two patches in the email referred to. If the > analysis that we're passing NULL to mempool_free is correct, it should > be the second one that fixes the problem (the one that checks > bio->bi_io_vec before freeing it). Which would mean we have a > nr_vecs==0 bio generated by the tar somehow. > I think we might only need the first patch if the problem is similar to what the lsi guys were seeing. I thought the problem is that we are not estimating how large the transfer is correctly because we do not take into account offsets at the end. This results in nr_vecs being zero when it should be a valid value. I thought Kai's patch: http://bugzilla.kernel.org/show_bug.cgi?id=7919 http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff;h=9abe16c670bd3d4ab5519257514f9f291383d104 fixed the problem on st's side, but I guess not so you are probably right. Here is a patch that dumps the sgl we are getting from st so we can see for sure what we are getting and can decide if we need the first patch, second patch or both. --------------090309010601060903010201 Content-Type: text/x-patch; name="dump-sgl.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="dump-sgl.patch" diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 5f95570..81005aa 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -306,6 +306,10 @@ static int scsi_req_map_sg(struct reques struct bio *bio = NULL; int i, err, nr_vecs = 0; + for (i = 0; i < nsegs; i++) + printk(KERN_INFO "sg length %u offset %u\n", sgl[i].length, + sgl[i].offset); + for (i = 0; i < nsegs; i++) { page = sgl[i].page; off = sgl[i].offset; --------------090309010601060903010201--