public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@suse.de>
To: Neil Brown <neilb@suse.de>
Cc: Chase Venters <chase.venters@clientec.com>,
	linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org,
	akpm@osdl.org, a.titov@host.bg, askernel2615@dsgml.com,
	jamie@audible.transient.net
Subject: Re: More information on scsi_cmd_cache leak... (bisect)
Date: Fri, 27 Jan 2006 12:28:37 +0100	[thread overview]
Message-ID: <20060127112837.GG4311@suse.de> (raw)
In-Reply-To: <20060127112352.GF4311@suse.de>

On Fri, Jan 27 2006, Jens Axboe wrote:
> On Fri, Jan 27 2006, Neil Brown wrote:
> > On Friday January 27, chase.venters@clientec.com wrote:
> > > Greetings,
> > > 	Just a quick recap - there are at least 4 reports of 2.6.15 users 
> > > experiencing severe slab leaks with scsi_cmd_cache. It seems that a few of us 
> > > have a board (Asus P5GDC-V Deluxe) in common. We seem to have raid in common. 
> > > 	After dealing with this leak for a while, I decided to do some dancing around 
> > > with git bisect. I've landed on a possible point of regression:
> > > 
> > > commit: a9701a30470856408d08657eb1bd7ae29a146190
> > > [PATCH] md: support BIO_RW_BARRIER for md/raid1
> > > 
> > > 	I spent about an hour and a half reading through the patch, trying to see if 
> > > I could make sense of what might be wrong. The result (after I dug into the 
> > > code to make a change I foolishly thought made sense) was a hung kernel.
> > > 	This is important because when I rebooted into the kernel that had been 
> > > giving me trouble, it started an md resync and I'm now watching (at least 
> > > during this resync) the slab usage for scsi_cmd_cache stay sane:
> > > 
> > > turbotaz ~ # cat /proc/slabinfo | grep scsi_cmd_cache
> > > scsi_cmd_cache        30     30    384   10    1 : tunables   54   27    8 : 
> > > slabdata      3      3      0
> > > 
> > 
> > This suggests that the problem happens when a BIO_RW_BARRIER write is
> > sent to the device.  With this patch, md flags all superblock writes
> > as BIO_RW_BARRIER However md is not so likely to update the superblock often
> > during a resync.
> > 
> > There is a (rough) count of the number of superblock writes in the
> > "Events" counter which "mdadm -D" will display.
> > You could try collecting 'Events' counter together with the
> > 'active_objs' count from /proc/slabinfo and graph the pairs - see if
> > they are linear.
> > 
> > I believe a BIO_RW_BARRIER is likely to send some sort of 'flush'
> > command to the device, and the driver for your particular device may
> > well be losing scsi_cmd_cache allocation when doing that, but I leave
> > that to someone how knows more about that code.
> 
> I already checked up on that since I suspected barriers initially. The
> path there for scsi is sd.c:sd_issue_flush() which looks pretty straight
> forward. In the end it goes through the block layer and gets back to the
> SCSI layer as a regular REQ_BLOCK_PC request.

Sorry, that was for the ->issue_flush() that md also does but did before
the barrier addition as well. Most of the barrier handling is done in
the block layer, but it could show leaks in SCSI of course. FWIW, I
tested barriers with and without md on SCSI here a few days ago and
didn't see any leaks at all.

Chase, can you post full dmesg again? I don't have it, thanks.

-- 
Jens Axboe


  reply	other threads:[~2006-01-27 11:26 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-01-27 10:09 More information on scsi_cmd_cache leak... (bisect) Chase Venters
2006-01-27 11:11 ` Neil Brown
2006-01-27 11:23   ` Jens Axboe
2006-01-27 11:28     ` Jens Axboe [this message]
2006-01-27 15:20       ` Chase Venters
2006-01-27 19:06       ` Mike Christie
2006-01-27 19:16         ` Jens Axboe
2006-01-27 19:20         ` James Bottomley
2006-01-27 19:29           ` Jens Axboe
2006-01-27 19:46           ` Mike Christie
2006-01-27 19:49             ` Jens Axboe
2006-01-27 19:53               ` Chase Venters
2006-01-27 20:00                 ` Jens Axboe
2006-01-27 20:02                 ` askernel2615
2006-01-27 20:06                   ` Jens Axboe
2006-01-27 22:50                     ` Tim Morley
2006-01-27 13:33 ` Alexey Dobriyan
2006-01-27 18:41 ` Ariel
2006-01-27 18:58   ` Chase Venters
2006-01-27 21:07   ` Neil Brown
2006-01-27 18:53 ` Ariel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060127112837.GG4311@suse.de \
    --to=axboe@suse.de \
    --cc=a.titov@host.bg \
    --cc=akpm@osdl.org \
    --cc=askernel2615@dsgml.com \
    --cc=chase.venters@clientec.com \
    --cc=jamie@audible.transient.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox