From: Jens Axboe <axboe@suse.de>
To: Mike Christie <michaelc@cs.wisc.edu>
Cc: Neil Brown <neilb@suse.de>,
Chase Venters <chase.venters@clientec.com>,
linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org,
akpm@osdl.org, a.titov@host.bg, askernel2615@dsgml.com,
jamie@audible.transient.net
Subject: Re: More information on scsi_cmd_cache leak... (bisect)
Date: Fri, 27 Jan 2006 20:16:37 +0100 [thread overview]
Message-ID: <20060127191637.GD6928@suse.de> (raw)
In-Reply-To: <43DA6F33.3070101@cs.wisc.edu>
On Fri, Jan 27 2006, Mike Christie wrote:
> Jens Axboe wrote:
> >On Fri, Jan 27 2006, Jens Axboe wrote:
> >
> >>On Fri, Jan 27 2006, Neil Brown wrote:
> >>
> >>>On Friday January 27, chase.venters@clientec.com wrote:
> >>>
> >>>>Greetings,
> >>>> Just a quick recap - there are at least 4 reports of 2.6.15 users
> >>>>experiencing severe slab leaks with scsi_cmd_cache. It seems that a few
> >>>>of us have a board (Asus P5GDC-V Deluxe) in common. We seem to have
> >>>>raid in common. After dealing with this leak for a while, I decided
> >>>> to do some dancing around with git bisect. I've landed on a possible
> >>>>point of regression:
> >>>>
> >>>>commit: a9701a30470856408d08657eb1bd7ae29a146190
> >>>>[PATCH] md: support BIO_RW_BARRIER for md/raid1
> >>>>
> >>>> I spent about an hour and a half reading through the patch, trying
> >>>> to see if I could make sense of what might be wrong. The result (after
> >>>>I dug into the code to make a change I foolishly thought made sense)
> >>>>was a hung kernel.
> >>>> This is important because when I rebooted into the kernel that had
> >>>> been giving me trouble, it started an md resync and I'm now watching
> >>>>(at least during this resync) the slab usage for scsi_cmd_cache stay
> >>>>sane:
> >>>>
> >>>>turbotaz ~ # cat /proc/slabinfo | grep scsi_cmd_cache
> >>>>scsi_cmd_cache 30 30 384 10 1 : tunables 54 27
> >>>>8 : slabdata 3 3 0
> >>>>
> >>>
> >>>This suggests that the problem happens when a BIO_RW_BARRIER write is
> >>>sent to the device. With this patch, md flags all superblock writes
> >>>as BIO_RW_BARRIER However md is not so likely to update the superblock
> >>>often
> >>>during a resync.
> >>>
> >>>There is a (rough) count of the number of superblock writes in the
> >>>"Events" counter which "mdadm -D" will display.
> >>>You could try collecting 'Events' counter together with the
> >>>'active_objs' count from /proc/slabinfo and graph the pairs - see if
> >>>they are linear.
> >>>
> >>>I believe a BIO_RW_BARRIER is likely to send some sort of 'flush'
> >>>command to the device, and the driver for your particular device may
> >>>well be losing scsi_cmd_cache allocation when doing that, but I leave
> >>>that to someone how knows more about that code.
> >>
> >>I already checked up on that since I suspected barriers initially. The
> >>path there for scsi is sd.c:sd_issue_flush() which looks pretty straight
> >>forward. In the end it goes through the block layer and gets back to the
> >>SCSI layer as a regular REQ_BLOCK_PC request.
> >
> >
> >Sorry, that was for the ->issue_flush() that md also does but did before
> >the barrier addition as well. Most of the barrier handling is done in
> >the block layer, but it could show leaks in SCSI of course. FWIW, I
> >tested barriers with and without md on SCSI here a few days ago and
> >didn't see any leaks at all.
> >
>
> It does not have anything to do with this in scsi_io_completion does it?
>
> if (blk_complete_barrier_rq(q, req, good_bytes >> 9))
> return;
>
> For that case the scsi_cmnd does not get freed. Does it come back around
> again and get released from a different path?
Certainly smells fishy. Unfortunately I cannot take a look at this until
monday :/
But adding some tracing there might be really interesting. Since we are
not seeing bio and/or req leaks, this does look very promising.
--
Jens Axboe
next prev parent reply other threads:[~2006-01-27 19:16 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-01-27 10:09 More information on scsi_cmd_cache leak... (bisect) Chase Venters
2006-01-27 11:11 ` Neil Brown
2006-01-27 11:23 ` Jens Axboe
2006-01-27 11:28 ` Jens Axboe
2006-01-27 15:20 ` Chase Venters
2006-01-27 19:06 ` Mike Christie
2006-01-27 19:16 ` Jens Axboe [this message]
2006-01-27 19:20 ` James Bottomley
2006-01-27 19:29 ` Jens Axboe
2006-01-27 19:46 ` Mike Christie
2006-01-27 19:49 ` Jens Axboe
2006-01-27 19:53 ` Chase Venters
2006-01-27 20:00 ` Jens Axboe
2006-01-27 20:02 ` askernel2615
2006-01-27 20:06 ` Jens Axboe
2006-01-27 22:50 ` Tim Morley
2006-01-27 13:33 ` Alexey Dobriyan
2006-01-27 18:41 ` Ariel
2006-01-27 18:58 ` Chase Venters
2006-01-27 21:07 ` Neil Brown
2006-01-27 18:53 ` Ariel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060127191637.GD6928@suse.de \
--to=axboe@suse.de \
--cc=a.titov@host.bg \
--cc=akpm@osdl.org \
--cc=askernel2615@dsgml.com \
--cc=chase.venters@clientec.com \
--cc=jamie@audible.transient.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=michaelc@cs.wisc.edu \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox