From: Jens Axboe <axboe@suse.de>
To: Mike Christie <michaelc@cs.wisc.edu>
Cc: Neil Brown <neilb@suse.de>,
Chase Venters <chase.venters@clientec.com>,
linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org,
akpm@osdl.org, a.titov@host.bg, askernel2615@dsgml.com,
jamie@audible.transient.net
Subject: Re: More information on scsi_cmd_cache leak... (bisect)
Date: Fri, 27 Jan 2006 20:16:37 +0100 [thread overview]
Message-ID: <20060127191637.GD6928@suse.de> (raw)
In-Reply-To: <43DA6F33.3070101@cs.wisc.edu>
On Fri, Jan 27 2006, Mike Christie wrote:
> Jens Axboe wrote:
> >On Fri, Jan 27 2006, Jens Axboe wrote:
> >
> >>On Fri, Jan 27 2006, Neil Brown wrote:
> >>
> >>>On Friday January 27, chase.venters@clientec.com wrote:
> >>>
> >>>>Greetings,
> >>>> Just a quick recap - there are at least 4 reports of 2.6.15 users
> >>>>experiencing severe slab leaks with scsi_cmd_cache. It seems that a few
> >>>>of us have a board (Asus P5GDC-V Deluxe) in common. We seem to have
> >>>>raid in common. After dealing with this leak for a while, I decided
> >>>> to do some dancing around with git bisect. I've landed on a possible
> >>>>point of regression:
> >>>>
> >>>>commit: a9701a30470856408d08657eb1bd7ae29a146190
> >>>>[PATCH] md: support BIO_RW_BARRIER for md/raid1
> >>>>
> >>>> I spent about an hour and a half reading through the patch, trying
> >>>> to see if I could make sense of what might be wrong. The result (after
> >>>>I dug into the code to make a change I foolishly thought made sense)
> >>>>was a hung kernel.
> >>>> This is important because when I rebooted into the kernel that had
> >>>> been giving me trouble, it started an md resync and I'm now watching
> >>>>(at least during this resync) the slab usage for scsi_cmd_cache stay
> >>>>sane:
> >>>>
> >>>>turbotaz ~ # cat /proc/slabinfo | grep scsi_cmd_cache
> >>>>scsi_cmd_cache 30 30 384 10 1 : tunables 54 27
> >>>>8 : slabdata 3 3 0
> >>>>
> >>>
> >>>This suggests that the problem happens when a BIO_RW_BARRIER write is
> >>>sent to the device. With this patch, md flags all superblock writes
> >>>as BIO_RW_BARRIER However md is not so likely to update the superblock
> >>>often
> >>>during a resync.
> >>>
> >>>There is a (rough) count of the number of superblock writes in the
> >>>"Events" counter which "mdadm -D" will display.
> >>>You could try collecting 'Events' counter together with the
> >>>'active_objs' count from /proc/slabinfo and graph the pairs - see if
> >>>they are linear.
> >>>
> >>>I believe a BIO_RW_BARRIER is likely to send some sort of 'flush'
> >>>command to the device, and the driver for your particular device may
> >>>well be losing scsi_cmd_cache allocation when doing that, but I leave
> >>>that to someone how knows more about that code.
> >>
> >>I already checked up on that since I suspected barriers initially. The
> >>path there for scsi is sd.c:sd_issue_flush() which looks pretty straight
> >>forward. In the end it goes through the block layer and gets back to the
> >>SCSI layer as a regular REQ_BLOCK_PC request.
> >
> >
> >Sorry, that was for the ->issue_flush() that md also does but did before
> >the barrier addition as well. Most of the barrier handling is done in
> >the block layer, but it could show leaks in SCSI of course. FWIW, I
> >tested barriers with and without md on SCSI here a few days ago and
> >didn't see any leaks at all.
> >
>
> It does not have anything to do with this in scsi_io_completion does it?
>
> if (blk_complete_barrier_rq(q, req, good_bytes >> 9))
> return;
>
> For that case the scsi_cmnd does not get freed. Does it come back around
> again and get released from a different path?
Certainly smells fishy. Unfortunately I cannot take a look at this until
monday :/
But adding some tracing there might be really interesting. Since we are
not seeing bio and/or req leaks, this does look very promising.
--
Jens Axboe
next prev parent reply other threads:[~2006-01-27 19:16 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-01-27 10:09 More information on scsi_cmd_cache leak... (bisect) Chase Venters
2006-01-27 11:11 ` Neil Brown
2006-01-27 11:23 ` Jens Axboe
2006-01-27 11:28 ` Jens Axboe
2006-01-27 15:20 ` Chase Venters
2006-01-27 19:06 ` Mike Christie
2006-01-27 19:16 ` Jens Axboe [this message]
2006-01-27 19:20 ` James Bottomley
2006-01-27 19:29 ` Jens Axboe
2006-01-27 19:46 ` Mike Christie
2006-01-27 19:49 ` Jens Axboe
2006-01-27 19:53 ` Chase Venters
2006-01-27 20:00 ` Jens Axboe
2006-01-27 20:02 ` askernel2615
2006-01-27 20:06 ` Jens Axboe
2006-01-27 22:50 ` Tim Morley
2006-01-27 13:33 ` Alexey Dobriyan
2006-01-27 18:41 ` Ariel
2006-01-27 18:58 ` Chase Venters
2006-01-27 21:07 ` Neil Brown
2006-01-27 18:53 ` Ariel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060127191637.GD6928@suse.de \
--to=axboe@suse.de \
--cc=a.titov@host.bg \
--cc=akpm@osdl.org \
--cc=askernel2615@dsgml.com \
--cc=chase.venters@clientec.com \
--cc=jamie@audible.transient.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=michaelc@cs.wisc.edu \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.