public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Hugh Dickins <hugh@veritas.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>,
	Jens Axboe <jens.axboe@oracle.com>,
	Christoph Lameter <clameter@sgi.com>,
	Pekka Enberg <penberg@cs.helsinki.fi>,
	Peter Zijlstra <a.p.ziljstra@chello.nl>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] scsi: fix sense_slab/bio swapping livelock
Date: Sun, 06 Apr 2008 18:35:35 -0500	[thread overview]
Message-ID: <1207524935.3223.39.camel@localhost.localdomain> (raw)
In-Reply-To: <Pine.LNX.4.64.0804062348001.19446@blonde.site>

On Sun, 2008-04-06 at 23:56 +0100, Hugh Dickins wrote:
> Since 2.6.25-rc7, I've been seeing an occasional livelock on one
> x86_64 machine, copying kernel trees to tmpfs, paging out to swap.
> 
> Signature: 6000 pages under writeback but never getting written;
> most tasks of interest trying to reclaim, but each get_swap_bio
> waiting for a bio in mempool_alloc's io_schedule_timeout(5*HZ);
> every five seconds an atomic page allocation failure report from
> kblockd failing to allocate a sense_buffer in __scsi_get_command.
> 
> __scsi_get_command has a (one item) free_list to protect against
> this, but rc1's [SCSI] use dynamically allocated sense buffer
> de25deb18016f66dcdede165d07654559bb332bc upset that slightly.
> When it fails to allocate from the separate sense_slab, instead
> of giving up, it must fall back to the command free_list, which
> is sure to have a sense_buffer attached.
> 
> Either my earlier -rc testing missed this, or there's some recent
> contributory factor.  One very significant factor is SLUB, which
> merges slab caches when it can, and on 64-bit happens to merge
> both bio cache and sense_slab cache into kmalloc's 128-byte cache:
> so that under this swapping load, bios above are liable to gobble
> up all the slots needed for scsi_cmnd sense_buffers below.
> 
> That's disturbing behaviour, and I tried a few things to fix it.
> Adding a no-op constructor to the sense_slab inhibits SLUB from
> merging it, and stops all the allocation failures I was seeing;
> but it's rather a hack, and perhaps in different configurations
> we have other caches on the swapout path which are ill-merged.
> 
> Another alternative is to revert the separate sense_slab, using
> cache-line-aligned sense_buffer allocated beyond scsi_cmnd from
> the one kmem_cache; but that might waste more memory, and is
> only a way of diverting around the known problem.
> 
> While I don't like seeing the allocation failures, and hate the
> idea of all those bios piled up above a scsi host working one by
> one, it does seem to emerge fairly soon with the livelock fix.
> So lacking better ideas, stick with that one clear fix for now.
> 
> Signed-off-by: Hugh Dickins <hugh@veritas.com>

This was sort of accidentally fixed in scsi-misc by commit 

commit c5f73260b289cb974928eac05f2d84e58ddfc020
Author: James Bottomley <James.Bottomley@HansenPartnership.com>
Date:   Thu Mar 13 11:16:33 2008 -0500

    [SCSI] consolidate command allocation in a single place

Could you check that:

master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6.git

and see if it alleviates the problem? ... if so, we can work out which
pieces to backport.

Thanks,

James



  reply	other threads:[~2008-04-06 23:35 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-06 22:56 [PATCH] scsi: fix sense_slab/bio swapping livelock Hugh Dickins
2008-04-06 23:35 ` James Bottomley [this message]
2008-04-07  1:01   ` Hugh Dickins
2008-04-07 17:51     ` Hugh Dickins
2008-04-07 18:04       ` James Bottomley
2008-04-07 18:26         ` Hugh Dickins
2008-04-07  2:48 ` FUJITA Tomonori
2008-04-07 18:07   ` Hugh Dickins
2008-04-08 14:04     ` FUJITA Tomonori
2008-04-07  5:26 ` Christoph Lameter
2008-04-07 19:40   ` Hugh Dickins
2008-04-07 19:55     ` Peter Zijlstra
2008-04-07 20:31       ` Hugh Dickins
2008-04-07 20:47         ` Peter Zijlstra
2008-04-07 21:00         ` Pekka Enberg
2008-04-07 21:05           ` Pekka Enberg
2008-04-07 21:15             ` Linus Torvalds
2008-04-07 21:34               ` Pekka Enberg
2008-04-07 21:39                 ` Pekka Enberg
2008-04-07 22:05                   ` Pekka J Enberg
2008-04-07 22:17                     ` Linus Torvalds
2008-04-07 22:42                       ` Pekka Enberg
2008-04-08 20:42                       ` Pekka J Enberg
2008-04-08 20:44                         ` Pekka Enberg
2008-04-08 20:45                         ` Christoph Lameter
2008-04-08 21:11                           ` Pekka Enberg
2008-04-08 21:40                             ` Peter Zijlstra
2008-04-07 21:30             ` Hugh Dickins
2008-04-07 21:36               ` Pekka Enberg
2008-04-08 20:43     ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1207524935.3223.39.camel@localhost.localdomain \
    --to=james.bottomley@hansenpartnership.com \
    --cc=a.p.ziljstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=clameter@sgi.com \
    --cc=fujita.tomonori@lab.ntt.co.jp \
    --cc=hugh@veritas.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=penberg@cs.helsinki.fi \
    --cc=rjw@sisk.pl \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox