All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lukas Kolbe <lkolbe@techfak.uni-bielefeld.de>
To: James Bottomley <James.Bottomley@suse.de>
Cc: "Kai Mäkisara" <kai.makisara@kolumbus.fi>,
	"FUJITA Tomonori" <fujita.tomonori@lab.ntt.co.jp>,
	linux-scsi@vger.kernel.org,
	"Kashyap Desai" <Kashyap.Desai@lsi.com>
Subject: Re: After memory pressure: can't read from tape anymore
Date: Sun, 05 Dec 2010 11:53:03 +0100	[thread overview]
Message-ID: <1291546383.2814.2890.camel@larosa> (raw)
In-Reply-To: <1291399814.2881.66.camel@mulgrave.site>

Am Freitag, den 03.12.2010, 12:10 -0600 schrieb James Bottomley:
> On Fri, 2010-12-03 at 18:03 +0100, Lukas Kolbe wrote:
> > Am Freitag, den 03.12.2010, 09:06 -0600 schrieb James Bottomley:
> > > On Fri, 2010-12-03 at 16:59 +0200, Kai Mäkisara wrote:
> > > > On 12/03/2010 02:27 PM, FUJITA Tomonori wrote:
> > > > >
> > > > > Can we make enlarge_buffer friendly to the memory alloctor a bit?
> > > > >
> > > > > His problem is that the driver can't allocate 2 mB with the hardware
> > > > > limit 128 segments.
> > > > >
> > > > > enlarge_buffer tries to use ST_MAX_ORDER and if the allocation (256 kB
> > > > > page) fails, enlarge_buffer fails. It could try smaller order instead?
> > > > >
> > > > > Not tested at all.
> > > > >
> > > > >
> > > > > diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c
> > > > > index 5b7388f..119544b 100644
> > > > > --- a/drivers/scsi/st.c
> > > > > +++ b/drivers/scsi/st.c
> > > > > @@ -3729,7 +3729,8 @@ static int enlarge_buffer(struct st_buffer * STbuffer, int new_size, int need_dm
> > > > >   		b_size = PAGE_SIZE<<  order;
> > > > >   	} else {
> > > > >   		for (b_size = PAGE_SIZE, order = 0;
> > > > > -		     order<  ST_MAX_ORDER&&  b_size<  new_size;
> > > > > +		     order<  ST_MAX_ORDER&&
> > > > > +			     max_segs * (PAGE_SIZE<<  order)<  new_size;
> > > > >   		     order++, b_size *= 2)
> > > > >   			;  /* empty */
> > > > >   	}
> > > > 
> > > > You are correct. The loop does not work at all as it should. Years ago,
> > > > the strategy was to start with as big blocks as possible to minimize the 
> > > > number s/g segments. Nowadays the segments must be of same size and the 
> > > > old logic is not applicable.
> > > > 
> > > > I have not tested the patch either but it looks correct.
> > > > 
> > > > Thanks for noticing this bug. I hope this helps the users. The question 
> > > > about number of s/g segments is still valid for the direct i/o case but 
> > > > that is optimization and not whether one can read/write.
> > > 
> > > Realistically, though, this will only increase the probability of making
> > > an allocation work, we can't get this to a certainty.
> > > 
> > > Since we fixed up the infrastructure to allow arbitrary length sg lists,
> > > perhaps we should document what cards can actually take advantage of
> > > this (and how to do so, since it's not set automatically on boot).  That
> > > way users wanting tapes at least know what the problems are likely to be
> > > and how to avoid them in their hardware purchasing decisions.  The
> > > corollary is that we should likely have a list of not recommended cards:
> > > if they can't go over 128 SG elements, then they're pretty much
> > > unsuitable for modern tapes.
> > 
> > Are you implying here that the LSI SAS1068E is unsuitable to drive two
> > LTO-4 tape drives? Or is it 'just' a problem with the driver?
> 
> The information seems to be the former.  There's no way the kernel can
> guarantee physical contiguity of memory as it operates.  We try to
> defrag, but it's probabalistic, not certain, so if we have to try to
> find a physically contiguous buffer to copy into for an operation like
> this, at some point that allocation is going to fail.
> 
> The only way to be certain you can get a 2MB block down to a tape device
> is to be able to transmit the whole thing as a SG list of fully
> discontiguous pages.  On a system with 4k pages, that requires 512 SG
> entries.  From what I've heard Kashyap say, that can't currently be done
> on the 1068 because of firmware limitations (I'm not entirely clear on
> this, but that's how it sounds to me ... if there is a way of making
> firmware accept more than 128 SG elements per SCSI command, then it is a
> fairly simple driver change).

Well, 2MB blocksizes actually do work - bacula is reporting a blocksize
of ~2MB for each drive while writing to it - only after there was memory
pressure and a new tape got inserted, it is *not* possible anymore to
write to the tape with these blocksizes, and dmesg tells me one of these
every time bacula tries to read from or write to a tape:

[101883.958351] st0: Can't allocate 2097152 byte tape buffer.
[103901.666608] st0: Can't allocate 10249541 byte tape buffer.

No idea why it's trying 10MB, though.

I tested with the patch from Fujita, and this messages from before
applying the patch: 

[158544.348411] st: append_to_buffer offset overflow.

do not appear anymore.
It didn't help on the not-being-able-to-write-after-memory-pressure
matter, though.

>  This isn't something we can work around
> in the driver because the transaction can't be split ... it has to go
> down as a single WRITE command with a single output data buffer.
> 
> The LSI 1068 is an upgradeable firmware system, so it's always possible
> LSI can come up with a firmware update that increases the size (this
> would also require a corresponding driver change), but it doesn't sound
> to be something that can be done in the driver alone.

If only LSI's website were a little more clear on where to find updated
firmware and what was the latest version :/.

-- 
Lukas


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2010-12-05 10:53 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-28 19:15 After memory pressure: can't read from tape anymore Lukas Kolbe
2010-11-29 17:09 ` Kai Makisara
2010-11-30 13:31   ` Lukas Kolbe
2010-11-30 16:10     ` Boaz Harrosh
2010-11-30 16:23       ` Kai Makisara
2010-11-30 16:44         ` Boaz Harrosh
2010-11-30 17:04           ` Kai Makisara
2010-11-30 17:24             ` Boaz Harrosh
2010-11-30 19:53               ` Kai Makisara
2010-12-01  9:40                 ` Lukas Kolbe
2010-12-02 11:17                   ` Desai, Kashyap
2010-12-02 16:22                     ` Kai Makisara
2010-12-02 18:14                       ` Desai, Kashyap
2010-12-02 20:25                         ` Kai Makisara
2010-12-05 10:44                           ` Lukas Kolbe
2010-12-03 10:13                       ` FUJITA Tomonori
2010-12-03 10:45                         ` Desai, Kashyap
2010-12-03 11:11                           ` FUJITA Tomonori
2010-12-02 10:01                 ` Lukas Kolbe
2010-12-03  9:44               ` FUJITA Tomonori
2010-11-30 16:20     ` Kai Makisara
2010-12-01 17:06       ` Lukas Kolbe
2010-12-02 16:41         ` Kai Makisara
2010-12-06  7:59           ` Kai Makisara
2010-12-06  8:50             ` FUJITA Tomonori
2010-12-06  9:36             ` Lukas Kolbe
2010-12-06 11:34               ` Bjørn Mork
2010-12-08 14:19               ` Lukas Kolbe
2010-12-03 12:27   ` FUJITA Tomonori
2010-12-03 14:59     ` Kai Mäkisara
2010-12-03 15:06       ` James Bottomley
2010-12-03 17:03         ` Lukas Kolbe
2010-12-03 18:10           ` James Bottomley
2010-12-05 10:53             ` Lukas Kolbe [this message]
2010-12-05 12:16               ` FUJITA Tomonori
2010-12-14 20:35             ` Vladislav Bolkhovitin
2010-12-14 22:23               ` Stephen Hemminger
2010-12-15 16:27                 ` Vladislav Bolkhovitin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1291546383.2814.2890.camel@larosa \
    --to=lkolbe@techfak.uni-bielefeld.de \
    --cc=James.Bottomley@suse.de \
    --cc=Kashyap.Desai@lsi.com \
    --cc=fujita.tomonori@lab.ntt.co.jp \
    --cc=kai.makisara@kolumbus.fi \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.