All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benny Halevy <bhalevy@panasas.com>
To: Jens Axboe <jens.axboe@oracle.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>,
	James Bottomley <James.Bottomley@HansenPartnership.com>,
	linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: [PATCH] remove use_sg_chaining
Date: Mon, 21 Jan 2008 13:32:33 +0200	[thread overview]
Message-ID: <479482D1.8090904@panasas.com> (raw)
In-Reply-To: <20080121093112.GG6258@kernel.dk>

On Jan. 21, 2008, 11:31 +0200, Jens Axboe <jens.axboe@oracle.com> wrote:
> On Mon, Jan 21 2008, Boaz Harrosh wrote:
>> On Sun, Jan 20 2008 at 22:59 +0200, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
>>> On Sun, 2008-01-20 at 21:01 +0100, Jens Axboe wrote:
>>>> On Sun, Jan 20 2008, Jens Axboe wrote:
>>>>> On Sun, Jan 20 2008, Boaz Harrosh wrote:
>>>>>> On Sun, Jan 20 2008 at 21:29 +0200, Jens Axboe <jens.axboe@oracle.com> wrote:
>>>>>>> On Sun, Jan 20 2008, James Bottomley wrote:
>>>>>>>> On Sun, 2008-01-20 at 21:18 +0200, Boaz Harrosh wrote:
>>>>>>>>> On Tue, Jan 15 2008 at 19:52 +0200, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
>>>>>>>>>> this patch depends on the sg branch of the block tree
>>>>>>>>>>
>>>>>>>>>> James
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>> From: James Bottomley <James.Bottomley@HansenPartnership.com>
>>>>>>>>>> Date: Tue, 15 Jan 2008 11:11:46 -0600
>>>>>>>>>> Subject: remove use_sg_chaining
>>>>>>>>>>
>>>>>>>>>> With the sg table code, every SCSI driver is now either chain capable
>>>>>>>>>> or broken, so there's no need to have a check in the host template.
>>>>>>>>>>
>>>>>>>>>> Also tidy up the code by moving the scatterlist size defines into the
>>>>>>>>>> SCSI includes and permit the last entry of the scatterlist pools not
>>>>>>>>>> to be a power of two.
>>>>>>>>>> ---
>>>>>>>>> I have a theoretical problem that BUGed me from the beginning.
>>>>>>>>>
>>>>>>>>> Could it happen that a memory critical IO, (that is needed to free
>>>>>>>>> memory), be collected into an sg-chained large IO, and the allocation 
>>>>>>>>> of the multiple sg-pool-allocations fail, thous dead locking on
>>>>>>>>> out-of-memory? Is there a mechanism in place that will split large IO's 
>>>>>>>>> into smaller chunks in the event of out-of-memory condition in prep_fn?
>>>>>>>>>
>>>>>>>>> Is it possible to call blk_rq_map_sg() with less then what is present
>>>>>>>>> at request to only map the starting portion?
>>>>>>>> Obviously, that's why I was worrying about mempool size and default
>>>>>>>> blocks a while ago.
>>>>>>>>
>>>>>>>> However, the deadlock only occurs if the device is swap or backing a
>>>>>>>> filesystem with memory mapped files.  The use cases for this are really
>>>>>>>> tapes and other entities that need huge buffers.  That's why we're
>>>>>>>> keeping the system sector size at 1024 unless you alter it through sysfs
>>>>>>>> (here gun, there foot ...)
>>>>>>> Alternatively (and much safer, imho), we allow blk_rq_map_sg() return
>>>>>>> smaller than nr_phys_segments and just ensure that the request is
>>>>>>> continued nicely through the normal 'request if residual' logic.
>>>>>>>
>>>>>> Thats a grate Idea. I will Q it on my todo list. Thanks
>>>>> ok good, thanks :-)
>>>> btw, the above is full of typos, my apologies. it should read "requeue
>>>> if residual", but I guess you already guessed as much.
>>> Something like ...
>>>
>>> It looks to me like it would make sense to have something like a
>>> BLKPREP_SGALLOCFAIL return so the block layer can do this for us ...
>>> Alternatively, we'll have to find a way of adjusting the sector count as
>>> it goes into the ULD prep functions.
>>>
>>> James
>> By luck this is no problem because it happens exactly before the ULD
>> actually prepares the command. sd and sr are already doing these
>> adjustments based on bufflen. For BLOCK_PC we will need to fail with
>> perhaps a new BLKPREP_SGALLOCFAIL, like you said, and let the
>> initiator take care of it.
> 
> Right, the scsi_init_io() takes care of it and adjusts the buflen as
> needed, no need to pass this "erro"r back. As far as I'm concerned,
> blocking for BLOCK_PC requests should be fine (is anyone using these for
> swap?).
> 

It could help the OSD I/O module which will produce BLOCK_PC bidi CDBs to get
feedback in the form of ENOMEM so to throttle down its I/O coalescing sizes
and generate smaller I/Os in face of memory pressure and then gradually throttle
up when on success.  If the requests are held back and blocked at the queue
I'm concerned that could hurt performance under memory pressure by not filling
up the pipeline as much as we can.

Benny

  parent reply	other threads:[~2008-01-21 11:32 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-15 17:52 [PATCH] remove use_sg_chaining James Bottomley
2008-01-15 20:10 ` [PATCH] firewire: fw-sbp2: prepare for s/g chaining Stefan Richter
2008-01-15 20:11   ` [PATCH] ieee1394: sbp2: " Stefan Richter
2008-01-15 20:21     ` Stefan Richter
2008-01-18  3:33   ` [PATCH] firewire: fw-sbp2: " FUJITA Tomonori
2008-01-19 21:20     ` Stefan Richter
2008-01-20 19:18 ` [PATCH] remove use_sg_chaining Boaz Harrosh
2008-01-20 19:24   ` James Bottomley
2008-01-20 19:29     ` Jens Axboe
2008-01-20 19:56       ` Boaz Harrosh
2008-01-20 19:59         ` Jens Axboe
2008-01-20 20:01           ` Jens Axboe
2008-01-20 20:59             ` James Bottomley
2008-01-21  8:32               ` Boaz Harrosh
2008-01-21  9:31                 ` Jens Axboe
2008-01-21 10:31                   ` Boaz Harrosh
2008-01-21 11:32                   ` Benny Halevy [this message]
2008-01-20 19:54     ` Boaz Harrosh
2008-01-20 19:59       ` Jens Axboe
2008-01-20 20:52       ` James Bottomley
2008-01-21  4:33       ` FUJITA Tomonori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=479482D1.8090904@panasas.com \
    --to=bhalevy@panasas.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=bharrosh@panasas.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.