From: James Bottomley <James.Bottomley@steeleye.com>
To: Patrick Mansfield <patmans@us.ibm.com>
Cc: James Bottomley <James.Bottomley@SteelEye.com>,
linux-scsi@vger.kernel.org
Subject: Re: [PATCH 2.5.17] Making SCSI not copy the request structure
Date: Wed, 22 May 2002 20:06:24 -0400 [thread overview]
Message-ID: <200205230006.g4N06PL03133@localhost.localdomain> (raw)
In-Reply-To: Message from Patrick Mansfield <patmans@us.ibm.com> of "Wed, 22 May 2002 15:44:06 PDT." <20020522154406.A17222@eng2.beaverton.ibm.com>
patmans@us.ibm.com said:
> I applied your patch and successfuly ran with AIC + 2 disks (one the
> boot disk), plus with qla (modified v6.0b20 to remove io request lock)
> drivers attached to both Triton (disk array) and Seagate drives using
> block and raw io.
That's great, thanks for testing it.
> Do you think the queue depth on some of the adapters/devices should be
> shrunk or the request queue increased with your patch? Some adapters
> set device queue depths above 200 (for example, aic set mine to 253),
> this seems like overkill, but today it means they can have 200 more
> IO's on the request queue, where freeing the request after the IO
> completes means the request queue (with your patch) means we sometimes
> would have 200 fewer entries.
That's a tough one. There are differing schools of thought on queue depth. I
incline to the one that says that for modern scsi devices, 4-8 is probably a
good figure, but there are definitely people who disagree. The IDE code uses
32 as the queue depth. One of the things I hope to get from standardising the
TCQ interface is the ability to adjust the queue depth from user land.
To move to a standard implementation in the generic layer, I think that
practically the queue depths have to be lower (at least than 253). the current
TCQ generic code uses an arbitrary length bitmap to track outstanding tags
which means it would scale OK for high queue depths, but as you say, we are
limited by the number of available requests.
> I don't understand how/why the journaling file systems want to use a
> barrier, and how it helps their IO.
> Are the request barriers needed to prevent earlier IO from completing
> before the barrier, or later IO from completing before the barrier, or
> both?
There were several discussion threads on the topic, but this is the only one I
can find:
http://marc.theaimsgroup.com/?t=101360488200004&r=1&w=2
Essentially, journalled fs can function more efficiently if they can rely on
transaction ordering (within ordering "barriers") making it all the way to the
medium. There was also a thought that this might speed up jfs operations, but
no conclusive data was produced.
The elevator is allowed to re-order and merge requests within the barrier, but
requests may not cross the barrier (REQ_BARRIER).
So, for instance, a jfs wants to write to a file, so it journals the write,
performs the write and erases the journal. The write cannot start until the
journal entry is committed for the fs to maintain integrity on recovery, so
currently you have to wait for the journal before beginning the fs write. In
the barrier abstraction, you simply send the journal entry and write down
together with a barrier separating them. The transaction integrity is
maintained by the barrier ordering guarantee.
The idea for SCSI was that we translate the barrier to an ordered queue tag.
There are, unfortunately, pathological error cases in SCSI where I/Os can
cross the barrier, but I'm hoping that "works right almost all the time" is
good enough.
James
next prev parent reply other threads:[~2002-05-23 0:06 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-05-21 23:11 [PATCH 2.5.17] Making SCSI not copy the request structure James Bottomley
2002-05-22 22:44 ` Patrick Mansfield
2002-05-22 22:53 ` Doug Ledford
2002-05-23 2:01 ` Alan Cox
2002-05-23 3:14 ` Doug Ledford
2002-05-24 15:32 ` Alan Cox
2002-05-24 15:56 ` James Bottomley
2002-05-23 0:06 ` James Bottomley [this message]
-- strict thread matches above, loose matches on Subject: below --
2002-05-23 9:18 Aron Zeh
2002-05-23 12:44 ` James Bottomley
2002-05-24 7:52 Aron Zeh
2002-05-24 8:34 ` rakesh rakesh
2002-05-24 13:17 ` James Bottomley
2002-05-24 13:00 ` James Bottomley
2002-05-24 9:35 Aron Zeh
2002-05-24 16:44 ` Patrick Mansfield
2002-05-31 12:04 Aron Zeh
[not found] <OFF6A89763.CD0EBEF7-ONC1256BCA.003CD69A@de.ibm.com>
2002-05-31 13:57 ` James Bottomley
2002-05-31 16:57 Aron Zeh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200205230006.g4N06PL03133@localhost.localdomain \
--to=james.bottomley@steeleye.com \
--cc=linux-scsi@vger.kernel.org \
--cc=patmans@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox