From mboxrd@z Thu Jan 1 00:00:00 1970 From: James Bottomley Subject: Re: [PATCH 2.5.17] Making SCSI not copy the request structure Date: Wed, 22 May 2002 20:06:24 -0400 Sender: linux-scsi-owner@vger.kernel.org Message-ID: <200205230006.g4N06PL03133@localhost.localdomain> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: (from root@localhost) by pogo.mtv1.steeleye.com (8.9.3/8.9.3) id RAA21028 for ; Wed, 22 May 2002 17:06:32 -0700 In-Reply-To: Message from Patrick Mansfield of "Wed, 22 May 2002 15:44:06 PDT." <20020522154406.A17222@eng2.beaverton.ibm.com> List-Id: linux-scsi@vger.kernel.org To: Patrick Mansfield Cc: James Bottomley , linux-scsi@vger.kernel.org patmans@us.ibm.com said: > I applied your patch and successfuly ran with AIC + 2 disks (one the > boot disk), plus with qla (modified v6.0b20 to remove io request lock) > drivers attached to both Triton (disk array) and Seagate drives using > block and raw io. That's great, thanks for testing it. > Do you think the queue depth on some of the adapters/devices should be > shrunk or the request queue increased with your patch? Some adapters > set device queue depths above 200 (for example, aic set mine to 253), > this seems like overkill, but today it means they can have 200 more > IO's on the request queue, where freeing the request after the IO > completes means the request queue (with your patch) means we sometimes > would have 200 fewer entries. That's a tough one. There are differing schools of thought on queue depth. I incline to the one that says that for modern scsi devices, 4-8 is probably a good figure, but there are definitely people who disagree. The IDE code uses 32 as the queue depth. One of the things I hope to get from standardising the TCQ interface is the ability to adjust the queue depth from user land. To move to a standard implementation in the generic layer, I think that practically the queue depths have to be lower (at least than 253). the current TCQ generic code uses an arbitrary length bitmap to track outstanding tags which means it would scale OK for high queue depths, but as you say, we are limited by the number of available requests. > I don't understand how/why the journaling file systems want to use a > barrier, and how it helps their IO. > Are the request barriers needed to prevent earlier IO from completing > before the barrier, or later IO from completing before the barrier, or > both? There were several discussion threads on the topic, but this is the only one I can find: http://marc.theaimsgroup.com/?t=101360488200004&r=1&w=2 Essentially, journalled fs can function more efficiently if they can rely on transaction ordering (within ordering "barriers") making it all the way to the medium. There was also a thought that this might speed up jfs operations, but no conclusive data was produced. The elevator is allowed to re-order and merge requests within the barrier, but requests may not cross the barrier (REQ_BARRIER). So, for instance, a jfs wants to write to a file, so it journals the write, performs the write and erases the journal. The write cannot start until the journal entry is committed for the fs to maintain integrity on recovery, so currently you have to wait for the journal before beginning the fs write. In the barrier abstraction, you simply send the journal entry and write down together with a barrier separating them. The transaction integrity is maintained by the barrier ordering guarantee. The idea for SCSI was that we translate the barrier to an ordered queue tag. There are, unfortunately, pathological error cases in SCSI where I/Os can cross the barrier, but I'm hoping that "works right almost all the time" is good enough. James