From mboxrd@z Thu Jan 1 00:00:00 1970 From: Luben Tuikov Subject: Re: [patch 2.5] ips queue depths Date: Tue, 15 Oct 2002 20:43:38 -0400 Sender: linux-scsi-owner@vger.kernel.org Message-ID: <3DACB63A.B748A97@splentec.com> References: <20021015194705.GD4391@redhat.com> <3DAC7A05.31B17A39@splentec.com> <20021015142733.A2611@eng2.beaverton.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Return-path: Received: from splentec.com (canoe.splentec.com [209.47.35.250]) by pepsi.splentec.com (8.11.6/8.11.0) with ESMTP id g9G0hcf29068 for ; Tue, 15 Oct 2002 20:43:38 -0400 List-Id: linux-scsi@vger.kernel.org To: linux-scsi Patrick Mansfield wrote: > > I'm saying don't set the queue depth really high when it gives no or > very little performance gain. If an adapter driver finds that a large > queue depth helps more than it hurts for all IO loads (for sequential > as well as random IO), go ahead, but I would guess that queue depths over > 100 give zero or very little performance gain compared to a queue depth > of say 50 for most devices. I was trying to run some tests on this > in the past but never had time to get it working well, plus it would have > been for only two different devices (disk and disk array), and the > drives I have are not really fast (20 mb/sec for disk, about 50mb/sec for > the disk array). Ok, this may work, now and here, but then and there it doesn't have to. Predicting on a number, say 100, is speculation at best. What if the initiator is connected to fiber which is connected to another, etc. And what if /dev/sda is an iSCSI initiator, connected to a bunch of targets, which are arrays on another fiber... You see, 100 means nothing anymore. That is sending 200 tagged commands will NOT go to the same ``device''... (your imagination here) The SCSI LLDD, being the gate to the interconnect/transport, knows best, and has at its disposal features/abilities not easily exportable to ULP/userland. Thus, it has the ability to at least hint at some number, being the device queue depth. > What is really needed are IO performance numbers for varying queue depths. Yep, this is what you give your boss... (Essay topic for next Thursday :-)) But tomorrow, someone has decided to just change one little iota in the code and those same numbers are out the window (just as has recently happened). That is, this wouldn't work here. Those numbers would of course depend on each subsystem getting it ``right'', and the dependent variables become too many. Thus, in my experience (and it is my opinion) it is best to approach matters like this from an academic/reasearch point of view -- that is, we are speaking of a _general_ architecture, and not of a few empirical tests, hinting at 10 five line patches. > With 2.5, the number of commands outstanding to the device is not > subtracted from the blk request queue size (we don't release a blk request > until the IO is completed, there is no call to blkdev_release_request in > scsi_request_fn) - this means large queue depths will cause the blk request > queue to fill up and even be full without any available blk request queue > commands to merge or sort with. Yes, ok, so we are involving the block layer, which can/should/may change tomorrow BUT the SCSI core should/may not have to -- this would mean that it's doing a great job. (cont'd below) > There are also issues like Andrew had with the read latency - although > his benchmark is aritificial, and has more to do with too many dirty > pages, it still showed that higher queue depths can have an impact > on interactive performance (i.e. read latencies). Right! Meaning that the issue is/was elsewhere all along. So if we involve the block layer too much, and tomorrow someone finds out something was broken there, we SHOULD NOT HAVE TO change the SCSI core. This would mean a fairly independent implementation (being a subsystem), which implies general structrure, which implies research. While it is good to look at who's below us and above us (SCSI core), depending too much on their particulars is not generally a good investment. ((All this of course implies that SCSI Core would be quite minimal.)) -- Luben