From mboxrd@z Thu Jan 1 00:00:00 1970 From: James Bottomley Subject: RE: PATCH [5/15] qla2xxx: SG tablesize update Date: 15 Mar 2004 22:37:51 -0500 Sender: linux-scsi-owner@vger.kernel.org Message-ID: <1079408273.1804.399.camel@mulgrave> References: Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit Return-path: Received: from stat1.steeleye.com ([65.114.3.130]:14232 "EHLO hancock.sc.steeleye.com") by vger.kernel.org with ESMTP id S262728AbUCPDiQ (ORCPT ); Mon, 15 Mar 2004 22:38:16 -0500 In-Reply-To: List-Id: linux-scsi@vger.kernel.org To: Andrew Vasquez Cc: Jeff Garzik , Anton Blanchard , Jens Axboe , SCSI Mailing List On Mon, 2004-03-15 at 18:43, Andrew Vasquez wrote: > I'm curious then, how does this value (nr_hw_segments) differ from > scsi_cmnd->use_sg? OK, here's a potted version of what goes on: the block layer, when merging keeps two counters of request segments: nr_phys_segments and nr_hw_segments. The counts are slightly inaccurate for efficiency, but guaranteed not to go over max_phys_segments and max_hw_segments respectively (the latter is what sg_tablesize becomes). max_phys_segments counts the size of the *input* sgtable (the one the mid layer). The mid layer (in scsi_alloc_sgtable) allocates enough space for a table of nr_phys_segments elements. Then we map the request from the block layer (blk_rq_map_sg) which fills them in with the physical pages. blk_rq_map_sg() does an exact count of physical segments and may find additional cases where pages are actually adjacent and so can be merged, so the value it returns (which is what is put in cmd->use_sg) is <= nr_phys_segments. However, you don't care about this part, the mid-layer does it all for you (well, except that max_phys_segments is fixed in the mid-layer at 128, which means that your sgtable can be in practice never more than 128 elements). When you map the sgtable for use on by the HBA, using dma_map_sg(), the bus physical addresses are filled in. If the platform has no IOMMU, this is usually simply the memory physical addresses and nr_phys_segments == nr_hw_segments. However, if there is an IOMMU in the system it may be able to take non adjacent pages in physical memory and remap them to be adjacent in bus physical address space. This is called virtual merging and when it happens, the size of the sgtable you get out of dma_map_sg shrinks. The size it shrinks down to is always <= nr_hw_segments (because the way the iommu does the mapping has been parametrised for the block layer). Thus, the number of elements the driver will have to allocate is always <= nr_hw_segments. > But from later emails...it's beginning to sound like a 'better' fix > would be to use the midlayer's own queueing mechanisms and strip out > the qla2xxx driver's legacy pending-queue infrastructure in favor or > returning: Yes. > On Sunday, March 14, 2004 2:27 PM, James Bottomley wrote: > > For dynamic resource situations we have the queuecommand return > > codes > > > > SCSI_MLQUEUE_HOST_BUSY which means the entire host is temporarily > > out of resources and causes the mid layer to hold off all commands > > for that host until we get one back from any device on the host and > > > > from queuecommand(). The 8.x series driver inherited alot of the > queuing baggage created during driver development of [567].x to > address some deficiecies of earlier midlayer implementations (all of > which have been addressed in recent kernels). I'll start to take a > look at tearing out the pending_q. Thanks, James