From mboxrd@z Thu Jan 1 00:00:00 1970 From: James Bottomley Subject: Re: max_hw_segments vs. max_phys_segments and scsi_alloc_queue() Date: 26 Feb 2004 09:02:57 -0600 Sender: linux-scsi-owner@vger.kernel.org Message-ID: <1077807779.1756.9.camel@mulgrave> References: <20040226071558.GA559837@sgi.com> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit Return-path: Received: from stat1.steeleye.com ([65.114.3.130]:49899 "EHLO hancock.sc.steeleye.com") by vger.kernel.org with ESMTP id S261854AbUBZPDU (ORCPT ); Thu, 26 Feb 2004 10:03:20 -0500 In-Reply-To: <20040226071558.GA559837@sgi.com> List-Id: linux-scsi@vger.kernel.org To: Jeremy Higdon Cc: SCSI Mailing List , jbarnes@cthulhu.engr.sgi.com On Thu, 2004-02-26 at 01:15, Jeremy Higdon wrote: > That is, max_hw_segments is the number of discrete segments that > a PCI device would see, while max_phys_segments would be the > max s/g list size that a PCI device could handle. No, max_hw_segments is the maximum size of the PCI device's sg table. max_hw_segments is the size of the internal sg list before dma_map_sg() gets its paws on it. The only parameter the drivers care about is max_hw_segments. It's the mid-layer that cares about the max_phys_segments because the mid-layer provides the memory for the sg list that comes out of blk_rq_map_sg() > It further seems as though when there is no IOMMU that the number > of hw_segments will equal the number of phys_segments, at least if I > understand the code in blk_recount_segments() and the comments > around the definition of BIO_VMERGE_BOUNDARY in > include/asm-ia64/io.h. That's correct. No virtual merging => max_phys_segments == max_hw_segments. > In particular, on ia64 machines, where BIO_VMERGE_BOUNDARY is > currently 0, and thus, the number of hw_segments equals the number > of phys_segments, we should be using the host's sg_tablesize to > set the max number of phys segments (as well as the max number > of hw segments). I see no reason why this wouldn't carry forward > to the other architectures, though there may be limits to the > total amount of data that could be mapped. This would have to > be fed to the block layer from the arch layer, though, I think. > > Does this make sense, or have I completely missed something? No, you can't do this (or at least, not simply like your patch). Like I said, the mid-layer has to allocate the sg table coming out of blk_rq_map_sg(). It does this in scsi_alloc_sgtable() using mempools, and the maximum sg table size it's expecting is MAX_PHYS_SEGMENTS. If you increase max_phys_segments beyond what the mid-layer can cope with, you'll end up with a request we can never map. We'd have to rejig the entire mempool setup to increase this (even though it looks like it's nicely coded to be variable for MAX_PHYS_SEGMENTS, in fact, the mempools are coded assuming that the value is 128). James