From mboxrd@z Thu Jan 1 00:00:00 1970 From: Hannes Reinecke Subject: Re: [LSF/VM TOPIC] Handling of invalid requests in virtual HBAs Date: Thu, 08 Apr 2010 15:44:03 +0200 Message-ID: <4BBDDDA3.7000006@suse.de> References: <4BB45632.5020700@suse.de> <1270186392.28897.76.camel@haakon2.linux-iscsi.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: Received: from cantor.suse.de ([195.135.220.2]:56993 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758608Ab0DHNoF (ORCPT ); Thu, 8 Apr 2010 09:44:05 -0400 In-Reply-To: <1270186392.28897.76.camel@haakon2.linux-iscsi.org> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: linux-iscsi-target-dev@googlegroups.com Cc: lsf10-pc@lists.linuxfoundation.org, SCSI Mailing List Nicholas A. Bellinger wrote: > On Thu, 2010-04-01 at 10:15 +0200, Hannes Reinecke wrote: >> Hi all, >> >=20 > Greetings Hannes, >=20 > Just a few comments on your proposal.. >=20 >> [Topic] >> Handling of invalid requests in virtual HBAs >> >> [Abstract] >> This discussion will focus on the problem of correct request handlin= g with virtual HBAs. >> For KVM I have implemented a 'megasas' HBA emulation which serves as= a backend for the >> megaraid_sas linux driver. >> It is now possible to connect several disks from different (physical= ) HBAs to that >> HBA emulation, each having different logical capabilities wrt transf= ersize, >> sgl size, sgl length etc. >> >> The goal of this discussion is how to determine the 'best' capabilit= y setting for the >> virtual HBA and how to handle hotplug scenarios, where a disk might = be plugged in >> which has incompatible settings from the one the virtual HBA is usin= g currently. >> >=20 > Most of what you are describing here in terms of having a kernel targ= et > enforce underlying LLD limitiations for LUNs is already available in = TCM > v3.x. Current TCM code will automatically handle the processing of a > single DATA_SG_IO CDB generated by KVM Guest + megasas emulation that > exceeds the underlying LLD max_sectors, and generate the multiple > internal se_task_t's in order to complete the original I/O generated = by > KVM Guest + megasas. >=20 Hmm, yes. > This is one example but the main underlying question wrt to TCM and > interaction with Linux subsystems has historically been: >=20 > What values should be enforced by TCM based on metadata presented by = TCM > subsystem plugins (pSCSI, IBLOCK, FILEIO) for struct block_device, an= d > what is expected to be enforced by underlying Linux subsystems > presenting struct block_device..? >=20 > For the virtual TCM subsystem plugin cases (IBLOCK, FILEIO, RAMDISK) = the > can_queue is a competely arbitary value and is enforced by the > underyling Linux subsystem. There are a couple of special cases: >=20 > *) For TCM/pSCSI, can_queue is enforced from struct scsi_device->queu= e_depth > and max_sectors from the smaller of the two values from struct Scs= i_Host->max_sectors > and struct scsi_device->request_queue->limits.max_sectors. >=20 > *) For TCM/IBLOCK, max_sectors is enforced based on struct request_qu= eue->limits.max_sectors. >=20 > *) For TCM/FILEIO and TCM/RAMDISK, both can_queue and max_sectors are > set to arbitrarly high values. >=20 > Also I should mention that TCM_Loop code currently uses a hardcoded > struct scsi_host_template->can_queue=3D1 and ->max_sectors=3D128, but= will > work fine with larger values. Being able to change the Linux/SCSI > queue depth on the fly for TCM_Loop virtual SAS ports being used in K= VM > guest could be quite useful for managing KVM Guest megasas emulation = I/O > traffic on a larger scale.. >=20 And my question / topic here is how to handle a hotplug capability in t= hese cases: What happens if a device / HBA is plugged in with different / lo= wer capabilities than those announced? Can we change the announced settings for the HBA on the fly? > The other big advantage of using TCM_Loop with your megasas guest > emulation means that existing TCM logic for >=3D SPC-3 T10 NAA naming= , PR, > and ALUA emulation is immediately available to KVM guest, and does no= t > have to be reproduced in QEMU code. >=20 I'm not doubting that using TCM_loop here would be advantageous. But I have to find a solution for folks just wanting to run on plain /d= ev/sdX. And I need to find a common ground here to argue with the KVM folks, whose main objection against the megasas emulation is this issue. Either way would be fine by me, I just think we should come to a common understanding. My initial idea here was to just pass the request back as partially com= pleted; that would solve the issue nicely. Sadly the SCSI midlayer interprets partial completion always as an erro= r :-( Would've been really neat. Cheers, Hannes --=20 Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N=FCrnberg GF: Markus Rex, HRB 16746 (AG N=FCrnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html