From mboxrd@z Thu Jan 1 00:00:00 1970 From: Hannes Reinecke Subject: Re: [PATCH 1/2 v1] blkdrv: Add queue limits parameters for sg block drive Date: Fri, 24 Aug 2012 12:43:34 +0200 Message-ID: <50375AD6.8060203@suse.de> References: <1345537427-21601-1-git-send-email-mc@linux.vnet.ibm.com> <50334B51.6050900@redhat.com> <503357B2.5040901@linux.vnet.ibm.com> <50335F78.1030005@redhat.com> <5034BCD1.9020603@linux.vnet.ibm.com> <5034CBF8.3050602@redhat.com> <20120822131348.GA3512@stefanha-thinkpad.localdomain> <5034E918.4030305@redhat.com> <5035F873.6090305@linux.vnet.ibm.com> <5035FFF4.4040603@redhat.com> <1345769101.10190.124.camel@haakon2.linux-iscsi.org> <503733A2.1050300@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; Format="flowed" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <503733A2.1050300@redhat.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: virtualization-bounces@lists.linux-foundation.org Errors-To: virtualization-bounces@lists.linux-foundation.org To: Paolo Bonzini Cc: Stefan Hajnoczi , zwanp@cn.ibm.com, linuxram@us.ibm.com, qemu-devel@nongnu.org, virtualization@lists.linux-foundation.org, Cong Meng , Christoph Hellwig List-Id: virtualization@lists.linuxfoundation.org On 08/24/2012 09:56 AM, Paolo Bonzini wrote: > Il 24/08/2012 02:45, Nicholas A. Bellinger ha scritto: >> So up until very recently, TCM would accept an I/O request for an DATA >> I/O type CDB with a max_sectors larger than the reported max_sectors for >> it's TCM backend (regardless of backend type), and silently generate N >> backend 'tasks' to complete the single initiator generated command. > > This is what QEMU does if you use scsi-block, except for MMC devices > (because of the insanity of the commands used for burning). > >> Also FYI for Paolo, for control type CDBs I've never actually seen an >> allocation length exceed max_sectors, so in practice AFAIK this only >> happens for DATA I/O type CDBs. > > Yes, that was my impression as well. > >> This was historically required by the pSCSI backend driver (using a >> number of old SCSI passthrough interfaces) in order to support this very >> type of case described above, but over the years the logic ended up >> creeping into various other non-passthrough backend drivers like IBLOCK >> +FILEIO. So for v3.6-rc1 code, hch ended up removing the 'task' logic >> thus allowing backends (and the layers below) to the I/O sectors > >> max_sectors handling work, allowing modern pSCSI using struct request to >> do the same. (hch assured me this works now for pSCSI) > > So now LIO and QEMU work the same. (Did he test tapes too?) > >> Anyways, I think having the guest limit virtio-scsi DATA I/O to >> max_sectors based upon the host accessible block limits is reasonable >> approach to consider. Reducing this value even further based upon the >> lowest max_sectors available amongst possible migration hosts would be a >> good idea here to avoid having to reject any I/O's exceeding a new >> host's device block queue limits. > > Yeah, it's reasonable _assuming it is needed at all_. For disks, it is > not needed. For CD-ROMs it is, but right now we have only one report > and it is using USB so we don't know if the problem is in the drive or > rather in the USB bridge (whose quality usually leaves much to be desired= ). > > So in the only observed case, the fix would really be a workaround; the > right thing to do with USB devices is to use USB passthrough. > Hehe. So finally someone else stumbled across this one. All is fine and dandy as long as you're able to use scsi-disk. As soon as you're forced to use scsi-generic we're in trouble. With scsi-generic we actually have two problems: 1) scsi-generic just acts as a pass-through and passes the commands as-is, including the scatter-gather information as formatted by the guest. So the guest could easily format an SG_IO comand which will not be compatible with the host. 2) The host is not able to differentiate between a malformed SG_IO command and a real I/O error; in both cases it'll return -EIO. So we can fix this by either a) ignore (as we do nowadays :-) b) Fixup scsi-generic to inspect and modify SG_IO information to ensure the host-limits are respected c) Fixup the host to differentiate between a malformed SG_IO and a real I/O error. c) would only be feasible for Linux et al. _personally_ I would prefer = that approach, as I fail to see why we cannot return a proper error code = here. But I already can hear the outraged cry 'POSIX! POSIX!', so I guess it's = not going to happen anytime soon. So I would vote for b). Yes, it's painful. But in the long run we'll have to do an SG_IO = inspection anyway, otherwise we'll always be susceptible to malicious = SG_IO attacks. Cheers, Hannes -- = Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N=FCrnberg GF: Markus Rex, HRB 16746 (AG N=FCrnberg)