All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: kvm-devel <kvm@vger.kernel.org>,
	qemu-devel <qemu-devel@nongnu.org>,
	linux-scsi <linux-scsi@vger.kernel.org>,
	Gerd Hoffmann <kraxel@redhat.com>
Subject: Re: [QEMU-KVM]: Megasas + TCM_Loop + SG_IO into Windows XP guests
Date: Tue, 18 May 2010 11:43:04 +0200	[thread overview]
Message-ID: <4BF26128.2030400@suse.de> (raw)
In-Reply-To: <1274130584.7348.83.camel@haakon2.linux-iscsi.org>

Nicholas A. Bellinger wrote:
> On Fri, 2010-05-14 at 02:42 -0700, Nicholas A. Bellinger wrote:
>> On Fri, 2010-05-14 at 09:22 +0200, Hannes Reinecke wrote:
>>> Nicholas A. Bellinger wrote:
>>>> Greetings Hannes and co,
>>>>
>> <SNIP>
>>> Let's see if I can find some time working on the megasas emulation.
>>> Maybe I find something.
>>> Last time I checked it was with a Windows7 build, but I didn't do
>>> any real tests there. Basically just checking if the system boots up :-)
>>>
>> Nothing fancy just yet.  This is involving a normal NTFS filesystem
>> format on a small TCM/FILEIO LUN using scsi-generic and a userspace
>> FILEIO with scsi-disk.
>>
>> This involves the XP guest waiting until the very last READ_10 once the
>> format has completed (eg: all WRITE and VERIFY CDBs complete with GOOD
>> status AFAICT) before announcing that mkfs.ntfs failed without any
>> helpful exception message (due to missing metadata of some sort I would
>> assume..?)
>>
>> So perhaps dumping QEMU and TCM_Loop SCSI payloads to determine if any
>> correct blocks from megasas_handle_io() are actually making it out to
>> KVM host is going to be my next option.  ;)
>>
> 
> Greetings Hannes,
> 
> So I spent some more time with XP guests this weekend, and I noticed two
> things immediately when using hw/lsi53c895a.c instead of hw/megasas.c
> with the same two TCM_Loop SAS LUNs via SG_IO from last week:
> 
> 1) With lsi53c895a, XP guests are able to boot successfully w/ out the
> synchronous SG_IO hack that is currently required to get past the first
> 36-byte INQUIRY for megasas + XP SP2
> 
> 2) With lsi53c895a, XP is able to successfully create and mount a NTFS
> filesystem, reboot, and read blocks appear to be functioning properly.
> FYI I have not run any 'write known pattern then read-back and compare
> blocks' data integrity tests from with in the XP guests just yet, but I
> am confident that TCM scatterlist -> se_mem_t mapping is working as
> expected on the KVM Host.
> 
> Futhermore, after formatting a 5 GB TCM/FILEIO LUN with lsi53c895a, and
> then rebooting with megasas with the same two configured TCM_Loop SG_IO
> devices, it appears to be able to mount and read blocks successfully.
> Attempting to write new blocks on the mounted filesystem also appears to
> work to some degree, but throughput slows down to a crawl during XP
> guest buffer cache flush, which is likely attributed to the use of my
> quick SYNC SG_IO hack.
> 
> So it appears that there are two seperate issues here, and AFAICT they
> both look to be XP and megasas specific.  For #2, it may be something
> about the format of the incoming scatterlists generated during XP's
> mkfs.ntfs that is causing some issues.  While watching output during fs
> creation, I noticed the following WRITE_10s with a starting 4088 byte
> scatterlist and a trailing 8 byte scatterlist:
> 
> megasas: writel mmio 40: 2b0b003
> megasas: Found mapped frame 2 context 82b0b000 pa 2b0b000
> megasas: Enqueue frame context 82b0b000 tail 493 busy 1
> megasas: LD SCSI dev 2 lun 0 sdev 0xdc0230 xfer 16384
> scsi-generic: Using cur_addr: 0x000000000ff6c008 cur_len: 0x0000000000000ff8
> scsi-generic: Adding iovec for mem: 0x7f1783b96008 len: 0x0000000000000ff8
> scsi-generic: Using cur_addr: 0x000000000fd6e000 cur_len: 0x0000000000001000
> scsi-generic: Adding iovec for mem: 0x7f1783998000 len: 0x0000000000001000
> scsi-generic: Using cur_addr: 0x000000000fe2f000 cur_len: 0x0000000000001000
> scsi-generic: Adding iovec for mem: 0x7f1783a59000 len: 0x0000000000001000
> scsi-generic: Using cur_addr: 0x000000000fdf0000 cur_len: 0x0000000000001000
> scsi-generic: Adding iovec for mem: 0x7f1783a1a000 len: 0x0000000000001000
> scsi-generic: Using cur_addr: 0x000000000fded000 cur_len: 0x0000000000000008
> scsi-generic: Adding iovec for mem: 0x7f1783a17000 len: 0x0000000000000008
> scsi-generic: execute IOV: iovec_count: 5, dxferp: 0xd92420, dxfer_len: 16384
> scsi-generic: -----------------------> Issuing SG_IO CDB len 10: 0x2a 00 00 00 fa be 00 00 20 00 
> scsi-generic: scsi_write_complete() ret = 0
> scsi-generic: Command complete 0x0xd922c0 tag=0x82b0b000 status=0
> megasas: LD SCSI req 0xd922c0 cmd 0xda92c0 lun 0xdc0230 finished with status 0 len 16384
> megasas: Complete frame context 82b0b000 tail 493 busy 0 doorbell 0
> 
> Also, the final READ_10 that produces the 'could not create filesystem'
> exception is for LBA 63 and XP looking for the first FS blocks after
> GPT.
> 
> Could there be some breakage in megasas with a length < PAGE_SIZE for
> the scatterlist..?    As lsi53c895a seems to work OK for this case, is
> there something about the logic of parsing the incoming struct
> scatterlists that is different between the two HBA drivers..?  AFAICT
> both are using Gerd's common code in hw/scsi-bus.c, unless there is
> something about megasas_map_sgl() that is causing issues with the
> above..?
> 

The usual disclaimer here: I'm less than happy with the current SCSI disk handling.
Currently we have the two options:
- Using 'scsi-disk', which will _emulate_ a SCSI disk internally, but allow to use
  asynchronous I/O using normal read/write syscalls
- Using 'scsi-generic', which will allow you to pass-through any SCSI device, but
  disallow asynchronous I/O and requires you to use the SG_IO interface.
The latter also implies that the host will mark _all_ I/O commands as 'block_pc',
so the code path within the kernel is quite different from those taken by I/Os
coming in via the 'scsi-disk' emulation.
Guess it's time to have a 'scsi-passthrough' device ...

Other than that: Think we have to investigate.
If you could send me a quite setup guide on how to configure TCM_Loop for an
existing device I'd give it a go ...

Thanks,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

WARNING: multiple messages have this Message-ID (diff)
From: Hannes Reinecke <hare@suse.de>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: Gerd Hoffmann <kraxel@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	kvm-devel <kvm@vger.kernel.org>,
	linux-scsi <linux-scsi@vger.kernel.org>
Subject: [Qemu-devel] Re: [QEMU-KVM]: Megasas + TCM_Loop + SG_IO into Windows XP guests
Date: Tue, 18 May 2010 11:43:04 +0200	[thread overview]
Message-ID: <4BF26128.2030400@suse.de> (raw)
In-Reply-To: <1274130584.7348.83.camel@haakon2.linux-iscsi.org>

Nicholas A. Bellinger wrote:
> On Fri, 2010-05-14 at 02:42 -0700, Nicholas A. Bellinger wrote:
>> On Fri, 2010-05-14 at 09:22 +0200, Hannes Reinecke wrote:
>>> Nicholas A. Bellinger wrote:
>>>> Greetings Hannes and co,
>>>>
>> <SNIP>
>>> Let's see if I can find some time working on the megasas emulation.
>>> Maybe I find something.
>>> Last time I checked it was with a Windows7 build, but I didn't do
>>> any real tests there. Basically just checking if the system boots up :-)
>>>
>> Nothing fancy just yet.  This is involving a normal NTFS filesystem
>> format on a small TCM/FILEIO LUN using scsi-generic and a userspace
>> FILEIO with scsi-disk.
>>
>> This involves the XP guest waiting until the very last READ_10 once the
>> format has completed (eg: all WRITE and VERIFY CDBs complete with GOOD
>> status AFAICT) before announcing that mkfs.ntfs failed without any
>> helpful exception message (due to missing metadata of some sort I would
>> assume..?)
>>
>> So perhaps dumping QEMU and TCM_Loop SCSI payloads to determine if any
>> correct blocks from megasas_handle_io() are actually making it out to
>> KVM host is going to be my next option.  ;)
>>
> 
> Greetings Hannes,
> 
> So I spent some more time with XP guests this weekend, and I noticed two
> things immediately when using hw/lsi53c895a.c instead of hw/megasas.c
> with the same two TCM_Loop SAS LUNs via SG_IO from last week:
> 
> 1) With lsi53c895a, XP guests are able to boot successfully w/ out the
> synchronous SG_IO hack that is currently required to get past the first
> 36-byte INQUIRY for megasas + XP SP2
> 
> 2) With lsi53c895a, XP is able to successfully create and mount a NTFS
> filesystem, reboot, and read blocks appear to be functioning properly.
> FYI I have not run any 'write known pattern then read-back and compare
> blocks' data integrity tests from with in the XP guests just yet, but I
> am confident that TCM scatterlist -> se_mem_t mapping is working as
> expected on the KVM Host.
> 
> Futhermore, after formatting a 5 GB TCM/FILEIO LUN with lsi53c895a, and
> then rebooting with megasas with the same two configured TCM_Loop SG_IO
> devices, it appears to be able to mount and read blocks successfully.
> Attempting to write new blocks on the mounted filesystem also appears to
> work to some degree, but throughput slows down to a crawl during XP
> guest buffer cache flush, which is likely attributed to the use of my
> quick SYNC SG_IO hack.
> 
> So it appears that there are two seperate issues here, and AFAICT they
> both look to be XP and megasas specific.  For #2, it may be something
> about the format of the incoming scatterlists generated during XP's
> mkfs.ntfs that is causing some issues.  While watching output during fs
> creation, I noticed the following WRITE_10s with a starting 4088 byte
> scatterlist and a trailing 8 byte scatterlist:
> 
> megasas: writel mmio 40: 2b0b003
> megasas: Found mapped frame 2 context 82b0b000 pa 2b0b000
> megasas: Enqueue frame context 82b0b000 tail 493 busy 1
> megasas: LD SCSI dev 2 lun 0 sdev 0xdc0230 xfer 16384
> scsi-generic: Using cur_addr: 0x000000000ff6c008 cur_len: 0x0000000000000ff8
> scsi-generic: Adding iovec for mem: 0x7f1783b96008 len: 0x0000000000000ff8
> scsi-generic: Using cur_addr: 0x000000000fd6e000 cur_len: 0x0000000000001000
> scsi-generic: Adding iovec for mem: 0x7f1783998000 len: 0x0000000000001000
> scsi-generic: Using cur_addr: 0x000000000fe2f000 cur_len: 0x0000000000001000
> scsi-generic: Adding iovec for mem: 0x7f1783a59000 len: 0x0000000000001000
> scsi-generic: Using cur_addr: 0x000000000fdf0000 cur_len: 0x0000000000001000
> scsi-generic: Adding iovec for mem: 0x7f1783a1a000 len: 0x0000000000001000
> scsi-generic: Using cur_addr: 0x000000000fded000 cur_len: 0x0000000000000008
> scsi-generic: Adding iovec for mem: 0x7f1783a17000 len: 0x0000000000000008
> scsi-generic: execute IOV: iovec_count: 5, dxferp: 0xd92420, dxfer_len: 16384
> scsi-generic: -----------------------> Issuing SG_IO CDB len 10: 0x2a 00 00 00 fa be 00 00 20 00 
> scsi-generic: scsi_write_complete() ret = 0
> scsi-generic: Command complete 0x0xd922c0 tag=0x82b0b000 status=0
> megasas: LD SCSI req 0xd922c0 cmd 0xda92c0 lun 0xdc0230 finished with status 0 len 16384
> megasas: Complete frame context 82b0b000 tail 493 busy 0 doorbell 0
> 
> Also, the final READ_10 that produces the 'could not create filesystem'
> exception is for LBA 63 and XP looking for the first FS blocks after
> GPT.
> 
> Could there be some breakage in megasas with a length < PAGE_SIZE for
> the scatterlist..?    As lsi53c895a seems to work OK for this case, is
> there something about the logic of parsing the incoming struct
> scatterlists that is different between the two HBA drivers..?  AFAICT
> both are using Gerd's common code in hw/scsi-bus.c, unless there is
> something about megasas_map_sgl() that is causing issues with the
> above..?
> 

The usual disclaimer here: I'm less than happy with the current SCSI disk handling.
Currently we have the two options:
- Using 'scsi-disk', which will _emulate_ a SCSI disk internally, but allow to use
  asynchronous I/O using normal read/write syscalls
- Using 'scsi-generic', which will allow you to pass-through any SCSI device, but
  disallow asynchronous I/O and requires you to use the SG_IO interface.
The latter also implies that the host will mark _all_ I/O commands as 'block_pc',
so the code path within the kernel is quite different from those taken by I/Os
coming in via the 'scsi-disk' emulation.
Guess it's time to have a 'scsi-passthrough' device ...

Other than that: Think we have to investigate.
If you could send me a quite setup guide on how to configure TCM_Loop for an
existing device I'd give it a go ...

Thanks,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

  reply	other threads:[~2010-05-18  9:43 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-13 21:38 [QEMU-KVM]: Megasas + TCM_Loop + SG_IO into Windows XP guests Nicholas A. Bellinger
2010-05-13 21:38 ` [Qemu-devel] " Nicholas A. Bellinger
2010-05-13 21:38 ` Nicholas A. Bellinger
2010-05-14  7:22 ` Hannes Reinecke
2010-05-14  7:22   ` [Qemu-devel] " Hannes Reinecke
2010-05-14  9:42   ` Nicholas A. Bellinger
2010-05-14  9:42     ` [Qemu-devel] " Nicholas A. Bellinger
2010-05-17 21:09     ` Nicholas A. Bellinger
2010-05-17 21:09       ` [Qemu-devel] " Nicholas A. Bellinger
2010-05-18  9:43       ` Hannes Reinecke [this message]
2010-05-18  9:43         ` Hannes Reinecke
2010-05-18 11:18         ` Nicholas A. Bellinger
2010-05-18 11:18           ` [Qemu-devel] " Nicholas A. Bellinger
2010-05-30  4:25           ` Nicholas A. Bellinger
2010-05-30  4:25             ` [Qemu-devel] " Nicholas A. Bellinger
2010-05-31  9:52             ` Gerd Hoffmann
2010-05-31  9:52               ` [Qemu-devel] " Gerd Hoffmann
2010-05-31 19:18               ` Alexander Graf
2010-05-31 19:18                 ` [Qemu-devel] " Alexander Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BF26128.2030400@suse.de \
    --to=hare@suse.de \
    --cc=kraxel@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=nab@linux-iscsi.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.