Re: [PATCH] scsi-generic: replace logical block count of response of READ CAPACITY

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Philippe Mathieu-Daudé" <philmd@redhat.com>
To: Claudio Fontana <cfontana@suse.de>, pbonzini@redhat.com, fam@euphon.net
Cc: Hannes Reinecke <hare@suse.com>, John Snow <jsnow@redhat.com>,
	qemu-devel@nongnu.org, Lin Ma <lma@suse.com>
Subject: Re: [PATCH] scsi-generic: replace logical block count of response of READ CAPACITY
Date: Tue, 21 Dec 2021 12:20:36 +0100	[thread overview]
Message-ID: <f8d9d1d4-a53f-b891-0f5e-5b0d9a0ed82e@redhat.com> (raw)
In-Reply-To: <8ac2bf20-1dc4-7153-4287-ca01bd84d213@suse.de>

Hi Claudio,

On 12/21/21 10:48, Claudio Fontana wrote:
> Hi Paolo, Hannes,
> 
> any thoughts on the following issue?
> 
> Introduction:
> 
> When using SAN storage for providing block devices to guests, configured as SCSI-passthrough devices, increasing the space available in the VM is a use case.
> 
> To do it, it is currently necessary to:
> 
> 1) expand storage on the actual SAN,
> 2) run a "virsh blockresize" or equivalent command to make QEMU aware of the new size, and finally
> 3) do a "rescan-scsi-bus.sh" or equivalent operation in the guest to make the running guest aware of the increased disk size.
> 
> The problem:
> 
> As of now, the administrator needs to make sure that step 3 won't be done before step 2 has been executed, or the resulting state will be inconsistent.
> In practice this creates organizational issues to try to sync up host/storage admins and guest OS admin, and is therefore error prone (due to these human factors).
> 
> The proposal:
> 
> The patch I replied to here from Ma Lin tries to avoid the inconsistent state, by having "rescan-scsi-bus.sh" still report the old size in the guest until QEMU itself is aware of the new disk size.
> 
> The patch:
> 
> Before the patch, the SCSI READ_CAPACITY command in the guest os directly receives the unmodified response from the storage backend.
> 
> After the patch, QEMU intercepts the READ_CAPACITY response and replaces the maximum LBA with the information which is saved in QEMU.
> 
> This means: after resizing the storage on the SAN backend, the host administrator must explicitly notify about CAPACITY HAS CHANGED by issuing a block resize command through QMP or libvirt,
> even for SCSI passthrough disks.
> 
> Any ideas on this patch or on possible alternatives?

I am not an SCSI expert, but Lin's description and yours sound
coherent to me.

One minor nitpick below.

> On 11/20/21 11:15 AM, Lin Ma wrote:
>> While using SCSI passthrough, Following scenario makes qemu doesn't
>> realized the capacity change of remote scsi target:
>> 1. online resize the scsi target.
>> 2. issue 'rescan-scsi-bus.sh -s ...' in host.
>> 3. issue 'rescan-scsi-bus.sh -s ...' in vm.
>>
>> In above scenario I used to experienced errors while accessing the
>> additional disk space in vm. I think the reasonable operations should
>> be:
>> 1. online resize the scsi target.
>> 2. issue 'rescan-scsi-bus.sh -s ...' in host.
>> 3. issue 'block_resize' via qmp to notify qemu.
>> 4. issue 'rescan-scsi-bus.sh -s ...' in vm.
>>
>> The errors disappear once I notify qemu by block_resize via qmp.
>>
>> So this patch replaces the number of logical blocks of READ CAPACITY
>> response from scsi target by qemu's bs->total_sectors. If the user in
>> vm wants to access the additional disk space, The administrator of
>> host must notify qemu once resizeing the scsi target.
>>
>> Bonus is that domblkinfo of libvirt can reflect the consistent capacity
>> information between host and vm in case of missing block_resize in qemu.
>> E.g:
>> ...
>>     <disk type='block' device='lun'>
>>       <driver name='qemu' type='raw'/>
>>       <source dev='/dev/sdc' index='1'/>
>>       <backingStore/>
>>       <target dev='sda' bus='scsi'/>
>>       <alias name='scsi0-0-0-0'/>
>>       <address type='drive' controller='0' bus='0' target='0' unit='0'/>
>>     </disk>
>> ...
>>
>> Before:
>> 1. online resize the scsi target.
>> 2. host:~  # rescan-scsi-bus.sh -s /dev/sdc
>> 3. guest:~ # rescan-scsi-bus.sh -s /dev/sda
>> 4  host:~  # virsh domblkinfo --domain $DOMAIN --human --device sda
>> Capacity:       4.000 GiB
>> Allocation:     0.000 B
>> Physical:       8.000 GiB
>>
>> 5. guest:~ # lsblk /dev/sda
>> NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
>> sda      8:0    0   8G  0 disk
>> └─sda1   8:1    0   2G  0 part
>>
>> After:
>> 1. online resize the scsi target.
>> 2. host:~  # rescan-scsi-bus.sh -s /dev/sdc
>> 3. guest:~ # rescan-scsi-bus.sh -s /dev/sda
>> 4  host:~  # virsh domblkinfo --domain $DOMAIN --human --device sda
>> Capacity:       4.000 GiB
>> Allocation:     0.000 B
>> Physical:       8.000 GiB
>>
>> 5. guest:~ # lsblk /dev/sda
>> NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
>> sda      8:0    0   4G  0 disk
>> └─sda1   8:1    0   2G  0 part
>>
>> Signed-off-by: Lin Ma <lma@suse.com>
>> ---
>>  hw/scsi/scsi-generic.c | 10 ++++++++--
>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
>> index 0306ccc7b1..343b51c2c0 100644
>> --- a/hw/scsi/scsi-generic.c
>> +++ b/hw/scsi/scsi-generic.c
>> @@ -315,11 +315,17 @@ static void scsi_read_complete(void * opaque, int ret)
>>      if (r->req.cmd.buf[0] == READ_CAPACITY_10 &&
>>          (ldl_be_p(&r->buf[0]) != 0xffffffffU || s->max_lba == 0)) {
>>          s->blocksize = ldl_be_p(&r->buf[4]);
>> -        s->max_lba = ldl_be_p(&r->buf[0]) & 0xffffffffULL;
>> +        BlockBackend *blk = s->conf.blk;
>> +        BlockDriverState *bs = blk_bs(blk);
>> +        s->max_lba = bs->total_sectors - 1;

I'd add a refresh_max_lba() helper:

 static void refresh_max_lba(SCSIDevice *s)
 {
     BlockBackend *blk = s->conf.blk;
     BlockDriverState *bs = blk_bs(blk);
     uint64_t max_lba = bs->total_sectors - 1;

     if (max_lba != s->max_lba) {
         trace_scsi_generic_max_lba_refreshed(s->max_lba, max_lba);
         s->max_lba = max_lba;
     }
 }

Otherwise, to the best of my knowledge:
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

>> +        stl_be_p(&r->buf[0], s->max_lba);
>>      } else if (r->req.cmd.buf[0] == SERVICE_ACTION_IN_16 &&
>>                 (r->req.cmd.buf[1] & 31) == SAI_READ_CAPACITY_16) {
>>          s->blocksize = ldl_be_p(&r->buf[8]);
>> -        s->max_lba = ldq_be_p(&r->buf[0]);
>> +        BlockBackend *blk = s->conf.blk;
>> +        BlockDriverState *bs = blk_bs(blk);
>> +        s->max_lba = bs->total_sectors - 1;
>> +        stq_be_p(&r->buf[0], s->max_lba);
>>      }
>>      blk_set_guest_block_size(s->conf.blk, s->blocksize);
>>  
>>
> 
>

     prev parent reply	other threads:[~2021-12-21 11:22 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-20 10:15 [PATCH] scsi-generic: replace logical block count of response of READ CAPACITY Lin Ma
2021-12-21  9:48 ` Claudio Fontana
2021-12-21 11:20   ` Philippe Mathieu-Daudé [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f8d9d1d4-a53f-b891-0f5e-5b0d9a0ed82e@redhat.com \
    --to=philmd@redhat.com \
    --cc=cfontana@suse.de \
    --cc=fam@euphon.net \
    --cc=hare@suse.com \
    --cc=jsnow@redhat.com \
    --cc=lma@suse.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).