qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/2] block: Expose host block dev I/O size limit in scsi-block/scsi-generic
@ 2016-05-26  6:15 Fam Zheng
  2016-05-26  6:15 ` [Qemu-devel] [PATCH 1/2] raw-posix: Fetch max sectors for host block device from sysfs Fam Zheng
  2016-05-26  6:15 ` [Qemu-devel] [PATCH 2/2] scsi-generic: Merge block max xfer len in INQUIRY response Fam Zheng
  0 siblings, 2 replies; 8+ messages in thread
From: Fam Zheng @ 2016-05-26  6:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Max Reitz, Paolo Bonzini, qemu-block

Host devices passed through as scsi-block or scsi-generic may have a compound
maximum I/O limit (out of physical hardware limit, driver quirks and file
system configuration, etc) that is presented in the sysfs entry. SG_IOs we
issue should respect this. However the value is currently not discoverable by
guest, because the vHBA (virtio-scsi) would present an fixed emulated limit,
while the INQUIRY (vpd=0xb0, block limits page) response solely speaks for the
LUN itself, not the host kernel. The issue is observed on scsi passthrough of
host usb or dm-multipath disks, and it is not specific to certain device types.

The proposed solution is collecting the host sysfs limit in raw-posix driver
when a block device is used, and intercepting INQUIRY response to clip the
max xfer len field.

This fixes booting a SanDisk usb-key with an upstream kernel.  The usb disk
reports a nonsense large value in INQUIRY, while the host (usb quirk?) only
allows 120KB.


Fam Zheng (2):
  raw-posix: Fetch max sectors for host block device from sysfs
  scsi-generic: Merge block max xfer len in INQUIRY response

 block/raw-posix.c      | 47 +++++++++++++++++++++++++++++++++++++++++++++++
 hw/scsi/scsi-generic.c | 12 ++++++++++++
 2 files changed, 59 insertions(+)

-- 
2.8.2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 1/2] raw-posix: Fetch max sectors for host block device from sysfs
  2016-05-26  6:15 [Qemu-devel] [PATCH 0/2] block: Expose host block dev I/O size limit in scsi-block/scsi-generic Fam Zheng
@ 2016-05-26  6:15 ` Fam Zheng
  2016-06-02  6:52   ` Fam Zheng
  2016-06-02 12:30   ` Max Reitz
  2016-05-26  6:15 ` [Qemu-devel] [PATCH 2/2] scsi-generic: Merge block max xfer len in INQUIRY response Fam Zheng
  1 sibling, 2 replies; 8+ messages in thread
From: Fam Zheng @ 2016-05-26  6:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Max Reitz, Paolo Bonzini, qemu-block

This is sometimes a useful value we should count in.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/raw-posix.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 47 insertions(+)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index a4f5a1b..d3796ad 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -729,9 +729,56 @@ static void raw_reopen_abort(BDRVReopenState *state)
     state->opaque = NULL;
 }
 
+static int hdev_get_max_transfer_length(dev_t dev)
+{
+    int ret;
+    int fd;
+    char *path;
+    const char *end;
+    char buf[32];
+    long len;
+
+    path = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_sectors_kb",
+                           major(dev), minor(dev));
+    fd = open(path, O_RDONLY);
+    if (fd < 0) {
+        ret = -errno;
+        goto out;
+    }
+    ret = read(fd, buf, sizeof(buf));
+    if (ret < 0) {
+        ret = -errno;
+        goto out;
+    } else if (ret == 0) {
+        ret = -EIO;
+        goto out;
+    }
+    buf[ret] = 0;
+    /* The file is ended with '\n', pass 'end' to accept that. */
+    ret = qemu_strtol(buf, &end, 10, &len);
+    if (ret == 0 && end && *end == '\n') {
+        ret = len * 1024 / BDRV_SECTOR_SIZE;
+    }
+
+    close(fd);
+out:
+    g_free(path);
+    return ret;
+}
+
 static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
 {
     BDRVRawState *s = bs->opaque;
+    struct stat st;
+
+    if (!fstat(s->fd, &st)) {
+        if (S_ISBLK(st.st_mode)) {
+            int ret = hdev_get_max_transfer_length(st.st_rdev);
+            if (ret >= 0) {
+                bs->bl.max_transfer_length = ret;
+            }
+        }
+    }
 
     raw_probe_alignment(bs, s->fd, errp);
     bs->bl.min_mem_alignment = s->buf_align;
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 2/2] scsi-generic: Merge block max xfer len in INQUIRY response
  2016-05-26  6:15 [Qemu-devel] [PATCH 0/2] block: Expose host block dev I/O size limit in scsi-block/scsi-generic Fam Zheng
  2016-05-26  6:15 ` [Qemu-devel] [PATCH 1/2] raw-posix: Fetch max sectors for host block device from sysfs Fam Zheng
@ 2016-05-26  6:15 ` Fam Zheng
  2016-05-26  7:59   ` Paolo Bonzini
  1 sibling, 1 reply; 8+ messages in thread
From: Fam Zheng @ 2016-05-26  6:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Max Reitz, Paolo Bonzini, qemu-block

The rationale is similar to the above mode sense response interception:
this is practically the only channel to communicate restraints from
elsewhere such as host and block driver.

The scsi bus we attach onto can have a larger max xfer len than what is
accepted by the host file system (guarding between the host scsi LUN and
QEMU), in which case the SG_IO we generate would get -EINVAL.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/scsi/scsi-generic.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 7459465..71372a8 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -222,6 +222,18 @@ static void scsi_read_complete(void * opaque, int ret)
             r->buf[3] |= 0x80;
         }
     }
+    if (s->type == TYPE_DISK &&
+        r->req.cmd.buf[0] == INQUIRY &&
+        r->req.cmd.buf[2] == 0xb0) {
+        uint32_t max_xfer_len = blk_get_max_transfer_length(s->conf.blk);
+        if (max_xfer_len) {
+            stl_be_p(&r->buf[8], max_xfer_len);
+            /* Also take care of the opt xfer len. */
+            if (ldl_be_p(&r->buf[12]) > max_xfer_len) {
+                stl_be_p(&r->buf[12], max_xfer_len);
+            }
+        }
+    }
     scsi_req_data(&r->req, len);
     scsi_req_unref(&r->req);
 }
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] scsi-generic: Merge block max xfer len in INQUIRY response
  2016-05-26  6:15 ` [Qemu-devel] [PATCH 2/2] scsi-generic: Merge block max xfer len in INQUIRY response Fam Zheng
@ 2016-05-26  7:59   ` Paolo Bonzini
  0 siblings, 0 replies; 8+ messages in thread
From: Paolo Bonzini @ 2016-05-26  7:59 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel; +Cc: Kevin Wolf, qemu-block, Max Reitz



On 26/05/2016 08:15, Fam Zheng wrote:
> The rationale is similar to the above mode sense response interception:
> this is practically the only channel to communicate restraints from
> elsewhere such as host and block driver.
> 
> The scsi bus we attach onto can have a larger max xfer len than what is
> accepted by the host file system (guarding between the host scsi LUN and
> QEMU), in which case the SG_IO we generate would get -EINVAL.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>

The two patches are pretty much separate, so I'm queuing this myself.

Thanks,

Paolo

> ---
>  hw/scsi/scsi-generic.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
> index 7459465..71372a8 100644
> --- a/hw/scsi/scsi-generic.c
> +++ b/hw/scsi/scsi-generic.c
> @@ -222,6 +222,18 @@ static void scsi_read_complete(void * opaque, int ret)
>              r->buf[3] |= 0x80;
>          }
>      }
> +    if (s->type == TYPE_DISK &&
> +        r->req.cmd.buf[0] == INQUIRY &&
> +        r->req.cmd.buf[2] == 0xb0) {
> +        uint32_t max_xfer_len = blk_get_max_transfer_length(s->conf.blk);
> +        if (max_xfer_len) {
> +            stl_be_p(&r->buf[8], max_xfer_len);
> +            /* Also take care of the opt xfer len. */
> +            if (ldl_be_p(&r->buf[12]) > max_xfer_len) {
> +                stl_be_p(&r->buf[12], max_xfer_len);
> +            }
> +        }
> +    }
>      scsi_req_data(&r->req, len);
>      scsi_req_unref(&r->req);
>  }
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] raw-posix: Fetch max sectors for host block device from sysfs
  2016-05-26  6:15 ` [Qemu-devel] [PATCH 1/2] raw-posix: Fetch max sectors for host block device from sysfs Fam Zheng
@ 2016-06-02  6:52   ` Fam Zheng
  2016-06-02 12:30   ` Max Reitz
  1 sibling, 0 replies; 8+ messages in thread
From: Fam Zheng @ 2016-06-02  6:52 UTC (permalink / raw)
  To: qemu-devel, kwolf, mreitz; +Cc: qemu-block

On Thu, 05/26 14:15, Fam Zheng wrote:
> This is sometimes a useful value we should count in.

Kevin, Max, could you review this please?

Fam

> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/raw-posix.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 47 insertions(+)
> 
> diff --git a/block/raw-posix.c b/block/raw-posix.c
> index a4f5a1b..d3796ad 100644
> --- a/block/raw-posix.c
> +++ b/block/raw-posix.c
> @@ -729,9 +729,56 @@ static void raw_reopen_abort(BDRVReopenState *state)
>      state->opaque = NULL;
>  }
>  
> +static int hdev_get_max_transfer_length(dev_t dev)
> +{
> +    int ret;
> +    int fd;
> +    char *path;
> +    const char *end;
> +    char buf[32];
> +    long len;
> +
> +    path = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_sectors_kb",
> +                           major(dev), minor(dev));
> +    fd = open(path, O_RDONLY);
> +    if (fd < 0) {
> +        ret = -errno;
> +        goto out;
> +    }
> +    ret = read(fd, buf, sizeof(buf));
> +    if (ret < 0) {
> +        ret = -errno;
> +        goto out;
> +    } else if (ret == 0) {
> +        ret = -EIO;
> +        goto out;
> +    }
> +    buf[ret] = 0;
> +    /* The file is ended with '\n', pass 'end' to accept that. */
> +    ret = qemu_strtol(buf, &end, 10, &len);
> +    if (ret == 0 && end && *end == '\n') {
> +        ret = len * 1024 / BDRV_SECTOR_SIZE;
> +    }
> +
> +    close(fd);
> +out:
> +    g_free(path);
> +    return ret;
> +}
> +
>  static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
>  {
>      BDRVRawState *s = bs->opaque;
> +    struct stat st;
> +
> +    if (!fstat(s->fd, &st)) {
> +        if (S_ISBLK(st.st_mode)) {
> +            int ret = hdev_get_max_transfer_length(st.st_rdev);
> +            if (ret >= 0) {
> +                bs->bl.max_transfer_length = ret;
> +            }
> +        }
> +    }
>  
>      raw_probe_alignment(bs, s->fd, errp);
>      bs->bl.min_mem_alignment = s->buf_align;
> -- 
> 2.8.2
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] raw-posix: Fetch max sectors for host block device from sysfs
  2016-05-26  6:15 ` [Qemu-devel] [PATCH 1/2] raw-posix: Fetch max sectors for host block device from sysfs Fam Zheng
  2016-06-02  6:52   ` Fam Zheng
@ 2016-06-02 12:30   ` Max Reitz
  2016-06-02 12:54     ` Kevin Wolf
  1 sibling, 1 reply; 8+ messages in thread
From: Max Reitz @ 2016-06-02 12:30 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, qemu-block

[-- Attachment #1: Type: text/plain, Size: 2396 bytes --]

On 26.05.2016 08:15, Fam Zheng wrote:
> This is sometimes a useful value we should count in.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/raw-posix.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 47 insertions(+)
> 
> diff --git a/block/raw-posix.c b/block/raw-posix.c
> index a4f5a1b..d3796ad 100644
> --- a/block/raw-posix.c
> +++ b/block/raw-posix.c
> @@ -729,9 +729,56 @@ static void raw_reopen_abort(BDRVReopenState *state)
>      state->opaque = NULL;
>  }
>  
> +static int hdev_get_max_transfer_length(dev_t dev)
> +{
> +    int ret;
> +    int fd;
> +    char *path;
> +    const char *end;
> +    char buf[32];
> +    long len;
> +
> +    path = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_sectors_kb",
> +                           major(dev), minor(dev));

I can't say I like this very much, but well, it won't do any harm on any
systems that do not offer this path (i.e. any non-Linux system, I
suppose). So I'm fine with it.

> +    fd = open(path, O_RDONLY);
> +    if (fd < 0) {
> +        ret = -errno;
> +        goto out;
> +    }
> +    ret = read(fd, buf, sizeof(buf));
> +    if (ret < 0) {
> +        ret = -errno;
> +        goto out;
> +    } else if (ret == 0) {
> +        ret = -EIO;
> +        goto out;
> +    }
> +    buf[ret] = 0;

Potential buffer overflow if ret == sizeof(buf).

> +    /* The file is ended with '\n', pass 'end' to accept that. */
> +    ret = qemu_strtol(buf, &end, 10, &len);
> +    if (ret == 0 && end && *end == '\n') {
> +        ret = len * 1024 / BDRV_SECTOR_SIZE;

Maybe there should be an overflow check here.

> +    }
> +
> +    close(fd);

This belongs in some error path, because otherwise the FD is leaked if
the read() failed.

Max

> +out:
> +    g_free(path);
> +    return ret;
> +}
> +
>  static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
>  {
>      BDRVRawState *s = bs->opaque;
> +    struct stat st;
> +
> +    if (!fstat(s->fd, &st)) {
> +        if (S_ISBLK(st.st_mode)) {
> +            int ret = hdev_get_max_transfer_length(st.st_rdev);
> +            if (ret >= 0) {
> +                bs->bl.max_transfer_length = ret;
> +            }
> +        }
> +    }
>  
>      raw_probe_alignment(bs, s->fd, errp);
>      bs->bl.min_mem_alignment = s->buf_align;
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] raw-posix: Fetch max sectors for host block device from sysfs
  2016-06-02 12:30   ` Max Reitz
@ 2016-06-02 12:54     ` Kevin Wolf
  2016-06-03  1:46       ` Fam Zheng
  0 siblings, 1 reply; 8+ messages in thread
From: Kevin Wolf @ 2016-06-02 12:54 UTC (permalink / raw)
  To: Max Reitz; +Cc: Fam Zheng, qemu-devel, Paolo Bonzini, qemu-block

[-- Attachment #1: Type: text/plain, Size: 1434 bytes --]

Am 02.06.2016 um 14:30 hat Max Reitz geschrieben:
> On 26.05.2016 08:15, Fam Zheng wrote:
> > This is sometimes a useful value we should count in.
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  block/raw-posix.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 47 insertions(+)
> > 
> > diff --git a/block/raw-posix.c b/block/raw-posix.c
> > index a4f5a1b..d3796ad 100644
> > --- a/block/raw-posix.c
> > +++ b/block/raw-posix.c
> > @@ -729,9 +729,56 @@ static void raw_reopen_abort(BDRVReopenState *state)
> >      state->opaque = NULL;
> >  }
> >  
> > +static int hdev_get_max_transfer_length(dev_t dev)
> > +{
> > +    int ret;
> > +    int fd;
> > +    char *path;
> > +    const char *end;
> > +    char buf[32];
> > +    long len;
> > +
> > +    path = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_sectors_kb",
> > +                           major(dev), minor(dev));
> 
> I can't say I like this very much, but well, it won't do any harm on any
> systems that do not offer this path (i.e. any non-Linux system, I
> suppose). So I'm fine with it.

Haven't looked at the patch in detail yet, so I didn't want to send a
comment yet, but I think this should be #ifdef-ed out for non-Linux.

Also a quick search on the internet suggests that the BLKSECTGET ioctl
is what we're looking for, so hopefully using sysfs is unnecessary
anyway.

Kevin

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] raw-posix: Fetch max sectors for host block device from sysfs
  2016-06-02 12:54     ` Kevin Wolf
@ 2016-06-03  1:46       ` Fam Zheng
  0 siblings, 0 replies; 8+ messages in thread
From: Fam Zheng @ 2016-06-03  1:46 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Max Reitz, qemu-devel, Paolo Bonzini, qemu-block

On Thu, 06/02 14:54, Kevin Wolf wrote:
> Also a quick search on the internet suggests that the BLKSECTGET ioctl
> is what we're looking for, so hopefully using sysfs is unnecessary
> anyway.

Oops!  Looks like something went terribly wrong with my "quick search", will
post v2. Thanks!

Fam

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-06-03  1:46 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-26  6:15 [Qemu-devel] [PATCH 0/2] block: Expose host block dev I/O size limit in scsi-block/scsi-generic Fam Zheng
2016-05-26  6:15 ` [Qemu-devel] [PATCH 1/2] raw-posix: Fetch max sectors for host block device from sysfs Fam Zheng
2016-06-02  6:52   ` Fam Zheng
2016-06-02 12:30   ` Max Reitz
2016-06-02 12:54     ` Kevin Wolf
2016-06-03  1:46       ` Fam Zheng
2016-05-26  6:15 ` [Qemu-devel] [PATCH 2/2] scsi-generic: Merge block max xfer len in INQUIRY response Fam Zheng
2016-05-26  7:59   ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).