Re: [Qemu-devel] [PATCH 2/2] qemu-img: Add dd seek= option

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Eric Blake <eblake@redhat.com>
To: Max Reitz <mreitz@redhat.com>, qemu-devel@nongnu.org
Cc: fullmanet@gmail.com, qemu-block@nongnu.org,
	Kevin Wolf <kwolf@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 2/2] qemu-img: Add dd seek= option
Date: Wed, 15 Aug 2018 21:39:02 -0500	[thread overview]
Message-ID: <4aa899d3-c622-6acf-128c-513f95c55ed5@redhat.com> (raw)
In-Reply-To: <48010463-405e-bb08-f9c5-8263dcf3bb48@redhat.com>

On 08/15/2018 09:20 PM, Max Reitz wrote:
> On 2018-08-15 04:56, Eric Blake wrote:
>> For feature parity with dd, we want to be able to specify
>> the offset within the output file, just as we can specify
>> the offset for the input (in particular, this makes copying
>> a subset range of guest-visible bytes from one file to
>> another much easier).
> 
> In my opinion, we do not want feature parity with dd.  What we do want
> is feature parity with convert.

Well, convert is lacking a way to specify a subset of one file to move 
to a (possibly different) subset of the other.  I'm fine if we want to 
enhance convert to do the things that right now require a dd-alike 
interface (namely, limiting the copying to less than the full file, and 
choosing the offset at which to start [before this patch] or write to 
[with this patch]).

If convert were more powerful, I'd be fine dropping 'qemu-img dd' after 
a proper deprecation period.

> 
>> The code style for 'qemu-img dd' was pretty hard to read;
>> unfortunately this patch focuses only on adding the new
>> feature in the existing style rather than trying to improve
>> the overall flow, other than switching octal constants to
>> hex.  Oh well.
> 
> No, the real issue is that dd is still not implemented just as a
> frontend to convert.  Which it should be.  I'm not sure dd was a very
> good idea from the start, and now it should ideally be a frontend to
> convert.
> 
> (My full opinion on the matter: dd has a horrible interface.  I don't
> quite see why we replicated that inside qemu-img.  Also, if you want to
> use dd, why not use qemu-nbd + Linux nbd device + real dd?)

Because of performance: qemu-nbd + Linux nbd device + real dd is one 
more layer of data copying (each write() from dd goes to kernel, then is 
sent to qemu-nbd in userspace as a socket message before being sent back 
to the kernel to actually write() to the final destination) compared to 
just doing it all in one process (write() lands in the final destination 
with no further user space bouncing).  And because the additional steps 
to set it up are awkward (see my other email where I rant about losing 
the better part of today to realizing that 'dd ...; qemu-nbd -d 
/dev/nbd1' loses data if you omit conv=fdatasync).

> 
> ((That gave me a good idea.  Actually, it's probably not such a good
> idea, but I guess I'll do it in my spare time anyway.  A qemu-img fuse
> might be nice which represents an image as a raw image at some mount
> point.  Benefits over qemu-nbd: (1) You don't need root, (2) you don't
> need to type modprobe nbd.))

So the kernel->userspace translation would be happening via the FUSE 
interface instead of the NBD interface.  Data still bounces around just 
as much, but it might be a fun project.  Does fuse behave well when 
serving exactly one file at the mountpoint, rather than the more typical 
file system rooted in a directory?  NBD at least has the benefit of 
claiming to be a block device all along, rather than complicating the 
user interface with POSIX file system rules (which you'll be bending, 
because you are serving exactly one file instead of a system).

> 
>> Also, switch the test to use an offset of 0 instead of 1,
>> to test skip= and seek= on their own; as it is, this is
>> effectively quadrupling the test runtime, which starts
>> to make this test borderline on whether it should still
>> belong to './check -g quick'.  And I didn't bother to
>> reindent the test shell code for the new nested loop.
> 
> In my opinion, it should no longer belong to quick.  It takes 8 s on my
> tmpfs.  My border is somewhere around 2 or 3; and I haven't yet decided
> whether that's on tmpfs or SSD.

I took 4 iterations pre-patch, to 8 iterations after patch 1, to 32 
iterations with this patch; my observed times went from 1s to 2s to 7s 
on SSD ext4. Yeah, for v2, I'll drop it from quick.

>> @@ -4574,7 +4592,14 @@ static int img_dd(int argc, char **argv)
>>           size = dd.count * in.bsz;
>>       }
>>
>> -    qemu_opt_set_number(opts, BLOCK_OPT_SIZE, size, &error_abort);
>> +    if (dd.flags & C_SEEK && out.offset * out.bsz > INT64_MAX - size) {
> 
> What about overflows in out.offset * out.bsz?

I've had enough of my eyes bleeding on all the code repeatedly scaling 
things. For v2, I'm strongly considering a cleanup patch that reads all 
input, then scales all values into bytes, and THEN performs any 
additional math in a single unit, just so the additions become easier to 
reason about.

> 
>> +        error_report("Seek too large for '%s'", out.filename);
>> +        ret = -1;
>> +        goto out;
> 
> Real dd doesn't seem to error out (it just reports an error).  I don't
> know whether that makes any difference, though.

But where does the data get written if you can't actually seek that far 
into the file?

> 
> The test looks good to me.

Other than my creative indentation levels ;)

> 
> Max
> 
>> +    }
>> +    seek = out.offset * out.bsz;
>> +
>> +    qemu_opt_set_number(opts, BLOCK_OPT_SIZE, size + seek, &error_abort);
>>       end = size + in_pos;
>>
>>       ret = bdrv_create(drv, out.filename, opts, &local_err);
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

next prev parent reply	other threads:[~2018-08-16  2:39 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-15  2:56 [Qemu-devel] [PATCH 0/2] Improve qemu-img dd Eric Blake
2018-08-15  2:56 ` [Qemu-devel] [PATCH 1/2] qemu-img: Fix dd with skip= and count= Eric Blake
2018-08-16  2:03   ` Max Reitz
2018-08-16  2:17     ` Eric Blake
2018-08-16  2:19       ` Max Reitz
2018-08-15  2:56 ` [Qemu-devel] [PATCH 2/2] qemu-img: Add dd seek= option Eric Blake
2018-08-16  2:20   ` Max Reitz
2018-08-16  2:39     ` Eric Blake [this message]
2018-08-16  2:49       ` Eric Blake
2018-08-16  2:49       ` Max Reitz
2018-08-16  2:57         ` Eric Blake
2018-08-16  3:00           ` Max Reitz
2018-08-16  7:15         ` Kevin Wolf
2018-08-17 19:22           ` Max Reitz
2018-08-20  2:07     ` Fam Zheng
2018-08-20 12:20       ` Max Reitz
2018-08-16  2:04 ` [Qemu-devel] [PATCH 0/2] Improve qemu-img dd Eric Blake
2018-08-16  2:12 ` Eric Blake
2018-08-16 19:39 ` no-reply
2018-08-16 20:00 ` no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4aa899d3-c622-6acf-128c-513f95c55ed5@redhat.com \
    --to=eblake@redhat.com \
    --cc=fullmanet@gmail.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).