From: Nir Soffer <nsoffer@redhat.com>
To: John Snow <jsnow@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Nir Soffer <nirsof@gmail.com>,
qemu-block <qemu-block@nongnu.org>,
QEMU Developers <qemu-devel@nongnu.org>,
Max Reitz <mreitz@redhat.com>, Niels de Vos <ndevos@redhat.com>
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH] block: posix: Always allocate the first block
Date: Sat, 17 Aug 2019 01:45:14 +0300 [thread overview]
Message-ID: <CAMRbyytThpP1KXPmJLpA_i3JLot7j9UshjcqRerkFtmN_T5Seg@mail.gmail.com> (raw)
In-Reply-To: <b24959b4-f2b2-d720-f8b5-4adc25b89278@redhat.com>
On Sat, Aug 17, 2019 at 12:57 AM John Snow <jsnow@redhat.com> wrote:
> On 8/16/19 5:21 PM, Nir Soffer wrote:
> > When creating an image with preallocation "off" or "falloc", the first
> > block of the image is typically not allocated. When using Gluster
> > storage backed by XFS filesystem, reading this block using direct I/O
> > succeeds regardless of request length, fooling alignment detection.
> >
> > In this case we fallback to a safe value (4096) instead of the optimal
> > value (512), which may lead to unneeded data copying when aligning
> > requests. Allocating the first block avoids the fallback.
> >
>
> Where does this detection/fallback happen? (Can it be improved?)
>
In raw_probe_alignment().
This patch explain the issues:
https://lists.nongnu.org/archive/html/qemu-block/2019-08/msg00568.html
Here Kevin and me discussed ways to improve it:
https://lists.nongnu.org/archive/html/qemu-block/2019-08/msg00426.html
> When using preallocation=off, we always allocate at least one filesystem
> > block:
> >
> > $ ./qemu-img create -f raw test.raw 1g
> > Formatting 'test.raw', fmt=raw size=1073741824
> >
> > $ ls -lhs test.raw
> > 4.0K -rw-r--r--. 1 nsoffer nsoffer 1.0G Aug 16 23:48 test.raw
> >
> > I did quick performance tests for these flows:
> > - Provisioning a VM with a new raw image.
> > - Copying disks with qemu-img convert to new raw target image
> >
> > I installed Fedora 29 server on raw sparse image, measuring the time
> > from clicking "Begin installation" until the "Reboot" button appears:
> >
> > Before(s) After(s) Diff(%)
> > -------------------------------
> > 356 389 +8.4
> >
> > I ran this only once, so we cannot tell much from these results.
> >
>
> That seems like a pretty big difference for just having pre-allocated a
> single block. What was the actual command line / block graph for that test?
>
Having the first block allocated changes the alignment.
Before this patch, we detect request_alignment=1, so we fallback to 4096.
Then we detect buf_align=1, so we fallback to value of request alignment.
The guest see a disk with:
logical_block_size = 512
physical_block_size = 512
But qemu uses:
request_alignment = 4096
buf_align = 4096
storage uses:
logical_block_size = 512
physical_block_size = 512
If the guest does direct I/O using 512 bytes aligment, qemu has to copy
the buffer to align them to 4096 bytes.
After this patch, qemu detects the alignment correctly, so we have:
guest
logical_block_size = 512
physical_block_size = 512
qemu
request_alignment = 512
buf_align = 512
storage:
logical_block_size = 512
physical_block_size = 512
We expect this to be more efficient because qemu does not have to emulate
anything.
Was this over a network that could explain the variance?
>
Maybe, this is complete install of Fedora 29 server, I'm not sure if the
installation
access the network.
> The second test was cloning the installation image with qemu-img
> > convert, doing 10 runs:
> >
> > for i in $(seq 10); do
> > rm -f dst.raw
> > sleep 10
> > time ./qemu-img convert -f raw -O raw -t none -T none src.raw
> dst.raw
> > done
> >
> > Here is a table comparing the total time spent:
> >
> > Type Before(s) After(s) Diff(%)
> > ---------------------------------------
> > real 530.028 469.123 -11.4
> > user 17.204 10.768 -37.4
> > sys 17.881 7.011 -60.7
> >
> > Here we see very clear improvement in CPU usage.
> >
>
> Hard to argue much with that. I feel a little strange trying to force
> the allocation of the first block, but I suppose in practice "almost no
> preallocation" is indistinguishable from "exactly no preallocation" if
> you squint.
>
Right.
The real issue is that filesystems and block devices do not expose the
alignment
requirement for direct I/O, so we need to use these hacks and assumptions.
With local XFS we use xfsctl(XFS_IOC_DIOINFO) to get request_alignment, but
this does
not help for XFS filesystem used by Gluster on the server side.
I hope that Niels is working on adding similar ioctl for Glsuter, os it can
expose the properties
of the remote filesystem.
Nir
next prev parent reply other threads:[~2019-08-16 22:46 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-16 21:21 [Qemu-devel] [PATCH] block: posix: Always allocate the first block Nir Soffer
2019-08-16 21:57 ` [Qemu-devel] [Qemu-block] " John Snow
2019-08-16 22:45 ` Nir Soffer [this message]
2019-08-16 23:00 ` John Snow
2019-08-22 11:30 ` [Qemu-devel] " Nir Soffer
2019-08-22 14:28 ` Max Reitz
2019-08-22 16:39 ` Nir Soffer
2019-08-22 18:11 ` Max Reitz
2019-08-22 19:01 ` Nir Soffer
2019-08-23 13:58 ` Max Reitz
2019-08-23 16:30 ` Nir Soffer
2019-08-23 17:41 ` Max Reitz
2019-08-23 16:48 ` Nir Soffer
2019-08-23 17:53 ` Max Reitz
2019-08-24 22:57 ` Nir Soffer
2019-08-25 7:44 ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
2019-08-25 19:51 ` Nir Soffer
2019-08-25 22:17 ` Maxim Levitsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMRbyytThpP1KXPmJLpA_i3JLot7j9UshjcqRerkFtmN_T5Seg@mail.gmail.com \
--to=nsoffer@redhat.com \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=ndevos@redhat.com \
--cc=nirsof@gmail.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).