From: Max Reitz <mreitz@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
"qemu-block@nongnu.org" <qemu-block@nongnu.org>
Cc: Kevin Wolf <kwolf@redhat.com>,
Anton Nefedov <anton.nefedov@virtuozzo.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [PATCH for-4.2 0/4] qcow2: Fix data corruption on XFS
Date: Fri, 1 Nov 2019 16:06:23 +0100 [thread overview]
Message-ID: <544d3cad-213b-752f-0831-075fd41a281d@redhat.com> (raw)
In-Reply-To: <4188555d-23e3-ef2d-133c-5826cf878d37@redhat.com>
[-- Attachment #1.1: Type: text/plain, Size: 4105 bytes --]
On 01.11.19 14:30, Max Reitz wrote:
[...]
> So unless there are realistic guest benchmarks for ext4 that say we
> should keep the patch, I’m going to queue the revert for now (“now” =
> 4.1.1 and 4.2.0).
I found one case where the performance is significantly improved by
c8bb23cbdbe: In the cases so far I had XFS in the guest, now I used
ext4, and with aio=native (on the ext4 host with 2 MB clusters), the
performance goes from 63.9 - 65.0 MB/s to 75.7 - 76.4 MB/s (so +18%).
The difference is smaller for 64 kB clusters, but still there at +13%.
That’s probably the more important fact, because these are the default
settings, and this is probably about what would happen with 2 MB
clusters + subclusters.
(Patch 4 in this series doesn’t decrease the performance.)
This is a tough decision for me because from some people tell me “Let’s
just revert it, there are problems with it and we don’t quite know what
good it does in practice”, and others say “We have (not really
practical) benchmarks that show it does something good for our specific
case”. And all that while those two groups never seem to talk to each
other directly.
So I suppose I’m going to have to make a decision. I now know a case
where c8bb23cbdbe is actually beneficial. I myself have never seen
c8bb23cbdbe decrease performance, but I know Laurent has seen a drastic
performance degradation, and he’s used it to bisect the XFS problem to
that commit, so it’s really real. But I haven’t seen it, and as far as
I know it really only happens on ppc64.
In light of this, I would prefer to revert c8bb23cbdbe for 4.1.1, and
keep it for 4.2.0. But I don’t know whether we can do that, all I know
is that I’m not going to find out in time for 4.1.1.
If we keep c8bb23cbdbe in any way, we need patches 2 through 4, that
much is clear.
I believe we can think about the performance problem after 4.2.0. I
would love to benchmark c8bb23cbdbe on a fixed kernel, but there just
isn’t time for that anymore.
I’m not a fan of keeping c8bb23cbdbe behind a configure switch. If it’s
beneficial, it should be there for everyone.
OK. Some may see this as a wrong decision, but someone needs to make
one now, so here goes: ext4 is the default Linux FS for most
distributions. As far as I can tell from my own benchmarks, c8bb23cbdbe
brings a significant performance improvement for qcow2 images with the
default configuration on this default filesystem with aio=native and
doesn’t change much in any other case.
What happens on ppc64 is a problem, but that’s a RHEL problem because
it’s specific to XFS (and also ppc64). It also won’t be a regression on
4.2 compared to 4.1.
Dave’s argument was good that fallocate() and AIO cannot mix (at least
on XFS), but I couldn’t see any impact of that in my benchmarks (maybe
my benchmarks were just wrong).
So I think for upstream it’ll be best if I send a v2 which doesn’t touch
handle_alloc_space(), and instead just consists of patches 2 through 4.
(And CC it all to stable.)
I think we still need to keep track of the XFS/ppc64 issue and do more
benchmarks especially with the fixed XFS driver.
tl;dr:
The main arguments for reverting c8bb23cbdbe were (AFAIU):
- a general uneasy feeling about it
- theoretical arguments that it must be bad on XFS
- actual problems on ppc64/XFS
- “what good does it do in practice?”
- that subclusters would make it obsolete anyway
What I could see is:
- no impact on XFS in practice
- significant practical benefit on ext4
- subclusters probably wouldn’t make it obsolete, because I can still
see a +13% improvement for 64 kB clusters (2 MB clusters + subclusters
gives you 64 kB subclusters)
In addition, it needs to be considered that ext4 is the default FS for
most Linux distributions.
As such, I personally am not convinced of reverting this patch. Let’s
keep it, have patches 2 through 4 for 4.1.1 and 4.2.0, and think about
what to do for ppc64/XFS later.
Max
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
prev parent reply other threads:[~2019-11-01 15:08 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-01 10:00 [PATCH for-4.2 0/4] qcow2: Fix data corruption on XFS Max Reitz
2019-11-01 10:00 ` [PATCH for-4.2 1/4] Revert "qcow2: skip writing zero buffers to empty COW areas" Max Reitz
2019-11-01 10:22 ` Vladimir Sementsov-Ogievskiy
2019-11-01 12:40 ` Eric Blake
2019-11-01 14:01 ` Max Reitz
2019-11-01 15:42 ` Kevin Wolf
2019-11-01 16:02 ` Max Reitz
2019-11-01 10:00 ` [PATCH for-4.2 2/4] block: Make wait/mark serialising requests public Max Reitz
2019-11-01 10:00 ` [PATCH for-4.2 3/4] block: Add bdrv_co_get_self_request() Max Reitz
2019-11-01 10:00 ` [PATCH for-4.2 4/4] block/file-posix: Let post-EOF fallocate serialize Max Reitz
2019-11-01 10:20 ` [PATCH for-4.2 0/4] qcow2: Fix data corruption on XFS Max Reitz
2019-11-01 10:28 ` Vladimir Sementsov-Ogievskiy
2019-11-01 11:12 ` Max Reitz
2019-11-01 11:16 ` Vladimir Sementsov-Ogievskiy
2019-11-01 11:20 ` Max Reitz
2019-11-01 12:34 ` Max Reitz
2019-11-01 13:09 ` Vladimir Sementsov-Ogievskiy
2019-11-01 13:36 ` Denis Lunev
2019-11-01 13:40 ` Max Reitz
2019-11-01 13:30 ` Max Reitz
2019-11-01 15:06 ` Max Reitz [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=544d3cad-213b-752f-0831-075fd41a281d@redhat.com \
--to=mreitz@redhat.com \
--cc=anton.nefedov@virtuozzo.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).