From: Thomas Huth <thuth@redhat.com>
To: Nir Soffer <nsoffer@redhat.com>,
qemu-block <qemu-block@nongnu.org>, Kevin Wolf <kwolf@redhat.com>,
Max Reitz <mreitz@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>,
Viktor Mihajlovski <mihajlov@linux.ibm.com>,
Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>,
Richard Jones <rjones@redhat.com>,
QEMU Developers <qemu-devel@nongnu.org>
Subject: Re: [PATCH] block/file-posix: Fix problem with fallocate(PUNCH_HOLE) on GPFS
Date: Wed, 19 May 2021 12:21:20 +0200 [thread overview]
Message-ID: <15a28d6c-8958-5feb-4418-134af27b5b6f@redhat.com> (raw)
In-Reply-To: <0b9b354c-c708-af16-105a-0e738eafa69e@redhat.com>
On 19/04/2021 07.06, Thomas Huth wrote:
> On 16/04/2021 22.34, Nir Soffer wrote:
>> On Fri, Apr 16, 2021 at 8:23 AM Thomas Huth <thuth@redhat.com> wrote:
>>>
>>> A customer reported that running
>>>
>>> qemu-img convert -t none -O qcow2 -f qcow2 input.qcow2 output.qcow2
>>>
>>> fails for them with the following error message when the images are
>>> stored on a GPFS file system:
>>>
>>> qemu-img: error while writing sector 0: Invalid argument
>>>
>>> After analyzing the strace output, it seems like the problem is in
>>> handle_aiocb_write_zeroes(): The call to fallocate(FALLOC_FL_PUNCH_HOLE)
>>> returns EINVAL, which can apparently happen if the file system has
>>> a different idea of the granularity of the operation. It's arguably
>>> a bug in GPFS, since the PUNCH_HOLE mode should not result in EINVAL
>>> according to the man-page of fallocate(), but the file system is out
>>> there in production and so we have to deal with it. In commit 294682cc3a
>>> ("block: workaround for unaligned byte range in fallocate()") we also
>>> already applied the a work-around for the same problem to the earlier
>>> fallocate(FALLOC_FL_ZERO_RANGE) call, so do it now similar with the
>>> PUNCH_HOLE call.
>>>
>>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>>> ---
>>> block/file-posix.c | 7 +++++++
>>> 1 file changed, 7 insertions(+)
>>>
>>> diff --git a/block/file-posix.c b/block/file-posix.c
>>> index 20e14f8e96..7a40428d52 100644
>>> --- a/block/file-posix.c
>>> +++ b/block/file-posix.c
>>> @@ -1675,6 +1675,13 @@ static int handle_aiocb_write_zeroes(void *opaque)
>>> }
>>> s->has_fallocate = false;
>>> } else if (ret != -ENOTSUP) {
>>> + if (ret == -EINVAL) {
>>> + /*
>>> + * File systems like GPFS do not like unaligned byte
>>> ranges,
>>> + * treat it like unsupported (so caller falls back to
>>> pwrite)
>>> + */
>>> + return -ENOTSUP;
>>
>> This skips the next fallback, using plain fallocate(0) if we write
>> after the end of the file. Is this intended?
>>
>> We can treat the buggy EINVAL return value as "filesystem is buggy,
>> let's not try other options", or "let's try the next option". Since falling
>> back to actually writing zeroes is so much slower, I think it is better to
>> try the next option.
>
> I just did the same work-around as in commit 294682cc3a7 ... so if we agree
> to try the other options, too, we should change that spot, too...
>
> However, what is not clear to me, how would you handle s->has_write_zeroes
> and s->has_discard in such a case? Set them to "false"? ... but it could
> still work for some blocks with different alignment ... but if we keep them
> set to "true", the code tries again and again to call these ioctls, maybe
> wasting other precious cycles for this?
>
> Maybe we should do a different approach instead: In case we hit a EINVAL
> here, print an error a la:
>
> error_report_once("You are running on a buggy file system, please complain
> to the file system vendor");
>
> and return -ENOTSUP ... then it's hopefully clear to the users why they are
> getting a bad performance, and that they should complain to the file system
> vendor instead to get their problem fixed.
Ping!
Any recommendations how to proceed here?
Thomas
next prev parent reply other threads:[~2021-05-19 10:25 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-16 5:23 [PATCH] block/file-posix: Fix problem with fallocate(PUNCH_HOLE) on GPFS Thomas Huth
2021-04-16 20:34 ` Nir Soffer
2021-04-19 5:06 ` Thomas Huth
2021-04-19 13:13 ` Kevin Wolf
2021-05-19 10:21 ` Thomas Huth [this message]
2021-05-19 10:41 ` Thomas Huth
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=15a28d6c-8958-5feb-4418-134af27b5b6f@redhat.com \
--to=thuth@redhat.com \
--cc=andrey.shinkevich@virtuozzo.com \
--cc=borntraeger@de.ibm.com \
--cc=kwolf@redhat.com \
--cc=mihajlov@linux.ibm.com \
--cc=mreitz@redhat.com \
--cc=nsoffer@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=rjones@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).