qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org
Subject: Re: [PATCH v2 03/20] fuse: Implement standard FUSE operations
Date: Thu, 15 Oct 2020 18:04:37 +0200	[thread overview]
Message-ID: <1768b35f-8615-1f0f-b008-6628882c6eaa@redhat.com> (raw)
In-Reply-To: <20201015155800.GL4610@merkur.fritz.box>


[-- Attachment #1.1: Type: text/plain, Size: 5046 bytes --]

On 15.10.20 17:58, Kevin Wolf wrote:
> Am 15.10.2020 um 17:18 hat Max Reitz geschrieben:
>> On 15.10.20 11:46, Kevin Wolf wrote:
>>> Am 22.09.2020 um 12:49 hat Max Reitz geschrieben:
>>>> This makes the export actually useful instead of only producing errors
>>>> whenever it is accessed.
>>>>
>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
> 
>>>> +/**
>>>> + * Handle client reads from the exported image.
>>>> + */
>>>> +static void fuse_read(fuse_req_t req, fuse_ino_t inode,
>>>> +                      size_t size, off_t offset, struct fuse_file_info *fi)
>>>> +{
>>>> +    FuseExport *exp = fuse_req_userdata(req);
>>>> +    int64_t length;
>>>> +    void *buf;
>>>> +    int ret;
>>>> +
>>>> +    /**
>>>> +     * Clients will expect short reads at EOF, so we have to limit
>>>> +     * offset+size to the image length.
>>>> +     */
>>>> +    length = blk_getlength(exp->common.blk);
>>>> +    if (length < 0) {
>>>> +        fuse_reply_err(req, -length);
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    size = MIN(size, FUSE_MAX_BOUNCE_BYTES);
>>>
>>> "Read should send exactly the number of bytes requested except on EOF or
>>> error, otherwise the rest of the data will be substituted with zeroes."
>>
>> :(
>>
>>> Do we corrupt huge reads with this, so that read() succeeds, but the
>>> buffer is zeroed above 64M instead of containing the correct data? Maybe
>>> we should return an error instead?
>>
>> Hm.  It looks like there is a max_read option obeyed by the kernel
>> driver, and it appears it’s set by implementing .init() and setting
>> fuse_conn_info.max_read (also, “for the time being” it has to also set
>> in the mount options passed to fuse_session_new()).
>>
>> I’m not sure whether that does anything, though.  It appears that
>> whenever I do a cached read, it caps out at 128k (which is what
>> cuse_prep_data() in libfuse sets; though increasing that number there
>> doesn’t change anything, so I think that’s just a coincidence), and with
>> O_DIRECT, it caps out at 1M.
>>
>> But at least that would be grounds to return an error for >64M requests.
>>  (Limiting max_read to 64k does limit all read requests to 64k.)
> 
> Yes, setting max_read and making larger requests an error seems like a
> good solution.
> 
>> Further investigation is needed.
> 
> If you want :-)

Not really, but 128k per request is a bit sad.

[...]

>>>> +/**
>>>> + * Let clients flush the exported image.
>>>> + */
>>>> +static void fuse_flush(fuse_req_t req, fuse_ino_t inode,
>>>> +                       struct fuse_file_info *fi)
>>>> +{
>>>> +    FuseExport *exp = fuse_req_userdata(req);
>>>> +    int ret;
>>>> +
>>>> +    ret = blk_flush(exp->common.blk);
>>>> +    fuse_reply_err(req, ret < 0 ? -ret : 0);
>>>> +}
>>>
>>> This seems to be an implementation for .fsync rather than for .flush.
>>
>> Wouldn’t fsync entail a drain?
> 
> Hm, the libfuse documentation doesn't say anything about draining. I
> suppose this is because if requests need to be drained, it will do so in
> the kernel. But I haven't checked the code.

Hmm, well, yeah...  A sync doesn’t necessarily mean settling all
in-flight requests.

> So I expect that calling fsync() on the lower layer does everything that
> is needed. Which is bdrv_flush() in QEMU.
> 
>> Or is it the opposite, flush should just drain and not invoke
>> blk_flush()?  (Sorry, this all gets me confused every time.)
> 
> I'm just relying on the libfuse documentation there for flush:
> 
> "This is called on each close() of the opened file."

Ah, OK.

> and
> 
> "NOTE: the name of the method is misleading, since (unlike fsync) the
> filesystem is not forced to flush pending writes. One reason to flush
> data is if the filesystem wants to return write errors during close.
> However, such use is non-portable because POSIX does not require close
> to wait for delayed I/O to complete."
> 
>> (Though I do think .fsync should both flush and drain.)
>>
>>> Hmm, or maybe actually for both? We usually do bdrv_flush() during
>>> close, so it would be consistent to do the same here. It's our last
>>> chance to report an error to the user before the file is closed.
>>
>> I don’t understand what you mean.  What is “the same”?  Closing the
>> image?  Or indeed having .flush() be implemented with blk_flush()?
> 
> Implementing .flush(), which will be called when the image is closed,
> with blk_flush().
> 
> And still doing blk_flush() for .fsync, of course.

OK.

>> Do you mean that other parties may do the same as qemu does, i.e.
>> flush files before they are closed, which is why we should anticipate
>> the same and give users a chance to see errors?
> 
> I'm not exactly sure what failure of .flush() would look like for users.
> Probably close() failing, so they don't have to do anything special,
> just check their return values.

Checking return values on close()?  Sounds special to me. O:)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2020-10-15 16:10 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-22 10:49 [PATCH v2 00/20] block/export: Allow exporting BDSs via FUSE Max Reitz
2020-09-22 10:49 ` [PATCH v2 01/20] configure: Detect libfuse Max Reitz
2020-09-22 11:14   ` Thomas Huth
2020-09-22 11:21     ` Paolo Bonzini
2020-09-22 11:46     ` Max Reitz
2020-09-22 15:37     ` Max Reitz
2020-09-22 15:45       ` Paolo Bonzini
2020-09-22 10:49 ` [PATCH v2 02/20] fuse: Allow exporting BDSs via FUSE Max Reitz
2020-10-15  8:57   ` Kevin Wolf
2020-10-15 14:46     ` Max Reitz
2020-10-15 15:41       ` Kevin Wolf
2020-10-15 15:59         ` Max Reitz
2020-10-15 17:01           ` Kevin Wolf
2020-09-22 10:49 ` [PATCH v2 03/20] fuse: Implement standard FUSE operations Max Reitz
2020-10-15  9:46   ` Kevin Wolf
2020-10-15 15:18     ` Max Reitz
2020-10-15 15:58       ` Kevin Wolf
2020-10-15 16:04         ` Max Reitz [this message]
2020-09-22 10:49 ` [PATCH v2 04/20] fuse: Allow growable exports Max Reitz
2020-10-15 10:41   ` Kevin Wolf
2020-10-15 15:20     ` Max Reitz
2020-09-22 10:49 ` [PATCH v2 05/20] fuse: (Partially) implement fallocate() Max Reitz
2020-09-22 10:49 ` [PATCH v2 06/20] fuse: Implement hole detection through lseek Max Reitz
2020-09-22 10:49 ` [PATCH v2 07/20] iotests: Do not needlessly filter _make_test_img Max Reitz
2020-09-22 10:49 ` [PATCH v2 08/20] iotests: Do not pipe _make_test_img Max Reitz
2020-09-22 10:49 ` [PATCH v2 09/20] iotests: Use convert -n in some cases Max Reitz
2020-09-22 10:49 ` [PATCH v2 10/20] iotests/046: Avoid renaming images Max Reitz
2020-09-22 10:49 ` [PATCH v2 11/20] iotests: Derive image names from $TEST_IMG Max Reitz
2020-09-22 10:49 ` [PATCH v2 12/20] iotests/091: Use _cleanup_qemu instad of "wait" Max Reitz
2020-09-22 10:49 ` [PATCH v2 13/20] iotests: Restrict some Python tests to file Max Reitz
2020-09-22 10:49 ` [PATCH v2 14/20] iotests: Let _make_test_img guess $TEST_IMG_FILE Max Reitz
2020-09-22 10:49 ` [PATCH v2 15/20] iotests/287: Clean up subshell test image Max Reitz
2020-09-22 10:49 ` [PATCH v2 16/20] storage-daemon: Call bdrv_close_all() on exit Max Reitz
2020-09-22 10:49 ` [PATCH v2 17/20] iotests: Give access to the qemu-storage-daemon Max Reitz
2020-10-15 11:27   ` Kevin Wolf
2020-10-15 15:22     ` Max Reitz
2020-09-22 10:49 ` [PATCH v2 18/20] iotests: Allow testing FUSE exports Max Reitz
2020-10-15 11:43   ` Kevin Wolf
2020-10-15 15:27     ` Max Reitz
2020-09-22 10:49 ` [PATCH v2 19/20] iotests: Enable fuse for many tests Max Reitz
2020-09-22 10:49 ` [PATCH v2 20/20] iotests/308: Add test for FUSE exports Max Reitz
2020-09-22 15:58 ` [PATCH v2 00/20] block/export: Allow exporting BDSs via FUSE Daniel P. Berrangé
2020-09-23  7:21   ` Max Reitz
2020-09-23  9:08   ` Stefan Hajnoczi
2020-10-15 12:01 ` Kevin Wolf
2020-10-15 16:47   ` Max Reitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1768b35f-8615-1f0f-b008-6628882c6eaa@redhat.com \
    --to=mreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).