From: Max Reitz <mreitz@redhat.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: virtio-fs@redhat.com, qemu-devel@nongnu.org
Subject: Re: [Virtio-fs] [PATCH v2 7/9] virtiofsd: Add inodes_by_handle hash table
Date: Tue, 13 Jul 2021 17:07:31 +0200 [thread overview]
Message-ID: <eda4ee02-56f8-079d-0829-041ed3471aed@redhat.com> (raw)
In-Reply-To: <eec1bcd6-957d-8e9f-457c-fb717b71336b@redhat.com>
So I’m coming back to this after three weeks (well, PTO), and this again
turns into a bit of a pain, actually.
I don’t think it’s anything serious, but I had thought we had found
something that would make us both happy because it wouldn’t be too ugly,
and now it’s turning ugly again... So I’m sending this mail as a heads
up before I send v3 in the next days, to explain my thought process.
On 21.06.21 11:02, Max Reitz wrote:
> On 18.06.21 20:29, Vivek Goyal wrote:
>
[...]
>> I am still reading your code and trying to understand it. But one
>> question came to mind. What happens if we can generate file handle
>> during lookup. But can't generate when same file is looked up again.
>>
>> - A file foo.txt is looked. We can create file handle and we add it
>> to lo->inodes_by_handle as well as lo->inodes_by_ds.
>>
>> - Say somebody deleted file and created again and inode number got
>> reused.
>>
>> - Now during ->revalidation path, lookup happens again. This time say
>> we can't generate file handle. If am reading lo_do_find() code
>> correctly, it will find the old inode using ids and return same
>> inode as result of lookup. And we did not recognize that inode
>> number has been reused.
>
> Oh, that’s a good point. If an lo_inode has no O_PATH fd but is only
> addressed by handle, we must always look it up by handle.
Also, just wanted to throw in this remark:
Now that I read the code again, lo_do_find() already has a condition to
prevent this. It’s this:
if (p && fhandle != NULL && p->fhandle != NULL) {
p = NULL;
}
There’s just one thing wrong with it, and that is the `fhandle != NULL`
part. It has no place there. But this piece of code does exactly what
we’d need it do if it were just:
if (p && p->fhandle != NULL) {
p = NULL;
}
[...]
> However, you made a good point in that we must require
> name_to_handle_at() to work if it worked before for some inode, not
> because it would be simpler, but because it would be wrong otherwise.
>
> As for the other way around... Well, now I don’t have a strong
> opinion on it. Handling temporary name_to_handle_at() failure after
> it worked the first time should not add extra complexity, but it
> wouldn’t be symmetric. Like, allowing temporary failure sometimes but
> not at other times.
(I think I mistyped here, it should be “Handling name_to_handle_at()
randomly working after it failed the first time”.)
> The next question is, how do we detect temporary failure, because if
> we look up some new inode, name_to_handle_at() fails, we ignore it,
> and then it starts to work and we fail all further lookups, that’s not
> good. We should have the first lookup fail. I suppose ENOTSUPP means
> “OK to ignore”, and for everything else we should let lookup fail?
> (And that pretty much answers my "what if name_to_handle_at() works
> the first time, but then fails" question. If we let anything but
> ENOTSUPP let the lookup fail, then we should do so every time.)
I don’t think this will work as cleanly as I’d hoped.
The problem I’m facing is that get_file_handle() doesn’t only call
name_to_handle_at(), but also contains a lot of code managing
mount_fds. There are a lot of places that can fail, too, and I think we
should have them fall back to using an O_PATH FD:
Say mount_fds doesn’t contain an FD for the new handle’s mount ID yet,
so we want to add one. However, it turns out that the file is not a
regular file or directory, so we cannot open it as a regular FD and add
it to mount_fds; or that it is a regular file, but without permission to
open it O_RDONLY. So we cannot return a file handle, because it will
not be usable until a mount FD is added.
I think in such a case we should fall back to an O_PATH FD, because this
is not some unexpected error, but just an unfortunate (but reproducible
and valid) circumstance where using `-o inode_file_handles` fails to do
something that works without it.
Now, however, this means that the next time we try to generate a handle
for this file (to look it up), it will absolutely work if some other FD
was added to mount_fds for this mount ID in the meantime.
We could get around this by not trying to open the file for which we are
to generate a handle to add its FD to mount_fds, but instead doing what
the open_by_handle_at() man page suggests:
> The mount_id argument returns an identifier for the filesystem mount
> that corresponds to pathname. This corresponds to the first field in
> one of the records in /proc/self/mountinfo. Opening the pathname in
> the fifth field of that record yields a file descriptor for the mount
> point; that file descriptor can be used in a subsequent call to
> open_by_handle_at().
However, I’d rather avoid parsing mountinfo. And as far as I
understand, the only problem here is that we’ll have to cope with the
fact that sometimes on lookups, we can generate a file handle, but the
lo_inode we want to find has no file handle attached to it (because
get_file_handle() failed the first time), and so we won’t find it by
that handle but have to look it up by its inode ID. (Which is safe,
because that lo_inode must have an O_PATH FD attached to it, so the
inode ID cannot be reused.) And that’s something that this series
already does, so I tend to favor that over parsing mountinfo.
Max
next prev parent reply other threads:[~2021-07-13 15:08 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-09 15:55 [PATCH v2 0/9] virtiofsd: Allow using file handles instead of O_PATH FDs Max Reitz
2021-06-09 15:55 ` [PATCH v2 1/9] virtiofsd: Add TempFd structure Max Reitz
2021-06-09 15:55 ` [PATCH v2 2/9] virtiofsd: Use lo_inode_open() instead of openat() Max Reitz
2021-06-09 15:55 ` [PATCH v2 3/9] virtiofsd: Add lo_inode_fd() helper Max Reitz
2021-06-09 15:55 ` [PATCH v2 4/9] virtiofsd: Let lo_fd() return a TempFd Max Reitz
2021-06-09 15:55 ` [PATCH v2 5/9] virtiofsd: Let lo_inode_open() " Max Reitz
2021-06-09 15:55 ` [PATCH v2 6/9] virtiofsd: Add lo_inode.fhandle Max Reitz
2021-06-09 15:55 ` [PATCH v2 7/9] virtiofsd: Add inodes_by_handle hash table Max Reitz
2021-06-11 20:04 ` [Virtio-fs] " Vivek Goyal
2021-06-16 13:38 ` Max Reitz
2021-06-17 21:21 ` Vivek Goyal
2021-06-18 8:28 ` Max Reitz
2021-06-18 18:29 ` Vivek Goyal
2021-06-21 9:02 ` Max Reitz
2021-06-21 15:51 ` Vivek Goyal
2021-06-21 17:07 ` Max Reitz
2021-06-21 21:27 ` Vivek Goyal
2021-07-13 15:07 ` Max Reitz [this message]
2021-07-20 14:50 ` Vivek Goyal
2021-07-21 8:29 ` Max Reitz
2021-06-18 8:30 ` Max Reitz
2021-06-09 15:55 ` [PATCH v2 8/9] virtiofsd: Optionally fill lo_inode.fhandle Max Reitz
2021-06-09 15:55 ` [PATCH v2 9/9] virtiofsd: Add lazy lo_do_find() Max Reitz
2021-06-11 19:19 ` [PATCH v2 0/9] virtiofsd: Allow using file handles instead of O_PATH FDs Vivek Goyal
2021-06-16 13:41 ` Max Reitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=eda4ee02-56f8-079d-0829-041ed3471aed@redhat.com \
--to=mreitz@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=vgoyal@redhat.com \
--cc=virtio-fs@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).