From: Al Viro <viro@zeniv.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christian Brauner <brauner@kernel.org>,
Mateusz Guzik <mjguzik@gmail.com>, Jens Axboe <axboe@kernel.dk>,
Christoph Hellwig <hch@lst.de>, Aleksa Sarai <cyphar@cyphar.com>,
Seth Forshee <sforshee@kernel.org>,
linux-fsdevel@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH] file: always lock position
Date: Sat, 2 Sep 2023 04:43:57 +0100 [thread overview]
Message-ID: <20230902034357.GH3390869@ZenIV> (raw)
In-Reply-To: <CAHk-=wi97khTatMKCvJD4tBkf6rMKTP=fLQDnok7MGEEewSz9g@mail.gmail.com>
On Thu, Aug 03, 2023 at 11:35:53AM -0700, Linus Torvalds wrote:
> And yes, those exist. See at least 'adfs_dir_ops', and
> 'ceph_dir_fops'. They may be broken, but people actually did do things
> like that historically, maybe there's a reason adfs and ceph allow it.
Huh?
adfs is a red herring - there is a method called 'read' in
struct adfs_dir_ops, but it has nothing to do with read(2) or anything
of that sort; it is *used* by adfs_iterate() (via adfs_read_inode()),
but the only file_operations that sucker has is
const struct file_operations adfs_dir_operations = {
.read = generic_read_dir,
.llseek = generic_file_llseek,
.iterate_shared = adfs_iterate,
.fsync = generic_file_fsync,
};
and generic_read_dir() is "fail with -EISDIR".
ceph one is real, and it's... not nice. They do have a genuine
->read() on directories, which acts as if it had been a regular
file contents generated by sprintf() (at the first read(2)).
Format:
"entries: %20lld\n"
" files: %20lld\n"
" subdirs: %20lld\n"
"rentries: %20lld\n"
" rfiles: %20lld\n"
" rsubdirs: %20lld\n"
"rbytes: %20lld\n"
"rctime: %10lld.%09ld\n",
and yes, it does go stale. Just close and reopen...
File position is an offset in that string. Shared with position
used by getdents()...
It gets better - if you open a directory, then fork and have
both child and parent read from it, well...
if (!dfi->dir_info) {
dfi->dir_info = kmalloc(bufsize, GFP_KERNEL);
if (!dfi->dir_info)
return -ENOMEM;
No locking whatsoever. No priveleges needed either...
Frankly, my preference would be to remove that kludge, but
short of that we would need at least some exclusion for
those allocations and if we do that, we might just as well
have ceph_readdir() fail if we'd ever done ceph_read_dir()
on the same struct file and vice versa.
They are really not mixible.
As far as I can tell, ceph is the only place in mainline where
we have ->iterate_shared along with non-failing ->read. Nothing
mixes it with ->read_iter, ->write or ->write_iter.
lseek() is there, of course, and that's enough to kill the "special
fdget variant for directory syscalls" kind of approach. But read()
and write() on directories are very much not something people
legitimately do.
next prev parent reply other threads:[~2023-09-02 3:44 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-24 15:00 [PATCH] file: always lock position Christian Brauner
2023-07-24 15:53 ` Linus Torvalds
2023-07-24 16:19 ` Christian Brauner
2023-07-24 16:36 ` Linus Torvalds
2023-07-24 16:51 ` Linus Torvalds
2023-09-02 4:44 ` Al Viro
2023-07-24 17:23 ` Christian Brauner
2023-07-24 17:34 ` Linus Torvalds
2023-07-24 17:46 ` Christian Brauner
2023-07-24 18:01 ` Linus Torvalds
2023-07-24 18:05 ` Jens Axboe
2023-07-24 18:27 ` Linus Torvalds
2023-07-24 18:48 ` Christian Brauner
2023-07-24 22:25 ` Linus Torvalds
2023-07-24 22:56 ` Jens Axboe
2023-07-25 18:30 ` Linus Torvalds
2023-07-25 20:41 ` Jens Axboe
2023-07-25 20:51 ` Linus Torvalds
2023-07-25 20:58 ` Jens Axboe
2023-07-26 8:36 ` Christian Brauner
2023-07-26 10:31 ` David Laight
2023-07-26 12:53 ` Christian Brauner
2023-07-26 8:07 ` Christian Brauner
2023-07-24 16:46 ` Christian Brauner
2023-07-24 16:59 ` Linus Torvalds
2023-07-24 17:18 ` Linus Torvalds
2023-08-03 9:53 ` Mateusz Guzik
2023-08-03 14:15 ` Christian Brauner
2023-08-03 15:17 ` Mateusz Guzik
2023-08-03 15:18 ` Mateusz Guzik
2023-08-03 15:45 ` Linus Torvalds
2023-08-03 17:54 ` Mateusz Guzik
2023-08-03 18:02 ` Christian Brauner
2023-08-03 18:35 ` Linus Torvalds
2023-08-04 13:43 ` Christian Brauner
2023-08-04 13:59 ` Christoph Hellwig
2023-09-02 3:43 ` Al Viro [this message]
[not found] <20230804-turnverein-helfer-ef07a4d7bbec@brauner>
2023-08-05 11:46 ` Christian Brauner
2023-08-05 18:47 ` Linus Torvalds
2023-08-05 19:46 ` Linus Torvalds
2023-08-06 6:10 ` Christian Brauner
2023-08-06 13:25 ` Christian Brauner
2023-08-06 17:48 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230902034357.GH3390869@ZenIV \
--to=viro@zeniv.linux.org.uk \
--cc=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=cyphar@cyphar.com \
--cc=hch@lst.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=mjguzik@gmail.com \
--cc=sforshee@kernel.org \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).