From: Ulrich Drepper <drepper@redhat.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Rob Ross <rross@mcs.anl.gov>,
Trond Myklebust <trond.myklebust@fys.uio.no>,
Andreas Dilger <adilger@clusterfs.com>,
Sage Weil <sage@newdream.net>, Brad Boyer <flar@allandria.com>,
Anton Altaparmakov <aia21@cam.ac.uk>,
Gary Grider <ggrider@lanl.gov>,
linux-fsdevel@vger.kernel.org
Subject: Re: NFSv4/pNFS possible POSIX I/O API standards
Date: Tue, 05 Dec 2006 15:55:46 -0800 [thread overview]
Message-ID: <45760702.6040805@redhat.com> (raw)
In-Reply-To: <20061205220538.GA1988@infradead.org>
Christoph Hellwig wrote:
> Ulrich, this in reply to these API proposals:
I know the documents. The HECWG was actually supposed to submit an
actual draft to the OpenGroup-internal working group but I haven't seen
anything yet. I'm not opposed to getting real-world experience first.
>> So other than this "lite" version of the readdirplus() call, and this
>> idea of making the flags indicate validity rather than accuracy, are
>> there other comments on the directory-related calls? I understand that
>> they might or might not ever make it in, but assuming they did, what
>> other changes would you like to see?
I don't think an accuracy flag is useful at all. Programs don't want to
use fuzzy information. If you want a fast 'ls -l' then add a mode which
doesn't print the fields which are not provided. Don't provide outdated
information. Similarly for other programs.
> statlite needs to separate the flag for valid fields from the actual
> stat structure and reuse the existing stat(64) structure. stat lite
> needs to at least get a better name, even better be folded into *statat*,
> either by having a new AT_VALID_MASK flag that enables a new
> unsigned int valid argument or by folding the valid flags into the AT_
> flags.
Yes, this is also my pet peeve with this interface. I don't want to
have another data structure. Especially since programs might want to
store the value in places where normal stat results are returned.
And also yes on 'statat'. I strongly suggest to define only a statat
variant. In the standards group I'll vehemently oppose the introduction
of yet another superfluous non-*at interface.
As for reusing the existing statat interface and magically add another
parameter through ellipsis: no. We need to become more type-safe. The
userlevel interface needs to be a new one. For the system call there is
no such restriction. We can indeed extend the existing syscall. We
have appropriate checks for the validity of the flags parameter in place
which make such calls backward compatible.
> I think having a stat lite variant is pretty much consensus, we just need
> to fine tune the actual API - and of course get a reference implementation.
> So if you want to get this going try to implement it based on
> http://marc.theaimsgroup.com/?l=linux-fsdevel&m=115487991724607&w=2.
> Bonus points for actually making use of the flags in some filesystems.
I don't like that approach. The flag parameter should be exclusively an
output parameter. By default the kernel should fill in all the fields
it has access to. If access is not easily possible then set the bit and
clear the field. There are of course certain fields which always should
be added. In the proposed man page these are already identified (i.e.,
those before the st_litemask member).
> At the actual
> C prototype level I would rename d_stat_err to d_stat_errno for consistency
> and maybe drop the readdirplus() entry point in favour of readdirplus_r
> only - there is no point in introducing new non-reenetrant APIs today.
No, readdirplus should be kept (and yes, readdirplus_r must be added).
The reason is that the readdir_r interface is only needed if multiple
threads use the _same_ DIR stream. This is hardly ever the case.
Forcing everybody to use the _r variant means that we unconditionally
have to copy the data in the user-provided buffer. With readdir there
is the possibility to just pass back a pointer into the internal buffer
read into by getdents. This is how readdir works for most kernel/arch
combinations.
This requires that the dirent_plus structure matches so it's important
to get it right. I'm not comfortable with the current proposal. Yes,
having ordinary dirent and stat structure in there is a plus. But we
have overlap:
- d_ino and st_ino
- d_type and parts of st_mode
And we have superfluous information
- st_dev, the same for all entries, at least this is what readdir
assumes
I haven't made up my mind yet whether this is enough reason to introduce
a new type which isn't made up of the the two structures.
And one last point: I haven't seen any discussion why readdirplus should
do the equivalent of stat and there is no 'statlite' variant. Are all
places for readdir is used non-critical for performance or depend on
accurate information?
--
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2006-12-05 23:56 UTC|newest]
Thread overview: 124+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-11-28 4:34 NFSv4/pNFS possible POSIX I/O API standards Gary Grider
2006-11-28 5:54 ` Christoph Hellwig
2006-11-28 10:54 ` Andreas Dilger
2006-11-28 11:28 ` Anton Altaparmakov
2006-11-28 20:17 ` Russell Cattelan
2006-11-28 23:28 ` Wendy Cheng
2006-11-29 9:12 ` Christoph Hellwig
2006-11-29 9:04 ` Christoph Hellwig
2006-11-29 9:14 ` Christoph Hellwig
2006-11-29 9:48 ` Andreas Dilger
2006-11-29 10:18 ` Anton Altaparmakov
2006-11-29 8:26 ` Brad Boyer
2006-11-30 9:25 ` Christoph Hellwig
2006-11-30 17:49 ` Sage Weil
2006-12-01 5:26 ` Trond Myklebust
2006-12-01 7:08 ` Sage Weil
2006-12-01 14:41 ` Trond Myklebust
2006-12-01 16:47 ` Sage Weil
2006-12-01 18:07 ` Trond Myklebust
2006-12-01 18:42 ` Sage Weil
2006-12-01 19:13 ` Trond Myklebust
2006-12-01 20:32 ` Sage Weil
2006-12-04 18:02 ` Peter Staubach
2006-12-05 23:20 ` readdirplus() as possible POSIX I/O API Sage Weil
2006-12-06 15:48 ` Peter Staubach
2006-12-03 1:57 ` NFSv4/pNFS possible POSIX I/O API standards Andreas Dilger
2006-12-03 7:34 ` Kari Hurtta
2006-12-03 1:52 ` Andreas Dilger
2006-12-03 16:10 ` Sage Weil
2006-12-04 7:32 ` Andreas Dilger
2006-12-04 15:15 ` Trond Myklebust
2006-12-05 0:59 ` Rob Ross
2006-12-05 4:44 ` Gary Grider
2006-12-05 10:05 ` Christoph Hellwig
2006-12-05 5:56 ` Trond Myklebust
2006-12-05 10:07 ` Christoph Hellwig
2006-12-05 14:20 ` Matthew Wilcox
2006-12-06 15:04 ` Rob Ross
2006-12-06 15:44 ` Matthew Wilcox
2006-12-06 16:15 ` Rob Ross
2006-12-05 14:55 ` Trond Myklebust
2006-12-05 22:11 ` Rob Ross
2006-12-05 23:24 ` Trond Myklebust
2006-12-06 16:42 ` Rob Ross
2006-12-06 12:22 ` Ragnar Kjørstad
2006-12-06 15:14 ` Trond Myklebust
2006-12-05 16:55 ` Latchesar Ionkov
2006-12-05 22:12 ` Christoph Hellwig
2006-12-06 23:12 ` Latchesar Ionkov
2006-12-06 23:33 ` Trond Myklebust
2006-12-05 21:50 ` Rob Ross
2006-12-05 22:05 ` Christoph Hellwig
2006-12-05 23:18 ` Sage Weil
2006-12-05 23:55 ` Ulrich Drepper [this message]
2006-12-06 10:06 ` Andreas Dilger
2006-12-06 17:19 ` Ulrich Drepper
2006-12-06 17:27 ` Rob Ross
2006-12-06 17:42 ` Ulrich Drepper
2006-12-06 18:01 ` Ragnar Kjørstad
2006-12-06 18:13 ` Ulrich Drepper
2006-12-17 14:41 ` Ragnar Kjørstad
2006-12-17 19:07 ` Ulrich Drepper
2006-12-17 19:38 ` Matthew Wilcox
2006-12-17 21:51 ` Ulrich Drepper
2006-12-18 2:57 ` Ragnar Kjørstad
2006-12-18 3:54 ` Gary Grider
2006-12-07 5:57 ` Andreas Dilger
2006-12-15 22:37 ` Ulrich Drepper
2006-12-16 18:13 ` Andreas Dilger
2006-12-16 19:08 ` Ulrich Drepper
2006-12-14 23:58 ` statlite() Rob Ross
2006-12-07 23:39 ` NFSv4/pNFS possible POSIX I/O API standards Nikita Danilov
2006-12-05 14:37 ` Peter Staubach
2006-12-05 10:26 ` readdirplus() as possible POSIX I/O API Andreas Dilger
2006-12-05 15:23 ` Trond Myklebust
2006-12-06 10:28 ` Andreas Dilger
2006-12-06 15:10 ` Trond Myklebust
2006-12-05 17:06 ` Latchesar Ionkov
2006-12-05 22:48 ` Rob Ross
2006-11-29 10:25 ` NFSv4/pNFS possible POSIX I/O API standards Steven Whitehouse
2006-11-30 12:29 ` Christoph Hellwig
2006-12-01 15:52 ` Ric Wheeler
2006-11-29 12:23 ` Matthew Wilcox
2006-11-29 12:35 ` Matthew Wilcox
2006-11-29 16:26 ` Gary Grider
2006-11-29 17:18 ` Christoph Hellwig
2006-11-29 12:39 ` Christoph Hellwig
2006-12-01 22:29 ` Rob Ross
2006-12-02 2:35 ` Latchesar Ionkov
2006-12-05 0:37 ` Rob Ross
2006-12-05 10:02 ` Christoph Hellwig
2006-12-05 16:47 ` Latchesar Ionkov
2006-12-05 17:01 ` Matthew Wilcox
[not found] ` <f158dc670612050909m366594c5ubaa87d9a9ecc8c2a@mail.gmail.com>
2006-12-05 17:10 ` Latchesar Ionkov
2006-12-05 17:39 ` Matthew Wilcox
2006-12-05 21:55 ` Rob Ross
2006-12-05 21:50 ` Peter Staubach
2006-12-05 21:44 ` Rob Ross
2006-12-06 11:01 ` openg Christoph Hellwig
2006-12-06 15:41 ` openg Trond Myklebust
2006-12-06 15:42 ` openg Rob Ross
2006-12-06 23:32 ` openg Christoph Hellwig
2006-12-14 23:36 ` openg Rob Ross
2006-12-06 23:25 ` Re: NFSv4/pNFS possible POSIX I/O API standards Latchesar Ionkov
2006-12-06 9:48 ` David Chinner
2006-12-06 15:53 ` openg and path_to_handle Rob Ross
2006-12-06 16:04 ` Matthew Wilcox
2006-12-06 16:20 ` Rob Ross
2006-12-06 20:57 ` David Chinner
2006-12-06 20:40 ` David Chinner
2006-12-06 20:50 ` Matthew Wilcox
2006-12-06 21:09 ` David Chinner
2006-12-06 22:09 ` Andreas Dilger
2006-12-06 22:17 ` Matthew Wilcox
2006-12-06 22:41 ` Andreas Dilger
2006-12-06 23:39 ` Christoph Hellwig
2006-12-14 22:52 ` Rob Ross
2006-12-06 20:50 ` Rob Ross
2006-12-06 21:01 ` David Chinner
2006-12-06 23:19 ` Latchesar Ionkov
2006-12-14 21:00 ` Rob Ross
2006-12-14 21:20 ` Matthew Wilcox
2006-12-14 23:02 ` Rob Ross
2006-11-28 15:08 ` NFSv4/pNFS possible POSIX I/O API standards Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45760702.6040805@redhat.com \
--to=drepper@redhat.com \
--cc=adilger@clusterfs.com \
--cc=aia21@cam.ac.uk \
--cc=flar@allandria.com \
--cc=ggrider@lanl.gov \
--cc=hch@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=rross@mcs.anl.gov \
--cc=sage@newdream.net \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).