From: Dave Chinner <david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org>
To: Andreas Dilger <adilger-KloliPT79xf2eFz/2MeuCQ@public.gmane.org>
Cc: David Howells <dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-cifs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
samba-technical-w/Ol4Ecudpl8XjKLYN78aQ@public.gmane.org,
linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
wine-devel-5vRYHf7vrtgdnm+yROfE0A@public.gmane.org,
kfm-devel-RoXCvvDuEio@public.gmane.org,
nautilus-list-rDKQcyrBJuzYtjvyW6yDsg@public.gmane.org,
linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
libc-alpha-9JcytcrH/bA+uJoB2kUjGw@public.gmane.org
Subject: Re: [PATCH 0/6] Extended file stat system call
Date: Sat, 28 Apr 2012 10:38:33 +1000 [thread overview]
Message-ID: <20120428003833.GH9541@dastard> (raw)
In-Reply-To: <ED5B8F1B-6C99-4516-85FA-A767E94B635F-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>
On Thu, Apr 26, 2012 at 09:22:04PM -0600, Andreas Dilger wrote:
> On 2012-04-26, at 7:06 PM, Dave Chinner wrote:
> > On Thu, Apr 19, 2012 at 03:05:58PM +0100, David Howells wrote:
> >>
> >> Implement a pair of new system calls to provide extended and further extensible stat functions.
> >>
> >> The second of the associated patches is the main patch that provides these new system calls:
> >>
> >> ssize_t ret = xstat(int dfd,
> >> const char *filename,
> >> unsigned atflag,
> >> unsigned mask,
> >> struct xstat *buffer);
> >>
> >> ssize_t ret = fxstat(int fd,
> >> unsigned atflag,
> >> unsigned mask,
> >> struct xstat *buffer);
> >>
> >> which are more fully documented in the first patch's description.
> >>
> >> These new stat functions provide a number of useful features, in summary:
> >>
> >> (1) More information: creation time, inode generation number, data
> >> version number, flags/attributes. A subset of these is available
> >> through a number of filesystems (CIFS, NFS, AFS, Ext4 and BTRFS).
> >
> > If we are adding per-inode flags, then what do we do with filesystem
> > specific flags? e.g. XFS has quite a number of per-inode flags that
> > don't align with any other filesystem (e.g. filestream allocator,
> > real time file, behaviour inheritence flags, etc), but may be useful
> > to retrieve in such a call. We currently have an ioctl to get that
> > information from each inode. Have you thought about how to handle
> > such flags?
>
> I'm sympathetic to your cause, but I don't want this to degrade into
> the same morass that it did last time when every attribute under the
> sun was added to the call.
Understood, which is why I'm not asking for everything under the sun
to be supported. I'm more interested in finding the useful subset of
information that a typical application might make use of.
> The intent is to replace the stat() call
> with something that can avoid overhead on filesystems for which some
> attributes are expensive, and that applications may not need. Some
> common attributes were added that are used by multiple filesystems.
>
> If it is too filesystem-specific, and there is little possibility
> that these attributes will be usable on other filesystems, then it
> should remain a filesystem specific ioctl() call.
Right, that's why I didn't mention the real-time bits, the
filestream allocation bits, or other things that are tightly bound
to the way XFS works....
> If you can make
> a case that these attributes have value on a few other filesystems,
> and applications are reasonably likely to be able to use them, and
> their addition does not make the API overly complex, then suggest
> away.
Exactly my thoughts ;)
> > Along the same lines, filesytsems can have different allocation
> > constraints to IO the filesystem block size - ext4 with it's
> > bigalloc hack, XFS with it's per-inode extent size hints and the
> > realtime device, etc. Then there's optimal IO characteristics
> > (e.g. geometery hints like stripe unit/stripe width for the
> > allocation policy of that given file) that applications could use
> > if they were present rather than having to expose them through
> > ioctls that nobody even knows about...
>
> There is already "optimal IO size" that the application can use,
> how do the geometry hints differ?
Have a look at how XFS overloads stat.st_blksize depending on the
filesystem and inode config. It's amazingly convoluted, and based on
a combination of filesystem geometry, inode bits and mount options:
xfs_vn_getattr()
....
if (XFS_IS_REALTIME_INODE(ip)) {
/*
* If the file blocks are being allocated from a
* realtime volume, then return the inode's realtime
* extent size or the realtime volume's extent size.
*/
stat->blksize =
xfs_get_extsz_hint(ip) << mp->m_sb.sb_blocklog;
} else
stat->blksize = xfs_preferred_iosize(mp);
......
xfs_extlen_t
xfs_get_extsz_hint(
struct xfs_inode *ip)
{
if ((ip->i_d.di_flags & XFS_DIFLAG_EXTSIZE) && ip->i_d.di_extsize)
return ip->i_d.di_extsize;
if (XFS_IS_REALTIME_INODE(ip))
return ip->i_mount->m_sb.sb_rextsize;
return 0;
}
....
static inline unsigned long
xfs_preferred_iosize(xfs_mount_t *mp)
{
if (mp->m_flags & XFS_MOUNT_COMPAT_IOSIZE)
return PAGE_CACHE_SIZE;
return (mp->m_swidth ?
(mp->m_swidth << mp->m_sb.sb_blocklog) :
((mp->m_flags & XFS_MOUNT_DFLT_IOSIZE) ?
(1 << (int)MAX(mp->m_readio_log, mp->m_writeio_log)) :
PAGE_CACHE_SIZE));
}
All of that can be exported as 4 parameters for normal files:
allocation block size (extent size hint)
minimum io size (PAGE_CACHE_SIZE)
preferred minimum IO size (mp->m_readio_log/mp->m_writeio_log)
best aligned IO size (stripe width)
And for realtime files it's a bit different because of the
block-based bitmap allocator it uses:
allocation block size (extent size hint)
minimum io size (PAGE_CACHE_SIZE)
preferred minimum IO size (extent size hint)
best aligned IO size (some multiple of extent size hint)
> Userspace is able to handle
> st_blksize of several MB in size without problems, and any sane
> application will do the IO sized + aligned on multiples of this.
Actually, some applications still have problems with that. That's
the reason we only expose stripe widths in st_blksize when a mount
option is set. Stripe widths are known to get into the tens of MB,
and applications using st_blksize for memory allocation of IO
buffers tend to get into trouble with those.
That's why I'd prefer specific optimal IO hints - we don't have to
overload st_blksize with lots of meanings to pass what is relatively
trivial information back to the application.
Cheers,
Dave.
--
Dave Chinner
david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org
next prev parent reply other threads:[~2012-04-28 0:38 UTC|newest]
Thread overview: 90+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-19 14:05 [PATCH 0/6] Extended file stat system call David Howells
2012-04-19 14:06 ` [PATCH 3/6] xstat: AFS: Return extended attributes David Howells
2012-04-19 14:06 ` [PATCH 4/6] xstat: NFS: " David Howells
[not found] ` <20120419140653.17272.95035.stgit-S6HVgzuS8uM4Awkfq6JHfwNdhmdF6hFW@public.gmane.org>
2012-04-19 14:35 ` Myklebust, Trond
2012-04-26 13:52 ` David Howells
2012-04-19 14:07 ` [PATCH 5/6] xstat: CIFS: " David Howells
[not found] ` <20120419140706.17272.72290.stgit-S6HVgzuS8uM4Awkfq6JHfwNdhmdF6hFW@public.gmane.org>
2012-04-19 15:19 ` Steve French
[not found] ` <20120419140558.17272.74360.stgit-S6HVgzuS8uM4Awkfq6JHfwNdhmdF6hFW@public.gmane.org>
2012-04-19 14:06 ` [PATCH 1/6] xstat: Add a pair of system calls to make extended file stats available David Howells
2012-04-19 23:36 ` Andreas Dilger
[not found] ` <20120419140612.17272.57774.stgit-S6HVgzuS8uM4Awkfq6JHfwNdhmdF6hFW@public.gmane.org>
2012-04-24 21:29 ` J. Bruce Fields
[not found] ` <20120424212911.GA26073-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-04-24 22:08 ` Steve French
2012-04-25 14:44 ` Andreas Dilger
2012-04-26 13:45 ` David Howells
[not found] ` <18765.1335447954-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-04-26 14:28 ` J. Bruce Fields
2012-04-26 17:06 ` Steve French
2012-04-26 13:32 ` David Howells
[not found] ` <18195.1335447156-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-04-27 0:51 ` Dave Chinner
2012-04-27 3:11 ` Andreas Dilger
2012-04-26 13:40 ` David Howells
[not found] ` <18533.1335447617-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-04-26 14:23 ` J. Bruce Fields
2012-04-30 16:27 ` Ben Hutchings
2012-04-30 20:15 ` David Howells
2012-04-30 20:30 ` J. Bruce Fields
2012-04-30 23:31 ` Ben Hutchings
2012-04-19 14:06 ` [PATCH 2/6] xstat: Ext4: Return extended attributes David Howells
[not found] ` <20120419140625.17272.23303.stgit-S6HVgzuS8uM4Awkfq6JHfwNdhmdF6hFW@public.gmane.org>
2012-04-19 16:03 ` Steve French
2012-04-26 13:47 ` David Howells
2012-04-26 17:00 ` Steve French
2012-04-19 14:07 ` [PATCH 6/6] xstat: eCryptFS: " David Howells
2012-04-19 17:11 ` [PATCH 0/6] Extended file stat system call Steve French
2012-04-27 1:06 ` Dave Chinner
2012-04-27 3:22 ` Andreas Dilger
[not found] ` <ED5B8F1B-6C99-4516-85FA-A767E94B635F-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>
2012-04-28 0:38 ` Dave Chinner [this message]
2012-04-28 0:54 ` Steve French
2012-05-08 20:19 ` Extended file stat: Splitting file- and fs-specific info? David Howells
2012-05-08 21:13 ` Myklebust, Trond
[not found] ` <16281.1336508382-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-05-09 0:24 ` Dave Chinner
2012-05-09 1:09 ` J. Bruce Fields
2012-05-09 4:25 ` Dave Chinner
2012-05-09 11:14 ` J. Bruce Fields
2012-05-09 1:16 ` Andreas Dilger
2012-05-10 9:23 ` David Howells
[not found] ` <14477.1336641794-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-05-10 16:05 ` Andreas Dilger
2012-05-10 17:10 ` Roland McGrath
2012-05-11 8:54 ` Andreas Dilger
2012-05-09 9:21 ` David Howells
[not found] ` <20170.1336555274-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-05-09 11:19 ` Christoph Hellwig
[not found] ` <20120509111958.GA11345-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2012-05-09 11:55 ` Bernd Schubert
[not found] ` <4FAA5B24.1020306-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
2012-05-09 12:05 ` Christoph Hellwig
[not found] ` <20120509120544.GA17535-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2012-05-09 12:25 ` Bernd Schubert
2012-05-09 13:51 ` Andreas Dilger
2012-05-09 14:12 ` Bernd Schubert
2012-05-10 9:14 ` David Howells
2012-04-19 16:32 ` [PATCH 0/6] Extended file stat system call Roland McGrath
2012-04-19 21:51 ` Paul Eggert
2012-04-19 23:05 ` Roland McGrath
2012-04-26 14:16 ` David Howells
[not found] ` <20173.1335449760-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-04-26 18:22 ` Roland McGrath
[not found] ` <4F9088D6.9020203-764C0pRuGfqVc3sceRu5cw@public.gmane.org>
2012-04-26 14:04 ` David Howells
[not found] ` <19638.1335449047-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-04-26 18:24 ` Roland McGrath
2012-04-19 23:29 ` Andreas Dilger
2012-04-26 13:54 ` David Howells
[not found] ` <19184.1335448455-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-04-26 18:25 ` Roland McGrath
2012-04-27 23:54 ` Paul Eggert
[not found] ` <20120426182524.E5ADF2C0EC-j1d2VQoJOwwHfwO+Tb3JRVaTQe2KTcn/@public.gmane.org>
2012-04-26 21:54 ` David Howells
[not found] ` <9931.1335477281-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-04-26 22:02 ` Roland McGrath
2012-04-26 22:21 ` Nix
2012-04-26 14:25 ` David Howells
2012-04-26 14:54 ` Steve French
[not found] ` <CAH2r5mv1Lijdwk5zsQwYJr4Etb6fhrRyNXm-iFCQX+HecboGrQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-04-26 15:25 ` Myklebust, Trond
2012-04-26 16:56 ` Steve French
[not found] ` <CAH2r5mt5af-_hxBRKK72iD5Gr99bo91ec78Rov8EGVEx8=21mA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-04-26 17:00 ` Myklebust, Trond
2012-04-26 17:03 ` Steve French
[not found] ` <CAH2r5mvmCfLrxRHje6Wx5X84zxPEHwRMUJGsjvWBujMu7w841w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-04-26 17:06 ` Myklebust, Trond
[not found] ` <1335460011.9701.30.camel-SyLVLa/KEI9HwK5hSS5vWB2eb7JE58TQ@public.gmane.org>
2012-04-26 17:09 ` Steve French
[not found] ` <CAH2r5muXk+frkFz9X523Ny=RMwJGeqOPH75G1ToNa5QoMo5SkQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-04-26 17:10 ` Steve French
2012-04-26 21:57 ` David Howells
[not found] ` <10104.1335477476-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-04-26 22:05 ` Roland McGrath
[not found] ` <20120426220552.D98D62C0D3-j1d2VQoJOwwHfwO+Tb3JRVaTQe2KTcn/@public.gmane.org>
2012-04-27 0:33 ` Myklebust, Trond
2012-04-27 0:30 ` Myklebust, Trond
2012-04-26 15:52 ` David Howells
2012-04-27 0:29 ` Andreas Dilger
[not found] ` <3F302713-B675-4BAA-B2B7-235E03C5975F-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>
2012-04-27 9:19 ` David Howells
2012-04-27 9:39 ` David Howells
[not found] ` <4111.1335519545-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-04-27 13:13 ` Dave Chinner
2012-04-27 15:10 ` J. Bruce Fields
[not found] ` <20120427151057.GA16580-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-04-27 16:32 ` Steve French
2012-04-27 19:31 ` Andreas Dilger
2012-04-28 0:58 ` Dave Chinner
2012-05-10 9:51 ` David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120428003833.GH9541@dastard \
--to=david-fqsqvqoi3ljby3ivrkzq2a@public.gmane.org \
--cc=adilger-KloliPT79xf2eFz/2MeuCQ@public.gmane.org \
--cc=dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=kfm-devel-RoXCvvDuEio@public.gmane.org \
--cc=libc-alpha-9JcytcrH/bA+uJoB2kUjGw@public.gmane.org \
--cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-cifs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=nautilus-list-rDKQcyrBJuzYtjvyW6yDsg@public.gmane.org \
--cc=samba-technical-w/Ol4Ecudpl8XjKLYN78aQ@public.gmane.org \
--cc=wine-devel-5vRYHf7vrtgdnm+yROfE0A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).