public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: tridge@samba.org
To: Andreas Dilger <adilger@clusterfs.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: performance of filesystem xattrs with Samba4
Date: Fri, 19 Nov 2004 22:43:40 +1100	[thread overview]
Message-ID: <16797.56428.329257.785330@samba.org> (raw)
In-Reply-To: <20041119101600.GM1974@schnapps.adilger.int>

Andreas,

 > Also, we (CFS) have developed patches for ext3 + e2fsprogs to support
 > "fast" EAs stored in larger inodes on disk, and this can improve
 > performance dramatically in the case where you are accessing a large
 > number of inodes with EAs just.

yep, that could help a lot. I imagine it will provide a similar
benefit to the option to expand the inode size in XFS, which certainly
made a huge difference.

 > This patch also provides the infrastructure on disk for storing e.g.
 > nsecond and create timestamps in the ext3 large inodes, but the actual
 > implementation to save/load these isn't there yet.  If that were
 > available, would you use it instead of explicitly storing the NTTIME in
 > an EA?

certainly! 

For Samba4 we need 4 timestamps (create/change/write/access),
preferably all with 100ns resolution or better. All 4 timestamps need
to be settable (unlike st_ctime in posix).

The strategy I've adopted is this:

 - use st_atime and st_mtime for the access and write time fields,
   with nanosecond resolution if available, otherwise with 1 second
   resolution. It's just too expensive to update an EA on every
   read/write, so I didn't put these in the DosAttrib EA.

 - store create_time and change_time in the user.DosAttrib xattr, as
   64 bit 100ns resolution times (same format as NT uses and Samba
   uses internally). I store change_time there as its definition is a
   little different from the posix ctime field (plus its settable).

If we had a settable create_time field in the inode then I'd certainly
want to use it in Samba4. A non-settable one wouldn't be nearly as
useful. Some win32 applications care about being able to set all the
time fields (such as excel 2003).

This wouldn't allow us to get rid of the user.DosAttrib xattr
completely though, as we stick a bunch of other stuff in there and
will be expanding it soon to help with the case-insensitive speed
problem.

>  I believe the 2.6 stat interface will support nsecond timestamps,

yep, we are already using st.st_atim.tv_nsec when configure detects
it. It's very useful, but the fact that ext3 doesn't store this on
disk leads to potential problems when timestamps regress if inodes are
ejected from the cache under memory pressure. That needs fixing.

 > but I don't think there is any API to get the create time to userspace
 > though we could hook this up to a pseudo EA.  The benefit of storing
 > these common fields in the inode instead of EAs is less overhead.

I think it would make more sense to have a new varient of utime() for
setting all available timestamps, and expose all timestamps in stat. A
separate API for create time seems a bit hackish.

 > I would just configure out the xattr sharing code entirely since it will
 > likely do nothing but increase overhead if any of the EAs on an inode
 > are unique (this is the most common case, except for POSIX-ACL-only setups).

I didn't know it was configurable. I can't see any CONFIG option for
it - is there some trick I've missed?

 > I've attached this patch here.

I'll give it a go and let you know how it changes the NBENCH results.

Cheers, Tridge

  reply	other threads:[~2004-11-19 11:44 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <16759.16648.459393.752417@samba.org>
2004-10-21 18:32 ` [PATCH] Re: idr in Samba4 Jim Houston
2004-10-22  6:17   ` tridge
2004-11-19  7:38   ` performance of filesystem xattrs with Samba4 tridge
2004-11-19  8:08     ` James Morris
2004-11-19 10:16     ` Andreas Dilger
2004-11-19 11:43       ` tridge [this message]
2004-11-19 22:28         ` Andreas Dilger
2004-11-22 13:02       ` tridge
2004-11-22 21:40         ` Andreas Dilger
2004-11-19 12:03     ` Anton Altaparmakov
2004-11-19 12:43       ` tridge
2004-11-19 14:11         ` Anton Altaparmakov
2004-11-20 10:44           ` tridge
2004-11-20 16:20             ` Hans Reiser
2004-11-20 23:29               ` tridge
2004-11-19 15:34     ` Hans Reiser
2004-11-19 15:58       ` Jan Engelhardt
2004-11-19 22:03       ` tridge
2004-11-20  4:51         ` Hans Reiser
2004-11-19 23:01       ` tridge
2004-11-20  0:26         ` Andrew Morton
2004-11-21  1:14           ` tridge
2004-11-21  2:12           ` tridge
2004-11-21 23:53           ` tridge
2004-11-23  9:37           ` tridge
2004-11-23 17:55             ` Andreas Dilger
2004-11-24  7:53           ` tridge
2004-11-20  4:40         ` Hans Reiser
2004-11-20  6:47           ` tridge
2004-11-20 16:13             ` Hans Reiser
2004-11-20 23:16               ` tridge
2004-11-21  2:36                 ` Hans Reiser
2004-11-21  0:21               ` tridge
2004-11-21  2:41                 ` Hans Reiser
2004-11-21  1:53               ` tridge
2004-11-21  2:48                 ` Hans Reiser
2004-11-21  3:19                   ` tridge
2004-11-21  6:11                     ` Hans Reiser
2004-11-21 22:21     ` Nathan Scott
2004-11-21 23:43       ` tridge
2004-12-03 17:49 Steve French

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=16797.56428.329257.785330@samba.org \
    --to=tridge@samba.org \
    --cc=adilger@clusterfs.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox