linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: David Howells <dhowells@redhat.com>
Cc: viro@ZenIV.linux.org.uk, smfrench@gmail.com, jlayton@redhat.com,
	mcao@us.ibm.com, aneesh.kumar@linux.vnet.ibm.com,
	linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, samba-technical@lists.samba.org,
	sjayaraman@suse.de, linux-ext4@vger.kernel.org
Subject: Re: [PATCH 0/3] Extended file stat functions [ver #2]
Date: Tue, 26 Nov 2013 11:40:34 +0100	[thread overview]
Message-ID: <20131126104034.GA4854@quack.suse.cz> (raw)
In-Reply-To: <20100630011656.18960.4255.stgit@warthog.procyon.org.uk>

  Hello,

On Wed 30-06-10 02:16:56, David Howells wrote:
> Implement a pair of new system calls to provide extended and further extensible
> stat functions.
> 
> The third of the associated patches provides these new system calls:
> 
> 	struct xstat_dev {
> 		unsigned int	major;
> 		unsigned int	minor;
> 	};
> 
> 	struct xstat_time {
> 		unsigned long long	tv_sec;
> 		unsigned long long	tv_nsec;
> 	};
> 
> 	struct xstat {
> 		unsigned int		struct_version;
> 	#define XSTAT_STRUCT_VERSION	0
> 		unsigned int		st_mode;
> 		unsigned int		st_nlink;
> 		unsigned int		st_uid;
> 		unsigned int		st_gid;
> 		unsigned int		st_blksize;
> 		struct xstat_dev	st_rdev;
> 		struct xstat_dev	st_dev;
> 		unsigned long long	st_ino;
> 		unsigned long long	st_size;
> 		struct xstat_time	st_atime;
> 		struct xstat_time	st_mtime;
> 		struct xstat_time	st_ctime;
> 		struct xstat_time	st_btime;
> 		unsigned long long	st_blocks;
  When we are doing this, can we please also change 'st_blocks' to
'st_bytes'? We track space usage in kernel in bytes for a long time so it
would be nice to propagate it to userspace via stat instead of a special
ioctl (at least quotacheck(8) needs to know the exact value).

								Honza
  
> 		unsigned long long	st_gen;
> 		unsigned long long	st_data_version;
> 		unsigned long long	query_flags;
> 	#define XSTAT_QUERY_SIZE		0x00000001ULL
> 	#define XSTAT_QUERY_NLINK		0x00000002ULL
> 	#define XSTAT_QUERY_AMC_TIMES		0x00000004ULL
> 	#define XSTAT_QUERY_CREATION_TIME	0x00000008ULL
> 	#define XSTAT_QUERY_BLOCKS		0x00000010ULL
> 	#define XSTAT_QUERY_INODE_GENERATION	0x00000020ULL
> 	#define XSTAT_QUERY_DATA_VERSION	0x00000040ULL
> 	#define XSTAT_QUERY__ORDINARY_SET	0x00000017ULL
> 	#define XSTAT_QUERY__GET_ANYWAY		0x0000007fULL
> 	#define XSTAT_QUERY__DEFINED_SET	0x0000007fULL
> 		unsigned long long	extra_results[0];
> 	};
> 
> 	ssize_t ret = xstat(int dfd,
> 			    const char *filename,
> 			    unsigned atflag,
> 			    struct xstat *buffer,
> 			    size_t buflen);
> 
> 	ssize_t ret = fxstat(int fd,
> 			     struct xstat *buffer,
> 			     size_t buflen);
> 
> which are more fully documented in that patch's description.
> 
> The bonuses of these new stat functions are:
> 
>  (1) The fields in the xstat struct are cleaned up.  There are no split or
>      duplicated fields.
> 
>  (2) Some extra information is made available (file creation time, inode
>      generation number and data version number) where provided by the
>      underlying filesystem.
> 
>      These are implemented here for Ext4 and AFS, but could also be provided
>      for CIFS, NTFS and BtrFS and probably others.
> 
>  (3) The structure is versioned and extensible, meaning that further new system
>      calls shouldn't be required.
> 
> Note that no lstat() equivalent is required as that can be implemented through
> xstat() with atflag == 0.
> 
> 
> The first patch makes const a bunch of system call userspace string/buffer
> arguments.  I can then make sys_xstat()'s filename pointer const too (though
> the entire first patch is not required for that).
> 
> The second patch makes the AFS filesystem use i_generation for the vnode ID
> uniquifier rather than i_version, and assigns i_version to hold the AFS data
> version number, making them more logical for when I want to get at them from
> afs_getattr().
> 
> There's a test program attached to the description for patch 3.  It can be run
> as follows:
> 
> 	[root@andromeda ~]# /tmp/xstat /afs/archive/linuxdev/fedora9/i386/repodata/
> 	xstat(/afs/archive/linuxdev/fedora9/i386/repodata/) = 152
> 	sv=0 qf=77 cr=0.0 iv=7a5 dv=5
> 	  Size: 2048            Blocks: 0          IO Block: 4096    directory
> 	Device: 00:15           Inode: 83          Links: 2
> 	Access: (0755/drwxr-xr-x)  Uid: 75338   Gid: 0
> 	Access: 2008-11-05 20:00:12.000000000+0000
> 	Modify: 2008-11-05 20:00:12.000000000+0000
> 	Change: 2008-11-05 20:00:12.000000000+0000
> 	Inode version: 7a5h
> 	Data version: 5h
> 
> 
> Things that need consideration:
> 
>  (1) Is it worth retaining the ability to arbitrarily add extra bits onto the
>      end of the stat buffer?  And what's the best way to do this?
> 
>      I've defined a way that from userspace involves assigning bits in
>      query_flags to extra results that you might want.  But this could instead
>      be done, say, by just upping the struct version number any time we want to
>      pass back more information.  Alternatively, we could go for a tagged data
>      method, perhaps using the same format as the recvmsg() control message
>      field.
> 
>      If we use tagged data then rather than being selective, we could just
>      return as many tagged data items as we feel the user might want and we can
>      cram into the buffer.  That could be rather slow, though.
> 
>  (2) What extra bits of information might we like to see available through the
>      stat interface?  Security labels?  NFS file IDs?  Xattrs?
> 
>      If we went for a tagged data method, xstat() could be modified to take a
>      list of tags as an argument, and could then return arbitrarily-sized
>      tagged results, including fs-specific stuff.
> 
>  (3) Does st_blksize really need to be 64 bits on a 64-bit system?  Or can it
>      be 32-bits?  Are we really likely to see something with a 4Gb+ blocksize?
> 
>  (4) Should the inode number and data version number fields be 128-bit?
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  parent reply	other threads:[~2013-11-26 10:40 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-30  1:16 [PATCH 0/3] Extended file stat functions [ver #2] David Howells
2010-06-30  1:17 ` [PATCH 1/3] Mark arguments to certain syscalls as being const " David Howells
2010-06-30  1:17 ` [PATCH 2/3] AFS: Use i_generation not i_version for the vnode uniquifier " David Howells
2010-06-30  1:17 ` [PATCH 3/3] Add a pair of system calls to make extended file stats available " David Howells
2010-06-30  1:48   ` Trond Myklebust
2010-06-30  9:33     ` Andreas Dilger
     [not found]     ` <CE3451EE-F8B2-47EF-AC1C-4EEEBE68B30F-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>
2010-06-30  9:47       ` David Howells
2010-06-30  2:32   ` Nicholas Miell
2010-06-30  8:30   ` Arnd Bergmann
2010-06-30  8:55   ` David Howells
2010-06-30  9:31     ` Arnd Bergmann
2010-06-30 10:01     ` David Howells
     [not found]       ` <29346.1277892068-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-06-30 11:46         ` Arnd Bergmann
2010-06-30 12:14       ` David Howells
     [not found]         ` <26650.1277900050-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-06-30 12:44           ` Arnd Bergmann
2010-06-30  9:45   ` Andreas Dilger
     [not found]   ` <B82FC7EE-93D2-4D86-906D-5D6AFA502709-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>
2010-06-30 10:22     ` David Howells
2010-06-30 11:04 ` [PATCH 0/3] Extended file stat functions " Andreas Dilger
2010-06-30 12:05 ` David Howells
2010-06-30 12:11   ` Christoph Hellwig
2010-06-30 13:31     ` Arnd Bergmann
2010-06-30 14:05       ` Jeff Layton
     [not found]         ` <20100630100553.707785c7-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2010-06-30 17:36           ` Arnd Bergmann
2010-06-30 12:23   ` David Howells
     [not found]   ` <26505.1277899544-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-06-30 21:45     ` Andreas Dilger
     [not found]   ` <FB78A152-53D3-4000-ABDB-9D6051ECB887-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>
2010-06-30 23:15     ` David Howells
2010-06-30 23:27       ` H. Peter Anvin
2010-07-01  0:15       ` David Howells
     [not found]         ` <8331.1277943337-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-07-01  3:20           ` H. Peter Anvin
     [not found]       ` <30875.1277939713-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-07-01  4:57         ` Andreas Dilger
     [not found]           ` <84225B35-7365-4DE2-8920-5741011B347C-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>
2010-07-01  8:09             ` Arnd Bergmann
2010-07-05 23:52 ` Brad Boyer
2013-11-26 10:40 ` Jan Kara [this message]
     [not found] ` <20131126104034.GA4854-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2013-11-28 13:07   ` David Howells
2013-11-28 13:57     ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131126104034.GA4854@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=dhowells@redhat.com \
    --cc=jlayton@redhat.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcao@us.ibm.com \
    --cc=samba-technical@lists.samba.org \
    --cc=sjayaraman@suse.de \
    --cc=smfrench@gmail.com \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).