* [PATCH] fs: switch f_iocb_flags and f_version
@ 2024-08-22 14:14 Christian Brauner
2024-08-22 14:55 ` Jens Axboe
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Christian Brauner @ 2024-08-22 14:14 UTC (permalink / raw)
To: linux-fsdevel
Cc: Christian Brauner, Jens Axboe, Al Viro, Linus Torvalds,
Jeff Layton, Jan Kara
Now that we shrank struct file by 24 bytes we still have a 4 byte hole.
Move f_version into the union and f_iocb_flags out of the union to fill
that hole and shrink struct file by another 4 bytes. This brings struct
file to 200 bytes down from 232 bytes.
I've tried to audit all codepaths that use f_version and none of them
rely on it in file->f_op->release() and never have since commit
1da177e4c3f4 ("Linux-2.6.12-rc2").
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
struct file {
union {
struct callback_head f_task_work; /* 0 16 */
struct llist_node f_llist; /* 0 8 */
u64 f_version; /* 0 8 */
}; /* 0 16 */
spinlock_t f_lock; /* 16 4 */
fmode_t f_mode; /* 20 4 */
atomic_long_t f_count; /* 24 8 */
struct mutex f_pos_lock; /* 32 32 */
/* --- cacheline 1 boundary (64 bytes) --- */
loff_t f_pos; /* 64 8 */
unsigned int f_flags; /* 72 4 */
unsigned int f_iocb_flags; /* 76 4 */
struct fown_struct * f_owner; /* 80 8 */
const struct cred * f_cred; /* 88 8 */
struct file_ra_state f_ra; /* 96 32 */
/* --- cacheline 2 boundary (128 bytes) --- */
struct path f_path; /* 128 16 */
struct inode * f_inode; /* 144 8 */
const struct file_operations * f_op; /* 152 8 */
void * f_security; /* 160 8 */
void * private_data; /* 168 8 */
struct hlist_head * f_ep; /* 176 8 */
struct address_space * f_mapping; /* 184 8 */
/* --- cacheline 3 boundary (192 bytes) --- */
errseq_t f_wb_err; /* 192 4 */
errseq_t f_sb_err; /* 196 4 */
/* size: 200, cachelines: 4, members: 20 */
/* last cacheline: 8 bytes */
};
---
include/linux/fs.h | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 7eb4f706d59f..7a2994405e8e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -998,9 +998,8 @@ struct file {
struct callback_head f_task_work;
/* fput() must use workqueue (most kernel threads). */
struct llist_node f_llist;
- unsigned int f_iocb_flags;
+ u64 f_version;
};
-
/*
* Protects f_ep, f_flags.
* Must not be taken from IRQ context.
@@ -1011,6 +1010,7 @@ struct file {
struct mutex f_pos_lock;
loff_t f_pos;
unsigned int f_flags;
+ unsigned int f_iocb_flags;
struct fown_struct *f_owner;
const struct cred *f_cred;
struct file_ra_state f_ra;
@@ -1018,7 +1018,6 @@ struct file {
struct inode *f_inode; /* cached value */
const struct file_operations *f_op;
- u64 f_version;
#ifdef CONFIG_SECURITY
void *f_security;
#endif
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH] fs: switch f_iocb_flags and f_version 2024-08-22 14:14 [PATCH] fs: switch f_iocb_flags and f_version Christian Brauner @ 2024-08-22 14:55 ` Jens Axboe 2024-08-22 15:10 ` Christian Brauner 2024-08-22 15:54 ` Jeff Layton 2024-08-23 6:24 ` Christoph Hellwig 2 siblings, 1 reply; 11+ messages in thread From: Jens Axboe @ 2024-08-22 14:55 UTC (permalink / raw) To: Christian Brauner, linux-fsdevel Cc: Al Viro, Linus Torvalds, Jeff Layton, Jan Kara On 8/22/24 8:14 AM, Christian Brauner wrote: > Now that we shrank struct file by 24 bytes we still have a 4 byte hole. > Move f_version into the union and f_iocb_flags out of the union to fill > that hole and shrink struct file by another 4 bytes. This brings struct > file to 200 bytes down from 232 bytes. Nice! Now you just need to find 8 more bytes and we'll be down to 3 cachelines for struct file. > I've tried to audit all codepaths that use f_version and none of them > rely on it in file->f_op->release() and never have since commit > 1da177e4c3f4 ("Linux-2.6.12-rc2"). Do we want to add a comment to this effect? I know it's obvious from sharing with f_task_work, but... -- Jens Axboe ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] fs: switch f_iocb_flags and f_version 2024-08-22 14:55 ` Jens Axboe @ 2024-08-22 15:10 ` Christian Brauner 2024-08-22 16:17 ` Jens Axboe 0 siblings, 1 reply; 11+ messages in thread From: Christian Brauner @ 2024-08-22 15:10 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-fsdevel, Al Viro, Linus Torvalds, Jeff Layton, Jan Kara > Do we want to add a comment to this effect? I know it's obvious from > sharing with f_task_work, but... I'll add one. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] fs: switch f_iocb_flags and f_version 2024-08-22 15:10 ` Christian Brauner @ 2024-08-22 16:17 ` Jens Axboe 2024-08-23 8:16 ` Christian Brauner 0 siblings, 1 reply; 11+ messages in thread From: Jens Axboe @ 2024-08-22 16:17 UTC (permalink / raw) To: Christian Brauner Cc: linux-fsdevel, Al Viro, Linus Torvalds, Jeff Layton, Jan Kara On 8/22/24 9:10 AM, Christian Brauner wrote: >> Do we want to add a comment to this effect? I know it's obvious from >> sharing with f_task_work, but... > > I'll add one. Sounds good. You can add my: Reviewed-by: Jens Axboe <axboe@kernel.dk> as well, forgot to mention that in the original reply. -- Jens Axboe ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] fs: switch f_iocb_flags and f_version 2024-08-22 16:17 ` Jens Axboe @ 2024-08-23 8:16 ` Christian Brauner 2024-08-24 9:26 ` Christian Brauner 0 siblings, 1 reply; 11+ messages in thread From: Christian Brauner @ 2024-08-23 8:16 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-fsdevel, Al Viro, Linus Torvalds, Jeff Layton, Jan Kara [-- Attachment #1: Type: text/plain, Size: 604 bytes --] On Thu, Aug 22, 2024 at 10:17:37AM GMT, Jens Axboe wrote: > On 8/22/24 9:10 AM, Christian Brauner wrote: > >> Do we want to add a comment to this effect? I know it's obvious from > >> sharing with f_task_work, but... > > > > I'll add one. > > Sounds good. You can add my: > > Reviewed-by: Jens Axboe <axboe@kernel.dk> > > as well, forgot to mention that in the original reply. I think we can deliver 192 bytes aka 3 cachelines. Afaict we can move struct file_ra_state into the union instead of f_version. See the appended patch I'm testing now. If that works then we're down by 40 bytes this cycle. [-- Attachment #2: 0001-fs-switch-f_iocb_flags-and-f_ra.patch --] [-- Type: text/x-diff, Size: 1652 bytes --] From 51d5327717b370041733af2f3c6ea3cd75d793e2 Mon Sep 17 00:00:00 2001 From: Christian Brauner <brauner@kernel.org> Date: Thu, 22 Aug 2024 16:14:46 +0200 Subject: [PATCH] fs: switch f_iocb_flags and f_ra Now that we shrank struct file by 24 bytes we still have a 4 byte hole. If we move struct file_ra_state into the union and f_iocb_flags out of the union we close that whole and bring down struct file to 192 bytes. Which means struct file is 3 cachelines and we managed to shrink it by 40 bytes this cycle. I've tried to audit all codepaths that use f_ra and none of them seem to rely on it in file->f_op->release() and never have since commit 1da177e4c3f4 ("Linux-2.6.12-rc2"). Signed-off-by: Christian Brauner <brauner@kernel.org> --- include/linux/fs.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index 7eb4f706d59f..6c19f87ea615 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -998,9 +998,9 @@ struct file { struct callback_head f_task_work; /* fput() must use workqueue (most kernel threads). */ struct llist_node f_llist; - unsigned int f_iocb_flags; + /* Invalid after last fput(). */ + struct file_ra_state f_ra; }; - /* * Protects f_ep, f_flags. * Must not be taken from IRQ context. @@ -1011,9 +1011,9 @@ struct file { struct mutex f_pos_lock; loff_t f_pos; unsigned int f_flags; + unsigned int f_iocb_flags; struct fown_struct *f_owner; const struct cred *f_cred; - struct file_ra_state f_ra; struct path f_path; struct inode *f_inode; /* cached value */ const struct file_operations *f_op; -- 2.43.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] fs: switch f_iocb_flags and f_version 2024-08-23 8:16 ` Christian Brauner @ 2024-08-24 9:26 ` Christian Brauner 0 siblings, 0 replies; 11+ messages in thread From: Christian Brauner @ 2024-08-24 9:26 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-fsdevel, Al Viro, Linus Torvalds, Jeff Layton, Jan Kara [-- Attachment #1: Type: text/plain, Size: 814 bytes --] On Fri, Aug 23, 2024 at 10:16:28AM GMT, Christian Brauner wrote: > On Thu, Aug 22, 2024 at 10:17:37AM GMT, Jens Axboe wrote: > > On 8/22/24 9:10 AM, Christian Brauner wrote: > > >> Do we want to add a comment to this effect? I know it's obvious from > > >> sharing with f_task_work, but... > > > > > > I'll add one. > > > > Sounds good. You can add my: > > > > Reviewed-by: Jens Axboe <axboe@kernel.dk> > > > > as well, forgot to mention that in the original reply. > > I think we can deliver 192 bytes aka 3 cachelines. > Afaict we can move struct file_ra_state into the union instead of > f_version. See the appended patch I'm testing now. If that works then > we're down by 40 bytes this cycle. Seems to hold up. I've reorderd things so that no member crosses a cacheline. Patch appended and in vfs.misc. [-- Attachment #2: 0001-fs-pack-struct-file.patch --] [-- Type: text/x-diff, Size: 4195 bytes --] From 88dad26dcadd9e8a47ff0cd85e9aef5a5b1667f7 Mon Sep 17 00:00:00 2001 From: Christian Brauner <brauner@kernel.org> Date: Fri, 23 Aug 2024 21:06:58 +0200 Subject: [PATCH] fs: pack struct file Now that we shrunk struct file to 192 bytes aka 3 cachelines reorder struct file to not leave any holes or have members cross cachelines. Add a short comment to each of the fields and mark the cachelines. It's possible that we may have to tweak this based on profiling in the future. So far I had Jens test this comparing io_uring with non-fixed and fixed files and it improved performance. The layout is a combination of Jens' and my changes. Signed-off-by: Christian Brauner <brauner@kernel.org> --- include/linux/fs.h | 90 +++++++++++++++++++++++++--------------------- 1 file changed, 50 insertions(+), 40 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index 6c19f87ea615..ace14421d7dc 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -986,52 +986,62 @@ static inline int ra_has_index(struct file_ra_state *ra, pgoff_t index) index < ra->start + ra->size); } -/* - * f_{lock,count,pos_lock} members can be highly contended and share - * the same cacheline. f_{lock,mode} are very frequently used together - * and so share the same cacheline as well. The read-mostly - * f_{path,inode,op} are kept on a separate cacheline. +/** + * struct file - Represents a file + * @f_lock: Protects f_ep, f_flags. Must not be taken from IRQ context. + * @f_mode: FMODE_* flags often used in hotpaths + * @f_mapping: Contents of a cacheable, mappable object. + * @f_flags: file flags + * @f_iocb_flags: iocb flags + * @private_data: filesystem or driver specific data + * @f_path: path of the file + * @f_inode: cached inode + * @f_count: reference count + * @f_pos_lock: lock protecting file position + * @f_pos: file position + * @f_version: file version + * @f_security: LSM security context of this file + * @f_owner: file owner + * @f_cred: stashed credentials of creator/opener + * @f_wb_err: writeback error + * @f_sb_err: per sb writeback errors + * @f_ep: link of all epoll hooks for this file + * @f_task_work: task work entry point + * @f_llist: work queue entrypoint + * @f_ra: file's readahead state */ struct file { - union { - /* fput() uses task work when closing and freeing file (default). */ - struct callback_head f_task_work; - /* fput() must use workqueue (most kernel threads). */ - struct llist_node f_llist; - /* Invalid after last fput(). */ - struct file_ra_state f_ra; - }; - /* - * Protects f_ep, f_flags. - * Must not be taken from IRQ context. - */ - spinlock_t f_lock; - fmode_t f_mode; - atomic_long_t f_count; - struct mutex f_pos_lock; - loff_t f_pos; - unsigned int f_flags; - unsigned int f_iocb_flags; - struct fown_struct *f_owner; - const struct cred *f_cred; - struct path f_path; - struct inode *f_inode; /* cached value */ + spinlock_t f_lock; + fmode_t f_mode; const struct file_operations *f_op; - - u64 f_version; + struct address_space *f_mapping; + unsigned int f_flags; + unsigned int f_iocb_flags; + void *private_data; + struct path f_path; + struct inode *f_inode; + /* --- cacheline 1 boundary (64 bytes) --- */ + atomic_long_t f_count; + struct mutex f_pos_lock; + loff_t f_pos; + u64 f_version; #ifdef CONFIG_SECURITY - void *f_security; + void *f_security; #endif - /* needed for tty driver, and maybe others */ - void *private_data; - + /* --- cacheline 2 boundary (128 bytes) --- */ + struct fown_struct *f_owner; + const struct cred *f_cred; + errseq_t f_wb_err; + errseq_t f_sb_err; #ifdef CONFIG_EPOLL - /* Used by fs/eventpoll.c to link all the hooks to this file */ - struct hlist_head *f_ep; -#endif /* #ifdef CONFIG_EPOLL */ - struct address_space *f_mapping; - errseq_t f_wb_err; - errseq_t f_sb_err; /* for syncfs */ + struct hlist_head *f_ep; +#endif + union { + struct callback_head f_task_work; + struct llist_node f_llist; + struct file_ra_state f_ra; + }; + /* --- cacheline 2 boundary (192 bytes) --- */ } __randomize_layout __attribute__((aligned(4))); /* lest something weird decides that 2 is OK */ -- 2.43.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] fs: switch f_iocb_flags and f_version 2024-08-22 14:14 [PATCH] fs: switch f_iocb_flags and f_version Christian Brauner 2024-08-22 14:55 ` Jens Axboe @ 2024-08-22 15:54 ` Jeff Layton 2024-08-23 6:24 ` Christoph Hellwig 2 siblings, 0 replies; 11+ messages in thread From: Jeff Layton @ 2024-08-22 15:54 UTC (permalink / raw) To: Christian Brauner, linux-fsdevel Cc: Jens Axboe, Al Viro, Linus Torvalds, Jan Kara On Thu, 2024-08-22 at 16:14 +0200, Christian Brauner wrote: > Now that we shrank struct file by 24 bytes we still have a 4 byte hole. > Move f_version into the union and f_iocb_flags out of the union to fill > that hole and shrink struct file by another 4 bytes. This brings struct > file to 200 bytes down from 232 bytes. > > I've tried to audit all codepaths that use f_version and none of them > rely on it in file->f_op->release() and never have since commit > 1da177e4c3f4 ("Linux-2.6.12-rc2"). > > Signed-off-by: Christian Brauner <brauner@kernel.org> > --- > struct file { > union { > struct callback_head f_task_work; /* 0 16 */ > struct llist_node f_llist; /* 0 8 */ > u64 f_version; /* 0 8 */ > }; /* 0 16 */ > spinlock_t f_lock; /* 16 4 */ > fmode_t f_mode; /* 20 4 */ > atomic_long_t f_count; /* 24 8 */ > struct mutex f_pos_lock; /* 32 32 */ > /* --- cacheline 1 boundary (64 bytes) --- */ > loff_t f_pos; /* 64 8 */ > unsigned int f_flags; /* 72 4 */ > unsigned int f_iocb_flags; /* 76 4 */ > struct fown_struct * f_owner; /* 80 8 */ > const struct cred * f_cred; /* 88 8 */ > struct file_ra_state f_ra; /* 96 32 */ > /* --- cacheline 2 boundary (128 bytes) --- */ > struct path f_path; /* 128 16 */ > struct inode * f_inode; /* 144 8 */ > const struct file_operations * f_op; /* 152 8 */ > void * f_security; /* 160 8 */ > void * private_data; /* 168 8 */ > struct hlist_head * f_ep; /* 176 8 */ > struct address_space * f_mapping; /* 184 8 */ > /* --- cacheline 3 boundary (192 bytes) --- */ > errseq_t f_wb_err; /* 192 4 */ > errseq_t f_sb_err; /* 196 4 */ > > /* size: 200, cachelines: 4, members: 20 */ > /* last cacheline: 8 bytes */ > }; > --- > include/linux/fs.h | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 7eb4f706d59f..7a2994405e8e 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -998,9 +998,8 @@ struct file { > struct callback_head f_task_work; > /* fput() must use workqueue (most kernel threads). */ > struct llist_node f_llist; > - unsigned int f_iocb_flags; > + u64 f_version; > }; > - > /* > * Protects f_ep, f_flags. > * Must not be taken from IRQ context. > @@ -1011,6 +1010,7 @@ struct file { > struct mutex f_pos_lock; > loff_t f_pos; > unsigned int f_flags; > + unsigned int f_iocb_flags; > struct fown_struct *f_owner; > const struct cred *f_cred; > struct file_ra_state f_ra; > @@ -1018,7 +1018,6 @@ struct file { > struct inode *f_inode; /* cached value */ > const struct file_operations *f_op; > > - u64 f_version; > #ifdef CONFIG_SECURITY > void *f_security; > #endif Reviewed-by: Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] fs: switch f_iocb_flags and f_version 2024-08-22 14:14 [PATCH] fs: switch f_iocb_flags and f_version Christian Brauner 2024-08-22 14:55 ` Jens Axboe 2024-08-22 15:54 ` Jeff Layton @ 2024-08-23 6:24 ` Christoph Hellwig 2024-08-23 6:34 ` Al Viro 2 siblings, 1 reply; 11+ messages in thread From: Christoph Hellwig @ 2024-08-23 6:24 UTC (permalink / raw) To: Christian Brauner Cc: linux-fsdevel, Jens Axboe, Al Viro, Linus Torvalds, Jeff Layton, Jan Kara Given that f_version only has about a dozen users maybe just kill it and make them use the private data instead? Many of the users looks pretty bogus to start with as they either only set or only compare the value as well. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] fs: switch f_iocb_flags and f_version 2024-08-23 6:24 ` Christoph Hellwig @ 2024-08-23 6:34 ` Al Viro 2024-08-23 6:52 ` Christoph Hellwig 0 siblings, 1 reply; 11+ messages in thread From: Al Viro @ 2024-08-23 6:34 UTC (permalink / raw) To: Christoph Hellwig Cc: Christian Brauner, linux-fsdevel, Jens Axboe, Linus Torvalds, Jeff Layton, Jan Kara On Thu, Aug 22, 2024 at 11:24:43PM -0700, Christoph Hellwig wrote: > Given that f_version only has about a dozen users maybe just kill > it and make them use the private data instead? Many of the users > looks pretty bogus to start with as they either only set or only > compare the value as well. Take a look at the uses in fs/pipe.c ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] fs: switch f_iocb_flags and f_version 2024-08-23 6:34 ` Al Viro @ 2024-08-23 6:52 ` Christoph Hellwig 2024-08-23 6:59 ` Al Viro 0 siblings, 1 reply; 11+ messages in thread From: Christoph Hellwig @ 2024-08-23 6:52 UTC (permalink / raw) To: Al Viro Cc: Christoph Hellwig, Christian Brauner, linux-fsdevel, Jens Axboe, Linus Torvalds, Jeff Layton, Jan Kara On Fri, Aug 23, 2024 at 07:34:11AM +0100, Al Viro wrote: > On Thu, Aug 22, 2024 at 11:24:43PM -0700, Christoph Hellwig wrote: > > Given that f_version only has about a dozen users maybe just kill > > it and make them use the private data instead? Many of the users > > looks pretty bogus to start with as they either only set or only > > compare the value as well. > > Take a look at the uses in fs/pipe.c I did not say "all", but "many" (and maybe should have said "a few"). ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] fs: switch f_iocb_flags and f_version 2024-08-23 6:52 ` Christoph Hellwig @ 2024-08-23 6:59 ` Al Viro 0 siblings, 0 replies; 11+ messages in thread From: Al Viro @ 2024-08-23 6:59 UTC (permalink / raw) To: Christoph Hellwig Cc: Christian Brauner, linux-fsdevel, Jens Axboe, Linus Torvalds, Jeff Layton, Jan Kara On Thu, Aug 22, 2024 at 11:52:41PM -0700, Christoph Hellwig wrote: > On Fri, Aug 23, 2024 at 07:34:11AM +0100, Al Viro wrote: > > On Thu, Aug 22, 2024 at 11:24:43PM -0700, Christoph Hellwig wrote: > > > Given that f_version only has about a dozen users maybe just kill > > > it and make them use the private data instead? Many of the users > > > looks pretty bogus to start with as they either only set or only > > > compare the value as well. > > > > Take a look at the uses in fs/pipe.c > > I did not say "all", but "many" (and maybe should have said "a few"). Not the point... From my (admittedly cursory) reading of that code, we might want different instances of struct file that share the same pipe_inode_info (pointed to by ->private_data) to have different values in ->f_version. If that is true, there's no hope of pushing it into ->private_data - we definitely do not what an extra level of indirection for getting to pipe_inode_info for those.. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-08-24 9:26 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-08-22 14:14 [PATCH] fs: switch f_iocb_flags and f_version Christian Brauner 2024-08-22 14:55 ` Jens Axboe 2024-08-22 15:10 ` Christian Brauner 2024-08-22 16:17 ` Jens Axboe 2024-08-23 8:16 ` Christian Brauner 2024-08-24 9:26 ` Christian Brauner 2024-08-22 15:54 ` Jeff Layton 2024-08-23 6:24 ` Christoph Hellwig 2024-08-23 6:34 ` Al Viro 2024-08-23 6:52 ` Christoph Hellwig 2024-08-23 6:59 ` Al Viro
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).