* [Cluster-devel] [GFS2 PATCH] gfs2: Remove inode from ordered write list in gfs2_write_inode()
@ 2018-01-29 5:02 Abhi Das
2018-01-29 9:46 ` Steven Whitehouse
2018-01-30 17:23 ` Bob Peterson
0 siblings, 2 replies; 4+ messages in thread
From: Abhi Das @ 2018-01-29 5:02 UTC (permalink / raw)
To: cluster-devel.redhat.com
The vfs clears the I_DIRTY inode flag before calling gfs2_write_inode()
having queued any data that needed to be written to disk.
This is a good time to remove such inodes from our ordered write list
so they don't hang around for long periods of time.
Signed-off-by: Abhi Das <adas@redhat.com>
---
fs/gfs2/super.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index d81d46e..596feb6 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -766,6 +766,12 @@ static int gfs2_write_inode(struct inode *inode, struct writeback_control *wbc)
ret = filemap_fdatawait(metamapping);
if (ret)
mark_inode_dirty_sync(inode);
+ else {
+ spin_lock(&inode->i_lock);
+ if (!(inode->i_flags & I_DIRTY))
+ gfs2_ordered_del_inode(ip);
+ spin_unlock(&inode->i_lock);
+ }
return ret;
}
--
2.4.11
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [Cluster-devel] [GFS2 PATCH] gfs2: Remove inode from ordered write list in gfs2_write_inode()
2018-01-29 5:02 [Cluster-devel] [GFS2 PATCH] gfs2: Remove inode from ordered write list in gfs2_write_inode() Abhi Das
@ 2018-01-29 9:46 ` Steven Whitehouse
2018-01-29 16:08 ` Abhijith Das
2018-01-30 17:23 ` Bob Peterson
1 sibling, 1 reply; 4+ messages in thread
From: Steven Whitehouse @ 2018-01-29 9:46 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
Looks good. Do you have any figures for how big a reduction in inodes on
the order list this gives?
Steve.
On 29/01/18 05:02, Abhi Das wrote:
> The vfs clears the I_DIRTY inode flag before calling gfs2_write_inode()
> having queued any data that needed to be written to disk.
> This is a good time to remove such inodes from our ordered write list
> so they don't hang around for long periods of time.
>
> Signed-off-by: Abhi Das <adas@redhat.com>
> ---
> fs/gfs2/super.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index d81d46e..596feb6 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c
> @@ -766,6 +766,12 @@ static int gfs2_write_inode(struct inode *inode, struct writeback_control *wbc)
> ret = filemap_fdatawait(metamapping);
> if (ret)
> mark_inode_dirty_sync(inode);
> + else {
> + spin_lock(&inode->i_lock);
> + if (!(inode->i_flags & I_DIRTY))
> + gfs2_ordered_del_inode(ip);
> + spin_unlock(&inode->i_lock);
> + }
> return ret;
> }
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Cluster-devel] [GFS2 PATCH] gfs2: Remove inode from ordered write list in gfs2_write_inode()
2018-01-29 9:46 ` Steven Whitehouse
@ 2018-01-29 16:08 ` Abhijith Das
0 siblings, 0 replies; 4+ messages in thread
From: Abhijith Das @ 2018-01-29 16:08 UTC (permalink / raw)
To: cluster-devel.redhat.com
----- Original Message -----
> From: "Steven Whitehouse" <swhiteho@redhat.com>
> To: "Abhi Das" <adas@redhat.com>, cluster-devel at redhat.com
> Sent: Monday, January 29, 2018 3:46:39 AM
> Subject: Re: [Cluster-devel] [GFS2 PATCH] gfs2: Remove inode from ordered write list in gfs2_write_inode()
>
> Hi,
>
> Looks good. Do you have any figures for how big a reduction in inodes on
> the order list this gives?
>
Here are some numbers with my million file creation test. It basically creates
a directory tree with 10000 directories at the deepest level with each of those
containing 100 8K size files.
I added some instrumentation to periodically print out the size of the ordered
write list and a count of the places from which inodes are added to and removed
from it.
This is the output without the patch:
[820545.521293] Ord list Size:23904 +[add_inode:23904] -[trunc:0 evict:0 wait:0 write_inode:0 ord_write:0 setflags:0]
[820575.535980] Ord list Size:73670 +[add_inode:73670] -[trunc:0 evict:0 wait:0 write_inode:0 ord_write:0 setflags:0]
...
[820966.796087] Ord list Size:667066 +[add_inode:667066] -[trunc:0 evict:0 wait:0 write_inode:0 ord_write:0 setflags:0]
[820996.811789] Ord list Size:705551 +[add_inode:710262] -[trunc:0 evict:0 wait:0 write_inode:0 ord_write:4711 setflags:0]
[821026.825525] Ord list Size:676518 +[add_inode:751608] -[trunc:0 evict:36 wait:0 write_inode:0 ord_write:75054 setflags:0]
...
[821146.927283] Ord list Size:587353 +[add_inode:922395] -[trunc:0 evict:376 wait:0 write_inode:0 ord_write:334666 setflags:0]
[821177.204024] Ord list Size:573822 +[add_inode:965143] -[trunc:0 evict:379 wait:0 write_inode:0 ord_write:390942 setflags:0]
[821207.402720] Ord list Size:569946 +[add_inode:1000000] -[trunc:0 evict:10143 wait:0 write_inode:0 ord_write:419911 setflags:0]
>> Test has ended here when we hit a million files - took 662 seconds
[821237.416418] Ord list Size:564908 +[add_inode:1000000] -[trunc:0 evict:10143 wait:0 write_inode:0 ord_write:424949 setflags:0]
[821267.429116] Ord list Size:564908 +[add_inode:1000000] -[trunc:0 evict:10143 wait:0 write_inode:0 ord_write:424949 setflags:0]
...
[821387.475956] Ord list Size:564908 +[add_inode:1000000] -[trunc:0 evict:10143 wait:0 write_inode:0 ord_write:424949 setflags:0]
>> I did an unmount here
[821398.943003] Ord list Size:0 +[add_inode:1000000] -[trunc:0 evict:575051 wait:0 write_inode:0 ord_write:424949 setflags:0]
As you can see, the size of the ordered list steadily climbs as we create more
files and a portion of it gets pruned in gfs2_ordered_write() where we remove
inodes whose mappings have zero pages (result of my patch from a few weeks ago)
A large part of the list is still around well after the test completes and
doesn't get cleared out until unmount.
With the latest patch:
[819186.894664] Ord list Size:2586 +[add_inode:2586] -[trunc:0 evict:0 wait:0 write_inode:0 ord_write:0 setflags:0]
[819217.871372] Ord list Size:46996 +[add_inode:48240] -[trunc:0 evict:0 wait:0 write_inode:1244 ord_write:0 setflags:0]
[819247.891066] Ord list Size:52424 +[add_inode:97568] -[trunc:0 evict:0 wait:0 write_inode:45144 ord_write:0 setflags:0]
[819278.379780] Ord list Size:48550 +[add_inode:144557] -[trunc:0 evict:0 wait:0 write_inode:96007 ord_write:0 setflags:0]
...
[819760.628004] Ord list Size:49686 +[add_inode:937372] -[trunc:0 evict:0 wait:0 write_inode:887686 ord_write:0 setflags:0]
[819791.206728] Ord list Size:48754 +[add_inode:986127] -[trunc:0 evict:0 wait:0 write_inode:937373 ord_write:0 setflags:0]
[819821.223436] Ord list Size:14212 +[add_inode:1000000] -[trunc:0 evict:0 wait:0 write_inode:985788 ord_write:0 setflags:0]
>> test ends here - took 635s
[819851.236137] Ord list Size:0 +[add_inode:1000000] -[trunc:0 evict:0 wait:0 write_inode:1000000 ord_write:0 setflags:0]
[819881.247826] Ord list Size:0 +[add_inode:1000000] -[trunc:0 evict:0 wait:0 write_inode:1000000 ord_write:0 setflags:0]
With the removal of non dirty inodes from the ordered list in gfs2_write_inode(),
we don't let things get out of hand and only about 50k inodes are on the list at
any point. Also, as the test completes, the entire list is cleared out.
In my testing I haven't encountered any issues with it, but if somebody can tack
this patch on to their test kernels (or I can provide one) I'd be more comfortable.
Cheers!
--Abhi
> Steve.
>
>
> On 29/01/18 05:02, Abhi Das wrote:
> > The vfs clears the I_DIRTY inode flag before calling gfs2_write_inode()
> > having queued any data that needed to be written to disk.
> > This is a good time to remove such inodes from our ordered write list
> > so they don't hang around for long periods of time.
> >
> > Signed-off-by: Abhi Das <adas@redhat.com>
> > ---
> > fs/gfs2/super.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> > index d81d46e..596feb6 100644
> > --- a/fs/gfs2/super.c
> > +++ b/fs/gfs2/super.c
> > @@ -766,6 +766,12 @@ static int gfs2_write_inode(struct inode *inode,
> > struct writeback_control *wbc)
> > ret = filemap_fdatawait(metamapping);
> > if (ret)
> > mark_inode_dirty_sync(inode);
> > + else {
> > + spin_lock(&inode->i_lock);
> > + if (!(inode->i_flags & I_DIRTY))
> > + gfs2_ordered_del_inode(ip);
> > + spin_unlock(&inode->i_lock);
> > + }
> > return ret;
> > }
> >
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Cluster-devel] [GFS2 PATCH] gfs2: Remove inode from ordered write list in gfs2_write_inode()
2018-01-29 5:02 [Cluster-devel] [GFS2 PATCH] gfs2: Remove inode from ordered write list in gfs2_write_inode() Abhi Das
2018-01-29 9:46 ` Steven Whitehouse
@ 2018-01-30 17:23 ` Bob Peterson
1 sibling, 0 replies; 4+ messages in thread
From: Bob Peterson @ 2018-01-30 17:23 UTC (permalink / raw)
To: cluster-devel.redhat.com
----- Original Message -----
| The vfs clears the I_DIRTY inode flag before calling gfs2_write_inode()
| having queued any data that needed to be written to disk.
| This is a good time to remove such inodes from our ordered write list
| so they don't hang around for long periods of time.
|
| Signed-off-by: Abhi Das <adas@redhat.com>
| ---
Hi,
Thanks. This is now pushed to the for-next branch of the linux-gfs2 tree:
https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git/commit/fs/gfs2?h=for-next&id=957a7acd46e64c52d2a1d59cd7273ed49455afb6
Regards,
Bob Peterson
Red Hat File Systems
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-01-30 17:23 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-29 5:02 [Cluster-devel] [GFS2 PATCH] gfs2: Remove inode from ordered write list in gfs2_write_inode() Abhi Das
2018-01-29 9:46 ` Steven Whitehouse
2018-01-29 16:08 ` Abhijith Das
2018-01-30 17:23 ` Bob Peterson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.