linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] f2fs: avoid issuing small bios due to several dirty node pages
@ 2013-01-18  6:29 Jaegeuk Kim
  2013-01-18  7:58 ` Namjae Jeon
  0 siblings, 1 reply; 6+ messages in thread
From: Jaegeuk Kim @ 2013-01-18  6:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-f2fs-devel; +Cc: Jaegeuk Kim

If some small bios of dirty node pages are supposed to be issued during the
sequential data writes, there-in well-produced consecutive data bios are able
to be split by the small node bios, resulting in performance degradation.
So, let's collect a number of dirty node pages until reaching a threshold.
And, by default, I set the threshold as 2MB, a segment size.

This improves sequential write performance on i5, 512GB SSD (830 w/ SATA2) as
follows.
Before: 231 MB/s -> After: 255 MB/s

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
---
 fs/f2fs/node.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index f177c01..9bda63c 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1124,6 +1124,12 @@ static int f2fs_write_node_page(struct page *page,
 	return 0;
 }
 
+/*
+ * It is very important to gather dirty pages and write at once, so that we can
+ * submit a big bio without interfering other data writes.
+ * Be default, 512 pages (2MB), a segment size, is quite reasonable.
+ */
+#define COLLECT_DIRTY_NODES	512
 static int f2fs_write_node_pages(struct address_space *mapping,
 			    struct writeback_control *wbc)
 {
@@ -1131,17 +1137,16 @@ static int f2fs_write_node_pages(struct address_space *mapping,
 	struct block_device *bdev = sbi->sb->s_bdev;
 	long nr_to_write = wbc->nr_to_write;
 
-	if (wbc->for_kupdate)
-		return 0;
-
-	if (get_pages(sbi, F2FS_DIRTY_NODES) == 0)
-		return 0;
-
+	/* First check balancing cached NAT entries */
 	if (try_to_free_nats(sbi, NAT_ENTRY_PER_BLOCK)) {
 		write_checkpoint(sbi, false, false);
 		return 0;
 	}
 
+	/* collect a number of dirty node pages and write together */
+	if (get_pages(sbi, F2FS_DIRTY_NODES) < COLLECT_DIRTY_NODES)
+		return 0;
+
 	/* if mounting is failed, skip writing node pages */
 	wbc->nr_to_write = bio_get_nr_vecs(bdev);
 	sync_node_pages(sbi, 0, wbc);
-- 
1.8.0.1.250.gb7973fb


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] f2fs: avoid issuing small bios due to several dirty node pages
  2013-01-18  6:29 [PATCH] f2fs: avoid issuing small bios due to several dirty node pages Jaegeuk Kim
@ 2013-01-18  7:58 ` Namjae Jeon
  2013-01-18  8:25   ` Jaegeuk Kim
  0 siblings, 1 reply; 6+ messages in thread
From: Namjae Jeon @ 2013-01-18  7:58 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-fsdevel, linux-f2fs-devel

2013/1/18, Jaegeuk Kim <jaegeuk.kim@samsung.com>:
> If some small bios of dirty node pages are supposed to be issued during the
> sequential data writes, there-in well-produced consecutive data bios are
> able
> to be split by the small node bios, resulting in performance degradation.
> So, let's collect a number of dirty node pages until reaching a threshold.
> And, by default, I set the threshold as 2MB, a segment size.
>
> This improves sequential write performance on i5, 512GB SSD (830 w/ SATA2)
> as
> follows.
> Before: 231 MB/s -> After: 255 MB/s
>
> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
> ---
>  fs/f2fs/node.c | 17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> index f177c01..9bda63c 100644
> --- a/fs/f2fs/node.c
> +++ b/fs/f2fs/node.c
> @@ -1124,6 +1124,12 @@ static int f2fs_write_node_page(struct page *page,
>  	return 0;
>  }
>
> +/*
> + * It is very important to gather dirty pages and write at once, so that we
> can
> + * submit a big bio without interfering other data writes.
> + * Be default, 512 pages (2MB), a segment size, is quite reasonable.
> + */
> +#define COLLECT_DIRTY_NODES	512
Hi Jaeguek.
It is just my opinion.
One of f2fs advantages is that user can set segment size to fit own device.
For future, Is it not good to use segment size set by user(from
superblock) instead of fixed size ?

Thanks.

>  static int f2fs_write_node_pages(struct address_space *mapping,
>  			    struct writeback_control *wbc)
>  {
> @@ -1131,17 +1137,16 @@ static int f2fs_write_node_pages(struct
> address_space *mapping,
>  	struct block_device *bdev = sbi->sb->s_bdev;
>  	long nr_to_write = wbc->nr_to_write;
>
> -	if (wbc->for_kupdate)
> -		return 0;
> -
> -	if (get_pages(sbi, F2FS_DIRTY_NODES) == 0)
> -		return 0;
> -
> +	/* First check balancing cached NAT entries */
>  	if (try_to_free_nats(sbi, NAT_ENTRY_PER_BLOCK)) {
>  		write_checkpoint(sbi, false, false);
>  		return 0;
>  	}
>
> +	/* collect a number of dirty node pages and write together */
> +	if (get_pages(sbi, F2FS_DIRTY_NODES) < COLLECT_DIRTY_NODES)
> +		return 0;
> +
>  	/* if mounting is failed, skip writing node pages */
>  	wbc->nr_to_write = bio_get_nr_vecs(bdev);
>  	sync_node_pages(sbi, 0, wbc);
> --
> 1.8.0.1.250.gb7973fb
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] f2fs: avoid issuing small bios due to several dirty node pages
  2013-01-18  7:58 ` Namjae Jeon
@ 2013-01-18  8:25   ` Jaegeuk Kim
  2013-01-18  8:57     ` Namjae Jeon
  0 siblings, 1 reply; 6+ messages in thread
From: Jaegeuk Kim @ 2013-01-18  8:25 UTC (permalink / raw)
  To: Namjae Jeon; +Cc: linux-fsdevel, linux-f2fs-devel

[-- Attachment #1: Type: text/plain, Size: 3085 bytes --]

2013-01-18 (금), 16:58 +0900, Namjae Jeon:
> 2013/1/18, Jaegeuk Kim <jaegeuk.kim@samsung.com>:
> > If some small bios of dirty node pages are supposed to be issued during the
> > sequential data writes, there-in well-produced consecutive data bios are
> > able
> > to be split by the small node bios, resulting in performance degradation.
> > So, let's collect a number of dirty node pages until reaching a threshold.
> > And, by default, I set the threshold as 2MB, a segment size.
> >
> > This improves sequential write performance on i5, 512GB SSD (830 w/ SATA2)
> > as
> > follows.
> > Before: 231 MB/s -> After: 255 MB/s
> >
> > Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
> > ---
> >  fs/f2fs/node.c | 17 +++++++++++------
> >  1 file changed, 11 insertions(+), 6 deletions(-)
> >
> > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> > index f177c01..9bda63c 100644
> > --- a/fs/f2fs/node.c
> > +++ b/fs/f2fs/node.c
> > @@ -1124,6 +1124,12 @@ static int f2fs_write_node_page(struct page *page,
> >  	return 0;
> >  }
> >
> > +/*
> > + * It is very important to gather dirty pages and write at once, so that we
> > can
> > + * submit a big bio without interfering other data writes.
> > + * Be default, 512 pages (2MB), a segment size, is quite reasonable.
> > + */
> > +#define COLLECT_DIRTY_NODES	512
> Hi Jaeguek.
> It is just my opinion.
> One of f2fs advantages is that user can set segment size to fit own device.
> For future, Is it not good to use segment size set by user(from
> superblock) instead of fixed size ?

At this moment, it's very hard to change the segment size due to many
hardcoded data structures such as SSA, bitmaps, etc.
Instead, I proposed something like section and zone which are based on
segment.
But, somebody will be able to do later. :)

> 
> Thanks.
> 
> >  static int f2fs_write_node_pages(struct address_space *mapping,
> >  			    struct writeback_control *wbc)
> >  {
> > @@ -1131,17 +1137,16 @@ static int f2fs_write_node_pages(struct
> > address_space *mapping,
> >  	struct block_device *bdev = sbi->sb->s_bdev;
> >  	long nr_to_write = wbc->nr_to_write;
> >
> > -	if (wbc->for_kupdate)
> > -		return 0;
> > -
> > -	if (get_pages(sbi, F2FS_DIRTY_NODES) == 0)
> > -		return 0;
> > -
> > +	/* First check balancing cached NAT entries */
> >  	if (try_to_free_nats(sbi, NAT_ENTRY_PER_BLOCK)) {
> >  		write_checkpoint(sbi, false, false);
> >  		return 0;
> >  	}
> >
> > +	/* collect a number of dirty node pages and write together */
> > +	if (get_pages(sbi, F2FS_DIRTY_NODES) < COLLECT_DIRTY_NODES)
> > +		return 0;
> > +
> >  	/* if mounting is failed, skip writing node pages */
> >  	wbc->nr_to_write = bio_get_nr_vecs(bdev);
> >  	sync_node_pages(sbi, 0, wbc);
> > --
> > 1.8.0.1.250.gb7973fb
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >

-- 
Jaegeuk Kim
Samsung

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] f2fs: avoid issuing small bios due to several dirty node pages
  2013-01-18  8:25   ` Jaegeuk Kim
@ 2013-01-18  8:57     ` Namjae Jeon
  2013-01-18  9:28       ` Jaegeuk Kim
  0 siblings, 1 reply; 6+ messages in thread
From: Namjae Jeon @ 2013-01-18  8:57 UTC (permalink / raw)
  To: jaegeuk.kim; +Cc: linux-fsdevel, linux-f2fs-devel

2013/1/18, Jaegeuk Kim <jaegeuk.kim@samsung.com>:
> 2013-01-18 (금), 16:58 +0900, Namjae Jeon:
>> 2013/1/18, Jaegeuk Kim <jaegeuk.kim@samsung.com>:
>> > If some small bios of dirty node pages are supposed to be issued during
>> > the
>> > sequential data writes, there-in well-produced consecutive data bios
>> > are
>> > able
>> > to be split by the small node bios, resulting in performance
>> > degradation.
>> > So, let's collect a number of dirty node pages until reaching a
>> > threshold.
>> > And, by default, I set the threshold as 2MB, a segment size.
>> >
>> > This improves sequential write performance on i5, 512GB SSD (830 w/
>> > SATA2)
>> > as
>> > follows.
>> > Before: 231 MB/s -> After: 255 MB/s
>> >
>> > Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
>> > ---
>> >  fs/f2fs/node.c | 17 +++++++++++------
>> >  1 file changed, 11 insertions(+), 6 deletions(-)
>> >
>> > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
>> > index f177c01..9bda63c 100644
>> > --- a/fs/f2fs/node.c
>> > +++ b/fs/f2fs/node.c
>> > @@ -1124,6 +1124,12 @@ static int f2fs_write_node_page(struct page
>> > *page,
>> >  	return 0;
>> >  }
>> >
>> > +/*
>> > + * It is very important to gather dirty pages and write at once, so
>> > that we
>> > can
>> > + * submit a big bio without interfering other data writes.
>> > + * Be default, 512 pages (2MB), a segment size, is quite reasonable.
>> > + */
>> > +#define COLLECT_DIRTY_NODES	512
>> Hi Jaeguek.
>> It is just my opinion.
>> One of f2fs advantages is that user can set segment size to fit own
>> device.
>> For future, Is it not good to use segment size set by user(from
>> superblock) instead of fixed size ?
>
> At this moment, it's very hard to change the segment size due to many
> hardcoded data structures such as SSA, bitmaps, etc.
> Instead, I proposed something like section and zone which are based on
> segment.
> But, somebody will be able to do later. :)
Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
Okay, I think It is better if you add TODO comment about this to be
fixed by the other.

Talk Incidentally, let me ask you other thing.
get_meta_page never return NULL about allocate page and readpage.
Is there any reason it try to infinitely retry about two case ?

Thanks for reply!
>

>>
>> Thanks.
>>
>> >  static int f2fs_write_node_pages(struct address_space *mapping,
>> >  			    struct writeback_control *wbc)
>> >  {
>> > @@ -1131,17 +1137,16 @@ static int f2fs_write_node_pages(struct
>> > address_space *mapping,
>> >  	struct block_device *bdev = sbi->sb->s_bdev;
>> >  	long nr_to_write = wbc->nr_to_write;
>> >
>> > -	if (wbc->for_kupdate)
>> > -		return 0;
>> > -
>> > -	if (get_pages(sbi, F2FS_DIRTY_NODES) == 0)
>> > -		return 0;
>> > -
>> > +	/* First check balancing cached NAT entries */
>> >  	if (try_to_free_nats(sbi, NAT_ENTRY_PER_BLOCK)) {
>> >  		write_checkpoint(sbi, false, false);
>> >  		return 0;
>> >  	}
>> >
>> > +	/* collect a number of dirty node pages and write together */
>> > +	if (get_pages(sbi, F2FS_DIRTY_NODES) < COLLECT_DIRTY_NODES)
>> > +		return 0;
>> > +
>> >  	/* if mounting is failed, skip writing node pages */
>> >  	wbc->nr_to_write = bio_get_nr_vecs(bdev);
>> >  	sync_node_pages(sbi, 0, wbc);
>> > --
>> > 1.8.0.1.250.gb7973fb
>> >
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel"
>> > in
>> > the body of a message to majordomo@vger.kernel.org
>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >
>
> --
> Jaegeuk Kim
> Samsung
>
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] f2fs: avoid issuing small bios due to several dirty node pages
  2013-01-18  8:57     ` Namjae Jeon
@ 2013-01-18  9:28       ` Jaegeuk Kim
  2013-01-18  9:37         ` Namjae Jeon
  0 siblings, 1 reply; 6+ messages in thread
From: Jaegeuk Kim @ 2013-01-18  9:28 UTC (permalink / raw)
  To: Namjae Jeon; +Cc: linux-fsdevel, linux-f2fs-devel

[-- Attachment #1: Type: text/plain, Size: 2723 bytes --]

2013-01-18 (금), 17:57 +0900, Namjae Jeon:
> 2013/1/18, Jaegeuk Kim <jaegeuk.kim@samsung.com>:
> > 2013-01-18 (금), 16:58 +0900, Namjae Jeon:
> >> 2013/1/18, Jaegeuk Kim <jaegeuk.kim@samsung.com>:
> >> > If some small bios of dirty node pages are supposed to be issued during
> >> > the
> >> > sequential data writes, there-in well-produced consecutive data bios
> >> > are
> >> > able
> >> > to be split by the small node bios, resulting in performance
> >> > degradation.
> >> > So, let's collect a number of dirty node pages until reaching a
> >> > threshold.
> >> > And, by default, I set the threshold as 2MB, a segment size.
> >> >
> >> > This improves sequential write performance on i5, 512GB SSD (830 w/
> >> > SATA2)
> >> > as
> >> > follows.
> >> > Before: 231 MB/s -> After: 255 MB/s
> >> >
> >> > Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
> >> > ---
> >> >  fs/f2fs/node.c | 17 +++++++++++------
> >> >  1 file changed, 11 insertions(+), 6 deletions(-)
> >> >
> >> > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> >> > index f177c01..9bda63c 100644
> >> > --- a/fs/f2fs/node.c
> >> > +++ b/fs/f2fs/node.c
> >> > @@ -1124,6 +1124,12 @@ static int f2fs_write_node_page(struct page
> >> > *page,
> >> >  	return 0;
> >> >  }
> >> >
> >> > +/*
> >> > + * It is very important to gather dirty pages and write at once, so
> >> > that we
> >> > can
> >> > + * submit a big bio without interfering other data writes.
> >> > + * Be default, 512 pages (2MB), a segment size, is quite reasonable.
> >> > + */
> >> > +#define COLLECT_DIRTY_NODES	512
> >> Hi Jaeguek.
> >> It is just my opinion.
> >> One of f2fs advantages is that user can set segment size to fit own
> >> device.
> >> For future, Is it not good to use segment size set by user(from
> >> superblock) instead of fixed size ?
> >
> > At this moment, it's very hard to change the segment size due to many
> > hardcoded data structures such as SSA, bitmaps, etc.
> > Instead, I proposed something like section and zone which are based on
> > segment.
> > But, somebody will be able to do later. :)
> Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
> Okay, I think It is better if you add TODO comment about this to be
> fixed by the other.
> 
> Talk Incidentally, let me ask you other thing.
> get_meta_page never return NULL about allocate page and readpage.
> Is there any reason it try to infinitely retry about two case ?

That is just to handle errors more easily especially for file system
metadata operations.
Definitely, we need to consider all the error handling codes in more
detail by using the fault-injection facility later.
Thanks,

-- 
Jaegeuk Kim
Samsung

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] f2fs: avoid issuing small bios due to several dirty node pages
  2013-01-18  9:28       ` Jaegeuk Kim
@ 2013-01-18  9:37         ` Namjae Jeon
  0 siblings, 0 replies; 6+ messages in thread
From: Namjae Jeon @ 2013-01-18  9:37 UTC (permalink / raw)
  To: jaegeuk.kim; +Cc: linux-fsdevel, linux-f2fs-devel

2013/1/18, Jaegeuk Kim <jaegeuk.kim@samsung.com>:
> 2013-01-18 (금), 17:57 +0900, Namjae Jeon:
>> 2013/1/18, Jaegeuk Kim <jaegeuk.kim@samsung.com>:
>> > 2013-01-18 (금), 16:58 +0900, Namjae Jeon:
>> >> 2013/1/18, Jaegeuk Kim <jaegeuk.kim@samsung.com>:
>> >> > If some small bios of dirty node pages are supposed to be issued
>> >> > during
>> >> > the
>> >> > sequential data writes, there-in well-produced consecutive data bios
>> >> > are
>> >> > able
>> >> > to be split by the small node bios, resulting in performance
>> >> > degradation.
>> >> > So, let's collect a number of dirty node pages until reaching a
>> >> > threshold.
>> >> > And, by default, I set the threshold as 2MB, a segment size.
>> >> >
>> >> > This improves sequential write performance on i5, 512GB SSD (830 w/
>> >> > SATA2)
>> >> > as
>> >> > follows.
>> >> > Before: 231 MB/s -> After: 255 MB/s
>> >> >
>> >> > Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
>> >> > ---
>> >> >  fs/f2fs/node.c | 17 +++++++++++------
>> >> >  1 file changed, 11 insertions(+), 6 deletions(-)
>> >> >
>> >> > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
>> >> > index f177c01..9bda63c 100644
>> >> > --- a/fs/f2fs/node.c
>> >> > +++ b/fs/f2fs/node.c
>> >> > @@ -1124,6 +1124,12 @@ static int f2fs_write_node_page(struct page
>> >> > *page,
>> >> >  	return 0;
>> >> >  }
>> >> >
>> >> > +/*
>> >> > + * It is very important to gather dirty pages and write at once, so
>> >> > that we
>> >> > can
>> >> > + * submit a big bio without interfering other data writes.
>> >> > + * Be default, 512 pages (2MB), a segment size, is quite
>> >> > reasonable.
>> >> > + */
>> >> > +#define COLLECT_DIRTY_NODES	512
>> >> Hi Jaeguek.
>> >> It is just my opinion.
>> >> One of f2fs advantages is that user can set segment size to fit own
>> >> device.
>> >> For future, Is it not good to use segment size set by user(from
>> >> superblock) instead of fixed size ?
>> >
>> > At this moment, it's very hard to change the segment size due to many
>> > hardcoded data structures such as SSA, bitmaps, etc.
>> > Instead, I proposed something like section and zone which are based on
>> > segment.
>> > But, somebody will be able to do later. :)
>> Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
>> Okay, I think It is better if you add TODO comment about this to be
>> fixed by the other.
>>
>> Talk Incidentally, let me ask you other thing.
>> get_meta_page never return NULL about allocate page and readpage.
>> Is there any reason it try to infinitely retry about two case ?
>
> That is just to handle errors more easily especially for file system
> metadata operations.
> Definitely, we need to consider all the error handling codes in more
> detail by using the fault-injection facility later.
> Thanks,
Okay, I see.
Thanks!
>
> --
> Jaegeuk Kim
> Samsung
>
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-01-18  9:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-18  6:29 [PATCH] f2fs: avoid issuing small bios due to several dirty node pages Jaegeuk Kim
2013-01-18  7:58 ` Namjae Jeon
2013-01-18  8:25   ` Jaegeuk Kim
2013-01-18  8:57     ` Namjae Jeon
2013-01-18  9:28       ` Jaegeuk Kim
2013-01-18  9:37         ` Namjae Jeon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).