linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jaegeuk Kim <jaegeuk.kim@samsung.com>
To: Jin Xu <jinuxstyle@gmail.com>
Cc: linux-f2fs-devel@lists.sourceforge.net,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] f2fs: optimize gc for better performance
Date: Wed, 04 Sep 2013 18:40:35 +0900	[thread overview]
Message-ID: <1378287635.2354.84.camel@kjgkr> (raw)
In-Reply-To: <522677ED.8090203@gmail.com>

Hi Jin,

2013-09-04 (수), 07:59 +0800, Jin Xu:
> Hi Jaegeuk,
> 
> On 03/09/2013 08:45, Jaegeuk Kim wrote:
> > Hi Jin,
> >
> >> [...]
> >>>
> >>> It seems that we can obtain the performance gain just by setting the
> >>> MAX_VICTIM_SEARCH to 4096, for example.
> >>> So, how about just adding an ending criteria like below?
> >>>
> >>
> >> I agree that we could get the performance improvement by simply
> >> enlarging the MAX_VICTIM_SEARCH to 4096, but I am concerning the
> >> scalability a little bit. Because it might always searching the whole
> >> bitmap in some cases, for example, when dirty segments is 4000 and
> >> total segments is 409600.
> >>> [snip]
> >> [...]
> >>>
> >>> 	if (p->max_search > MAX_VICTIM_SEARCH)
> >>> 		p->max_search = MAX_VICTIM_SEARCH;
> >>>
> >>
> >> The optimization does not apply to SSR mode. There has a reason.
> >> As noticed in the test, when SSR selected the segments that have most
> >> garbage blocks, then when gc is needed, all the dirty segments might
> >> have very less garbage blocks, thus the gc overhead is high. This might
> >> lead to performance degradation. So the patch does not change the
> >> victim selection policy for SSR.
> >
> > I think it doesn't care.
> > GC is only triggered during the direct node block allocation.
> > What it means that we need to consider the number of GC triggers where
> > the GC triggers more frequently during the normal data allocation than
> > the node block allocation.
> > So, I think it would not degrade performance significatly.
> >
> > BTW, could you show some numbers for this?
> > Or could you test what I suggested?
> >
> > Thanks,
> >
> 
> I re-ran the test and got the following result:
> 
> ---------------------------------------
> 2GB SDHC
> create 52023 files of size 32768 bytes
> random re-write 100000 records of 4KB
> ---------------------------------------
> 
> | file creation (s) | rewrite time (s) | gc count | gc garbage blocks |
> 
> no patch       341         4227             1174          174840
> patched        296         2995             634           109314
> patched (KIM)  324         2958             645           106682
> 
> In this test, it does not show the minor performance degradation caused
> by applying the patch to SSR mode. Instead, the performance is a little 
> better with what you suggested.
> 
> I agree that the performance degradation would not be significant even
> it does degrade. I ever saw the minor degradation in some workloads, but
> I didn't save the data.
> 
> So, I agree that we can apply the patch to SSR mode as well.
> 
> And do you still have concerns about the formula for calculating the #
> of search?

Thank you for the test. :)
What I've concerned is that, if it is really important to get a victim
more accurately for the performance as you described, it doesn't need to
calculate the number of searches IMO. Just let's select nr_dirty. Why
not?
Only the thing that we should consider is to handle the case where the
nr_dirty is too large.
For this, we can just limit the # of searches to avoid performance
degradation.

Still actually, I'm not convincing the effectiveness of your formula.
If possible, could you show it with numbers?

Thanks,

> 
> >>
> >> What do you think now?
> >>
> >>> #define MAX_VICTIM_SEARCH 4096 /* covers 8GB */
> >>>
> >>>>    	p->offset = sbi->last_victim[p->gc_mode];
> >>>> @@ -243,6 +245,8 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
> >>>>    	struct victim_sel_policy p;
> >>>>    	unsigned int secno, max_cost;
> >>>>    	int nsearched = 0;
> >>>> +	unsigned int max_search = MAX_VICTIM_SEARCH;
> >>>> +	unsigned int nr_dirty;
> >>>>
> >>>>    	p.alloc_mode = alloc_mode;
> >>>>    	select_policy(sbi, gc_type, type, &p);
> >>>> @@ -258,6 +262,27 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
> >>>>    			goto got_it;
> >>>>    	}
> >>>>
> >>>> +	nr_dirty = dirty_i->nr_dirty[p.dirty_type];
> >>>> +	if (p.gc_mode == GC_GREEDY && p.alloc_mode != SSR) {
> >>>> +		if (TOTAL_SEGS(sbi) <= FULL_VICTIM_SEARCH_THRESH)
> >>>> +			max_search = nr_dirty; /* search all the dirty segs */
> >>>> +		else {
> >>>> +			/*
> >>>> +			 * With more dirty segments, garbage blocks are likely
> >>>> +			 * more scattered, thus search harder for better
> >>>> +			 * victim.
> >>>> +			 */
> >>>> +			max_search = div_u64 ((nr_dirty *
> >>>> +				FULL_VICTIM_SEARCH_THRESH), TOTAL_SEGS(sbi));
> >>>> +			if (max_search < MIN_VICTIM_SEARCH_GREEDY)
> >>>> +				max_search = MIN_VICTIM_SEARCH_GREEDY;
> >>>> +		}
> >>>> +	}
> >>>> +
> >>>> +	/* no more than the total dirty segments */
> >>>> +	if (max_search > nr_dirty)
> >>>> +		max_search = nr_dirty;
> >>>> +
> >>>>    	while (1) {
> >>>>    		unsigned long cost;
> >>>>    		unsigned int segno;
> >>>> @@ -290,7 +315,7 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
> >>>>    		if (cost == max_cost)
> >>>>    			continue;
> >>>>
> >>>> -		if (nsearched++ >= MAX_VICTIM_SEARCH) {
> >>>> +		if (nsearched++ >= max_search) {
> >>>
> >>> 		if (nsearched++ >= p.max_search) {
> >>>
> >>>>    			sbi->last_victim[p.gc_mode] = segno;
> >>>>    			break;
> >>>>    		}
> >>>> diff --git a/fs/f2fs/gc.h b/fs/f2fs/gc.h
> >>>> index 2c6a6bd..2f525aa 100644
> >>>> --- a/fs/f2fs/gc.h
> >>>> +++ b/fs/f2fs/gc.h
> >>>> @@ -20,7 +20,9 @@
> >>>>    #define LIMIT_FREE_BLOCK	40 /* percentage over invalid + free space */
> >>>>
> >>>>    /* Search max. number of dirty segments to select a victim segment */
> >>>> -#define MAX_VICTIM_SEARCH	20
> >>>> +#define MAX_VICTIM_SEARCH		20
> >>>> +#define MIN_VICTIM_SEARCH_GREEDY	20
> >>>> +#define FULL_VICTIM_SEARCH_THRESH	4096
> >>>>
> >>>>    struct f2fs_gc_kthread {
> >>>>    	struct task_struct *f2fs_gc_task;
> >>>> diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
> >>>> index 062424a..cd33f96 100644
> >>>> --- a/fs/f2fs/segment.h
> >>>> +++ b/fs/f2fs/segment.h
> >>>> @@ -142,6 +142,7 @@ struct victim_sel_policy {
> >>>>    	int alloc_mode;			/* LFS or SSR */
> >>>>    	int gc_mode;			/* GC_CB or GC_GREEDY */
> >>>>    	unsigned long *dirty_segmap;	/* dirty segment bitmap */
> >>>> +	int dirty_type;
> >>>
> >>> 	int max_search;		/* maximum # of segments to search */
> >>>
> >>>>    	unsigned int offset;		/* last scanned bitmap offset */
> >>>>    	unsigned int ofs_unit;		/* bitmap search unit */
> >>>>    	unsigned int min_cost;		/* minimum cost */
> >>>
> >>
> >
> 
> --
> Thanks,
> Jin
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Jaegeuk Kim
Samsung

  reply	other threads:[~2013-09-04  9:40 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-29  0:48 [PATCH] f2fs: optimize gc for better performance Jin Xu
2013-08-29  5:00 ` jin xu
2013-08-29 11:56 ` Jaegeuk Kim
2013-08-30  0:18   ` jin xu
2013-08-30  0:29   ` jin xu
2013-08-30  4:06   ` Jin Xu
2013-09-03  0:45     ` Jaegeuk Kim
2013-09-03 23:59       ` Jin Xu
2013-09-04  9:40         ` Jaegeuk Kim [this message]
2013-09-04 13:17           ` Jin Xu
2013-09-04 23:50             ` Jaegeuk Kim
2013-09-04 23:58               ` Jin Xu
  -- strict thread matches above, loose matches on Subject: below --
2013-09-05  4:45 Jin Xu
2013-09-05  4:49 ` Jaegeuk Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1378287635.2354.84.camel@kjgkr \
    --to=jaegeuk.kim@samsung.com \
    --cc=jinuxstyle@gmail.com \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).