linux-f2fs-devel.lists.sourceforge.net archive mirror
 help / color / mirror / Atom feed
From: Jaegeuk Kim <jaegeuk.kim@samsung.com>
To: Jin Xu <jinuxstyle@gmail.com>
Cc: linux-f2fs-devel@lists.sourceforge.net,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] f2fs: optimize gc for better performance
Date: Thu, 05 Sep 2013 08:50:09 +0900	[thread overview]
Message-ID: <1378338609.2354.86.camel@kjgkr> (raw)
In-Reply-To: <522732EB.4020103@gmail.com>

Hi Jin,

2013-09-04 (수), 21:17 +0800, Jin Xu:
> Hi Jaegeuk,
> 
> On 04/09/2013 17:40, Jaegeuk Kim wrote:
> > Hi Jin,
> >
> > 2013-09-04 (수), 07:59 +0800, Jin Xu:
> >> Hi Jaegeuk,
> >>
> >> On 03/09/2013 08:45, Jaegeuk Kim wrote:
> >>> Hi Jin,
> >>>
> >>>> [...]
> >>>>>
> >>>>> It seems that we can obtain the performance gain just by setting the
> >>>>> MAX_VICTIM_SEARCH to 4096, for example.
> >>>>> So, how about just adding an ending criteria like below?
> >>>>>
> >>>>
> >>>> I agree that we could get the performance improvement by simply
> >>>> enlarging the MAX_VICTIM_SEARCH to 4096, but I am concerning the
> >>>> scalability a little bit. Because it might always searching the whole
> >>>> bitmap in some cases, for example, when dirty segments is 4000 and
> >>>> total segments is 409600.
> >>>>> [snip]
> >>>> [...]
> >>>>>
> >>>>> 	if (p->max_search > MAX_VICTIM_SEARCH)
> >>>>> 		p->max_search = MAX_VICTIM_SEARCH;
> >>>>>
> >>>>
> >>>> The optimization does not apply to SSR mode. There has a reason.
> >>>> As noticed in the test, when SSR selected the segments that have most
> >>>> garbage blocks, then when gc is needed, all the dirty segments might
> >>>> have very less garbage blocks, thus the gc overhead is high. This might
> >>>> lead to performance degradation. So the patch does not change the
> >>>> victim selection policy for SSR.
> >>>
> >>> I think it doesn't care.
> >>> GC is only triggered during the direct node block allocation.
> >>> What it means that we need to consider the number of GC triggers where
> >>> the GC triggers more frequently during the normal data allocation than
> >>> the node block allocation.
> >>> So, I think it would not degrade performance significatly.
> >>>
> >>> BTW, could you show some numbers for this?
> >>> Or could you test what I suggested?
> >>>
> >>> Thanks,
> >>>
> >>
> >> I re-ran the test and got the following result:
> >>
> >> ---------------------------------------
> >> 2GB SDHC
> >> create 52023 files of size 32768 bytes
> >> random re-write 100000 records of 4KB
> >> ---------------------------------------
> >>
> >> | file creation (s) | rewrite time (s) | gc count | gc garbage blocks |
> >>
> >> no patch       341         4227             1174          174840
> >> patched        296         2995             634           109314
> >> patched (KIM)  324         2958             645           106682
> >>
> >> In this test, it does not show the minor performance degradation caused
> >> by applying the patch to SSR mode. Instead, the performance is a little
> >> better with what you suggested.
> >>
> >> I agree that the performance degradation would not be significant even
> >> it does degrade. I ever saw the minor degradation in some workloads, but
> >> I didn't save the data.
> >>
> >> So, I agree that we can apply the patch to SSR mode as well.
> >>
> >> And do you still have concerns about the formula for calculating the #
> >> of search?
> >
> > Thank you for the test. :)
> > What I've concerned is that, if it is really important to get a victim
> > more accurately for the performance as you described, it doesn't need to
> > calculate the number of searches IMO. Just let's select nr_dirty. Why
> > not?
> > Only the thing that we should consider is to handle the case where the
> > nr_dirty is too large.
> > For this, we can just limit the # of searches to avoid performance
> > degradation.
> >
> > Still actually, I'm not convincing the effectiveness of your formula.
> > If possible, could you show it with numbers?
> 
> It's not easy to prove the effectiveness of the formula. It's just for
> eliminating my concern on the scalability of searching. Since it does
> not matter much for the performance improvement, we can put it aside
> and choose the simpler method as you suggested.
> 
> So, should I revise the patch based on what you suggested or will
> you take care of it?

Could you make a patch with your performance description and sumbit it
again?
Thanks a lot,

-- 
Jaegeuk Kim
Samsung

  reply	other threads:[~2013-09-04 23:50 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-29  0:48 [PATCH] f2fs: optimize gc for better performance Jin Xu
2013-08-29  5:00 ` jin xu
2013-08-29 11:56 ` Jaegeuk Kim
2013-08-30  0:18   ` jin xu
2013-08-30  0:29   ` jin xu
2013-08-30  4:06   ` Jin Xu
2013-09-03  0:45     ` Jaegeuk Kim
2013-09-03 23:59       ` Jin Xu
2013-09-04  9:40         ` Jaegeuk Kim
2013-09-04 13:17           ` Jin Xu
2013-09-04 23:50             ` Jaegeuk Kim [this message]
2013-09-04 23:58               ` Jin Xu
  -- strict thread matches above, loose matches on Subject: below --
2013-09-05  4:45 Jin Xu
2013-09-05  4:49 ` Jaegeuk Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1378338609.2354.86.camel@kjgkr \
    --to=jaegeuk.kim@samsung.com \
    --cc=jinuxstyle@gmail.com \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).