From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kent Overstreet Subject: Re: reverse link from bucket to keys Date: Thu, 11 Jul 2013 19:01:11 -0700 Message-ID: <20130712020111.GG17799@kmo-pixel> References: <20130708214859.GB1959@kmo-pixel> <20130710230257.GD13527@kmo-pixel> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-bcache-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: sheng qiu Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: linux-bcache@vger.kernel.org On Thu, Jul 11, 2013 at 09:45:25AM -0400, sheng qiu wrote: > Hi Kent, > > i do not know if this is a problem or i did not understand correctly. > when i enable the copy_gc, at first the gc_moving_threshold is larger > than the bucket size (512KB), I haven't looked at the copy gc code in ages, but if I remember the algorithm I used - the threshold is set based on how many buckets we want to free up, so if most of the buckets are reclaimable threshold > bucket size is what I'd expect. > i guess maybe initially they are set to > 1<<14 -1, after some time, seems all the bucket are set to use all the > sectors it own, which is 1024 in my case. I guess this is because > SET_GC_SECTORS_USED(bucket.size) is called while returning a free > bucket for new allocation. then the gc_moving_threshold became 1024 > (512KB), and no bucket's GC_SECTORS_USED is smaller than this value. > So no gc_moving event at all. > at first, i thought the bcache code would set 0 sectors used for each > new allocated bucket, and then update GC_SECTORS_USED whenever fill > data into the bucket. In that way, we can track how many data are > valid within a bucket. When we allocate a bucket we generally fill it up right away - we don't make it available for GC (the ->pin refcount) until after we're finished writing to it, at which point GC_SECTORS_USED is going to = bucket_size. So that's why the alloc code just sets GC_SECTORS_USED(b, ca->sb.bucket_size) when you allocate it. But then over time, as new writes come in - when those writes are overwriting data that's already in the cache, now GC will notice those buckets aren't full of (live) data anymore.