Re: [RFC][PATCH v3 1/5] mm/zsmalloc: introduce class auto-compaction

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Joonsoo Kim <js1304@gmail.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC][PATCH v3 1/5] mm/zsmalloc: introduce class auto-compaction
Date: Fri, 18 Mar 2016 11:00:29 +0900	[thread overview]
Message-ID: <20160318020029.GC572@swordfish> (raw)
In-Reply-To: <20160318011741.GD2154@bbox>

Hi,

On (03/18/16 10:17), Minchan Kim wrote:
> > > > hm, in this scenario both solutions are less than perfect. we jump
> > > > X times over 40% margin, we have X*NR_CLASS compaction scans in the
> > > > end. the difference is that we queue less works, yes, but we don't
> > > > have to use workqueue in the first place; compaction can be done
> > > > asynchronously by a pool's dedicated kthread. so we will just
> > > > wake_up() the process.
> > > 
> > > Hmm, kthread is over-engineered to me. If we want to create new kthread
> > > in the system, I guess we should persuade many people to merge in.
> > > Surely, we should have why it couldn't be done by others(e.g., workqueue).
> > > 
> > > I think your workqueue approach is good to me.
> > > Only problem I can see with it is we cannot start compaction when
> > > we want instantly so my conclusion is we need both direct and
> > > background compaction.
> > 
> > well, if we will keep the shrinker callbacks then it's not such a huge
> > issue, IMHO. for that type of forward progress guarantees we can have
> > our own, dedicated, workqueue with a rescuer thread (WQ_MEM_RECLAIM).
> 
> What I meant with direct compaction is shrinker while backgroud
> compaction is workqueue.
> So do you mean that you agree to remain shrinker?

hm, probably yes, hard to say. we don't have yet a solution for background
compaction.

> And do you want to use workqueue with WQ_MEM_RECLAIM rather than
> new kthread?

I have some concerns here. WQ_MEM_RECLAIM implies that there is a kthread
attached to wq, a rescuer thread, which will be idle until wq declares mayday.
But the kthread will be allocated anyway. And we can queue only one global
compaction work at a time; so wq does not buy us a lot here and a simple
wake_up_process() looks much better. it make sense to use wq if we can have
N compaction jobs queued, like I did in my initial patch, but otherwise
it's sort of overkill, isn't it?

> > just thought... I think it'll be tricky to implement this. We scan classes
> > from HIGH class_size to SMALL class_size, counting fragmentation value and
> > re-calculating the global fragmentation all the time; once the global
> > fragmentation passes the watermark, we start compacting from HIGH to
> > SMALL. the problem here is that as soon as we calculated the class B
> > fragmentation index and moved to class A we can't trust B anymore. classes
> > are not locked and absolutely free to change. so the global fragmentation
> > index likely will be inaccurate.
> > 
> 
> Actually, I don't think such inaccuracy will make big trouble here.
> But How about this simple idea?
> 
> If zs_free find wasted space is bigger than threshold(e.g., 10M)
>
> user defined, zs_free can queue work for background compaction(
> that background compaction work should be WQ_MEM_RECLAIM |
> WQ_CPU_INTENSIVE?). Once that work is executed, the work compacts
> all size_class unconditionally.

ok. global pool stats that will give us a fragmentation index, so we can
start compaction when the entire pool passes the watermark, not an
individual class.

> With it, less background compaction and more simple algorithm,

so you want to have

	zs_free()
		check pool watermark
			queue class compaction
			queue pool compaction

?

I think a simpler one will be to just queue global compaction, if pool
is fragmented -- compact everything, like we do in shrinker callback.

> no harmful other works by WQ_CPU_INTENSIVE.
> 
> > so I'm thinking about triggering a global compaction from zs_free() (to
> > queue less works), but instead of calculating global watermark and compacting
> > afterwards, just compact every class that has fragmentation over XY% (for
> > example 30%). "iterate from HI to LO and compact everything that is too
> > fragmented".
> 
> The problem with approach is we can compact only small size class which
> is fragment ratio is higher than bigger size class but compaction benefit
> is smaller than higher size class which is lower fragment ratio.
> With that, continue to need to background work until it meets user-defined
> global threshold.

good point.

> > 
> > we still need some sort of a pool->compact_ts timestamp to prevent too
> > frequent compaction jobs.
> 
> Yes, we need something to throttle mechanism. Need time to think more. :)

yes, need to think more :)

	-ss

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Joonsoo Kim <js1304@gmail.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC][PATCH v3 1/5] mm/zsmalloc: introduce class auto-compaction
Date: Fri, 18 Mar 2016 11:00:29 +0900	[thread overview]
Message-ID: <20160318020029.GC572@swordfish> (raw)
In-Reply-To: <20160318011741.GD2154@bbox>

Hi,

On (03/18/16 10:17), Minchan Kim wrote:
> > > > hm, in this scenario both solutions are less than perfect. we jump
> > > > X times over 40% margin, we have X*NR_CLASS compaction scans in the
> > > > end. the difference is that we queue less works, yes, but we don't
> > > > have to use workqueue in the first place; compaction can be done
> > > > asynchronously by a pool's dedicated kthread. so we will just
> > > > wake_up() the process.
> > > 
> > > Hmm, kthread is over-engineered to me. If we want to create new kthread
> > > in the system, I guess we should persuade many people to merge in.
> > > Surely, we should have why it couldn't be done by others(e.g., workqueue).
> > > 
> > > I think your workqueue approach is good to me.
> > > Only problem I can see with it is we cannot start compaction when
> > > we want instantly so my conclusion is we need both direct and
> > > background compaction.
> > 
> > well, if we will keep the shrinker callbacks then it's not such a huge
> > issue, IMHO. for that type of forward progress guarantees we can have
> > our own, dedicated, workqueue with a rescuer thread (WQ_MEM_RECLAIM).
> 
> What I meant with direct compaction is shrinker while backgroud
> compaction is workqueue.
> So do you mean that you agree to remain shrinker?

hm, probably yes, hard to say. we don't have yet a solution for background
compaction.

> And do you want to use workqueue with WQ_MEM_RECLAIM rather than
> new kthread?

I have some concerns here. WQ_MEM_RECLAIM implies that there is a kthread
attached to wq, a rescuer thread, which will be idle until wq declares mayday.
But the kthread will be allocated anyway. And we can queue only one global
compaction work at a time; so wq does not buy us a lot here and a simple
wake_up_process() looks much better. it make sense to use wq if we can have
N compaction jobs queued, like I did in my initial patch, but otherwise
it's sort of overkill, isn't it?

> > just thought... I think it'll be tricky to implement this. We scan classes
> > from HIGH class_size to SMALL class_size, counting fragmentation value and
> > re-calculating the global fragmentation all the time; once the global
> > fragmentation passes the watermark, we start compacting from HIGH to
> > SMALL. the problem here is that as soon as we calculated the class B
> > fragmentation index and moved to class A we can't trust B anymore. classes
> > are not locked and absolutely free to change. so the global fragmentation
> > index likely will be inaccurate.
> > 
> 
> Actually, I don't think such inaccuracy will make big trouble here.
> But How about this simple idea?
> 
> If zs_free find wasted space is bigger than threshold(e.g., 10M)
>
> user defined, zs_free can queue work for background compaction(
> that background compaction work should be WQ_MEM_RECLAIM |
> WQ_CPU_INTENSIVE?). Once that work is executed, the work compacts
> all size_class unconditionally.

ok. global pool stats that will give us a fragmentation index, so we can
start compaction when the entire pool passes the watermark, not an
individual class.

> With it, less background compaction and more simple algorithm,

so you want to have

	zs_free()
		check pool watermark
			queue class compaction
			queue pool compaction

?

I think a simpler one will be to just queue global compaction, if pool
is fragmented -- compact everything, like we do in shrinker callback.

> no harmful other works by WQ_CPU_INTENSIVE.
> 
> > so I'm thinking about triggering a global compaction from zs_free() (to
> > queue less works), but instead of calculating global watermark and compacting
> > afterwards, just compact every class that has fragmentation over XY% (for
> > example 30%). "iterate from HI to LO and compact everything that is too
> > fragmented".
> 
> The problem with approach is we can compact only small size class which
> is fragment ratio is higher than bigger size class but compaction benefit
> is smaller than higher size class which is lower fragment ratio.
> With that, continue to need to background work until it meets user-defined
> global threshold.

good point.

> > 
> > we still need some sort of a pool->compact_ts timestamp to prevent too
> > frequent compaction jobs.
> 
> Yes, we need something to throttle mechanism. Need time to think more. :)

yes, need to think more :)

	-ss

next prev parent reply	other threads:[~2016-03-18  1:59 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-03 14:45 [RFC][PATCH v3 0/5] mm/zsmalloc: rework compaction and increase density Sergey Senozhatsky
2016-03-03 14:45 ` Sergey Senozhatsky
2016-03-03 14:45 ` [RFC][PATCH v3 1/5] mm/zsmalloc: introduce class auto-compaction Sergey Senozhatsky
2016-03-03 14:45   ` Sergey Senozhatsky
2016-03-14  6:17   ` Minchan Kim
2016-03-14  6:17     ` Minchan Kim
2016-03-14  7:41     ` Sergey Senozhatsky
2016-03-14  7:41       ` Sergey Senozhatsky
2016-03-14  8:20       ` Sergey Senozhatsky
2016-03-14  8:20         ` Sergey Senozhatsky
2016-03-15  0:46       ` Minchan Kim
2016-03-15  0:46         ` Minchan Kim
2016-03-15  1:33         ` Sergey Senozhatsky
2016-03-15  1:33           ` Sergey Senozhatsky
2016-03-15  6:17           ` Minchan Kim
2016-03-15  6:17             ` Minchan Kim
2016-03-17  1:29             ` Sergey Senozhatsky
2016-03-17  1:29               ` Sergey Senozhatsky
2016-03-18  1:17               ` Minchan Kim
2016-03-18  1:17                 ` Minchan Kim
2016-03-18  2:00                 ` Sergey Senozhatsky [this message]
2016-03-18  2:00                   ` Sergey Senozhatsky
2016-03-18  4:03                   ` Minchan Kim
2016-03-18  4:03                     ` Minchan Kim
2016-03-18  4:10                     ` Sergey Senozhatsky
2016-03-18  4:10                       ` Sergey Senozhatsky
2016-03-03 14:46 ` [RFC][PATCH v3 2/5] mm/zsmalloc: remove shrinker compaction callbacks Sergey Senozhatsky
2016-03-03 14:46   ` Sergey Senozhatsky
2016-03-14  6:32   ` Minchan Kim
2016-03-14  6:32     ` Minchan Kim
2016-03-14  7:45     ` Sergey Senozhatsky
2016-03-14  7:45       ` Sergey Senozhatsky
2016-03-15  0:52       ` Minchan Kim
2016-03-15  0:52         ` Minchan Kim
2016-03-15  1:05         ` Sergey Senozhatsky
2016-03-15  1:05           ` Sergey Senozhatsky
2016-03-15  2:19           ` Minchan Kim
2016-03-15  2:19             ` Minchan Kim
2016-03-03 14:46 ` [RFC][PATCH v3 3/5] mm/zsmalloc: introduce zs_huge_object() Sergey Senozhatsky
2016-03-03 14:46   ` Sergey Senozhatsky
2016-03-14  6:53   ` Minchan Kim
2016-03-14  6:53     ` Minchan Kim
2016-03-14  8:08     ` Sergey Senozhatsky
2016-03-14  8:08       ` Sergey Senozhatsky
2016-03-15  0:54       ` Minchan Kim
2016-03-15  0:54         ` Minchan Kim
2016-03-03 14:46 ` [RFC][PATCH v3 4/5] zram: use zs_huge_object() Sergey Senozhatsky
2016-03-03 14:46   ` Sergey Senozhatsky
2016-03-03 14:46 ` [RFC][PATCH v3 5/5] mm/zsmalloc: reduce the number of huge classes Sergey Senozhatsky
2016-03-03 14:46   ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160318020029.GC572@swordfish \
    --to=sergey.senozhatsky.work@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=js1304@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=sergey.senozhatsky@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.