From: Will Huck <will.huckk@gmail.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Kyungmin Park <kmpark@infradead.org>,
Simon Jeons <simon.jeons@gmail.com>,
Dan Magenheimer <dan.magenheimer@oracle.com>,
Luigi Semenzato <semenzato@google.com>,
linux-mm@kvack.org, Konrad Wilk <konrad.wilk@oracle.com>,
Seth Jennings <sjenning@linux.vnet.ibm.com>,
Nitin Gupta <ngupta@vflare.org>
Subject: Re: zcache+zram working together?
Date: Wed, 20 Feb 2013 14:00:28 +0800 [thread overview]
Message-ID: <5124667C.9060002@gmail.com> (raw)
In-Reply-To: <20130220055324.GF16950@blaptop>
On 02/20/2013 01:53 PM, Minchan Kim wrote:
> On Wed, Feb 20, 2013 at 11:47:32AM +0900, Kyungmin Park wrote:
>> On Wed, Feb 20, 2013 at 9:06 AM, Minchan Kim <minchan@kernel.org> wrote:
>>> On Sat, Feb 16, 2013 at 04:15:41PM +0800, Simon Jeons wrote:
>>>> On 12/11/2012 02:42 PM, Minchan Kim wrote:
>>>>> On Fri, Dec 07, 2012 at 01:31:35PM -0800, Dan Magenheimer wrote:
>>>>>> Last summer, during the great(?) zcache-vs-zcache2 debate,
>>>>>> I wondered if there might be some way to obtain the strengths
>>>>>> of both. While following Luigi's recent efforts toward
>>>>>> using zram for ChromeOS "swap", I thought of an interesting
>>>>>> interposition of zram and zcache that, at first blush, makes
>>>>>> almost no sense at all, but after more thought, may serve as a
>>>>>> foundation for moving towards a more optimal solution for use
>>>>>> of "adaptive compression" in the kernel, at least for
>>>>>> embedded systems.
>>>>>>
>>>>>> To quickly review:
>>>>>>
>>>>>> Zram (when used for swap) compresses only anonymous pages and
>>>>>> only when they are swapped but uses the high-density zsmalloc
>>>>>> allocator and eliminates the need for a true swap device, thus
>>>>>> making zram a good fit for embedded systems. But, because zram
>>>>>> appears to the kernel as a swap device, zram data must traverse
>>>>>> the block I/O subsystem and is somewhat difficult to monitor and
>>>>>> control without significant changes to the swap and/or block
>>>>>> I/O subsystem, which are designed to handle fixed block-sized
>>>>>> data.
>>>>>>
>>>>>> Zcache (zcache2) compresses BOTH clean page cache pages that
>>>>>> would otherwise be evicted, and anonymous pages that would
>>>>>> otherwise be sent to a swap device. Both paths use in-kernel
>>>>>> hooks (cleancache and frontswap respectively) which avoid
>>>>>> most or all of the block I/O subsystem and the swap subsystem.
>>>>>> Because of this and since it is designed using transcendent
>>>>>> memory ("tmem") principles, zcache has a great deal more
>>>>>> flexibility in control and monitoring. Zcache uses the simpler,
>>>>>> more predictable "zbud" allocator which achieves lower density
>>>>>> but provides greater flexibility under high pressure.
>>>>>> But zcache requires a swap device as a "backup" so seems
>>>>>> unsuitable for embedded systems.
>>>>>>
>>>>>> (Minchan, I know at one point you were working on some
>>>>>> documentation to contrast zram and zcache so you may
>>>>>> have something more to add here...)
>>>>>>
>>>>>> What if one were to enable both? This is possible today with
>>>>>> no kernel change at all by configuring both zram and zcache2
>>>>>> into the kernel and then configuring zram at boottime.
>>>>>>
>>>>>> When memory pressure is dominated by file pages, zcache (via
>>>>>> the cleancache hooks) provides compression to optimize memory
>>>>>> utilization. As more pressure is exerted by anonymous pages,
>>>>>> "swapping" occurs but the frontswap hooks route the data to
>>>>>> zcache which, as necessary, reclaims physical pages used by
>>>>>> compressed file pages to use for compressed anonymous pages.
>>>>>> At this point, any compressions unsuitable for zbud are rejected
>>>>>> by zcache and passed through to the "backup" swap device...
>>>>>> which is zram! Under high pressure from anonymous pages,
>>>>>> zcache can also be configured to "unuse" pages to zram (though
>>>>>> this functionality is still not merged).
>>>>>>
>>>>>> I've plugged zcache and zram together and watched them
>>>>>> work/cooperate, via their respective debugfs statistics.
>>>>>> While I don't have benchmarking results and may not have
>>>>>> time anytime soon to do much work on this, it seems like
>>>>>> there is some potential here, so I thought I'd publish the
>>>>>> idea so that others can give it a go and/or look at
>>>>>> other ways (including kernel changes) to combine the two.
>>>>>>
>>>>>> Feedback welcome and (early) happy holidays!
>>>>> Interesting, Dan!
>>>>> I would like to get a chance to investigate it if I have a time
>>>>> in future.
>>>>>
>>>>> Another synergy with BOTH is to remove CMA completely because
>>>>> it makes mm core code complicated with hooking and still have a
>>>>> problem with pinned page and eviction working set for getting
>>>> Do you mean get_user_pages? Could you explain in details about the
>>>> downside of CMA?
>>> Good question.
>>>
>>> 1. Ignore workingset.
>>> CMA can sweep out woring set pages in CMA area for getting contiguous
>>> memory.
>> Theoritically agreed, but there's no data to prove this one.
> CMA area is last fallback type for allocation and pages in that area would
> be evicted out when we need contiguous memory. It means newly forked task's
> pages would be likely in that area. newly task's pages would be fit into
> working set category POV LRU. No?
Sorry for a silly question, what's the meaning of POV?
>>> 2. No guarantee of contigous memory area
>>> As I metioned, get_user_pages could pin the page so ends up failing
>>> migration.
>> Right it's working item now, we have to guarantee these pages can't
>> allocate from CMA area.
> Good to hear. CMA's goal is to guarantee it.
> If it can't, there is no point to use it. FYI, memory-hotplug people have
> same problem and have tried to solve it and I wanted they should solve
> CMA problem by their solution, too but not sure they do.
> I'm looking forwading to seeing your elegant works.
>
>>> 3. Latency
>>> CMA reclaims all pages in CMA area when we need it. It means sometime
>>> we should write out dirty pages so it could make big overhead POV latency.
>>> Even, unmapping of all pages from pte of all processes isn't trivial.
>> It's trade off between requirement and performance. If feature is more
>> important and need more memory, it can accept it
> It depends on usecase and as you already know, many people want to use
> CMA with small latency if possible. If CMA can't meet their latency
> requirement, they might use reserved memory rather than CMA or use CMA
> with some harmful jobs(ex, sync + drop_cache).
> If kernel can provide better solution, they can avoid such things.
>
>>> 4. Adding many hooks in MM code. - Personally, I really hate it.
>> But there are cases to use CMA. e.g., DRM playback.
>>
>> We have to guarantee the physical contiguous memory for TrustZone
>> solution at ARM.
>> Without reseverd memory concept. there's no way to get physical
>> congituous memory execpt CMA.
> I don't get it.
> I meant I don't like CONFIG_CMA hook under mm/.
>
>
>> Thank you,
>> Kyungmin Park
>>
>>> --
>>> Kind regards,
>>> Minchan Kim
>>>
>>> --
>>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>>> the body to majordomo@kvack.org. For more info on Linux MM,
>>> see: http://www.linux-mm.org/ .
>>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org. For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-02-20 6:00 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-07 21:31 zcache+zram working together? Dan Magenheimer
2012-12-11 6:42 ` Minchan Kim
2013-02-16 8:15 ` Simon Jeons
2013-02-20 0:06 ` Minchan Kim
2013-02-20 2:47 ` Kyungmin Park
2013-02-20 5:53 ` Minchan Kim
2013-02-20 6:00 ` Will Huck [this message]
2013-02-20 6:05 ` Minchan Kim
2013-02-20 6:21 ` Will Huck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5124667C.9060002@gmail.com \
--to=will.huckk@gmail.com \
--cc=dan.magenheimer@oracle.com \
--cc=kmpark@infradead.org \
--cc=konrad.wilk@oracle.com \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=ngupta@vflare.org \
--cc=semenzato@google.com \
--cc=simon.jeons@gmail.com \
--cc=sjenning@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).