From: Vlastimil Babka <vbabka@suse.cz>
To: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Rik van Riel <riel@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Mel Gorman <mgorman@suse.de>,
Johannes Weiner <hannes@cmpxchg.org>,
Minchan Kim <minchan@kernel.org>,
Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>,
Zhang Yanfei <zhangyanfei@cn.fujitsu.com>,
"Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>,
Tang Chen <tangchen@cn.fujitsu.com>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>,
Wen Congyang <wency@cn.fujitsu.com>,
Marek Szyprowski <m.szyprowski@samsung.com>,
Michal Nazarewicz <mina86@mina86.com>,
Laura Abbott <lauraa@codeaurora.org>,
Heesub Shin <heesub.shin@samsung.com>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
Ritesh Harjani <ritesh.list@gmail.com>,
t.stanislaws@samsung.com, Gioh Kim <gioh.kim@lge.com>,
linux-mm@kvack.org, Lisa Du <cldu@marvell.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 00/10] fix freepage count problems due to memory isolation
Date: Wed, 16 Jul 2014 13:14:26 +0200 [thread overview]
Message-ID: <53C65E92.2000606@suse.cz> (raw)
In-Reply-To: <20140716084333.GA20359@js1304-P5Q-DELUXE>
On 07/16/2014 10:43 AM, Joonsoo Kim wrote:
>> I think your plan of multiple parallel CMA allocations (and thus
>> multiple parallel isolations) is also possible. The isolate pcplists
>> can be shared by pages coming from multiple parallel isolations. But
>> the flush operation needs a pfn start/end parameters to only flush
>> pages belonging to the given isolation. That might mean a bit of
>> inefficient list traversing, but I don't think it's a problem.
>
> I think that special pcplist would cause a problem if we should check
> pfn range. If there are too many pages on this pcplist, move pages from
> this pcplist to isolate freelist takes too long time in irq context and
> system could be broken. This operation cannot be easily stopped because
> it is initiated by IPI on other cpu and starter of this IPI expect that
> all pages on other cpus' pcplist are moved properly when returning
> from on_each_cpu().
>
> And, if there are so many pages, serious lock contention would happen
> in this case.
Hm I see. So what if it wasn't a special pcplist, but a special "free list"
where the pages would be just linked together as on pcplist, regardless of
order, and would not merge until the CPU that drives the memory isolation
process decides it is safe to flush them away. That would remove the need for
IPI's and provide the same guarantees I think.
> Anyway, my idea's key point is using PageIsolated() to distinguish
> isolated page, instead of using PageBuddy(). If page is PageIsolated(),
Is PageIsolated a completely new page flag? Those are a limited resource so I
would expect some resistance to such approach. Or a new special page->_mapcount
value? That could maybe work.
> it isn't handled as freepage although it is in buddy allocator. During free,
> page with MIGRATETYPE_ISOLATE will be marked as PageIsolated() and
> won't be merged and counted for freepage.
OK. Preventing wrong merging is the key point and this should work.
> When we move pages from normal buddy list to isolate buddy
> list, we check PageBuddy() and subtract number of PageBuddy() pages
Do we really need to check PageBuddy()? Could a page get marked as PageIsolate()
but still go to normal list instead of isolate list?
> from number of freepage. And, change page from PageBuddy() to PageIsolated()
> since it is handled as isolated page at this point. In this way, freepage
> count will be correct.
>
> Unisolation can be done by similar approach.
>
> I made prototype of this approach and it isn't intrusive to core
> allocator compared to my previous patchset.
>
> Make sense?
I think so :)
> Thanks.
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-07-16 11:14 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-04 7:57 [PATCH 00/10] fix freepage count problems due to memory isolation Joonsoo Kim
2014-07-04 7:57 ` [PATCH 01/10] mm/page_alloc: remove unlikely macro on free_one_page() Joonsoo Kim
2014-07-04 12:03 ` Vlastimil Babka
2014-07-07 4:58 ` Joonsoo Kim
2014-07-04 12:52 ` Michal Nazarewicz
2014-07-04 12:53 ` Michal Nazarewicz
2014-07-04 7:57 ` [PATCH 02/10] mm/page_alloc: correct to clear guard attribute in DEBUG_PAGEALLOC Joonsoo Kim
2014-07-07 14:50 ` Vlastimil Babka
2014-07-04 7:57 ` [PATCH 03/10] mm/page_alloc: handle page on pcp correctly if it's pageblock is isolated Joonsoo Kim
2014-07-07 8:25 ` Gioh Kim
2014-07-08 7:18 ` Vlastimil Babka
2014-07-07 15:19 ` Vlastimil Babka
2014-07-14 6:24 ` Joonsoo Kim
2014-07-04 7:57 ` [PATCH 04/10] mm/page_alloc: carefully free the page on isolate pageblock Joonsoo Kim
2014-07-07 15:43 ` Vlastimil Babka
2014-07-04 7:57 ` [PATCH 05/10] mm/page_alloc: optimize and unify pageblock migratetype check in free path Joonsoo Kim
2014-07-07 15:50 ` Vlastimil Babka
2014-07-14 6:28 ` Joonsoo Kim
2014-07-04 7:57 ` [PATCH 06/10] mm/page_alloc: separate freepage migratetype interface Joonsoo Kim
2014-07-04 7:57 ` [PATCH 07/10] mm/page_alloc: store migratetype of the buddy list into freepage correctly Joonsoo Kim
2014-07-04 7:57 ` [PATCH 08/10] mm/page_alloc: use get_onbuddy_migratetype() to get buddy list type Joonsoo Kim
2014-07-07 15:57 ` Vlastimil Babka
2014-07-08 1:01 ` Gioh Kim
2014-07-08 7:23 ` Vlastimil Babka
2014-07-14 6:34 ` Joonsoo Kim
2014-07-04 7:57 ` [PATCH 09/10] mm/page_alloc: fix possible wrongly calculated freepage counter Joonsoo Kim
2014-07-04 7:57 ` [PATCH 10/10] mm/page_alloc: Stop merging pages on non-isolate and isolate buddy list Joonsoo Kim
2014-07-04 15:33 ` [PATCH 00/10] fix freepage count problems due to memory isolation Vlastimil Babka
2014-07-07 4:49 ` Joonsoo Kim
2014-07-07 14:33 ` Vlastimil Babka
2014-07-14 6:22 ` Joonsoo Kim
2014-07-14 9:49 ` Vlastimil Babka
2014-07-15 8:28 ` Joonsoo Kim
2014-07-15 8:36 ` Vlastimil Babka
2014-07-15 9:39 ` Joonsoo Kim
2014-07-15 10:00 ` Peter Zijlstra
2014-07-16 8:44 ` Joonsoo Kim
2014-07-16 8:43 ` Joonsoo Kim
2014-07-16 11:14 ` Vlastimil Babka [this message]
2014-07-17 6:12 ` Joonsoo Kim
2014-07-17 9:14 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53C65E92.2000606@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=b.zolnierkie@samsung.com \
--cc=cldu@marvell.com \
--cc=gioh.kim@lge.com \
--cc=hannes@cmpxchg.org \
--cc=heesub.shin@samsung.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=isimatu.yasuaki@jp.fujitsu.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=lauraa@codeaurora.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=m.szyprowski@samsung.com \
--cc=mgorman@suse.de \
--cc=mina86@mina86.com \
--cc=minchan@kernel.org \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=ritesh.list@gmail.com \
--cc=srivatsa.bhat@linux.vnet.ibm.com \
--cc=t.stanislaws@samsung.com \
--cc=tangchen@cn.fujitsu.com \
--cc=wency@cn.fujitsu.com \
--cc=zhangyanfei@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).