Allow migration of mlocked page?

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* Allow migration of mlocked page?
@ 2012-05-11  4:37 Minchan Kim
  2012-05-11  9:20 ` Peter Zijlstra
  2012-05-11 13:14 ` Mel Gorman
  0 siblings, 2 replies; 38+ messages in thread
From: Minchan Kim @ 2012-05-11  4:37 UTC (permalink / raw)
  To: Johannes Weiner, Mel Gorman, Rik van Riel, Andrew Morton,
	Andrea Arcangeli, KAMEZAWA Hiroyuki, Christoph Lameter,
	linux-mm@kvack.org, tglx, Ingo Molnar, Peter Zijlstra
  Cc: Theodore Ts'o

Let's open new thread.

On 05/11/2012 11:51 AM, KOSAKI Motohiro wrote:
> (5/10/12 8:50 PM), Minchan Kim wrote:
>> Hi KOSAKI,
>>
>> On 05/11/2012 02:53 AM, KOSAKI Motohiro wrote:
>>
>>>>>> let's assume that one application want to allocate user space memory
>>>>>> region using malloc() and then write something on the region. as you
>>>>>> may know, user space buffer doen't have real physical pages once
>>>>>> malloc() call so if user tries to access the region then page fault
>>>>>> handler would be triggered
>>>>>
>>>>>
>>>>> Understood.
>>>>>
>>>>>> and then in turn next process like swap in to fill physical frame
>>>>>> number
>>>>> into entry of the page faulted.
>>>>>
>>>>>
>>>>> Sorry, I can't understand your point due to my poor English.
>>>>> Could you rewrite it easiliy? :)
>>>>>
>>>>
>>>> Simply saying, handle_mm_fault would be called to update pte after
>>>> finding
>>>> vma and checking access right. and as you know, there are many cases to
>>>> process page fault such as COW or demand paging.
>>>
>>> Hmm. If I understand correctly, you guys misunderstand mlock. it doesn't
>>> page pinning
>>> nor prevent pfn change. It only guarantee to don't make swap out. e.g.
>>
>>
>> Symantic point of view, you're right but the implementation makes sure
>> page pinning.
>>
>>> memory campaction
>>> feature may automatically change page physical address.
>>
>>
>> I tried it last year but decided drop by realtime issue.
>> https://lkml.org/lkml/2011/8/29/295
>>
>> so I think mlock is a kind of page pinning. If elsewhere I don't
>> realized is doing, that place should be fixed.
>> Or my above patch should go ahead.
> 
> Thanks pointing out. I didn't realized your patch didn't merged. I think
> it should go ahead. think autonuma case,
> if mlock disable autonuma migration, that's bug.  I don't think we can

Bug is rather exaggerated. It's a just more overhead.

> promise mlock don't change physical page.
> I wonder if any realtime guys page migration is free lunch. they should
> disable both auto migration and compaction.

I think disable migration is overkill. We can do better than it.
Quote from discussion last year from me.

"
We can solve a bit that by another approach if it's really problem
with RT processes. The another approach is to separate mlocked pages
with allocation time like below pseudo patch which just show the
concept)

ex)
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 3a93f73..8ae2e60 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -175,7 +175,8 @@ static inline struct page *
 alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
                                        unsigned long vaddr)
 {
-       return __alloc_zeroed_user_highpage(__GFP_MOVABLE, vma, vaddr);
+       gfp_t gfp_flag = vma->vm_flags & VM_LCOKED ? 0 : __GFP_MOVABLE;
+       return __alloc_zeroed_user_highpage(gfp_flag, vma, vaddr);
 }

But it's a solution about newly allocated page on mlocked vma.
Old pages in the VMA is still a problem.
We can solve it at mlock system call through migrating the pages to
UNMOVABLE block.
"
It would be a solution to enhance compaction/CMA and we can make that compaction doesn't migrate
UNMOVABLE_PAGE_GROUP which make full by unevictable pages so mlocked page is still pinning page.
But get_user_pages in drivers still a problem. Or we can migrate unevictable pages, too so that
compaction/CMA would be good much but we lost pinning concept(It would break man page of mlocked
about real-time application stuff). Hmm.

> 
> And, think if application explictly use migrate_pages(2) or admins uses
> cpusets. driver code can't assume such scenario
> doesn't occur, yes?

Yes. it seems to migrate mlocked page now. Hmm,
Johannes, Mel.
Why should we be unfair on only compaction?

I hope hear opinion from rt guys, too.
Thanks.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-11  4:37 Allow migration of mlocked page? Minchan Kim
@ 2012-05-11  9:20 ` Peter Zijlstra
  2012-05-11 16:20   ` Christoph Lameter
  2012-05-14  4:13   ` Minchan Kim
  2012-05-11 13:14 ` Mel Gorman
  1 sibling, 2 replies; 38+ messages in thread
From: Peter Zijlstra @ 2012-05-11  9:20 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Johannes Weiner, Mel Gorman, Rik van Riel, Andrew Morton,
	Andrea Arcangeli, KAMEZAWA Hiroyuki, Christoph Lameter,
	linux-mm@kvack.org, tglx, Ingo Molnar, Theodore Ts'o

On Fri, 2012-05-11 at 13:37 +0900, Minchan Kim wrote:
> I hope hear opinion from rt guys, too.

Its a problem yes, not sure your solution is any good though. As it
stands mlock() simply doesn't guarantee no faults, all it does is
guarantee no major faults.

Are you saying compaction doesn't actually move mlocked pages? I'm
somewhat surprised by that, I've always assumed it would.

Its sad that mlock() doesn't take a flags argument, so I'd rather
introduce a new madvise() flag for -rt, something like MADV_UNMOVABLE
(or whatever) which will basically copy the pages to an un-movable page
block and really pin the things.

That way mlock() can stay what the spec says it is and guarantee
residency.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-11  9:20 ` Peter Zijlstra
@ 2012-05-11 16:20   ` Christoph Lameter
  2012-05-11 23:24     ` KOSAKI Motohiro
  2012-05-14  4:13   ` Minchan Kim
  1 sibling, 1 reply; 38+ messages in thread
From: Christoph Lameter @ 2012-05-11 16:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Minchan Kim, Johannes Weiner, Mel Gorman, Rik van Riel,
	Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	linux-mm@kvack.org, tglx, Ingo Molnar, Theodore Ts'o

On Fri, 11 May 2012, Peter Zijlstra wrote:

> On Fri, 2012-05-11 at 13:37 +0900, Minchan Kim wrote:
> > I hope hear opinion from rt guys, too.
>
> Its a problem yes, not sure your solution is any good though. As it
> stands mlock() simply doesn't guarantee no faults, all it does is
> guarantee no major faults.

There are two different way to lock pages down in memory that have
different counters in /proc/<pid>/status and also different semantics.

VmLck: Mlocked pages. This means there is a prohibition against evicting
pages. These pages can undergo page migration and therefore also be
handled by compation. These pages have PG_mlock set.

VmPin: Pinned pages. Page cannot be moved. These pages have an elevated
refcount that makes page migration fail.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-11 16:20   ` Christoph Lameter
@ 2012-05-11 23:24     ` KOSAKI Motohiro
  2012-05-14 13:45       ` Christoph Lameter
  0 siblings, 1 reply; 38+ messages in thread
From: KOSAKI Motohiro @ 2012-05-11 23:24 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Peter Zijlstra, Minchan Kim, Johannes Weiner, Mel Gorman,
	Rik van Riel, Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	linux-mm@kvack.org, tglx, Ingo Molnar, Theodore Ts'o,
	kosaki.motohiro

(5/11/12 12:20 PM), Christoph Lameter wrote:
> On Fri, 11 May 2012, Peter Zijlstra wrote:
>
>> On Fri, 2012-05-11 at 13:37 +0900, Minchan Kim wrote:
>>> I hope hear opinion from rt guys, too.
>>
>> Its a problem yes, not sure your solution is any good though. As it
>> stands mlock() simply doesn't guarantee no faults, all it does is
>> guarantee no major faults.
>
> There are two different way to lock pages down in memory that have
> different counters in /proc/<pid>/status and also different semantics.
>
> VmLck: Mlocked pages. This means there is a prohibition against evicting
> pages. These pages can undergo page migration and therefore also be
> handled by compation. These pages have PG_mlock set.
>
> VmPin: Pinned pages. Page cannot be moved. These pages have an elevated
> refcount that makes page migration fail.

I don't see VmPin counter in my box. Did you introduce this one recently?


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-11 23:24     ` KOSAKI Motohiro
@ 2012-05-14 13:45       ` Christoph Lameter
  0 siblings, 0 replies; 38+ messages in thread
From: Christoph Lameter @ 2012-05-14 13:45 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Peter Zijlstra, Minchan Kim, Johannes Weiner, Mel Gorman,
	Rik van Riel, Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	linux-mm@kvack.org, tglx, Ingo Molnar, Theodore Ts'o

On Fri, 11 May 2012, KOSAKI Motohiro wrote:

> I don't see VmPin counter in my box. Did you introduce this one recently?

Yes I think it was 3.3 or 3.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-11  9:20 ` Peter Zijlstra
  2012-05-11 16:20   ` Christoph Lameter
@ 2012-05-14  4:13   ` Minchan Kim
  2012-05-14  6:56     ` Peter Zijlstra
  1 sibling, 1 reply; 38+ messages in thread
From: Minchan Kim @ 2012-05-14  4:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Johannes Weiner, Mel Gorman, Rik van Riel, Andrew Morton,
	Andrea Arcangeli, KAMEZAWA Hiroyuki, Christoph Lameter,
	linux-mm@kvack.org, tglx, Ingo Molnar, Theodore Ts'o

On 05/11/2012 06:20 PM, Peter Zijlstra wrote:

> On Fri, 2012-05-11 at 13:37 +0900, Minchan Kim wrote:
>> I hope hear opinion from rt guys, too.
> 
> Its a problem yes, not sure your solution is any good though. As it
> stands mlock() simply doesn't guarantee no faults, all it does is
> guarantee no major faults.


I can't find such definition from man pages
"
       Real-time  processes  that are using mlockall() to prevent delays on page faults should
       reserve enough locked stack pages before entering the time-critical section, so that no
       page fault can be caused by function calls
"
So I didn't expect it. Is your definition popular available on server RT?
At least, embedded guys didn't expect it.


> 
> Are you saying compaction doesn't actually move mlocked pages? I'm


Yes.

> somewhat surprised by that, I've always assumed it would.


It seems everyone assumed it.

> 
> Its sad that mlock() doesn't take a flags argument, so I'd rather
> introduce a new madvise() flag for -rt, something like MADV_UNMOVABLE
> (or whatever) which will basically copy the pages to an un-movable page
> block and really pin the things.


1) We don't have space of vm_flags in 32bit machine and Konstantin
   have sorted out but not sure it's merged. Anyway, Okay. It couldn't be a problem.

2) It needs application's fix and as Mel said, we might get new bug reports about latency.
   Doesn't it break current mlock semantic? - " no page fault can be caused by function calls"
   Otherwise, we should fix man page like your saying -   "no major page fault can be caused by function calls"

 

> That way mlock() can stay what the spec says it is and guarantee
> residency.

> 

> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=ilto:"dont@kvack.org"> email@kvack.org </a>
> 



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14  4:13   ` Minchan Kim
@ 2012-05-14  6:56     ` Peter Zijlstra
  2012-05-14  7:37       ` Minchan Kim
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2012-05-14  6:56 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Johannes Weiner, Mel Gorman, Rik van Riel, Andrew Morton,
	Andrea Arcangeli, KAMEZAWA Hiroyuki, Christoph Lameter,
	linux-mm@kvack.org, tglx, Ingo Molnar, Theodore Ts'o

On Mon, 2012-05-14 at 13:13 +0900, Minchan Kim wrote:
> On 05/11/2012 06:20 PM, Peter Zijlstra wrote:
> 
> > On Fri, 2012-05-11 at 13:37 +0900, Minchan Kim wrote:
> >> I hope hear opinion from rt guys, too.
> > 
> > Its a problem yes, not sure your solution is any good though. As it
> > stands mlock() simply doesn't guarantee no faults, all it does is
> > guarantee no major faults.
> 
> 
> I can't find such definition from man pages
> "
>        Real-time  processes  that are using mlockall() to prevent delays on page faults should
>        reserve enough locked stack pages before entering the time-critical section, so that no
>        page fault can be caused by function calls
> "
> So I didn't expect it. Is your definition popular available on server RT?
> At least, embedded guys didn't expect it.

Sod the manpage, the opengroup.org definition only states the page will
not be paged-out.

  http://pubs.opengroup.org/onlinepubs/009604599/functions/mlock.html

It only states: 'shall be memory resident' that very much implies no
major faults. But I cannot make that mean no minor faults.


Also, no clue what the userspace guys know or think to know, in my
experience they get it wrong anyway, regardless of what the manpage/spec
says.

But I've been telling the -rt folks for a long while that mlock only
guarantees no major faults for a while now (although apparently that's
not entirely true with current kernels, but see below).

> > Its sad that mlock() doesn't take a flags argument, so I'd rather
> > introduce a new madvise() flag for -rt, something like MADV_UNMOVABLE
> > (or whatever) which will basically copy the pages to an un-movable page
> > block and really pin the things.
> 
> 
> 1) We don't have space of vm_flags in 32bit machine and Konstantin
>    have sorted out but not sure it's merged. Anyway, Okay. It couldn't be a problem.

Or we just make the thing u64... :-)

> 2) It needs application's fix and as Mel said, we might get new bug reports about latency.
>    Doesn't it break current mlock semantic? - " no page fault can be caused by function calls"
>    Otherwise, we should fix man page like your saying -   "no major page fault can be caused by function calls" 

Well, if you look at v2.6.18:mm/rmap.c it would actually migrate mlocked
pages (which is what I remembered):

        if (!migration && ((vma->vm_flags & VM_LOCKED) ||
                        (ptep_clear_flush_young(vma, address, pte)))) {
                ret = SWAP_FAIL;
                goto out_unmap;
        }

So somewhere someone changed mlock() semantics already.

But yes, its going to cause pain whichever way around.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14  6:56     ` Peter Zijlstra
@ 2012-05-14  7:37       ` Minchan Kim
  2012-05-14  7:45         ` Peter Zijlstra
  2012-05-14 13:47         ` Christoph Lameter
  0 siblings, 2 replies; 38+ messages in thread
From: Minchan Kim @ 2012-05-14  7:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Johannes Weiner, Mel Gorman, Rik van Riel, Andrew Morton,
	Andrea Arcangeli, KAMEZAWA Hiroyuki, Christoph Lameter,
	linux-mm@kvack.org, tglx, Ingo Molnar, Theodore Ts'o

On 05/14/2012 03:56 PM, Peter Zijlstra wrote:

> On Mon, 2012-05-14 at 13:13 +0900, Minchan Kim wrote:
>> On 05/11/2012 06:20 PM, Peter Zijlstra wrote:
>>
>>> On Fri, 2012-05-11 at 13:37 +0900, Minchan Kim wrote:
>>>> I hope hear opinion from rt guys, too.
>>>
>>> Its a problem yes, not sure your solution is any good though. As it
>>> stands mlock() simply doesn't guarantee no faults, all it does is
>>> guarantee no major faults.
>>
>>
>> I can't find such definition from man pages
>> "
>>        Real-time  processes  that are using mlockall() to prevent delays on page faults should
>>        reserve enough locked stack pages before entering the time-critical section, so that no
>>        page fault can be caused by function calls
>> "
>> So I didn't expect it. Is your definition popular available on server RT?
>> At least, embedded guys didn't expect it.
> 
> Sod the manpage, the opengroup.org definition only states the page will
> not be paged-out.
> 
>   http://pubs.opengroup.org/onlinepubs/009604599/functions/mlock.html
> 
> It only states: 'shall be memory resident' that very much implies no
> major faults. But I cannot make that mean no minor faults.


Yes and I saw this
'Upon successful return from mlock(), pages in the specified range shall be locked and memory-resident' 
It said "locked and memory-resident".

What's the meaning of "locked"? Isn't it pinning?

> 
> 
> Also, no clue what the userspace guys know or think to know, in my
> experience they get it wrong anyway, regardless of what the manpage/spec
> says.
> 
> But I've been telling the -rt folks for a long while that mlock only
> guarantees no major faults for a while now (although apparently that's
> not entirely true with current kernels, but see below).
> 
>>> Its sad that mlock() doesn't take a flags argument, so I'd rather
>>> introduce a new madvise() flag for -rt, something like MADV_UNMOVABLE
>>> (or whatever) which will basically copy the pages to an un-movable page
>>> block and really pin the things.
>>
>>
>> 1) We don't have space of vm_flags in 32bit machine and Konstantin
>>    have sorted out but not sure it's merged. Anyway, Okay. It couldn't be a problem.
> 
> Or we just make the thing u64... :-)
> 
>> 2) It needs application's fix and as Mel said, we might get new bug reports about latency.
>>    Doesn't it break current mlock semantic? - " no page fault can be caused by function calls"
>>    Otherwise, we should fix man page like your saying -   "no major page fault can be caused by function calls" 
> 
> Well, if you look at v2.6.18:mm/rmap.c it would actually migrate mlocked
> pages (which is what I remembered):
> 
>         if (!migration && ((vma->vm_flags & VM_LOCKED) ||
>                         (ptep_clear_flush_young(vma, address, pte)))) {
>                 ret = SWAP_FAIL;
>                 goto out_unmap;
>         }
> 
> So somewhere someone changed mlock() semantics already.


Yes. migrate_pages, cpuset_migrate_mm and memcg alreay seem to break it.
I think they all is done by under user's control while compaction happens regardless of user's intention.
I'm not sure they could be excused althoug it's done by user's control. :(


> 
> But yes, its going to cause pain whichever way around.
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=ilto:"dont@kvack.org"> email@kvack.org </a>
> 



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14  7:37       ` Minchan Kim
@ 2012-05-14  7:45         ` Peter Zijlstra
  2012-05-14  7:49           ` Peter Zijlstra
  2012-05-14 13:47         ` Christoph Lameter
  1 sibling, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2012-05-14  7:45 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Johannes Weiner, Mel Gorman, Rik van Riel, Andrew Morton,
	Andrea Arcangeli, KAMEZAWA Hiroyuki, Christoph Lameter,
	linux-mm@kvack.org, tglx, Ingo Molnar, Theodore Ts'o

On Mon, 2012-05-14 at 16:37 +0900, Minchan Kim wrote:
> What's the meaning of "locked"? Isn't it pinning?

It doesn't say, the best inference I can make is that locked means the
effect of mlock() which is defined as: 'to be memory-resident', esp. so
since it then states: 'until unlocked' (or exit/exec).

So basically the statement: 'locked and memory-resident' is redundant.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14  7:45         ` Peter Zijlstra
@ 2012-05-14  7:49           ` Peter Zijlstra
  2012-05-14  7:54             ` Minchan Kim
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2012-05-14  7:49 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Johannes Weiner, Mel Gorman, Rik van Riel, Andrew Morton,
	Andrea Arcangeli, KAMEZAWA Hiroyuki, Christoph Lameter,
	linux-mm@kvack.org, tglx, Ingo Molnar, Theodore Ts'o

On Mon, 2012-05-14 at 09:45 +0200, Peter Zijlstra wrote:
> On Mon, 2012-05-14 at 16:37 +0900, Minchan Kim wrote:
> > What's the meaning of "locked"? Isn't it pinning?
> 
> It doesn't say, the best inference I can make is that locked means the
> effect of mlock() which is defined as: 'to be memory-resident', esp. so
> since it then states: 'until unlocked' (or exit/exec).
> 
> So basically the statement: 'locked and memory-resident' is redundant.

And alternative interpretation of that statement is that mlock() whould
keep pages memory-resident, but also make them memory-resident. IE, it
should fault the entire range in before returning the system-call.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14  7:49           ` Peter Zijlstra
@ 2012-05-14  7:54             ` Minchan Kim
  0 siblings, 0 replies; 38+ messages in thread
From: Minchan Kim @ 2012-05-14  7:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Johannes Weiner, Mel Gorman, Rik van Riel, Andrew Morton,
	Andrea Arcangeli, KAMEZAWA Hiroyuki, Christoph Lameter,
	linux-mm@kvack.org, tglx, Ingo Molnar, Theodore Ts'o

On 05/14/2012 04:49 PM, Peter Zijlstra wrote:

> On Mon, 2012-05-14 at 09:45 +0200, Peter Zijlstra wrote:
>> On Mon, 2012-05-14 at 16:37 +0900, Minchan Kim wrote:
>>> What's the meaning of "locked"? Isn't it pinning?
>>
>> It doesn't say, the best inference I can make is that locked means the
>> effect of mlock() which is defined as: 'to be memory-resident', esp. so
>> since it then states: 'until unlocked' (or exit/exec).
>>
>> So basically the statement: 'locked and memory-resident' is redundant.
> 
> And alternative interpretation of that statement is that mlock() whould
> keep pages memory-resident, but also make them memory-resident. IE, it
> should fault the entire range in before returning the system-call.
> 


Fair enough.

Thanks.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14  7:37       ` Minchan Kim
  2012-05-14  7:45         ` Peter Zijlstra
@ 2012-05-14 13:47         ` Christoph Lameter
  2012-05-15  1:23           ` Minchan Kim
  1 sibling, 1 reply; 38+ messages in thread
From: Christoph Lameter @ 2012-05-14 13:47 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Peter Zijlstra, Johannes Weiner, Mel Gorman, Rik van Riel,
	Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	linux-mm@kvack.org, Thomas Gleixner, Ingo Molnar,
	Theodore Ts'o, hugh.dickins@tiscali.co.uk

On Mon, 14 May 2012, Minchan Kim wrote:

> What's the meaning of "locked"? Isn't it pinning?

No. We agreed to that a long time ago when the page migration logic was
first merged. Mlock only means memory resident.

Hugh pushed for it initially.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14 13:47         ` Christoph Lameter
@ 2012-05-15  1:23           ` Minchan Kim
  2012-05-15 11:07             ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Minchan Kim @ 2012-05-15  1:23 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Peter Zijlstra, Johannes Weiner, Mel Gorman, Rik van Riel,
	Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	linux-mm@kvack.org, Thomas Gleixner, Ingo Molnar,
	Theodore Ts'o, hugh.dickins@tiscali.co.uk

On 05/14/2012 10:47 PM, Christoph Lameter wrote:

> On Mon, 14 May 2012, Minchan Kim wrote:
> 
>> What's the meaning of "locked"? Isn't it pinning?
> 
> No. We agreed to that a long time ago when the page migration logic was
> first merged. Mlock only means memory resident.


I realized it through Peter's link on opengroup.
Hmm, The problem is that it's not consistent with man pages which says "no fault happen".
So many developers have been used it by meaning of "making sure latency". :(

> 
> Hugh pushed for it initially.

> 

> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-15  1:23           ` Minchan Kim
@ 2012-05-15 11:07             ` Peter Zijlstra
  0 siblings, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2012-05-15 11:07 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Christoph Lameter, Johannes Weiner, Mel Gorman, Rik van Riel,
	Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	linux-mm@kvack.org, Thomas Gleixner, Ingo Molnar,
	Theodore Ts'o, hugh.dickins@tiscali.co.uk

On Tue, 2012-05-15 at 10:23 +0900, Minchan Kim wrote:
> So many developers have been used it by meaning of "making sure latency". :(

Many developers do many crazy things.. many use sched_yield() for
instance. Doesn't make it right though.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-11  4:37 Allow migration of mlocked page? Minchan Kim
  2012-05-11  9:20 ` Peter Zijlstra
@ 2012-05-11 13:14 ` Mel Gorman
  2012-05-11 23:25   ` KOSAKI Motohiro
  2012-05-14  4:25   ` Minchan Kim
  1 sibling, 2 replies; 38+ messages in thread
From: Mel Gorman @ 2012-05-11 13:14 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Johannes Weiner, Rik van Riel, Andrew Morton, Andrea Arcangeli,
	KAMEZAWA Hiroyuki, Christoph Lameter, linux-mm@kvack.org, tglx,
	Ingo Molnar, Peter Zijlstra, Theodore Ts'o

On Fri, May 11, 2012 at 01:37:26PM +0900, Minchan Kim wrote:
> > <SNIP>
> > promise mlock don't change physical page.
> > I wonder if any realtime guys page migration is free lunch. they should
> > disable both auto migration and compaction.
> 
> I think disable migration is overkill. We can do better than it.

The reason why we do not migrate mlock() pages is down to expectations of the
application developer.  mlock historically was a real-time extention. For
files, there is no guarantee of latency because obviously things like
writing to the page can stall in balance_dirty_pages() but for anonymous
memory, there is an expectation that access be low or zero latency. This
would be particularly true if they used something like MAP_POPULATE.

> Quote from discussion last year from me.
> 
> "
> We can solve a bit that by another approach if it's really problem
> with RT processes. The another approach is to separate mlocked pages
> with allocation time like below pseudo patch which just show the
> concept)
> 
> ex)
> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
> index 3a93f73..8ae2e60 100644
> --- a/include/linux/highmem.h
> +++ b/include/linux/highmem.h
> @@ -175,7 +175,8 @@ static inline struct page *
>  alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
>                                         unsigned long vaddr)
>  {
> -       return __alloc_zeroed_user_highpage(__GFP_MOVABLE, vma, vaddr);
> +       gfp_t gfp_flag = vma->vm_flags & VM_LCOKED ? 0 : __GFP_MOVABLE;
> +       return __alloc_zeroed_user_highpage(gfp_flag, vma, vaddr);
>  }
> 
> But it's a solution about newly allocated page on mlocked vma.
> Old pages in the VMA is still a problem.

Yes.

> We can solve it at mlock system call through migrating the pages to
> UNMOVABLE block.

Combining the two would be suitable because once mlock returns, any mapped
page is locked in place and future allocations will be placed suitable. I'd
also be ok allowing file-backed mlocked pages to be migrated on the grounds
that no assumptions can be made about access latency anyway.

> "
> It would be a solution to enhance compaction/CMA and we can make that compaction doesn't migrate
> UNMOVABLE_PAGE_GROUP which make full by unevictable pages so mlocked page is still pinning page.
> But get_user_pages in drivers still a problem. Or we can migrate unevictable pages, too so that
> compaction/CMA would be good much but we lost pinning concept(It would break man page of mlocked
> about real-time application stuff). Hmm.
> 
> > 
> > And, think if application explictly use migrate_pages(2) or admins uses
> > cpusets. driver code can't assume such scenario
> > doesn't occur, yes?
> 
> Yes. it seems to migrate mlocked page now.
> Hmm,
> Johannes, Mel.
> Why should we be unfair on only compaction?
> 

If CMA decide they want to alter mlocked pages in this way, it's sortof
ok. While CMA is being used, there are no expectations on the RT
behaviour of the system - stalls are expected. In their use cases, CMA
failing is far worse than access latency to an mlocked page being
variable while CMA is running.

Compaction on the other hand is during the normal operation of the
machine. There are applications that assume that if anonymous memory
is mlocked() then access to it is close to zero latency. They are
not RT-critical processes (or they would disable THP) but depend on
this. Allowing compaction to migrate mlocked() pages will result in bugs
being reported by these people.

I've received one bug this year about access latency to mlocked() regions but
it turned out to be a file-backed region and related to when the write-fault
is incurred. The ultimate fix was in the application but we'll get new bug
reports if anonymous mlocked pages do not preserve the current guarantees
on access latency.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-11 13:14 ` Mel Gorman
@ 2012-05-11 23:25   ` KOSAKI Motohiro
  2012-05-14 13:32     ` Mel Gorman
  2012-05-14  4:25   ` Minchan Kim
  1 sibling, 1 reply; 38+ messages in thread
From: KOSAKI Motohiro @ 2012-05-11 23:25 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Minchan Kim, Johannes Weiner, Rik van Riel, Andrew Morton,
	Andrea Arcangeli, KAMEZAWA Hiroyuki, Christoph Lameter,
	linux-mm@kvack.org, tglx, Ingo Molnar, Peter Zijlstra,
	Theodore Ts'o, kosaki.motohiro

(5/11/12 9:14 AM), Mel Gorman wrote:
> On Fri, May 11, 2012 at 01:37:26PM +0900, Minchan Kim wrote:
>>> <SNIP>
>>> promise mlock don't change physical page.
>>> I wonder if any realtime guys page migration is free lunch. they should
>>> disable both auto migration and compaction.
>>
>> I think disable migration is overkill. We can do better than it.
>
> The reason why we do not migrate mlock() pages is down to expectations of the
> application developer.  mlock historically was a real-time extention. For
> files, there is no guarantee of latency because obviously things like
> writing to the page can stall in balance_dirty_pages() but for anonymous
> memory, there is an expectation that access be low or zero latency. This
> would be particularly true if they used something like MAP_POPULATE.
>
>> Quote from discussion last year from me.
>>
>> "
>> We can solve a bit that by another approach if it's really problem
>> with RT processes. The another approach is to separate mlocked pages
>> with allocation time like below pseudo patch which just show the
>> concept)
>>
>> ex)
>> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
>> index 3a93f73..8ae2e60 100644
>> --- a/include/linux/highmem.h
>> +++ b/include/linux/highmem.h
>> @@ -175,7 +175,8 @@ static inline struct page *
>>   alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
>>                                          unsigned long vaddr)
>>   {
>> -       return __alloc_zeroed_user_highpage(__GFP_MOVABLE, vma, vaddr);
>> +       gfp_t gfp_flag = vma->vm_flags&  VM_LCOKED ? 0 : __GFP_MOVABLE;
>> +       return __alloc_zeroed_user_highpage(gfp_flag, vma, vaddr);
>>   }
>>
>> But it's a solution about newly allocated page on mlocked vma.
>> Old pages in the VMA is still a problem.
>
> Yes.

I disagree. __GFP_MOVABLE is one of zone mask. therefore, To turn off __GFP_MOVABLE
will break memory hotplug. mlock may easily invoke oom killer.



>> We can solve it at mlock system call through migrating the pages to
>> UNMOVABLE block.
>
> Combining the two would be suitable because once mlock returns, any mapped
> page is locked in place and future allocations will be placed suitable. I'd
> also be ok allowing file-backed mlocked pages to be migrated on the grounds
> that no assumptions can be made about access latency anyway.
>
>> "
>> It would be a solution to enhance compaction/CMA and we can make that compaction doesn't migrate
>> UNMOVABLE_PAGE_GROUP which make full by unevictable pages so mlocked page is still pinning page.
>> But get_user_pages in drivers still a problem. Or we can migrate unevictable pages, too so that
>> compaction/CMA would be good much but we lost pinning concept(It would break man page of mlocked
>> about real-time application stuff). Hmm.
>>
>>>
>>> And, think if application explictly use migrate_pages(2) or admins uses
>>> cpusets. driver code can't assume such scenario
>>> doesn't occur, yes?
>>
>> Yes. it seems to migrate mlocked page now.
>> Hmm,
>> Johannes, Mel.
>> Why should we be unfair on only compaction?
>>
>
> If CMA decide they want to alter mlocked pages in this way, it's sortof
> ok. While CMA is being used, there are no expectations on the RT
> behaviour of the system - stalls are expected. In their use cases, CMA
> failing is far worse than access latency to an mlocked page being
> variable while CMA is running.

That's strange. CMA caller can't know the altered page is under mlock or not.
and almost all CMA user is in embedded world. ie RT realm. So, I don't think
CMA and compaction are significantly different.


> Compaction on the other hand is during the normal operation of the
> machine. There are applications that assume that if anonymous memory
> is mlocked() then access to it is close to zero latency. They are
> not RT-critical processes (or they would disable THP) but depend on
> this. Allowing compaction to migrate mlocked() pages will result in bugs
> being reported by these people.
>
> I've received one bug this year about access latency to mlocked() regions but
> it turned out to be a file-backed region and related to when the write-fault
> is incurred. The ultimate fix was in the application but we'll get new bug
> reports if anonymous mlocked pages do not preserve the current guarantees
> on access latency.

Can you please tell us your opinion about autonuma? I doubt we can keep such
mlock guarantee. I think we need to suggest application fix. maybe to introduce
MADV_UNMOVABLE is good start. it seems to solve autonuma issue too.




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-11 23:25   ` KOSAKI Motohiro
@ 2012-05-14 13:32     ` Mel Gorman
  2012-05-14 13:51       ` Peter Zijlstra
                         ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Mel Gorman @ 2012-05-14 13:32 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Minchan Kim, Johannes Weiner, Rik van Riel, Andrew Morton,
	Andrea Arcangeli, KAMEZAWA Hiroyuki, Christoph Lameter,
	linux-mm@kvack.org, tglx, Ingo Molnar, Peter Zijlstra,
	Theodore Ts'o

On Fri, May 11, 2012 at 07:25:59PM -0400, KOSAKI Motohiro wrote:
> >>diff --git a/include/linux/highmem.h b/include/linux/highmem.h
> >>index 3a93f73..8ae2e60 100644
> >>--- a/include/linux/highmem.h
> >>+++ b/include/linux/highmem.h
> >>@@ -175,7 +175,8 @@ static inline struct page *
> >>  alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
> >>                                         unsigned long vaddr)
> >>  {
> >>-       return __alloc_zeroed_user_highpage(__GFP_MOVABLE, vma, vaddr);
> >>+       gfp_t gfp_flag = vma->vm_flags&  VM_LCOKED ? 0 : __GFP_MOVABLE;
> >>+       return __alloc_zeroed_user_highpage(gfp_flag, vma, vaddr);
> >>  }
> >>
> >>But it's a solution about newly allocated page on mlocked vma.
> >>Old pages in the VMA is still a problem.
> >
> >Yes.
> 
> I disagree. __GFP_MOVABLE is one of zone mask. therefore, To turn off __GFP_MOVABLE
> will break memory hotplug. mlock may easily invoke oom killer.
> 

Fair point.

> >>We can solve it at mlock system call through migrating the pages to
> >>UNMOVABLE block.
> >
> >Combining the two would be suitable because once mlock returns, any mapped
> >page is locked in place and future allocations will be placed suitable. I'd
> >also be ok allowing file-backed mlocked pages to be migrated on the grounds
> >that no assumptions can be made about access latency anyway.
> >
> >>"
> >>It would be a solution to enhance compaction/CMA and we can make that compaction doesn't migrate
> >>UNMOVABLE_PAGE_GROUP which make full by unevictable pages so mlocked page is still pinning page.
> >>But get_user_pages in drivers still a problem. Or we can migrate unevictable pages, too so that
> >>compaction/CMA would be good much but we lost pinning concept(It would break man page of mlocked
> >>about real-time application stuff). Hmm.
> >>
> >>>
> >>>And, think if application explictly use migrate_pages(2) or admins uses
> >>>cpusets. driver code can't assume such scenario
> >>>doesn't occur, yes?
> >>
> >>Yes. it seems to migrate mlocked page now.
> >>Hmm,
> >>Johannes, Mel.
> >>Why should we be unfair on only compaction?
> >>
> >
> >If CMA decide they want to alter mlocked pages in this way, it's sortof
> >ok. While CMA is being used, there are no expectations on the RT
> >behaviour of the system - stalls are expected. In their use cases, CMA
> >failing is far worse than access latency to an mlocked page being
> >variable while CMA is running.
> 
> That's strange. CMA caller can't know the altered page is under mlock or not.
> and almost all CMA user is in embedded world. ie RT realm.

Embedded does not imply realtime constraints.

> So, I don't think
> CMA and compaction are significantly different.
> 

CMA is used in cases such as a mobile phone needing to allocate a large
contiguous range of memory for video decoding. Compaction is used by
features such as THP with khugepaged potentially using it frequently on
x86-64 machines. The use cases are different and compaction is used by
THP a lot more than CMA is used by anything.

If compaction can move mlocked pages then khugepaged can introduce unexpected
latencies on mlocked anonymous regions of memory.

> >Compaction on the other hand is during the normal operation of the
> >machine. There are applications that assume that if anonymous memory
> >is mlocked() then access to it is close to zero latency. They are
> >not RT-critical processes (or they would disable THP) but depend on
> >this. Allowing compaction to migrate mlocked() pages will result in bugs
> >being reported by these people.
> >
> >I've received one bug this year about access latency to mlocked() regions but
> >it turned out to be a file-backed region and related to when the write-fault
> >is incurred. The ultimate fix was in the application but we'll get new bug
> >reports if anonymous mlocked pages do not preserve the current guarantees
> >on access latency.
> 
> Can you please tell us your opinion about autonuma?

I think it will have the same problem as THP using compaction. If
mlocked pages can move then there may be unexpected latencies accessing
mlocked anonymous regions.

> I doubt we can keep such
> mlock guarantee. I think we need to suggest application fix. maybe to introduce
> MADV_UNMOVABLE is good start. it seems to solve autonuma issue too.
> 

That'll regress existing applications. It would be preferable to me that
it be the other way around to not move mlocked pages unless the user says
it's allowed.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14 13:32     ` Mel Gorman
@ 2012-05-14 13:51       ` Peter Zijlstra
  2012-05-14 14:01         ` Christoph Lameter
  2012-05-14 14:08         ` Peter Zijlstra
  2012-05-14 23:06       ` KOSAKI Motohiro
  2012-05-15  1:35       ` Minchan Kim
  2 siblings, 2 replies; 38+ messages in thread
From: Peter Zijlstra @ 2012-05-14 13:51 UTC (permalink / raw)
  To: Mel Gorman
  Cc: KOSAKI Motohiro, Minchan Kim, Johannes Weiner, Rik van Riel,
	Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	Christoph Lameter, linux-mm@kvack.org, tglx, Ingo Molnar,
	Theodore Ts'o

On Mon, 2012-05-14 at 14:32 +0100, Mel Gorman wrote:

> Embedded does not imply realtime constraints.
> 
> > So, I don't think
> > CMA and compaction are significantly different.
> > 
> 
> CMA is used in cases such as a mobile phone needing to allocate a large
> contiguous range of memory for video decoding. Compaction is used by
> features such as THP with khugepaged potentially using it frequently on
> x86-64 machines. The use cases are different and compaction is used by
> THP a lot more than CMA is used by anything.
> 
> If compaction can move mlocked pages then khugepaged can introduce unexpected
> latencies on mlocked anonymous regions of memory.

I'd like to see CMA used for memcg and things as well, where we only
allocate the shadow page frames on-demand.

This moves CMA out of the crappy hardware-only section and should result
in pretty much everybody using it (except me, since I have cgroup=n).

Anyway, THP isn't an issue for -rt, its impossible to select when you
have PREEMPT_RT.

> > >Compaction on the other hand is during the normal operation of the
> > >machine. There are applications that assume that if anonymous memory
> > >is mlocked() then access to it is close to zero latency. They are
> > >not RT-critical processes (or they would disable THP) but depend on
> > >this. Allowing compaction to migrate mlocked() pages will result in bugs
> > >being reported by these people.
> > >
> > >I've received one bug this year about access latency to mlocked() regions but
> > >it turned out to be a file-backed region and related to when the write-fault
> > >is incurred. The ultimate fix was in the application but we'll get new bug
> > >reports if anonymous mlocked pages do not preserve the current guarantees
> > >on access latency.
> > 
> > Can you please tell us your opinion about autonuma?
> 
> I think it will have the same problem as THP using compaction. If
> mlocked pages can move then there may be unexpected latencies accessing
> mlocked anonymous regions.

numa and rt don't mix anyway.. don't worry about that.

> > I doubt we can keep such
> > mlock guarantee. I think we need to suggest application fix. maybe to introduce
> > MADV_UNMOVABLE is good start. it seems to solve autonuma issue too.
> > 
> 
> That'll regress existing applications. It would be preferable to me that
> it be the other way around to not move mlocked pages unless the user says
> it's allowed.

I'd say go for it, I've been telling everybody who would listen that
mlock() only means no major faults for a very long time now.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14 13:51       ` Peter Zijlstra
@ 2012-05-14 14:01         ` Christoph Lameter
  2012-05-14 14:14           ` Peter Zijlstra
  2012-05-15  1:38           ` Minchan Kim
  2012-05-14 14:08         ` Peter Zijlstra
  1 sibling, 2 replies; 38+ messages in thread
From: Christoph Lameter @ 2012-05-14 14:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mel Gorman, KOSAKI Motohiro, Minchan Kim, Johannes Weiner,
	Rik van Riel, Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	linux-mm@kvack.org, tglx, Ingo Molnar, Theodore Ts'o

On Mon, 14 May 2012, Peter Zijlstra wrote:

> I'd say go for it, I've been telling everybody who would listen that
> mlock() only means no major faults for a very long time now.

We could introduce a new page flag PG_pinned (it already exists for Xen)
that would mean no faults on the page?

The situation with pinned pages is not clean right now because page count
increases should only signal temporary references to a page but subsystems
use an elevated page count to pin pages for good (f.e. Infiniband memory
registration). The reclaim logic has no way to differentiate between a
pinned page and a temporary reference count increase for page handling.

Therefore f.e. the page migration logic will repeatedly try to move the
page and always fail to account for all references.

A PG_pinned could allow us to make that distinction to avoid overhead in
the reclaim and page migration logic and also we could add some semantics
that avoid page faults.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14 14:01         ` Christoph Lameter
@ 2012-05-14 14:14           ` Peter Zijlstra
  2012-05-14 14:43             ` Christoph Lameter
  2012-05-14 23:04             ` Roland Dreier
  2012-05-15  1:38           ` Minchan Kim
  1 sibling, 2 replies; 38+ messages in thread
From: Peter Zijlstra @ 2012-05-14 14:14 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Mel Gorman, KOSAKI Motohiro, Minchan Kim, Johannes Weiner,
	Rik van Riel, Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	linux-mm@kvack.org, tglx, Ingo Molnar, Theodore Ts'o, roland

On Mon, 2012-05-14 at 09:01 -0500, Christoph Lameter wrote:
> On Mon, 14 May 2012, Peter Zijlstra wrote:
> 
> > I'd say go for it, I've been telling everybody who would listen that
> > mlock() only means no major faults for a very long time now.
> 
> We could introduce a new page flag PG_pinned (it already exists for Xen)
> that would mean no faults on the page?
> 
> The situation with pinned pages is not clean right now because page count
> increases should only signal temporary references to a page but subsystems
> use an elevated page count to pin pages for good (f.e. Infiniband memory
> registration). The reclaim logic has no way to differentiate between a
> pinned page and a temporary reference count increase for page handling.
> 
> Therefore f.e. the page migration logic will repeatedly try to move the
> page and always fail to account for all references.
> 
> A PG_pinned could allow us to make that distinction to avoid overhead in
> the reclaim and page migration logic and also we could add some semantics
> that avoid page faults.

Either that or a VMA flag, I think both infiniband and whatever new
mlock API we invent will pretty much always be VMA wide. Or does the
infinimuck take random pages out? All I really know about IB is to stay
the #$%! away from it [as Mel recently learned the hard way] :-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14 14:14           ` Peter Zijlstra
@ 2012-05-14 14:43             ` Christoph Lameter
  2012-05-14 22:52               ` KOSAKI Motohiro
  2012-05-14 23:04             ` Roland Dreier
  1 sibling, 1 reply; 38+ messages in thread
From: Christoph Lameter @ 2012-05-14 14:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mel Gorman, KOSAKI Motohiro, Minchan Kim, Johannes Weiner,
	Rik van Riel, Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	linux-mm@kvack.org, tglx, Ingo Molnar, Theodore Ts'o, roland

On Mon, 14 May 2012, Peter Zijlstra wrote:

> > A PG_pinned could allow us to make that distinction to avoid overhead in
> > the reclaim and page migration logic and also we could add some semantics
> > that avoid page faults.
>
> Either that or a VMA flag, I think both infiniband and whatever new
> mlock API we invent will pretty much always be VMA wide. Or does the
> infinimuck take random pages out? All I really know about IB is to stay
> the #$%! away from it [as Mel recently learned the hard way] :-)

Devices (also infiniband) register buffers allocated on the heap and
increase the page count of the pages. Its not VMA bound.

Creating a VMA flag would force device driver writers to break up VMAs I
think.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14 14:43             ` Christoph Lameter
@ 2012-05-14 22:52               ` KOSAKI Motohiro
  0 siblings, 0 replies; 38+ messages in thread
From: KOSAKI Motohiro @ 2012-05-14 22:52 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Peter Zijlstra, Mel Gorman, Minchan Kim, Johannes Weiner,
	Rik van Riel, Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	linux-mm@kvack.org, tglx, Ingo Molnar, Theodore Ts'o, roland

On Mon, May 14, 2012 at 10:43 AM, Christoph Lameter <cl@linux.com> wrote:
> On Mon, 14 May 2012, Peter Zijlstra wrote:
>
>> > A PG_pinned could allow us to make that distinction to avoid overhead in
>> > the reclaim and page migration logic and also we could add some semantics
>> > that avoid page faults.
>>
>> Either that or a VMA flag, I think both infiniband and whatever new
>> mlock API we invent will pretty much always be VMA wide. Or does the
>> infinimuck take random pages out? All I really know about IB is to stay
>> the #$%! away from it [as Mel recently learned the hard way] :-)
>
> Devices (also infiniband) register buffers allocated on the heap and
> increase the page count of the pages. Its not VMA bound.
>
> Creating a VMA flag would force device driver writers to break up VMAs I
> think.

Why do you dislike vma splitting so much? Infiniband is usually HPC
(i.e. 64bit arch)
and number of VMAs are not big matter. Usually IB register buffer is
not one or two pages. It's usually bigger.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14 14:14           ` Peter Zijlstra
  2012-05-14 14:43             ` Christoph Lameter
@ 2012-05-14 23:04             ` Roland Dreier
  2012-05-15 14:27               ` Christoph Lameter
  1 sibling, 1 reply; 38+ messages in thread
From: Roland Dreier @ 2012-05-14 23:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Christoph Lameter, Mel Gorman, KOSAKI Motohiro, Minchan Kim,
	Johannes Weiner, Rik van Riel, Andrew Morton, Andrea Arcangeli,
	KAMEZAWA Hiroyuki, linux-mm@kvack.org, tglx, Ingo Molnar,
	Theodore Ts'o

On Mon, May 14, 2012 at 7:14 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> Either that or a VMA flag, I think both infiniband and whatever new
> mlock API we invent will pretty much always be VMA wide. Or does the
> infinimuck take random pages out? All I really know about IB is to stay
> the #$%! away from it [as Mel recently learned the hard way] :-)

In general the InfiniBand pinning (calling get_user_pages()) is driven
by userspace, which doesn't really know anything about VMAs.

However userspace will often do madvise(DONT_FORK) on those
same ranges, so we'll probably have vma boundaries match up with
the ranges of pinned pages.

In any case I don't see any problem with doing vma splitting in
drivers/core/infiniband/umem.c if need be.

 - R.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14 23:04             ` Roland Dreier
@ 2012-05-15 14:27               ` Christoph Lameter
  0 siblings, 0 replies; 38+ messages in thread
From: Christoph Lameter @ 2012-05-15 14:27 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Peter Zijlstra, Mel Gorman, KOSAKI Motohiro, Minchan Kim,
	Johannes Weiner, Rik van Riel, Andrew Morton, Andrea Arcangeli,
	KAMEZAWA Hiroyuki, linux-mm@kvack.org, tglx, Ingo Molnar,
	Theodore Ts'o

On Mon, 14 May 2012, Roland Dreier wrote:

> In any case I don't see any problem with doing vma splitting in
> drivers/core/infiniband/umem.c if need be.

Prohibiting migration is already supported at the VMA level. There is no
need to add anyting extra.

"struct vm_operations_struct" has a field for the "migrate" function.
If that field is set to "fail_migrate_page" then no migration will ever
take place on the VMA.

But this feature is not accessible from user space. So far it has
only been used by special filesystesm.

And disabling migration does not solve the "I want no faults
whatsovever" requirement that I keep hearing.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14 14:01         ` Christoph Lameter
  2012-05-14 14:14           ` Peter Zijlstra
@ 2012-05-15  1:38           ` Minchan Kim
  1 sibling, 0 replies; 38+ messages in thread
From: Minchan Kim @ 2012-05-15  1:38 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Peter Zijlstra, Mel Gorman, KOSAKI Motohiro, Johannes Weiner,
	Rik van Riel, Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	linux-mm@kvack.org, tglx, Ingo Molnar, Theodore Ts'o

On 05/14/2012 11:01 PM, Christoph Lameter wrote:

> On Mon, 14 May 2012, Peter Zijlstra wrote:
> 
>> I'd say go for it, I've been telling everybody who would listen that
>> mlock() only means no major faults for a very long time now.
> 
> We could introduce a new page flag PG_pinned (it already exists for Xen)
> that would mean no faults on the page?
> 
> The situation with pinned pages is not clean right now because page count
> increases should only signal temporary references to a page but subsystems
> use an elevated page count to pin pages for good (f.e. Infiniband memory
> registration). The reclaim logic has no way to differentiate between a
> pinned page and a temporary reference count increase for page handling.
> 
> Therefore f.e. the page migration logic will repeatedly try to move the
> page and always fail to account for all references.
> 
> A PG_pinned could allow us to make that distinction to avoid overhead in
> the reclaim and page migration logic and also we could add some semantics
> that avoid page faults.


It does make sense!

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14 13:51       ` Peter Zijlstra
  2012-05-14 14:01         ` Christoph Lameter
@ 2012-05-14 14:08         ` Peter Zijlstra
  1 sibling, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2012-05-14 14:08 UTC (permalink / raw)
  To: Mel Gorman
  Cc: KOSAKI Motohiro, Minchan Kim, Johannes Weiner, Rik van Riel,
	Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	Christoph Lameter, linux-mm@kvack.org, tglx, Ingo Molnar,
	Theodore Ts'o

Anyway, afaict there's only two options:

 1) make mlock() mean physically pinned (which we've so far always
rejected and isn't supported by whatever passes as a std for unix -- at
least not by the precise wording).

 2) keep mlock() to mean no major fault.

I strongly prefer 2 -- its what we've always said.

This might mean there's a need for a stronger API -- one that also
guarantees physically pinned. This is a more expensive
resource/operation. It means we need to migrate all memory to UNMOVABLE
blocks, possibly growing the number of such blocks with all the
down-sides that has.

Alternatively -- in case we pick 1 -- we should create a weaker variant
that does what mlock means now in order to allow people to not pressure
the system unduly. 

I don't see any other way.. the current constraints really are mutually
exclusive.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14 13:32     ` Mel Gorman
  2012-05-14 13:51       ` Peter Zijlstra
@ 2012-05-14 23:06       ` KOSAKI Motohiro
  2012-05-15  1:35       ` Minchan Kim
  2 siblings, 0 replies; 38+ messages in thread
From: KOSAKI Motohiro @ 2012-05-14 23:06 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Minchan Kim, Johannes Weiner, Rik van Riel, Andrew Morton,
	Andrea Arcangeli, KAMEZAWA Hiroyuki, Christoph Lameter,
	linux-mm@kvack.org, tglx, Ingo Molnar, Peter Zijlstra,
	Theodore Ts'o

>> >If CMA decide they want to alter mlocked pages in this way, it's sortof
>> >ok. While CMA is being used, there are no expectations on the RT
>> >behaviour of the system - stalls are expected. In their use cases, CMA
>> >failing is far worse than access latency to an mlocked page being
>> >variable while CMA is running.
>>
>> That's strange. CMA caller can't know the altered page is under mlock or not.
>> and almost all CMA user is in embedded world. ie RT realm.
>
> Embedded does not imply realtime constraints.

True. but much overwrapped.


>> So, I don't think
>> CMA and compaction are significantly different.
>
> CMA is used in cases such as a mobile phone needing to allocate a large
> contiguous range of memory for video decoding. Compaction is used by
> features such as THP with khugepaged potentially using it frequently on
> x86-64 machines. The use cases are different and compaction is used by
> THP a lot more than CMA is used by anything.

Fair point. usecase frequency is clearly different.


> If compaction can move mlocked pages then khugepaged can introduce unexpected
> latencies on mlocked anonymous regions of memory.

Yes, it can. Then, the problem depend on how much applications assume
mlock provide no minor fault. right?

My claim was, I suspect such applications certainly exist, but very
few. Automatic
moving makes 99.9% applications happy. example, modern distro  have >1000
utility commands and I suspect _all_ command don't care minor fault.

OK, a few high end and hpc applications certainly care it. but is it majority?


>> >Compaction on the other hand is during the normal operation of the
>> >machine. There are applications that assume that if anonymous memory
>> >is mlocked() then access to it is close to zero latency. They are
>> >not RT-critical processes (or they would disable THP) but depend on
>> >this. Allowing compaction to migrate mlocked() pages will result in bugs
>> >being reported by these people.
>> >
>> >I've received one bug this year about access latency to mlocked() regions but
>> >it turned out to be a file-backed region and related to when the write-fault
>> >is incurred. The ultimate fix was in the application but we'll get new bug
>> >reports if anonymous mlocked pages do not preserve the current guarantees
>> >on access latency.
>>
>> Can you please tell us your opinion about autonuma?
>
> I think it will have the same problem as THP using compaction. If
> mlocked pages can move then there may be unexpected latencies accessing
> mlocked anonymous regions.
>
>> I doubt we can keep such
>> mlock guarantee. I think we need to suggest application fix. maybe to introduce
>> MADV_UNMOVABLE is good start. it seems to solve autonuma issue too.
>
> That'll regress existing applications. It would be preferable to me that
> it be the other way around to not move mlocked pages unless the user says
> it's allowed.

My conclusion is different but I don't disagree your point. see above. I know
you are right too.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14 13:32     ` Mel Gorman
  2012-05-14 13:51       ` Peter Zijlstra
  2012-05-14 23:06       ` KOSAKI Motohiro
@ 2012-05-15  1:35       ` Minchan Kim
  2 siblings, 0 replies; 38+ messages in thread
From: Minchan Kim @ 2012-05-15  1:35 UTC (permalink / raw)
  To: Mel Gorman
  Cc: KOSAKI Motohiro, Johannes Weiner, Rik van Riel, Andrew Morton,
	Andrea Arcangeli, KAMEZAWA Hiroyuki, Christoph Lameter,
	linux-mm@kvack.org, tglx, Ingo Molnar, Peter Zijlstra,
	Theodore Ts'o

On 05/14/2012 10:32 PM, Mel Gorman wrote:

> On Fri, May 11, 2012 at 07:25:59PM -0400, KOSAKI Motohiro wrote:
>>>> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
>>>> index 3a93f73..8ae2e60 100644
>>>> --- a/include/linux/highmem.h
>>>> +++ b/include/linux/highmem.h
>>>> @@ -175,7 +175,8 @@ static inline struct page *
>>>>  alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
>>>>                                         unsigned long vaddr)
>>>>  {
>>>> -       return __alloc_zeroed_user_highpage(__GFP_MOVABLE, vma, vaddr);
>>>> +       gfp_t gfp_flag = vma->vm_flags&  VM_LCOKED ? 0 : __GFP_MOVABLE;
>>>> +       return __alloc_zeroed_user_highpage(gfp_flag, vma, vaddr);
>>>>  }
>>>>
>>>> But it's a solution about newly allocated page on mlocked vma.
>>>> Old pages in the VMA is still a problem.
>>>
>>> Yes.
>>
>> I disagree. __GFP_MOVABLE is one of zone mask. therefore, To turn off __GFP_MOVABLE
>> will break memory hotplug. mlock may easily invoke oom killer.
>>
> 
> Fair point.
> 
>>>> We can solve it at mlock system call through migrating the pages to
>>>> UNMOVABLE block.
>>>
>>> Combining the two would be suitable because once mlock returns, any mapped
>>> page is locked in place and future allocations will be placed suitable. I'd
>>> also be ok allowing file-backed mlocked pages to be migrated on the grounds
>>> that no assumptions can be made about access latency anyway.
>>>
>>>> "
>>>> It would be a solution to enhance compaction/CMA and we can make that compaction doesn't migrate
>>>> UNMOVABLE_PAGE_GROUP which make full by unevictable pages so mlocked page is still pinning page.
>>>> But get_user_pages in drivers still a problem. Or we can migrate unevictable pages, too so that
>>>> compaction/CMA would be good much but we lost pinning concept(It would break man page of mlocked
>>>> about real-time application stuff). Hmm.
>>>>
>>>>>
>>>>> And, think if application explictly use migrate_pages(2) or admins uses
>>>>> cpusets. driver code can't assume such scenario
>>>>> doesn't occur, yes?
>>>>
>>>> Yes. it seems to migrate mlocked page now.
>>>> Hmm,
>>>> Johannes, Mel.
>>>> Why should we be unfair on only compaction?
>>>>
>>>
>>> If CMA decide they want to alter mlocked pages in this way, it's sortof
>>> ok. While CMA is being used, there are no expectations on the RT
>>> behaviour of the system - stalls are expected. In their use cases, CMA
>>> failing is far worse than access latency to an mlocked page being
>>> variable while CMA is running.
>>
>> That's strange. CMA caller can't know the altered page is under mlock or not.
>> and almost all CMA user is in embedded world. ie RT realm.
> 
> Embedded does not imply realtime constraints.
> 
>> So, I don't think
>> CMA and compaction are significantly different.
>>
> 
> CMA is used in cases such as a mobile phone needing to allocate a large
> contiguous range of memory for video decoding. Compaction is used by
> features such as THP with khugepaged potentially using it frequently on
> x86-64 machines. The use cases are different and compaction is used by
> THP a lot more than CMA is used by anything.


Firstly CMA is born in embedded area but who knows that in future other guys need CMA?

> 
> If compaction can move mlocked pages then khugepaged can introduce unexpected
> latencies on mlocked anonymous regions of memory.


I'm not of big fan of THP so not sure how much latency is important in khugepaged.
But, I guess THP collapse success ratio could be important than latency?
And I'm not sure how long anon mlocked page migration affect latency.
IMHO, it wouldn't be a big.

> 
>>> Compaction on the other hand is during the normal operation of the
>>> machine. There are applications that assume that if anonymous memory
>>> is mlocked() then access to it is close to zero latency. They are
>>> not RT-critical processes (or they would disable THP) but depend on
>>> this. Allowing compaction to migrate mlocked() pages will result in bugs
>>> being reported by these people.
>>>
>>> I've received one bug this year about access latency to mlocked() regions but
>>> it turned out to be a file-backed region and related to when the write-fault
>>> is incurred. The ultimate fix was in the application but we'll get new bug
>>> reports if anonymous mlocked pages do not preserve the current guarantees
>>> on access latency.
>>
>> Can you please tell us your opinion about autonuma?
> 
> I think it will have the same problem as THP using compaction. If
> mlocked pages can move then there may be unexpected latencies accessing
> mlocked anonymous regions.
> 
>> I doubt we can keep such
>> mlock guarantee. I think we need to suggest application fix. maybe to introduce
>> MADV_UNMOVABLE is good start. it seems to solve autonuma issue too.
>>
> 
> That'll regress existing applications. It would be preferable to me that
> it be the other way around to not move mlocked pages unless the user says
> it's allowed.
> 



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-11 13:14 ` Mel Gorman
  2012-05-11 23:25   ` KOSAKI Motohiro
@ 2012-05-14  4:25   ` Minchan Kim
  2012-05-14 13:39     ` Mel Gorman
  1 sibling, 1 reply; 38+ messages in thread
From: Minchan Kim @ 2012-05-14  4:25 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Johannes Weiner, Rik van Riel, Andrew Morton, Andrea Arcangeli,
	KAMEZAWA Hiroyuki, Christoph Lameter, linux-mm@kvack.org, tglx,
	Ingo Molnar, Peter Zijlstra, Theodore Ts'o

On 05/11/2012 10:14 PM, Mel Gorman wrote:

> On Fri, May 11, 2012 at 01:37:26PM +0900, Minchan Kim wrote:
>>> <SNIP>
>>> promise mlock don't change physical page.
>>> I wonder if any realtime guys page migration is free lunch. they should
>>> disable both auto migration and compaction.
>>
>> I think disable migration is overkill. We can do better than it.
> 
> The reason why we do not migrate mlock() pages is down to expectations of the
> application developer.  mlock historically was a real-time extention. For
> files, there is no guarantee of latency because obviously things like
> writing to the page can stall in balance_dirty_pages() but for anonymous
> memory, there is an expectation that access be low or zero latency. This
> would be particularly true if they used something like MAP_POPULATE.
> 
>> Quote from discussion last year from me.
>>
>> "
>> We can solve a bit that by another approach if it's really problem
>> with RT processes. The another approach is to separate mlocked pages
>> with allocation time like below pseudo patch which just show the
>> concept)
>>
>> ex)
>> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
>> index 3a93f73..8ae2e60 100644
>> --- a/include/linux/highmem.h
>> +++ b/include/linux/highmem.h
>> @@ -175,7 +175,8 @@ static inline struct page *
>>  alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
>>                                         unsigned long vaddr)
>>  {
>> -       return __alloc_zeroed_user_highpage(__GFP_MOVABLE, vma, vaddr);
>> +       gfp_t gfp_flag = vma->vm_flags & VM_LCOKED ? 0 : __GFP_MOVABLE;
>> +       return __alloc_zeroed_user_highpage(gfp_flag, vma, vaddr);
>>  }
>>
>> But it's a solution about newly allocated page on mlocked vma.
>> Old pages in the VMA is still a problem.
> 
> Yes.
> 
>> We can solve it at mlock system call through migrating the pages to
>> UNMOVABLE block.

> 

> Combining the two would be suitable because once mlock returns, any mapped
> page is locked in place and future allocations will be placed suitable. I'd
> also be ok allowing file-backed mlocked pages to be migrated on the grounds
> that no assumptions can be made about access latency anyway.
> 
>> "
>> It would be a solution to enhance compaction/CMA and we can make that compaction doesn't migrate
>> UNMOVABLE_PAGE_GROUP which make full by unevictable pages so mlocked page is still pinning page.
>> But get_user_pages in drivers still a problem. Or we can migrate unevictable pages, too so that
>> compaction/CMA would be good much but we lost pinning concept(It would break man page of mlocked
>> about real-time application stuff). Hmm.
>>
>>>
>>> And, think if application explictly use migrate_pages(2) or admins uses
>>> cpusets. driver code can't assume such scenario
>>> doesn't occur, yes?
>>
>> Yes. it seems to migrate mlocked page now.
>> Hmm,
>> Johannes, Mel.
>> Why should we be unfair on only compaction?
>>
> 
> If CMA decide they want to alter mlocked pages in this way, it's sortof
> ok. While CMA is being used, there are no expectations on the RT
> behaviour of the system - stalls are expected. In their use cases, CMA
> failing is far worse than access latency to an mlocked page being
> variable while CMA is running.
> 
> Compaction on the other hand is during the normal operation of the
> machine. There are applications that assume that if anonymous memory
> is mlocked() then access to it is close to zero latency. They are
> not RT-critical processes (or they would disable THP) but depend on
> this. Allowing compaction to migrate mlocked() pages will result in bugs
> being reported by these people.
> 
> I've received one bug this year about access latency to mlocked() regions but
> it turned out to be a file-backed region and related to when the write-fault
> is incurred. The ultimate fix was in the application but we'll get new bug
> reports if anonymous mlocked pages do not preserve the current guarantees
> on access latency.
> 


If so, what do you think about migration of mlocked pages by migrate_pages, cpuset_migrate_mm and memcg?
I think they all is done by under user's control while compaction happens regardless of user.
So do you think that's why compaction shouldn't migrate mlocked page?


-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14  4:25   ` Minchan Kim
@ 2012-05-14 13:39     ` Mel Gorman
  2012-05-15  2:15       ` Minchan Kim
  0 siblings, 1 reply; 38+ messages in thread
From: Mel Gorman @ 2012-05-14 13:39 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Johannes Weiner, Rik van Riel, Andrew Morton, Andrea Arcangeli,
	KAMEZAWA Hiroyuki, Christoph Lameter, linux-mm@kvack.org, tglx,
	Ingo Molnar, Peter Zijlstra, Theodore Ts'o

On Mon, May 14, 2012 at 01:25:04PM +0900, Minchan Kim wrote:
> > <SNIP>
> >
> > If CMA decide they want to alter mlocked pages in this way, it's sortof
> > ok. While CMA is being used, there are no expectations on the RT
> > behaviour of the system - stalls are expected. In their use cases, CMA
> > failing is far worse than access latency to an mlocked page being
> > variable while CMA is running.
> > 
> > Compaction on the other hand is during the normal operation of the
> > machine. There are applications that assume that if anonymous memory
> > is mlocked() then access to it is close to zero latency. They are
> > not RT-critical processes (or they would disable THP) but depend on
> > this. Allowing compaction to migrate mlocked() pages will result in bugs
> > being reported by these people.
> > 
> > I've received one bug this year about access latency to mlocked() regions but
> > it turned out to be a file-backed region and related to when the write-fault
> > is incurred. The ultimate fix was in the application but we'll get new bug
> > reports if anonymous mlocked pages do not preserve the current guarantees
> > on access latency.
> > 
> 
> If so, what do you think about migration of mlocked pages by migrate_pages, cpuset_migrate_mm and memcg?

migrate_pages() is a core function used by a variety of different callers. It
*optionally* could move mlocked pages and it would be up to the caller to
specify if that was allowed.

cpuset_migrate_mm() should be allowed to move mlocked() pages because it's
called in a path where the pages are on a node that should not longer be
accessible to the processes. In this case, the latency hit is unavoidable
and a bug reporter that says "there is an unexpected latency accessing memory
while a process moves memory to another node" will be told to get a clue.

Where does memcg call migrate_pages()?

> I think they all is done by under user's control while compaction happens regardless of user.
> So do you think that's why compaction shouldn't migrate mlocked page?
> 

Yes. If the user takes an explicit action that causes latencies when
accessing an mlocked anonymous region while the pages are migrated, that's
fine. I still do not think that THP and khugepaged should cause unexpected
latencies accessing mlocked anonymous regions because it is beyond the
control of the application.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-14 13:39     ` Mel Gorman
@ 2012-05-15  2:15       ` Minchan Kim
  2012-05-15  4:33         ` KOSAKI Motohiro
  2012-05-15 14:09         ` Christoph Lameter
  0 siblings, 2 replies; 38+ messages in thread
From: Minchan Kim @ 2012-05-15  2:15 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Johannes Weiner, Rik van Riel, Andrew Morton, Andrea Arcangeli,
	KAMEZAWA Hiroyuki, Christoph Lameter, linux-mm@kvack.org, tglx,
	Ingo Molnar, Peter Zijlstra, Theodore Ts'o

On 05/14/2012 10:39 PM, Mel Gorman wrote:

> On Mon, May 14, 2012 at 01:25:04PM +0900, Minchan Kim wrote:
>>> <SNIP>
>>>
>>> If CMA decide they want to alter mlocked pages in this way, it's sortof
>>> ok. While CMA is being used, there are no expectations on the RT
>>> behaviour of the system - stalls are expected. In their use cases, CMA
>>> failing is far worse than access latency to an mlocked page being
>>> variable while CMA is running.
>>>
>>> Compaction on the other hand is during the normal operation of the
>>> machine. There are applications that assume that if anonymous memory
>>> is mlocked() then access to it is close to zero latency. They are
>>> not RT-critical processes (or they would disable THP) but depend on
>>> this. Allowing compaction to migrate mlocked() pages will result in bugs
>>> being reported by these people.
>>>
>>> I've received one bug this year about access latency to mlocked() regions but
>>> it turned out to be a file-backed region and related to when the write-fault
>>> is incurred. The ultimate fix was in the application but we'll get new bug
>>> reports if anonymous mlocked pages do not preserve the current guarantees
>>> on access latency.
>>>
>>
>> If so, what do you think about migration of mlocked pages by migrate_pages, cpuset_migrate_mm and memcg?
> 
> migrate_pages() is a core function used by a variety of different callers. It
> *optionally* could move mlocked pages and it would be up to the caller to
> specify if that was allowed.


Sorry I meant SYSCALL_DEFINE4(migrate_pages..);

> 
> cpuset_migrate_mm() should be allowed to move mlocked() pages because it's
> called in a path where the pages are on a node that should not longer be
> accessible to the processes. In this case, the latency hit is unavoidable
> and a bug reporter that says "there is an unexpected latency accessing memory
> while a process moves memory to another node" will be told to get a clue.


The point is that others except compaction already have migrated mlocked pages.

> 
> Where does memcg call migrate_pages()?


I don't know internal of memcg but just saw the code following as,
__unmap_and_move
{
	mem_cgroup_prepare_migration;
	try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
	mem_cgroup_end_migration;
}

So I thougt memcg can migrate mlocked page, too.

> 
>> I think they all is done by under user's control while compaction happens regardless of user.
>> So do you think that's why compaction shouldn't migrate mlocked page?
>>
> 
> Yes. If the user takes an explicit action that causes latencies when
> accessing an mlocked anonymous region while the pages are migrated, that's
> fine. I still do not think that THP and khugepaged should cause unexpected
> latencies accessing mlocked anonymous regions because it is beyond the
> control of the application.
> 


Okay. Let's summary opinions in this thread until now.

1. mlock doesn't have pinning's semantic by definition of opengroup.
2. man page says "No page fault". It's bad. Maybe need fix up of man page.
3. Thera are several places which already have migrate mlocked pages but it's okay because
   it's done under user's control while compaction/khugepagd doesn't.
3. Many application already used mlock by semantic of 2. So let's break legacy application if possible.
4. CMA consider getting of free contiguos memory as top priority so latency may be okay in CMA
   while THP consider latency as top priority.
5. Let's define new API which would be 
   5.1 mlock(SOFT) - it can gaurantee memory-resident.
   5.2 mlock(HARD) - it can gaurantee 1 and pinning.
   Current mlock could be 5.1, then we should implement 5.2. Or
   Current mlock could be 5.2, then we should implement 5.1
   We can implement it by PG_pinned or vma flags.

One of clear point is that it's okay to migrate mlocked page in CMA.
And we can migrate mlocked anonymous pages and mlocked file pages by MIGRATE_ASYNC mode in compaction 
if we all agree Peter who says "mlocked mean NO MAJOR FAULT".


-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-15  2:15       ` Minchan Kim
@ 2012-05-15  4:33         ` KOSAKI Motohiro
  2012-05-15 11:06           ` Peter Zijlstra
  2012-05-15 14:10           ` Christoph Lameter
  2012-05-15 14:09         ` Christoph Lameter
  1 sibling, 2 replies; 38+ messages in thread
From: KOSAKI Motohiro @ 2012-05-15  4:33 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Mel Gorman, Johannes Weiner, Rik van Riel, Andrew Morton,
	Andrea Arcangeli, KAMEZAWA Hiroyuki, Christoph Lameter,
	linux-mm@kvack.org, tglx, Ingo Molnar, Peter Zijlstra,
	Theodore Ts'o

> Okay. Let's summary opinions in this thread until now.
>
> 1. mlock doesn't have pinning's semantic by definition of opengroup.
> 2. man page says "No page fault". It's bad. Maybe need fix up of man page.

Yes, it should be.

> 3. Thera are several places which already have migrate mlocked pages but it's okay because
>   it's done under user's control while compaction/khugepagd doesn't.

I disagree. CPUSETS are used from admins. realtime _application_ is written
by application developers. ok, they are often overwrapped or the same. but it's
not exactly true. memory hotplug has similar situation.

Moreover, Think mix up rt-app and non-rt-migrate_pages-user-app situation. RT
app still be faced minor page fault and it's not expected from rt-app
developers.


> 3. Many application already used mlock by semantic of 2. So let's break legacy application if possible.

Many? really? I guess it's a very few.

> 4. CMA consider getting of free contiguos memory as top priority so latency may be okay in CMA
>   while THP consider latency as top priority.
> 5. Let's define new API which would be
>   5.1 mlock(SOFT) - it can gaurantee memory-resident.
>   5.2 mlock(HARD) - it can gaurantee 1 and pinning.
>   Current mlock could be 5.1, then we should implement 5.2. Or
>   Current mlock could be 5.2, then we should implement 5.1
>   We can implement it by PG_pinned or vma flags.

I definitely agree we need both pinned and not-pinned mlock.


> One of clear point is that it's okay to migrate mlocked page in CMA.
> And we can migrate mlocked anonymous pages and mlocked file pages by MIGRATE_ASYNC mode in compaction
> if we all agree Peter who says "mlocked mean NO MAJOR FAULT".

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-15  4:33         ` KOSAKI Motohiro
@ 2012-05-15 11:06           ` Peter Zijlstra
  2012-05-15 14:12             ` Christoph Lameter
  2012-05-15 14:10           ` Christoph Lameter
  1 sibling, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2012-05-15 11:06 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Minchan Kim, Mel Gorman, Johannes Weiner, Rik van Riel,
	Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	Christoph Lameter, linux-mm@kvack.org, tglx, Ingo Molnar,
	Theodore Ts'o

On Tue, 2012-05-15 at 00:33 -0400, KOSAKI Motohiro wrote:
> > 3. Thera are several places which already have migrate mlocked pages but it's okay because
> >   it's done under user's control while compaction/khugepagd doesn't.
> 
> I disagree. CPUSETS are used from admins. realtime _application_ is written
> by application developers. ok, they are often overwrapped or the same. but it's
> not exactly true. memory hotplug has similar situation.

I'm not exactly sure I get what you're saying, but with the current
scheme of things its impossible to run an RT app properly without the
administrator knowing wrf he's doing.

So the fact that cpusets are admin only doesn't matter, he'd better know
about the rt apps and its requirements.

This very much includes crap like THP (which, as stated, is unavailable
for PREEMPT_RT) since that is under administrator control.

CMA and other allocation based compaction much less so though.

> Moreover, Think mix up rt-app and non-rt-migrate_pages-user-app situation. RT
> app still be faced minor page fault and it's not expected from rt-app
> developers. 

It would be if they'd listened to what I've been telling them for ages.

Anyway.. taking faults isn't the problem for RT, taking indeterministic
time to satisfy them is, and disk IO is completely off the charts
indeterministic. Minor faults much less so.

There is a very big difference between very fast and real-time, they've
got very little to do with one another.

That said, the way page migration currently works isn't ideal from a
determinism pov, the migration PTE can be present for a basically
indeterminate amount of time.

So yes, page migration is a 'serious' problem, but only because the way
its implemented is sub-optimal.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-15 11:06           ` Peter Zijlstra
@ 2012-05-15 14:12             ` Christoph Lameter
  2012-05-15 14:45               ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Christoph Lameter @ 2012-05-15 14:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: KOSAKI Motohiro, Minchan Kim, Mel Gorman, Johannes Weiner,
	Rik van Riel, Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	linux-mm@kvack.org, tglx, Ingo Molnar, Theodore Ts'o

On Tue, 15 May 2012, Peter Zijlstra wrote:

> So yes, page migration is a 'serious' problem, but only because the way
> its implemented is sub-optimal.

For the low-latency cases: page migration needs to be restricted to cpus
that are allowed to run high latency tasks or restricted to a time that no
low-latency responses are needed by the app. This means during setup or
special processing times (maybe after some action was completed).

A random compaction run can be very bad for a latency critical section.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-15 14:12             ` Christoph Lameter
@ 2012-05-15 14:45               ` Peter Zijlstra
  2012-05-15 15:11                 ` Christoph Lameter
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2012-05-15 14:45 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: KOSAKI Motohiro, Minchan Kim, Mel Gorman, Johannes Weiner,
	Rik van Riel, Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	linux-mm@kvack.org, tglx, Ingo Molnar, Theodore Ts'o

On Tue, 2012-05-15 at 09:12 -0500, Christoph Lameter wrote:
> On Tue, 15 May 2012, Peter Zijlstra wrote:
> 
> > So yes, page migration is a 'serious' problem, but only because the way
> > its implemented is sub-optimal.
> 
> For the low-latency cases: page migration needs to be restricted to cpus
> that are allowed to run high latency tasks or restricted to a time that no
> low-latency responses are needed by the app. This means during setup or
> special processing times (maybe after some action was completed).
> 
> A random compaction run can be very bad for a latency critical section.

Yes however:

 1) low latency doesn't make real-time, time bounds do.
 2) the latency impact of migration can be _MUCH_ improved if someone
were to care about it.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-15 14:45               ` Peter Zijlstra
@ 2012-05-15 15:11                 ` Christoph Lameter
  0 siblings, 0 replies; 38+ messages in thread
From: Christoph Lameter @ 2012-05-15 15:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: KOSAKI Motohiro, Minchan Kim, Mel Gorman, Johannes Weiner,
	Rik van Riel, Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	linux-mm@kvack.org, tglx, Ingo Molnar, Theodore Ts'o

On Tue, 15 May 2012, Peter Zijlstra wrote:

> On Tue, 2012-05-15 at 09:12 -0500, Christoph Lameter wrote:
> > On Tue, 15 May 2012, Peter Zijlstra wrote:
> >
> > > So yes, page migration is a 'serious' problem, but only because the way
> > > its implemented is sub-optimal.
> >
> > For the low-latency cases: page migration needs to be restricted to cpus
> > that are allowed to run high latency tasks or restricted to a time that no
> > low-latency responses are needed by the app. This means during setup or
> > special processing times (maybe after some action was completed).
> >
> > A random compaction run can be very bad for a latency critical section.
>
> Yes however:
>
>  1) low latency doesn't make real-time, time bounds do.

Indeed. My requirements are low latency not real time deadlines. There is
some overlap but the basic idea on what to accomplish is different. F.e.
we cannot tolerate the overhead (and scaling limits) added through
additional logic coming with a "real time" kernel.

>  2) the latency impact of migration can be _MUCH_ improved if someone
> were to care about it.

True. I think THP/compaction would benefit most from it.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-15  4:33         ` KOSAKI Motohiro
  2012-05-15 11:06           ` Peter Zijlstra
@ 2012-05-15 14:10           ` Christoph Lameter
  1 sibling, 0 replies; 38+ messages in thread
From: Christoph Lameter @ 2012-05-15 14:10 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Minchan Kim, Mel Gorman, Johannes Weiner, Rik van Riel,
	Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	linux-mm@kvack.org, tglx, Ingo Molnar, Peter Zijlstra,
	Theodore Ts'o

On Tue, 15 May 2012, KOSAKI Motohiro wrote:

> > 3. Many application already used mlock by semantic of 2. So let's break legacy application if possible.
>
> Many? really? I guess it's a very few.

They cannot have used the semantics since they never existed. Maybe the
application writers had certain assumptions about what mlock did but those
were wrong.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Allow migration of mlocked page?
  2012-05-15  2:15       ` Minchan Kim
  2012-05-15  4:33         ` KOSAKI Motohiro
@ 2012-05-15 14:09         ` Christoph Lameter
  1 sibling, 0 replies; 38+ messages in thread
From: Christoph Lameter @ 2012-05-15 14:09 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Mel Gorman, Johannes Weiner, Rik van Riel, Andrew Morton,
	Andrea Arcangeli, KAMEZAWA Hiroyuki, linux-mm@kvack.org, tglx,
	Ingo Molnar, Peter Zijlstra, Theodore Ts'o

On Tue, 15 May 2012, Minchan Kim wrote:

> One of clear point is that it's okay to migrate mlocked page in CMA.
> And we can migrate mlocked anonymous pages and mlocked file pages by MIGRATE_ASYNC mode in compaction
> if we all agree Peter who says "mlocked mean NO MAJOR FAULT".

As far as I can recall the posix definiton mlocked means the page stays in
memory and is not evicted. It says nothing about faults.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2012-05-15 15:12 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-11  4:37 Allow migration of mlocked page? Minchan Kim
2012-05-11  9:20 ` Peter Zijlstra
2012-05-11 16:20   ` Christoph Lameter
2012-05-11 23:24     ` KOSAKI Motohiro
2012-05-14 13:45       ` Christoph Lameter
2012-05-14  4:13   ` Minchan Kim
2012-05-14  6:56     ` Peter Zijlstra
2012-05-14  7:37       ` Minchan Kim
2012-05-14  7:45         ` Peter Zijlstra
2012-05-14  7:49           ` Peter Zijlstra
2012-05-14  7:54             ` Minchan Kim
2012-05-14 13:47         ` Christoph Lameter
2012-05-15  1:23           ` Minchan Kim
2012-05-15 11:07             ` Peter Zijlstra
2012-05-11 13:14 ` Mel Gorman
2012-05-11 23:25   ` KOSAKI Motohiro
2012-05-14 13:32     ` Mel Gorman
2012-05-14 13:51       ` Peter Zijlstra
2012-05-14 14:01         ` Christoph Lameter
2012-05-14 14:14           ` Peter Zijlstra
2012-05-14 14:43             ` Christoph Lameter
2012-05-14 22:52               ` KOSAKI Motohiro
2012-05-14 23:04             ` Roland Dreier
2012-05-15 14:27               ` Christoph Lameter
2012-05-15  1:38           ` Minchan Kim
2012-05-14 14:08         ` Peter Zijlstra
2012-05-14 23:06       ` KOSAKI Motohiro
2012-05-15  1:35       ` Minchan Kim
2012-05-14  4:25   ` Minchan Kim
2012-05-14 13:39     ` Mel Gorman
2012-05-15  2:15       ` Minchan Kim
2012-05-15  4:33         ` KOSAKI Motohiro
2012-05-15 11:06           ` Peter Zijlstra
2012-05-15 14:12             ` Christoph Lameter
2012-05-15 14:45               ` Peter Zijlstra
2012-05-15 15:11                 ` Christoph Lameter
2012-05-15 14:10           ` Christoph Lameter
2012-05-15 14:09         ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).