linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* question on NUMA page migration
@ 2012-10-19 15:53 Rik van Riel
  2012-10-19 16:39 ` Peter Zijlstra
  2012-10-21  2:39 ` Ni zhan Chen
  0 siblings, 2 replies; 11+ messages in thread
From: Rik van Riel @ 2012-10-19 15:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrea Arcangeli, Ingo Molnar, Linux Memory Management List,
	Mel Gorman, Linux kernel Mailing List

Hi Andrea, Peter,

I have a question on page refcounting in your NUMA
page migration code.

In Peter's case, I wonder why you introduce a new
MIGRATE_FAULT migration mode. If the normal page
migration / compaction logic can do without taking
an extra reference count, why does your code need it?

In Andrea's case, we have a comment suggesting an
extra refcount is needed, immediately followed by
a put_page:

         /*
          * Pin the head subpage at least until the first
          * __isolate_lru_page succeeds (__isolate_lru_page pins it
          * again when it succeeds). If we unpin before
          * __isolate_lru_page successd, the page could be freed and
          * reallocated out from under us. Thus our previous checks on
          * the page, and the split_huge_page, would be worthless.
          *
          * We really only need to do this if "ret > 0" but it doesn't
          * hurt to do it unconditionally as nobody can reference
          * "page" anymore after this and so we can avoid an "if (ret >
          * 0)" branch here.
          */
         put_page(page);

This also confuses me.

If we do not need the extra refcount (and I do not
understand why NUMA migrate-on-fault needs one more
refcount than normal page migration), we can get
rid of the MIGRATE_FAULT mode.

If we do need the extra refcount, why is normal
page migration safe? :)

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: question on NUMA page migration
  2012-10-19 15:53 question on NUMA page migration Rik van Riel
@ 2012-10-19 16:39 ` Peter Zijlstra
  2012-10-19 17:13   ` Rik van Riel
  2012-10-21  2:39 ` Ni zhan Chen
  1 sibling, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2012-10-19 16:39 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Andrea Arcangeli, Ingo Molnar, Linux Memory Management List,
	Mel Gorman, Linux kernel Mailing List

On Fri, 2012-10-19 at 11:53 -0400, Rik van Riel wrote:
> 
> If we do need the extra refcount, why is normal
> page migration safe? :) 

Its mostly a matter of how convoluted you make the code, regular page
migration is about as bad as you can get

Normal does:

  follow_page(FOLL_GET) +1

  isolate_lru_page() +1

  put_page() -1

ending up with a page with a single reference (for anon, or one extra
each for the mapping and buffer).

And while I suppose I could do a put_page() in migrate_misplaced_page()
that makes the function frob the page-count depending on the return
value.

I always try and avoid conditional locks/refs, therefore the code ends
up doing:

  page = vm_normal_page()
  if (page) {
    get_page()

    migrate_misplaced_page()

    put_page()
  }


where migrate_misplaced_page() does isolate_lru_page()/putback_lru_page,
and this leaves the page-count invariant.

We got a ref, therefore we must put a ref, is easier than we got a ref
and must put except when...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: question on NUMA page migration
  2012-10-19 16:39 ` Peter Zijlstra
@ 2012-10-19 17:13   ` Rik van Riel
  2012-10-19 17:53     ` Peter Zijlstra
  0 siblings, 1 reply; 11+ messages in thread
From: Rik van Riel @ 2012-10-19 17:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrea Arcangeli, Ingo Molnar, Linux Memory Management List,
	Mel Gorman, Linux kernel Mailing List

On 10/19/2012 12:39 PM, Peter Zijlstra wrote:
> On Fri, 2012-10-19 at 11:53 -0400, Rik van Riel wrote:
>>
>> If we do need the extra refcount, why is normal
>> page migration safe? :)
>
> Its mostly a matter of how convoluted you make the code, regular page
> migration is about as bad as you can get
>
> Normal does:
>
>    follow_page(FOLL_GET) +1
>
>    isolate_lru_page() +1
>
>    put_page() -1
>
> ending up with a page with a single reference (for anon, or one extra
> each for the mapping and buffer).

Would it make sense to have the normal page migration code always
work with the extra refcount, so we do not have to introduce a new
MIGRATE_FAULT migration mode?

On the other hand, compaction does not take the extra reference...

Another alternative might be to do the put_page inside
do_prot_none_numa().  That would be analogous to do_wp_page
disposing of the old page for the caller.

I am not real happy about NUMA migration introducing its own
migration mode...

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: question on NUMA page migration
  2012-10-19 17:13   ` Rik van Riel
@ 2012-10-19 17:53     ` Peter Zijlstra
  2012-10-19 18:33       ` Rik van Riel
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2012-10-19 17:53 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Andrea Arcangeli, Ingo Molnar, Linux Memory Management List,
	Mel Gorman, Linux kernel Mailing List

On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:

> Would it make sense to have the normal page migration code always
> work with the extra refcount, so we do not have to introduce a new
> MIGRATE_FAULT migration mode?
> 
> On the other hand, compaction does not take the extra reference...

Right, it appears to not do this, it gets pages from the pfn and
zone->lock and the isolate_lru_page() call is the first reference.

> Another alternative might be to do the put_page inside
> do_prot_none_numa().  That would be analogous to do_wp_page
> disposing of the old page for the caller.

It'd have to be inside migrate_misplaced_page(), can't do before
isolate_lru_page() or the page might disappear. Doing it after is
(obviously) too late.

> I am not real happy about NUMA migration introducing its own
> migration mode...

You didn't seem to mind too much earlier, but I can remove it if you
want.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: question on NUMA page migration
  2012-10-19 17:53     ` Peter Zijlstra
@ 2012-10-19 18:33       ` Rik van Riel
  2012-10-20  1:23         ` Ingo Molnar
  0 siblings, 1 reply; 11+ messages in thread
From: Rik van Riel @ 2012-10-19 18:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrea Arcangeli, Ingo Molnar, Linux Memory Management List,
	Mel Gorman, Linux kernel Mailing List

On 10/19/2012 01:53 PM, Peter Zijlstra wrote:
> On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:

>> Another alternative might be to do the put_page inside
>> do_prot_none_numa().  That would be analogous to do_wp_page
>> disposing of the old page for the caller.
>
> It'd have to be inside migrate_misplaced_page(), can't do before
> isolate_lru_page() or the page might disappear. Doing it after is
> (obviously) too late.

Keeping an extra refcount on the page might _still_
result in it disappearing from the process by some
other means, in-between you grabbing the refcount
and invoking migration of the page.

>> I am not real happy about NUMA migration introducing its own
>> migration mode...
>
> You didn't seem to mind too much earlier, but I can remove it if you
> want.

Could have been reviewing fatigue :)

And yes, it would have been nice to not have a special
migration mode for sched/numa.

Speaking of, when do you guys plan to submit a (cleaned up)
version of the sched/numa patch series for review on lkml?

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: question on NUMA page migration
  2012-10-19 18:33       ` Rik van Riel
@ 2012-10-20  1:23         ` Ingo Molnar
  2012-10-20 16:02           ` Rik van Riel
  0 siblings, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2012-10-20  1:23 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Peter Zijlstra, Andrea Arcangeli, Linux Memory Management List,
	Mel Gorman, Linux kernel Mailing List


* Rik van Riel <riel@redhat.com> wrote:

> On 10/19/2012 01:53 PM, Peter Zijlstra wrote:
> >On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:
> 
> >>Another alternative might be to do the put_page inside
> >>do_prot_none_numa().  That would be analogous to do_wp_page
> >>disposing of the old page for the caller.
> >
> >It'd have to be inside migrate_misplaced_page(), can't do before
> >isolate_lru_page() or the page might disappear. Doing it after is
> >(obviously) too late.
> 
> Keeping an extra refcount on the page might _still_
> result in it disappearing from the process by some
> other means, in-between you grabbing the refcount
> and invoking migration of the page.
> 
> >>I am not real happy about NUMA migration introducing its own
> >>migration mode...
> >
> >You didn't seem to mind too much earlier, but I can remove it if you
> >want.
> 
> Could have been reviewing fatigue :)

:-)

> And yes, it would have been nice to not have a special
> migration mode for sched/numa.
> 
> Speaking of, when do you guys plan to submit a (cleaned up)
> version of the sched/numa patch series for review on lkml?

Which commit(s) worry you specifically?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: question on NUMA page migration
  2012-10-20  1:23         ` Ingo Molnar
@ 2012-10-20 16:02           ` Rik van Riel
  2012-10-21 12:30             ` Ingo Molnar
  0 siblings, 1 reply; 11+ messages in thread
From: Rik van Riel @ 2012-10-20 16:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Andrea Arcangeli, Linux Memory Management List,
	Mel Gorman, Linux kernel Mailing List

On 10/19/2012 09:23 PM, Ingo Molnar wrote:
>
> * Rik van Riel <riel@redhat.com> wrote:
>
>> On 10/19/2012 01:53 PM, Peter Zijlstra wrote:
>>> On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:
>>
>>>> Another alternative might be to do the put_page inside
>>>> do_prot_none_numa().  That would be analogous to do_wp_page
>>>> disposing of the old page for the caller.
>>>
>>> It'd have to be inside migrate_misplaced_page(), can't do before
>>> isolate_lru_page() or the page might disappear. Doing it after is
>>> (obviously) too late.
>>
>> Keeping an extra refcount on the page might _still_
>> result in it disappearing from the process by some
>> other means, in-between you grabbing the refcount
>> and invoking migration of the page.
>>
>>>> I am not real happy about NUMA migration introducing its own
>>>> migration mode...
>>>
>>> You didn't seem to mind too much earlier, but I can remove it if you
>>> want.
>>
>> Could have been reviewing fatigue :)
>
> :-)
>
>> And yes, it would have been nice to not have a special
>> migration mode for sched/numa.
>>
>> Speaking of, when do you guys plan to submit a (cleaned up)
>> version of the sched/numa patch series for review on lkml?
>
> Which commit(s) worry you specifically?

One of them would be the commit that introduces MIGRATE_FAULT.
Adding it in one patch, and removing it into a next just makes
things harder on the reviewers.

03a040f6c17ab81659579ba6abe267c0562097e4


If the changesets with NUMA syscalls are still in your tree's
history, they should not be submitted as part of the patch
series, either.
-- 
All rights reversed

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: question on NUMA page migration
  2012-10-19 15:53 question on NUMA page migration Rik van Riel
  2012-10-19 16:39 ` Peter Zijlstra
@ 2012-10-21  2:39 ` Ni zhan Chen
  2012-10-21  2:40   ` Rik van Riel
  1 sibling, 1 reply; 11+ messages in thread
From: Ni zhan Chen @ 2012-10-21  2:39 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Peter Zijlstra, Andrea Arcangeli, Ingo Molnar,
	Linux Memory Management List, Mel Gorman,
	Linux kernel Mailing List

On 10/19/2012 11:53 PM, Rik van Riel wrote:
> Hi Andrea, Peter,
>
> I have a question on page refcounting in your NUMA
> page migration code.
>
> In Peter's case, I wonder why you introduce a new
> MIGRATE_FAULT migration mode. If the normal page
> migration / compaction logic can do without taking
> an extra reference count, why does your code need it?

Hi Rik van Riel,

This is which part of codes? Why I can't find MIGRATE_FAULT in latest 
v3.7-rc2?

Regards,
Chen

>
> In Andrea's case, we have a comment suggesting an
> extra refcount is needed, immediately followed by
> a put_page:
>
>         /*
>          * Pin the head subpage at least until the first
>          * __isolate_lru_page succeeds (__isolate_lru_page pins it
>          * again when it succeeds). If we unpin before
>          * __isolate_lru_page successd, the page could be freed and
>          * reallocated out from under us. Thus our previous checks on
>          * the page, and the split_huge_page, would be worthless.
>          *
>          * We really only need to do this if "ret > 0" but it doesn't
>          * hurt to do it unconditionally as nobody can reference
>          * "page" anymore after this and so we can avoid an "if (ret >
>          * 0)" branch here.
>          */
>         put_page(page);
>
> This also confuses me.
>
> If we do not need the extra refcount (and I do not
> understand why NUMA migrate-on-fault needs one more
> refcount than normal page migration), we can get
> rid of the MIGRATE_FAULT mode.
>
> If we do need the extra refcount, why is normal
> page migration safe? :)
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: question on NUMA page migration
  2012-10-21  2:39 ` Ni zhan Chen
@ 2012-10-21  2:40   ` Rik van Riel
  2012-10-21 12:31     ` Ingo Molnar
  0 siblings, 1 reply; 11+ messages in thread
From: Rik van Riel @ 2012-10-21  2:40 UTC (permalink / raw)
  To: Ni zhan Chen
  Cc: Peter Zijlstra, Andrea Arcangeli, Ingo Molnar,
	Linux Memory Management List, Mel Gorman,
	Linux kernel Mailing List

On 10/20/2012 10:39 PM, Ni zhan Chen wrote:
> On 10/19/2012 11:53 PM, Rik van Riel wrote:
>> Hi Andrea, Peter,
>>
>> I have a question on page refcounting in your NUMA
>> page migration code.
>>
>> In Peter's case, I wonder why you introduce a new
>> MIGRATE_FAULT migration mode. If the normal page
>> migration / compaction logic can do without taking
>> an extra reference count, why does your code need it?
>
> Hi Rik van Riel,
>
> This is which part of codes? Why I can't find MIGRATE_FAULT in latest
> v3.7-rc2?

It is in tip.git in the numa/core branch.


-- 
All rights reversed

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: question on NUMA page migration
  2012-10-20 16:02           ` Rik van Riel
@ 2012-10-21 12:30             ` Ingo Molnar
  0 siblings, 0 replies; 11+ messages in thread
From: Ingo Molnar @ 2012-10-21 12:30 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Peter Zijlstra, Andrea Arcangeli, Linux Memory Management List,
	Mel Gorman, Thomas Gleixner, Linux kernel Mailing List


* Rik van Riel <riel@redhat.com> wrote:

> On 10/19/2012 09:23 PM, Ingo Molnar wrote:
> >
> >* Rik van Riel <riel@redhat.com> wrote:
> >
> >>On 10/19/2012 01:53 PM, Peter Zijlstra wrote:
> >>>On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:
> >>
> >>>>Another alternative might be to do the put_page inside
> >>>>do_prot_none_numa().  That would be analogous to do_wp_page
> >>>>disposing of the old page for the caller.
> >>>
> >>>It'd have to be inside migrate_misplaced_page(), can't do before
> >>>isolate_lru_page() or the page might disappear. Doing it after is
> >>>(obviously) too late.
> >>
> >>Keeping an extra refcount on the page might _still_
> >>result in it disappearing from the process by some
> >>other means, in-between you grabbing the refcount
> >>and invoking migration of the page.
> >>
> >>>>I am not real happy about NUMA migration introducing its own
> >>>>migration mode...
> >>>
> >>>You didn't seem to mind too much earlier, but I can remove it if you
> >>>want.
> >>
> >>Could have been reviewing fatigue :)
> >
> >:-)
> >
> >>And yes, it would have been nice to not have a special
> >>migration mode for sched/numa.
> >>
> >>Speaking of, when do you guys plan to submit a (cleaned up)
> >>version of the sched/numa patch series for review on lkml?
> >
> >Which commit(s) worry you specifically?
> 
> One of them would be the commit that introduces MIGRATE_FAULT.

MIGRATE_FAULT is still used.

> Adding it in one patch, and removing it into a next just makes
> things harder on the reviewers.

Yes.

> 03a040f6c17ab81659579ba6abe267c0562097e4

It's still used:

comet:~/tip> git grep MIGRATE_FAULT
include/linux/migrate_mode.h: * MIGRATE_FAULT called from the fault path to migrate-on-fault for mempolicy
include/linux/migrate_mode.h:	MIGRATE_FAULT,
mm/migrate.c:	if (mode != MIGRATE_ASYNC && mode != MIGRATE_FAULT) {
mm/migrate.c:	if (mode == MIGRATE_FAULT) {
mm/migrate.c:		 * MIGRATE_FAULT has an extra reference on the page and
mm/migrate.c:	if ((mode == MIGRATE_ASYNC || mode == MIGRATE_FAULT) && head &&
mm/migrate.c:	if (mode != MIGRATE_ASYNC && mode != MIGRATE_FAULT)
mm/migrate.c:		if (!force || mode == MIGRATE_ASYNC || mode == MIGRATE_FAULT)
mm/migrate.c:	ret = __unmap_and_move(page, newpage, 0, 0, MIGRATE_FAULT);

> If the changesets with NUMA syscalls are still in your tree's 
> history, they should not be submitted as part of the patch 
> series, either.

No, the syscalls have not been there for quite some time.

If you have trouble with any specific commit, please quote the 
specific SHA1 so that I can have a look and resolve any specific 
concerns.

Otherwise, lets continue with the integration?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: question on NUMA page migration
  2012-10-21  2:40   ` Rik van Riel
@ 2012-10-21 12:31     ` Ingo Molnar
  0 siblings, 0 replies; 11+ messages in thread
From: Ingo Molnar @ 2012-10-21 12:31 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Ni zhan Chen, Peter Zijlstra, Andrea Arcangeli,
	Linux Memory Management List, Mel Gorman,
	Linux kernel Mailing List


* Rik van Riel <riel@redhat.com> wrote:

> On 10/20/2012 10:39 PM, Ni zhan Chen wrote:
> >On 10/19/2012 11:53 PM, Rik van Riel wrote:
> >>Hi Andrea, Peter,
> >>
> >>I have a question on page refcounting in your NUMA
> >>page migration code.
> >>
> >>In Peter's case, I wonder why you introduce a new
> >>MIGRATE_FAULT migration mode. If the normal page
> >>migration / compaction logic can do without taking
> >>an extra reference count, why does your code need it?
> >
> >Hi Rik van Riel,
> >
> >This is which part of codes? Why I can't find MIGRATE_FAULT in latest
> >v3.7-rc2?
> 
> It is in tip.git in the numa/core branch.

The Git access URI is:

  git pull  git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git numa/core
  git clone git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git numa/core

Thanks,

	Ingo


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-10-21 12:31 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-19 15:53 question on NUMA page migration Rik van Riel
2012-10-19 16:39 ` Peter Zijlstra
2012-10-19 17:13   ` Rik van Riel
2012-10-19 17:53     ` Peter Zijlstra
2012-10-19 18:33       ` Rik van Riel
2012-10-20  1:23         ` Ingo Molnar
2012-10-20 16:02           ` Rik van Riel
2012-10-21 12:30             ` Ingo Molnar
2012-10-21  2:39 ` Ni zhan Chen
2012-10-21  2:40   ` Rik van Riel
2012-10-21 12:31     ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).