All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Avi Kivity <avi@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	Xen-devel <xen-devel@lists.xensource.com>,
	Jan Beulich <jbeulich@novell.com>, Ingo Molnar <mingo@elte.hu>
Subject: Re: Question about x86/mm/gup.c's use of disabled interrupts
Date: Thu, 19 Mar 2009 10:31:55 -0700	[thread overview]
Message-ID: <49C2818B.9060201@goop.org> (raw)
In-Reply-To: <200903191232.05459.nickpiggin@yahoo.com.au>

Nick Piggin wrote:
>> Also, assuming that disabling the interrupt is enough to get the
>> guarantees we need here, there's a Xen problem because we don't use IPIs
>> for cross-cpu tlb flushes (well, it happens within Xen).  I'll have to
>> think a bit about how to deal with that, but I'm thinking that we could
>> add a per-cpu "tlb flushes blocked" flag, and maintain some kind of
>> per-cpu deferred tlb flush count so we can get around to doing the flush
>> eventually.
>>
>> But I want to make sure I understand the exact algorithm here.
>>     
>
> FWIW, powerpc actually can flush tlbs without IPIs, and it also has
> a gup_fast. powerpc RCU frees its page _tables_ so we can walk them,
> and then I use speculative page references in order to be able to
> take a reference on the page without having it pinned.
>   

Ah, interesting.  So disabling interrupts prevents the RCU free from 
happening, and non-atomic pte fetching is a non-issue.  So it doesn't 
address the PAE side of the problem.

> Turning gup_get_pte into a pvop would be a bit nasty because on !PAE
> it is just a single load, and even on PAE it is pretty cheap.
>   

Well, it wouldn't be too bad; for !PAE it would turn into something we 
could inline, so there'd be little to no cost.  For PAE it would be out 
of line, but a direct function call, which would be nicely cached and 
very predictable once we've gone through the the loop once (and for Xen 
I think I'd just make it a cmpxchg8b-based implementation, assuming that 
the tlb flush hypercall would offset the cost of making gup_fast a bit 
slower).

But it would be better if we can address it at a higher level.

    J

WARNING: multiple messages have this Message-ID (diff)
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Avi Kivity <avi@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	Xen-devel <xen-devel@lists.xensource.com>,
	Jan Beulich <jbeulich@novell.com>, Ingo Molnar <mingo@elte.hu>
Subject: Re: Question about x86/mm/gup.c's use of disabled interrupts
Date: Thu, 19 Mar 2009 10:31:55 -0700	[thread overview]
Message-ID: <49C2818B.9060201@goop.org> (raw)
In-Reply-To: <200903191232.05459.nickpiggin@yahoo.com.au>

Nick Piggin wrote:
>> Also, assuming that disabling the interrupt is enough to get the
>> guarantees we need here, there's a Xen problem because we don't use IPIs
>> for cross-cpu tlb flushes (well, it happens within Xen).  I'll have to
>> think a bit about how to deal with that, but I'm thinking that we could
>> add a per-cpu "tlb flushes blocked" flag, and maintain some kind of
>> per-cpu deferred tlb flush count so we can get around to doing the flush
>> eventually.
>>
>> But I want to make sure I understand the exact algorithm here.
>>     
>
> FWIW, powerpc actually can flush tlbs without IPIs, and it also has
> a gup_fast. powerpc RCU frees its page _tables_ so we can walk them,
> and then I use speculative page references in order to be able to
> take a reference on the page without having it pinned.
>   

Ah, interesting.  So disabling interrupts prevents the RCU free from 
happening, and non-atomic pte fetching is a non-issue.  So it doesn't 
address the PAE side of the problem.

> Turning gup_get_pte into a pvop would be a bit nasty because on !PAE
> it is just a single load, and even on PAE it is pretty cheap.
>   

Well, it wouldn't be too bad; for !PAE it would turn into something we 
could inline, so there'd be little to no cost.  For PAE it would be out 
of line, but a direct function call, which would be nicely cached and 
very predictable once we've gone through the the loop once (and for Xen 
I think I'd just make it a cmpxchg8b-based implementation, assuming that 
the tlb flush hypercall would offset the cost of making gup_fast a bit 
slower).

But it would be better if we can address it at a higher level.

    J

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-03-19 17:32 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-18 19:17 Question about x86/mm/gup.c's use of disabled interrupts Jeremy Fitzhardinge
2009-03-18 19:17 ` Jeremy Fitzhardinge
2009-03-18 19:17 ` Jeremy Fitzhardinge
2009-03-18 21:13 ` Avi Kivity
2009-03-18 21:13   ` Avi Kivity
2009-03-18 21:23   ` Jeremy Fitzhardinge
2009-03-18 21:23     ` Jeremy Fitzhardinge
2009-03-18 21:23     ` Jeremy Fitzhardinge
2009-03-18 21:40     ` Avi Kivity
2009-03-18 21:40       ` Avi Kivity
2009-03-18 22:14       ` Jeremy Fitzhardinge
2009-03-18 22:14         ` Jeremy Fitzhardinge
2009-03-18 22:14         ` Jeremy Fitzhardinge
2009-03-18 22:41         ` Avi Kivity
2009-03-18 22:41           ` Avi Kivity
2009-03-18 22:55           ` Jeremy Fitzhardinge
2009-03-18 22:55             ` Jeremy Fitzhardinge
2009-03-18 23:05             ` Avi Kivity
2009-03-18 23:05               ` Avi Kivity
2009-03-18 23:05               ` Avi Kivity
2009-03-18 23:32               ` Jeremy Fitzhardinge
2009-03-18 23:32                 ` Jeremy Fitzhardinge
2009-03-19  9:46                 ` Avi Kivity
2009-03-19  9:46                   ` Avi Kivity
2009-03-19 17:16                   ` Jeremy Fitzhardinge
2009-03-19 17:16                     ` Jeremy Fitzhardinge
2009-03-19 17:16                     ` Jeremy Fitzhardinge
2009-03-19 17:33                     ` Avi Kivity
2009-03-19 17:33                       ` Avi Kivity
2009-04-03  2:41                 ` paravirtops kernel and HVM guests Nitin A Kamble
2009-04-03  3:37                   ` Jeremy Fitzhardinge
     [not found]               ` <70513aa50903181617r418ec23s744544dccfd812e8@mail.gmail.com>
2009-03-18 23:37                 ` Question about x86/mm/gup.c's use of disabled interrupts Jeremy Fitzhardinge
2009-03-18 23:37                   ` Jeremy Fitzhardinge
2009-03-19  1:32 ` Nick Piggin
2009-03-19  1:32   ` Nick Piggin
2009-03-19 17:31   ` Jeremy Fitzhardinge [this message]
2009-03-19 17:31     ` Jeremy Fitzhardinge
2009-03-20  4:40     ` Paul E. McKenney
2009-03-20  4:40       ` Paul E. McKenney
2009-03-20 15:38       ` Jeremy Fitzhardinge
2009-03-20 15:38         ` Jeremy Fitzhardinge
2009-03-20 15:57         ` Paul E. McKenney
2009-03-20 15:57           ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49C2818B.9060201@goop.org \
    --to=jeremy@goop.org \
    --cc=avi@redhat.com \
    --cc=jbeulich@novell.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.