All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Avi Kivity <avi@redhat.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	Xen-devel <xen-devel@lists.xensource.com>,
	Jan Beulich <jbeulich@novell.com>, Ingo Molnar <mingo@elte.hu>
Subject: Re: Question about x86/mm/gup.c's use of disabled interrupts
Date: Wed, 18 Mar 2009 15:14:08 -0700	[thread overview]
Message-ID: <49C17230.20109@goop.org> (raw)
In-Reply-To: <49C16A48.4090303@redhat.com>

Avi Kivity wrote:
> Jeremy Fitzhardinge wrote:
>>>> Disabling the interrupt will prevent the tlb flush IPI from coming 
>>>> in and flushing this cpu's tlb, but I don't see how it will prevent 
>>>> some other cpu from actually updating the pte in the pagetable, 
>>>> which is what we're concerned about here.  
>>>
>>> The thread that cleared the pte holds the pte lock and is now 
>>> waiting for the IPI.  The thread that wants to update the pte will 
>>> wait for the pte lock, thus also waits on the IPI and gup_fast()'s 
>>> local_irq_enable().  I think.
>>
>> But hasn't it already done the pte update at that point?
>>
>> (I think this conversation really is moot because the kernel never 
>> does P->P pte updates any more; its always P->N->P.)
>
> I thought you were concerned about cpu 0 doing a gup_fast(), cpu 1 
> doing P->N, and cpu 2 doing N->P.  In this case cpu 2 is waiting on 
> the pte lock.

The issue is that if cpu 0 is doing a gup_fast() and other cpus are 
doing P->P updates, then gup_fast() can potentially get a mix of old and 
new pte values - where P->P is any aggregate set of unsynchronized P->N 
and N->P operations on any number of other cpus.  Ah, but if every P->N 
is followed by a tlb flush, then disabling interrupts will hold off any 
following N->P, allowing gup_fast to get a consistent pte snapshot.

Hm, awkward if flush_tlb_others doesn't IPI...

> Won't stop munmap().

And I guess it does the tlb flush before freeing the pages, so disabling 
the interrupt helps here too.

Simplest fix is to make gup_get_pte() a pvop, but that does seem like 
putting a red flag in front of an inner-loop hotspot, or something...

The per-cpu tlb-flush exclusion flag might really be the way to go.

    J

WARNING: multiple messages have this Message-ID (diff)
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Avi Kivity <avi@redhat.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	Xen-devel <xen-devel@lists.xensource.com>,
	Jan Beulich <jbeulich@novell.com>, Ingo Molnar <mingo@elte.hu>
Subject: Re: Question about x86/mm/gup.c's use of disabled interrupts
Date: Wed, 18 Mar 2009 15:14:08 -0700	[thread overview]
Message-ID: <49C17230.20109@goop.org> (raw)
In-Reply-To: <49C16A48.4090303@redhat.com>

Avi Kivity wrote:
> Jeremy Fitzhardinge wrote:
>>>> Disabling the interrupt will prevent the tlb flush IPI from coming 
>>>> in and flushing this cpu's tlb, but I don't see how it will prevent 
>>>> some other cpu from actually updating the pte in the pagetable, 
>>>> which is what we're concerned about here.  
>>>
>>> The thread that cleared the pte holds the pte lock and is now 
>>> waiting for the IPI.  The thread that wants to update the pte will 
>>> wait for the pte lock, thus also waits on the IPI and gup_fast()'s 
>>> local_irq_enable().  I think.
>>
>> But hasn't it already done the pte update at that point?
>>
>> (I think this conversation really is moot because the kernel never 
>> does P->P pte updates any more; its always P->N->P.)
>
> I thought you were concerned about cpu 0 doing a gup_fast(), cpu 1 
> doing P->N, and cpu 2 doing N->P.  In this case cpu 2 is waiting on 
> the pte lock.

The issue is that if cpu 0 is doing a gup_fast() and other cpus are 
doing P->P updates, then gup_fast() can potentially get a mix of old and 
new pte values - where P->P is any aggregate set of unsynchronized P->N 
and N->P operations on any number of other cpus.  Ah, but if every P->N 
is followed by a tlb flush, then disabling interrupts will hold off any 
following N->P, allowing gup_fast to get a consistent pte snapshot.

Hm, awkward if flush_tlb_others doesn't IPI...

> Won't stop munmap().

And I guess it does the tlb flush before freeing the pages, so disabling 
the interrupt helps here too.

Simplest fix is to make gup_get_pte() a pvop, but that does seem like 
putting a red flag in front of an inner-loop hotspot, or something...

The per-cpu tlb-flush exclusion flag might really be the way to go.

    J

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Avi Kivity <avi@redhat.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>,
	Xen-devel <xen-devel@lists.xensource.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	Ingo Molnar <mingo@elte.hu>
Subject: Re: Question about x86/mm/gup.c's use of disabled interrupts
Date: Wed, 18 Mar 2009 15:14:08 -0700	[thread overview]
Message-ID: <49C17230.20109@goop.org> (raw)
In-Reply-To: <49C16A48.4090303@redhat.com>

Avi Kivity wrote:
> Jeremy Fitzhardinge wrote:
>>>> Disabling the interrupt will prevent the tlb flush IPI from coming 
>>>> in and flushing this cpu's tlb, but I don't see how it will prevent 
>>>> some other cpu from actually updating the pte in the pagetable, 
>>>> which is what we're concerned about here.  
>>>
>>> The thread that cleared the pte holds the pte lock and is now 
>>> waiting for the IPI.  The thread that wants to update the pte will 
>>> wait for the pte lock, thus also waits on the IPI and gup_fast()'s 
>>> local_irq_enable().  I think.
>>
>> But hasn't it already done the pte update at that point?
>>
>> (I think this conversation really is moot because the kernel never 
>> does P->P pte updates any more; its always P->N->P.)
>
> I thought you were concerned about cpu 0 doing a gup_fast(), cpu 1 
> doing P->N, and cpu 2 doing N->P.  In this case cpu 2 is waiting on 
> the pte lock.

The issue is that if cpu 0 is doing a gup_fast() and other cpus are 
doing P->P updates, then gup_fast() can potentially get a mix of old and 
new pte values - where P->P is any aggregate set of unsynchronized P->N 
and N->P operations on any number of other cpus.  Ah, but if every P->N 
is followed by a tlb flush, then disabling interrupts will hold off any 
following N->P, allowing gup_fast to get a consistent pte snapshot.

Hm, awkward if flush_tlb_others doesn't IPI...

> Won't stop munmap().

And I guess it does the tlb flush before freeing the pages, so disabling 
the interrupt helps here too.

Simplest fix is to make gup_get_pte() a pvop, but that does seem like 
putting a red flag in front of an inner-loop hotspot, or something...

The per-cpu tlb-flush exclusion flag might really be the way to go.

    J

  reply	other threads:[~2009-03-18 22:14 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-18 19:17 Question about x86/mm/gup.c's use of disabled interrupts Jeremy Fitzhardinge
2009-03-18 19:17 ` Jeremy Fitzhardinge
2009-03-18 19:17 ` Jeremy Fitzhardinge
2009-03-18 21:13 ` Avi Kivity
2009-03-18 21:13   ` Avi Kivity
2009-03-18 21:23   ` Jeremy Fitzhardinge
2009-03-18 21:23     ` Jeremy Fitzhardinge
2009-03-18 21:23     ` Jeremy Fitzhardinge
2009-03-18 21:40     ` Avi Kivity
2009-03-18 21:40       ` Avi Kivity
2009-03-18 22:14       ` Jeremy Fitzhardinge [this message]
2009-03-18 22:14         ` Jeremy Fitzhardinge
2009-03-18 22:14         ` Jeremy Fitzhardinge
2009-03-18 22:41         ` Avi Kivity
2009-03-18 22:41           ` Avi Kivity
2009-03-18 22:55           ` Jeremy Fitzhardinge
2009-03-18 22:55             ` Jeremy Fitzhardinge
2009-03-18 23:05             ` Avi Kivity
2009-03-18 23:05               ` Avi Kivity
2009-03-18 23:05               ` Avi Kivity
2009-03-18 23:32               ` Jeremy Fitzhardinge
2009-03-18 23:32                 ` Jeremy Fitzhardinge
2009-03-19  9:46                 ` Avi Kivity
2009-03-19  9:46                   ` Avi Kivity
2009-03-19 17:16                   ` Jeremy Fitzhardinge
2009-03-19 17:16                     ` Jeremy Fitzhardinge
2009-03-19 17:16                     ` Jeremy Fitzhardinge
2009-03-19 17:33                     ` Avi Kivity
2009-03-19 17:33                       ` Avi Kivity
2009-04-03  2:41                 ` paravirtops kernel and HVM guests Nitin A Kamble
2009-04-03  3:37                   ` Jeremy Fitzhardinge
     [not found]               ` <70513aa50903181617r418ec23s744544dccfd812e8@mail.gmail.com>
2009-03-18 23:37                 ` Question about x86/mm/gup.c's use of disabled interrupts Jeremy Fitzhardinge
2009-03-18 23:37                   ` Jeremy Fitzhardinge
2009-03-19  1:32 ` Nick Piggin
2009-03-19  1:32   ` Nick Piggin
2009-03-19 17:31   ` Jeremy Fitzhardinge
2009-03-19 17:31     ` Jeremy Fitzhardinge
2009-03-20  4:40     ` Paul E. McKenney
2009-03-20  4:40       ` Paul E. McKenney
2009-03-20 15:38       ` Jeremy Fitzhardinge
2009-03-20 15:38         ` Jeremy Fitzhardinge
2009-03-20 15:57         ` Paul E. McKenney
2009-03-20 15:57           ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49C17230.20109@goop.org \
    --to=jeremy@goop.org \
    --cc=avi@redhat.com \
    --cc=jbeulich@novell.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.