From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: David Miller <davem@davemloft.net>,
akpm@linux-foundation.org, aneesh.kumar@linux.vnet.ibm.com,
steve.capper@linaro.org, aarcange@redhat.com, mpe@ellerman.id.au,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org, linux-arch@vger.kernel.org,
hannes@cmpxchg.org
Subject: Re: [PATCH V2 1/2] mm: Update generic gup implementation to handle hugepage directory
Date: Mon, 27 Oct 2014 07:50:41 +1100 [thread overview]
Message-ID: <1414356641.364.142.camel@pasglop> (raw)
In-Reply-To: <1414167761.19984.17.camel@jarvis.lan>
On Fri, 2014-10-24 at 09:22 -0700, James Bottomley wrote:
> Parisc does this. As soon as one CPU issues a TLB purge, it's broadcast
> to all the CPUs on the inter-CPU bus. The next instruction isn't
> executed until they respond.
>
> But this is only for our CPU TLB. There's no other external
> consequence, so removal from the page tables isn't effected by this TLB
> flush, therefore the theory on which Dave bases the change to
> atomic_add() should work for us (of course, atomic_add is lock add
> unlock on our CPU, so it's not going to be of much benefit).
I'm not sure I follow you here.
Do you or do you now perform an IPI to do TLB flushes ? If you don't
(for example because you have HW broadcast), then you need the
speculative get_page(). If you do (and can read a PTE atomically), you
can get away with atomic_add().
The reason is that if you remember how zap_pte_range works, we perform
the flush before we get rid of the page.
So if your using IPIs for the flush, the fact that gup_fast has
interrupts disabled will delay the IPI response and thus effectively
prevent the pages from being actually freed, allowing us to simply do
the atomic_add() on x86.
But if we don't use IPIs because we have HW broadcast of TLB
invalidations, then we don't have that synchronization. atomic_add won't
work, we need get_page_speculative() because the page could be
concurrently being freed.
Cheers,
Ben.
> James
>
> > Another option would be to make the generic code use something defined
> > by the arch to decide whether to use speculative get or
> > not. I like the idea of keeping the bulk of that code generic...
> >
> > Cheers,
> > Ben.
> >
> > > --
> > > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > > the body to majordomo@kvack.org. For more info on Linux MM,
> > > see: http://www.linux-mm.org/ .
> > > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org. For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: David Miller <davem@davemloft.net>,
akpm@linux-foundation.org, aneesh.kumar@linux.vnet.ibm.com,
steve.capper@linaro.org, aarcange@redhat.com, mpe@ellerman.id.au,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org, linux-arch@vger.kernel.org,
hannes@cmpxchg.org
Subject: Re: [PATCH V2 1/2] mm: Update generic gup implementation to handle hugepage directory
Date: Mon, 27 Oct 2014 07:50:41 +1100 [thread overview]
Message-ID: <1414356641.364.142.camel@pasglop> (raw)
Message-ID: <20141026205041.KN_SJjrAdnG3YojM28V0QGPEqrPocSiJrIxXVqMQfpc@z> (raw)
In-Reply-To: <1414167761.19984.17.camel@jarvis.lan>
On Fri, 2014-10-24 at 09:22 -0700, James Bottomley wrote:
> Parisc does this. As soon as one CPU issues a TLB purge, it's broadcast
> to all the CPUs on the inter-CPU bus. The next instruction isn't
> executed until they respond.
>
> But this is only for our CPU TLB. There's no other external
> consequence, so removal from the page tables isn't effected by this TLB
> flush, therefore the theory on which Dave bases the change to
> atomic_add() should work for us (of course, atomic_add is lock add
> unlock on our CPU, so it's not going to be of much benefit).
I'm not sure I follow you here.
Do you or do you now perform an IPI to do TLB flushes ? If you don't
(for example because you have HW broadcast), then you need the
speculative get_page(). If you do (and can read a PTE atomically), you
can get away with atomic_add().
The reason is that if you remember how zap_pte_range works, we perform
the flush before we get rid of the page.
So if your using IPIs for the flush, the fact that gup_fast has
interrupts disabled will delay the IPI response and thus effectively
prevent the pages from being actually freed, allowing us to simply do
the atomic_add() on x86.
But if we don't use IPIs because we have HW broadcast of TLB
invalidations, then we don't have that synchronization. atomic_add won't
work, we need get_page_speculative() because the page could be
concurrently being freed.
Cheers,
Ben.
> James
>
> > Another option would be to make the generic code use something defined
> > by the arch to decide whether to use speculative get or
> > not. I like the idea of keeping the bulk of that code generic...
> >
> > Cheers,
> > Ben.
> >
> > > --
> > > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > > the body to majordomo@kvack.org. For more info on Linux MM,
> > > see: http://www.linux-mm.org/ .
> > > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org. For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
WARNING: multiple messages have this Message-ID (diff)
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: aarcange@redhat.com, linux-arch@vger.kernel.org,
steve.capper@linaro.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, aneesh.kumar@linux.vnet.ibm.com,
hannes@cmpxchg.org, akpm@linux-foundation.org,
linuxppc-dev@lists.ozlabs.org, David Miller <davem@davemloft.net>
Subject: Re: [PATCH V2 1/2] mm: Update generic gup implementation to handle hugepage directory
Date: Mon, 27 Oct 2014 07:50:41 +1100 [thread overview]
Message-ID: <1414356641.364.142.camel@pasglop> (raw)
In-Reply-To: <1414167761.19984.17.camel@jarvis.lan>
On Fri, 2014-10-24 at 09:22 -0700, James Bottomley wrote:
> Parisc does this. As soon as one CPU issues a TLB purge, it's broadcast
> to all the CPUs on the inter-CPU bus. The next instruction isn't
> executed until they respond.
>
> But this is only for our CPU TLB. There's no other external
> consequence, so removal from the page tables isn't effected by this TLB
> flush, therefore the theory on which Dave bases the change to
> atomic_add() should work for us (of course, atomic_add is lock add
> unlock on our CPU, so it's not going to be of much benefit).
I'm not sure I follow you here.
Do you or do you now perform an IPI to do TLB flushes ? If you don't
(for example because you have HW broadcast), then you need the
speculative get_page(). If you do (and can read a PTE atomically), you
can get away with atomic_add().
The reason is that if you remember how zap_pte_range works, we perform
the flush before we get rid of the page.
So if your using IPIs for the flush, the fact that gup_fast has
interrupts disabled will delay the IPI response and thus effectively
prevent the pages from being actually freed, allowing us to simply do
the atomic_add() on x86.
But if we don't use IPIs because we have HW broadcast of TLB
invalidations, then we don't have that synchronization. atomic_add won't
work, we need get_page_speculative() because the page could be
concurrently being freed.
Cheers,
Ben.
> James
>
> > Another option would be to make the generic code use something defined
> > by the arch to decide whether to use speculative get or
> > not. I like the idea of keeping the bulk of that code generic...
> >
> > Cheers,
> > Ben.
> >
> > > --
> > > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > > the body to majordomo@kvack.org. For more info on Linux MM,
> > > see: http://www.linux-mm.org/ .
> > > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org. For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
next prev parent reply other threads:[~2014-10-26 20:50 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-17 4:38 [PATCH V2 1/2] mm: Update generic gup implementation to handle hugepage directory Aneesh Kumar K.V
2014-10-17 4:38 ` Aneesh Kumar K.V
2014-10-17 4:38 ` Aneesh Kumar K.V
2014-10-17 4:38 ` [PATCH V2 2/2] arch/powerpc: Switch to generic RCU get_user_pages_fast Aneesh Kumar K.V
2014-10-17 4:38 ` Aneesh Kumar K.V
2014-10-17 4:38 ` Aneesh Kumar K.V
2014-10-17 14:10 ` [PATCH V2 1/2] mm: Update generic gup implementation to handle hugepage directory Steve Capper
2014-10-17 14:10 ` Steve Capper
2014-10-17 14:10 ` Steve Capper
2014-10-22 23:02 ` Andrew Morton
2014-10-22 23:02 ` Andrew Morton
2014-10-22 23:02 ` Andrew Morton
2014-10-23 4:28 ` Aneesh Kumar K.V
2014-10-23 4:28 ` Aneesh Kumar K.V
2014-10-23 4:28 ` Aneesh Kumar K.V
2014-10-23 8:08 ` Aneesh Kumar K.V
2014-10-23 8:08 ` Aneesh Kumar K.V
2014-10-23 8:08 ` Aneesh Kumar K.V
2014-10-23 22:40 ` David Miller
2014-10-23 22:40 ` David Miller
2014-10-23 22:40 ` David Miller
2014-10-23 23:40 ` Benjamin Herrenschmidt
2014-10-23 23:40 ` Benjamin Herrenschmidt
2014-10-23 23:40 ` Benjamin Herrenschmidt
2014-10-24 3:55 ` David Miller
2014-10-24 3:55 ` David Miller
2014-10-24 3:55 ` David Miller
2014-10-24 8:33 ` Steve Capper
2014-10-24 8:33 ` Steve Capper
2014-10-24 8:33 ` Steve Capper
2014-10-24 16:22 ` James Bottomley
2014-10-24 16:22 ` James Bottomley
2014-10-24 16:22 ` James Bottomley
2014-10-26 20:50 ` Benjamin Herrenschmidt [this message]
2014-10-26 20:50 ` Benjamin Herrenschmidt
2014-10-26 20:50 ` Benjamin Herrenschmidt
2014-10-27 0:18 ` Andrea Arcangeli
2014-10-27 0:18 ` Andrea Arcangeli
2014-10-27 0:18 ` Andrea Arcangeli
2014-10-27 17:58 ` Aneesh Kumar K.V
2014-10-27 17:58 ` Aneesh Kumar K.V
2014-10-27 17:58 ` Aneesh Kumar K.V
2014-10-27 18:41 ` Andrea Arcangeli
2014-10-27 18:41 ` Andrea Arcangeli
2014-10-27 18:41 ` Andrea Arcangeli
2014-10-25 10:30 ` Aneesh Kumar K.V
2014-10-25 10:30 ` Aneesh Kumar K.V
2014-10-25 10:30 ` Aneesh Kumar K.V
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1414356641.364.142.camel@pasglop \
--to=benh@kernel.crashing.org \
--cc=James.Bottomley@HansenPartnership.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=davem@davemloft.net \
--cc=hannes@cmpxchg.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=steve.capper@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.