* Re: [PATCH v2 10/10] kernel: might_fault does not imply might_sleep
From: Michael S. Tsirkin @ 2013-05-19 9:35 UTC (permalink / raw)
To: Peter Zijlstra
Cc: linux-m32r-ja, kvm, Catalin Marinas, Will Deacon, David Howells,
linux-mm, Paul Mackerras, H. Peter Anvin, linux-arch,
linux-am33-list, Hirokazu Takata, x86, Ingo Molnar, Arnd Bergmann,
microblaze-uclinux, Chris Metcalf, rostedt, Thomas Gleixner,
linux-arm-kernel, Michal Simek, linux-m32r, linux-kernel,
Koichi Yasutake, linuxppc-dev
In-Reply-To: <20130516184041.GP19669@dyad.programming.kicks-ass.net>
On Thu, May 16, 2013 at 08:40:41PM +0200, Peter Zijlstra wrote:
> On Thu, May 16, 2013 at 02:16:10PM +0300, Michael S. Tsirkin wrote:
> > There are several ways to make sure might_fault
> > calling function does not sleep.
> > One is to use it on kernel or otherwise locked memory - apparently
> > nfs/sunrpc does this. As noted by Ingo, this is handled by the
> > migh_fault() implementation in mm/memory.c but not the one in
> > linux/kernel.h so in the current code might_fault() schedules
> > differently depending on CONFIG_PROVE_LOCKING, which is an undesired
> > semantical side effect.
> >
> > Another is to call pagefault_disable: in this case the page fault
> > handler will go to fixups processing and we get an error instead of
> > sleeping, so the might_sleep annotation is a false positive.
> > vhost driver wants to do this now in order to reuse socket ops
> > under a spinlock (and fall back on slower thread handler
> > on error).
>
> Are you using the assumption that spin_lock() implies preempt_disable() implies
> pagefault_disable()? Note that this assumption isn't valid for -rt where the
> spinlock becomes preemptible but we'll not disable pagefaults.
No, I was not assuming that. What I'm trying to say is that a caller
that does something like this under a spinlock:
preempt_disable
pagefault_disable
error = copy_to_user
pagefault_enable
preempt_enable_no_resched
is not doing anything wrong and should not get a warning,
as long as error is handled correctly later.
Right?
> > Address both issues by:
> > - dropping the unconditional call to might_sleep
> > from the fast might_fault code in linux/kernel.h
> > - checking for pagefault_disable() in the
> > CONFIG_PROVE_LOCKING implementation
> >
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> > include/linux/kernel.h | 1 -
> > mm/memory.c | 14 +++++++++-----
> > 2 files changed, 9 insertions(+), 6 deletions(-)
> >
> > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > index e96329c..322b065 100644
> > --- a/include/linux/kernel.h
> > +++ b/include/linux/kernel.h
> > @@ -198,7 +198,6 @@ void might_fault(void);
> > #else
> > static inline void might_fault(void)
> > {
> > - might_sleep();
>
> This removes potential resched points for PREEMPT_VOLUNTARY -- was that
> intentional?
No it's a bug. Thanks for pointing this out.
OK so I guess it should be might_sleep_if(!in_atomic())
and this means might_fault would have to move from linux/kernel.h to
linux/uaccess.h, since in_atomic() is in linux/hardirq.h
Makes sense?
> > }
> > #endif
> >
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 6dc1882..1b8327b 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -4222,13 +4222,17 @@ void might_fault(void)
> > if (segment_eq(get_fs(), KERNEL_DS))
> > return;
> >
> > - might_sleep();
> > /*
> > - * it would be nicer only to annotate paths which are not under
> > - * pagefault_disable, however that requires a larger audit and
> > - * providing helpers like get_user_atomic.
> > + * It would be nicer to annotate paths which are under preempt_disable
> > + * but not under pagefault_disable, however that requires a new flag
> > + * for differentiating between the two.
>
> -rt has this, pagefault_disable() doesn't change the preempt count but pokes
> at task_struct::pagefault_disable.
Good to know.
So maybe we can import this at least for CONFIG_PROVE_LOCKING?
To make the patch smaller I'd prefer doing both for now,
this way this patchset does not have to poke in too many
mm internals.
I can try doing that - unless
someone else has plans to merge this part soon anyway?
> > */
> > - if (!in_atomic() && current->mm)
> > + if (in_atomic())
> > + return;
> > +
> > + might_sleep();
> > +
> > + if (current->mm)
> > might_lock_read(¤t->mm->mmap_sem);
> > }
> > EXPORT_SYMBOL(might_fault);
> > --
> > MST
^ permalink raw reply
* Re: [PATCH v2 10/10] kernel: might_fault does not imply might_sleep
From: Steven Rostedt @ 2013-05-19 12:34 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: linux-m32r-ja, kvm, Peter Zijlstra, Catalin Marinas, Will Deacon,
David Howells, linux-mm, Paul Mackerras, H. Peter Anvin,
linux-arch, linux-am33-list, Hirokazu Takata, x86, Ingo Molnar,
Arnd Bergmann, microblaze-uclinux, Chris Metcalf, Thomas Gleixner,
linux-arm-kernel, Michal Simek, linux-m32r, linux-kernel,
Koichi Yasutake, linuxppc-dev
In-Reply-To: <20130519093526.GD19883@redhat.com>
On Sun, 2013-05-19 at 12:35 +0300, Michael S. Tsirkin wrote:
> No, I was not assuming that. What I'm trying to say is that a caller
> that does something like this under a spinlock:
> preempt_disable
> pagefault_disable
> error = copy_to_user
> pagefault_enable
> preempt_enable_no_resched
>
> is not doing anything wrong and should not get a warning,
> as long as error is handled correctly later.
> Right?
>
What I see wrong with the above is the preempt_enable_no_resched(). The
only place that should be ever used is right before a schedule(), as you
don't want to schedule twice (once for the preempt_enable() and then
again for the schedule itself).
Remember, in -rt, a spin lock does not disable preemption.
-- Steve
^ permalink raw reply
* Re: [PATCH v2 10/10] kernel: might_fault does not imply might_sleep
From: Michael S. Tsirkin @ 2013-05-19 13:34 UTC (permalink / raw)
To: Steven Rostedt
Cc: linux-m32r-ja, kvm, Peter Zijlstra, Catalin Marinas, Will Deacon,
David Howells, linux-mm, Paul Mackerras, H. Peter Anvin,
linux-arch, linux-am33-list, Hirokazu Takata, x86, Ingo Molnar,
Arnd Bergmann, microblaze-uclinux, Chris Metcalf, Thomas Gleixner,
linux-arm-kernel, Michal Simek, linux-m32r, linux-kernel,
Koichi Yasutake, linuxppc-dev
In-Reply-To: <1368966844.6828.111.camel@gandalf.local.home>
On Sun, May 19, 2013 at 08:34:04AM -0400, Steven Rostedt wrote:
> On Sun, 2013-05-19 at 12:35 +0300, Michael S. Tsirkin wrote:
>
> > No, I was not assuming that. What I'm trying to say is that a caller
> > that does something like this under a spinlock:
> > preempt_disable
> > pagefault_disable
> > error = copy_to_user
> > pagefault_enable
> > preempt_enable_no_resched
> >
> > is not doing anything wrong and should not get a warning,
> > as long as error is handled correctly later.
> > Right?
> >
>
> What I see wrong with the above is the preempt_enable_no_resched(). The
> only place that should be ever used is right before a schedule(), as you
> don't want to schedule twice (once for the preempt_enable() and then
> again for the schedule itself).
>
> Remember, in -rt, a spin lock does not disable preemption.
>
> -- Steve
Right but we need to keep it working on upstream as well.
If I do preempt_enable under a spinlock upstream won't it
try to sleep under spinlock?
--
MST
^ permalink raw reply
* Re: [PATCH v2 10/10] kernel: might_fault does not imply might_sleep
From: Steven Rostedt @ 2013-05-19 16:06 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: linux-m32r-ja, kvm, Peter Zijlstra, Catalin Marinas, Will Deacon,
David Howells, linux-mm, Paul Mackerras, H. Peter Anvin,
linux-arch, linux-am33-list, Hirokazu Takata, x86, Ingo Molnar,
Arnd Bergmann, microblaze-uclinux, Chris Metcalf, Thomas Gleixner,
linux-arm-kernel, Michal Simek, linux-m32r, linux-kernel,
Koichi Yasutake, linuxppc-dev
In-Reply-To: <20130519133418.GA24381@redhat.com>
On Sun, 2013-05-19 at 16:34 +0300, Michael S. Tsirkin wrote:
> Right but we need to keep it working on upstream as well.
> If I do preempt_enable under a spinlock upstream won't it
> try to sleep under spinlock?
No it wont. A spinlock calls preempt_disable implicitly, and a
preempt_enable() will not schedule unless preempt_count is zero, which
it wont be under a spinlock.
If it did, there would be lots of bugs all over the place because this
is done throughout the kernel (a preempt_enable() under a spinlock).
In other words, don't ever use preempt_enable_no_resched().
-- Steve
^ permalink raw reply
* Re: [PATCH v2 10/10] kernel: might_fault does not imply might_sleep
From: Michael S. Tsirkin @ 2013-05-19 16:40 UTC (permalink / raw)
To: Steven Rostedt
Cc: linux-m32r-ja, kvm, Peter Zijlstra, Catalin Marinas, Will Deacon,
David Howells, linux-mm, Paul Mackerras, H. Peter Anvin,
linux-arch, linux-am33-list, Hirokazu Takata, x86, Ingo Molnar,
Arnd Bergmann, microblaze-uclinux, Chris Metcalf, Thomas Gleixner,
linux-arm-kernel, Michal Simek, linux-m32r, linux-kernel,
Koichi Yasutake, linuxppc-dev
In-Reply-To: <1368979579.6828.114.camel@gandalf.local.home>
On Sun, May 19, 2013 at 12:06:19PM -0400, Steven Rostedt wrote:
> On Sun, 2013-05-19 at 16:34 +0300, Michael S. Tsirkin wrote:
>
> > Right but we need to keep it working on upstream as well.
> > If I do preempt_enable under a spinlock upstream won't it
> > try to sleep under spinlock?
>
> No it wont. A spinlock calls preempt_disable implicitly, and a
> preempt_enable() will not schedule unless preempt_count is zero, which
> it wont be under a spinlock.
>
> If it did, there would be lots of bugs all over the place because this
> is done throughout the kernel (a preempt_enable() under a spinlock).
>
> In other words, don't ever use preempt_enable_no_resched().
>
> -- Steve
>
OK I get it. So let me correct myself. The simple code
that does something like this under a spinlock:
> preempt_disable
> pagefault_disable
> error = copy_to_user
> pagefault_enable
> preempt_enable
>
is not doing anything wrong and should not get a warning,
as long as error is handled correctly later.
Right?
^ permalink raw reply
* Re: [PATCH v2 10/10] kernel: might_fault does not imply might_sleep
From: Steven Rostedt @ 2013-05-19 20:23 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: linux-m32r-ja, kvm, Peter Zijlstra, Catalin Marinas, Will Deacon,
David Howells, linux-mm, Paul Mackerras, H. Peter Anvin,
linux-arch, linux-am33-list, Hirokazu Takata, x86, Ingo Molnar,
Arnd Bergmann, microblaze-uclinux, Chris Metcalf, Thomas Gleixner,
linux-arm-kernel, Michal Simek, linux-m32r, linux-kernel,
Koichi Yasutake, linuxppc-dev
In-Reply-To: <20130519164009.GA2434@redhat.com>
On Sun, 2013-05-19 at 19:40 +0300, Michael S. Tsirkin wrote:
> OK I get it. So let me correct myself. The simple code
> that does something like this under a spinlock:
> > preempt_disable
> > pagefault_disable
> > error = copy_to_user
> > pagefault_enable
> > preempt_enable
> >
> is not doing anything wrong and should not get a warning,
> as long as error is handled correctly later.
> Right?
I came in mid thread and I don't know the context. Anyway, the above
looks to me as you just don't want to sleep. If you try to copy data to
user space that happens not to be currently mapped for any reason, you
will get an error. Even if the address space is completely valid. Is
that what you want?
-- Steve
^ permalink raw reply
* Re: [PATCH v2 10/10] kernel: might_fault does not imply might_sleep
From: Michael S. Tsirkin @ 2013-05-19 20:35 UTC (permalink / raw)
To: Steven Rostedt
Cc: linux-m32r-ja, kvm, Peter Zijlstra, Catalin Marinas, Will Deacon,
David Howells, linux-mm, Paul Mackerras, H. Peter Anvin,
linux-arch, linux-am33-list, Hirokazu Takata, x86, Ingo Molnar,
Arnd Bergmann, microblaze-uclinux, Chris Metcalf, Thomas Gleixner,
linux-arm-kernel, Michal Simek, linux-m32r, linux-kernel,
Koichi Yasutake, linuxppc-dev
In-Reply-To: <1368995002.6828.117.camel@gandalf.local.home>
On Sun, May 19, 2013 at 04:23:22PM -0400, Steven Rostedt wrote:
> On Sun, 2013-05-19 at 19:40 +0300, Michael S. Tsirkin wrote:
>
> > OK I get it. So let me correct myself. The simple code
> > that does something like this under a spinlock:
> > > preempt_disable
> > > pagefault_disable
> > > error = copy_to_user
> > > pagefault_enable
> > > preempt_enable
> > >
> > is not doing anything wrong and should not get a warning,
> > as long as error is handled correctly later.
> > Right?
>
> I came in mid thread and I don't know the context.
The context is that I want to change might_fault
from might_sleep to
might_sleep_if(!in_atomic())
so that above does not trigger warnings even with
CONFIG_DEBUG_ATOMIC_SLEEP enabled.
> Anyway, the above
> looks to me as you just don't want to sleep.
Exactly. upstream we can just do pagefault_disable but to make
this code -rt ready it's best to do preempt_disable as well.
> If you try to copy data to
> user space that happens not to be currently mapped for any reason, you
> will get an error. Even if the address space is completely valid. Is
> that what you want?
>
> -- Steve
>
Yes, this is by design.
We detect that and bounce the work to a thread outside
any locks.
Thanks,
--
MST
^ permalink raw reply
* Re: [PATCH 1/2] powerpc: use smp_rmb when looking at deposisted pgtable to store hash index
From: Michael Neuling @ 2013-05-20 1:18 UTC (permalink / raw)
To: Aneesh Kumar K.V; +Cc: paulus, linuxppc-dev, dwg
In-Reply-To: <1368778503-23230-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> wrote:
> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>
> We need to use smb_rmb when looking at hpte slot array. Otherwise we could
> reorder the hpte_slot array load bfore even we marked the pmd trans huge.
Does this need to go back into the stable series?
Mikey
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
> arch/powerpc/include/asm/pgtable-ppc64.h | 15 +++++++++++++++
> arch/powerpc/mm/hugepage-hash64.c | 6 +-----
> arch/powerpc/mm/pgtable_64.c | 6 +-----
> 3 files changed, 17 insertions(+), 10 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
> index d046289..46db094 100644
> --- a/arch/powerpc/include/asm/pgtable-ppc64.h
> +++ b/arch/powerpc/include/asm/pgtable-ppc64.h
> @@ -10,6 +10,7 @@
> #else
> #include <asm/pgtable-ppc64-4k.h>
> #endif
> +#include <asm/barrier.h>
>
> #define FIRST_USER_ADDRESS 0
>
> @@ -393,6 +394,20 @@ static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
> hpte_slot_array[index] = hidx << 4 | 0x1 << 3;
> }
>
> +static inline char *get_hpte_slot_array(pmd_t *pmdp)
> +{
> + /*
> + * The hpte hindex is stored in the pgtable whose address is in the
> + * second half of the PMD
> + *
> + * Order this load with the test for pmd_trans_huge in the caller
> + */
> + smp_rmb();
> + return *(char **)(pmdp + PTRS_PER_PMD);
> +
> +
> +}
> +
> extern void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long addr,
> pmd_t *pmdp);
> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> diff --git a/arch/powerpc/mm/hugepage-hash64.c b/arch/powerpc/mm/hugepage-hash64.c
> index e430766..c3ba3d5 100644
> --- a/arch/powerpc/mm/hugepage-hash64.c
> +++ b/arch/powerpc/mm/hugepage-hash64.c
> @@ -84,11 +84,7 @@ int __hash_page_thp(unsigned long ea, unsigned long access, unsigned long vsid,
>
> vpn = hpt_vpn(ea, vsid, ssize);
> hash = hpt_hash(vpn, shift, ssize);
> - /*
> - * The hpte hindex are stored in the pgtable whose address is in the
> - * second half of the PMD
> - */
> - hpte_slot_array = *(char **)(pmdp + PTRS_PER_PMD);
> + hpte_slot_array = get_hpte_slot_array(pmdp);
>
> valid = hpte_valid(hpte_slot_array, index);
> if (valid) {
> diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
> index 8dd7c83..19d6734 100644
> --- a/arch/powerpc/mm/pgtable_64.c
> +++ b/arch/powerpc/mm/pgtable_64.c
> @@ -701,11 +701,7 @@ void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long addr,
> * Flush all the hptes mapping this hugepage
> */
> s_addr = addr & HPAGE_PMD_MASK;
> - /*
> - * The hpte hindex are stored in the pgtable whose address is in the
> - * second half of the PMD
> - */
> - hpte_slot_array = *(char **)(pmdp + PTRS_PER_PMD);
> + hpte_slot_array = get_hpte_slot_array(pmdp);
> /*
> * IF we try to do a HUGE PTE update after a withdraw is done.
> * we will find the below NULL. This happens when we do
> --
> 1.8.1.2
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
>
^ permalink raw reply
* Re: [PATCH 1/2] powerpc: use smp_rmb when looking at deposisted pgtable to store hash index
From: Aneesh Kumar K.V @ 2013-05-20 4:27 UTC (permalink / raw)
To: Michael Neuling; +Cc: paulus, linuxppc-dev, dwg
In-Reply-To: <26735.1369012726@ale.ozlabs.ibm.com>
Michael Neuling <mikey@neuling.org> writes:
> Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> wrote:
>
>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>>
>> We need to use smb_rmb when looking at hpte slot array. Otherwise we could
>> reorder the hpte_slot array load bfore even we marked the pmd trans huge.
>
> Does this need to go back into the stable series?
>
No the changes are not yet upstream. hpte_slot_array changes are in the
THP series. I didn't want to post the entire series again with the
above two patches.
-aneesh
^ permalink raw reply
* Re: [PATCH 17/18] cpufreq: powerpc: move cpufreq driver to drivers/cpufreq
From: Viresh Kumar @ 2013-05-20 4:40 UTC (permalink / raw)
To: Benjamin Herrenschmidt, deepthi
Cc: robin.randhawa, linux-pm, patches, Liviu.Dudau, linux-kernel,
cpufreq, rjw, Steve.Bannister, Paul Mackerras, Olof Johansson,
arvind.chauhan, linuxppc-dev, linaro-kernel, charles.garcia-tobin
In-Reply-To: <CAKohponBfXsy21RTik+jWMukL_VnDNXOL7xmj4sEBF3HEBXqpQ@mail.gmail.com>
On 13 May 2013 11:34, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 22 April 2013 12:19, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>> On 9 April 2013 14:05, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>>> On 5 April 2013 12:16, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>>>> On 4 April 2013 18:24, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>>>>> This patch moves cpufreq driver of powerpc platform to drivers/cpufreq.
>>>>>
>>>>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>>>>> Cc: Paul Mackerras <paulus@samba.org>
>>>>> Cc: Olof Johansson <olof@lixom.net>
>>>>> Cc: linuxppc-dev@lists.ozlabs.org
>>>>> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
>>>>> ---
>>>>> Compile Tested only.
>>>>>
>>>>> arch/powerpc/platforms/Kconfig | 31 ----------------------
>>>>> arch/powerpc/platforms/pasemi/Makefile | 1 -
>>>>> arch/powerpc/platforms/powermac/Makefile | 2 --
>>>>> drivers/cpufreq/Kconfig.powerpc | 26 ++++++++++++++++++
>>>>> drivers/cpufreq/Makefile | 3 +++
>>>>> .../cpufreq.c => drivers/cpufreq/pasemi-cpufreq.c | 0
>>>>> .../cpufreq/pmac32-cpufreq.c | 0
>>>>> .../cpufreq/pmac64-cpufreq.c | 0
>>>>> 8 files changed, 29 insertions(+), 34 deletions(-)
>>>>> rename arch/powerpc/platforms/pasemi/cpufreq.c => drivers/cpufreq/pasemi-cpufreq.c (100%)
>>>>> rename arch/powerpc/platforms/powermac/cpufreq_32.c => drivers/cpufreq/pmac32-cpufreq.c (100%)
>>>>> rename arch/powerpc/platforms/powermac/cpufreq_64.c => drivers/cpufreq/pmac64-cpufreq.c (100%)
>>>>
>>>> Hi Deepthi,
>>>>
>>>> Can you help testing this please?
>>>
>>> Ping!!
>>
>> Ping!!
>
> Hi Benjamin,
>
> Hope you are back from your vacations. Can you give it a try now?
Ping!!
^ permalink raw reply
* Re: [PATCH 1/2] powerpc: use smp_rmb when looking at deposisted pgtable to store hash index
From: Benjamin Herrenschmidt @ 2013-05-20 6:28 UTC (permalink / raw)
To: Aneesh Kumar K.V; +Cc: linuxppc-dev, Michael Neuling, paulus, dwg
In-Reply-To: <87vc6e9sxx.fsf@linux.vnet.ibm.com>
On Mon, 2013-05-20 at 09:57 +0530, Aneesh Kumar K.V wrote:
> Michael Neuling <mikey@neuling.org> writes:
>
> > Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> wrote:
> >
> >> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> >>
> >> We need to use smb_rmb when looking at hpte slot array. Otherwise we could
> >> reorder the hpte_slot array load bfore even we marked the pmd trans huge.
> >
> > Does this need to go back into the stable series?
> >
>
> No the changes are not yet upstream. hpte_slot_array changes are in the
> THP series. I didn't want to post the entire series again with the
> above two patches.
Note that any patch that adds such a rmb should have a very clear
description of what is the corresponding wmb. It's almost never right to
have one without the other. IE. What is it that you are ordering
against.
Cheers,
Ben.
^ permalink raw reply
* Re: [PATCH 1/2] powerpc: use smp_rmb when looking at deposisted pgtable to store hash index
From: Aneesh Kumar K.V @ 2013-05-20 9:26 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, Michael Neuling, paulus, dwg
In-Reply-To: <1369031318.6387.1.camel@pasglop>
Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
> On Mon, 2013-05-20 at 09:57 +0530, Aneesh Kumar K.V wrote:
>> Michael Neuling <mikey@neuling.org> writes:
>>
>> > Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> wrote:
>> >
>> >> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>> >>
>> >> We need to use smb_rmb when looking at hpte slot array. Otherwise we could
>> >> reorder the hpte_slot array load bfore even we marked the pmd trans huge.
>> >
>> > Does this need to go back into the stable series?
>> >
>>
>> No the changes are not yet upstream. hpte_slot_array changes are in the
>> THP series. I didn't want to post the entire series again with the
>> above two patches.
>
> Note that any patch that adds such a rmb should have a very clear
> description of what is the corresponding wmb. It's almost never right to
> have one without the other. IE. What is it that you are ordering
> against.
Will update the commit message. Some of that smb_wmb() are in the
core THP, and the others in ppc64 code. Will update with more info.
When we deposit pgtable we use pgtable_trans_huge_deposit that calls
related smb_wmb(). When we read pgtable via pgtable_trans_huge_withdraw,
core THP code does the related smb_wmb() after setting the right
PTE information in the pgtable.
-aneesh
^ permalink raw reply
* [PATCH] powerpc/tm: Abort transactions on emulation and alignment faults
From: Michael Neuling @ 2013-05-20 11:07 UTC (permalink / raw)
To: benh; +Cc: Linux PPC dev, matt
If we are emulating an instruction inside an active user transaction that
touches memory, the kernel can't emulate it as it operates in transactional
suspend context. We need to abort these transactions and send them back to
userspace for the hardware to rollback.
We can service these if the user transaction is in suspend mode, since the
kernel will operate in the same suspend context.
This adds a check to all alignment faults and to specific instruction
emulations (only string instructions for now). If the user process is in an
active (non-suspended) transaction, we abort the transaction go back to
userspace allowing the HW to roll back the transaction and tell the user of the
failure. This also adds new tm abort cause codes to report the reason of the
persistent error to the user.
Crappy test case here http://neuling.org/devel/junkcode/aligntm.c
Signed-off-by: Michael Neuling <mikey@neuling.org>
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index a613651..2f20a40 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -121,6 +121,8 @@
#define TM_CAUSE_SYSCALL 0xf9 /* Persistent */
#define TM_CAUSE_MISC 0xf6
#define TM_CAUSE_SIGNAL 0xf4
+#define TM_CAUSE_ALIGNMENT 0xf2
+#define TM_CAUSE_EMULATE 0xf0
#if defined(CONFIG_PPC_BOOK3S_64)
#define MSR_64BIT MSR_SF
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index a7a648f..245c1e1 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -53,6 +53,7 @@
#ifdef CONFIG_PPC64
#include <asm/firmware.h>
#include <asm/processor.h>
+#include <asm/tm.h>
#endif
#include <asm/kexec.h>
#include <asm/ppc-opcode.h>
@@ -932,6 +933,28 @@ static int emulate_isel(struct pt_regs *regs, u32 instword)
return 0;
}
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+static inline bool tm_abort_check(struct pt_regs *regs, int cause)
+{
+ /* If we're emulating a load/store in an active transaction, we cannot
+ * emulate it as the kernel operates in transaction suspended context.
+ * We need to abort the transaction. This creates a persistent TM
+ * abort so tell the user what caused it with a new code.
+ */
+ if (MSR_TM_TRANSACTIONAL(regs->msr)) {
+ tm_enable();
+ tm_abort(cause);
+ return true;
+ }
+ return false;
+}
+#else
+static inline bool tm_abort_check(struct pt_regs *regs, int reason)
+{
+ return false;
+}
+#endif
+
static int emulate_instruction(struct pt_regs *regs)
{
u32 instword;
@@ -971,6 +994,8 @@ static int emulate_instruction(struct pt_regs *regs)
/* Emulate load/store string insn. */
if ((instword & PPC_INST_STRING_GEN_MASK) == PPC_INST_STRING) {
+ if (tm_abort_check(regs, TM_CAUSE_EMULATE))
+ return -EINVAL;
PPC_WARN_EMULATED(string, regs);
return emulate_string_inst(regs, instword);
}
@@ -1147,7 +1172,10 @@ void alignment_exception(struct pt_regs *regs)
/* We restore the interrupt state now */
if (!arch_irq_disabled_regs(regs))
local_irq_enable();
-
+
+ if (tm_abort_check(regs, TM_CAUSE_ALIGNMENT))
+ goto bail;
+
/* we don't implement logging of alignment exceptions */
if (!(current->thread.align_ctl & PR_UNALIGN_SIGBUS))
fixed = fix_alignment(regs);
^ permalink raw reply related
* Re: [PATCH v2 2/4] powerpc/cputable: advertise DSCR support on P7/P7+
From: Will Schmidt @ 2013-05-20 15:04 UTC (permalink / raw)
To: Nishanth Aravamudan
Cc: Michael Neuling, Michael R Meissner, Steve Munroe, Peter Bergner,
Ryan Arnold, linuxppc-dev
In-Reply-To: <20130504004838.GB3532@linux.vnet.ibm.com>
On Fri, 2013-05-03 at 17:48 -0700, Nishanth Aravamudan wrote:
> Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
>
> diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
> index ae9f433..a792157 100644
> --- a/arch/powerpc/kernel/cputable.c
> +++ b/arch/powerpc/kernel/cputable.c
> @@ -98,6 +98,7 @@ extern void __restore_cpu_e6500(void);
> PPC_FEATURE_SMT | PPC_FEATURE_ICACHE_SNOOP | \
> PPC_FEATURE_TRUE_LE | \
> PPC_FEATURE_PSERIES_PERFMON_COMPAT)
> +#define COMMON_USER2_POWER7 (PPC_FEATURE2_DSCR)
> #define COMMON_USER_POWER8 (COMMON_USER_PPC64 | PPC_FEATURE_ARCH_2_06 |\
> PPC_FEATURE_SMT | PPC_FEATURE_ICACHE_SNOOP | \
> PPC_FEATURE_TRUE_LE | \
> @@ -428,6 +429,7 @@ static struct cpu_spec __initdata cpu_specs[] = {
> .cpu_name = "POWER7 (architected)",
> .cpu_features = CPU_FTRS_POWER7,
> .cpu_user_features = COMMON_USER_POWER7,
> + .cpu_user_features2 = COMMON_USER2_POWER7,
> .mmu_features = MMU_FTRS_POWER7,
> .icache_bsize = 128,
> .dcache_bsize = 128,
> @@ -458,6 +460,7 @@ static struct cpu_spec __initdata cpu_specs[] = {
> .cpu_name = "POWER7 (raw)",
> .cpu_features = CPU_FTRS_POWER7,
> .cpu_user_features = COMMON_USER_POWER7,
> + .cpu_user_features2 = COMMON_USER2_POWER7,
> .mmu_features = MMU_FTRS_POWER7,
> .icache_bsize = 128,
> .dcache_bsize = 128,
> @@ -475,6 +478,7 @@ static struct cpu_spec __initdata cpu_specs[] = {
> .cpu_name = "POWER7+ (raw)",
> .cpu_features = CPU_FTRS_POWER7,
> .cpu_user_features = COMMON_USER_POWER7,
> + .cpu_user_features = COMMON_USER2_POWER7,
^ Oops here, I think. Please consider applying this on top.
Untested, but seems obvious.
Fix a typo in setting COMMON_USER2_POWER7 bits to .cpu_user_features2
cpu specs table.
Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
diff --git a/arch/powerpc/kernel/cputable.c
b/arch/powerpc/kernel/cputable.c
index c60bbec..51eecb5 100644
--- a/arch/powerpc/kernel/cputable.c
+++ b/arch/powerpc/kernel/cputable.c
@@ -482,7 +482,7 @@ static struct cpu_spec __initdata cpu_specs[] = {
.cpu_name = "POWER7+ (raw)",
.cpu_features = CPU_FTRS_POWER7,
.cpu_user_features = COMMON_USER_POWER7,
- .cpu_user_features = COMMON_USER2_POWER7,
+ .cpu_user_features2 = COMMON_USER2_POWER7,
.mmu_features = MMU_FTRS_POWER7,
.icache_bsize = 128,
.dcache_bsize = 128,
> .mmu_features = MMU_FTRS_POWER7,
> .icache_bsize = 128,
> .dcache_bsize = 128,
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
>
^ permalink raw reply related
* Build failure with 3.9.3 (Regression 3.9.2->3.9.3)
From: Adam Lackorzynski @ 2013-05-20 15:14 UTC (permalink / raw)
To: linuxppc-dev
Hi,
3.9.3 introduced the following build failure:
CC arch/powerpc/kernel/rtas.o
arch/powerpc/kernel/rtas.c: In function =E2=80=98rtas_cpu_state_change_mask=
=E2=80=99:
arch/powerpc/kernel/rtas.c:843:4: error: implicit declaration of function =
=E2=80=98cpu_down=E2=80=99 [-Werror=3Dimplicit-function-declaration]
cc1: all warnings being treated as errors
make[1]: *** [arch/powerpc/kernel/rtas.o] Error 1
make: *** [arch/powerpc/kernel] Error 2
My kernel config has CONFIG_HOTPLUG_CPU off, that's why cpu_down is not
defined. Shall CONFIG_HOTPLUG_CPU just be enabled or should the code in
rtas.c be adapted?
Thanks,
Adam
--=20
Adam adam@os.inf.tu-dresden.de
Lackorzynski http://os.inf.tu-dresden.de/~adam/
^ permalink raw reply
* Re: [PATCH v2] can: flexcan: remove HAVE_CAN_FLEXCAN Kconfig symbol
From: Marc Kleine-Budde @ 2013-05-20 16:06 UTC (permalink / raw)
To: Shawn Guo
Cc: Arnd Bergmann, Sascha Hauer, U Bhaskar-B22300, linux-kernel,
linux-can, linuxppc-dev, linux-arm-kernel
In-Reply-To: <20130517090949.GI8607@S2101-09.ap.freescale.net>
[-- Attachment #1: Type: text/plain, Size: 1204 bytes --]
On 05/17/2013 11:09 AM, Shawn Guo wrote:
> On Fri, May 17, 2013 at 10:59:17AM +0200, Marc Kleine-Budde wrote:
>> This patch removes the Kconfig symbol HAVE_CAN_FLEXCAN from arch/{arm,powerpc}
>> and allowing compilation unconditionally on all arm and powerpc platforms.
>>
>> This brings a bigger compile time coverage and removes the following dependency
>> warning found by Arnd Bergmann:
>>
>> warning: (SOC_IMX28 && SOC_IMX25 && SOC_IMX35 && IMX_HAVE_PLATFORM_FLEXCAN &&
>> SOC_IMX53 && SOC_IMX6Q) selects HAVE_CAN_FLEXCAN
>> which has unmet direct dependencies (NET && CAN && CAN_DEV)
>>
>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Cc: Shawn Guo <shawn.guo@linaro.org>
>
> Acked-by: Shawn Guo <shawn.guo@linaro.org>
Thanks.
An Acked-by by the powerpc people would be fine. However, if nobody
doesn't object, I'm sending this patch via linux-can and net-next upstream.
regards,
Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Industrial Linux Solutions | Phone: +49-231-2826-924 |
Vertretung West/Dortmund | Fax: +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de |
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]
^ permalink raw reply
* [PATCH] powerpc/platforms/83xx/mcu_mpc8349emitx: Use module_i2c_driver to register driver
From: Peter Huewe @ 2013-05-20 19:39 UTC (permalink / raw)
To: Kumar Gala; +Cc: Peter Huewe, Paul Mackerras, linuxppc-dev, linux-kernel
Removing some boilerplate by using module_i2c_driver instead of calling
register and unregister in the otherwise empty init/exit functions
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
---
arch/powerpc/platforms/83xx/mcu_mpc8349emitx.c | 13 +------------
1 file changed, 1 insertion(+), 12 deletions(-)
diff --git a/arch/powerpc/platforms/83xx/mcu_mpc8349emitx.c b/arch/powerpc/platforms/83xx/mcu_mpc8349emitx.c
index 624cb51..4b8ea95 100644
--- a/arch/powerpc/platforms/83xx/mcu_mpc8349emitx.c
+++ b/arch/powerpc/platforms/83xx/mcu_mpc8349emitx.c
@@ -231,18 +231,7 @@ static struct i2c_driver mcu_driver = {
.id_table = mcu_ids,
};
-static int __init mcu_init(void)
-{
- return i2c_add_driver(&mcu_driver);
-}
-module_init(mcu_init);
-
-static void __exit mcu_exit(void)
-{
- i2c_del_driver(&mcu_driver);
-}
-module_exit(mcu_exit);
-
+module_i2c_driver(mcu_driver);
MODULE_DESCRIPTION("Power Management and GPIO expander driver for "
"MPC8349E-mITX-compatible MCU");
MODULE_AUTHOR("Anton Vorontsov <avorontsov@ru.mvista.com>");
--
1.8.1.5
^ permalink raw reply related
* [PATCH 1/2] powerpc/PCI: Use PCI_UNKNOWN for unknown power state
From: Bjorn Helgaas @ 2013-05-20 23:19 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, David S. Miller
Cc: sparclinux, Rafael J. Wysocki, linuxppc-dev, linux-pci
Previously we initialized dev->current_state to 4 (PCI_D3cold), but I think
we wanted PCI_UNKNOWN (5) here based on the comment and the fact that the
generic version of this code, pci_setup_device(), uses PCI_UNKNOWN.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
arch/powerpc/kernel/pci_of_scan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/pci_of_scan.c b/arch/powerpc/kernel/pci_of_scan.c
index 2a67e9b..d2d407d 100644
--- a/arch/powerpc/kernel/pci_of_scan.c
+++ b/arch/powerpc/kernel/pci_of_scan.c
@@ -165,7 +165,7 @@ struct pci_dev *of_create_pci_dev(struct device_node *node,
pr_debug(" class: 0x%x\n", dev->class);
pr_debug(" revision: 0x%x\n", dev->revision);
- dev->current_state = 4; /* unknown power state */
+ dev->current_state = PCI_UNKNOWN; /* unknown power state */
dev->error_state = pci_channel_io_normal;
dev->dma_mask = 0xffffffff;
^ permalink raw reply related
* [PATCH 2/2] sparc/PCI: Use PCI_UNKNOWN for unknown power state
From: Bjorn Helgaas @ 2013-05-20 23:19 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, David S. Miller
Cc: sparclinux, Rafael J. Wysocki, linuxppc-dev, linux-pci
In-Reply-To: <20130520231909.32416.81752.stgit@bhelgaas-glaptop>
Previously we initialized dev->current_state to 4 (PCI_D3cold), but I think
we wanted PCI_UNKNOWN (5) here based on the comment and the fact that the
generic version of this code, pci_setup_device(), uses PCI_UNKNOWN.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
arch/sparc/kernel/pci.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/sparc/kernel/pci.c b/arch/sparc/kernel/pci.c
index baf4366..972892a 100644
--- a/arch/sparc/kernel/pci.c
+++ b/arch/sparc/kernel/pci.c
@@ -327,7 +327,7 @@ static struct pci_dev *of_create_pci_dev(struct pci_pbm_info *pbm,
if ((dev->class >> 8) == PCI_CLASS_STORAGE_IDE)
pci_set_master(dev);
- dev->current_state = 4; /* unknown power state */
+ dev->current_state = PCI_UNKNOWN; /* unknown power state */
dev->error_state = pci_channel_io_normal;
dev->dma_mask = 0xffffffff;
^ permalink raw reply related
* Re: [PATCH 1/2] powerpc/PCI: Use PCI_UNKNOWN for unknown power state
From: Bjorn Helgaas @ 2013-05-20 23:26 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, David S. Miller
Cc: sparclinux, Rafael J. Wysocki, linuxppc-dev,
linux-pci@vger.kernel.org
In-Reply-To: <20130520231909.32416.81752.stgit@bhelgaas-glaptop>
On Mon, May 20, 2013 at 5:19 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> Previously we initialized dev->current_state to 4 (PCI_D3cold), but I think
> we wanted PCI_UNKNOWN (5) here based on the comment and the fact that the
> generic version of this code, pci_setup_device(), uses PCI_UNKNOWN.
>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
> arch/powerpc/kernel/pci_of_scan.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kernel/pci_of_scan.c b/arch/powerpc/kernel/pci_of_scan.c
> index 2a67e9b..d2d407d 100644
> --- a/arch/powerpc/kernel/pci_of_scan.c
> +++ b/arch/powerpc/kernel/pci_of_scan.c
> @@ -165,7 +165,7 @@ struct pci_dev *of_create_pci_dev(struct device_node *node,
> pr_debug(" class: 0x%x\n", dev->class);
> pr_debug(" revision: 0x%x\n", dev->revision);
>
> - dev->current_state = 4; /* unknown power state */
> + dev->current_state = PCI_UNKNOWN; /* unknown power state */
If you're interested in the archaeology of this, I think it was first
introduced by 4267292b0 in 2005 for powerpc and subsequently copied
into sparc by a2fb23af1 in 2007.
Prior to the addition of PCI_UNKNOWN, pci_setup_device() set
current_state to 4, so I think 4267292b0 probably just copied that
code before 3fe9d19f9 added PCI_UNKNOWN and changed the initialization
in pci_setup_device().
> dev->error_state = pci_channel_io_normal;
> dev->dma_mask = 0xffffffff;
>
>
^ permalink raw reply
* Re: [PATCH v2 2/4] powerpc/cputable: advertise DSCR support on P7/P7+
From: Michael Neuling @ 2013-05-20 23:41 UTC (permalink / raw)
To: will_schmidt
Cc: Nishanth Aravamudan, Steve Munroe, Peter Bergner, Ryan Arnold,
linuxppc-dev, Michael R Meissner
In-Reply-To: <1369062258.18289.3.camel@brimstone>
Will Schmidt <will_schmidt@vnet.ibm.com> wrote:
> On Fri, 2013-05-03 at 17:48 -0700, Nishanth Aravamudan wrote:
> > Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
> >
> > diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
> > index ae9f433..a792157 100644
> > --- a/arch/powerpc/kernel/cputable.c
> > +++ b/arch/powerpc/kernel/cputable.c
> > @@ -98,6 +98,7 @@ extern void __restore_cpu_e6500(void);
> > PPC_FEATURE_SMT | PPC_FEATURE_ICACHE_SNOOP | \
> > PPC_FEATURE_TRUE_LE | \
> > PPC_FEATURE_PSERIES_PERFMON_COMPAT)
> > +#define COMMON_USER2_POWER7 (PPC_FEATURE2_DSCR)
> > #define COMMON_USER_POWER8 (COMMON_USER_PPC64 | PPC_FEATURE_ARCH_2_06 |\
> > PPC_FEATURE_SMT | PPC_FEATURE_ICACHE_SNOOP | \
> > PPC_FEATURE_TRUE_LE | \
> > @@ -428,6 +429,7 @@ static struct cpu_spec __initdata cpu_specs[] = {
> > .cpu_name = "POWER7 (architected)",
> > .cpu_features = CPU_FTRS_POWER7,
> > .cpu_user_features = COMMON_USER_POWER7,
> > + .cpu_user_features2 = COMMON_USER2_POWER7,
> > .mmu_features = MMU_FTRS_POWER7,
> > .icache_bsize = 128,
> > .dcache_bsize = 128,
> > @@ -458,6 +460,7 @@ static struct cpu_spec __initdata cpu_specs[] = {
> > .cpu_name = "POWER7 (raw)",
> > .cpu_features = CPU_FTRS_POWER7,
> > .cpu_user_features = COMMON_USER_POWER7,
> > + .cpu_user_features2 = COMMON_USER2_POWER7,
> > .mmu_features = MMU_FTRS_POWER7,
> > .icache_bsize = 128,
> > .dcache_bsize = 128,
> > @@ -475,6 +478,7 @@ static struct cpu_spec __initdata cpu_specs[] = {
> > .cpu_name = "POWER7+ (raw)",
> > .cpu_features = CPU_FTRS_POWER7,
> > .cpu_user_features = COMMON_USER_POWER7,
> > + .cpu_user_features = COMMON_USER2_POWER7,
>
> ^ Oops here, I think. Please consider applying this on top.
> Untested, but seems obvious.
>
> Fix a typo in setting COMMON_USER2_POWER7 bits to .cpu_user_features2
> cpu specs table.
Nice catch. Thanks
Acked-by: Michael Neuling <mikey@neuling.org>
>
> Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
>
>
> diff --git a/arch/powerpc/kernel/cputable.c
> b/arch/powerpc/kernel/cputable.c
> index c60bbec..51eecb5 100644
> --- a/arch/powerpc/kernel/cputable.c
> +++ b/arch/powerpc/kernel/cputable.c
> @@ -482,7 +482,7 @@ static struct cpu_spec __initdata cpu_specs[] = {
> .cpu_name = "POWER7+ (raw)",
> .cpu_features = CPU_FTRS_POWER7,
> .cpu_user_features = COMMON_USER_POWER7,
> - .cpu_user_features = COMMON_USER2_POWER7,
> + .cpu_user_features2 = COMMON_USER2_POWER7,
> .mmu_features = MMU_FTRS_POWER7,
> .icache_bsize = 128,
> .dcache_bsize = 128,
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> > .mmu_features = MMU_FTRS_POWER7,
> > .icache_bsize = 128,
> > .dcache_bsize = 128,
> >
> > _______________________________________________
> > Linuxppc-dev mailing list
> > Linuxppc-dev@lists.ozlabs.org
> > https://lists.ozlabs.org/listinfo/linuxppc-dev
> >
>
>
^ permalink raw reply
* Re: [PATCH 1/2] powerpc/PCI: Use PCI_UNKNOWN for unknown power state
From: Benjamin Herrenschmidt @ 2013-05-20 23:42 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Rafael J. Wysocki, Paul Mackerras, sparclinux,
linuxppc-dev, David S. Miller
In-Reply-To: <20130520231909.32416.81752.stgit@bhelgaas-glaptop>
On Mon, 2013-05-20 at 17:19 -0600, Bjorn Helgaas wrote:
> Previously we initialized dev->current_state to 4 (PCI_D3cold), but I think
> we wanted PCI_UNKNOWN (5) here based on the comment and the fact that the
> generic version of this code, pci_setup_device(), uses PCI_UNKNOWN.
>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
Ack. Do you send to Linus or should I ?
Thanks.
Cheers,
Ben.
> arch/powerpc/kernel/pci_of_scan.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kernel/pci_of_scan.c b/arch/powerpc/kernel/pci_of_scan.c
> index 2a67e9b..d2d407d 100644
> --- a/arch/powerpc/kernel/pci_of_scan.c
> +++ b/arch/powerpc/kernel/pci_of_scan.c
> @@ -165,7 +165,7 @@ struct pci_dev *of_create_pci_dev(struct device_node *node,
> pr_debug(" class: 0x%x\n", dev->class);
> pr_debug(" revision: 0x%x\n", dev->revision);
>
> - dev->current_state = 4; /* unknown power state */
> + dev->current_state = PCI_UNKNOWN; /* unknown power state */
> dev->error_state = pci_channel_io_normal;
> dev->dma_mask = 0xffffffff;
>
^ permalink raw reply
* Re: [PATCH 1/2] powerpc/PCI: Use PCI_UNKNOWN for unknown power state
From: Bjorn Helgaas @ 2013-05-20 23:45 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linux-pci@vger.kernel.org, Rafael J. Wysocki, Paul Mackerras,
sparclinux, linuxppc-dev, David S. Miller
In-Reply-To: <1369093321.6387.24.camel@pasglop>
On Mon, May 20, 2013 at 5:42 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Mon, 2013-05-20 at 17:19 -0600, Bjorn Helgaas wrote:
>> Previously we initialized dev->current_state to 4 (PCI_D3cold), but I think
>> we wanted PCI_UNKNOWN (5) here based on the comment and the fact that the
>> generic version of this code, pci_setup_device(), uses PCI_UNKNOWN.
>>
>> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>> ---
>
> Ack. Do you send to Linus or should I ?
I'll add your ack and take care of it.
>> arch/powerpc/kernel/pci_of_scan.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/kernel/pci_of_scan.c b/arch/powerpc/kernel/pci_of_scan.c
>> index 2a67e9b..d2d407d 100644
>> --- a/arch/powerpc/kernel/pci_of_scan.c
>> +++ b/arch/powerpc/kernel/pci_of_scan.c
>> @@ -165,7 +165,7 @@ struct pci_dev *of_create_pci_dev(struct device_node *node,
>> pr_debug(" class: 0x%x\n", dev->class);
>> pr_debug(" revision: 0x%x\n", dev->revision);
>>
>> - dev->current_state = 4; /* unknown power state */
>> + dev->current_state = PCI_UNKNOWN; /* unknown power state */
>> dev->error_state = pci_channel_io_normal;
>> dev->dma_mask = 0xffffffff;
>>
>
>
^ permalink raw reply
* [PATCH v2] powerpc/tm: Abort transactions on emulation and alignment faults
From: Michael Neuling @ 2013-05-21 1:48 UTC (permalink / raw)
To: benh; +Cc: Linux PPC dev, matt
In-Reply-To: <23571.1369048074@ale.ozlabs.ibm.com>
If we are emulating an instruction inside an active user transaction that
touches memory, the kernel can't emulate it as it operates in transactional
suspend context. We need to abort these transactions and send them back to
userspace for the hardware to rollback.
We can service these if the user transaction is in suspend mode, since the
kernel will operate in the same suspend context.
This adds a check to all alignment faults and to specific instruction
emulations (only string instructions for now). If the user process is in an
active (non-suspended) transaction, we abort the transaction go back to
userspace allowing the HW to roll back the transaction and tell the user of the
failure. This also adds new tm abort cause codes to report the reason of the
persistent error to the user.
Crappy test case here http://neuling.org/devel/junkcode/aligntm.c
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@vger.kernel.org> # v3.9 only
---
v2:
Split persistent bit of cause into separate bit
Mark alignment and emulation as persistent.
syscall cause it not used currently.
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index a613651..20be907 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -115,12 +115,15 @@
convention, bit0 is copied to TEXASR[56] (IBM bit 7) which is set if
the failure is persistent.
*/
+#define TM_CAUSE_PERSISTENT 0x01
#define TM_CAUSE_RESCHED 0xfe
#define TM_CAUSE_TLBI 0xfc
#define TM_CAUSE_FAC_UNAV 0xfa
-#define TM_CAUSE_SYSCALL 0xf9 /* Persistent */
+#define TM_CAUSE_SYSCALL 0xf8
#define TM_CAUSE_MISC 0xf6
#define TM_CAUSE_SIGNAL 0xf4
+#define TM_CAUSE_ALIGNMENT 0xf2
+#define TM_CAUSE_EMULATE 0xf0
#if defined(CONFIG_PPC_BOOK3S_64)
#define MSR_64BIT MSR_SF
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index a7a648f..f18c79c 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -53,6 +53,7 @@
#ifdef CONFIG_PPC64
#include <asm/firmware.h>
#include <asm/processor.h>
+#include <asm/tm.h>
#endif
#include <asm/kexec.h>
#include <asm/ppc-opcode.h>
@@ -932,6 +933,28 @@ static int emulate_isel(struct pt_regs *regs, u32 instword)
return 0;
}
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+static inline bool tm_abort_check(struct pt_regs *regs, int cause)
+{
+ /* If we're emulating a load/store in an active transaction, we cannot
+ * emulate it as the kernel operates in transaction suspended context.
+ * We need to abort the transaction. This creates a persistent TM
+ * abort so tell the user what caused it with a new code.
+ */
+ if (MSR_TM_TRANSACTIONAL(regs->msr)) {
+ tm_enable();
+ tm_abort(cause);
+ return true;
+ }
+ return false;
+}
+#else
+static inline bool tm_abort_check(struct pt_regs *regs, int reason)
+{
+ return false;
+}
+#endif
+
static int emulate_instruction(struct pt_regs *regs)
{
u32 instword;
@@ -971,6 +994,9 @@ static int emulate_instruction(struct pt_regs *regs)
/* Emulate load/store string insn. */
if ((instword & PPC_INST_STRING_GEN_MASK) == PPC_INST_STRING) {
+ if (tm_abort_check(regs,
+ TM_CAUSE_EMULATE | TM_CAUSE_PERSISTENT))
+ return -EINVAL;
PPC_WARN_EMULATED(string, regs);
return emulate_string_inst(regs, instword);
}
@@ -1148,6 +1174,9 @@ void alignment_exception(struct pt_regs *regs)
if (!arch_irq_disabled_regs(regs))
local_irq_enable();
+ if (tm_abort_check(regs, TM_CAUSE_ALIGNMENT | TM_CAUSE_PERSISTENT))
+ goto bail;
+
/* we don't implement logging of alignment exceptions */
if (!(current->thread.align_ctl & PR_UNALIGN_SIGBUS))
fixed = fix_alignment(regs);
^ permalink raw reply related
* Re: [PATCH 2/2] sparc/PCI: Use PCI_UNKNOWN for unknown power state
From: David Miller @ 2013-05-21 1:51 UTC (permalink / raw)
To: bhelgaas; +Cc: rjw, paulus, linux-pci, sparclinux, linuxppc-dev
In-Reply-To: <20130520231916.32416.7318.stgit@bhelgaas-glaptop>
From: Bjorn Helgaas <bhelgaas@google.com>
Date: Mon, 20 May 2013 17:19:16 -0600
> Previously we initialized dev->current_state to 4 (PCI_D3cold), but I think
> we wanted PCI_UNKNOWN (5) here based on the comment and the fact that the
> generic version of this code, pci_setup_device(), uses PCI_UNKNOWN.
>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox