LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] x86/mpx: fix recursive munmap() corruption
From: Christophe Leroy @ 2020-10-23 12:28 UTC (permalink / raw)
  To: Laurent Dufour, Michael Ellerman
  Cc: mhocko, rguenther, linux-mm, Dave Hansen, x86, stable, LKML,
	Dave Hansen, Thomas Gleixner, luto, linuxppc-dev, Andrew Morton,
	vbabka
In-Reply-To: <9c2b2826-4083-fc9c-5a4d-c101858dd560@linux.vnet.ibm.com>

Hi Laurent

Le 07/05/2019 à 18:35, Laurent Dufour a écrit :
> Le 01/05/2019 à 12:32, Michael Ellerman a écrit :
>> Laurent Dufour <ldufour@linux.vnet.ibm.com> writes:
>>> Le 23/04/2019 à 18:04, Dave Hansen a écrit :
>>>> On 4/23/19 4:16 AM, Laurent Dufour wrote:
>> ...
>>>>> There are 2 assumptions here:
>>>>>    1. 'start' and 'end' are page aligned (this is guaranteed by __do_munmap().
>>>>>    2. the VDSO is 1 page (this is guaranteed by the union vdso_data_store on powerpc)
>>>>
>>>> Are you sure about #2?  The 'vdso64_pages' variable seems rather
>>>> unnecessary if the VDSO is only 1 page. ;)
>>>
>>> Hum, not so sure now ;)
>>> I got confused, only the header is one page.
>>> The test is working as a best effort, and don't cover the case where
>>> only few pages inside the VDSO are unmmapped (start >
>>> mm->context.vdso_base). This is not what CRIU is doing and so this was
>>> enough for CRIU support.
>>>
>>> Michael, do you think there is a need to manage all the possibility
>>> here, since the only user is CRIU and unmapping the VDSO is not a so
>>> good idea for other processes ?
>>
>> Couldn't we implement the semantic that if any part of the VDSO is
>> unmapped then vdso_base is set to zero? That should be fairly easy, eg:
>>
>>     if (start < vdso_end && end >= mm->context.vdso_base)
>>         mm->context.vdso_base = 0;
>>
>>
>> We might need to add vdso_end to the mm->context, but that should be OK.
>>
>> That seems like it would work for CRIU and make sense in general?
> 
> Sorry for the late answer, yes this would make more sense.
> 
> Here is a patch doing that.
> 

In your patch, the test seems overkill:

+	if ((start <= vdso_base && vdso_end <= end) ||  /* 1   */
+	    (vdso_base <= start && start < vdso_end) || /* 3,4 */
+	    (vdso_base < end && end <= vdso_end))       /* 2,3 */
+		mm->context.vdso_base = mm->context.vdso_end = 0;

What about

	if (start < vdso_end && vdso_start < end)
		mm->context.vdso_base = mm->context.vdso_end = 0;

This should cover all cases, or am I missing something ?


And do we really need to store vdso_end in the context ?
I think it should be possible to re-calculate it: the size of the VDSO should be (&vdso32_end - 
&vdso32_start) + PAGE_SIZE for 32 bits VDSO, and (&vdso64_end - &vdso64_start) + PAGE_SIZE for the 
64 bits VDSO.

Christophe

^ permalink raw reply

* Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()
From: Christophe Leroy @ 2020-10-23 11:57 UTC (permalink / raw)
  To: Will Deacon
  Cc: nathanl, linux-arch, Arnd Bergmann, Dmitry Safonov, open list,
	Paul Mackerras, Andy Lutomirski, Thomas Gleixner,
	Vincenzo Frascino, linuxppc-dev
In-Reply-To: <20201023112514.GE20933@willie-the-truck>



Le 23/10/2020 à 13:25, Will Deacon a écrit :
> On Fri, Oct 23, 2020 at 01:22:04PM +0200, Christophe Leroy wrote:
>> Hi Dmitry,
>>
>> Le 28/09/2020 à 17:08, Dmitry Safonov a écrit :
>>> On 9/27/20 8:43 AM, Christophe Leroy wrote:
>>>>
>>>>
>>>> Le 21/09/2020 à 13:26, Will Deacon a écrit :
>>>>> On Fri, Aug 28, 2020 at 12:14:28PM +1000, Michael Ellerman wrote:
>>>>>> Dmitry Safonov <0x7f454c46@gmail.com> writes:
>>> [..]
>>>>>>> I'll cook a patch for vm_special_mapping if you don't mind :-)
>>>>>>
>>>>>> That would be great, thanks!
>>>>>
>>>>> I lost track of this one. Is there a patch kicking around to resolve
>>>>> this,
>>>>> or is the segfault expected behaviour?
>>>>>
>>>>
>>>> IIUC dmitry said he will cook a patch. I have not seen any patch yet.
>>>
>>> Yes, sorry about the delay - I was a bit busy with xfrm patches.
>>>
>>> I'll send patches for .close() this week, working on them now.
>>
>> I haven't seen the patches, did you sent them out finally ?
> 
> I think it's this series:
> 
> https://lore.kernel.org/r/20201013013416.390574-1-dima@arista.com
> 
> but they look really invasive to me, so I may cook a small hack for arm64
> in the meantine / for stable.
> 

Not sure we are talking about the same thing.

I can't see any new .close function added to vm_special_mapping in order to replace arch_unmap() hook.

Christophe

^ permalink raw reply

* Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()
From: Will Deacon @ 2020-10-23 11:25 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: nathanl, linux-arch, Arnd Bergmann, Dmitry Safonov, open list,
	Paul Mackerras, Andy Lutomirski, Thomas Gleixner,
	Vincenzo Frascino, linuxppc-dev
In-Reply-To: <ba9861da-2f5b-a649-5626-af00af634546@csgroup.eu>

On Fri, Oct 23, 2020 at 01:22:04PM +0200, Christophe Leroy wrote:
> Hi Dmitry,
> 
> Le 28/09/2020 à 17:08, Dmitry Safonov a écrit :
> > On 9/27/20 8:43 AM, Christophe Leroy wrote:
> > > 
> > > 
> > > Le 21/09/2020 à 13:26, Will Deacon a écrit :
> > > > On Fri, Aug 28, 2020 at 12:14:28PM +1000, Michael Ellerman wrote:
> > > > > Dmitry Safonov <0x7f454c46@gmail.com> writes:
> > [..]
> > > > > > I'll cook a patch for vm_special_mapping if you don't mind :-)
> > > > > 
> > > > > That would be great, thanks!
> > > > 
> > > > I lost track of this one. Is there a patch kicking around to resolve
> > > > this,
> > > > or is the segfault expected behaviour?
> > > > 
> > > 
> > > IIUC dmitry said he will cook a patch. I have not seen any patch yet.
> > 
> > Yes, sorry about the delay - I was a bit busy with xfrm patches.
> > 
> > I'll send patches for .close() this week, working on them now.
> 
> I haven't seen the patches, did you sent them out finally ?

I think it's this series:

https://lore.kernel.org/r/20201013013416.390574-1-dima@arista.com

but they look really invasive to me, so I may cook a small hack for arm64
in the meantine / for stable.

Will

^ permalink raw reply

* Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()
From: Christophe Leroy @ 2020-10-23 11:22 UTC (permalink / raw)
  To: Dmitry Safonov, Will Deacon, Michael Ellerman
  Cc: nathanl, linux-arch, Arnd Bergmann, open list, Paul Mackerras,
	Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino, linuxppc-dev
In-Reply-To: <542145eb-7d90-0444-867e-c9cbb6bdd8e3@gmail.com>

Hi Dmitry,

Le 28/09/2020 à 17:08, Dmitry Safonov a écrit :
> On 9/27/20 8:43 AM, Christophe Leroy wrote:
>>
>>
>> Le 21/09/2020 à 13:26, Will Deacon a écrit :
>>> On Fri, Aug 28, 2020 at 12:14:28PM +1000, Michael Ellerman wrote:
>>>> Dmitry Safonov <0x7f454c46@gmail.com> writes:
> [..]
>>>>> I'll cook a patch for vm_special_mapping if you don't mind :-)
>>>>
>>>> That would be great, thanks!
>>>
>>> I lost track of this one. Is there a patch kicking around to resolve
>>> this,
>>> or is the segfault expected behaviour?
>>>
>>
>> IIUC dmitry said he will cook a patch. I have not seen any patch yet.
> 
> Yes, sorry about the delay - I was a bit busy with xfrm patches.
> 
> I'll send patches for .close() this week, working on them now.

I haven't seen the patches, did you sent them out finally ?

Christophe

^ permalink raw reply

* Re: [PATCH v2 1/2] powerpc: Introduce POWER10_DD1 feature
From: Ravi Bangoria @ 2020-10-23 10:14 UTC (permalink / raw)
  To: Jordan Niethe
  Cc: Christophe Leroy, Ravi Bangoria, Michael Neuling, Nicholas Piggin,
	maddy, Paul Mackerras, naveen.n.rao, linuxppc-dev
In-Reply-To: <CACzsE9rqopXj37vrOcyM3sisomeYzxbeR6V4o5K3_Y8xqo5hHw@mail.gmail.com>


>>>> +static void __init fixup_cpu_features(void)
>>>> +{
>>>> +       unsigned long version = mfspr(SPRN_PVR);
>>>> +
>>>> +       if ((version & 0xffffffff) == 0x00800100)
>>>> +               cur_cpu_spec->cpu_features |= CPU_FTR_POWER10_DD1;
>>>> +}
>>>> +
>>> I am just wondering why this is needed here, but the same thing is not
>>> done for, say, CPU_FTR_POWER9_DD2_1?
>>
>> When we don't use DT cpu_features (PowerVM / kvm geusts), we call
>> identify_cpu() twice. First with Real PVR which sets "raw" cpu_spec
>> as cur_cpu_spec and then 2nd time with Logical PVR (0x0f...) which
>> (mostly) overwrites the cur_cpu_spec with "architected" mode cpu_spec.
>> I don't see DD version specific entries for "architected" mode in
>> cpu_specs[] for any previous processors. So I've introduced this
>> function to tweak cpu_features.
>>
>> Though, I don't know why we don't have similar thing for
>> CPU_FTR_POWER9_DD2_1. I've to check that.
>>
>>> And should we get a /* Power10 DD 1 */ added to cpu_specs[]?
>>
>> IIUC, we don't need such entry. For PowerVM / kvm guests, we overwrite
>> cpu_spec, so /* Power10 */ "raw" entry is sufficient. And For baremetal,
>> we don't use cpu_specs[] at all.
> I think even for powernv, using dt features can be disabled by the
> cmdline with dt_cpu_ftrs=off, then cpu_specs[] will then be used.

Ok... with dt_cpu_ftrs=off, we seem to be using raw mode cpu_specs[] entry on
baremetal. So yeah, I'll add /* Power10 DD1 */ raw mode entry into cpu_specs[].
Thanks for pointing it out.

-Ravi

^ permalink raw reply

* Re: [PATCH] powerpc: Send SIGBUS from machine_check
From: Joakim Tjernlund @ 2020-10-23  9:23 UTC (permalink / raw)
  To: linuxppc-dev@lists.ozlabs.org, mpe@ellerman.id.au
In-Reply-To: <87d019yd0w.fsf@mpe.ellerman.id.au>

On Fri, 2020-10-23 at 11:57 +1100, Michael Ellerman wrote:
> 
> 
> Joakim Tjernlund <joakim.tjernlund@infinera.com> writes:
> > Embedded PPC CPU should send SIGBUS to user space when applicable.
> 
> Yeah, but it's not clear that it's applicable in all cases.
> 
> At least I need some reasoning for why it's safe in all cases below to
> just send a SIGBUS and take no other action.

For me this came from an User SDK accessing a PCI device(also using PCI IRQs) and this
SDK did some strange stuff during shutdown which disabled the device before SW was done.
This caused PCI accesses, both from User Space and kernel PCI IRQs access) to the device
which caused an Machine Check(PCI transfer failed). Without this patch, the kernel
would just OOPS and hang/do strange things even for an access made by User space.
Now the User app just gets a SIGBUS and the kernel still works as it should.

Perhaps a SIGBUS and recover isn't right in all cases but without it there will be a
system break down.


> Is there a particular CPU you're working on? Can we start with that and
> look at all the machine check causes and which can be safely handled.

This was a T1042(e5500) but we have e500 and mpc832x as well.

> 
> Some comments below ...
> 
> 
> > diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> > index 0381242920d9..12715d24141c 100644
> > --- a/arch/powerpc/kernel/traps.c
> > +++ b/arch/powerpc/kernel/traps.c
> > @@ -621,6 +621,11 @@ int machine_check_e500mc(struct pt_regs *regs)
> 
> At the beginning of the function we have:
> 
>         printk("Machine check in kernel mode.\n");
> 
> Which should be updated.

Sure, just remove the "in kernel mode" perhaps?

> 
> >                      reason & MCSR_MEA ? "Effective" : "Physical", addr);
> >       }
> > 
> > +     if ((user_mode(regs))) {
> > +             _exception(SIGBUS, regs, reason, regs->nip);
> > +             recoverable = 1;
> > +     }
> 
> For most of the error causes we take no action and set recoverable = 0.
> 
> Then you just declare that it is recoverable because it hit in
> userspace. Depending on the cause that might be OK, but it's not
> obviously correct in all cases.

Not so familiar with PPC that I can make out what is OK or not.
I do think you stand a better chance now that before though.  

> 
> 
> > +
> >  silent_out:
> >       mtspr(SPRN_MCSR, mcsr);
> >       return mfspr(SPRN_MCSR) == 0 && recoverable;
> > @@ -665,6 +670,10 @@ int machine_check_e500(struct pt_regs *regs)
> 
> Same comment about the printk().
> 
> >       if (reason & MCSR_BUS_RPERR)
> >               printk("Bus - Read Parity Error\n");
> > 
> > +     if ((user_mode(regs))) {
> > +             _exception(SIGBUS, regs, reason, regs->nip);
> > +             return 1;
> > +     }
> 
> And same comment more or less.
> 
> Other than the MCSR_BUS_RBERR cases that are explicitly checked, the
> function does nothing to clear the cause of the machine check.
> 
> >       return 0;
> >  }
> > 
> > @@ -695,6 +704,10 @@ int machine_check_e200(struct pt_regs *regs)
> >       if (reason & MCSR_BUS_WRERR)
> >               printk("Bus - Write Bus Error on buffered store or cache line push\n");
> > 
> > +     if ((user_mode(regs))) {
> > +             _exception(SIGBUS, regs, reason, regs->nip);
> > +             return 1;
> > +     }
> 
> Same.
> 
> >       return 0;
> >  }
> >  #elif defined(CONFIG_PPC32)
> > @@ -731,6 +744,10 @@ int machine_check_generic(struct pt_regs *regs)
> >       default:
> >               printk("Unknown values in msr\n");
> >       }
> > +     if ((user_mode(regs))) {
> > +             _exception(SIGBUS, regs, reason, regs->nip);
> > +             return 1;
> > +     }
> 
> Same.
> 
> >       return 0;
> >  }
> >  #endif /* everything else */
> > --
> > 2.26.2
> 
> 
> cheers


^ permalink raw reply

* Re: [PATCH] treewide: Convert macro and uses of __section(foo) to __section("foo")
From: Joe Perches @ 2020-10-23  8:03 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: clang-built-linux, Nick Desaulniers, Linus Torvalds, linux-kernel,
	linuxppc-dev
In-Reply-To: <CANiq72nfHjXkN65jy+unz0k66qvAALNhhhDZsxqPRLdtLKOW_Q@mail.gmail.com>

On Fri, 2020-10-23 at 08:08 +0200, Miguel Ojeda wrote:
> On Thu, Oct 22, 2020 at 4:36 AM Joe Perches <joe@perches.com> wrote:
> > 
> > Use a more generic form for __section that requires quotes to avoid
> > complications with clang and gcc differences.
> 
> I performed visual inspection (one by one...) and the only thing I saw
> is that sometimes the `__attribute__` has a whitespace afterwards and
> sometimes it doesn't, same for the commas inside, e.g.:
> 
> -  __used __attribute__((section(".modinfo"), unused, aligned(1)))  \
> +  __used __section(".modinfo") __attribute__((unused, aligned(1)))  \
> 
> and:
> 
> -    __attribute__ ((unused,__section__ ("__param"),aligned(sizeof(void *)))) \
> +    __section("__param") __attribute__ ((unused, aligned(sizeof(void *)))) \
> 
> I think the patch tries to follow the style of the replaced line, but
> for the commas in this last case it didn't. Anyway, it is not
> important.

Here the change follows the kernel style of space after comma.

> I can pick it up in my queue along with the __alias one and keep it
> for a few weeks in -next.

Thanks Miguel, but IMO it doesn't need time in next.

Applying it just before an rc1 minimizes conflicts.



^ permalink raw reply

* Re: C vdso
From: Christophe Leroy @ 2020-10-23  6:28 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev@ozlabs.org
In-Reply-To: <cc532aa8-a9e0-a105-b7b1-ee8d723b7ed6@csgroup.eu>

Hi Michael,

Le 24/09/2020 à 15:17, Christophe Leroy a écrit :
> Hi Michael
> 
> Le 17/09/2020 à 14:33, Michael Ellerman a écrit :
>> Hi Christophe,
>>
>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>> Hi Michael,
>>>
>>> What is the status with the generic C vdso merge ?
>>> In some mail, you mentionned having difficulties getting it working on
>>> ppc64, any progress ? What's the problem ? Can I help ?
>>
>> Yeah sorry I was hoping to get time to work on it but haven't been able
>> to.
>>
>> It's causing crashes on ppc64 ie. big endian.
>>
> 
>>
>> As you can see from the instruction dump we have jumped into the weeds somewhere.
>>
>> We also had the report from the kbuild robot about rela.opd being
>> discarded, which I think is indicative of a bigger problem. ie. we don't
>> process relocations for the VDSO, but opds require relocations (they
>> contain an absolute pointer).
>>
>> I thought we could get away with that, because the VDSO entry points
>> aren't proper functions (so they don't have opds), and I didn't think
>> we'd be calling via function pointers in the VDSO code (which would
>> require opds). But seems something is not working right.
>>
>> Sorry I haven't got back to you with those details. Things are a bit of
>> a mess inside IBM at the moment (always?), and I've been trying to get
>> everything done before I take a holiday next week.
>>
> 
> 
> Can you tell what defconfig you are using ? I have been able to setup a full glibc PPC64 cross 
> compilation chain and been able to test it under QEMU with success, using Nathan's vdsotest tool.


What config are you using ?

Christophe

> 
> I tested with both ppc64_defconfig and pseries_defconfig.
> 
> The only problem I got is with getcpu, which segfaults but both before and after applying my series, 
> so I guess this is unrelated.
> 
> Not sure we can pay too much attention to the exact measurement as it is a ppc64 QEMU running on a 
> x86 Linux which is running in a Virtual Box on a x86 windows Laptop, but at least it works:
> 
> Without the series:
> 
> clock-getres-monotonic:    vdso: 389 nsec/call
> clock-gettime-monotonic:    vdso: 781 nsec/call
> clock-getres-monotonic-coarse:    vdso: 13715 nsec/call
> clock-gettime-monotonic-coarse:    vdso: 312 nsec/call
> clock-getres-monotonic-raw:    vdso: 13589 nsec/call
> clock-getres-tai:    vdso: 13827 nsec/call
> clock-gettime-tai:    vdso: 14846 nsec/call
> clock-getres-boottime:    vdso: 13596 nsec/call
> clock-gettime-boottime:    vdso: 14758 nsec/call
> clock-getres-realtime:    vdso: 327 nsec/call
> clock-gettime-realtime:    vdso: 717 nsec/call
> clock-getres-realtime-coarse:    vdso: 14102 nsec/call
> clock-gettime-realtime-coarse:    vdso: 299 nsec/call
> gettimeofday:    vdso: 771 nsec/call
> 
> With the series:
> 
> clock-getres-monotonic:    vdso: 350 nsec/call
> clock-gettime-monotonic:    vdso: 726 nsec/call
> clock-getres-monotonic-coarse:    vdso: 356 nsec/call
> clock-gettime-monotonic-coarse:    vdso: 423 nsec/call
> clock-getres-monotonic-raw:    vdso: 349 nsec/call
> clock-getres-tai:    vdso: 419 nsec/call
> clock-gettime-tai:    vdso: 724 nsec/call
> clock-getres-boottime:    vdso: 352 nsec/call
> clock-gettime-boottime:    vdso: 752 nsec/call
> clock-getres-realtime:    vdso: 351 nsec/call
> clock-gettime-realtime:    vdso: 733 nsec/call
> clock-getres-realtime-coarse:    vdso: 356 nsec/call
> clock-gettime-realtime-coarse:    vdso: 367 nsec/call
> gettimeofday:    vdso: 796 nsec/call
> 
> 
> Thanks
> Christophe

^ permalink raw reply

* Re: [PATCH] treewide: Convert macro and uses of __section(foo) to __section("foo")
From: Miguel Ojeda @ 2020-10-23  6:08 UTC (permalink / raw)
  To: Joe Perches
  Cc: clang-built-linux, Nick Desaulniers, Linus Torvalds, linux-kernel,
	linuxppc-dev
In-Reply-To: <fe8abcc88cff676ead8ee48db1e993e63b0611c7.1603327264.git.joe@perches.com>

On Thu, Oct 22, 2020 at 4:36 AM Joe Perches <joe@perches.com> wrote:
>
> Use a more generic form for __section that requires quotes to avoid
> complications with clang and gcc differences.

I performed visual inspection (one by one...) and the only thing I saw
is that sometimes the `__attribute__` has a whitespace afterwards and
sometimes it doesn't, same for the commas inside, e.g.:

-  __used __attribute__((section(".modinfo"), unused, aligned(1)))  \
+  __used __section(".modinfo") __attribute__((unused, aligned(1)))  \

and:

-    __attribute__ ((unused,__section__ ("__param"),aligned(sizeof(void *)))) \
+    __section("__param") __attribute__ ((unused, aligned(sizeof(void *)))) \

I think the patch tries to follow the style of the replaced line, but
for the commas in this last case it didn't. Anyway, it is not
important.

I can pick it up in my queue along with the __alias one and keep it
for a few weeks in -next.

Cheers,
Miguel

^ permalink raw reply

* [powerpc:fixes] BUILD SUCCESS 4ff753feab021242144818b9a3ba011238218145
From: kernel test robot @ 2020-10-23  4:14 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev

tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git  fixes
branch HEAD: 4ff753feab021242144818b9a3ba011238218145  powerpc/pseries: Avoid using addr_to_pfn in real mode

elapsed time: 723m

configs tested: 120
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm                                 defconfig
arm64                            allyesconfig
arm64                               defconfig
arm                              allyesconfig
arm                              allmodconfig
powerpc                 mpc837x_mds_defconfig
mips                          ath79_defconfig
um                            kunit_defconfig
powerpc                     ppa8548_defconfig
arm                        spear6xx_defconfig
riscv                            alldefconfig
arm                            xcep_defconfig
arm                          exynos_defconfig
arm                        multi_v5_defconfig
powerpc                     ksi8560_defconfig
arm                         s3c6400_defconfig
mips                        maltaup_defconfig
arm                           omap1_defconfig
arm                         lubbock_defconfig
arm                           spitz_defconfig
powerpc                 xes_mpc85xx_defconfig
mips                        vocore2_defconfig
arm                      jornada720_defconfig
arc                          axs103_defconfig
arm                    vt8500_v6_v7_defconfig
powerpc                     redwood_defconfig
sh                             sh03_defconfig
mips                      bmips_stb_defconfig
sparc64                          alldefconfig
sh                     magicpanelr2_defconfig
arm                         axm55xx_defconfig
c6x                        evmc6678_defconfig
mips                         bigsur_defconfig
powerpc                     tqm8560_defconfig
mips                  cavium_octeon_defconfig
xtensa                    xip_kc705_defconfig
m68k                        m5407c3_defconfig
sh                             espt_defconfig
arm                        realview_defconfig
xtensa                         virt_defconfig
powerpc                      ppc6xx_defconfig
arm                       versatile_defconfig
h8300                               defconfig
m68k                        stmark2_defconfig
powerpc                        icon_defconfig
ia64                             allmodconfig
ia64                                defconfig
ia64                             allyesconfig
m68k                             allmodconfig
m68k                                defconfig
m68k                             allyesconfig
nios2                               defconfig
arc                              allyesconfig
nds32                             allnoconfig
c6x                              allyesconfig
nds32                               defconfig
nios2                            allyesconfig
csky                                defconfig
alpha                               defconfig
alpha                            allyesconfig
xtensa                           allyesconfig
h8300                            allyesconfig
arc                                 defconfig
sh                               allmodconfig
parisc                              defconfig
s390                             allyesconfig
parisc                           allyesconfig
s390                                defconfig
i386                             allyesconfig
sparc                            allyesconfig
sparc                               defconfig
i386                                defconfig
mips                             allyesconfig
mips                             allmodconfig
powerpc                          allyesconfig
powerpc                          allmodconfig
powerpc                           allnoconfig
i386                 randconfig-a002-20201023
i386                 randconfig-a005-20201023
i386                 randconfig-a003-20201023
i386                 randconfig-a001-20201023
i386                 randconfig-a006-20201023
i386                 randconfig-a004-20201023
i386                 randconfig-a002-20201022
i386                 randconfig-a005-20201022
i386                 randconfig-a003-20201022
i386                 randconfig-a001-20201022
i386                 randconfig-a006-20201022
i386                 randconfig-a004-20201022
x86_64               randconfig-a011-20201022
x86_64               randconfig-a013-20201022
x86_64               randconfig-a016-20201022
x86_64               randconfig-a015-20201022
x86_64               randconfig-a012-20201022
x86_64               randconfig-a014-20201022
i386                 randconfig-a016-20201022
i386                 randconfig-a014-20201022
i386                 randconfig-a015-20201022
i386                 randconfig-a012-20201022
i386                 randconfig-a013-20201022
i386                 randconfig-a011-20201022
riscv                    nommu_k210_defconfig
riscv                            allyesconfig
riscv                    nommu_virt_defconfig
riscv                             allnoconfig
riscv                               defconfig
riscv                          rv32_defconfig
riscv                            allmodconfig
x86_64                                   rhel
x86_64                           allyesconfig
x86_64                    rhel-7.6-kselftests
x86_64                              defconfig
x86_64                               rhel-8.3
x86_64                                  kexec

clang tested configs:
x86_64               randconfig-a001-20201022
x86_64               randconfig-a002-20201022
x86_64               randconfig-a003-20201022
x86_64               randconfig-a006-20201022
x86_64               randconfig-a004-20201022
x86_64               randconfig-a005-20201022

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply

* [PATCH] powerpc: Add config fragment for disabling -Werror
From: Michael Ellerman @ 2020-10-23  4:00 UTC (permalink / raw)
  To: linuxppc-dev

This makes it easy to disable building with -Werror:

  $ make defconfig
  $ grep WERROR .config
  # CONFIG_PPC_DISABLE_WERROR is not set
  CONFIG_PPC_WERROR=y

  $ make disable-werror.config
    GEN     Makefile
  Using .config as base
  Merging arch/powerpc/configs/disable-werror.config
  Value of CONFIG_PPC_DISABLE_WERROR is redefined by fragment arch/powerpc/configs/disable-werror.config:
  Previous value: # CONFIG_PPC_DISABLE_WERROR is not set
  New value: CONFIG_PPC_DISABLE_WERROR=y
  ...

  $ grep WERROR .config
  CONFIG_PPC_DISABLE_WERROR=y

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/configs/disable-werror.config | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 arch/powerpc/configs/disable-werror.config

diff --git a/arch/powerpc/configs/disable-werror.config b/arch/powerpc/configs/disable-werror.config
new file mode 100644
index 000000000000..6ea12a12432c
--- /dev/null
+++ b/arch/powerpc/configs/disable-werror.config
@@ -0,0 +1 @@
+CONFIG_PPC_DISABLE_WERROR=y
-- 
2.25.1


^ permalink raw reply related

* [PATCH] net: ucc_geth: Drop extraneous parentheses in comparison
From: Michael Ellerman @ 2020-10-23  3:32 UTC (permalink / raw)
  To: linuxppc-dev, netdev; +Cc: kuba, leoyang.li, davem, linux-kernel

Clang warns about the extra parentheses in this comparison:

  drivers/net/ethernet/freescale/ucc_geth.c:1361:28:
  warning: equality comparison with extraneous parentheses
    if ((ugeth->phy_interface == PHY_INTERFACE_MODE_SGMII))
         ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~

It seems clear the intent here is to do a comparison not an
assignment, so drop the extra parentheses to avoid any confusion.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 drivers/net/ethernet/freescale/ucc_geth.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/ucc_geth.c b/drivers/net/ethernet/freescale/ucc_geth.c
index db791f60b884..d8ad478a0a13 100644
--- a/drivers/net/ethernet/freescale/ucc_geth.c
+++ b/drivers/net/ethernet/freescale/ucc_geth.c
@@ -1358,7 +1358,7 @@ static int adjust_enet_interface(struct ucc_geth_private *ugeth)
 	    (ugeth->phy_interface == PHY_INTERFACE_MODE_RTBI)) {
 		upsmr |= UCC_GETH_UPSMR_TBIM;
 	}
-	if ((ugeth->phy_interface == PHY_INTERFACE_MODE_SGMII))
+	if (ugeth->phy_interface == PHY_INTERFACE_MODE_SGMII)
 		upsmr |= UCC_GETH_UPSMR_SGMM;
 
 	out_be32(&uf_regs->upsmr, upsmr);
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH] serial: pmac_zilog: don't init if zilog is not available
From: Finn Thain @ 2020-10-23  3:21 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Greg KH, Laurent Vivier, Linux Kernel Mailing List, linux-m68k,
	Paul Mackerras, open list:SERIAL DRIVERS, Brad Boyer,
	linuxppc-dev, Joshua Thompson
In-Reply-To: <CAMuHMdVbo2C1yZ5E_A3L8J1zZigO8i8m5AFUTn9SjbY1sx16kA@mail.gmail.com>

On Thu, 22 Oct 2020, Geert Uytterhoeven wrote:

> 
> Thanks for your patch...
> 

You're welcome.

> I can't say I'm a fan of this...
> 

Sorry.

> 
> The real issue is this "extern struct platform_device scc_a_pdev, 
> scc_b_pdev", circumventing the driver framework.
> 
> Can we get rid of that?
> 

Is there a better alternative?

pmz_probe() is called by console_initcall(pmz_console_init) when 
CONFIG_SERIAL_PMACZILOG_CONSOLE=y because this has to happen earlier than 
the normal platform bus probing which takes place later as a typical 
module_initcall.

^ permalink raw reply

* [PATCH] powerpc/ps3: Drop unused DBG macro
From: Michael Ellerman @ 2020-10-23  3:13 UTC (permalink / raw)
  To: linuxppc-dev

This DBG macro is unused, and has been unused since the file was
originally merged into mainline. Just drop it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/boot/ps3.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/arch/powerpc/boot/ps3.c b/arch/powerpc/boot/ps3.c
index 6e4efbdb6b7c..f157717ae814 100644
--- a/arch/powerpc/boot/ps3.c
+++ b/arch/powerpc/boot/ps3.c
@@ -21,13 +21,6 @@ extern int lv1_get_logical_ppe_id(u64 *out_1);
 extern int lv1_get_repository_node_value(u64 in_1, u64 in_2, u64 in_3,
 	u64 in_4, u64 in_5, u64 *out_1, u64 *out_2);
 
-#ifdef DEBUG
-#define DBG(fmt...) printf(fmt)
-#else
-static inline int __attribute__ ((format (printf, 1, 2))) DBG(
-	const char *fmt, ...) {return 0;}
-#endif
-
 BSS_STACK(4096);
 
 /* A buffer that may be edited by tools operating on a zImage binary so as to
-- 
2.25.1


^ permalink raw reply related

* [PATCHv2] selftests/powerpc/eeh: disable kselftest timeout setting for eeh-basic
From: Po-Hsu Lin @ 2020-10-23  2:45 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, linux-kselftest, mpe
  Cc: joe.lawrence, mathieu.desnoyers, po-hsu.lin, mbenes, shuah

The eeh-basic test got its own 60 seconds timeout (defined in commit
414f50434aa2 "selftests/eeh: Bump EEH wait time to 60s") per breakable
device.

And we have discovered that the number of breakable devices varies
on different hardware. The device recovery time ranges from 0 to 35
seconds. In our test pool it will take about 30 seconds to run on a
Power8 system that with 5 breakable devices, 60 seconds to run on a
Power9 system that with 4 breakable devices.

Extend the timeout setting in the kselftest framework to 5 minutes
to give it a chance to finish.

Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
---
 tools/testing/selftests/powerpc/eeh/Makefile | 2 +-
 tools/testing/selftests/powerpc/eeh/settings | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/powerpc/eeh/settings

diff --git a/tools/testing/selftests/powerpc/eeh/Makefile b/tools/testing/selftests/powerpc/eeh/Makefile
index b397bab..ae963eb 100644
--- a/tools/testing/selftests/powerpc/eeh/Makefile
+++ b/tools/testing/selftests/powerpc/eeh/Makefile
@@ -3,7 +3,7 @@ noarg:
 	$(MAKE) -C ../
 
 TEST_PROGS := eeh-basic.sh
-TEST_FILES := eeh-functions.sh
+TEST_FILES := eeh-functions.sh settings
 
 top_srcdir = ../../../../..
 include ../../lib.mk
diff --git a/tools/testing/selftests/powerpc/eeh/settings b/tools/testing/selftests/powerpc/eeh/settings
new file mode 100644
index 0000000..694d707
--- /dev/null
+++ b/tools/testing/selftests/powerpc/eeh/settings
@@ -0,0 +1 @@
+timeout=300
-- 
2.7.4


^ permalink raw reply related

* Re: [PATCH] selftests/powerpc/eeh: disable kselftest timeout setting for eeh-basic
From: Po-Hsu Lin @ 2020-10-23  2:37 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: joe.lawrence, linuxppc-dev, linux-kernel, mathieu.desnoyers,
	linux-kselftest, mbenes, shuah
In-Reply-To: <87a6wdy9si.fsf@mpe.ellerman.id.au>

On Fri, Oct 23, 2020 at 10:07 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Po-Hsu Lin <po-hsu.lin@canonical.com> writes:
> > The eeh-basic test got its own 60 seconds timeout (defined in commit
> > 414f50434aa2 "selftests/eeh: Bump EEH wait time to 60s") per breakable
> > device.
> >
> > And we have discovered that the number of breakable devices varies
> > on different hardware. The device recovery time ranges from 0 to 35
> > seconds. In our test pool it will take about 30 seconds to run on a
> > Power8 system that with 5 breakable devices, 60 seconds to run on a
> > Power9 system that with 4 breakable devices.
> >
> > Thus it's better to disable the default 45 seconds timeout setting in
> > the kselftest framework to give it a chance to finish. And let the
> > test to take care of the timeout control.
>
> I'd prefer if we still had some timeout, maybe 5 or 10 minutes? Just in
> case the test goes completely bonkers.
>
OK, let's go for 5 minutes.
Will send V2 later.
Thanks for your suggestion!

> cheers
>
> > diff --git a/tools/testing/selftests/powerpc/eeh/Makefile b/tools/testing/selftests/powerpc/eeh/Makefile
> > index b397bab..ae963eb 100644
> > --- a/tools/testing/selftests/powerpc/eeh/Makefile
> > +++ b/tools/testing/selftests/powerpc/eeh/Makefile
> > @@ -3,7 +3,7 @@ noarg:
> >       $(MAKE) -C ../
> >
> >  TEST_PROGS := eeh-basic.sh
> > -TEST_FILES := eeh-functions.sh
> > +TEST_FILES := eeh-functions.sh settings
> >
> >  top_srcdir = ../../../../..
> >  include ../../lib.mk
> > diff --git a/tools/testing/selftests/powerpc/eeh/settings b/tools/testing/selftests/powerpc/eeh/settings
> > new file mode 100644
> > index 0000000..e7b9417
> > --- /dev/null
> > +++ b/tools/testing/selftests/powerpc/eeh/settings
> > @@ -0,0 +1 @@
> > +timeout=0
> > --
> > 2.7.4

^ permalink raw reply

* [PATCH] powerpc/85xx: Fix declaration made after definition
From: Michael Ellerman @ 2020-10-23  2:08 UTC (permalink / raw)
  To: linuxppc-dev

Currently the clang build of corenet64_smp_defconfig fails with:

  arch/powerpc/platforms/85xx/corenet_generic.c:210:1: error:
  attribute declaration must precede definition
  machine_arch_initcall(corenet_generic, corenet_gen_publish_devices);

Fix it by moving the initcall definition prior to the machine
definition, and directly below the function it calls, which is the
usual style anyway.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/platforms/85xx/corenet_generic.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c b/arch/powerpc/platforms/85xx/corenet_generic.c
index 6aa8defb5857..8d6029099848 100644
--- a/arch/powerpc/platforms/85xx/corenet_generic.c
+++ b/arch/powerpc/platforms/85xx/corenet_generic.c
@@ -106,6 +106,7 @@ int __init corenet_gen_publish_devices(void)
 {
 	return of_platform_bus_probe(NULL, of_device_ids, NULL);
 }
+machine_arch_initcall(corenet_generic, corenet_gen_publish_devices);
 
 static const char * const boards[] __initconst = {
 	"fsl,P2041RDB",
@@ -206,5 +207,3 @@ define_machine(corenet_generic) {
 	.power_save		= e500_idle,
 #endif
 };
-
-machine_arch_initcall(corenet_generic, corenet_gen_publish_devices);
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH] selftests/powerpc/eeh: disable kselftest timeout setting for eeh-basic
From: Michael Ellerman @ 2020-10-23  2:07 UTC (permalink / raw)
  To: Po-Hsu Lin, linux-kernel, linuxppc-dev, linux-kselftest
  Cc: joe.lawrence, mathieu.desnoyers, po-hsu.lin, mbenes, shuah
In-Reply-To: <20201022083616.41666-1-po-hsu.lin@canonical.com>

Po-Hsu Lin <po-hsu.lin@canonical.com> writes:
> The eeh-basic test got its own 60 seconds timeout (defined in commit
> 414f50434aa2 "selftests/eeh: Bump EEH wait time to 60s") per breakable
> device.
>
> And we have discovered that the number of breakable devices varies
> on different hardware. The device recovery time ranges from 0 to 35
> seconds. In our test pool it will take about 30 seconds to run on a
> Power8 system that with 5 breakable devices, 60 seconds to run on a
> Power9 system that with 4 breakable devices.
>
> Thus it's better to disable the default 45 seconds timeout setting in
> the kselftest framework to give it a chance to finish. And let the
> test to take care of the timeout control.

I'd prefer if we still had some timeout, maybe 5 or 10 minutes? Just in
case the test goes completely bonkers.

cheers

> diff --git a/tools/testing/selftests/powerpc/eeh/Makefile b/tools/testing/selftests/powerpc/eeh/Makefile
> index b397bab..ae963eb 100644
> --- a/tools/testing/selftests/powerpc/eeh/Makefile
> +++ b/tools/testing/selftests/powerpc/eeh/Makefile
> @@ -3,7 +3,7 @@ noarg:
>  	$(MAKE) -C ../
>  
>  TEST_PROGS := eeh-basic.sh
> -TEST_FILES := eeh-functions.sh
> +TEST_FILES := eeh-functions.sh settings
>  
>  top_srcdir = ../../../../..
>  include ../../lib.mk
> diff --git a/tools/testing/selftests/powerpc/eeh/settings b/tools/testing/selftests/powerpc/eeh/settings
> new file mode 100644
> index 0000000..e7b9417
> --- /dev/null
> +++ b/tools/testing/selftests/powerpc/eeh/settings
> @@ -0,0 +1 @@
> +timeout=0
> -- 
> 2.7.4

^ permalink raw reply

* Re: [PATCH] powerpc: Send SIGBUS from machine_check
From: Michael Ellerman @ 2020-10-23  0:57 UTC (permalink / raw)
  To: Joakim Tjernlund, linuxppc-dev
In-Reply-To: <20201001170557.10915-1-joakim.tjernlund@infinera.com>

Joakim Tjernlund <joakim.tjernlund@infinera.com> writes:
> Embedded PPC CPU should send SIGBUS to user space when applicable.

Yeah, but it's not clear that it's applicable in all cases.

At least I need some reasoning for why it's safe in all cases below to
just send a SIGBUS and take no other action.

Is there a particular CPU you're working on? Can we start with that and
look at all the machine check causes and which can be safely handled.

Some comments below ...


> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index 0381242920d9..12715d24141c 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -621,6 +621,11 @@ int machine_check_e500mc(struct pt_regs *regs)

At the beginning of the function we have:

	printk("Machine check in kernel mode.\n");

Which should be updated.

>  		       reason & MCSR_MEA ? "Effective" : "Physical", addr);
>  	}
>  
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		recoverable = 1;
> +	}

For most of the error causes we take no action and set recoverable = 0.

Then you just declare that it is recoverable because it hit in
userspace. Depending on the cause that might be OK, but it's not
obviously correct in all cases.


> +
>  silent_out:
>  	mtspr(SPRN_MCSR, mcsr);
>  	return mfspr(SPRN_MCSR) == 0 && recoverable;
> @@ -665,6 +670,10 @@ int machine_check_e500(struct pt_regs *regs)

Same comment about the printk().

>  	if (reason & MCSR_BUS_RPERR)
>  		printk("Bus - Read Parity Error\n");
>  
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		return 1;
> +	}

And same comment more or less.

Other than the MCSR_BUS_RBERR cases that are explicitly checked, the
function does nothing to clear the cause of the machine check.

>  	return 0;
>  }
>  
> @@ -695,6 +704,10 @@ int machine_check_e200(struct pt_regs *regs)
>  	if (reason & MCSR_BUS_WRERR)
>  		printk("Bus - Write Bus Error on buffered store or cache line push\n");
>  
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		return 1;
> +	}

Same.

>  	return 0;
>  }
>  #elif defined(CONFIG_PPC32)
> @@ -731,6 +744,10 @@ int machine_check_generic(struct pt_regs *regs)
>  	default:
>  		printk("Unknown values in msr\n");
>  	}
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		return 1;
> +	}

Same.

>  	return 0;
>  }
>  #endif /* everything else */
> -- 
> 2.26.2


cheers

^ permalink raw reply

* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: David Laight @ 2020-10-22 22:07 UTC (permalink / raw)
  To: 'Al Viro', Nick Desaulniers
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, Greg KH,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, netdev@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022192458.GV3576660@ZenIV.linux.org.uk>

From: Al Viro
> Sent: 22 October 2020 20:25
> 
> On Thu, Oct 22, 2020 at 12:04:52PM -0700, Nick Desaulniers wrote:
> 
> > Passing an `unsigned long` as an `unsigned int` does no such
> > narrowing: https://godbolt.org/z/TvfMxe (same vice-versa, just tail
> > calls, no masking instructions).
> > So if rw_copy_check_uvector() is inlined into import_iovec() (looking
> > at the mainline@1028ae406999), then children calls of
> > `rw_copy_check_uvector()` will be interpreting the `nr_segs` register
> > unmodified, ie. garbage in the upper 32b.
> 
> FWIW,
> 
> void f(unsinged long v)
> {
> 	if (v != 1)
> 		printf("failed\n");
> }
> 
> void g(unsigned int v)
> {
> 	f(v);
> }
> 
> void h(unsigned long v)
> {
> 	g(v);
> }
> 
> main()
> {
> 	h(0x100000001);
> }
> 
> must not produce any output on a host with 32bit int and 64bit long, regardless of
> the inlining, having functions live in different compilation units, etc.
> 
> Depending upon the calling conventions, compiler might do truncation in caller or
> in a callee, but it must be done _somewhere_.

Put g() in a separate compilation unit and use the 'wrong' type
in the prototypes t() used to call g() and g() uses to call f().

Then you might see where and masking does (or does not) happen.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply

* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: David Laight @ 2020-10-22 22:04 UTC (permalink / raw)
  To: 'Nick Desaulniers', Arnd Bergmann
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, linux-block@vger.kernel.org, Al Viro,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, Greg KH,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, netdev@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <CAKwvOdnhONvrHLAuz_BrAuEpnF5mD9p0YPGJs=NZZ0EZNo7dFQ@mail.gmail.com>

From: Nick Desaulniers
> Sent: 22 October 2020 20:05
> 
...
> Passing an `unsigned long` as an `unsigned int` does no such
> narrowing: https://godbolt.org/z/TvfMxe (same vice-versa, just tail
> calls, no masking instructions).

Right but is the called function going to use 32bit ops
and/or mask the register?
Certainly that is likely on x86-64.

I've rather lost track of where the masking is expected
to happen.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Al Viro @ 2020-10-22 21:28 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	Matthew Wilcox, Linus Torvalds, kernel-team@android.com,
	Arnd Bergmann, linux-block@vger.kernel.org,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, Greg KH,
	Nick Desaulniers, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022205932.GB3613750@gmail.com>

On Thu, Oct 22, 2020 at 01:59:32PM -0700, Eric Biggers wrote:

> Also note the following program succeeds on Linux 5.9 on x86_64.  On kernels
> that have this bug, it should fail.  (I couldn't get it to actually fail, so it
> must depend on the compiler and/or the kernel config...)

It doesn't.  See https://www.spinics.net/lists/linux-scsi/msg147836.html for
discussion of that mess.

ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
                  unsigned long vlen, loff_t *pos, rwf_t flags)
{
        struct iovec iovstack[UIO_FASTIOV];
        struct iovec *iov = iovstack;
        struct iov_iter iter;
        ssize_t ret;

        ret = import_iovec(READ, vec, vlen, ARRAY_SIZE(iovstack), &iov, &iter);
        if (ret >= 0) {
                ret = do_iter_read(file, &iter, pos, flags);
                kfree(iov);
        }

        return ret;
}

and import_iovec() takes unsigned int as the third argument, so it *will*
truncate to 32 bits, no matter what.  Has done so since 0504c074b546
"switch {compat_,}do_readv_writev() to {compat_,}import_iovec()" back in
March 2015.  Yes, it was an incompatible userland ABI change, even though
nothing that used glibc/uclibc/dietlibc would've noticed.

Better yet, up until 2.1.90pre1 passing a 64bit value as the _first_ argument
of readv(2) used to fail with -EBADF if it was too large; at that point it
started to get quietly truncated to 32bit first.  And again, no libc users
would've noticed (neither would anything except deliberate regression test
looking for that specific behaviour).

Note that we also have process_madvise(2) with size_t for vlen (huh?  It's
a number of array elements, not an object size) and process_vm_{read,write}v(2),
that have unsigned long for the same thing.  And the last two *are* using
the same unsigned long from glibc POV.

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Eric Biggers @ 2020-10-22 20:59 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	Matthew Wilcox, kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, Al Viro, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, Greg KH,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <CAKwvOdnq-yYLcF_coo=jMV-RH-SkuNp_kMB+KCBF5cz3PwiB8g@mail.gmail.com>

On Thu, Oct 22, 2020 at 10:00:44AM -0700, Nick Desaulniers wrote:
> On Thu, Oct 22, 2020 at 9:40 AM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Thu, Oct 22, 2020 at 04:35:17PM +0000, David Laight wrote:
> > > Wait...
> > > readv(2) defines:
> > >       ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
> >
> > It doesn't really matter what the manpage says.  What does the AOSP
> > libc header say?
> 
> Same: https://android.googlesource.com/platform/bionic/+/refs/heads/master/libc/include/sys/uio.h#38
> 
> Theoretically someone could bypass libc to make a system call, right?
> 
> >
> > > But the syscall is defined as:
> > >
> > > SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec,
> > >                 unsigned long, vlen)
> > > {
> > >         return do_readv(fd, vec, vlen, 0);
> > > }
> >
> 

FWIW, glibc makes the readv() syscall assuming that fd and vlen are 'int' as
well.  So this problem isn't specific to Android's libc.

From objdump -d /lib/x86_64-linux-gnu/libc.so.6:

	00000000000f4db0 <readv@@GLIBC_2.2.5>:
	   f4db0:       64 8b 04 25 18 00 00    mov    %fs:0x18,%eax
	   f4db7:       00
	   f4db8:       85 c0                   test   %eax,%eax
	   f4dba:       75 14                   jne    f4dd0 <readv@@GLIBC_2.2.5+0x20>
	   f4dbc:       b8 13 00 00 00          mov    $0x13,%eax
	   f4dc1:       0f 05                   syscall
	   ...

There's some code for pthread cancellation, but no zeroing of the upper half of
the fd and vlen arguments, which are in %edi and %edx respectively.  But the
glibc function prototype uses 'int' for them, not 'unsigned long'
'ssize_t readv(int fd, const struct iovec *iov, int iovcnt);'.

So the high halves of the fd and iovcnt registers can contain garbage.  Or at
least that's what gcc (9.3.0) and clang (9.0.1) assume; they both compile the
following

void g(unsigned int x);

void f(unsigned long x)
{
        g(x);
}

into f() making a tail call to g(), without zeroing the top half of %rdi.

Also note the following program succeeds on Linux 5.9 on x86_64.  On kernels
that have this bug, it should fail.  (I couldn't get it to actually fail, so it
must depend on the compiler and/or the kernel config...)

	#include <fcntl.h>
	#include <stdio.h>
	#include <sys/syscall.h>
	#include <sys/uio.h>
	#include <unistd.h>

	int main()
	{
		int fd = open("/dev/zero", O_RDONLY);
		char buf[1000];
		struct iovec iov = { .iov_base = buf, .iov_len = sizeof(buf) };
		long ret;

		ret = syscall(__NR_readv, fd, &iov, 0x100000001);
		if (ret < 0)
			perror("readv failed");
		else
			printf("read %ld bytes\n", ret);
	}

I think the right fix is to change the readv() (and writev(), etc.) syscalls to
take 'unsigned int' rather than 'unsigned long', as that is what the users are
assuming...

- Eric

^ permalink raw reply

* Re: [PATCH] treewide: Convert macro and uses of __section(foo) to __section("foo")
From: Joe Perches @ 2020-10-22 20:45 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: clang-built-linux, linuxppc-dev, Linus Torvalds, LKML,
	Miguel Ojeda
In-Reply-To: <CAKwvOdmUPA9XupXwYHy_qT7P+LrUc+wseT79K_oqw=3y6bwLfg@mail.gmail.com>

On Thu, 2020-10-22 at 13:42 -0700, Nick Desaulniers wrote:
> .On Wed, Oct 21, 2020 at 7:36 PM Joe Perches <joe@perches.com> wrote:
> > Use a more generic form for __section that requires quotes to avoid
> > complications with clang and gcc differences.
[]
> >  a quick test of x86_64 and s390 would be good.

x86_64 was compiled here.
I believe the robot tested the others.


^ permalink raw reply

* Re: [PATCH] treewide: Convert macro and uses of __section(foo) to __section("foo")
From: Nick Desaulniers @ 2020-10-22 20:42 UTC (permalink / raw)
  To: Joe Perches
  Cc: clang-built-linux, linuxppc-dev, Linus Torvalds, LKML,
	Miguel Ojeda
In-Reply-To: <fe8abcc88cff676ead8ee48db1e993e63b0611c7.1603327264.git.joe@perches.com>

.On Wed, Oct 21, 2020 at 7:36 PM Joe Perches <joe@perches.com> wrote:
>
> Use a more generic form for __section that requires quotes to avoid
> complications with clang and gcc differences.
>
> Remove the quote operator # from compiler_attributes.h __section macro.
>
> Convert all unquoted __section(foo) uses to quoted __section("foo").
> Also convert __attribute__((section("foo"))) uses to __section("foo")
> even if the __attribute__ has multiple list entry forms.
>
> Conversion done using a script:
>
> Link: https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl
>
> Signed-off-by: Joe Perches <joe@perches.com>
> ---
>
> This conversion was previously submitted to -next last month
> https://lore.kernel.org/lkml/46f69161e60b802488ba8c8f3f8bbf922aa3b49b.camel@perches.com/
>
> Nick Desaulniers found a defect in the conversion of 2 boot files
> for powerpc, but no other defect was found for any other arch.

Untested, but:
Reviewed-by: Nick Desaulniers <ndesaulniers@gooogle.com>

Good job handling the trickier cases when the attribute was mixed with
others, and printing it in scripts/mod/modpost.c.

The only cases that *might* be similar to PPC are:
>  arch/s390/boot/startup.c              |  2 +-
>  arch/x86/boot/compressed/pgtable_64.c |  2 +-
>  arch/x86/purgatory/purgatory.c        |  4 ++--

So a quick test of x86_64 and s390 would be good.

Thanks for the patch.

>
> The script was corrected to avoid converting these 2 files.
>
> There is no difference between the script output when run on today's -next
> and Linus' tree through commit f804b3159482, so this should be reasonable to
> apply now.


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox