* Re: [PATCH] x86/mpx: fix recursive munmap() corruption
From: Christophe Leroy @ 2020-10-23 12:28 UTC (permalink / raw)
To: Laurent Dufour, Michael Ellerman
Cc: mhocko, rguenther, linux-mm, Dave Hansen, x86, stable, LKML,
Dave Hansen, Thomas Gleixner, luto, linuxppc-dev, Andrew Morton,
vbabka
In-Reply-To: <9c2b2826-4083-fc9c-5a4d-c101858dd560@linux.vnet.ibm.com>
Hi Laurent
Le 07/05/2019 à 18:35, Laurent Dufour a écrit :
> Le 01/05/2019 à 12:32, Michael Ellerman a écrit :
>> Laurent Dufour <ldufour@linux.vnet.ibm.com> writes:
>>> Le 23/04/2019 à 18:04, Dave Hansen a écrit :
>>>> On 4/23/19 4:16 AM, Laurent Dufour wrote:
>> ...
>>>>> There are 2 assumptions here:
>>>>> 1. 'start' and 'end' are page aligned (this is guaranteed by __do_munmap().
>>>>> 2. the VDSO is 1 page (this is guaranteed by the union vdso_data_store on powerpc)
>>>>
>>>> Are you sure about #2? The 'vdso64_pages' variable seems rather
>>>> unnecessary if the VDSO is only 1 page. ;)
>>>
>>> Hum, not so sure now ;)
>>> I got confused, only the header is one page.
>>> The test is working as a best effort, and don't cover the case where
>>> only few pages inside the VDSO are unmmapped (start >
>>> mm->context.vdso_base). This is not what CRIU is doing and so this was
>>> enough for CRIU support.
>>>
>>> Michael, do you think there is a need to manage all the possibility
>>> here, since the only user is CRIU and unmapping the VDSO is not a so
>>> good idea for other processes ?
>>
>> Couldn't we implement the semantic that if any part of the VDSO is
>> unmapped then vdso_base is set to zero? That should be fairly easy, eg:
>>
>> if (start < vdso_end && end >= mm->context.vdso_base)
>> mm->context.vdso_base = 0;
>>
>>
>> We might need to add vdso_end to the mm->context, but that should be OK.
>>
>> That seems like it would work for CRIU and make sense in general?
>
> Sorry for the late answer, yes this would make more sense.
>
> Here is a patch doing that.
>
In your patch, the test seems overkill:
+ if ((start <= vdso_base && vdso_end <= end) || /* 1 */
+ (vdso_base <= start && start < vdso_end) || /* 3,4 */
+ (vdso_base < end && end <= vdso_end)) /* 2,3 */
+ mm->context.vdso_base = mm->context.vdso_end = 0;
What about
if (start < vdso_end && vdso_start < end)
mm->context.vdso_base = mm->context.vdso_end = 0;
This should cover all cases, or am I missing something ?
And do we really need to store vdso_end in the context ?
I think it should be possible to re-calculate it: the size of the VDSO should be (&vdso32_end -
&vdso32_start) + PAGE_SIZE for 32 bits VDSO, and (&vdso64_end - &vdso64_start) + PAGE_SIZE for the
64 bits VDSO.
Christophe
^ permalink raw reply
* Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()
From: Christophe Leroy @ 2020-10-23 11:57 UTC (permalink / raw)
To: Will Deacon
Cc: nathanl, linux-arch, Arnd Bergmann, Dmitry Safonov, open list,
Paul Mackerras, Andy Lutomirski, Thomas Gleixner,
Vincenzo Frascino, linuxppc-dev
In-Reply-To: <20201023112514.GE20933@willie-the-truck>
Le 23/10/2020 à 13:25, Will Deacon a écrit :
> On Fri, Oct 23, 2020 at 01:22:04PM +0200, Christophe Leroy wrote:
>> Hi Dmitry,
>>
>> Le 28/09/2020 à 17:08, Dmitry Safonov a écrit :
>>> On 9/27/20 8:43 AM, Christophe Leroy wrote:
>>>>
>>>>
>>>> Le 21/09/2020 à 13:26, Will Deacon a écrit :
>>>>> On Fri, Aug 28, 2020 at 12:14:28PM +1000, Michael Ellerman wrote:
>>>>>> Dmitry Safonov <0x7f454c46@gmail.com> writes:
>>> [..]
>>>>>>> I'll cook a patch for vm_special_mapping if you don't mind :-)
>>>>>>
>>>>>> That would be great, thanks!
>>>>>
>>>>> I lost track of this one. Is there a patch kicking around to resolve
>>>>> this,
>>>>> or is the segfault expected behaviour?
>>>>>
>>>>
>>>> IIUC dmitry said he will cook a patch. I have not seen any patch yet.
>>>
>>> Yes, sorry about the delay - I was a bit busy with xfrm patches.
>>>
>>> I'll send patches for .close() this week, working on them now.
>>
>> I haven't seen the patches, did you sent them out finally ?
>
> I think it's this series:
>
> https://lore.kernel.org/r/20201013013416.390574-1-dima@arista.com
>
> but they look really invasive to me, so I may cook a small hack for arm64
> in the meantine / for stable.
>
Not sure we are talking about the same thing.
I can't see any new .close function added to vm_special_mapping in order to replace arch_unmap() hook.
Christophe
^ permalink raw reply
* Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()
From: Will Deacon @ 2020-10-23 11:25 UTC (permalink / raw)
To: Christophe Leroy
Cc: nathanl, linux-arch, Arnd Bergmann, Dmitry Safonov, open list,
Paul Mackerras, Andy Lutomirski, Thomas Gleixner,
Vincenzo Frascino, linuxppc-dev
In-Reply-To: <ba9861da-2f5b-a649-5626-af00af634546@csgroup.eu>
On Fri, Oct 23, 2020 at 01:22:04PM +0200, Christophe Leroy wrote:
> Hi Dmitry,
>
> Le 28/09/2020 à 17:08, Dmitry Safonov a écrit :
> > On 9/27/20 8:43 AM, Christophe Leroy wrote:
> > >
> > >
> > > Le 21/09/2020 à 13:26, Will Deacon a écrit :
> > > > On Fri, Aug 28, 2020 at 12:14:28PM +1000, Michael Ellerman wrote:
> > > > > Dmitry Safonov <0x7f454c46@gmail.com> writes:
> > [..]
> > > > > > I'll cook a patch for vm_special_mapping if you don't mind :-)
> > > > >
> > > > > That would be great, thanks!
> > > >
> > > > I lost track of this one. Is there a patch kicking around to resolve
> > > > this,
> > > > or is the segfault expected behaviour?
> > > >
> > >
> > > IIUC dmitry said he will cook a patch. I have not seen any patch yet.
> >
> > Yes, sorry about the delay - I was a bit busy with xfrm patches.
> >
> > I'll send patches for .close() this week, working on them now.
>
> I haven't seen the patches, did you sent them out finally ?
I think it's this series:
https://lore.kernel.org/r/20201013013416.390574-1-dima@arista.com
but they look really invasive to me, so I may cook a small hack for arm64
in the meantine / for stable.
Will
^ permalink raw reply
* Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()
From: Christophe Leroy @ 2020-10-23 11:22 UTC (permalink / raw)
To: Dmitry Safonov, Will Deacon, Michael Ellerman
Cc: nathanl, linux-arch, Arnd Bergmann, open list, Paul Mackerras,
Andy Lutomirski, Thomas Gleixner, Vincenzo Frascino, linuxppc-dev
In-Reply-To: <542145eb-7d90-0444-867e-c9cbb6bdd8e3@gmail.com>
Hi Dmitry,
Le 28/09/2020 à 17:08, Dmitry Safonov a écrit :
> On 9/27/20 8:43 AM, Christophe Leroy wrote:
>>
>>
>> Le 21/09/2020 à 13:26, Will Deacon a écrit :
>>> On Fri, Aug 28, 2020 at 12:14:28PM +1000, Michael Ellerman wrote:
>>>> Dmitry Safonov <0x7f454c46@gmail.com> writes:
> [..]
>>>>> I'll cook a patch for vm_special_mapping if you don't mind :-)
>>>>
>>>> That would be great, thanks!
>>>
>>> I lost track of this one. Is there a patch kicking around to resolve
>>> this,
>>> or is the segfault expected behaviour?
>>>
>>
>> IIUC dmitry said he will cook a patch. I have not seen any patch yet.
>
> Yes, sorry about the delay - I was a bit busy with xfrm patches.
>
> I'll send patches for .close() this week, working on them now.
I haven't seen the patches, did you sent them out finally ?
Christophe
^ permalink raw reply
* Re: [PATCH v2 1/2] powerpc: Introduce POWER10_DD1 feature
From: Ravi Bangoria @ 2020-10-23 10:14 UTC (permalink / raw)
To: Jordan Niethe
Cc: Christophe Leroy, Ravi Bangoria, Michael Neuling, Nicholas Piggin,
maddy, Paul Mackerras, naveen.n.rao, linuxppc-dev
In-Reply-To: <CACzsE9rqopXj37vrOcyM3sisomeYzxbeR6V4o5K3_Y8xqo5hHw@mail.gmail.com>
>>>> +static void __init fixup_cpu_features(void)
>>>> +{
>>>> + unsigned long version = mfspr(SPRN_PVR);
>>>> +
>>>> + if ((version & 0xffffffff) == 0x00800100)
>>>> + cur_cpu_spec->cpu_features |= CPU_FTR_POWER10_DD1;
>>>> +}
>>>> +
>>> I am just wondering why this is needed here, but the same thing is not
>>> done for, say, CPU_FTR_POWER9_DD2_1?
>>
>> When we don't use DT cpu_features (PowerVM / kvm geusts), we call
>> identify_cpu() twice. First with Real PVR which sets "raw" cpu_spec
>> as cur_cpu_spec and then 2nd time with Logical PVR (0x0f...) which
>> (mostly) overwrites the cur_cpu_spec with "architected" mode cpu_spec.
>> I don't see DD version specific entries for "architected" mode in
>> cpu_specs[] for any previous processors. So I've introduced this
>> function to tweak cpu_features.
>>
>> Though, I don't know why we don't have similar thing for
>> CPU_FTR_POWER9_DD2_1. I've to check that.
>>
>>> And should we get a /* Power10 DD 1 */ added to cpu_specs[]?
>>
>> IIUC, we don't need such entry. For PowerVM / kvm guests, we overwrite
>> cpu_spec, so /* Power10 */ "raw" entry is sufficient. And For baremetal,
>> we don't use cpu_specs[] at all.
> I think even for powernv, using dt features can be disabled by the
> cmdline with dt_cpu_ftrs=off, then cpu_specs[] will then be used.
Ok... with dt_cpu_ftrs=off, we seem to be using raw mode cpu_specs[] entry on
baremetal. So yeah, I'll add /* Power10 DD1 */ raw mode entry into cpu_specs[].
Thanks for pointing it out.
-Ravi
^ permalink raw reply
* Re: [PATCH] powerpc: Send SIGBUS from machine_check
From: Joakim Tjernlund @ 2020-10-23 9:23 UTC (permalink / raw)
To: linuxppc-dev@lists.ozlabs.org, mpe@ellerman.id.au
In-Reply-To: <87d019yd0w.fsf@mpe.ellerman.id.au>
On Fri, 2020-10-23 at 11:57 +1100, Michael Ellerman wrote:
>
>
> Joakim Tjernlund <joakim.tjernlund@infinera.com> writes:
> > Embedded PPC CPU should send SIGBUS to user space when applicable.
>
> Yeah, but it's not clear that it's applicable in all cases.
>
> At least I need some reasoning for why it's safe in all cases below to
> just send a SIGBUS and take no other action.
For me this came from an User SDK accessing a PCI device(also using PCI IRQs) and this
SDK did some strange stuff during shutdown which disabled the device before SW was done.
This caused PCI accesses, both from User Space and kernel PCI IRQs access) to the device
which caused an Machine Check(PCI transfer failed). Without this patch, the kernel
would just OOPS and hang/do strange things even for an access made by User space.
Now the User app just gets a SIGBUS and the kernel still works as it should.
Perhaps a SIGBUS and recover isn't right in all cases but without it there will be a
system break down.
> Is there a particular CPU you're working on? Can we start with that and
> look at all the machine check causes and which can be safely handled.
This was a T1042(e5500) but we have e500 and mpc832x as well.
>
> Some comments below ...
>
>
> > diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> > index 0381242920d9..12715d24141c 100644
> > --- a/arch/powerpc/kernel/traps.c
> > +++ b/arch/powerpc/kernel/traps.c
> > @@ -621,6 +621,11 @@ int machine_check_e500mc(struct pt_regs *regs)
>
> At the beginning of the function we have:
>
> printk("Machine check in kernel mode.\n");
>
> Which should be updated.
Sure, just remove the "in kernel mode" perhaps?
>
> > reason & MCSR_MEA ? "Effective" : "Physical", addr);
> > }
> >
> > + if ((user_mode(regs))) {
> > + _exception(SIGBUS, regs, reason, regs->nip);
> > + recoverable = 1;
> > + }
>
> For most of the error causes we take no action and set recoverable = 0.
>
> Then you just declare that it is recoverable because it hit in
> userspace. Depending on the cause that might be OK, but it's not
> obviously correct in all cases.
Not so familiar with PPC that I can make out what is OK or not.
I do think you stand a better chance now that before though.
>
>
> > +
> > silent_out:
> > mtspr(SPRN_MCSR, mcsr);
> > return mfspr(SPRN_MCSR) == 0 && recoverable;
> > @@ -665,6 +670,10 @@ int machine_check_e500(struct pt_regs *regs)
>
> Same comment about the printk().
>
> > if (reason & MCSR_BUS_RPERR)
> > printk("Bus - Read Parity Error\n");
> >
> > + if ((user_mode(regs))) {
> > + _exception(SIGBUS, regs, reason, regs->nip);
> > + return 1;
> > + }
>
> And same comment more or less.
>
> Other than the MCSR_BUS_RBERR cases that are explicitly checked, the
> function does nothing to clear the cause of the machine check.
>
> > return 0;
> > }
> >
> > @@ -695,6 +704,10 @@ int machine_check_e200(struct pt_regs *regs)
> > if (reason & MCSR_BUS_WRERR)
> > printk("Bus - Write Bus Error on buffered store or cache line push\n");
> >
> > + if ((user_mode(regs))) {
> > + _exception(SIGBUS, regs, reason, regs->nip);
> > + return 1;
> > + }
>
> Same.
>
> > return 0;
> > }
> > #elif defined(CONFIG_PPC32)
> > @@ -731,6 +744,10 @@ int machine_check_generic(struct pt_regs *regs)
> > default:
> > printk("Unknown values in msr\n");
> > }
> > + if ((user_mode(regs))) {
> > + _exception(SIGBUS, regs, reason, regs->nip);
> > + return 1;
> > + }
>
> Same.
>
> > return 0;
> > }
> > #endif /* everything else */
> > --
> > 2.26.2
>
>
> cheers
^ permalink raw reply
* Re: [PATCH] treewide: Convert macro and uses of __section(foo) to __section("foo")
From: Joe Perches @ 2020-10-23 8:03 UTC (permalink / raw)
To: Miguel Ojeda
Cc: clang-built-linux, Nick Desaulniers, Linus Torvalds, linux-kernel,
linuxppc-dev
In-Reply-To: <CANiq72nfHjXkN65jy+unz0k66qvAALNhhhDZsxqPRLdtLKOW_Q@mail.gmail.com>
On Fri, 2020-10-23 at 08:08 +0200, Miguel Ojeda wrote:
> On Thu, Oct 22, 2020 at 4:36 AM Joe Perches <joe@perches.com> wrote:
> >
> > Use a more generic form for __section that requires quotes to avoid
> > complications with clang and gcc differences.
>
> I performed visual inspection (one by one...) and the only thing I saw
> is that sometimes the `__attribute__` has a whitespace afterwards and
> sometimes it doesn't, same for the commas inside, e.g.:
>
> - __used __attribute__((section(".modinfo"), unused, aligned(1))) \
> + __used __section(".modinfo") __attribute__((unused, aligned(1))) \
>
> and:
>
> - __attribute__ ((unused,__section__ ("__param"),aligned(sizeof(void *)))) \
> + __section("__param") __attribute__ ((unused, aligned(sizeof(void *)))) \
>
> I think the patch tries to follow the style of the replaced line, but
> for the commas in this last case it didn't. Anyway, it is not
> important.
Here the change follows the kernel style of space after comma.
> I can pick it up in my queue along with the __alias one and keep it
> for a few weeks in -next.
Thanks Miguel, but IMO it doesn't need time in next.
Applying it just before an rc1 minimizes conflicts.
^ permalink raw reply
* Re: C vdso
From: Christophe Leroy @ 2020-10-23 6:28 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev@ozlabs.org
In-Reply-To: <cc532aa8-a9e0-a105-b7b1-ee8d723b7ed6@csgroup.eu>
Hi Michael,
Le 24/09/2020 à 15:17, Christophe Leroy a écrit :
> Hi Michael
>
> Le 17/09/2020 à 14:33, Michael Ellerman a écrit :
>> Hi Christophe,
>>
>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>> Hi Michael,
>>>
>>> What is the status with the generic C vdso merge ?
>>> In some mail, you mentionned having difficulties getting it working on
>>> ppc64, any progress ? What's the problem ? Can I help ?
>>
>> Yeah sorry I was hoping to get time to work on it but haven't been able
>> to.
>>
>> It's causing crashes on ppc64 ie. big endian.
>>
>
>>
>> As you can see from the instruction dump we have jumped into the weeds somewhere.
>>
>> We also had the report from the kbuild robot about rela.opd being
>> discarded, which I think is indicative of a bigger problem. ie. we don't
>> process relocations for the VDSO, but opds require relocations (they
>> contain an absolute pointer).
>>
>> I thought we could get away with that, because the VDSO entry points
>> aren't proper functions (so they don't have opds), and I didn't think
>> we'd be calling via function pointers in the VDSO code (which would
>> require opds). But seems something is not working right.
>>
>> Sorry I haven't got back to you with those details. Things are a bit of
>> a mess inside IBM at the moment (always?), and I've been trying to get
>> everything done before I take a holiday next week.
>>
>
>
> Can you tell what defconfig you are using ? I have been able to setup a full glibc PPC64 cross
> compilation chain and been able to test it under QEMU with success, using Nathan's vdsotest tool.
What config are you using ?
Christophe
>
> I tested with both ppc64_defconfig and pseries_defconfig.
>
> The only problem I got is with getcpu, which segfaults but both before and after applying my series,
> so I guess this is unrelated.
>
> Not sure we can pay too much attention to the exact measurement as it is a ppc64 QEMU running on a
> x86 Linux which is running in a Virtual Box on a x86 windows Laptop, but at least it works:
>
> Without the series:
>
> clock-getres-monotonic: vdso: 389 nsec/call
> clock-gettime-monotonic: vdso: 781 nsec/call
> clock-getres-monotonic-coarse: vdso: 13715 nsec/call
> clock-gettime-monotonic-coarse: vdso: 312 nsec/call
> clock-getres-monotonic-raw: vdso: 13589 nsec/call
> clock-getres-tai: vdso: 13827 nsec/call
> clock-gettime-tai: vdso: 14846 nsec/call
> clock-getres-boottime: vdso: 13596 nsec/call
> clock-gettime-boottime: vdso: 14758 nsec/call
> clock-getres-realtime: vdso: 327 nsec/call
> clock-gettime-realtime: vdso: 717 nsec/call
> clock-getres-realtime-coarse: vdso: 14102 nsec/call
> clock-gettime-realtime-coarse: vdso: 299 nsec/call
> gettimeofday: vdso: 771 nsec/call
>
> With the series:
>
> clock-getres-monotonic: vdso: 350 nsec/call
> clock-gettime-monotonic: vdso: 726 nsec/call
> clock-getres-monotonic-coarse: vdso: 356 nsec/call
> clock-gettime-monotonic-coarse: vdso: 423 nsec/call
> clock-getres-monotonic-raw: vdso: 349 nsec/call
> clock-getres-tai: vdso: 419 nsec/call
> clock-gettime-tai: vdso: 724 nsec/call
> clock-getres-boottime: vdso: 352 nsec/call
> clock-gettime-boottime: vdso: 752 nsec/call
> clock-getres-realtime: vdso: 351 nsec/call
> clock-gettime-realtime: vdso: 733 nsec/call
> clock-getres-realtime-coarse: vdso: 356 nsec/call
> clock-gettime-realtime-coarse: vdso: 367 nsec/call
> gettimeofday: vdso: 796 nsec/call
>
>
> Thanks
> Christophe
^ permalink raw reply
* Re: [PATCH] treewide: Convert macro and uses of __section(foo) to __section("foo")
From: Miguel Ojeda @ 2020-10-23 6:08 UTC (permalink / raw)
To: Joe Perches
Cc: clang-built-linux, Nick Desaulniers, Linus Torvalds, linux-kernel,
linuxppc-dev
In-Reply-To: <fe8abcc88cff676ead8ee48db1e993e63b0611c7.1603327264.git.joe@perches.com>
On Thu, Oct 22, 2020 at 4:36 AM Joe Perches <joe@perches.com> wrote:
>
> Use a more generic form for __section that requires quotes to avoid
> complications with clang and gcc differences.
I performed visual inspection (one by one...) and the only thing I saw
is that sometimes the `__attribute__` has a whitespace afterwards and
sometimes it doesn't, same for the commas inside, e.g.:
- __used __attribute__((section(".modinfo"), unused, aligned(1))) \
+ __used __section(".modinfo") __attribute__((unused, aligned(1))) \
and:
- __attribute__ ((unused,__section__ ("__param"),aligned(sizeof(void *)))) \
+ __section("__param") __attribute__ ((unused, aligned(sizeof(void *)))) \
I think the patch tries to follow the style of the replaced line, but
for the commas in this last case it didn't. Anyway, it is not
important.
I can pick it up in my queue along with the __alias one and keep it
for a few weeks in -next.
Cheers,
Miguel
^ permalink raw reply
* [powerpc:fixes] BUILD SUCCESS 4ff753feab021242144818b9a3ba011238218145
From: kernel test robot @ 2020-10-23 4:14 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git fixes
branch HEAD: 4ff753feab021242144818b9a3ba011238218145 powerpc/pseries: Avoid using addr_to_pfn in real mode
elapsed time: 723m
configs tested: 120
configs skipped: 2
The following configs have been built successfully.
More configs may be tested in the coming days.
gcc tested configs:
arm defconfig
arm64 allyesconfig
arm64 defconfig
arm allyesconfig
arm allmodconfig
powerpc mpc837x_mds_defconfig
mips ath79_defconfig
um kunit_defconfig
powerpc ppa8548_defconfig
arm spear6xx_defconfig
riscv alldefconfig
arm xcep_defconfig
arm exynos_defconfig
arm multi_v5_defconfig
powerpc ksi8560_defconfig
arm s3c6400_defconfig
mips maltaup_defconfig
arm omap1_defconfig
arm lubbock_defconfig
arm spitz_defconfig
powerpc xes_mpc85xx_defconfig
mips vocore2_defconfig
arm jornada720_defconfig
arc axs103_defconfig
arm vt8500_v6_v7_defconfig
powerpc redwood_defconfig
sh sh03_defconfig
mips bmips_stb_defconfig
sparc64 alldefconfig
sh magicpanelr2_defconfig
arm axm55xx_defconfig
c6x evmc6678_defconfig
mips bigsur_defconfig
powerpc tqm8560_defconfig
mips cavium_octeon_defconfig
xtensa xip_kc705_defconfig
m68k m5407c3_defconfig
sh espt_defconfig
arm realview_defconfig
xtensa virt_defconfig
powerpc ppc6xx_defconfig
arm versatile_defconfig
h8300 defconfig
m68k stmark2_defconfig
powerpc icon_defconfig
ia64 allmodconfig
ia64 defconfig
ia64 allyesconfig
m68k allmodconfig
m68k defconfig
m68k allyesconfig
nios2 defconfig
arc allyesconfig
nds32 allnoconfig
c6x allyesconfig
nds32 defconfig
nios2 allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
arc defconfig
sh allmodconfig
parisc defconfig
s390 allyesconfig
parisc allyesconfig
s390 defconfig
i386 allyesconfig
sparc allyesconfig
sparc defconfig
i386 defconfig
mips allyesconfig
mips allmodconfig
powerpc allyesconfig
powerpc allmodconfig
powerpc allnoconfig
i386 randconfig-a002-20201023
i386 randconfig-a005-20201023
i386 randconfig-a003-20201023
i386 randconfig-a001-20201023
i386 randconfig-a006-20201023
i386 randconfig-a004-20201023
i386 randconfig-a002-20201022
i386 randconfig-a005-20201022
i386 randconfig-a003-20201022
i386 randconfig-a001-20201022
i386 randconfig-a006-20201022
i386 randconfig-a004-20201022
x86_64 randconfig-a011-20201022
x86_64 randconfig-a013-20201022
x86_64 randconfig-a016-20201022
x86_64 randconfig-a015-20201022
x86_64 randconfig-a012-20201022
x86_64 randconfig-a014-20201022
i386 randconfig-a016-20201022
i386 randconfig-a014-20201022
i386 randconfig-a015-20201022
i386 randconfig-a012-20201022
i386 randconfig-a013-20201022
i386 randconfig-a011-20201022
riscv nommu_k210_defconfig
riscv allyesconfig
riscv nommu_virt_defconfig
riscv allnoconfig
riscv defconfig
riscv rv32_defconfig
riscv allmodconfig
x86_64 rhel
x86_64 allyesconfig
x86_64 rhel-7.6-kselftests
x86_64 defconfig
x86_64 rhel-8.3
x86_64 kexec
clang tested configs:
x86_64 randconfig-a001-20201022
x86_64 randconfig-a002-20201022
x86_64 randconfig-a003-20201022
x86_64 randconfig-a006-20201022
x86_64 randconfig-a004-20201022
x86_64 randconfig-a005-20201022
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* [PATCH] powerpc: Add config fragment for disabling -Werror
From: Michael Ellerman @ 2020-10-23 4:00 UTC (permalink / raw)
To: linuxppc-dev
This makes it easy to disable building with -Werror:
$ make defconfig
$ grep WERROR .config
# CONFIG_PPC_DISABLE_WERROR is not set
CONFIG_PPC_WERROR=y
$ make disable-werror.config
GEN Makefile
Using .config as base
Merging arch/powerpc/configs/disable-werror.config
Value of CONFIG_PPC_DISABLE_WERROR is redefined by fragment arch/powerpc/configs/disable-werror.config:
Previous value: # CONFIG_PPC_DISABLE_WERROR is not set
New value: CONFIG_PPC_DISABLE_WERROR=y
...
$ grep WERROR .config
CONFIG_PPC_DISABLE_WERROR=y
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
arch/powerpc/configs/disable-werror.config | 1 +
1 file changed, 1 insertion(+)
create mode 100644 arch/powerpc/configs/disable-werror.config
diff --git a/arch/powerpc/configs/disable-werror.config b/arch/powerpc/configs/disable-werror.config
new file mode 100644
index 000000000000..6ea12a12432c
--- /dev/null
+++ b/arch/powerpc/configs/disable-werror.config
@@ -0,0 +1 @@
+CONFIG_PPC_DISABLE_WERROR=y
--
2.25.1
^ permalink raw reply related
* [PATCH] net: ucc_geth: Drop extraneous parentheses in comparison
From: Michael Ellerman @ 2020-10-23 3:32 UTC (permalink / raw)
To: linuxppc-dev, netdev; +Cc: kuba, leoyang.li, davem, linux-kernel
Clang warns about the extra parentheses in this comparison:
drivers/net/ethernet/freescale/ucc_geth.c:1361:28:
warning: equality comparison with extraneous parentheses
if ((ugeth->phy_interface == PHY_INTERFACE_MODE_SGMII))
~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
It seems clear the intent here is to do a comparison not an
assignment, so drop the extra parentheses to avoid any confusion.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
drivers/net/ethernet/freescale/ucc_geth.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/freescale/ucc_geth.c b/drivers/net/ethernet/freescale/ucc_geth.c
index db791f60b884..d8ad478a0a13 100644
--- a/drivers/net/ethernet/freescale/ucc_geth.c
+++ b/drivers/net/ethernet/freescale/ucc_geth.c
@@ -1358,7 +1358,7 @@ static int adjust_enet_interface(struct ucc_geth_private *ugeth)
(ugeth->phy_interface == PHY_INTERFACE_MODE_RTBI)) {
upsmr |= UCC_GETH_UPSMR_TBIM;
}
- if ((ugeth->phy_interface == PHY_INTERFACE_MODE_SGMII))
+ if (ugeth->phy_interface == PHY_INTERFACE_MODE_SGMII)
upsmr |= UCC_GETH_UPSMR_SGMM;
out_be32(&uf_regs->upsmr, upsmr);
--
2.25.1
^ permalink raw reply related
* Re: [PATCH] serial: pmac_zilog: don't init if zilog is not available
From: Finn Thain @ 2020-10-23 3:21 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Greg KH, Laurent Vivier, Linux Kernel Mailing List, linux-m68k,
Paul Mackerras, open list:SERIAL DRIVERS, Brad Boyer,
linuxppc-dev, Joshua Thompson
In-Reply-To: <CAMuHMdVbo2C1yZ5E_A3L8J1zZigO8i8m5AFUTn9SjbY1sx16kA@mail.gmail.com>
On Thu, 22 Oct 2020, Geert Uytterhoeven wrote:
>
> Thanks for your patch...
>
You're welcome.
> I can't say I'm a fan of this...
>
Sorry.
>
> The real issue is this "extern struct platform_device scc_a_pdev,
> scc_b_pdev", circumventing the driver framework.
>
> Can we get rid of that?
>
Is there a better alternative?
pmz_probe() is called by console_initcall(pmz_console_init) when
CONFIG_SERIAL_PMACZILOG_CONSOLE=y because this has to happen earlier than
the normal platform bus probing which takes place later as a typical
module_initcall.
^ permalink raw reply
* [PATCH] powerpc/ps3: Drop unused DBG macro
From: Michael Ellerman @ 2020-10-23 3:13 UTC (permalink / raw)
To: linuxppc-dev
This DBG macro is unused, and has been unused since the file was
originally merged into mainline. Just drop it.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
arch/powerpc/boot/ps3.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/arch/powerpc/boot/ps3.c b/arch/powerpc/boot/ps3.c
index 6e4efbdb6b7c..f157717ae814 100644
--- a/arch/powerpc/boot/ps3.c
+++ b/arch/powerpc/boot/ps3.c
@@ -21,13 +21,6 @@ extern int lv1_get_logical_ppe_id(u64 *out_1);
extern int lv1_get_repository_node_value(u64 in_1, u64 in_2, u64 in_3,
u64 in_4, u64 in_5, u64 *out_1, u64 *out_2);
-#ifdef DEBUG
-#define DBG(fmt...) printf(fmt)
-#else
-static inline int __attribute__ ((format (printf, 1, 2))) DBG(
- const char *fmt, ...) {return 0;}
-#endif
-
BSS_STACK(4096);
/* A buffer that may be edited by tools operating on a zImage binary so as to
--
2.25.1
^ permalink raw reply related
* [PATCHv2] selftests/powerpc/eeh: disable kselftest timeout setting for eeh-basic
From: Po-Hsu Lin @ 2020-10-23 2:45 UTC (permalink / raw)
To: linux-kernel, linuxppc-dev, linux-kselftest, mpe
Cc: joe.lawrence, mathieu.desnoyers, po-hsu.lin, mbenes, shuah
The eeh-basic test got its own 60 seconds timeout (defined in commit
414f50434aa2 "selftests/eeh: Bump EEH wait time to 60s") per breakable
device.
And we have discovered that the number of breakable devices varies
on different hardware. The device recovery time ranges from 0 to 35
seconds. In our test pool it will take about 30 seconds to run on a
Power8 system that with 5 breakable devices, 60 seconds to run on a
Power9 system that with 4 breakable devices.
Extend the timeout setting in the kselftest framework to 5 minutes
to give it a chance to finish.
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
---
tools/testing/selftests/powerpc/eeh/Makefile | 2 +-
tools/testing/selftests/powerpc/eeh/settings | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/powerpc/eeh/settings
diff --git a/tools/testing/selftests/powerpc/eeh/Makefile b/tools/testing/selftests/powerpc/eeh/Makefile
index b397bab..ae963eb 100644
--- a/tools/testing/selftests/powerpc/eeh/Makefile
+++ b/tools/testing/selftests/powerpc/eeh/Makefile
@@ -3,7 +3,7 @@ noarg:
$(MAKE) -C ../
TEST_PROGS := eeh-basic.sh
-TEST_FILES := eeh-functions.sh
+TEST_FILES := eeh-functions.sh settings
top_srcdir = ../../../../..
include ../../lib.mk
diff --git a/tools/testing/selftests/powerpc/eeh/settings b/tools/testing/selftests/powerpc/eeh/settings
new file mode 100644
index 0000000..694d707
--- /dev/null
+++ b/tools/testing/selftests/powerpc/eeh/settings
@@ -0,0 +1 @@
+timeout=300
--
2.7.4
^ permalink raw reply related
* Re: [PATCH] selftests/powerpc/eeh: disable kselftest timeout setting for eeh-basic
From: Po-Hsu Lin @ 2020-10-23 2:37 UTC (permalink / raw)
To: Michael Ellerman
Cc: joe.lawrence, linuxppc-dev, linux-kernel, mathieu.desnoyers,
linux-kselftest, mbenes, shuah
In-Reply-To: <87a6wdy9si.fsf@mpe.ellerman.id.au>
On Fri, Oct 23, 2020 at 10:07 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Po-Hsu Lin <po-hsu.lin@canonical.com> writes:
> > The eeh-basic test got its own 60 seconds timeout (defined in commit
> > 414f50434aa2 "selftests/eeh: Bump EEH wait time to 60s") per breakable
> > device.
> >
> > And we have discovered that the number of breakable devices varies
> > on different hardware. The device recovery time ranges from 0 to 35
> > seconds. In our test pool it will take about 30 seconds to run on a
> > Power8 system that with 5 breakable devices, 60 seconds to run on a
> > Power9 system that with 4 breakable devices.
> >
> > Thus it's better to disable the default 45 seconds timeout setting in
> > the kselftest framework to give it a chance to finish. And let the
> > test to take care of the timeout control.
>
> I'd prefer if we still had some timeout, maybe 5 or 10 minutes? Just in
> case the test goes completely bonkers.
>
OK, let's go for 5 minutes.
Will send V2 later.
Thanks for your suggestion!
> cheers
>
> > diff --git a/tools/testing/selftests/powerpc/eeh/Makefile b/tools/testing/selftests/powerpc/eeh/Makefile
> > index b397bab..ae963eb 100644
> > --- a/tools/testing/selftests/powerpc/eeh/Makefile
> > +++ b/tools/testing/selftests/powerpc/eeh/Makefile
> > @@ -3,7 +3,7 @@ noarg:
> > $(MAKE) -C ../
> >
> > TEST_PROGS := eeh-basic.sh
> > -TEST_FILES := eeh-functions.sh
> > +TEST_FILES := eeh-functions.sh settings
> >
> > top_srcdir = ../../../../..
> > include ../../lib.mk
> > diff --git a/tools/testing/selftests/powerpc/eeh/settings b/tools/testing/selftests/powerpc/eeh/settings
> > new file mode 100644
> > index 0000000..e7b9417
> > --- /dev/null
> > +++ b/tools/testing/selftests/powerpc/eeh/settings
> > @@ -0,0 +1 @@
> > +timeout=0
> > --
> > 2.7.4
^ permalink raw reply
* [PATCH] powerpc/85xx: Fix declaration made after definition
From: Michael Ellerman @ 2020-10-23 2:08 UTC (permalink / raw)
To: linuxppc-dev
Currently the clang build of corenet64_smp_defconfig fails with:
arch/powerpc/platforms/85xx/corenet_generic.c:210:1: error:
attribute declaration must precede definition
machine_arch_initcall(corenet_generic, corenet_gen_publish_devices);
Fix it by moving the initcall definition prior to the machine
definition, and directly below the function it calls, which is the
usual style anyway.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
arch/powerpc/platforms/85xx/corenet_generic.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c b/arch/powerpc/platforms/85xx/corenet_generic.c
index 6aa8defb5857..8d6029099848 100644
--- a/arch/powerpc/platforms/85xx/corenet_generic.c
+++ b/arch/powerpc/platforms/85xx/corenet_generic.c
@@ -106,6 +106,7 @@ int __init corenet_gen_publish_devices(void)
{
return of_platform_bus_probe(NULL, of_device_ids, NULL);
}
+machine_arch_initcall(corenet_generic, corenet_gen_publish_devices);
static const char * const boards[] __initconst = {
"fsl,P2041RDB",
@@ -206,5 +207,3 @@ define_machine(corenet_generic) {
.power_save = e500_idle,
#endif
};
-
-machine_arch_initcall(corenet_generic, corenet_gen_publish_devices);
--
2.25.1
^ permalink raw reply related
* Re: [PATCH] selftests/powerpc/eeh: disable kselftest timeout setting for eeh-basic
From: Michael Ellerman @ 2020-10-23 2:07 UTC (permalink / raw)
To: Po-Hsu Lin, linux-kernel, linuxppc-dev, linux-kselftest
Cc: joe.lawrence, mathieu.desnoyers, po-hsu.lin, mbenes, shuah
In-Reply-To: <20201022083616.41666-1-po-hsu.lin@canonical.com>
Po-Hsu Lin <po-hsu.lin@canonical.com> writes:
> The eeh-basic test got its own 60 seconds timeout (defined in commit
> 414f50434aa2 "selftests/eeh: Bump EEH wait time to 60s") per breakable
> device.
>
> And we have discovered that the number of breakable devices varies
> on different hardware. The device recovery time ranges from 0 to 35
> seconds. In our test pool it will take about 30 seconds to run on a
> Power8 system that with 5 breakable devices, 60 seconds to run on a
> Power9 system that with 4 breakable devices.
>
> Thus it's better to disable the default 45 seconds timeout setting in
> the kselftest framework to give it a chance to finish. And let the
> test to take care of the timeout control.
I'd prefer if we still had some timeout, maybe 5 or 10 minutes? Just in
case the test goes completely bonkers.
cheers
> diff --git a/tools/testing/selftests/powerpc/eeh/Makefile b/tools/testing/selftests/powerpc/eeh/Makefile
> index b397bab..ae963eb 100644
> --- a/tools/testing/selftests/powerpc/eeh/Makefile
> +++ b/tools/testing/selftests/powerpc/eeh/Makefile
> @@ -3,7 +3,7 @@ noarg:
> $(MAKE) -C ../
>
> TEST_PROGS := eeh-basic.sh
> -TEST_FILES := eeh-functions.sh
> +TEST_FILES := eeh-functions.sh settings
>
> top_srcdir = ../../../../..
> include ../../lib.mk
> diff --git a/tools/testing/selftests/powerpc/eeh/settings b/tools/testing/selftests/powerpc/eeh/settings
> new file mode 100644
> index 0000000..e7b9417
> --- /dev/null
> +++ b/tools/testing/selftests/powerpc/eeh/settings
> @@ -0,0 +1 @@
> +timeout=0
> --
> 2.7.4
^ permalink raw reply
* Re: [PATCH] powerpc: Send SIGBUS from machine_check
From: Michael Ellerman @ 2020-10-23 0:57 UTC (permalink / raw)
To: Joakim Tjernlund, linuxppc-dev
In-Reply-To: <20201001170557.10915-1-joakim.tjernlund@infinera.com>
Joakim Tjernlund <joakim.tjernlund@infinera.com> writes:
> Embedded PPC CPU should send SIGBUS to user space when applicable.
Yeah, but it's not clear that it's applicable in all cases.
At least I need some reasoning for why it's safe in all cases below to
just send a SIGBUS and take no other action.
Is there a particular CPU you're working on? Can we start with that and
look at all the machine check causes and which can be safely handled.
Some comments below ...
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index 0381242920d9..12715d24141c 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -621,6 +621,11 @@ int machine_check_e500mc(struct pt_regs *regs)
At the beginning of the function we have:
printk("Machine check in kernel mode.\n");
Which should be updated.
> reason & MCSR_MEA ? "Effective" : "Physical", addr);
> }
>
> + if ((user_mode(regs))) {
> + _exception(SIGBUS, regs, reason, regs->nip);
> + recoverable = 1;
> + }
For most of the error causes we take no action and set recoverable = 0.
Then you just declare that it is recoverable because it hit in
userspace. Depending on the cause that might be OK, but it's not
obviously correct in all cases.
> +
> silent_out:
> mtspr(SPRN_MCSR, mcsr);
> return mfspr(SPRN_MCSR) == 0 && recoverable;
> @@ -665,6 +670,10 @@ int machine_check_e500(struct pt_regs *regs)
Same comment about the printk().
> if (reason & MCSR_BUS_RPERR)
> printk("Bus - Read Parity Error\n");
>
> + if ((user_mode(regs))) {
> + _exception(SIGBUS, regs, reason, regs->nip);
> + return 1;
> + }
And same comment more or less.
Other than the MCSR_BUS_RBERR cases that are explicitly checked, the
function does nothing to clear the cause of the machine check.
> return 0;
> }
>
> @@ -695,6 +704,10 @@ int machine_check_e200(struct pt_regs *regs)
> if (reason & MCSR_BUS_WRERR)
> printk("Bus - Write Bus Error on buffered store or cache line push\n");
>
> + if ((user_mode(regs))) {
> + _exception(SIGBUS, regs, reason, regs->nip);
> + return 1;
> + }
Same.
> return 0;
> }
> #elif defined(CONFIG_PPC32)
> @@ -731,6 +744,10 @@ int machine_check_generic(struct pt_regs *regs)
> default:
> printk("Unknown values in msr\n");
> }
> + if ((user_mode(regs))) {
> + _exception(SIGBUS, regs, reason, regs->nip);
> + return 1;
> + }
Same.
> return 0;
> }
> #endif /* everything else */
> --
> 2.26.2
cheers
^ permalink raw reply
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: David Laight @ 2020-10-22 22:07 UTC (permalink / raw)
To: 'Al Viro', Nick Desaulniers
Cc: linux-aio@kvack.org, David Hildenbrand,
linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
Christoph Hellwig, linux-arch@vger.kernel.org,
linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
kernel-team@android.com, Arnd Bergmann,
linux-block@vger.kernel.org, io-uring@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, Jens Axboe,
linux-parisc@vger.kernel.org, Greg KH,
linux-kernel@vger.kernel.org,
linux-security-module@vger.kernel.org, netdev@vger.kernel.org,
linux-fsdevel@vger.kernel.org, Andrew Morton,
linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022192458.GV3576660@ZenIV.linux.org.uk>
From: Al Viro
> Sent: 22 October 2020 20:25
>
> On Thu, Oct 22, 2020 at 12:04:52PM -0700, Nick Desaulniers wrote:
>
> > Passing an `unsigned long` as an `unsigned int` does no such
> > narrowing: https://godbolt.org/z/TvfMxe (same vice-versa, just tail
> > calls, no masking instructions).
> > So if rw_copy_check_uvector() is inlined into import_iovec() (looking
> > at the mainline@1028ae406999), then children calls of
> > `rw_copy_check_uvector()` will be interpreting the `nr_segs` register
> > unmodified, ie. garbage in the upper 32b.
>
> FWIW,
>
> void f(unsinged long v)
> {
> if (v != 1)
> printf("failed\n");
> }
>
> void g(unsigned int v)
> {
> f(v);
> }
>
> void h(unsigned long v)
> {
> g(v);
> }
>
> main()
> {
> h(0x100000001);
> }
>
> must not produce any output on a host with 32bit int and 64bit long, regardless of
> the inlining, having functions live in different compilation units, etc.
>
> Depending upon the calling conventions, compiler might do truncation in caller or
> in a callee, but it must be done _somewhere_.
Put g() in a separate compilation unit and use the 'wrong' type
in the prototypes t() used to call g() and g() uses to call f().
Then you might see where and masking does (or does not) happen.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: David Laight @ 2020-10-22 22:04 UTC (permalink / raw)
To: 'Nick Desaulniers', Arnd Bergmann
Cc: linux-aio@kvack.org, David Hildenbrand,
linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
Christoph Hellwig, linux-arch@vger.kernel.org,
linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
kernel-team@android.com, linux-block@vger.kernel.org, Al Viro,
io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
Jens Axboe, linux-parisc@vger.kernel.org, Greg KH,
linux-kernel@vger.kernel.org,
linux-security-module@vger.kernel.org, netdev@vger.kernel.org,
linux-fsdevel@vger.kernel.org, Andrew Morton,
linuxppc-dev@lists.ozlabs.org
In-Reply-To: <CAKwvOdnhONvrHLAuz_BrAuEpnF5mD9p0YPGJs=NZZ0EZNo7dFQ@mail.gmail.com>
From: Nick Desaulniers
> Sent: 22 October 2020 20:05
>
...
> Passing an `unsigned long` as an `unsigned int` does no such
> narrowing: https://godbolt.org/z/TvfMxe (same vice-versa, just tail
> calls, no masking instructions).
Right but is the called function going to use 32bit ops
and/or mask the register?
Certainly that is likely on x86-64.
I've rather lost track of where the masking is expected
to happen.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Al Viro @ 2020-10-22 21:28 UTC (permalink / raw)
To: Eric Biggers
Cc: linux-aio@kvack.org, David Hildenbrand,
linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
Christoph Hellwig, linux-arch@vger.kernel.org,
linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
Matthew Wilcox, Linus Torvalds, kernel-team@android.com,
Arnd Bergmann, linux-block@vger.kernel.org,
io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
Jens Axboe, linux-parisc@vger.kernel.org, Greg KH,
Nick Desaulniers, linux-kernel@vger.kernel.org,
linux-security-module@vger.kernel.org, David Laight,
netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022205932.GB3613750@gmail.com>
On Thu, Oct 22, 2020 at 01:59:32PM -0700, Eric Biggers wrote:
> Also note the following program succeeds on Linux 5.9 on x86_64. On kernels
> that have this bug, it should fail. (I couldn't get it to actually fail, so it
> must depend on the compiler and/or the kernel config...)
It doesn't. See https://www.spinics.net/lists/linux-scsi/msg147836.html for
discussion of that mess.
ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
unsigned long vlen, loff_t *pos, rwf_t flags)
{
struct iovec iovstack[UIO_FASTIOV];
struct iovec *iov = iovstack;
struct iov_iter iter;
ssize_t ret;
ret = import_iovec(READ, vec, vlen, ARRAY_SIZE(iovstack), &iov, &iter);
if (ret >= 0) {
ret = do_iter_read(file, &iter, pos, flags);
kfree(iov);
}
return ret;
}
and import_iovec() takes unsigned int as the third argument, so it *will*
truncate to 32 bits, no matter what. Has done so since 0504c074b546
"switch {compat_,}do_readv_writev() to {compat_,}import_iovec()" back in
March 2015. Yes, it was an incompatible userland ABI change, even though
nothing that used glibc/uclibc/dietlibc would've noticed.
Better yet, up until 2.1.90pre1 passing a 64bit value as the _first_ argument
of readv(2) used to fail with -EBADF if it was too large; at that point it
started to get quietly truncated to 32bit first. And again, no libc users
would've noticed (neither would anything except deliberate regression test
looking for that specific behaviour).
Note that we also have process_madvise(2) with size_t for vlen (huh? It's
a number of array elements, not an object size) and process_vm_{read,write}v(2),
that have unsigned long for the same thing. And the last two *are* using
the same unsigned long from glibc POV.
^ permalink raw reply
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Eric Biggers @ 2020-10-22 20:59 UTC (permalink / raw)
To: Nick Desaulniers
Cc: linux-aio@kvack.org, David Hildenbrand,
linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
Christoph Hellwig, linux-arch@vger.kernel.org,
linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
Matthew Wilcox, kernel-team@android.com, Arnd Bergmann,
linux-block@vger.kernel.org, Al Viro, io-uring@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, Jens Axboe,
linux-parisc@vger.kernel.org, Greg KH,
linux-kernel@vger.kernel.org,
linux-security-module@vger.kernel.org, David Laight,
netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <CAKwvOdnq-yYLcF_coo=jMV-RH-SkuNp_kMB+KCBF5cz3PwiB8g@mail.gmail.com>
On Thu, Oct 22, 2020 at 10:00:44AM -0700, Nick Desaulniers wrote:
> On Thu, Oct 22, 2020 at 9:40 AM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Thu, Oct 22, 2020 at 04:35:17PM +0000, David Laight wrote:
> > > Wait...
> > > readv(2) defines:
> > > ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
> >
> > It doesn't really matter what the manpage says. What does the AOSP
> > libc header say?
>
> Same: https://android.googlesource.com/platform/bionic/+/refs/heads/master/libc/include/sys/uio.h#38
>
> Theoretically someone could bypass libc to make a system call, right?
>
> >
> > > But the syscall is defined as:
> > >
> > > SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec,
> > > unsigned long, vlen)
> > > {
> > > return do_readv(fd, vec, vlen, 0);
> > > }
> >
>
FWIW, glibc makes the readv() syscall assuming that fd and vlen are 'int' as
well. So this problem isn't specific to Android's libc.
From objdump -d /lib/x86_64-linux-gnu/libc.so.6:
00000000000f4db0 <readv@@GLIBC_2.2.5>:
f4db0: 64 8b 04 25 18 00 00 mov %fs:0x18,%eax
f4db7: 00
f4db8: 85 c0 test %eax,%eax
f4dba: 75 14 jne f4dd0 <readv@@GLIBC_2.2.5+0x20>
f4dbc: b8 13 00 00 00 mov $0x13,%eax
f4dc1: 0f 05 syscall
...
There's some code for pthread cancellation, but no zeroing of the upper half of
the fd and vlen arguments, which are in %edi and %edx respectively. But the
glibc function prototype uses 'int' for them, not 'unsigned long'
'ssize_t readv(int fd, const struct iovec *iov, int iovcnt);'.
So the high halves of the fd and iovcnt registers can contain garbage. Or at
least that's what gcc (9.3.0) and clang (9.0.1) assume; they both compile the
following
void g(unsigned int x);
void f(unsigned long x)
{
g(x);
}
into f() making a tail call to g(), without zeroing the top half of %rdi.
Also note the following program succeeds on Linux 5.9 on x86_64. On kernels
that have this bug, it should fail. (I couldn't get it to actually fail, so it
must depend on the compiler and/or the kernel config...)
#include <fcntl.h>
#include <stdio.h>
#include <sys/syscall.h>
#include <sys/uio.h>
#include <unistd.h>
int main()
{
int fd = open("/dev/zero", O_RDONLY);
char buf[1000];
struct iovec iov = { .iov_base = buf, .iov_len = sizeof(buf) };
long ret;
ret = syscall(__NR_readv, fd, &iov, 0x100000001);
if (ret < 0)
perror("readv failed");
else
printf("read %ld bytes\n", ret);
}
I think the right fix is to change the readv() (and writev(), etc.) syscalls to
take 'unsigned int' rather than 'unsigned long', as that is what the users are
assuming...
- Eric
^ permalink raw reply
* Re: [PATCH] treewide: Convert macro and uses of __section(foo) to __section("foo")
From: Joe Perches @ 2020-10-22 20:45 UTC (permalink / raw)
To: Nick Desaulniers
Cc: clang-built-linux, linuxppc-dev, Linus Torvalds, LKML,
Miguel Ojeda
In-Reply-To: <CAKwvOdmUPA9XupXwYHy_qT7P+LrUc+wseT79K_oqw=3y6bwLfg@mail.gmail.com>
On Thu, 2020-10-22 at 13:42 -0700, Nick Desaulniers wrote:
> .On Wed, Oct 21, 2020 at 7:36 PM Joe Perches <joe@perches.com> wrote:
> > Use a more generic form for __section that requires quotes to avoid
> > complications with clang and gcc differences.
[]
> > a quick test of x86_64 and s390 would be good.
x86_64 was compiled here.
I believe the robot tested the others.
^ permalink raw reply
* Re: [PATCH] treewide: Convert macro and uses of __section(foo) to __section("foo")
From: Nick Desaulniers @ 2020-10-22 20:42 UTC (permalink / raw)
To: Joe Perches
Cc: clang-built-linux, linuxppc-dev, Linus Torvalds, LKML,
Miguel Ojeda
In-Reply-To: <fe8abcc88cff676ead8ee48db1e993e63b0611c7.1603327264.git.joe@perches.com>
.On Wed, Oct 21, 2020 at 7:36 PM Joe Perches <joe@perches.com> wrote:
>
> Use a more generic form for __section that requires quotes to avoid
> complications with clang and gcc differences.
>
> Remove the quote operator # from compiler_attributes.h __section macro.
>
> Convert all unquoted __section(foo) uses to quoted __section("foo").
> Also convert __attribute__((section("foo"))) uses to __section("foo")
> even if the __attribute__ has multiple list entry forms.
>
> Conversion done using a script:
>
> Link: https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl
>
> Signed-off-by: Joe Perches <joe@perches.com>
> ---
>
> This conversion was previously submitted to -next last month
> https://lore.kernel.org/lkml/46f69161e60b802488ba8c8f3f8bbf922aa3b49b.camel@perches.com/
>
> Nick Desaulniers found a defect in the conversion of 2 boot files
> for powerpc, but no other defect was found for any other arch.
Untested, but:
Reviewed-by: Nick Desaulniers <ndesaulniers@gooogle.com>
Good job handling the trickier cases when the attribute was mixed with
others, and printing it in scripts/mod/modpost.c.
The only cases that *might* be similar to PPC are:
> arch/s390/boot/startup.c | 2 +-
> arch/x86/boot/compressed/pgtable_64.c | 2 +-
> arch/x86/purgatory/purgatory.c | 4 ++--
So a quick test of x86_64 and s390 would be good.
Thanks for the patch.
>
> The script was corrected to avoid converting these 2 files.
>
> There is no difference between the script output when run on today's -next
> and Linus' tree through commit f804b3159482, so this should be reasonable to
> apply now.
--
Thanks,
~Nick Desaulniers
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox