LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Nick Desaulniers @ 2020-10-22 17:00 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, Al Viro, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, Greg KH,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022164040.GV20115@casper.infradead.org>

On Thu, Oct 22, 2020 at 9:40 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Thu, Oct 22, 2020 at 04:35:17PM +0000, David Laight wrote:
> > Wait...
> > readv(2) defines:
> >       ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
>
> It doesn't really matter what the manpage says.  What does the AOSP
> libc header say?

Same: https://android.googlesource.com/platform/bionic/+/refs/heads/master/libc/include/sys/uio.h#38

Theoretically someone could bypass libc to make a system call, right?

>
> > But the syscall is defined as:
> >
> > SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec,
> >                 unsigned long, vlen)
> > {
> >         return do_readv(fd, vec, vlen, 0);
> > }
>


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply

* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: David Laight @ 2020-10-22 16:50 UTC (permalink / raw)
  To: 'Matthew Wilcox'
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	'Christoph Hellwig', linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, Al Viro, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, Greg KH, Nick Desaulniers,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, netdev@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022164040.GV20115@casper.infradead.org>

From: Matthew Wilcox <willy@infradead.org>
> Sent: 22 October 2020 17:41
> 
> On Thu, Oct 22, 2020 at 04:35:17PM +0000, David Laight wrote:
> > Wait...
> > readv(2) defines:
> > 	ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
> 
> It doesn't really matter what the manpage says.  What does the AOSP
> libc header say?

The only copy I can find is:

/usr/include/x86_64-linux-gnu/sys/uio.h:extern ssize_t readv (int __fd, const struct iovec *__iovec, int __count)
/usr/include/x86_64-linux-gnu/sys/uio.h-  __wur;

and not surprisingly agrees.
POSIX and/or TOG will (more or less) mandate the prototype.

> > But the syscall is defined as:
> >
> > SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec,
> >                 unsigned long, vlen)
> > {
> >         return do_readv(fd, vec, vlen, 0);
> > }

I wonder when the high bits of 'fd' get zapped?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Matthew Wilcox @ 2020-10-22 16:40 UTC (permalink / raw)
  To: David Laight
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	'Christoph Hellwig', linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, Al Viro, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, Greg KH, Nick Desaulniers,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, netdev@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <8f1fff0c358b4b669d51cc80098dbba1@AcuMS.aculab.com>

On Thu, Oct 22, 2020 at 04:35:17PM +0000, David Laight wrote:
> Wait...
> readv(2) defines:
> 	ssize_t readv(int fd, const struct iovec *iov, int iovcnt);

It doesn't really matter what the manpage says.  What does the AOSP
libc header say?

> But the syscall is defined as:
> 
> SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec,
>                 unsigned long, vlen)
> {
>         return do_readv(fd, vec, vlen, 0);
> }


^ permalink raw reply

* Re: [PATCH] powerpc/bitops: Fix possible undefined behaviour with fls() and fls64()
From: Segher Boessenkool @ 2020-10-22 16:37 UTC (permalink / raw)
  To: Christophe Leroy; +Cc: jakub, linux-kernel, Paul Mackerras, linuxppc-dev
In-Reply-To: <348c2d3f19ffcff8abe50d52513f989c4581d000.1603375524.git.christophe.leroy@csgroup.eu>

On Thu, Oct 22, 2020 at 02:05:46PM +0000, Christophe Leroy wrote:
> fls() and fls64() are using __builtin_ctz() and _builtin_ctzll().
> On powerpc, those builtins trivially use ctlzw and ctlzd power
> instructions.
> 
> Allthough those instructions provide the expected result with
> input argument 0, __builtin_ctz() and __builtin_ctzll() are
> documented as undefined for value 0.

> When the input of fls(x) is a constant, just check x for nullity and
> return either 0 or __builtin_clz(x). Otherwise, use cntlzw instruction
> directly.

That looks good :-)

Acked-by: Segher Boessenkool <segher@kernel.crashing.org>


Segher

^ permalink raw reply

* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: David Laight @ 2020-10-22 16:35 UTC (permalink / raw)
  To: 'Christoph Hellwig', David Hildenbrand
  Cc: linux-aio@kvack.org, linux-mips@vger.kernel.org, David Howells,
	linux-mm@kvack.org, keyrings@vger.kernel.org,
	sparclinux@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, Al Viro, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, Greg KH, Nick Desaulniers,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, netdev@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022132342.GB8781@lst.de>

From: Christoph Hellwig
> Sent: 22 October 2020 14:24
> 
> On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
> > My thinking: if the compiler that calls import_iovec() has garbage in
> > the upper 32 bit
> >
> > a) gcc will zero it out and not rely on it being zero.
> > b) clang will not zero it out, assuming it is zero.
> >
> > But
> >
> > a) will zero it out when calling the !inlined variant
> > b) clang will zero it out when calling the !inlined variant
> >
> > When inlining, b) strikes. We access garbage. That would mean that we
> > have calling code that's not generated by clang/gcc IIUC.
> 
> Most callchains of import_iovec start with the assembly syscall wrappers.

Wait...
readv(2) defines:
	ssize_t readv(int fd, const struct iovec *iov, int iovcnt);

But the syscall is defined as:

SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec,
                unsigned long, vlen)
{
        return do_readv(fd, vec, vlen, 0);
}

I'm guessing that nothing actually masks the high bits that come
from an application that is compiled with clang?

The vlen is 'unsigned long' through the first few calls.
So unless there is a non-inlined function than takes vlen
as 'int' the high garbage bits from userspace are kept.

Which makes it a bug in the kernel C syscall wrappers.
They need to explicitly mask the high bits of 32bit
arguments on arm64 but not x86-64.

What does the ARM EABI say about register parameters?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply

* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: David Laight @ 2020-10-22 16:15 UTC (permalink / raw)
  To: 'Greg KH', Arnd Bergmann
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, linux-block@vger.kernel.org, Al Viro,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, netdev@vger.kernel.org,
	Nick Desaulniers, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022144021.GA1969554@kroah.com>

From: Greg KH
> Sent: 22 October 2020 15:40
> 
> On Thu, Oct 22, 2020 at 04:28:20PM +0200, Arnd Bergmann wrote:
...
> > Can you attach the iov_iter.s files from the broken build, plus the
> > one with 'noinline' for comparison? Maybe something can be seen
> > in there.
> 
> I don't know how to extract the .s files easily from the AOSP build
> system, I'll look into that.  I'm also now testing by downgrading to an
> older version of clang (10 instead of 11), to see if that matters at all
> or not...

Back from a day out - after it stopped raining.
Trying to use up leave before the end of the year.

Can you use objdump on the kernel binary itself and cut out
the single function?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Christoph Hellwig @ 2020-10-22 13:23 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-aio@kvack.org, linux-mips@vger.kernel.org, David Howells,
	linux-mm@kvack.org, keyrings@vger.kernel.org,
	sparclinux@vger.kernel.org, Christoph Hellwig,
	linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
	linux-scsi@vger.kernel.org, kernel-team@android.com,
	Arnd Bergmann, linux-block@vger.kernel.org, Al Viro,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, Greg KH,
	Nick Desaulniers, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <5fd6003b-55a6-2c3c-9a28-8fd3a575ca78@redhat.com>

On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
> My thinking: if the compiler that calls import_iovec() has garbage in
> the upper 32 bit
> 
> a) gcc will zero it out and not rely on it being zero.
> b) clang will not zero it out, assuming it is zero.
> 
> But
> 
> a) will zero it out when calling the !inlined variant
> b) clang will zero it out when calling the !inlined variant
> 
> When inlining, b) strikes. We access garbage. That would mean that we
> have calling code that's not generated by clang/gcc IIUC.

Most callchains of import_iovec start with the assembly syscall wrappers.

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Greg KH @ 2020-10-22 14:40 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, linux-block@vger.kernel.org, Al Viro,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, netdev@vger.kernel.org,
	Nick Desaulniers, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <CAK8P3a1B7OVdyzW0-97JwzZiwp0D0fnSfyete16QTvPp_1m07A@mail.gmail.com>

On Thu, Oct 22, 2020 at 04:28:20PM +0200, Arnd Bergmann wrote:
> On Thu, Oct 22, 2020 at 3:50 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > On Thu, Oct 22, 2020 at 02:57:59PM +0200, Greg KH wrote:
> > > On Thu, Oct 22, 2020 at 02:42:24PM +0200, David Hildenbrand wrote:
> 
> > > >  struct iovec *iovec_from_user(const struct iovec __user *uvec,
> > > > -               unsigned long nr_segs, unsigned long fast_segs,
> > > > +               unsigned nr_segs, unsigned fast_segs,
> > > >                 struct iovec *fast_iov, bool compat)
> > > >  {
> > > >         struct iovec *iov = fast_iov;
> > > > @@ -1738,7 +1738,7 @@ ssize_t __import_iovec(int type, const struct
> > > > iovec __user *uvec,
> > > >                  struct iov_iter *i, bool compat)
> > > >  {
> > > >         ssize_t total_len = 0;
> > > > -       unsigned long seg;
> > > > +       unsigned seg;
> > > >         struct iovec *iov;
> > > >
> > > >         iov = iovec_from_user(uvec, nr_segs, fast_segs, *iovp, compat);
> > > >
> > >
> > > Ah, I tested the other way around, making everything "unsigned long"
> > > instead.  Will go try this too, as other tests are still running...
> >
> > Ok, no, this didn't work either.
> >
> > Nick, I think I need some compiler help here.  Any ideas?
> 
> I don't think the patch above would reliably clear the upper bits if they
> contain garbage.
> 
> If the integer extension is the problem, the way I'd try it is to make the
> function take an 'unsigned long' and then explictly mask the upper
> bits with
> 
>      seg = lower_32_bits(seg);
> 
> Can you attach the iov_iter.s files from the broken build, plus the
> one with 'noinline' for comparison? Maybe something can be seen
> in there.

I don't know how to extract the .s files easily from the AOSP build
system, I'll look into that.  I'm also now testing by downgrading to an
older version of clang (10 instead of 11), to see if that matters at all
or not...

thanks,

greg k-h

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Arnd Bergmann @ 2020-10-22 14:28 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, linux-block@vger.kernel.org, Al Viro,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, netdev@vger.kernel.org,
	Nick Desaulniers, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022135036.GA1787470@kroah.com>

On Thu, Oct 22, 2020 at 3:50 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> On Thu, Oct 22, 2020 at 02:57:59PM +0200, Greg KH wrote:
> > On Thu, Oct 22, 2020 at 02:42:24PM +0200, David Hildenbrand wrote:

> > >  struct iovec *iovec_from_user(const struct iovec __user *uvec,
> > > -               unsigned long nr_segs, unsigned long fast_segs,
> > > +               unsigned nr_segs, unsigned fast_segs,
> > >                 struct iovec *fast_iov, bool compat)
> > >  {
> > >         struct iovec *iov = fast_iov;
> > > @@ -1738,7 +1738,7 @@ ssize_t __import_iovec(int type, const struct
> > > iovec __user *uvec,
> > >                  struct iov_iter *i, bool compat)
> > >  {
> > >         ssize_t total_len = 0;
> > > -       unsigned long seg;
> > > +       unsigned seg;
> > >         struct iovec *iov;
> > >
> > >         iov = iovec_from_user(uvec, nr_segs, fast_segs, *iovp, compat);
> > >
> >
> > Ah, I tested the other way around, making everything "unsigned long"
> > instead.  Will go try this too, as other tests are still running...
>
> Ok, no, this didn't work either.
>
> Nick, I think I need some compiler help here.  Any ideas?

I don't think the patch above would reliably clear the upper bits if they
contain garbage.

If the integer extension is the problem, the way I'd try it is to make the
function take an 'unsigned long' and then explictly mask the upper
bits with

     seg = lower_32_bits(seg);

Can you attach the iov_iter.s files from the broken build, plus the
one with 'noinline' for comparison? Maybe something can be seen
in there.

       Arnd

^ permalink raw reply

* [PATCH] powerpc/bitops: Fix possible undefined behaviour with fls() and fls64()
From: Christophe Leroy @ 2020-10-22 14:05 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, jakub,
	segher
  Cc: linuxppc-dev, linux-kernel

fls() and fls64() are using __builtin_ctz() and _builtin_ctzll().
On powerpc, those builtins trivially use ctlzw and ctlzd power
instructions.

Allthough those instructions provide the expected result with
input argument 0, __builtin_ctz() and __builtin_ctzll() are
documented as undefined for value 0.

The easiest fix would be to use fls() and fls64() functions
defined in include/asm-generic/bitops/builtin-fls.h and
include/asm-generic/bitops/fls64.h, but GCC output is not optimal:

00000388 <testfls>:
 388:   2c 03 00 00     cmpwi   r3,0
 38c:   41 82 00 10     beq     39c <testfls+0x14>
 390:   7c 63 00 34     cntlzw  r3,r3
 394:   20 63 00 20     subfic  r3,r3,32
 398:   4e 80 00 20     blr
 39c:   38 60 00 00     li      r3,0
 3a0:   4e 80 00 20     blr

000003b0 <testfls64>:
 3b0:   2c 03 00 00     cmpwi   r3,0
 3b4:   40 82 00 1c     bne     3d0 <testfls64+0x20>
 3b8:   2f 84 00 00     cmpwi   cr7,r4,0
 3bc:   38 60 00 00     li      r3,0
 3c0:   4d 9e 00 20     beqlr   cr7
 3c4:   7c 83 00 34     cntlzw  r3,r4
 3c8:   20 63 00 20     subfic  r3,r3,32
 3cc:   4e 80 00 20     blr
 3d0:   7c 63 00 34     cntlzw  r3,r3
 3d4:   20 63 00 40     subfic  r3,r3,64
 3d8:   4e 80 00 20     blr

When the input of fls(x) is a constant, just check x for nullity and
return either 0 or __builtin_clz(x). Otherwise, use cntlzw instruction
directly.

For fls64() on PPC64, do the same but with __builtin_clzll() and
cntlzd instruction. On PPC32, lets take the generic fls64() which
will use our fls(). The result is as expected:

00000388 <testfls>:
 388:   7c 63 00 34     cntlzw  r3,r3
 38c:   20 63 00 20     subfic  r3,r3,32
 390:   4e 80 00 20     blr

000003a0 <testfls64>:
 3a0:   2c 03 00 00     cmpwi   r3,0
 3a4:   40 82 00 10     bne     3b4 <testfls64+0x14>
 3a8:   7c 83 00 34     cntlzw  r3,r4
 3ac:   20 63 00 20     subfic  r3,r3,32
 3b0:   4e 80 00 20     blr
 3b4:   7c 63 00 34     cntlzw  r3,r3
 3b8:   20 63 00 40     subfic  r3,r3,64
 3bc:   4e 80 00 20     blr

Fixes: 2fcff790dcb4 ("powerpc: Use builtin functions for fls()/__fls()/fls64()")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/bitops.h | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/bitops.h b/arch/powerpc/include/asm/bitops.h
index 4a4d3afd5340..299ab33505a6 100644
--- a/arch/powerpc/include/asm/bitops.h
+++ b/arch/powerpc/include/asm/bitops.h
@@ -216,15 +216,34 @@ static inline void arch___clear_bit_unlock(int nr, volatile unsigned long *addr)
  */
 static inline int fls(unsigned int x)
 {
-	return 32 - __builtin_clz(x);
+	int lz;
+
+	if (__builtin_constant_p(x))
+		return x ? 32 - __builtin_clz(x) : 0;
+	asm("cntlzw %0,%1" : "=r" (lz) : "r" (x));
+	return 32 - lz;
 }
 
 #include <asm-generic/bitops/builtin-__fls.h>
 
+/*
+ * 64-bit can do this using one cntlzd (count leading zeroes doubleword)
+ * instruction; for 32-bit we use the generic version, which does two
+ * 32-bit fls calls.
+ */
+#ifdef CONFIG_PPC64
 static inline int fls64(__u64 x)
 {
-	return 64 - __builtin_clzll(x);
+	int lz;
+
+	if (__builtin_constant_p(x))
+		return x ? 64 - __builtin_clzll(x) : 0;
+	asm("cntlzd %0,%1" : "=r" (lz) : "r" (x));
+	return 64 - lz;
 }
+#else
+#include <asm-generic/bitops/fls64.h>
+#endif
 
 #ifdef CONFIG_PPC64
 unsigned int __arch_hweight8(unsigned int w);
-- 
2.25.0


^ permalink raw reply related

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Greg KH @ 2020-10-22 13:50 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-aio@kvack.org, linux-mips@vger.kernel.org, David Howells,
	linux-mm@kvack.org, keyrings@vger.kernel.org,
	sparclinux@vger.kernel.org, Christoph Hellwig,
	linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
	linux-scsi@vger.kernel.org, kernel-team@android.com,
	Arnd Bergmann, linux-block@vger.kernel.org, Al Viro,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, netdev@vger.kernel.org,
	Nick Desaulniers, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022125759.GA1685526@kroah.com>

On Thu, Oct 22, 2020 at 02:57:59PM +0200, Greg KH wrote:
> On Thu, Oct 22, 2020 at 02:42:24PM +0200, David Hildenbrand wrote:
> > On 22.10.20 14:18, Greg KH wrote:
> > > On Thu, Oct 22, 2020 at 12:48:05PM +0200, Greg KH wrote:
> > >> On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
> > >>> On 22.10.20 11:32, David Laight wrote:
> > >>>> From: David Hildenbrand
> > >>>>> Sent: 22 October 2020 10:25
> > >>>> ...
> > >>>>> ... especially because I recall that clang and gcc behave slightly
> > >>>>> differently:
> > >>>>>
> > >>>>> https://github.com/hjl-tools/x86-psABI/issues/2
> > >>>>>
> > >>>>> "Function args are different: narrow types are sign or zero extended to
> > >>>>> 32 bits, depending on their type. clang depends on this for incoming
> > >>>>> args, but gcc doesn't make that assumption. But both compilers do it
> > >>>>> when calling, so gcc code can call clang code.
> > >>>>
> > >>>> It really is best to use 'int' (or even 'long') for all numeric
> > >>>> arguments (and results) regardless of the domain of the value.
> > >>>>
> > >>>> Related, I've always worried about 'bool'....
> > >>>>
> > >>>>> The upper 32 bits of registers are always undefined garbage for types
> > >>>>> smaller than 64 bits."
> > >>>>
> > >>>> On x86-64 the high bits are zeroed by all 32bit loads.
> > >>>
> > >>> Yeah, but does not help here.
> > >>>
> > >>>
> > >>> My thinking: if the compiler that calls import_iovec() has garbage in
> > >>> the upper 32 bit
> > >>>
> > >>> a) gcc will zero it out and not rely on it being zero.
> > >>> b) clang will not zero it out, assuming it is zero.
> > >>>
> > >>> But
> > >>>
> > >>> a) will zero it out when calling the !inlined variant
> > >>> b) clang will zero it out when calling the !inlined variant
> > >>>
> > >>> When inlining, b) strikes. We access garbage. That would mean that we
> > >>> have calling code that's not generated by clang/gcc IIUC.
> > >>>
> > >>> We can test easily by changing the parameters instead of adding an "inline".
> > >>
> > >> Let me try that as well, as I seem to have a good reproducer, but it
> > >> takes a while to run...
> > > 
> > > Ok, that didn't work.
> > > 
> > > And I can't seem to "fix" this by adding noinline to patches further
> > > along in the patch series (because this commit's function is no longer
> > > present due to later patches.)
> > 
> > We might have the same issues with iovec_from_user() and friends now.
> > 
> > > 
> > > Will keep digging...
> > > 
> > > greg k-h
> > > 
> > 
> > 
> > Might be worth to give this a try, just to see if it's related to
> > garbage in upper 32 bit and the way clang is handling it (might be a BUG
> > in clang, though):
> > 
> > 
> > diff --git a/include/linux/uio.h b/include/linux/uio.h
> > index 72d88566694e..7527298c6b56 100644
> > --- a/include/linux/uio.h
> > +++ b/include/linux/uio.h
> > @@ -267,7 +267,7 @@ size_t hash_and_copy_to_iter(const void *addr,
> > size_t bytes, void *hashp,
> >                 struct iov_iter *i);
> > 
> >  struct iovec *iovec_from_user(const struct iovec __user *uvector,
> > -               unsigned long nr_segs, unsigned long fast_segs,
> > +               unsigned nr_segs, unsigned fast_segs,
> >                 struct iovec *fast_iov, bool compat);
> >  ssize_t import_iovec(int type, const struct iovec __user *uvec,
> >                  unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
> > diff --git a/lib/iov_iter.c b/lib/iov_iter.c
> > index 1635111c5bd2..58417f1916dc 100644
> > --- a/lib/iov_iter.c
> > +++ b/lib/iov_iter.c
> > @@ -1652,7 +1652,7 @@ const void *dup_iter(struct iov_iter *new, struct
> > iov_iter *old, gfp_t flags)
> >  EXPORT_SYMBOL(dup_iter);
> > 
> >  static int copy_compat_iovec_from_user(struct iovec *iov,
> > -               const struct iovec __user *uvec, unsigned long nr_segs)
> > +               const struct iovec __user *uvec, unsigned nr_segs)
> >  {
> >         const struct compat_iovec __user *uiov =
> >                 (const struct compat_iovec __user *)uvec;
> > @@ -1684,7 +1684,7 @@ static int copy_compat_iovec_from_user(struct
> > iovec *iov,
> >  }
> > 
> >  static int copy_iovec_from_user(struct iovec *iov,
> > -               const struct iovec __user *uvec, unsigned long nr_segs)
> > +               const struct iovec __user *uvec, unsigned nr_segs)
> >  {
> >         unsigned long seg;
> > 
> > @@ -1699,7 +1699,7 @@ static int copy_iovec_from_user(struct iovec *iov,
> >  }
> > 
> >  struct iovec *iovec_from_user(const struct iovec __user *uvec,
> > -               unsigned long nr_segs, unsigned long fast_segs,
> > +               unsigned nr_segs, unsigned fast_segs,
> >                 struct iovec *fast_iov, bool compat)
> >  {
> >         struct iovec *iov = fast_iov;
> > @@ -1738,7 +1738,7 @@ ssize_t __import_iovec(int type, const struct
> > iovec __user *uvec,
> >                  struct iov_iter *i, bool compat)
> >  {
> >         ssize_t total_len = 0;
> > -       unsigned long seg;
> > +       unsigned seg;
> >         struct iovec *iov;
> > 
> >         iov = iovec_from_user(uvec, nr_segs, fast_segs, *iovp, compat);
> > 
> 
> Ah, I tested the other way around, making everything "unsigned long"
> instead.  Will go try this too, as other tests are still running...

Ok, no, this didn't work either.

Nick, I think I need some compiler help here.  Any ideas?

thanks,

greg k-h

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Greg KH @ 2020-10-22 12:57 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-aio@kvack.org, linux-mips@vger.kernel.org, David Howells,
	linux-mm@kvack.org, keyrings@vger.kernel.org,
	sparclinux@vger.kernel.org, Christoph Hellwig,
	linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
	linux-scsi@vger.kernel.org, kernel-team@android.com,
	Arnd Bergmann, linux-block@vger.kernel.org, Al Viro,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, netdev@vger.kernel.org,
	Nick Desaulniers, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <98d9df88-b7ef-fdfb-7d90-2fa7a9d7bab5@redhat.com>

On Thu, Oct 22, 2020 at 02:42:24PM +0200, David Hildenbrand wrote:
> On 22.10.20 14:18, Greg KH wrote:
> > On Thu, Oct 22, 2020 at 12:48:05PM +0200, Greg KH wrote:
> >> On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
> >>> On 22.10.20 11:32, David Laight wrote:
> >>>> From: David Hildenbrand
> >>>>> Sent: 22 October 2020 10:25
> >>>> ...
> >>>>> ... especially because I recall that clang and gcc behave slightly
> >>>>> differently:
> >>>>>
> >>>>> https://github.com/hjl-tools/x86-psABI/issues/2
> >>>>>
> >>>>> "Function args are different: narrow types are sign or zero extended to
> >>>>> 32 bits, depending on their type. clang depends on this for incoming
> >>>>> args, but gcc doesn't make that assumption. But both compilers do it
> >>>>> when calling, so gcc code can call clang code.
> >>>>
> >>>> It really is best to use 'int' (or even 'long') for all numeric
> >>>> arguments (and results) regardless of the domain of the value.
> >>>>
> >>>> Related, I've always worried about 'bool'....
> >>>>
> >>>>> The upper 32 bits of registers are always undefined garbage for types
> >>>>> smaller than 64 bits."
> >>>>
> >>>> On x86-64 the high bits are zeroed by all 32bit loads.
> >>>
> >>> Yeah, but does not help here.
> >>>
> >>>
> >>> My thinking: if the compiler that calls import_iovec() has garbage in
> >>> the upper 32 bit
> >>>
> >>> a) gcc will zero it out and not rely on it being zero.
> >>> b) clang will not zero it out, assuming it is zero.
> >>>
> >>> But
> >>>
> >>> a) will zero it out when calling the !inlined variant
> >>> b) clang will zero it out when calling the !inlined variant
> >>>
> >>> When inlining, b) strikes. We access garbage. That would mean that we
> >>> have calling code that's not generated by clang/gcc IIUC.
> >>>
> >>> We can test easily by changing the parameters instead of adding an "inline".
> >>
> >> Let me try that as well, as I seem to have a good reproducer, but it
> >> takes a while to run...
> > 
> > Ok, that didn't work.
> > 
> > And I can't seem to "fix" this by adding noinline to patches further
> > along in the patch series (because this commit's function is no longer
> > present due to later patches.)
> 
> We might have the same issues with iovec_from_user() and friends now.
> 
> > 
> > Will keep digging...
> > 
> > greg k-h
> > 
> 
> 
> Might be worth to give this a try, just to see if it's related to
> garbage in upper 32 bit and the way clang is handling it (might be a BUG
> in clang, though):
> 
> 
> diff --git a/include/linux/uio.h b/include/linux/uio.h
> index 72d88566694e..7527298c6b56 100644
> --- a/include/linux/uio.h
> +++ b/include/linux/uio.h
> @@ -267,7 +267,7 @@ size_t hash_and_copy_to_iter(const void *addr,
> size_t bytes, void *hashp,
>                 struct iov_iter *i);
> 
>  struct iovec *iovec_from_user(const struct iovec __user *uvector,
> -               unsigned long nr_segs, unsigned long fast_segs,
> +               unsigned nr_segs, unsigned fast_segs,
>                 struct iovec *fast_iov, bool compat);
>  ssize_t import_iovec(int type, const struct iovec __user *uvec,
>                  unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
> diff --git a/lib/iov_iter.c b/lib/iov_iter.c
> index 1635111c5bd2..58417f1916dc 100644
> --- a/lib/iov_iter.c
> +++ b/lib/iov_iter.c
> @@ -1652,7 +1652,7 @@ const void *dup_iter(struct iov_iter *new, struct
> iov_iter *old, gfp_t flags)
>  EXPORT_SYMBOL(dup_iter);
> 
>  static int copy_compat_iovec_from_user(struct iovec *iov,
> -               const struct iovec __user *uvec, unsigned long nr_segs)
> +               const struct iovec __user *uvec, unsigned nr_segs)
>  {
>         const struct compat_iovec __user *uiov =
>                 (const struct compat_iovec __user *)uvec;
> @@ -1684,7 +1684,7 @@ static int copy_compat_iovec_from_user(struct
> iovec *iov,
>  }
> 
>  static int copy_iovec_from_user(struct iovec *iov,
> -               const struct iovec __user *uvec, unsigned long nr_segs)
> +               const struct iovec __user *uvec, unsigned nr_segs)
>  {
>         unsigned long seg;
> 
> @@ -1699,7 +1699,7 @@ static int copy_iovec_from_user(struct iovec *iov,
>  }
> 
>  struct iovec *iovec_from_user(const struct iovec __user *uvec,
> -               unsigned long nr_segs, unsigned long fast_segs,
> +               unsigned nr_segs, unsigned fast_segs,
>                 struct iovec *fast_iov, bool compat)
>  {
>         struct iovec *iov = fast_iov;
> @@ -1738,7 +1738,7 @@ ssize_t __import_iovec(int type, const struct
> iovec __user *uvec,
>                  struct iov_iter *i, bool compat)
>  {
>         ssize_t total_len = 0;
> -       unsigned long seg;
> +       unsigned seg;
>         struct iovec *iov;
> 
>         iov = iovec_from_user(uvec, nr_segs, fast_segs, *iovp, compat);
> 

Ah, I tested the other way around, making everything "unsigned long"
instead.  Will go try this too, as other tests are still running...

thanks,

greg k-h

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: David Hildenbrand @ 2020-10-22 12:42 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-aio@kvack.org, linux-mips@vger.kernel.org, David Howells,
	linux-mm@kvack.org, keyrings@vger.kernel.org,
	sparclinux@vger.kernel.org, Christoph Hellwig,
	linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
	linux-scsi@vger.kernel.org, kernel-team@android.com,
	Arnd Bergmann, linux-block@vger.kernel.org, Al Viro,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, netdev@vger.kernel.org,
	Nick Desaulniers, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022121849.GA1664412@kroah.com>

On 22.10.20 14:18, Greg KH wrote:
> On Thu, Oct 22, 2020 at 12:48:05PM +0200, Greg KH wrote:
>> On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
>>> On 22.10.20 11:32, David Laight wrote:
>>>> From: David Hildenbrand
>>>>> Sent: 22 October 2020 10:25
>>>> ...
>>>>> ... especially because I recall that clang and gcc behave slightly
>>>>> differently:
>>>>>
>>>>> https://github.com/hjl-tools/x86-psABI/issues/2
>>>>>
>>>>> "Function args are different: narrow types are sign or zero extended to
>>>>> 32 bits, depending on their type. clang depends on this for incoming
>>>>> args, but gcc doesn't make that assumption. But both compilers do it
>>>>> when calling, so gcc code can call clang code.
>>>>
>>>> It really is best to use 'int' (or even 'long') for all numeric
>>>> arguments (and results) regardless of the domain of the value.
>>>>
>>>> Related, I've always worried about 'bool'....
>>>>
>>>>> The upper 32 bits of registers are always undefined garbage for types
>>>>> smaller than 64 bits."
>>>>
>>>> On x86-64 the high bits are zeroed by all 32bit loads.
>>>
>>> Yeah, but does not help here.
>>>
>>>
>>> My thinking: if the compiler that calls import_iovec() has garbage in
>>> the upper 32 bit
>>>
>>> a) gcc will zero it out and not rely on it being zero.
>>> b) clang will not zero it out, assuming it is zero.
>>>
>>> But
>>>
>>> a) will zero it out when calling the !inlined variant
>>> b) clang will zero it out when calling the !inlined variant
>>>
>>> When inlining, b) strikes. We access garbage. That would mean that we
>>> have calling code that's not generated by clang/gcc IIUC.
>>>
>>> We can test easily by changing the parameters instead of adding an "inline".
>>
>> Let me try that as well, as I seem to have a good reproducer, but it
>> takes a while to run...
> 
> Ok, that didn't work.
> 
> And I can't seem to "fix" this by adding noinline to patches further
> along in the patch series (because this commit's function is no longer
> present due to later patches.)

We might have the same issues with iovec_from_user() and friends now.

> 
> Will keep digging...
> 
> greg k-h
> 


Might be worth to give this a try, just to see if it's related to
garbage in upper 32 bit and the way clang is handling it (might be a BUG
in clang, though):


diff --git a/include/linux/uio.h b/include/linux/uio.h
index 72d88566694e..7527298c6b56 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -267,7 +267,7 @@ size_t hash_and_copy_to_iter(const void *addr,
size_t bytes, void *hashp,
                struct iov_iter *i);

 struct iovec *iovec_from_user(const struct iovec __user *uvector,
-               unsigned long nr_segs, unsigned long fast_segs,
+               unsigned nr_segs, unsigned fast_segs,
                struct iovec *fast_iov, bool compat);
 ssize_t import_iovec(int type, const struct iovec __user *uvec,
                 unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 1635111c5bd2..58417f1916dc 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1652,7 +1652,7 @@ const void *dup_iter(struct iov_iter *new, struct
iov_iter *old, gfp_t flags)
 EXPORT_SYMBOL(dup_iter);

 static int copy_compat_iovec_from_user(struct iovec *iov,
-               const struct iovec __user *uvec, unsigned long nr_segs)
+               const struct iovec __user *uvec, unsigned nr_segs)
 {
        const struct compat_iovec __user *uiov =
                (const struct compat_iovec __user *)uvec;
@@ -1684,7 +1684,7 @@ static int copy_compat_iovec_from_user(struct
iovec *iov,
 }

 static int copy_iovec_from_user(struct iovec *iov,
-               const struct iovec __user *uvec, unsigned long nr_segs)
+               const struct iovec __user *uvec, unsigned nr_segs)
 {
        unsigned long seg;

@@ -1699,7 +1699,7 @@ static int copy_iovec_from_user(struct iovec *iov,
 }

 struct iovec *iovec_from_user(const struct iovec __user *uvec,
-               unsigned long nr_segs, unsigned long fast_segs,
+               unsigned nr_segs, unsigned fast_segs,
                struct iovec *fast_iov, bool compat)
 {
        struct iovec *iov = fast_iov;
@@ -1738,7 +1738,7 @@ ssize_t __import_iovec(int type, const struct
iovec __user *uvec,
                 struct iov_iter *i, bool compat)
 {
        ssize_t total_len = 0;
-       unsigned long seg;
+       unsigned seg;
        struct iovec *iov;

        iov = iovec_from_user(uvec, nr_segs, fast_segs, *iovp, compat);


-- 
Thanks,

David / dhildenb


^ permalink raw reply related

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Greg KH @ 2020-10-22 12:18 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-aio@kvack.org, linux-mips@vger.kernel.org, David Howells,
	linux-mm@kvack.org, keyrings@vger.kernel.org,
	sparclinux@vger.kernel.org, Christoph Hellwig,
	linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
	linux-scsi@vger.kernel.org, kernel-team@android.com,
	Arnd Bergmann, linux-block@vger.kernel.org, Al Viro,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, netdev@vger.kernel.org,
	Nick Desaulniers, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022104805.GA1503673@kroah.com>

On Thu, Oct 22, 2020 at 12:48:05PM +0200, Greg KH wrote:
> On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
> > On 22.10.20 11:32, David Laight wrote:
> > > From: David Hildenbrand
> > >> Sent: 22 October 2020 10:25
> > > ...
> > >> ... especially because I recall that clang and gcc behave slightly
> > >> differently:
> > >>
> > >> https://github.com/hjl-tools/x86-psABI/issues/2
> > >>
> > >> "Function args are different: narrow types are sign or zero extended to
> > >> 32 bits, depending on their type. clang depends on this for incoming
> > >> args, but gcc doesn't make that assumption. But both compilers do it
> > >> when calling, so gcc code can call clang code.
> > > 
> > > It really is best to use 'int' (or even 'long') for all numeric
> > > arguments (and results) regardless of the domain of the value.
> > > 
> > > Related, I've always worried about 'bool'....
> > > 
> > >> The upper 32 bits of registers are always undefined garbage for types
> > >> smaller than 64 bits."
> > > 
> > > On x86-64 the high bits are zeroed by all 32bit loads.
> > 
> > Yeah, but does not help here.
> > 
> > 
> > My thinking: if the compiler that calls import_iovec() has garbage in
> > the upper 32 bit
> > 
> > a) gcc will zero it out and not rely on it being zero.
> > b) clang will not zero it out, assuming it is zero.
> > 
> > But
> > 
> > a) will zero it out when calling the !inlined variant
> > b) clang will zero it out when calling the !inlined variant
> > 
> > When inlining, b) strikes. We access garbage. That would mean that we
> > have calling code that's not generated by clang/gcc IIUC.
> > 
> > We can test easily by changing the parameters instead of adding an "inline".
> 
> Let me try that as well, as I seem to have a good reproducer, but it
> takes a while to run...

Ok, that didn't work.

And I can't seem to "fix" this by adding noinline to patches further
along in the patch series (because this commit's function is no longer
present due to later patches.)

Will keep digging...

greg k-h

^ permalink raw reply

* Re: [PATCH 1/2] powerpc: Introduce POWER10_DD1 feature
From: Michael Ellerman @ 2020-10-22 11:35 UTC (permalink / raw)
  To: Ravi Bangoria
  Cc: christophe.leroy, ravi.bangoria, mikey, jniethe5, npiggin, maddy,
	paulus, naveen.n.rao, linuxppc-dev
In-Reply-To: <20201020054454.194343-1-ravi.bangoria@linux.ibm.com>

Ravi Bangoria <ravi.bangoria@linux.ibm.com> writes:
> POWER10_DD1 feature flag will be needed while adding
> conditional code that applies only for Power10 DD1.
>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/cputable.h | 8 ++++++--
>  arch/powerpc/kernel/dt_cpu_ftrs.c   | 3 +++
>  arch/powerpc/kernel/prom.c          | 9 +++++++++
>  3 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
> index 93bc70d4c9a1..d486f56c0d33 100644
> --- a/arch/powerpc/include/asm/cputable.h
> +++ b/arch/powerpc/include/asm/cputable.h
> @@ -216,6 +216,7 @@ static inline void cpu_feature_keys_init(void) { }
>  #define CPU_FTR_P9_RADIX_PREFETCH_BUG	LONG_ASM_CONST(0x0002000000000000)
>  #define CPU_FTR_ARCH_31			LONG_ASM_CONST(0x0004000000000000)
>  #define CPU_FTR_DAWR1			LONG_ASM_CONST(0x0008000000000000)
> +#define CPU_FTR_POWER10_DD1		LONG_ASM_CONST(0x0010000000000000)
>  
>  #ifndef __ASSEMBLY__
>  
> @@ -479,6 +480,7 @@ static inline void cpu_feature_keys_init(void) { }
>  	    CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \
>  	    CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_ARCH_31 | \
>  	    CPU_FTR_DAWR | CPU_FTR_DAWR1)
> +#define CPU_FTRS_POWER10_DD1	(CPU_FTRS_POWER10 | CPU_FTR_POWER10_DD1)
>  #define CPU_FTRS_CELL	(CPU_FTR_LWSYNC | \
>  	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
>  	    CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
> @@ -497,14 +499,16 @@ static inline void cpu_feature_keys_init(void) { }
>  #define CPU_FTRS_POSSIBLE	\
>  	    (CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | CPU_FTRS_POWER8 | \
>  	     CPU_FTR_ALTIVEC_COMP | CPU_FTR_VSX_COMP | CPU_FTRS_POWER9 | \
> -	     CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
> +	     CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | \
> +	     CPU_FTRS_POWER10_DD1)
>  #else
>  #define CPU_FTRS_POSSIBLE	\
>  	    (CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | \
>  	     CPU_FTRS_POWER6 | CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | \
>  	     CPU_FTRS_POWER8 | CPU_FTRS_CELL | CPU_FTRS_PA6T | \
>  	     CPU_FTR_VSX_COMP | CPU_FTR_ALTIVEC_COMP | CPU_FTRS_POWER9 | \
> -	     CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
> +	     CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | \
> +	     CPU_FTRS_POWER10_DD1)
>  #endif /* CONFIG_CPU_LITTLE_ENDIAN */
>  #endif
>  #else
> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
> index 1098863e17ee..b2327f2967ff 100644
> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
> @@ -811,6 +811,9 @@ static __init void cpufeatures_cpu_quirks(void)
>  	}
>  
>  	update_tlbie_feature_flag(version);
> +
> +	if ((version & 0xffffffff) == 0x00800100)
> +		cur_cpu_spec->cpu_features |= CPU_FTR_POWER10_DD1;
>  }
>  
>  static void __init cpufeatures_setup_finished(void)
> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
> index c1545f22c077..c778c81284f7 100644
> --- a/arch/powerpc/kernel/prom.c
> +++ b/arch/powerpc/kernel/prom.c
> @@ -305,6 +305,14 @@ static void __init check_cpu_feature_properties(unsigned long node)
>  	}
>  }
>  
> +static void __init fixup_cpu_features(void)
> +{
> +	unsigned long version = mfspr(SPRN_PVR);
> +
> +	if ((version & 0xffffffff) == 0x00800100)
> +		cur_cpu_spec->cpu_features |= CPU_FTR_POWER10_DD1;
> +}
> +
>  static int __init early_init_dt_scan_cpus(unsigned long node,
>  					  const char *uname, int depth,
>  					  void *data)
> @@ -378,6 +386,7 @@ static int __init early_init_dt_scan_cpus(unsigned long node,
>  
>  		check_cpu_feature_properties(node);
>  		check_cpu_pa_features(node);
> +		fixup_cpu_features();
>  	}

This is not the way we normally do CPU features.

In the past we have always added a raw entry in cputable.c, see eg. the
Power9 DD 2.0, 2.1 entries.

Doing it here is not really safe, if you're running with an architected
PVR (via cpu-version property), you can't set the DD1 feature, because
you might be migrated to a future CPU that doesn't have the DD1 quirks.

cheers

^ permalink raw reply

* Re: [PATCH] powerpc: Send SIGBUS from machine_check
From: Joakim Tjernlund @ 2020-10-22 11:19 UTC (permalink / raw)
  To: linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201001170557.10915-1-joakim.tjernlund@infinera.com>

ping

Also Cc: stable@vger.kernel.org

On Thu, 2020-10-01 at 19:05 +0200, Joakim Tjernlund wrote:
> Embedded PPC CPU should send SIGBUS to user space when applicable.
> 
> Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com>
> ---
>  arch/powerpc/kernel/traps.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index 0381242920d9..12715d24141c 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -621,6 +621,11 @@ int machine_check_e500mc(struct pt_regs *regs)
>  		       reason & MCSR_MEA ? "Effective" : "Physical", addr);
>  	}
>  
> 
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		recoverable = 1;
> +	}
> +
>  silent_out:
>  	mtspr(SPRN_MCSR, mcsr);
>  	return mfspr(SPRN_MCSR) == 0 && recoverable;
> @@ -665,6 +670,10 @@ int machine_check_e500(struct pt_regs *regs)
>  	if (reason & MCSR_BUS_RPERR)
>  		printk("Bus - Read Parity Error\n");
>  
> 
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		return 1;
> +	}
>  	return 0;
>  }
>  
> 
> @@ -695,6 +704,10 @@ int machine_check_e200(struct pt_regs *regs)
>  	if (reason & MCSR_BUS_WRERR)
>  		printk("Bus - Write Bus Error on buffered store or cache line push\n");
>  
> 
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		return 1;
> +	}
>  	return 0;
>  }
>  #elif defined(CONFIG_PPC32)
> @@ -731,6 +744,10 @@ int machine_check_generic(struct pt_regs *regs)
>  	default:
>  		printk("Unknown values in msr\n");
>  	}
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		return 1;
> +	}
>  	return 0;
>  }
>  #endif /* everything else */


^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Greg KH @ 2020-10-22 10:48 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-aio@kvack.org, linux-mips@vger.kernel.org, David Howells,
	linux-mm@kvack.org, keyrings@vger.kernel.org,
	sparclinux@vger.kernel.org, Christoph Hellwig,
	linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
	linux-scsi@vger.kernel.org, kernel-team@android.com,
	Arnd Bergmann, linux-block@vger.kernel.org, Al Viro,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, netdev@vger.kernel.org,
	Nick Desaulniers, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <5fd6003b-55a6-2c3c-9a28-8fd3a575ca78@redhat.com>

On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
> On 22.10.20 11:32, David Laight wrote:
> > From: David Hildenbrand
> >> Sent: 22 October 2020 10:25
> > ...
> >> ... especially because I recall that clang and gcc behave slightly
> >> differently:
> >>
> >> https://github.com/hjl-tools/x86-psABI/issues/2
> >>
> >> "Function args are different: narrow types are sign or zero extended to
> >> 32 bits, depending on their type. clang depends on this for incoming
> >> args, but gcc doesn't make that assumption. But both compilers do it
> >> when calling, so gcc code can call clang code.
> > 
> > It really is best to use 'int' (or even 'long') for all numeric
> > arguments (and results) regardless of the domain of the value.
> > 
> > Related, I've always worried about 'bool'....
> > 
> >> The upper 32 bits of registers are always undefined garbage for types
> >> smaller than 64 bits."
> > 
> > On x86-64 the high bits are zeroed by all 32bit loads.
> 
> Yeah, but does not help here.
> 
> 
> My thinking: if the compiler that calls import_iovec() has garbage in
> the upper 32 bit
> 
> a) gcc will zero it out and not rely on it being zero.
> b) clang will not zero it out, assuming it is zero.
> 
> But
> 
> a) will zero it out when calling the !inlined variant
> b) clang will zero it out when calling the !inlined variant
> 
> When inlining, b) strikes. We access garbage. That would mean that we
> have calling code that's not generated by clang/gcc IIUC.
> 
> We can test easily by changing the parameters instead of adding an "inline".

Let me try that as well, as I seem to have a good reproducer, but it
takes a while to run...

greg k-h

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: David Hildenbrand @ 2020-10-22  9:36 UTC (permalink / raw)
  To: David Laight, Greg KH
  Cc: linux-aio@kvack.org, linux-mips@vger.kernel.org, David Howells,
	linux-mm@kvack.org, keyrings@vger.kernel.org,
	sparclinux@vger.kernel.org, Christoph Hellwig,
	linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
	linux-scsi@vger.kernel.org, kernel-team@android.com,
	Arnd Bergmann, linux-block@vger.kernel.org, Al Viro,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, netdev@vger.kernel.org,
	Nick Desaulniers, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <bc0a091865f34700b9df332c6e9dcdfd@AcuMS.aculab.com>

On 22.10.20 11:32, David Laight wrote:
> From: David Hildenbrand
>> Sent: 22 October 2020 10:25
> ...
>> ... especially because I recall that clang and gcc behave slightly
>> differently:
>>
>> https://github.com/hjl-tools/x86-psABI/issues/2
>>
>> "Function args are different: narrow types are sign or zero extended to
>> 32 bits, depending on their type. clang depends on this for incoming
>> args, but gcc doesn't make that assumption. But both compilers do it
>> when calling, so gcc code can call clang code.
> 
> It really is best to use 'int' (or even 'long') for all numeric
> arguments (and results) regardless of the domain of the value.
> 
> Related, I've always worried about 'bool'....
> 
>> The upper 32 bits of registers are always undefined garbage for types
>> smaller than 64 bits."
> 
> On x86-64 the high bits are zeroed by all 32bit loads.

Yeah, but does not help here.


My thinking: if the compiler that calls import_iovec() has garbage in
the upper 32 bit

a) gcc will zero it out and not rely on it being zero.
b) clang will not zero it out, assuming it is zero.

But

a) will zero it out when calling the !inlined variant
b) clang will zero it out when calling the !inlined variant

When inlining, b) strikes. We access garbage. That would mean that we
have calling code that's not generated by clang/gcc IIUC.

We can test easily by changing the parameters instead of adding an "inline".

-- 
Thanks,

David / dhildenb


^ permalink raw reply

* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: David Laight @ 2020-10-22  9:32 UTC (permalink / raw)
  To: 'David Hildenbrand', Greg KH
  Cc: linux-aio@kvack.org, linux-mips@vger.kernel.org, David Howells,
	linux-mm@kvack.org, keyrings@vger.kernel.org,
	sparclinux@vger.kernel.org, Christoph Hellwig,
	linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
	linux-scsi@vger.kernel.org, kernel-team@android.com,
	Arnd Bergmann, linux-block@vger.kernel.org, Al Viro,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, netdev@vger.kernel.org,
	Nick Desaulniers, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <a1533569-948a-1d5b-e231-5531aa988047@redhat.com>

From: David Hildenbrand
> Sent: 22 October 2020 10:25
...
> ... especially because I recall that clang and gcc behave slightly
> differently:
> 
> https://github.com/hjl-tools/x86-psABI/issues/2
> 
> "Function args are different: narrow types are sign or zero extended to
> 32 bits, depending on their type. clang depends on this for incoming
> args, but gcc doesn't make that assumption. But both compilers do it
> when calling, so gcc code can call clang code.

It really is best to use 'int' (or even 'long') for all numeric
arguments (and results) regardless of the domain of the value.

Related, I've always worried about 'bool'....

> The upper 32 bits of registers are always undefined garbage for types
> smaller than 64 bits."

On x86-64 the high bits are zeroed by all 32bit loads.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Arnd Bergmann @ 2020-10-22  9:16 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-aio, open list:BROADCOM NVRAM DRIVER, David Howells,
	Linux-MM, keyrings, sparclinux, Christoph Hellwig, linux-arch,
	linux-s390, linux-scsi, Android Kernel Team, linux-block, Al Viro,
	io-uring, Linux ARM, Jens Axboe, Parisc List, Networking,
	Nick Desaulniers, linux-kernel@vger.kernel.org, LSM List,
	David Laight, Linux FS-devel Mailing List, Andrew Morton,
	linuxppc-dev
In-Reply-To: <20201022082654.GA1477657@kroah.com>

On Thu, Oct 22, 2020 at 10:26 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> On Thu, Oct 22, 2020 at 12:39:14AM +0100, Al Viro wrote:
> > On Wed, Oct 21, 2020 at 06:13:01PM +0200, Greg KH wrote:
> > > On Fri, Sep 25, 2020 at 06:51:39AM +0200, Christoph Hellwig wrote:
> > >
> > > I can't really figure out what the regression is, only that this commit
> > > triggers a "large Android system binary" from working properly.  There's
> > > no kernel log messages anywhere, and I don't have any way to strace the
> > > thing in the testing framework, so any hints that people can provide
> > > would be most appreciated.
> >
> > It's a pure move - modulo changed line breaks in the argument lists
> > the functions involved are identical before and after that (just checked
> > that directly, by checking out the trees before and after, extracting two
> > functions in question from fs/read_write.c and lib/iov_iter.c (before and
> > after, resp.) and checking the diff between those.
> >
> > How certain is your bisection?
>
> The bisection is very reproducable.
>
> But, this looks now to be a compiler bug.  I'm using the latest version
> of clang and if I put "noinline" at the front of the function,
> everything works.
>
> Nick, any ideas here as to who I should report this to?
>
> I'll work on a fixup patch for the Android kernel tree to see if I can
> work around it there, but others will hit this in Linus's tree sooner or
> later...

I see that Christoph rewrote the function again in bfdc59701d6d
("iov_iter: refactor rw_copy_check_uvector and import_iovec"),
do you know if the current mainline version is also affected?

Do you know if it happens across multiple architectures or might
be specific to either x86 or arm64?

https://bugs.llvm.org/ is obviously the place for reporting the
issue if it does turn out to be a bug in clang, but that requires
a specific thing going wrong in the output.

One idea I have for debugging it is to sprinkle the inlined
function with lots of barrier()s to prevent a lot of the optimizations.
If that solves the issue, you can bisect through those until you
find one barrier that makes the difference between working and
broken and then look at diff of the assembler output.

        Arnd

^ permalink raw reply

* [PATCH v3 3/3] powerpc: Fix update form addressing in inline assembly
From: Christophe Leroy @ 2020-10-22  9:29 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	mathieu.desnoyers
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <25e6fca46fda3e2a4298448edbf654f64756eee7.1603358942.git.christophe.leroy@csgroup.eu>

In several places, inline assembly uses the "%Un" modifier
to enable the use of instruction with update form addressing,
but the associated "<>" constraint is missing.

As mentioned in previous patch, this fails with gcc 4.9, so
"<>" can't be used directly.

Use UPD_CONSTR macro everywhere %Un modifier is used.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
---
v3: __set_pte_at() not impacted anymore (%U0 removed in previous patch)
v2: Build failure (circular inclusion) fixed by change in patch 1
---
 arch/powerpc/include/asm/atomic.h | 9 +++++----
 arch/powerpc/include/asm/io.h     | 4 ++--
 arch/powerpc/kvm/powerpc.c        | 4 ++--
 3 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 8a55eb8cc97b..61c6e8b200e8 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -10,6 +10,7 @@
 #include <linux/types.h>
 #include <asm/cmpxchg.h>
 #include <asm/barrier.h>
+#include <asm/asm-const.h>
 
 /*
  * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
@@ -26,14 +27,14 @@ static __inline__ int atomic_read(const atomic_t *v)
 {
 	int t;
 
-	__asm__ __volatile__("lwz%U1%X1 %0,%1" : "=r"(t) : "m"(v->counter));
+	__asm__ __volatile__("lwz%U1%X1 %0,%1" : "=r"(t) : "m"UPD_CONSTR(v->counter));
 
 	return t;
 }
 
 static __inline__ void atomic_set(atomic_t *v, int i)
 {
-	__asm__ __volatile__("stw%U0%X0 %1,%0" : "=m"(v->counter) : "r"(i));
+	__asm__ __volatile__("stw%U0%X0 %1,%0" : "=m"UPD_CONSTR(v->counter) : "r"(i));
 }
 
 #define ATOMIC_OP(op, asm_op)						\
@@ -316,14 +317,14 @@ static __inline__ s64 atomic64_read(const atomic64_t *v)
 {
 	s64 t;
 
-	__asm__ __volatile__("ld%U1%X1 %0,%1" : "=r"(t) : "m"(v->counter));
+	__asm__ __volatile__("ld%U1%X1 %0,%1" : "=r"(t) : "m"UPD_CONSTR(v->counter));
 
 	return t;
 }
 
 static __inline__ void atomic64_set(atomic64_t *v, s64 i)
 {
-	__asm__ __volatile__("std%U0%X0 %1,%0" : "=m"(v->counter) : "r"(i));
+	__asm__ __volatile__("std%U0%X0 %1,%0" : "=m"UPD_CONSTR(v->counter) : "r"(i));
 }
 
 #define ATOMIC64_OP(op, asm_op)						\
diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
index 58635960403c..87964dfb838e 100644
--- a/arch/powerpc/include/asm/io.h
+++ b/arch/powerpc/include/asm/io.h
@@ -122,7 +122,7 @@ static inline u##size name(const volatile u##size __iomem *addr)	\
 {									\
 	u##size ret;							\
 	__asm__ __volatile__("sync;"#insn"%U1%X1 %0,%1;twi 0,%0,0;isync"\
-		: "=r" (ret) : "m" (*addr) : "memory");			\
+		: "=r" (ret) : "m"UPD_CONSTR (*addr) : "memory");	\
 	return ret;							\
 }
 
@@ -130,7 +130,7 @@ static inline u##size name(const volatile u##size __iomem *addr)	\
 static inline void name(volatile u##size __iomem *addr, u##size val)	\
 {									\
 	__asm__ __volatile__("sync;"#insn"%U0%X0 %1,%0"			\
-		: "=m" (*addr) : "r" (val) : "memory");			\
+		: "=m"UPD_CONSTR (*addr) : "r" (val) : "memory");	\
 	mmiowb_set_pending();						\
 }
 
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 13999123b735..cf52d26f49cd 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -1087,7 +1087,7 @@ static inline u64 sp_to_dp(u32 fprs)
 
 	preempt_disable();
 	enable_kernel_fp();
-	asm ("lfs%U1%X1 0,%1; stfd%U0%X0 0,%0" : "=m" (fprd) : "m" (fprs)
+	asm ("lfs%U1%X1 0,%1; stfd%U0%X0 0,%0" : "=m"UPD_CONSTR (fprd) : "m"UPD_CONSTR (fprs)
 	     : "fr0");
 	preempt_enable();
 	return fprd;
@@ -1099,7 +1099,7 @@ static inline u32 dp_to_sp(u64 fprd)
 
 	preempt_disable();
 	enable_kernel_fp();
-	asm ("lfd%U1%X1 0,%1; stfs%U0%X0 0,%0" : "=m" (fprs) : "m" (fprd)
+	asm ("lfd%U1%X1 0,%1; stfs%U0%X0 0,%0" : "=m"UPD_CONSTR (fprs) : "m"UPD_CONSTR (fprd)
 	     : "fr0");
 	preempt_enable();
 	return fprs;
-- 
2.25.0


^ permalink raw reply related

* [PATCH v3 2/3] powerpc: Fix incorrect stw{, ux, u, x} instructions in __set_pte_at
From: Christophe Leroy @ 2020-10-22  9:29 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	mathieu.desnoyers
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <25e6fca46fda3e2a4298448edbf654f64756eee7.1603358942.git.christophe.leroy@csgroup.eu>

From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

The placeholder for instruction selection should use the second
argument's operand, which is %1, not %0. This could generate incorrect
assembly code if the memory addressing of operand %0 is a different
form from that of operand %1.

Also remove the %Un placeholder because having %Un placeholders
for two operands which are based on the same local var (ptep) doesn't
make much sense. By the way, it doesn't change the current behaviour
because "<>" constraint is missing for the associated "=m".

Fixes: 9bf2b5cdc5fe ("powerpc: Fixes for CONFIG_PTE_64BIT for SMP support")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: <stable@vger.kernel.org> # v2.6.28+
Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
[chleroy: revised commit log iaw segher's comments and removed %U0]
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
v3: Remove %Un

v2: Changed commit log.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 4 ++--
 arch/powerpc/include/asm/nohash/pgtable.h    | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 36443cda8dcf..41d8bc6db303 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -522,9 +522,9 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
 	if (pte_val(*ptep) & _PAGE_HASHPTE)
 		flush_hash_entry(mm, ptep, addr);
 	__asm__ __volatile__("\
-		stw%U0%X0 %2,%0\n\
+		stw%X0 %2,%0\n\
 		eieio\n\
-		stw%U0%X0 %L2,%1"
+		stw%X1 %L2,%1"
 	: "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
 	: "r" (pte) : "memory");
 
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h
index 6277e7596ae5..ac75f4ab0dba 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -192,9 +192,9 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
 	 */
 	if (IS_ENABLED(CONFIG_PPC32) && IS_ENABLED(CONFIG_PTE_64BIT) && !percpu) {
 		__asm__ __volatile__("\
-			stw%U0%X0 %2,%0\n\
+			stw%X0 %2,%0\n\
 			eieio\n\
-			stw%U0%X0 %L2,%1"
+			stw%X1 %L2,%1"
 		: "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
 		: "r" (pte) : "memory");
 		return;
-- 
2.25.0


^ permalink raw reply related

* [PATCH v3 1/3] powerpc/uaccess: Don't use "m<>" constraint with GCC 4.9
From: Christophe Leroy @ 2020-10-22  9:29 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	mathieu.desnoyers
  Cc: linuxppc-dev, linux-kernel

GCC 4.9 sometimes fails to build with "m<>" constraint in
inline assembly.

  CC      lib/iov_iter.o
In file included from ./arch/powerpc/include/asm/cmpxchg.h:6:0,
                 from ./arch/powerpc/include/asm/atomic.h:11,
                 from ./include/linux/atomic.h:7,
                 from ./include/linux/crypto.h:15,
                 from ./include/crypto/hash.h:11,
                 from lib/iov_iter.c:2:
lib/iov_iter.c: In function 'iovec_from_user.part.30':
./arch/powerpc/include/asm/uaccess.h:287:2: error: 'asm' operand has impossible constraints
  __asm__ __volatile__(    \
  ^
./include/linux/compiler.h:78:42: note: in definition of macro 'unlikely'
 # define unlikely(x) __builtin_expect(!!(x), 0)
                                          ^
./arch/powerpc/include/asm/uaccess.h:583:34: note: in expansion of macro 'unsafe_op_wrap'
 #define unsafe_get_user(x, p, e) unsafe_op_wrap(__get_user_allowed(x, p), e)
                                  ^
./arch/powerpc/include/asm/uaccess.h:329:10: note: in expansion of macro '__get_user_asm'
  case 4: __get_user_asm(x, (u32 __user *)ptr, retval, "lwz"); break; \
          ^
./arch/powerpc/include/asm/uaccess.h:363:3: note: in expansion of macro '__get_user_size_allowed'
   __get_user_size_allowed(__gu_val, __gu_addr, __gu_size, __gu_err); \
   ^
./arch/powerpc/include/asm/uaccess.h:100:2: note: in expansion of macro '__get_user_nocheck'
  __get_user_nocheck((x), (ptr), sizeof(*(ptr)), false)
  ^
./arch/powerpc/include/asm/uaccess.h:583:49: note: in expansion of macro '__get_user_allowed'
 #define unsafe_get_user(x, p, e) unsafe_op_wrap(__get_user_allowed(x, p), e)
                                                 ^
lib/iov_iter.c:1663:3: note: in expansion of macro 'unsafe_get_user'
   unsafe_get_user(len, &uiov[i].iov_len, uaccess_end);
   ^
make[1]: *** [scripts/Makefile.build:283: lib/iov_iter.o] Error 1

Define a UPD_CONSTR macro that is "<>" by default and
only "" with GCC prior to GCC 5.

Fixes: fcf1f26895a4 ("powerpc/uaccess: Add pre-update addressing to __put_user_asm_goto()")
Fixes: 2f279eeb68b8 ("powerpc/uaccess: Add pre-update addressing to __get_user_asm() and __put_user_asm()")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
---
v2: Moved UPD_CONSTR to asm-const.h to avoid circular inclusion issues with patch 3.
---
 arch/powerpc/include/asm/asm-const.h | 13 +++++++++++++
 arch/powerpc/include/asm/uaccess.h   |  4 ++--
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/asm-const.h b/arch/powerpc/include/asm/asm-const.h
index 082c1538c562..0ce2368bd20f 100644
--- a/arch/powerpc/include/asm/asm-const.h
+++ b/arch/powerpc/include/asm/asm-const.h
@@ -11,4 +11,17 @@
 #  define __ASM_CONST(x)	x##UL
 #  define ASM_CONST(x)		__ASM_CONST(x)
 #endif
+
+/*
+ * Inline assembly memory constraint
+ *
+ * GCC 4.9 doesn't properly handle pre update memory constraint "m<>"
+ *
+ */
+#if defined(GCC_VERSION) && GCC_VERSION < 50000
+#define UPD_CONSTR ""
+#else
+#define UPD_CONSTR "<>"
+#endif
+
 #endif /* _ASM_POWERPC_ASM_CONST_H */
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 604d705f1bb8..8f27ea48fadb 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -223,7 +223,7 @@ do {								\
 		"1:	" op "%U1%X1 %0,%1	# put_user\n"	\
 		EX_TABLE(1b, %l2)				\
 		:						\
-		: "r" (x), "m<>" (*addr)				\
+		: "r" (x), "m"UPD_CONSTR (*addr)		\
 		:						\
 		: label)
 
@@ -294,7 +294,7 @@ extern long __get_user_bad(void);
 		".previous\n"				\
 		EX_TABLE(1b, 3b)			\
 		: "=r" (err), "=r" (x)			\
-		: "m<>" (*addr), "i" (-EFAULT), "0" (err))
+		: "m"UPD_CONSTR (*addr), "i" (-EFAULT), "0" (err))
 
 #ifdef __powerpc64__
 #define __get_user_asm2(x, addr, err)			\
-- 
2.25.0


^ permalink raw reply related

* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: David Laight @ 2020-10-22  9:28 UTC (permalink / raw)
  To: 'David Hildenbrand', Greg KH
  Cc: linux-aio@kvack.org, linux-mips@vger.kernel.org, David Howells,
	linux-mm@kvack.org, keyrings@vger.kernel.org,
	sparclinux@vger.kernel.org, Christoph Hellwig,
	linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
	linux-scsi@vger.kernel.org, kernel-team@android.com,
	Arnd Bergmann, linux-block@vger.kernel.org, Al Viro,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, netdev@vger.kernel.org,
	Nick Desaulniers, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <e04d0c5d-e834-a15b-7844-44dcc82785cc@redhat.com>

From: David Hildenbrand
> Sent: 22 October 2020 10:19
> 
> On 22.10.20 11:01, Greg KH wrote:
> > On Thu, Oct 22, 2020 at 10:48:59AM +0200, David Hildenbrand wrote:
> >> On 22.10.20 10:40, David Laight wrote:
> >>> From: David Hildenbrand
> >>>> Sent: 22 October 2020 09:35
> >>>>
> >>>> On 22.10.20 10:26, Greg KH wrote:
> >>>>> On Thu, Oct 22, 2020 at 12:39:14AM +0100, Al Viro wrote:
> >>>>>> On Wed, Oct 21, 2020 at 06:13:01PM +0200, Greg KH wrote:
> >>>>>>> On Fri, Sep 25, 2020 at 06:51:39AM +0200, Christoph Hellwig wrote:
> >>>>>>>> From: David Laight <David.Laight@ACULAB.COM>
> >>>>>>>>
> >>>>>>>> This lets the compiler inline it into import_iovec() generating
> >>>>>>>> much better code.
> >>>>>>>>
> >>>>>>>> Signed-off-by: David Laight <david.laight@aculab.com>
> >>>>>>>> Signed-off-by: Christoph Hellwig <hch@lst.de>
> >>>>>>>> ---
> >>>>>>>>  fs/read_write.c | 179 ------------------------------------------------
> >>>>>>>>  lib/iov_iter.c  | 176 +++++++++++++++++++++++++++++++++++++++++++++++
> >>>>>>>>  2 files changed, 176 insertions(+), 179 deletions(-)
> >>>>>>>
> >>>>>>> Strangely, this commit causes a regression in Linus's tree right now.
> >>>>>>>
> >>>>>>> I can't really figure out what the regression is, only that this commit
> >>>>>>> triggers a "large Android system binary" from working properly.  There's
> >>>>>>> no kernel log messages anywhere, and I don't have any way to strace the
> >>>>>>> thing in the testing framework, so any hints that people can provide
> >>>>>>> would be most appreciated.
> >>>>>>
> >>>>>> It's a pure move - modulo changed line breaks in the argument lists
> >>>>>> the functions involved are identical before and after that (just checked
> >>>>>> that directly, by checking out the trees before and after, extracting two
> >>>>>> functions in question from fs/read_write.c and lib/iov_iter.c (before and
> >>>>>> after, resp.) and checking the diff between those.
> >>>>>>
> >>>>>> How certain is your bisection?
> >>>>>
> >>>>> The bisection is very reproducable.
> >>>>>
> >>>>> But, this looks now to be a compiler bug.  I'm using the latest version
> >>>>> of clang and if I put "noinline" at the front of the function,
> >>>>> everything works.
> >>>>
> >>>> Well, the compiler can do more invasive optimizations when inlining. If
> >>>> you have buggy code that relies on some unspecified behavior, inlining
> >>>> can change the behavior ... but going over that code, there isn't too
> >>>> much action going on. At least nothing screamed at me.
> >>>
> >>> Apart from all the optimisations that get rid off the 'pass be reference'
> >>> parameters and strange conditional tests.
> >>> Plenty of scope for the compiler getting it wrong.
> >>> But nothing even vaguely illegal.
> >>
> >> Not the first time that people blame the compiler to then figure out
> >> that something else is wrong ... but maybe this time is different :)
> >
> > I agree, I hate to blame the compiler, that's almost never the real
> > problem, but this one sure "feels" like it.
> >
> > I'm running some more tests, trying to narrow things down as just adding
> > a "noinline" to the function that got moved here doesn't work on Linus's
> > tree at the moment because the function was split into multiple
> > functions.
> >
> > Give me a few hours...
> 
> I might be wrong but
> 
> a) import_iovec() uses:
> - unsigned nr_segs -> int
> - unsigned fast_segs -> int
> b) rw_copy_check_uvector() uses:
> - unsigned long nr_segs -> long
> - unsigned long fast_seg -> long
> 
> So when calling rw_copy_check_uvector(), we have to zero-extend the
> registers used for passing the arguments. That's definitely done when
> calling the function explicitly. Maybe when inlining something is messed up?

That's also not needed on x86-64 - the high bits get cleared by 32bit writes.
But, IIRC, arm64 leaves them unchanged or undefined.

I guessing that every array access uses a *(Rx + Ry) addressing
mode. So indexing an array even with 'unsigned int' requires
an explicit zero-extend on arm64?
(x86-64 ends up with an explicit sign extend when indexing an
array with 'signed int'.)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: David Hildenbrand @ 2020-10-22  9:25 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-aio@kvack.org, linux-mips@vger.kernel.org, David Howells,
	linux-mm@kvack.org, keyrings@vger.kernel.org,
	sparclinux@vger.kernel.org, Christoph Hellwig,
	linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
	linux-scsi@vger.kernel.org, kernel-team@android.com,
	Arnd Bergmann, linux-block@vger.kernel.org, Al Viro,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, netdev@vger.kernel.org,
	Nick Desaulniers, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <e04d0c5d-e834-a15b-7844-44dcc82785cc@redhat.com>

On 22.10.20 11:19, David Hildenbrand wrote:
> On 22.10.20 11:01, Greg KH wrote:
>> On Thu, Oct 22, 2020 at 10:48:59AM +0200, David Hildenbrand wrote:
>>> On 22.10.20 10:40, David Laight wrote:
>>>> From: David Hildenbrand
>>>>> Sent: 22 October 2020 09:35
>>>>>
>>>>> On 22.10.20 10:26, Greg KH wrote:
>>>>>> On Thu, Oct 22, 2020 at 12:39:14AM +0100, Al Viro wrote:
>>>>>>> On Wed, Oct 21, 2020 at 06:13:01PM +0200, Greg KH wrote:
>>>>>>>> On Fri, Sep 25, 2020 at 06:51:39AM +0200, Christoph Hellwig wrote:
>>>>>>>>> From: David Laight <David.Laight@ACULAB.COM>
>>>>>>>>>
>>>>>>>>> This lets the compiler inline it into import_iovec() generating
>>>>>>>>> much better code.
>>>>>>>>>
>>>>>>>>> Signed-off-by: David Laight <david.laight@aculab.com>
>>>>>>>>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>>>>>>>>> ---
>>>>>>>>>  fs/read_write.c | 179 ------------------------------------------------
>>>>>>>>>  lib/iov_iter.c  | 176 +++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>>  2 files changed, 176 insertions(+), 179 deletions(-)
>>>>>>>>
>>>>>>>> Strangely, this commit causes a regression in Linus's tree right now.
>>>>>>>>
>>>>>>>> I can't really figure out what the regression is, only that this commit
>>>>>>>> triggers a "large Android system binary" from working properly.  There's
>>>>>>>> no kernel log messages anywhere, and I don't have any way to strace the
>>>>>>>> thing in the testing framework, so any hints that people can provide
>>>>>>>> would be most appreciated.
>>>>>>>
>>>>>>> It's a pure move - modulo changed line breaks in the argument lists
>>>>>>> the functions involved are identical before and after that (just checked
>>>>>>> that directly, by checking out the trees before and after, extracting two
>>>>>>> functions in question from fs/read_write.c and lib/iov_iter.c (before and
>>>>>>> after, resp.) and checking the diff between those.
>>>>>>>
>>>>>>> How certain is your bisection?
>>>>>>
>>>>>> The bisection is very reproducable.
>>>>>>
>>>>>> But, this looks now to be a compiler bug.  I'm using the latest version
>>>>>> of clang and if I put "noinline" at the front of the function,
>>>>>> everything works.
>>>>>
>>>>> Well, the compiler can do more invasive optimizations when inlining. If
>>>>> you have buggy code that relies on some unspecified behavior, inlining
>>>>> can change the behavior ... but going over that code, there isn't too
>>>>> much action going on. At least nothing screamed at me.
>>>>
>>>> Apart from all the optimisations that get rid off the 'pass be reference'
>>>> parameters and strange conditional tests.
>>>> Plenty of scope for the compiler getting it wrong.
>>>> But nothing even vaguely illegal.
>>>
>>> Not the first time that people blame the compiler to then figure out
>>> that something else is wrong ... but maybe this time is different :)
>>
>> I agree, I hate to blame the compiler, that's almost never the real
>> problem, but this one sure "feels" like it.
>>
>> I'm running some more tests, trying to narrow things down as just adding
>> a "noinline" to the function that got moved here doesn't work on Linus's
>> tree at the moment because the function was split into multiple
>> functions.
>>
>> Give me a few hours...
> 
> I might be wrong but
> 
> a) import_iovec() uses:
> - unsigned nr_segs -> int
> - unsigned fast_segs -> int
> b) rw_copy_check_uvector() uses:
> - unsigned long nr_segs -> long
> - unsigned long fast_seg -> long
> 
> So when calling rw_copy_check_uvector(), we have to zero-extend the
> registers used for passing the arguments. That's definitely done when
> calling the function explicitly. Maybe when inlining something is messed up?
> 
> Just a thought ...
> 

... especially because I recall that clang and gcc behave slightly
differently:

https://github.com/hjl-tools/x86-psABI/issues/2

"Function args are different: narrow types are sign or zero extended to
32 bits, depending on their type. clang depends on this for incoming
args, but gcc doesn't make that assumption. But both compilers do it
when calling, so gcc code can call clang code.

The upper 32 bits of registers are always undefined garbage for types
smaller than 64 bits."

Again, just a thought.

-- 
Thanks,

David / dhildenb


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox