LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] net: ucc_geth: Drop extraneous parentheses in comparison
From: Michael Ellerman @ 2020-10-23  3:32 UTC (permalink / raw)
  To: linuxppc-dev, netdev; +Cc: kuba, leoyang.li, davem, linux-kernel

Clang warns about the extra parentheses in this comparison:

  drivers/net/ethernet/freescale/ucc_geth.c:1361:28:
  warning: equality comparison with extraneous parentheses
    if ((ugeth->phy_interface == PHY_INTERFACE_MODE_SGMII))
         ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~

It seems clear the intent here is to do a comparison not an
assignment, so drop the extra parentheses to avoid any confusion.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 drivers/net/ethernet/freescale/ucc_geth.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/ucc_geth.c b/drivers/net/ethernet/freescale/ucc_geth.c
index db791f60b884..d8ad478a0a13 100644
--- a/drivers/net/ethernet/freescale/ucc_geth.c
+++ b/drivers/net/ethernet/freescale/ucc_geth.c
@@ -1358,7 +1358,7 @@ static int adjust_enet_interface(struct ucc_geth_private *ugeth)
 	    (ugeth->phy_interface == PHY_INTERFACE_MODE_RTBI)) {
 		upsmr |= UCC_GETH_UPSMR_TBIM;
 	}
-	if ((ugeth->phy_interface == PHY_INTERFACE_MODE_SGMII))
+	if (ugeth->phy_interface == PHY_INTERFACE_MODE_SGMII)
 		upsmr |= UCC_GETH_UPSMR_SGMM;
 
 	out_be32(&uf_regs->upsmr, upsmr);
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH] serial: pmac_zilog: don't init if zilog is not available
From: Finn Thain @ 2020-10-23  3:21 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Greg KH, Laurent Vivier, Linux Kernel Mailing List, linux-m68k,
	Paul Mackerras, open list:SERIAL DRIVERS, Brad Boyer,
	linuxppc-dev, Joshua Thompson
In-Reply-To: <CAMuHMdVbo2C1yZ5E_A3L8J1zZigO8i8m5AFUTn9SjbY1sx16kA@mail.gmail.com>

On Thu, 22 Oct 2020, Geert Uytterhoeven wrote:

> 
> Thanks for your patch...
> 

You're welcome.

> I can't say I'm a fan of this...
> 

Sorry.

> 
> The real issue is this "extern struct platform_device scc_a_pdev, 
> scc_b_pdev", circumventing the driver framework.
> 
> Can we get rid of that?
> 

Is there a better alternative?

pmz_probe() is called by console_initcall(pmz_console_init) when 
CONFIG_SERIAL_PMACZILOG_CONSOLE=y because this has to happen earlier than 
the normal platform bus probing which takes place later as a typical 
module_initcall.

^ permalink raw reply

* [PATCH] powerpc/ps3: Drop unused DBG macro
From: Michael Ellerman @ 2020-10-23  3:13 UTC (permalink / raw)
  To: linuxppc-dev

This DBG macro is unused, and has been unused since the file was
originally merged into mainline. Just drop it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/boot/ps3.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/arch/powerpc/boot/ps3.c b/arch/powerpc/boot/ps3.c
index 6e4efbdb6b7c..f157717ae814 100644
--- a/arch/powerpc/boot/ps3.c
+++ b/arch/powerpc/boot/ps3.c
@@ -21,13 +21,6 @@ extern int lv1_get_logical_ppe_id(u64 *out_1);
 extern int lv1_get_repository_node_value(u64 in_1, u64 in_2, u64 in_3,
 	u64 in_4, u64 in_5, u64 *out_1, u64 *out_2);
 
-#ifdef DEBUG
-#define DBG(fmt...) printf(fmt)
-#else
-static inline int __attribute__ ((format (printf, 1, 2))) DBG(
-	const char *fmt, ...) {return 0;}
-#endif
-
 BSS_STACK(4096);
 
 /* A buffer that may be edited by tools operating on a zImage binary so as to
-- 
2.25.1


^ permalink raw reply related

* [PATCHv2] selftests/powerpc/eeh: disable kselftest timeout setting for eeh-basic
From: Po-Hsu Lin @ 2020-10-23  2:45 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, linux-kselftest, mpe
  Cc: joe.lawrence, mathieu.desnoyers, po-hsu.lin, mbenes, shuah

The eeh-basic test got its own 60 seconds timeout (defined in commit
414f50434aa2 "selftests/eeh: Bump EEH wait time to 60s") per breakable
device.

And we have discovered that the number of breakable devices varies
on different hardware. The device recovery time ranges from 0 to 35
seconds. In our test pool it will take about 30 seconds to run on a
Power8 system that with 5 breakable devices, 60 seconds to run on a
Power9 system that with 4 breakable devices.

Extend the timeout setting in the kselftest framework to 5 minutes
to give it a chance to finish.

Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
---
 tools/testing/selftests/powerpc/eeh/Makefile | 2 +-
 tools/testing/selftests/powerpc/eeh/settings | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/powerpc/eeh/settings

diff --git a/tools/testing/selftests/powerpc/eeh/Makefile b/tools/testing/selftests/powerpc/eeh/Makefile
index b397bab..ae963eb 100644
--- a/tools/testing/selftests/powerpc/eeh/Makefile
+++ b/tools/testing/selftests/powerpc/eeh/Makefile
@@ -3,7 +3,7 @@ noarg:
 	$(MAKE) -C ../

 TEST_PROGS := eeh-basic.sh
-TEST_FILES := eeh-functions.sh
+TEST_FILES := eeh-functions.sh settings

 top_srcdir = ../../../../..
 include ../../lib.mk
diff --git a/tools/testing/selftests/powerpc/eeh/settings b/tools/testing/selftests/powerpc/eeh/settings
new file mode 100644
index 0000000..694d707
--- /dev/null
+++ b/tools/testing/selftests/powerpc/eeh/settings
@@ -0,0 +1 @@
+timeout=300
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH] selftests/powerpc/eeh: disable kselftest timeout setting for eeh-basic
From: Po-Hsu Lin @ 2020-10-23  2:37 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: joe.lawrence, linuxppc-dev, linux-kernel, mathieu.desnoyers,
	linux-kselftest, mbenes, shuah
In-Reply-To: <87a6wdy9si.fsf@mpe.ellerman.id.au>

On Fri, Oct 23, 2020 at 10:07 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Po-Hsu Lin <po-hsu.lin@canonical.com> writes:
> > The eeh-basic test got its own 60 seconds timeout (defined in commit
> > 414f50434aa2 "selftests/eeh: Bump EEH wait time to 60s") per breakable
> > device.
> >
> > And we have discovered that the number of breakable devices varies
> > on different hardware. The device recovery time ranges from 0 to 35
> > seconds. In our test pool it will take about 30 seconds to run on a
> > Power8 system that with 5 breakable devices, 60 seconds to run on a
> > Power9 system that with 4 breakable devices.
> >
> > Thus it's better to disable the default 45 seconds timeout setting in
> > the kselftest framework to give it a chance to finish. And let the
> > test to take care of the timeout control.
>
> I'd prefer if we still had some timeout, maybe 5 or 10 minutes? Just in
> case the test goes completely bonkers.
>
OK, let's go for 5 minutes.
Will send V2 later.
Thanks for your suggestion!

> cheers
>
> > diff --git a/tools/testing/selftests/powerpc/eeh/Makefile b/tools/testing/selftests/powerpc/eeh/Makefile
> > index b397bab..ae963eb 100644
> > --- a/tools/testing/selftests/powerpc/eeh/Makefile
> > +++ b/tools/testing/selftests/powerpc/eeh/Makefile
> > @@ -3,7 +3,7 @@ noarg:
> >       $(MAKE) -C ../
> >
> >  TEST_PROGS := eeh-basic.sh
> > -TEST_FILES := eeh-functions.sh
> > +TEST_FILES := eeh-functions.sh settings
> >
> >  top_srcdir = ../../../../..
> >  include ../../lib.mk
> > diff --git a/tools/testing/selftests/powerpc/eeh/settings b/tools/testing/selftests/powerpc/eeh/settings
> > new file mode 100644
> > index 0000000..e7b9417
> > --- /dev/null
> > +++ b/tools/testing/selftests/powerpc/eeh/settings
> > @@ -0,0 +1 @@
> > +timeout=0
> > --
> > 2.7.4

^ permalink raw reply

* [PATCH] powerpc/85xx: Fix declaration made after definition
From: Michael Ellerman @ 2020-10-23  2:08 UTC (permalink / raw)
  To: linuxppc-dev

Currently the clang build of corenet64_smp_defconfig fails with:

  arch/powerpc/platforms/85xx/corenet_generic.c:210:1: error:
  attribute declaration must precede definition
  machine_arch_initcall(corenet_generic, corenet_gen_publish_devices);

Fix it by moving the initcall definition prior to the machine
definition, and directly below the function it calls, which is the
usual style anyway.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/platforms/85xx/corenet_generic.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c b/arch/powerpc/platforms/85xx/corenet_generic.c
index 6aa8defb5857..8d6029099848 100644
--- a/arch/powerpc/platforms/85xx/corenet_generic.c
+++ b/arch/powerpc/platforms/85xx/corenet_generic.c
@@ -106,6 +106,7 @@ int __init corenet_gen_publish_devices(void)
 {
 	return of_platform_bus_probe(NULL, of_device_ids, NULL);
 }
+machine_arch_initcall(corenet_generic, corenet_gen_publish_devices);
 
 static const char * const boards[] __initconst = {
 	"fsl,P2041RDB",
@@ -206,5 +207,3 @@ define_machine(corenet_generic) {
 	.power_save		= e500_idle,
 #endif
 };
-
-machine_arch_initcall(corenet_generic, corenet_gen_publish_devices);
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH] selftests/powerpc/eeh: disable kselftest timeout setting for eeh-basic
From: Michael Ellerman @ 2020-10-23  2:07 UTC (permalink / raw)
  To: Po-Hsu Lin, linux-kernel, linuxppc-dev, linux-kselftest
  Cc: joe.lawrence, mathieu.desnoyers, po-hsu.lin, mbenes, shuah
In-Reply-To: <20201022083616.41666-1-po-hsu.lin@canonical.com>

Po-Hsu Lin <po-hsu.lin@canonical.com> writes:
> The eeh-basic test got its own 60 seconds timeout (defined in commit
> 414f50434aa2 "selftests/eeh: Bump EEH wait time to 60s") per breakable
> device.
>
> And we have discovered that the number of breakable devices varies
> on different hardware. The device recovery time ranges from 0 to 35
> seconds. In our test pool it will take about 30 seconds to run on a
> Power8 system that with 5 breakable devices, 60 seconds to run on a
> Power9 system that with 4 breakable devices.
>
> Thus it's better to disable the default 45 seconds timeout setting in
> the kselftest framework to give it a chance to finish. And let the
> test to take care of the timeout control.

I'd prefer if we still had some timeout, maybe 5 or 10 minutes? Just in
case the test goes completely bonkers.

cheers

> diff --git a/tools/testing/selftests/powerpc/eeh/Makefile b/tools/testing/selftests/powerpc/eeh/Makefile
> index b397bab..ae963eb 100644
> --- a/tools/testing/selftests/powerpc/eeh/Makefile
> +++ b/tools/testing/selftests/powerpc/eeh/Makefile
> @@ -3,7 +3,7 @@ noarg:
>  	$(MAKE) -C ../
>  
>  TEST_PROGS := eeh-basic.sh
> -TEST_FILES := eeh-functions.sh
> +TEST_FILES := eeh-functions.sh settings
>  
>  top_srcdir = ../../../../..
>  include ../../lib.mk
> diff --git a/tools/testing/selftests/powerpc/eeh/settings b/tools/testing/selftests/powerpc/eeh/settings
> new file mode 100644
> index 0000000..e7b9417
> --- /dev/null
> +++ b/tools/testing/selftests/powerpc/eeh/settings
> @@ -0,0 +1 @@
> +timeout=0
> -- 
> 2.7.4

^ permalink raw reply

* Re: [PATCH] powerpc: Send SIGBUS from machine_check
From: Michael Ellerman @ 2020-10-23  0:57 UTC (permalink / raw)
  To: Joakim Tjernlund, linuxppc-dev
In-Reply-To: <20201001170557.10915-1-joakim.tjernlund@infinera.com>

Joakim Tjernlund <joakim.tjernlund@infinera.com> writes:
> Embedded PPC CPU should send SIGBUS to user space when applicable.

Yeah, but it's not clear that it's applicable in all cases.

At least I need some reasoning for why it's safe in all cases below to
just send a SIGBUS and take no other action.

Is there a particular CPU you're working on? Can we start with that and
look at all the machine check causes and which can be safely handled.

Some comments below ...

> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index 0381242920d9..12715d24141c 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -621,6 +621,11 @@ int machine_check_e500mc(struct pt_regs *regs)

At the beginning of the function we have:

	printk("Machine check in kernel mode.\n");

Which should be updated.

>  		       reason & MCSR_MEA ? "Effective" : "Physical", addr);
>  	}
>  
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		recoverable = 1;
> +	}

For most of the error causes we take no action and set recoverable = 0.

Then you just declare that it is recoverable because it hit in
userspace. Depending on the cause that might be OK, but it's not
obviously correct in all cases.

> +
>  silent_out:
>  	mtspr(SPRN_MCSR, mcsr);
>  	return mfspr(SPRN_MCSR) == 0 && recoverable;
> @@ -665,6 +670,10 @@ int machine_check_e500(struct pt_regs *regs)

Same comment about the printk().

>  	if (reason & MCSR_BUS_RPERR)
>  		printk("Bus - Read Parity Error\n");
>  
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		return 1;
> +	}

And same comment more or less.

Other than the MCSR_BUS_RBERR cases that are explicitly checked, the
function does nothing to clear the cause of the machine check.

>  	return 0;
>  }
>  
> @@ -695,6 +704,10 @@ int machine_check_e200(struct pt_regs *regs)
>  	if (reason & MCSR_BUS_WRERR)
>  		printk("Bus - Write Bus Error on buffered store or cache line push\n");
>  
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		return 1;
> +	}

Same.

>  	return 0;
>  }
>  #elif defined(CONFIG_PPC32)
> @@ -731,6 +744,10 @@ int machine_check_generic(struct pt_regs *regs)
>  	default:
>  		printk("Unknown values in msr\n");
>  	}
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		return 1;
> +	}

Same.

>  	return 0;
>  }
>  #endif /* everything else */
> -- 
> 2.26.2

cheers

^ permalink raw reply

* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: David Laight @ 2020-10-22 22:07 UTC (permalink / raw)
  To: 'Al Viro', Nick Desaulniers
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, Greg KH,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, netdev@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022192458.GV3576660@ZenIV.linux.org.uk>

From: Al Viro
> Sent: 22 October 2020 20:25
> 
> On Thu, Oct 22, 2020 at 12:04:52PM -0700, Nick Desaulniers wrote:
> 
> > Passing an `unsigned long` as an `unsigned int` does no such
> > narrowing: https://godbolt.org/z/TvfMxe (same vice-versa, just tail
> > calls, no masking instructions).
> > So if rw_copy_check_uvector() is inlined into import_iovec() (looking
> > at the mainline@1028ae406999), then children calls of
> > `rw_copy_check_uvector()` will be interpreting the `nr_segs` register
> > unmodified, ie. garbage in the upper 32b.
> 
> FWIW,
> 
> void f(unsinged long v)
> {
> 	if (v != 1)
> 		printf("failed\n");
> }
> 
> void g(unsigned int v)
> {
> 	f(v);
> }
> 
> void h(unsigned long v)
> {
> 	g(v);
> }
> 
> main()
> {
> 	h(0x100000001);
> }
> 
> must not produce any output on a host with 32bit int and 64bit long, regardless of
> the inlining, having functions live in different compilation units, etc.
> 
> Depending upon the calling conventions, compiler might do truncation in caller or
> in a callee, but it must be done _somewhere_.

Put g() in a separate compilation unit and use the 'wrong' type
in the prototypes t() used to call g() and g() uses to call f().

Then you might see where and masking does (or does not) happen.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply

* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: David Laight @ 2020-10-22 22:04 UTC (permalink / raw)
  To: 'Nick Desaulniers', Arnd Bergmann
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, linux-block@vger.kernel.org, Al Viro,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, Greg KH,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, netdev@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <CAKwvOdnhONvrHLAuz_BrAuEpnF5mD9p0YPGJs=NZZ0EZNo7dFQ@mail.gmail.com>

From: Nick Desaulniers
> Sent: 22 October 2020 20:05
> 
...
> Passing an `unsigned long` as an `unsigned int` does no such
> narrowing: https://godbolt.org/z/TvfMxe (same vice-versa, just tail
> calls, no masking instructions).

Right but is the called function going to use 32bit ops
and/or mask the register?
Certainly that is likely on x86-64.

I've rather lost track of where the masking is expected
to happen.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Al Viro @ 2020-10-22 21:28 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	Matthew Wilcox, Linus Torvalds, kernel-team@android.com,
	Arnd Bergmann, linux-block@vger.kernel.org,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, Greg KH,
	Nick Desaulniers, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022205932.GB3613750@gmail.com>

On Thu, Oct 22, 2020 at 01:59:32PM -0700, Eric Biggers wrote:

> Also note the following program succeeds on Linux 5.9 on x86_64.  On kernels
> that have this bug, it should fail.  (I couldn't get it to actually fail, so it
> must depend on the compiler and/or the kernel config...)

It doesn't.  See https://www.spinics.net/lists/linux-scsi/msg147836.html for
discussion of that mess.

ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
                  unsigned long vlen, loff_t *pos, rwf_t flags)
{
        struct iovec iovstack[UIO_FASTIOV];
        struct iovec *iov = iovstack;
        struct iov_iter iter;
        ssize_t ret;

        ret = import_iovec(READ, vec, vlen, ARRAY_SIZE(iovstack), &iov, &iter);
        if (ret >= 0) {
                ret = do_iter_read(file, &iter, pos, flags);
                kfree(iov);
        }

        return ret;
}

and import_iovec() takes unsigned int as the third argument, so it *will*
truncate to 32 bits, no matter what.  Has done so since 0504c074b546
"switch {compat_,}do_readv_writev() to {compat_,}import_iovec()" back in
March 2015.  Yes, it was an incompatible userland ABI change, even though
nothing that used glibc/uclibc/dietlibc would've noticed.

Better yet, up until 2.1.90pre1 passing a 64bit value as the _first_ argument
of readv(2) used to fail with -EBADF if it was too large; at that point it
started to get quietly truncated to 32bit first.  And again, no libc users
would've noticed (neither would anything except deliberate regression test
looking for that specific behaviour).

Note that we also have process_madvise(2) with size_t for vlen (huh?  It's
a number of array elements, not an object size) and process_vm_{read,write}v(2),
that have unsigned long for the same thing.  And the last two *are* using
the same unsigned long from glibc POV.

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Eric Biggers @ 2020-10-22 20:59 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	Matthew Wilcox, kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, Al Viro, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, Greg KH,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <CAKwvOdnq-yYLcF_coo=jMV-RH-SkuNp_kMB+KCBF5cz3PwiB8g@mail.gmail.com>

On Thu, Oct 22, 2020 at 10:00:44AM -0700, Nick Desaulniers wrote:
> On Thu, Oct 22, 2020 at 9:40 AM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Thu, Oct 22, 2020 at 04:35:17PM +0000, David Laight wrote:
> > > Wait...
> > > readv(2) defines:
> > >       ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
> >
> > It doesn't really matter what the manpage says.  What does the AOSP
> > libc header say?
> 
> Same: https://android.googlesource.com/platform/bionic/+/refs/heads/master/libc/include/sys/uio.h#38
> 
> Theoretically someone could bypass libc to make a system call, right?
> 
> >
> > > But the syscall is defined as:
> > >
> > > SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec,
> > >                 unsigned long, vlen)
> > > {
> > >         return do_readv(fd, vec, vlen, 0);
> > > }
> >
> 

FWIW, glibc makes the readv() syscall assuming that fd and vlen are 'int' as
well.  So this problem isn't specific to Android's libc.

From objdump -d /lib/x86_64-linux-gnu/libc.so.6:

	00000000000f4db0 <readv@@GLIBC_2.2.5>:
	   f4db0:       64 8b 04 25 18 00 00    mov    %fs:0x18,%eax
	   f4db7:       00
	   f4db8:       85 c0                   test   %eax,%eax
	   f4dba:       75 14                   jne    f4dd0 <readv@@GLIBC_2.2.5+0x20>
	   f4dbc:       b8 13 00 00 00          mov    $0x13,%eax
	   f4dc1:       0f 05                   syscall
	   ...

There's some code for pthread cancellation, but no zeroing of the upper half of
the fd and vlen arguments, which are in %edi and %edx respectively.  But the
glibc function prototype uses 'int' for them, not 'unsigned long'
'ssize_t readv(int fd, const struct iovec *iov, int iovcnt);'.

So the high halves of the fd and iovcnt registers can contain garbage.  Or at
least that's what gcc (9.3.0) and clang (9.0.1) assume; they both compile the
following

void g(unsigned int x);

void f(unsigned long x)
{
        g(x);
}

into f() making a tail call to g(), without zeroing the top half of %rdi.

Also note the following program succeeds on Linux 5.9 on x86_64.  On kernels
that have this bug, it should fail.  (I couldn't get it to actually fail, so it
must depend on the compiler and/or the kernel config...)

	#include <fcntl.h>
	#include <stdio.h>
	#include <sys/syscall.h>
	#include <sys/uio.h>
	#include <unistd.h>

	int main()
	{
		int fd = open("/dev/zero", O_RDONLY);
		char buf[1000];
		struct iovec iov = { .iov_base = buf, .iov_len = sizeof(buf) };
		long ret;

		ret = syscall(__NR_readv, fd, &iov, 0x100000001);
		if (ret < 0)
			perror("readv failed");
		else
			printf("read %ld bytes\n", ret);
	}

I think the right fix is to change the readv() (and writev(), etc.) syscalls to
take 'unsigned int' rather than 'unsigned long', as that is what the users are
assuming...

- Eric

^ permalink raw reply

* Re: [PATCH] treewide: Convert macro and uses of __section(foo) to __section("foo")
From: Joe Perches @ 2020-10-22 20:45 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: clang-built-linux, linuxppc-dev, Linus Torvalds, LKML,
	Miguel Ojeda
In-Reply-To: <CAKwvOdmUPA9XupXwYHy_qT7P+LrUc+wseT79K_oqw=3y6bwLfg@mail.gmail.com>

On Thu, 2020-10-22 at 13:42 -0700, Nick Desaulniers wrote:
> .On Wed, Oct 21, 2020 at 7:36 PM Joe Perches <joe@perches.com> wrote:
> > Use a more generic form for __section that requires quotes to avoid
> > complications with clang and gcc differences.
[]
> >  a quick test of x86_64 and s390 would be good.

x86_64 was compiled here.
I believe the robot tested the others.


^ permalink raw reply

* Re: [PATCH] treewide: Convert macro and uses of __section(foo) to __section("foo")
From: Nick Desaulniers @ 2020-10-22 20:42 UTC (permalink / raw)
  To: Joe Perches
  Cc: clang-built-linux, linuxppc-dev, Linus Torvalds, LKML,
	Miguel Ojeda
In-Reply-To: <fe8abcc88cff676ead8ee48db1e993e63b0611c7.1603327264.git.joe@perches.com>

.On Wed, Oct 21, 2020 at 7:36 PM Joe Perches <joe@perches.com> wrote:
>
> Use a more generic form for __section that requires quotes to avoid
> complications with clang and gcc differences.
>
> Remove the quote operator # from compiler_attributes.h __section macro.
>
> Convert all unquoted __section(foo) uses to quoted __section("foo").
> Also convert __attribute__((section("foo"))) uses to __section("foo")
> even if the __attribute__ has multiple list entry forms.
>
> Conversion done using a script:
>
> Link: https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl
>
> Signed-off-by: Joe Perches <joe@perches.com>
> ---
>
> This conversion was previously submitted to -next last month
> https://lore.kernel.org/lkml/46f69161e60b802488ba8c8f3f8bbf922aa3b49b.camel@perches.com/
>
> Nick Desaulniers found a defect in the conversion of 2 boot files
> for powerpc, but no other defect was found for any other arch.

Untested, but:
Reviewed-by: Nick Desaulniers <ndesaulniers@gooogle.com>

Good job handling the trickier cases when the attribute was mixed with
others, and printing it in scripts/mod/modpost.c.

The only cases that *might* be similar to PPC are:
>  arch/s390/boot/startup.c              |  2 +-
>  arch/x86/boot/compressed/pgtable_64.c |  2 +-
>  arch/x86/purgatory/purgatory.c        |  4 ++--

So a quick test of x86_64 and s390 would be good.

Thanks for the patch.

>
> The script was corrected to avoid converting these 2 files.
>
> There is no difference between the script output when run on today's -next
> and Linus' tree through commit f804b3159482, so this should be reasonable to
> apply now.


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Nick Desaulniers @ 2020-10-22 20:11 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, Greg KH,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022192458.GV3576660@ZenIV.linux.org.uk>

On Thu, Oct 22, 2020 at 12:25 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Thu, Oct 22, 2020 at 12:04:52PM -0700, Nick Desaulniers wrote:
>
> > Passing an `unsigned long` as an `unsigned int` does no such
> > narrowing: https://godbolt.org/z/TvfMxe (same vice-versa, just tail
> > calls, no masking instructions).
> > So if rw_copy_check_uvector() is inlined into import_iovec() (looking
> > at the mainline@1028ae406999), then children calls of
> > `rw_copy_check_uvector()` will be interpreting the `nr_segs` register
> > unmodified, ie. garbage in the upper 32b.
>
> FWIW,
>
> void f(unsinged long v)
> {
>         if (v != 1)
>                 printf("failed\n");
> }
>
> void g(unsigned int v)
> {
>         f(v);
> }
>
> void h(unsigned long v)
> {
>         g(v);
> }
>
> main()
> {
>         h(0x100000001);
> }

A good/analogous example, but things get weird when the leaf node in
the call chain is inline asm: https://godbolt.org/z/s19TY5

(I'm not sure that's precisely what's going on here; I'll need to dive
more into the calls rw_copy_check_uvector() makes to see if there's
inline asm somewhere, pretty sure calls to get_user with `nr_regs`
exist).

>
> must not produce any output on a host with 32bit int and 64bit long, regardless of
> the inlining, having functions live in different compilation units, etc.
>
> Depending upon the calling conventions, compiler might do truncation in caller or
> in a callee, but it must be done _somewhere_.



-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Al Viro @ 2020-10-22 20:09 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, Greg KH,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022200629.GX3576660@ZenIV.linux.org.uk>

On Thu, Oct 22, 2020 at 09:06:29PM +0100, Al Viro wrote:
> On Thu, Oct 22, 2020 at 08:24:58PM +0100, Al Viro wrote:
> 
> > Depending upon the calling conventions, compiler might do truncation in caller or
> > in a callee, but it must be done _somewhere_.
> 
> Unless I'm misreading AAPCS64,
> 	"Unlike in the 32-bit AAPCS, named integral values must be narrowed by the callee
> 	 rather than the caller"
> in 6.4.2 means that callee must not _not_ expect the upper 32 bits of %x0..%x7 to contain

Sorry, artefact of editing - that's

"in 6.4.2 means that callee must _not_ expect the upper 32 bits of %x0..%x7 to contain"

obviously.

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Arnd Bergmann @ 2020-10-22 20:06 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, linux-block@vger.kernel.org, Al Viro,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, Greg KH,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <CAKwvOdnhONvrHLAuz_BrAuEpnF5mD9p0YPGJs=NZZ0EZNo7dFQ@mail.gmail.com>

On Thu, Oct 22, 2020 at 9:05 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> On Thu, Oct 22, 2020 at 11:13 AM Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > On Thu, Oct 22, 2020 at 7:54 PM Nick Desaulniers
> > <ndesaulniers@google.com> wrote:
> > > On Thu, Oct 22, 2020 at 9:35 AM David Laight <David.Laight@aculab.com> wrote:
> > > >
> > > > Which makes it a bug in the kernel C syscall wrappers.
> > > > They need to explicitly mask the high bits of 32bit
> > > > arguments on arm64 but not x86-64.
> > >
> > > Why not x86-64? Wouldn't it be *any* LP64 ISA?
> >
> > x86-64 is slightly special because most instructions on a 32-bit
> > argument clear the upper 32 bits, while on most architectures
> > the same instruction would leave the upper bits unchanged.
>
> Oh interesting, depends on the operations too on x86_64 IIUC?

It seems this doesn't impact the calling conventions (see below),
it's just that there are more cases on x86 where the callee doesn't
have to explicitly clear the upper bits because the this is implied.

> > > Attaching a patch that uses the proper width, but I'm pretty sure
> > > there's still a signedness issue .  Greg, would you mind running this
> > > through the wringer?
> >
> > I would not expect this to change anything for the bug that Greg
> > is chasing, unless there is also a bug in clang.
> >
> > In the version before the patch, we get a 64-bit argument from
> > user space, which may consist of the intended value in the lower
> > bits plus garbage in the upper bits. However, vlen only gets
> > passed down  into import_iovec() without any other operations
> > on it, and since import_iovec takes a 32-bit argument, this is
> > where it finally gets narrowed.
>
> Passing an `unsigned long` as an `unsigned int` does no such
> narrowing: https://godbolt.org/z/TvfMxe (same vice-versa, just tail
> calls, no masking instructions).

Sorry I got it wrong, looked up the aarch64 AAPCS now, which
explains

 "Unlike in the 32-bit AAPCS, named integral values must be
  narrowed by the callee rather than the caller."

Also confirmed using https://godbolt.org/z/acPrjj, which
shows more combinations of compilers and architectures
in addition to your example. I had expected arm64 to be
like powerpc64 and arm32 in this case, but it's the reverse.

I also verified that SYSCALL_DEFINEx() is correct on arm64
and saw that as of v4.19 it passes the syscall arguments
through pt_regs, which will do the right thing here regardless
of the argument passing rules. The earlier version also seems
to be working as intended.

         Arnd

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Al Viro @ 2020-10-22 20:06 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, Greg KH,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022192458.GV3576660@ZenIV.linux.org.uk>

On Thu, Oct 22, 2020 at 08:24:58PM +0100, Al Viro wrote:

> Depending upon the calling conventions, compiler might do truncation in caller or
> in a callee, but it must be done _somewhere_.

Unless I'm misreading AAPCS64,
	"Unlike in the 32-bit AAPCS, named integral values must be narrowed by the callee
	 rather than the caller"
in 6.4.2 means that callee must not _not_ expect the upper 32 bits of %x0..%x7 to contain
anything valid for 32bit arguments and it must zero-extend %w0..%w7 when passing that to
something that expects a 64bit argument.  On inlining it should be the same situation as
storing unsigned int argument into unsigned long local variable and working with that - if

void f(unsigned int w)
{
	unsigned long x = w;
	printf("%lx\n", x);
}

ends up passing %x0 to printf, it's an obvious bug - it must do something like
	uxtw x0, w0
first.

What am I missing here?

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Al Viro @ 2020-10-22 19:27 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, Greg KH,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022192458.GV3576660@ZenIV.linux.org.uk>

On Thu, Oct 22, 2020 at 08:24:58PM +0100, Al Viro wrote:
> On Thu, Oct 22, 2020 at 12:04:52PM -0700, Nick Desaulniers wrote:
> 
> > Passing an `unsigned long` as an `unsigned int` does no such
> > narrowing: https://godbolt.org/z/TvfMxe (same vice-versa, just tail
> > calls, no masking instructions).
> > So if rw_copy_check_uvector() is inlined into import_iovec() (looking
> > at the mainline@1028ae406999), then children calls of
> > `rw_copy_check_uvector()` will be interpreting the `nr_segs` register
> > unmodified, ie. garbage in the upper 32b.
> 
> FWIW,
> 
> void f(unsinged long v)

s/unsinged/unsigned/, obviously...

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Al Viro @ 2020-10-22 19:24 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, Greg KH,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <CAKwvOdnhONvrHLAuz_BrAuEpnF5mD9p0YPGJs=NZZ0EZNo7dFQ@mail.gmail.com>

On Thu, Oct 22, 2020 at 12:04:52PM -0700, Nick Desaulniers wrote:

> Passing an `unsigned long` as an `unsigned int` does no such
> narrowing: https://godbolt.org/z/TvfMxe (same vice-versa, just tail
> calls, no masking instructions).
> So if rw_copy_check_uvector() is inlined into import_iovec() (looking
> at the mainline@1028ae406999), then children calls of
> `rw_copy_check_uvector()` will be interpreting the `nr_segs` register
> unmodified, ie. garbage in the upper 32b.

FWIW,

void f(unsinged long v)
{
	if (v != 1)
		printf("failed\n");
}

void g(unsigned int v)
{
	f(v);
}

void h(unsigned long v)
{
	g(v);
}

main()
{
	h(0x100000001);
}

must not produce any output on a host with 32bit int and 64bit long, regardless of
the inlining, having functions live in different compilation units, etc.

Depending upon the calling conventions, compiler might do truncation in caller or
in a callee, but it must be done _somewhere_.

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Nick Desaulniers @ 2020-10-22 19:04 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, linux-block@vger.kernel.org, Al Viro,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, Greg KH,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <CAK8P3a3LjG+ZvmQrkb9zpgov8xBkQQWrkHBPgjfYSqBKGrwT4w@mail.gmail.com>

On Thu, Oct 22, 2020 at 11:13 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Thu, Oct 22, 2020 at 7:54 PM Nick Desaulniers
> <ndesaulniers@google.com> wrote:
> > On Thu, Oct 22, 2020 at 9:35 AM David Laight <David.Laight@aculab.com> wrote:
> > >
> > > Which makes it a bug in the kernel C syscall wrappers.
> > > They need to explicitly mask the high bits of 32bit
> > > arguments on arm64 but not x86-64.
> >
> > Why not x86-64? Wouldn't it be *any* LP64 ISA?
>
> x86-64 is slightly special because most instructions on a 32-bit
> argument clear the upper 32 bits, while on most architectures
> the same instruction would leave the upper bits unchanged.

Oh interesting, depends on the operations too on x86_64 IIUC?

>
> > Attaching a patch that uses the proper width, but I'm pretty sure
> > there's still a signedness issue .  Greg, would you mind running this
> > through the wringer?
>
> I would not expect this to change anything for the bug that Greg
> is chasing, unless there is also a bug in clang.
>
> In the version before the patch, we get a 64-bit argument from
> user space, which may consist of the intended value in the lower
> bits plus garbage in the upper bits. However, vlen only gets
> passed down  into import_iovec() without any other operations
> on it, and since import_iovec takes a 32-bit argument, this is
> where it finally gets narrowed.

Passing an `unsigned long` as an `unsigned int` does no such
narrowing: https://godbolt.org/z/TvfMxe (same vice-versa, just tail
calls, no masking instructions).
So if rw_copy_check_uvector() is inlined into import_iovec() (looking
at the mainline@1028ae406999), then children calls of
`rw_copy_check_uvector()` will be interpreting the `nr_segs` register
unmodified, ie. garbage in the upper 32b.

>
> After your patch, the SYSCALL_DEFINE3() does the narrowing
> conversion with the same clearing of the upper bits.
>
> If there is a problem somewhere leading up to import_iovec(),
> it would have to in some code that expects to get a 32-bit
> register argument but gets called with a register that has
> garbage in the upper bits /without/ going through a correct
> sanitizing function like SYSCALL_DEFINE3().
>
>       Arnd



-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Al Viro @ 2020-10-22 18:19 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	'Christoph Hellwig', linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, Greg KH, Nick Desaulniers,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022164040.GV20115@casper.infradead.org>

On Thu, Oct 22, 2020 at 05:40:40PM +0100, Matthew Wilcox wrote:
> On Thu, Oct 22, 2020 at 04:35:17PM +0000, David Laight wrote:
> > Wait...
> > readv(2) defines:
> > 	ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
> 
> It doesn't really matter what the manpage says.  What does the AOSP
> libc header say?

FWIW, see https://www.spinics.net/lists/linux-scsi/msg147836.html and
subthread from there on...

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Arnd Bergmann @ 2020-10-22 18:12 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, linux-block@vger.kernel.org, Al Viro,
	io-uring@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Jens Axboe, linux-parisc@vger.kernel.org, Greg KH,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <CAKwvOdnix6YGFhsmT_mY8ORNPTOsN3HwS33Dr0Ykn-pyJ6e-Bw@mail.gmail.com>

On Thu, Oct 22, 2020 at 7:54 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
> On Thu, Oct 22, 2020 at 9:35 AM David Laight <David.Laight@aculab.com> wrote:
> >
> > Which makes it a bug in the kernel C syscall wrappers.
> > They need to explicitly mask the high bits of 32bit
> > arguments on arm64 but not x86-64.
>
> Why not x86-64? Wouldn't it be *any* LP64 ISA?

x86-64 is slightly special because most instructions on a 32-bit
argument clear the upper 32 bits, while on most architectures
the same instruction would leave the upper bits unchanged.

> Attaching a patch that uses the proper width, but I'm pretty sure
> there's still a signedness issue .  Greg, would you mind running this
> through the wringer?

I would not expect this to change anything for the bug that Greg
is chasing, unless there is also a bug in clang.

In the version before the patch, we get a 64-bit argument from
user space, which may consist of the intended value in the lower
bits plus garbage in the upper bits. However, vlen only gets
passed down  into import_iovec() without any other operations
on it, and ince import_iovec takes a 32-bit argument, this is
where it finally gets narrowed.

After your patch, the SYSCALL_DEFINE3() does the narrowing
conversion with the same clearing of the upper bits.

If there is a problem somewhere leading up to import_iovec(),
it would have to in some code that expects to get a 32-bit
register argument but gets called with a register that has
garbage in the upper bits /without/ going through a correct
sanitizing function like SYSCALL_DEFINE3().

      Arnd

^ permalink raw reply

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Nick Desaulniers @ 2020-10-22 17:54 UTC (permalink / raw)
  To: David Laight
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, Al Viro, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, Greg KH,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, netdev@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <8f1fff0c358b4b669d51cc80098dbba1@AcuMS.aculab.com>

[-- Attachment #1: Type: text/plain, Size: 2086 bytes --]

On Thu, Oct 22, 2020 at 9:35 AM David Laight <David.Laight@aculab.com> wrote:
>
> From: Christoph Hellwig
> > Sent: 22 October 2020 14:24
> >
> > On Thu, Oct 22, 2020 at 11:36:40AM +0200, David Hildenbrand wrote:
> > > My thinking: if the compiler that calls import_iovec() has garbage in
> > > the upper 32 bit
> > >
> > > a) gcc will zero it out and not rely on it being zero.
> > > b) clang will not zero it out, assuming it is zero.
> > >
> > > But
> > >
> > > a) will zero it out when calling the !inlined variant
> > > b) clang will zero it out when calling the !inlined variant
> > >
> > > When inlining, b) strikes. We access garbage. That would mean that we
> > > have calling code that's not generated by clang/gcc IIUC.
> >
> > Most callchains of import_iovec start with the assembly syscall wrappers.
>
> Wait...
> readv(2) defines:
>         ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
>
> But the syscall is defined as:
>
> SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec,
>                 unsigned long, vlen)
> {
>         return do_readv(fd, vec, vlen, 0);
> }
>
> I'm guessing that nothing actually masks the high bits that come
> from an application that is compiled with clang?
>
> The vlen is 'unsigned long' through the first few calls.
> So unless there is a non-inlined function than takes vlen
> as 'int' the high garbage bits from userspace are kept.

Yeah, that's likely a bug: https://godbolt.org/z/KfsPKs

>
> Which makes it a bug in the kernel C syscall wrappers.
> They need to explicitly mask the high bits of 32bit
> arguments on arm64 but not x86-64.

Why not x86-64? Wouldn't it be *any* LP64 ISA?

Attaching a patch that uses the proper width, but I'm pretty sure
there's still a signedness issue .  Greg, would you mind running this
through the wringer?

>
> What does the ARM EABI say about register parameters?

AAPCS is the ABI for 64b ARM, IIUC, which is the ISA GKH is reporting
the problem against. IIUC, EABI is one of the 32b ABIs.  aarch64 is
LP64 just like x86_64.

--
Thanks,
~Nick Desaulniers

[-- Attachment #2: 0001-fs-fix-up-type-confusion-in-readv-writev.patch --]
[-- Type: application/octet-stream, Size: 4630 bytes --]

From aae26b13ffb9e38bb46b8c85985761b5f196b6f6 Mon Sep 17 00:00:00 2001
From: Nick Desaulniers <ndesaulniers@google.com>
Date: Thu, 22 Oct 2020 10:23:47 -0700
Subject: [PATCH] fs: fix up type confusion in readv/writev

The syscall interface doesn't match up with the interface libc is using
or that's defined in the manual pages.

ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
ssize_t writev(int fd, const struct iovec *iov, int iovcnt);

The kernel was defining `iovcnt` as `unsigned long` which is a problem
when userspace understands this to be `int`.

(There's still likely a signedness bug here, but use the proper widths
that import_iovec() expects.)

Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
 fs/read_write.c    | 10 +++++-----
 fs/splice.c        |  2 +-
 include/linux/fs.h |  2 +-
 lib/iov_iter.c     |  4 ++--
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 19f5c4bf75aa..b858f39a4475 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -890,7 +890,7 @@ ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos,
 EXPORT_SYMBOL(vfs_iter_write);
 
 ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
-		  unsigned long vlen, loff_t *pos, rwf_t flags)
+		  unsigned int vlen, loff_t *pos, rwf_t flags)
 {
 	struct iovec iovstack[UIO_FASTIOV];
 	struct iovec *iov = iovstack;
@@ -907,7 +907,7 @@ ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
 }
 
 static ssize_t vfs_writev(struct file *file, const struct iovec __user *vec,
-		   unsigned long vlen, loff_t *pos, rwf_t flags)
+		   unsigned int vlen, loff_t *pos, rwf_t flags)
 {
 	struct iovec iovstack[UIO_FASTIOV];
 	struct iovec *iov = iovstack;
@@ -925,7 +925,7 @@ static ssize_t vfs_writev(struct file *file, const struct iovec __user *vec,
 }
 
 static ssize_t do_readv(unsigned long fd, const struct iovec __user *vec,
-			unsigned long vlen, rwf_t flags)
+			unsigned int vlen, rwf_t flags)
 {
 	struct fd f = fdget_pos(fd);
 	ssize_t ret = -EBADF;
@@ -1025,13 +1025,13 @@ static ssize_t do_pwritev(unsigned long fd, const struct iovec __user *vec,
 }
 
 SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec,
-		unsigned long, vlen)
+		unsigned int, vlen)
 {
 	return do_readv(fd, vec, vlen, 0);
 }
 
 SYSCALL_DEFINE3(writev, unsigned long, fd, const struct iovec __user *, vec,
-		unsigned long, vlen)
+		unsigned int, vlen)
 {
 	return do_writev(fd, vec, vlen, 0);
 }
diff --git a/fs/splice.c b/fs/splice.c
index 70cc52af780b..7508eccfa143 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -342,7 +342,7 @@ const struct pipe_buf_operations nosteal_pipe_buf_ops = {
 EXPORT_SYMBOL(nosteal_pipe_buf_ops);
 
 static ssize_t kernel_readv(struct file *file, const struct kvec *vec,
-			    unsigned long vlen, loff_t offset)
+			    unsigned int vlen, loff_t offset)
 {
 	mm_segment_t old_fs;
 	loff_t pos = offset;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index c4ae9cafbbba..211bce5e6e60 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1895,7 +1895,7 @@ static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
 extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *);
 extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *);
 extern ssize_t vfs_readv(struct file *, const struct iovec __user *,
-		unsigned long, loff_t *, rwf_t);
+		unsigned int, loff_t *, rwf_t);
 extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *,
 				   loff_t, size_t, unsigned int);
 extern ssize_t generic_copy_file_range(struct file *file_in, loff_t pos_in,
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 1635111c5bd2..ded9d9c4eb28 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1734,7 +1734,7 @@ struct iovec *iovec_from_user(const struct iovec __user *uvec,
 }
 
 ssize_t __import_iovec(int type, const struct iovec __user *uvec,
-		 unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
+		 unsigned int nr_segs, unsigned int fast_segs, struct iovec **iovp,
 		 struct iov_iter *i, bool compat)
 {
 	ssize_t total_len = 0;
@@ -1803,7 +1803,7 @@ ssize_t __import_iovec(int type, const struct iovec __user *uvec,
  * Return: Negative error code on error, bytes imported on success
  */
 ssize_t import_iovec(int type, const struct iovec __user *uvec,
-		 unsigned nr_segs, unsigned fast_segs,
+		 unsigned int nr_segs, unsigned int fast_segs,
 		 struct iovec **iovp, struct iov_iter *i)
 {
 	return __import_iovec(type, uvec, nr_segs, fast_segs, iovp, i,
-- 
2.29.0.rc1.297.gfa9743e501-goog


^ permalink raw reply related

* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Nick Desaulniers @ 2020-10-22 17:00 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-aio@kvack.org, David Hildenbrand,
	linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
	keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
	Christoph Hellwig, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	kernel-team@android.com, Arnd Bergmann,
	linux-block@vger.kernel.org, Al Viro, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, Greg KH,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, David Laight,
	netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201022164040.GV20115@casper.infradead.org>

On Thu, Oct 22, 2020 at 9:40 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Thu, Oct 22, 2020 at 04:35:17PM +0000, David Laight wrote:
> > Wait...
> > readv(2) defines:
> >       ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
>
> It doesn't really matter what the manpage says.  What does the AOSP
> libc header say?

Same: https://android.googlesource.com/platform/bionic/+/refs/heads/master/libc/include/sys/uio.h#38

Theoretically someone could bypass libc to make a system call, right?

>
> > But the syscall is defined as:
> >
> > SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec,
> >                 unsigned long, vlen)
> > {
> >         return do_readv(fd, vec, vlen, 0);
> > }
>


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox