LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH] vio: make remove callback return void
From: Sukadev Bhattiprolu @ 2021-01-28 19:07 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: Tyrel Datwyler, Cristobal Forno, sparclinux, target-devel,
	Paul Mackerras, Breno Leitão, Peter Huewe, Jiri Slaby,
	Herbert Xu, linux-scsi, Nayna Jain, Jason Gunthorpe, Michael Cyr,
	Jakub Kicinski, Arnd Bergmann, James E.J. Bottomley, linux-block,
	Lijun Pan, Matt Mackall, Jens Axboe, Steven Royer,
	Martin K. Petersen, Greg Kroah-Hartman, linux-kernel,
	Jarkko Sakkinen, linux-crypto, netdev, Dany Madden,
	Paulo Flabiano Smorigo, linux-integrity, linuxppc-dev,
	David S. Miller
In-Reply-To: <20210127215010.99954-1-uwe@kleine-koenig.org>


Uwe Kleine-König [uwe@kleine-koenig.org] wrote:
> The driver core ignores the return value of struct bus_type::remove()
> because there is only little that can be done. To simplify the quest to
> make this function return void, let struct vio_driver::remove() return
> void, too. All users already unconditionally return 0, this commit makes
> it obvious that returning an error code is a bad idea and makes it
> obvious for future driver authors that returning an error code isn't
> intended.

Slightly off-topic, should ndo_stop() also return a void? Its return value
seems to be mostly ignored and __dev_close_many() has:

                /*
                 *      Call the device specific close. This cannot fail.
                 *      Only if device is UP
                 *
                 *      We allow it to be called even after a DETACH hot-plug
                 *      event.
                 */
                if (ops->ndo_stop)
                        ops->ndo_stop(dev);
Sukadev

^ permalink raw reply

* Re: [PATCH] vio: make remove callback return void
From: Uwe Kleine-König @ 2021-01-28 20:08 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Tyrel Datwyler, Cristobal Forno, sparclinux, target-devel,
	Paul Mackerras, Breno Leitão, Peter Huewe, Jiri Slaby,
	Herbert Xu, linux-scsi, Nayna Jain, Jason Gunthorpe, Michael Cyr,
	Jakub Kicinski, Arnd Bergmann, James E.J. Bottomley, linux-block,
	Lijun Pan, Matt Mackall, Jens Axboe, Steven Royer,
	Martin K. Petersen, Greg Kroah-Hartman, linux-kernel,
	Jarkko Sakkinen, linux-crypto, netdev, Dany Madden,
	Paulo Flabiano Smorigo, linux-integrity, linuxppc-dev,
	David S. Miller
In-Reply-To: <20210128190750.GA490196@us.ibm.com>


[-- Attachment #1.1: Type: text/plain, Size: 373 bytes --]

Hello Sukadev,

On 1/28/21 8:07 PM, Sukadev Bhattiprolu wrote:
> Slightly off-topic, should ndo_stop() also return a void? Its return value
> seems to be mostly ignored and [...]

I don't know enough about the network stack to tell. Probably it's a 
good idea to start a separate thread for this and address this to the 
netdev list only.

Best regards
Uwe



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH 04/13] module: use RCU to synchronize find_module
From: Thiago Jung Bauermann @ 2021-01-28 20:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Petr Mladek, Jiri Kosina, Andrew Donnellan, linux-kbuild,
	David Airlie, Masahiro Yamada, Josh Poimboeuf, Maarten Lankhorst,
	linux-kernel, Maxime Ripard, live-patching, Michal Marek,
	Joe Lawrence, dri-devel, Thomas Zimmermann, Jessica Yu,
	Frederic Barrat, Daniel Vetter, Miroslav Benes, linuxppc-dev
In-Reply-To: <20210128181421.2279-5-hch@lst.de>


Hi Christoph,

Christoph Hellwig <hch@lst.de> writes:

> diff --git a/kernel/module.c b/kernel/module.c
> index 981302f616b411..6772fb2680eb3e 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -668,7 +668,6 @@ static struct module *find_module_all(const char *name, size_t len,
>  
>  struct module *find_module(const char *name)
>  {
> -	module_assert_mutex();

Does it make sense to replace the assert above with the warn below (untested)?

     RCU_LOCKDEP_WARN(rcu_read_lock_sched_held());

>  	return find_module_all(name, strlen(name), false);
>  }

-- 
Thiago Jung Bauermann
IBM Linux Technology Center

^ permalink raw reply

* [PATCH net v2] ibmvnic: device remove has higher precedence over reset
From: Lijun Pan @ 2021-01-29  4:34 UTC (permalink / raw)
  To: netdev
  Cc: Lijun Pan, gregkh, julietk, Uwe Kleine-König, paulus, kernel,
	drt, kuba, sukadev, linuxppc-dev, davem

Returning -EBUSY in ibmvnic_remove() does not actually hold the
removal procedure since driver core doesn't care for the return
value (see __device_release_driver() in drivers/base/dd.c
calling dev->bus->remove()) though vio_bus_remove
(in arch/powerpc/platforms/pseries/vio.c) records the
return value and passes it on. [1]

During the device removal precedure, checking for resetting
bit is dropped so that we can continue executing all the
cleanup calls in the rest of the remove function. Otherwise,
it can cause latent memory leaks and kernel crashes.

[1] https://lore.kernel.org/linuxppc-dev/20210117101242.dpwayq6wdgfdzirl@pengutronix.de/T/#m48f5befd96bc9842ece2a3ad14f4c27747206a53
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Fixes: 7d7195a026ba ("ibmvnic: Do not process device remove during device reset")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
---
v2: drop v1's deletion of REMOVING check in __ibmvnic_reset.

 drivers/net/ethernet/ibm/ibmvnic.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 9778c83150f1..e19fa8bc763c 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -5438,11 +5438,6 @@ static int ibmvnic_remove(struct vio_dev *dev)
 	unsigned long flags;
 
 	spin_lock_irqsave(&adapter->state_lock, flags);
-	if (test_bit(0, &adapter->resetting)) {
-		spin_unlock_irqrestore(&adapter->state_lock, flags);
-		return -EBUSY;
-	}
-
 	adapter->state = VNIC_REMOVING;
 	spin_unlock_irqrestore(&adapter->state_lock, flags);
 
-- 
2.23.0


^ permalink raw reply related

* Re: [PATCH 04/13] module: use RCU to synchronize find_module
From: Christoph Hellwig @ 2021-01-29  5:10 UTC (permalink / raw)
  To: Thiago Jung Bauermann
  Cc: Petr Mladek, Jiri Kosina, Andrew Donnellan, linux-kbuild,
	David Airlie, Masahiro Yamada, Josh Poimboeuf, Maarten Lankhorst,
	linux-kernel, Maxime Ripard, live-patching, Michal Marek,
	Joe Lawrence, dri-devel, Thomas Zimmermann, Jessica Yu,
	Frederic Barrat, Daniel Vetter, Miroslav Benes, linuxppc-dev,
	Christoph Hellwig
In-Reply-To: <874kj023bj.fsf@manicouagan.localdomain>

On Thu, Jan 28, 2021 at 05:50:56PM -0300, Thiago Jung Bauermann wrote:
> >  struct module *find_module(const char *name)
> >  {
> > -	module_assert_mutex();
> 
> Does it make sense to replace the assert above with the warn below (untested)?
> 
>      RCU_LOCKDEP_WARN(rcu_read_lock_sched_held());

One caller actually holds module_mutex still.  And find_module_all,
which implements the actual logic already asserts that either
module_mutex is held or rcu_read_lock, so I don't tink we need an
extra one here.

^ permalink raw reply

* Re: [PATCH 1/2] crypto: talitos - Work around SEC6 ERRATA (AES-CTR mode data size error)
From: Herbert Xu @ 2021-01-29  5:10 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: linux-kernel, linuxppc-dev, David S. Miller, linux-crypto
In-Reply-To: <4b7a870573f485b9fea496b13c9b02d86dd97314.1611169001.git.christophe.leroy@csgroup.eu>

On Wed, Jan 20, 2021 at 06:57:24PM +0000, Christophe Leroy wrote:
> Talitos Security Engine AESU considers any input
> data size that is not a multiple of 16 bytes to be an error.
> This is not a problem in general, except for Counter mode
> that is a stream cipher and can have an input of any size.
> 
> Test Manager for ctr(aes) fails on 4th test vector which has
> a length of 499 while all previous vectors which have a 16 bytes
> multiple length succeed.
> 
> As suggested by Freescale, round up the input data length to the
> nearest 16 bytes.
> 
> Fixes: 5e75ae1b3cef ("crypto: talitos - add new crypto modes")
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> ---
>  drivers/crypto/talitos.c | 28 ++++++++++++++++------------
>  drivers/crypto/talitos.h |  1 +
>  2 files changed, 17 insertions(+), 12 deletions(-)

All applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH] powerpc/fault: fix wrong KUAP fault for IO_URING
From: Zorro Lang @ 2021-01-29  6:52 UTC (permalink / raw)
  To: Christophe Leroy; +Cc: Jens Axboe, linuxppc-dev, Nicholas Piggin
In-Reply-To: <17ae2706-fe95-a5de-b9da-e3480800daf7@csgroup.eu>

On Thu, Jan 28, 2021 at 03:44:21PM +0100, Christophe Leroy wrote:
> 
> 
> Le 28/01/2021 à 15:42, Jens Axboe a écrit :
> > On 1/28/21 6:52 AM, Zorro Lang wrote:
> > > On Wed, Jan 27, 2021 at 08:06:37PM -0700, Jens Axboe wrote:
> > > > On 1/27/21 8:13 PM, Zorro Lang wrote:
> > > > > On Thu, Jan 28, 2021 at 10:18:07AM +1000, Nicholas Piggin wrote:
> > > > > > Excerpts from Jens Axboe's message of January 28, 2021 5:29 am:
> > > > > > > On 1/27/21 9:38 AM, Christophe Leroy wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Le 27/01/2021 à 15:56, Zorro Lang a écrit :
> > > > > > > > > On powerpc, io_uring test hit below KUAP fault on __do_page_fault.
> > > > > > > > > The fail source line is:
> > > > > > > > > 
> > > > > > > > >     if (unlikely(!is_user && bad_kernel_fault(regs, error_code, address, is_write)))
> > > > > > > > >         return SIGSEGV;
> > > > > > > > > 
> > > > > > > > > The is_user() is based on user_mod(regs) only. This's not suit for
> > > > > > > > > io_uring, where the helper thread can assume the user app identity
> > > > > > > > > and could perform this fault just fine. So turn to use mm to decide
> > > > > > > > > if this is valid or not.
> > > > > > > > 
> > > > > > > > I don't understand why testing is_user would be an issue. KUAP purpose
> > > > > > > > it to block any unallowed access from kernel to user memory
> > > > > > > > (Equivalent to SMAP on x86). So it really must be based on MSR_PR bit,
> > > > > > > > that is what is_user provides.
> > > > > > > > 
> > > > > > > > If the kernel access is legitimate, kernel should have opened
> > > > > > > > userspace access then you shouldn't get this "Bug: Read fault blocked
> > > > > > > > by KUAP!".
> > > > > > > > 
> > > > > > > > As far as I understand, the fault occurs in
> > > > > > > > iov_iter_fault_in_readable() which calls fault_in_pages_readable() And
> > > > > > > > fault_in_pages_readable() uses __get_user() so it is a legitimate
> > > > > > > > access and you really should get a KUAP fault.
> > > > > > > > 
> > > > > > > > So the problem is somewhere else, I think you proposed patch just
> > > > > > > > hides the problem, it doesn't fix it.
> > > > > > > 
> > > > > > > If we do kthread_use_mm(), can we agree that the user access is valid?
> > > > > > 
> > > > > > Yeah the io uring code is fine, provided it uses the uaccess primitives
> > > > > > like any other kernel code. It's looking more like a an arch/powerpc bug.
> > > > > > 
> > > > > > > We should be able to copy to/from user space, and including faults, if
> > > > > > > that's been done and the new mm assigned. Because it really should be.
> > > > > > > If SMAP was a problem on x86, we would have seen it long ago.
> > > > > > > 
> > > > > > > I'm assuming this may be breakage related to the recent uaccess changes
> > > > > > > related to set_fs and friends? Or maybe recent changes on the powerpc
> > > > > > > side?
> > > > > > > 
> > > > > > > Zorro, did 5.10 work?
> > > > > > 
> > > > > > Would be interesting to know.
> > > > > 
> > > > > Sure Nick and Jens, which 5.10 rc? version do you want to know ? Or any git
> > > > > commit(be the HEAD) in 5.10 phase?
> > > > 
> > > > I forget which versions had what series of this, but 5.10 final - and if
> > > > that fails, then 5.9 final. IIRC, 5.9 was pre any of these changes, and
> > > > 5.10 definitely has them.
> > > 
> > > I justed built linux v5.10 with same .config file, and gave it same test.
> > > v5.10 (HEAD=2c85ebc57b Linux 5.10) can't reproduce this bug:
> > > 
> > > # ./check generic/013 generic/051
> > > FSTYP         -- xfs (non-debug)
> > > PLATFORM      -- Linux/ppc64le ibm-p9z-xxx-xxxx 5.10.0 #3 SMP Thu Jan 28 04:12:14 EST 2021
> > > MKFS_OPTIONS  -- -f -m crc=1,finobt=1,reflink=1,rmapbt=1,bigtime=1,inobtcount=1 /dev/sda3
> > > MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/sda3 /mnt/xfstests/scratch
> > > 
> > > generic/013 138s ...  77s
> > > generic/051 103s ...  143s
> > > Ran: generic/013 generic/051
> > > Passed all 2 tests
> > 
> > Thanks for testing that, so I think it's safe to conclude that there's a
> > regression in powerpc fault handling for kthreads that use
> > kthread_use_mm in this release. A bisect would definitely find it, but
> > might be pointless if Christophe or Nick already have an idea of what it
> > is.
> > 
> 
> I don't have any idea yet, but I'd be curious to see the vmlinux binary matching the reported Oops.

I just upload the vmlinux and .config file, the vmlinux it too big, I have to
upload it to my google store and share the link as below:

config file: https://drive.google.com/file/d/1pMszboxdjbMPqSNXnMH-1UCZC-vtDnI9/view?usp=sharing
vmlinux: https://drive.google.com/file/d/1s7g2eBPAFFV61aM2dO9bvVTERGQ9mLYk/view?usp=sharing

I used latest upstream mainline linux, HEAD commit is:
76c057c84d (HEAD -> master, origin/master, origin/HEAD) Merge branch 'parisc-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

The test failed on this kernel as:

# dmesg
[   96.200296] ------------[ cut here ]------------
[   96.200304] Bug: Read fault blocked by KUAP!
[   96.200309] WARNING: CPU: 3 PID: 1876 at arch/powerpc/mm/fault.c:229 bad_kernel_fault+0x180/0x310
[   96.200323] Modules linked in: bonding rfkill sunrpc pseries_rng uio_pdrv_genirq uio drm fuse drm_panel_orientation_quirks ip_tables xfs libcrc32c sd_mod t10_pi xts ibmvscsi ibmveth scsi_transport_srp vmx_crypto
[   96.200372] CPU: 3 PID: 1876 Comm: io_wqe_worker-0 Tainted: G        W         5.11.0-rc5+ #5
[   96.200380] NIP:  c00000000008f8a0 LR: c00000000008f89c CTR: 0000000000000000
[   96.200386] REGS: c00000000d3aafd0 TRAP: 0700   Tainted: G        W          (5.11.0-rc5+)
[   96.200393] MSR:  8000000000021033 <SF,ME,IR,DR,RI,LE>  CR: 48082204  XER: 00000005
[   96.200416] CFAR: c00000000015ddac IRQMASK: 1 
               GPR00: c00000000008f89c c00000000d3ab270 c000000002116900 0000000000000020 
               GPR04: c000000001bec250 0000000000000001 00000001fbb80000 0000000000000027 
               GPR08: 0000000000000001 0000000000000000 c000000020fbba00 0000000000000001 
               GPR12: 0000000000002000 c00000001ecaae00 c00000000019dae8 c000000008d48040 
               GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
               GPR20: c0000000012d9650 fcffffffffffffff c000000002164018 c000000001262da0 
               GPR24: 0000000000000000 0000000000000000 0000000000000300 c000000010c27b80 
               GPR28: 0000000000200000 0000000000000000 0000010007ffce60 c00000000d3ab470 
[   96.200521] NIP [c00000000008f8a0] bad_kernel_fault+0x180/0x310
[   96.200528] LR [c00000000008f89c] bad_kernel_fault+0x17c/0x310
[   96.200535] Call Trace:
[   96.200539] [c00000000d3ab270] [c00000000008f89c] bad_kernel_fault+0x17c/0x310 (unreliable)
[   96.200551] [c00000000d3ab2f0] [c000000000090494] __do_page_fault+0x5f4/0x900
[   96.200561] [c00000000d3ab3b0] [c0000000000907dc] do_page_fault+0x3c/0x120
[   96.200570] [c00000000d3ab400] [c00000000000c748] handle_page_fault+0x10/0x2c
[   96.200579] --- interrupt: 300 at fault_in_pages_readable+0x104/0x350
[   96.200579] --- interrupt: 300 at fault_in_pages_readable+0x104/0x350
[   96.200586] NIP:  c000000000849424 LR: c00000000084952c CTR: c0000000006984a0
[   96.200592] REGS: c00000000d3ab470 TRAP: 0300   Tainted: G        W          (5.11.0-rc5+)
[   96.200598] MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 44082804  XER: 00000001
[   96.200628] CFAR: c000000000010330 DAR: 0000010007ffce60 DSISR: 00200000 IRQMASK: 0 
               GPR00: c00000000084952c c00000000d3ab710 c000000002116900 0000000000000000 
               GPR04: c000000010c27ce0 0000000000000001 00000001fbb80000 0000000000010000 
               GPR08: 00000000271cd0a4 0000000000000200 0000000000000200 0000000000000000 
               GPR12: 0000000000002000 c00000001ecaae00 c00000000019dae8 c000000008d48040 
               GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
               GPR20: c0000000012d9650 fcffffffffffffff c000000002164018 c000000001262da0 
               GPR24: c000000020fbba00 bfffffffffffffff 0000000000000000 a8aaaaaaaaaaaaaa 
               GPR28: 0000010008005d71 c00000000bec3e00 0000000000000000 0000010007ffce60 
[   96.200734] NIP [c000000000849424] fault_in_pages_readable+0x104/0x350
[   96.200741] LR [c00000000084952c] fault_in_pages_readable+0x20c/0x350
[   96.200747] --- interrupt: 300
[   96.200752] [c00000000d3ab7a0] [c0000000008496d0] iov_iter_fault_in_readable+0x60/0x120
[   96.200761] [c00000000d3ab7e0] [c000000000698558] iomap_write_actor+0xb8/0x270
[   96.200771] [c00000000d3ab890] [c000000000693554] iomap_apply+0x2b4/0x740
[   96.200780] [c00000000d3ab9a0] [c000000000693dc0] iomap_file_buffered_write+0xa0/0x120
[   96.200790] [c00000000d3ab9f0] [c008000001d3efec] xfs_file_buffered_aio_write+0x354/0x590 [xfs]
[   96.200870] [c00000000d3aba90] [c0000000006691e4] io_write+0x104/0x4b0
[   96.200884] [c00000000d3abbb0] [c00000000066be54] io_issue_sqe+0x3d4/0xf50
[   96.200897] [c00000000d3abc60] [c00000000066f250] io_wq_submit_work+0xb0/0x2f0
[   96.200911] [c00000000d3abcb0] [c0000000006738a8] io_worker_handle_work+0x248/0x4a0
[   96.200925] [c00000000d3abd30] [c000000000673d28] io_wqe_worker+0x228/0x2a0
[   96.200939] [c00000000d3abda0] [c00000000019dc94] kthread+0x1b4/0x1c0
[   96.200950] [c00000000d3abe10] [c00000000000daf0] ret_from_kernel_thread+0x5c/0x6c
[   96.200959] Instruction dump:
[   96.200965] e87f0100 4810b155 60000000 2c230000 4182ffa8 409200ac 3c82ff15 38847e38 
[   96.200987] 3c62ff15 38637ed0 480ce4ad 60000000 <0fe00000> e8010090 38210080 38600001 
[   96.201008] irq event stamp: 46
[   96.201013] hardirqs last  enabled at (45): [<c0000000005428c4>] __slab_free+0x414/0x610
[   96.201021] hardirqs last disabled at (46): [<c000000000008a04>] data_access_common_virt+0x1a4/0x1c0
[   96.201030] softirqs last  enabled at (0): [<c00000000015ae68>] copy_process+0x688/0x1600
[   96.201038] softirqs last disabled at (0): [<0000000000000000>] 0x0
[   96.201045] ---[ end trace c2373fad985a304b ]---

# ./scripts/faddr2line vmlinux bad_kernel_fault+0x180/0x310
bad_kernel_fault+0x180/0x310:
bad_kernel_fault at arch/powerpc/mm/fault.c:229 (discriminator 6)

    217         // Read/write fault blocked by KUAP is bad, it can never succeed.
    218         if (bad_kuap_fault(regs, address, is_write)) {
    219                 pr_crit_ratelimited("Kernel attempted to %s user page (%lx) - exploit attempt? (uid: %d)\n",
    220                                     is_write ? "write" : "read", address,
    221                                     from_kuid(&init_user_ns, current_uid()));
    222 
    223                 // Fault on user outside of certain regions (eg. copy_tofrom_user()) is bad
    224                 if (!search_exception_tables(regs->nip))
    225                         return true;
    226 
    227                 // Read/write fault in a valid region (the exception table search passed
    228                 // above), but blocked by KUAP is bad, it can never succeed.
    229                 return WARN(true, "Bug: %s fault blocked by KUAP!", is_write ? "Write" : "Read");

Thanks,
Zorro

> 
> Christophe
> 


^ permalink raw reply

* [PATCH v2] powerpc/sstep: Fix array out of bound warning
From: Ravi Bangoria @ 2021-01-29  7:17 UTC (permalink / raw)
  To: mpe; +Cc: naveen.n.rao, ravi.bangoria, paulus, linuxppc-dev, jniethe5

Compiling kernel with -Warray-bounds throws below warning:

  In function 'emulate_vsx_store':
  warning: array subscript is above array bounds [-Warray-bounds]
  buf.d[2] = byterev_8(reg->d[1]);
  ~~~~~^~~
  buf.d[3] = byterev_8(reg->d[0]);
  ~~~~~^~~

Fix it by using temporary array variable 'union vsx_reg buf32[]' in
that code block. Also, with element_size = 32, 'union vsx_reg *reg'
is an array of size 2. So, use 'reg' as an array instead of pointer
in the same code block.

Fixes: af99da74333b ("powerpc/sstep: Support VSX vector paired storage access instructions")
Suggested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
---
v1: http://lore.kernel.org/r/20210115061620.692500-1-ravi.bangoria@linux.ibm.com
v1->v2:
  - Change code only in the affected block

 arch/powerpc/lib/sstep.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index bf7a7d62ae8b..ede093e96234 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -818,13 +818,15 @@ void emulate_vsx_store(struct instruction_op *op, const union vsx_reg *reg,
 			break;
 		if (rev) {
 			/* reverse 32 bytes */
-			buf.d[0] = byterev_8(reg->d[3]);
-			buf.d[1] = byterev_8(reg->d[2]);
-			buf.d[2] = byterev_8(reg->d[1]);
-			buf.d[3] = byterev_8(reg->d[0]);
-			reg = &buf;
+			union vsx_reg buf32[2];
+			buf32[0].d[0] = byterev_8(reg[1].d[1]);
+			buf32[0].d[1] = byterev_8(reg[1].d[0]);
+			buf32[1].d[0] = byterev_8(reg[0].d[1]);
+			buf32[1].d[1] = byterev_8(reg[0].d[0]);
+			memcpy(mem, buf32, size);
+		} else {
+			memcpy(mem, reg, size);
 		}
-		memcpy(mem, reg, size);
 		break;
 	case 16:
 		/* stxv, stxvx, stxvl, stxvll */
-- 
2.26.2


^ permalink raw reply related

* Re: [PATCH] powerpc/sstep: Fix array out of bound warning
From: Ravi Bangoria @ 2021-01-29  7:18 UTC (permalink / raw)
  To: Naveen N. Rao; +Cc: Ravi Bangoria, paulus, jniethe5, naveen.n.rao, linuxppc-dev
In-Reply-To: <20210128172043.GB117@DESKTOP-TDPLP67.localdomain>



On 1/28/21 10:50 PM, Naveen N. Rao wrote:
> On 2021/01/15 11:46AM, Ravi Bangoria wrote:
>> Compiling kernel with -Warray-bounds throws below warning:
>>
>>    In function 'emulate_vsx_store':
>>    warning: array subscript is above array bounds [-Warray-bounds]
>>    buf.d[2] = byterev_8(reg->d[1]);
>>    ~~~~~^~~
>>    buf.d[3] = byterev_8(reg->d[0]);
>>    ~~~~~^~~
>>
>> Fix it by converting local variable 'union vsx_reg buf' into an array.
>> Also consider function argument 'union vsx_reg *reg' as array instead
>> of pointer because callers are actually passing an array to it.
> 
> I think you should change the function prototype to reflect this.
> 
> However, while I agree with this change in principle, it looks to be a
> lot of code churn for a fairly narrow use. Perhaps we should just
> address the specific bug. Something like the below (not tested)?

Yes, this would serve the same purpose and it's more compact as well. Sent v2.

> 
> @@ -818,13 +818,15 @@ void emulate_vsx_store(struct instruction_op *op, const union vsx_reg *reg,
>                          break;
>                  if (rev) {
>                          /* reverse 32 bytes */
> -                       buf.d[0] = byterev_8(reg->d[3]);
> -                       buf.d[1] = byterev_8(reg->d[2]);
> -                       buf.d[2] = byterev_8(reg->d[1]);
> -                       buf.d[3] = byterev_8(reg->d[0]);
> -                       reg = &buf;
> +                       union vsx_reg buf32[2];
> +                       buf32[0].d[0] = byterev_8(reg[1].d[1]);
> +                       buf32[0].d[1] = byterev_8(reg[1].d[0]);
> +                       buf32[1].d[0] = byterev_8(reg[0].d[1]);
> +                       buf32[1].d[1] = byterev_8(reg[0].d[0]);
> +                       memcpy(mem, buf32, size);
> +               } else {
> +                       memcpy(mem, reg, size);
>                  }
> -               memcpy(mem, reg, size);
>                  break;
>          case 16:
>                  /* stxv, stxvx, stxvl, stxvll */
> 
> 
> - Naveen
> 

^ permalink raw reply

* Re: [PATCH] cxl: Simplify bool conversion
From: Frederic Barrat @ 2021-01-29  9:34 UTC (permalink / raw)
  To: Yang Li; +Cc: gregkh, linuxppc-dev, ajd, arnd, linux-kernel
In-Reply-To: <1611908705-98507-1-git-send-email-yang.lee@linux.alibaba.com>



On 29/01/2021 09:25, Yang Li wrote:
> Fix the following coccicheck warning:
> ./drivers/misc/cxl/sysfs.c:181:48-53: WARNING: conversion to bool not
> needed here
> 
> Reported-by: Abaci Robot <abaci@linux.alibaba.com>
> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
> ---


Thanks!
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>


>   drivers/misc/cxl/sysfs.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/misc/cxl/sysfs.c b/drivers/misc/cxl/sysfs.c
> index d97a243..c173a5e 100644
> --- a/drivers/misc/cxl/sysfs.c
> +++ b/drivers/misc/cxl/sysfs.c
> @@ -178,7 +178,7 @@ static ssize_t perst_reloads_same_image_store(struct device *device,
>   	if ((rc != 1) || !(val == 1 || val == 0))
>   		return -EINVAL;
>   
> -	adapter->perst_same_image = (val == 1 ? true : false);
> +	adapter->perst_same_image = (val == 1);
>   	return count;
>   }
>   
> 

^ permalink raw reply

* Re: [PATCH 05/13] kallsyms: refactor {,module_}kallsyms_on_each_symbol
From: Petr Mladek @ 2021-01-29  9:43 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jiri Kosina, Andrew Donnellan, linux-kbuild, David Airlie,
	Masahiro Yamada, Josh Poimboeuf, Maarten Lankhorst, linux-kernel,
	Maxime Ripard, live-patching, Michal Marek, Joe Lawrence,
	dri-devel, Thomas Zimmermann, Jessica Yu, Frederic Barrat,
	Daniel Vetter, Miroslav Benes, linuxppc-dev
In-Reply-To: <20210128181421.2279-6-hch@lst.de>

On Thu 2021-01-28 19:14:13, Christoph Hellwig wrote:
> Require an explicit call to module_kallsyms_on_each_symbol to look
> for symbols in modules instead of the call from kallsyms_on_each_symbol,
> and acquire module_mutex inside of module_kallsyms_on_each_symbol instead
> of leaving that up to the caller.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  kernel/kallsyms.c       | 6 +++++-
>  kernel/livepatch/core.c | 6 +-----
>  kernel/module.c         | 8 ++++----
>  3 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> index fe9de067771c34..a0d3f0865916f9 100644
> --- a/kernel/kallsyms.c
> +++ b/kernel/kallsyms.c
> @@ -177,6 +177,10 @@ unsigned long kallsyms_lookup_name(const char *name)
>  	return module_kallsyms_lookup_name(name);
>  }
>  
> +/*
> + * Iterate over all symbols in vmlinux.  For symbols from modules use
> + * module_kallsyms_on_each_symbol instead.
> + */
>  int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
>  				      unsigned long),
>  			    void *data)
> @@ -192,7 +196,7 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
>  		if (ret != 0)
>  			return ret;
>  	}
> -	return module_kallsyms_on_each_symbol(fn, data);
> +	return 0;
>  }
>  
>  static unsigned long get_symbol_pos(unsigned long addr,
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index 262cd9b003b9f0..f591dac5e86ef4 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -164,12 +164,8 @@ static int klp_find_object_symbol(const char *objname, const char *name,
>  		.pos = sympos,
>  	};
>  
> -	mutex_lock(&module_mutex);
> -	if (objname)
> +	if (objname || !kallsyms_on_each_symbol(klp_find_callback, &args))
>  		module_kallsyms_on_each_symbol(klp_find_callback, &args);
> -	else
> -		kallsyms_on_each_symbol(klp_find_callback, &args);
> -	mutex_unlock(&module_mutex);

This change is not needed. (objname == NULL) means that we are
interested only in symbols in "vmlinux".

module_kallsyms_on_each_symbol(klp_find_callback, &args)
will always fail when objname == NULL.

Best Regards,
Petr

^ permalink raw reply

* Re: [PATCH 04/13] module: use RCU to synchronize find_module
From: Petr Mladek @ 2021-01-29 10:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jiri Kosina, Andrew Donnellan, linux-kbuild, David Airlie,
	Masahiro Yamada, Josh Poimboeuf, Maarten Lankhorst, linux-kernel,
	Maxime Ripard, live-patching, Michal Marek, Joe Lawrence,
	dri-devel, Thomas Zimmermann, Jessica Yu, Frederic Barrat,
	Daniel Vetter, Miroslav Benes, linuxppc-dev
In-Reply-To: <20210128181421.2279-5-hch@lst.de>

On Thu 2021-01-28 19:14:12, Christoph Hellwig wrote:
> Allow for a RCU-sched critical section around find_module, following
> the lower level find_module_all helper, and switch the two callers
> outside of module.c to use such a RCU-sched critical section instead
> of module_mutex.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

It looks good and safe.

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr

^ permalink raw reply

* Re: [PATCH v2] powerpc/sstep: Fix array out of bound warning
From: Naveen N. Rao @ 2021-01-29 10:04 UTC (permalink / raw)
  To: Ravi Bangoria; +Cc: naveen.n.rao, paulus, linuxppc-dev, jniethe5
In-Reply-To: <20210129071745.111466-1-ravi.bangoria@linux.ibm.com>

On 2021/01/29 12:47PM, Ravi Bangoria wrote:
> Compiling kernel with -Warray-bounds throws below warning:
> 
>   In function 'emulate_vsx_store':
>   warning: array subscript is above array bounds [-Warray-bounds]
>   buf.d[2] = byterev_8(reg->d[1]);
>   ~~~~~^~~
>   buf.d[3] = byterev_8(reg->d[0]);
>   ~~~~~^~~
> 
> Fix it by using temporary array variable 'union vsx_reg buf32[]' in
> that code block. Also, with element_size = 32, 'union vsx_reg *reg'
> is an array of size 2. So, use 'reg' as an array instead of pointer
> in the same code block.
> 
> Fixes: af99da74333b ("powerpc/sstep: Support VSX vector paired storage access instructions")
> Suggested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
> ---
> v1: http://lore.kernel.org/r/20210115061620.692500-1-ravi.bangoria@linux.ibm.com
> v1->v2:
>   - Change code only in the affected block

I don't see the compiler warning with -Warray-bounds with this patch:
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>

- Naveen


^ permalink raw reply

* Re: [PATCH 0/3] sched: Task priority related cleanups
From: Peter Zijlstra @ 2021-01-29 10:55 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Juri Lelli, Hillf Danton, Vincent Guittot, Arnd Bergmann,
	linux-kernel, Steven Rostedt, Ingo Molnar, Jeremy Kerr,
	linuxppc-dev
In-Reply-To: <20210128131040.296856-1-dietmar.eggemann@arm.com>

On Thu, Jan 28, 2021 at 02:10:37PM +0100, Dietmar Eggemann wrote:

> Dietmar Eggemann (3):
>   sched: Remove MAX_USER_RT_PRIO
>   sched: Remove USER_PRIO, TASK_USER_PRIO and MAX_USER_PRIO
>   sched/core: Update task_prio() function header

Thanks!

^ permalink raw reply

* Re: [PATCH 1/2] powerpc/vdso: fix unnecessary rebuilds of vgettimeofday.o
From: Michael Ellerman @ 2021-01-29 11:41 UTC (permalink / raw)
  To: Masahiro Yamada, Benjamin Herrenschmidt, Paul Mackerras,
	linuxppc-dev
  Cc: Ravi Bangoria, Linux Kernel Mailing List, Nicholas Piggin,
	Oliver O'Halloran, Greentime Hu, Michal Suchanek,
	Ard Biesheuvel, Daniel Axtens
In-Reply-To: <CAK7LNASEVM8e5hohV4jbXOvMxSJ_Prm3es+fhezPkRc6UL=vdw@mail.gmail.com>

Masahiro Yamada <masahiroy@kernel.org> writes:
> On Thu, Dec 24, 2020 at 2:12 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
>>
>> vgettimeofday.o is unnecessarily rebuilt. Adding it to 'targets' is not
>> enough to fix the issue. Kbuild is correctly rebuilding it because the
>> command line is changed.
>>
>> PowerPC builds each vdso directory twice; first in vdso_prepare to
>> generate vdso{32,64}-offsets.h, second as part of the ordinary build
>> process to embed vdso{32,64}.so.dbg into the kernel.
>>
>> The problem shows up when CONFIG_PPC_WERROR=y due to the following line
>> in arch/powerpc/Kbuild:
>>
>>   subdir-ccflags-$(CONFIG_PPC_WERROR) := -Werror
>>
>> In the preparation stage, Kbuild directly visits the vdso directories,
>> hence it does not inherit subdir-ccflags-y. In the second descend,
>> Kbuild adds -Werror, which results in the command line flipping
>> with/without -Werror.
>>
>> It implies a potential danger; if a more critical flag that would impact
>> the resulted vdso, the offsets recorded in the headers might be different
>> from real offsets in the embedded vdso images.
>>
>> Removing the unneeded second descend solves the problem.
>>
>> Link: https://lore.kernel.org/linuxppc-dev/87tuslxhry.fsf@mpe.ellerman.id.au/
>> Reported-by: Michael Ellerman <mpe@ellerman.id.au>
>> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
>> ---
>
>
> Michael, please take a  look at this.
>
> The unneeded rebuild problem is still remaining.

Sorry missed those.

I guess I'll pick these up as fixes for v5.10.

cheers

^ permalink raw reply

* Re: [PATCH] powerpc: remove unneeded semicolons
From: Michael Ellerman @ 2021-01-29 11:48 UTC (permalink / raw)
  To: Chengyang Fan; +Cc: joe, linuxppc-dev
In-Reply-To: <20210125095338.1719405-1-cy.fan@huawei.com>

Chengyang Fan <cy.fan@huawei.com> writes:
> Remove superfluous semicolons after function definitions.

Is there a good reason why?

I realise they're superfluous, but they're also harmless as far as I'm
aware.

cheers

>  arch/powerpc/include/asm/book3s/32/mmu-hash.h       |  2 +-
>  arch/powerpc/include/asm/book3s/64/mmu.h            |  2 +-
>  arch/powerpc/include/asm/book3s/64/tlbflush-radix.h |  2 +-
>  arch/powerpc/include/asm/book3s/64/tlbflush.h       |  2 +-
>  arch/powerpc/include/asm/firmware.h                 |  2 +-
>  arch/powerpc/include/asm/kvm_ppc.h                  |  6 +++---
>  arch/powerpc/include/asm/paca.h                     |  6 +++---
>  arch/powerpc/include/asm/rtas.h                     |  2 +-
>  arch/powerpc/include/asm/setup.h                    |  6 +++---
>  arch/powerpc/include/asm/simple_spinlock.h          |  4 ++--
>  arch/powerpc/include/asm/smp.h                      |  2 +-
>  arch/powerpc/include/asm/xmon.h                     |  4 ++--
>  arch/powerpc/kernel/prom.c                          |  2 +-
>  arch/powerpc/kernel/setup.h                         | 12 ++++++------
>  arch/powerpc/platforms/powernv/subcore.h            |  2 +-
>  arch/powerpc/platforms/pseries/pseries.h            |  2 +-
>  16 files changed, 29 insertions(+), 29 deletions(-)

^ permalink raw reply

* Re: [PATCH] powerpc/fault: fix wrong KUAP fault for IO_URING
From: Christophe Leroy @ 2021-01-29 12:26 UTC (permalink / raw)
  To: Zorro Lang, Aneesh Kumar K.V; +Cc: Jens Axboe, linuxppc-dev, Nicholas Piggin
In-Reply-To: <20210129065220.GS14354@localhost.localdomain>

+Aneesh

Le 29/01/2021 à 07:52, Zorro Lang a écrit :
> On Thu, Jan 28, 2021 at 03:44:21PM +0100, Christophe Leroy wrote:
>>
>>
>> Le 28/01/2021 à 15:42, Jens Axboe a écrit :
>>> On 1/28/21 6:52 AM, Zorro Lang wrote:
>>>> On Wed, Jan 27, 2021 at 08:06:37PM -0700, Jens Axboe wrote:
>>>>> On 1/27/21 8:13 PM, Zorro Lang wrote:
>>>>>> On Thu, Jan 28, 2021 at 10:18:07AM +1000, Nicholas Piggin wrote:
>>>>>>> Excerpts from Jens Axboe's message of January 28, 2021 5:29 am:
>>>>>>>> On 1/27/21 9:38 AM, Christophe Leroy wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Le 27/01/2021 à 15:56, Zorro Lang a écrit :
>>>>>>>>>> On powerpc, io_uring test hit below KUAP fault on __do_page_fault.
>>>>>>>>>> The fail source line is:
>>>>>>>>>>
>>>>>>>>>>      if (unlikely(!is_user && bad_kernel_fault(regs, error_code, address, is_write)))
>>>>>>>>>>          return SIGSEGV;
>>>>>>>>>>
>>>>>>>>>> The is_user() is based on user_mod(regs) only. This's not suit for
>>>>>>>>>> io_uring, where the helper thread can assume the user app identity
>>>>>>>>>> and could perform this fault just fine. So turn to use mm to decide
>>>>>>>>>> if this is valid or not.
>>>>>>>>>
>>>>>>>>> I don't understand why testing is_user would be an issue. KUAP purpose
>>>>>>>>> it to block any unallowed access from kernel to user memory
>>>>>>>>> (Equivalent to SMAP on x86). So it really must be based on MSR_PR bit,
>>>>>>>>> that is what is_user provides.
>>>>>>>>>
>>>>>>>>> If the kernel access is legitimate, kernel should have opened
>>>>>>>>> userspace access then you shouldn't get this "Bug: Read fault blocked
>>>>>>>>> by KUAP!".
>>>>>>>>>
>>>>>>>>> As far as I understand, the fault occurs in
>>>>>>>>> iov_iter_fault_in_readable() which calls fault_in_pages_readable() And
>>>>>>>>> fault_in_pages_readable() uses __get_user() so it is a legitimate
>>>>>>>>> access and you really should get a KUAP fault.
>>>>>>>>>
>>>>>>>>> So the problem is somewhere else, I think you proposed patch just
>>>>>>>>> hides the problem, it doesn't fix it.
>>>>>>>>
>>>>>>>> If we do kthread_use_mm(), can we agree that the user access is valid?
>>>>>>>
>>>>>>> Yeah the io uring code is fine, provided it uses the uaccess primitives
>>>>>>> like any other kernel code. It's looking more like a an arch/powerpc bug.
>>>>>>>
>>>>>>>> We should be able to copy to/from user space, and including faults, if
>>>>>>>> that's been done and the new mm assigned. Because it really should be.
>>>>>>>> If SMAP was a problem on x86, we would have seen it long ago.
>>>>>>>>
>>>>>>>> I'm assuming this may be breakage related to the recent uaccess changes
>>>>>>>> related to set_fs and friends? Or maybe recent changes on the powerpc
>>>>>>>> side?
>>>>>>>>
>>>>>>>> Zorro, did 5.10 work?
>>>>>>>
>>>>>>> Would be interesting to know.
>>>>>>
>>>>>> Sure Nick and Jens, which 5.10 rc? version do you want to know ? Or any git
>>>>>> commit(be the HEAD) in 5.10 phase?
>>>>>
>>>>> I forget which versions had what series of this, but 5.10 final - and if
>>>>> that fails, then 5.9 final. IIRC, 5.9 was pre any of these changes, and
>>>>> 5.10 definitely has them.
>>>>
>>>> I justed built linux v5.10 with same .config file, and gave it same test.
>>>> v5.10 (HEAD=2c85ebc57b Linux 5.10) can't reproduce this bug:
>>>>
>>>> # ./check generic/013 generic/051
>>>> FSTYP         -- xfs (non-debug)
>>>> PLATFORM      -- Linux/ppc64le ibm-p9z-xxx-xxxx 5.10.0 #3 SMP Thu Jan 28 04:12:14 EST 2021
>>>> MKFS_OPTIONS  -- -f -m crc=1,finobt=1,reflink=1,rmapbt=1,bigtime=1,inobtcount=1 /dev/sda3
>>>> MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/sda3 /mnt/xfstests/scratch
>>>>
>>>> generic/013 138s ...  77s
>>>> generic/051 103s ...  143s
>>>> Ran: generic/013 generic/051
>>>> Passed all 2 tests
>>>
>>> Thanks for testing that, so I think it's safe to conclude that there's a
>>> regression in powerpc fault handling for kthreads that use
>>> kthread_use_mm in this release. A bisect would definitely find it, but
>>> might be pointless if Christophe or Nick already have an idea of what it
>>> is.
>>>
>>
>> I don't have any idea yet, but I'd be curious to see the vmlinux binary matching the reported Oops.
> 
> I just upload the vmlinux and .config file, the vmlinux it too big, I have to
> upload it to my google store and share the link as below:
> 
> config file: https://drive.google.com/file/d/1pMszboxdjbMPqSNXnMH-1UCZC-vtDnI9/view?usp=sharing
> vmlinux: https://drive.google.com/file/d/1s7g2eBPAFFV61aM2dO9bvVTERGQ9mLYk/view?usp=sharing
> 
> I used latest upstream mainline linux, HEAD commit is:
> 76c057c84d (HEAD -> master, origin/master, origin/HEAD) Merge branch 'parisc-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
> 
> The test failed on this kernel as:
> 
> # dmesg
> [   96.200296] ------------[ cut here ]------------
> [   96.200304] Bug: Read fault blocked by KUAP!
> [   96.200309] WARNING: CPU: 3 PID: 1876 at arch/powerpc/mm/fault.c:229 bad_kernel_fault+0x180/0x310

> [   96.200734] NIP [c000000000849424] fault_in_pages_readable+0x104/0x350
> [   96.200741] LR [c00000000084952c] fault_in_pages_readable+0x20c/0x350
> [   96.200747] --- interrupt: 300


Problem happens in a section where userspace access is supposed to be granted, so the patch you 
proposed is definitely not the right fix.

c000000000849408:	2c 01 00 4c 	isync
c00000000084940c:	a6 03 3d 7d 	mtspr   29,r9  <== granting userspace access permission
c000000000849410:	2c 01 00 4c 	isync
c000000000849414:	00 00 36 e9 	ld      r9,0(r22)
c000000000849418:	20 00 29 81 	lwz     r9,32(r9)
c00000000084941c:	00 02 29 71 	andi.   r9,r9,512
c000000000849420:	78 d3 5e 7f 	mr      r30,r26
==> c000000000849424:	00 00 bf 8b 	lbz     r29,0(r31)  <== accessing userspace
c000000000849428:	10 00 82 41 	beq     c000000000849438 <fault_in_pages_readable+0x118>
c00000000084942c:	2c 01 00 4c 	isync
c000000000849430:	a6 03 bd 7e 	mtspr   29,r21  <== clearing userspace access permission
c000000000849434:	2c 01 00 4c 	isync

My first guess is that the problem is linked to the following function, see the comment

/*
  * For kernel thread that doesn't have thread.regs return
  * default AMR/IAMR values.
  */
static inline u64 current_thread_amr(void)
{
	if (current->thread.regs)
		return current->thread.regs->amr;
	return AMR_KUAP_BLOCKED;
}

Above function was introduced by commit 48a8ab4eeb82 ("powerpc/book3s64/pkeys: Don't update SPRN_AMR 
when in kernel mode")

Aneesh, any idea ?

Christophe

^ permalink raw reply

* Re: [PATCH 03/13] module: unexport find_module and module_mutex
From: Miroslav Benes @ 2021-01-29 15:16 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Petr Mladek, Jiri Kosina, Andrew Donnellan, linux-kbuild,
	David Airlie, Masahiro Yamada, Josh Poimboeuf, Maarten Lankhorst,
	linux-kernel, Maxime Ripard, live-patching, Michal Marek,
	Joe Lawrence, dri-devel, Thomas Zimmermann, Jessica Yu,
	Frederic Barrat, Daniel Vetter, linuxppc-dev
In-Reply-To: <20210128181421.2279-4-hch@lst.de>

On Thu, 28 Jan 2021, Christoph Hellwig wrote:

> find_module is not used by modular code any more, and random driver code
> has no business calling it to start with.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

M

^ permalink raw reply

* Re: [PATCH 04/13] module: use RCU to synchronize find_module
From: Miroslav Benes @ 2021-01-29 15:29 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Petr Mladek, Jiri Kosina, Andrew Donnellan, linux-kbuild,
	David Airlie, Masahiro Yamada, Josh Poimboeuf, Maarten Lankhorst,
	linux-kernel, Maxime Ripard, live-patching, Michal Marek,
	Joe Lawrence, dri-devel, Thomas Zimmermann, Jessica Yu,
	Frederic Barrat, Daniel Vetter, linuxppc-dev
In-Reply-To: <20210128181421.2279-5-hch@lst.de>

On Thu, 28 Jan 2021, Christoph Hellwig wrote:

> Allow for a RCU-sched critical section around find_module, following
> the lower level find_module_all helper, and switch the two callers
> outside of module.c to use such a RCU-sched critical section instead
> of module_mutex.

That's a nice idea.
 
> @@ -57,7 +58,7 @@ static void klp_find_object_module(struct klp_object *obj)
>  	if (!klp_is_module(obj))
>  		return;
>  
> -	mutex_lock(&module_mutex);
> +	rcu_read_lock_sched();
>  	/*
>  	 * We do not want to block removal of patched modules and therefore
>  	 * we do not take a reference here. The patches are removed by
> @@ -74,7 +75,7 @@ static void klp_find_object_module(struct klp_object *obj)
>  	if (mod && mod->klp_alive)

RCU always baffles me a bit, so I'll ask. Don't we need 
rcu_dereference_sched() here? "mod" comes from a RCU-protected list, so I 
wonder.

>  		obj->mod = mod;
>  
> -	mutex_unlock(&module_mutex);
> +	rcu_read_unlock_sched();
>  }

Thanks

Miroslav

^ permalink raw reply

* [PATCH AUTOSEL 5.10 18/41] scsi: ibmvfc: Set default timeout to avoid crash during migration
From: Sasha Levin @ 2021-01-29 15:36 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Brian King, Sasha Levin, linuxppc-dev, linux-scsi,
	Martin K . Petersen
In-Reply-To: <20210129153713.1592185-1-sashal@kernel.org>

From: Brian King <brking@linux.vnet.ibm.com>

[ Upstream commit 764907293edc1af7ac857389af9dc858944f53dc ]

While testing live partition mobility, we have observed occasional crashes
of the Linux partition. What we've seen is that during the live migration,
for specific configurations with large amounts of memory, slow network
links, and workloads that are changing memory a lot, the partition can end
up being suspended for 30 seconds or longer. This resulted in the following
scenario:

CPU 0                          CPU 1
-------------------------------  ----------------------------------
scsi_queue_rq                    migration_store
 -> blk_mq_start_request          -> rtas_ibm_suspend_me
  -> blk_add_timer                 -> on_each_cpu(rtas_percpu_suspend_me
              _______________________________________V
             |
             V
    -> IPI from CPU 1
     -> rtas_percpu_suspend_me
                                     -> __rtas_suspend_last_cpu

-- Linux partition suspended for > 30 seconds --
                                      -> for_each_online_cpu(cpu)
                                           plpar_hcall_norets(H_PROD
 -> scsi_dispatch_cmd
                                      -> scsi_times_out
                                       -> scsi_abort_command
                                        -> queue_delayed_work
  -> ibmvfc_queuecommand_lck
   -> ibmvfc_send_event
    -> ibmvfc_send_crq
     - returns H_CLOSED
   <- returns SCSI_MLQUEUE_HOST_BUSY
-> __blk_mq_requeue_request

                                      -> scmd_eh_abort_handler
                                       -> scsi_try_to_abort_cmd
                                         - returns SUCCESS
                                       -> scsi_queue_insert

Normally, the SCMD_STATE_COMPLETE bit would protect against the command
completion and the timeout, but that doesn't work here, since we don't
check that at all in the SCSI_MLQUEUE_HOST_BUSY path.

In this case we end up calling scsi_queue_insert on a request that has
already been queued, or possibly even freed, and we crash.

The patch below simply increases the default I/O timeout to avoid this race
condition. This is also the timeout value that nearly all IBM SAN storage
recommends setting as the default value.

Link: https://lore.kernel.org/r/1610463998-19791-1-git-send-email-brking@linux.vnet.ibm.com
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 070cf516b98fe..57c9a71fa33a7 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -2957,8 +2957,10 @@ static int ibmvfc_slave_configure(struct scsi_device *sdev)
 	unsigned long flags = 0;
 
 	spin_lock_irqsave(shost->host_lock, flags);
-	if (sdev->type == TYPE_DISK)
+	if (sdev->type == TYPE_DISK) {
 		sdev->allow_restart = 1;
+		blk_queue_rq_timeout(sdev->request_queue, 120 * HZ);
+	}
 	spin_unlock_irqrestore(shost->host_lock, flags);
 	return 0;
 }
-- 
2.27.0


^ permalink raw reply related

* [PATCH AUTOSEL 5.10 28/41] selftests/powerpc: Only test lwm/stmw on big endian
From: Sasha Levin @ 2021-01-29 15:36 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, linuxppc-dev, linux-kselftest, Libor Pechacek
In-Reply-To: <20210129153713.1592185-1-sashal@kernel.org>

From: Michael Ellerman <mpe@ellerman.id.au>

[ Upstream commit dd3a44c06f7b4f14e90065bf05d62c255b20005f ]

Newer binutils (>= 2.36) refuse to assemble lmw/stmw when building in
little endian mode. That breaks compilation of our alignment handler
test:

  /tmp/cco4l14N.s: Assembler messages:
  /tmp/cco4l14N.s:1440: Error: `lmw' invalid when little-endian
  /tmp/cco4l14N.s:1814: Error: `stmw' invalid when little-endian
  make[2]: *** [../../lib.mk:139: /output/kselftest/powerpc/alignment/alignment_handler] Error 1

These tests do pass on little endian machines, as the kernel will
still emulate those instructions even when running little
endian (which is arguably a kernel bug).

But we don't really need to test that case, so ifdef those
instructions out to get the alignment test building again.

Reported-by: Libor Pechacek <lpechacek@suse.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Libor Pechacek <lpechacek@suse.com>
Link: https://lore.kernel.org/r/20210119041800.3093047-1-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../testing/selftests/powerpc/alignment/alignment_handler.c  | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/powerpc/alignment/alignment_handler.c b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
index cb53a8b777e68..c25cf7cd45e9f 100644
--- a/tools/testing/selftests/powerpc/alignment/alignment_handler.c
+++ b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
@@ -443,7 +443,6 @@ int test_alignment_handler_integer(void)
 	LOAD_DFORM_TEST(ldu);
 	LOAD_XFORM_TEST(ldx);
 	LOAD_XFORM_TEST(ldux);
-	LOAD_DFORM_TEST(lmw);
 	STORE_DFORM_TEST(stb);
 	STORE_XFORM_TEST(stbx);
 	STORE_DFORM_TEST(stbu);
@@ -462,7 +461,11 @@ int test_alignment_handler_integer(void)
 	STORE_XFORM_TEST(stdx);
 	STORE_DFORM_TEST(stdu);
 	STORE_XFORM_TEST(stdux);
+
+#ifdef __BIG_ENDIAN__
+	LOAD_DFORM_TEST(lmw);
 	STORE_DFORM_TEST(stmw);
+#endif
 
 	return rc;
 }
-- 
2.27.0


^ permalink raw reply related

* [PATCH AUTOSEL 5.4 10/19] scsi: ibmvfc: Set default timeout to avoid crash during migration
From: Sasha Levin @ 2021-01-29 15:37 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Brian King, Sasha Levin, linuxppc-dev, linux-scsi,
	Martin K . Petersen
In-Reply-To: <20210129153806.1592565-1-sashal@kernel.org>

From: Brian King <brking@linux.vnet.ibm.com>

[ Upstream commit 764907293edc1af7ac857389af9dc858944f53dc ]

While testing live partition mobility, we have observed occasional crashes
of the Linux partition. What we've seen is that during the live migration,
for specific configurations with large amounts of memory, slow network
links, and workloads that are changing memory a lot, the partition can end
up being suspended for 30 seconds or longer. This resulted in the following
scenario:

CPU 0                          CPU 1
-------------------------------  ----------------------------------
scsi_queue_rq                    migration_store
 -> blk_mq_start_request          -> rtas_ibm_suspend_me
  -> blk_add_timer                 -> on_each_cpu(rtas_percpu_suspend_me
              _______________________________________V
             |
             V
    -> IPI from CPU 1
     -> rtas_percpu_suspend_me
                                     -> __rtas_suspend_last_cpu

-- Linux partition suspended for > 30 seconds --
                                      -> for_each_online_cpu(cpu)
                                           plpar_hcall_norets(H_PROD
 -> scsi_dispatch_cmd
                                      -> scsi_times_out
                                       -> scsi_abort_command
                                        -> queue_delayed_work
  -> ibmvfc_queuecommand_lck
   -> ibmvfc_send_event
    -> ibmvfc_send_crq
     - returns H_CLOSED
   <- returns SCSI_MLQUEUE_HOST_BUSY
-> __blk_mq_requeue_request

                                      -> scmd_eh_abort_handler
                                       -> scsi_try_to_abort_cmd
                                         - returns SUCCESS
                                       -> scsi_queue_insert

Normally, the SCMD_STATE_COMPLETE bit would protect against the command
completion and the timeout, but that doesn't work here, since we don't
check that at all in the SCSI_MLQUEUE_HOST_BUSY path.

In this case we end up calling scsi_queue_insert on a request that has
already been queued, or possibly even freed, and we crash.

The patch below simply increases the default I/O timeout to avoid this race
condition. This is also the timeout value that nearly all IBM SAN storage
recommends setting as the default value.

Link: https://lore.kernel.org/r/1610463998-19791-1-git-send-email-brking@linux.vnet.ibm.com
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 8a76284b59b08..523809a8a2323 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -2881,8 +2881,10 @@ static int ibmvfc_slave_configure(struct scsi_device *sdev)
 	unsigned long flags = 0;
 
 	spin_lock_irqsave(shost->host_lock, flags);
-	if (sdev->type == TYPE_DISK)
+	if (sdev->type == TYPE_DISK) {
 		sdev->allow_restart = 1;
+		blk_queue_rq_timeout(sdev->request_queue, 120 * HZ);
+	}
 	spin_unlock_irqrestore(shost->host_lock, flags);
 	return 0;
 }
-- 
2.27.0


^ permalink raw reply related

* [PATCH AUTOSEL 5.4 14/19] selftests/powerpc: Only test lwm/stmw on big endian
From: Sasha Levin @ 2021-01-29 15:38 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, linuxppc-dev, linux-kselftest, Libor Pechacek
In-Reply-To: <20210129153806.1592565-1-sashal@kernel.org>

From: Michael Ellerman <mpe@ellerman.id.au>

[ Upstream commit dd3a44c06f7b4f14e90065bf05d62c255b20005f ]

Newer binutils (>= 2.36) refuse to assemble lmw/stmw when building in
little endian mode. That breaks compilation of our alignment handler
test:

  /tmp/cco4l14N.s: Assembler messages:
  /tmp/cco4l14N.s:1440: Error: `lmw' invalid when little-endian
  /tmp/cco4l14N.s:1814: Error: `stmw' invalid when little-endian
  make[2]: *** [../../lib.mk:139: /output/kselftest/powerpc/alignment/alignment_handler] Error 1

These tests do pass on little endian machines, as the kernel will
still emulate those instructions even when running little
endian (which is arguably a kernel bug).

But we don't really need to test that case, so ifdef those
instructions out to get the alignment test building again.

Reported-by: Libor Pechacek <lpechacek@suse.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Libor Pechacek <lpechacek@suse.com>
Link: https://lore.kernel.org/r/20210119041800.3093047-1-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../testing/selftests/powerpc/alignment/alignment_handler.c  | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/powerpc/alignment/alignment_handler.c b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
index 0453c50c949cb..0725239bbd85c 100644
--- a/tools/testing/selftests/powerpc/alignment/alignment_handler.c
+++ b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
@@ -380,7 +380,6 @@ int test_alignment_handler_integer(void)
 	LOAD_DFORM_TEST(ldu);
 	LOAD_XFORM_TEST(ldx);
 	LOAD_XFORM_TEST(ldux);
-	LOAD_DFORM_TEST(lmw);
 	STORE_DFORM_TEST(stb);
 	STORE_XFORM_TEST(stbx);
 	STORE_DFORM_TEST(stbu);
@@ -399,7 +398,11 @@ int test_alignment_handler_integer(void)
 	STORE_XFORM_TEST(stdx);
 	STORE_DFORM_TEST(stdu);
 	STORE_XFORM_TEST(stdux);
+
+#ifdef __BIG_ENDIAN__
+	LOAD_DFORM_TEST(lmw);
 	STORE_DFORM_TEST(stmw);
+#endif
 
 	return rc;
 }
-- 
2.27.0


^ permalink raw reply related

* [PATCH AUTOSEL 4.19 08/12] scsi: ibmvfc: Set default timeout to avoid crash during migration
From: Sasha Levin @ 2021-01-29 15:38 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Brian King, Sasha Levin, linuxppc-dev, linux-scsi,
	Martin K . Petersen
In-Reply-To: <20210129153831.1592769-1-sashal@kernel.org>

From: Brian King <brking@linux.vnet.ibm.com>

[ Upstream commit 764907293edc1af7ac857389af9dc858944f53dc ]

While testing live partition mobility, we have observed occasional crashes
of the Linux partition. What we've seen is that during the live migration,
for specific configurations with large amounts of memory, slow network
links, and workloads that are changing memory a lot, the partition can end
up being suspended for 30 seconds or longer. This resulted in the following
scenario:

CPU 0                          CPU 1
-------------------------------  ----------------------------------
scsi_queue_rq                    migration_store
 -> blk_mq_start_request          -> rtas_ibm_suspend_me
  -> blk_add_timer                 -> on_each_cpu(rtas_percpu_suspend_me
              _______________________________________V
             |
             V
    -> IPI from CPU 1
     -> rtas_percpu_suspend_me
                                     -> __rtas_suspend_last_cpu

-- Linux partition suspended for > 30 seconds --
                                      -> for_each_online_cpu(cpu)
                                           plpar_hcall_norets(H_PROD
 -> scsi_dispatch_cmd
                                      -> scsi_times_out
                                       -> scsi_abort_command
                                        -> queue_delayed_work
  -> ibmvfc_queuecommand_lck
   -> ibmvfc_send_event
    -> ibmvfc_send_crq
     - returns H_CLOSED
   <- returns SCSI_MLQUEUE_HOST_BUSY
-> __blk_mq_requeue_request

                                      -> scmd_eh_abort_handler
                                       -> scsi_try_to_abort_cmd
                                         - returns SUCCESS
                                       -> scsi_queue_insert

Normally, the SCMD_STATE_COMPLETE bit would protect against the command
completion and the timeout, but that doesn't work here, since we don't
check that at all in the SCSI_MLQUEUE_HOST_BUSY path.

In this case we end up calling scsi_queue_insert on a request that has
already been queued, or possibly even freed, and we crash.

The patch below simply increases the default I/O timeout to avoid this race
condition. This is also the timeout value that nearly all IBM SAN storage
recommends setting as the default value.

Link: https://lore.kernel.org/r/1610463998-19791-1-git-send-email-brking@linux.vnet.ibm.com
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 090ab377f65e5..50078a199fea0 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -2890,8 +2890,10 @@ static int ibmvfc_slave_configure(struct scsi_device *sdev)
 	unsigned long flags = 0;
 
 	spin_lock_irqsave(shost->host_lock, flags);
-	if (sdev->type == TYPE_DISK)
+	if (sdev->type == TYPE_DISK) {
 		sdev->allow_restart = 1;
+		blk_queue_rq_timeout(sdev->request_queue, 120 * HZ);
+	}
 	spin_unlock_irqrestore(shost->host_lock, flags);
 	return 0;
 }
-- 
2.27.0


^ permalink raw reply related

* [PATCH AUTOSEL 4.19 09/12] selftests/powerpc: Only test lwm/stmw on big endian
From: Sasha Levin @ 2021-01-29 15:38 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, linuxppc-dev, linux-kselftest, Libor Pechacek
In-Reply-To: <20210129153831.1592769-1-sashal@kernel.org>

From: Michael Ellerman <mpe@ellerman.id.au>

[ Upstream commit dd3a44c06f7b4f14e90065bf05d62c255b20005f ]

Newer binutils (>= 2.36) refuse to assemble lmw/stmw when building in
little endian mode. That breaks compilation of our alignment handler
test:

  /tmp/cco4l14N.s: Assembler messages:
  /tmp/cco4l14N.s:1440: Error: `lmw' invalid when little-endian
  /tmp/cco4l14N.s:1814: Error: `stmw' invalid when little-endian
  make[2]: *** [../../lib.mk:139: /output/kselftest/powerpc/alignment/alignment_handler] Error 1

These tests do pass on little endian machines, as the kernel will
still emulate those instructions even when running little
endian (which is arguably a kernel bug).

But we don't really need to test that case, so ifdef those
instructions out to get the alignment test building again.

Reported-by: Libor Pechacek <lpechacek@suse.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Libor Pechacek <lpechacek@suse.com>
Link: https://lore.kernel.org/r/20210119041800.3093047-1-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../testing/selftests/powerpc/alignment/alignment_handler.c  | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/powerpc/alignment/alignment_handler.c b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
index 169a8b9719fb9..4f8335e0c9858 100644
--- a/tools/testing/selftests/powerpc/alignment/alignment_handler.c
+++ b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
@@ -384,7 +384,6 @@ int test_alignment_handler_integer(void)
 	LOAD_DFORM_TEST(ldu);
 	LOAD_XFORM_TEST(ldx);
 	LOAD_XFORM_TEST(ldux);
-	LOAD_DFORM_TEST(lmw);
 	STORE_DFORM_TEST(stb);
 	STORE_XFORM_TEST(stbx);
 	STORE_DFORM_TEST(stbu);
@@ -403,7 +402,11 @@ int test_alignment_handler_integer(void)
 	STORE_XFORM_TEST(stdx);
 	STORE_DFORM_TEST(stdu);
 	STORE_XFORM_TEST(stdux);
+
+#ifdef __BIG_ENDIAN__
+	LOAD_DFORM_TEST(lmw);
 	STORE_DFORM_TEST(stmw);
+#endif
 
 	return rc;
 }
-- 
2.27.0


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox