LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH linux-next] powerpc/tm: remove duplicate include in tm-poison.c
From: cgel.zte @ 2021-08-05  6:52 UTC (permalink / raw)
  To: mpe
  Cc: yong.yiran, Zeal Robot, linuxppc-dev, linux-kernel, paulus,
	linux-kselftest, shuah

From: yong yiran <yong.yiran@zte.com.cn>

'inttypes.h' included in 'tm-poison.c' is duplicated.
Remove all but the first include of inttypes.h from tm-poison.c.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: yong yiran <yong.yiran@zte.com.cn>
---
 tools/testing/selftests/powerpc/tm/tm-poison.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tools/testing/selftests/powerpc/tm/tm-poison.c b/tools/testing/selftests/powerpc/tm/tm-poison.c
index 29e5f26af7b9..27c083a03d1f 100644
--- a/tools/testing/selftests/powerpc/tm/tm-poison.c
+++ b/tools/testing/selftests/powerpc/tm/tm-poison.c
@@ -20,7 +20,6 @@
 #include <sched.h>
 #include <sys/types.h>
 #include <signal.h>
-#include <inttypes.h>
 
 #include "tm.h"
 
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars()
From: Greg KH @ 2021-08-05  8:18 UTC (permalink / raw)
  To: Xianting Tian
  Cc: arnd, amit, Jiri Slaby, linux-kernel, virtualization,
	linuxppc-dev, osandov
In-Reply-To: <40f78d10-0a57-4620-e7e2-f806bd61abca@linux.alibaba.com>

On Thu, Aug 05, 2021 at 04:08:46PM +0800, Xianting Tian wrote:
> 
> 在 2021/8/5 下午3:58, Jiri Slaby 写道:
> > Hi,
> > 
> > On 04. 08. 21, 4:54, Xianting Tian wrote:
> > > @@ -933,6 +949,16 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno,
> > > int data,
> > >       hp->outbuf_size = outbuf_size;
> > >       hp->outbuf = &((char *)hp)[ALIGN(sizeof(*hp), sizeof(long))];
> > >   +    /*
> > > +     * hvc_con_outbuf is guaranteed to be aligned at least to the
> > > +     * size(N_OUTBUF) by kmalloc().
> > > +     */
> > > +    hp->hvc_con_outbuf = kzalloc(N_OUTBUF, GFP_KERNEL);
> > > +    if (!hp->hvc_con_outbuf)
> > > +        return ERR_PTR(-ENOMEM);
> > 
> > This leaks hp, right?
> > 
> > BTW your 2 patches are still not threaded, that is hard to follow.
> 
> yes, thanks, I found the bug, I am preparing to do this in v4.
> 
> It is the first time I send series patches(number >1), I checked the method
> for sending series patch on LKML.org, I should send '0/2' which is the
> history info for series patches.

Please use 'git send-email' to send the full series all at once,
otherwise it is hard to make the emails threaded "by hand" if you do not
do so.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH] powerpc/kprobes: Fix kprobe Oops happens in booke
From: Pu Lehui @ 2021-08-05  7:52 UTC (permalink / raw)
  To: Michael Ellerman, oleg, benh, paulus, naveen.n.rao, mhiramat,
	christophe.leroy, peterz, npiggin, ruscur
  Cc: zhangjinhao2, xukuohai, linuxppc-dev, linux-kernel
In-Reply-To: <87fsvoo1uy.fsf@mpe.ellerman.id.au>



On 2021/8/5 14:13, Michael Ellerman wrote:
> Pu Lehui <pulehui@huawei.com> writes:
>> When using kprobe on powerpc booke series processor, Oops happens
>> as show bellow:
>>
>> [   35.861352] Oops: Exception in kernel mode, sig: 5 [#1]
>> [   35.861676] BE PAGE_SIZE=4K SMP NR_CPUS=24 QEMU e500
>> [   35.861905] Modules linked in:
>> [   35.862144] CPU: 0 PID: 76 Comm: sh Not tainted 5.14.0-rc3-00060-g7e96bf476270 #18
>> [   35.862610] NIP:  c0b96470 LR: c00107b4 CTR: c0161c80
>> [   35.862805] REGS: c387fe70 TRAP: 0700   Not tainted (5.14.0-rc3-00060-g7e96bf476270)
>> [   35.863198] MSR:  00029002 <CE,EE,ME>  CR: 24022824  XER: 20000000
>> [   35.863577]
>> [   35.863577] GPR00: c0015218 c387ff20 c313e300 c387ff50 00000004 40000002 40000000 0a1a2cce
>> [   35.863577] GPR08: 00000000 00000004 00000000 59764000 24022422 102490c2 00000000 00000000
>> [   35.863577] GPR16: 00000000 00000000 00000040 10240000 10240000 10240000 10240000 10220000
>> [   35.863577] GPR24: ffffffff 10240000 00000000 00000000 bfc655e8 00000800 c387ff50 00000000
>> [   35.865367] NIP [c0b96470] schedule+0x0/0x130
>> [   35.865606] LR [c00107b4] interrupt_exit_user_prepare_main+0xf4/0x100
>> [   35.865974] Call Trace:
>> [   35.866142] [c387ff20] [c0053224] irq_exit+0x114/0x120 (unreliable)
>> [   35.866472] [c387ff40] [c0015218] interrupt_return+0x14/0x13c
>> [   35.866728] --- interrupt: 900 at 0x100af3dc
>> [   35.866963] NIP:  100af3dc LR: 100de020 CTR: 00000000
>> [   35.867177] REGS: c387ff50 TRAP: 0900   Not tainted (5.14.0-rc3-00060-g7e96bf476270)
>> [   35.867488] MSR:  0002f902 <CE,EE,PR,FP,ME>  CR: 20022422  XER: 20000000
>> [   35.867808]
>> [   35.867808] GPR00: c001509c bfc65570 1024b4d0 00000000 100de020 20022422 bfc655a8 100af3dc
>> [   35.867808] GPR08: 0002f902 00000000 00000000 00000000 72656773 102490c2 00000000 00000000
>> [   35.867808] GPR16: 00000000 00000000 00000040 10240000 10240000 10240000 10240000 10220000
>> [   35.867808] GPR24: ffffffff 10240000 00000000 00000000 bfc655e8 10245910 ffffffff 00000001
>> [   35.869406] NIP [100af3dc] 0x100af3dc
>> [   35.869578] LR [100de020] 0x100de020
>> [   35.869751] --- interrupt: 900
>> [   35.870001] Instruction dump:
>> [   35.870283] 40c20010 815e0518 714a0100 41e2fd04 39200000 913e00c0 3b1e0450 4bfffd80
>> [   35.870666] 0fe00000 92a10024 4bfff1a9 60000000 <7fe00008> 7c0802a6 93e1001c 7c5f1378
>> [   35.871339] ---[ end trace 23ff848139efa9b9 ]---
>>
>> There is no real mode for booke arch and the MMU translation is
>> always on. The corresponding MSR_IS/MSR_DS bit in booke is used
>> to switch the address space, but not for real mode judgment.
>>
>> Fixes: 21f8b2fa3ca5 ("powerpc/kprobes: Ignore traps that happened in real mode")
>> Signed-off-by: Pu Lehui <pulehui@huawei.com>
>> ---
>>   arch/powerpc/include/asm/ptrace.h | 6 ++++++
>>   arch/powerpc/kernel/kprobes.c     | 5 +----
>>   2 files changed, 7 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h
>> index 3e5d470a6155..4aec1a97024b 100644
>> --- a/arch/powerpc/include/asm/ptrace.h
>> +++ b/arch/powerpc/include/asm/ptrace.h
>> @@ -187,6 +187,12 @@ static inline unsigned long frame_pointer(struct pt_regs *regs)
>>   #define user_mode(regs) (((regs)->msr & MSR_PR) != 0)
>>   #endif
>>   
>> +#ifdef CONFIG_BOOKE
>> +#define real_mode(regs)	0
>> +#else
>> +#define real_mode(regs)	(!((regs)->msr & MSR_IR) || !((regs)->msr & MSR_DR))
>> +#endif
> 
> I'm not sure about this helper.
> 
> Arguably it should only return true if both MSR_IR and MSR_DR are clear.
> 
> 
>> diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
>> index cbc28d1a2e1b..fac9a5974718 100644
>> --- a/arch/powerpc/kernel/kprobes.c
>> +++ b/arch/powerpc/kernel/kprobes.c
>> @@ -289,10 +289,7 @@ int kprobe_handler(struct pt_regs *regs)
>>   	unsigned int *addr = (unsigned int *)regs->nip;
>>   	struct kprobe_ctlblk *kcb;
>>   
>> -	if (user_mode(regs))
>> -		return 0;
>> -
>> -	if (!(regs->msr & MSR_IR) || !(regs->msr & MSR_DR))
>> +	if (user_mode(regs) || real_mode(regs))
>>   		return 0;
> 
> I think just adding an IS_ENABLED(CONFIG_BOOKE) here might be better.
> 
> cheers
> .
> 
Thanks for your suggestion, I will fix it in v2.

Best regards
Lehui

^ permalink raw reply

* Re: [PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars()
From: Jiri Slaby @ 2021-08-05  8:09 UTC (permalink / raw)
  To: Xianting Tian, gregkh, amit, arnd, osandov
  Cc: linuxppc-dev, linux-kernel, virtualization
In-Reply-To: <0f26a1c3-53e8-9282-69e8-8d81a9cafc59@kernel.org>

On 05. 08. 21, 9:58, Jiri Slaby wrote:
> Hi,
> 
> On 04. 08. 21, 4:54, Xianting Tian wrote:
>> @@ -933,6 +949,16 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, 
>> int data,
>>       hp->outbuf_size = outbuf_size;
>>       hp->outbuf = &((char *)hp)[ALIGN(sizeof(*hp), sizeof(long))];

This deserves cleanup too. Why is "outbuf" not "char outbuf[0] 
__ALIGNED__" at the end of the structure? The allocation would be easier 
(using struct_size()) and this line would be gone completely.

>> +    /*
>> +     * hvc_con_outbuf is guaranteed to be aligned at least to the
>> +     * size(N_OUTBUF) by kmalloc().
>> +     */
>> +    hp->hvc_con_outbuf = kzalloc(N_OUTBUF, GFP_KERNEL);
>> +    if (!hp->hvc_con_outbuf)
>> +        return ERR_PTR(-ENOMEM);
> 
> This leaks hp, right?

Actually, why don't you make
char c[N_OUTBUF] __ALIGNED__;

part of struct hvc_struct directly?

> BTW your 2 patches are still not threaded, that is hard to follow.
> 
>> +
>> +    spin_lock_init(&hp->hvc_con_lock);
>> +
>>       tty_port_init(&hp->port);
>>       hp->port.ops = &hvc_port_ops;
> 
> thanks,
-- 
js
suse labs

^ permalink raw reply

* [Bug 213961] Oops while loading radeon driver
From: bugzilla-daemon @ 2021-08-05  8:04 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <bug-213961-206035@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=213961

--- Comment #8 from Christophe Leroy (christophe.leroy@csgroup.eu) ---
Great:

[   15.246367] NIP [bea42a80] radeon_agp_head_init+0x1c/0xf8 [radeon]
[   15.246969] LR [bea39860] radeon_driver_load_kms+0x1bc/0x1f4 [radeon]
[   15.247160] Call Trace:
[   15.247168] [f2b75c30] [c0c1ec60] 0xc0c1ec60 (unreliable)
[   15.247180] [f2b75c50] [bea39860] radeon_driver_load_kms+0x1bc/0x1f4
[radeon]
[   15.247343] [f2b75c80] [be8cc74c] drm_dev_register+0x10c/0x268 [drm]
[   15.247718] [f2b75cb0] [bea36484] radeon_pci_probe+0x108/0x190 [radeon]
[   15.248001] [f2b75cd0] [c03832fc] pci_device_probe+0xf4/0x1a4

So we now know we have a NULL pointer dereference in radeon_agp_head_init().

Looks like all this code is quite recent, at least there are recent
modification, so I think you should address it with RADEON people, I'm not sure
the problem is a PPC32 subject.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply

* Re: [PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars()
From: Jiri Slaby @ 2021-08-05  7:58 UTC (permalink / raw)
  To: Xianting Tian, gregkh, amit, arnd, osandov
  Cc: linuxppc-dev, linux-kernel, virtualization
In-Reply-To: <20210804025453.93543-1-xianting.tian@linux.alibaba.com>

Hi,

On 04. 08. 21, 4:54, Xianting Tian wrote:
> @@ -933,6 +949,16 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
>   	hp->outbuf_size = outbuf_size;
>   	hp->outbuf = &((char *)hp)[ALIGN(sizeof(*hp), sizeof(long))];
>   
> +	/*
> +	 * hvc_con_outbuf is guaranteed to be aligned at least to the
> +	 * size(N_OUTBUF) by kmalloc().
> +	 */
> +	hp->hvc_con_outbuf = kzalloc(N_OUTBUF, GFP_KERNEL);
> +	if (!hp->hvc_con_outbuf)
> +		return ERR_PTR(-ENOMEM);

This leaks hp, right?

BTW your 2 patches are still not threaded, that is hard to follow.

> +
> +	spin_lock_init(&hp->hvc_con_lock);
> +
>   	tty_port_init(&hp->port);
>   	hp->port.ops = &hvc_port_ops;
>   

thanks,
-- 
js
suse labs

^ permalink raw reply

* [PATCH kernel v2] KVM: PPC: Use arch_get_random_seed_long instead of powernv variant
From: Alexey Kardashevskiy @ 2021-08-05  7:56 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Alexey Kardashevskiy, kvm, kvm-ppc

The powernv_get_random_long() does not work in nested KVM (which is
pseries) and produces a crash when accessing in_be64(rng->regs) in
powernv_get_random_long().

This replaces powernv_get_random_long with the ppc_md machine hook
wrapper.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---

Changes:
v2:
* replaces [PATCH kernel] powerpc/powernv: Check if powernv_rng is initialized

---
 arch/powerpc/kvm/book3s_hv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index be0cde26f156..ecfd133e0ca8 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1165,7 +1165,7 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
 		break;
 #endif
 	case H_RANDOM:
-		if (!powernv_get_random_long(&vcpu->arch.regs.gpr[4]))
+		if (!arch_get_random_seed_long(&vcpu->arch.regs.gpr[4]))
 			ret = H_HARDWARE;
 		break;
 	case H_RPT_INVALIDATE:
-- 
2.30.2


^ permalink raw reply related

* [PATCH 1/3] arch: Export machine_restart() instances so they can be called from modules
From: Lee Jones @ 2021-08-05  7:50 UTC (permalink / raw)
  To: lee.jones
  Cc: Rich Felker, Greg Kroah-Hartman, Catalin Marinas, Paul Walmsley,
	Sebastian Reichel, James E . J . Bottomley, Max Filippov, Guo Ren,
	linux-csky, sparclinux, linux-hexagon, linux-riscv, Will Deacon,
	Thomas Gleixner, Anton Ivanov, Jonas Bonn, linux-s390, Brian Cain,
	Helge Deller, linux-sh, Ley Foon Tan, Christian Borntraeger,
	Ingo Molnar, Geert Uytterhoeven, linux-snps-arc, Jeff Dike,
	uclinux-h8-devel, linux-xtensa, Albert Ou, Vasily Gorbik,
	Heiko Carstens, linux-um, Stefan Kristiansson, linux-m68k,
	openrisc, Borislav Petkov, John Crispin, Stafford Horne,
	linux-arm-kernel, Chris Zankel, Michal Simek, Thomas Bogendoerfer,
	linux-mips, Yoshinori Sato, linux-parisc, Vineet Gupta,
	linux-kernel, Palmer Dabbelt, Richard Weinberger, linuxppc-dev,
	David S . Miller
In-Reply-To: <20210805075032.723037-1-lee.jones@linaro.org>

A recent attempt to convert the Power Reset Restart driver to tristate
failed because of the following compile error (reported once merged by
Stephen Rothwell via Linux Next):

  ERROR: "machine_restart" [drivers/power/reset/restart-poweroff.ko] undefined!

This error occurs since some of the machine_restart() instances are
not currently exported for use in modules.  This patch aims to rectify
that.

Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: John Crispin <john@phrozen.org>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Stafford Horne <shorne@gmail.com>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Rich Felker <dalias@libc.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-csky@vger.kernel.org
Cc: uclinux-h8-devel@lists.sourceforge.jp
Cc: linux-hexagon@vger.kernel.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-mips@vger.kernel.org
Cc: openrisc@lists.librecores.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-riscv@lists.infradead.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: linux-um@lists.infradead.org
Cc: linux-xtensa@linux-xtensa.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
---

The 2 patches this change supports have the required Acks already.

NB: If it's safe to omit some of these, let me know and I'll revise the patch.

 arch/arc/kernel/reset.c            | 1 +
 arch/arm/kernel/reboot.c           | 1 +
 arch/arm64/kernel/process.c        | 1 +
 arch/csky/kernel/power.c           | 1 +
 arch/h8300/kernel/process.c        | 1 +
 arch/hexagon/kernel/reset.c        | 1 +
 arch/m68k/kernel/process.c         | 1 +
 arch/microblaze/kernel/reset.c     | 1 +
 arch/mips/kernel/reset.c           | 1 +
 arch/mips/lantiq/falcon/reset.c    | 1 +
 arch/mips/sgi-ip27/ip27-reset.c    | 1 +
 arch/nios2/kernel/process.c        | 1 +
 arch/openrisc/kernel/process.c     | 1 +
 arch/parisc/kernel/process.c       | 1 +
 arch/powerpc/kernel/setup-common.c | 1 +
 arch/riscv/kernel/reset.c          | 1 +
 arch/s390/kernel/setup.c           | 1 +
 arch/sh/kernel/reboot.c            | 1 +
 arch/sparc/kernel/process_32.c     | 1 +
 arch/sparc/kernel/reboot.c         | 1 +
 arch/um/kernel/reboot.c            | 1 +
 arch/x86/kernel/reboot.c           | 1 +
 arch/xtensa/kernel/setup.c         | 1 +
 23 files changed, 23 insertions(+)

diff --git a/arch/arc/kernel/reset.c b/arch/arc/kernel/reset.c
index fd6c3eb930bad..ae4f8a43b0af4 100644
--- a/arch/arc/kernel/reset.c
+++ b/arch/arc/kernel/reset.c
@@ -20,6 +20,7 @@ void machine_restart(char *__unused)
 	pr_info("Put your restart handler here\n");
 	machine_halt();
 }
+EXPORT_SYMBOL(machine_restart);
 
 void machine_power_off(void)
 {
diff --git a/arch/arm/kernel/reboot.c b/arch/arm/kernel/reboot.c
index 0ce388f154226..2878260efd130 100644
--- a/arch/arm/kernel/reboot.c
+++ b/arch/arm/kernel/reboot.c
@@ -150,3 +150,4 @@ void machine_restart(char *cmd)
 	printk("Reboot failed -- System halted\n");
 	while (1);
 }
+EXPORT_SYMBOL(machine_restart);
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index b4bb67f17a2ca..cf89ce91d7145 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -212,6 +212,7 @@ void machine_restart(char *cmd)
 	printk("Reboot failed -- System halted\n");
 	while (1);
 }
+EXPORT_SYMBOL(machine_restart);
 
 #define bstr(suffix, str) [PSR_BTYPE_ ## suffix >> PSR_BTYPE_SHIFT] = str
 static const char *const btypes[] = {
diff --git a/arch/csky/kernel/power.c b/arch/csky/kernel/power.c
index 923ee4e381b81..b466c825cbb3c 100644
--- a/arch/csky/kernel/power.c
+++ b/arch/csky/kernel/power.c
@@ -28,3 +28,4 @@ void machine_restart(char *cmd)
 	do_kernel_restart(cmd);
 	asm volatile ("bkpt");
 }
+EXPORT_SYMBOL(machine_restart);
diff --git a/arch/h8300/kernel/process.c b/arch/h8300/kernel/process.c
index 46b1342ce515b..8203ac5cd33ec 100644
--- a/arch/h8300/kernel/process.c
+++ b/arch/h8300/kernel/process.c
@@ -66,6 +66,7 @@ void machine_restart(char *__unused)
 	local_irq_disable();
 	__asm__("jmp @@0");
 }
+EXPORT_SYMBOL(machine_restart);
 
 void machine_halt(void)
 {
diff --git a/arch/hexagon/kernel/reset.c b/arch/hexagon/kernel/reset.c
index da36114d928f0..433378d52063c 100644
--- a/arch/hexagon/kernel/reset.c
+++ b/arch/hexagon/kernel/reset.c
@@ -19,6 +19,7 @@ void machine_halt(void)
 void machine_restart(char *cmd)
 {
 }
+EXPORT_SYMBOL(machine_restart);
 
 void (*pm_power_off)(void) = NULL;
 EXPORT_SYMBOL(pm_power_off);
diff --git a/arch/m68k/kernel/process.c b/arch/m68k/kernel/process.c
index da83cc83e7912..e0264704686e9 100644
--- a/arch/m68k/kernel/process.c
+++ b/arch/m68k/kernel/process.c
@@ -57,6 +57,7 @@ void machine_restart(char * __unused)
 		mach_reset();
 	for (;;);
 }
+EXPORT_SYMBOL(machine_restart);
 
 void machine_halt(void)
 {
diff --git a/arch/microblaze/kernel/reset.c b/arch/microblaze/kernel/reset.c
index 5f4722908164d..902fbe3777846 100644
--- a/arch/microblaze/kernel/reset.c
+++ b/arch/microblaze/kernel/reset.c
@@ -41,3 +41,4 @@ void machine_restart(char *cmd)
 	pr_emerg("Reboot failed -- System halted\n");
 	while (1);
 }
+EXPORT_SYMBOL(machine_restart);
diff --git a/arch/mips/kernel/reset.c b/arch/mips/kernel/reset.c
index 6288780b779e7..2d3193a3cf68b 100644
--- a/arch/mips/kernel/reset.c
+++ b/arch/mips/kernel/reset.c
@@ -99,6 +99,7 @@ void machine_restart(char *command)
 	pr_emerg("Reboot failed -- System halted\n");
 	machine_hang();
 }
+EXPORT_SYMBOL(machine_restart);
 
 void machine_halt(void)
 {
diff --git a/arch/mips/lantiq/falcon/reset.c b/arch/mips/lantiq/falcon/reset.c
index 261996c230cf6..80dd9759ffa55 100644
--- a/arch/mips/lantiq/falcon/reset.c
+++ b/arch/mips/lantiq/falcon/reset.c
@@ -51,6 +51,7 @@ static void machine_restart(char *command)
 		(void *)WDT_REG_BASE);
 	unreachable();
 }
+EXPORT_SYMBOL(machine_restart);
 
 static void machine_halt(void)
 {
diff --git a/arch/mips/sgi-ip27/ip27-reset.c b/arch/mips/sgi-ip27/ip27-reset.c
index 5ac5ad6387343..a3f8f4498b7c5 100644
--- a/arch/mips/sgi-ip27/ip27-reset.c
+++ b/arch/mips/sgi-ip27/ip27-reset.c
@@ -29,6 +29,7 @@
 #include "ip27-common.h"
 
 void machine_restart(char *command) __noreturn;
+EXPORT_SYMBOL(machine_restart);
 void machine_halt(void) __noreturn;
 void machine_power_off(void) __noreturn;
 
diff --git a/arch/nios2/kernel/process.c b/arch/nios2/kernel/process.c
index c5f916ca6845f..6f9459e8ae4ed 100644
--- a/arch/nios2/kernel/process.c
+++ b/arch/nios2/kernel/process.c
@@ -51,6 +51,7 @@ void machine_restart(char *__unused)
 	: "r" (cpuinfo.reset_addr)
 	: "r4");
 }
+EXPORT_SYMBOL(machine_restart);
 
 void machine_halt(void)
 {
diff --git a/arch/openrisc/kernel/process.c b/arch/openrisc/kernel/process.c
index eb62429681fc8..12c3022c46387 100644
--- a/arch/openrisc/kernel/process.c
+++ b/arch/openrisc/kernel/process.c
@@ -61,6 +61,7 @@ void machine_restart(char *cmd)
 	pr_emerg("Reboot failed -- System halted\n");
 	while (1);
 }
+EXPORT_SYMBOL(machine_restart);
 
 /*
  * Similar to machine_power_off, but don't shut off power.  Add code
diff --git a/arch/parisc/kernel/process.c b/arch/parisc/kernel/process.c
index b144fbe29bc16..05e9f03124b64 100644
--- a/arch/parisc/kernel/process.c
+++ b/arch/parisc/kernel/process.c
@@ -96,6 +96,7 @@ void machine_restart(char *cmd)
 	while (1) ;
 
 }
+EXPORT_SYMBOL(machine_restart);
 
 void (*chassis_power_off)(void);
 
diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
index 74a98fff2c2f9..54ebae540dd7d 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -159,6 +159,7 @@ void machine_restart(char *cmd)
 
 	machine_hang();
 }
+EXPORT_SYMBOL(machine_restart);
 
 void machine_power_off(void)
 {
diff --git a/arch/riscv/kernel/reset.c b/arch/riscv/kernel/reset.c
index ee5878d968cc1..596a36b91eaa2 100644
--- a/arch/riscv/kernel/reset.c
+++ b/arch/riscv/kernel/reset.c
@@ -20,6 +20,7 @@ void machine_restart(char *cmd)
 	do_kernel_restart(cmd);
 	while (1);
 }
+EXPORT_SYMBOL(machine_restart);
 
 void machine_halt(void)
 {
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
index 5aab59ad56881..fd2394af0d43a 100644
--- a/arch/s390/kernel/setup.c
+++ b/arch/s390/kernel/setup.c
@@ -276,6 +276,7 @@ void machine_restart(char *command)
 		console_unblank();
 	_machine_restart(command);
 }
+EXPORT_SYMBOL(machine_restart);
 
 void machine_halt(void)
 {
diff --git a/arch/sh/kernel/reboot.c b/arch/sh/kernel/reboot.c
index 5c33f036418be..36b6c61f3b129 100644
--- a/arch/sh/kernel/reboot.c
+++ b/arch/sh/kernel/reboot.c
@@ -83,6 +83,7 @@ void machine_restart(char *cmd)
 {
 	machine_ops.restart(cmd);
 }
+EXPORT_SYMBOL(machine_restart);
 
 void machine_halt(void)
 {
diff --git a/arch/sparc/kernel/process_32.c b/arch/sparc/kernel/process_32.c
index 3b9794978e5bc..30a1674683946 100644
--- a/arch/sparc/kernel/process_32.c
+++ b/arch/sparc/kernel/process_32.c
@@ -104,6 +104,7 @@ void machine_restart(char * cmd)
 	prom_feval ("reset");
 	panic("Reboot failed!");
 }
+EXPORT_SYMBOL(machine_restart);
 
 void machine_power_off(void)
 {
diff --git a/arch/sparc/kernel/reboot.c b/arch/sparc/kernel/reboot.c
index 69c1b6c047d53..53adef425d7de 100644
--- a/arch/sparc/kernel/reboot.c
+++ b/arch/sparc/kernel/reboot.c
@@ -52,4 +52,5 @@ void machine_restart(char *cmd)
 	prom_reboot("");
 	panic("Reboot failed!");
 }
+EXPORT_SYMBOL(machine_restart);
 
diff --git a/arch/um/kernel/reboot.c b/arch/um/kernel/reboot.c
index 48c0610d506e0..4b764311efb89 100644
--- a/arch/um/kernel/reboot.c
+++ b/arch/um/kernel/reboot.c
@@ -47,6 +47,7 @@ void machine_restart(char * __unused)
 	uml_cleanup();
 	reboot_skas();
 }
+EXPORT_SYMBOL(machine_restart);
 
 void machine_power_off(void)
 {
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index b29657b76e3fa..b48c30ead7167 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -733,6 +733,7 @@ static void native_machine_restart(char *__unused)
 		machine_shutdown();
 	__machine_emergency_restart(0);
 }
+EXPORT_SYMBOL(machine_restart);
 
 static void native_machine_halt(void)
 {
diff --git a/arch/xtensa/kernel/setup.c b/arch/xtensa/kernel/setup.c
index ed184106e4cf9..a84cc934300d5 100644
--- a/arch/xtensa/kernel/setup.c
+++ b/arch/xtensa/kernel/setup.c
@@ -564,6 +564,7 @@ void machine_restart(char * cmd)
 {
 	platform_restart();
 }
+EXPORT_SYMBOL(machine_restart);
 
 void machine_halt(void)
 {
-- 
2.32.0.605.g8dce9f2422-goog


^ permalink raw reply related

* [PATCH 0/3] power: reset: Convert Power-Off driver to tristate
From: Lee Jones @ 2021-08-05  7:50 UTC (permalink / raw)
  To: lee.jones
  Cc: Rich Felker, linux-sh, Catalin Marinas, Paul Walmsley, linux-mips,
	James E . J . Bottomley, Max Filippov, Guo Ren, linux-csky,
	sparclinux, linux-riscv, Will Deacon, Thomas Gleixner,
	Anton Ivanov, Jonas Bonn, linux-s390, Brian Cain, linux-hexagon,
	Helge Deller, Ley Foon Tan, Christian Borntraeger, Ingo Molnar,
	Geert Uytterhoeven, linux-snps-arc, Jeff Dike, uclinux-h8-devel,
	linux-xtensa, Albert Ou, Vasily Gorbik, Heiko Carstens, linux-um,
	Stefan Kristiansson, Richard Weinberger, linux-m68k, openrisc,
	Borislav Petkov, John Crispin, Stafford Horne, linux-arm-kernel,
	Chris Zankel, Michal Simek, Thomas Bogendoerfer, Yoshinori Sato,
	linux-parisc, Vineet Gupta, linux-kernel, Palmer Dabbelt,
	linuxppc-dev, David S . Miller

Provide support to compile the Power-Off driver as a module.

Elliot Berman (2):
  reboot: Export reboot_mode
  power: reset: Enable tristate on restart power-off driver

Lee Jones (1):
  arch: Export machine_restart() instances so they can be called from
    modules

 arch/arc/kernel/reset.c            | 1 +
 arch/arm/kernel/reboot.c           | 1 +
 arch/arm64/kernel/process.c        | 1 +
 arch/csky/kernel/power.c           | 1 +
 arch/h8300/kernel/process.c        | 1 +
 arch/hexagon/kernel/reset.c        | 1 +
 arch/m68k/kernel/process.c         | 1 +
 arch/microblaze/kernel/reset.c     | 1 +
 arch/mips/kernel/reset.c           | 1 +
 arch/mips/lantiq/falcon/reset.c    | 1 +
 arch/mips/sgi-ip27/ip27-reset.c    | 1 +
 arch/nios2/kernel/process.c        | 1 +
 arch/openrisc/kernel/process.c     | 1 +
 arch/parisc/kernel/process.c       | 1 +
 arch/powerpc/kernel/setup-common.c | 1 +
 arch/riscv/kernel/reset.c          | 1 +
 arch/s390/kernel/setup.c           | 1 +
 arch/sh/kernel/reboot.c            | 1 +
 arch/sparc/kernel/process_32.c     | 1 +
 arch/sparc/kernel/reboot.c         | 1 +
 arch/um/kernel/reboot.c            | 1 +
 arch/x86/kernel/reboot.c           | 1 +
 arch/xtensa/kernel/setup.c         | 1 +
 drivers/power/reset/Kconfig        | 2 +-
 kernel/reboot.c                    | 2 ++
 25 files changed, 26 insertions(+), 1 deletion(-)

Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: John Crispin <john@phrozen.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-csky@vger.kernel.org
Cc: linux-hexagon@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-mips@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-riscv@lists.infradead.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-um@lists.infradead.org
Cc: linux-xtensa@linux-xtensa.org
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: openrisc@lists.librecores.org
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: sparclinux@vger.kernel.org
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: uclinux-h8-devel@lists.sourceforge.jp
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
-- 
2.32.0.605.g8dce9f2422-goog


^ permalink raw reply

* Re: [RFC PATCH v0 0/5] PPC: KVM: pseries: Asynchronous page fault
From: Bharata B Rao @ 2021-08-05  7:35 UTC (permalink / raw)
  To: kvm-ppc, linuxppc-dev; +Cc: aneesh.kumar, kvm, bharata.rao
In-Reply-To: <20210805072439.501481-1-bharata@linux.ibm.com>

On Thu, Aug 05, 2021 at 12:54:34PM +0530, Bharata B Rao wrote:
> Hi,
> 
> This series adds asynchronous page fault support for pseries guests
> and enables the support for the same in powerpc KVM. This is an
> early RFC with details and multiple TODOs listed in patch descriptions.
> 
> This patch needs supporting enablement in QEMU too which will be
> posted separately.

QEMU part is posted here:
https://lore.kernel.org/qemu-devel/20210805073228.502292-2-bharata@linux.ibm.com/T/#u

Regards,
Bharata.

^ permalink raw reply

* [RFC PATCH v0 5/5] pseries: Asynchronous page fault support
From: Bharata B Rao @ 2021-08-05  7:24 UTC (permalink / raw)
  To: kvm-ppc, linuxppc-dev; +Cc: aneesh.kumar, Bharata B Rao, kvm, bharata.rao
In-Reply-To: <20210805072439.501481-1-bharata@linux.ibm.com>

Add asynchronous page fault support for pseries guests.

1. Setup the guest to handle async-pf
   - Issue H_REG_SNS hcall to register the SNS region.
   - Setup the subvention interrupt irq.
   - Enable async-pf by updating the byte_b9 of VPA for each
     CPU.
2. Check if the page fault is an expropriation notification
   (SRR1_PROGTRAP set in SRR1) and if so put the task on
   wait queue based on the expropriation correlation number
   read from the VPA.
3. Handle subvention interrupt to wake any waiting tasks.
   The wait and wakeup mechanism from x86 async-pf implementation
   is being reused here.

TODO:
- Check how to keep this feature together with other CMO features.
- The async-pf check in the page fault handler path is limited to
  guest with an #ifdef. This isn't sufficient and hence needs to
  be replaced by an appropriate check.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
---
 arch/powerpc/include/asm/async-pf.h       |  12 ++
 arch/powerpc/mm/fault.c                   |   7 +-
 arch/powerpc/platforms/pseries/Makefile   |   2 +-
 arch/powerpc/platforms/pseries/async-pf.c | 219 ++++++++++++++++++++++
 4 files changed, 238 insertions(+), 2 deletions(-)
 create mode 100644 arch/powerpc/include/asm/async-pf.h
 create mode 100644 arch/powerpc/platforms/pseries/async-pf.c

diff --git a/arch/powerpc/include/asm/async-pf.h b/arch/powerpc/include/asm/async-pf.h
new file mode 100644
index 000000000000..95d6c3da9f50
--- /dev/null
+++ b/arch/powerpc/include/asm/async-pf.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Async page fault support via PAPR Expropriation/Subvention Notification
+ * option(ESN)
+ *
+ * Copyright 2020 Bharata B Rao, IBM Corp. <bharata@linux.ibm.com>
+ */
+
+#ifndef _ASM_POWERPC_ASYNC_PF_H
+int handle_async_page_fault(struct pt_regs *regs, unsigned long addr);
+#define _ASM_POWERPC_ASYNC_PF_H
+#endif
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index a8d0ce85d39a..bbdc61605885 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -44,7 +44,7 @@
 #include <asm/debug.h>
 #include <asm/kup.h>
 #include <asm/inst.h>
-
+#include <asm/async-pf.h>
 
 /*
  * do_page_fault error handling helpers
@@ -395,6 +395,11 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
 	vm_fault_t fault, major = 0;
 	bool kprobe_fault = kprobe_page_fault(regs, 11);
 
+#ifdef CONFIG_PPC_PSERIES
+	if (handle_async_page_fault(regs, address))
+		return 0;
+#endif
+
 	if (unlikely(debugger_fault_handler(regs) || kprobe_fault))
 		return 0;
 
diff --git a/arch/powerpc/platforms/pseries/Makefile b/arch/powerpc/platforms/pseries/Makefile
index 4cda0ef87be0..e0ada605ef20 100644
--- a/arch/powerpc/platforms/pseries/Makefile
+++ b/arch/powerpc/platforms/pseries/Makefile
@@ -6,7 +6,7 @@ obj-y			:= lpar.o hvCall.o nvram.o reconfig.o \
 			   of_helpers.o \
 			   setup.o iommu.o event_sources.o ras.o \
 			   firmware.o power.o dlpar.o mobility.o rng.o \
-			   pci.o pci_dlpar.o eeh_pseries.o msi.o
+			   pci.o pci_dlpar.o eeh_pseries.o msi.o async-pf.o
 obj-$(CONFIG_SMP)	+= smp.o
 obj-$(CONFIG_SCANLOG)	+= scanlog.o
 obj-$(CONFIG_KEXEC_CORE)	+= kexec.o
diff --git a/arch/powerpc/platforms/pseries/async-pf.c b/arch/powerpc/platforms/pseries/async-pf.c
new file mode 100644
index 000000000000..c2f3bbc0d674
--- /dev/null
+++ b/arch/powerpc/platforms/pseries/async-pf.c
@@ -0,0 +1,219 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Async page fault support via PAPR Expropriation/Subvention Notification
+ * option(ESN)
+ *
+ * Copyright 2020 Bharata B Rao, IBM Corp. <bharata@linux.ibm.com>
+ */
+
+#include <linux/interrupt.h>
+#include <linux/swait.h>
+#include <linux/irqdomain.h>
+#include <asm/machdep.h>
+#include <asm/hvcall.h>
+#include <asm/paca.h>
+
+static char sns_buffer[PAGE_SIZE] __aligned(4096);
+static uint16_t *esn_q = (uint16_t *)sns_buffer + 1;
+static unsigned long next_eq_entry, nr_eq_entries;
+
+#define ASYNC_PF_SLEEP_HASHBITS 8
+#define ASYNC_PF_SLEEP_HASHSIZE (1<<ASYNC_PF_SLEEP_HASHBITS)
+
+/* Controls access to SNS buffer */
+static DEFINE_RAW_SPINLOCK(async_sns_guest_lock);
+
+/* Wait queue handling is from x86 asyn-pf implementation */
+struct async_pf_sleep_node {
+	struct hlist_node link;
+	struct swait_queue_head wq;
+	u64 token;
+	int cpu;
+};
+
+static struct async_pf_sleep_head {
+	raw_spinlock_t lock;
+	struct hlist_head list;
+} async_pf_sleepers[ASYNC_PF_SLEEP_HASHSIZE];
+
+static struct async_pf_sleep_node *_find_apf_task(struct async_pf_sleep_head *b,
+						  u64 token)
+{
+	struct hlist_node *p;
+
+	hlist_for_each(p, &b->list) {
+		struct async_pf_sleep_node *n =
+			hlist_entry(p, typeof(*n), link);
+		if (n->token == token)
+			return n;
+	}
+
+	return NULL;
+}
+static int async_pf_queue_task(u64 token, struct async_pf_sleep_node *n)
+{
+	u64 key = hash_64(token, ASYNC_PF_SLEEP_HASHBITS);
+	struct async_pf_sleep_head *b = &async_pf_sleepers[key];
+	struct async_pf_sleep_node *e;
+
+	raw_spin_lock(&b->lock);
+	e = _find_apf_task(b, token);
+	if (e) {
+		/* dummy entry exist -> wake up was delivered ahead of PF */
+		hlist_del(&e->link);
+		raw_spin_unlock(&b->lock);
+		kfree(e);
+		return false;
+	}
+
+	n->token = token;
+	n->cpu = smp_processor_id();
+	init_swait_queue_head(&n->wq);
+	hlist_add_head(&n->link, &b->list);
+	raw_spin_unlock(&b->lock);
+	return true;
+}
+
+/*
+ * Handle Expropriation notification.
+ */
+int handle_async_page_fault(struct pt_regs *regs, unsigned long addr)
+{
+	struct async_pf_sleep_node n;
+	DECLARE_SWAITQUEUE(wait);
+	unsigned long exp_corr_nr;
+
+	/* Is this Expropriation notification? */
+	if (!(mfspr(SPRN_SRR1) & SRR1_PROGTRAP))
+		return 0;
+
+	if (unlikely(!user_mode(regs)))
+		panic("Host injected async PF in kernel mode\n");
+
+	exp_corr_nr = be16_to_cpu(get_lppaca()->exp_corr_nr);
+	if (!async_pf_queue_task(exp_corr_nr, &n))
+		return 0;
+
+	for (;;) {
+		prepare_to_swait_exclusive(&n.wq, &wait, TASK_UNINTERRUPTIBLE);
+		if (hlist_unhashed(&n.link))
+			break;
+
+		local_irq_enable();
+		schedule();
+		local_irq_disable();
+	}
+
+	finish_swait(&n.wq, &wait);
+	return 1;
+}
+
+static void apf_task_wake_one(struct async_pf_sleep_node *n)
+{
+	hlist_del_init(&n->link);
+	if (swq_has_sleeper(&n->wq))
+		swake_up_one(&n->wq);
+}
+
+static void async_pf_wake_task(u64 token)
+{
+	u64 key = hash_64(token, ASYNC_PF_SLEEP_HASHBITS);
+	struct async_pf_sleep_head *b = &async_pf_sleepers[key];
+	struct async_pf_sleep_node *n;
+
+again:
+	raw_spin_lock(&b->lock);
+	n = _find_apf_task(b, token);
+	if (!n) {
+		/*
+		 * async PF was not yet handled.
+		 * Add dummy entry for the token.
+		 */
+		n = kzalloc(sizeof(*n), GFP_ATOMIC);
+		if (!n) {
+			/*
+			 * Allocation failed! Busy wait while other cpu
+			 * handles async PF.
+			 */
+			raw_spin_unlock(&b->lock);
+			cpu_relax();
+			goto again;
+		}
+		n->token = token;
+		n->cpu = smp_processor_id();
+		init_swait_queue_head(&n->wq);
+		hlist_add_head(&n->link, &b->list);
+	} else {
+		apf_task_wake_one(n);
+	}
+	raw_spin_unlock(&b->lock);
+}
+
+/*
+ * Handle Subvention notification.
+ */
+static irqreturn_t async_pf_handler(int irq, void *dev_id)
+{
+	uint16_t exp_token, old;
+
+	raw_spin_lock(&async_sns_guest_lock);
+	do {
+		exp_token = *(esn_q + next_eq_entry);
+		if (!exp_token)
+			break;
+
+		old = arch_cmpxchg(esn_q + next_eq_entry, exp_token, 0);
+		BUG_ON(old != exp_token);
+
+		async_pf_wake_task(exp_token);
+		next_eq_entry = (next_eq_entry + 1) % nr_eq_entries;
+	} while (1);
+	raw_spin_unlock(&async_sns_guest_lock);
+	return IRQ_HANDLED;
+}
+
+static int __init pseries_async_pf_init(void)
+{
+	long rc;
+	unsigned long ret[PLPAR_HCALL_BUFSIZE];
+	unsigned int irq, cpu;
+	int i;
+
+	/* Register buffer via H_REG_SNS */
+	rc = plpar_hcall(H_REG_SNS, ret, __pa(sns_buffer), PAGE_SIZE);
+	if (rc != H_SUCCESS)
+		return -1;
+
+	nr_eq_entries = (PAGE_SIZE - 2) / sizeof(uint16_t);
+
+	/* Register irq handler */
+	irq = irq_create_mapping(NULL, ret[1]);
+	if (!irq) {
+		plpar_hcall(H_REG_SNS, ret, -1, PAGE_SIZE);
+		return -1;
+	}
+
+	rc = request_irq(irq, async_pf_handler, 0, "sns-interrupt", NULL);
+	if (rc < 0) {
+		plpar_hcall(H_REG_SNS, ret, -1, PAGE_SIZE);
+		return -1;
+	}
+
+	for (i = 0; i < ASYNC_PF_SLEEP_HASHSIZE; i++)
+		raw_spin_lock_init(&async_pf_sleepers[i].lock);
+
+	/*
+	 * Enable subvention notifications from the hypervisor
+	 * by setting bit 0, byte 0 of SNS buffer
+	 */
+	*sns_buffer |= 0x1;
+
+	/* Enable LPPACA_EXP_INT_ENABLED in VPA */
+	for_each_possible_cpu(cpu)
+		lppaca_of(cpu).byte_b9 |= LPPACA_EXP_INT_ENABLED;
+
+	pr_err("%s: Enabled Async PF\n", __func__);
+	return 0;
+}
+
+machine_arch_initcall(pseries, pseries_async_pf_init);
-- 
2.31.1


^ permalink raw reply related

* [RFC PATCH v0 4/5] KVM: PPC: BOOK3S HV: Async PF support
From: Bharata B Rao @ 2021-08-05  7:24 UTC (permalink / raw)
  To: kvm-ppc, linuxppc-dev; +Cc: aneesh.kumar, Bharata B Rao, kvm, bharata.rao
In-Reply-To: <20210805072439.501481-1-bharata@linux.ibm.com>

Add asynchronous page fault support for PowerKVM by making
use of the Expropriation/Subvention Notification Option
defined by PAPR specifications.

1. When guest accessed page isn't immediately available in the
host, update the vcpu's VPA with a unique expropriation correlation
number and inject a DSI to the guest with SRR1_PROGTRAP bit set in
SRR1. This informs the guest vcpu to put the process to wait and
schedule a different process.
   - Async PF is supported for data pages in this implementation
     though PAPR allows it for code pages too.
   - Async PF is supported only for user pages here.
   - The feature is currently limited only to radix guests.

2. When the page becomes available, update the Subvention Notification
Structure  with the corresponding expropriation correlation number and
and inform the guest via subvention interrupt.
   - Subvention Notification Structure (SNS) is a region of memory
     shared between host and guest via which the communication related
     to expropriated and subvened pages happens between guest and host.
   - SNS region is registered by the guest via H_REG_SNS hcall which
     is implemented in QEMU.
   - H_REG_SNS implementation in QEMU needs a new ioctl KVM_PPC_SET_SNS.
     This ioctl is used to map and pin the guest page containing SNS
     in the host.
   - Subvention notification interrupt is raised to the guest by
     QEMU in response to the guest exit via KVM_REQ_ESN_EXIT. This
     interrupt informs the guest about the availability of the
     pages.

TODO:
- H_REG_SNS is implemented in QEMU because this hcall needs to return
  the interrupt source number associated with the subvention interrupt.
  Claiming of IRQ line and raising an external interrupt seem to be
  straightforward from QEMU. Figure out the in-kernel equivalents for
  these two so that, we can save on guest exit for each expropriated
  page and move the entire hcall implementation into the host kernel.
- The code is pretty much experimental and is barely able to boot a
  guest. I do see some requests for expropriated pages not getting
  fulfilled by host leading the long delays in guest. This needs some
  debugging.
- A few other aspects recommended by PAPR around this feature(like
  setting of page state flags) need to be evaluated and incorporated
  into the implementation if found appropriate.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
---
 Documentation/virt/kvm/api.rst            |  15 ++
 arch/powerpc/include/asm/hvcall.h         |   1 +
 arch/powerpc/include/asm/kvm_book3s_esn.h |  24 +++
 arch/powerpc/include/asm/kvm_host.h       |  21 +++
 arch/powerpc/include/asm/kvm_ppc.h        |   1 +
 arch/powerpc/include/asm/lppaca.h         |  12 +-
 arch/powerpc/include/uapi/asm/kvm.h       |   6 +
 arch/powerpc/kvm/Kconfig                  |   2 +
 arch/powerpc/kvm/Makefile                 |   5 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c    |   3 +
 arch/powerpc/kvm/book3s_hv.c              |  25 +++
 arch/powerpc/kvm/book3s_hv_esn.c          | 189 ++++++++++++++++++++++
 include/uapi/linux/kvm.h                  |   1 +
 tools/include/uapi/linux/kvm.h            |   1 +
 14 files changed, 303 insertions(+), 3 deletions(-)
 create mode 100644 arch/powerpc/include/asm/kvm_book3s_esn.h
 create mode 100644 arch/powerpc/kvm/book3s_hv_esn.c

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index dae68e68ca23..512f078b9d02 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -5293,6 +5293,21 @@ the trailing ``'\0'``, is indicated by ``name_size`` in the header.
 The Stats Data block contains an array of 64-bit values in the same order
 as the descriptors in Descriptors block.
 
+4.134 KVM_PPC_SET_SNS
+---------------------
+
+:Capability: basic
+:Architectures: powerpc
+:Type: vm ioctl
+:Parameters: none
+:Returns: 0 on successful completion,
+
+As part of H_REG_SNS hypercall, this ioctl is used to map and pin
+the guest provided SNS structure in the host.
+
+This is used for providing asynchronous page fault support for
+powerpc pseries KVM guests.
+
 5. The kvm_run structure
 ========================
 
diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index 9bcf345cb208..9e33500c1723 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -321,6 +321,7 @@
 #define H_SCM_UNBIND_ALL        0x3FC
 #define H_SCM_HEALTH            0x400
 #define H_SCM_PERFORMANCE_STATS 0x418
+#define H_REG_SNS		0x41C
 #define H_RPT_INVALIDATE	0x448
 #define H_SCM_FLUSH		0x44C
 #define MAX_HCALL_OPCODE	H_SCM_FLUSH
diff --git a/arch/powerpc/include/asm/kvm_book3s_esn.h b/arch/powerpc/include/asm/kvm_book3s_esn.h
new file mode 100644
index 000000000000..d79a441ea31d
--- /dev/null
+++ b/arch/powerpc/include/asm/kvm_book3s_esn.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_KVM_BOOK3S_ESN_H__
+#define __ASM_KVM_BOOK3S_ESN_H__
+
+/* SNS buffer EQ state flags */
+#define SNS_EQ_STATE_OPERATIONAL 0X0
+#define SNS_EQ_STATE_OVERFLOW 0x1
+
+/* SNS buffer Notification control bits */
+#define SNS_EQ_CNTRL_TRIGGER 0x1
+
+struct kvmppc_sns {
+	unsigned long gpa;
+	unsigned long len;
+	void *hva;
+	uint16_t exp_corr_nr;
+	uint16_t *eq;
+	uint8_t *eq_cntrl;
+	uint8_t *eq_state;
+	unsigned long next_eq_entry;
+	unsigned long nr_eq_entries;
+};
+
+#endif /* __ASM_KVM_BOOK3S_ESN_H__ */
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 204dc2d91388..8d7f73085ef5 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -25,6 +25,7 @@
 #include <asm/cacheflush.h>
 #include <asm/hvcall.h>
 #include <asm/mce.h>
+#include <asm/kvm_book3s_esn.h>
 
 #define KVM_MAX_VCPUS		NR_CPUS
 #define KVM_MAX_VCORES		NR_CPUS
@@ -325,6 +326,7 @@ struct kvm_arch {
 #endif
 	struct kvmppc_ops *kvm_ops;
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+	struct kvmppc_sns sns;
 	struct mutex uvmem_lock;
 	struct list_head uvmem_pfns;
 	struct mutex mmu_setup_lock;	/* nests inside vcpu mutexes */
@@ -855,6 +857,25 @@ struct kvm_vcpu_arch {
 #define __KVM_HAVE_ARCH_WQP
 #define __KVM_HAVE_CREATE_DEVICE
 
+/* Async pf */
+#define ASYNC_PF_PER_VCPU       64
+struct kvm_arch_async_pf {
+	unsigned long exp_token;
+};
+int kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu,
+			       unsigned long gpa, unsigned long hva);
+
+void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu,
+			       struct kvm_async_pf *work);
+
+bool kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
+				     struct kvm_async_pf *work);
+
+void kvm_arch_async_page_present(struct kvm_vcpu *vcpu,
+				 struct kvm_async_pf *work);
+bool kvm_arch_can_dequeue_async_page_present(struct kvm_vcpu *vcpu);
+static inline void kvm_arch_async_page_present_queued(struct kvm_vcpu *vcpu) {}
+
 static inline void kvm_arch_hardware_disable(void) {}
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 09235bdfd4ac..c14a84041d0e 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -228,6 +228,7 @@ extern long kvm_vm_ioctl_resize_hpt_commit(struct kvm *kvm,
 int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu, struct kvm_interrupt *irq);
 
 extern int kvm_vm_ioctl_rtas_define_token(struct kvm *kvm, void __user *argp);
+long kvm_vm_ioctl_set_sns(struct kvm *kvm, struct kvm_ppc_sns_reg *sns_reg);
 extern int kvmppc_rtas_hcall(struct kvm_vcpu *vcpu);
 extern void kvmppc_rtas_tokens_free(struct kvm *kvm);
 
diff --git a/arch/powerpc/include/asm/lppaca.h b/arch/powerpc/include/asm/lppaca.h
index 57e432766f3e..17e89c3865e8 100644
--- a/arch/powerpc/include/asm/lppaca.h
+++ b/arch/powerpc/include/asm/lppaca.h
@@ -104,7 +104,17 @@ struct lppaca {
 	volatile __be32 dispersion_count; /* dispatch changed physical cpu */
 	volatile __be64 cmo_faults;	/* CMO page fault count */
 	volatile __be64 cmo_fault_time;	/* CMO page fault time */
-	u8	reserved10[104];
+
+	/*
+	 * TODO: Insert this at correct offset
+	 * 0x17D - Exp flags (1 byte)
+	 * 0x17E - Exp corr number (2 bytes)
+	 *
+	 * Here I am using only exp corr number at an easy to insert
+	 * offset.
+	 */
+	__be16 exp_corr_nr; /* Exproppriation correlation number */
+	u8	reserved10[102];
 
 	/* cacheline 4-5 */
 
diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index 9f18fa090f1f..d72739126ae5 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -470,6 +470,12 @@ struct kvm_ppc_cpu_char {
 #define KVM_PPC_CPU_BEHAV_BNDS_CHK_SPEC_BAR	(1ULL << 61)
 #define KVM_PPC_CPU_BEHAV_FLUSH_COUNT_CACHE	(1ull << 58)
 
+/* For KVM_PPC_SET_SNS */
+struct kvm_ppc_sns_reg {
+	__u64 addr;
+	__u64 len;
+};
+
 /* Per-vcpu XICS interrupt controller state */
 #define KVM_REG_PPC_ICP_STATE	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x8c)
 
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index e45644657d49..4f552649a4b2 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -85,6 +85,8 @@ config KVM_BOOK3S_64_HV
 	depends on KVM_BOOK3S_64 && PPC_POWERNV
 	select KVM_BOOK3S_HV_POSSIBLE
 	select MMU_NOTIFIER
+	select KVM_ASYNC_PF
+	select KVM_ASYNC_PF_SYNC
 	select CMA
 	help
 	  Support running unmodified book3s_64 guest kernels in
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index 583c14ef596e..603ab382d021 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -6,7 +6,7 @@
 ccflags-y := -Ivirt/kvm -Iarch/powerpc/kvm
 KVM := ../../../virt/kvm
 
-common-objs-y = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/binary_stats.o
+common-objs-y = $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/binary_stats.o $(KVM)/async_pf.o
 common-objs-$(CONFIG_KVM_VFIO) += $(KVM)/vfio.o
 common-objs-$(CONFIG_KVM_MMIO) += $(KVM)/coalesced_mmio.o
 
@@ -70,7 +70,8 @@ kvm-hv-y += \
 	book3s_hv_interrupts.o \
 	book3s_64_mmu_hv.o \
 	book3s_64_mmu_radix.o \
-	book3s_hv_nested.o
+	book3s_hv_nested.o \
+	book3s_hv_esn.o
 
 kvm-hv-$(CONFIG_PPC_UV) += \
 	book3s_hv_uvmem.o
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index 618206a504b0..1985f84bfebe 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -837,6 +837,9 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu,
 	} else {
 		unsigned long pfn;
 
+		if (kvm_arch_setup_async_pf(vcpu, gpa, hva))
+			return RESUME_GUEST;
+
 		/* Call KVM generic code to do the slow-path check */
 		pfn = __gfn_to_pfn_memslot(memslot, gfn, false, NULL,
 					   writing, upgrade_p, NULL);
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index d07e9065f7c1..5cc564321521 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -77,6 +77,7 @@
 #include <asm/ultravisor.h>
 #include <asm/dtl.h>
 #include <asm/plpar_wrappers.h>
+#include <asm/kvm_book3s_esn.h>
 
 #include "book3s.h"
 
@@ -4570,6 +4571,11 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
 		return -EINTR;
 	}
 
+	if (kvm_request_pending(vcpu)) {
+		if (!kvmppc_core_check_requests(vcpu))
+			return 0;
+	}
+
 	kvm = vcpu->kvm;
 	atomic_inc(&kvm->arch.vcpus_running);
 	/* Order vcpus_running vs. mmu_ready, see kvmppc_alloc_reset_hpt */
@@ -4591,6 +4597,7 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
 	vcpu->arch.state = KVMPPC_VCPU_BUSY_IN_HOST;
 
 	do {
+		kvm_check_async_pf_completion(vcpu);
 		if (cpu_has_feature(CPU_FTR_ARCH_300))
 			r = kvmhv_run_single_vcpu(vcpu, ~(u64)0,
 						  vcpu->arch.vcore->lpcr);
@@ -5257,6 +5264,8 @@ static void kvmppc_free_vcores(struct kvm *kvm)
 
 static void kvmppc_core_destroy_vm_hv(struct kvm *kvm)
 {
+	struct kvm_ppc_sns_reg sns_reg;
+
 	debugfs_remove_recursive(kvm->arch.debugfs_dir);
 
 	if (!cpu_has_feature(CPU_FTR_ARCH_300))
@@ -5283,6 +5292,11 @@ static void kvmppc_core_destroy_vm_hv(struct kvm *kvm)
 	kvmppc_free_lpid(kvm->arch.lpid);
 
 	kvmppc_free_pimap(kvm);
+
+	/* Needed for de-registering SNS buffer */
+	sns_reg.addr = -1;
+	sns_reg.len = 0;
+	kvm_vm_ioctl_set_sns(kvm, &sns_reg);
 }
 
 /* We don't need to emulate any privileged instructions or dcbz */
@@ -5561,6 +5575,17 @@ static long kvm_arch_vm_ioctl_hv(struct file *filp,
 		break;
 	}
 
+	case KVM_PPC_SET_SNS: {
+		struct kvm_ppc_sns_reg sns_reg;
+
+		r = -EFAULT;
+		if (copy_from_user(&sns_reg, argp, sizeof(sns_reg)))
+			break;
+
+		r = kvm_vm_ioctl_set_sns(kvm, &sns_reg);
+		break;
+	}
+
 	default:
 		r = -ENOTTY;
 	}
diff --git a/arch/powerpc/kvm/book3s_hv_esn.c b/arch/powerpc/kvm/book3s_hv_esn.c
new file mode 100644
index 000000000000..b322a14c1f83
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_hv_esn.c
@@ -0,0 +1,189 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Async page fault support via PAPR Expropriation/Subvention Notification
+ * option(ESN)
+ *
+ * Copyright 2020 Bharata B Rao, IBM Corp. <bharata@linux.ibm.com>
+ */
+
+#include <linux/kvm_host.h>
+#include <asm/kvm_ppc.h>
+#include <asm/kvm_book3s_esn.h>
+
+static DEFINE_SPINLOCK(async_exp_lock); /* for updating exp_corr_nr */
+static DEFINE_SPINLOCK(async_sns_lock); /* SNS buffer updated under this lock */
+
+int kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu,
+			       unsigned long gpa, unsigned long hva)
+{
+	struct kvm_arch_async_pf arch;
+	struct lppaca *vpa = vcpu->arch.vpa.pinned_addr;
+	u64 msr = kvmppc_get_msr(vcpu);
+	struct kvmppc_sns *sns = &vcpu->kvm->arch.sns;
+
+	/*
+	 * If VPA hasn't been registered yet, can't support
+	 * async pf.
+	 */
+	if (!vpa)
+		return 0;
+
+	/*
+	 * If SNS memory area hasn't been registered yet,
+	 * can't support async pf.
+	 */
+	if (!vcpu->kvm->arch.sns.eq)
+		return 0;
+
+	/*
+	 * If guest hasn't enabled expropriation interrupt,
+	 * don't try async pf.
+	 */
+	if (!(vpa->byte_b9 & LPPACA_EXP_INT_ENABLED))
+		return 0;
+
+	/*
+	 * If the fault is in the guest kernel, don,t
+	 * try async pf.
+	 */
+	if (!(msr & MSR_PR) && !(msr & MSR_HV))
+		return 0;
+
+	spin_lock(&async_sns_lock);
+	/*
+	 * Check if subvention event queue can
+	 * overflow, if so, don't try async pf.
+	 */
+	if (*(sns->eq + sns->next_eq_entry)) {
+		pr_err("%s: SNS buffer overflow\n", __func__);
+		spin_unlock(&async_sns_lock);
+		return 0;
+	}
+	spin_unlock(&async_sns_lock);
+
+	/*
+	 * TODO:
+	 *
+	 * 1. Update exp flags bit 7 to 1
+	 * ("The Subvened page data will be restored")
+	 *
+	 * 2. Check if request to this page has been
+	 * notified to guest earlier, if so send back
+	 * the same exp corr number.
+	 *
+	 * 3. exp_corr_nr could be a random but non-zero
+	 * number. Not taking care of wrapping here. Fix
+	 * it.
+	 */
+	spin_lock(&async_exp_lock);
+	vpa->exp_corr_nr = cpu_to_be16(vcpu->kvm->arch.sns.exp_corr_nr);
+	arch.exp_token = vcpu->kvm->arch.sns.exp_corr_nr++;
+	spin_unlock(&async_exp_lock);
+
+	return kvm_setup_async_pf(vcpu, gpa, hva, &arch);
+}
+
+bool kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
+				     struct kvm_async_pf *work)
+{
+	/* Inject DSI to guest with srr1 bit 46 set */
+	kvmppc_core_queue_data_storage(vcpu, kvmppc_get_dar(vcpu), DSISR_NOHPTE, SRR1_PROGTRAP);
+	return true;
+}
+
+void kvm_arch_async_page_present(struct kvm_vcpu *vcpu,
+				 struct kvm_async_pf *work)
+{
+	struct kvmppc_sns *sns = &vcpu->kvm->arch.sns;
+
+	spin_lock(&async_sns_lock);
+	if (*sns->eq_cntrl != SNS_EQ_CNTRL_TRIGGER) {
+		pr_err("%s: SNS Notification Trigger not set by guest\n", __func__);
+		spin_unlock(&async_sns_lock);
+		/* TODO: Terminate the guest? */
+		return;
+	}
+
+	if (arch_cmpxchg(sns->eq + sns->next_eq_entry, 0,
+	    work->arch.exp_token)) {
+		*sns->eq_state |= SNS_EQ_STATE_OVERFLOW;
+		pr_err("%s: SNS buffer overflow\n", __func__);
+		spin_unlock(&async_sns_lock);
+		/* TODO: Terminate the guest? */
+		return;
+	}
+
+	sns->next_eq_entry = (sns->next_eq_entry + 1) % sns->nr_eq_entries;
+	spin_unlock(&async_sns_lock);
+
+	/*
+	 * Request a guest exit so that ESN virtual interrupt can
+	 * be injected by QEMU.
+	 */
+	kvm_make_request(KVM_REQ_ESN_EXIT, vcpu);
+}
+
+void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work)
+{
+	/* We will inject the page directly */
+}
+
+bool kvm_arch_can_dequeue_async_page_present(struct kvm_vcpu *vcpu)
+{
+	/*
+	 * PowerPC will always inject the page directly,
+	 * but we still want check_async_completion to cleanup
+	 */
+	return true;
+}
+
+long kvm_vm_ioctl_set_sns(struct kvm *kvm, struct kvm_ppc_sns_reg *sns_reg)
+{
+	unsigned long nb;
+
+	/* Deregister */
+	if (sns_reg->addr == -1) {
+		if (!kvm->arch.sns.hva)
+			return 0;
+
+		pr_info("%s: Deregistering SNS buffer for LPID %d\n",
+			__func__, kvm->arch.lpid);
+		kvmppc_unpin_guest_page(kvm, kvm->arch.sns.hva, kvm->arch.sns.gpa, false);
+		kvm->arch.sns.gpa = -1;
+		kvm->arch.sns.hva = 0;
+		return 0;
+	}
+
+	/*
+	 * Already registered with the same address?
+	 */
+	if (sns_reg->addr == kvm->arch.sns.gpa)
+		return 0;
+
+	/* If previous registration exists, free it */
+	if (kvm->arch.sns.hva) {
+		pr_info("%s: Deregistering Previous SNS buffer for LPID %d\n",
+			__func__, kvm->arch.lpid);
+		kvmppc_unpin_guest_page(kvm, kvm->arch.sns.hva, kvm->arch.sns.gpa, false);
+		kvm->arch.sns.gpa = -1;
+		kvm->arch.sns.hva = 0;
+	}
+
+	kvm->arch.sns.gpa = sns_reg->addr;
+	kvm->arch.sns.hva = kvmppc_pin_guest_page(kvm, kvm->arch.sns.gpa, &nb);
+	kvm->arch.sns.len = sns_reg->len;
+	kvm->arch.sns.nr_eq_entries = (kvm->arch.sns.len - 2) / sizeof(uint16_t);
+	kvm->arch.sns.next_eq_entry = 0;
+	kvm->arch.sns.eq = kvm->arch.sns.hva + 2;
+	kvm->arch.sns.eq_cntrl = kvm->arch.sns.hva;
+	kvm->arch.sns.eq_state = kvm->arch.sns.hva + 1;
+	kvm->arch.sns.exp_corr_nr = 1; /* Should be non-zero */
+
+	*(kvm->arch.sns.eq_state) = SNS_EQ_STATE_OPERATIONAL;
+
+	pr_info("%s: Registering SNS buffer for LPID %d sns_addr %llx eq %lx\n",
+		__func__, kvm->arch.lpid, sns_reg->addr,
+		(unsigned long)kvm->arch.sns.eq);
+
+	return 0;
+}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 47be532ed14b..dbe65e8d68d8 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1459,6 +1459,7 @@ struct kvm_s390_ucas_mapping {
 #define KVM_SET_PMU_EVENT_FILTER  _IOW(KVMIO,  0xb2, struct kvm_pmu_event_filter)
 #define KVM_PPC_SVM_OFF		  _IO(KVMIO,  0xb3)
 #define KVM_ARM_MTE_COPY_TAGS	  _IOR(KVMIO,  0xb4, struct kvm_arm_copy_mte_tags)
+#define KVM_PPC_SET_SNS		  _IOR(KVMIO, 0xb5, struct kvm_ppc_sns_reg)
 
 /* ioctl for vm fd */
 #define KVM_CREATE_DEVICE	  _IOWR(KVMIO,  0xe0, struct kvm_create_device)
diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h
index d9e4aabcb31a..e9dea164498f 100644
--- a/tools/include/uapi/linux/kvm.h
+++ b/tools/include/uapi/linux/kvm.h
@@ -1458,6 +1458,7 @@ struct kvm_s390_ucas_mapping {
 #define KVM_SET_PMU_EVENT_FILTER  _IOW(KVMIO,  0xb2, struct kvm_pmu_event_filter)
 #define KVM_PPC_SVM_OFF		  _IO(KVMIO,  0xb3)
 #define KVM_ARM_MTE_COPY_TAGS	  _IOR(KVMIO,  0xb4, struct kvm_arm_copy_mte_tags)
+#define KVM_PPC_SET_SNS		  _IOR(KVMIO, 0xb5, struct kvm_ppc_sns_reg)
 
 /* ioctl for vm fd */
 #define KVM_CREATE_DEVICE	  _IOWR(KVMIO,  0xe0, struct kvm_create_device)
-- 
2.31.1


^ permalink raw reply related

* [RFC PATCH v0 3/5] KVM: PPC: Book3S: Enable setting SRR1 flags for DSI
From: Bharata B Rao @ 2021-08-05  7:24 UTC (permalink / raw)
  To: kvm-ppc, linuxppc-dev; +Cc: aneesh.kumar, Bharata B Rao, kvm, bharata.rao
In-Reply-To: <20210805072439.501481-1-bharata@linux.ibm.com>

kvmppc_core_queue_data_storage() doesn't provide an option to
set SRR1 flags when raising DSI. Since kvmppc_inject_interrupt()
allows for such a provision, add an argument to allow the same.

This will be used to raise DSI with SRR1_PROGTRAP set when
expropriation interrupt needs to be injected to the guest.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
---
 arch/powerpc/include/asm/kvm_ppc.h     | 3 ++-
 arch/powerpc/kvm/book3s.c              | 6 +++---
 arch/powerpc/kvm/book3s_64_mmu_radix.c | 6 +++---
 arch/powerpc/kvm/book3s_hv.c           | 4 ++--
 arch/powerpc/kvm/book3s_hv_nested.c    | 4 ++--
 arch/powerpc/kvm/book3s_pr.c           | 4 ++--
 6 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 2d88944f9f34..09235bdfd4ac 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -143,7 +143,8 @@ extern void kvmppc_core_queue_dtlb_miss(struct kvm_vcpu *vcpu, ulong dear_flags,
 					ulong esr_flags);
 extern void kvmppc_core_queue_data_storage(struct kvm_vcpu *vcpu,
 					   ulong dear_flags,
-					   ulong esr_flags);
+					   ulong esr_flags,
+					   ulong srr1_flags);
 extern void kvmppc_core_queue_itlb_miss(struct kvm_vcpu *vcpu);
 extern void kvmppc_core_queue_inst_storage(struct kvm_vcpu *vcpu,
 					   ulong esr_flags);
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 79833f78d1da..f7f6641a788d 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -284,11 +284,11 @@ void kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu)
 }
 
 void kvmppc_core_queue_data_storage(struct kvm_vcpu *vcpu, ulong dar,
-				    ulong flags)
+				    ulong dsisr, ulong srr1)
 {
 	kvmppc_set_dar(vcpu, dar);
-	kvmppc_set_dsisr(vcpu, flags);
-	kvmppc_inject_interrupt(vcpu, BOOK3S_INTERRUPT_DATA_STORAGE, 0);
+	kvmppc_set_dsisr(vcpu, dsisr);
+	kvmppc_inject_interrupt(vcpu, BOOK3S_INTERRUPT_DATA_STORAGE, srr1);
 }
 EXPORT_SYMBOL_GPL(kvmppc_core_queue_data_storage);
 
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index b5905ae4377c..618206a504b0 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -946,7 +946,7 @@ int kvmppc_book3s_radix_page_fault(struct kvm_vcpu *vcpu,
 	if (dsisr & DSISR_BADACCESS) {
 		/* Reflect to the guest as DSI */
 		pr_err("KVM: Got radix HV page fault with DSISR=%lx\n", dsisr);
-		kvmppc_core_queue_data_storage(vcpu, ea, dsisr);
+		kvmppc_core_queue_data_storage(vcpu, ea, dsisr, 0);
 		return RESUME_GUEST;
 	}
 
@@ -971,7 +971,7 @@ int kvmppc_book3s_radix_page_fault(struct kvm_vcpu *vcpu,
 			 * Bad address in guest page table tree, or other
 			 * unusual error - reflect it to the guest as DSI.
 			 */
-			kvmppc_core_queue_data_storage(vcpu, ea, dsisr);
+			kvmppc_core_queue_data_storage(vcpu, ea, dsisr, 0);
 			return RESUME_GUEST;
 		}
 		return kvmppc_hv_emulate_mmio(vcpu, gpa, ea, writing);
@@ -981,7 +981,7 @@ int kvmppc_book3s_radix_page_fault(struct kvm_vcpu *vcpu,
 		if (writing) {
 			/* give the guest a DSI */
 			kvmppc_core_queue_data_storage(vcpu, ea, DSISR_ISSTORE |
-						       DSISR_PROTFAULT);
+						       DSISR_PROTFAULT, 0);
 			return RESUME_GUEST;
 		}
 		kvm_ro = true;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 47ccd4a2df54..d07e9065f7c1 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1592,7 +1592,7 @@ static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
 
 		if (!(vcpu->arch.fault_dsisr & (DSISR_NOHPTE | DSISR_PROTFAULT))) {
 			kvmppc_core_queue_data_storage(vcpu,
-				vcpu->arch.fault_dar, vcpu->arch.fault_dsisr);
+				vcpu->arch.fault_dar, vcpu->arch.fault_dsisr, 0);
 			r = RESUME_GUEST;
 			break;
 		}
@@ -1610,7 +1610,7 @@ static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
 			r = RESUME_PAGE_FAULT;
 		} else {
 			kvmppc_core_queue_data_storage(vcpu,
-				vcpu->arch.fault_dar, err);
+				vcpu->arch.fault_dar, err, 0);
 			r = RESUME_GUEST;
 		}
 		break;
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
index 898f942eb198..a10ef0d5f925 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -1556,7 +1556,7 @@ static long int __kvmhv_nested_page_fault(struct kvm_vcpu *vcpu,
 	if (!memslot || (memslot->flags & KVM_MEMSLOT_INVALID)) {
 		if (dsisr & (DSISR_PRTABLE_FAULT | DSISR_BADACCESS)) {
 			/* unusual error -> reflect to the guest as a DSI */
-			kvmppc_core_queue_data_storage(vcpu, ea, dsisr);
+			kvmppc_core_queue_data_storage(vcpu, ea, dsisr, 0);
 			return RESUME_GUEST;
 		}
 
@@ -1567,7 +1567,7 @@ static long int __kvmhv_nested_page_fault(struct kvm_vcpu *vcpu,
 		if (writing) {
 			/* Give the guest a DSI */
 			kvmppc_core_queue_data_storage(vcpu, ea,
-					DSISR_ISSTORE | DSISR_PROTFAULT);
+					DSISR_ISSTORE | DSISR_PROTFAULT, 0);
 			return RESUME_GUEST;
 		}
 		kvm_ro = true;
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 6bc9425acb32..f7fc8e01fd8e 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -754,7 +754,7 @@ static int kvmppc_handle_pagefault(struct kvm_vcpu *vcpu,
 			flags = DSISR_NOHPTE;
 		if (data) {
 			flags |= vcpu->arch.fault_dsisr & DSISR_ISSTORE;
-			kvmppc_core_queue_data_storage(vcpu, eaddr, flags);
+			kvmppc_core_queue_data_storage(vcpu, eaddr, flags, 0);
 		} else {
 			kvmppc_core_queue_inst_storage(vcpu, flags);
 		}
@@ -1229,7 +1229,7 @@ int kvmppc_handle_exit_pr(struct kvm_vcpu *vcpu, unsigned int exit_nr)
 			r = kvmppc_handle_pagefault(vcpu, dar, exit_nr);
 			srcu_read_unlock(&vcpu->kvm->srcu, idx);
 		} else {
-			kvmppc_core_queue_data_storage(vcpu, dar, fault_dsisr);
+			kvmppc_core_queue_data_storage(vcpu, dar, fault_dsisr, 0);
 			r = RESUME_GUEST;
 		}
 		break;
-- 
2.31.1


^ permalink raw reply related

* [RFC PATCH v0 2/5] KVM: PPC: Add support for KVM_REQ_ESN_EXIT
From: Bharata B Rao @ 2021-08-05  7:24 UTC (permalink / raw)
  To: kvm-ppc, linuxppc-dev; +Cc: aneesh.kumar, Bharata B Rao, kvm, bharata.rao
In-Reply-To: <20210805072439.501481-1-bharata@linux.ibm.com>

Add a new KVM exit request KVM_REQ_ESN_EXIT that will be used
to exit to userspace (QEMU) whenever subvention notification
needs to be sent to the guest.

The userspace (QEMU) issues the subvention notification by
injecting an interrupt into the guest.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
---
 arch/powerpc/include/asm/kvm_host.h | 1 +
 arch/powerpc/kvm/book3s_hv.c        | 8 ++++++++
 include/uapi/linux/kvm.h            | 1 +
 3 files changed, 10 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 9f52f282b1aa..204dc2d91388 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -52,6 +52,7 @@
 #define KVM_REQ_WATCHDOG	KVM_ARCH_REQ(0)
 #define KVM_REQ_EPR_EXIT	KVM_ARCH_REQ(1)
 #define KVM_REQ_PENDING_TIMER	KVM_ARCH_REQ(2)
+#define KVM_REQ_ESN_EXIT	KVM_ARCH_REQ(3)
 
 #include <linux/mmu_notifier.h>
 
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 085fb8ecbf68..47ccd4a2df54 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2820,6 +2820,14 @@ static void kvmppc_core_vcpu_free_hv(struct kvm_vcpu *vcpu)
 
 static int kvmppc_core_check_requests_hv(struct kvm_vcpu *vcpu)
 {
+	/*
+	 * If subvention interrupt needs to be injected to the guest
+	 * exit to user space.
+	 */
+	if (kvm_check_request(KVM_REQ_ESN_EXIT, vcpu)) {
+		vcpu->run->exit_reason = KVM_EXIT_ESN;
+		return 0;
+	}
 	/* Indicate we want to get back into the guest */
 	return 1;
 }
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index d9e4aabcb31a..47be532ed14b 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -269,6 +269,7 @@ struct kvm_xen_exit {
 #define KVM_EXIT_AP_RESET_HOLD    32
 #define KVM_EXIT_X86_BUS_LOCK     33
 #define KVM_EXIT_XEN              34
+#define KVM_EXIT_ESN		  35
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
-- 
2.31.1


^ permalink raw reply related

* [RFC PATCH v0 0/5] PPC: KVM: pseries: Asynchronous page fault
From: Bharata B Rao @ 2021-08-05  7:24 UTC (permalink / raw)
  To: kvm-ppc, linuxppc-dev; +Cc: aneesh.kumar, Bharata B Rao, kvm, bharata.rao

Hi,

This series adds asynchronous page fault support for pseries guests
and enables the support for the same in powerpc KVM. This is an
early RFC with details and multiple TODOs listed in patch descriptions.

This patch needs supporting enablement in QEMU too which will be
posted separately.

Bharata B Rao (5):
  powerpc: Define Expropriation interrupt bit to VPA byte offset 0xB9
  KVM: PPC: Add support for KVM_REQ_ESN_EXIT
  KVM: PPC: Book3S: Enable setting SRR1 flags for DSI
  KVM: PPC: BOOK3S HV: Async PF support
  pseries: Asynchronous page fault support

 Documentation/virt/kvm/api.rst            |  15 ++
 arch/powerpc/include/asm/async-pf.h       |  12 ++
 arch/powerpc/include/asm/hvcall.h         |   1 +
 arch/powerpc/include/asm/kvm_book3s_esn.h |  24 +++
 arch/powerpc/include/asm/kvm_host.h       |  22 +++
 arch/powerpc/include/asm/kvm_ppc.h        |   4 +-
 arch/powerpc/include/asm/lppaca.h         |  20 +-
 arch/powerpc/include/uapi/asm/kvm.h       |   6 +
 arch/powerpc/kvm/Kconfig                  |   2 +
 arch/powerpc/kvm/Makefile                 |   5 +-
 arch/powerpc/kvm/book3s.c                 |   6 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c    |   9 +-
 arch/powerpc/kvm/book3s_hv.c              |  37 +++-
 arch/powerpc/kvm/book3s_hv_esn.c          | 189 +++++++++++++++++++
 arch/powerpc/kvm/book3s_hv_nested.c       |   4 +-
 arch/powerpc/kvm/book3s_pr.c              |   4 +-
 arch/powerpc/mm/fault.c                   |   7 +-
 arch/powerpc/platforms/pseries/Makefile   |   2 +-
 arch/powerpc/platforms/pseries/async-pf.c | 219 ++++++++++++++++++++++
 drivers/cpuidle/cpuidle-pseries.c         |   4 +-
 include/uapi/linux/kvm.h                  |   2 +
 tools/include/uapi/linux/kvm.h            |   1 +
 22 files changed, 574 insertions(+), 21 deletions(-)
 create mode 100644 arch/powerpc/include/asm/async-pf.h
 create mode 100644 arch/powerpc/include/asm/kvm_book3s_esn.h
 create mode 100644 arch/powerpc/kvm/book3s_hv_esn.c
 create mode 100644 arch/powerpc/platforms/pseries/async-pf.c

-- 
2.31.1


^ permalink raw reply

* [RFC PATCH v0 1/5] powerpc: Define Expropriation interrupt bit to VPA byte offset 0xB9
From: Bharata B Rao @ 2021-08-05  7:24 UTC (permalink / raw)
  To: kvm-ppc, linuxppc-dev; +Cc: aneesh.kumar, Bharata B Rao, kvm, bharata.rao
In-Reply-To: <20210805072439.501481-1-bharata@linux.ibm.com>

VPA byte offset 0xB9 was named as donate_dedicated_cpu as that
was the only used bit. The Expropriation/Subvention support defines
a bit in byte offset 0xB9. Define this bit and rename the field
in VPA to a generic name.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
---
 arch/powerpc/include/asm/lppaca.h | 8 +++++++-
 drivers/cpuidle/cpuidle-pseries.c | 4 ++--
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/lppaca.h b/arch/powerpc/include/asm/lppaca.h
index c390ec377bae..57e432766f3e 100644
--- a/arch/powerpc/include/asm/lppaca.h
+++ b/arch/powerpc/include/asm/lppaca.h
@@ -80,7 +80,7 @@ struct lppaca {
 	u8	ebb_regs_in_use;
 	u8	reserved7[6];
 	u8	dtl_enable_mask;	/* Dispatch Trace Log mask */
-	u8	donate_dedicated_cpu;	/* Donate dedicated CPU cycles */
+	u8	byte_b9; /* Donate dedicated CPU cycles & Expropriation int */
 	u8	fpregs_in_use;
 	u8	pmcregs_in_use;
 	u8	reserved8[28];
@@ -116,6 +116,12 @@ struct lppaca {
 
 #define lppaca_of(cpu)	(*paca_ptrs[cpu]->lppaca_ptr)
 
+/*
+ * Flags for Byte offset 0xB9
+ */
+#define LPPACA_DONATE_DED_CPU_CYCLES   0x1
+#define LPPACA_EXP_INT_ENABLED         0x2
+
 /*
  * We are using a non architected field to determine if a partition is
  * shared or dedicated. This currently works on both KVM and PHYP, but
diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c
index a2b5c6f60cf0..b9d0f41c3f19 100644
--- a/drivers/cpuidle/cpuidle-pseries.c
+++ b/drivers/cpuidle/cpuidle-pseries.c
@@ -221,7 +221,7 @@ static int dedicated_cede_loop(struct cpuidle_device *dev,
 	u8 old_latency_hint;
 
 	pseries_idle_prolog();
-	get_lppaca()->donate_dedicated_cpu = 1;
+	get_lppaca()->byte_b9 |= LPPACA_DONATE_DED_CPU_CYCLES;
 	old_latency_hint = get_lppaca()->cede_latency_hint;
 	get_lppaca()->cede_latency_hint = cede_latency_hint[index];
 
@@ -229,7 +229,7 @@ static int dedicated_cede_loop(struct cpuidle_device *dev,
 	check_and_cede_processor();
 
 	local_irq_disable();
-	get_lppaca()->donate_dedicated_cpu = 0;
+	get_lppaca()->byte_b9 &= ~LPPACA_DONATE_DED_CPU_CYCLES;
 	get_lppaca()->cede_latency_hint = old_latency_hint;
 
 	pseries_idle_epilog();
-- 
2.31.1


^ permalink raw reply related

* Re: [PATCH v1 11/55] powerpc/time: add API for KVM to re-arm the host timer/decrementer
From: Christophe Leroy @ 2021-08-05  7:22 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev
In-Reply-To: <20210726035036.739609-12-npiggin@gmail.com>



Le 26/07/2021 à 05:49, Nicholas Piggin a écrit :
> Rather than have KVM look up the host timer and fiddle with the
> irq-work internal details, have the powerpc/time.c code provide a
> function for KVM to re-arm the Linux timer code when exiting a
> guest.
> 
> This is implementation has an improvement over existing code of
> marking a decrementer interrupt as soft-pending if a timer has
> expired, rather than setting DEC to a -ve value, which tended to
> cause host timers to take two interrupts (first hdec to exit the
> guest, then the immediate dec).
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   arch/powerpc/include/asm/time.h | 16 +++-------
>   arch/powerpc/kernel/time.c      | 52 +++++++++++++++++++++++++++------
>   arch/powerpc/kvm/book3s_hv.c    |  7 ++---
>   3 files changed, 49 insertions(+), 26 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
> index 69b6be617772..924b2157882f 100644
> --- a/arch/powerpc/include/asm/time.h
> +++ b/arch/powerpc/include/asm/time.h
> @@ -99,18 +99,6 @@ extern void div128_by_32(u64 dividend_high, u64 dividend_low,
>   extern void secondary_cpu_time_init(void);
>   extern void __init time_init(void);
>   
> -#ifdef CONFIG_PPC64
> -static inline unsigned long test_irq_work_pending(void)
> -{
> -	unsigned long x;
> -
> -	asm volatile("lbz %0,%1(13)"
> -		: "=r" (x)
> -		: "i" (offsetof(struct paca_struct, irq_work_pending)));
> -	return x;
> -}
> -#endif
> -
>   DECLARE_PER_CPU(u64, decrementers_next_tb);
>   
>   static inline u64 timer_get_next_tb(void)
> @@ -118,6 +106,10 @@ static inline u64 timer_get_next_tb(void)
>   	return __this_cpu_read(decrementers_next_tb);
>   }
>   
> +#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
> +void timer_rearm_host_dec(u64 now);
> +#endif
> +
>   /* Convert timebase ticks to nanoseconds */
>   unsigned long long tb_to_ns(unsigned long long tb_ticks);
>   
> diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
> index 72d872b49167..016828b7401b 100644
> --- a/arch/powerpc/kernel/time.c
> +++ b/arch/powerpc/kernel/time.c
> @@ -499,6 +499,16 @@ EXPORT_SYMBOL(profile_pc);
>    * 64-bit uses a byte in the PACA, 32-bit uses a per-cpu variable...
>    */
>   #ifdef CONFIG_PPC64
> +static inline unsigned long test_irq_work_pending(void)
> +{
> +	unsigned long x;
> +
> +	asm volatile("lbz %0,%1(13)"
> +		: "=r" (x)
> +		: "i" (offsetof(struct paca_struct, irq_work_pending)));

Can we just use READ_ONCE() instead of hard coding the read ?


> +	return x;
> +}
> +
>   static inline void set_irq_work_pending_flag(void)
>   {
>   	asm volatile("stb %0,%1(13)" : :
> @@ -542,13 +552,44 @@ void arch_irq_work_raise(void)
>   	preempt_enable();
>   }
>   
> +static void set_dec_or_work(u64 val)
> +{
> +	set_dec(val);
> +	/* We may have raced with new irq work */
> +	if (unlikely(test_irq_work_pending()))
> +		set_dec(1);
> +}
> +
>   #else  /* CONFIG_IRQ_WORK */
>   
>   #define test_irq_work_pending()	0
>   #define clear_irq_work_pending()
>   
> +static void set_dec_or_work(u64 val)
> +{
> +	set_dec(val);
> +}
>   #endif /* CONFIG_IRQ_WORK */
>   
> +#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
> +void timer_rearm_host_dec(u64 now)
> +{
> +	u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
> +
> +	WARN_ON_ONCE(!arch_irqs_disabled());
> +	WARN_ON_ONCE(mfmsr() & MSR_EE);
> +
> +	if (now >= *next_tb) {
> +		local_paca->irq_happened |= PACA_IRQ_DEC;
> +	} else {
> +		now = *next_tb - now;
> +		if (now <= decrementer_max)
> +			set_dec_or_work(now);
> +	}
> +}
> +EXPORT_SYMBOL_GPL(timer_rearm_host_dec);
> +#endif
> +
>   /*
>    * timer_interrupt - gets called when the decrementer overflows,
>    * with interrupts disabled.
> @@ -609,10 +650,7 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(timer_interrupt)
>   	} else {
>   		now = *next_tb - now;
>   		if (now <= decrementer_max)
> -			set_dec(now);
> -		/* We may have raced with new irq work */
> -		if (test_irq_work_pending())
> -			set_dec(1);
> +			set_dec_or_work(now);
>   		__this_cpu_inc(irq_stat.timer_irqs_others);
>   	}
>   
> @@ -854,11 +892,7 @@ static int decrementer_set_next_event(unsigned long evt,
>   				      struct clock_event_device *dev)
>   {
>   	__this_cpu_write(decrementers_next_tb, get_tb() + evt);
> -	set_dec(evt);
> -
> -	/* We may have raced with new irq work */
> -	if (test_irq_work_pending())
> -		set_dec(1);
> +	set_dec_or_work(evt);
>   
>   	return 0;
>   }
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 6e6cfb10e9bb..0cef578930f9 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -4018,11 +4018,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
>   	vc->entry_exit_map = 0x101;
>   	vc->in_guest = 0;
>   
> -	next_timer = timer_get_next_tb();
> -	set_dec(next_timer - tb);
> -	/* We may have raced with new irq work */
> -	if (test_irq_work_pending())
> -		set_dec(1);
> +	timer_rearm_host_dec(tb);
> +
>   	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
>   
>   	kvmhv_load_host_pmu();
> 

^ permalink raw reply

* Re: [PATCH] powerpc/kprobes: Fix kprobe Oops happens in booke
From: Michael Ellerman @ 2021-08-05  6:13 UTC (permalink / raw)
  To: Pu Lehui, oleg, benh, paulus, naveen.n.rao, mhiramat,
	christophe.leroy, peterz, npiggin, ruscur
  Cc: zhangjinhao2, xukuohai, linuxppc-dev, linux-kernel, pulehui
In-Reply-To: <20210804143735.148547-1-pulehui@huawei.com>

Pu Lehui <pulehui@huawei.com> writes:
> When using kprobe on powerpc booke series processor, Oops happens
> as show bellow:
>
> [   35.861352] Oops: Exception in kernel mode, sig: 5 [#1]
> [   35.861676] BE PAGE_SIZE=4K SMP NR_CPUS=24 QEMU e500
> [   35.861905] Modules linked in:
> [   35.862144] CPU: 0 PID: 76 Comm: sh Not tainted 5.14.0-rc3-00060-g7e96bf476270 #18
> [   35.862610] NIP:  c0b96470 LR: c00107b4 CTR: c0161c80
> [   35.862805] REGS: c387fe70 TRAP: 0700   Not tainted (5.14.0-rc3-00060-g7e96bf476270)
> [   35.863198] MSR:  00029002 <CE,EE,ME>  CR: 24022824  XER: 20000000
> [   35.863577]
> [   35.863577] GPR00: c0015218 c387ff20 c313e300 c387ff50 00000004 40000002 40000000 0a1a2cce
> [   35.863577] GPR08: 00000000 00000004 00000000 59764000 24022422 102490c2 00000000 00000000
> [   35.863577] GPR16: 00000000 00000000 00000040 10240000 10240000 10240000 10240000 10220000
> [   35.863577] GPR24: ffffffff 10240000 00000000 00000000 bfc655e8 00000800 c387ff50 00000000
> [   35.865367] NIP [c0b96470] schedule+0x0/0x130
> [   35.865606] LR [c00107b4] interrupt_exit_user_prepare_main+0xf4/0x100
> [   35.865974] Call Trace:
> [   35.866142] [c387ff20] [c0053224] irq_exit+0x114/0x120 (unreliable)
> [   35.866472] [c387ff40] [c0015218] interrupt_return+0x14/0x13c
> [   35.866728] --- interrupt: 900 at 0x100af3dc
> [   35.866963] NIP:  100af3dc LR: 100de020 CTR: 00000000
> [   35.867177] REGS: c387ff50 TRAP: 0900   Not tainted (5.14.0-rc3-00060-g7e96bf476270)
> [   35.867488] MSR:  0002f902 <CE,EE,PR,FP,ME>  CR: 20022422  XER: 20000000
> [   35.867808]
> [   35.867808] GPR00: c001509c bfc65570 1024b4d0 00000000 100de020 20022422 bfc655a8 100af3dc
> [   35.867808] GPR08: 0002f902 00000000 00000000 00000000 72656773 102490c2 00000000 00000000
> [   35.867808] GPR16: 00000000 00000000 00000040 10240000 10240000 10240000 10240000 10220000
> [   35.867808] GPR24: ffffffff 10240000 00000000 00000000 bfc655e8 10245910 ffffffff 00000001
> [   35.869406] NIP [100af3dc] 0x100af3dc
> [   35.869578] LR [100de020] 0x100de020
> [   35.869751] --- interrupt: 900
> [   35.870001] Instruction dump:
> [   35.870283] 40c20010 815e0518 714a0100 41e2fd04 39200000 913e00c0 3b1e0450 4bfffd80
> [   35.870666] 0fe00000 92a10024 4bfff1a9 60000000 <7fe00008> 7c0802a6 93e1001c 7c5f1378
> [   35.871339] ---[ end trace 23ff848139efa9b9 ]---
>
> There is no real mode for booke arch and the MMU translation is
> always on. The corresponding MSR_IS/MSR_DS bit in booke is used
> to switch the address space, but not for real mode judgment.
>
> Fixes: 21f8b2fa3ca5 ("powerpc/kprobes: Ignore traps that happened in real mode")
> Signed-off-by: Pu Lehui <pulehui@huawei.com>
> ---
>  arch/powerpc/include/asm/ptrace.h | 6 ++++++
>  arch/powerpc/kernel/kprobes.c     | 5 +----
>  2 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h
> index 3e5d470a6155..4aec1a97024b 100644
> --- a/arch/powerpc/include/asm/ptrace.h
> +++ b/arch/powerpc/include/asm/ptrace.h
> @@ -187,6 +187,12 @@ static inline unsigned long frame_pointer(struct pt_regs *regs)
>  #define user_mode(regs) (((regs)->msr & MSR_PR) != 0)
>  #endif
>  
> +#ifdef CONFIG_BOOKE
> +#define real_mode(regs)	0
> +#else
> +#define real_mode(regs)	(!((regs)->msr & MSR_IR) || !((regs)->msr & MSR_DR))
> +#endif

I'm not sure about this helper.

Arguably it should only return true if both MSR_IR and MSR_DR are clear.


> diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> index cbc28d1a2e1b..fac9a5974718 100644
> --- a/arch/powerpc/kernel/kprobes.c
> +++ b/arch/powerpc/kernel/kprobes.c
> @@ -289,10 +289,7 @@ int kprobe_handler(struct pt_regs *regs)
>  	unsigned int *addr = (unsigned int *)regs->nip;
>  	struct kprobe_ctlblk *kcb;
>  
> -	if (user_mode(regs))
> -		return 0;
> -
> -	if (!(regs->msr & MSR_IR) || !(regs->msr & MSR_DR))
> +	if (user_mode(regs) || real_mode(regs))
>  		return 0;

I think just adding an IS_ENABLED(CONFIG_BOOKE) here might be better.

cheers

^ permalink raw reply

* Re: [PATCH v4 10/10] net/ps3_gelic: Fix DMA mapping problems
From: Christophe Leroy @ 2021-08-05  5:10 UTC (permalink / raw)
  To: Geoff Levand, David S. Miller, Jakub Kicinski; +Cc: netdev, linuxppc-dev
In-Reply-To: <7aa1d9b1b4ffadcbdc6f88e4f8d4a323da307595.1627068552.git.geoff@infradead.org>



Le 23/07/2021 à 22:31, Geoff Levand a écrit :
> Fixes several DMA mapping problems with the PS3's gelic network driver:
> 
>   * Change from checking the return value of dma_map_single to using the
>     dma_mapping_error routine.
>   * Use the correct buffer length when mapping the RX skb.
>   * Improved error checking and debug logging.
> 
> Fixes runtime errors like these, and also other randomly occurring errors:
> 
>    IP-Config: Complete:
>    DMA-API: ps3_gelic_driver sb_05: device driver failed to check map error
>    WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:1027 .check_unmap+0x888/0x8dc
> 
> Signed-off-by: Geoff Levand <geoff@infradead.org>


CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#55: FILE: drivers/net/ethernet/toshiba/ps3_gelic_net.c:351:
+		descr->link.cpu_addr = dma_map_single(dev, descr,
+			descr->link.size, DMA_BIDIRECTIONAL);

WARNING:BRACES: braces {} are not necessary for single statement blocks
#62: FILE: drivers/net/ethernet/toshiba/ps3_gelic_net.c:358:
+				if (descr->link.cpu_addr) {
+					gelic_unmap_link(dev, descr);
+				}

CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#157: FILE: drivers/net/ethernet/toshiba/ps3_gelic_net.c:440:
+	cpu_addr = dma_map_single(dev, descr->skb->data,
+		descr->hw_regs.payload.size, DMA_FROM_DEVICE);

CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#262: FILE: drivers/net/ethernet/toshiba/ps3_gelic_net.c:612:
+			dev_info_ratelimited(dev,
+				"%s:%d: forcing end of tx descriptor with status %x\n",

CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#323: FILE: drivers/net/ethernet/toshiba/ps3_gelic_net.c:846:
+	cpu_addr = dma_map_single(dev, skb->data, descr->hw_regs.payload.size,
+		DMA_TO_DEVICE);


NOTE: For some of the reported defects, checkpatch may be able to
       mechanically convert to the typical style using --fix or --fix-inplace.

Commit cf6041cd6b17 ("net/ps3_gelic: Fix DMA mapping problems") has style problems, please review.

NOTE: Ignored message types: ARCH_INCLUDE_LINUX BIT_MACRO COMPARISON_TO_NULL DT_SPLIT_BINDING_PATCH 
EMAIL_SUBJECT FILE_PATH_CHANGES GLOBAL_INITIALISERS LINE_SPACING MULTIPLE_ASSIGNMENTS


> ---
>   drivers/net/ethernet/toshiba/ps3_gelic_net.c | 183 +++++++++++--------
>   1 file changed, 108 insertions(+), 75 deletions(-)
> 
> diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> index 42f4de9ad5fe..11ddeacb1159 100644
> --- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> +++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> @@ -336,22 +336,31 @@ static int gelic_card_init_chain(struct gelic_card *card,
>   	struct gelic_descr_chain *chain, struct gelic_descr *start_descr,
>   	int descr_count)
>   {
> -	int i;
> -	struct gelic_descr *descr;
> +	struct gelic_descr *descr = start_descr;
>   	struct device *dev = ctodev(card);
> +	unsigned int index;
>   
> -	descr = start_descr;
> -	memset(descr, 0, sizeof(*descr) *descr_count);
> +	memset(start_descr, 0, descr_count * sizeof(*start_descr));
>   
> -	for (i = 0; i < descr_count; i++, descr++) {
> -		descr->link.size = sizeof(struct gelic_hw_regs);
> +	for (index = 0, descr = start_descr; index < descr_count;
> +		index++, descr++) {
>   		gelic_descr_set_status(descr, GELIC_DESCR_DMA_NOT_IN_USE);
> -		descr->link.cpu_addr =
> -			dma_map_single(dev, descr, descr->link.size,
> -				DMA_BIDIRECTIONAL);
>   
> -		if (!descr->link.cpu_addr)
> -			goto iommu_error;
> +		descr->link.size = sizeof(struct gelic_hw_regs);
> +		descr->link.cpu_addr = dma_map_single(dev, descr,
> +			descr->link.size, DMA_BIDIRECTIONAL);
> +
> +		if (unlikely(dma_mapping_error(dev, descr->link.cpu_addr))) {
> +			dev_err(dev, "%s:%d: dma_mapping_error\n", __func__,
> +				__LINE__);
> +
> +			for (index--, descr--; index > 0; index--, descr--) {
> +				if (descr->link.cpu_addr) {
> +					gelic_unmap_link(dev, descr);
> +				}
> +			}
> +			return -ENOMEM;
> +		}
>   
>   		descr->next = descr + 1;
>   		descr->prev = descr - 1;
> @@ -360,8 +369,9 @@ static int gelic_card_init_chain(struct gelic_card *card,
>   	(descr - 1)->next = start_descr;
>   	start_descr->prev = (descr - 1);
>   
> -	descr = start_descr;
> -	for (i = 0; i < descr_count; i++, descr++) {
> +	/* chain bus addr of hw descriptor */
> +	for (index = 0, descr = start_descr; index < descr_count;
> +		index++, descr++) {
>   		descr->hw_regs.next_descr_addr =
>   			cpu_to_be32(descr->next->link.cpu_addr);
>   	}
> @@ -373,12 +383,6 @@ static int gelic_card_init_chain(struct gelic_card *card,
>   	(descr - 1)->hw_regs.next_descr_addr = 0;
>   
>   	return 0;
> -
> -iommu_error:
> -	for (i--, descr--; 0 <= i; i--, descr--)
> -		if (descr->link.cpu_addr)
> -			gelic_unmap_link(dev, descr);
> -	return -ENOMEM;
>   }
>   
>   /**
> @@ -395,49 +399,63 @@ static int gelic_descr_prepare_rx(struct gelic_card *card,
>   	struct gelic_descr *descr)
>   {
>   	struct device *dev = ctodev(card);
> -	int offset;
> -	unsigned int bufsize;
> +	struct aligned_buff {
> +		unsigned int total_bytes;
> +		unsigned int offset;
> +	};
> +	struct aligned_buff a_buf;
> +	dma_addr_t cpu_addr;
>   
>   	if (gelic_descr_get_status(descr) !=  GELIC_DESCR_DMA_NOT_IN_USE) {
>   		dev_err(dev, "%s:%d: ERROR status\n", __func__, __LINE__);
>   	}
>   
> -	/* we need to round up the buffer size to a multiple of 128 */
> -	bufsize = ALIGN(GELIC_NET_MAX_MTU, GELIC_NET_RXBUF_ALIGN);
> +	a_buf.total_bytes = ALIGN(GELIC_NET_MAX_MTU, GELIC_NET_RXBUF_ALIGN)
> +		+ GELIC_NET_RXBUF_ALIGN;
> +
> +	descr->skb = dev_alloc_skb(a_buf.total_bytes);
>   
> -	/* and we need to have it 128 byte aligned, therefore we allocate a
> -	 * bit more */
> -	descr->skb = dev_alloc_skb(bufsize + GELIC_NET_RXBUF_ALIGN - 1);
>   	if (!descr->skb) {
> -		descr->hw_regs.payload.dev_addr = 0; /* tell DMAC don't touch memory */
> +		descr->hw_regs.payload.dev_addr = 0;
> +		descr->hw_regs.payload.size = 0;
>   		return -ENOMEM;
>   	}
> -	descr->hw_regs.payload.size = cpu_to_be32(bufsize);
> +
> +	a_buf.offset = PTR_ALIGN(descr->skb->data, GELIC_NET_RXBUF_ALIGN)
> +		- descr->skb->data;
> +
> +	if (a_buf.offset) {
> +		dev_dbg(dev, "%s:%d: offset=%u\n", __func__, __LINE__,
> +			a_buf.offset);
> +		skb_reserve(descr->skb, a_buf.offset);
> +	}
> +
>   	descr->hw_regs.dmac_cmd_status = 0;
>   	descr->hw_regs.result_size = 0;
>   	descr->hw_regs.valid_size = 0;
>   	descr->hw_regs.data_error = 0;
>   
> -	offset = ((unsigned long)descr->skb->data) &
> -		(GELIC_NET_RXBUF_ALIGN - 1);
> -	if (offset)
> -		skb_reserve(descr->skb, GELIC_NET_RXBUF_ALIGN - offset);
> -	/* io-mmu-map the skb */
> -	descr->hw_regs.payload.dev_addr = cpu_to_be32(dma_map_single(dev,
> -						     descr->skb->data,
> -						     GELIC_NET_MAX_MTU,
> -						     DMA_FROM_DEVICE));
> -	if (!descr->hw_regs.payload.dev_addr) {
> +	descr->hw_regs.payload.size = a_buf.total_bytes - a_buf.offset;
> +	cpu_addr = dma_map_single(dev, descr->skb->data,
> +		descr->hw_regs.payload.size, DMA_FROM_DEVICE);
> +	descr->hw_regs.payload.dev_addr = cpu_to_be32(cpu_addr);
> +
> +	if (unlikely(dma_mapping_error(dev, cpu_addr))) {
> +		dev_err(dev, "%s:%d: dma_mapping_error\n", __func__, __LINE__);
> +
> +		descr->hw_regs.payload.dev_addr = 0;
> +		descr->hw_regs.payload.size = 0;
> +
>   		dev_kfree_skb_any(descr->skb);
>   		descr->skb = NULL;
> -		dev_info(dev,
> -			 "%s:Could not iommu-map rx buffer\n", __func__);
> +
>   		gelic_descr_set_status(descr, GELIC_DESCR_DMA_NOT_IN_USE);
> +
>   		return -ENOMEM;
> -	} else {
> -		gelic_descr_set_status(descr, GELIC_DESCR_DMA_CARDOWNED);
> -		return 0;
>   	}
> +
> +	gelic_descr_set_status(descr, GELIC_DESCR_DMA_CARDOWNED);
> +	return 0;
>   }
>   
>   /**
> @@ -454,13 +472,18 @@ static void gelic_card_release_rx_chain(struct gelic_card *card)
>   		if (descr->skb) {
>   			dma_unmap_single(dev,
>   				be32_to_cpu(descr->hw_regs.payload.dev_addr),
> -				descr->skb->len, DMA_FROM_DEVICE);
> -			descr->hw_regs.payload.dev_addr = 0;
> +				descr->hw_regs.payload.size, DMA_FROM_DEVICE);
> +
>   			dev_kfree_skb_any(descr->skb);
>   			descr->skb = NULL;
> +
>   			gelic_descr_set_status(descr,
>   				GELIC_DESCR_DMA_NOT_IN_USE);
>   		}
> +
> +		descr->hw_regs.payload.dev_addr = 0;
> +		descr->hw_regs.payload.size = 0;
> +
>   		descr = descr->next;
>   	} while (descr != card->rx_chain.head);
>   }
> @@ -526,17 +549,19 @@ static void gelic_descr_release_tx(struct gelic_card *card,
>   		GELIC_DESCR_TX_TAIL));
>   
>   	dma_unmap_single(dev, be32_to_cpu(descr->hw_regs.payload.dev_addr),
> -		skb->len, DMA_TO_DEVICE);
> -	dev_kfree_skb_any(skb);
> +		descr->hw_regs.payload.size, DMA_TO_DEVICE);
>   
>   	descr->hw_regs.payload.dev_addr = 0;
>   	descr->hw_regs.payload.size = 0;
> +
> +	dev_kfree_skb_any(skb);
> +	descr->skb = NULL;
> +
>   	descr->hw_regs.next_descr_addr = 0;
>   	descr->hw_regs.result_size = 0;
>   	descr->hw_regs.valid_size = 0;
>   	descr->hw_regs.data_status = 0;
>   	descr->hw_regs.data_error = 0;
> -	descr->skb = NULL;
>   
>   	gelic_descr_set_status(descr, GELIC_DESCR_DMA_NOT_IN_USE);
>   }
> @@ -565,31 +590,34 @@ static void gelic_card_wake_queues(struct gelic_card *card)
>   static void gelic_card_release_tx_chain(struct gelic_card *card, int stop)
>   {
>   	struct gelic_descr_chain *tx_chain;
> -	enum gelic_descr_dma_status status;
>   	struct device *dev = ctodev(card);
> -	struct net_device *netdev;
> -	int release = 0;
> +	int release;
> +
> +	for (release = 0, tx_chain = &card->tx_chain;
> +		tx_chain->head != tx_chain->tail && tx_chain->tail;
> +		tx_chain->tail = tx_chain->tail->next) {
> +		enum gelic_descr_dma_status status;
> +		struct gelic_descr *descr;
> +		struct net_device *netdev;
> +
> +		descr = tx_chain->tail;
> +		status = gelic_descr_get_status(descr);
> +		netdev = descr->skb->dev;
>   
> -	for (tx_chain = &card->tx_chain;
> -	     tx_chain->head != tx_chain->tail && tx_chain->tail;
> -	     tx_chain->tail = tx_chain->tail->next) {
> -		status = gelic_descr_get_status(tx_chain->tail);
> -		netdev = tx_chain->tail->skb->dev;
>   		switch (status) {
>   		case GELIC_DESCR_DMA_RESPONSE_ERROR:
>   		case GELIC_DESCR_DMA_PROTECTION_ERROR:
>   		case GELIC_DESCR_DMA_FORCE_END:
> -			 dev_info_ratelimited(dev,
> -					 "%s:%d: forcing end of tx descriptor with status %x\n",
> -					 __func__, __LINE__, status);
> +			dev_info_ratelimited(dev,
> +				"%s:%d: forcing end of tx descriptor with status %x\n",
> +				__func__, __LINE__, status);
>   			netdev->stats.tx_dropped++;
>   			break;
>   
>   		case GELIC_DESCR_DMA_COMPLETE:
> -			if (tx_chain->tail->skb) {
> +			if (descr->skb) {
>   				netdev->stats.tx_packets++;
> -				netdev->stats.tx_bytes +=
> -					tx_chain->tail->skb->len;
> +				netdev->stats.tx_bytes += descr->skb->len;
>   			}
>   			break;
>   
> @@ -599,7 +627,7 @@ static void gelic_card_release_tx_chain(struct gelic_card *card, int stop)
>   			}
>   		}
>   
> -		gelic_descr_release_tx(card, tx_chain->tail);
> +		gelic_descr_release_tx(card, descr);
>   		release++;
>   	}
>   out:
> @@ -703,19 +731,19 @@ int gelic_net_stop(struct net_device *netdev)
>    *
>    * returns the address of the next descriptor, or NULL if not available.
>    */
> -static struct gelic_descr *
> -gelic_card_get_next_tx_descr(struct gelic_card *card)
> +static struct gelic_descr *gelic_card_get_next_tx_descr(struct gelic_card *card)
>   {
>   	if (!card->tx_chain.head)
>   		return NULL;
> +
>   	/*  see if the next descriptor is free */
>   	if (card->tx_chain.tail != card->tx_chain.head->next &&
> -		gelic_descr_get_status(card->tx_chain.head) ==
> -			GELIC_DESCR_DMA_NOT_IN_USE)
> +		(gelic_descr_get_status(card->tx_chain.head) ==
> +			GELIC_DESCR_DMA_NOT_IN_USE)) {
>   		return card->tx_chain.head;
> -	else
> -		return NULL;
> +	}
>   
> +	return NULL;
>   }
>   
>   /**
> @@ -809,18 +837,23 @@ static int gelic_descr_prepare_tx(struct gelic_card *card,
>   		if (!skb_tmp) {
>   			return -ENOMEM;
>   		}
> +
>   		skb = skb_tmp;
>   	}
>   
> -	cpu_addr = dma_map_single(dev, skb->data, skb->len, DMA_TO_DEVICE);
> +	descr->hw_regs.payload.size = skb->len;
> +	cpu_addr = dma_map_single(dev, skb->data, descr->hw_regs.payload.size,
> +		DMA_TO_DEVICE);
> +	descr->hw_regs.payload.dev_addr = cpu_to_be32(cpu_addr);
>   
> -	if (!cpu_addr) {
> +	if (unlikely(dma_mapping_error(dev, cpu_addr))) {
>   		dev_err(dev, "%s:%d: dma_mapping_error\n", __func__, __LINE__);
> +
> +		descr->hw_regs.payload.dev_addr = 0;
> +		descr->hw_regs.payload.size = 0;
>   		return -ENOMEM;
>   	}
>   
> -	descr->hw_regs.payload.dev_addr = cpu_to_be32(cpu_addr);
> -	descr->hw_regs.payload.size = cpu_to_be32(skb->len);
>   	descr->skb = skb;
>   	descr->hw_regs.data_status = 0;
>   	descr->hw_regs.next_descr_addr = 0; /* terminate hw descr */
> @@ -948,9 +981,9 @@ static void gelic_net_pass_skb_up(struct gelic_descr *descr,
>   
>   	data_status = be32_to_cpu(descr->hw_regs.data_status);
>   	data_error = be32_to_cpu(descr->hw_regs.data_error);
> -	/* unmap skb buffer */
> +
>   	dma_unmap_single(dev, be32_to_cpu(descr->hw_regs.payload.dev_addr),
> -			 GELIC_NET_MAX_MTU, DMA_FROM_DEVICE);
> +			 descr->hw_regs.payload.size, DMA_FROM_DEVICE);
>   
>   	skb_put(skb, be32_to_cpu(descr->hw_regs.valid_size) ?
>   		be32_to_cpu(descr->hw_regs.valid_size) :
> 

^ permalink raw reply

* Re: [PATCH v4 09/10] net/ps3_gelic: Add new routine gelic_work_to_card
From: Christophe Leroy @ 2021-08-05  5:10 UTC (permalink / raw)
  To: Geoff Levand, David S. Miller, Jakub Kicinski; +Cc: netdev, linuxppc-dev
In-Reply-To: <5634f7c76a67345c9735e05b68228ea899a8bf9d.1627068552.git.geoff@infradead.org>



Le 23/07/2021 à 22:31, Geoff Levand a écrit :
> Add new helper routine gelic_work_to_card that converts a work_struct
> to a gelic_card.
> 
> Signed-off-by: Geoff Levand <geoff@infradead.org>

Commit 3ffdbef9f86f ("net/ps3_gelic: Add new routine gelic_work_to_card") has no obvious style 
problems and is ready for submission.


> ---
>   drivers/net/ethernet/toshiba/ps3_gelic_net.c | 8 ++++++--
>   1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> index 60fcca5d20dd..42f4de9ad5fe 100644
> --- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> +++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> @@ -1420,6 +1420,11 @@ static const struct ethtool_ops gelic_ether_ethtool_ops = {
>   	.set_link_ksettings = gelic_ether_set_link_ksettings,
>   };
>   
> +static struct gelic_card *gelic_work_to_card(struct work_struct *work)
> +{
> +	return container_of(work, struct gelic_card, tx_timeout_task);
> +}
> +
>   /**
>    * gelic_net_tx_timeout_task - task scheduled by the watchdog timeout
>    * function (to be called not under interrupt status)
> @@ -1429,8 +1434,7 @@ static const struct ethtool_ops gelic_ether_ethtool_ops = {
>    */
>   static void gelic_net_tx_timeout_task(struct work_struct *work)
>   {
> -	struct gelic_card *card =
> -		container_of(work, struct gelic_card, tx_timeout_task);
> +	struct gelic_card *card = gelic_work_to_card(work);
>   	struct net_device *netdev = card->netdev[GELIC_PORT_ETHERNET_0];
>   	struct device *dev = ctodev(card);
>   
> 

^ permalink raw reply

* Re: [PATCH v4 08/10] net/ps3_gelic: Rename no to descr_count
From: Christophe Leroy @ 2021-08-05  5:09 UTC (permalink / raw)
  To: Geoff Levand, David S. Miller, Jakub Kicinski; +Cc: netdev, linuxppc-dev
In-Reply-To: <07e42ec30037d514c1d63f33efe4642364d89802.1627068552.git.geoff@infradead.org>



Le 23/07/2021 à 22:31, Geoff Levand a écrit :
> In an effort to make the PS3 gelic driver easier to maintain, rename
> the gelic_card_init_chain parameter 'no' to 'descr_count'.
> 
> Signed-off-by: Geoff Levand <geoff@infradead.org>

CHECK:SPACING: spaces preferred around that '*' (ctx:WxV)
#40: FILE: drivers/net/ethernet/toshiba/ps3_gelic_net.c:344:
+	memset(descr, 0, sizeof(*descr) *descr_count);
  	                                ^


NOTE: For some of the reported defects, checkpatch may be able to
       mechanically convert to the typical style using --fix or --fix-inplace.

Commit fdbd3f08a0b1 ("net/ps3_gelic: Rename no to descr_count") has style problems, please review.


> ---
>   drivers/net/ethernet/toshiba/ps3_gelic_net.c | 10 +++++-----
>   1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> index e55aa9fecfeb..60fcca5d20dd 100644
> --- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> +++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> @@ -325,7 +325,7 @@ static void gelic_card_free_chain(struct gelic_card *card,
>    * @card: card structure
>    * @chain: address of chain
>    * @start_descr: address of descriptor array
> - * @no: number of descriptors
> + * @descr_count: number of descriptors
>    *
>    * we manage a circular list that mirrors the hardware structure,
>    * except that the hardware uses bus addresses.
> @@ -334,16 +334,16 @@ static void gelic_card_free_chain(struct gelic_card *card,
>    */
>   static int gelic_card_init_chain(struct gelic_card *card,
>   	struct gelic_descr_chain *chain, struct gelic_descr *start_descr,
> -	int no)
> +	int descr_count)
>   {
>   	int i;
>   	struct gelic_descr *descr;
>   	struct device *dev = ctodev(card);
>   
>   	descr = start_descr;
> -	memset(descr, 0, sizeof(*descr) * no);
> +	memset(descr, 0, sizeof(*descr) *descr_count);
>   
> -	for (i = 0; i < no; i++, descr++) {
> +	for (i = 0; i < descr_count; i++, descr++) {
>   		descr->link.size = sizeof(struct gelic_hw_regs);
>   		gelic_descr_set_status(descr, GELIC_DESCR_DMA_NOT_IN_USE);
>   		descr->link.cpu_addr =
> @@ -361,7 +361,7 @@ static int gelic_card_init_chain(struct gelic_card *card,
>   	start_descr->prev = (descr - 1);
>   
>   	descr = start_descr;
> -	for (i = 0; i < no; i++, descr++) {
> +	for (i = 0; i < descr_count; i++, descr++) {
>   		descr->hw_regs.next_descr_addr =
>   			cpu_to_be32(descr->next->link.cpu_addr);
>   	}
> 

^ permalink raw reply

* Re: [PATCH v4 07/10] net/ps3_gelic: Add new routine gelic_unmap_link
From: Christophe Leroy @ 2021-08-05  5:09 UTC (permalink / raw)
  To: Geoff Levand, David S. Miller, Jakub Kicinski; +Cc: netdev, linuxppc-dev
In-Reply-To: <024b88e07095f00bc2eabfae2f526851600ee272.1627068552.git.geoff@infradead.org>



Le 23/07/2021 à 22:31, Geoff Levand a écrit :
> Put the common code for unmaping a link into its own routine,
> gelic_unmap_link, and add some debugging checks.
> 
> Signed-off-by: Geoff Levand <geoff@infradead.org>

CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#31: FILE: drivers/net/ethernet/toshiba/ps3_gelic_net.c:300:
+	dma_unmap_single(dev, descr->link.cpu_addr, descr->link.size,
+		DMA_BIDIRECTIONAL);


NOTE: For some of the reported defects, checkpatch may be able to
       mechanically convert to the typical style using --fix or --fix-inplace.

Commit bcb1cb297705 ("net/ps3_gelic: Add new routine gelic_unmap_link") has style problems, please 
review.

NOTE: Ignored message types: ARCH_INCLUDE_LINUX BIT_MACRO COMPARISON_TO_NULL DT_SPLIT_BINDING_PATCH 
EMAIL_SUBJECT FILE_PATH_CHANGES GLOBAL_INITIALISERS LINE_SPACING MULTIPLE_ASSIGNMENTS



> ---
>   drivers/net/ethernet/toshiba/ps3_gelic_net.c | 23 +++++++++++++++-----
>   1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> index 85fc1915c8be..e55aa9fecfeb 100644
> --- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> +++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> @@ -288,6 +288,21 @@ void gelic_card_down(struct gelic_card *card)
>   	mutex_unlock(&card->updown_lock);
>   }
>   
> +static void gelic_unmap_link(struct device *dev, struct gelic_descr *descr)
> +{
> +	BUG_ON_DEBUG(descr->hw_regs.payload.dev_addr);
> +	BUG_ON_DEBUG(descr->hw_regs.payload.size);
> +
> +	BUG_ON_DEBUG(!descr->link.cpu_addr);
> +	BUG_ON_DEBUG(!descr->link.size);
> +
> +	dma_unmap_single(dev, descr->link.cpu_addr, descr->link.size,
> +		DMA_BIDIRECTIONAL);
> +
> +	descr->link.cpu_addr = 0;
> +	descr->link.size = 0;
> +}
> +
>   /**
>    * gelic_card_free_chain - free descriptor chain
>    * @card: card structure
> @@ -301,9 +316,7 @@ static void gelic_card_free_chain(struct gelic_card *card,
>   
>   	for (descr = descr_in; descr && descr->link.cpu_addr;
>   		descr = descr->next) {
> -		dma_unmap_single(dev, descr->link.cpu_addr, descr->link.size,
> -			DMA_BIDIRECTIONAL);
> -		descr->link.cpu_addr = 0;
> +		gelic_unmap_link(dev, descr);
>   	}
>   }
>   
> @@ -364,9 +377,7 @@ static int gelic_card_init_chain(struct gelic_card *card,
>   iommu_error:
>   	for (i--, descr--; 0 <= i; i--, descr--)
>   		if (descr->link.cpu_addr)
> -			dma_unmap_single(dev, descr->link.cpu_addr,
> -					 descr->link.size,
> -					 DMA_BIDIRECTIONAL);
> +			gelic_unmap_link(dev, descr);
>   	return -ENOMEM;
>   }
>   
> 

^ permalink raw reply

* Re: [PATCH v4 06/10] net/ps3_gelic: Cleanup debug code
From: Christophe Leroy @ 2021-08-05  5:08 UTC (permalink / raw)
  To: Geoff Levand, David S. Miller, Jakub Kicinski; +Cc: netdev, linuxppc-dev
In-Reply-To: <8421aa2c148d840b11b7115208e5276017999c2a.1627068552.git.geoff@infradead.org>



Le 23/07/2021 à 22:31, Geoff Levand a écrit :
> In an effort to make the PS3 gelic driver easier to maintain, change the
> gelic_card_enable_rxdmac routine to use the optimizer to remove
> debug code.
> 
> Signed-off-by: Geoff Levand <geoff@infradead.org>


WARNING:VSPRINTF_SPECIFIER_PX: Using vsprintf specifier '%px' potentially exposes the kernel memory 
layout, if you don't really need the address please consider using '%p'.
#38: FILE: drivers/net/ethernet/toshiba/ps3_gelic_net.c:171:
+		dev_err(dev, "%s:%d: head=%px\n", __func__, __LINE__,
+			card->rx_chain.head);


NOTE: For some of the reported defects, checkpatch may be able to
       mechanically convert to the typical style using --fix or --fix-inplace.

Commit 65f38d9720ac ("net/ps3_gelic: Cleanup debug code") has style problems, please review.

NOTE: Ignored message types: ARCH_INCLUDE_LINUX BIT_MACRO COMPARISON_TO_NULL DT_SPLIT_BINDING_PATCH 
EMAIL_SUBJECT FILE_PATH_CHANGES GLOBAL_INITIALISERS LINE_SPACING MULTIPLE_ASSIGNMENTS


> ---
>   drivers/net/ethernet/toshiba/ps3_gelic_net.c | 19 +++++++++----------
>   1 file changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> index 54e50ad9e629..85fc1915c8be 100644
> --- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> +++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> @@ -162,17 +162,16 @@ static void gelic_card_enable_rxdmac(struct gelic_card *card)
>   	struct device *dev = ctodev(card);
>   	int status;
>   
> -#ifdef DEBUG
> -	if (gelic_descr_get_status(card->rx_chain.head) !=
> -	    GELIC_DESCR_DMA_CARDOWNED) {
> -		printk(KERN_ERR "%s: status=%x\n", __func__,
> -		       be32_to_cpu(card->rx_chain.head->dmac_cmd_status));
> -		printk(KERN_ERR "%s: nextphy=%x\n", __func__,
> -		       be32_to_cpu(card->rx_chain.head->hw_regs.next_descr_addr));
> -		printk(KERN_ERR "%s: head=%p\n", __func__,
> -		       card->rx_chain.head);
> +	if (__is_defined(DEBUG) && (gelic_descr_get_status(card->rx_chain.head)
> +			!= GELIC_DESCR_DMA_CARDOWNED)) {
> +		dev_err(dev, "%s:%d: status=%x\n", __func__, __LINE__,
> +			be32_to_cpu(card->rx_chain.head->hw_regs.dmac_cmd_status));
> +		dev_err(dev, "%s:%d: nextphy=%x\n", __func__, __LINE__,
> +			be32_to_cpu(card->rx_chain.head->hw_regs.next_descr_addr));
> +		dev_err(dev, "%s:%d: head=%px\n", __func__, __LINE__,
> +			card->rx_chain.head);
>   	}
> -#endif
> +
>   	status = lv1_net_start_rx_dma(bus_id(card), dev_id(card),
>   		card->rx_chain.head->link.cpu_addr, 0);
>   
> 

^ permalink raw reply

* Re: [PATCH v4 05/10] net/ps3_gelic: Add vlan_id structure
From: Christophe Leroy @ 2021-08-05  5:07 UTC (permalink / raw)
  To: Geoff Levand, David S. Miller, Jakub Kicinski; +Cc: netdev, linuxppc-dev
In-Reply-To: <1cdd7f718dde93dcaebf7ddd025869901aa30523.1627068552.git.geoff@infradead.org>



Le 23/07/2021 à 22:31, Geoff Levand a écrit :
> In an effort to make the PS3 gelic driver easier to maintain, add
> a definition for the vlan_id structure.
> 
> Signed-off-by: Geoff Levand <geoff@infradead.org>

Commit 4298d9fdc87f ("net/ps3_gelic: Add vlan_id structure") has no obvious style problems and is 
ready for submission.


> ---
>   drivers/net/ethernet/toshiba/ps3_gelic_net.c | 7 ++++---
>   1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> index 946e9bfa071b..54e50ad9e629 100644
> --- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> +++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> @@ -1614,13 +1614,14 @@ static struct gelic_card *gelic_alloc_card_net(struct net_device **netdev)
>   static void gelic_card_get_vlan_info(struct gelic_card *card)
>   {
>   	struct device *dev = ctodev(card);
> +	unsigned int i;
>   	u64 v1, v2;
>   	int status;
> -	unsigned int i;
> -	struct {
> +	struct vlan_id {
>   		int tx;
>   		int rx;
> -	} vlan_id_ix[2] = {
> +	};
> +	struct vlan_id vlan_id_ix[2] = {
>   		[GELIC_PORT_ETHERNET_0] = {
>   			.tx = GELIC_LV1_VLAN_TX_ETHERNET_0,
>   			.rx = GELIC_LV1_VLAN_RX_ETHERNET_0
> 

^ permalink raw reply

* Re: [PATCH v4 04/10] net/ps3_gelic: Add new macro BUG_ON_DEBUG
From: Christophe Leroy @ 2021-08-05  5:07 UTC (permalink / raw)
  To: Geoff Levand, David S. Miller, Jakub Kicinski; +Cc: netdev, linuxppc-dev
In-Reply-To: <bc659850d4eec3b2358c1ccb0e00952ceaa6012f.1627068552.git.geoff@infradead.org>



Le 23/07/2021 à 22:31, Geoff Levand a écrit :
> Add a new preprocessor macro BUG_ON_DEBUG, that expands to BUG_ON when
> the preprocessor macro DEBUG is defined, or to WARN_ON when DEBUG is not
> defined.  Also, replace all occurrences of BUG_ON with BUG_ON_DEBUG.
> 

CHECK:MACRO_ARG_REUSE: Macro argument reuse '_cond' - possible side-effects?
#23: FILE: drivers/net/ethernet/toshiba/ps3_gelic_net.c:47:
+#define BUG_ON_DEBUG(_cond) do { \
+	if (__is_defined(DEBUG)) \
+		BUG_ON(_cond); \
+	else \
+		WARN_ON(_cond); \
+} while (0)

WARNING:AVOID_BUG: Avoid crashing the kernel - try using WARN_ON & recovery code rather than BUG() 
or BUG_ON()
#25: FILE: drivers/net/ethernet/toshiba/ps3_gelic_net.c:49:
+		BUG_ON(_cond); \


NOTE: For some of the reported defects, checkpatch may be able to
       mechanically convert to the typical style using --fix or --fix-inplace.

Commit e4fbd62abdcd ("net/ps3_gelic: Add new macro BUG_ON_DEBUG") has style problems, please review.

NOTE: Ignored message types: ARCH_INCLUDE_LINUX BIT_MACRO COMPARISON_TO_NULL DT_SPLIT_BINDING_PATCH 
EMAIL_SUBJECT FILE_PATH_CHANGES GLOBAL_INITIALISERS LINE_SPACING MULTIPLE_ASSIGNMENTS


> Signed-off-by: Geoff Levand <geoff@infradead.org>
> ---
>   drivers/net/ethernet/toshiba/ps3_gelic_net.c | 13 ++++++++++---
>   1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> index ded467d81f36..946e9bfa071b 100644
> --- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> +++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
> @@ -44,6 +44,13 @@ MODULE_AUTHOR("SCE Inc.");
>   MODULE_DESCRIPTION("Gelic Network driver");
>   MODULE_LICENSE("GPL");
>   
> +#define BUG_ON_DEBUG(_cond) do { \
> +	if (__is_defined(DEBUG)) \
> +		BUG_ON(_cond); \
> +	else \
> +		WARN_ON(_cond); \
> +} while (0)
> +
>   int gelic_card_set_irq_mask(struct gelic_card *card, u64 mask)
>   {
>   	struct device *dev = ctodev(card);
> @@ -505,7 +512,7 @@ static void gelic_descr_release_tx(struct gelic_card *card,
>   	struct sk_buff *skb = descr->skb;
>   	struct device *dev = ctodev(card);
>   
> -	BUG_ON(!(be32_to_cpu(descr->hw_regs.data_status) &
> +	BUG_ON_DEBUG(!(be32_to_cpu(descr->hw_regs.data_status) &
>   		GELIC_DESCR_TX_TAIL));
>   
>   	dma_unmap_single(dev, be32_to_cpu(descr->hw_regs.payload.dev_addr),
> @@ -1667,7 +1674,7 @@ static void gelic_card_get_vlan_info(struct gelic_card *card)
>   	}
>   
>   	if (card->vlan[GELIC_PORT_ETHERNET_0].tx) {
> -		BUG_ON(!card->vlan[GELIC_PORT_WIRELESS].tx);
> +		BUG_ON_DEBUG(!card->vlan[GELIC_PORT_WIRELESS].tx);
>   		card->vlan_required = 1;
>   	} else
>   		card->vlan_required = 0;
> @@ -1709,7 +1716,7 @@ static int ps3_gelic_driver_probe(struct ps3_system_bus_device *sb_dev)
>   	if (result) {
>   		dev_err(dev, "%s:%d: ps3_dma_region_create failed: %d\n",
>   			__func__, __LINE__, result);
> -		BUG_ON("check region type");
> +		BUG_ON_DEBUG("check region type");
>   		goto fail_dma_region;
>   	}
>   
> 

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox