LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH v2] powerpc: pseries/hvconsole: fix stack overread via udbg
From: Michael Ellerman @ 2019-06-15 13:35 UTC (permalink / raw)
  To: Daniel Axtens, linuxppc-dev; +Cc: Dmitry Vyukov, Daniel Axtens
In-Reply-To: <20190603065657.7986-1-dja@axtens.net>

On Mon, 2019-06-03 at 06:56:57 UTC, Daniel Axtens wrote:
> While developing kasan for 64-bit book3s, I hit the following stack
> over-read.
> 
> It occurs because the hypercall to put characters onto the terminal
> takes 2 longs (128 bits/16 bytes) of characters at a time, and so
> hvc_put_chars would unconditionally copy 16 bytes from the argument
> buffer, regardless of supplied length. However, udbg_hvc_putc can
> call hvc_put_chars with a single-byte buffer, leading to the error.
> 
> [    0.001931] ==================================================================                                                  [150/819]
> [    0.001933] BUG: KASAN: stack-out-of-bounds in hvc_put_chars+0xdc/0x110
> [    0.001934] Read of size 8 at addr c0000000023e7a90 by task swapper/0
> [    0.001934]
> [    0.001935] CPU: 0 PID: 0 Comm: swapper Not tainted 5.2.0-rc2-next-20190528-02824-g048a6ab4835b #113
> [    0.001935] Call Trace:
> [    0.001936] [c0000000023e7790] [c000000001b4a450] dump_stack+0x104/0x154 (unreliable)
> [    0.001937] [c0000000023e77f0] [c0000000006d3524] print_address_description+0xa0/0x30c
> [    0.001938] [c0000000023e7880] [c0000000006d318c] __kasan_report+0x20c/0x224
> [    0.001939] [c0000000023e7950] [c0000000006d19d8] kasan_report+0x18/0x30
> [    0.001940] [c0000000023e7970] [c0000000006d4854] __asan_report_load8_noabort+0x24/0x40
> [    0.001941] [c0000000023e7990] [c0000000001511ac] hvc_put_chars+0xdc/0x110
> [    0.001942] [c0000000023e7a10] [c000000000f81cfc] hvterm_raw_put_chars+0x9c/0x110
> [    0.001943] [c0000000023e7a50] [c000000000f82634] udbg_hvc_putc+0x154/0x200
> [    0.001944] [c0000000023e7b10] [c000000000049c90] udbg_write+0xf0/0x240
> [    0.001945] [c0000000023e7b70] [c0000000002e5d88] console_unlock+0x868/0xd30
> [    0.001946] [c0000000023e7ca0] [c0000000002e6e00] register_console+0x970/0xe90
> [    0.001947] [c0000000023e7d80] [c000000001ff1928] register_early_udbg_console+0xf8/0x114
> [    0.001948] [c0000000023e7df0] [c000000001ff1174] setup_arch+0x108/0x790
> [    0.001948] [c0000000023e7e90] [c000000001fe41c8] start_kernel+0x104/0x784
> [    0.001949] [c0000000023e7f90] [c00000000000b368] start_here_common+0x1c/0x534
> [    0.001950]
> [    0.001950]
> [    0.001951] Memory state around the buggy address:
> [    0.001952]  c0000000023e7980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [    0.001952]  c0000000023e7a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1
> [    0.001953] >c0000000023e7a80: f1 f1 01 f2 f2 f2 00 00 00 00 00 00 00 00 00 00
> [    0.001953]                          ^
> [    0.001954]  c0000000023e7b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [    0.001954]  c0000000023e7b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [    0.001955] ==================================================================
> 
> Document that a 16-byte buffer is requred, and provide it in udbg.
> 
> CC: Dmitry Vyukov <dvyukov@google.com>
> Signed-off-by: Daniel Axtens <dja@axtens.net>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/934bda59f286d0221f1a3ebab7f5156a

cheers

^ permalink raw reply

* Re: [PATCH] ocxl: do not use C++ style comments in uapi header
From: Michael Ellerman @ 2019-06-15 13:36 UTC (permalink / raw)
  To: Masahiro Yamada, Frederic Barrat, Andrew Donnellan, linuxppc-dev
  Cc: Arnd Bergmann, Greg KH, Randy Dunlap, linux-kernel,
	Masahiro Yamada, Joe Perches, Thomas Gleixner
In-Reply-To: <20190604111632.22479-1-yamada.masahiro@socionext.com>

On Tue, 2019-06-04 at 11:16:32 UTC, Masahiro Yamada wrote:
> Linux kernel tolerates C++ style comments these days. Actually, the
> SPDX License tags for .c files start with //.
> 
> On the other hand, uapi headers are written in more strict C, where
> the C++ comment style is forbidden.
> 
> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
> Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
> Acked-by: Andrew Donnellan <ajd@linux.ibm.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/2305ff225c0b1691ec2e93f3d6990e13

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/pseries: fix oops in hotplug memory notifier
From: Michael Ellerman @ 2019-06-15 13:36 UTC (permalink / raw)
  To: Nathan Lynch, linuxppc-dev
In-Reply-To: <20190607050407.25444-1-nathanl@linux.ibm.com>

On Fri, 2019-06-07 at 05:04:07 UTC, Nathan Lynch wrote:
> During post-migration device tree updates, we can oops in
> pseries_update_drconf_memory if the source device tree has an
> ibm,dynamic-memory-v2 property and the destination has a
> ibm,dynamic_memory (v1) property. The notifier processes an "update"
> for the ibm,dynamic-memory property but it's really an add in this
> scenario. So make sure the old property object is there before
> dereferencing it.
> 
> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/0aa82c482ab2ece530a6f44897b63b27

cheers

^ permalink raw reply

* Re: [PATCH 1/3] powerpc/cacheinfo: add cacheinfo_teardown, cacheinfo_rebuild
From: Michael Ellerman @ 2019-06-15 13:36 UTC (permalink / raw)
  To: Nathan Lynch, linuxppc-dev
In-Reply-To: <20190612044506.16392-2-nathanl@linux.ibm.com>

On Wed, 2019-06-12 at 04:45:04 UTC, Nathan Lynch wrote:
> Allow external callers to force the cacheinfo code to release all its
> references to cache nodes, e.g. before processing device tree updates
> post-migration, and to rebuild the hierarchy afterward.
> 
> CPU online/offline must be blocked by callers; enforce this.
> 
> Fixes: 410bccf97881 ("powerpc/pseries: Partition migration in the kernel")
> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/d4aa219a074a5abaf95a756b9f0d190b

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/mm/32s: only use MMU to mark initmem NX if STRICT_KERNEL_RWX
From: Andreas Schwab @ 2019-06-15 14:36 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: linux-kernel, j.neuschaefer, Paul Mackerras, linuxppc-dev
In-Reply-To: <20190615152559.Horde.0lTFIZALxZ-RI75z94G3jA8@messagerie.si.c-s.fr>

On Jun 15 2019, Christophe Leroy <christophe.leroy@c-s.fr> wrote:

> Andreas Schwab <schwab@linux-m68k.org> a écrit :
>
>> If STRICT_KERNEL_RWX is disabled, never use the MMU to mark initmen
>> nonexecutable.
>
> I dont understand, can you elaborate ?

It breaks suspend.

> This area is mapped with BATs so using change_page_attr() is pointless.

There must be a reason STRICT_KERNEL_RWX is not available with
HIBERNATE.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply

* [PATCH] powerpc/4xx/uic: clear pending interrupt after irq type/pol change
From: Christian Lamparter @ 2019-06-15 15:23 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Thomas Gleixner, Paul Mackerras

When testing out gpio-keys with a button, a spurious
interrupt (and therefore a key press or release event)
gets triggered as soon as the driver enables the irq
line for the first time.

This patch clears any potential bogus generated interrupt
that was caused by the switching of the associated irq's
type and polarity.

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
---
 arch/powerpc/platforms/4xx/uic.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/platforms/4xx/uic.c b/arch/powerpc/platforms/4xx/uic.c
index 31f12ad37a98..36fb66ce54cf 100644
--- a/arch/powerpc/platforms/4xx/uic.c
+++ b/arch/powerpc/platforms/4xx/uic.c
@@ -154,6 +154,7 @@ static int uic_set_irq_type(struct irq_data *d, unsigned int flow_type)
 
 	mtdcr(uic->dcrbase + UIC_PR, pr);
 	mtdcr(uic->dcrbase + UIC_TR, tr);
+	mtdcr(uic->dcrbase + UIC_SR, ~mask);
 
 	raw_spin_unlock_irqrestore(&uic->lock, flags);
 
-- 
2.20.1


^ permalink raw reply related

* Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.2-4 tag
From: pr-tracker-bot @ 2019-06-15 17:55 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, Linus Torvalds, linux-kernel, npiggin
In-Reply-To: <87v9x7nf9l.fsf@concordia.ellerman.id.au>

The pull request you sent on Sat, 15 Jun 2019 23:34:46 +1000:

> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.2-4

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/fa1827d7731ac24f44309ddc2ca806650912bf0e

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

^ permalink raw reply

* Re: [PATCH] powerpc/pseries: fix oops in hotplug memory notifier
From: Tyrel Datwyler @ 2019-06-15 22:22 UTC (permalink / raw)
  To: Nathan Lynch, Tyrel Datwyler; +Cc: linuxppc-dev
In-Reply-To: <87a7el3rk3.fsf@linux.ibm.com>

On 06/13/2019 06:05 PM, Nathan Lynch wrote:
> Nathan Lynch <nathanl@linux.ibm.com> writes:
> 
>> Tyrel Datwyler <tyreld@linux.vnet.ibm.com> writes:
>>
>>> Maybe we are ok with this behavior as I haven't dug deep enough into the memory
>>> subsystem here to really understand what the memory code is updating, but it is
>>> concerning that we are doing it in some cases, but not all.
>>
>> I hope I've made a good case above that the notifier does not do any
>> useful work, and a counterpart for the v2 format isn't needed.  Do you
>> agree?
>>
>> If so, I'll send a patch later to remove the notifier altogether. In the
>> near term I would still like this minimal fix to go in.
> 
> Tyrel, ack?
> 

Sure, lets start with simple case to fix the crash.

-Tyrel


^ permalink raw reply

* Re:  Re: [PATCH v3 7/9] KVM: PPC: Ultravisor: Restrict LDBAR access
From: Ram Pai @ 2019-06-16  1:10 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Madhavan Srinivasan, rgrimm, Michael Anderson, Claudio Carvalho,
	kvm-ppc, Bharata B Rao, linuxppc-dev, Sukadev Bhattiprolu,
	Thiago Bauermann, Anshuman Khandual
In-Reply-To: <20190615074334.GD24709@blackberry>

On Sat, Jun 15, 2019 at 05:43:34PM +1000, Paul Mackerras wrote:
> On Thu, Jun 06, 2019 at 02:36:12PM -0300, Claudio Carvalho wrote:
> > When the ultravisor firmware is available, it takes control over the
> > LDBAR register. In this case, thread-imc updates and save/restore
> > operations on the LDBAR register are handled by ultravisor.
> > 
> > Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com>
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> 
> Acked-by: Paul Mackerras <paulus@ozlabs.org>
> 
> Just a note on the signed-off-by: it's a bit weird to have Ram's
> signoff when he is neither the author nor the sender of the patch.
> The author is assumed to be Claudio; if that is not correct then the
> patch should have a From: line at the start telling us who the author
> is, and ideally that person should have a signoff line before
> Claudio's signoff as the sender of the patch.

Ryan originally wrote this patch, which I than modified,
before Claudio further modified it to its current form.

So I think the author should be Ryan.

RP


^ permalink raw reply

* Re: [PATCH] powerpc/mm/32s: only use MMU to mark initmem NX if STRICT_KERNEL_RWX
From: christophe leroy @ 2019-06-16  8:06 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: linux-kernel, j.neuschaefer, Paul Mackerras, linuxppc-dev
In-Reply-To: <87pnne9aqo.fsf@igel.home>



Le 15/06/2019 à 16:36, Andreas Schwab a écrit :
> On Jun 15 2019, Christophe Leroy <christophe.leroy@c-s.fr> wrote:
> 
>> Andreas Schwab <schwab@linux-m68k.org> a écrit :
>>
>>> If STRICT_KERNEL_RWX is disabled, never use the MMU to mark initmen
>>> nonexecutable.
>>
>> I dont understand, can you elaborate ?
> 
> It breaks suspend.

Ok, but we need to explain why it breaks suspend, and again your patch 
is wrong anyway because that area of memory is mapped with BATs so you 
can't use change_page_attr()

> 
>> This area is mapped with BATs so using change_page_attr() is pointless.
> 
> There must be a reason STRICT_KERNEL_RWX is not available with
> HIBERNATE.

Yes but HIBERNATE and suspend are not the same thing. I guess HIBERNATE 
is not available with STRICT_KERNEL_RWX because HIBERNATE have to write 
back saved state into read-only memory as well.

Christophe

---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
https://www.avast.com/antivirus


^ permalink raw reply

* Re: [PATCH v5 13/16] powerpc/mm/32s: Use BATs for STRICT_KERNEL_RWX
From: christophe leroy @ 2019-06-16  8:20 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: j.neuschaefer, linux-kernel, Paul Mackerras, linuxppc-dev
In-Reply-To: <87blyz9jor.fsf@igel.home>

Le 15/06/2019 à 13:23, Andreas Schwab a écrit :
> This breaks suspend (or resume) on the iBook G4.  no_console_suspend
> doesn't give any clues, the display just stays dark.
> 

After a quick look at the suspend functions, I have the feeling that 
those functions only store and restore BATs 0 to 3.

Could you build your kernel with CONFIG_PPC_PTDUMP and see in file 
/sys/kernel/debug/powerpc/segment_registers how many IBATs registers are 
used.
If any of registers IBATs 4 to 7 are used, could you adjust 
CONFIG_ETEXT_SHIFT so that only IBATs 0 to 3 be used, and check if 
suspend/resume works when IBATs 4 to 7 are not used ?

Thanks
Christophe

---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
https://www.avast.com/antivirus

^ permalink raw reply

* Re: [PATCH v5 13/16] powerpc/mm/32s: Use BATs for STRICT_KERNEL_RWX
From: christophe leroy @ 2019-06-16  8:01 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: j.neuschaefer, linux-kernel, Paul Mackerras, linuxppc-dev
In-Reply-To: <877e9n9gng.fsf@igel.home>



Le 15/06/2019 à 14:28, Andreas Schwab a écrit :
> On Feb 21 2019, Christophe Leroy <christophe.leroy@c-s.fr> wrote:
> 
>> diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
>> index a000768a5cc9..6e56a6240bfa 100644
>> --- a/arch/powerpc/mm/pgtable_32.c
>> +++ b/arch/powerpc/mm/pgtable_32.c
>> @@ -353,7 +353,10 @@ void mark_initmem_nx(void)
>>   	unsigned long numpages = PFN_UP((unsigned long)_einittext) -
>>   				 PFN_DOWN((unsigned long)_sinittext);
>>   
>> -	change_page_attr(page, numpages, PAGE_KERNEL);
>> +	if (v_block_mapped((unsigned long)_stext) + 1)
> 
> That is always true.
> 

Did you boot with 'nobats' kernel parameter ?

If not, that's normal to be true, it means that memory is mapped with BATs.

When you boot with 'nobats' parameter, this should return false.

Christophe

---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
https://www.avast.com/antivirus


^ permalink raw reply

* Re: [PATCH v5 13/16] powerpc/mm/32s: Use BATs for STRICT_KERNEL_RWX
From: Andreas Schwab @ 2019-06-16  8:45 UTC (permalink / raw)
  To: christophe leroy
  Cc: j.neuschaefer, linux-kernel, Paul Mackerras, linuxppc-dev
In-Reply-To: <1c271e47-6917-2f29-97b6-f3160ddf5117@c-s.fr>

On Jun 16 2019, christophe leroy <christophe.leroy@c-s.fr> wrote:

> Le 15/06/2019 à 14:28, Andreas Schwab a écrit :
>> On Feb 21 2019, Christophe Leroy <christophe.leroy@c-s.fr> wrote:
>>
>>> diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
>>> index a000768a5cc9..6e56a6240bfa 100644
>>> --- a/arch/powerpc/mm/pgtable_32.c
>>> +++ b/arch/powerpc/mm/pgtable_32.c
>>> @@ -353,7 +353,10 @@ void mark_initmem_nx(void)
>>>   	unsigned long numpages = PFN_UP((unsigned long)_einittext) -
>>>   				 PFN_DOWN((unsigned long)_sinittext);
>>>   -	change_page_attr(page, numpages, PAGE_KERNEL);
>>> +	if (v_block_mapped((unsigned long)_stext) + 1)
>>
>> That is always true.
>>
>
> Did you boot with 'nobats' kernel parameter ?
>
> If not, that's normal to be true, it means that memory is mapped with BATs.

bool + 1 is always true.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply

* Re: [PATCH v5 13/16] powerpc/mm/32s: Use BATs for STRICT_KERNEL_RWX
From: Andreas Schwab @ 2019-06-16  9:29 UTC (permalink / raw)
  To: christophe leroy
  Cc: j.neuschaefer, linux-kernel, Paul Mackerras, linuxppc-dev
In-Reply-To: <a76f7759-a407-3d9a-0f43-654fd7ea0b1e@c-s.fr>

On Jun 16 2019, christophe leroy <christophe.leroy@c-s.fr> wrote:

> If any of registers IBATs 4 to 7 are used

Nope.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply

* Re: [PATCH v5 13/16] powerpc/mm/32s: Use BATs for STRICT_KERNEL_RWX
From: Andreas Schwab @ 2019-06-16 10:13 UTC (permalink / raw)
  To: christophe leroy
  Cc: j.neuschaefer, linux-kernel, Paul Mackerras, linuxppc-dev
In-Reply-To: <a76f7759-a407-3d9a-0f43-654fd7ea0b1e@c-s.fr>

On Jun 16 2019, christophe leroy <christophe.leroy@c-s.fr> wrote:

> If any of registers IBATs 4 to 7 are used, could you adjust
> CONFIG_ETEXT_SHIFT so that only IBATs 0 to 3 be used, and check if
> suspend/resume works when IBATs 4 to 7 are not used ?

I forgot to remove my patch.  With only 0-3 used, suspend/resume works.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply

* Re: [PATCH] powerpc/32s: fix initial setup of segment registers on secondary CPU
From: Michael Ellerman @ 2019-06-16 12:23 UTC (permalink / raw)
  To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras,
	erhard_f
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <be07403806abc56ec027f6d47468411876e18bb5.1560267983.git.christophe.leroy@c-s.fr>

On Tue, 2019-06-11 at 15:47:20 UTC, Christophe Leroy wrote:
> The patch referenced below moved the loading of segment registers
> out of load_up_mmu() in order to do it earlier in the boot sequence.
> However, the secondary CPU still needs it to be done when loading up
> the MMU.
> 
> Reported-by: Erhard F. <erhard_f@mailbox.org>
> Fixes: 215b823707ce ("powerpc/32s: set up an early static hash table for KASAN")
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/b7f8b440f3001cc1775c028f0a783786113c2ae3

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/32: fix build failure on book3e with KVM
From: Michael Ellerman @ 2019-06-16 12:23 UTC (permalink / raw)
  To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <97664671a229a4240bfb22f69ec4743e837c2b83.1558600628.git.christophe.leroy@c-s.fr>

On Thu, 2019-05-23 at 08:39:27 UTC, Christophe Leroy wrote:
> Build failure was introduced by the commit identified below,
> due to missed macro expension leading to wrong called function's name.
> 
> arch/powerpc/kernel/head_fsl_booke.o: In function `SystemCall':
> arch/powerpc/kernel/head_fsl_booke.S:416: undefined reference to `kvmppc_handler_BOOKE_INTERRUPT_SYSCALL_SPRN_SRR1'
> Makefile:1052: recipe for target 'vmlinux' failed
> 
> The called function should be kvmppc_handler_8_0x01B(). This patch fixes it.
> 
> Reported-by: Paul Mackerras <paulus@ozlabs.org>
> Fixes: 1a4b739bbb4f ("powerpc/32: implement fast entry for syscalls on BOOKE")
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/82f6e266f8123d7938713c0e10c03aa655b3e68a

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/booke: fix fast syscall entry on SMP
From: Michael Ellerman @ 2019-06-16 12:23 UTC (permalink / raw)
  To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <c0ea3c32c7ed892501ddcc7a169948c305081551.1560433897.git.christophe.leroy@c-s.fr>

On Thu, 2019-06-13 at 13:52:30 UTC, Christophe Leroy wrote:
> Use r10 instead of r9 to calculate CPU offset as r9 contains
> the value from SRR1 which is used later.
> 
> Fixes: 1a4b739bbb4f ("powerpc/32: implement fast entry for syscalls on BOOKE")
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/e8732ffa2e096d433c3f2349b871d43ed0d39f5c

cheers

^ permalink raw reply

* Re: [PATCH v3 1/3] powerpc/powernv: Add OPAL API interface to get secureboot state
From: Daniel Axtens @ 2019-06-16 23:56 UTC (permalink / raw)
  To: Nayna
  Cc: linux-efi, Ard Biesheuvel, Eric Richter, Nayna Jain, linux-kernel,
	Mimi Zohar, Claudio Carvalho, Matthew Garret, linuxppc-dev,
	Paul Mackerras, Jeremy Kerr, linux-integrity
In-Reply-To: <b2cedb05-6373-b357-f35c-bc112c78a6fc@linux.vnet.ibm.com>

Hi Nayna,

>> I guess I also somewhat object to calling it a 'backend' if we're using
>> it as a version scheme. I think the skiboot storage backends are true
>> backends - they provide different implementations of the same
>> functionality with the same API, but this seems like you're using it to
>> indicate different functionality. It seems like we're using it as if it
>> were called OPAL_SECVAR_VERSION.
>
> We are changing how we are exposing the version to the kernel. The 
> version will be exposed as device-tree entry rather than a OPAL runtime 
> service. We are not tied to the name "backend", we can switch to calling 
> it as "scheme" unless there is a better name.

This sounds like a good approach to me.

Kind regards,
Daniel

>
> Thanks & Regards,
>        - Nayna

^ permalink raw reply

* Re: [PATCH 0/1] PPC32: fix ptrace() access to FPU registers
From: Daniel Axtens @ 2019-06-17  1:19 UTC (permalink / raw)
  To: Radu Rendec, linuxppc-dev; +Cc: Radu Rendec, Paul Mackerras, Oleg Nesterov
In-Reply-To: <20190610232758.19010-1-radu.rendec@gmail.com>

Radu Rendec <radu.rendec@gmail.com> writes:

> Hi Everyone,
>
> I'm following up on the ptrace() problem that I reported a few days ago.
> I believe my version of the code handles all cases correctly. While the
> problem essentially boils down to dividing the fpidx by 2 on PPC32, it
> becomes tricky when the same code must work correctly on both PPC32 and
> PPC64.
>
> One other thing that I believe was handled incorrectly in the previous
> version is the unused half of fpscr on PPC32. Note that while PT_FPSCR
> is defined as (PT_FPR0 + 2*32 + 1), making only the upper half visible,
> PT_FPR0 + 2*32 still corresponds to a possible address that userspace
> can pass. In that case, comparing fpidx to (PT_FPSCR - PT_FPR0) would
> cause an invalid access to the FPU registers array.
>
> I tested the patch on 4.9.179, but that part of the code is identical in
> recent kernels so it should work just the same.
>
> I wrote a simple test program than can be used to quickly test (on an
> x86_64 host) that all cases are handled correctly for both PPC32/PPC64.
> The code is included below.
>
> I also tested with gdbserver (test patch included below) and verified
> that it generates two ptrace() calls for each FPU register, with
> addresses between 0xc0 and 0x1bc.

Thanks for looking in to this. I can confirm your issue. What I'm
currently wondering is: what is the behaviour with a 32-bit userspace on
a 64-bit kernel? Should they also be going down the 32-bit path as far
as calculating offsets goes?

Regards,
Daniel

>
> 8<--------------- Makefile ---------------------------------------------
> .PHONY: all clean
>
> all: ptrace-fpregs-32 ptrace-fpregs-64
>
> ptrace-fpregs-32: ptrace-fpregs.c
> 	$(CC) -o ptrace-fpregs-32 -Wall -O2 -m32 ptrace-fpregs.c
>
> ptrace-fpregs-64: ptrace-fpregs.c
> 	$(CC) -o ptrace-fpregs-64 -Wall -O2 ptrace-fpregs.c
>
> clean:
> 	rm -f ptrace-fpregs-32 ptrace-fpregs-64
> 8<--------------- ptrace-fpregs.c --------------------------------------
> #include <stdio.h>
> #include <errno.h>
>
> #define PT_FPR0	48
>
> #ifndef __x86_64
>
> #define PT_FPR31 (PT_FPR0 + 2*31)
> #define PT_FPSCR (PT_FPR0 + 2*32 + 1)
>
> #else
>
> #define PT_FPSCR (PT_FPR0 + 32)
>
> #endif
>
> int test_access(unsigned long addr)
> {
> 	int ret;
>
> 	do {
> 		unsigned long index, fpidx;
>
> 		ret = -EIO;
>
> 		/* convert to index and check */
> 		index = addr / sizeof(long);
> 		if ((addr & (sizeof(long) - 1)) || (index > PT_FPSCR))
> 			break;
>
> 		if (index < PT_FPR0) {
> 			ret = printf("ptrace_put_reg(%lu)", index);
> 			break;
> 		}
>
> 		ret = 0;
> #ifndef __x86_64
> 		if (index == PT_FPSCR - 1) {
> 			/* corner case for PPC32; do nothing */
> 			printf("corner_case");
> 			break;
> 		}
> #endif
> 		if (index == PT_FPSCR) {
> 			printf("fpscr");
> 			break;
> 		}
>
> 		/*
> 		 * FPR is always 64-bit; on PPC32, userspace does two 32-bit
> 		 * accesses. Add bit2 to allow accessing the upper half on
> 		 * 32-bit; on 64-bit, bit2 is always 0 (we validate it above).
> 		 */
> 		fpidx = (addr - PT_FPR0 * sizeof(long)) / 8;
> 		printf("TS_FPR[%lu] + %lu", fpidx, addr & 4);
> 		break;
> 	} while (0);
>
> 	return ret;
> }
>
> int main(void)
> {
> 	unsigned long addr;
> 	int rc;
>
> 	for (addr = 0; addr < PT_FPSCR * sizeof(long) + 16; addr++) {
> 		printf("0x%04lx: ", addr);
> 		rc = test_access(addr);
> 		if (rc < 0)
> 			printf("!err!");
> 		printf("\t<%d>\n", rc);
> 	}
>
> 	return 0;
> }
> 8<--------------- gdb.patch --------------------------------------------
> --- gdb/gdbserver/linux-low.c.orig	2019-06-10 11:45:53.810882669 -0400
> +++ gdb/gdbserver/linux-low.c	2019-06-10 11:49:32.272929766 -0400
> @@ -4262,6 +4262,8 @@ store_register (struct regcache *regcach
>    pid = lwpid_of (get_thread_lwp (current_inferior));
>    for (i = 0; i < size; i += sizeof (PTRACE_XFER_TYPE))
>      {
> +      printf("writing register #%d offset %d at address %#x\n",
> +             regno, i, (unsigned int)regaddr);
>        errno = 0;
>        ptrace (PTRACE_POKEUSER, pid,
>  	    /* Coerce to a uintptr_t first to avoid potential gcc warning
> 8<----------------------------------------------------------------------
>
> Radu Rendec (1):
>   PPC32: fix ptrace() access to FPU registers
>
>  arch/powerpc/kernel/ptrace.c | 85 ++++++++++++++++++++++--------------
>  1 file changed, 52 insertions(+), 33 deletions(-)
>
> -- 
> 2.20.1

^ permalink raw reply

* Re: [PATCH v3 4/9] KVM: PPC: Ultravisor: Add generic ultravisor call handler
From: Paul Mackerras @ 2019-06-17  2:06 UTC (permalink / raw)
  To: Claudio Carvalho
  Cc: Madhavan Srinivasan, Michael Anderson, Ram Pai, kvm-ppc,
	Bharata B Rao, linuxppc-dev, Sukadev Bhattiprolu,
	Thiago Bauermann, Anshuman Khandual
In-Reply-To: <20190606173614.32090-5-cclaudio@linux.ibm.com>

On Thu, Jun 06, 2019 at 02:36:09PM -0300, Claudio Carvalho wrote:
> From: Ram Pai <linuxram@us.ibm.com>
> 
> Add the ucall() function, which can be used to make ultravisor calls
> with varied number of in and out arguments. Ultravisor calls can be made
> from the host or guests.
> 
> This copies the implementation of plpar_hcall().

One point which I missed when I looked at this patch previously is
that the ABI that we're defining here is different from the hcall ABI
in that we are putting the ucall number in r0, whereas hcalls have the
hcall number in r3.  That makes ucalls more like syscalls, which have
the syscall number in r0.  So that last sentence quoted above is
somewhat misleading.

The thing we need to consider is that when SMFCTRL[E] = 0, a ucall
instruction becomes a hcall (that is, sc 2 is executed as if it was
sc 1).  In that case, the first argument to the ucall will be
interpreted as the hcall number.  Mostly that will happen not to be a
valid hcall number, but sometimes it might unavoidably be a valid but
unintended hcall number.

I think that will make it difficult to get ucalls to fail gracefully
in the case where SMF/PEF is disabled.  It seems like the assignment
of ucall numbers was made so that they wouldn't overlap with valid
hcall numbers; presumably that was so that we could tell when an hcall
was actually intended to be a ucall.  However, using a different GPR
to pass the ucall number defeats that.

I realize that there is ultravisor code in development that takes the
ucall number in r0, and also that having the ucall number in r3 would
possibly make life more difficult for the place where we call
UV_RETURN in assembler code.  Nevertheless, perhaps we should consider
changing the ABI to be like the hcall ABI before everything gets set
in concrete.

Paul.

^ permalink raw reply

* Re: [PATCH 0/1] PPC32: fix ptrace() access to FPU registers
From: Radu Rendec @ 2019-06-17  2:27 UTC (permalink / raw)
  To: Daniel Axtens, linuxppc-dev; +Cc: Paul Mackerras, Oleg Nesterov
In-Reply-To: <87r27t2el0.fsf@dja-thinkpad.axtens.net>

On Mon, 2019-06-17 at 11:19 +1000, Daniel Axtens wrote:
> Radu Rendec <
> radu.rendec@gmail.com
> > writes:
> 
> > Hi Everyone,
> > 
> > I'm following up on the ptrace() problem that I reported a few days ago.
> > I believe my version of the code handles all cases correctly. While the
> > problem essentially boils down to dividing the fpidx by 2 on PPC32, it
> > becomes tricky when the same code must work correctly on both PPC32 and
> > PPC64.
> > 
> > One other thing that I believe was handled incorrectly in the previous
> > version is the unused half of fpscr on PPC32. Note that while PT_FPSCR
> > is defined as (PT_FPR0 + 2*32 + 1), making only the upper half visible,
> > PT_FPR0 + 2*32 still corresponds to a possible address that userspace
> > can pass. In that case, comparing fpidx to (PT_FPSCR - PT_FPR0) would
> > cause an invalid access to the FPU registers array.
> > 
> > I tested the patch on 4.9.179, but that part of the code is identical in
> > recent kernels so it should work just the same.
> > 
> > I wrote a simple test program than can be used to quickly test (on an
> > x86_64 host) that all cases are handled correctly for both PPC32/PPC64.
> > The code is included below.
> > 
> > I also tested with gdbserver (test patch included below) and verified
> > that it generates two ptrace() calls for each FPU register, with
> > addresses between 0xc0 and 0x1bc.
> 
> Thanks for looking in to this. I can confirm your issue. What I'm
> currently wondering is: what is the behaviour with a 32-bit userspace on
> a 64-bit kernel? Should they also be going down the 32-bit path as far
> as calculating offsets goes?

Thanks for reviewing this. I haven't thought about the 32-bit userspace
on a 64-bit kernel, that is a very good question. Userspace passes a
pointer, so in theory it could go down either path as long as the
pointer points to the right data type.

I will go back to the gdb source code and try to figure out if that case
is handled in a special way. If not, it's probably safe to assume that a
32-bit userspace should always go down the 32-bit path regardless of the
kernel bitness (in which case I think I have to change my patch).

Best regards,
Radu

> > 8<--------------- Makefile ---------------------------------------------
> > .PHONY: all clean
> > 
> > all: ptrace-fpregs-32 ptrace-fpregs-64
> > 
> > ptrace-fpregs-32: ptrace-fpregs.c
> > 	$(CC) -o ptrace-fpregs-32 -Wall -O2 -m32 ptrace-fpregs.c
> > 
> > ptrace-fpregs-64: ptrace-fpregs.c
> > 	$(CC) -o ptrace-fpregs-64 -Wall -O2 ptrace-fpregs.c
> > 
> > clean:
> > 	rm -f ptrace-fpregs-32 ptrace-fpregs-64
> > 8<--------------- ptrace-fpregs.c --------------------------------------
> > #include <stdio.h>
> > #include <errno.h>
> > 
> > #define PT_FPR0	48
> > 
> > #ifndef __x86_64
> > 
> > #define PT_FPR31 (PT_FPR0 + 2*31)
> > #define PT_FPSCR (PT_FPR0 + 2*32 + 1)
> > 
> > #else
> > 
> > #define PT_FPSCR (PT_FPR0 + 32)
> > 
> > #endif
> > 
> > int test_access(unsigned long addr)
> > {
> > 	int ret;
> > 
> > 	do {
> > 		unsigned long index, fpidx;
> > 
> > 		ret = -EIO;
> > 
> > 		/* convert to index and check */
> > 		index = addr / sizeof(long);
> > 		if ((addr & (sizeof(long) - 1)) || (index > PT_FPSCR))
> > 			break;
> > 
> > 		if (index < PT_FPR0) {
> > 			ret = printf("ptrace_put_reg(%lu)", index);
> > 			break;
> > 		}
> > 
> > 		ret = 0;
> > #ifndef __x86_64
> > 		if (index == PT_FPSCR - 1) {
> > 			/* corner case for PPC32; do nothing */
> > 			printf("corner_case");
> > 			break;
> > 		}
> > #endif
> > 		if (index == PT_FPSCR) {
> > 			printf("fpscr");
> > 			break;
> > 		}
> > 
> > 		/*
> > 		 * FPR is always 64-bit; on PPC32, userspace does two 32-bit
> > 		 * accesses. Add bit2 to allow accessing the upper half on
> > 		 * 32-bit; on 64-bit, bit2 is always 0 (we validate it above).
> > 		 */
> > 		fpidx = (addr - PT_FPR0 * sizeof(long)) / 8;
> > 		printf("TS_FPR[%lu] + %lu", fpidx, addr & 4);
> > 		break;
> > 	} while (0);
> > 
> > 	return ret;
> > }
> > 
> > int main(void)
> > {
> > 	unsigned long addr;
> > 	int rc;
> > 
> > 	for (addr = 0; addr < PT_FPSCR * sizeof(long) + 16; addr++) {
> > 		printf("0x%04lx: ", addr);
> > 		rc = test_access(addr);
> > 		if (rc < 0)
> > 			printf("!err!");
> > 		printf("\t<%d>\n", rc);
> > 	}
> > 
> > 	return 0;
> > }
> > 8<--------------- gdb.patch --------------------------------------------
> > --- gdb/gdbserver/linux-low.c.orig	2019-06-10 11:45:53.810882669 -0400
> > +++ gdb/gdbserver/linux-low.c	2019-06-10 11:49:32.272929766 -0400
> > @@ -4262,6 +4262,8 @@ store_register (struct regcache *regcach
> >    pid = lwpid_of (get_thread_lwp (current_inferior));
> >    for (i = 0; i < size; i += sizeof (PTRACE_XFER_TYPE))
> >      {
> > +      printf("writing register #%d offset %d at address %#x\n",
> > +             regno, i, (unsigned int)regaddr);
> >        errno = 0;
> >        ptrace (PTRACE_POKEUSER, pid,
> >  	    /* Coerce to a uintptr_t first to avoid potential gcc warning
> > 8<----------------------------------------------------------------------
> > 
> > Radu Rendec (1):
> >   PPC32: fix ptrace() access to FPU registers
> > 
> >  arch/powerpc/kernel/ptrace.c | 85 ++++++++++++++++++++++--------------
> >  1 file changed, 52 insertions(+), 33 deletions(-)
> > 
> > -- 
> > 2.20.1


^ permalink raw reply

* Re: [RFC PATCH v4 6/6] kvmppc: Support reset of secure guest
From: Paul Mackerras @ 2019-06-17  4:06 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: linuxram, cclaudio, kvm-ppc, linux-mm, jglisse, aneesh.kumar,
	paulus, sukadev, linuxppc-dev
In-Reply-To: <20190528064933.23119-7-bharata@linux.ibm.com>

On Tue, May 28, 2019 at 12:19:33PM +0530, Bharata B Rao wrote:
> Add support for reset of secure guest via a new ioctl KVM_PPC_SVM_OFF.
> This ioctl will be issued by QEMU during reset and in this ioctl,
> we ask UV to terminate the guest via UV_SVM_TERMINATE ucall,
> reinitialize guest's partitioned scoped page tables and release all
> HMM pages of the secure guest.
> 
> After these steps, guest is ready to issue UV_ESM call once again
> to switch to secure mode.

Since you are adding a new KVM ioctl, you need to add a description of
it to Documentation/virtual/kvm/api.txt.

Paul.

^ permalink raw reply

* [PATCH] ocxl: Allow contexts to be attached with a NULL mm
From: Alastair D'Silva @ 2019-06-17  4:41 UTC (permalink / raw)
  To: alastair
  Cc: Andrew Donnellan, Arnd Bergmann, Greg Kroah-Hartman, linux-kernel,
	Frederic Barrat, linuxppc-dev

From: Alastair D'Silva <alastair@d-silva.org>

If an OpenCAPI context is to be used directly by a kernel driver, there
may not be a suitable mm to use.

The patch makes the mm parameter to ocxl_context_attach optional.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 drivers/misc/ocxl/context.c |  9 ++++++---
 drivers/misc/ocxl/link.c    | 12 ++++++++----
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/misc/ocxl/context.c b/drivers/misc/ocxl/context.c
index bab9c9364184..994563a078eb 100644
--- a/drivers/misc/ocxl/context.c
+++ b/drivers/misc/ocxl/context.c
@@ -69,6 +69,7 @@ static void xsl_fault_error(void *data, u64 addr, u64 dsisr)
 int ocxl_context_attach(struct ocxl_context *ctx, u64 amr, struct mm_struct *mm)
 {
 	int rc;
+	unsigned long pidr = 0;
 
 	// Locks both status & tidr
 	mutex_lock(&ctx->status_mutex);
@@ -77,9 +78,11 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr, struct mm_struct *mm)
 		goto out;
 	}
 
-	rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid,
-			mm->context.id, ctx->tidr, amr, mm,
-			xsl_fault_error, ctx);
+	if (mm)
+		pidr = mm->context.id;
+
+	rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid, pidr, ctx->tidr,
+			      amr, mm, xsl_fault_error, ctx);
 	if (rc)
 		goto out;
 
diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c
index cce5b0d64505..43542f124807 100644
--- a/drivers/misc/ocxl/link.c
+++ b/drivers/misc/ocxl/link.c
@@ -523,7 +523,8 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr,
 	pe->amr = cpu_to_be64(amr);
 	pe->software_state = cpu_to_be32(SPA_PE_VALID);
 
-	mm_context_add_copro(mm);
+	if (mm)
+		mm_context_add_copro(mm);
 	/*
 	 * Barrier is to make sure PE is visible in the SPA before it
 	 * is used by the device. It also helps with the global TLBI
@@ -546,7 +547,8 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr,
 	 * have a reference on mm_users. Incrementing mm_count solves
 	 * the problem.
 	 */
-	mmgrab(mm);
+	if (mm)
+		mmgrab(mm);
 	trace_ocxl_context_add(current->pid, spa->spa_mem, pasid, pidr, tidr);
 unlock:
 	mutex_unlock(&spa->spa_lock);
@@ -652,8 +654,10 @@ int ocxl_link_remove_pe(void *link_handle, int pasid)
 	if (!pe_data) {
 		WARN(1, "Couldn't find pe data when removing PE\n");
 	} else {
-		mm_context_remove_copro(pe_data->mm);
-		mmdrop(pe_data->mm);
+		if (pe_data->mm) {
+			mm_context_remove_copro(pe_data->mm);
+			mmdrop(pe_data->mm);
+		}
 		kfree_rcu(pe_data, rcu);
 	}
 unlock:
-- 
2.21.0


^ permalink raw reply related

* Re: [PATCH v4 1/6] kvmppc: HMM backend driver to manage pages of secure guest
From: Paul Mackerras @ 2019-06-17  5:31 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: linuxram, cclaudio, kvm-ppc, linux-mm, jglisse, aneesh.kumar,
	paulus, sukadev, linuxppc-dev
In-Reply-To: <20190528064933.23119-2-bharata@linux.ibm.com>

On Tue, May 28, 2019 at 12:19:28PM +0530, Bharata B Rao wrote:
> HMM driver for KVM PPC to manage page transitions of
> secure guest via H_SVM_PAGE_IN and H_SVM_PAGE_OUT hcalls.
> 
> H_SVM_PAGE_IN: Move the content of a normal page to secure page
> H_SVM_PAGE_OUT: Move the content of a secure page to normal page

Comments below...

> @@ -4421,6 +4435,7 @@ static void kvmppc_core_free_memslot_hv(struct kvm_memory_slot *free,
>  					struct kvm_memory_slot *dont)
>  {
>  	if (!dont || free->arch.rmap != dont->arch.rmap) {
> +		kvmppc_hmm_release_pfns(free);

I don't think this is the right place to do this.  The memslot will
have no pages mapped by this time, because higher levels of code will
have called kvmppc_core_flush_memslot_hv() before calling this.
Releasing the pfns should be done in that function.

> diff --git a/arch/powerpc/kvm/book3s_hv_hmm.c b/arch/powerpc/kvm/book3s_hv_hmm.c
> new file mode 100644
> index 000000000000..713806003da3

...

> +#define KVMPPC_PFN_HMM		(0x1ULL << 61)
> +
> +static inline bool kvmppc_is_hmm_pfn(unsigned long pfn)
> +{
> +	return !!(pfn & KVMPPC_PFN_HMM);
> +}

Since you are putting in these values in the rmap entries, you need to
be careful about overlaps between these values and the other uses of
rmap entries.  The value you have chosen would be in the middle of the
LPID field for an rmap entry for a guest that has nested guests, and
in fact kvmhv_remove_nest_rmap_range() effectively assumes that a
non-zero rmap entry must be a list of L2 guest mappings.  (This is for
radix guests; HPT guests use the rmap entry differently, but I am
assuming that we will enforce that only radix guests can be secure
guests.)

Maybe it is true that the rmap entry will be non-zero only for those
guest pages which are not mapped on the host side, that is,
kvmppc_radix_flush_memslot() will see !pte_present(*ptep) for any page
of a secure guest where the rmap entry contains a HMM pfn.  If that is
so and is a deliberate part of the design, then I would like to see it
written down in comments and commit messages so it's clear to others
working on the code in future.

Suraj is working on support for nested HPT guests, which will involve
changing the rmap format to indicate more explicitly what sort of
entry each rmap entry is.  Please work with him to define a format for
your rmap entries that is clearly distinguishable from the others.

I think it is reasonable to say that a secure guest can't have nested
guests, at least for now, but then we should make sure to kill all
nested guests when a guest goes secure.

...

> +/*
> + * Move page from normal memory to secure memory.
> + */
> +unsigned long
> +kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
> +		     unsigned long flags, unsigned long page_shift)
> +{
> +	unsigned long addr, end;
> +	unsigned long src_pfn, dst_pfn;
> +	struct kvmppc_hmm_migrate_args args;
> +	struct vm_area_struct *vma;
> +	int srcu_idx;
> +	unsigned long gfn = gpa >> page_shift;
> +	struct kvm_memory_slot *slot;
> +	unsigned long *rmap;
> +	int ret = H_SUCCESS;
> +
> +	if (page_shift != PAGE_SHIFT)
> +		return H_P3;
> +
> +	srcu_idx = srcu_read_lock(&kvm->srcu);
> +	slot = gfn_to_memslot(kvm, gfn);
> +	rmap = &slot->arch.rmap[gfn - slot->base_gfn];
> +	addr = gfn_to_hva(kvm, gpa >> page_shift);
> +	srcu_read_unlock(&kvm->srcu, srcu_idx);

Shouldn't we keep the srcu read lock until we have finished working on
the page?

> +	if (kvm_is_error_hva(addr))
> +		return H_PARAMETER;
> +
> +	end = addr + (1UL << page_shift);
> +
> +	if (flags)
> +		return H_P2;
> +
> +	args.rmap = rmap;
> +	args.lpid = kvm->arch.lpid;
> +	args.gpa = gpa;
> +	args.page_shift = page_shift;
> +
> +	down_read(&kvm->mm->mmap_sem);
> +	vma = find_vma_intersection(kvm->mm, addr, end);
> +	if (!vma || vma->vm_start > addr || vma->vm_end < end) {
> +		ret = H_PARAMETER;
> +		goto out;
> +	}
> +	ret = migrate_vma(&kvmppc_hmm_migrate_ops, vma, addr, end,
> +			  &src_pfn, &dst_pfn, &args);
> +	if (ret < 0)
> +		ret = H_PARAMETER;
> +out:
> +	up_read(&kvm->mm->mmap_sem);
> +	return ret;
> +}

...

> +/*
> + * Move page from secure memory to normal memory.
> + */
> +unsigned long
> +kvmppc_h_svm_page_out(struct kvm *kvm, unsigned long gpa,
> +		      unsigned long flags, unsigned long page_shift)
> +{
> +	unsigned long addr, end;
> +	struct vm_area_struct *vma;
> +	unsigned long src_pfn, dst_pfn = 0;
> +	int srcu_idx;
> +	int ret = H_SUCCESS;
> +
> +	if (page_shift != PAGE_SHIFT)
> +		return H_P3;
> +
> +	if (flags)
> +		return H_P2;
> +
> +	srcu_idx = srcu_read_lock(&kvm->srcu);
> +	addr = gfn_to_hva(kvm, gpa >> page_shift);
> +	srcu_read_unlock(&kvm->srcu, srcu_idx);

and likewise here, shouldn't we unlock later, after the migrate_vma()
call perhaps?

> +	if (kvm_is_error_hva(addr))
> +		return H_PARAMETER;
> +
> +	end = addr + (1UL << page_shift);
> +
> +	down_read(&kvm->mm->mmap_sem);
> +	vma = find_vma_intersection(kvm->mm, addr, end);
> +	if (!vma || vma->vm_start > addr || vma->vm_end < end) {
> +		ret = H_PARAMETER;
> +		goto out;
> +	}
> +	ret = migrate_vma(&kvmppc_hmm_fault_migrate_ops, vma, addr, end,
> +			  &src_pfn, &dst_pfn, NULL);
> +	if (ret < 0)
> +		ret = H_PARAMETER;
> +out:
> +	up_read(&kvm->mm->mmap_sem);
> +	return ret;
> +}
> +

Paul.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox