LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: (subset) [PATCH 00/38] Replace deprecated CPU-hotplug
From: Michael Ellerman @ 2021-08-18 13:38 UTC (permalink / raw)
  To: linux-kernel, Sebastian Andrzej Siewior; +Cc: linuxppc-dev
In-Reply-To: <20210803141621.780504-1-bigeasy@linutronix.de>

On Tue, 3 Aug 2021 16:15:43 +0200, Sebastian Andrzej Siewior wrote:
> This is a tree wide replacement of the deprecated CPU hotplug functions
> which are only wrappers around the actual functions.
> 
> Each patch is independent and can be picked up by the relevant maintainer.
> 
> [...]

Applied to powerpc/next.

[03/38] powerpc: Replace deprecated CPU-hotplug functions.
        https://git.kernel.org/powerpc/c/5ae36401ca4ea2737d779ce7c267444b16530001

cheers

^ permalink raw reply

* Re: [PATCH v2 1/2] powerpc/book3s64/radix: make tlb_single_page_flush_ceiling a debugfs entry
From: Michael Ellerman @ 2021-08-18 13:38 UTC (permalink / raw)
  To: linuxppc-dev, mpe, Aneesh Kumar K.V
In-Reply-To: <20210812132831.233794-1-aneesh.kumar@linux.ibm.com>

On Thu, 12 Aug 2021 18:58:30 +0530, Aneesh Kumar K.V wrote:
> Similar to x86/s390 add a debugfs file to tune tlb_single_page_flush_ceiling.
> Also add a debugfs entry for tlb_local_single_page_flush_ceiling.

Applied to powerpc/next.

[1/2] powerpc/book3s64/radix: make tlb_single_page_flush_ceiling a debugfs entry
      https://git.kernel.org/powerpc/c/3e188b1ae8807f26cc5a530a9d55f3f643fe050a
[2/2] powerpc: rename powerpc_debugfs_root to arch_debugfs_dir
      https://git.kernel.org/powerpc/c/dbf77fed8b302e87561c7c2fc06050c88f4d3120

cheers

^ permalink raw reply

* Re: [PATCH v3 3/3] powerpc/perf: Fix the check for SIAR value
From: Christophe Leroy @ 2021-08-18 13:28 UTC (permalink / raw)
  To: Kajol Jain, mpe, linuxppc-dev; +Cc: atrajeev, maddy, rnsastry
In-Reply-To: <20210818131949.32008-3-kjain@linux.ibm.com>



Le 18/08/2021 à 15:19, Kajol Jain a écrit :
> Incase of random sampling, there can be scenarios where
> Sample Instruction Address Register(SIAR) may not latch
> to the sampled instruction and could result in
> the value of 0. In these scenarios it is preferred to
> return regs->nip. These corner cases are seen in the
> previous generation (p9) also.
> 
> Patch adds the check for SIAR value along with use_siar and
> siar_valid checks so that the function will return regs->nip
> incase SIAR is zero.
> 
> Patch drops the code under PPMU_P10_DD1 flag check
> which handles SIAR 0 case only for Power10 DD1.
> 
> Fixes: 2ca13a4cc56c9 ("powerpc/perf: Use regs->nip when SIAR is zero")
> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
> ---
> 
> Changelog:
> - Drop adding new ternary condition to check siar value.
> - Remove siar check specific for PPMU_P10_DD1 and add
>    it along with common checks as suggested by Christophe Leroy
>    and Michael Ellermen
> 
>   arch/powerpc/perf/core-book3s.c | 7 +------
>   1 file changed, 1 insertion(+), 6 deletions(-)
> 
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 23ec89a59893..55efbba7572b 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -2254,12 +2254,7 @@ unsigned long perf_instruction_pointer(struct pt_regs *regs)
>   	bool use_siar = regs_use_siar(regs);
>   	unsigned long siar = mfspr(SPRN_SIAR);
>   
> -	if (ppmu && (ppmu->flags & PPMU_P10_DD1)) {
> -		if (siar)
> -			return siar;
> -		else
> -			return regs->nip;
> -	} else if (use_siar && siar_valid(regs))
> +	if (use_siar && siar_valid(regs) && siar)

You can probably now do

+	if (regs_use_siar(regs) && siar_valid(regs) && siar)

and remove use_siar

>   		return siar + perf_ip_adjust(regs);
>   	else
>   		return regs->nip;
> 

^ permalink raw reply

* [PATCH v3 3/3] powerpc/perf: Fix the check for SIAR value
From: Kajol Jain @ 2021-08-18 13:19 UTC (permalink / raw)
  To: mpe, linuxppc-dev, christophe.leroy; +Cc: kjain, atrajeev, maddy, rnsastry
In-Reply-To: <20210818131949.32008-1-kjain@linux.ibm.com>

Incase of random sampling, there can be scenarios where
Sample Instruction Address Register(SIAR) may not latch
to the sampled instruction and could result in
the value of 0. In these scenarios it is preferred to
return regs->nip. These corner cases are seen in the
previous generation (p9) also. 

Patch adds the check for SIAR value along with use_siar and
siar_valid checks so that the function will return regs->nip
incase SIAR is zero.

Patch drops the code under PPMU_P10_DD1 flag check
which handles SIAR 0 case only for Power10 DD1.

Fixes: 2ca13a4cc56c9 ("powerpc/perf: Use regs->nip when SIAR is zero")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
---

Changelog:
- Drop adding new ternary condition to check siar value.
- Remove siar check specific for PPMU_P10_DD1 and add
  it along with common checks as suggested by Christophe Leroy
  and Michael Ellermen

 arch/powerpc/perf/core-book3s.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 23ec89a59893..55efbba7572b 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2254,12 +2254,7 @@ unsigned long perf_instruction_pointer(struct pt_regs *regs)
 	bool use_siar = regs_use_siar(regs);
 	unsigned long siar = mfspr(SPRN_SIAR);
 
-	if (ppmu && (ppmu->flags & PPMU_P10_DD1)) {
-		if (siar)
-			return siar;
-		else
-			return regs->nip;
-	} else if (use_siar && siar_valid(regs))
+	if (use_siar && siar_valid(regs) && siar)
 		return siar + perf_ip_adjust(regs);
 	else
 		return regs->nip;
-- 
2.26.2


^ permalink raw reply related

* [PATCH v3 2/3] powerpc/perf: Drop the case of returning 0 as instruction pointer
From: Kajol Jain @ 2021-08-18 13:19 UTC (permalink / raw)
  To: mpe, linuxppc-dev, christophe.leroy; +Cc: kjain, atrajeev, maddy, rnsastry
In-Reply-To: <20210818131949.32008-1-kjain@linux.ibm.com>

Drop the case of returning 0 as instruction pointer since kernel
never executes at 0 and userspace almost never does either.

Fixes: e6878835ac47 ("powerpc/perf: Sample only if SIAR-Valid
bit is set in P7+")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
---
 arch/powerpc/perf/core-book3s.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 1b464aad29c4..23ec89a59893 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2261,8 +2261,6 @@ unsigned long perf_instruction_pointer(struct pt_regs *regs)
 			return regs->nip;
 	} else if (use_siar && siar_valid(regs))
 		return siar + perf_ip_adjust(regs);
-	else if (use_siar)
-		return 0;		// no valid instruction pointer
 	else
 		return regs->nip;
 }
-- 
2.26.2


^ permalink raw reply related

* [PATCH v3 1/3] powerpc/perf: Use stack siar instead of mfspr
From: Kajol Jain @ 2021-08-18 13:19 UTC (permalink / raw)
  To: mpe, linuxppc-dev, christophe.leroy; +Cc: kjain, atrajeev, maddy, rnsastry

Minor optimization in the 'perf_instruction_pointer' function code by
making use of stack siar instead of mfspr.

Fixes: 75382aa72f06 ("powerpc/perf: Move code to select SIAR or pt_regs
into perf_read_regs")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
---
 arch/powerpc/perf/core-book3s.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index bb0ee716de91..1b464aad29c4 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2260,7 +2260,7 @@ unsigned long perf_instruction_pointer(struct pt_regs *regs)
 		else
 			return regs->nip;
 	} else if (use_siar && siar_valid(regs))
-		return mfspr(SPRN_SIAR) + perf_ip_adjust(regs);
+		return siar + perf_ip_adjust(regs);
 	else if (use_siar)
 		return 0;		// no valid instruction pointer
 	else
-- 
2.26.2


^ permalink raw reply related

* Re: [PATCH v1 2/4] powerpc/64s/perf: add power_pmu_running to query whether perf is being used
From: Athira Rajeev @ 2021-08-18 12:14 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: Madhavan Srinivasan, linuxppc-dev
In-Reply-To: <1629286381.q658eskbmg.astroid@bobo.none>



> On 18-Aug-2021, at 5:11 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> Excerpts from Madhavan Srinivasan's message of August 17, 2021 11:06 pm:
>> 
>> On 8/16/21 12:59 PM, Nicholas Piggin wrote:
>>> Interrupt handling code would like to know whether perf is enabled, to
>>> know whether it should enable MSR[EE] to improve PMI coverage.
>>> 
>>> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
>>> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>> ---
>>>  arch/powerpc/include/asm/hw_irq.h |  2 ++
>>>  arch/powerpc/perf/core-book3s.c   | 13 +++++++++++++
>>>  2 files changed, 15 insertions(+)
>>> 
>>> diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
>>> index 21cc571ea9c2..2d5c0d3ccbb6 100644
>>> --- a/arch/powerpc/include/asm/hw_irq.h
>>> +++ b/arch/powerpc/include/asm/hw_irq.h
>>> @@ -306,6 +306,8 @@ static inline bool lazy_irq_pending_nocheck(void)
>>>  	return __lazy_irq_pending(local_paca->irq_happened);
>>>  }
>>> 
>>> +bool power_pmu_running(void);
>>> +
>>>  /*
>>>   * This is called by asynchronous interrupts to conditionally
>>>   * re-enable hard interrupts after having cleared the source
>>> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
>>> index bb0ee716de91..76114a9afb2b 100644
>>> --- a/arch/powerpc/perf/core-book3s.c
>>> +++ b/arch/powerpc/perf/core-book3s.c
>>> @@ -2380,6 +2380,19 @@ static void perf_event_interrupt(struct pt_regs *regs)
>>>  	perf_sample_event_took(sched_clock() - start_clock);
>>>  }
>>> 
>>> +bool power_pmu_running(void)
>>> +{
>>> +	struct cpu_hw_events *cpuhw;
>>> +
>>> +	/* Could this simply test local_paca->pmcregs_in_use? */
>>> +
>>> +	if (!ppmu)
>>> +		return false;
>> 
>> 
>> This covers only when perf platform driver is not registered,
>> but we should also check for MMCR0[32], since pmu sprs can be
>> accessed via sysfs.
> 
> In that case do they actually do anything with the PMI? I don't think it 
> should matter hopefully.
> 
> But I do think a lot of this stuff could be cleaned up. We have 
> pmcs_enabled and ppc_enable_pmcs() in sysfs.c, ppc_set_pmu_inuse(), 
> ppc_md.enable_pmcs(), reserve_pmc_hardware(), etc and different users 
> call different things. We don't consistently disable either, e.g., we 
> never disable the H_PERFMON facility after we stop using perf even 
> though it says that slows down partition switch.

Hi Nick,

I have started looking at understanding the code path and working on it to get the code cleaned up.
I will work on posting the patch set for clean up.

Thanks
Athira Rajeev
> 
> I started to have a look at sorting it out but it looks like a big
> job so would take a bit of time if we want to do it.
> 
> Thanks,
> Nick


^ permalink raw reply

* Re: [PATCH v2] powerpc/mm: Fix set_memory_*() against concurrent accesses
From: Christophe Leroy @ 2021-08-18 12:09 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev
  Cc: lvivier, jniethe5, aneesh.kumar, npiggin, farosas
In-Reply-To: <20210818120518.3603172-1-mpe@ellerman.id.au>



Le 18/08/2021 à 14:05, Michael Ellerman a écrit :
> Laurent reported that STRICT_MODULE_RWX was causing intermittent crashes
> on one of his systems:
> 
>    kernel tried to execute exec-protected page (c008000004073278) - exploit attempt? (uid: 0)
>    BUG: Unable to handle kernel instruction fetch
>    Faulting instruction address: 0xc008000004073278
>    Oops: Kernel access of bad area, sig: 11 [#1]
>    LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
>    Modules linked in: drm virtio_console fuse drm_panel_orientation_quirks ...
>    CPU: 3 PID: 44 Comm: kworker/3:1 Not tainted 5.14.0-rc4+ #12
>    Workqueue: events control_work_handler [virtio_console]
>    NIP:  c008000004073278 LR: c008000004073278 CTR: c0000000001e9de0
>    REGS: c00000002e4ef7e0 TRAP: 0400   Not tainted  (5.14.0-rc4+)
>    MSR:  800000004280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24002822 XER: 200400cf
>    ...
>    NIP fill_queue+0xf0/0x210 [virtio_console]
>    LR  fill_queue+0xf0/0x210 [virtio_console]
>    Call Trace:
>      fill_queue+0xb4/0x210 [virtio_console] (unreliable)
>      add_port+0x1a8/0x470 [virtio_console]
>      control_work_handler+0xbc/0x1e8 [virtio_console]
>      process_one_work+0x290/0x590
>      worker_thread+0x88/0x620
>      kthread+0x194/0x1a0
>      ret_from_kernel_thread+0x5c/0x64
> 
> Jordan, Fabiano & Murilo were able to reproduce and identify that the
> problem is caused by the call to module_enable_ro() in do_init_module(),
> which happens after the module's init function has already been called.
> 
> Our current implementation of change_page_attr() is not safe against
> concurrent accesses, because it invalidates the PTE before flushing the
> TLB and then installing the new PTE. That leaves a window in time where
> there is no valid PTE for the page, if another CPU tries to access the
> page at that time we see something like the fault above.
> 
> We can't simply switch to set_pte_at()/flush TLB, because our hash MMU
> code doesn't handle a set_pte_at() of a valid PTE. See [1].
> 
> But we do have pte_update(), which replaces the old PTE with the new,
> meaning there's no window where the PTE is invalid. And the hash MMU
> version hash__pte_update() deals with synchronising the hash page table
> correctly.
> 
> [1]: https://lore.kernel.org/linuxppc-dev/87y318wp9r.fsf@linux.ibm.com/
> 
> Fixes: 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines")
> Reported-by: Laurent Vivier <lvivier@redhat.com>
> Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>

> ---
>   arch/powerpc/mm/pageattr.c | 23 ++++++++++-------------
>   1 file changed, 10 insertions(+), 13 deletions(-)
> 
> v2: Use pte_update(..., ~0, pte_val(pte), ...) as suggested by Fabiano,
>      and ptep_get() as suggested by Christophe.
> 
> diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
> index 0876216ceee6..edea388e9d3f 100644
> --- a/arch/powerpc/mm/pageattr.c
> +++ b/arch/powerpc/mm/pageattr.c
> @@ -18,16 +18,12 @@
>   /*
>    * Updates the attributes of a page in three steps:
>    *
> - * 1. invalidate the page table entry
> - * 2. flush the TLB
> - * 3. install the new entry with the updated attributes
> - *
> - * Invalidating the pte means there are situations where this will not work
> - * when in theory it should.
> - * For example:
> - * - removing write from page whilst it is being executed
> - * - setting a page read-only whilst it is being read by another CPU
> + * 1. take the page_table_lock
> + * 2. install the new entry with the updated attributes
> + * 3. flush the TLB
>    *
> + * This sequence is safe against concurrent updates, and also allows updating the
> + * attributes of a page currently being executed or accessed.
>    */
>   static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
>   {
> @@ -36,9 +32,7 @@ static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
>   
>   	spin_lock(&init_mm.page_table_lock);
>   
> -	/* invalidate the PTE so it's safe to modify */
> -	pte = ptep_get_and_clear(&init_mm, addr, ptep);
> -	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +	pte = ptep_get(ptep);
>   
>   	/* modify the PTE bits as desired, then apply */
>   	switch (action) {
> @@ -59,11 +53,14 @@ static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
>   		break;
>   	}
>   
> -	set_pte_at(&init_mm, addr, ptep, pte);
> +	pte_update(&init_mm, addr, ptep, ~0UL, pte_val(pte), 0);
>   
>   	/* See ptesync comment in radix__set_pte_at() */
>   	if (radix_enabled())
>   		asm volatile("ptesync": : :"memory");
> +
> +	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +
>   	spin_unlock(&init_mm.page_table_lock);
>   
>   	return 0;
> 
> base-commit: cbc06f051c524dcfe52ef0d1f30647828e226d30
> 

^ permalink raw reply

* [PATCH v2] powerpc/mm: Fix set_memory_*() against concurrent accesses
From: Michael Ellerman @ 2021-08-18 12:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: lvivier, farosas, jniethe5, npiggin, aneesh.kumar

Laurent reported that STRICT_MODULE_RWX was causing intermittent crashes
on one of his systems:

  kernel tried to execute exec-protected page (c008000004073278) - exploit attempt? (uid: 0)
  BUG: Unable to handle kernel instruction fetch
  Faulting instruction address: 0xc008000004073278
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: drm virtio_console fuse drm_panel_orientation_quirks ...
  CPU: 3 PID: 44 Comm: kworker/3:1 Not tainted 5.14.0-rc4+ #12
  Workqueue: events control_work_handler [virtio_console]
  NIP:  c008000004073278 LR: c008000004073278 CTR: c0000000001e9de0
  REGS: c00000002e4ef7e0 TRAP: 0400   Not tainted  (5.14.0-rc4+)
  MSR:  800000004280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24002822 XER: 200400cf
  ...
  NIP fill_queue+0xf0/0x210 [virtio_console]
  LR  fill_queue+0xf0/0x210 [virtio_console]
  Call Trace:
    fill_queue+0xb4/0x210 [virtio_console] (unreliable)
    add_port+0x1a8/0x470 [virtio_console]
    control_work_handler+0xbc/0x1e8 [virtio_console]
    process_one_work+0x290/0x590
    worker_thread+0x88/0x620
    kthread+0x194/0x1a0
    ret_from_kernel_thread+0x5c/0x64

Jordan, Fabiano & Murilo were able to reproduce and identify that the
problem is caused by the call to module_enable_ro() in do_init_module(),
which happens after the module's init function has already been called.

Our current implementation of change_page_attr() is not safe against
concurrent accesses, because it invalidates the PTE before flushing the
TLB and then installing the new PTE. That leaves a window in time where
there is no valid PTE for the page, if another CPU tries to access the
page at that time we see something like the fault above.

We can't simply switch to set_pte_at()/flush TLB, because our hash MMU
code doesn't handle a set_pte_at() of a valid PTE. See [1].

But we do have pte_update(), which replaces the old PTE with the new,
meaning there's no window where the PTE is invalid. And the hash MMU
version hash__pte_update() deals with synchronising the hash page table
correctly.

[1]: https://lore.kernel.org/linuxppc-dev/87y318wp9r.fsf@linux.ibm.com/

Fixes: 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines")
Reported-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/mm/pageattr.c | 23 ++++++++++-------------
 1 file changed, 10 insertions(+), 13 deletions(-)

v2: Use pte_update(..., ~0, pte_val(pte), ...) as suggested by Fabiano,
    and ptep_get() as suggested by Christophe.

diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
index 0876216ceee6..edea388e9d3f 100644
--- a/arch/powerpc/mm/pageattr.c
+++ b/arch/powerpc/mm/pageattr.c
@@ -18,16 +18,12 @@
 /*
  * Updates the attributes of a page in three steps:
  *
- * 1. invalidate the page table entry
- * 2. flush the TLB
- * 3. install the new entry with the updated attributes
- *
- * Invalidating the pte means there are situations where this will not work
- * when in theory it should.
- * For example:
- * - removing write from page whilst it is being executed
- * - setting a page read-only whilst it is being read by another CPU
+ * 1. take the page_table_lock
+ * 2. install the new entry with the updated attributes
+ * 3. flush the TLB
  *
+ * This sequence is safe against concurrent updates, and also allows updating the
+ * attributes of a page currently being executed or accessed.
  */
 static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
 {
@@ -36,9 +32,7 @@ static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
 
 	spin_lock(&init_mm.page_table_lock);
 
-	/* invalidate the PTE so it's safe to modify */
-	pte = ptep_get_and_clear(&init_mm, addr, ptep);
-	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+	pte = ptep_get(ptep);
 
 	/* modify the PTE bits as desired, then apply */
 	switch (action) {
@@ -59,11 +53,14 @@ static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
 		break;
 	}
 
-	set_pte_at(&init_mm, addr, ptep, pte);
+	pte_update(&init_mm, addr, ptep, ~0UL, pte_val(pte), 0);
 
 	/* See ptesync comment in radix__set_pte_at() */
 	if (radix_enabled())
 		asm volatile("ptesync": : :"memory");
+
+	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+
 	spin_unlock(&init_mm.page_table_lock);
 
 	return 0;

base-commit: cbc06f051c524dcfe52ef0d1f30647828e226d30
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH v1 2/4] powerpc/64s/perf: add power_pmu_running to query whether perf is being used
From: Nicholas Piggin @ 2021-08-18 11:41 UTC (permalink / raw)
  To: linuxppc-dev, Madhavan Srinivasan; +Cc: Athira Rajeev
In-Reply-To: <2e3108d7-8d11-d204-c605-fe51cd361586@linux.ibm.com>

Excerpts from Madhavan Srinivasan's message of August 17, 2021 11:06 pm:
> 
> On 8/16/21 12:59 PM, Nicholas Piggin wrote:
>> Interrupt handling code would like to know whether perf is enabled, to
>> know whether it should enable MSR[EE] to improve PMI coverage.
>>
>> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
>> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>   arch/powerpc/include/asm/hw_irq.h |  2 ++
>>   arch/powerpc/perf/core-book3s.c   | 13 +++++++++++++
>>   2 files changed, 15 insertions(+)
>>
>> diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
>> index 21cc571ea9c2..2d5c0d3ccbb6 100644
>> --- a/arch/powerpc/include/asm/hw_irq.h
>> +++ b/arch/powerpc/include/asm/hw_irq.h
>> @@ -306,6 +306,8 @@ static inline bool lazy_irq_pending_nocheck(void)
>>   	return __lazy_irq_pending(local_paca->irq_happened);
>>   }
>>   
>> +bool power_pmu_running(void);
>> +
>>   /*
>>    * This is called by asynchronous interrupts to conditionally
>>    * re-enable hard interrupts after having cleared the source
>> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
>> index bb0ee716de91..76114a9afb2b 100644
>> --- a/arch/powerpc/perf/core-book3s.c
>> +++ b/arch/powerpc/perf/core-book3s.c
>> @@ -2380,6 +2380,19 @@ static void perf_event_interrupt(struct pt_regs *regs)
>>   	perf_sample_event_took(sched_clock() - start_clock);
>>   }
>>   
>> +bool power_pmu_running(void)
>> +{
>> +	struct cpu_hw_events *cpuhw;
>> +
>> +	/* Could this simply test local_paca->pmcregs_in_use? */
>> +
>> +	if (!ppmu)
>> +		return false;
> 
> 
> This covers only when perf platform driver is not registered,
> but we should also check for MMCR0[32], since pmu sprs can be
> accessed via sysfs.

In that case do they actually do anything with the PMI? I don't think it 
should matter hopefully.

But I do think a lot of this stuff could be cleaned up. We have 
pmcs_enabled and ppc_enable_pmcs() in sysfs.c, ppc_set_pmu_inuse(), 
ppc_md.enable_pmcs(), reserve_pmc_hardware(), etc and different users 
call different things. We don't consistently disable either, e.g., we 
never disable the H_PERFMON facility after we stop using perf even 
though it says that slows down partition switch.

I started to have a look at sorting it out but it looks like a big
job so would take a bit of time if we want to do it.

Thanks,
Nick

^ permalink raw reply

* Re: [PATCH] scsi: ibmvfc: Stop using scsi_cmnd.tag
From: John Garry @ 2021-08-18 11:29 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: tyreld, bvanassche, linux-scsi, sfr, jejb, linux-kernel,
	linux-next, paulus, hare, linuxppc-dev, hch
In-Reply-To: <yq14kbnmqoh.fsf@ca-mkp.ca.oracle.com>

Hi Martin,

> 
>> Use scsi_cmd_to_rq(scsi_cmnd)->tag in preference to scsi_cmnd.tag.
> 
> Applied to 5.15/scsi-staging and rebased for bisectability.
> 

Thanks, and sorry for the hassle. But I would still like the maintainers 
to have a look, as I was curious about current usage of scsi_cmnd.tag in 
that driver.

> Just to be picky it looks like there's another scsi_cmmd tag lurking in
> qla1280.c but it's sitting behind an #ifdef DEBUG_QLA1280.
> 

That driver does not even compile with DEBUG_QLA1280 set beforehand. 
I'll fix that up and send as separate patches in case you want to 
shuffle the tag patch in earlier, which is prob not worth the effort.

I've done a good few more x86 randconfigs and tried to audit the code 
for more references, so hopefully that's the last.

Thanks

^ permalink raw reply

* Re: [PATCH v8 2/3] tty: hvc: pass DMA capable memory to put_chars()
From: kernel test robot @ 2021-08-18 11:03 UTC (permalink / raw)
  To: Xianting Tian, gregkh, jirislaby, amit, arnd, osandov
  Cc: kbuild-all, Xianting Tian, shile.zhang, linux-kernel,
	virtualization, linuxppc-dev
In-Reply-To: <20210818082122.166881-3-xianting.tian@linux.alibaba.com>

[-- Attachment #1: Type: text/plain, Size: 3013 bytes --]

Hi Xianting,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on tty/tty-testing]
[also build test WARNING on char-misc/char-misc-testing soc/for-next v5.14-rc6 next-20210818]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Xianting-Tian/make-hvc-pass-dma-capable-memory-to-its-backend/20210818-162408
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git tty-testing
config: sh-allmodconfig (attached as .config)
compiler: sh4-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/e1b7662dafceb07a6905b64da2f1be27498c4a46
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Xianting-Tian/make-hvc-pass-dma-capable-memory-to-its-backend/20210818-162408
        git checkout e1b7662dafceb07a6905b64da2f1be27498c4a46
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=sh 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/tty/hvc/hvc_console.c: In function 'hvc_poll_put_char':
>> drivers/tty/hvc/hvc_console.c:880:55: warning: passing argument 2 of 'hp->ops->put_chars' makes pointer from integer without a cast [-Wint-conversion]
     880 |                 n = hp->ops->put_chars(hp->vtermno, hp->out_ch, 1);
         |                                                     ~~^~~~~~~~
         |                                                       |
         |                                                       char
   drivers/tty/hvc/hvc_console.c:880:55: note: expected 'const char *' but argument is of type 'char'

Kconfig warnings: (for reference only)
   WARNING: unmet direct dependencies detected for SND_ATMEL_SOC_PDC
   Depends on SOUND && !UML && SND && SND_SOC && SND_ATMEL_SOC && HAS_DMA
   Selected by
   - SND_ATMEL_SOC_SSC && SOUND && !UML && SND && SND_SOC && SND_ATMEL_SOC
   - SND_ATMEL_SOC_SSC_PDC && SOUND && !UML && SND && SND_SOC && SND_ATMEL_SOC && ATMEL_SSC


vim +880 drivers/tty/hvc/hvc_console.c

   870	
   871	static void hvc_poll_put_char(struct tty_driver *driver, int line, char ch)
   872	{
   873		struct tty_struct *tty = driver->ttys[0];
   874		struct hvc_struct *hp = tty->driver_data;
   875		int n;
   876	
   877		hp->out_ch = ch;
   878	
   879		do {
 > 880			n = hp->ops->put_chars(hp->vtermno, hp->out_ch, 1);
   881		} while (n <= 0);
   882	}
   883	#endif
   884	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 55017 bytes --]

^ permalink raw reply

* [PATCH] powerpc/32: Remove unneccessary calculations in load_up_{fpu/altivec}
From: Christophe Leroy @ 2021-08-18  8:47 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

No need to re-read SPRN_THREAD, we can calculate thread address
from current (r2).

And remove a reload of value 1 into r4 as r4 is already 1.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/kernel/fpu.S    | 3 +--
 arch/powerpc/kernel/vector.S | 4 +---
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/fpu.S b/arch/powerpc/kernel/fpu.S
index 6010adcee16e..ba4afe3b5a9c 100644
--- a/arch/powerpc/kernel/fpu.S
+++ b/arch/powerpc/kernel/fpu.S
@@ -91,8 +91,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_VSX)
 	isync
 	/* enable use of FP after return */
 #ifdef CONFIG_PPC32
-	mfspr	r5,SPRN_SPRG_THREAD	/* current task's THREAD (phys) */
-	tovirt(r5, r5)
+	addi	r5,r2,THREAD
 	lwz	r4,THREAD_FPEXC_MODE(r5)
 	ori	r9,r9,MSR_FP		/* enable FP for current */
 	or	r9,r9,r4
diff --git a/arch/powerpc/kernel/vector.S b/arch/powerpc/kernel/vector.S
index fc120fac1910..ba03eedfdcd8 100644
--- a/arch/powerpc/kernel/vector.S
+++ b/arch/powerpc/kernel/vector.S
@@ -65,9 +65,8 @@ _GLOBAL(load_up_altivec)
 1:
 	/* enable use of VMX after return */
 #ifdef CONFIG_PPC32
-	mfspr	r5,SPRN_SPRG_THREAD		/* current task's THREAD (phys) */
+	addi	r5,r2,THREAD
 	oris	r9,r9,MSR_VEC@h
-	tovirt(r5, r5)
 #else
 	ld	r4,PACACURRENT(r13)
 	addi	r5,r4,THREAD		/* Get THREAD */
@@ -81,7 +80,6 @@ _GLOBAL(load_up_altivec)
 	li	r4,1
 	stb	r4,THREAD_LOAD_VEC(r5)
 	addi	r6,r5,THREAD_VRSTATE
-	li	r4,1
 	li	r10,VRSTATE_VSCR
 	stw	r4,THREAD_USED_VR(r5)
 	lvx	v0,r10,r6
-- 
2.25.0


^ permalink raw reply related

* [PATCH v8 3/3] virtio-console: remove unnecessary kmemdup()
From: Xianting Tian @ 2021-08-18  8:21 UTC (permalink / raw)
  To: gregkh, jirislaby, amit, arnd, osandov
  Cc: Xianting Tian, shile.zhang, linuxppc-dev, linux-kernel,
	virtualization
In-Reply-To: <20210818082122.166881-1-xianting.tian@linux.alibaba.com>

This revert commit c4baad5029 ("virtio-console: avoid DMA from stack")

hvc framework will never pass stack memory to the put_chars() function,
So the calling of kmemdup() is unnecessary, we can remove it.

Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
Reviewed-by: Shile Zhang <shile.zhang@linux.alibaba.com>
---
 drivers/char/virtio_console.c | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 7eaf303a7..4ed3ffb1d 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -1117,8 +1117,6 @@ static int put_chars(u32 vtermno, const char *buf, int count)
 {
 	struct port *port;
 	struct scatterlist sg[1];
-	void *data;
-	int ret;
 
 	if (unlikely(early_put_chars))
 		return early_put_chars(vtermno, buf, count);
@@ -1127,14 +1125,8 @@ static int put_chars(u32 vtermno, const char *buf, int count)
 	if (!port)
 		return -EPIPE;
 
-	data = kmemdup(buf, count, GFP_ATOMIC);
-	if (!data)
-		return -ENOMEM;
-
-	sg_init_one(sg, data, count);
-	ret = __send_to_port(port, sg, 1, count, data, false);
-	kfree(data);
-	return ret;
+	sg_init_one(sg, buf, count);
+	return __send_to_port(port, sg, 1, count, (void *)buf, false);
 }
 
 /*
-- 
2.17.1


^ permalink raw reply related

* [PATCH v8 2/3] tty: hvc: pass DMA capable memory to put_chars()
From: Xianting Tian @ 2021-08-18  8:21 UTC (permalink / raw)
  To: gregkh, jirislaby, amit, arnd, osandov
  Cc: Xianting Tian, shile.zhang, linuxppc-dev, linux-kernel,
	virtualization
In-Reply-To: <20210818082122.166881-1-xianting.tian@linux.alibaba.com>

As well known, hvc backend driver(eg, virtio-console) can register its
operations to hvc framework. The operations can contain put_chars(),
get_chars() and so on.

Some hvc backend may do dma in its operations. eg, put_chars() of
virtio-console. But in the code of hvc framework, it may pass DMA
incapable memory to put_chars() under a specific configuration, which
is explained in commit c4baad5029(virtio-console: avoid DMA from stack):
1, c[] is on stack,
   hvc_console_print():
	char c[N_OUTBUF] __ALIGNED__;
	cons_ops[index]->put_chars(vtermnos[index], c, i);
2, ch is on stack,
   static void hvc_poll_put_char(,,char ch)
   {
	struct tty_struct *tty = driver->ttys[0];
	struct hvc_struct *hp = tty->driver_data;
	int n;

	do {
		n = hp->ops->put_chars(hp->vtermno, &ch, 1);
	} while (n <= 0);
   }

Commit c4baad5029 is just the fix to avoid DMA from stack memory, which
is passed to virtio-console by hvc framework in above code. But I think
the fix is aggressive, it directly uses kmemdup() to alloc new buffer
from kmalloc area and do memcpy no matter the memory is in kmalloc area
or not. But most importantly, it should better be fixed in the hvc
framework, by changing it to never pass stack memory to the put_chars()
function in the first place. Otherwise, we still face the same issue if
a new hvc backend using dma added in the future.

In this patch, we make 'char out_buf[N_OUTBUF]' and 'chat out_ch' part
of 'struct hvc_struct', so both two buf are no longer the stack memory.
we can use it in above two cases separately.

Introduce another array(cons_outbufs[]) for buffer pointers next to
the cons_ops[] and vtermnos[] arrays. With the array, we can easily find
the buffer, instead of traversing hp list.

With the patch, we can remove the fix c4baad5029.

Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
Reviewed-by: Shile Zhang <shile.zhang@linux.alibaba.com>
---
 drivers/tty/hvc/hvc_console.c | 27 ++++++++++++---------------
 drivers/tty/hvc/hvc_console.h | 16 ++++++++++++++--
 2 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index 5bb8c4e44..300e9c037 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -41,16 +41,6 @@
  */
 #define HVC_CLOSE_WAIT (HZ/100) /* 1/10 of a second */
 
-/*
- * These sizes are most efficient for vio, because they are the
- * native transfer size. We could make them selectable in the
- * future to better deal with backends that want other buffer sizes.
- */
-#define N_OUTBUF	16
-#define N_INBUF		16
-
-#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES)))
-
 static struct tty_driver *hvc_driver;
 static struct task_struct *hvc_task;
 
@@ -142,6 +132,7 @@ static int hvc_flush(struct hvc_struct *hp)
 static const struct hv_ops *cons_ops[MAX_NR_HVC_CONSOLES];
 static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
 	{[0 ... MAX_NR_HVC_CONSOLES - 1] = -1};
+static char *cons_outbufs[MAX_NR_HVC_CONSOLES];
 
 /*
  * Console APIs, NOT TTY.  These APIs are available immediately when
@@ -151,7 +142,7 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
 static void hvc_console_print(struct console *co, const char *b,
 			      unsigned count)
 {
-	char c[N_OUTBUF] __ALIGNED__;
+	char *c;
 	unsigned i = 0, n = 0;
 	int r, donecr = 0, index = co->index;
 
@@ -163,6 +154,10 @@ static void hvc_console_print(struct console *co, const char *b,
 	if (vtermnos[index] == -1)
 		return;
 
+	c = cons_outbufs[index];
+	if (!c)
+		return;
+
 	while (count > 0 || i > 0) {
 		if (count > 0 && i < sizeof(c)) {
 			if (b[n] == '\n' && !donecr) {
@@ -879,8 +874,10 @@ static void hvc_poll_put_char(struct tty_driver *driver, int line, char ch)
 	struct hvc_struct *hp = tty->driver_data;
 	int n;
 
+	hp->out_ch = ch;
+
 	do {
-		n = hp->ops->put_chars(hp->vtermno, &ch, 1);
+		n = hp->ops->put_chars(hp->vtermno, hp->out_ch, 1);
 	} while (n <= 0);
 }
 #endif
@@ -922,8 +919,7 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
 			return ERR_PTR(err);
 	}
 
-	hp = kzalloc(ALIGN(sizeof(*hp), sizeof(long)) + outbuf_size,
-			GFP_KERNEL);
+	hp = kzalloc(struct_size(hp, outbuf, outbuf_size), GFP_KERNEL);
 	if (!hp)
 		return ERR_PTR(-ENOMEM);
 
@@ -931,7 +927,6 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
 	hp->data = data;
 	hp->ops = ops;
 	hp->outbuf_size = outbuf_size;
-	hp->outbuf = &((char *)hp)[ALIGN(sizeof(*hp), sizeof(long))];
 
 	tty_port_init(&hp->port);
 	hp->port.ops = &hvc_port_ops;
@@ -964,6 +959,7 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
 	if (i < MAX_NR_HVC_CONSOLES) {
 		cons_ops[i] = ops;
 		vtermnos[i] = vtermno;
+		cons_outbufs[i] = hp->out_buf;
 	}
 
 	list_add_tail(&(hp->next), &hvc_structs);
@@ -988,6 +984,7 @@ int hvc_remove(struct hvc_struct *hp)
 	if (hp->index < MAX_NR_HVC_CONSOLES) {
 		vtermnos[hp->index] = -1;
 		cons_ops[hp->index] = NULL;
+		cons_outbufs[hp->index] = NULL;
 	}
 
 	/* Don't whack hp->irq because tty_hangup() will need to free the irq. */
diff --git a/drivers/tty/hvc/hvc_console.h b/drivers/tty/hvc/hvc_console.h
index 18d005814..b94576d55 100644
--- a/drivers/tty/hvc/hvc_console.h
+++ b/drivers/tty/hvc/hvc_console.h
@@ -32,13 +32,21 @@
  */
 #define HVC_ALLOC_TTY_ADAPTERS	8
 
+/*
+ * These sizes are most efficient for vio, because they are the
+ * native transfer size. We could make them selectable in the
+ * future to better deal with backends that want other buffer sizes.
+ */
+#define N_OUTBUF	16
+#define N_INBUF		16
+
+#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES)))
+
 struct hvc_struct {
 	struct tty_port port;
 	spinlock_t lock;
 	int index;
 	int do_wakeup;
-	char *outbuf;
-	int outbuf_size;
 	int n_outbuf;
 	uint32_t vtermno;
 	const struct hv_ops *ops;
@@ -48,6 +56,10 @@ struct hvc_struct {
 	struct work_struct tty_resize;
 	struct list_head next;
 	unsigned long flags;
+	char out_ch;
+	char out_buf[N_OUTBUF] __ALIGNED__;
+	int outbuf_size;
+	char outbuf[0] __ALIGNED__;
 };
 
 /* implemented by a low level driver */
-- 
2.17.1


^ permalink raw reply related

* [PATCH v8 0/3] make hvc pass dma capable memory to its backend
From: Xianting Tian @ 2021-08-18  8:21 UTC (permalink / raw)
  To: gregkh, jirislaby, amit, arnd, osandov
  Cc: Xianting Tian, shile.zhang, linuxppc-dev, linux-kernel,
	virtualization

Dear all,

This patch series make hvc framework pass DMA capable memory to
put_chars() of hvc backend(eg, virtio-console), and revert commit
c4baad5029 ("virtio-console: avoid DMA from stack”)

V1
virtio-console: avoid DMA from vmalloc area
https://lkml.org/lkml/2021/7/27/494

For v1 patch, Arnd Bergmann suggests to fix the issue in the first
place:
Make hvc pass DMA capable memory to put_chars()
The fix suggestion is included in v2.

V2
[PATCH 1/2] tty: hvc: pass DMA capable memory to put_chars()
https://lkml.org/lkml/2021/8/1/8
[PATCH 2/2] virtio-console: remove unnecessary kmemdup()
https://lkml.org/lkml/2021/8/1/9

For v2 patch, Arnd Bergmann suggests to make new buf part of the
hvc_struct structure, and fix the compile issue.
The fix suggestion is included in v3.

V3
[PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars()
https://lkml.org/lkml/2021/8/3/1347
[PATCH v3 2/2] virtio-console: remove unnecessary kmemdup()
https://lkml.org/lkml/2021/8/3/1348

For v3 patch, Jiri Slaby suggests to make 'char c[N_OUTBUF]' part of
hvc_struct, and make 'hp->outbuf' aligned and use struct_size() to
calculate the size of hvc_struct. The fix suggestion is included in
v4.

V4
[PATCH v4 0/2] make hvc pass dma capable memory to its backend
https://lkml.org/lkml/2021/8/5/1350
[PATCH v4 1/2] tty: hvc: pass DMA capable memory to put_chars()
https://lkml.org/lkml/2021/8/5/1351
[PATCH v4 2/2] virtio-console: remove unnecessary kmemdup()
https://lkml.org/lkml/2021/8/5/1352

For v4 patch, Arnd Bergmann suggests to introduce another
array(cons_outbuf[]) for the buffer pointers next to the cons_ops[]
and vtermnos[] arrays. This fix included in this v5 patch.

V5
Arnd Bergmann suggests to use "L1_CACHE_BYTES" as dma alignment,
use 'sizeof(long)' as dma alignment is wrong. fix it in v6.

V6
It contains coding error, fix it in v7 and it worked normally
according to test result.

V7
Greg KH suggests to add test and code review developer,
Jiri Slaby suggests to use lockless buffer and fix dma alignment
in separate patch.
fix above things in v8. 

drivers/tty/hvc/hvc_console.c | 27 ++++++++++++---------------
drivers/tty/hvc/hvc_console.h | 16 ++++++++++++++--
drivers/tty/hvc/hvc_console.h | 16 ++++++++++++--
3 file changed

^ permalink raw reply

* [PATCH v8 1/3] tty: hvc: use correct dma alignment size
From: Xianting Tian @ 2021-08-18  8:21 UTC (permalink / raw)
  To: gregkh, jirislaby, amit, arnd, osandov
  Cc: Xianting Tian, shile.zhang, linuxppc-dev, linux-kernel,
	virtualization
In-Reply-To: <20210818082122.166881-1-xianting.tian@linux.alibaba.com>

Use L1_CACHE_BYTES as the dma alignment size, use 'sizeof(long)'
is wrong.

Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
Reviewed-by: Shile Zhang <shile.zhang@linux.alibaba.com>
---
 drivers/tty/hvc/hvc_console.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index 5bb8c4e44..5957ab728 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -49,7 +49,7 @@
 #define N_OUTBUF	16
 #define N_INBUF		16
 
-#define __ALIGNED__ __attribute__((__aligned__(sizeof(long))))
+#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES)))
 
 static struct tty_driver *hvc_driver;
 static struct task_struct *hvc_task;
-- 
2.17.1


^ permalink raw reply related

* Re: [PATCH] powerpc/mm: Fix set_memory_*() against concurrent accesses
From: Michael Ellerman @ 2021-08-18  7:46 UTC (permalink / raw)
  To: Fabiano Rosas, linuxppc-dev; +Cc: lvivier, jniethe5, npiggin, aneesh.kumar
In-Reply-To: <87sfz8tam3.fsf@linux.ibm.com>

Fabiano Rosas <farosas@linux.ibm.com> writes:
> Michael Ellerman <mpe@ellerman.id.au> writes:
>
> Hi, I already mentioned these things in private, but I'll post here so
> everyone can see:
>
>> Because pte_update() takes the set of PTE bits to set and clear we can't
>> use our existing helpers, eg. pte_wrprotect() etc. and instead have to
>> open code the set of flags. We will clean that up somehow in a future
>> commit.
>
> I tested the following on P9 and it seems to work fine. Not sure if it
> works for CONFIG_PPC_8xx, though.
>
>
>  static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
>  {
>  	long action = (long)data;
>  	pte_t pte;
>  
>  	spin_lock(&init_mm.page_table_lock);
> -
> -	/* invalidate the PTE so it's safe to modify */
> -	pte = ptep_get_and_clear(&init_mm, addr, ptep);
> -	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +	pte = *ptep;
>  
>  	/* modify the PTE bits as desired, then apply */
>  	switch (action) {
> @@ -59,11 +42,9 @@ static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
>  		break;
>  	}
>  
> -	set_pte_at(&init_mm, addr, ptep, pte);
> +	pte_update(&init_mm, addr, ptep, ~0UL, pte_val(pte), 0);

I avoided that because it's not atomic, but pte_update() is not atomic
on some platforms anyway. And for now at least we still have the
page_table_lock as well.

So you're right that's a nicer way to do it.

And I'll use ptep_get() as Christophe suggested.

> +	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
>  
> -	/* See ptesync comment in radix__set_pte_at() */
> -	if (radix_enabled())
> -		asm volatile("ptesync": : :"memory");

I didn't do that because I wanted to keep the patch minimal. We can do
that as a separate patch.

>  	spin_unlock(&init_mm.page_table_lock);
>  
>  	return 0;
> ---
>
> For reference, the full patch is here:
> https://github.com/farosas/linux/commit/923c95c84d7081d7be9503bf5b276dd93bd17036.patch
>
>>
>> [1]: https://lore.kernel.org/linuxppc-dev/87y318wp9r.fsf@linux.ibm.com/
>>
>> Fixes: 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines")
>> Reported-by: Laurent Vivier <lvivier@redhat.com>
>> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>> ---
>
> ...
>
>> -	set_pte_at(&init_mm, addr, ptep, pte);
>> +	pte_update(&init_mm, addr, ptep, clear, set, 0);
>>  
>>  	/* See ptesync comment in radix__set_pte_at() */
>>  	if (radix_enabled())
>>  		asm volatile("ptesync": : :"memory");
>> +
>> +	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
>
> I think there's an optimization possible here, when relaxing access, to
> skip the TLB flush. Would still need the ptesync though. Similar to what
> Nick did in e5f7cb58c2b7 ("powerpc/64s/radix: do not flush TLB when
> relaxing access").

That commit is specific to Radix, whereas this code needs to work on all
platforms.

We'd need to verify that it's safe to skip the flush on all platforms
and CPU versions.

What I think we can do, and would possibly be a more meaningful
optimisation, is to move the TLB flush out of the loop and up into
change_memory_attr(). So we just do it once for the range, rather than
per page. But that too would be a separate patch.

cheers

^ permalink raw reply

* [PATCH] sched/topology: Skip updating masks for non-online nodes
From: Srikar Dronamraju @ 2021-08-18  7:43 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: Nathan Lynch, Gautham R Shenoy, Vincent Guittot,
	Srikar Dronamraju, Rik van Riel, linuxppc-dev,
	Geetika Moolchandani, LKML, Dietmar Eggemann, Thomas Gleixner,
	Laurent Dufour, Mel Gorman, Valentin Schneider

From: Valentin Schneider <valentin.schneider@arm.com>

The scheduler currently expects NUMA node distances to be stable from
init onwards, and as a consequence builds the related data structures
once-and-for-all at init (see sched_init_numa()).

Unfortunately, on some architectures node distance is unreliable for
offline nodes and may very well change upon onlining.

Skip over offline nodes during sched_init_numa(). Track nodes that have
been onlined at least once, and trigger a build of a node's NUMA masks
when it is first onlined post-init.

Cc: LKML <linux-kernel@vger.kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Geetika Moolchandani <Geetika.Moolchandani1@ibm.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Reported-by: Geetika Moolchandani <Geetika.Moolchandani1@ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
---
Changelog:
[Fixed warning: no previous prototype for '__sched_domains_numa_masks_set']
 kernel/sched/topology.c | 65 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 65 insertions(+)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index b77ad49dc14f..4e8698e62f07 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1482,6 +1482,8 @@ int				sched_max_numa_distance;
 static int			*sched_domains_numa_distance;
 static struct cpumask		***sched_domains_numa_masks;
 int __read_mostly		node_reclaim_distance = RECLAIM_DISTANCE;
+
+static unsigned long __read_mostly *sched_numa_onlined_nodes;
 #endif
 
 /*
@@ -1833,6 +1835,16 @@ void sched_init_numa(void)
 			sched_domains_numa_masks[i][j] = mask;
 
 			for_each_node(k) {
+				/*
+				 * Distance information can be unreliable for
+				 * offline nodes, defer building the node
+				 * masks to its bringup.
+				 * This relies on all unique distance values
+				 * still being visible at init time.
+				 */
+				if (!node_online(j))
+					continue;
+
 				if (sched_debug() && (node_distance(j, k) != node_distance(k, j)))
 					sched_numa_warn("Node-distance not symmetric");
 
@@ -1886,6 +1898,53 @@ void sched_init_numa(void)
 	sched_max_numa_distance = sched_domains_numa_distance[nr_levels - 1];
 
 	init_numa_topology_type();
+
+	sched_numa_onlined_nodes = bitmap_alloc(nr_node_ids, GFP_KERNEL);
+	if (!sched_numa_onlined_nodes)
+		return;
+
+	bitmap_zero(sched_numa_onlined_nodes, nr_node_ids);
+	for_each_online_node(i)
+		bitmap_set(sched_numa_onlined_nodes, i, 1);
+}
+
+static void __sched_domains_numa_masks_set(unsigned int node)
+{
+	int i, j;
+
+	/*
+	 * NUMA masks are not built for offline nodes in sched_init_numa().
+	 * Thus, when a CPU of a never-onlined-before node gets plugged in,
+	 * adding that new CPU to the right NUMA masks is not sufficient: the
+	 * masks of that CPU's node must also be updated.
+	 */
+	if (test_bit(node, sched_numa_onlined_nodes))
+		return;
+
+	bitmap_set(sched_numa_onlined_nodes, node, 1);
+
+	for (i = 0; i < sched_domains_numa_levels; i++) {
+		for (j = 0; j < nr_node_ids; j++) {
+			if (!node_online(j) || node == j)
+				continue;
+
+			if (node_distance(j, node) > sched_domains_numa_distance[i])
+				continue;
+
+			/* Add remote nodes in our masks */
+			cpumask_or(sched_domains_numa_masks[i][node],
+				   sched_domains_numa_masks[i][node],
+				   sched_domains_numa_masks[0][j]);
+		}
+	}
+
+	/*
+	 * A new node has been brought up, potentially changing the topology
+	 * classification.
+	 *
+	 * Note that this is racy vs any use of sched_numa_topology_type :/
+	 */
+	init_numa_topology_type();
 }
 
 void sched_domains_numa_masks_set(unsigned int cpu)
@@ -1893,8 +1952,14 @@ void sched_domains_numa_masks_set(unsigned int cpu)
 	int node = cpu_to_node(cpu);
 	int i, j;
 
+	__sched_domains_numa_masks_set(node);
+
 	for (i = 0; i < sched_domains_numa_levels; i++) {
 		for (j = 0; j < nr_node_ids; j++) {
+			if (!node_online(j))
+				continue;
+
+			/* Set ourselves in the remote node's masks */
 			if (node_distance(j, node) <= sched_domains_numa_distance[i])
 				cpumask_set_cpu(cpu, sched_domains_numa_masks[i][j]);
 		}

-- 
2.18.2


^ permalink raw reply related

* [PATCH v2] powerpc/32s: Fix random crashes by adding isync() after locking/unlocking KUEP
From: Christophe Leroy @ 2021-08-18  6:49 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, userm57,
	fthain
  Cc: linuxppc-dev, linux-kernel

Commit b5efec00b671 ("powerpc/32s: Move KUEP locking/unlocking in C")
removed the 'isync' instruction after adding/removing NX bit in user
segments. The reasoning behind this change was that when setting the
NX bit we don't mind it taking effect with delay as the kernel never
executes text from userspace, and when clearing the NX bit this is
to return to userspace and then the 'rfi' should synchronise the
context.

However, it looks like on book3s/32 having a hash page table, at least
on the G3 processor, we get an unexpected fault from userspace, then
this is followed by something wrong in the verification of MSR_PR
at end of another interrupt.

This is fixed by adding back the removed isync() following update
of NX bit in user segment registers. Only do it for cores with an
hash table, as 603 cores don't exhibit that problem and the two isync
increase ./null_syscall selftest by 6 cycles on an MPC 832x.

First problem: unexpected WARN_ON() for mysterious PROTFAULT

	[   62.896426] WARNING: CPU: 0 PID: 1660 at arch/powerpc/mm/fault.c:354 do_page_fault+0x6c/0x5b0
	[   62.918111] Modules linked in:
	[   62.923350] CPU: 0 PID: 1660 Comm: Xorg Not tainted 5.13.0-pmac-00028-gb3c15b60339a #40
	[   62.943476] NIP:  c001b5c8 LR: c001b6f8 CTR: 00000000
	[   62.954714] REGS: e2d09e40 TRAP: 0700   Not tainted  (5.13.0-pmac-00028-gb3c15b60339a)
	[   62.974581] MSR:  00021032 <ME,IR,DR,RI>  CR: 42d04f30  XER: 20000000
	[   62.990009]
        	       GPR00: c000424c e2d09f00 c301b680 e2d09f40 0000001e 42000000 00cba028 00000000
        	       GPR08: 08000000 48000010 c301b680 e2d09f30 22d09f30 00c1fff0 00cba000 a7b7ba4c
        	       GPR16: 00000031 00000000 00000000 00000000 00000000 00000000 a7b7b0d0 00c5c010
        	       GPR24: a7b7b64c a7b7d2f0 00000004 00000000 c1efa6c0 00cba02c 00000300 e2d09f40
	[   63.075238] NIP [c001b5c8] do_page_fault+0x6c/0x5b0
	[   63.085952] LR [c001b6f8] do_page_fault+0x19c/0x5b0
	[   63.096678] Call Trace:
	[   63.100075] [e2d09f00] [e2d09f04] 0xe2d09f04 (unreliable)
	[   63.112359] [e2d09f30] [c000424c] DataAccess_virt+0xd4/0xe4
	[   63.125168] --- interrupt: 300 at 0xa7a261dc
	[   63.134060] NIP:  a7a261dc LR: a7a253bc CTR: 00000000
	[   63.145302] REGS: e2d09f40 TRAP: 0300   Not tainted  (5.13.0-pmac-00028-gb3c15b60339a)
	[   63.165167] MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 228428e2  XER: 20000000
	[   63.182162] DAR: 00cba02c DSISR: 42000000
        	       GPR00: a7a27448 afa6b0e0 a74c35c0 a7b7b614 0000001e a7b7b614 00cba028 00000000
        	       GPR08: 00020fd9 00000031 00cb9ff8 a7a273b0 220028e2 00c1fff0 00cba000 a7b7ba4c
        	       GPR16: 00000031 00000000 00000000 00000000 00000000 00000000 a7b7b0d0 00c5c010
        	       GPR24: a7b7b64c a7b7d2f0 00000004 00000002 0000001e a7b7b614 a7b7aff4 00000030
	[   63.275233] NIP [a7a261dc] 0xa7a261dc
	[   63.282291] LR [a7a253bc] 0xa7a253bc
	[   63.289087] --- interrupt: 300
	[   63.294322] Instruction dump:
	[   63.299291] 7c4a1378 810300a0 75278410 83820298 83a300a4 553b018c 551e0036 4082038c
	[   63.318630] 2e1b0000 40920228 75280800 41820220 <0fe00000> 3b600000 41920214 81420594
	[   63.338503] ---[ end trace f642a84639cba377 ]---

Second problem: MSR PR is seen unset allthough the interrupt frame shows it set

	[   63.666846] kernel BUG at arch/powerpc/kernel/interrupt.c:458!
	[   63.680633] Oops: Exception in kernel mode, sig: 5 [#1]
	[   63.692201] BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
	[   63.705011] Modules linked in:
	[   63.710247] CPU: 0 PID: 1660 Comm: Xorg Tainted: G        W         5.13.0-pmac-00028-gb3c15b60339a #40
	[   63.734553] NIP:  c0011434 LR: c001629c CTR: 00000000
	[   63.745796] REGS: e2d09e70 TRAP: 0700   Tainted: G        W          (5.13.0-pmac-00028-gb3c15b60339a)
	[   63.769844] MSR:  00029032 <EE,ME,IR,DR,RI>  CR: 42d09f30  XER: 00000000
	[   63.786052]
        	       GPR00: 00000000 e2d09f30 c301b680 e2d09f40 83440000 c44d0e68 e2d09e8c 00000000
        	       GPR08: 00000002 00dc228a 00004000 e2d09f30 22d09f30 00c1fff0 afa6ceb4 00c26144
        	       GPR16: 00c25fb8 00c26140 afa6ceb8 90000000 00c944d8 0000001c 00000000 00200000
        	       GPR24: 00000000 000001fb afa6d1b4 00000001 00000000 a539a2a0 a530fd80 00000089
	[   63.871284] NIP [c0011434] interrupt_exit_kernel_prepare+0x10/0x70
	[   63.885922] LR [c001629c] interrupt_return+0x9c/0x144
	[   63.897168] Call Trace:
	[   63.900562] [e2d09f30] [c000424c] DataAccess_virt+0xd4/0xe4 (unreliable)
	[   63.916773] --- interrupt: 300 at 0xa09be008
	[   63.925659] NIP:  a09be008 LR: a09bdfe8 CTR: a09bdfc0
	[   63.936903] REGS: e2d09f40 TRAP: 0300   Tainted: G        W          (5.13.0-pmac-00028-gb3c15b60339a)
	[   63.960953] MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 420028e2  XER: 20000000
	[   63.977948] DAR: a539a308 DSISR: 0a000000
        	       GPR00: a7b90d50 afa6b2d0 a74c35c0 a0a8b690 a0a8b698 a5365d70 a4fa82a8 00000004
        	       GPR08: 00000000 a09bdfc0 00000000 a5360000 a09bde7c 00c1fff0 afa6ceb4 00c26144
        	       GPR16: 00c25fb8 00c26140 afa6ceb8 90000000 00c944d8 0000001c 00000000 00200000
        	       GPR24: 00000000 000001fb afa6d1b4 00000001 00000000 a539a2a0 a530fd80 00000089
	[   64.071020] NIP [a09be008] 0xa09be008
	[   64.078079] LR [a09bdfe8] 0xa09bdfe8
	[   64.084874] --- interrupt: 300
	[   64.090108] Instruction dump:
	[   64.095074] 80010024 83e1001c 7c0803a6 4bffff80 3bc00800 4bffffd0 486b42fd 4bffffcc
	[   64.114419] 81430084 71480002 41820038 554a0462 <0f0a0000> 80620060 74630001 40820034
	[   64.134298] ---[ end trace f642a84639cba378 ]---

Reported-by: Stan Johnson <userm57@yahoo.com>
Fixes: b5efec00b671 ("powerpc/32s: Move KUEP locking/unlocking in C")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
v2: Added comments to explain why we have a 'isync' for 604+ cores only.
---
 arch/powerpc/include/asm/book3s/32/kup.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/powerpc/include/asm/book3s/32/kup.h b/arch/powerpc/include/asm/book3s/32/kup.h
index 64201125a287..d4b145b279f6 100644
--- a/arch/powerpc/include/asm/book3s/32/kup.h
+++ b/arch/powerpc/include/asm/book3s/32/kup.h
@@ -4,6 +4,8 @@
 
 #include <asm/bug.h>
 #include <asm/book3s/32/mmu-hash.h>
+#include <asm/mmu.h>
+#include <asm/synch.h>
 
 #ifndef __ASSEMBLY__
 
@@ -28,6 +30,15 @@ static inline void kuep_lock(void)
 		return;
 
 	update_user_segments(mfsr(0) | SR_NX);
+	/*
+	 * This isync() shouldn't be necessary as the kernel is not excepted to
+	 * run any instruction in userspace soon after the update of segments,
+	 * but hash based cores (at least G3) seem to exhibit a random
+	 * behaviour when the 'isync' is not there. 603 cores don't have this
+	 * behaviour so don't do the 'isync' as it saves several CPU cycles.
+	 */
+	if (mmu_has_feature(MMU_FTR_HPTE_TABLE))
+		isync();	/* Context sync required after mtsr() */
 }
 
 static inline void kuep_unlock(void)
@@ -36,6 +47,15 @@ static inline void kuep_unlock(void)
 		return;
 
 	update_user_segments(mfsr(0) & ~SR_NX);
+	/*
+	 * This isync() shouldn't be necessary as a 'rfi' will soon be executed
+	 * to return to userspace, but hash based cores (at least G3) seem to
+	 * exhibit a random behaviour when the 'isync' is not there. 603 cores
+	 * don't have this behaviour so don't do the 'isync' as it saves several
+	 * CPU cycles.
+	 */
+	if (mmu_has_feature(MMU_FTR_HPTE_TABLE))
+		isync();	/* Context sync required after mtsr() */
 }
 
 #ifdef CONFIG_PPC_KUAP
-- 
2.25.0


^ permalink raw reply related

* Re: [PATCH v2 61/63] powerpc: Split memset() to avoid multi-field overflow
From: Christophe Leroy @ 2021-08-18  6:42 UTC (permalink / raw)
  To: Kees Cook, linux-kernel
  Cc: Rasmus Villemoes, clang-built-linux, Greg Kroah-Hartman,
	Wang Wensheng, linux-staging, linux-wireless, Gustavo A. R. Silva,
	Qinglang Miao, linux-block, Hulk Robot, dri-devel, netdev,
	Andrew Morton, linuxppc-dev, linux-kbuild, linux-hardening
In-Reply-To: <20210818060533.3569517-62-keescook@chromium.org>



Le 18/08/2021 à 08:05, Kees Cook a écrit :
> In preparation for FORTIFY_SOURCE performing compile-time and run-time
> field bounds checking for memset(), avoid intentionally writing across
> neighboring fields.
> 
> Instead of writing across a field boundary with memset(), move the call
> to just the array, and an explicit zeroing of the prior field.
> 
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Qinglang Miao <miaoqinglang@huawei.com>
> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
> Cc: Hulk Robot <hulkci@huawei.com>
> Cc: Wang Wensheng <wangwensheng4@huawei.com>
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Kees Cook <keescook@chromium.org>
> Reviewed-by: Michael Ellerman <mpe@ellerman.id.au>
> Link: https://lore.kernel.org/lkml/87czqsnmw9.fsf@mpe.ellerman.id.au
> ---
>   drivers/macintosh/smu.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/macintosh/smu.c b/drivers/macintosh/smu.c
> index 94fb63a7b357..59ce431da7ef 100644
> --- a/drivers/macintosh/smu.c
> +++ b/drivers/macintosh/smu.c
> @@ -848,7 +848,8 @@ int smu_queue_i2c(struct smu_i2c_cmd *cmd)
>   	cmd->read = cmd->info.devaddr & 0x01;
>   	switch(cmd->info.type) {
>   	case SMU_I2C_TRANSFER_SIMPLE:
> -		memset(&cmd->info.sublen, 0, 4);
> +		cmd->info.sublen = 0;
> +		memset(&cmd->info.subaddr, 0, 3);

subaddr[] is a table, should the & be avoided ?
And while at it, why not use sizeof(subaddr) instead of 3 ?


>   		break;
>   	case SMU_I2C_TRANSFER_COMBINED:
>   		cmd->info.devaddr &= 0xfe;
> 

^ permalink raw reply

* [PATCH] ASoC: fsl_rpmsg: Check -EPROBE_DEFER for getting clocks
From: Shengjiu Wang @ 2021-08-18  6:03 UTC (permalink / raw)
  To: timur, nicoleotsuka, Xiubo.Lee, festevam, broonie, perex, tiwai,
	alsa-devel
  Cc: linuxppc-dev, linux-kernel

The devm_clk_get() may return -EPROBE_DEFER, then clocks
will be assigned to NULL wrongly. As the clocks are
optional so we can use devm_clk_get_optional() instead of
devm_clk_get().

Fixes: b73d9e6225e8 ("ASoC: fsl_rpmsg: Add CPU DAI driver for audio base on rpmsg")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
---
 sound/soc/fsl/fsl_rpmsg.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/sound/soc/fsl/fsl_rpmsg.c b/sound/soc/fsl/fsl_rpmsg.c
index ea5c973e2e84..d60f4dac6c1b 100644
--- a/sound/soc/fsl/fsl_rpmsg.c
+++ b/sound/soc/fsl/fsl_rpmsg.c
@@ -165,25 +165,25 @@ static int fsl_rpmsg_probe(struct platform_device *pdev)
 	}
 
 	/* Get the optional clocks */
-	rpmsg->ipg = devm_clk_get(&pdev->dev, "ipg");
+	rpmsg->ipg = devm_clk_get_optional(&pdev->dev, "ipg");
 	if (IS_ERR(rpmsg->ipg))
-		rpmsg->ipg = NULL;
+		return PTR_ERR(rpmsg->ipg);
 
-	rpmsg->mclk = devm_clk_get(&pdev->dev, "mclk");
+	rpmsg->mclk = devm_clk_get_optional(&pdev->dev, "mclk");
 	if (IS_ERR(rpmsg->mclk))
-		rpmsg->mclk = NULL;
+		return PTR_ERR(rpmsg->mclk);
 
-	rpmsg->dma = devm_clk_get(&pdev->dev, "dma");
+	rpmsg->dma = devm_clk_get_optional(&pdev->dev, "dma");
 	if (IS_ERR(rpmsg->dma))
-		rpmsg->dma = NULL;
+		return PTR_ERR(rpmsg->dma);
 
-	rpmsg->pll8k = devm_clk_get(&pdev->dev, "pll8k");
+	rpmsg->pll8k = devm_clk_get_optional(&pdev->dev, "pll8k");
 	if (IS_ERR(rpmsg->pll8k))
-		rpmsg->pll8k = NULL;
+		return PTR_ERR(rpmsg->pll8k);
 
-	rpmsg->pll11k = devm_clk_get(&pdev->dev, "pll11k");
+	rpmsg->pll11k = devm_clk_get_optional(&pdev->dev, "pll11k");
 	if (IS_ERR(rpmsg->pll11k))
-		rpmsg->pll11k = NULL;
+		return PTR_ERR(rpmsg->pll11k);
 
 	platform_set_drvdata(pdev, rpmsg);
 	pm_runtime_enable(&pdev->dev);
-- 
2.27.0


^ permalink raw reply related

* [PATCH v2 57/63] powerpc/signal32: Use struct_group() to zero spe regs
From: Kees Cook @ 2021-08-18  6:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kees Cook, Rasmus Villemoes, Greg Kroah-Hartman, linux-staging,
	linux-wireless, Gustavo A. R. Silva, linux-hardening, linux-block,
	clang-built-linux, netdev, Paul Mackerras, dri-devel,
	Sudeep Holla, Andrew Morton, linuxppc-dev, linux-kbuild,
	kernel test robot
In-Reply-To: <20210818060533.3569517-1-keescook@chromium.org>

In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Add a struct_group() for the spe registers so that memset() can correctly reason
about the size:

   In function 'fortify_memset_chk',
       inlined from 'restore_user_regs.part.0' at arch/powerpc/kernel/signal_32.c:539:3:
>> include/linux/fortify-string.h:195:4: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning]
     195 |    __write_overflow_field();
         |    ^~~~~~~~~~~~~~~~~~~~~~~~

Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/powerpc/include/asm/processor.h | 6 ++++--
 arch/powerpc/kernel/signal_32.c      | 6 +++---
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index f348e564f7dd..05dc567cb9a8 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -191,8 +191,10 @@ struct thread_struct {
 	int		used_vsr;	/* set if process has used VSX */
 #endif /* CONFIG_VSX */
 #ifdef CONFIG_SPE
-	unsigned long	evr[32];	/* upper 32-bits of SPE regs */
-	u64		acc;		/* Accumulator */
+	struct_group(spe,
+		unsigned long	evr[32];	/* upper 32-bits of SPE regs */
+		u64		acc;		/* Accumulator */
+	);
 	unsigned long	spefscr;	/* SPE & eFP status */
 	unsigned long	spefscr_last;	/* SPEFSCR value on last prctl
 					   call or trap return */
diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
index 0608581967f0..77b86caf5c51 100644
--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@ -532,11 +532,11 @@ static long restore_user_regs(struct pt_regs *regs,
 	regs_set_return_msr(regs, regs->msr & ~MSR_SPE);
 	if (msr & MSR_SPE) {
 		/* restore spe registers from the stack */
-		unsafe_copy_from_user(current->thread.evr, &sr->mc_vregs,
-				      ELF_NEVRREG * sizeof(u32), failed);
+		unsafe_copy_from_user(&current->thread.spe, &sr->mc_vregs,
+				      sizeof(current->thread.spe), failed);
 		current->thread.used_spe = true;
 	} else if (current->thread.used_spe)
-		memset(current->thread.evr, 0, ELF_NEVRREG * sizeof(u32));
+		memset(&current->thread.spe, 0, sizeof(current->thread.spe));
 
 	/* Always get SPEFSCR back */
 	unsafe_get_user(current->thread.spefscr, (u32 __user *)&sr->mc_vregs + ELF_NEVRREG, failed);
-- 
2.30.2


^ permalink raw reply related

* [PATCH v2 61/63] powerpc: Split memset() to avoid multi-field overflow
From: Kees Cook @ 2021-08-18  6:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kees Cook, Rasmus Villemoes, Greg Kroah-Hartman, Wang Wensheng,
	linux-staging, linux-wireless, Gustavo A. R. Silva, Qinglang Miao,
	linux-block, Hulk Robot, clang-built-linux, netdev, dri-devel,
	Andrew Morton, linuxppc-dev, linux-kbuild, linux-hardening
In-Reply-To: <20210818060533.3569517-1-keescook@chromium.org>

In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Instead of writing across a field boundary with memset(), move the call
to just the array, and an explicit zeroing of the prior field.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Qinglang Miao <miaoqinglang@huawei.com>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: Hulk Robot <hulkci@huawei.com>
Cc: Wang Wensheng <wangwensheng4@huawei.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/lkml/87czqsnmw9.fsf@mpe.ellerman.id.au
---
 drivers/macintosh/smu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/macintosh/smu.c b/drivers/macintosh/smu.c
index 94fb63a7b357..59ce431da7ef 100644
--- a/drivers/macintosh/smu.c
+++ b/drivers/macintosh/smu.c
@@ -848,7 +848,8 @@ int smu_queue_i2c(struct smu_i2c_cmd *cmd)
 	cmd->read = cmd->info.devaddr & 0x01;
 	switch(cmd->info.type) {
 	case SMU_I2C_TRANSFER_SIMPLE:
-		memset(&cmd->info.sublen, 0, 4);
+		cmd->info.sublen = 0;
+		memset(&cmd->info.subaddr, 0, 3);
 		break;
 	case SMU_I2C_TRANSFER_COMBINED:
 		cmd->info.devaddr &= 0xfe;
-- 
2.30.2


^ permalink raw reply related

* [PATCH v2 36/63] scsi: ibmvscsi: Avoid multi-field memset() overflow by aiming at srp
From: Kees Cook @ 2021-08-18  6:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tyrel Datwyler, Kees Cook, Martin K. Petersen, Greg Kroah-Hartman,
	linux-block, James E.J. Bottomley, linux-staging, linux-wireless,
	Gustavo A. R. Silva, dri-devel, linux-scsi, clang-built-linux,
	netdev, Paul Mackerras, linux-hardening, Andrew Morton,
	linuxppc-dev, Rasmus Villemoes, linux-kbuild
In-Reply-To: <20210818060533.3569517-1-keescook@chromium.org>

In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Instead of writing beyond the end of evt_struct->iu.srp.cmd, target the
upper union (evt_struct->iu.srp) instead, as that's what is being wiped.

Cc: Tyrel Datwyler <tyreld@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/lkml/yq135rzp79c.fsf@ca-mkp.ca.oracle.com
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Link: https://lore.kernel.org/lkml/6eae8434-e9a7-aa74-628b-b515b3695359@linux.ibm.com
---
 drivers/scsi/ibmvscsi/ibmvscsi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index 50df7dd9cb91..ea8e01f49cba 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -1055,8 +1055,9 @@ static int ibmvscsi_queuecommand_lck(struct scsi_cmnd *cmnd,
 		return SCSI_MLQUEUE_HOST_BUSY;
 
 	/* Set up the actual SRP IU */
+	BUILD_BUG_ON(sizeof(evt_struct->iu.srp) != SRP_MAX_IU_LEN);
+	memset(&evt_struct->iu.srp, 0x00, sizeof(evt_struct->iu.srp));
 	srp_cmd = &evt_struct->iu.srp.cmd;
-	memset(srp_cmd, 0x00, SRP_MAX_IU_LEN);
 	srp_cmd->opcode = SRP_CMD;
 	memcpy(srp_cmd->cdb, cmnd->cmnd, sizeof(srp_cmd->cdb));
 	int_to_scsilun(lun, &srp_cmd->lun);
-- 
2.30.2


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox