[PATCH 6.4 00/28] 6.4.1-rc1 review

stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 6.4 00/28] 6.4.1-rc1 review
@ 2023-06-29 18:43 Greg Kroah-Hartman
  2023-06-29 18:43 ` [PATCH 6.4 01/28] x86/microcode/AMD: Load late on both threads too Greg Kroah-Hartman
                   ` (28 more replies)
  0 siblings, 29 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:43 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, linux-kernel, torvalds, akpm, linux,
	shuah, patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor

This is the start of the stable review cycle for the 6.4.1 release.
There are 28 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Sat, 01 Jul 2023 18:41:39 +0000.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.1-rc1.gz
or in the git tree and branch at:
	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Linux 6.4.1-rc1

Ricardo Cañuelo <ricardo.canuelo@collabora.com>
    Revert "thermal/drivers/mediatek: Use devm_of_iomap to avoid resource leak in mtk_thermal_probe"

Mike Hommey <mh@glandium.org>
    HID: logitech-hidpp: add HIDPP_QUIRK_DELAYED_INIT for the T651.

Ludvig Michaelsson <ludvig.michaelsson@yubico.com>
    HID: hidraw: fix data race on device refcount

Zhang Shurong <zhang_shurong@foxmail.com>
    fbdev: fix potential OOB read in fast_imageblit()

Hugh Dickins <hughd@google.com>
    mm/khugepaged: fix regression in collapse_file()

Linus Torvalds <torvalds@linux-foundation.org>
    gup: add warning if some caller would seem to want stack expansion

Jason Gerecke <jason.gerecke@wacom.com>
    HID: wacom: Use ktime_t rather than int when dealing with timestamps

Linus Torvalds <torvalds@linux-foundation.org>
    mm: always expand the stack with the mmap write lock held

Linus Torvalds <torvalds@linux-foundation.org>
    execve: expand new process stack manually ahead of time

Liam R. Howlett <Liam.Howlett@oracle.com>
    mm: make find_extend_vma() fail if write lock not held

Linus Torvalds <torvalds@linux-foundation.org>
    powerpc/mm: convert coprocessor fault to lock_mm_and_find_vma()

Linus Torvalds <torvalds@linux-foundation.org>
    mm/fault: convert remaining simple cases to lock_mm_and_find_vma()

Ben Hutchings <ben@decadent.org.uk>
    arm/mm: Convert to using lock_mm_and_find_vma()

Ben Hutchings <ben@decadent.org.uk>
    riscv/mm: Convert to using lock_mm_and_find_vma()

Ben Hutchings <ben@decadent.org.uk>
    mips/mm: Convert to using lock_mm_and_find_vma()

Michael Ellerman <mpe@ellerman.id.au>
    powerpc/mm: Convert to using lock_mm_and_find_vma()

Linus Torvalds <torvalds@linux-foundation.org>
    arm64/mm: Convert to using lock_mm_and_find_vma()

Linus Torvalds <torvalds@linux-foundation.org>
    mm: make the page fault mmap locking killable

Linus Torvalds <torvalds@linux-foundation.org>
    mm: introduce new 'lock_mm_and_find_vma()' page fault helper

Peng Zhang <zhangpeng.00@bytedance.com>
    maple_tree: fix potential out-of-bounds access in mas_wr_end_piv()

Oliver Hartkopp <socketcan@hartkopp.net>
    can: isotp: isotp_sendmsg(): fix return error fix on TX path

Wyes Karny <wyes.karny@amd.com>
    cpufreq: amd-pstate: Make amd-pstate EPP driver name hyphenated

Thomas Gleixner <tglx@linutronix.de>
    x86/smp: Cure kexec() vs. mwait_play_dead() breakage

Thomas Gleixner <tglx@linutronix.de>
    x86/smp: Use dedicated cache-line for mwait_play_dead()

Thomas Gleixner <tglx@linutronix.de>
    x86/smp: Remove pointless wmb()s from native_stop_other_cpus()

Tony Battersby <tonyb@cybernetics.com>
    x86/smp: Dont access non-existing CPUID leaf

Thomas Gleixner <tglx@linutronix.de>
    x86/smp: Make stop_other_cpus() more robust

Borislav Petkov (AMD) <bp@alien8.de>
    x86/microcode/AMD: Load late on both threads too


-------------

Diffstat:

 Makefile                                  |   4 +-
 arch/alpha/Kconfig                        |   1 +
 arch/alpha/mm/fault.c                     |  13 +--
 arch/arc/Kconfig                          |   1 +
 arch/arc/mm/fault.c                       |  11 +--
 arch/arm/Kconfig                          |   1 +
 arch/arm/mm/fault.c                       |  63 ++++-----------
 arch/arm64/Kconfig                        |   1 +
 arch/arm64/mm/fault.c                     |  47 ++---------
 arch/csky/Kconfig                         |   1 +
 arch/csky/mm/fault.c                      |  22 ++----
 arch/hexagon/Kconfig                      |   1 +
 arch/hexagon/mm/vm_fault.c                |  18 +----
 arch/ia64/mm/fault.c                      |  36 ++-------
 arch/loongarch/Kconfig                    |   1 +
 arch/loongarch/mm/fault.c                 |  16 ++--
 arch/m68k/mm/fault.c                      |   9 ++-
 arch/microblaze/mm/fault.c                |   5 +-
 arch/mips/Kconfig                         |   1 +
 arch/mips/mm/fault.c                      |  12 +--
 arch/nios2/Kconfig                        |   1 +
 arch/nios2/mm/fault.c                     |  17 +---
 arch/openrisc/mm/fault.c                  |   5 +-
 arch/parisc/mm/fault.c                    |  23 +++---
 arch/powerpc/Kconfig                      |   1 +
 arch/powerpc/mm/copro_fault.c             |  14 +---
 arch/powerpc/mm/fault.c                   |  39 +--------
 arch/riscv/Kconfig                        |   1 +
 arch/riscv/mm/fault.c                     |  31 +++-----
 arch/s390/mm/fault.c                      |   5 +-
 arch/sh/Kconfig                           |   1 +
 arch/sh/mm/fault.c                        |  17 +---
 arch/sparc/Kconfig                        |   1 +
 arch/sparc/mm/fault_32.c                  |  32 ++------
 arch/sparc/mm/fault_64.c                  |   8 +-
 arch/um/kernel/trap.c                     |  11 +--
 arch/x86/Kconfig                          |   1 +
 arch/x86/include/asm/cpu.h                |   2 +
 arch/x86/include/asm/smp.h                |   2 +
 arch/x86/kernel/cpu/microcode/amd.c       |   2 +-
 arch/x86/kernel/process.c                 |  28 ++++++-
 arch/x86/kernel/smp.c                     |  73 ++++++++++-------
 arch/x86/kernel/smpboot.c                 |  81 ++++++++++++++++---
 arch/x86/mm/fault.c                       |  52 +-----------
 arch/xtensa/Kconfig                       |   1 +
 arch/xtensa/mm/fault.c                    |  14 +---
 drivers/cpufreq/amd-pstate.c              |   2 +-
 drivers/hid/hid-logitech-hidpp.c          |   2 +-
 drivers/hid/hidraw.c                      |   9 ++-
 drivers/hid/wacom_wac.c                   |   6 +-
 drivers/hid/wacom_wac.h                   |   2 +-
 drivers/iommu/amd/iommu_v2.c              |   4 +-
 drivers/iommu/iommu-sva.c                 |   2 +-
 drivers/thermal/mediatek/auxadc_thermal.c |  14 +---
 drivers/video/fbdev/core/sysimgblt.c      |   2 +-
 fs/binfmt_elf.c                           |   6 +-
 fs/exec.c                                 |  38 +++++----
 include/linux/mm.h                        |  16 ++--
 lib/maple_tree.c                          |  11 +--
 mm/Kconfig                                |   4 +
 mm/gup.c                                  |  14 +++-
 mm/khugepaged.c                           |   7 +-
 mm/memory.c                               | 127 ++++++++++++++++++++++++++++++
 mm/mmap.c                                 | 121 ++++++++++++++++++++++++----
 mm/nommu.c                                |  17 ++--
 net/can/isotp.c                           |   5 +-
 66 files changed, 605 insertions(+), 531 deletions(-)



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 01/28] x86/microcode/AMD: Load late on both threads too
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
@ 2023-06-29 18:43 ` Greg Kroah-Hartman
  2023-06-29 18:43 ` [PATCH 6.4 02/28] x86/smp: Make stop_other_cpus() more robust Greg Kroah-Hartman
                   ` (27 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:43 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Borislav Petkov (AMD), stable

From: Borislav Petkov (AMD) <bp@alien8.de>

commit a32b0f0db3f396f1c9be2fe621e77c09ec3d8e7d upstream.

Do the same as early loading - load on both threads.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20230605141332.25948-1-bp@alien8.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/microcode/amd.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -705,7 +705,7 @@ static enum ucode_state apply_microcode_
 	rdmsr(MSR_AMD64_PATCH_LEVEL, rev, dummy);
 
 	/* need to apply patch? */
-	if (rev >= mc_amd->hdr.patch_id) {
+	if (rev > mc_amd->hdr.patch_id) {
 		ret = UCODE_OK;
 		goto out;
 	}



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 02/28] x86/smp: Make stop_other_cpus() more robust
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
  2023-06-29 18:43 ` [PATCH 6.4 01/28] x86/microcode/AMD: Load late on both threads too Greg Kroah-Hartman
@ 2023-06-29 18:43 ` Greg Kroah-Hartman
  2023-06-29 18:43 ` [PATCH 6.4 03/28] x86/smp: Dont access non-existing CPUID leaf Greg Kroah-Hartman
                   ` (26 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:43 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Tony Battersby, Thomas Gleixner,
	Borislav Petkov (AMD), Ashok Raj

From: Thomas Gleixner <tglx@linutronix.de>

commit 1f5e7eb7868e42227ac426c96d437117e6e06e8e upstream.

Tony reported intermittent lockups on poweroff. His analysis identified the
wbinvd() in stop_this_cpu() as the culprit. This was added to ensure that
on SME enabled machines a kexec() does not leave any stale data in the
caches when switching from encrypted to non-encrypted mode or vice versa.

That wbinvd() is conditional on the SME feature bit which is read directly
from CPUID. But that readout does not check whether the CPUID leaf is
available or not. If it's not available the CPU will return the value of
the highest supported leaf instead. Depending on the content the "SME" bit
might be set or not.

That's incorrect but harmless. Making the CPUID readout conditional makes
the observed hangs go away, but it does not fix the underlying problem:

CPU0					CPU1

 stop_other_cpus()
   send_IPIs(REBOOT);			stop_this_cpu()
   while (num_online_cpus() > 1);         set_online(false);
   proceed... -> hang
				          wbinvd()

WBINVD is an expensive operation and if multiple CPUs issue it at the same
time the resulting delays are even larger.

But CPU0 already observed num_online_cpus() going down to 1 and proceeds
which causes the system to hang.

This issue exists independent of WBINVD, but the delays caused by WBINVD
make it more prominent.

Make this more robust by adding a cpumask which is initialized to the
online CPU mask before sending the IPIs and CPUs clear their bit in
stop_this_cpu() after the WBINVD completed. Check for that cpumask to
become empty in stop_other_cpus() instead of watching num_online_cpus().

The cpumask cannot plug all holes either, but it's better than a raw
counter and allows to restrict the NMI fallback IPI to be sent only the
CPUs which have not reported within the timeout window.

Fixes: 08f253ec3767 ("x86/cpu: Clear SME feature flag when not in use")
Reported-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/3817d810-e0f1-8ef8-0bbd-663b919ca49b@cybernetics.com
Link: https://lore.kernel.org/r/87h6r770bv.ffs@tglx
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/cpu.h |    2 +
 arch/x86/kernel/process.c  |   23 +++++++++++++++-
 arch/x86/kernel/smp.c      |   62 +++++++++++++++++++++++++++++----------------
 3 files changed, 64 insertions(+), 23 deletions(-)

--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -98,4 +98,6 @@ extern u64 x86_read_arch_cap_msr(void);
 int intel_find_matching_signature(void *mc, unsigned int csig, int cpf);
 int intel_microcode_sanity_check(void *mc, bool print_err, int hdr_type);
 
+extern struct cpumask cpus_stop_mask;
+
 #endif /* _ASM_X86_CPU_H */
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -759,13 +759,23 @@ bool xen_set_default_idle(void)
 }
 #endif
 
+struct cpumask cpus_stop_mask;
+
 void __noreturn stop_this_cpu(void *dummy)
 {
+	unsigned int cpu = smp_processor_id();
+
 	local_irq_disable();
+
 	/*
-	 * Remove this CPU:
+	 * Remove this CPU from the online mask and disable it
+	 * unconditionally. This might be redundant in case that the reboot
+	 * vector was handled late and stop_other_cpus() sent an NMI.
+	 *
+	 * According to SDM and APM NMIs can be accepted even after soft
+	 * disabling the local APIC.
 	 */
-	set_cpu_online(smp_processor_id(), false);
+	set_cpu_online(cpu, false);
 	disable_local_APIC();
 	mcheck_cpu_clear(this_cpu_ptr(&cpu_info));
 
@@ -783,6 +793,15 @@ void __noreturn stop_this_cpu(void *dumm
 	 */
 	if (cpuid_eax(0x8000001f) & BIT(0))
 		native_wbinvd();
+
+	/*
+	 * This brings a cache line back and dirties it, but
+	 * native_stop_other_cpus() will overwrite cpus_stop_mask after it
+	 * observed that all CPUs reported stop. This write will invalidate
+	 * the related cache line on this CPU.
+	 */
+	cpumask_clear_cpu(cpu, &cpus_stop_mask);
+
 	for (;;) {
 		/*
 		 * Use native_halt() so that memory contents don't change
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -27,6 +27,7 @@
 #include <asm/mmu_context.h>
 #include <asm/proto.h>
 #include <asm/apic.h>
+#include <asm/cpu.h>
 #include <asm/idtentry.h>
 #include <asm/nmi.h>
 #include <asm/mce.h>
@@ -146,31 +147,43 @@ static int register_stop_handler(void)
 
 static void native_stop_other_cpus(int wait)
 {
-	unsigned long flags;
-	unsigned long timeout;
+	unsigned int cpu = smp_processor_id();
+	unsigned long flags, timeout;
 
 	if (reboot_force)
 		return;
 
-	/*
-	 * Use an own vector here because smp_call_function
-	 * does lots of things not suitable in a panic situation.
-	 */
+	/* Only proceed if this is the first CPU to reach this code */
+	if (atomic_cmpxchg(&stopping_cpu, -1, cpu) != -1)
+		return;
 
 	/*
-	 * We start by using the REBOOT_VECTOR irq.
-	 * The irq is treated as a sync point to allow critical
-	 * regions of code on other cpus to release their spin locks
-	 * and re-enable irqs.  Jumping straight to an NMI might
-	 * accidentally cause deadlocks with further shutdown/panic
-	 * code.  By syncing, we give the cpus up to one second to
-	 * finish their work before we force them off with the NMI.
+	 * 1) Send an IPI on the reboot vector to all other CPUs.
+	 *
+	 *    The other CPUs should react on it after leaving critical
+	 *    sections and re-enabling interrupts. They might still hold
+	 *    locks, but there is nothing which can be done about that.
+	 *
+	 * 2) Wait for all other CPUs to report that they reached the
+	 *    HLT loop in stop_this_cpu()
+	 *
+	 * 3) If #2 timed out send an NMI to the CPUs which did not
+	 *    yet report
+	 *
+	 * 4) Wait for all other CPUs to report that they reached the
+	 *    HLT loop in stop_this_cpu()
+	 *
+	 * #3 can obviously race against a CPU reaching the HLT loop late.
+	 * That CPU will have reported already and the "have all CPUs
+	 * reached HLT" condition will be true despite the fact that the
+	 * other CPU is still handling the NMI. Again, there is no
+	 * protection against that as "disabled" APICs still respond to
+	 * NMIs.
 	 */
-	if (num_online_cpus() > 1) {
-		/* did someone beat us here? */
-		if (atomic_cmpxchg(&stopping_cpu, -1, safe_smp_processor_id()) != -1)
-			return;
+	cpumask_copy(&cpus_stop_mask, cpu_online_mask);
+	cpumask_clear_cpu(cpu, &cpus_stop_mask);
 
+	if (!cpumask_empty(&cpus_stop_mask)) {
 		/* sync above data before sending IRQ */
 		wmb();
 
@@ -183,12 +196,12 @@ static void native_stop_other_cpus(int w
 		 * CPUs reach shutdown state.
 		 */
 		timeout = USEC_PER_SEC;
-		while (num_online_cpus() > 1 && timeout--)
+		while (!cpumask_empty(&cpus_stop_mask) && timeout--)
 			udelay(1);
 	}
 
 	/* if the REBOOT_VECTOR didn't work, try with the NMI */
-	if (num_online_cpus() > 1) {
+	if (!cpumask_empty(&cpus_stop_mask)) {
 		/*
 		 * If NMI IPI is enabled, try to register the stop handler
 		 * and send the IPI. In any case try to wait for the other
@@ -200,7 +213,8 @@ static void native_stop_other_cpus(int w
 
 			pr_emerg("Shutting down cpus with NMI\n");
 
-			apic_send_IPI_allbutself(NMI_VECTOR);
+			for_each_cpu(cpu, &cpus_stop_mask)
+				apic->send_IPI(cpu, NMI_VECTOR);
 		}
 		/*
 		 * Don't wait longer than 10 ms if the caller didn't
@@ -208,7 +222,7 @@ static void native_stop_other_cpus(int w
 		 * one or more CPUs do not reach shutdown state.
 		 */
 		timeout = USEC_PER_MSEC * 10;
-		while (num_online_cpus() > 1 && (wait || timeout--))
+		while (!cpumask_empty(&cpus_stop_mask) && (wait || timeout--))
 			udelay(1);
 	}
 
@@ -216,6 +230,12 @@ static void native_stop_other_cpus(int w
 	disable_local_APIC();
 	mcheck_cpu_clear(this_cpu_ptr(&cpu_info));
 	local_irq_restore(flags);
+
+	/*
+	 * Ensure that the cpus_stop_mask cache lines are invalidated on
+	 * the other CPUs. See comment vs. SME in stop_this_cpu().
+	 */
+	cpumask_clear(&cpus_stop_mask);
 }
 
 /*



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 03/28] x86/smp: Dont access non-existing CPUID leaf
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
  2023-06-29 18:43 ` [PATCH 6.4 01/28] x86/microcode/AMD: Load late on both threads too Greg Kroah-Hartman
  2023-06-29 18:43 ` [PATCH 6.4 02/28] x86/smp: Make stop_other_cpus() more robust Greg Kroah-Hartman
@ 2023-06-29 18:43 ` Greg Kroah-Hartman
  2023-06-29 18:43 ` [PATCH 6.4 04/28] x86/smp: Remove pointless wmb()s from native_stop_other_cpus() Greg Kroah-Hartman
                   ` (25 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:43 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Tony Battersby, Thomas Gleixner,
	Mario Limonciello, Borislav Petkov (AMD)

From: Tony Battersby <tonyb@cybernetics.com>

commit 9b040453d4440659f33dc6f0aa26af418ebfe70b upstream.

stop_this_cpu() tests CPUID leaf 0x8000001f::EAX unconditionally. Intel
CPUs return the content of the highest supported leaf when a non-existing
leaf is read, while AMD CPUs return all zeros for unsupported leafs.

So the result of the test on Intel CPUs is lottery.

While harmless it's incorrect and causes the conditional wbinvd() to be
issued where not required.

Check whether the leaf is supported before reading it.

[ tglx: Adjusted changelog ]

Fixes: 08f253ec3767 ("x86/cpu: Clear SME feature flag when not in use")
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/3817d810-e0f1-8ef8-0bbd-663b919ca49b@cybernetics.com
Link: https://lore.kernel.org/r/20230615193330.322186388@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/process.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -763,6 +763,7 @@ struct cpumask cpus_stop_mask;
 
 void __noreturn stop_this_cpu(void *dummy)
 {
+	struct cpuinfo_x86 *c = this_cpu_ptr(&cpu_info);
 	unsigned int cpu = smp_processor_id();
 
 	local_irq_disable();
@@ -777,7 +778,7 @@ void __noreturn stop_this_cpu(void *dumm
 	 */
 	set_cpu_online(cpu, false);
 	disable_local_APIC();
-	mcheck_cpu_clear(this_cpu_ptr(&cpu_info));
+	mcheck_cpu_clear(c);
 
 	/*
 	 * Use wbinvd on processors that support SME. This provides support
@@ -791,7 +792,7 @@ void __noreturn stop_this_cpu(void *dumm
 	 * Test the CPUID bit directly because the machine might've cleared
 	 * X86_FEATURE_SME due to cmdline options.
 	 */
-	if (cpuid_eax(0x8000001f) & BIT(0))
+	if (c->extended_cpuid_level >= 0x8000001f && (cpuid_eax(0x8000001f) & BIT(0)))
 		native_wbinvd();
 
 	/*



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 04/28] x86/smp: Remove pointless wmb()s from native_stop_other_cpus()
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (2 preceding siblings ...)
  2023-06-29 18:43 ` [PATCH 6.4 03/28] x86/smp: Dont access non-existing CPUID leaf Greg Kroah-Hartman
@ 2023-06-29 18:43 ` Greg Kroah-Hartman
  2023-06-29 18:43 ` [PATCH 6.4 05/28] x86/smp: Use dedicated cache-line for mwait_play_dead() Greg Kroah-Hartman
                   ` (24 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:43 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Thomas Gleixner,
	Borislav Petkov (AMD)

From: Thomas Gleixner <tglx@linutronix.de>

commit 2affa6d6db28855e6340b060b809c23477aa546e upstream.

The wmb()s before sending the IPIs are not synchronizing anything.

If at all then the apic IPI functions have to provide or act as appropriate
barriers.

Remove these cargo cult barriers which have no explanation of what they are
synchronizing.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230615193330.378358382@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/smp.c |    6 ------
 1 file changed, 6 deletions(-)

--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -184,9 +184,6 @@ static void native_stop_other_cpus(int w
 	cpumask_clear_cpu(cpu, &cpus_stop_mask);
 
 	if (!cpumask_empty(&cpus_stop_mask)) {
-		/* sync above data before sending IRQ */
-		wmb();
-
 		apic_send_IPI_allbutself(REBOOT_VECTOR);
 
 		/*
@@ -208,9 +205,6 @@ static void native_stop_other_cpus(int w
 		 * CPUs to stop.
 		 */
 		if (!smp_no_nmi_ipi && !register_stop_handler()) {
-			/* Sync above data before sending IRQ */
-			wmb();
-
 			pr_emerg("Shutting down cpus with NMI\n");
 
 			for_each_cpu(cpu, &cpus_stop_mask)



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 05/28] x86/smp: Use dedicated cache-line for mwait_play_dead()
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (3 preceding siblings ...)
  2023-06-29 18:43 ` [PATCH 6.4 04/28] x86/smp: Remove pointless wmb()s from native_stop_other_cpus() Greg Kroah-Hartman
@ 2023-06-29 18:43 ` Greg Kroah-Hartman
  2023-06-29 18:43 ` [PATCH 6.4 06/28] x86/smp: Cure kexec() vs. mwait_play_dead() breakage Greg Kroah-Hartman
                   ` (23 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:43 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Thomas Gleixner, Ashok Raj,
	Borislav Petkov (AMD)

From: Thomas Gleixner <tglx@linutronix.de>

commit f9c9987bf52f4e42e940ae217333ebb5a4c3b506 upstream.

Monitoring idletask::thread_info::flags in mwait_play_dead() has been an
obvious choice as all what is needed is a cache line which is not written
by other CPUs.

But there is a use case where a "dead" CPU needs to be brought out of
MWAIT: kexec().

This is required as kexec() can overwrite text, pagetables, stacks and the
monitored cacheline of the original kernel. The latter causes MWAIT to
resume execution which obviously causes havoc on the kexec kernel which
results usually in triple faults.

Use a dedicated per CPU storage to prepare for that.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230615193330.434553750@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/smpboot.c |   24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -101,6 +101,17 @@ EXPORT_PER_CPU_SYMBOL(cpu_die_map);
 DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
 EXPORT_PER_CPU_SYMBOL(cpu_info);
 
+struct mwait_cpu_dead {
+	unsigned int	control;
+	unsigned int	status;
+};
+
+/*
+ * Cache line aligned data for mwait_play_dead(). Separate on purpose so
+ * that it's unlikely to be touched by other CPUs.
+ */
+static DEFINE_PER_CPU_ALIGNED(struct mwait_cpu_dead, mwait_cpu_dead);
+
 /* Logical package management. We might want to allocate that dynamically */
 unsigned int __max_logical_packages __read_mostly;
 EXPORT_SYMBOL(__max_logical_packages);
@@ -1758,10 +1769,10 @@ EXPORT_SYMBOL_GPL(cond_wakeup_cpu0);
  */
 static inline void mwait_play_dead(void)
 {
+	struct mwait_cpu_dead *md = this_cpu_ptr(&mwait_cpu_dead);
 	unsigned int eax, ebx, ecx, edx;
 	unsigned int highest_cstate = 0;
 	unsigned int highest_subcstate = 0;
-	void *mwait_ptr;
 	int i;
 
 	if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
@@ -1796,13 +1807,6 @@ static inline void mwait_play_dead(void)
 			(highest_subcstate - 1);
 	}
 
-	/*
-	 * This should be a memory location in a cache line which is
-	 * unlikely to be touched by other processors.  The actual
-	 * content is immaterial as it is not actually modified in any way.
-	 */
-	mwait_ptr = &current_thread_info()->flags;
-
 	wbinvd();
 
 	while (1) {
@@ -1814,9 +1818,9 @@ static inline void mwait_play_dead(void)
 		 * case where we return around the loop.
 		 */
 		mb();
-		clflush(mwait_ptr);
+		clflush(md);
 		mb();
-		__monitor(mwait_ptr, 0, 0);
+		__monitor(md, 0, 0);
 		mb();
 		__mwait(eax, 0);
 



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 06/28] x86/smp: Cure kexec() vs. mwait_play_dead() breakage
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (4 preceding siblings ...)
  2023-06-29 18:43 ` [PATCH 6.4 05/28] x86/smp: Use dedicated cache-line for mwait_play_dead() Greg Kroah-Hartman
@ 2023-06-29 18:43 ` Greg Kroah-Hartman
  2023-06-29 18:43 ` [PATCH 6.4 07/28] cpufreq: amd-pstate: Make amd-pstate EPP driver name hyphenated Greg Kroah-Hartman
                   ` (22 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:43 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Ashok Raj, Thomas Gleixner

From: Thomas Gleixner <tglx@linutronix.de>

commit d7893093a7417527c0d73c9832244e65c9d0114f upstream.

TLDR: It's a mess.

When kexec() is executed on a system with offline CPUs, which are parked in
mwait_play_dead() it can end up in a triple fault during the bootup of the
kexec kernel or cause hard to diagnose data corruption.

The reason is that kexec() eventually overwrites the previous kernel's text,
page tables, data and stack. If it writes to the cache line which is
monitored by a previously offlined CPU, MWAIT resumes execution and ends
up executing the wrong text, dereferencing overwritten page tables or
corrupting the kexec kernels data.

Cure this by bringing the offlined CPUs out of MWAIT into HLT.

Write to the monitored cache line of each offline CPU, which makes MWAIT
resume execution. The written control word tells the offlined CPUs to issue
HLT, which does not have the MWAIT problem.

That does not help, if a stray NMI, MCE or SMI hits the offlined CPUs as
those make it come out of HLT.

A follow up change will put them into INIT, which protects at least against
NMI and SMI.

Fixes: ea53069231f9 ("x86, hotplug: Use mwait to offline a processor, fix the legacy case")
Reported-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230615193330.492257119@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/smp.h |    2 +
 arch/x86/kernel/smp.c      |    5 +++
 arch/x86/kernel/smpboot.c  |   59 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 66 insertions(+)

--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -132,6 +132,8 @@ void wbinvd_on_cpu(int cpu);
 int wbinvd_on_all_cpus(void);
 void cond_wakeup_cpu0(void);
 
+void smp_kick_mwait_play_dead(void);
+
 void native_smp_send_reschedule(int cpu);
 void native_send_call_func_ipi(const struct cpumask *mask);
 void native_send_call_func_single_ipi(int cpu);
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -21,6 +21,7 @@
 #include <linux/interrupt.h>
 #include <linux/cpu.h>
 #include <linux/gfp.h>
+#include <linux/kexec.h>
 
 #include <asm/mtrr.h>
 #include <asm/tlbflush.h>
@@ -157,6 +158,10 @@ static void native_stop_other_cpus(int w
 	if (atomic_cmpxchg(&stopping_cpu, -1, cpu) != -1)
 		return;
 
+	/* For kexec, ensure that offline CPUs are out of MWAIT and in HLT */
+	if (kexec_in_progress)
+		smp_kick_mwait_play_dead();
+
 	/*
 	 * 1) Send an IPI on the reboot vector to all other CPUs.
 	 *
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -53,6 +53,7 @@
 #include <linux/tboot.h>
 #include <linux/gfp.h>
 #include <linux/cpuidle.h>
+#include <linux/kexec.h>
 #include <linux/numa.h>
 #include <linux/pgtable.h>
 #include <linux/overflow.h>
@@ -106,6 +107,9 @@ struct mwait_cpu_dead {
 	unsigned int	status;
 };
 
+#define CPUDEAD_MWAIT_WAIT	0xDEADBEEF
+#define CPUDEAD_MWAIT_KEXEC_HLT	0x4A17DEAD
+
 /*
  * Cache line aligned data for mwait_play_dead(). Separate on purpose so
  * that it's unlikely to be touched by other CPUs.
@@ -173,6 +177,10 @@ static void smp_callin(void)
 {
 	int cpuid;
 
+	/* Mop up eventual mwait_play_dead() wreckage */
+	this_cpu_write(mwait_cpu_dead.status, 0);
+	this_cpu_write(mwait_cpu_dead.control, 0);
+
 	/*
 	 * If waken up by an INIT in an 82489DX configuration
 	 * cpu_callout_mask guarantees we don't get here before
@@ -1807,6 +1815,10 @@ static inline void mwait_play_dead(void)
 			(highest_subcstate - 1);
 	}
 
+	/* Set up state for the kexec() hack below */
+	md->status = CPUDEAD_MWAIT_WAIT;
+	md->control = CPUDEAD_MWAIT_WAIT;
+
 	wbinvd();
 
 	while (1) {
@@ -1824,10 +1836,57 @@ static inline void mwait_play_dead(void)
 		mb();
 		__mwait(eax, 0);
 
+		if (READ_ONCE(md->control) == CPUDEAD_MWAIT_KEXEC_HLT) {
+			/*
+			 * Kexec is about to happen. Don't go back into mwait() as
+			 * the kexec kernel might overwrite text and data including
+			 * page tables and stack. So mwait() would resume when the
+			 * monitor cache line is written to and then the CPU goes
+			 * south due to overwritten text, page tables and stack.
+			 *
+			 * Note: This does _NOT_ protect against a stray MCE, NMI,
+			 * SMI. They will resume execution at the instruction
+			 * following the HLT instruction and run into the problem
+			 * which this is trying to prevent.
+			 */
+			WRITE_ONCE(md->status, CPUDEAD_MWAIT_KEXEC_HLT);
+			while(1)
+				native_halt();
+		}
+
 		cond_wakeup_cpu0();
 	}
 }
 
+/*
+ * Kick all "offline" CPUs out of mwait on kexec(). See comment in
+ * mwait_play_dead().
+ */
+void smp_kick_mwait_play_dead(void)
+{
+	u32 newstate = CPUDEAD_MWAIT_KEXEC_HLT;
+	struct mwait_cpu_dead *md;
+	unsigned int cpu, i;
+
+	for_each_cpu_andnot(cpu, cpu_present_mask, cpu_online_mask) {
+		md = per_cpu_ptr(&mwait_cpu_dead, cpu);
+
+		/* Does it sit in mwait_play_dead() ? */
+		if (READ_ONCE(md->status) != CPUDEAD_MWAIT_WAIT)
+			continue;
+
+		/* Wait up to 5ms */
+		for (i = 0; READ_ONCE(md->status) != newstate && i < 1000; i++) {
+			/* Bring it out of mwait */
+			WRITE_ONCE(md->control, newstate);
+			udelay(5);
+		}
+
+		if (READ_ONCE(md->status) != newstate)
+			pr_err_once("CPU%u is stuck in mwait_play_dead()\n", cpu);
+	}
+}
+
 void __noreturn hlt_play_dead(void)
 {
 	if (__this_cpu_read(cpu_info.x86) >= 4)



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 07/28] cpufreq: amd-pstate: Make amd-pstate EPP driver name hyphenated
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (5 preceding siblings ...)
  2023-06-29 18:43 ` [PATCH 6.4 06/28] x86/smp: Cure kexec() vs. mwait_play_dead() breakage Greg Kroah-Hartman
@ 2023-06-29 18:43 ` Greg Kroah-Hartman
  2023-06-29 18:43 ` [PATCH 6.4 08/28] can: isotp: isotp_sendmsg(): fix return error fix on TX path Greg Kroah-Hartman
                   ` (21 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:43 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Gautham R. Shenoy, Wyes Karny,
	Huang Rui, Perry Yuan, Rafael J. Wysocki

From: Wyes Karny <wyes.karny@amd.com>

commit f4aad639302a07454dcb23b408dcadf8a9efb031 upstream.

amd-pstate passive mode driver is hyphenated. So make amd-pstate active
mode driver consistent with that rename "amd_pstate_epp" to
"amd-pstate-epp".

Fixes: ffa5096a7c33 ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Cc: All applicable <stable@vger.kernel.org>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/cpufreq/amd-pstate.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -1356,7 +1356,7 @@ static struct cpufreq_driver amd_pstate_
 	.online		= amd_pstate_epp_cpu_online,
 	.suspend	= amd_pstate_epp_suspend,
 	.resume		= amd_pstate_epp_resume,
-	.name		= "amd_pstate_epp",
+	.name		= "amd-pstate-epp",
 	.attr		= amd_pstate_epp_attr,
 };
 



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 08/28] can: isotp: isotp_sendmsg(): fix return error fix on TX path
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (6 preceding siblings ...)
  2023-06-29 18:43 ` [PATCH 6.4 07/28] cpufreq: amd-pstate: Make amd-pstate EPP driver name hyphenated Greg Kroah-Hartman
@ 2023-06-29 18:43 ` Greg Kroah-Hartman
  2023-06-29 18:43 ` [PATCH 6.4 09/28] maple_tree: fix potential out-of-bounds access in mas_wr_end_piv() Greg Kroah-Hartman
                   ` (20 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:43 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Carsten Schmidt, Oliver Hartkopp,
	Marc Kleine-Budde

From: Oliver Hartkopp <socketcan@hartkopp.net>

commit e38910c0072b541a91954682c8b074a93e57c09b upstream.

With commit d674a8f123b4 ("can: isotp: isotp_sendmsg(): fix return
error on FC timeout on TX path") the missing correct return value in
the case of a protocol error was introduced.

But the way the error value has been read and sent to the user space
does not follow the common scheme to clear the error after reading
which is provided by the sock_error() function. This leads to an error
report at the following write() attempt although everything should be
working.

Fixes: d674a8f123b4 ("can: isotp: isotp_sendmsg(): fix return error on FC timeout on TX path")
Reported-by: Carsten Schmidt <carsten.schmidt-achim@t-online.de>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://lore.kernel.org/all/20230607072708.38809-1-socketcan@hartkopp.net
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/can/isotp.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/net/can/isotp.c
+++ b/net/can/isotp.c
@@ -1112,8 +1112,9 @@ wait_free_buffer:
 		if (err)
 			goto err_event_drop;
 
-		if (sk->sk_err)
-			return -sk->sk_err;
+		err = sock_error(sk);
+		if (err)
+			return err;
 	}
 
 	return size;



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 09/28] maple_tree: fix potential out-of-bounds access in mas_wr_end_piv()
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (7 preceding siblings ...)
  2023-06-29 18:43 ` [PATCH 6.4 08/28] can: isotp: isotp_sendmsg(): fix return error fix on TX path Greg Kroah-Hartman
@ 2023-06-29 18:43 ` Greg Kroah-Hartman
  2023-06-29 18:43 ` [PATCH 6.4 10/28] mm: introduce new lock_mm_and_find_vma() page fault helper Greg Kroah-Hartman
                   ` (19 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:43 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Peng Zhang, Liam R. Howlett,
	Andrew Morton

From: Peng Zhang <zhangpeng.00@bytedance.com>

commit cd00dd2585c4158e81fdfac0bbcc0446afbad26d upstream.

Check the write offset end bounds before using it as the offset into the
pivot array.  This avoids a possible out-of-bounds access on the pivot
array if the write extends to the last slot in the node, in which case the
node maximum should be used as the end pivot.

akpm: this doesn't affect any current callers, but new users of mapletree
may encounter this problem if backported into earlier kernels, so let's
fix it in -stable kernels in case of this.

Link: https://lkml.kernel.org/r/20230506024752.2550-1-zhangpeng.00@bytedance.com
Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 lib/maple_tree.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -4263,11 +4263,13 @@ done:
 
 static inline void mas_wr_end_piv(struct ma_wr_state *wr_mas)
 {
-	while ((wr_mas->mas->last > wr_mas->end_piv) &&
-	       (wr_mas->offset_end < wr_mas->node_end))
-		wr_mas->end_piv = wr_mas->pivots[++wr_mas->offset_end];
+	while ((wr_mas->offset_end < wr_mas->node_end) &&
+	       (wr_mas->mas->last > wr_mas->pivots[wr_mas->offset_end]))
+		wr_mas->offset_end++;
 
-	if (wr_mas->mas->last > wr_mas->end_piv)
+	if (wr_mas->offset_end < wr_mas->node_end)
+		wr_mas->end_piv = wr_mas->pivots[wr_mas->offset_end];
+	else
 		wr_mas->end_piv = wr_mas->mas->max;
 }
 
@@ -4424,7 +4426,6 @@ static inline void *mas_wr_store_entry(s
 	}
 
 	/* At this point, we are at the leaf node that needs to be altered. */
-	wr_mas->end_piv = wr_mas->r_max;
 	mas_wr_end_piv(wr_mas);
 
 	if (!wr_mas->entry)



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 10/28] mm: introduce new lock_mm_and_find_vma() page fault helper
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (8 preceding siblings ...)
  2023-06-29 18:43 ` [PATCH 6.4 09/28] maple_tree: fix potential out-of-bounds access in mas_wr_end_piv() Greg Kroah-Hartman
@ 2023-06-29 18:43 ` Greg Kroah-Hartman
  2023-06-29 18:43 ` [PATCH 6.4 11/28] mm: make the page fault mmap locking killable Greg Kroah-Hartman
                   ` (18 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:43 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Linus Torvalds

From: Linus Torvalds <torvalds@linux-foundation.org>

commit c2508ec5a58db67093f4fb8bf89a9a7c53a109e9 upstream.

.. and make x86 use it.

This basically extracts the existing x86 "find and expand faulting vma"
code, but extends it to also take the mmap lock for writing in case we
actually do need to expand the vma.

We've historically short-circuited that case, and have some rather ugly
special logic to serialize the stack segment expansion (since we only
hold the mmap lock for reading) that doesn't match the normal VM
locking.

That slight violation of locking worked well, right up until it didn't:
the maple tree code really does want proper locking even for simple
extension of an existing vma.

So extract the code for "look up the vma of the fault" from x86, fix it
up to do the necessary write locking, and make it available as a helper
function for other architectures that can use the common helper.

Note: I say "common helper", but it really only handles the normal
stack-grows-down case.  Which is all architectures except for PA-RISC
and IA64.  So some rare architectures can't use the helper, but if they
care they'll just need to open-code this logic.

It's also worth pointing out that this code really would like to have an
optimistic "mmap_upgrade_trylock()" to make it quicker to go from a
read-lock (for the common case) to taking the write lock (for having to
extend the vma) in the normal single-threaded situation where there is
no other locking activity.

But that _is_ all the very uncommon special case, so while it would be
nice to have such an operation, it probably doesn't matter in reality.
I did put in the skeleton code for such a possible future expansion,
even if it only acts as pseudo-documentation for what we're doing.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/Kconfig    |    1 
 arch/x86/mm/fault.c |   52 ----------------------
 include/linux/mm.h  |    2 
 mm/Kconfig          |    4 +
 mm/memory.c         |  121 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 130 insertions(+), 50 deletions(-)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -276,6 +276,7 @@ config X86
 	select HAVE_GENERIC_VDSO
 	select HOTPLUG_SMT			if SMP
 	select IRQ_FORCED_THREADING
+	select LOCK_MM_AND_FIND_VMA
 	select NEED_PER_CPU_EMBED_FIRST_CHUNK
 	select NEED_PER_CPU_PAGE_FIRST_CHUNK
 	select NEED_SG_DMA_LENGTH
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -880,12 +880,6 @@ __bad_area(struct pt_regs *regs, unsigne
 	__bad_area_nosemaphore(regs, error_code, address, pkey, si_code);
 }
 
-static noinline void
-bad_area(struct pt_regs *regs, unsigned long error_code, unsigned long address)
-{
-	__bad_area(regs, error_code, address, 0, SEGV_MAPERR);
-}
-
 static inline bool bad_area_access_from_pkeys(unsigned long error_code,
 		struct vm_area_struct *vma)
 {
@@ -1366,51 +1360,10 @@ void do_user_addr_fault(struct pt_regs *
 lock_mmap:
 #endif /* CONFIG_PER_VMA_LOCK */
 
-	/*
-	 * Kernel-mode access to the user address space should only occur
-	 * on well-defined single instructions listed in the exception
-	 * tables.  But, an erroneous kernel fault occurring outside one of
-	 * those areas which also holds mmap_lock might deadlock attempting
-	 * to validate the fault against the address space.
-	 *
-	 * Only do the expensive exception table search when we might be at
-	 * risk of a deadlock.  This happens if we
-	 * 1. Failed to acquire mmap_lock, and
-	 * 2. The access did not originate in userspace.
-	 */
-	if (unlikely(!mmap_read_trylock(mm))) {
-		if (!user_mode(regs) && !search_exception_tables(regs->ip)) {
-			/*
-			 * Fault from code in kernel from
-			 * which we do not expect faults.
-			 */
-			bad_area_nosemaphore(regs, error_code, address);
-			return;
-		}
 retry:
-		mmap_read_lock(mm);
-	} else {
-		/*
-		 * The above down_read_trylock() might have succeeded in
-		 * which case we'll have missed the might_sleep() from
-		 * down_read():
-		 */
-		might_sleep();
-	}
-
-	vma = find_vma(mm, address);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (unlikely(!vma)) {
-		bad_area(regs, error_code, address);
-		return;
-	}
-	if (likely(vma->vm_start <= address))
-		goto good_area;
-	if (unlikely(!(vma->vm_flags & VM_GROWSDOWN))) {
-		bad_area(regs, error_code, address);
-		return;
-	}
-	if (unlikely(expand_stack(vma, address))) {
-		bad_area(regs, error_code, address);
+		bad_area_nosemaphore(regs, error_code, address);
 		return;
 	}
 
@@ -1418,7 +1371,6 @@ retry:
 	 * Ok, we have a good vm_area for this memory access, so
 	 * we can handle it..
 	 */
-good_area:
 	if (unlikely(access_error(error_code, vma))) {
 		bad_area_access_error(regs, error_code, address, vma);
 		return;
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2325,6 +2325,8 @@ void unmap_mapping_pages(struct address_
 		pgoff_t start, pgoff_t nr, bool even_cows);
 void unmap_mapping_range(struct address_space *mapping,
 		loff_t const holebegin, loff_t const holelen, int even_cows);
+struct vm_area_struct *lock_mm_and_find_vma(struct mm_struct *mm,
+		unsigned long address, struct pt_regs *regs);
 #else
 static inline vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
 					 unsigned long address, unsigned int flags,
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -1206,6 +1206,10 @@ config PER_VMA_LOCK
 	  This feature allows locking each virtual memory area separately when
 	  handling page faults instead of taking mmap_lock.
 
+config LOCK_MM_AND_FIND_VMA
+	bool
+	depends on !STACK_GROWSUP
+
 source "mm/damon/Kconfig"
 
 endmenu
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5262,6 +5262,127 @@ out:
 }
 EXPORT_SYMBOL_GPL(handle_mm_fault);
 
+#ifdef CONFIG_LOCK_MM_AND_FIND_VMA
+#include <linux/extable.h>
+
+static inline bool get_mmap_lock_carefully(struct mm_struct *mm, struct pt_regs *regs)
+{
+	/* Even if this succeeds, make it clear we *might* have slept */
+	if (likely(mmap_read_trylock(mm))) {
+		might_sleep();
+		return true;
+	}
+
+	if (regs && !user_mode(regs)) {
+		unsigned long ip = instruction_pointer(regs);
+		if (!search_exception_tables(ip))
+			return false;
+	}
+
+	mmap_read_lock(mm);
+	return true;
+}
+
+static inline bool mmap_upgrade_trylock(struct mm_struct *mm)
+{
+	/*
+	 * We don't have this operation yet.
+	 *
+	 * It should be easy enough to do: it's basically a
+	 *    atomic_long_try_cmpxchg_acquire()
+	 * from RWSEM_READER_BIAS -> RWSEM_WRITER_LOCKED, but
+	 * it also needs the proper lockdep magic etc.
+	 */
+	return false;
+}
+
+static inline bool upgrade_mmap_lock_carefully(struct mm_struct *mm, struct pt_regs *regs)
+{
+	mmap_read_unlock(mm);
+	if (regs && !user_mode(regs)) {
+		unsigned long ip = instruction_pointer(regs);
+		if (!search_exception_tables(ip))
+			return false;
+	}
+	mmap_write_lock(mm);
+	return true;
+}
+
+/*
+ * Helper for page fault handling.
+ *
+ * This is kind of equivalend to "mmap_read_lock()" followed
+ * by "find_extend_vma()", except it's a lot more careful about
+ * the locking (and will drop the lock on failure).
+ *
+ * For example, if we have a kernel bug that causes a page
+ * fault, we don't want to just use mmap_read_lock() to get
+ * the mm lock, because that would deadlock if the bug were
+ * to happen while we're holding the mm lock for writing.
+ *
+ * So this checks the exception tables on kernel faults in
+ * order to only do this all for instructions that are actually
+ * expected to fault.
+ *
+ * We can also actually take the mm lock for writing if we
+ * need to extend the vma, which helps the VM layer a lot.
+ */
+struct vm_area_struct *lock_mm_and_find_vma(struct mm_struct *mm,
+			unsigned long addr, struct pt_regs *regs)
+{
+	struct vm_area_struct *vma;
+
+	if (!get_mmap_lock_carefully(mm, regs))
+		return NULL;
+
+	vma = find_vma(mm, addr);
+	if (likely(vma && (vma->vm_start <= addr)))
+		return vma;
+
+	/*
+	 * Well, dang. We might still be successful, but only
+	 * if we can extend a vma to do so.
+	 */
+	if (!vma || !(vma->vm_flags & VM_GROWSDOWN)) {
+		mmap_read_unlock(mm);
+		return NULL;
+	}
+
+	/*
+	 * We can try to upgrade the mmap lock atomically,
+	 * in which case we can continue to use the vma
+	 * we already looked up.
+	 *
+	 * Otherwise we'll have to drop the mmap lock and
+	 * re-take it, and also look up the vma again,
+	 * re-checking it.
+	 */
+	if (!mmap_upgrade_trylock(mm)) {
+		if (!upgrade_mmap_lock_carefully(mm, regs))
+			return NULL;
+
+		vma = find_vma(mm, addr);
+		if (!vma)
+			goto fail;
+		if (vma->vm_start <= addr)
+			goto success;
+		if (!(vma->vm_flags & VM_GROWSDOWN))
+			goto fail;
+	}
+
+	if (expand_stack(vma, addr))
+		goto fail;
+
+success:
+	mmap_write_downgrade(mm);
+	return vma;
+
+fail:
+	mmap_write_unlock(mm);
+	return NULL;
+}
+#endif
+
 #ifdef CONFIG_PER_VMA_LOCK
 /*
  * Lookup and lock a VMA under RCU protection. Returned VMA is guaranteed to be



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 11/28] mm: make the page fault mmap locking killable
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (9 preceding siblings ...)
  2023-06-29 18:43 ` [PATCH 6.4 10/28] mm: introduce new lock_mm_and_find_vma() page fault helper Greg Kroah-Hartman
@ 2023-06-29 18:43 ` Greg Kroah-Hartman
  2023-06-29 18:44 ` [PATCH 6.4 12/28] arm64/mm: Convert to using lock_mm_and_find_vma() Greg Kroah-Hartman
                   ` (17 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:43 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Linus Torvalds

From: Linus Torvalds <torvalds@linux-foundation.org>

commit eda0047296a16d65a7f2bc60a408f70d178b2014 upstream.

This is done as a separate patch from introducing the new
lock_mm_and_find_vma() helper, because while it's an obvious change,
it's not what x86 used to do in this area.

We already abort the page fault on fatal signals anyway, so why should
we wait for the mmap lock only to then abort later? With the new helper
function that returns without the lock held on failure anyway, this is
particularly easy and straightforward.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/memory.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5279,8 +5279,7 @@ static inline bool get_mmap_lock_careful
 			return false;
 	}
 
-	mmap_read_lock(mm);
-	return true;
+	return !mmap_read_lock_killable(mm);
 }
 
 static inline bool mmap_upgrade_trylock(struct mm_struct *mm)
@@ -5304,8 +5303,7 @@ static inline bool upgrade_mmap_lock_car
 		if (!search_exception_tables(ip))
 			return false;
 	}
-	mmap_write_lock(mm);
-	return true;
+	return !mmap_write_lock_killable(mm);
 }
 
 /*



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 12/28] arm64/mm: Convert to using lock_mm_and_find_vma()
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (10 preceding siblings ...)
  2023-06-29 18:43 ` [PATCH 6.4 11/28] mm: make the page fault mmap locking killable Greg Kroah-Hartman
@ 2023-06-29 18:44 ` Greg Kroah-Hartman
  2023-06-29 18:44 ` [PATCH 6.4 13/28] powerpc/mm: " Greg Kroah-Hartman
                   ` (16 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:44 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Linus Torvalds, Suren Baghdasaryan,
	Liam R . Howlett

From: Linus Torvalds <torvalds@linux-foundation.org>

commit ae870a68b5d13d67cf4f18d47bb01ee3fee40acb upstream.

This converts arm64 to use the new page fault helper.  It was very
straightforward, but still needed a fix for the "obvious" conversion I
initially did.  Thanks to Suren for the fix and testing.

Fixed-and-tested-by: Suren Baghdasaryan <surenb@google.com>
Unnecessary-code-removal-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/Kconfig    |    1 +
 arch/arm64/mm/fault.c |   47 ++++++++---------------------------------------
 2 files changed, 9 insertions(+), 39 deletions(-)

--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -225,6 +225,7 @@ config ARM64
 	select IRQ_DOMAIN
 	select IRQ_FORCED_THREADING
 	select KASAN_VMALLOC if KASAN
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_RELA
 	select NEED_DMA_MAP_STATE
 	select NEED_SG_DMA_LENGTH
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -483,27 +483,14 @@ static void do_bad_area(unsigned long fa
 #define VM_FAULT_BADMAP		((__force vm_fault_t)0x010000)
 #define VM_FAULT_BADACCESS	((__force vm_fault_t)0x020000)
 
-static vm_fault_t __do_page_fault(struct mm_struct *mm, unsigned long addr,
+static vm_fault_t __do_page_fault(struct mm_struct *mm,
+				  struct vm_area_struct *vma, unsigned long addr,
 				  unsigned int mm_flags, unsigned long vm_flags,
 				  struct pt_regs *regs)
 {
-	struct vm_area_struct *vma = find_vma(mm, addr);
-
-	if (unlikely(!vma))
-		return VM_FAULT_BADMAP;
-
 	/*
 	 * Ok, we have a good vm_area for this memory access, so we can handle
 	 * it.
-	 */
-	if (unlikely(vma->vm_start > addr)) {
-		if (!(vma->vm_flags & VM_GROWSDOWN))
-			return VM_FAULT_BADMAP;
-		if (expand_stack(vma, addr))
-			return VM_FAULT_BADMAP;
-	}
-
-	/*
 	 * Check that the permissions on the VMA allow for the fault which
 	 * occurred.
 	 */
@@ -617,31 +604,15 @@ static int __kprobes do_page_fault(unsig
 	}
 lock_mmap:
 #endif /* CONFIG_PER_VMA_LOCK */
-	/*
-	 * As per x86, we may deadlock here. However, since the kernel only
-	 * validly references user space from well defined areas of the code,
-	 * we can bug out early if this is from code which shouldn't.
-	 */
-	if (!mmap_read_trylock(mm)) {
-		if (!user_mode(regs) && !search_exception_tables(regs->pc))
-			goto no_context;
+
 retry:
-		mmap_read_lock(mm);
-	} else {
-		/*
-		 * The above mmap_read_trylock() might have succeeded in which
-		 * case, we'll have missed the might_sleep() from down_read().
-		 */
-		might_sleep();
-#ifdef CONFIG_DEBUG_VM
-		if (!user_mode(regs) && !search_exception_tables(regs->pc)) {
-			mmap_read_unlock(mm);
-			goto no_context;
-		}
-#endif
+	vma = lock_mm_and_find_vma(mm, addr, regs);
+	if (unlikely(!vma)) {
+		fault = VM_FAULT_BADMAP;
+		goto done;
 	}
 
-	fault = __do_page_fault(mm, addr, mm_flags, vm_flags, regs);
+	fault = __do_page_fault(mm, vma, addr, mm_flags, vm_flags, regs);
 
 	/* Quick path to respond to signals */
 	if (fault_signal_pending(fault, regs)) {
@@ -660,9 +631,7 @@ retry:
 	}
 	mmap_read_unlock(mm);
 
-#ifdef CONFIG_PER_VMA_LOCK
 done:
-#endif
 	/*
 	 * Handle the "normal" (no error) case first.
 	 */



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 13/28] powerpc/mm: Convert to using lock_mm_and_find_vma()
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (11 preceding siblings ...)
  2023-06-29 18:44 ` [PATCH 6.4 12/28] arm64/mm: Convert to using lock_mm_and_find_vma() Greg Kroah-Hartman
@ 2023-06-29 18:44 ` Greg Kroah-Hartman
  2023-06-29 18:44 ` [PATCH 6.4 14/28] mips/mm: " Greg Kroah-Hartman
                   ` (15 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:44 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Michael Ellerman, Linus Torvalds

From: Michael Ellerman <mpe@ellerman.id.au>

commit e6fe228c4ffafdfc970cf6d46883a1f481baf7ea upstream.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/powerpc/Kconfig    |    1 +
 arch/powerpc/mm/fault.c |   41 ++++-------------------------------------
 2 files changed, 5 insertions(+), 37 deletions(-)

--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -278,6 +278,7 @@ config PPC
 	select IRQ_DOMAIN
 	select IRQ_FORCED_THREADING
 	select KASAN_VMALLOC			if KASAN && MODULES
+	select LOCK_MM_AND_FIND_VMA
 	select MMU_GATHER_PAGE_SIZE
 	select MMU_GATHER_RCU_TABLE_FREE
 	select MMU_GATHER_MERGE_VMAS
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -84,11 +84,6 @@ static int __bad_area(struct pt_regs *re
 	return __bad_area_nosemaphore(regs, address, si_code);
 }
 
-static noinline int bad_area(struct pt_regs *regs, unsigned long address)
-{
-	return __bad_area(regs, address, SEGV_MAPERR);
-}
-
 static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
 				    struct vm_area_struct *vma)
 {
@@ -515,40 +510,12 @@ lock_mmap:
 	 * we will deadlock attempting to validate the fault against the
 	 * address space.  Luckily the kernel only validly references user
 	 * space from well defined areas of code, which are listed in the
-	 * exceptions table.
-	 *
-	 * As the vast majority of faults will be valid we will only perform
-	 * the source reference check when there is a possibility of a deadlock.
-	 * Attempt to lock the address space, if we cannot we then validate the
-	 * source.  If this is invalid we can skip the address space check,
-	 * thus avoiding the deadlock.
-	 */
-	if (unlikely(!mmap_read_trylock(mm))) {
-		if (!is_user && !search_exception_tables(regs->nip))
-			return bad_area_nosemaphore(regs, address);
-
+	 * exceptions table. lock_mm_and_find_vma() handles that logic.
+	 */
 retry:
-		mmap_read_lock(mm);
-	} else {
-		/*
-		 * The above down_read_trylock() might have succeeded in
-		 * which case we'll have missed the might_sleep() from
-		 * down_read():
-		 */
-		might_sleep();
-	}
-
-	vma = find_vma(mm, address);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (unlikely(!vma))
-		return bad_area(regs, address);
-
-	if (unlikely(vma->vm_start > address)) {
-		if (unlikely(!(vma->vm_flags & VM_GROWSDOWN)))
-			return bad_area(regs, address);
-
-		if (unlikely(expand_stack(vma, address)))
-			return bad_area(regs, address);
-	}
+		return bad_area_nosemaphore(regs, address);
 
 	if (unlikely(access_pkey_error(is_write, is_exec,
 				       (error_code & DSISR_KEYFAULT), vma)))



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 14/28] mips/mm: Convert to using lock_mm_and_find_vma()
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (12 preceding siblings ...)
  2023-06-29 18:44 ` [PATCH 6.4 13/28] powerpc/mm: " Greg Kroah-Hartman
@ 2023-06-29 18:44 ` Greg Kroah-Hartman
  2023-06-29 18:44 ` [PATCH 6.4 15/28] riscv/mm: " Greg Kroah-Hartman
                   ` (14 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:44 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Ben Hutchings, Linus Torvalds

From: Ben Hutchings <ben@decadent.org.uk>

commit 4bce37a68ff884e821a02a731897a8119e0c37b7 upstream.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/mips/Kconfig    |    1 +
 arch/mips/mm/fault.c |   12 ++----------
 2 files changed, 3 insertions(+), 10 deletions(-)

--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -91,6 +91,7 @@ config MIPS
 	select HAVE_VIRT_CPU_ACCOUNTING_GEN if 64BIT || !SMP
 	select IRQ_FORCED_THREADING
 	select ISA if EISA
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_REL if MODULES
 	select MODULES_USE_ELF_RELA if MODULES && 64BIT
 	select PERF_USE_VMALLOC
--- a/arch/mips/mm/fault.c
+++ b/arch/mips/mm/fault.c
@@ -99,21 +99,13 @@ static void __do_page_fault(struct pt_re
 
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 retry:
-	mmap_read_lock(mm);
-	vma = find_vma(mm, address);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (!vma)
-		goto bad_area;
-	if (vma->vm_start <= address)
-		goto good_area;
-	if (!(vma->vm_flags & VM_GROWSDOWN))
-		goto bad_area;
-	if (expand_stack(vma, address))
-		goto bad_area;
+		goto bad_area_nosemaphore;
 /*
  * Ok, we have a good vm_area for this memory access, so
  * we can handle it..
  */
-good_area:
 	si_code = SEGV_ACCERR;
 
 	if (write) {



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 15/28] riscv/mm: Convert to using lock_mm_and_find_vma()
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (13 preceding siblings ...)
  2023-06-29 18:44 ` [PATCH 6.4 14/28] mips/mm: " Greg Kroah-Hartman
@ 2023-06-29 18:44 ` Greg Kroah-Hartman
  2023-06-29 18:44 ` [PATCH 6.4 16/28] arm/mm: " Greg Kroah-Hartman
                   ` (13 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:44 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Ben Hutchings, Linus Torvalds

From: Ben Hutchings <ben@decadent.org.uk>

commit 7267ef7b0b77f4ed23b7b3c87d8eca7bd9c2d007 upstream.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/riscv/Kconfig    |    1 +
 arch/riscv/mm/fault.c |   31 +++++++++++++------------------
 2 files changed, 14 insertions(+), 18 deletions(-)

--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -126,6 +126,7 @@ config RISCV
 	select IRQ_DOMAIN
 	select IRQ_FORCED_THREADING
 	select KASAN_VMALLOC if KASAN
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_RELA if MODULES
 	select MODULE_SECTIONS if MODULES
 	select OF
--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@@ -84,13 +84,13 @@ static inline void mm_fault_error(struct
 	BUG();
 }
 
-static inline void bad_area(struct pt_regs *regs, struct mm_struct *mm, int code, unsigned long addr)
+static inline void
+bad_area_nosemaphore(struct pt_regs *regs, int code, unsigned long addr)
 {
 	/*
 	 * Something tried to access memory that isn't in our memory map.
 	 * Fix it, but check if it's kernel or user first.
 	 */
-	mmap_read_unlock(mm);
 	/* User mode accesses just cause a SIGSEGV */
 	if (user_mode(regs)) {
 		do_trap(regs, SIGSEGV, code, addr);
@@ -100,6 +100,15 @@ static inline void bad_area(struct pt_re
 	no_context(regs, addr);
 }
 
+static inline void
+bad_area(struct pt_regs *regs, struct mm_struct *mm, int code,
+	 unsigned long addr)
+{
+	mmap_read_unlock(mm);
+
+	bad_area_nosemaphore(regs, code, addr);
+}
+
 static inline void vmalloc_fault(struct pt_regs *regs, int code, unsigned long addr)
 {
 	pgd_t *pgd, *pgd_k;
@@ -287,23 +296,10 @@ void handle_page_fault(struct pt_regs *r
 	else if (cause == EXC_INST_PAGE_FAULT)
 		flags |= FAULT_FLAG_INSTRUCTION;
 retry:
-	mmap_read_lock(mm);
-	vma = find_vma(mm, addr);
+	vma = lock_mm_and_find_vma(mm, addr, regs);
 	if (unlikely(!vma)) {
 		tsk->thread.bad_cause = cause;
-		bad_area(regs, mm, code, addr);
-		return;
-	}
-	if (likely(vma->vm_start <= addr))
-		goto good_area;
-	if (unlikely(!(vma->vm_flags & VM_GROWSDOWN))) {
-		tsk->thread.bad_cause = cause;
-		bad_area(regs, mm, code, addr);
-		return;
-	}
-	if (unlikely(expand_stack(vma, addr))) {
-		tsk->thread.bad_cause = cause;
-		bad_area(regs, mm, code, addr);
+		bad_area_nosemaphore(regs, code, addr);
 		return;
 	}
 
@@ -311,7 +307,6 @@ retry:
 	 * Ok, we have a good vm_area for this memory access, so
 	 * we can handle it.
 	 */
-good_area:
 	code = SEGV_ACCERR;
 
 	if (unlikely(access_error(cause, vma))) {



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 16/28] arm/mm: Convert to using lock_mm_and_find_vma()
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (14 preceding siblings ...)
  2023-06-29 18:44 ` [PATCH 6.4 15/28] riscv/mm: " Greg Kroah-Hartman
@ 2023-06-29 18:44 ` Greg Kroah-Hartman
  2023-06-29 18:44 ` [PATCH 6.4 17/28] mm/fault: convert remaining simple cases to lock_mm_and_find_vma() Greg Kroah-Hartman
                   ` (12 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:44 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Ben Hutchings, Linus Torvalds

From: Ben Hutchings <ben@decadent.org.uk>

commit 8b35ca3e45e35a26a21427f35d4093606e93ad0a upstream.

arm has an additional check for address < FIRST_USER_ADDRESS before
expanding the stack.  Since FIRST_USER_ADDRESS is defined everywhere
(generally as 0), move that check to the generic expand_downwards().

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm/Kconfig    |    1 
 arch/arm/mm/fault.c |   63 +++++++++++-----------------------------------------
 mm/mmap.c           |    2 -
 3 files changed, 16 insertions(+), 50 deletions(-)

--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -125,6 +125,7 @@ config ARM
 	select HAVE_UID16
 	select HAVE_VIRT_CPU_ACCOUNTING_GEN
 	select IRQ_FORCED_THREADING
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_REL
 	select NEED_DMA_MAP_STATE
 	select OF_EARLY_FLATTREE if OF
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -232,37 +232,11 @@ static inline bool is_permission_fault(u
 	return false;
 }
 
-static vm_fault_t __kprobes
-__do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int flags,
-		unsigned long vma_flags, struct pt_regs *regs)
-{
-	struct vm_area_struct *vma = find_vma(mm, addr);
-	if (unlikely(!vma))
-		return VM_FAULT_BADMAP;
-
-	if (unlikely(vma->vm_start > addr)) {
-		if (!(vma->vm_flags & VM_GROWSDOWN))
-			return VM_FAULT_BADMAP;
-		if (addr < FIRST_USER_ADDRESS)
-			return VM_FAULT_BADMAP;
-		if (expand_stack(vma, addr))
-			return VM_FAULT_BADMAP;
-	}
-
-	/*
-	 * ok, we have a good vm_area for this memory access, check the
-	 * permissions on the VMA allow for the fault which occurred.
-	 */
-	if (!(vma->vm_flags & vma_flags))
-		return VM_FAULT_BADACCESS;
-
-	return handle_mm_fault(vma, addr & PAGE_MASK, flags, regs);
-}
-
 static int __kprobes
 do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 {
 	struct mm_struct *mm = current->mm;
+	struct vm_area_struct *vma;
 	int sig, code;
 	vm_fault_t fault;
 	unsigned int flags = FAULT_FLAG_DEFAULT;
@@ -301,31 +275,21 @@ do_page_fault(unsigned long addr, unsign
 
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
 
-	/*
-	 * As per x86, we may deadlock here.  However, since the kernel only
-	 * validly references user space from well defined areas of the code,
-	 * we can bug out early if this is from code which shouldn't.
-	 */
-	if (!mmap_read_trylock(mm)) {
-		if (!user_mode(regs) && !search_exception_tables(regs->ARM_pc))
-			goto no_context;
 retry:
-		mmap_read_lock(mm);
-	} else {
-		/*
-		 * The above down_read_trylock() might have succeeded in
-		 * which case, we'll have missed the might_sleep() from
-		 * down_read()
-		 */
-		might_sleep();
-#ifdef CONFIG_DEBUG_VM
-		if (!user_mode(regs) &&
-		    !search_exception_tables(regs->ARM_pc))
-			goto no_context;
-#endif
+	vma = lock_mm_and_find_vma(mm, addr, regs);
+	if (unlikely(!vma)) {
+		fault = VM_FAULT_BADMAP;
+		goto bad_area;
 	}
 
-	fault = __do_page_fault(mm, addr, flags, vm_flags, regs);
+	/*
+	 * ok, we have a good vm_area for this memory access, check the
+	 * permissions on the VMA allow for the fault which occurred.
+	 */
+	if (!(vma->vm_flags & vm_flags))
+		fault = VM_FAULT_BADACCESS;
+	else
+		fault = handle_mm_fault(vma, addr & PAGE_MASK, flags, regs);
 
 	/* If we need to retry but a fatal signal is pending, handle the
 	 * signal first. We do not need to release the mmap_lock because
@@ -356,6 +320,7 @@ retry:
 	if (likely(!(fault & (VM_FAULT_ERROR | VM_FAULT_BADMAP | VM_FAULT_BADACCESS))))
 		return 0;
 
+bad_area:
 	/*
 	 * If we are in kernel mode at this point, we
 	 * have no context to handle this fault with.
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2036,7 +2036,7 @@ int expand_downwards(struct vm_area_stru
 	int error = 0;
 
 	address &= PAGE_MASK;
-	if (address < mmap_min_addr)
+	if (address < mmap_min_addr || address < FIRST_USER_ADDRESS)
 		return -EPERM;
 
 	/* Enforce stack_guard_gap */



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 17/28] mm/fault: convert remaining simple cases to lock_mm_and_find_vma()
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (15 preceding siblings ...)
  2023-06-29 18:44 ` [PATCH 6.4 16/28] arm/mm: " Greg Kroah-Hartman
@ 2023-06-29 18:44 ` Greg Kroah-Hartman
  2023-06-29 18:44 ` [PATCH 6.4 18/28] powerpc/mm: convert coprocessor fault " Greg Kroah-Hartman
                   ` (11 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:44 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Linus Torvalds

From: Linus Torvalds <torvalds@linux-foundation.org>

commit a050ba1e7422f2cc60ff8bfde3f96d34d00cb585 upstream.

This does the simple pattern conversion of alpha, arc, csky, hexagon,
loongarch, nios2, sh, sparc32, and xtensa to the lock_mm_and_find_vma()
helper.  They all have the regular fault handling pattern without odd
special cases.

The remaining architectures all have something that keeps us from a
straightforward conversion: ia64 and parisc have stacks that can grow
both up as well as down (and ia64 has special address region checks).

And m68k, microblaze, openrisc, sparc64, and um end up having extra
rules about only expanding the stack down a limited amount below the
user space stack pointer.  That is something that x86 used to do too
(long long ago), and it probably could just be skipped, but it still
makes the conversion less than trivial.

Note that this conversion was done manually and with the exception of
alpha without any build testing, because I have a fairly limited cross-
building environment.  The cases are all simple, and I went through the
changes several times, but...

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/alpha/Kconfig         |    1 +
 arch/alpha/mm/fault.c      |   13 +++----------
 arch/arc/Kconfig           |    1 +
 arch/arc/mm/fault.c        |   11 +++--------
 arch/csky/Kconfig          |    1 +
 arch/csky/mm/fault.c       |   22 +++++-----------------
 arch/hexagon/Kconfig       |    1 +
 arch/hexagon/mm/vm_fault.c |   18 ++++--------------
 arch/loongarch/Kconfig     |    1 +
 arch/loongarch/mm/fault.c  |   16 ++++++----------
 arch/nios2/Kconfig         |    1 +
 arch/nios2/mm/fault.c      |   17 ++---------------
 arch/sh/Kconfig            |    1 +
 arch/sh/mm/fault.c         |   17 ++---------------
 arch/sparc/Kconfig         |    1 +
 arch/sparc/mm/fault_32.c   |   32 ++++++++------------------------
 arch/xtensa/Kconfig        |    1 +
 arch/xtensa/mm/fault.c     |   14 +++-----------
 18 files changed, 45 insertions(+), 124 deletions(-)

--- a/arch/alpha/Kconfig
+++ b/arch/alpha/Kconfig
@@ -30,6 +30,7 @@ config ALPHA
 	select HAS_IOPORT
 	select HAVE_ARCH_AUDITSYSCALL
 	select HAVE_MOD_ARCH_SPECIFIC
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_RELA
 	select ODD_RT_SIGACTION
 	select OLD_SIGSUSPEND
--- a/arch/alpha/mm/fault.c
+++ b/arch/alpha/mm/fault.c
@@ -119,20 +119,12 @@ do_page_fault(unsigned long address, uns
 		flags |= FAULT_FLAG_USER;
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 retry:
-	mmap_read_lock(mm);
-	vma = find_vma(mm, address);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (!vma)
-		goto bad_area;
-	if (vma->vm_start <= address)
-		goto good_area;
-	if (!(vma->vm_flags & VM_GROWSDOWN))
-		goto bad_area;
-	if (expand_stack(vma, address))
-		goto bad_area;
+		goto bad_area_nosemaphore;
 
 	/* Ok, we have a good vm_area for this memory access, so
 	   we can handle it.  */
- good_area:
 	si_code = SEGV_ACCERR;
 	if (cause < 0) {
 		if (!(vma->vm_flags & VM_EXEC))
@@ -192,6 +184,7 @@ retry:
  bad_area:
 	mmap_read_unlock(mm);
 
+ bad_area_nosemaphore:
 	if (user_mode(regs))
 		goto do_sigsegv;
 
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -41,6 +41,7 @@ config ARC
 	select HAVE_PERF_EVENTS
 	select HAVE_SYSCALL_TRACEPOINTS
 	select IRQ_DOMAIN
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_RELA
 	select OF
 	select OF_EARLY_FLATTREE
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -113,15 +113,9 @@ void do_page_fault(unsigned long address
 
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 retry:
-	mmap_read_lock(mm);
-
-	vma = find_vma(mm, address);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (!vma)
-		goto bad_area;
-	if (unlikely(address < vma->vm_start)) {
-		if (!(vma->vm_flags & VM_GROWSDOWN) || expand_stack(vma, address))
-			goto bad_area;
-	}
+		goto bad_area_nosemaphore;
 
 	/*
 	 * vm_area is good, now check permissions for this memory access
@@ -161,6 +155,7 @@ retry:
 bad_area:
 	mmap_read_unlock(mm);
 
+bad_area_nosemaphore:
 	/*
 	 * Major/minor page fault accounting
 	 * (in case of retry we only land here once)
--- a/arch/csky/Kconfig
+++ b/arch/csky/Kconfig
@@ -96,6 +96,7 @@ config CSKY
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
+	select LOCK_MM_AND_FIND_VMA
 	select MAY_HAVE_SPARSE_IRQ
 	select MODULES_USE_ELF_RELA if MODULES
 	select OF
--- a/arch/csky/mm/fault.c
+++ b/arch/csky/mm/fault.c
@@ -97,13 +97,12 @@ static inline void mm_fault_error(struct
 	BUG();
 }
 
-static inline void bad_area(struct pt_regs *regs, struct mm_struct *mm, int code, unsigned long addr)
+static inline void bad_area_nosemaphore(struct pt_regs *regs, struct mm_struct *mm, int code, unsigned long addr)
 {
 	/*
 	 * Something tried to access memory that isn't in our memory map.
 	 * Fix it, but check if it's kernel or user first.
 	 */
-	mmap_read_unlock(mm);
 	/* User mode accesses just cause a SIGSEGV */
 	if (user_mode(regs)) {
 		do_trap(regs, SIGSEGV, code, addr);
@@ -238,20 +237,9 @@ asmlinkage void do_page_fault(struct pt_
 	if (is_write(regs))
 		flags |= FAULT_FLAG_WRITE;
 retry:
-	mmap_read_lock(mm);
-	vma = find_vma(mm, addr);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (unlikely(!vma)) {
-		bad_area(regs, mm, code, addr);
-		return;
-	}
-	if (likely(vma->vm_start <= addr))
-		goto good_area;
-	if (unlikely(!(vma->vm_flags & VM_GROWSDOWN))) {
-		bad_area(regs, mm, code, addr);
-		return;
-	}
-	if (unlikely(expand_stack(vma, addr))) {
-		bad_area(regs, mm, code, addr);
+		bad_area_nosemaphore(regs, mm, code, addr);
 		return;
 	}
 
@@ -259,11 +247,11 @@ retry:
 	 * Ok, we have a good vm_area for this memory access, so
 	 * we can handle it.
 	 */
-good_area:
 	code = SEGV_ACCERR;
 
 	if (unlikely(access_error(regs, vma))) {
-		bad_area(regs, mm, code, addr);
+		mmap_read_unlock(mm);
+		bad_area_nosemaphore(regs, mm, code, addr);
 		return;
 	}
 
--- a/arch/hexagon/Kconfig
+++ b/arch/hexagon/Kconfig
@@ -28,6 +28,7 @@ config HEXAGON
 	select GENERIC_SMP_IDLE_THREAD
 	select STACKTRACE_SUPPORT
 	select GENERIC_CLOCKEVENTS_BROADCAST
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_RELA
 	select GENERIC_CPU_DEVICES
 	select ARCH_WANT_LD_ORPHAN_WARN
--- a/arch/hexagon/mm/vm_fault.c
+++ b/arch/hexagon/mm/vm_fault.c
@@ -57,21 +57,10 @@ void do_page_fault(unsigned long address
 
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 retry:
-	mmap_read_lock(mm);
-	vma = find_vma(mm, address);
-	if (!vma)
-		goto bad_area;
+	vma = lock_mm_and_find_vma(mm, address, regs);
+	if (unlikely(!vma))
+		goto bad_area_nosemaphore;
 
-	if (vma->vm_start <= address)
-		goto good_area;
-
-	if (!(vma->vm_flags & VM_GROWSDOWN))
-		goto bad_area;
-
-	if (expand_stack(vma, address))
-		goto bad_area;
-
-good_area:
 	/* Address space is OK.  Now check access rights. */
 	si_code = SEGV_ACCERR;
 
@@ -143,6 +132,7 @@ good_area:
 bad_area:
 	mmap_read_unlock(mm);
 
+bad_area_nosemaphore:
 	if (user_mode(regs)) {
 		force_sig_fault(SIGSEGV, si_code, (void __user *)address);
 		return;
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -130,6 +130,7 @@ config LOONGARCH
 	select HAVE_VIRT_CPU_ACCOUNTING_GEN if !SMP
 	select IRQ_FORCED_THREADING
 	select IRQ_LOONGARCH_CPU
+	select LOCK_MM_AND_FIND_VMA
 	select MMU_GATHER_MERGE_VMAS if MMU
 	select MODULES_USE_ELF_RELA if MODULES
 	select NEED_PER_CPU_EMBED_FIRST_CHUNK
--- a/arch/loongarch/mm/fault.c
+++ b/arch/loongarch/mm/fault.c
@@ -169,22 +169,18 @@ static void __kprobes __do_page_fault(st
 
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 retry:
-	mmap_read_lock(mm);
-	vma = find_vma(mm, address);
-	if (!vma)
-		goto bad_area;
-	if (vma->vm_start <= address)
-		goto good_area;
-	if (!(vma->vm_flags & VM_GROWSDOWN))
-		goto bad_area;
-	if (!expand_stack(vma, address))
-		goto good_area;
+	vma = lock_mm_and_find_vma(mm, address, regs);
+	if (unlikely(!vma))
+		goto bad_area_nosemaphore;
+	goto good_area;
+
 /*
  * Something tried to access memory that isn't in our memory map..
  * Fix it, but check if it's kernel or user first..
  */
 bad_area:
 	mmap_read_unlock(mm);
+bad_area_nosemaphore:
 	do_sigsegv(regs, write, address, si_code);
 	return;
 
--- a/arch/nios2/Kconfig
+++ b/arch/nios2/Kconfig
@@ -16,6 +16,7 @@ config NIOS2
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_ARCH_KGDB
 	select IRQ_DOMAIN
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_RELA
 	select OF
 	select OF_EARLY_FLATTREE
--- a/arch/nios2/mm/fault.c
+++ b/arch/nios2/mm/fault.c
@@ -86,27 +86,14 @@ asmlinkage void do_page_fault(struct pt_
 
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 
-	if (!mmap_read_trylock(mm)) {
-		if (!user_mode(regs) && !search_exception_tables(regs->ea))
-			goto bad_area_nosemaphore;
 retry:
-		mmap_read_lock(mm);
-	}
-
-	vma = find_vma(mm, address);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (!vma)
-		goto bad_area;
-	if (vma->vm_start <= address)
-		goto good_area;
-	if (!(vma->vm_flags & VM_GROWSDOWN))
-		goto bad_area;
-	if (expand_stack(vma, address))
-		goto bad_area;
+		goto bad_area_nosemaphore;
 /*
  * Ok, we have a good vm_area for this memory access, so
  * we can handle it..
  */
-good_area:
 	code = SEGV_ACCERR;
 
 	switch (cause) {
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -59,6 +59,7 @@ config SUPERH
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
 	select IRQ_FORCED_THREADING
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_RELA
 	select NEED_SG_DMA_LENGTH
 	select NO_DMA if !MMU && !DMA_COHERENT
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@@ -439,21 +439,9 @@ asmlinkage void __kprobes do_page_fault(
 	}
 
 retry:
-	mmap_read_lock(mm);
-
-	vma = find_vma(mm, address);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (unlikely(!vma)) {
-		bad_area(regs, error_code, address);
-		return;
-	}
-	if (likely(vma->vm_start <= address))
-		goto good_area;
-	if (unlikely(!(vma->vm_flags & VM_GROWSDOWN))) {
-		bad_area(regs, error_code, address);
-		return;
-	}
-	if (unlikely(expand_stack(vma, address))) {
-		bad_area(regs, error_code, address);
+		bad_area_nosemaphore(regs, error_code, address);
 		return;
 	}
 
@@ -461,7 +449,6 @@ retry:
 	 * Ok, we have a good vm_area for this memory access, so
 	 * we can handle it..
 	 */
-good_area:
 	if (unlikely(access_error(error_code, vma))) {
 		bad_area_access_error(regs, error_code, address);
 		return;
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -57,6 +57,7 @@ config SPARC32
 	select DMA_DIRECT_REMAP
 	select GENERIC_ATOMIC64
 	select HAVE_UID16
+	select LOCK_MM_AND_FIND_VMA
 	select OLD_SIGACTION
 	select ZONE_DMA
 
--- a/arch/sparc/mm/fault_32.c
+++ b/arch/sparc/mm/fault_32.c
@@ -143,28 +143,19 @@ asmlinkage void do_sparc_fault(struct pt
 	if (pagefault_disabled() || !mm)
 		goto no_context;
 
+	if (!from_user && address >= PAGE_OFFSET)
+		goto no_context;
+
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 
 retry:
-	mmap_read_lock(mm);
-
-	if (!from_user && address >= PAGE_OFFSET)
-		goto bad_area;
-
-	vma = find_vma(mm, address);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (!vma)
-		goto bad_area;
-	if (vma->vm_start <= address)
-		goto good_area;
-	if (!(vma->vm_flags & VM_GROWSDOWN))
-		goto bad_area;
-	if (expand_stack(vma, address))
-		goto bad_area;
+		goto bad_area_nosemaphore;
 	/*
 	 * Ok, we have a good vm_area for this memory access, so
 	 * we can handle it..
 	 */
-good_area:
 	code = SEGV_ACCERR;
 	if (write) {
 		if (!(vma->vm_flags & VM_WRITE))
@@ -321,17 +312,9 @@ static void force_user_fault(unsigned lo
 
 	code = SEGV_MAPERR;
 
-	mmap_read_lock(mm);
-	vma = find_vma(mm, address);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (!vma)
-		goto bad_area;
-	if (vma->vm_start <= address)
-		goto good_area;
-	if (!(vma->vm_flags & VM_GROWSDOWN))
-		goto bad_area;
-	if (expand_stack(vma, address))
-		goto bad_area;
-good_area:
+		goto bad_area_nosemaphore;
 	code = SEGV_ACCERR;
 	if (write) {
 		if (!(vma->vm_flags & VM_WRITE))
@@ -350,6 +333,7 @@ good_area:
 	return;
 bad_area:
 	mmap_read_unlock(mm);
+bad_area_nosemaphore:
 	__do_fault_siginfo(code, SIGSEGV, tsk->thread.kregs, address);
 	return;
 
--- a/arch/xtensa/Kconfig
+++ b/arch/xtensa/Kconfig
@@ -49,6 +49,7 @@ config XTENSA
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_VIRT_CPU_ACCOUNTING_GEN
 	select IRQ_DOMAIN
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_RELA
 	select PERF_USE_VMALLOC
 	select TRACE_IRQFLAGS_SUPPORT
--- a/arch/xtensa/mm/fault.c
+++ b/arch/xtensa/mm/fault.c
@@ -130,23 +130,14 @@ void do_page_fault(struct pt_regs *regs)
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 
 retry:
-	mmap_read_lock(mm);
-	vma = find_vma(mm, address);
-
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (!vma)
-		goto bad_area;
-	if (vma->vm_start <= address)
-		goto good_area;
-	if (!(vma->vm_flags & VM_GROWSDOWN))
-		goto bad_area;
-	if (expand_stack(vma, address))
-		goto bad_area;
+		goto bad_area_nosemaphore;
 
 	/* Ok, we have a good vm_area for this memory access, so
 	 * we can handle it..
 	 */
 
-good_area:
 	code = SEGV_ACCERR;
 
 	if (is_write) {
@@ -205,6 +196,7 @@ good_area:
 	 */
 bad_area:
 	mmap_read_unlock(mm);
+bad_area_nosemaphore:
 	if (user_mode(regs)) {
 		force_sig_fault(SIGSEGV, code, (void *) address);
 		return;



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 18/28] powerpc/mm: convert coprocessor fault to lock_mm_and_find_vma()
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (16 preceding siblings ...)
  2023-06-29 18:44 ` [PATCH 6.4 17/28] mm/fault: convert remaining simple cases to lock_mm_and_find_vma() Greg Kroah-Hartman
@ 2023-06-29 18:44 ` Greg Kroah-Hartman
  2023-06-29 18:44 ` [PATCH 6.4 19/28] mm: make find_extend_vma() fail if write lock not held Greg Kroah-Hartman
                   ` (10 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:44 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Linus Torvalds

From: Linus Torvalds <torvalds@linux-foundation.org>

commit 2cd76c50d0b41cec5c87abfcdf25b236a2793fb6 upstream.

This is one of the simple cases, except there's no pt_regs pointer.
Which is fine, as lock_mm_and_find_vma() is set up to work fine with a
NULL pt_regs.

Powerpc already enabled LOCK_MM_AND_FIND_VMA for the main CPU faulting,
so we can just use the helper without any extra work.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/powerpc/mm/copro_fault.c |   14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

--- a/arch/powerpc/mm/copro_fault.c
+++ b/arch/powerpc/mm/copro_fault.c
@@ -33,19 +33,11 @@ int copro_handle_mm_fault(struct mm_stru
 	if (mm->pgd == NULL)
 		return -EFAULT;
 
-	mmap_read_lock(mm);
-	ret = -EFAULT;
-	vma = find_vma(mm, ea);
+	vma = lock_mm_and_find_vma(mm, ea, NULL);
 	if (!vma)
-		goto out_unlock;
-
-	if (ea < vma->vm_start) {
-		if (!(vma->vm_flags & VM_GROWSDOWN))
-			goto out_unlock;
-		if (expand_stack(vma, ea))
-			goto out_unlock;
-	}
+		return -EFAULT;
 
+	ret = -EFAULT;
 	is_write = dsisr & DSISR_ISSTORE;
 	if (is_write) {
 		if (!(vma->vm_flags & VM_WRITE))



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 19/28] mm: make find_extend_vma() fail if write lock not held
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (17 preceding siblings ...)
  2023-06-29 18:44 ` [PATCH 6.4 18/28] powerpc/mm: convert coprocessor fault " Greg Kroah-Hartman
@ 2023-06-29 18:44 ` Greg Kroah-Hartman
  2023-06-29 18:44 ` [PATCH 6.4 20/28] execve: expand new process stack manually ahead of time Greg Kroah-Hartman
                   ` (9 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:44 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Matthew Wilcox (Oracle),
	Liam R. Howlett, Linus Torvalds

From: Liam R. Howlett <Liam.Howlett@oracle.com>

commit f440fa1ac955e2898893f9301568435eb5cdfc4b upstream.

Make calls to extend_vma() and find_extend_vma() fail if the write lock
is required.

To avoid making this a flag-day event, this still allows the old
read-locking case for the trivial situations, and passes in a flag to
say "is it write-locked".  That way write-lockers can say "yes, I'm
being careful", and legacy users will continue to work in all the common
cases until they have been fully converted to the new world order.

Co-Developed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/binfmt_elf.c    |    6 +++---
 fs/exec.c          |    5 +++--
 include/linux/mm.h |   10 +++++++---
 mm/memory.c        |    2 +-
 mm/mmap.c          |   50 +++++++++++++++++++++++++++++++++-----------------
 mm/nommu.c         |    3 ++-
 6 files changed, 49 insertions(+), 27 deletions(-)

--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -320,10 +320,10 @@ create_elf_tables(struct linux_binprm *b
 	 * Grow the stack manually; some architectures have a limit on how
 	 * far ahead a user-space access may be in order to grow the stack.
 	 */
-	if (mmap_read_lock_killable(mm))
+	if (mmap_write_lock_killable(mm))
 		return -EINTR;
-	vma = find_extend_vma(mm, bprm->p);
-	mmap_read_unlock(mm);
+	vma = find_extend_vma_locked(mm, bprm->p, true);
+	mmap_write_unlock(mm);
 	if (!vma)
 		return -EFAULT;
 
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -205,7 +205,8 @@ static struct page *get_arg_page(struct
 
 #ifdef CONFIG_STACK_GROWSUP
 	if (write) {
-		ret = expand_downwards(bprm->vma, pos);
+		/* We claim to hold the lock - nobody to race with */
+		ret = expand_downwards(bprm->vma, pos, true);
 		if (ret < 0)
 			return NULL;
 	}
@@ -853,7 +854,7 @@ int setup_arg_pages(struct linux_binprm
 	stack_base = vma->vm_end - stack_expand;
 #endif
 	current->mm->start_stack = bprm->p;
-	ret = expand_stack(vma, stack_base);
+	ret = expand_stack_locked(vma, stack_base, true);
 	if (ret)
 		ret = -EFAULT;
 
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3192,11 +3192,13 @@ extern vm_fault_t filemap_page_mkwrite(s
 
 extern unsigned long stack_guard_gap;
 /* Generic expand stack which grows the stack according to GROWS{UP,DOWN} */
-extern int expand_stack(struct vm_area_struct *vma, unsigned long address);
+int expand_stack_locked(struct vm_area_struct *vma, unsigned long address,
+		bool write_locked);
+#define expand_stack(vma,addr) expand_stack_locked(vma,addr,false)
 
 /* CONFIG_STACK_GROWSUP still needs to grow downwards at some places */
-extern int expand_downwards(struct vm_area_struct *vma,
-		unsigned long address);
+int expand_downwards(struct vm_area_struct *vma, unsigned long address,
+		bool write_locked);
 #if VM_GROWSUP
 extern int expand_upwards(struct vm_area_struct *vma, unsigned long address);
 #else
@@ -3297,6 +3299,8 @@ unsigned long change_prot_numa(struct vm
 #endif
 
 struct vm_area_struct *find_extend_vma(struct mm_struct *, unsigned long addr);
+struct vm_area_struct *find_extend_vma_locked(struct mm_struct *,
+		unsigned long addr, bool write_locked);
 int remap_pfn_range(struct vm_area_struct *, unsigned long addr,
 			unsigned long pfn, unsigned long size, pgprot_t);
 int remap_pfn_range_notrack(struct vm_area_struct *vma, unsigned long addr,
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5368,7 +5368,7 @@ struct vm_area_struct *lock_mm_and_find_
 			goto fail;
 	}
 
-	if (expand_stack(vma, addr))
+	if (expand_stack_locked(vma, addr, true))
 		goto fail;
 
 success:
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1935,7 +1935,8 @@ static int acct_stack_growth(struct vm_a
  * PA-RISC uses this for its stack; IA64 for its Register Backing Store.
  * vma is the last one with address > vma->vm_end.  Have to extend vma.
  */
-int expand_upwards(struct vm_area_struct *vma, unsigned long address)
+int expand_upwards(struct vm_area_struct *vma, unsigned long address,
+		bool write_locked)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	struct vm_area_struct *next;
@@ -1959,6 +1960,8 @@ int expand_upwards(struct vm_area_struct
 	if (gap_addr < address || gap_addr > TASK_SIZE)
 		gap_addr = TASK_SIZE;
 
+	if (!write_locked)
+		return -EAGAIN;
 	next = find_vma_intersection(mm, vma->vm_end, gap_addr);
 	if (next && vma_is_accessible(next)) {
 		if (!(next->vm_flags & VM_GROWSUP))
@@ -2028,7 +2031,8 @@ int expand_upwards(struct vm_area_struct
 /*
  * vma is the first one with address < vma->vm_start.  Have to extend vma.
  */
-int expand_downwards(struct vm_area_struct *vma, unsigned long address)
+int expand_downwards(struct vm_area_struct *vma, unsigned long address,
+		bool write_locked)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	MA_STATE(mas, &mm->mm_mt, vma->vm_start, vma->vm_start);
@@ -2042,10 +2046,13 @@ int expand_downwards(struct vm_area_stru
 	/* Enforce stack_guard_gap */
 	prev = mas_prev(&mas, 0);
 	/* Check that both stack segments have the same anon_vma? */
-	if (prev && !(prev->vm_flags & VM_GROWSDOWN) &&
-			vma_is_accessible(prev)) {
-		if (address - prev->vm_end < stack_guard_gap)
+	if (prev) {
+		if (!(prev->vm_flags & VM_GROWSDOWN) &&
+		    vma_is_accessible(prev) &&
+		    (address - prev->vm_end < stack_guard_gap))
 			return -ENOMEM;
+		if (!write_locked && (prev->vm_end == address))
+			return -EAGAIN;
 	}
 
 	if (mas_preallocate(&mas, GFP_KERNEL))
@@ -2124,13 +2131,14 @@ static int __init cmdline_parse_stack_gu
 __setup("stack_guard_gap=", cmdline_parse_stack_guard_gap);
 
 #ifdef CONFIG_STACK_GROWSUP
-int expand_stack(struct vm_area_struct *vma, unsigned long address)
+int expand_stack_locked(struct vm_area_struct *vma, unsigned long address,
+		bool write_locked)
 {
-	return expand_upwards(vma, address);
+	return expand_upwards(vma, address, write_locked);
 }
 
-struct vm_area_struct *
-find_extend_vma(struct mm_struct *mm, unsigned long addr)
+struct vm_area_struct *find_extend_vma_locked(struct mm_struct *mm,
+		unsigned long addr, bool write_locked)
 {
 	struct vm_area_struct *vma, *prev;
 
@@ -2138,20 +2146,25 @@ find_extend_vma(struct mm_struct *mm, un
 	vma = find_vma_prev(mm, addr, &prev);
 	if (vma && (vma->vm_start <= addr))
 		return vma;
-	if (!prev || expand_stack(prev, addr))
+	if (!prev)
+		return NULL;
+	if (expand_stack_locked(prev, addr, write_locked))
 		return NULL;
 	if (prev->vm_flags & VM_LOCKED)
 		populate_vma_page_range(prev, addr, prev->vm_end, NULL);
 	return prev;
 }
 #else
-int expand_stack(struct vm_area_struct *vma, unsigned long address)
+int expand_stack_locked(struct vm_area_struct *vma, unsigned long address,
+		bool write_locked)
 {
-	return expand_downwards(vma, address);
+	if (unlikely(!(vma->vm_flags & VM_GROWSDOWN)))
+		return -EINVAL;
+	return expand_downwards(vma, address, write_locked);
 }
 
-struct vm_area_struct *
-find_extend_vma(struct mm_struct *mm, unsigned long addr)
+struct vm_area_struct *find_extend_vma_locked(struct mm_struct *mm,
+		unsigned long addr, bool write_locked)
 {
 	struct vm_area_struct *vma;
 	unsigned long start;
@@ -2162,10 +2175,8 @@ find_extend_vma(struct mm_struct *mm, un
 		return NULL;
 	if (vma->vm_start <= addr)
 		return vma;
-	if (!(vma->vm_flags & VM_GROWSDOWN))
-		return NULL;
 	start = vma->vm_start;
-	if (expand_stack(vma, addr))
+	if (expand_stack_locked(vma, addr, write_locked))
 		return NULL;
 	if (vma->vm_flags & VM_LOCKED)
 		populate_vma_page_range(vma, addr, start, NULL);
@@ -2173,6 +2184,11 @@ find_extend_vma(struct mm_struct *mm, un
 }
 #endif
 
+struct vm_area_struct *find_extend_vma(struct mm_struct *mm,
+		unsigned long addr)
+{
+	return find_extend_vma_locked(mm, addr, false);
+}
 EXPORT_SYMBOL_GPL(find_extend_vma);
 
 /*
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -643,7 +643,8 @@ struct vm_area_struct *find_extend_vma(s
  * expand a stack to a given address
  * - not supported under NOMMU conditions
  */
-int expand_stack(struct vm_area_struct *vma, unsigned long address)
+int expand_stack_locked(struct vm_area_struct *vma, unsigned long address,
+		bool write_locked)
 {
 	return -ENOMEM;
 }



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 20/28] execve: expand new process stack manually ahead of time
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (18 preceding siblings ...)
  2023-06-29 18:44 ` [PATCH 6.4 19/28] mm: make find_extend_vma() fail if write lock not held Greg Kroah-Hartman
@ 2023-06-29 18:44 ` Greg Kroah-Hartman
  2023-06-29 18:44 ` [PATCH 6.4 21/28] mm: always expand the stack with the mmap write lock held Greg Kroah-Hartman
                   ` (8 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:44 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Linus Torvalds

From: Linus Torvalds <torvalds@linux-foundation.org>

commit f313c51d26aa87e69633c9b46efb37a930faca71 upstream.

This is a small step towards a model where GUP itself would not expand
the stack, and any user that needs GUP to not look up existing mappings,
but actually expand on them, would have to do so manually before-hand,
and with the mm lock held for writing.

It turns out that execve() already did almost exactly that, except it
didn't take the mm lock at all (it's single-threaded so no locking
technically needed, but it could cause lockdep errors).  And it only did
it for the CONFIG_STACK_GROWSUP case, since in that case GUP has
obviously never expanded the stack downwards.

So just make that CONFIG_STACK_GROWSUP case do the right thing with
locking, and enable it generally.  This will eventually help GUP, and in
the meantime avoids a special case and the lockdep issue.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/exec.c |   37 +++++++++++++++++++++----------------
 1 file changed, 21 insertions(+), 16 deletions(-)

--- a/fs/exec.c
+++ b/fs/exec.c
@@ -200,34 +200,39 @@ static struct page *get_arg_page(struct
 		int write)
 {
 	struct page *page;
+	struct vm_area_struct *vma = bprm->vma;
+	struct mm_struct *mm = bprm->mm;
 	int ret;
-	unsigned int gup_flags = 0;
 
-#ifdef CONFIG_STACK_GROWSUP
-	if (write) {
-		/* We claim to hold the lock - nobody to race with */
-		ret = expand_downwards(bprm->vma, pos, true);
-		if (ret < 0)
+	/*
+	 * Avoid relying on expanding the stack down in GUP (which
+	 * does not work for STACK_GROWSUP anyway), and just do it
+	 * by hand ahead of time.
+	 */
+	if (write && pos < vma->vm_start) {
+		mmap_write_lock(mm);
+		ret = expand_downwards(vma, pos, true);
+		if (unlikely(ret < 0)) {
+			mmap_write_unlock(mm);
 			return NULL;
-	}
-#endif
-
-	if (write)
-		gup_flags |= FOLL_WRITE;
+		}
+		mmap_write_downgrade(mm);
+	} else
+		mmap_read_lock(mm);
 
 	/*
 	 * We are doing an exec().  'current' is the process
-	 * doing the exec and bprm->mm is the new process's mm.
+	 * doing the exec and 'mm' is the new process's mm.
 	 */
-	mmap_read_lock(bprm->mm);
-	ret = get_user_pages_remote(bprm->mm, pos, 1, gup_flags,
+	ret = get_user_pages_remote(mm, pos, 1,
+			write ? FOLL_WRITE : 0,
 			&page, NULL, NULL);
-	mmap_read_unlock(bprm->mm);
+	mmap_read_unlock(mm);
 	if (ret <= 0)
 		return NULL;
 
 	if (write)
-		acct_arg_size(bprm, vma_pages(bprm->vma));
+		acct_arg_size(bprm, vma_pages(vma));
 
 	return page;
 }



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 21/28] mm: always expand the stack with the mmap write lock held
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (19 preceding siblings ...)
  2023-06-29 18:44 ` [PATCH 6.4 20/28] execve: expand new process stack manually ahead of time Greg Kroah-Hartman
@ 2023-06-29 18:44 ` Greg Kroah-Hartman
  2023-06-29 18:44 ` [PATCH 6.4 22/28] HID: wacom: Use ktime_t rather than int when dealing with timestamps Greg Kroah-Hartman
                   ` (7 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:44 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Vegard Nossum, Linus Torvalds,
	John Paul Adrian Glaubitz, Frank Scheiner

From: Linus Torvalds <torvalds@linux-foundation.org>

commit 8d7071af890768438c14db6172cc8f9f4d04e184 upstream.

This finishes the job of always holding the mmap write lock when
extending the user stack vma, and removes the 'write_locked' argument
from the vm helper functions again.

For some cases, we just avoid expanding the stack at all: drivers and
page pinning really shouldn't be extending any stacks.  Let's see if any
strange users really wanted that.

It's worth noting that architectures that weren't converted to the new
lock_mm_and_find_vma() helper function are left using the legacy
"expand_stack()" function, but it has been changed to drop the mmap_lock
and take it for writing while expanding the vma.  This makes it fairly
straightforward to convert the remaining architectures.

As a result of dropping and re-taking the lock, the calling conventions
for this function have also changed, since the old vma may no longer be
valid.  So it will now return the new vma if successful, and NULL - and
the lock dropped - if the area could not be extended.

Tested-by: Vegard Nossum <vegard.nossum@oracle.com>
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> # ia64
Tested-by: Frank Scheiner <frank.scheiner@web.de> # ia64
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/ia64/mm/fault.c         |   36 ++----------
 arch/m68k/mm/fault.c         |    9 ++-
 arch/microblaze/mm/fault.c   |    5 +
 arch/openrisc/mm/fault.c     |    5 +
 arch/parisc/mm/fault.c       |   23 +++-----
 arch/s390/mm/fault.c         |    5 +
 arch/sparc/mm/fault_64.c     |    8 +-
 arch/um/kernel/trap.c        |   11 ++-
 drivers/iommu/amd/iommu_v2.c |    4 -
 drivers/iommu/iommu-sva.c    |    2 
 fs/binfmt_elf.c              |    2 
 fs/exec.c                    |    4 -
 include/linux/mm.h           |   16 +----
 mm/gup.c                     |    6 +-
 mm/memory.c                  |   10 +++
 mm/mmap.c                    |  121 ++++++++++++++++++++++++++++++++++---------
 mm/nommu.c                   |   18 ++----
 17 files changed, 169 insertions(+), 116 deletions(-)

--- a/arch/ia64/mm/fault.c
+++ b/arch/ia64/mm/fault.c
@@ -110,10 +110,12 @@ retry:
          * register backing store that needs to expand upwards, in
          * this case vma will be null, but prev_vma will ne non-null
          */
-        if (( !vma && prev_vma ) || (address < vma->vm_start) )
-		goto check_expansion;
+        if (( !vma && prev_vma ) || (address < vma->vm_start) ) {
+		vma = expand_stack(mm, address);
+		if (!vma)
+			goto bad_area_nosemaphore;
+	}
 
-  good_area:
 	code = SEGV_ACCERR;
 
 	/* OK, we've got a good vm_area for this memory area.  Check the access permissions: */
@@ -177,35 +179,9 @@ retry:
 	mmap_read_unlock(mm);
 	return;
 
-  check_expansion:
-	if (!(prev_vma && (prev_vma->vm_flags & VM_GROWSUP) && (address == prev_vma->vm_end))) {
-		if (!vma)
-			goto bad_area;
-		if (!(vma->vm_flags & VM_GROWSDOWN))
-			goto bad_area;
-		if (REGION_NUMBER(address) != REGION_NUMBER(vma->vm_start)
-		    || REGION_OFFSET(address) >= RGN_MAP_LIMIT)
-			goto bad_area;
-		if (expand_stack(vma, address))
-			goto bad_area;
-	} else {
-		vma = prev_vma;
-		if (REGION_NUMBER(address) != REGION_NUMBER(vma->vm_start)
-		    || REGION_OFFSET(address) >= RGN_MAP_LIMIT)
-			goto bad_area;
-		/*
-		 * Since the register backing store is accessed sequentially,
-		 * we disallow growing it by more than a page at a time.
-		 */
-		if (address > vma->vm_end + PAGE_SIZE - sizeof(long))
-			goto bad_area;
-		if (expand_upwards(vma, address))
-			goto bad_area;
-	}
-	goto good_area;
-
   bad_area:
 	mmap_read_unlock(mm);
+  bad_area_nosemaphore:
 	if ((isr & IA64_ISR_SP)
 	    || ((isr & IA64_ISR_NA) && (isr & IA64_ISR_CODE_MASK) == IA64_ISR_CODE_LFETCH))
 	{
--- a/arch/m68k/mm/fault.c
+++ b/arch/m68k/mm/fault.c
@@ -105,8 +105,9 @@ retry:
 		if (address + 256 < rdusp())
 			goto map_err;
 	}
-	if (expand_stack(vma, address))
-		goto map_err;
+	vma = expand_stack(mm, address);
+	if (!vma)
+		goto map_err_nosemaphore;
 
 /*
  * Ok, we have a good vm_area for this memory access, so
@@ -196,10 +197,12 @@ bus_err:
 	goto send_sig;
 
 map_err:
+	mmap_read_unlock(mm);
+map_err_nosemaphore:
 	current->thread.signo = SIGSEGV;
 	current->thread.code = SEGV_MAPERR;
 	current->thread.faddr = address;
-	goto send_sig;
+	return send_fault_sig(regs);
 
 acc_err:
 	current->thread.signo = SIGSEGV;
--- a/arch/microblaze/mm/fault.c
+++ b/arch/microblaze/mm/fault.c
@@ -192,8 +192,9 @@ retry:
 			&& (kernel_mode(regs) || !store_updates_sp(regs)))
 				goto bad_area;
 	}
-	if (expand_stack(vma, address))
-		goto bad_area;
+	vma = expand_stack(mm, address);
+	if (!vma)
+		goto bad_area_nosemaphore;
 
 good_area:
 	code = SEGV_ACCERR;
--- a/arch/openrisc/mm/fault.c
+++ b/arch/openrisc/mm/fault.c
@@ -127,8 +127,9 @@ retry:
 		if (address + PAGE_SIZE < regs->sp)
 			goto bad_area;
 	}
-	if (expand_stack(vma, address))
-		goto bad_area;
+	vma = expand_stack(mm, address);
+	if (!vma)
+		goto bad_area_nosemaphore;
 
 	/*
 	 * Ok, we have a good vm_area for this memory access, so
--- a/arch/parisc/mm/fault.c
+++ b/arch/parisc/mm/fault.c
@@ -288,15 +288,19 @@ void do_page_fault(struct pt_regs *regs,
 retry:
 	mmap_read_lock(mm);
 	vma = find_vma_prev(mm, address, &prev_vma);
-	if (!vma || address < vma->vm_start)
-		goto check_expansion;
+	if (!vma || address < vma->vm_start) {
+		if (!prev || !(prev->vm_flags & VM_GROWSUP))
+			goto bad_area;
+		vma = expand_stack(mm, address);
+		if (!vma)
+			goto bad_area_nosemaphore;
+	}
+
 /*
  * Ok, we have a good vm_area for this memory access. We still need to
  * check the access permissions.
  */
 
-good_area:
-
 	if ((vma->vm_flags & acc_type) != acc_type)
 		goto bad_area;
 
@@ -347,17 +351,13 @@ good_area:
 	mmap_read_unlock(mm);
 	return;
 
-check_expansion:
-	vma = prev_vma;
-	if (vma && (expand_stack(vma, address) == 0))
-		goto good_area;
-
 /*
  * Something tried to access memory that isn't in our memory map..
  */
 bad_area:
 	mmap_read_unlock(mm);
 
+bad_area_nosemaphore:
 	if (user_mode(regs)) {
 		int signo, si_code;
 
@@ -449,7 +449,7 @@ handle_nadtlb_fault(struct pt_regs *regs
 {
 	unsigned long insn = regs->iir;
 	int breg, treg, xreg, val = 0;
-	struct vm_area_struct *vma, *prev_vma;
+	struct vm_area_struct *vma;
 	struct task_struct *tsk;
 	struct mm_struct *mm;
 	unsigned long address;
@@ -485,7 +485,7 @@ handle_nadtlb_fault(struct pt_regs *regs
 				/* Search for VMA */
 				address = regs->ior;
 				mmap_read_lock(mm);
-				vma = find_vma_prev(mm, address, &prev_vma);
+				vma = vma_lookup(mm, address);
 				mmap_read_unlock(mm);
 
 				/*
@@ -494,7 +494,6 @@ handle_nadtlb_fault(struct pt_regs *regs
 				 */
 				acc_type = (insn & 0x40) ? VM_WRITE : VM_READ;
 				if (vma
-				    && address >= vma->vm_start
 				    && (vma->vm_flags & acc_type) == acc_type)
 					val = 1;
 			}
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -457,8 +457,9 @@ retry:
 	if (unlikely(vma->vm_start > address)) {
 		if (!(vma->vm_flags & VM_GROWSDOWN))
 			goto out_up;
-		if (expand_stack(vma, address))
-			goto out_up;
+		vma = expand_stack(mm, address);
+		if (!vma)
+			goto out;
 	}
 
 	/*
--- a/arch/sparc/mm/fault_64.c
+++ b/arch/sparc/mm/fault_64.c
@@ -383,8 +383,9 @@ continue_fault:
 				goto bad_area;
 		}
 	}
-	if (expand_stack(vma, address))
-		goto bad_area;
+	vma = expand_stack(mm, address);
+	if (!vma)
+		goto bad_area_nosemaphore;
 	/*
 	 * Ok, we have a good vm_area for this memory access, so
 	 * we can handle it..
@@ -487,8 +488,9 @@ exit_exception:
 	 * Fix it, but check if it's kernel or user first..
 	 */
 bad_area:
-	insn = get_fault_insn(regs, insn);
 	mmap_read_unlock(mm);
+bad_area_nosemaphore:
+	insn = get_fault_insn(regs, insn);
 
 handle_kernel_fault:
 	do_kernel_fault(regs, si_code, fault_code, insn, address);
--- a/arch/um/kernel/trap.c
+++ b/arch/um/kernel/trap.c
@@ -47,14 +47,15 @@ retry:
 	vma = find_vma(mm, address);
 	if (!vma)
 		goto out;
-	else if (vma->vm_start <= address)
+	if (vma->vm_start <= address)
 		goto good_area;
-	else if (!(vma->vm_flags & VM_GROWSDOWN))
+	if (!(vma->vm_flags & VM_GROWSDOWN))
 		goto out;
-	else if (is_user && !ARCH_IS_STACKGROW(address))
-		goto out;
-	else if (expand_stack(vma, address))
+	if (is_user && !ARCH_IS_STACKGROW(address))
 		goto out;
+	vma = expand_stack(mm, address);
+	if (!vma)
+		goto out_nosemaphore;
 
 good_area:
 	*code_out = SEGV_ACCERR;
--- a/drivers/iommu/amd/iommu_v2.c
+++ b/drivers/iommu/amd/iommu_v2.c
@@ -485,8 +485,8 @@ static void do_fault(struct work_struct
 	flags |= FAULT_FLAG_REMOTE;
 
 	mmap_read_lock(mm);
-	vma = find_extend_vma(mm, address);
-	if (!vma || address < vma->vm_start)
+	vma = vma_lookup(mm, address);
+	if (!vma)
 		/* failed to get a vma in the right range */
 		goto out;
 
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -175,7 +175,7 @@ iommu_sva_handle_iopf(struct iommu_fault
 
 	mmap_read_lock(mm);
 
-	vma = find_extend_vma(mm, prm->addr);
+	vma = vma_lookup(mm, prm->addr);
 	if (!vma)
 		/* Unmapped area */
 		goto out_put_mm;
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -322,7 +322,7 @@ create_elf_tables(struct linux_binprm *b
 	 */
 	if (mmap_write_lock_killable(mm))
 		return -EINTR;
-	vma = find_extend_vma_locked(mm, bprm->p, true);
+	vma = find_extend_vma_locked(mm, bprm->p);
 	mmap_write_unlock(mm);
 	if (!vma)
 		return -EFAULT;
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -211,7 +211,7 @@ static struct page *get_arg_page(struct
 	 */
 	if (write && pos < vma->vm_start) {
 		mmap_write_lock(mm);
-		ret = expand_downwards(vma, pos, true);
+		ret = expand_downwards(vma, pos);
 		if (unlikely(ret < 0)) {
 			mmap_write_unlock(mm);
 			return NULL;
@@ -859,7 +859,7 @@ int setup_arg_pages(struct linux_binprm
 	stack_base = vma->vm_end - stack_expand;
 #endif
 	current->mm->start_stack = bprm->p;
-	ret = expand_stack_locked(vma, stack_base, true);
+	ret = expand_stack_locked(vma, stack_base);
 	if (ret)
 		ret = -EFAULT;
 
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3192,18 +3192,11 @@ extern vm_fault_t filemap_page_mkwrite(s
 
 extern unsigned long stack_guard_gap;
 /* Generic expand stack which grows the stack according to GROWS{UP,DOWN} */
-int expand_stack_locked(struct vm_area_struct *vma, unsigned long address,
-		bool write_locked);
-#define expand_stack(vma,addr) expand_stack_locked(vma,addr,false)
+int expand_stack_locked(struct vm_area_struct *vma, unsigned long address);
+struct vm_area_struct *expand_stack(struct mm_struct * mm, unsigned long addr);
 
 /* CONFIG_STACK_GROWSUP still needs to grow downwards at some places */
-int expand_downwards(struct vm_area_struct *vma, unsigned long address,
-		bool write_locked);
-#if VM_GROWSUP
-extern int expand_upwards(struct vm_area_struct *vma, unsigned long address);
-#else
-  #define expand_upwards(vma, address) (0)
-#endif
+int expand_downwards(struct vm_area_struct *vma, unsigned long address);
 
 /* Look up the first VMA which satisfies  addr < vm_end,  NULL if none. */
 extern struct vm_area_struct * find_vma(struct mm_struct * mm, unsigned long addr);
@@ -3298,9 +3291,8 @@ unsigned long change_prot_numa(struct vm
 			unsigned long start, unsigned long end);
 #endif
 
-struct vm_area_struct *find_extend_vma(struct mm_struct *, unsigned long addr);
 struct vm_area_struct *find_extend_vma_locked(struct mm_struct *,
-		unsigned long addr, bool write_locked);
+		unsigned long addr);
 int remap_pfn_range(struct vm_area_struct *, unsigned long addr,
 			unsigned long pfn, unsigned long size, pgprot_t);
 int remap_pfn_range_notrack(struct vm_area_struct *vma, unsigned long addr,
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1096,7 +1096,7 @@ static long __get_user_pages(struct mm_s
 
 		/* first iteration or cross vma bound */
 		if (!vma || start >= vma->vm_end) {
-			vma = find_extend_vma(mm, start);
+			vma = vma_lookup(mm, start);
 			if (!vma && in_gate_area(mm, start)) {
 				ret = get_gate_page(mm, start & PAGE_MASK,
 						gup_flags, &vma,
@@ -1265,8 +1265,8 @@ int fixup_user_fault(struct mm_struct *m
 		fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
 
 retry:
-	vma = find_extend_vma(mm, address);
-	if (!vma || address < vma->vm_start)
+	vma = vma_lookup(mm, address);
+	if (!vma)
 		return -EFAULT;
 
 	if (!vma_permits_fault(vma, fault_flags))
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5368,7 +5368,7 @@ struct vm_area_struct *lock_mm_and_find_
 			goto fail;
 	}
 
-	if (expand_stack_locked(vma, addr, true))
+	if (expand_stack_locked(vma, addr))
 		goto fail;
 
 success:
@@ -5713,6 +5713,14 @@ int __access_remote_vm(struct mm_struct
 	if (mmap_read_lock_killable(mm))
 		return 0;
 
+	/* We might need to expand the stack to access it */
+	vma = vma_lookup(mm, addr);
+	if (!vma) {
+		vma = expand_stack(mm, addr);
+		if (!vma)
+			return 0;
+	}
+
 	/* ignore errors, just check how much was successfully transferred */
 	while (len) {
 		int bytes, ret, offset;
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1935,8 +1935,7 @@ static int acct_stack_growth(struct vm_a
  * PA-RISC uses this for its stack; IA64 for its Register Backing Store.
  * vma is the last one with address > vma->vm_end.  Have to extend vma.
  */
-int expand_upwards(struct vm_area_struct *vma, unsigned long address,
-		bool write_locked)
+static int expand_upwards(struct vm_area_struct *vma, unsigned long address)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	struct vm_area_struct *next;
@@ -1960,8 +1959,6 @@ int expand_upwards(struct vm_area_struct
 	if (gap_addr < address || gap_addr > TASK_SIZE)
 		gap_addr = TASK_SIZE;
 
-	if (!write_locked)
-		return -EAGAIN;
 	next = find_vma_intersection(mm, vma->vm_end, gap_addr);
 	if (next && vma_is_accessible(next)) {
 		if (!(next->vm_flags & VM_GROWSUP))
@@ -2030,15 +2027,18 @@ int expand_upwards(struct vm_area_struct
 
 /*
  * vma is the first one with address < vma->vm_start.  Have to extend vma.
+ * mmap_lock held for writing.
  */
-int expand_downwards(struct vm_area_struct *vma, unsigned long address,
-		bool write_locked)
+int expand_downwards(struct vm_area_struct *vma, unsigned long address)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	MA_STATE(mas, &mm->mm_mt, vma->vm_start, vma->vm_start);
 	struct vm_area_struct *prev;
 	int error = 0;
 
+	if (!(vma->vm_flags & VM_GROWSDOWN))
+		return -EFAULT;
+
 	address &= PAGE_MASK;
 	if (address < mmap_min_addr || address < FIRST_USER_ADDRESS)
 		return -EPERM;
@@ -2051,8 +2051,6 @@ int expand_downwards(struct vm_area_stru
 		    vma_is_accessible(prev) &&
 		    (address - prev->vm_end < stack_guard_gap))
 			return -ENOMEM;
-		if (!write_locked && (prev->vm_end == address))
-			return -EAGAIN;
 	}
 
 	if (mas_preallocate(&mas, GFP_KERNEL))
@@ -2131,14 +2129,12 @@ static int __init cmdline_parse_stack_gu
 __setup("stack_guard_gap=", cmdline_parse_stack_guard_gap);
 
 #ifdef CONFIG_STACK_GROWSUP
-int expand_stack_locked(struct vm_area_struct *vma, unsigned long address,
-		bool write_locked)
+int expand_stack_locked(struct vm_area_struct *vma, unsigned long address)
 {
-	return expand_upwards(vma, address, write_locked);
+	return expand_upwards(vma, address);
 }
 
-struct vm_area_struct *find_extend_vma_locked(struct mm_struct *mm,
-		unsigned long addr, bool write_locked)
+struct vm_area_struct *find_extend_vma_locked(struct mm_struct *mm, unsigned long addr)
 {
 	struct vm_area_struct *vma, *prev;
 
@@ -2148,23 +2144,21 @@ struct vm_area_struct *find_extend_vma_l
 		return vma;
 	if (!prev)
 		return NULL;
-	if (expand_stack_locked(prev, addr, write_locked))
+	if (expand_stack_locked(prev, addr))
 		return NULL;
 	if (prev->vm_flags & VM_LOCKED)
 		populate_vma_page_range(prev, addr, prev->vm_end, NULL);
 	return prev;
 }
 #else
-int expand_stack_locked(struct vm_area_struct *vma, unsigned long address,
-		bool write_locked)
+int expand_stack_locked(struct vm_area_struct *vma, unsigned long address)
 {
 	if (unlikely(!(vma->vm_flags & VM_GROWSDOWN)))
 		return -EINVAL;
-	return expand_downwards(vma, address, write_locked);
+	return expand_downwards(vma, address);
 }
 
-struct vm_area_struct *find_extend_vma_locked(struct mm_struct *mm,
-		unsigned long addr, bool write_locked)
+struct vm_area_struct *find_extend_vma_locked(struct mm_struct *mm, unsigned long addr)
 {
 	struct vm_area_struct *vma;
 	unsigned long start;
@@ -2176,7 +2170,7 @@ struct vm_area_struct *find_extend_vma_l
 	if (vma->vm_start <= addr)
 		return vma;
 	start = vma->vm_start;
-	if (expand_stack_locked(vma, addr, write_locked))
+	if (expand_stack_locked(vma, addr))
 		return NULL;
 	if (vma->vm_flags & VM_LOCKED)
 		populate_vma_page_range(vma, addr, start, NULL);
@@ -2184,12 +2178,91 @@ struct vm_area_struct *find_extend_vma_l
 }
 #endif
 
-struct vm_area_struct *find_extend_vma(struct mm_struct *mm,
-		unsigned long addr)
+/*
+ * IA64 has some horrid mapping rules: it can expand both up and down,
+ * but with various special rules.
+ *
+ * We'll get rid of this architecture eventually, so the ugliness is
+ * temporary.
+ */
+#ifdef CONFIG_IA64
+static inline bool vma_expand_ok(struct vm_area_struct *vma, unsigned long addr)
+{
+	return REGION_NUMBER(addr) == REGION_NUMBER(vma->vm_start) &&
+		REGION_OFFSET(addr) < RGN_MAP_LIMIT;
+}
+
+/*
+ * IA64 stacks grow down, but there's a special register backing store
+ * that can grow up. Only sequentially, though, so the new address must
+ * match vm_end.
+ */
+static inline int vma_expand_up(struct vm_area_struct *vma, unsigned long addr)
+{
+	if (!vma_expand_ok(vma, addr))
+		return -EFAULT;
+	if (vma->vm_end != (addr & PAGE_MASK))
+		return -EFAULT;
+	return expand_upwards(vma, addr);
+}
+
+static inline bool vma_expand_down(struct vm_area_struct *vma, unsigned long addr)
+{
+	if (!vma_expand_ok(vma, addr))
+		return -EFAULT;
+	return expand_downwards(vma, addr);
+}
+
+#elif defined(CONFIG_STACK_GROWSUP)
+
+#define vma_expand_up(vma,addr) expand_upwards(vma, addr)
+#define vma_expand_down(vma, addr) (-EFAULT)
+
+#else
+
+#define vma_expand_up(vma,addr) (-EFAULT)
+#define vma_expand_down(vma, addr) expand_downwards(vma, addr)
+
+#endif
+
+/*
+ * expand_stack(): legacy interface for page faulting. Don't use unless
+ * you have to.
+ *
+ * This is called with the mm locked for reading, drops the lock, takes
+ * the lock for writing, tries to look up a vma again, expands it if
+ * necessary, and downgrades the lock to reading again.
+ *
+ * If no vma is found or it can't be expanded, it returns NULL and has
+ * dropped the lock.
+ */
+struct vm_area_struct *expand_stack(struct mm_struct *mm, unsigned long addr)
 {
-	return find_extend_vma_locked(mm, addr, false);
+	struct vm_area_struct *vma, *prev;
+
+	mmap_read_unlock(mm);
+	if (mmap_write_lock_killable(mm))
+		return NULL;
+
+	vma = find_vma_prev(mm, addr, &prev);
+	if (vma && vma->vm_start <= addr)
+		goto success;
+
+	if (prev && !vma_expand_up(prev, addr)) {
+		vma = prev;
+		goto success;
+	}
+
+	if (vma && !vma_expand_down(vma, addr))
+		goto success;
+
+	mmap_write_unlock(mm);
+	return NULL;
+
+success:
+	mmap_write_downgrade(mm);
+	return vma;
 }
-EXPORT_SYMBOL_GPL(find_extend_vma);
 
 /*
  * Ok - we have the memory areas we should free on a maple tree so release them,
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -631,24 +631,20 @@ struct vm_area_struct *find_vma(struct m
 EXPORT_SYMBOL(find_vma);
 
 /*
- * find a VMA
- * - we don't extend stack VMAs under NOMMU conditions
- */
-struct vm_area_struct *find_extend_vma(struct mm_struct *mm, unsigned long addr)
-{
-	return find_vma(mm, addr);
-}
-
-/*
  * expand a stack to a given address
  * - not supported under NOMMU conditions
  */
-int expand_stack_locked(struct vm_area_struct *vma, unsigned long address,
-		bool write_locked)
+int expand_stack_locked(struct vm_area_struct *vma, unsigned long addr)
 {
 	return -ENOMEM;
 }
 
+struct vm_area_struct *expand_stack(struct mm_struct *mm, unsigned long addr)
+{
+	mmap_read_unlock(mm);
+	return NULL;
+}
+
 /*
  * look up the first VMA exactly that exactly matches addr
  * - should be called with mm->mmap_lock at least held readlocked



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 22/28] HID: wacom: Use ktime_t rather than int when dealing with timestamps
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (20 preceding siblings ...)
  2023-06-29 18:44 ` [PATCH 6.4 21/28] mm: always expand the stack with the mmap write lock held Greg Kroah-Hartman
@ 2023-06-29 18:44 ` Greg Kroah-Hartman
  2023-06-29 18:44 ` [PATCH 6.4 23/28] gup: add warning if some caller would seem to want stack expansion Greg Kroah-Hartman
                   ` (6 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:44 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Jason Gerecke, Benjamin Tissoires,
	Benjamin Tissoires

From: Jason Gerecke <jason.gerecke@wacom.com>

commit 9a6c0e28e215535b2938c61ded54603b4e5814c5 upstream.

Code which interacts with timestamps needs to use the ktime_t type
returned by functions like ktime_get. The int type does not offer
enough space to store these values, and attempting to use it is a
recipe for problems. In this particular case, overflows would occur
when calculating/storing timestamps leading to incorrect values being
reported to userspace. In some cases these bad timestamps cause input
handling in userspace to appear hung.

Link: https://gitlab.freedesktop.org/libinput/libinput/-/issues/901
Fixes: 17d793f3ed53 ("HID: wacom: insert timestamp to packed Bluetooth (BT) events")
CC: stable@vger.kernel.org
Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Link: https://lore.kernel.org/r/20230608213828.2108-1-jason.gerecke@wacom.com
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/hid/wacom_wac.c |    6 +++---
 drivers/hid/wacom_wac.h |    2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

--- a/drivers/hid/wacom_wac.c
+++ b/drivers/hid/wacom_wac.c
@@ -1314,7 +1314,7 @@ static void wacom_intuos_pro2_bt_pen(str
 	struct input_dev *pen_input = wacom->pen_input;
 	unsigned char *data = wacom->data;
 	int number_of_valid_frames = 0;
-	int time_interval = 15000000;
+	ktime_t time_interval = 15000000;
 	ktime_t time_packet_received = ktime_get();
 	int i;
 
@@ -1348,7 +1348,7 @@ static void wacom_intuos_pro2_bt_pen(str
 	if (number_of_valid_frames) {
 		if (wacom->hid_data.time_delayed)
 			time_interval = ktime_get() - wacom->hid_data.time_delayed;
-		time_interval /= number_of_valid_frames;
+		time_interval = div_u64(time_interval, number_of_valid_frames);
 		wacom->hid_data.time_delayed = time_packet_received;
 	}
 
@@ -1359,7 +1359,7 @@ static void wacom_intuos_pro2_bt_pen(str
 		bool range = frame[0] & 0x20;
 		bool invert = frame[0] & 0x10;
 		int frames_number_reversed = number_of_valid_frames - i - 1;
-		int event_timestamp = time_packet_received - frames_number_reversed * time_interval;
+		ktime_t event_timestamp = time_packet_received - frames_number_reversed * time_interval;
 
 		if (!valid)
 			continue;
--- a/drivers/hid/wacom_wac.h
+++ b/drivers/hid/wacom_wac.h
@@ -324,7 +324,7 @@ struct hid_data {
 	int ps_connected;
 	bool pad_input_event_flag;
 	unsigned short sequence_number;
-	int time_delayed;
+	ktime_t time_delayed;
 };
 
 struct wacom_remote_data {



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 23/28] gup: add warning if some caller would seem to want stack expansion
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (21 preceding siblings ...)
  2023-06-29 18:44 ` [PATCH 6.4 22/28] HID: wacom: Use ktime_t rather than int when dealing with timestamps Greg Kroah-Hartman
@ 2023-06-29 18:44 ` Greg Kroah-Hartman
  2023-06-29 18:44 ` [PATCH 6.4 24/28] mm/khugepaged: fix regression in collapse_file() Greg Kroah-Hartman
                   ` (5 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:44 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Linus Torvalds

From: Linus Torvalds <torvalds@linux-foundation.org>

commit a425ac5365f6cb3cc47bf83e6bff0213c10445f7 upstream.

It feels very unlikely that anybody would want to do a GUP in an
unmapped area under the stack pointer, but real users sometimes do some
really strange things.  So add a (temporary) warning for the case where
a GUP fails and expanding the stack might have made it work.

It's trivial to do the expansion in the caller as part of getting the mm
lock in the first place - see __access_remote_vm() for ptrace, for
example - it's just that it's unnecessarily painful to do it deep in the
guts of the GUP lookup when we might have to drop and re-take the lock.

I doubt anybody actually does anything quite this strange, but let's be
proactive: adding these warnings is simple, and will make debugging it
much easier if they trigger.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/gup.c |   12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1096,7 +1096,11 @@ static long __get_user_pages(struct mm_s
 
 		/* first iteration or cross vma bound */
 		if (!vma || start >= vma->vm_end) {
-			vma = vma_lookup(mm, start);
+			vma = find_vma(mm, start);
+			if (vma && (start < vma->vm_start)) {
+				WARN_ON_ONCE(vma->vm_flags & VM_GROWSDOWN);
+				vma = NULL;
+			}
 			if (!vma && in_gate_area(mm, start)) {
 				ret = get_gate_page(mm, start & PAGE_MASK,
 						gup_flags, &vma,
@@ -1265,9 +1269,13 @@ int fixup_user_fault(struct mm_struct *m
 		fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
 
 retry:
-	vma = vma_lookup(mm, address);
+	vma = find_vma(mm, address);
 	if (!vma)
 		return -EFAULT;
+	if (address < vma->vm_start ) {
+		WARN_ON_ONCE(vma->vm_flags & VM_GROWSDOWN);
+		return -EFAULT;
+	}
 
 	if (!vma_permits_fault(vma, fault_flags))
 		return -EFAULT;



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 24/28] mm/khugepaged: fix regression in collapse_file()
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (22 preceding siblings ...)
  2023-06-29 18:44 ` [PATCH 6.4 23/28] gup: add warning if some caller would seem to want stack expansion Greg Kroah-Hartman
@ 2023-06-29 18:44 ` Greg Kroah-Hartman
  2023-06-29 18:44 ` [PATCH 6.4 25/28] fbdev: fix potential OOB read in fast_imageblit() Greg Kroah-Hartman
                   ` (4 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:44 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Hugh Dickins, stable, Andrew Morton,
	Matthew Wilcox, David Stevens, Peter Xu, Linus Torvalds

From: Hugh Dickins <hughd@google.com>

commit e8c716bc6812202ccf4ce0f0bad3428b794fb39c upstream.

There is no xas_pause(&xas) in collapse_file()'s main loop, at the points
where it does xas_unlock_irq(&xas) and then continues.

That would explain why, once two weeks ago and twice yesterday, I have
hit the VM_BUG_ON_PAGE(page != xas_load(&xas), page) since "mm/khugepaged:
fix iteration in collapse_file" removed the xas_set(&xas, index) just
before it: xas.xa_node could be left pointing to a stale node, if there
was concurrent activity on the file which transformed its xarray.

I tried inserting xas_pause()s, but then even bootup crashed on that
VM_BUG_ON_PAGE(): there appears to be a subtle "nextness" implicit in
xas_pause().

xas_next() and xas_pause() are good for use in simple loops, but not in
this one: xas_set() worked well until now, so use xas_set(&xas, index)
explicitly at the head of the loop; and change that VM_BUG_ON_PAGE() not
to need its own xas_set(), and not to interfere with the xa_state (which
would probably stop the crashes from xas_pause(), but I trust that less).

The user-visible effects of this bug (if VM_BUG_ONs are configured out)
would be data loss and data leak - potentially - though in practice I
expect it is more likely that a subsequent check (e.g. on mapping or on
nr_none) would notice an inconsistency, and just abandon the collapse.

Link: https://lore.kernel.org/linux-mm/f18e4b64-3f88-a8ab-56cc-d1f5f9c58d4@google.com/
Fixes: c8a8f3b4a95a ("mm/khugepaged: fix iteration in collapse_file")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: David Stevens <stevensd@chromium.org>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/khugepaged.c |    7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1918,9 +1918,9 @@ static int collapse_file(struct mm_struc
 		}
 	} while (1);

-	xas_set(&xas, start);
 	for (index = start; index < end; index++) {
-		page = xas_next(&xas);
+		xas_set(&xas, index);
+		page = xas_load(&xas);

 		VM_BUG_ON(index != xas.xa_index);
 		if (is_shmem) {
@@ -1935,7 +1935,6 @@ static int collapse_file(struct mm_struc
 						result = SCAN_TRUNCATED;
 						goto xa_locked;
 					}
-					xas_set(&xas, index + 1);
 				}
 				if (!shmem_charge(mapping->host, 1)) {
 					result = SCAN_FAIL;
@@ -2071,7 +2070,7 @@ static int collapse_file(struct mm_struc

 		xas_lock_irq(&xas);

-		VM_BUG_ON_PAGE(page != xas_load(&xas), page);
+		VM_BUG_ON_PAGE(page != xa_load(xas.xa, index), page);

 		/*
 		 * We control three references to the page:

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 25/28] fbdev: fix potential OOB read in fast_imageblit()
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (23 preceding siblings ...)
  2023-06-29 18:44 ` [PATCH 6.4 24/28] mm/khugepaged: fix regression in collapse_file() Greg Kroah-Hartman
@ 2023-06-29 18:44 ` Greg Kroah-Hartman
  2023-06-29 18:44 ` [PATCH 6.4 26/28] HID: hidraw: fix data race on device refcount Greg Kroah-Hartman
                   ` (3 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:44 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Zhang Shurong, Helge Deller

From: Zhang Shurong <zhang_shurong@foxmail.com>

commit c2d22806aecb24e2de55c30a06e5d6eb297d161d upstream.

There is a potential OOB read at fast_imageblit, for
"colortab[(*src >> 4)]" can become a negative value due to
"const char *s = image->data, *src".
This change makes sure the index for colortab always positive
or zero.

Similar commit:
https://patchwork.kernel.org/patch/11746067

Potential bug report:
https://groups.google.com/g/syzkaller-bugs/c/9ubBXKeKXf4/m/k-QXy4UgAAAJ

Signed-off-by: Zhang Shurong <zhang_shurong@foxmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/video/fbdev/core/sysimgblt.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/video/fbdev/core/sysimgblt.c
+++ b/drivers/video/fbdev/core/sysimgblt.c
@@ -189,7 +189,7 @@ static void fast_imageblit(const struct
 	u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
 	u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
 	u32 bit_mask, eorx, shift;
-	const char *s = image->data, *src;
+	const u8 *s = image->data, *src;
 	u32 *dst;
 	const u32 *tab;
 	size_t tablen;



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 26/28] HID: hidraw: fix data race on device refcount
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (24 preceding siblings ...)
  2023-06-29 18:44 ` [PATCH 6.4 25/28] fbdev: fix potential OOB read in fast_imageblit() Greg Kroah-Hartman
@ 2023-06-29 18:44 ` Greg Kroah-Hartman
  2023-06-29 18:44 ` [PATCH 6.4 27/28] HID: logitech-hidpp: add HIDPP_QUIRK_DELAYED_INIT for the T651 Greg Kroah-Hartman
                   ` (2 subsequent siblings)
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:44 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Ludvig Michaelsson,
	Benjamin Tissoires

From: Ludvig Michaelsson <ludvig.michaelsson@yubico.com>

commit 944ee77dc6ec7b0afd8ec70ffc418b238c92f12b upstream.

The hidraw_open() function increments the hidraw device reference
counter. The counter has no dedicated synchronization mechanism,
resulting in a potential data race when concurrently opening a device.

The race is a regression introduced by commit 8590222e4b02 ("HID:
hidraw: Replace hidraw device table mutex with a rwsem"). While
minors_rwsem is intended to protect the hidraw_table itself, by instead
acquiring the lock for writing, the reference counter is also protected.
This is symmetrical to hidraw_release().

Link: https://github.com/systemd/systemd/issues/27947
Fixes: 8590222e4b02 ("HID: hidraw: Replace hidraw device table mutex with a rwsem")
Cc: stable@vger.kernel.org
Signed-off-by: Ludvig Michaelsson <ludvig.michaelsson@yubico.com>
Link: https://lore.kernel.org/r/20230621-hidraw-race-v1-1-a58e6ac69bab@yubico.com
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/hid/hidraw.c |    9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

--- a/drivers/hid/hidraw.c
+++ b/drivers/hid/hidraw.c
@@ -272,7 +272,12 @@ static int hidraw_open(struct inode *ino
 		goto out;
 	}
 
-	down_read(&minors_rwsem);
+	/*
+	 * Technically not writing to the hidraw_table but a write lock is
+	 * required to protect the device refcount. This is symmetrical to
+	 * hidraw_release().
+	 */
+	down_write(&minors_rwsem);
 	if (!hidraw_table[minor] || !hidraw_table[minor]->exist) {
 		err = -ENODEV;
 		goto out_unlock;
@@ -301,7 +306,7 @@ static int hidraw_open(struct inode *ino
 	spin_unlock_irqrestore(&hidraw_table[minor]->list_lock, flags);
 	file->private_data = list;
 out_unlock:
-	up_read(&minors_rwsem);
+	up_write(&minors_rwsem);
 out:
 	if (err < 0)
 		kfree(list);



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 27/28] HID: logitech-hidpp: add HIDPP_QUIRK_DELAYED_INIT for the T651.
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (25 preceding siblings ...)
  2023-06-29 18:44 ` [PATCH 6.4 26/28] HID: hidraw: fix data race on device refcount Greg Kroah-Hartman
@ 2023-06-29 18:44 ` Greg Kroah-Hartman
  2023-06-29 18:44 ` [PATCH 6.4 28/28] Revert "thermal/drivers/mediatek: Use devm_of_iomap to avoid resource leak in mtk_thermal_probe" Greg Kroah-Hartman
  2023-06-30  5:30 ` [PATCH 6.4 00/28] 6.4.1-rc1 review Naresh Kamboju
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:44 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Mike Hommey, Benjamin Tissoires

From: Mike Hommey <mh@glandium.org>

commit 5fe251112646d8626818ea90f7af325bab243efa upstream.

commit 498ba2069035 ("HID: logitech-hidpp: Don't restart communication if
not necessary") put restarting communication behind that flag, and this
was apparently necessary on the T651, but the flag was not set for it.

Fixes: 498ba2069035 ("HID: logitech-hidpp: Don't restart communication if not necessary")
Cc: stable@vger.kernel.org
Signed-off-by: Mike Hommey <mh@glandium.org>
Link: https://lore.kernel.org/r/20230617230957.6mx73th4blv7owqk@glandium.org
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/hid/hid-logitech-hidpp.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/hid/hid-logitech-hidpp.c
+++ b/drivers/hid/hid-logitech-hidpp.c
@@ -4553,7 +4553,7 @@ static const struct hid_device_id hidpp_
 	{ /* wireless touchpad T651 */
 	  HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH,
 		USB_DEVICE_ID_LOGITECH_T651),
-	  .driver_data = HIDPP_QUIRK_CLASS_WTP },
+	  .driver_data = HIDPP_QUIRK_CLASS_WTP | HIDPP_QUIRK_DELAYED_INIT },
 	{ /* Mouse Logitech Anywhere MX */
 	  LDJ_DEVICE(0x1017), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_1P0 },
 	{ /* Mouse logitech M560 */



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 6.4 28/28] Revert "thermal/drivers/mediatek: Use devm_of_iomap to avoid resource leak in mtk_thermal_probe"
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (26 preceding siblings ...)
  2023-06-29 18:44 ` [PATCH 6.4 27/28] HID: logitech-hidpp: add HIDPP_QUIRK_DELAYED_INIT for the T651 Greg Kroah-Hartman
@ 2023-06-29 18:44 ` Greg Kroah-Hartman
  2023-06-30  5:30 ` [PATCH 6.4 00/28] 6.4.1-rc1 review Naresh Kamboju
  28 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-29 18:44 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Ricardo Cañuelo,
	AngeloGioacchino Del Regno, Daniel Lezcano

From: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

commit 86edac7d3888c715fe3a81bd61f3617ecfe2e1dd upstream.

This reverts commit f05c7b7d9ea9477fcc388476c6f4ade8c66d2d26.

That change was causing a regression in the generic-adc-thermal-probed
bootrr test as reported in the kernelci-results list [1].
A proper rework will take longer, so revert it for now.

[1] https://groups.io/g/kernelci-results/message/42660

Fixes: f05c7b7d9ea9 ("thermal/drivers/mediatek: Use devm_of_iomap to avoid resource leak in mtk_thermal_probe")
Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
Suggested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230525121811.3360268-1-ricardo.canuelo@collabora.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/thermal/mediatek/auxadc_thermal.c |   14 ++------------
 1 file changed, 2 insertions(+), 12 deletions(-)

--- a/drivers/thermal/mediatek/auxadc_thermal.c
+++ b/drivers/thermal/mediatek/auxadc_thermal.c
@@ -1222,12 +1222,7 @@ static int mtk_thermal_probe(struct plat
 		return -ENODEV;
 	}
 
-	auxadc_base = devm_of_iomap(&pdev->dev, auxadc, 0, NULL);
-	if (IS_ERR(auxadc_base)) {
-		of_node_put(auxadc);
-		return PTR_ERR(auxadc_base);
-	}
-
+	auxadc_base = of_iomap(auxadc, 0);
 	auxadc_phys_base = of_get_phys_base(auxadc);
 
 	of_node_put(auxadc);
@@ -1243,12 +1238,7 @@ static int mtk_thermal_probe(struct plat
 		return -ENODEV;
 	}
 
-	apmixed_base = devm_of_iomap(&pdev->dev, apmixedsys, 0, NULL);
-	if (IS_ERR(apmixed_base)) {
-		of_node_put(apmixedsys);
-		return PTR_ERR(apmixed_base);
-	}
-
+	apmixed_base = of_iomap(apmixedsys, 0);
 	apmixed_phys_base = of_get_phys_base(apmixedsys);
 
 	of_node_put(apmixedsys);



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review
  2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
                   ` (27 preceding siblings ...)
  2023-06-29 18:44 ` [PATCH 6.4 28/28] Revert "thermal/drivers/mediatek: Use devm_of_iomap to avoid resource leak in mtk_thermal_probe" Greg Kroah-Hartman
@ 2023-06-30  5:30 ` Naresh Kamboju
  2023-06-30  5:52   ` Greg Kroah-Hartman
                     ` (2 more replies)
  28 siblings, 3 replies; 65+ messages in thread
From: Naresh Kamboju @ 2023-06-30  5:30 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, linux-parisc, sparclinux,
	Stephen Rothwell, Helge Deller, Jason Wang

On Fri, 30 Jun 2023 at 00:18, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> This is the start of the stable review cycle for the 6.4.1 release.
> There are 28 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sat, 01 Jul 2023 18:41:39 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
>         https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.1-rc1.gz
> or in the git tree and branch at:
>         git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Results from Linaro’s test farm.

Following build regression noticed on Linux stable-rc 6.4 and also noticed on
Linux mainline master.

Regressions found on Parisc and Sparc build failed:
 - build/gcc-11-defconfig

Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>

Parisc Build log:
=============
arch/parisc/mm/fault.c: In function 'do_page_fault':
arch/parisc/mm/fault.c:292:22: error: 'prev' undeclared (first use in
this function)
  292 |                 if (!prev || !(prev->vm_flags & VM_GROWSUP))
      |                      ^~~~
arch/parisc/mm/fault.c:292:22: note: each undeclared identifier is
reported only once for each function it appears in


sparc Build log:
===========
<stdin>:1519:2: warning: #warning syscall clone3 not implemented [-Wcpp]
arch/sparc/mm/fault_32.c: In function 'force_user_fault':
arch/sparc/mm/fault_32.c:315:49: error: 'regs' undeclared (first use
in this function)
  315 |         vma = lock_mm_and_find_vma(mm, address, regs);
      |                                                 ^~~~
arch/sparc/mm/fault_32.c:315:49: note: each undeclared identifier is
reported only once for each function it appears in


Links:
 - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.4.y/build/v6.4-29-g8e5ddb853f08/testrun/17959811/suite/build/test/gcc-11-defconfig/details/
 - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.4.y/build/v6.4-29-g8e5ddb853f08/testrun/17959811/suite/build/test/gcc-11-defconfig/log

 - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.4.y/build/v6.4-29-g8e5ddb853f08/testrun/17959890/suite/build/test/gcc-11-defconfig/details/
 - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.4.y/build/v6.4-29-g8e5ddb853f08/testrun/17959890/suite/build/test/gcc-11-defconfig/log


Both build failures noticed on mainline and sparc build have been
fixed yesterday.
 - https://qa-reports.linaro.org/lkft/linux-mainline-master/build/v6.4-8542-g82a2a5105589/testrun/17963192/suite/build/test/gcc-11-defconfig/history/


Following patch that got fixed
---
From 0b26eadbf200abf6c97c6d870286c73219cdac65 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Thu, 29 Jun 2023 20:41:24 -0700
Subject: sparc32: fix lock_mm_and_find_vma() conversion

The sparc32 conversion to lock_mm_and_find_vma() in commit a050ba1e7422
("mm/fault: convert remaining simple cases to lock_mm_and_find_vma()")
missed the fact that we didn't actually have a 'regs' pointer available
in the 'force_user_fault()' case.

It's there in the regular page fault path ("do_sparc_fault()"), but not
the window underflow/overflow paths.

Which is all fine - we can just pass in a NULL pointer.  The register
state is only used to avoid deadlock with kernel faults, which is not
the case for any of these register window faults.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: a050ba1e7422 ("mm/fault: convert remaining simple cases to
lock_mm_and_find_vma()")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

--
Linaro LKFT
https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review
  2023-06-30  5:30 ` [PATCH 6.4 00/28] 6.4.1-rc1 review Naresh Kamboju
@ 2023-06-30  5:52   ` Greg Kroah-Hartman
  2023-06-30  6:16   ` Linus Torvalds
  2023-06-30  6:28   ` Greg Kroah-Hartman
  2 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-30  5:52 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, linux-parisc, sparclinux,
	Stephen Rothwell, Helge Deller, Jason Wang

On Fri, Jun 30, 2023 at 11:00:51AM +0530, Naresh Kamboju wrote:
> On Fri, 30 Jun 2023 at 00:18, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > This is the start of the stable review cycle for the 6.4.1 release.
> > There are 28 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Sat, 01 Jul 2023 18:41:39 +0000.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> >         https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.1-rc1.gz
> > or in the git tree and branch at:
> >         git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y
> > and the diffstat can be found below.
> >
> > thanks,
> >
> > greg k-h
> 
> Results from Linaro’s test farm.
> 
> Following build regression noticed on Linux stable-rc 6.4 and also noticed on
> Linux mainline master.
> 
> Regressions found on Parisc and Sparc build failed:
>  - build/gcc-11-defconfig
> 
> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
> 
> Parisc Build log:
> =============
> arch/parisc/mm/fault.c: In function 'do_page_fault':
> arch/parisc/mm/fault.c:292:22: error: 'prev' undeclared (first use in
> this function)
>   292 |                 if (!prev || !(prev->vm_flags & VM_GROWSUP))
>       |                      ^~~~
> arch/parisc/mm/fault.c:292:22: note: each undeclared identifier is
> reported only once for each function it appears in
> 
> 
> sparc Build log:
> ===========
> <stdin>:1519:2: warning: #warning syscall clone3 not implemented [-Wcpp]
> arch/sparc/mm/fault_32.c: In function 'force_user_fault':
> arch/sparc/mm/fault_32.c:315:49: error: 'regs' undeclared (first use
> in this function)
>   315 |         vma = lock_mm_and_find_vma(mm, address, regs);
>       |                                                 ^~~~
> arch/sparc/mm/fault_32.c:315:49: note: each undeclared identifier is
> reported only once for each function it appears in
> 
> 
> Links:
>  - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.4.y/build/v6.4-29-g8e5ddb853f08/testrun/17959811/suite/build/test/gcc-11-defconfig/details/
>  - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.4.y/build/v6.4-29-g8e5ddb853f08/testrun/17959811/suite/build/test/gcc-11-defconfig/log
> 
>  - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.4.y/build/v6.4-29-g8e5ddb853f08/testrun/17959890/suite/build/test/gcc-11-defconfig/details/
>  - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.4.y/build/v6.4-29-g8e5ddb853f08/testrun/17959890/suite/build/test/gcc-11-defconfig/log
> 
> 
> Both build failures noticed on mainline and sparc build have been
> fixed yesterday.
>  - https://qa-reports.linaro.org/lkft/linux-mainline-master/build/v6.4-8542-g82a2a5105589/testrun/17963192/suite/build/test/gcc-11-defconfig/history/
> 
> 
> Following patch that got fixed
> ---
> >From 0b26eadbf200abf6c97c6d870286c73219cdac65 Mon Sep 17 00:00:00 2001
> From: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Thu, 29 Jun 2023 20:41:24 -0700
> Subject: sparc32: fix lock_mm_and_find_vma() conversion
> 
> The sparc32 conversion to lock_mm_and_find_vma() in commit a050ba1e7422
> ("mm/fault: convert remaining simple cases to lock_mm_and_find_vma()")
> missed the fact that we didn't actually have a 'regs' pointer available
> in the 'force_user_fault()' case.
> 
> It's there in the regular page fault path ("do_sparc_fault()"), but not
> the window underflow/overflow paths.
> 
> Which is all fine - we can just pass in a NULL pointer.  The register
> state is only used to avoid deadlock with kernel faults, which is not
> the case for any of these register window faults.
> 
> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
> Fixes: a050ba1e7422 ("mm/fault: convert remaining simple cases to
> lock_mm_and_find_vma()")
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Thanks!  That saves me having to dig.  I'll go push out updates with
this in them...

greg k-h

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review
  2023-06-30  5:30 ` [PATCH 6.4 00/28] 6.4.1-rc1 review Naresh Kamboju
  2023-06-30  5:52   ` Greg Kroah-Hartman
@ 2023-06-30  6:16   ` Linus Torvalds
  2023-06-30  6:29     ` Greg Kroah-Hartman
  2023-06-30  6:29     ` [PATCH 6.4 00/28] 6.4.1-rc1 review Guenter Roeck
  2023-06-30  6:28   ` Greg Kroah-Hartman
  2 siblings, 2 replies; 65+ messages in thread
From: Linus Torvalds @ 2023-06-30  6:16 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: Greg Kroah-Hartman, stable, patches, linux-kernel, akpm, linux,
	shuah, patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, linux-parisc, sparclinux,
	Stephen Rothwell, Helge Deller, Jason Wang

On Thu, 29 Jun 2023 at 22:31, Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
>
> arch/parisc/mm/fault.c: In function 'do_page_fault':
> arch/parisc/mm/fault.c:292:22: error: 'prev' undeclared (first use in this function)
>   292 |                 if (!prev || !(prev->vm_flags & VM_GROWSUP))

Bah. "prev" should be "prev_vma" here.

I've pushed out the fix. Greg, apologies. It's

   ea3f8272876f parisc: fix expand_stack() conversion

and Naresh already pointed to the similarly silly sparc32 fix.

             Linus

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review
  2023-06-30  5:30 ` [PATCH 6.4 00/28] 6.4.1-rc1 review Naresh Kamboju
  2023-06-30  5:52   ` Greg Kroah-Hartman
  2023-06-30  6:16   ` Linus Torvalds
@ 2023-06-30  6:28   ` Greg Kroah-Hartman
  2 siblings, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-30  6:28 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, linux-parisc, sparclinux,
	Stephen Rothwell, Helge Deller, Jason Wang

On Fri, Jun 30, 2023 at 11:00:51AM +0530, Naresh Kamboju wrote:
> On Fri, 30 Jun 2023 at 00:18, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > This is the start of the stable review cycle for the 6.4.1 release.
> > There are 28 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Sat, 01 Jul 2023 18:41:39 +0000.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> >         https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.1-rc1.gz
> > or in the git tree and branch at:
> >         git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y
> > and the diffstat can be found below.
> >
> > thanks,
> >
> > greg k-h
> 
> Results from Linaro’s test farm.
> 
> Following build regression noticed on Linux stable-rc 6.4 and also noticed on
> Linux mainline master.
> 
> Regressions found on Parisc and Sparc build failed:
>  - build/gcc-11-defconfig
> 
> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
> 
> Parisc Build log:
> =============
> arch/parisc/mm/fault.c: In function 'do_page_fault':
> arch/parisc/mm/fault.c:292:22: error: 'prev' undeclared (first use in
> this function)
>   292 |                 if (!prev || !(prev->vm_flags & VM_GROWSUP))
>       |                      ^~~~
> arch/parisc/mm/fault.c:292:22: note: each undeclared identifier is
> reported only once for each function it appears in

This is now fixed in Linus's tree with ea3f8272876f ("parisc: fix
expand_stack() conversion"), so I'll queue it up and push out
yet-another-rc...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review
  2023-06-30  6:16   ` Linus Torvalds
@ 2023-06-30  6:29     ` Greg Kroah-Hartman
  2023-06-30  6:56       ` Helge Deller
  2023-06-30  6:29     ` [PATCH 6.4 00/28] 6.4.1-rc1 review Guenter Roeck
  1 sibling, 1 reply; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-30  6:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Naresh Kamboju, stable, patches, linux-kernel, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, linux-parisc, sparclinux,
	Stephen Rothwell, Helge Deller, Jason Wang

On Thu, Jun 29, 2023 at 11:16:21PM -0700, Linus Torvalds wrote:
> On Thu, 29 Jun 2023 at 22:31, Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> >
> > arch/parisc/mm/fault.c: In function 'do_page_fault':
> > arch/parisc/mm/fault.c:292:22: error: 'prev' undeclared (first use in this function)
> >   292 |                 if (!prev || !(prev->vm_flags & VM_GROWSUP))
> 
> Bah. "prev" should be "prev_vma" here.
> 
> I've pushed out the fix. Greg, apologies. It's
> 
>    ea3f8272876f parisc: fix expand_stack() conversion
> 
> and Naresh already pointed to the similarly silly sparc32 fix.

Ah, I saw it hit your repo before your email here, sorry about that.
Now picked up.

greg k-h

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review
  2023-06-30  6:16   ` Linus Torvalds
  2023-06-30  6:29     ` Greg Kroah-Hartman
@ 2023-06-30  6:29     ` Guenter Roeck
  2023-06-30  6:33       ` Guenter Roeck
  2023-06-30  6:33       ` Linus Torvalds
  1 sibling, 2 replies; 65+ messages in thread
From: Guenter Roeck @ 2023-06-30  6:29 UTC (permalink / raw)
  To: Linus Torvalds, Naresh Kamboju
  Cc: Greg Kroah-Hartman, stable, patches, linux-kernel, akpm, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, linux-parisc, sparclinux,
	Stephen Rothwell, Helge Deller, Jason Wang

On 6/29/23 23:16, Linus Torvalds wrote:
> On Thu, 29 Jun 2023 at 22:31, Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
>>
>> arch/parisc/mm/fault.c: In function 'do_page_fault':
>> arch/parisc/mm/fault.c:292:22: error: 'prev' undeclared (first use in this function)
>>    292 |                 if (!prev || !(prev->vm_flags & VM_GROWSUP))
> 
> Bah. "prev" should be "prev_vma" here.
> 
> I've pushed out the fix. Greg, apologies. It's
> 
>     ea3f8272876f parisc: fix expand_stack() conversion
> 
> and Naresh already pointed to the similarly silly sparc32 fix.
> 
>               Linus

Did you see that one (in mainline) ?

Building csky:defconfig ... failed
--------------
Error log:
arch/csky/mm/fault.c: In function 'do_page_fault':
arch/csky/mm/fault.c:240:40: error: 'address' undeclared (first use in this function); did you mean 'addr'?
   240 |         vma = lock_mm_and_find_vma(mm, address, regs);

Guenter


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review
  2023-06-30  6:29     ` [PATCH 6.4 00/28] 6.4.1-rc1 review Guenter Roeck
@ 2023-06-30  6:33       ` Guenter Roeck
  2023-06-30  6:33       ` Linus Torvalds
  1 sibling, 0 replies; 65+ messages in thread
From: Guenter Roeck @ 2023-06-30  6:33 UTC (permalink / raw)
  To: Linus Torvalds, Naresh Kamboju
  Cc: Greg Kroah-Hartman, stable, patches, linux-kernel, akpm, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, linux-parisc, sparclinux,
	Stephen Rothwell, Helge Deller, Jason Wang

On 6/29/23 23:29, Guenter Roeck wrote:
> On 6/29/23 23:16, Linus Torvalds wrote:
>> On Thu, 29 Jun 2023 at 22:31, Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
>>>
>>> arch/parisc/mm/fault.c: In function 'do_page_fault':
>>> arch/parisc/mm/fault.c:292:22: error: 'prev' undeclared (first use in this function)
>>>    292 |                 if (!prev || !(prev->vm_flags & VM_GROWSUP))
>>
>> Bah. "prev" should be "prev_vma" here.
>>
>> I've pushed out the fix. Greg, apologies. It's
>>
>>     ea3f8272876f parisc: fix expand_stack() conversion
>>
>> and Naresh already pointed to the similarly silly sparc32 fix.
>>
>>               Linus
> 
> Did you see that one (in mainline) ?
> 
> Building csky:defconfig ... failed
> --------------
> Error log:
> arch/csky/mm/fault.c: In function 'do_page_fault':
> arch/csky/mm/fault.c:240:40: error: 'address' undeclared (first use in this function); did you mean 'addr'?
>    240 |         vma = lock_mm_and_find_vma(mm, address, regs);
> 

This is also in {6.1,6.3,6.4}-rc unless I am missing something.

Guenter


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review
  2023-06-30  6:29     ` [PATCH 6.4 00/28] 6.4.1-rc1 review Guenter Roeck
  2023-06-30  6:33       ` Guenter Roeck
@ 2023-06-30  6:33       ` Linus Torvalds
  2023-06-30  6:47         ` Linus Torvalds
  2023-06-30  6:51         ` Greg Kroah-Hartman
  1 sibling, 2 replies; 65+ messages in thread
From: Linus Torvalds @ 2023-06-30  6:33 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Naresh Kamboju, Greg Kroah-Hartman, stable, patches, linux-kernel,
	akpm, shuah, patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, linux-parisc, sparclinux,
	Stephen Rothwell, Helge Deller, Jason Wang

On Thu, 29 Jun 2023 at 23:29, Guenter Roeck <linux@roeck-us.net> wrote:
>
> Did you see that one (in mainline) ?
>
> Building csky:defconfig ... failed

Nope. Thanks. Obvious fix: 'address' is called 'addr' here.

I knew we had all these tiny little mazes that looked the same but
were just _subtly_ different, but I still ended up doing too much
cut-and-paste.

And I only ended up cross-compiling the fairly small set that I had
existing cross-build environments for. Which was less than half our
~24 different architectures.

Oh well.  We'll get them all. Eventually. Let me go fix up that csky case.

              Linus

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review
  2023-06-30  6:33       ` Linus Torvalds
@ 2023-06-30  6:47         ` Linus Torvalds
  2023-06-30 22:51           ` Guenter Roeck
  2023-06-30  6:51         ` Greg Kroah-Hartman
  1 sibling, 1 reply; 65+ messages in thread
From: Linus Torvalds @ 2023-06-30  6:47 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Naresh Kamboju, Greg Kroah-Hartman, stable, patches, linux-kernel,
	akpm, shuah, patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, linux-parisc, sparclinux,
	Stephen Rothwell, Helge Deller, Jason Wang

On Thu, 29 Jun 2023 at 23:33, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Oh well.  We'll get them all. Eventually. Let me go fix up that csky case.

It's commit e55e5df193d2 ("csky: fix up lock_mm_and_find_vma() conversion").

Let's hope all the problems are these kinds of silly - but obvious -
naming differences between different architectures.

Because as long as they cause build errors, they may be embarrassing,
but easy to find and notice.

I may not have cared enough about some of these architectures, and it
shows. sparc32. parisc. csky...

             Linus

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review
  2023-06-30  6:33       ` Linus Torvalds
  2023-06-30  6:47         ` Linus Torvalds
@ 2023-06-30  6:51         ` Greg Kroah-Hartman
  1 sibling, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-06-30  6:51 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Guenter Roeck, Naresh Kamboju, stable, patches, linux-kernel,
	akpm, shuah, patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, linux-parisc, sparclinux,
	Stephen Rothwell, Helge Deller, Jason Wang

On Thu, Jun 29, 2023 at 11:33:45PM -0700, Linus Torvalds wrote:
> On Thu, 29 Jun 2023 at 23:29, Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > Did you see that one (in mainline) ?
> >
> > Building csky:defconfig ... failed
> 
> Nope. Thanks. Obvious fix: 'address' is called 'addr' here.
> 
> I knew we had all these tiny little mazes that looked the same but
> were just _subtly_ different, but I still ended up doing too much
> cut-and-paste.
> 
> And I only ended up cross-compiling the fairly small set that I had
> existing cross-build environments for. Which was less than half our
> ~24 different architectures.
> 
> Oh well.  We'll get them all. Eventually. Let me go fix up that csky case.

Thanks, I've picked that up now as well.

greg k-h

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review
  2023-06-30  6:29     ` Greg Kroah-Hartman
@ 2023-06-30  6:56       ` Helge Deller
  2023-07-02 21:33         ` [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long Helge Deller
  0 siblings, 1 reply; 65+ messages in thread
From: Helge Deller @ 2023-06-30  6:56 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Linus Torvalds
  Cc: Naresh Kamboju, stable, patches, linux-kernel, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, linux-parisc, sparclinux,
	Stephen Rothwell, Jason Wang

On 6/30/23 08:29, Greg Kroah-Hartman wrote:
> On Thu, Jun 29, 2023 at 11:16:21PM -0700, Linus Torvalds wrote:
>> On Thu, 29 Jun 2023 at 22:31, Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
>>>
>>> arch/parisc/mm/fault.c: In function 'do_page_fault':
>>> arch/parisc/mm/fault.c:292:22: error: 'prev' undeclared (first use in this function)
>>>    292 |                 if (!prev || !(prev->vm_flags & VM_GROWSUP))
>>
>> Bah. "prev" should be "prev_vma" here.
>>
>> I've pushed out the fix. Greg, apologies. It's
>>
>>     ea3f8272876f parisc: fix expand_stack() conversion
>>
>> and Naresh already pointed to the similarly silly sparc32 fix.
>
> Ah, I saw it hit your repo before your email here, sorry about that.
> Now picked up.

I've just cherry-picked ea3f8272876f on top of -rc2, built and run-tested it,
and everything is OK on parisc.

Thanks!
Helge

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review
  2023-06-30  6:47         ` Linus Torvalds
@ 2023-06-30 22:51           ` Guenter Roeck
  2023-07-01  1:24             ` Linus Torvalds
  0 siblings, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2023-06-30 22:51 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Naresh Kamboju, Greg Kroah-Hartman, stable, patches, linux-kernel,
	akpm, shuah, patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, linux-parisc, sparclinux,
	Stephen Rothwell, Helge Deller, Jason Wang

On 6/29/23 23:47, Linus Torvalds wrote:
> On Thu, 29 Jun 2023 at 23:33, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> Oh well.  We'll get them all. Eventually. Let me go fix up that csky case.
> 
> It's commit e55e5df193d2 ("csky: fix up lock_mm_and_find_vma() conversion").
> 
> Let's hope all the problems are these kinds of silly - but obvious -
> naming differences between different architectures.
> 
> Because as long as they cause build errors, they may be embarrassing,
> but easy to find and notice.
> 
> I may not have cared enough about some of these architectures, and it
> shows. sparc32. parisc. csky...
> 

There is one more, unfortunately.

Building xtensa:de212:kc705-nommu:nommu_kc705_defconfig ... failed
------------
Error log:
arch/xtensa/mm/fault.c: In function ‘do_page_fault’:
arch/xtensa/mm/fault.c:133:8: error: implicit declaration of function ‘lock_mm_and_find_vma’

This affects all stable release candidates as well as mainline.
mmu builds are fine, and indeed lock_mm_and_find_vma() is only declared
for CONFIG_MMU. I don't know if this needs a dummy or some other fix
for the nommu case.

Guenter


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review
  2023-06-30 22:51           ` Guenter Roeck
@ 2023-07-01  1:24             ` Linus Torvalds
  2023-07-01  2:49               ` Guenter Roeck
  0 siblings, 1 reply; 65+ messages in thread
From: Linus Torvalds @ 2023-07-01  1:24 UTC (permalink / raw)
  To: Guenter Roeck, Chris Zankel, Max Filippov
  Cc: Naresh Kamboju, Greg Kroah-Hartman, stable, patches, linux-kernel,
	akpm, shuah, patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, linux-parisc, sparclinux,
	Stephen Rothwell, Helge Deller, Jason Wang

[-- Attachment #1: Type: text/plain, Size: 1387 bytes --]

On Fri, 30 Jun 2023 at 15:51, Guenter Roeck <linux@roeck-us.net> wrote:
>
> There is one more, unfortunately.
>
> Building xtensa:de212:kc705-nommu:nommu_kc705_defconfig ... failed

Heh. I didn't even realize that anybody would ever do
lock_mm_and_find_vma() code on a nommu platform.

With nommu, handle_mm_fault() will just BUG(), so it's kind of
pointless to do any of this at all, and I didn't expect anybody to
have this page faulting path that just causes that BUG() for any
faults.

But it turns out xtensa has a notion of protection faults even for
NOMMU configs:

    config PFAULT
        bool "Handle protection faults" if EXPERT && !MMU
        default y
        help
          Handle protection faults. MMU configurations must enable it.
          noMMU configurations may disable it if used memory map never
          generates protection faults or faults are always fatal.

          If unsure, say Y.

which is why it violated my expectations so badly.

I'm not sure if that protection fault handling really ever gets quite
this far (it certainly should *not* make it to the BUG() in
handle_mm_fault()), but I think the attached patch is likely the right
thing to do.

Can you check if it fixes that xtensa case? It looks
ObviouslyCorrect(tm) to me, but considering that I clearly missed this
case existing AT ALL, it might be best to double-check.

               Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 1876 bytes --]

 include/linux/mm.h |  5 +++--
 mm/nommu.c         | 11 +++++++++++
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 39aa409e84d5..4f2c33c273eb 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2323,6 +2323,9 @@ void pagecache_isize_extended(struct inode *inode, loff_t from, loff_t to);
 void truncate_pagecache_range(struct inode *inode, loff_t offset, loff_t end);
 int generic_error_remove_page(struct address_space *mapping, struct page *page);
 
+struct vm_area_struct *lock_mm_and_find_vma(struct mm_struct *mm,
+		unsigned long address, struct pt_regs *regs);
+
 #ifdef CONFIG_MMU
 extern vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
 				  unsigned long address, unsigned int flags,
@@ -2334,8 +2337,6 @@ void unmap_mapping_pages(struct address_space *mapping,
 		pgoff_t start, pgoff_t nr, bool even_cows);
 void unmap_mapping_range(struct address_space *mapping,
 		loff_t const holebegin, loff_t const holelen, int even_cows);
-struct vm_area_struct *lock_mm_and_find_vma(struct mm_struct *mm,
-		unsigned long address, struct pt_regs *regs);
 #else
 static inline vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
 					 unsigned long address, unsigned int flags,
diff --git a/mm/nommu.c b/mm/nommu.c
index 37d0b03143f1..fdc392735ec6 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -630,6 +630,17 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr)
 }
 EXPORT_SYMBOL(find_vma);
 
+/*
+ * At least xtensa ends up having protection faults even with no
+ * MMU.. No stack expansion, at least.
+ */
+struct vm_area_struct *lock_mm_and_find_vma(struct mm_struct *mm,
+			unsigned long addr, struct pt_regs *regs)
+{
+	mmap_read_lock(mm);
+	return vma_lookup(mm, addr);
+}
+
 /*
  * expand a stack to a given address
  * - not supported under NOMMU conditions

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review
  2023-07-01  1:24             ` Linus Torvalds
@ 2023-07-01  2:49               ` Guenter Roeck
  2023-07-01  4:22                 ` Linus Torvalds
  0 siblings, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2023-07-01  2:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Chris Zankel, Max Filippov, Naresh Kamboju, Greg Kroah-Hartman,
	stable, patches, linux-kernel, akpm, shuah, patches, lkft-triage,
	pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw, rwarsow,
	conor, linux-parisc, sparclinux, Stephen Rothwell, Helge Deller,
	Jason Wang

On Fri, Jun 30, 2023 at 06:24:49PM -0700, Linus Torvalds wrote:
> On Fri, 30 Jun 2023 at 15:51, Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > There is one more, unfortunately.
> >
> > Building xtensa:de212:kc705-nommu:nommu_kc705_defconfig ... failed
> 
> Heh. I didn't even realize that anybody would ever do
> lock_mm_and_find_vma() code on a nommu platform.
> 
> With nommu, handle_mm_fault() will just BUG(), so it's kind of
> pointless to do any of this at all, and I didn't expect anybody to
> have this page faulting path that just causes that BUG() for any
> faults.
> 
> But it turns out xtensa has a notion of protection faults even for
> NOMMU configs:
> 
>     config PFAULT
>         bool "Handle protection faults" if EXPERT && !MMU
>         default y
>         help
>           Handle protection faults. MMU configurations must enable it.
>           noMMU configurations may disable it if used memory map never
>           generates protection faults or faults are always fatal.
> 
>           If unsure, say Y.
> 
> which is why it violated my expectations so badly.
> 
> I'm not sure if that protection fault handling really ever gets quite
> this far (it certainly should *not* make it to the BUG() in
> handle_mm_fault()), but I think the attached patch is likely the right
> thing to do.
> 
> Can you check if it fixes that xtensa case? It looks
> ObviouslyCorrect(tm) to me, but considering that I clearly missed this
> case existing AT ALL, it might be best to double-check.
> 
>                Linus

Yes, the patch below fixes the problem.

Building xtensa:de212:kc705-nommu:nommu_kc705_defconfig ... running ......... passed

Thanks,
Guenter

>  include/linux/mm.h |  5 +++--
>  mm/nommu.c         | 11 +++++++++++
>  2 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 39aa409e84d5..4f2c33c273eb 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2323,6 +2323,9 @@ void pagecache_isize_extended(struct inode *inode, loff_t from, loff_t to);
>  void truncate_pagecache_range(struct inode *inode, loff_t offset, loff_t end);
>  int generic_error_remove_page(struct address_space *mapping, struct page *page);
>  
> +struct vm_area_struct *lock_mm_and_find_vma(struct mm_struct *mm,
> +		unsigned long address, struct pt_regs *regs);
> +
>  #ifdef CONFIG_MMU
>  extern vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
>  				  unsigned long address, unsigned int flags,
> @@ -2334,8 +2337,6 @@ void unmap_mapping_pages(struct address_space *mapping,
>  		pgoff_t start, pgoff_t nr, bool even_cows);
>  void unmap_mapping_range(struct address_space *mapping,
>  		loff_t const holebegin, loff_t const holelen, int even_cows);
> -struct vm_area_struct *lock_mm_and_find_vma(struct mm_struct *mm,
> -		unsigned long address, struct pt_regs *regs);
>  #else
>  static inline vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
>  					 unsigned long address, unsigned int flags,
> diff --git a/mm/nommu.c b/mm/nommu.c
> index 37d0b03143f1..fdc392735ec6 100644
> --- a/mm/nommu.c
> +++ b/mm/nommu.c
> @@ -630,6 +630,17 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr)
>  }
>  EXPORT_SYMBOL(find_vma);
>  
> +/*
> + * At least xtensa ends up having protection faults even with no
> + * MMU.. No stack expansion, at least.
> + */
> +struct vm_area_struct *lock_mm_and_find_vma(struct mm_struct *mm,
> +			unsigned long addr, struct pt_regs *regs)
> +{
> +	mmap_read_lock(mm);
> +	return vma_lookup(mm, addr);
> +}
> +
>  /*
>   * expand a stack to a given address
>   * - not supported under NOMMU conditions


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review
  2023-07-01  2:49               ` Guenter Roeck
@ 2023-07-01  4:22                 ` Linus Torvalds
  2023-07-01  9:57                   ` Greg Kroah-Hartman
  2023-07-01 10:32                   ` Max Filippov
  0 siblings, 2 replies; 65+ messages in thread
From: Linus Torvalds @ 2023-07-01  4:22 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Chris Zankel, Max Filippov, Naresh Kamboju, Greg Kroah-Hartman,
	stable, patches, linux-kernel, akpm, shuah, patches, lkft-triage,
	pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw, rwarsow,
	conor, linux-parisc, sparclinux, Stephen Rothwell, Helge Deller,
	Jason Wang

On Fri, 30 Jun 2023 at 19:50, Guenter Roeck <linux@roeck-us.net> wrote:
>
> Yes, the patch below fixes the problem.
>
> Building xtensa:de212:kc705-nommu:nommu_kc705_defconfig ... running ......... passed

Thanks. Committed as

  d85a143b69ab ("xtensa: fix NOMMU build with lock_mm_and_find_vma()
conversion")

and pushed out.

                    Linus

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review
  2023-07-01  4:22                 ` Linus Torvalds
@ 2023-07-01  9:57                   ` Greg Kroah-Hartman
  2023-07-01 10:32                   ` Max Filippov
  1 sibling, 0 replies; 65+ messages in thread
From: Greg Kroah-Hartman @ 2023-07-01  9:57 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Guenter Roeck, Chris Zankel, Max Filippov, Naresh Kamboju, stable,
	patches, linux-kernel, akpm, shuah, patches, lkft-triage, pavel,
	jonathanh, f.fainelli, sudipm.mukherjee, srw, rwarsow, conor,
	linux-parisc, sparclinux, Stephen Rothwell, Helge Deller,
	Jason Wang

On Fri, Jun 30, 2023 at 09:22:45PM -0700, Linus Torvalds wrote:
> On Fri, 30 Jun 2023 at 19:50, Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > Yes, the patch below fixes the problem.
> >
> > Building xtensa:de212:kc705-nommu:nommu_kc705_defconfig ... running ......... passed
> 
> Thanks. Committed as
> 
>   d85a143b69ab ("xtensa: fix NOMMU build with lock_mm_and_find_vma()
> conversion")
> 
> and pushed out.

Thanks, now queued up.

greg k-h

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review
  2023-07-01  4:22                 ` Linus Torvalds
  2023-07-01  9:57                   ` Greg Kroah-Hartman
@ 2023-07-01 10:32                   ` Max Filippov
  2023-07-01 15:01                     ` Linus Torvalds
  1 sibling, 1 reply; 65+ messages in thread
From: Max Filippov @ 2023-07-01 10:32 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Guenter Roeck, Chris Zankel, Naresh Kamboju, Greg Kroah-Hartman,
	stable, patches, linux-kernel, akpm, shuah, patches, lkft-triage,
	pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw, rwarsow,
	conor, linux-parisc, sparclinux, Stephen Rothwell, Helge Deller,
	Jason Wang

Hi Linus,

On Fri, Jun 30, 2023 at 9:23 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Fri, 30 Jun 2023 at 19:50, Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > Yes, the patch below fixes the problem.
> >
> > Building xtensa:de212:kc705-nommu:nommu_kc705_defconfig ... running ......... passed
>
> Thanks. Committed as
>
>   d85a143b69ab ("xtensa: fix NOMMU build with lock_mm_and_find_vma()
> conversion")
>
> and pushed out.

Thanks for the build fix. Unfortunately despite being obviously correct
it doesn't release the mm lock in case VMA is not found, so it results
in a runtime hang. I've posted a fix for that.

-- 
Thanks.
-- Max

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review
  2023-07-01 10:32                   ` Max Filippov
@ 2023-07-01 15:01                     ` Linus Torvalds
  0 siblings, 0 replies; 65+ messages in thread
From: Linus Torvalds @ 2023-07-01 15:01 UTC (permalink / raw)
  To: Max Filippov
  Cc: Guenter Roeck, Chris Zankel, Naresh Kamboju, Greg Kroah-Hartman,
	stable, patches, linux-kernel, akpm, shuah, patches, lkft-triage,
	pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw, rwarsow,
	conor, linux-parisc, sparclinux, Stephen Rothwell, Helge Deller,
	Jason Wang

On Sat, 1 Jul 2023 at 03:32, Max Filippov <jcmvbkbc@gmail.com> wrote:
>
> Thanks for the build fix. Unfortunately despite being obviously correct
> it doesn't release the mm lock in case VMA is not found, so it results
> in a runtime hang. I've posted a fix for that.

Heh. I woke up this morning to that feeling of "Duh!" about this, and
find you already had fixed it. Patch applied.

            Linus

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long
  2023-06-30  6:56       ` Helge Deller
@ 2023-07-02 21:33         ` Helge Deller
  2023-07-02 22:45           ` Linus Torvalds
  0 siblings, 1 reply; 65+ messages in thread
From: Helge Deller @ 2023-07-02 21:33 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: stable, linux-kernel, akpm, linux, linux-parisc,
	Greg Kroah-Hartman, John David Anglin

Hi Linus,

On 6/30/23 08:56, Helge Deller wrote:
> On 6/30/23 08:29, Greg Kroah-Hartman wrote:
>> On Thu, Jun 29, 2023 at 11:16:21PM -0700, Linus Torvalds wrote:
>>> On Thu, 29 Jun 2023 at 22:31, Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
>>>>
>>>> arch/parisc/mm/fault.c: In function 'do_page_fault':
>>>> arch/parisc/mm/fault.c:292:22: error: 'prev' undeclared (first use in this function)
>>>>    292 |                 if (!prev || !(prev->vm_flags & VM_GROWSUP))
>>>
>>> Bah. "prev" should be "prev_vma" here.
>>>
>>> I've pushed out the fix. Greg, apologies. It's
>>>
>>>     ea3f8272876f parisc: fix expand_stack() conversion
>>>
>>> and Naresh already pointed to the similarly silly sparc32 fix.
>>
>> Ah, I saw it hit your repo before your email here, sorry about that.
>> Now picked up.
>
> I've just cherry-picked ea3f8272876f on top of -rc2, built and run-tested it,
> and everything is OK on parisc.

Actually, your changes seems to trigger...:

root@debian:~# /usr/bin/ls /usr/bin/*
-bash: /usr/bin/ls: Argument list too long

or with a long gcc argument list:
gcc: fatal error: cannot execute '/usr/lib/gcc/hppa-linux-gnu/12/cc1': execv: Argument list too long

I'm trying to understand what's missing, but maybe you have some idea?

Helge

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long
  2023-07-02 21:33         ` [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long Helge Deller
@ 2023-07-02 22:45           ` Linus Torvalds
  2023-07-02 23:30             ` Linus Torvalds
  0 siblings, 1 reply; 65+ messages in thread
From: Linus Torvalds @ 2023-07-02 22:45 UTC (permalink / raw)
  To: Helge Deller
  Cc: stable, linux-kernel, akpm, linux, linux-parisc,
	Greg Kroah-Hartman, John David Anglin

On Sun, 2 Jul 2023 at 14:33, Helge Deller <deller@gmx.de> wrote:
>
> Actually, your changes seems to trigger...:
>
> root@debian:~# /usr/bin/ls /usr/bin/*
> -bash: /usr/bin/ls: Argument list too long

So this only happens with _fairly_ long argument lists, right? Maybe
your config has a 64kB page size, and normal programs never expand
beyond a single page?

I bet it is because of f313c51d26aa ("execve: expand new process stack
manually ahead of time"), but I don't see exactly why.

But pa-risc is the only architecture with CONFIG_STACK_GROWSUP, and
while I really thought that commit should do the exact same thing as
the old

  #ifdef CONFIG_STACK_GROWSUP

special case, I must clearly have been wrong.

Would you mind just verifying that yes, that commit on mainline is
broken for you, and the previous one works?

               Linus

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long
  2023-07-02 22:45           ` Linus Torvalds
@ 2023-07-02 23:30             ` Linus Torvalds
  2023-07-03  3:23               ` Guenter Roeck
  0 siblings, 1 reply; 65+ messages in thread
From: Linus Torvalds @ 2023-07-02 23:30 UTC (permalink / raw)
  To: Helge Deller
  Cc: stable, linux-kernel, akpm, linux, linux-parisc,
	Greg Kroah-Hartman, John David Anglin

[-- Attachment #1: Type: text/plain, Size: 789 bytes --]

On Sun, 2 Jul 2023 at 15:45, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Would you mind just verifying that yes, that commit on mainline is
> broken for you, and the previous one works?

Also, while I looked at it again, and still didn't understand why
parisc would be different here, I *did* realize that because parisc
has a stack that grows up, the debug warning I added for GUP won't
trigger.

So if I got that execve() logic wrong for STACK_GROWSUP (which I
clearly must have), then exactly because it's grows-up, a GUP failure
wouldn't warn about not expanding the stack.

IOW, would you mind applying something like this on top of the current
kernel, and let me know if it warns?

.. and here I thought ia64 would be the pain-point. Silly me.

                  Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 773 bytes --]

 mm/gup.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/mm/gup.c b/mm/gup.c
index ef29641671c7..66520194006b 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1168,11 +1168,15 @@ static long __get_user_pages(struct mm_struct *mm,
 
 		/* first iteration or cross vma bound */
 		if (!vma || start >= vma->vm_end) {
-			vma = find_vma(mm, start);
+			struct vm_area_struct *prev = NULL;
+			vma = find_vma_prev(mm, start, &prev);
 			if (vma && (start < vma->vm_start)) {
 				WARN_ON_ONCE(vma->vm_flags & VM_GROWSDOWN);
 				vma = NULL;
 			}
+			if (!vma && prev && start >= prev->vm_end)
+				WARN_ON_ONCE(prev->vm_flags & VM_GROWSUP);
+
 			if (!vma && in_gate_area(mm, start)) {
 				ret = get_gate_page(mm, start & PAGE_MASK,
 						gup_flags, &vma,

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long
  2023-07-02 23:30             ` Linus Torvalds
@ 2023-07-03  3:23               ` Guenter Roeck
  2023-07-03  4:22                 ` Linus Torvalds
  0 siblings, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2023-07-03  3:23 UTC (permalink / raw)
  To: Linus Torvalds, Helge Deller
  Cc: stable, linux-kernel, akpm, linux-parisc, Greg Kroah-Hartman,
	John David Anglin

On 7/2/23 16:30, Linus Torvalds wrote:
> On Sun, 2 Jul 2023 at 15:45, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> Would you mind just verifying that yes, that commit on mainline is
>> broken for you, and the previous one works?
> 
> Also, while I looked at it again, and still didn't understand why
> parisc would be different here, I *did* realize that because parisc
> has a stack that grows up, the debug warning I added for GUP won't
> trigger.
> 
> So if I got that execve() logic wrong for STACK_GROWSUP (which I
> clearly must have), then exactly because it's grows-up, a GUP failure
> wouldn't warn about not expanding the stack.
> 
> IOW, would you mind applying something like this on top of the current
> kernel, and let me know if it warns?
> 

I can reproduce the problem in qemu. However, I do not see a warning
after applying your patch.

Guenter


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long
  2023-07-03  3:23               ` Guenter Roeck
@ 2023-07-03  4:22                 ` Linus Torvalds
  2023-07-03  4:46                   ` Guenter Roeck
  0 siblings, 1 reply; 65+ messages in thread
From: Linus Torvalds @ 2023-07-03  4:22 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Helge Deller, stable, linux-kernel, akpm, linux-parisc,
	Greg Kroah-Hartman, John David Anglin

[-- Attachment #1: Type: text/plain, Size: 662 bytes --]

On Sun, 2 Jul 2023 at 20:23, Guenter Roeck <linux@roeck-us.net> wrote:
>
> I can reproduce the problem in qemu. However, I do not see a warning
> after applying your patch.

Funky, funky.

I'm assuming it's the

                                page = get_arg_page(bprm, pos, 1);
                                if (!page) {
                                        ret = -E2BIG;
                                        goto out;
                                }

in copy_strings() that causes this. Or possibly, the version in
copy_string_kernel().

Does *this* get that "pr_warn()" printout (and a stack trace once,
just for good measure)?

              Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 773 bytes --]

 mm/gup.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/mm/gup.c b/mm/gup.c
index ef29641671c7..66520194006b 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1168,11 +1168,15 @@ static long __get_user_pages(struct mm_struct *mm,
 
 		/* first iteration or cross vma bound */
 		if (!vma || start >= vma->vm_end) {
-			vma = find_vma(mm, start);
+			struct vm_area_struct *prev = NULL;
+			vma = find_vma_prev(mm, start, &prev);
 			if (vma && (start < vma->vm_start)) {
 				WARN_ON_ONCE(vma->vm_flags & VM_GROWSDOWN);
 				vma = NULL;
 			}
+			if (!vma && prev && start >= prev->vm_end)
+				WARN_ON_ONCE(prev->vm_flags & VM_GROWSUP);
+
 			if (!vma && in_gate_area(mm, start)) {
 				ret = get_gate_page(mm, start & PAGE_MASK,
 						gup_flags, &vma,

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long
  2023-07-03  4:22                 ` Linus Torvalds
@ 2023-07-03  4:46                   ` Guenter Roeck
  2023-07-03  4:49                     ` Linus Torvalds
  0 siblings, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2023-07-03  4:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Helge Deller, stable, linux-kernel, akpm, linux-parisc,
	Greg Kroah-Hartman, John David Anglin

On 7/2/23 21:22, Linus Torvalds wrote:
> On Sun, 2 Jul 2023 at 20:23, Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> I can reproduce the problem in qemu. However, I do not see a warning
>> after applying your patch.
> 
> Funky, funky.
> 
> I'm assuming it's the
> 
>                                  page = get_arg_page(bprm, pos, 1);
>                                  if (!page) {
>                                          ret = -E2BIG;
>                                          goto out;
>                                  }
> 
> in copy_strings() that causes this. Or possibly, the version in
> copy_string_kernel().
> 
> Does *this* get that "pr_warn()" printout (and a stack trace once,
> just for good measure)?
> 

Sorry, you lost me. Isn't that the same patch as before ? Or
is it just time for me to go to bed ?

Guenter


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long
  2023-07-03  4:46                   ` Guenter Roeck
@ 2023-07-03  4:49                     ` Linus Torvalds
  2023-07-03  5:33                       ` Guenter Roeck
  0 siblings, 1 reply; 65+ messages in thread
From: Linus Torvalds @ 2023-07-03  4:49 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Helge Deller, stable, linux-kernel, akpm, linux-parisc,
	Greg Kroah-Hartman, John David Anglin

[-- Attachment #1: Type: text/plain, Size: 280 bytes --]

On Sun, 2 Jul 2023 at 21:46, Guenter Roeck <linux@roeck-us.net> wrote:
>
> Sorry, you lost me. Isn't that the same patch as before ? Or
> is it just time for me to go to bed ?

No, I think it's time for *me* to go to bed.

Let's get the right diff this time.

              Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 932 bytes --]

 fs/exec.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/exec.c b/fs/exec.c
index 1a827d55ba94..50462ee6085c 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -212,6 +212,9 @@ static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
 		ret = expand_downwards(vma, pos);
 		if (unlikely(ret < 0)) {
 			mmap_write_unlock(mm);
+			pr_warn("stack expand failed: %lx-%lx (%lx)\n",
+				vma->vm_start, vma->vm_end, pos);
+			WARN_ON_ONCE(1);
 			return NULL;
 		}
 		mmap_write_downgrade(mm);
@@ -226,8 +229,12 @@ static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
 			write ? FOLL_WRITE : 0,
 			&page, NULL);
 	mmap_read_unlock(mm);
-	if (ret <= 0)
+	if (ret <= 0) {
+		pr_warn("get_user_pages_remote failed: %lx-%lx (%lx)\n",
+			vma->vm_start, vma->vm_end, pos);
+		WARN_ON_ONCE(1);
 		return NULL;
+	}
 
 	if (write)
 		acct_arg_size(bprm, vma_pages(vma));

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long
  2023-07-03  4:49                     ` Linus Torvalds
@ 2023-07-03  5:33                       ` Guenter Roeck
  2023-07-03  6:20                         ` Linus Torvalds
  0 siblings, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2023-07-03  5:33 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Helge Deller, stable, linux-kernel, akpm, linux-parisc,
	Greg Kroah-Hartman, John David Anglin

On 7/2/23 21:49, Linus Torvalds wrote:
> On Sun, 2 Jul 2023 at 21:46, Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> Sorry, you lost me. Isn't that the same patch as before ? Or
>> is it just time for me to go to bed ?
> 
> No, I think it's time for *me* to go to bed.
> 
> Let's get the right diff this time.
> 

Here you are:

[   31.188688] stack expand failed: ffeff000-fff00000 (ffefeff2)
[   31.189131] ------------[ cut here ]------------
[   31.189259] WARNING: CPU: 0 PID: 472 at fs/exec.c:217 get_arg_page+0x1e8/0x1f4
[   31.189827] Modules linked in:
[   31.190083] CPU: 0 PID: 472 Comm: sh Tainted: G                 N 6.4.0-32bit+ #1
[   31.190213] Hardware name: 9000/778/B160L
[   31.190347]
[   31.190407]      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
[   31.190496] PSW: 00000000000001001011111100001111 Tainted: G                 N
[   31.190625] r00-03  0004bf0f 11026240 1034a3ec 12bb41c0
[   31.190741] r04-07  127ec400 00000001 12b725a4 12b72530
[   31.190821] r08-11  129e6708 ffefeff2 2ff9d000 ffeff000
[   31.190895] r12-15  127ec400 10e463f0 10e34348 12a4d1a0
[   31.190962] r16-19  00000002 00001000 ffefe000 12a4d1a0
[   31.191033] r20-23  0000000f 00001a46 013ae000 12bb4498
[   31.191103] r24-27  11542330 00000000 115430a0 10ed98d8
[   31.191173] r28-31  00000031 00000310 12bb4240 0000000f
[   31.191251] sr00-03  00000000 00000000 00000000 000000a0
[   31.191332] sr04-07  00000000 00000000 00000000 00000000
[   31.191407]
[   31.191443] IASQ: 00000000 00000000 IAOQ: 1034a3ec 1034a3f0
[   31.191522]  IIR: 03ffe01f    ISR: 00000000  IOR: 1065d424
[   31.191593]  CPU:        0   CR30: 12a4d1a0 CR31: 00000000
[   31.192675]  ORIG_R28: 12a4d1a0
[   31.192770]  IAOQ[0]: get_arg_page+0x1e8/0x1f4
[   31.192851]  IAOQ[1]: get_arg_page+0x1ec/0x1f4
[   31.192922]  RP(r2): get_arg_page+0x1e8/0x1f4
[   31.193007] Backtrace:
[   31.193085]  [<1034a9cc>] copy_strings+0x148/0x3d8
[   31.193214]  [<1034ad94>] do_execveat_common+0x138/0x21c
[   31.193302]  [<1034bcc4>] sys_execve+0x3c/0x54
[   31.193400]  [<101af1b4>] syscall_exit+0x0/0x10
[   31.193562]
[   31.193698] ---[ end trace 0000000000000000 ]---
[   31.200551] stack expand failed: ffeff000-fff00000 (ffefefee)
/bin/sh: ls: Argument list too long

Guenter


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long
  2023-07-03  5:33                       ` Guenter Roeck
@ 2023-07-03  6:20                         ` Linus Torvalds
  2023-07-03  7:08                           ` Helge Deller
  2023-07-03 12:59                           ` Guenter Roeck
  0 siblings, 2 replies; 65+ messages in thread
From: Linus Torvalds @ 2023-07-03  6:20 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Helge Deller, stable, linux-kernel, akpm, linux-parisc,
	Greg Kroah-Hartman, John David Anglin

[-- Attachment #1: Type: text/plain, Size: 1869 bytes --]

On Sun, 2 Jul 2023 at 22:33, Guenter Roeck <linux@roeck-us.net> wrote:
>
> Here you are:
>
> [   31.188688] stack expand failed: ffeff000-fff00000 (ffefeff2)

Ahhah!

I think the problem is actually ridiculously simple.

The thing is, the parisc stack expands upwards. That's obvious. I've
mentioned it several times in just this thread as being the thing that
makes parisc special.

But it's *so* obvious that I didn't even think about what it really implies.

And part of all the changes was this part in expand_downwards():

        if (!(vma->vm_flags & VM_GROWSDOWN))
                return -EFAULT;

and that will *always* fail on parisc, because - as said multiple
times - the parisc stack expands upwards. It doesn't have VM_GROWSDOWN
set.

What a dum-dum I am.

And I did it that way because the *normal* stack expansion obviously
wants it that way and putting the check there not only made sense, but
simplified other code.

But fs/execve.c is special - and only special for parisc - in that it
really wants to  expand a normally upwards-growing stack downwards
unconditionally.

Anyway, I think that new check in expand_downwards() is the right
thing to do, and the real fix here is to simply make vm_flags reflect
reality.

Because during execve, that stack that will _eventually_ grow upwards,
does in fact grow downwards.  Let's make it reflect that.

We already do magical extra setup for the stack flags during setup
(VM_STACK_INCOMPLETE_SETUP), so extending that logic to contain
VM_GROWSDOWN seems sane and the right thing to do.

IOW, I think a patch like the attached will fix the problem for real.

It needs a good commit log and maybe a code comment or two, but before
I bother to do that, let's verify that yes, it does actually fix
things.

In the meantime, I will actually go to bed, but I'm pretty sure this is it.

                     Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 948 bytes --]

 include/linux/mm.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 74f1be743ba2..2dd73e4f3d8e 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -377,7 +377,7 @@ extern unsigned int kobjsize(const void *objp);
 #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */

 /* Bits set in the VMA until the stack is in its final location */
-#define VM_STACK_INCOMPLETE_SETUP	(VM_RAND_READ | VM_SEQ_READ)
+#define VM_STACK_INCOMPLETE_SETUP (VM_RAND_READ | VM_SEQ_READ | VM_STACK_EARLY)

 #define TASK_EXEC ((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0)

@@ -399,8 +399,10 @@ extern unsigned int kobjsize(const void *objp);

 #ifdef CONFIG_STACK_GROWSUP
 #define VM_STACK	VM_GROWSUP
+#define VM_STACK_EARLY	VM_GROWSDOWN
 #else
 #define VM_STACK	VM_GROWSDOWN
+#define VM_STACK_EARLY	0
 #endif

 #define VM_STACK_FLAGS	(VM_STACK | VM_STACK_DEFAULT_FLAGS | VM_ACCOUNT)

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long
  2023-07-03  6:20                         ` Linus Torvalds
@ 2023-07-03  7:08                           ` Helge Deller
  2023-07-03 16:49                             ` Linus Torvalds
  2023-07-03 12:59                           ` Guenter Roeck
  1 sibling, 1 reply; 65+ messages in thread
From: Helge Deller @ 2023-07-03  7:08 UTC (permalink / raw)
  To: Linus Torvalds, Guenter Roeck
  Cc: stable, linux-kernel, akpm, linux-parisc, Greg Kroah-Hartman,
	John David Anglin

Hi Linus,

On 7/3/23 08:20, Linus Torvalds wrote:
> On Sun, 2 Jul 2023 at 22:33, Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> Here you are:
>>
>> [   31.188688] stack expand failed: ffeff000-fff00000 (ffefeff2)
>
> Ahhah!
>
> I think the problem is actually ridiculously simple.
>
> The thing is, the parisc stack expands upwards. That's obvious. I've
> mentioned it several times in just this thread as being the thing that
> makes parisc special.
>
> But it's *so* obvious that I didn't even think about what it really implies.
>
> And part of all the changes was this part in expand_downwards():
>
>          if (!(vma->vm_flags & VM_GROWSDOWN))
>                  return -EFAULT;
>
> and that will *always* fail on parisc, because - as said multiple
> times - the parisc stack expands upwards. It doesn't have VM_GROWSDOWN
> set.
>
> What a dum-dum I am.
>
> And I did it that way because the *normal* stack expansion obviously
> wants it that way and putting the check there not only made sense, but
> simplified other code.
>
> But fs/execve.c is special - and only special for parisc - in that it
> really wants to  expand a normally upwards-growing stack downwards
> unconditionally.
>
> Anyway, I think that new check in expand_downwards() is the right
> thing to do, and the real fix here is to simply make vm_flags reflect
> reality.
>
> Because during execve, that stack that will _eventually_ grow upwards,
> does in fact grow downwards.  Let's make it reflect that.
>
> We already do magical extra setup for the stack flags during setup
> (VM_STACK_INCOMPLETE_SETUP), so extending that logic to contain
> VM_GROWSDOWN seems sane and the right thing to do.
>
> IOW, I think a patch like the attached will fix the problem for real.
>
> It needs a good commit log and maybe a code comment or two, but before
> I bother to do that, let's verify that yes, it does actually fix
> things.
>
> In the meantime, I will actually go to bed, but I'm pretty sure this is it.

Great, that patch fixes it!

I wonder if you want to
#define VM_STACK_EARLY VM_GROWSDOWN
even for the case where the stack grows down too (instead of 0),
just to make clear that in both cases the stack goes downwards initially.

Helge

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long
  2023-07-03  6:20                         ` Linus Torvalds
  2023-07-03  7:08                           ` Helge Deller
@ 2023-07-03 12:59                           ` Guenter Roeck
  2023-07-03 13:06                             ` Guenter Roeck
  1 sibling, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2023-07-03 12:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Helge Deller, stable, linux-kernel, akpm, linux-parisc,
	Greg Kroah-Hartman, John David Anglin

On 7/2/23 23:20, Linus Torvalds wrote:
> On Sun, 2 Jul 2023 at 22:33, Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> Here you are:
>>
>> [   31.188688] stack expand failed: ffeff000-fff00000 (ffefeff2)
> 
> Ahhah!
> 
> I think the problem is actually ridiculously simple.
> 
> The thing is, the parisc stack expands upwards. That's obvious. I've
> mentioned it several times in just this thread as being the thing that
> makes parisc special.
> 
> But it's *so* obvious that I didn't even think about what it really implies.
> 
> And part of all the changes was this part in expand_downwards():
> 
>          if (!(vma->vm_flags & VM_GROWSDOWN))
>                  return -EFAULT;
> 
> and that will *always* fail on parisc, because - as said multiple
> times - the parisc stack expands upwards. It doesn't have VM_GROWSDOWN
> set.
> 
> What a dum-dum I am.
> 
> And I did it that way because the *normal* stack expansion obviously
> wants it that way and putting the check there not only made sense, but
> simplified other code.
> 
> But fs/execve.c is special - and only special for parisc - in that it
> really wants to  expand a normally upwards-growing stack downwards
> unconditionally.
> 
> Anyway, I think that new check in expand_downwards() is the right
> thing to do, and the real fix here is to simply make vm_flags reflect
> reality.
> 
> Because during execve, that stack that will _eventually_ grow upwards,
> does in fact grow downwards.  Let's make it reflect that.
> 
> We already do magical extra setup for the stack flags during setup
> (VM_STACK_INCOMPLETE_SETUP), so extending that logic to contain
> VM_GROWSDOWN seems sane and the right thing to do.
> 
> IOW, I think a patch like the attached will fix the problem for real.
> 
> It needs a good commit log and maybe a code comment or two, but before
> I bother to do that, let's verify that yes, it does actually fix
> things.
> 

Yes, it does. I'll run a complete qemu test with it applied to be sure
there is no impact on other architectures (yes, I know, that should not
be the case, but better safe than sorry). I'll even apply
https://lore.kernel.org/all/20230609075528.9390-12-bhe@redhat.com/raw
to be able to test sh4.

Guenter


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long
  2023-07-03 12:59                           ` Guenter Roeck
@ 2023-07-03 13:06                             ` Guenter Roeck
  0 siblings, 0 replies; 65+ messages in thread
From: Guenter Roeck @ 2023-07-03 13:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Helge Deller, stable, linux-kernel, akpm, linux-parisc,
	Greg Kroah-Hartman, John David Anglin

On 7/3/23 05:59, Guenter Roeck wrote:
> On 7/2/23 23:20, Linus Torvalds wrote:
>> On Sun, 2 Jul 2023 at 22:33, Guenter Roeck <linux@roeck-us.net> wrote:
>>>
>>> Here you are:
>>>
>>> [   31.188688] stack expand failed: ffeff000-fff00000 (ffefeff2)
>>
>> Ahhah!
>>
>> I think the problem is actually ridiculously simple.
>>
>> The thing is, the parisc stack expands upwards. That's obvious. I've
>> mentioned it several times in just this thread as being the thing that
>> makes parisc special.
>>
>> But it's *so* obvious that I didn't even think about what it really implies.
>>
>> And part of all the changes was this part in expand_downwards():
>>
>>          if (!(vma->vm_flags & VM_GROWSDOWN))
>>                  return -EFAULT;
>>
>> and that will *always* fail on parisc, because - as said multiple
>> times - the parisc stack expands upwards. It doesn't have VM_GROWSDOWN
>> set.
>>
>> What a dum-dum I am.
>>
>> And I did it that way because the *normal* stack expansion obviously
>> wants it that way and putting the check there not only made sense, but
>> simplified other code.
>>
>> But fs/execve.c is special - and only special for parisc - in that it
>> really wants to  expand a normally upwards-growing stack downwards
>> unconditionally.
>>
>> Anyway, I think that new check in expand_downwards() is the right
>> thing to do, and the real fix here is to simply make vm_flags reflect
>> reality.
>>
>> Because during execve, that stack that will _eventually_ grow upwards,
>> does in fact grow downwards.  Let's make it reflect that.
>>
>> We already do magical extra setup for the stack flags during setup
>> (VM_STACK_INCOMPLETE_SETUP), so extending that logic to contain
>> VM_GROWSDOWN seems sane and the right thing to do.
>>
>> IOW, I think a patch like the attached will fix the problem for real.
>>
>> It needs a good commit log and maybe a code comment or two, but before
>> I bother to do that, let's verify that yes, it does actually fix
>> things.
>>
> 
> Yes, it does. I'll run a complete qemu test with it applied to be sure
> there is no impact on other architectures (yes, I know, that should not
> be the case, but better safe than sorry). I'll even apply
> https://lore.kernel.org/all/20230609075528.9390-12-bhe@redhat.com/raw
> to be able to test sh4.
> 

Meh, should have figured. That fixes one problem with sh4 builds
and creates another. Should have figured.

Guenter


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long
  2023-07-03  7:08                           ` Helge Deller
@ 2023-07-03 16:49                             ` Linus Torvalds
  2023-07-03 17:19                               ` Guenter Roeck
  2023-07-03 19:24                               ` Helge Deller
  0 siblings, 2 replies; 65+ messages in thread
From: Linus Torvalds @ 2023-07-03 16:49 UTC (permalink / raw)
  To: Helge Deller
  Cc: Guenter Roeck, stable, linux-kernel, akpm, linux-parisc,
	Greg Kroah-Hartman, John David Anglin

On Mon, 3 Jul 2023 at 00:08, Helge Deller <deller@gmx.de> wrote:
>
> Great, that patch fixes it!

Yeah, I was pretty sure this was it, but it's good to have it
confirmed. Committed.

> I wonder if you want to
> #define VM_STACK_EARLY VM_GROWSDOWN
> even for the case where the stack grows down too (instead of 0),
> just to make clear that in both cases the stack goes downwards initially.

No, that wouldn't work for the simple reason that the special bits in
VM_STACK_INCOMPLETE_SETUP are always cleared after the stack setup is
done.

So if we added VM_GROWSDOWN to those early bits in general, the bit
would then be cleared even when that wasn't the intent.

Yes, yes, we could change the VM_STACK_INCOMPLETE_SETUP logic to only
clear some of the bits in the end, but the end result would be
practically the same: we'd still have to do different things for
grows-up vs grows-down cases, so the difference might as well be here
in the VM_STACK_EARLY bit.

                 Linus

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long
  2023-07-03 16:49                             ` Linus Torvalds
@ 2023-07-03 17:19                               ` Guenter Roeck
  2023-07-03 17:30                                 ` Linus Torvalds
  2023-07-03 19:24                               ` Helge Deller
  1 sibling, 1 reply; 65+ messages in thread
From: Guenter Roeck @ 2023-07-03 17:19 UTC (permalink / raw)
  To: Linus Torvalds, Helge Deller
  Cc: stable, linux-kernel, akpm, linux-parisc, Greg Kroah-Hartman,
	John David Anglin

On 7/3/23 09:49, Linus Torvalds wrote:
> On Mon, 3 Jul 2023 at 00:08, Helge Deller <deller@gmx.de> wrote:
>>
>> Great, that patch fixes it!
> 
> Yeah, I was pretty sure this was it, but it's good to have it
> confirmed. Committed.
> 

FWIW, my qemu boot tests didn't find any problems with other architectures.

Guenter

>> I wonder if you want to
>> #define VM_STACK_EARLY VM_GROWSDOWN
>> even for the case where the stack grows down too (instead of 0),
>> just to make clear that in both cases the stack goes downwards initially.
> 
> No, that wouldn't work for the simple reason that the special bits in
> VM_STACK_INCOMPLETE_SETUP are always cleared after the stack setup is
> done.
> 
> So if we added VM_GROWSDOWN to those early bits in general, the bit
> would then be cleared even when that wasn't the intent.
> 
> Yes, yes, we could change the VM_STACK_INCOMPLETE_SETUP logic to only
> clear some of the bits in the end, but the end result would be
> practically the same: we'd still have to do different things for
> grows-up vs grows-down cases, so the difference might as well be here
> in the VM_STACK_EARLY bit.
> 
>                   Linus


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long
  2023-07-03 17:19                               ` Guenter Roeck
@ 2023-07-03 17:30                                 ` Linus Torvalds
  0 siblings, 0 replies; 65+ messages in thread
From: Linus Torvalds @ 2023-07-03 17:30 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Helge Deller, stable, linux-kernel, akpm, linux-parisc,
	Greg Kroah-Hartman, John David Anglin

On Mon, 3 Jul 2023 at 10:19, Guenter Roeck <linux@roeck-us.net> wrote:
>
> FWIW, my qemu boot tests didn't find any problems with other architectures.

Thanks. This whole "let's get the stack expansion locking right"
wasn't exactly buttery smooth, but given all our crazy architectures
it was not entirely unexpected.

Let's hope it really is all done now,

            Linus

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long
  2023-07-03 16:49                             ` Linus Torvalds
  2023-07-03 17:19                               ` Guenter Roeck
@ 2023-07-03 19:24                               ` Helge Deller
  2023-07-03 19:31                                 ` Sam James
  1 sibling, 1 reply; 65+ messages in thread
From: Helge Deller @ 2023-07-03 19:24 UTC (permalink / raw)
  To: Linus Torvalds, Greg Kroah-Hartman
  Cc: Guenter Roeck, stable, linux-kernel, akpm, linux-parisc,
	John David Anglin

On 7/3/23 18:49, Linus Torvalds wrote:
> On Mon, 3 Jul 2023 at 00:08, Helge Deller <deller@gmx.de> wrote:
>>
>> Great, that patch fixes it!
>
> Yeah, I was pretty sure this was it, but it's good to have it
> confirmed. Committed.

Thank you!

Nice to see that Greg picked up the patch for stable that fast as well!

>> I wonder if you want to
>> #define VM_STACK_EARLY VM_GROWSDOWN
>> even for the case where the stack grows down too (instead of 0),
>> just to make clear that in both cases the stack goes downwards initially.
>
> No, that wouldn't work for the simple reason that the special bits in
> VM_STACK_INCOMPLETE_SETUP are always cleared after the stack setup is
> done.
>
> So if we added VM_GROWSDOWN to those early bits in general, the bit
> would then be cleared even when that wasn't the intent.
>
> Yes, yes, we could change the VM_STACK_INCOMPLETE_SETUP logic to only
> clear some of the bits in the end, but the end result would be
> practically the same: we'd still have to do different things for
> grows-up vs grows-down cases, so the difference might as well be here
> in the VM_STACK_EARLY bit.

Ok, thanks for explainig!

Helge

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long
  2023-07-03 19:24                               ` Helge Deller
@ 2023-07-03 19:31                                 ` Sam James
  2023-07-03 19:36                                   ` Sam James
  0 siblings, 1 reply; 65+ messages in thread
From: Sam James @ 2023-07-03 19:31 UTC (permalink / raw)
  To: Helge Deller, Greg Kroah-Hartman
  Cc: Linus Torvalds, Guenter Roeck, stable, linux-kernel, akpm,
	linux-parisc, John David Anglin


Helge Deller <deller@gmx.de> writes:

> On 7/3/23 18:49, Linus Torvalds wrote:
>> On Mon, 3 Jul 2023 at 00:08, Helge Deller <deller@gmx.de> wrote:
>>>
>>> Great, that patch fixes it!
>>
>> Yeah, I was pretty sure this was it, but it's good to have it
>> confirmed. Committed.
>
> Thank you!
>
> Nice to see that Greg picked up the patch for stable that fast as well!

Sorry, where? I was just about to check if it was marked for backporting
but I can't see it in Greg's trees yet.

We need it fo 6.1, 6.3, 6.4.

(Apologies if I'm missing it somewhere obvious.)

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long
  2023-07-03 19:31                                 ` Sam James
@ 2023-07-03 19:36                                   ` Sam James
  0 siblings, 0 replies; 65+ messages in thread
From: Sam James @ 2023-07-03 19:36 UTC (permalink / raw)
  To: Sam James
  Cc: Helge Deller, Greg Kroah-Hartman, Linus Torvalds, Guenter Roeck,
	stable, linux-kernel, akpm, linux-parisc, John David Anglin


Sam James <sam@gentoo.org> writes:

> Helge Deller <deller@gmx.de> writes:
>
>> On 7/3/23 18:49, Linus Torvalds wrote:
>>> On Mon, 3 Jul 2023 at 00:08, Helge Deller <deller@gmx.de> wrote:
>>>>
>>>> Great, that patch fixes it!
>>>
>>> Yeah, I was pretty sure this was it, but it's good to have it
>>> confirmed. Committed.
>>
>> Thank you!
>>
>> Nice to see that Greg picked up the patch for stable that fast as well!
>
> Sorry, where? I was just about to check if it was marked for backporting
> but I can't see it in Greg's trees yet.
>
> We need it fo 6.1, 6.3, 6.4.
>
> (Apologies if I'm missing it somewhere obvious.)

.. and I was. I see it now, sorry!

^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2023-07-03 19:36 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-29 18:43 [PATCH 6.4 00/28] 6.4.1-rc1 review Greg Kroah-Hartman
2023-06-29 18:43 ` [PATCH 6.4 01/28] x86/microcode/AMD: Load late on both threads too Greg Kroah-Hartman
2023-06-29 18:43 ` [PATCH 6.4 02/28] x86/smp: Make stop_other_cpus() more robust Greg Kroah-Hartman
2023-06-29 18:43 ` [PATCH 6.4 03/28] x86/smp: Dont access non-existing CPUID leaf Greg Kroah-Hartman
2023-06-29 18:43 ` [PATCH 6.4 04/28] x86/smp: Remove pointless wmb()s from native_stop_other_cpus() Greg Kroah-Hartman
2023-06-29 18:43 ` [PATCH 6.4 05/28] x86/smp: Use dedicated cache-line for mwait_play_dead() Greg Kroah-Hartman
2023-06-29 18:43 ` [PATCH 6.4 06/28] x86/smp: Cure kexec() vs. mwait_play_dead() breakage Greg Kroah-Hartman
2023-06-29 18:43 ` [PATCH 6.4 07/28] cpufreq: amd-pstate: Make amd-pstate EPP driver name hyphenated Greg Kroah-Hartman
2023-06-29 18:43 ` [PATCH 6.4 08/28] can: isotp: isotp_sendmsg(): fix return error fix on TX path Greg Kroah-Hartman
2023-06-29 18:43 ` [PATCH 6.4 09/28] maple_tree: fix potential out-of-bounds access in mas_wr_end_piv() Greg Kroah-Hartman
2023-06-29 18:43 ` [PATCH 6.4 10/28] mm: introduce new lock_mm_and_find_vma() page fault helper Greg Kroah-Hartman
2023-06-29 18:43 ` [PATCH 6.4 11/28] mm: make the page fault mmap locking killable Greg Kroah-Hartman
2023-06-29 18:44 ` [PATCH 6.4 12/28] arm64/mm: Convert to using lock_mm_and_find_vma() Greg Kroah-Hartman
2023-06-29 18:44 ` [PATCH 6.4 13/28] powerpc/mm: " Greg Kroah-Hartman
2023-06-29 18:44 ` [PATCH 6.4 14/28] mips/mm: " Greg Kroah-Hartman
2023-06-29 18:44 ` [PATCH 6.4 15/28] riscv/mm: " Greg Kroah-Hartman
2023-06-29 18:44 ` [PATCH 6.4 16/28] arm/mm: " Greg Kroah-Hartman
2023-06-29 18:44 ` [PATCH 6.4 17/28] mm/fault: convert remaining simple cases to lock_mm_and_find_vma() Greg Kroah-Hartman
2023-06-29 18:44 ` [PATCH 6.4 18/28] powerpc/mm: convert coprocessor fault " Greg Kroah-Hartman
2023-06-29 18:44 ` [PATCH 6.4 19/28] mm: make find_extend_vma() fail if write lock not held Greg Kroah-Hartman
2023-06-29 18:44 ` [PATCH 6.4 20/28] execve: expand new process stack manually ahead of time Greg Kroah-Hartman
2023-06-29 18:44 ` [PATCH 6.4 21/28] mm: always expand the stack with the mmap write lock held Greg Kroah-Hartman
2023-06-29 18:44 ` [PATCH 6.4 22/28] HID: wacom: Use ktime_t rather than int when dealing with timestamps Greg Kroah-Hartman
2023-06-29 18:44 ` [PATCH 6.4 23/28] gup: add warning if some caller would seem to want stack expansion Greg Kroah-Hartman
2023-06-29 18:44 ` [PATCH 6.4 24/28] mm/khugepaged: fix regression in collapse_file() Greg Kroah-Hartman
2023-06-29 18:44 ` [PATCH 6.4 25/28] fbdev: fix potential OOB read in fast_imageblit() Greg Kroah-Hartman
2023-06-29 18:44 ` [PATCH 6.4 26/28] HID: hidraw: fix data race on device refcount Greg Kroah-Hartman
2023-06-29 18:44 ` [PATCH 6.4 27/28] HID: logitech-hidpp: add HIDPP_QUIRK_DELAYED_INIT for the T651 Greg Kroah-Hartman
2023-06-29 18:44 ` [PATCH 6.4 28/28] Revert "thermal/drivers/mediatek: Use devm_of_iomap to avoid resource leak in mtk_thermal_probe" Greg Kroah-Hartman
2023-06-30  5:30 ` [PATCH 6.4 00/28] 6.4.1-rc1 review Naresh Kamboju
2023-06-30  5:52   ` Greg Kroah-Hartman
2023-06-30  6:16   ` Linus Torvalds
2023-06-30  6:29     ` Greg Kroah-Hartman
2023-06-30  6:56       ` Helge Deller
2023-07-02 21:33         ` [PATCH 6.4 00/28] 6.4.1-rc1 review - hppa argument list too long Helge Deller
2023-07-02 22:45           ` Linus Torvalds
2023-07-02 23:30             ` Linus Torvalds
2023-07-03  3:23               ` Guenter Roeck
2023-07-03  4:22                 ` Linus Torvalds
2023-07-03  4:46                   ` Guenter Roeck
2023-07-03  4:49                     ` Linus Torvalds
2023-07-03  5:33                       ` Guenter Roeck
2023-07-03  6:20                         ` Linus Torvalds
2023-07-03  7:08                           ` Helge Deller
2023-07-03 16:49                             ` Linus Torvalds
2023-07-03 17:19                               ` Guenter Roeck
2023-07-03 17:30                                 ` Linus Torvalds
2023-07-03 19:24                               ` Helge Deller
2023-07-03 19:31                                 ` Sam James
2023-07-03 19:36                                   ` Sam James
2023-07-03 12:59                           ` Guenter Roeck
2023-07-03 13:06                             ` Guenter Roeck
2023-06-30  6:29     ` [PATCH 6.4 00/28] 6.4.1-rc1 review Guenter Roeck
2023-06-30  6:33       ` Guenter Roeck
2023-06-30  6:33       ` Linus Torvalds
2023-06-30  6:47         ` Linus Torvalds
2023-06-30 22:51           ` Guenter Roeck
2023-07-01  1:24             ` Linus Torvalds
2023-07-01  2:49               ` Guenter Roeck
2023-07-01  4:22                 ` Linus Torvalds
2023-07-01  9:57                   ` Greg Kroah-Hartman
2023-07-01 10:32                   ` Max Filippov
2023-07-01 15:01                     ` Linus Torvalds
2023-06-30  6:51         ` Greg Kroah-Hartman
2023-06-30  6:28   ` Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).