LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v7 11/15] powerpc/code-patching: Avoid r/w mapping of the zero page
From: Mukesh Kumar Chaurasiya @ 2026-06-03 18:03 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, linux-kernel, will, catalin.marinas,
	mark.rutland, Ard Biesheuvel, Ryan Roberts, Anshuman Khandual,
	Kevin Brodsky, Liz Prucka, Seth Jenkins, Kees Cook, Mike Rapoport,
	David Hildenbrand, Andrew Morton, Jann Horn, linux-mm,
	linux-hardening, linuxppc-dev, linux-sh, Madhavan Srinivasan,
	Michael Ellerman, Nicholas Piggin, Christophe Leroy (CS GROUP)
In-Reply-To: <20260529150150.1670604-28-ardb+git@google.com>

On Fri, May 29, 2026 at 05:02:02PM +0200, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
> 
> The only remaining use of map_patch_area() is mapping the zero page, and
> immediately unmapping it again so that the intermediate page table
> levels are all guaranteed to be populated.
> 
> The use of the zero page here is completely arbitrary, and not harmful
> per se, but currently, it creates a writable mapping, and does so in a
> manner that requires that the empty_zero_page[] symbol is not
> const-qualified.
> 
> Given that this is about to change, and that map_patch_area() now never
> maps anything other than the zero page, let's simplify the code and
> - remove the helpers and call [un]map_kernel_page() directly
> - take the PA of empty_zero_page directly
> - create a read-only temporary mapping.
> 
> This allows empty_zero_page[] to be repainted as const u8[] in a
> subsequent patch, without making substantial changes to this code
> patching logic.
> 
> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Nicholas Piggin <npiggin@gmail.com>
> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org>
> Link: https://lore.kernel.org/all/20260520085423.485402-1-ardb@kernel.org/
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/powerpc/lib/code-patching.c | 52 +-------------------
>  1 file changed, 2 insertions(+), 50 deletions(-)
> 
> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> index f84e0337cc02..44ff9f684bef 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -60,9 +60,6 @@ struct patch_context {
>  
>  static DEFINE_PER_CPU(struct patch_context, cpu_patching_context);
>  
> -static int map_patch_area(void *addr, unsigned long text_poke_addr);
> -static void unmap_patch_area(unsigned long addr);
> -
>  static bool mm_patch_enabled(void)
>  {
>  	return IS_ENABLED(CONFIG_SMP) && radix_enabled();
> @@ -117,11 +114,11 @@ static int text_area_cpu_up(unsigned int cpu)
>  
>  	// Map/unmap the area to ensure all page tables are pre-allocated
>  	addr = (unsigned long)area->addr;
> -	err = map_patch_area(empty_zero_page, addr);
> +	err = map_kernel_page(addr, __pa_symbol(empty_zero_page), PAGE_KERNEL_RO);
>  	if (err)
>  		return err;
>  
> -	unmap_patch_area(addr);
> +	unmap_kernel_page(addr);
>  
>  	this_cpu_write(cpu_patching_context.area, area);
>  	this_cpu_write(cpu_patching_context.addr, addr);
> @@ -233,51 +230,6 @@ static unsigned long get_patch_pfn(void *addr)
>  		return __pa_symbol(addr) >> PAGE_SHIFT;
>  }
>  
> -/*
> - * This can be called for kernel text or a module.
> - */
> -static int map_patch_area(void *addr, unsigned long text_poke_addr)
> -{
> -	unsigned long pfn = get_patch_pfn(addr);
> -
> -	return map_kernel_page(text_poke_addr, (pfn << PAGE_SHIFT), PAGE_KERNEL);
> -}
> -
> -static void unmap_patch_area(unsigned long addr)
> -{
> -	pte_t *ptep;
> -	pmd_t *pmdp;
> -	pud_t *pudp;
> -	p4d_t *p4dp;
> -	pgd_t *pgdp;
> -
> -	pgdp = pgd_offset_k(addr);
> -	if (WARN_ON(pgd_none(*pgdp)))
> -		return;
> -
> -	p4dp = p4d_offset(pgdp, addr);
> -	if (WARN_ON(p4d_none(*p4dp)))
> -		return;
> -
> -	pudp = pud_offset(p4dp, addr);
> -	if (WARN_ON(pud_none(*pudp)))
> -		return;
> -
> -	pmdp = pmd_offset(pudp, addr);
> -	if (WARN_ON(pmd_none(*pmdp)))
> -		return;
> -
> -	ptep = pte_offset_kernel(pmdp, addr);
> -	if (WARN_ON(pte_none(*ptep)))
> -		return;
> -
> -	/*
> -	 * In hash, pte_clear flushes the tlb, in radix, we have to
> -	 */
> -	pte_clear(&init_mm, addr, ptep);
> -	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> -}
> -
>  static int __do_patch_mem_mm(void *addr, unsigned long val, bool is_dword)
>  {
>  	int err;
> -- 
> 2.54.0.823.g6e5bcc1fc9-goog
> 
I don't see any functional change.

Reviewed-by: Mukesh Kumar Chaurasiya (IBM) <mkchauras@gmail.com>


^ permalink raw reply

* Re: [PATCH] KVM: PPC: Book3S HV: Validate arch_compat against host compatibility mode
From: Mukesh Kumar Chaurasiya @ 2026-06-03 17:57 UTC (permalink / raw)
  To: Amit Machhiwal
  Cc: linuxppc-dev, Madhavan Srinivasan, Vaibhav Jain,
	Harsh Prateek Bora, Ritesh Harjani, Anushree Mathur,
	Nicholas Piggin, Michael Ellerman, Christophe Leroy (CS GROUP),
	kvm, stable, linux-kernel
In-Reply-To: <20260603141539.47620-1-amachhiw@linux.ibm.com>

On Wed, Jun 03, 2026 at 07:45:39PM +0530, Amit Machhiwal wrote:
> On IBM POWER systems, newer processor generations can operate in
> compatibility modes corresponding to earlier generations. This becomes
> relevant for nested virtualization, where nested KVM guests may need to
> run with a specific processor compatibility level.
> 
> Currently, when running a nested KVM guest (L2) inside a Power11 pSeries
> logical partition (L1) booted in Power10 compatibility mode, the guest
> fails to boot while setting 'arch_compat'. This happens because the CPU
> class is derived from the hardware PVR (via mfspr()), which reflects the
> physical processor generation (Power11), rather than the effective
> compatibility mode (Power10).
> 
> As a result, userspace may request a Power11 arch_compat for the L2
> guest. However, the L1 partition, running in Power10 compatibility, has
> only negotiated support up to Power10 with the Power Hypervisor (L0).
> When H_GUEST_SET_STATE is invoked with a Power11 Logical PVR, the
> hypervisor rejects the request, leading to a late guest boot failure:
> 
>   KVM-NESTEDv2: couldn't set guest wide elements
>   [..KVM reg dump..]
> 
> This situation should be detected earlier. Rejecting unsupported
> 'arch_compat' values in 'kvmppc_set_arch_compat()' avoids issuing an
> invalid H_GUEST_SET_STATE hcall and provides a clearer failure mode.
> 
> Add a check to reject Power11 'arch_compat' requests when the host is
> running in Power10 compatibility mode, returning -EINVAL early instead
> of deferring the failure to the hypervisor.
> 
> Suggested-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> Tested-by: Anushree Mathur <anushree.mathur@linux.ibm.com>
> Cc: <stable@vger.kernel.org> # v6.13+
> Signed-off-by: Amit Machhiwal <amachhiw@linux.ibm.com>
> ---
> Changelog:
> 
> * Moved this patch out of the v3 series [1] as discussed here [2]
> * Addressed below review comments from Ritesh:
>   - Based the PVR validation on cpu features
>   - Fixed hcall name typo
>   - Stable backport
> 
> [1] https://lore.kernel.org/all/20260522152744.55251-1-amachhiw@linux.ibm.com/
> [2] https://lore.kernel.org/all/20260522152744.55251-2-amachhiw@linux.ibm.com/
> ---
>  arch/powerpc/kvm/book3s_hv.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 61dbeea317f3..e16dbb199366 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -446,7 +446,17 @@ static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, u32 arch_compat)
>  			guest_pcr_bit = PCR_ARCH_300;
>  			break;
>  		case PVR_ARCH_31:
> +			guest_pcr_bit = PCR_ARCH_31;
> +			break;
>  		case PVR_ARCH_31_P11:
> +			/*
> +			 * Need to check this for ISA 3.1, as Power10 and
> +			 * Power11 share the same PCR. For any subsequent ISA
> +			 * versions, this will be taken care of by the guest vs
> +			 * host PCR comparison below.
> +			 */
> +			if (!cpu_has_feature(CPU_FTR_P11_PVR))
> +				return -EINVAL;
>  			guest_pcr_bit = PCR_ARCH_31;
>  			break;
>  		default:
> 
> base-commit: ba3e43a9e601636f5edb54e259a74f96ca3b8fd8
> -- 
> 2.50.1 (Apple Git-155)
> 
yeah this makes sense to throw an error early.
LGTM

Reviewed-by: Mukesh Kumar Chaurasiya (IBM) <mkchauras@gmail.com>


^ permalink raw reply

* Re: [PATCH v2] powerpc/entry: Disable interrupts before irqentry_exit
From: Mukesh Kumar Chaurasiya @ 2026-06-03 17:51 UTC (permalink / raw)
  To: Shrikanth Hegde
  Cc: maddy, linuxppc-dev, peterz, tglx, christophe.leroy, linux-kernel,
	venkat88
In-Reply-To: <20260603131054.216235-1-sshegde@linux.ibm.com>

On Wed, Jun 03, 2026 at 06:40:54PM +0530, Shrikanth Hegde wrote:
> Venkat reported a panic on powerpc-next tree where GENERIC_ENTRY has
> been enabled.
> 
> kernel BUG at kernel/sched/core.c:7512!
> NIP  preempt_schedule_irq+0x44/0x118
> LR   dynamic_irqentry_exit_cond_resched+0x40/0x1a4
> Call Trace:
>  dynamic_irqentry_exit_cond_resched+0x40/0x1a4
>  do_page_fault+0xc0/0x104
>  data_access_common_virt+0x210/0x220
> 
> This happens since __do_page_fault ends up enabling the interrupts and
> it could take significant time such that need_resched could be set. This
> leads to schedule call in irqentry_exit leading to the bug.
> 
> There are many such irq handlers which enables the interrupts.
> Fix it by disabling the irq before calling irqentry_exit. The same
> pattern exists today in interrupt_exit_kernel_prepare.
> 
> Fixes: bee25f97ad24 ("powerpc: Enable GENERIC_ENTRY feature")
> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Closes: https://lore.kernel.org/all/7904105b-9dfa-4efd-a5ef-bc0276ed255d@linux.ibm.com/
> Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
> ---
> This applies on top on powerpc/next tree.
> base: 6ed60999d33d '("powerpc: Remove unused functions")'
> 
> v1->v2:
> Leave those BUG_ON alone since they are tracking the register
> state of userspace (Peter Zijlstra)
> v1: https://lore.kernel.org/all/20260603095521.198267-1-sshegde@linux.ibm.com/
> 
>  arch/powerpc/include/asm/entry-common.h | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
> index de5601282755..fc636c42e89a 100644
> --- a/arch/powerpc/include/asm/entry-common.h
> +++ b/arch/powerpc/include/asm/entry-common.h
> @@ -260,9 +260,10 @@ static inline void arch_interrupt_exit_prepare(struct pt_regs *regs)
>  		 * AMR can only have been unlocked if we interrupted the kernel.
>  		 */
>  		kuap_assert_locked();
> -
> -		local_irq_disable();
>  	}
> +
> +	/* irqentry_exit expects to be called with interrupts disabled */
> +	local_irq_disable();
>  }
>  
>  static inline void arch_interrupt_async_enter_prepare(struct pt_regs *regs)
> -- 
> 2.47.3
> 
LGTM

Reviewed-by: Mukesh Kumar Chaurasiya (IBM) <mkchauras@gmail.com>


^ permalink raw reply

* Re: [PATCH v2] scsi: core: Remove remaining references to the pktcdvd driver
From: Bart Van Assche @ 2026-06-03 16:06 UTC (permalink / raw)
  To: Catalin Iacob, Thomas Bogendoerfer, Madhavan Srinivasan,
	Michael Ellerman, Nicholas Piggin, Christophe Leroy (CS GROUP),
	Rich Felker, John Paul Adrian Glaubitz, David S. Miller,
	Andreas Larsson, James E.J. Bottomley, Martin K. Petersen,
	Jens Axboe
  Cc: linux-mips, linux-kernel, linuxppc-dev, linux-sh, sparclinux,
	linux-scsi
In-Reply-To: <20260603-remove-pktcdvd-references-v2-1-c4402154d53a@gmail.com>

On 6/3/26 6:27 AM, Catalin Iacob wrote:
> Commit 1cea5180f2f8 ("block: remove pktcdvd driver") left behind some
> CONFIG_CONFIG_CDROM_PKTCDVD* references in defconfigs and around an
> export. Remove them.
> 
> Signed-off-by: Catalin Iacob <iacobcatalin@gmail.com>
> ---
> Found this incidentally while looking at kernel sources to understand
> what pktcdvd is
> ---
> Changes in v2:
> - Reworded commit message on John Paul Adrian's suggestion to be about
>    the removed references not the export symbol
> - Link to v1: https://patch.msgid.link/20260530-remove-pktcdvd-references-v1-1-aa56941d4315@gmail.com
> ---
>   arch/mips/configs/fuloong2e_defconfig    | 1 -
>   arch/mips/configs/ip22_defconfig         | 1 -
>   arch/mips/configs/ip27_defconfig         | 1 -
>   arch/mips/configs/ip30_defconfig         | 1 -
>   arch/mips/configs/jazz_defconfig         | 1 -
>   arch/mips/configs/malta_defconfig        | 1 -
>   arch/mips/configs/malta_kvm_defconfig    | 1 -
>   arch/mips/configs/maltaup_xpa_defconfig  | 1 -
>   arch/mips/configs/rm200_defconfig        | 1 -
>   arch/mips/configs/sb1250_swarm_defconfig | 1 -
>   arch/powerpc/configs/g5_defconfig        | 1 -
>   arch/powerpc/configs/ppc6xx_defconfig    | 1 -
>   arch/sh/configs/sh2007_defconfig         | 1 -
>   arch/sparc/configs/sparc64_defconfig     | 2 --
>   drivers/scsi/scsi_lib.c                  | 8 --------
>   15 files changed, 23 deletions(-)

Shouldn't this be split into two patches - one for the defconfig files
and a second patch for the SCSI core?

Thanks,

Bart.


^ permalink raw reply

* [powerpc:next-test] BUILD SUCCESS c8f80f95da32618d49c2a068ee561b85655b761d
From: kernel test robot @ 2026-06-03 15:39 UTC (permalink / raw)
  To: Madhavan Srinivasan; +Cc: linuxppc-dev

tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test
branch HEAD: c8f80f95da32618d49c2a068ee561b85655b761d  powerpc/pseries/lparcfg: Replace deprecated strcpy in parse_system_parameter_string

elapsed time: 1977m

configs tested: 160
configs skipped: 3

The following configs have been built successfully.
More configs may be tested in the coming days.

tested configs:
alpha                             allnoconfig    gcc-15.2.0
alpha                            allyesconfig    gcc-15.2.0
alpha                               defconfig    gcc-15.2.0
arc                              allmodconfig    gcc-15.2.0
arc                               allnoconfig    gcc-15.2.0
arc                              allyesconfig    gcc-15.2.0
arc                                 defconfig    gcc-15.2.0
arc                   randconfig-001-20260603    gcc-13.4.0
arc                   randconfig-002-20260603    gcc-8.5.0
arm                               allnoconfig    clang-23
arm                              allyesconfig    gcc-15.2.0
arm                                 defconfig    clang-23
arm                   randconfig-001-20260603    gcc-13.4.0
arm                   randconfig-002-20260603    gcc-10.5.0
arm                   randconfig-003-20260603    gcc-8.5.0
arm                   randconfig-004-20260603    gcc-11.5.0
arm64                            allmodconfig    clang-19
arm64                             allnoconfig    gcc-15.2.0
arm64                               defconfig    gcc-15.2.0
arm64                 randconfig-001-20260603    clang-20
arm64                 randconfig-002-20260603    gcc-11.5.0
arm64                 randconfig-003-20260603    clang-23
arm64                 randconfig-004-20260603    gcc-14.3.0
csky                             allmodconfig    gcc-15.2.0
csky                              allnoconfig    gcc-15.2.0
csky                                defconfig    gcc-15.2.0
csky                  randconfig-001-20260603    gcc-10.5.0
csky                  randconfig-002-20260603    gcc-12.5.0
hexagon                          allmodconfig    clang-17
hexagon                           allnoconfig    clang-23
hexagon                             defconfig    clang-23
hexagon               randconfig-001-20260603    clang-16
hexagon               randconfig-002-20260603    clang-23
i386                             allmodconfig    gcc-14
i386                              allnoconfig    gcc-14
i386                             allyesconfig    gcc-14
i386        buildonly-randconfig-001-20260603    gcc-14
i386        buildonly-randconfig-002-20260603    gcc-14
i386        buildonly-randconfig-003-20260603    clang-20
i386        buildonly-randconfig-004-20260603    gcc-12
i386        buildonly-randconfig-005-20260603    gcc-14
i386        buildonly-randconfig-006-20260603    clang-20
i386                                defconfig    clang-20
i386                  randconfig-001-20260603    gcc-14
i386                  randconfig-002-20260603    gcc-14
i386                  randconfig-003-20260603    gcc-14
i386                  randconfig-004-20260603    gcc-14
i386                  randconfig-005-20260603    clang-20
i386                  randconfig-006-20260603    gcc-12
i386                  randconfig-007-20260603    clang-20
i386                  randconfig-011-20260603    gcc-14
i386                  randconfig-012-20260603    gcc-14
i386                  randconfig-013-20260603    clang-20
i386                  randconfig-014-20260603    clang-20
i386                  randconfig-015-20260603    clang-20
i386                  randconfig-016-20260603    gcc-14
i386                  randconfig-017-20260603    clang-20
loongarch                        allmodconfig    clang-19
loongarch                         allnoconfig    clang-23
loongarch                           defconfig    clang-19
loongarch             randconfig-001-20260603    gcc-16.1.0
loongarch             randconfig-002-20260603    clang-23
m68k                             allmodconfig    gcc-15.2.0
m68k                              allnoconfig    gcc-15.2.0
m68k                             allyesconfig    gcc-15.2.0
m68k                                defconfig    gcc-15.2.0
microblaze                        allnoconfig    gcc-15.2.0
microblaze                       allyesconfig    gcc-15.2.0
microblaze                          defconfig    gcc-15.2.0
mips                             allmodconfig    gcc-15.2.0
mips                              allnoconfig    gcc-15.2.0
mips                             allyesconfig    gcc-15.2.0
nios2                            allmodconfig    gcc-11.5.0
nios2                             allnoconfig    gcc-11.5.0
nios2                               defconfig    gcc-11.5.0
nios2                 randconfig-001-20260603    gcc-8.5.0
nios2                 randconfig-002-20260603    gcc-8.5.0
openrisc                         allmodconfig    gcc-15.2.0
openrisc                          allnoconfig    gcc-15.2.0
openrisc                            defconfig    gcc-15.2.0
parisc                           allmodconfig    gcc-15.2.0
parisc                            allnoconfig    gcc-15.2.0
parisc                           allyesconfig    gcc-15.2.0
parisc                              defconfig    gcc-15.2.0
parisc                randconfig-001-20260603    gcc-8.5.0
parisc                randconfig-002-20260603    gcc-15.2.0
parisc64                            defconfig    gcc-15.2.0
powerpc                          allmodconfig    gcc-15.2.0
powerpc                           allnoconfig    gcc-15.2.0
powerpc               randconfig-001-20260603    gcc-13.4.0
powerpc               randconfig-002-20260603    gcc-13.4.0
powerpc64             randconfig-001-20260603    clang-16
powerpc64             randconfig-002-20260603    gcc-10.5.0
riscv                            allmodconfig    clang-23
riscv                             allnoconfig    gcc-15.2.0
riscv                            allyesconfig    clang-16
riscv                               defconfig    clang-23
riscv                 randconfig-001-20260603    gcc-8.5.0
riscv                 randconfig-002-20260603    gcc-8.5.0
s390                             allmodconfig    clang-18
s390                              allnoconfig    clang-23
s390                             allyesconfig    gcc-15.2.0
s390                                defconfig    clang-23
s390                  randconfig-001-20260603    clang-16
s390                  randconfig-002-20260603    gcc-13.4.0
sh                               allmodconfig    gcc-15.2.0
sh                                allnoconfig    gcc-15.2.0
sh                               allyesconfig    gcc-15.2.0
sh                                  defconfig    gcc-15.2.0
sh                          polaris_defconfig    gcc-15.2.0
sh                    randconfig-001-20260603    gcc-10.5.0
sh                    randconfig-002-20260603    gcc-13.4.0
sparc                             allnoconfig    gcc-15.2.0
sparc                               defconfig    gcc-15.2.0
sparc                 randconfig-001-20260603    gcc-8.5.0
sparc                 randconfig-002-20260603    gcc-16.1.0
sparc64                          allmodconfig    clang-23
sparc64                             defconfig    clang-20
sparc64               randconfig-001-20260603    clang-23
sparc64               randconfig-002-20260603    clang-20
um                               allmodconfig    clang-19
um                                allnoconfig    clang-23
um                               allyesconfig    gcc-14
um                                  defconfig    clang-23
um                             i386_defconfig    gcc-14
um                    randconfig-001-20260603    gcc-14
um                    randconfig-002-20260603    gcc-14
um                           x86_64_defconfig    clang-23
x86_64                           allmodconfig    clang-20
x86_64                            allnoconfig    clang-20
x86_64                           allyesconfig    clang-20
x86_64      buildonly-randconfig-001-20260603    gcc-14
x86_64      buildonly-randconfig-002-20260603    gcc-14
x86_64      buildonly-randconfig-003-20260603    clang-20
x86_64      buildonly-randconfig-004-20260603    clang-20
x86_64      buildonly-randconfig-005-20260603    clang-20
x86_64      buildonly-randconfig-006-20260603    gcc-14
x86_64                              defconfig    gcc-14
x86_64                randconfig-001-20260603    clang-20
x86_64                randconfig-002-20260603    clang-20
x86_64                randconfig-003-20260603    clang-20
x86_64                randconfig-004-20260603    clang-20
x86_64                randconfig-005-20260603    clang-20
x86_64                randconfig-006-20260603    gcc-14
x86_64                randconfig-011-20260603    gcc-14
x86_64                randconfig-012-20260603    gcc-14
x86_64                randconfig-013-20260603    clang-20
x86_64                randconfig-014-20260603    gcc-14
x86_64                randconfig-015-20260603    gcc-14
x86_64                randconfig-016-20260603    clang-20
x86_64                randconfig-071-20260603    clang-20
x86_64                randconfig-072-20260603    gcc-14
x86_64                randconfig-073-20260603    gcc-12
x86_64                randconfig-074-20260603    clang-20
x86_64                randconfig-075-20260603    clang-20
x86_64                randconfig-076-20260603    gcc-14
x86_64                          rhel-9.4-rust    clang-20
xtensa                            allnoconfig    gcc-15.2.0
xtensa                randconfig-001-20260603    gcc-9.5.0
xtensa                randconfig-002-20260603    gcc-15.2.0

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply

* Re: [PATCH v2] scsi: core: Remove remaining references to the pktcdvd driver
From: John Garry @ 2026-06-03 15:36 UTC (permalink / raw)
  To: Catalin Iacob, Thomas Bogendoerfer, Madhavan Srinivasan,
	Michael Ellerman, Nicholas Piggin, Christophe Leroy (CS GROUP),
	Rich Felker, John Paul Adrian Glaubitz, David S. Miller,
	Andreas Larsson, James E.J. Bottomley, Martin K. Petersen,
	Jens Axboe
  Cc: linux-mips, linux-kernel, linuxppc-dev, linux-sh, sparclinux,
	linux-scsi
In-Reply-To: <20260603-remove-pktcdvd-references-v2-1-c4402154d53a@gmail.com>

On 03/06/2026 14:27, Catalin Iacob wrote:
> Commit 1cea5180f2f8 ("block: remove pktcdvd driver") left behind some
> CONFIG_CONFIG_CDROM_PKTCDVD* references in defconfigs and around an
> export. Remove them.
> 
> Signed-off-by: Catalin Iacob<iacobcatalin@gmail.com>
> ---
> Found this incidentally while looking at kernel sources to understand
> what pktcdvd is
> ---
> Changes in v2:
> - Reworded commit message on John Paul Adrian's suggestion to be about
>    the removed references not the export symbol
> - Link to v1:https://patch.msgid.link/20260530-remove-pktcdvd-references-v1-1- 
> aa56941d4315@gmail.com
> ---
>   arch/mips/configs/fuloong2e_defconfig    | 1 -
>   arch/mips/configs/ip22_defconfig         | 1 -
>   arch/mips/configs/ip27_defconfig         | 1 -
>   arch/mips/configs/ip30_defconfig         | 1 -
>   arch/mips/configs/jazz_defconfig         | 1 -
>   arch/mips/configs/malta_defconfig        | 1 -
>   arch/mips/configs/malta_kvm_defconfig    | 1 -
>   arch/mips/configs/maltaup_xpa_defconfig  | 1 -
>   arch/mips/configs/rm200_defconfig        | 1 -
>   arch/mips/configs/sb1250_swarm_defconfig | 1 -
>   arch/powerpc/configs/g5_defconfig        | 1 -
>   arch/powerpc/configs/ppc6xx_defconfig    | 1 -
>   arch/sh/configs/sh2007_defconfig         | 1 -
>   arch/sparc/configs/sparc64_defconfig     | 2 --

Obviously none of the changes above are related to scsi core, so they 
can be made separately

>   drivers/scsi/scsi_lib.c                  | 8 --------
>   15 files changed, 23 deletions(-)
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 85eef401925a..b67f0dc79499 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -2224,14 +2224,6 @@ struct scsi_device *scsi_device_from_queue(struct request_queue *q)
>   
>   	return sdev;
>   }
> -/*
> - * pktcdvd should have been integrated into the SCSI layers, but for historical
> - * reasons like the old IDE driver it isn't.  This export allows it to safely
> - * probe if a given device is a SCSI one and only attach to that.
> - */
> -#ifdef CONFIG_CDROM_PKTCDVD_MODULE
> -EXPORT_SYMBOL_GPL(scsi_device_from_queue);
> -#endif
>   

I also think that the prototype of scsi_device_from_queue can be 
relocated from include/scsi/scsi_device.h to drivers/scsi/scsi_priv.h

>   /**
>    * scsi_block_requests - Utility function used by low-level drivers to prevent



^ permalink raw reply

* Re: [PATCH] KVM: PPC: Book3S HV: Validate arch_compat against host compatibility mode
From: Ritesh Harjani @ 2026-06-03 15:11 UTC (permalink / raw)
  To: Amit Machhiwal, linuxppc-dev, Madhavan Srinivasan
  Cc: Amit Machhiwal, Vaibhav Jain, Harsh Prateek Bora, Anushree Mathur,
	Nicholas Piggin, Michael Ellerman, Christophe Leroy (CS GROUP),
	kvm, stable, linux-kernel
In-Reply-To: <20260603141539.47620-1-amachhiw@linux.ibm.com>

Amit Machhiwal <amachhiw@linux.ibm.com> writes:

> On IBM POWER systems, newer processor generations can operate in
> compatibility modes corresponding to earlier generations. This becomes
> relevant for nested virtualization, where nested KVM guests may need to
> run with a specific processor compatibility level.
>
> Currently, when running a nested KVM guest (L2) inside a Power11 pSeries
> logical partition (L1) booted in Power10 compatibility mode, the guest
> fails to boot while setting 'arch_compat'. This happens because the CPU
> class is derived from the hardware PVR (via mfspr()), which reflects the
> physical processor generation (Power11), rather than the effective
> compatibility mode (Power10).
>
> As a result, userspace may request a Power11 arch_compat for the L2
> guest. However, the L1 partition, running in Power10 compatibility, has
> only negotiated support up to Power10 with the Power Hypervisor (L0).
> When H_GUEST_SET_STATE is invoked with a Power11 Logical PVR, the
> hypervisor rejects the request, leading to a late guest boot failure:
>
>   KVM-NESTEDv2: couldn't set guest wide elements
>   [..KVM reg dump..]
>

Thanks! It make sense to return a proper error code to the user (VMM)
while the VM/VCPU are being initialized, rather then the guest failing
to boot with a weird error like this, at the time when kernel makes this
H_GUEST_SET_STATE hcall.

> This situation should be detected earlier. Rejecting unsupported
> 'arch_compat' values in 'kvmppc_set_arch_compat()' avoids issuing an
> invalid H_GUEST_SET_STATE hcall and provides a clearer failure mode.
>
> Add a check to reject Power11 'arch_compat' requests when the host is
> running in Power10 compatibility mode, returning -EINVAL early instead
> of deferring the failure to the hypervisor.
>
> Suggested-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> Tested-by: Anushree Mathur <anushree.mathur@linux.ibm.com>
> Cc: <stable@vger.kernel.org> # v6.13+

Sure, v6.13 sounds fair as you pointed out.

> Signed-off-by: Amit Machhiwal <amachhiw@linux.ibm.com>
> ---
> Changelog:
>
> * Moved this patch out of the v3 series [1] as discussed here [2]
> * Addressed below review comments from Ritesh:
>   - Based the PVR validation on cpu features
>   - Fixed hcall name typo
>   - Stable backport

The changes looks good to me. Please feel free to add:

Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>



^ permalink raw reply

* Re: [PATCH v3 03/19] powerpc/mm: Fix wrong addr_pfn tracking in compound vmemmap population
From: Ritesh Harjani @ 2026-06-03 14:36 UTC (permalink / raw)
  To: Muchun Song, Oscar Salvador, David Hildenbrand, Andrew Morton,
	Madhavan Srinivasan, Michael Ellerman
  Cc: Muchun Song, Mike Rapoport, Lorenzo Stoakes, Liam R. Howlett,
	Vlastimil Babka, linux-mm, linux-kernel, Nicholas Piggin,
	Christophe Leroy (CS GROUP), Aneesh Kumar K.V, linuxppc-dev,
	Mike Kravetz, Muchun Song
In-Reply-To: <20260602101039.1867613-4-songmuchun@bytedance.com>

Muchun Song <songmuchun@bytedance.com> writes:

> vmemmap_populate_compound_pages() uses addr_pfn to determine the PFN
> offset within a compound page and to decide whether the current
> vmemmap slot should be populated as a head page mapping or should reuse
> a tail page mapping.
>
> However, addr_pfn is advanced manually in parallel with addr.  The loop
> itself progresses in vmemmap address space, so each PAGE_SIZE step in
> addr covers PAGE_SIZE / sizeof(struct page) struct page slots.  Since
> addr_pfn is compared against nr_pages in data-PFN units, it should
> advance by the same number of PFNs.  The existing manual increments do
> not match that and therefore do not reliably track the PFN
> corresponding to the current addr.
>
> As a result, pfn_offset can be computed from the wrong PFN and the code
> can make the head/tail decision for the wrong compound-page position.
>
> Fix this by deriving addr_pfn directly from the current vmemmap address
> instead of carrying it as loop state.
>
> Fixes: f2b79c0d7968 ("powerpc/book3s64/radix: add support for vmemmap optimization for radix")
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> Acked-by: Oscar Salvador <osalvador@suse.de>

Thanks for fixing it. I guess this was not caught because section size
on powerpc is 16MB and with 64K pagesize we have 256 pfns to map. The
vmemmap size required for this is 256*sizeof(struct page) = 16KB which
is < 64K (pagesize). So basically we never loop in
vmemmap_populate_compound_page(), because
next = addr+PAGE_SIZE will be > end after the 1st iteration itself.

But I agree this is a bug which needs fixing and it can be easily caught
with 4K pagesize, where we have 4096 pfns to map within a 16MB section.


The change looks good to me. Can we please add stable tag too?
Cc: stable@kernel.org

Also, feel free to add:
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>


^ permalink raw reply

* Re: [PATCH v3 1/5] KVM: PPC: Book3S HV: Validate arch_compat against host compatibility mode
From: Amit Machhiwal @ 2026-06-03 14:26 UTC (permalink / raw)
  To: Ritesh Harjani, Madhavan Srinivasan
  Cc: Harsh Prateek Bora, Vaibhav Jain, Amit Machhiwal, linuxppc-dev,
	Anushree Mathur, Paolo Bonzini, Nicholas Piggin, Michael Ellerman,
	Christophe Leroy (CS GROUP), Jonathan Corbet, Shuah Khan, kvm,
	linux-kernel, linux-doc, lkp
In-Reply-To: <bjdsw43g.ritesh.list@gmail.com>

Hi Maddy, Ritesh,

I have made the suggested changes and posted the separate patch here:

 https://lore.kernel.org/all/20260603141539.47620-1-amachhiw@linux.ibm.com/

<..snip..>

> > Hence IMHO, this patch can be marked for stable tree and potential
> > candidate for 7.2 merge window. But dont see applicability of a 'fixes'
> > tag to this patch
> 
> I agree, we need not use a fixes tag then. So, we shall mark this
> with v6.10 tag then.
> 
> Cc: stable@vger.kernel.org # v6.10+

Please note that I have marked the patch for stable v6.13+ as the KVM
support for Power11 was added via 96e266e3bcd6 ("KVM: PPC: Book3S HV:
Add Power11 capability support for Nested PAPR guests"). Also, this
commit had introduced CPU_FTR_P11_PVR on which the compat PVR check in
the patch is based on.

Thanks,
Amit


^ permalink raw reply

* [PATCH] KVM: PPC: Book3S HV: Validate arch_compat against host compatibility mode
From: Amit Machhiwal @ 2026-06-03 14:15 UTC (permalink / raw)
  To: linuxppc-dev, Madhavan Srinivasan
  Cc: Amit Machhiwal, Vaibhav Jain, Harsh Prateek Bora, Ritesh Harjani,
	Anushree Mathur, Nicholas Piggin, Michael Ellerman,
	Christophe Leroy (CS GROUP), kvm, stable, linux-kernel

On IBM POWER systems, newer processor generations can operate in
compatibility modes corresponding to earlier generations. This becomes
relevant for nested virtualization, where nested KVM guests may need to
run with a specific processor compatibility level.

Currently, when running a nested KVM guest (L2) inside a Power11 pSeries
logical partition (L1) booted in Power10 compatibility mode, the guest
fails to boot while setting 'arch_compat'. This happens because the CPU
class is derived from the hardware PVR (via mfspr()), which reflects the
physical processor generation (Power11), rather than the effective
compatibility mode (Power10).

As a result, userspace may request a Power11 arch_compat for the L2
guest. However, the L1 partition, running in Power10 compatibility, has
only negotiated support up to Power10 with the Power Hypervisor (L0).
When H_GUEST_SET_STATE is invoked with a Power11 Logical PVR, the
hypervisor rejects the request, leading to a late guest boot failure:

  KVM-NESTEDv2: couldn't set guest wide elements
  [..KVM reg dump..]

This situation should be detected earlier. Rejecting unsupported
'arch_compat' values in 'kvmppc_set_arch_compat()' avoids issuing an
invalid H_GUEST_SET_STATE hcall and provides a clearer failure mode.

Add a check to reject Power11 'arch_compat' requests when the host is
running in Power10 compatibility mode, returning -EINVAL early instead
of deferring the failure to the hypervisor.

Suggested-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Tested-by: Anushree Mathur <anushree.mathur@linux.ibm.com>
Cc: <stable@vger.kernel.org> # v6.13+
Signed-off-by: Amit Machhiwal <amachhiw@linux.ibm.com>
---
Changelog:

* Moved this patch out of the v3 series [1] as discussed here [2]
* Addressed below review comments from Ritesh:
  - Based the PVR validation on cpu features
  - Fixed hcall name typo
  - Stable backport

[1] https://lore.kernel.org/all/20260522152744.55251-1-amachhiw@linux.ibm.com/
[2] https://lore.kernel.org/all/20260522152744.55251-2-amachhiw@linux.ibm.com/
---
 arch/powerpc/kvm/book3s_hv.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 61dbeea317f3..e16dbb199366 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -446,7 +446,17 @@ static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, u32 arch_compat)
 			guest_pcr_bit = PCR_ARCH_300;
 			break;
 		case PVR_ARCH_31:
+			guest_pcr_bit = PCR_ARCH_31;
+			break;
 		case PVR_ARCH_31_P11:
+			/*
+			 * Need to check this for ISA 3.1, as Power10 and
+			 * Power11 share the same PCR. For any subsequent ISA
+			 * versions, this will be taken care of by the guest vs
+			 * host PCR comparison below.
+			 */
+			if (!cpu_has_feature(CPU_FTR_P11_PVR))
+				return -EINVAL;
 			guest_pcr_bit = PCR_ARCH_31;
 			break;
 		default:

base-commit: ba3e43a9e601636f5edb54e259a74f96ca3b8fd8
-- 
2.50.1 (Apple Git-155)



^ permalink raw reply related

* [PATCH v2] scsi: core: Remove remaining references to the pktcdvd driver
From: Catalin Iacob @ 2026-06-03 13:27 UTC (permalink / raw)
  To: Thomas Bogendoerfer, Madhavan Srinivasan, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy (CS GROUP), Rich Felker,
	John Paul Adrian Glaubitz, David S. Miller, Andreas Larsson,
	James E.J. Bottomley, Martin K. Petersen, Jens Axboe
  Cc: linux-mips, linux-kernel, linuxppc-dev, linux-sh, sparclinux,
	linux-scsi, Catalin Iacob
In-Reply-To: <20260530-remove-pktcdvd-references-v1-1-aa56941d4315@gmail.com>

Commit 1cea5180f2f8 ("block: remove pktcdvd driver") left behind some
CONFIG_CONFIG_CDROM_PKTCDVD* references in defconfigs and around an
export. Remove them.

Signed-off-by: Catalin Iacob <iacobcatalin@gmail.com>
---
Found this incidentally while looking at kernel sources to understand
what pktcdvd is
---
Changes in v2:
- Reworded commit message on John Paul Adrian's suggestion to be about
  the removed references not the export symbol
- Link to v1: https://patch.msgid.link/20260530-remove-pktcdvd-references-v1-1-aa56941d4315@gmail.com
---
 arch/mips/configs/fuloong2e_defconfig    | 1 -
 arch/mips/configs/ip22_defconfig         | 1 -
 arch/mips/configs/ip27_defconfig         | 1 -
 arch/mips/configs/ip30_defconfig         | 1 -
 arch/mips/configs/jazz_defconfig         | 1 -
 arch/mips/configs/malta_defconfig        | 1 -
 arch/mips/configs/malta_kvm_defconfig    | 1 -
 arch/mips/configs/maltaup_xpa_defconfig  | 1 -
 arch/mips/configs/rm200_defconfig        | 1 -
 arch/mips/configs/sb1250_swarm_defconfig | 1 -
 arch/powerpc/configs/g5_defconfig        | 1 -
 arch/powerpc/configs/ppc6xx_defconfig    | 1 -
 arch/sh/configs/sh2007_defconfig         | 1 -
 arch/sparc/configs/sparc64_defconfig     | 2 --
 drivers/scsi/scsi_lib.c                  | 8 --------
 15 files changed, 23 deletions(-)

diff --git a/arch/mips/configs/fuloong2e_defconfig b/arch/mips/configs/fuloong2e_defconfig
index b6fe3c962464..840130a73992 100644
--- a/arch/mips/configs/fuloong2e_defconfig
+++ b/arch/mips/configs/fuloong2e_defconfig
@@ -89,7 +89,6 @@ CONFIG_MTD_CFI_STAA=m
 CONFIG_MTD_PHYSMAP=m
 CONFIG_BLK_DEV_LOOP=y
 CONFIG_BLK_DEV_RAM=m
-CONFIG_CDROM_PKTCDVD=m
 CONFIG_ATA_OVER_ETH=m
 CONFIG_BLK_DEV_SD=y
 CONFIG_BLK_DEV_SR=y
diff --git a/arch/mips/configs/ip22_defconfig b/arch/mips/configs/ip22_defconfig
index e123848f94ab..61f09cc9ac12 100644
--- a/arch/mips/configs/ip22_defconfig
+++ b/arch/mips/configs/ip22_defconfig
@@ -177,7 +177,6 @@ CONFIG_NET_ACT_SIMP=m
 CONFIG_NET_ACT_SKBEDIT=m
 CONFIG_RFKILL=m
 CONFIG_CONNECTOR=m
-CONFIG_CDROM_PKTCDVD=m
 CONFIG_ATA_OVER_ETH=m
 CONFIG_RAID_ATTRS=m
 CONFIG_SCSI=y
diff --git a/arch/mips/configs/ip27_defconfig b/arch/mips/configs/ip27_defconfig
index fea0ccee6948..60da9cf71b72 100644
--- a/arch/mips/configs/ip27_defconfig
+++ b/arch/mips/configs/ip27_defconfig
@@ -83,7 +83,6 @@ CONFIG_CFG80211=m
 CONFIG_MAC80211=m
 CONFIG_RFKILL=m
 CONFIG_BLK_DEV_LOOP=y
-CONFIG_CDROM_PKTCDVD=m
 CONFIG_ATA_OVER_ETH=m
 CONFIG_SCSI=y
 CONFIG_BLK_DEV_SD=y
diff --git a/arch/mips/configs/ip30_defconfig b/arch/mips/configs/ip30_defconfig
index 718f3060d9fa..5c2911ff9a87 100644
--- a/arch/mips/configs/ip30_defconfig
+++ b/arch/mips/configs/ip30_defconfig
@@ -77,7 +77,6 @@ CONFIG_NET_ACT_PEDIT=m
 CONFIG_NET_ACT_SKBEDIT=m
 # CONFIG_VGA_ARB is not set
 CONFIG_BLK_DEV_LOOP=y
-CONFIG_CDROM_PKTCDVD=m
 CONFIG_ATA_OVER_ETH=m
 CONFIG_SCSI=y
 CONFIG_BLK_DEV_SD=y
diff --git a/arch/mips/configs/jazz_defconfig b/arch/mips/configs/jazz_defconfig
index a790c2610fd3..dd3486b8d1fc 100644
--- a/arch/mips/configs/jazz_defconfig
+++ b/arch/mips/configs/jazz_defconfig
@@ -33,7 +33,6 @@ CONFIG_BLK_DEV_FD=m
 CONFIG_BLK_DEV_LOOP=m
 CONFIG_BLK_DEV_NBD=m
 CONFIG_BLK_DEV_RAM=m
-CONFIG_CDROM_PKTCDVD=m
 CONFIG_ATA_OVER_ETH=m
 CONFIG_RAID_ATTRS=m
 CONFIG_SCSI=y
diff --git a/arch/mips/configs/malta_defconfig b/arch/mips/configs/malta_defconfig
index 81704ec67f09..b10dac71f400 100644
--- a/arch/mips/configs/malta_defconfig
+++ b/arch/mips/configs/malta_defconfig
@@ -224,7 +224,6 @@ CONFIG_BLK_DEV_FD=m
 CONFIG_BLK_DEV_LOOP=m
 CONFIG_BLK_DEV_NBD=m
 CONFIG_BLK_DEV_RAM=y
-CONFIG_CDROM_PKTCDVD=m
 CONFIG_ATA_OVER_ETH=m
 CONFIG_RAID_ATTRS=m
 CONFIG_BLK_DEV_SD=y
diff --git a/arch/mips/configs/malta_kvm_defconfig b/arch/mips/configs/malta_kvm_defconfig
index 82a97f58bce1..bdd5d99884e3 100644
--- a/arch/mips/configs/malta_kvm_defconfig
+++ b/arch/mips/configs/malta_kvm_defconfig
@@ -228,7 +228,6 @@ CONFIG_BLK_DEV_FD=m
 CONFIG_BLK_DEV_LOOP=m
 CONFIG_BLK_DEV_NBD=m
 CONFIG_BLK_DEV_RAM=y
-CONFIG_CDROM_PKTCDVD=m
 CONFIG_ATA_OVER_ETH=m
 CONFIG_RAID_ATTRS=m
 CONFIG_BLK_DEV_SD=y
diff --git a/arch/mips/configs/maltaup_xpa_defconfig b/arch/mips/configs/maltaup_xpa_defconfig
index 0f9ef20744f9..523c0ff329ac 100644
--- a/arch/mips/configs/maltaup_xpa_defconfig
+++ b/arch/mips/configs/maltaup_xpa_defconfig
@@ -226,7 +226,6 @@ CONFIG_BLK_DEV_FD=m
 CONFIG_BLK_DEV_LOOP=m
 CONFIG_BLK_DEV_NBD=m
 CONFIG_BLK_DEV_RAM=y
-CONFIG_CDROM_PKTCDVD=m
 CONFIG_ATA_OVER_ETH=m
 CONFIG_RAID_ATTRS=m
 CONFIG_BLK_DEV_SD=y
diff --git a/arch/mips/configs/rm200_defconfig b/arch/mips/configs/rm200_defconfig
index ad9fbd0cbb38..60054e54bc5a 100644
--- a/arch/mips/configs/rm200_defconfig
+++ b/arch/mips/configs/rm200_defconfig
@@ -177,7 +177,6 @@ CONFIG_PARIDE_ON26=m
 CONFIG_BLK_DEV_LOOP=m
 CONFIG_BLK_DEV_NBD=m
 CONFIG_BLK_DEV_RAM=m
-CONFIG_CDROM_PKTCDVD=m
 CONFIG_ATA_OVER_ETH=m
 CONFIG_RAID_ATTRS=m
 CONFIG_SCSI=y
diff --git a/arch/mips/configs/sb1250_swarm_defconfig b/arch/mips/configs/sb1250_swarm_defconfig
index 4a25b8d3e507..a50a7c097542 100644
--- a/arch/mips/configs/sb1250_swarm_defconfig
+++ b/arch/mips/configs/sb1250_swarm_defconfig
@@ -43,7 +43,6 @@ CONFIG_FW_LOADER=m
 CONFIG_CONNECTOR=m
 CONFIG_BLK_DEV_RAM=y
 CONFIG_BLK_DEV_RAM_SIZE=9220
-CONFIG_CDROM_PKTCDVD=m
 CONFIG_ATA_OVER_ETH=m
 CONFIG_RAID_ATTRS=m
 CONFIG_BLK_DEV_SD=y
diff --git a/arch/powerpc/configs/g5_defconfig b/arch/powerpc/configs/g5_defconfig
index 5ca1676e6058..647775f6d174 100644
--- a/arch/powerpc/configs/g5_defconfig
+++ b/arch/powerpc/configs/g5_defconfig
@@ -57,7 +57,6 @@ CONFIG_BLK_DEV_LOOP=y
 CONFIG_BLK_DEV_NBD=m
 CONFIG_BLK_DEV_RAM=y
 CONFIG_BLK_DEV_RAM_SIZE=65536
-CONFIG_CDROM_PKTCDVD=m
 CONFIG_BLK_DEV_SD=y
 CONFIG_CHR_DEV_ST=y
 CONFIG_BLK_DEV_SR=y
diff --git a/arch/powerpc/configs/ppc6xx_defconfig b/arch/powerpc/configs/ppc6xx_defconfig
index eda1fec7ffd9..5c3e25fd8edd 100644
--- a/arch/powerpc/configs/ppc6xx_defconfig
+++ b/arch/powerpc/configs/ppc6xx_defconfig
@@ -306,7 +306,6 @@ CONFIG_BLK_DEV_LOOP=m
 CONFIG_BLK_DEV_NBD=m
 CONFIG_BLK_DEV_RAM=y
 CONFIG_BLK_DEV_RAM_SIZE=16384
-CONFIG_CDROM_PKTCDVD=m
 CONFIG_VIRTIO_BLK=m
 CONFIG_ENCLOSURE_SERVICES=m
 CONFIG_SENSORS_TSL2550=m
diff --git a/arch/sh/configs/sh2007_defconfig b/arch/sh/configs/sh2007_defconfig
index 5d9080499485..f287a41cd38c 100644
--- a/arch/sh/configs/sh2007_defconfig
+++ b/arch/sh/configs/sh2007_defconfig
@@ -45,7 +45,6 @@ CONFIG_NETWORK_SECMARK=y
 CONFIG_NET_PKTGEN=y
 CONFIG_BLK_DEV_LOOP=y
 CONFIG_BLK_DEV_RAM=y
-CONFIG_CDROM_PKTCDVD=y
 CONFIG_RAID_ATTRS=y
 CONFIG_SCSI=y
 CONFIG_BLK_DEV_SD=y
diff --git a/arch/sparc/configs/sparc64_defconfig b/arch/sparc/configs/sparc64_defconfig
index 632081a262ba..4abea39281cd 100644
--- a/arch/sparc/configs/sparc64_defconfig
+++ b/arch/sparc/configs/sparc64_defconfig
@@ -60,8 +60,6 @@ CONFIG_CONNECTOR=m
 CONFIG_BLK_DEV_LOOP=m
 CONFIG_BLK_DEV_CRYPTOLOOP=m
 CONFIG_BLK_DEV_NBD=m
-CONFIG_CDROM_PKTCDVD=m
-CONFIG_CDROM_PKTCDVD_WCACHE=y
 CONFIG_ATA_OVER_ETH=m
 CONFIG_SUNVDC=m
 CONFIG_ATA=y
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 85eef401925a..b67f0dc79499 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2224,14 +2224,6 @@ struct scsi_device *scsi_device_from_queue(struct request_queue *q)
 
 	return sdev;
 }
-/*
- * pktcdvd should have been integrated into the SCSI layers, but for historical
- * reasons like the old IDE driver it isn't.  This export allows it to safely
- * probe if a given device is a SCSI one and only attach to that.
- */
-#ifdef CONFIG_CDROM_PKTCDVD_MODULE
-EXPORT_SYMBOL_GPL(scsi_device_from_queue);
-#endif
 
 /**
  * scsi_block_requests - Utility function used by low-level drivers to prevent

---
base-commit: e43ffb69e0438cddd72aaa30898b4dc446f664f8
change-id: 20260530-remove-pktcdvd-references-9d5c6362a5de

Best regards,
--  
Catalin Iacob <iacobcatalin@gmail.com>



^ permalink raw reply related

* [PATCH v3 phy-next 15/16] phy: lynx-10g: new driver
From: Vladimir Oltean @ 2026-06-03 13:21 UTC (permalink / raw)
  To: linux-phy
  Cc: Ioana Ciornei, Vinod Koul, Neil Armstrong, Tanjeff Moos,
	linux-kernel, devicetree, Conor Dooley, Krzysztof Kozlowski,
	Rob Herring, linux-arm-kernel, chleroy, linuxppc-dev
In-Reply-To: <20260603131914.503053-1-vladimir.oltean@nxp.com>

Introduce a driver for the networking lanes of the 10G Lynx SerDes
block, present on the majority of Layerscape and QorIQ (Freescale/NXP)
SoCs.

As with the 28G Lynx, the SerDes lanes come pre-initialized out of
reset and the consumers use them that way outside the Generic PHY
framework (for networking, the static configuration remains for the
entire SoC lifetime, whereas for SATA and PCIe, the hardware
reconfigures itself automatically for other link speeds).

The need for the Generic PHY framework comes specifically for networking
use cases where a static lane configuration is not sufficient. For
example a network MAC is connected to an SFP cage, where various SFP or
SFP+ modules can be connected. Each of them may require a different
SerDes protocol (SGMII, 1000Base-X, 10GBase-R), which phylink + sfp-bus
are responsible of figuring out. The phylink drivers are:
- enetc
- felix
- dpaa_eth (fman_memac)
- dpaa2-eth
- dpaa2-switch

and they all need to reconfigure the SerDes for the requested link mode,
using phy_set_mode_ext() (and phy_validate() to see if it is supported
in the first place).

Note that SerDes 2 on LS1088A is exclusively non-networking, so there is
currently no need for this driver. Therefore we skip matching on its
compatible string and do not probe on that device.

Co-developed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
Cc: devicetree@vger.kernel.org
Cc: Conor Dooley <conor+dt@kernel.org>
Cc: Krzysztof Kozlowski <krzk+dt@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: chleroy@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org

v2->v3:
- fix lynx_10g_power_on() procedure
- include <linux/of.h> instead of <linux/of_device.h>
- fix build warning introduced in v2 in lynx_10g_lane_set_nrate()
v1->v2:
- move lynx_lane_restrict_fixed_mode_change() to lynx-core, even though
  the 28G Lynx as instantiated in LX2 does not have QSGMII.
- lynx_10g_validate() now calls the new lynx_phy_mode_to_lane_mode()
  which does verify that the current lane mode is supported
- avoid line size checkpatch warnings in lynx_10g_lane_set_nrate() by
  saving the nrate to a variable and calling lynx_lane_rmw() only once
- remove redundant "if (!lane->powered_up)" checks from
  lynx_10g_lane_halt() and lynx_10g_lane_reset() - also checked at
  the only call site, lynx_10g_set_mode(), as in lynx-28g
- expand CC list (flagged by Patchwork)
---
 drivers/phy/freescale/Kconfig             |   10 +
 drivers/phy/freescale/Makefile            |    1 +
 drivers/phy/freescale/phy-fsl-lynx-10g.c  | 1278 +++++++++++++++++++++
 drivers/phy/freescale/phy-fsl-lynx-core.c |   38 +
 drivers/phy/freescale/phy-fsl-lynx-core.h |    4 +
 include/soc/fsl/phy-fsl-lynx.h            |   27 +
 6 files changed, 1358 insertions(+)
 create mode 100644 drivers/phy/freescale/phy-fsl-lynx-10g.c

diff --git a/drivers/phy/freescale/Kconfig b/drivers/phy/freescale/Kconfig
index ac575d531db7..5bf3864fbe64 100644
--- a/drivers/phy/freescale/Kconfig
+++ b/drivers/phy/freescale/Kconfig
@@ -54,6 +54,16 @@ endif
 config PHY_FSL_LYNX_CORE
 	tristate
 
+config PHY_FSL_LYNX_10G
+	tristate "Freescale Layerscape Lynx 10G SerDes PHY support"
+	depends on OF
+	depends on ARCH_LAYERSCAPE || COMPILE_TEST
+	select GENERIC_PHY
+	select PHY_FSL_LYNX_CORE
+	help
+	  Enable this to add support for the Lynx 10G SerDes PHY as found on
+	  NXP's Layerscape platform such as LS1088A or LS1028A.
+
 config PHY_FSL_LYNX_28G
 	tristate "Freescale Layerscape Lynx 28G SerDes PHY support"
 	depends on OF
diff --git a/drivers/phy/freescale/Makefile b/drivers/phy/freescale/Makefile
index d7aa62cdeb39..5b0e180d6972 100644
--- a/drivers/phy/freescale/Makefile
+++ b/drivers/phy/freescale/Makefile
@@ -5,5 +5,6 @@ obj-$(CONFIG_PHY_MIXEL_MIPI_DPHY)	+= phy-fsl-imx8-mipi-dphy.o
 obj-$(CONFIG_PHY_FSL_IMX8M_PCIE)	+= phy-fsl-imx8m-pcie.o
 obj-$(CONFIG_PHY_FSL_IMX8QM_HSIO)	+= phy-fsl-imx8qm-hsio.o
 obj-$(CONFIG_PHY_FSL_LYNX_CORE)		+= phy-fsl-lynx-core.o
+obj-$(CONFIG_PHY_FSL_LYNX_10G)		+= phy-fsl-lynx-10g.o
 obj-$(CONFIG_PHY_FSL_LYNX_28G)		+= phy-fsl-lynx-28g.o
 obj-$(CONFIG_PHY_FSL_SAMSUNG_HDMI_PHY)	+= phy-fsl-samsung-hdmi.o
diff --git a/drivers/phy/freescale/phy-fsl-lynx-10g.c b/drivers/phy/freescale/phy-fsl-lynx-10g.c
new file mode 100644
index 000000000000..7dd5d94b51cf
--- /dev/null
+++ b/drivers/phy/freescale/phy-fsl-lynx-10g.c
@@ -0,0 +1,1278 @@
+// SPDX-License-Identifier: GPL-2.0+
+/* Copyright 2021-2026 NXP */
+
+#include <linux/delay.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/phy.h>
+#include <linux/phy/phy.h>
+#include <linux/platform_device.h>
+#include <linux/workqueue.h>
+
+#include "phy-fsl-lynx-core.h"
+
+/* SoC IP wrapper for protocol converters */
+#define PCCR8				0x220
+#define PCCR8_SGMIIa_KX			BIT(3)
+#define PCCR8_SGMIIa_CFG		BIT(0)
+
+#define PCCR9				0x224
+#define PCCR9_QSGMIIa_CFG		BIT(0)
+#define PCCR9_QXGMIIa_CFG		BIT(0)
+
+#define PCCRB				0x22c
+#define PCCRB_XFIa_CFG			BIT(0)
+#define PCCRB_SXGMIIa_CFG		BIT(0)
+
+#define SGMII_CFG(id)			(28 - (id) * 4)
+#define QSGMII_CFG(id)			(28 - (id) * 4)
+#define SXGMII_CFG(id)			(28 - (id) * 4)
+#define QXGMII_CFG(id)			(12 - (id) * 4)
+#define XFI_CFG(id)			(28 - (id) * 4)
+
+#define CR(x)				((x) * 4)
+
+#define A				0
+#define B				1
+#define C				2
+#define D				3
+#define E				4
+#define F				5
+#define G				6
+#define H				7
+
+#define SGMIIaCR0(id)			(0x1800 + (id) * 0x10)
+#define QSGMIIaCR0(id)			(0x1880 + (id) * 0x10)
+#define XAUIaCR0(id)			(0x1900 + (id) * 0x10)
+#define XFIaCR0(id)			(0x1980 + (id) * 0x10)
+#define SXGMIIaCR0(id)			(0x1a80 + (id) * 0x10)
+#define QXGMIIaCR0(id)			(0x1b00 + (id) * 0x20)
+
+#define SGMIIaCR0_RST_SGM		BIT(31)
+#define SGMIIaCR0_RST_SGM_OFF		SGMIIaCR0_RST_SGM
+#define SGMIIaCR0_RST_SGM_ON		0
+#define SGMIIaCR0_PD_SGM		BIT(30)
+#define SGMIIaCR1_SGPCS_EN		BIT(11)
+#define SGMIIaCR1_SGPCS_DIS		0x0
+
+#define QSGMIIaCR0_RST_QSGM		BIT(31)
+#define QSGMIIaCR0_RST_QSGM_OFF		QSGMIIaCR0_RST_QSGM
+#define QSGMIIaCR0_RST_QSGM_ON		0
+#define QSGMIIaCR0_PD_QSGM		BIT(30)
+
+/* Per PLL registers */
+#define PLLnCR0(pll)			((pll) * 0x20 + 0x4)
+
+#define PLLnCR0_POFF			BIT(31)
+
+#define PLLnCR0_REFCLK_SEL		GENMASK(30, 28)
+#define PLLnCR0_REFCLK_SEL_100MHZ	0x0
+#define PLLnCR0_REFCLK_SEL_125MHZ	0x1
+#define PLLnCR0_REFCLK_SEL_156MHZ	0x2
+#define PLLnCR0_REFCLK_SEL_150MHZ	0x3
+#define PLLnCR0_REFCLK_SEL_161MHZ	0x4
+#define PLLnCR0_PLL_LCK			BIT(23)
+#define PLLnCR0_FRATE_SEL		GENMASK(19, 16)
+#define PLLnCR0_FRATE_5G		0x0
+#define PLLnCR0_FRATE_5_15625G		0x6
+#define PLLnCR0_FRATE_4G		0x7
+#define PLLnCR0_FRATE_3_125G		0x9
+#define PLLnCR0_FRATE_3G		0xa
+
+/* Per SerDes lane registers */
+
+/* Lane a Protocol Select status register */
+#define LNaPSSR0(lane)			(0x100 + (lane) * 0x20)
+#define LNaPSSR0_TYPE			GENMASK(30, 26)
+#define LNaPSSR0_IS_QUAD		GENMASK(25, 24)
+#define LNaPSSR0_MAC			GENMASK(19, 16)
+#define LNaPSSR0_PCS			GENMASK(10, 8)
+#define LNaPSSR0_LANE			GENMASK(2, 0)
+
+/* Lane a General Control Register */
+#define LNaGCR0(lane)			(0x800 + (lane) * 0x40 + 0x0)
+#define LNaGCR0_RPLL_PLLF		BIT(31)
+#define LNaGCR0_RPLL_PLLS		0x0
+#define LNaGCR0_RPLL_MSK		BIT(31)
+#define LNaGCR0_RRAT_SEL		GENMASK(29, 28)
+#define LNaGCR0_TRAT_SEL		GENMASK(25, 24)
+#define LNaGCR0_TPLL_PLLF		BIT(27)
+#define LNaGCR0_TPLL_PLLS		0x0
+#define LNaGCR0_TPLL_MSK		BIT(27)
+#define LNaGCR0_RRST_OFF		LNaGCR0_RRST
+#define LNaGCR0_TRST_OFF		LNaGCR0_TRST
+#define LNaGCR0_RRST_ON			0x0
+#define LNaGCR0_TRST_ON			0x0
+#define LNaGCR0_RRST			BIT(22)
+#define LNaGCR0_TRST			BIT(21)
+#define LNaGCR0_RX_PD			BIT(20)
+#define LNaGCR0_TX_PD			BIT(19)
+#define LNaGCR0_IF20BIT_EN		BIT(18)
+#define LNaGCR0_PROTS			GENMASK(11, 7)
+
+#define LNaGCR1(lane)			(0x800 + (lane) * 0x40 + 0x4)
+#define LNaGCR1_RDAT_INV		BIT(31)
+#define LNaGCR1_TDAT_INV		BIT(30)
+#define LNaGCR1_OPAD_CTL		BIT(26)
+#define LNaGCR1_REIDL_TH		GENMASK(22, 20)
+#define LNaGCR1_REIDL_EX_SEL		GENMASK(19, 18)
+#define LNaGCR1_REIDL_ET_SEL		GENMASK(17, 16)
+#define LNaGCR1_REIDL_EX_MSB		BIT(15)
+#define LNaGCR1_REIDL_ET_MSB		BIT(14)
+#define LNaGCR1_REQ_CTL_SNP		BIT(13)
+#define LNaGCR1_REQ_CDR_SNP		BIT(12)
+#define LNaGCR1_TRSTDIR			BIT(7)
+#define LNaGCR1_REQ_BIN_SNP		BIT(6)
+#define LNaGCR1_ISLEW_RCTL		GENMASK(5, 4)
+#define LNaGCR1_OSLEW_RCTL		GENMASK(1, 0)
+
+#define LNaRECR0(lane)			(0x800 + (lane) * 0x40 + 0x10)
+#define LNaRECR0_RXEQ_BST		BIT(28)
+#define LNaRECR0_GK2OVD			GENMASK(27, 24)
+#define LNaRECR0_GK3OVD			GENMASK(19, 16)
+#define LNaRECR0_GK2OVD_EN		BIT(15)
+#define LNaRECR0_GK3OVD_EN		BIT(14)
+#define LNaRECR0_OSETOVD_EN		BIT(13)
+#define LNaRECR0_BASE_WAND		GENMASK(11, 10)
+#define LNaRECR0_OSETOVD		GENMASK(6, 0)
+
+#define LNaTECR0(lane)			(0x800 + (lane) * 0x40 + 0x18)
+#define LNaTECR0_TEQ_TYPE		GENMASK(29, 28)
+#define LNaTECR0_SGN_PREQ		BIT(26)
+#define LNaTECR0_RATIO_PREQ		GENMASK(25, 22)
+#define LNaTECR0_SGN_POST1Q		BIT(21)
+#define LNaTECR0_RATIO_PST1Q		GENMASK(20, 16)
+#define LNaTECR0_ADPT_EQ		GENMASK(13, 8)
+#define LNaTECR0_AMP_RED		GENMASK(5, 0)
+
+#define LNaTTLCR0(lane)			(0x800 + (lane) * 0x40 + 0x20)
+#define LNaTTLCR1(lane)			(0x800 + (lane) * 0x40 + 0x24)
+#define LNaTTLCR2(lane)			(0x800 + (lane) * 0x40 + 0x28)
+
+#define LNaTCSR3(lane)			(0x800 + (lane) * 0x40 + 0x3C)
+#define LNaTCSR3_CDR_LCK		BIT(27)
+
+enum lynx_10g_rat_sel {
+	RAT_SEL_FULL = 0x0,
+	RAT_SEL_HALF = 0x1,
+	RAT_SEL_QUARTER = 0x2,
+	RAT_SEL_DOUBLE = 0x3,
+};
+
+enum lynx_10g_eq_type {
+	EQ_TYPE_NO_EQ = 0,
+	EQ_TYPE_2TAP = 1,
+	EQ_TYPE_3TAP = 2,
+};
+
+enum lynx_10g_proto_sel {
+	PROTO_SEL_PCIE = 0,
+	PROTO_SEL_SGMII_BASEX_KX_QSGMII = 1,
+	PROTO_SEL_SATA = 2,
+	PROTO_SEL_XAUI = 4,
+	PROTO_SEL_XFI_10GBASER_KR_SXGMII = 0xa,
+};
+
+struct lynx_10g_proto_conf {
+	int proto_sel;
+	int if20bit_en;
+	int reidl_th;
+	int reidl_et_msb;
+	int reidl_et_sel;
+	int reidl_ex_msb;
+	int reidl_ex_sel;
+	int islew_rctl;
+	int oslew_rctl;
+	int rxeq_bst;
+	int gk2ovd;
+	int gk3ovd;
+	int gk2ovd_en;
+	int gk3ovd_en;
+	int base_wand;
+	int teq_type;
+	int sgn_preq;
+	int ratio_preq;
+	int sgn_post1q;
+	int ratio_post1q;
+	int adpt_eq;
+	int amp_red;
+	int ttlcr0;
+};
+
+static const struct lynx_10g_proto_conf lynx_10g_proto_conf[LANE_MODE_MAX] = {
+	[LANE_MODE_1000BASEX_SGMII] = {
+		.proto_sel = PROTO_SEL_SGMII_BASEX_KX_QSGMII,
+		.reidl_th = 1,
+		.reidl_ex_sel = 3,
+		.reidl_et_msb = 1,
+		.islew_rctl = 1,
+		.oslew_rctl = 1,
+		.gk2ovd = 15,
+		.gk3ovd = 15,
+		.gk2ovd_en = 1,
+		.gk3ovd_en = 1,
+		.teq_type = EQ_TYPE_NO_EQ,
+		.adpt_eq = 48,
+		.amp_red = 6,
+		.ttlcr0 = 0x39000400,
+	},
+	[LANE_MODE_2500BASEX] = {
+		.proto_sel = PROTO_SEL_SGMII_BASEX_KX_QSGMII,
+		.islew_rctl = 2,
+		.oslew_rctl = 2,
+		.teq_type = EQ_TYPE_2TAP,
+		.sgn_post1q = 1,
+		.ratio_post1q = 6,
+		.adpt_eq = 48,
+		.ttlcr0 = 0x00000400,
+	},
+	[LANE_MODE_QSGMII] = {
+		.proto_sel = PROTO_SEL_SGMII_BASEX_KX_QSGMII,
+		.islew_rctl = 1,
+		.oslew_rctl = 1,
+		.teq_type = EQ_TYPE_2TAP,
+		.sgn_post1q = 1,
+		.ratio_post1q = 6,
+		.adpt_eq = 48,
+		.amp_red = 2,
+		.ttlcr0 = 0x00000400,
+	},
+	[LANE_MODE_10G_QXGMII] = {
+		.proto_sel = PROTO_SEL_XFI_10GBASER_KR_SXGMII,
+		.if20bit_en = 1,
+		.islew_rctl = 1,
+		.oslew_rctl = 1,
+		.base_wand = 1,
+		.teq_type = EQ_TYPE_NO_EQ,
+		.adpt_eq = 48,
+		.ttlcr0 = 0x00000400,
+	},
+	[LANE_MODE_USXGMII] = {
+		.proto_sel = PROTO_SEL_XFI_10GBASER_KR_SXGMII,
+		.if20bit_en = 1,
+		.islew_rctl = 1,
+		.oslew_rctl = 1,
+		.base_wand = 1,
+		.teq_type = EQ_TYPE_NO_EQ,
+		.sgn_post1q = 1,
+		.adpt_eq = 48,
+		.ttlcr0 = 0x00000400,
+	},
+	[LANE_MODE_10GBASER] = {
+		.proto_sel = PROTO_SEL_XFI_10GBASER_KR_SXGMII,
+		.if20bit_en = 1,
+		.islew_rctl = 2,
+		.oslew_rctl = 2,
+		.rxeq_bst = 1,
+		.base_wand = 1,
+		.teq_type = EQ_TYPE_2TAP,
+		.sgn_post1q = 1,
+		.ratio_post1q = 3,
+		.adpt_eq = 48,
+		.amp_red = 7,
+		.ttlcr0 = 0x00000400,
+	},
+};
+
+static void lynx_10g_cdr_lock_check(struct lynx_lane *lane)
+{
+	u32 tcsr3 = lynx_lane_read(lane, LNaTCSR3);
+
+	if (tcsr3 & LNaTCSR3_CDR_LCK)
+		return;
+
+	dev_dbg(&lane->phy->dev,
+		"Lane %c CDR unlocked, resetting receiver...\n",
+		'A' + lane->id);
+
+	lynx_lane_rmw(lane, LNaGCR0, LNaGCR0_RRST_ON, LNaGCR0_RRST);
+	usleep_range(1, 2);
+	lynx_lane_rmw(lane, LNaGCR0, LNaGCR0_RRST_OFF, LNaGCR0_RRST);
+
+	usleep_range(1, 2);
+}
+
+static void lynx_10g_pll_read_configuration(struct lynx_pll *pll)
+{
+	u32 val;
+
+	val = lynx_pll_read(pll, PLLnCR0);
+	pll->frate_sel = FIELD_GET(PLLnCR0_FRATE_SEL, val);
+	pll->refclk_sel = FIELD_GET(PLLnCR0_REFCLK_SEL, val);
+	pll->enabled = !(val & PLLnCR0_POFF);
+	pll->locked = !!(val & PLLnCR0_PLL_LCK);
+
+	if (!pll->enabled)
+		return;
+
+	switch (pll->frate_sel) {
+	case PLLnCR0_FRATE_5G:
+		/* 5GHz clock net */
+		__set_bit(LANE_MODE_1000BASEX_SGMII, pll->supported);
+		__set_bit(LANE_MODE_QSGMII, pll->supported);
+		break;
+	case PLLnCR0_FRATE_3_125G:
+		__set_bit(LANE_MODE_2500BASEX, pll->supported);
+		break;
+	case PLLnCR0_FRATE_5_15625G:
+		/* 10.3125GHz clock net */
+		__set_bit(LANE_MODE_10GBASER, pll->supported);
+		__set_bit(LANE_MODE_USXGMII, pll->supported);
+		__set_bit(LANE_MODE_10G_QXGMII, pll->supported);
+		break;
+	default:
+		break;
+	}
+}
+
+/* On LS1028A, SGMIIA_CFG, SGMIIB_CFG, and SGMIIC_CFG from PCCR8 have the
+ * ability to map either an ENETC PCS or a Felix switch PCS to the same lane.
+ * The PHY API lacks the capability to distinguish between one consumer and
+ * another, so we don't support changing the initial muxing done by the RCW.
+ * However, when disabling a PCS through PCCR8, we need to properly restore
+ * the original value to keep the same muxing, and for that we need to back
+ * it up (here).
+ */
+static void lynx_10g_backup_pccr_val(struct lynx_lane *lane)
+{
+	u32 val;
+	int err;
+
+	if (lane->mode == LANE_MODE_UNKNOWN)
+		return;
+
+	err = lynx_pccr_read(lane, lane->mode, &val);
+	if (err) {
+		dev_warn(&lane->phy->dev,
+			 "The driver doesn't know how to access the PCCR for lane mode %s\n",
+			 lynx_lane_mode_str(lane->mode));
+		lane->mode = LANE_MODE_UNKNOWN;
+		return;
+	}
+
+	lane->default_pccr[lane->mode] = val;
+
+	switch (lane->mode) {
+	case LANE_MODE_1000BASEX_SGMII:
+	case LANE_MODE_2500BASEX:
+		lane->default_pccr[LANE_MODE_1000BASEX_SGMII] = val & ~PCCR8_SGMIIa_KX;
+		lane->default_pccr[LANE_MODE_2500BASEX] = val & ~PCCR8_SGMIIa_KX;
+		break;
+	default:
+		break;
+	}
+}
+
+static bool lynx_10g_lane_is_3_125g(struct lynx_lane *lane)
+{
+	struct lynx_priv *priv = lane->priv;
+	struct lynx_pll *pll;
+	u32 gcr0;
+
+	gcr0 = lynx_lane_read(lane, LNaGCR0);
+
+	if (gcr0 & LNaGCR0_TPLL_PLLF)
+		pll = &priv->pll[0];
+	else
+		pll = &priv->pll[1];
+
+	if (pll->frate_sel != PLLnCR0_FRATE_3_125G)
+		return false;
+
+	if (FIELD_GET(LNaGCR0_TRAT_SEL, gcr0) != RAT_SEL_FULL ||
+	    FIELD_GET(LNaGCR0_RRAT_SEL, gcr0) != RAT_SEL_FULL)
+		return false;
+
+	return true;
+}
+
+static void lynx_10g_lane_read_configuration(struct lynx_lane *lane)
+{
+	u32 pssr0 = lynx_lane_read(lane, LNaPSSR0);
+	struct lynx_priv *priv = lane->priv;
+	int proto;
+
+	proto = FIELD_GET(LNaPSSR0_TYPE, pssr0);
+	switch (proto) {
+	case PROTO_SEL_SGMII_BASEX_KX_QSGMII:
+		if (lynx_10g_lane_is_3_125g(lane))
+			lane->mode = LANE_MODE_2500BASEX;
+		else if (FIELD_GET(LNaPSSR0_IS_QUAD, pssr0))
+			lane->mode = LANE_MODE_QSGMII;
+		else
+			lane->mode = LANE_MODE_1000BASEX_SGMII;
+		break;
+	case PROTO_SEL_XFI_10GBASER_KR_SXGMII:
+		if (FIELD_GET(LNaPSSR0_IS_QUAD, pssr0))
+			lane->mode = LANE_MODE_10G_QXGMII;
+		else if (priv->info->quirks & LYNX_QUIRK_HAS_HARDCODED_USXGMII)
+			lane->mode = LANE_MODE_USXGMII;
+		else
+			lane->mode = LANE_MODE_10GBASER;
+		break;
+	case PROTO_SEL_PCIE:
+	case PROTO_SEL_SATA:
+	case PROTO_SEL_XAUI:
+		break;
+	default:
+		dev_warn(&lane->phy->dev, "Unknown lane protocol 0x%x\n",
+			 proto);
+	}
+
+	lynx_10g_backup_pccr_val(lane);
+}
+
+static int ls1028a_get_pccr(enum lynx_lane_mode lane_mode, int lane,
+			    struct lynx_pccr *pccr)
+{
+	switch (lane_mode) {
+	case LANE_MODE_1000BASEX_SGMII:
+	case LANE_MODE_2500BASEX:
+		pccr->offset = PCCR8;
+		pccr->width = 4;
+		pccr->shift = SGMII_CFG(lane);
+		break;
+	case LANE_MODE_QSGMII:
+		if (lane != 1)
+			return -EINVAL;
+
+		pccr->offset = PCCR9;
+		pccr->width = 3;
+		pccr->shift = QSGMII_CFG(A);
+		break;
+	case LANE_MODE_10G_QXGMII:
+		if (lane != 1)
+			return -EINVAL;
+
+		pccr->offset = PCCR9;
+		pccr->width = 3;
+		pccr->shift = QXGMII_CFG(A);
+		break;
+	case LANE_MODE_USXGMII:
+		if (lane != 0)
+			return -EINVAL;
+
+		pccr->offset = PCCRB;
+		pccr->width = 3;
+		pccr->shift = SXGMII_CFG(A);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int ls1028a_get_pcvt_offset(int lane, enum lynx_lane_mode mode)
+{
+	switch (mode) {
+	case LANE_MODE_1000BASEX_SGMII:
+	case LANE_MODE_2500BASEX:
+		return SGMIIaCR0(lane);
+	case LANE_MODE_QSGMII:
+		return lane == 1 ? QSGMIIaCR0(A) : -EINVAL;
+	case LANE_MODE_USXGMII:
+		return lane == 0 ? SXGMIIaCR0(A) : -EINVAL;
+	case LANE_MODE_10G_QXGMII:
+		return lane == 1 ? QXGMIIaCR0(A) : -EINVAL;
+	default:
+		return -EINVAL;
+	}
+}
+
+static const struct lynx_info lynx_info_ls1028a = {
+	.get_pccr = ls1028a_get_pccr,
+	.get_pcvt_offset = ls1028a_get_pcvt_offset,
+	.pll_read_configuration = lynx_10g_pll_read_configuration,
+	.lane_read_configuration = lynx_10g_lane_read_configuration,
+	.cdr_lock_check = lynx_10g_cdr_lock_check,
+	.num_lanes = 4,
+	.index = 1,
+	.quirks = LYNX_QUIRK_HAS_HARDCODED_USXGMII,
+};
+
+static int ls1046a_serdes1_get_pccr(enum lynx_lane_mode lane_mode, int lane,
+				    struct lynx_pccr *pccr)
+{
+	switch (lane_mode) {
+	case LANE_MODE_1000BASEX_SGMII:
+	case LANE_MODE_2500BASEX:
+		pccr->offset = PCCR8;
+		pccr->width = 4;
+		pccr->shift = SGMII_CFG(lane);
+		break;
+	case LANE_MODE_QSGMII:
+		if (lane != 1)
+			return -EINVAL;
+
+		pccr->offset = PCCR9;
+		pccr->width = 3;
+		pccr->shift = QSGMII_CFG(B);
+		break;
+	case LANE_MODE_10GBASER:
+		switch (lane) {
+		case 2:
+			pccr->shift = XFI_CFG(A);
+			break;
+		case 3:
+			pccr->shift = XFI_CFG(B);
+			break;
+		default:
+			return -EINVAL;
+		}
+
+		pccr->offset = PCCRB;
+		pccr->width = 3;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int ls1046a_serdes1_get_pcvt_offset(int lane, enum lynx_lane_mode mode)
+{
+	switch (mode) {
+	case LANE_MODE_1000BASEX_SGMII:
+	case LANE_MODE_2500BASEX:
+		return SGMIIaCR0(lane);
+	case LANE_MODE_QSGMII:
+		if (lane != 1)
+			return -EINVAL;
+
+		return QSGMIIaCR0(B);
+	case LANE_MODE_10GBASER:
+		switch (lane) {
+		case 2:
+			return XFIaCR0(A);
+		case 3:
+			return XFIaCR0(B);
+		default:
+			return -EINVAL;
+		}
+	default:
+		return -EINVAL;
+	}
+}
+
+static const struct lynx_info lynx_info_ls1046a_serdes1 = {
+	.get_pccr = ls1046a_serdes1_get_pccr,
+	.get_pcvt_offset = ls1046a_serdes1_get_pcvt_offset,
+	.pll_read_configuration = lynx_10g_pll_read_configuration,
+	.lane_read_configuration = lynx_10g_lane_read_configuration,
+	.cdr_lock_check = lynx_10g_cdr_lock_check,
+	.num_lanes = 4,
+	.index = 1,
+};
+
+static int ls1046a_serdes2_get_pccr(enum lynx_lane_mode lane_mode, int lane,
+				    struct lynx_pccr *pccr)
+{
+	switch (lane_mode) {
+	case LANE_MODE_1000BASEX_SGMII:
+	case LANE_MODE_2500BASEX:
+		if (lane != 1)
+			return -EINVAL;
+
+		pccr->offset = PCCR8;
+		pccr->width = 4;
+		pccr->shift = SGMII_CFG(B);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int ls1046a_serdes2_get_pcvt_offset(int lane, enum lynx_lane_mode mode)
+{
+	switch (mode) {
+	case LANE_MODE_1000BASEX_SGMII:
+	case LANE_MODE_2500BASEX:
+		if (lane != 1)
+			return -EINVAL;
+
+		return SGMIIaCR0(B);
+	default:
+		return -EINVAL;
+	}
+}
+
+static const struct lynx_info lynx_info_ls1046a_serdes2 = {
+	.get_pccr = ls1046a_serdes2_get_pccr,
+	.get_pcvt_offset = ls1046a_serdes2_get_pcvt_offset,
+	.pll_read_configuration = lynx_10g_pll_read_configuration,
+	.lane_read_configuration = lynx_10g_lane_read_configuration,
+	.cdr_lock_check = lynx_10g_cdr_lock_check,
+	.num_lanes = 4,
+	.index = 2,
+};
+
+static int ls1088a_serdes1_get_pccr(enum lynx_lane_mode lane_mode, int lane,
+				    struct lynx_pccr *pccr)
+{
+	switch (lane_mode) {
+	case LANE_MODE_1000BASEX_SGMII:
+		pccr->offset = PCCR8;
+		pccr->width = 4;
+		pccr->shift = SGMII_CFG(lane);
+		break;
+	case LANE_MODE_QSGMII:
+		switch (lane) {
+		case 0:
+			pccr->shift = QSGMII_CFG(A);
+			break;
+		case 1:
+		case 3:
+			pccr->shift = QSGMII_CFG(B);
+			break;
+		default:
+			return -EINVAL;
+		}
+
+		pccr->offset = PCCR9;
+		pccr->width = 3;
+		break;
+	case LANE_MODE_10GBASER:
+		switch (lane) {
+		case 2:
+			pccr->shift = XFI_CFG(A);
+			break;
+		case 3:
+			pccr->shift = XFI_CFG(B);
+			break;
+		default:
+			return -EINVAL;
+		}
+
+		pccr->offset = PCCRB;
+		pccr->width = 3;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int ls1088a_serdes1_get_pcvt_offset(int lane, enum lynx_lane_mode mode)
+{
+	switch (mode) {
+	case LANE_MODE_1000BASEX_SGMII:
+		return SGMIIaCR0(lane);
+	case LANE_MODE_QSGMII:
+		switch (lane) {
+		case 0:
+			return QSGMIIaCR0(A);
+		case 1:
+		case 3:
+			return QSGMIIaCR0(B);
+		default:
+			return -EINVAL;
+		}
+	case LANE_MODE_10GBASER:
+		switch (lane) {
+		case 2:
+			return XFIaCR0(A);
+		case 3:
+			return XFIaCR0(B);
+		default:
+			return -EINVAL;
+		}
+	default:
+		return -EINVAL;
+	}
+}
+
+static const struct lynx_info lynx_info_ls1088a_serdes1 = {
+	.get_pccr = ls1088a_serdes1_get_pccr,
+	.get_pcvt_offset = ls1088a_serdes1_get_pcvt_offset,
+	.pll_read_configuration = lynx_10g_pll_read_configuration,
+	.lane_read_configuration = lynx_10g_lane_read_configuration,
+	.cdr_lock_check = lynx_10g_cdr_lock_check,
+	.num_lanes = 4,
+	.index = 1,
+};
+
+static int ls2088a_serdes1_get_pccr(enum lynx_lane_mode lane_mode, int lane,
+				    struct lynx_pccr *pccr)
+{
+	switch (lane_mode) {
+	case LANE_MODE_1000BASEX_SGMII:
+	case LANE_MODE_2500BASEX:
+		pccr->offset = PCCR8;
+		pccr->width = 4;
+		pccr->shift = SGMII_CFG(lane);
+		break;
+	case LANE_MODE_QSGMII:
+		switch (lane) {
+		case 2:
+		case 6:
+			pccr->shift = QSGMII_CFG(A);
+			break;
+		case 7:
+			pccr->shift = QSGMII_CFG(B);
+			break;
+		case 0:
+		case 4:
+			pccr->shift = QSGMII_CFG(C);
+			break;
+		case 1:
+		case 5:
+			pccr->shift = QSGMII_CFG(D);
+			break;
+		default:
+			return -EINVAL;
+		}
+
+		pccr->offset = PCCR9;
+		pccr->width = 3;
+		break;
+	case LANE_MODE_10GBASER:
+		pccr->offset = PCCRB;
+		pccr->width = 3;
+		pccr->shift = XFI_CFG(lane);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int ls2088a_serdes1_get_pcvt_offset(int lane, enum lynx_lane_mode mode)
+{
+	switch (mode) {
+	case LANE_MODE_1000BASEX_SGMII:
+	case LANE_MODE_2500BASEX:
+		return SGMIIaCR0(lane);
+	case LANE_MODE_QSGMII:
+		switch (lane) {
+		case 2:
+		case 6:
+			return QSGMIIaCR0(A);
+		case 7:
+			return QSGMIIaCR0(B);
+		case 0:
+		case 4:
+			return QSGMIIaCR0(C);
+		case 1:
+		case 5:
+			return QSGMIIaCR0(D);
+		default:
+			return -EINVAL;
+		}
+	case LANE_MODE_10GBASER:
+		return XFIaCR0(lane);
+	default:
+		return -EINVAL;
+	}
+}
+
+static const struct lynx_info lynx_info_ls2088a_serdes1 = {
+	.get_pccr = ls2088a_serdes1_get_pccr,
+	.get_pcvt_offset = ls2088a_serdes1_get_pcvt_offset,
+	.pll_read_configuration = lynx_10g_pll_read_configuration,
+	.lane_read_configuration = lynx_10g_lane_read_configuration,
+	.cdr_lock_check = lynx_10g_cdr_lock_check,
+	.num_lanes = 8,
+	.index = 1,
+};
+
+static int ls2088a_serdes2_get_pccr(enum lynx_lane_mode lane_mode, int lane,
+				    struct lynx_pccr *pccr)
+{
+	switch (lane_mode) {
+	case LANE_MODE_1000BASEX_SGMII:
+	case LANE_MODE_2500BASEX:
+		pccr->offset = PCCR8;
+		pccr->width = 4;
+		pccr->shift = SGMII_CFG(lane);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int ls2088a_serdes2_get_pcvt_offset(int lane, enum lynx_lane_mode mode)
+{
+	switch (mode) {
+	case LANE_MODE_1000BASEX_SGMII:
+	case LANE_MODE_2500BASEX:
+		return SGMIIaCR0(lane);
+	default:
+		return -EINVAL;
+	}
+}
+
+static const struct lynx_info lynx_info_ls2088a_serdes2 = {
+	.get_pccr = ls2088a_serdes2_get_pccr,
+	.get_pcvt_offset = ls2088a_serdes2_get_pcvt_offset,
+	.pll_read_configuration = lynx_10g_pll_read_configuration,
+	.lane_read_configuration = lynx_10g_lane_read_configuration,
+	.cdr_lock_check = lynx_10g_cdr_lock_check,
+	.num_lanes = 8,
+	.index = 2,
+};
+
+/* Halting puts the lane in a mode in which it can be reconfigured */
+static void lynx_10g_lane_halt(struct phy *phy)
+{
+	struct lynx_lane *lane = phy_get_drvdata(phy);
+
+	/* Issue a reset request */
+	lynx_lane_rmw(lane, LNaGCR0,
+		      LNaGCR0_RRST_ON | LNaGCR0_TRST_ON,
+		      LNaGCR0_RRST | LNaGCR0_TRST);
+
+	/* The RM says to wait for at least 50ns */
+	usleep_range(1, 2);
+}
+
+static void lynx_10g_lane_reset(struct phy *phy)
+{
+	struct lynx_lane *lane = phy_get_drvdata(phy);
+
+	/* Finalize the reset request */
+	lynx_lane_rmw(lane, LNaGCR0,
+		      LNaGCR0_RRST_OFF | LNaGCR0_TRST_OFF,
+		      LNaGCR0_RRST | LNaGCR0_TRST);
+}
+
+static int lynx_10g_power_off(struct phy *phy)
+{
+	struct lynx_lane *lane = phy_get_drvdata(phy);
+
+	if (!lane->powered_up)
+		return 0;
+
+	/* Issue a reset request with the power down bits set */
+	lynx_lane_rmw(lane, LNaGCR0,
+		      LNaGCR0_RRST_ON | LNaGCR0_TRST_ON |
+		      LNaGCR0_RX_PD | LNaGCR0_TX_PD,
+		      LNaGCR0_RRST | LNaGCR0_TRST |
+		      LNaGCR0_RX_PD | LNaGCR0_TX_PD);
+
+	/* The RM says to wait for at least 50ns */
+	usleep_range(1, 2);
+
+	lane->powered_up = false;
+
+	return 0;
+}
+
+static int lynx_10g_power_on(struct phy *phy)
+{
+	struct lynx_lane *lane = phy_get_drvdata(phy);
+
+	if (lane->powered_up)
+		return 0;
+
+	/* RM says that to enable a previously powered down lane, set
+	 * LNmGCR0[{R,T}X_PD]=0, wait 15 us, then set LNmGCR0[{R,T}RST]=1.
+	 */
+	lynx_lane_rmw(lane, LNaGCR0, 0, LNaGCR0_RX_PD | LNaGCR0_TX_PD);
+	usleep_range(150, 300);
+	lynx_10g_lane_reset(phy);
+
+	lane->powered_up = true;
+
+	return 0;
+}
+
+static void lynx_10g_lane_set_nrate(struct lynx_lane *lane,
+				    struct lynx_pll *pll,
+				    enum lynx_lane_mode mode)
+{
+	enum lynx_10g_rat_sel nrate;
+
+	switch (pll->frate_sel) {
+	case PLLnCR0_FRATE_5G:
+		switch (mode) {
+		case LANE_MODE_1000BASEX_SGMII:
+			nrate = RAT_SEL_QUARTER;
+			break;
+		case LANE_MODE_QSGMII:
+			nrate = RAT_SEL_FULL;
+			break;
+		default:
+			return;
+		}
+		break;
+	case PLLnCR0_FRATE_3_125G:
+		switch (mode) {
+		case LANE_MODE_2500BASEX:
+			nrate = RAT_SEL_FULL;
+			break;
+		default:
+			return;
+		}
+		break;
+	case PLLnCR0_FRATE_5_15625G:
+		switch (mode) {
+		case LANE_MODE_10GBASER:
+		case LANE_MODE_USXGMII:
+		case LANE_MODE_10G_QXGMII:
+			nrate = RAT_SEL_DOUBLE;
+			break;
+		default:
+			return;
+		}
+		break;
+	default:
+		return;
+	}
+
+	lynx_lane_rmw(lane, LNaGCR0,
+		      FIELD_PREP(LNaGCR0_TRAT_SEL, nrate) |
+		      FIELD_PREP(LNaGCR0_RRAT_SEL, nrate),
+		      LNaGCR0_RRAT_SEL | LNaGCR0_TRAT_SEL);
+}
+
+static void lynx_10g_lane_set_pll(struct lynx_lane *lane,
+				  struct lynx_pll *pll)
+{
+	if (pll->id == 0) {
+		lynx_lane_rmw(lane, LNaGCR0,
+			      LNaGCR0_RPLL_PLLF | LNaGCR0_TPLL_PLLF,
+			      LNaGCR0_RPLL_MSK | LNaGCR0_TPLL_MSK);
+	} else {
+		lynx_lane_rmw(lane, LNaGCR0,
+			      LNaGCR0_RPLL_PLLS | LNaGCR0_TPLL_PLLS,
+			      LNaGCR0_RPLL_MSK | LNaGCR0_TPLL_MSK);
+	}
+}
+
+static void lynx_10g_lane_remap_pll(struct lynx_lane *lane,
+				    enum lynx_lane_mode lane_mode)
+{
+	struct lynx_priv *priv = lane->priv;
+	struct lynx_pll *pll;
+
+	/* Switch to the PLL that works with this interface type */
+	pll = lynx_pll_get(priv, lane_mode);
+	if (unlikely(!pll))
+		return;
+
+	lynx_10g_lane_set_pll(lane, pll);
+
+	/* Choose the portion of clock net to be used on this lane */
+	lynx_10g_lane_set_nrate(lane, pll, lane_mode);
+}
+
+static void lynx_10g_lane_change_proto_conf(struct lynx_lane *lane,
+					    enum lynx_lane_mode mode)
+{
+	const struct lynx_10g_proto_conf *conf = &lynx_10g_proto_conf[mode];
+
+	lynx_lane_rmw(lane, LNaGCR0,
+		      FIELD_PREP(LNaGCR0_PROTS, conf->proto_sel) |
+		      FIELD_PREP(LNaGCR0_IF20BIT_EN, conf->if20bit_en),
+		      LNaGCR0_PROTS | LNaGCR0_IF20BIT_EN);
+	lynx_lane_rmw(lane, LNaGCR1,
+		      FIELD_PREP(LNaGCR1_REIDL_TH, conf->reidl_th) |
+		      FIELD_PREP(LNaGCR1_REIDL_ET_MSB, conf->reidl_et_msb) |
+		      FIELD_PREP(LNaGCR1_REIDL_ET_SEL, conf->reidl_et_sel) |
+		      FIELD_PREP(LNaGCR1_REIDL_EX_MSB, conf->reidl_ex_msb) |
+		      FIELD_PREP(LNaGCR1_REIDL_EX_SEL, conf->reidl_ex_sel) |
+		      FIELD_PREP(LNaGCR1_ISLEW_RCTL, conf->islew_rctl) |
+		      FIELD_PREP(LNaGCR1_OSLEW_RCTL, conf->oslew_rctl),
+		      LNaGCR1_REIDL_TH |
+		      LNaGCR1_REIDL_ET_MSB | LNaGCR1_REIDL_ET_SEL |
+		      LNaGCR1_REIDL_EX_MSB | LNaGCR1_REIDL_EX_SEL |
+		      LNaGCR1_ISLEW_RCTL | LNaGCR1_OSLEW_RCTL);
+	lynx_lane_rmw(lane, LNaRECR0,
+		      FIELD_PREP(LNaRECR0_RXEQ_BST, conf->rxeq_bst) |
+		      FIELD_PREP(LNaRECR0_GK2OVD, conf->gk2ovd) |
+		      FIELD_PREP(LNaRECR0_GK3OVD, conf->gk3ovd) |
+		      FIELD_PREP(LNaRECR0_GK2OVD_EN, conf->gk2ovd_en) |
+		      FIELD_PREP(LNaRECR0_GK3OVD_EN, conf->gk3ovd_en) |
+		      FIELD_PREP(LNaRECR0_BASE_WAND, conf->base_wand),
+		      LNaRECR0_RXEQ_BST | LNaRECR0_GK2OVD | LNaRECR0_GK3OVD |
+		      LNaRECR0_GK2OVD_EN | LNaRECR0_GK3OVD_EN |
+		      LNaRECR0_BASE_WAND);
+	lynx_lane_rmw(lane, LNaTECR0,
+		      FIELD_PREP(LNaTECR0_TEQ_TYPE, conf->teq_type) |
+		      FIELD_PREP(LNaTECR0_SGN_PREQ, conf->sgn_preq) |
+		      FIELD_PREP(LNaTECR0_RATIO_PREQ, conf->ratio_preq) |
+		      FIELD_PREP(LNaTECR0_SGN_POST1Q, conf->sgn_post1q) |
+		      FIELD_PREP(LNaTECR0_RATIO_PST1Q, conf->ratio_post1q) |
+		      FIELD_PREP(LNaTECR0_ADPT_EQ, conf->adpt_eq) |
+		      FIELD_PREP(LNaTECR0_AMP_RED, conf->amp_red),
+		      LNaTECR0_TEQ_TYPE | LNaTECR0_SGN_PREQ |
+		      LNaTECR0_RATIO_PREQ | LNaTECR0_SGN_POST1Q |
+		      LNaTECR0_RATIO_PST1Q | LNaTECR0_ADPT_EQ |
+		      LNaTECR0_AMP_RED);
+	lynx_lane_write(lane, LNaTTLCR0, conf->ttlcr0);
+}
+
+static int lynx_10g_lane_disable_pcvt(struct lynx_lane *lane,
+				      enum lynx_lane_mode mode)
+{
+	struct lynx_priv *priv = lane->priv;
+	int err;
+
+	spin_lock(&priv->pcc_lock);
+
+	err = lynx_pccr_write(lane, mode, 0);
+	if (err)
+		goto out;
+
+	switch (mode) {
+	case LANE_MODE_1000BASEX_SGMII:
+	case LANE_MODE_2500BASEX:
+		err = lynx_pcvt_rmw(lane, mode, CR(1), SGMIIaCR1_SGPCS_DIS,
+				    SGMIIaCR1_SGPCS_EN);
+		if (err)
+			goto out;
+
+		lynx_pcvt_rmw(lane, mode, CR(0),
+			      SGMIIaCR0_RST_SGM_ON | SGMIIaCR0_PD_SGM,
+			      SGMIIaCR0_RST_SGM | SGMIIaCR0_PD_SGM);
+		break;
+	case LANE_MODE_QSGMII:
+		err = lynx_pcvt_rmw(lane, mode, CR(0),
+				    QSGMIIaCR0_RST_QSGM_ON | QSGMIIaCR0_PD_QSGM,
+				    QSGMIIaCR0_RST_QSGM | QSGMIIaCR0_PD_QSGM);
+		if (err)
+			goto out;
+		break;
+	default:
+		err = 0;
+	}
+
+out:
+	spin_unlock(&priv->pcc_lock);
+
+	return err;
+}
+
+static int lynx_10g_lane_enable_pcvt(struct lynx_lane *lane,
+				     enum lynx_lane_mode mode)
+{
+	struct lynx_priv *priv = lane->priv;
+	u32 val;
+	int err;
+
+	spin_lock(&priv->pcc_lock);
+
+	switch (mode) {
+	case LANE_MODE_1000BASEX_SGMII:
+	case LANE_MODE_2500BASEX:
+		err = lynx_pcvt_rmw(lane, mode, CR(1), SGMIIaCR1_SGPCS_EN,
+				    SGMIIaCR1_SGPCS_EN);
+		if (err)
+			goto out;
+
+		lynx_pcvt_rmw(lane, mode, CR(0), SGMIIaCR0_RST_SGM_OFF,
+			      SGMIIaCR0_RST_SGM | SGMIIaCR0_PD_SGM);
+		break;
+	case LANE_MODE_QSGMII:
+		err = lynx_pcvt_rmw(lane, mode, CR(0), QSGMIIaCR0_RST_QSGM_OFF,
+				    QSGMIIaCR0_RST_QSGM | QSGMIIaCR0_PD_QSGM);
+		if (err)
+			goto out;
+		break;
+	default:
+		err = 0;
+	}
+
+	if (lane->default_pccr[mode]) {
+		err = lynx_pccr_write(lane, mode, lane->default_pccr[mode]);
+		goto out;
+	}
+
+	val = 0;
+
+	switch (mode) {
+	case LANE_MODE_1000BASEX_SGMII:
+	case LANE_MODE_2500BASEX:
+		val |= PCCR8_SGMIIa_CFG;
+		break;
+	case LANE_MODE_QSGMII:
+		val |= PCCR9_QSGMIIa_CFG;
+		break;
+	case LANE_MODE_10G_QXGMII:
+		val |= PCCR9_QXGMIIa_CFG;
+		break;
+	case LANE_MODE_10GBASER:
+		val |= PCCRB_XFIa_CFG;
+		break;
+	case LANE_MODE_USXGMII:
+		val |= PCCRB_SXGMIIa_CFG;
+		break;
+	default:
+		err = 0;
+		goto out;
+	}
+
+	err = lynx_pccr_write(lane, mode, val);
+out:
+	spin_unlock(&priv->pcc_lock);
+
+	return err;
+}
+
+static bool lynx_10g_lane_mode_needs_rcw_override(struct lynx_lane *lane,
+						  enum lynx_lane_mode new)
+{
+	enum lynx_lane_mode curr = lane->mode;
+
+	/* Major protocol changes, which involve changing the PCS connection to
+	 * the GMII MAC with the one to the XGMII MAC, require an RCW override
+	 * procedure to reconfigure an internal mux, as documented here:
+	 * https://lore.kernel.org/linux-phy/20230810102631.bvozjer3t67r67iy@skbuf/
+	 * This is SoC-specific, and not yet implemented in drivers/soc/fsl/guts.c.
+	 *
+	 * So the supported set of protocols depends on the initial lane mode.
+	 *
+	 * Minor protocol changes (SGMII <-> 1000Base-X <-> 2500Base-X or
+	 * 10GBase-R <-> USXGMII) are supported.
+	 */
+	if ((lynx_lane_mode_uses_gmii_mac(curr) &&
+	     lynx_lane_mode_uses_xgmii_mac(new)) ||
+	    (lynx_lane_mode_uses_xgmii_mac(curr) &&
+	     lynx_lane_mode_uses_gmii_mac(new)))
+		return true;
+
+	return false;
+}
+
+static int lynx_10g_validate(struct phy *phy, enum phy_mode mode, int submode,
+			     union phy_configure_opts *opts)
+{
+	struct lynx_lane *lane = phy_get_drvdata(phy);
+	enum lynx_lane_mode lane_mode;
+	int err;
+
+	err = lynx_phy_mode_to_lane_mode(phy, mode, submode, &lane_mode);
+	if (err)
+		return err;
+
+	if (lynx_10g_lane_mode_needs_rcw_override(lane, lane_mode))
+		return -EINVAL;
+
+	return 0;
+}
+
+static int lynx_10g_set_mode(struct phy *phy, enum phy_mode mode, int submode)
+{
+	struct lynx_lane *lane = phy_get_drvdata(phy);
+	bool powered_up = lane->powered_up;
+	enum lynx_lane_mode lane_mode;
+	int err;
+
+	err = lynx_10g_validate(phy, mode, submode, NULL);
+	if (err)
+		return err;
+
+	lane_mode = phy_interface_to_lane_mode(submode);
+	/* lynx_10g_validate() already made sure the lane_mode is supported */
+
+	if (lane_mode == lane->mode)
+		return 0;
+
+	/* If the lane is powered up, put the lane into the halt state while
+	 * the reconfiguration is being done.
+	 */
+	if (powered_up)
+		lynx_10g_lane_halt(phy);
+
+	err = lynx_10g_lane_disable_pcvt(lane, lane->mode);
+	if (err)
+		goto out;
+
+	lynx_10g_lane_change_proto_conf(lane, lane_mode);
+	lynx_10g_lane_remap_pll(lane, lane_mode);
+	WARN_ON(lynx_10g_lane_enable_pcvt(lane, lane_mode));
+
+	lane->mode = lane_mode;
+
+out:
+	if (powered_up) {
+		/* The RM says to wait for at least 120 ns */
+		usleep_range(1, 2);
+		lynx_10g_lane_reset(phy);
+	}
+
+	return err;
+}
+
+static int lynx_10g_init(struct phy *phy)
+{
+	struct lynx_lane *lane = phy_get_drvdata(phy);
+
+	/* Mark the fact that the lane was init */
+	lane->init = true;
+
+	/* SerDes lanes are powered on at boot time. Any lane that is
+	 * managed by this driver will get powered off when its consumer
+	 * calls phy_init().
+	 */
+	lane->powered_up = true;
+	lynx_10g_power_off(phy);
+
+	return 0;
+}
+
+static int lynx_10g_exit(struct phy *phy)
+{
+	struct lynx_lane *lane = phy_get_drvdata(phy);
+
+	/* The lane returns to the state where it isn't managed by the
+	 * consumer, so we must treat is as if it isn't initialized, and always
+	 * powered on.
+	 */
+	lane->init = false;
+	lane->powered_up = false;
+	lynx_10g_power_on(phy);
+
+	return 0;
+}
+
+static const struct phy_ops lynx_10g_ops = {
+	.init		= lynx_10g_init,
+	.exit		= lynx_10g_exit,
+	.power_on	= lynx_10g_power_on,
+	.power_off	= lynx_10g_power_off,
+	.set_mode	= lynx_10g_set_mode,
+	.validate	= lynx_10g_validate,
+	.owner		= THIS_MODULE,
+};
+
+static int lynx_10g_probe(struct platform_device *pdev)
+{
+	return lynx_probe(pdev, of_device_get_match_data(&pdev->dev),
+			  &lynx_10g_ops);
+}
+
+static const struct of_device_id lynx_10g_of_match_table[] = {
+	{ .compatible = "fsl,ls1028a-serdes", .data = &lynx_info_ls1028a },
+	{ .compatible = "fsl,ls1046a-serdes1", .data = &lynx_info_ls1046a_serdes1 },
+	{ .compatible = "fsl,ls1046a-serdes2", .data = &lynx_info_ls1046a_serdes2 },
+	{ .compatible = "fsl,ls1088a-serdes1", .data = &lynx_info_ls1088a_serdes1 },
+	{ .compatible = "fsl,ls2088a-serdes1", .data = &lynx_info_ls2088a_serdes1 },
+	{ .compatible = "fsl,ls2088a-serdes2", .data = &lynx_info_ls2088a_serdes2 },
+	{}
+};
+MODULE_DEVICE_TABLE(of, lynx_10g_of_match_table);
+
+static struct platform_driver lynx_10g_driver = {
+	.probe	= lynx_10g_probe,
+	.remove	= lynx_remove,
+	.driver	= {
+		.name = "lynx-10g",
+		.of_match_table = lynx_10g_of_match_table,
+	},
+};
+module_platform_driver(lynx_10g_driver);
+
+MODULE_IMPORT_NS("PHY_FSL_LYNX");
+MODULE_AUTHOR("Ioana Ciornei <ioana.ciornei@nxp.com>");
+MODULE_AUTHOR("Vladimir Oltean <vladimir.oltean@nxp.com>");
+MODULE_DESCRIPTION("Lynx 10G SerDes PHY driver for Layerscape SoCs");
+MODULE_LICENSE("GPL");
diff --git a/drivers/phy/freescale/phy-fsl-lynx-core.c b/drivers/phy/freescale/phy-fsl-lynx-core.c
index 1e411bfab404..2cfe9236ffc5 100644
--- a/drivers/phy/freescale/phy-fsl-lynx-core.c
+++ b/drivers/phy/freescale/phy-fsl-lynx-core.c
@@ -11,6 +11,12 @@ const char *lynx_lane_mode_str(enum lynx_lane_mode lane_mode)
 	switch (lane_mode) {
 	case LANE_MODE_1000BASEX_SGMII:
 		return "1000Base-X/SGMII";
+	case LANE_MODE_2500BASEX:
+		return "2500Base-X";
+	case LANE_MODE_QSGMII:
+		return "QSGMII";
+	case LANE_MODE_10G_QXGMII:
+		return "10G-QXGMII";
 	case LANE_MODE_10GBASER:
 		return "10GBase-R";
 	case LANE_MODE_USXGMII:
@@ -29,6 +35,12 @@ enum lynx_lane_mode phy_interface_to_lane_mode(phy_interface_t intf)
 	case PHY_INTERFACE_MODE_SGMII:
 	case PHY_INTERFACE_MODE_1000BASEX:
 		return LANE_MODE_1000BASEX_SGMII;
+	case PHY_INTERFACE_MODE_2500BASEX:
+		return LANE_MODE_2500BASEX;
+	case PHY_INTERFACE_MODE_QSGMII:
+		return LANE_MODE_QSGMII;
+	case PHY_INTERFACE_MODE_10G_QXGMII:
+		return LANE_MODE_10G_QXGMII;
 	case PHY_INTERFACE_MODE_10GBASER:
 		return LANE_MODE_10GBASER;
 	case PHY_INTERFACE_MODE_USXGMII:
@@ -89,6 +101,29 @@ bool lynx_lane_supports_mode(struct lynx_lane *lane, enum lynx_lane_mode mode)
 }
 EXPORT_SYMBOL_NS_GPL(lynx_lane_supports_mode, "PHY_FSL_LYNX");
 
+/* The quad protocols are fixed because the lane has multiple consumers, and
+ * one phy_set_mode_ext() affects the other consumers as well. We have no use
+ * case for dynamic protocol changing here, so disallow it.
+ */
+static enum lynx_lane_mode lynx_fixed_protocols[] = {
+	LANE_MODE_QSGMII,
+	LANE_MODE_10G_QXGMII,
+};
+
+static bool lynx_lane_restrict_fixed_mode_change(struct lynx_lane *lane,
+						 enum lynx_lane_mode new)
+{
+	enum lynx_lane_mode curr = lane->mode;
+
+	for (int i = 0; i < ARRAY_SIZE(lynx_fixed_protocols); i++)
+		if ((curr == lynx_fixed_protocols[i] ||
+		     new == lynx_fixed_protocols[i]) &&
+		     curr != new)
+			return true;
+
+	return false;
+}
+
 /* Translate the mode/submode from phy_validate() and phy_set_mode_ext() to a
  * lane_mode and return 0 if it is supported and we can transition to it from
  * the current lane mode, or return negative error otherwise.
@@ -112,6 +147,9 @@ int lynx_phy_mode_to_lane_mode(struct phy *phy, enum phy_mode mode,
 	if (!lynx_lane_supports_mode(lane, tmp_lane_mode))
 		return -EINVAL;
 
+	if (lynx_lane_restrict_fixed_mode_change(lane, tmp_lane_mode))
+		return -EINVAL;
+
 	if (lane_mode)
 		*lane_mode = tmp_lane_mode;
 
diff --git a/drivers/phy/freescale/phy-fsl-lynx-core.h b/drivers/phy/freescale/phy-fsl-lynx-core.h
index 37fa4b544faa..a60429ba9324 100644
--- a/drivers/phy/freescale/phy-fsl-lynx-core.h
+++ b/drivers/phy/freescale/phy-fsl-lynx-core.h
@@ -9,6 +9,7 @@
 #include <soc/fsl/phy-fsl-lynx.h>
 
 #define LYNX_NUM_PLL				2
+#define LYNX_QUIRK_HAS_HARDCODED_USXGMII	BIT(0)
 
 struct lynx_priv;
 struct lynx_lane;
@@ -36,6 +37,7 @@ struct lynx_lane {
 	bool init;
 	unsigned int id;
 	enum lynx_lane_mode mode;
+	u32 default_pccr[LANE_MODE_MAX];
 };
 
 struct lynx_info {
@@ -48,6 +50,8 @@ struct lynx_info {
 	void (*cdr_lock_check)(struct lynx_lane *lane);
 	int first_lane;
 	int num_lanes;
+	int index;
+	unsigned long quirks;
 };
 
 struct lynx_priv {
diff --git a/include/soc/fsl/phy-fsl-lynx.h b/include/soc/fsl/phy-fsl-lynx.h
index 92e8272d5ae1..ff5a7d1835b5 100644
--- a/include/soc/fsl/phy-fsl-lynx.h
+++ b/include/soc/fsl/phy-fsl-lynx.h
@@ -7,10 +7,37 @@
 enum lynx_lane_mode {
 	LANE_MODE_UNKNOWN,
 	LANE_MODE_1000BASEX_SGMII,
+	LANE_MODE_2500BASEX,
+	LANE_MODE_QSGMII,
+	LANE_MODE_10G_QXGMII,
 	LANE_MODE_10GBASER,
 	LANE_MODE_USXGMII,
 	LANE_MODE_25GBASER,
 	LANE_MODE_MAX,
 };
 
+static inline bool lynx_lane_mode_uses_gmii_mac(enum lynx_lane_mode mode)
+{
+	switch (mode) {
+	case LANE_MODE_1000BASEX_SGMII:
+	case LANE_MODE_2500BASEX:
+	case LANE_MODE_QSGMII:
+	case LANE_MODE_10G_QXGMII:
+		return true;
+	default:
+		return false;
+	}
+}
+
+static inline bool lynx_lane_mode_uses_xgmii_mac(enum lynx_lane_mode mode)
+{
+	switch (mode) {
+	case LANE_MODE_10GBASER:
+	case LANE_MODE_USXGMII:
+		return true;
+	default:
+		return false;
+	}
+}
+
 #endif /* __PHY_FSL_LYNX_H_ */
-- 
2.34.1



^ permalink raw reply related

* [PATCH v2] powerpc/entry: Disable interrupts before irqentry_exit
From: Shrikanth Hegde @ 2026-06-03 13:10 UTC (permalink / raw)
  To: maddy, linuxppc-dev
  Cc: sshegde, peterz, tglx, christophe.leroy, linux-kernel, venkat88,
	mkchauras

Venkat reported a panic on powerpc-next tree where GENERIC_ENTRY has
been enabled.

kernel BUG at kernel/sched/core.c:7512!
NIP  preempt_schedule_irq+0x44/0x118
LR   dynamic_irqentry_exit_cond_resched+0x40/0x1a4
Call Trace:
 dynamic_irqentry_exit_cond_resched+0x40/0x1a4
 do_page_fault+0xc0/0x104
 data_access_common_virt+0x210/0x220

This happens since __do_page_fault ends up enabling the interrupts and
it could take significant time such that need_resched could be set. This
leads to schedule call in irqentry_exit leading to the bug.

There are many such irq handlers which enables the interrupts.
Fix it by disabling the irq before calling irqentry_exit. The same
pattern exists today in interrupt_exit_kernel_prepare.

Fixes: bee25f97ad24 ("powerpc: Enable GENERIC_ENTRY feature")
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Closes: https://lore.kernel.org/all/7904105b-9dfa-4efd-a5ef-bc0276ed255d@linux.ibm.com/
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
---
This applies on top on powerpc/next tree.
base: 6ed60999d33d '("powerpc: Remove unused functions")'

v1->v2:
Leave those BUG_ON alone since they are tracking the register
state of userspace (Peter Zijlstra)
v1: https://lore.kernel.org/all/20260603095521.198267-1-sshegde@linux.ibm.com/

 arch/powerpc/include/asm/entry-common.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
index de5601282755..fc636c42e89a 100644
--- a/arch/powerpc/include/asm/entry-common.h
+++ b/arch/powerpc/include/asm/entry-common.h
@@ -260,9 +260,10 @@ static inline void arch_interrupt_exit_prepare(struct pt_regs *regs)
 		 * AMR can only have been unlocked if we interrupted the kernel.
 		 */
 		kuap_assert_locked();
-
-		local_irq_disable();
 	}
+
+	/* irqentry_exit expects to be called with interrupts disabled */
+	local_irq_disable();
 }
 
 static inline void arch_interrupt_async_enter_prepare(struct pt_regs *regs)
-- 
2.47.3



^ permalink raw reply related

* Re: [PATCH v3 15/19] mm/hugetlb_vmemmap: Move bootmem HVO setup to early init
From: Usama Arif @ 2026-06-03 12:35 UTC (permalink / raw)
  To: Muchun Song
  Cc: Oscar Salvador, David Hildenbrand, Andrew Morton,
	Madhavan Srinivasan, Michael Ellerman, Mike Rapoport,
	Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, linux-mm,
	linux-kernel, Nicholas Piggin, Christophe Leroy (CS GROUP),
	Ritesh Harjani (IBM), Aneesh Kumar K.V, linuxppc-dev, Muchun Song
In-Reply-To: <f9dd874a-b637-4740-9a63-8da66de323ca@linux.dev>



On 03/06/2026 13:24, Muchun Song wrote:
> 
> 
> On 2026/6/3 20:02, Usama Arif wrote:
>> On Tue,  2 Jun 2026 18:10:35 +0800 Muchun Song <songmuchun@bytedance.com> wrote:
>>
>>> Bootmem HugeTLB pages currently defer HVO setup to
>>> hugetlb_vmemmap_init_late(), because the optimization needs zone
>>> information.
>>>
>>> Now that zone initialization is available earlier, the bootmem HVO setup
>>> can be done directly from hugetlb_vmemmap_init_early(). This lets
>>> gigantic HugeTLB pages apply HVO as soon as they are allocated.
>>>
>>> Bootmem gigantic pages that span multiple zones are now filtered out
>>> when they are allocated, so the remaining bootmem gigantic pages seen by
>>> later hugetlb initialization are already zone-valid. As a result,
>>> hugetlb_vmemmap_init_late() no longer needs to handle bootmem HVO setup.
>>>
>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>> ---

Acked-by: Usama Arif <usama.arif@linux.dev>



^ permalink raw reply

* Re: [PATCH 04/23] pmdomain: imx: fix OF node refcount
From: Ulf Hansson @ 2026-06-03 10:15 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: Lee Jones, Mark Brown, Thierry Reding, Sebastian Hesselbarth,
	Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Srinivas Kandagatla, Greg Kroah-Hartman, Vinod Koul,
	Rafael J. Wysocki, Danilo Krummrich, Rob Herring, Saravana Kannan,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy (CS GROUP), Andi Shyti, Andy Shevchenko,
	Joerg Roedel, Will Deacon, Robin Murphy, Doug Berger,
	Florian Fainelli, Broadcom internal kernel review list,
	Ulf Hansson, Frank Li, Sascha Hauer, Pengutronix Kernel Team,
	Fabio Estevam, Matthew Brost, Thomas Hellström, Rodrigo Vivi,
	David Airlie, Simona Vetter, Peter Chen, Paul Cercueil, Bin Liu,
	Philipp Zabel, Maximilian Luz, Hans de Goede, Ilpo Järvinen,
	Krzysztof Kozlowski, Benjamin Herrenschmidt, brgl, linux-kernel,
	netdev, linux-arm-msm, linux-sound, driver-core, devicetree,
	linuxppc-dev, linux-i2c, iommu, linux-pm, imx, linux-arm-kernel,
	intel-xe, dri-devel, linux-usb, linux-mips, platform-driver-x86,
	stable
In-Reply-To: <20260521-pdev-fwnode-ref-v1-4-88c324a1b8d2@oss.qualcomm.com>

On Thu, May 21, 2026 at 10:36 AM Bartosz Golaszewski
<bartosz.golaszewski@oss.qualcomm.com> wrote:
>
> for_each_child_of_node_scoped() decrements the reference count of the
> nod after each iteration. Assigning it without incrementing the refcount
> to a dynamically allocated platform device will result in a double put
> in platform_device_release(). Add the missing call to of_node_get().
>
> Cc: stable@vger.kernel.org
> Fixes: 3e4d109ee8fc ("pmdomain: imx: gpc: Simplify with scoped for each OF child loop")
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

Applied for fixes, thanks!

Kind regards
Uffe


> ---
>  drivers/pmdomain/imx/gpc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/pmdomain/imx/gpc.c b/drivers/pmdomain/imx/gpc.c
> index de695f1944ab31de3d37ce8000d0c577579d64f9..42e50c9b4fb9ffb96a20a462d4eb5168942a893c 100644
> --- a/drivers/pmdomain/imx/gpc.c
> +++ b/drivers/pmdomain/imx/gpc.c
> @@ -487,7 +487,7 @@ static int imx_gpc_probe(struct platform_device *pdev)
>                         domain->ipg_rate_mhz = ipg_rate_mhz;
>
>                         pd_pdev->dev.parent = &pdev->dev;
> -                       pd_pdev->dev.of_node = np;
> +                       pd_pdev->dev.of_node = of_node_get(np);
>                         pd_pdev->dev.fwnode = of_fwnode_handle(np);
>
>                         ret = platform_device_add(pd_pdev);
>
> --
> 2.47.3


^ permalink raw reply

* Re: [PATCH v3 15/19] mm/hugetlb_vmemmap: Move bootmem HVO setup to early init
From: Muchun Song @ 2026-06-03 12:24 UTC (permalink / raw)
  To: Usama Arif
  Cc: Oscar Salvador, David Hildenbrand, Andrew Morton,
	Madhavan Srinivasan, Michael Ellerman, Mike Rapoport,
	Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, linux-mm,
	linux-kernel, Nicholas Piggin, Christophe Leroy (CS GROUP),
	Ritesh Harjani (IBM), Aneesh Kumar K.V, linuxppc-dev, Muchun Song
In-Reply-To: <20260603120246.1572177-1-usama.arif@linux.dev>



On 2026/6/3 20:02, Usama Arif wrote:
> On Tue,  2 Jun 2026 18:10:35 +0800 Muchun Song <songmuchun@bytedance.com> wrote:
>
>> Bootmem HugeTLB pages currently defer HVO setup to
>> hugetlb_vmemmap_init_late(), because the optimization needs zone
>> information.
>>
>> Now that zone initialization is available earlier, the bootmem HVO setup
>> can be done directly from hugetlb_vmemmap_init_early(). This lets
>> gigantic HugeTLB pages apply HVO as soon as they are allocated.
>>
>> Bootmem gigantic pages that span multiple zones are now filtered out
>> when they are allocated, so the remaining bootmem gigantic pages seen by
>> later hugetlb initialization are already zone-valid. As a result,
>> hugetlb_vmemmap_init_late() no longer needs to handle bootmem HVO setup.
>>
>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>> ---
>>   mm/hugetlb_vmemmap.c | 67 +++++++++-----------------------------------
>>   1 file changed, 13 insertions(+), 54 deletions(-)
>>
>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
>> index ea6af85bfec1..464578ee246e 100644
>> --- a/mm/hugetlb_vmemmap.c
>> +++ b/mm/hugetlb_vmemmap.c
>> @@ -745,6 +745,8 @@ static bool vmemmap_should_optimize_bootmem_page(struct huge_bootmem_page *m)
>>   	return true;
>>   }
>>   
>> +static struct zone *pfn_to_zone(unsigned nid, unsigned long pfn);
>> +
>>   /*
>>    * Initialize memmap section for a gigantic page, HVO-style.
>>    */
>> @@ -752,6 +754,7 @@ void __init hugetlb_vmemmap_init_early(int nid)
>>   {
>>   	unsigned long psize, paddr, section_size;
>>   	unsigned long ns, i, pnum, pfn, nr_pages;
>> +	unsigned long start, end;
>>   	struct huge_bootmem_page *m = NULL;
>>   	void *map;
>>   
>> @@ -761,6 +764,8 @@ void __init hugetlb_vmemmap_init_early(int nid)
>>   	section_size = (1UL << PA_SECTION_SHIFT);
>>   
>>   	list_for_each_entry(m, &huge_boot_pages[nid], list) {
>> +		struct zone *zone;
>> +
>>   		if (!vmemmap_should_optimize_bootmem_page(m))
>>   			continue;
>>   
>> @@ -769,6 +774,14 @@ void __init hugetlb_vmemmap_init_early(int nid)
>>   		paddr = virt_to_phys(m);
>>   		pfn = PHYS_PFN(paddr);
>>   		map = pfn_to_page(pfn);
>> +		start = (unsigned long)map;
>> +		end = start + hugetlb_vmemmap_size(m->hstate);
>> +		zone = pfn_to_zone(nid, pfn);
>> +
>> +		if (vmemmap_populate_hvo(start, end, huge_page_order(m->hstate),
>> +					 zone, HUGETLB_VMEMMAP_RESERVE_SIZE))
>> +			panic("Failed to allocate memmap for HugeTLB page\n");
> The replaced hugetlb_vmemmap_init_late() path used to fall back to
> vmemmap_populate() if vmemmap_populate_hvo() returned an error and
> just lost the HVO optimization for that page.
>
> The new path panics on any non-zero return.  Is the panic intended,
> given that vmemmap_populate_hvo() returns -ENOMEM on allocation
> failure and HVO is normally treated as an optimization rather than a
> hard requirement?

This is intentional; see patch 6:

     mm/sparse: Panic on memmap and usemap allocation failure

We already panic on OOM anyway.

Muchun,
Thanks.

>
>> +		memmap_boot_pages_add(DIV_ROUND_UP(HUGETLB_VMEMMAP_RESERVE_SIZE, PAGE_SIZE));
>>   
>>   		pnum = pfn_to_section_nr(pfn);
>>   		ns = psize / section_size;
>> @@ -800,60 +813,6 @@ static struct zone *pfn_to_zone(unsigned nid, unsigned long pfn)
>>   
>>   void __init hugetlb_vmemmap_init_late(int nid)
>>   {
>> -	struct huge_bootmem_page *m, *tm;
>> -	unsigned long phys, nr_pages, start, end;
>> -	unsigned long pfn, nr_mmap;
>> -	struct zone *zone = NULL;
>> -	struct hstate *h;
>> -	void *map;
>> -
>> -	if (!READ_ONCE(vmemmap_optimize_enabled))
>> -		return;
>> -
>> -	list_for_each_entry_safe(m, tm, &huge_boot_pages[nid], list) {
>> -		if (!(m->flags & HUGE_BOOTMEM_HVO))
>> -			continue;
>> -
>> -		phys = virt_to_phys(m);
>> -		h = m->hstate;
>> -		pfn = PHYS_PFN(phys);
>> -		nr_pages = pages_per_huge_page(h);
>> -		map = pfn_to_page(pfn);
>> -		start = (unsigned long)map;
>> -		end = start + nr_pages * sizeof(struct page);
>> -
>> -		if (!hugetlb_bootmem_page_zones_valid(nid, m)) {
>> -			/*
>> -			 * Oops, the hugetlb page spans multiple zones.
>> -			 * Remove it from the list, and populate it normally.
>> -			 */
>> -			list_del(&m->list);
>> -
>> -			vmemmap_populate(start, end, nid, NULL);
>> -			nr_mmap = end - start;
>> -			memmap_boot_pages_add(DIV_ROUND_UP(nr_mmap, PAGE_SIZE));
>> -
>> -			memblock_phys_free(phys, huge_page_size(h));
>> -			continue;
>> -		}
>> -
>> -		if (!zone || !zone_spans_pfn(zone, pfn))
>> -			zone = pfn_to_zone(nid, pfn);
>> -		if (WARN_ON_ONCE(!zone))
>> -			continue;
>> -
>> -		if (vmemmap_populate_hvo(start, end, huge_page_order(h), zone,
>> -					 HUGETLB_VMEMMAP_RESERVE_SIZE) < 0) {
>> -			/* Fallback if HVO population fails */
>> -			vmemmap_populate(start, end, nid, NULL);
>> -			nr_mmap = end - start;
>> -		} else {
>> -			m->flags |= HUGE_BOOTMEM_ZONES_VALID;
>> -			nr_mmap = HUGETLB_VMEMMAP_RESERVE_SIZE;
>> -		}
>> -
>> -		memmap_boot_pages_add(DIV_ROUND_UP(nr_mmap, PAGE_SIZE));
>> -	}
>>   }
>>   #endif
>>   
>> -- 
>> 2.54.0
>>
>>



^ permalink raw reply

* Re: [PATCH v3 15/19] mm/hugetlb_vmemmap: Move bootmem HVO setup to early init
From: Usama Arif @ 2026-06-03 12:02 UTC (permalink / raw)
  To: Muchun Song
  Cc: Usama Arif, Oscar Salvador, David Hildenbrand, Andrew Morton,
	Madhavan Srinivasan, Michael Ellerman, Muchun Song, Mike Rapoport,
	Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, linux-mm,
	linux-kernel, Nicholas Piggin, Christophe Leroy (CS GROUP),
	Ritesh Harjani (IBM), Aneesh Kumar K.V, linuxppc-dev,
	Mike Kravetz
In-Reply-To: <20260602101039.1867613-16-songmuchun@bytedance.com>

On Tue,  2 Jun 2026 18:10:35 +0800 Muchun Song <songmuchun@bytedance.com> wrote:

> Bootmem HugeTLB pages currently defer HVO setup to
> hugetlb_vmemmap_init_late(), because the optimization needs zone
> information.
> 
> Now that zone initialization is available earlier, the bootmem HVO setup
> can be done directly from hugetlb_vmemmap_init_early(). This lets
> gigantic HugeTLB pages apply HVO as soon as they are allocated.
> 
> Bootmem gigantic pages that span multiple zones are now filtered out
> when they are allocated, so the remaining bootmem gigantic pages seen by
> later hugetlb initialization are already zone-valid. As a result,
> hugetlb_vmemmap_init_late() no longer needs to handle bootmem HVO setup.
> 
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> ---
>  mm/hugetlb_vmemmap.c | 67 +++++++++-----------------------------------
>  1 file changed, 13 insertions(+), 54 deletions(-)
> 
> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> index ea6af85bfec1..464578ee246e 100644
> --- a/mm/hugetlb_vmemmap.c
> +++ b/mm/hugetlb_vmemmap.c
> @@ -745,6 +745,8 @@ static bool vmemmap_should_optimize_bootmem_page(struct huge_bootmem_page *m)
>  	return true;
>  }
>  
> +static struct zone *pfn_to_zone(unsigned nid, unsigned long pfn);
> +
>  /*
>   * Initialize memmap section for a gigantic page, HVO-style.
>   */
> @@ -752,6 +754,7 @@ void __init hugetlb_vmemmap_init_early(int nid)
>  {
>  	unsigned long psize, paddr, section_size;
>  	unsigned long ns, i, pnum, pfn, nr_pages;
> +	unsigned long start, end;
>  	struct huge_bootmem_page *m = NULL;
>  	void *map;
>  
> @@ -761,6 +764,8 @@ void __init hugetlb_vmemmap_init_early(int nid)
>  	section_size = (1UL << PA_SECTION_SHIFT);
>  
>  	list_for_each_entry(m, &huge_boot_pages[nid], list) {
> +		struct zone *zone;
> +
>  		if (!vmemmap_should_optimize_bootmem_page(m))
>  			continue;
>  
> @@ -769,6 +774,14 @@ void __init hugetlb_vmemmap_init_early(int nid)
>  		paddr = virt_to_phys(m);
>  		pfn = PHYS_PFN(paddr);
>  		map = pfn_to_page(pfn);
> +		start = (unsigned long)map;
> +		end = start + hugetlb_vmemmap_size(m->hstate);
> +		zone = pfn_to_zone(nid, pfn);
> +
> +		if (vmemmap_populate_hvo(start, end, huge_page_order(m->hstate),
> +					 zone, HUGETLB_VMEMMAP_RESERVE_SIZE))
> +			panic("Failed to allocate memmap for HugeTLB page\n");

The replaced hugetlb_vmemmap_init_late() path used to fall back to
vmemmap_populate() if vmemmap_populate_hvo() returned an error and
just lost the HVO optimization for that page.

The new path panics on any non-zero return.  Is the panic intended,
given that vmemmap_populate_hvo() returns -ENOMEM on allocation
failure and HVO is normally treated as an optimization rather than a
hard requirement?

> +		memmap_boot_pages_add(DIV_ROUND_UP(HUGETLB_VMEMMAP_RESERVE_SIZE, PAGE_SIZE));
>  
>  		pnum = pfn_to_section_nr(pfn);
>  		ns = psize / section_size;
> @@ -800,60 +813,6 @@ static struct zone *pfn_to_zone(unsigned nid, unsigned long pfn)
>  
>  void __init hugetlb_vmemmap_init_late(int nid)
>  {
> -	struct huge_bootmem_page *m, *tm;
> -	unsigned long phys, nr_pages, start, end;
> -	unsigned long pfn, nr_mmap;
> -	struct zone *zone = NULL;
> -	struct hstate *h;
> -	void *map;
> -
> -	if (!READ_ONCE(vmemmap_optimize_enabled))
> -		return;
> -
> -	list_for_each_entry_safe(m, tm, &huge_boot_pages[nid], list) {
> -		if (!(m->flags & HUGE_BOOTMEM_HVO))
> -			continue;
> -
> -		phys = virt_to_phys(m);
> -		h = m->hstate;
> -		pfn = PHYS_PFN(phys);
> -		nr_pages = pages_per_huge_page(h);
> -		map = pfn_to_page(pfn);
> -		start = (unsigned long)map;
> -		end = start + nr_pages * sizeof(struct page);
> -
> -		if (!hugetlb_bootmem_page_zones_valid(nid, m)) {
> -			/*
> -			 * Oops, the hugetlb page spans multiple zones.
> -			 * Remove it from the list, and populate it normally.
> -			 */
> -			list_del(&m->list);
> -
> -			vmemmap_populate(start, end, nid, NULL);
> -			nr_mmap = end - start;
> -			memmap_boot_pages_add(DIV_ROUND_UP(nr_mmap, PAGE_SIZE));
> -
> -			memblock_phys_free(phys, huge_page_size(h));
> -			continue;
> -		}
> -
> -		if (!zone || !zone_spans_pfn(zone, pfn))
> -			zone = pfn_to_zone(nid, pfn);
> -		if (WARN_ON_ONCE(!zone))
> -			continue;
> -
> -		if (vmemmap_populate_hvo(start, end, huge_page_order(h), zone,
> -					 HUGETLB_VMEMMAP_RESERVE_SIZE) < 0) {
> -			/* Fallback if HVO population fails */
> -			vmemmap_populate(start, end, nid, NULL);
> -			nr_mmap = end - start;
> -		} else {
> -			m->flags |= HUGE_BOOTMEM_ZONES_VALID;
> -			nr_mmap = HUGETLB_VMEMMAP_RESERVE_SIZE;
> -		}
> -
> -		memmap_boot_pages_add(DIV_ROUND_UP(nr_mmap, PAGE_SIZE));
> -	}
>  }
>  #endif
>  
> -- 
> 2.54.0
> 
> 


^ permalink raw reply

* Re: [PATCH] powerpc/entry: Disable interrupts before irqentry_exit
From: Shrikanth Hegde @ 2026-06-03 11:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: maddy, linuxppc-dev, tglx, christophe.leroy, linux-kernel,
	venkat88, mkchauras
In-Reply-To: <20260603101155.GV3493090@noisy.programming.kicks-ass.net>

Hi Peter. Thanks for taking a look at it.

On 6/3/26 3:41 PM, Peter Zijlstra wrote:
> On Wed, Jun 03, 2026 at 03:25:21PM +0530, Shrikanth Hegde wrote:
>> Venkat reported a panic on powerpc-next tree where GENERIC_ENTRY has
>> been enabled.
>>
>> kernel BUG at kernel/sched/core.c:7512!
>> NIP  preempt_schedule_irq+0x44/0x118
>> LR   dynamic_irqentry_exit_cond_resched+0x40/0x1a4
>> Call Trace:
>>   dynamic_irqentry_exit_cond_resched+0x40/0x1a4
>>   do_page_fault+0xc0/0x104
>>   data_access_common_virt+0x210/0x220
>>
>> This happens since __do_page_fault ends up enabling the interrupts and
>> it could take significant time such that need_resched could be set. This
>> leads to schedule call in irqentry_exit leading to the bug.
>>
>> There are many such irq handlers which enables the interrupts.
>> Fix it by disabling the irq before calling irqentry_exit. The same
>> pattern exists today in interrupt_exit_kernel_prepare.
>>
>> While there, make those BUG_ON into WARN_ON. Interrupt is disabled right
>> after so it is not that severe. This will still help to catch the
>> offending callsites.
>>
>> Fixes: bee25f97ad24 ("powerpc: Enable GENERIC_ENTRY feature")
>> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
>> Closes: https://lore.kernel.org/all/7904105b-9dfa-4efd-a5ef-bc0276ed255d@linux.ibm.com/
>> Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
>> ---
>>
>> This applies on top on powerpc/next tree.
>> base: 6ed60999d33d '("powerpc: Remove unused functions")'
>>
>>   arch/powerpc/include/asm/entry-common.h | 9 +++++----
>>   1 file changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
>> index de5601282755..7da373a56813 100644
>> --- a/arch/powerpc/include/asm/entry-common.h
>> +++ b/arch/powerpc/include/asm/entry-common.h
>> @@ -253,16 +253,17 @@ static inline void arch_interrupt_enter_prepare(struct pt_regs *regs)
>>   static inline void arch_interrupt_exit_prepare(struct pt_regs *regs)
>>   {
>>   	if (user_mode(regs)) {
>> -		BUG_ON(regs_is_unrecoverable(regs));
>> -		BUG_ON(regs_irqs_disabled(regs));
>> +		WARN_ON(regs_is_unrecoverable(regs));
>> +		WARN_ON(regs_irqs_disabled(regs));
> 
> So while you will indeed disable IRQs righ below. This checks the IRQ
> state of regs, not the current state.
> 
> What you are allowing through it a *userspace* state that has IRQs
> disabled. That is an invalid state.

Ok. I should have seen it closely. Let me keep them as is.

> 
>>   		/*
>>   		 * We don't need to restore AMR on the way back to userspace for KUAP.
>>   		 * AMR can only have been unlocked if we interrupted the kernel.
>>   		 */
>>   		kuap_assert_locked();
>> -
>> -		local_irq_disable();
>>   	}
>> +
>> +	/* irqentry_exit expects to be called with interrupts disabled */
>> +	local_irq_disable();
>>   }
>>   
>>   static inline void arch_interrupt_async_enter_prepare(struct pt_regs *regs)
>> -- 
>> 2.47.3
>>



^ permalink raw reply

* Re: [PATCH v7 00/15] arm64: Unmap linear alias of kernel data/bss
From: Ard Biesheuvel @ 2026-06-03 11:24 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arm-kernel, Ard Biesheuvel, Catalin Marinas, kernel-team,
	linux-kernel, Mark Rutland, Ryan Roberts, Anshuman Khandual,
	Kevin Brodsky, Liz Prucka, Seth Jenkins, Kees Cook, Mike Rapoport,
	David Hildenbrand, Andrew Morton, Jann Horn, linux-mm,
	linux-hardening, linuxppc-dev, linux-sh, maz
In-Reply-To: <aiAOaYwPlGEG6FML@willie-the-truck>



On Wed, 3 Jun 2026, at 13:22, Will Deacon wrote:
> On Wed, Jun 03, 2026 at 10:57:49AM +0200, Ard Biesheuvel wrote:
>> (cc Marc)
>> 
>> On Tue, 2 Jun 2026, at 22:34, Will Deacon wrote:
>> > On Fri, 29 May 2026 17:01:51 +0200, Ard Biesheuvel wrote:
>> >> One of the reasons the lack of randomization of the linear map on arm64
>> >> is considered problematic is the fact that bootloaders adhering to the
>> >> original arm64 boot protocol (i.e., a substantial fraction of all
>> >> Android phones) may place the kernel at the base of DRAM, and therefore
>> >> at the base of the non-randomized linear map. This puts a writable alias
>> >> of the kernel's data and bss regions at a predictable location, removing
>> >> the need for an attacker to guess where KASLR mapped the kernel.
>> >> 
>> >> [...]
>> >
>> > It would've been nice to hear from the ppc folks on patch 11, but I've
>> > picked it up on the assumption that they'll love the negative diff stat.
>> > Worst case, we can drop/revert stuff if they have late objections.
>> >
>> 
>> Thanks.
>> 
>> There is a de facto ack from Michael Ellerman in the Link:, which is why
>> I included it.
>> 
>> Note that Sashiko found an issue with KVM+MTE, where a read-only mapping
>> of the zero page in the linear map may result in issues:
>> 
>> """
>> Does moving the zero page to .rodata (or unmapping/read-only mapping its
>> linear alias) expose a guest-to-host denial of service with KVM and MTE?
>> When an MTE-enabled KVM guest reads an unmapped memory address, KVM handles
>> the stage-2 fault by mapping the host's shared zero page. KVM will then
>> call sanitise_mte_tags() in arch/arm64/kvm/mmu.c.
>> Since the PG_mte_tagged flag is never set on the zero page, KVM's
>> try_page_mte_tagging() succeeds, and it calls mte_clear_page_tags().
>> This executes the STGM instruction using the zero page's linear map alias.
>> If this alias is read-only or unmapped, won't the STGM instruction trigger
>> a synchronous permission fault or translation fault in EL1, causing a host
>> kernel panic?
>> """
>> 
>> Marc seems to think it is legit, so I came up with the following (I'll send
>> it out separately with another pair of tweaks):
>
> Thanks, it also looks like we're getting some early WARN_ON()s firing in
> CI from split_kernel_leaf_mapping() after applying your changes:
>
> https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2571596185/test_aarch64/14662134813/artifacts/jobwatch/logs/recipes/21399931/tasks/219104268/results/1007729692/logs/journalctl.log
>

OK I'll investigate


^ permalink raw reply

* Re: ppc64le kunit test failure: guest_state_buffer_test
From: Vaibhav Jain @ 2026-06-03 11:25 UTC (permalink / raw)
  To: Eric Biggers; +Cc: linuxppc-dev, kvm
In-Reply-To: <20260603074035.GA1148188@sol>

Hi Eric,

I just successfully tested this kunit test on kernel tag next-20260602
and qemu master HEAD == 405c32d2b18a("Merge tag 'pull-tpm-2026-06-01-1'
of https://github.com/stefanberger/qemu-tpm into staging") [1]

If you are using Qemu >= v10.0.0 then the kunit test failure you
reported is due to Qemu not enabling 'cap-nested-papr' capability which
is needed to enable KVM-HV Apiv2 support in Qemu.

Without 'cap-nested-papr' Qemu doesnt register the H_GUEST_GET_STATE
Hcall which this kunit test relies on. Hence it returns -2 (H_FUNCTION)
for this unsupported HCall. The failure of kunit 'test_gs_hostwide_msg'
is expected as the underlying hypervisor doesnt have support nested-papr
APIv2 capability.

To fix this please enable qemu-system-ppc64 machine's capability
cap-nested-papr to enable nested-papr APIv2 support.

Below is the test log showing successful 'guest_state_buffer_test' kunit
test passing :

$ ./qemu-system-ppc64 --version
QEMU emulator version 11.0.50 (v11.0.0-1621-g405c32d2b1)
Copyright (c) 2003-2026 Fabrice Bellard and the QEMU Project developers

# kunit test is specific to nested-papr APIv2 so qemu cap-nested-papr needs to be enabled
$ ./qemu-system-ppc64 -display none -nographic -kernel ~/linux/vmlinux
-machine pseries,cap-nested-papr=true

<snip>
Booting from memory...                                       
OF stdout device is: /vdevice/vty@71000000                                                                                       
Preparing to boot Linux version 7.1.0-rc6-next-20260602 (vaibhav@*********) (gcc (GCC) 16.1.1 20260515 (Red Hat 16.1.1-2), GNU ld version 2.46-3.fc44) #2 SMP PREEMPT Wed Jun  3 04:11:38 CDT 2026
Detected machine type: 0000000000000101

<snip>
[    4.335850][    T1]     KTAP version 1
[    4.335896][    T1]     # Subtest: guest_state_buffer_test
[    4.335946][    T1]     # module: test_guest_state_buffer
[    4.335970][    T1]     1..7
[    4.337296][    T1]     ok 1 test_creating_buffer
[    4.338998][    T1]     ok 2 test_adding_element
[    4.341996][    T1]     ok 3 test_gs_bitmap
[    4.343406][    T1]     ok 4 test_gs_parsing
[    4.345607][    T1]     ok 5 test_gs_msg
[    4.347247][    T1]     ok 6 test_gs_hostwide_msg
[    4.348012][  T131]     # test_gs_hostwide_counters: Guest Heap Size=0 bytes
[    4.348183][  T131]     # test_gs_hostwide_counters: Guest Heap Size Max=0 bytes
[    4.348350][  T131]     # test_gs_hostwide_counters: Guest Page-table Size=0 bytes
[    4.348653][  T131]     # test_gs_hostwide_counters: Guest Page-table Size Max=0 bytes
[    4.348813][  T131]     # test_gs_hostwide_counters: Guest Page-table Reclaim Size=0 bytes
[    4.349354][    T1]     ok 7 test_gs_hostwide_counters
[    4.349569][    T1] # guest_state_buffer_test: pass:7 fail:0 skip:0 total:7
[    4.349635][    T1] # Totals: pass:7 fail:0 skip:0 total:7
[    4.349708][    T1] ok 4 guest_state_buffer_test


Can you try adding 'cap-nested-papr=true' to the Qemu machine you are
using and see if the problem resolves for you. If it persists can you
please share the Qemu command line you are using.

[1] https://gitlab.com/qemu-project/qemu/-/commit/405c32d2b18a683ba36301351af75125d9afda08


Eric Biggers <ebiggers@kernel.org> writes:

> On Wed, Jun 03, 2026 at 01:03:09PM +0530, Vaibhav Jain wrote:
>> Hi Eric,
>> 
>> Thanks for trying and reporting this. This kunit test depends on
>> availablility of Qemu commit 5f7d861e("spapr: nested: Add support for
>> reporting Hostwide state counter ") [1] that was merged in v10.0.0.
>> 
>> Since you havent mentioned the qemu version used I assume its a version
>> < v10.0.0 . With the qemu patch available you should see this test
>> passing with results similar to as originally described in original
>> cover letter of the patch series at [2] that introduced this kunit test.
>> 
>> [1] https://gitlab.com/qemu-project/qemu/-/commit/5f7d861e65d90e0446b8f22a0bc859a5d8058ea6
>> 
>> [2] https://lore.kernel.org/all/20250416162740.93143-1-vaibhav@linux.ibm.com/
>
> Nope, it fails even on the master branch of QEMU.
>
> - Eric

-- 
Cheers
~ Vaibhav


^ permalink raw reply

* Re: [PATCH v7 00/15] arm64: Unmap linear alias of kernel data/bss
From: Will Deacon @ 2026-06-03 11:22 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, Ard Biesheuvel, Catalin Marinas, kernel-team,
	linux-kernel, Mark Rutland, Ryan Roberts, Anshuman Khandual,
	Kevin Brodsky, Liz Prucka, Seth Jenkins, Kees Cook, Mike Rapoport,
	David Hildenbrand, Andrew Morton, Jann Horn, linux-mm,
	linux-hardening, linuxppc-dev, linux-sh, maz
In-Reply-To: <2c7929d4-6622-475d-af1b-bcd0cd997cd3@app.fastmail.com>

On Wed, Jun 03, 2026 at 10:57:49AM +0200, Ard Biesheuvel wrote:
> (cc Marc)
> 
> On Tue, 2 Jun 2026, at 22:34, Will Deacon wrote:
> > On Fri, 29 May 2026 17:01:51 +0200, Ard Biesheuvel wrote:
> >> One of the reasons the lack of randomization of the linear map on arm64
> >> is considered problematic is the fact that bootloaders adhering to the
> >> original arm64 boot protocol (i.e., a substantial fraction of all
> >> Android phones) may place the kernel at the base of DRAM, and therefore
> >> at the base of the non-randomized linear map. This puts a writable alias
> >> of the kernel's data and bss regions at a predictable location, removing
> >> the need for an attacker to guess where KASLR mapped the kernel.
> >> 
> >> [...]
> >
> > It would've been nice to hear from the ppc folks on patch 11, but I've
> > picked it up on the assumption that they'll love the negative diff stat.
> > Worst case, we can drop/revert stuff if they have late objections.
> >
> 
> Thanks.
> 
> There is a de facto ack from Michael Ellerman in the Link:, which is why
> I included it.
> 
> Note that Sashiko found an issue with KVM+MTE, where a read-only mapping
> of the zero page in the linear map may result in issues:
> 
> """
> Does moving the zero page to .rodata (or unmapping/read-only mapping its
> linear alias) expose a guest-to-host denial of service with KVM and MTE?
> When an MTE-enabled KVM guest reads an unmapped memory address, KVM handles
> the stage-2 fault by mapping the host's shared zero page. KVM will then
> call sanitise_mte_tags() in arch/arm64/kvm/mmu.c.
> Since the PG_mte_tagged flag is never set on the zero page, KVM's
> try_page_mte_tagging() succeeds, and it calls mte_clear_page_tags().
> This executes the STGM instruction using the zero page's linear map alias.
> If this alias is read-only or unmapped, won't the STGM instruction trigger
> a synchronous permission fault or translation fault in EL1, causing a host
> kernel panic?
> """
> 
> Marc seems to think it is legit, so I came up with the following (I'll send
> it out separately with another pair of tweaks):

Thanks, it also looks like we're getting some early WARN_ON()s firing in
CI from split_kernel_leaf_mapping() after applying your changes:

https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2571596185/test_aarch64/14662134813/artifacts/jobwatch/logs/recipes/21399931/tasks/219104268/results/1007729692/logs/journalctl.log

Will


^ permalink raw reply

* Re: [PATCH] [SCSI] qla2xxx: Handle the INTx not connected while passing through
From: Shivaprasad G Bhat @ 2026-06-03 10:56 UTC (permalink / raw)
  To: Madhavan Srinivasan, njavali, GR-QLogic-Storage-Upstream,
	James.Bottomley, martin.petersen, Kyle.Mahlkuch
  Cc: linux-scsi, linux-kernel, alex.williamson, linuxppc-dev, harshpb
In-Reply-To: <66039318-07c0-4453-a295-bc39a2a5b8ec@linux.ibm.com>

On 6/3/26 1:42 PM, Madhavan Srinivasan wrote:
>
> On 5/15/26 7:15 PM, Shivaprasad G Bhat wrote:
>> The PCI_INTERRUPT_PIN reports if the device supports the INTx.
>> However, when the device is assigned to a guest via vfio, the
>> PCI_INTERRUPT_PIN is set to 0(i.e none) if the line is not
>> connected and|or the platform cannot route the interrupt.
>>
>> In such cases, the guest PCI_INTERRUPT_PIN is 0 and the port
>> number becomes -1(255, uint8_t underflow) for qla[25|27|28]xx and
>> qla2031 devices. The flt_region_nvram is never set, and subsequently
>> the lun detection fails. Below warnings show the NVRAM configuration
>> failure.
>>
>>   []-0073:1: Inconsistent NVRAM checksum=0xffffffc0 id=HCAM 
>> version=0x100.
>>   []-0074:1: Falling back to functioning (yet invalid -- WWPN) defaults.
>>   []-0076:1: NVRAM configuration failed.
>>
>> The patch handles the case, and sets the port_no to devfn like
>> its done everywhere else.
>
> Any update on this? do you have any comments/concerns that should be 
> addressed
>
Hi Maddy,


Priya has tested this already. I have requested Kyle to help with 
reviewing this patch.

Meanwhile, would request Nilesh, James or Martin to provide feedback, if 
any.


Thanks,

Shivaprasad



> Maddy
>
>> Reference: commit 2bd42b03ab6b ("vfio/pci: Virtualize zero INTx PIN 
>> if no pdev->irq")
>> Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
>> ---
>>   drivers/scsi/qla2xxx/qla_os.c |   15 ++++++++++-----
>>   1 file changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/scsi/qla2xxx/qla_os.c 
>> b/drivers/scsi/qla2xxx/qla_os.c
>> index 72b1c28e4dae..a8d6a0a021f4 100644
>> --- a/drivers/scsi/qla2xxx/qla_os.c
>> +++ b/drivers/scsi/qla2xxx/qla_os.c
>> @@ -2803,11 +2803,16 @@ qla2x00_set_isp_flags(struct qla_hw_data *ha)
>>       else {
>>           /* Get adapter physical port no from interrupt pin 
>> register. */
>>           pci_read_config_byte(ha->pdev, PCI_INTERRUPT_PIN, 
>> &ha->port_no);
>> -        if (IS_QLA25XX(ha) || IS_QLA2031(ha) ||
>> -            IS_QLA27XX(ha) || IS_QLA28XX(ha))
>> -            ha->port_no--;
>> -        else
>> -            ha->port_no = !(ha->port_no & 1);
>> +        if (ha->port_no == 0) {
>> +            /* None of INT[A|B|C|D], may be virtualized by vfio */
>> +            ha->port_no = PCI_FUNC(ha->pdev->devfn);
>> +        } else {
>> +            if (IS_QLA25XX(ha) || IS_QLA2031(ha) ||
>> +                IS_QLA27XX(ha) || IS_QLA28XX(ha))
>> +                ha->port_no--;
>> +            else
>> +                ha->port_no = !(ha->port_no & 1);
>> +        }
>>       }
>>         ql_dbg_pci(ql_dbg_init, ha->pdev, 0x000b,
>>
>>


^ permalink raw reply

* [PATCH v4] powerpc: Simplify access_ok()
From: Christophe Leroy (CS GROUP) @ 2026-06-03 10:20 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, Madhavan Srinivasan
  Cc: Christophe Leroy (CS GROUP), linux-kernel, linuxppc-dev

With the implementation of masked user access, we always have a memory
gap between user memory space and kernel memory space, so use it to
simplify access_ok() by relying on access fault in case of an access
in the gap.

Most of the time the size is known at build time.

On powerpc64, the kernel space starts at 0x8000000000000000 which is
always more than two times TASK_USER_MAX so when the size is known at
build time and lower than TASK_USER_MAX, only the address needs to be
verified. If not, a binary or of address and size must be lower than
TASK_USER_MAX. As TASK_USER_MAX is a power of 2, just check that
there is no bit set outside of TASK_USER_MAX - 1 mask.

On powerpc32, there is a garanteed gap of 128KB so when the size is
known at build time and not greater than 128KB, just check that the
address is below TASK_SIZE. Otherwise use the original formula.

Signed-off-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
---
v2: Fix build failure following untested last minute change :(

v3: Using statically_true() following comment from David.

v4: Rebased on top of today's powerpc/merge branch 3af068d1f05b ("Automatic merge of 'next' into merge (2026-05-22 09:59)")
---
 arch/powerpc/include/asm/uaccess.h | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index e98c628e3899..7b8c56962c31 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -18,8 +18,34 @@
 /* Threshold above which VMX copy path is used */
 #define VMX_COPY_THRESHOLD 3328
 
+#define __access_ok __access_ok
+
 #include <asm-generic/access_ok.h>
 
+/*
+ * On powerpc64, TASK_SIZE_MAX is 0x0010000000000000 then even if both ptr and size
+ * are TASK_SIZE_MAX we are still inside the memory gap. So make it simple.
+ */
+static __always_inline int __access_ok(const void __user *ptr, unsigned long size)
+{
+	unsigned long addr = (unsigned long)ptr;
+
+	if (IS_ENABLED(CONFIG_PPC64)) {
+		BUILD_BUG_ON(!is_power_of_2(TASK_SIZE_MAX));
+		BUILD_BUG_ON(TASK_SIZE_MAX > 0x0010000000000000);
+
+		if (statically_true(size > TASK_SIZE_MAX))
+			return false;
+		if (statically_true(size <= TASK_SIZE_MAX))
+			return !(addr & ~(TASK_SIZE_MAX - 1));
+		return !((size | addr) & ~(TASK_SIZE_MAX - 1));
+	} else {
+		if (statically_true(size <= SZ_128K))
+			return addr < TASK_SIZE;
+		return size <= TASK_SIZE && addr <= TASK_SIZE - size;
+	}
+}
+
 /*
  * These are the main single-value transfer routines.  They automatically
  * use the right size if we just have the right pointer type.
-- 
2.54.0



^ permalink raw reply related

* Re: [PATCH] powerpc/entry: Disable interrupts before irqentry_exit
From: Peter Zijlstra @ 2026-06-03 10:11 UTC (permalink / raw)
  To: Shrikanth Hegde
  Cc: maddy, linuxppc-dev, tglx, christophe.leroy, linux-kernel,
	venkat88, mkchauras
In-Reply-To: <20260603095521.198267-1-sshegde@linux.ibm.com>

On Wed, Jun 03, 2026 at 03:25:21PM +0530, Shrikanth Hegde wrote:
> Venkat reported a panic on powerpc-next tree where GENERIC_ENTRY has
> been enabled.
> 
> kernel BUG at kernel/sched/core.c:7512!
> NIP  preempt_schedule_irq+0x44/0x118
> LR   dynamic_irqentry_exit_cond_resched+0x40/0x1a4
> Call Trace:
>  dynamic_irqentry_exit_cond_resched+0x40/0x1a4
>  do_page_fault+0xc0/0x104
>  data_access_common_virt+0x210/0x220
> 
> This happens since __do_page_fault ends up enabling the interrupts and
> it could take significant time such that need_resched could be set. This
> leads to schedule call in irqentry_exit leading to the bug.
> 
> There are many such irq handlers which enables the interrupts.
> Fix it by disabling the irq before calling irqentry_exit. The same
> pattern exists today in interrupt_exit_kernel_prepare.
> 
> While there, make those BUG_ON into WARN_ON. Interrupt is disabled right
> after so it is not that severe. This will still help to catch the
> offending callsites.
> 
> Fixes: bee25f97ad24 ("powerpc: Enable GENERIC_ENTRY feature")
> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Closes: https://lore.kernel.org/all/7904105b-9dfa-4efd-a5ef-bc0276ed255d@linux.ibm.com/
> Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
> ---
> 
> This applies on top on powerpc/next tree.
> base: 6ed60999d33d '("powerpc: Remove unused functions")'
> 
>  arch/powerpc/include/asm/entry-common.h | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
> index de5601282755..7da373a56813 100644
> --- a/arch/powerpc/include/asm/entry-common.h
> +++ b/arch/powerpc/include/asm/entry-common.h
> @@ -253,16 +253,17 @@ static inline void arch_interrupt_enter_prepare(struct pt_regs *regs)
>  static inline void arch_interrupt_exit_prepare(struct pt_regs *regs)
>  {
>  	if (user_mode(regs)) {
> -		BUG_ON(regs_is_unrecoverable(regs));
> -		BUG_ON(regs_irqs_disabled(regs));
> +		WARN_ON(regs_is_unrecoverable(regs));
> +		WARN_ON(regs_irqs_disabled(regs));

So while you will indeed disable IRQs righ below. This checks the IRQ
state of regs, not the current state.

What you are allowing through it a *userspace* state that has IRQs
disabled. That is an invalid state.

>  		/*
>  		 * We don't need to restore AMR on the way back to userspace for KUAP.
>  		 * AMR can only have been unlocked if we interrupted the kernel.
>  		 */
>  		kuap_assert_locked();
> -
> -		local_irq_disable();
>  	}
> +
> +	/* irqentry_exit expects to be called with interrupts disabled */
> +	local_irq_disable();
>  }
>  
>  static inline void arch_interrupt_async_enter_prepare(struct pt_regs *regs)
> -- 
> 2.47.3
> 


^ permalink raw reply

* [PATCH] powerpc/entry: Disable interrupts before irqentry_exit
From: Shrikanth Hegde @ 2026-06-03  9:55 UTC (permalink / raw)
  To: maddy, linuxppc-dev
  Cc: sshegde, peterz, tglx, christophe.leroy, linux-kernel, venkat88,
	mkchauras

Venkat reported a panic on powerpc-next tree where GENERIC_ENTRY has
been enabled.

kernel BUG at kernel/sched/core.c:7512!
NIP  preempt_schedule_irq+0x44/0x118
LR   dynamic_irqentry_exit_cond_resched+0x40/0x1a4
Call Trace:
 dynamic_irqentry_exit_cond_resched+0x40/0x1a4
 do_page_fault+0xc0/0x104
 data_access_common_virt+0x210/0x220

This happens since __do_page_fault ends up enabling the interrupts and
it could take significant time such that need_resched could be set. This
leads to schedule call in irqentry_exit leading to the bug.

There are many such irq handlers which enables the interrupts.
Fix it by disabling the irq before calling irqentry_exit. The same
pattern exists today in interrupt_exit_kernel_prepare.

While there, make those BUG_ON into WARN_ON. Interrupt is disabled right
after so it is not that severe. This will still help to catch the
offending callsites.

Fixes: bee25f97ad24 ("powerpc: Enable GENERIC_ENTRY feature")
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Closes: https://lore.kernel.org/all/7904105b-9dfa-4efd-a5ef-bc0276ed255d@linux.ibm.com/
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
---

This applies on top on powerpc/next tree.
base: 6ed60999d33d '("powerpc: Remove unused functions")'

 arch/powerpc/include/asm/entry-common.h | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
index de5601282755..7da373a56813 100644
--- a/arch/powerpc/include/asm/entry-common.h
+++ b/arch/powerpc/include/asm/entry-common.h
@@ -253,16 +253,17 @@ static inline void arch_interrupt_enter_prepare(struct pt_regs *regs)
 static inline void arch_interrupt_exit_prepare(struct pt_regs *regs)
 {
 	if (user_mode(regs)) {
-		BUG_ON(regs_is_unrecoverable(regs));
-		BUG_ON(regs_irqs_disabled(regs));
+		WARN_ON(regs_is_unrecoverable(regs));
+		WARN_ON(regs_irqs_disabled(regs));
 		/*
 		 * We don't need to restore AMR on the way back to userspace for KUAP.
 		 * AMR can only have been unlocked if we interrupted the kernel.
 		 */
 		kuap_assert_locked();
-
-		local_irq_disable();
 	}
+
+	/* irqentry_exit expects to be called with interrupts disabled */
+	local_irq_disable();
 }
 
 static inline void arch_interrupt_async_enter_prepare(struct pt_regs *regs)
-- 
2.47.3



^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox