[PATCH AUTOSEL 6.16-5.15] fs: writeback: fix use-after-free in __mark_inode

Archive-only list for patches
 help / color / mirror / Atom feed

* [PATCH AUTOSEL 6.16-5.15] fs: writeback: fix use-after-free in __mark_inode_dirty()
@ 2025-08-25 12:14 Sasha Levin
  2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.1] cdc_ncm: Flag Intel OEM version of Fibocom L850-GL as WWAN Sasha Levin
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: Sasha Levin @ 2025-08-25 12:14 UTC (permalink / raw)
  To: patches, stable
  Cc: Jiufei Xue, Jan Kara, Christian Brauner, Sasha Levin, viro,
	linux-fsdevel

From: Jiufei Xue <jiufei.xue@samsung.com>

[ Upstream commit d02d2c98d25793902f65803ab853b592c7a96b29 ]

An use-after-free issue occurred when __mark_inode_dirty() get the
bdi_writeback that was in the progress of switching.

CPU: 1 PID: 562 Comm: systemd-random- Not tainted 6.6.56-gb4403bd46a8e #1
......
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __mark_inode_dirty+0x124/0x418
lr : __mark_inode_dirty+0x118/0x418
sp : ffffffc08c9dbbc0
........
Call trace:
 __mark_inode_dirty+0x124/0x418
 generic_update_time+0x4c/0x60
 file_modified+0xcc/0xd0
 ext4_buffered_write_iter+0x58/0x124
 ext4_file_write_iter+0x54/0x704
 vfs_write+0x1c0/0x308
 ksys_write+0x74/0x10c
 __arm64_sys_write+0x1c/0x28
 invoke_syscall+0x48/0x114
 el0_svc_common.constprop.0+0xc0/0xe0
 do_el0_svc+0x1c/0x28
 el0_svc+0x40/0xe4
 el0t_64_sync_handler+0x120/0x12c
 el0t_64_sync+0x194/0x198

Root cause is:

systemd-random-seed                         kworker
----------------------------------------------------------------------
___mark_inode_dirty                     inode_switch_wbs_work_fn

  spin_lock(&inode->i_lock);
  inode_attach_wb
  locked_inode_to_wb_and_lock_list
     get inode->i_wb
     spin_unlock(&inode->i_lock);
     spin_lock(&wb->list_lock)
  spin_lock(&inode->i_lock)
  inode_io_list_move_locked
  spin_unlock(&wb->list_lock)
  spin_unlock(&inode->i_lock)
                                    spin_lock(&old_wb->list_lock)
                                      inode_do_switch_wbs
                                        spin_lock(&inode->i_lock)
                                        inode->i_wb = new_wb
                                        spin_unlock(&inode->i_lock)
                                    spin_unlock(&old_wb->list_lock)
                                    wb_put_many(old_wb, nr_switched)
                                      cgwb_release
                                      old wb released
  wb_wakeup_delayed() accesses wb,
  then trigger the use-after-free
  issue

Fix this race condition by holding inode spinlock until
wb_wakeup_delayed() finished.

Signed-off-by: Jiufei Xue <jiufei.xue@samsung.com>
Link: https://lore.kernel.org/20250728100715.3863241-1-jiufei.xue@samsung.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Based on my thorough analysis of the commit and the kernel code, here's
my assessment:

**Backport Status: YES**

## Extensive Explanation:

This commit fixes a **critical use-after-free vulnerability** in the
writeback subsystem that should definitely be backported to stable
kernel trees. Here's why:

### 1. **Real Security Bug with Clear Impact**
The commit fixes a genuine use-after-free vulnerability that occurs in
`__mark_inode_dirty()`. The bug manifests as a kernel crash with a clear
call trace showing memory corruption. This is not a theoretical issue -
it has been observed in production (kernel 6.6.56).

### 2. **Race Condition Details**
The race condition occurs between two concurrent operations:
- **Thread A** (`__mark_inode_dirty`): Gets a reference to
  `inode->i_wb`, releases the inode lock, then calls
  `wb_wakeup_delayed(wb)`
- **Thread B** (`inode_switch_wbs_work_fn`): Switches the inode's
  writeback context, releases the old wb via `wb_put_many()`, which can
  trigger `cgwb_release` and free the wb structure

The vulnerability window exists because Thread A accesses the wb
structure (`wb_wakeup_delayed(wb)`) after releasing the inode lock but
before completing its operation, while Thread B can free that same wb
structure in parallel.

### 3. **Minimal and Contained Fix**
The fix is remarkably simple and surgical - it only reorders lock
releases:
```c
- spin_unlock(&wb->list_lock);
- spin_unlock(&inode->i_lock);
- trace_writeback_dirty_inode_enqueue(inode);
-
  if (wakeup_bdi && (wb->bdi->capabilities & BDI_CAP_WRITEBACK))
      wb_wakeup_delayed(wb);
+
+ spin_unlock(&wb->list_lock);
+ spin_unlock(&inode->i_lock);
+ trace_writeback_dirty_inode_enqueue(inode);
```

The fix ensures that `wb_wakeup_delayed()` is called while still holding
the locks, preventing the wb from being freed during the operation. This
is a classic lock ordering fix with minimal code change (just moving 3
lines).

### 4. **Critical Subsystem**
The writeback subsystem is fundamental to filesystem operations - it
handles dirty page writeback, inode management, and I/O scheduling. A
use-after-free here can lead to:
- System crashes/panics
- Data corruption
- Security vulnerabilities (potential for exploitation)

### 5. **Matches Stable Criteria**
This fix perfectly aligns with stable kernel rules:
- **Fixes a real bug**: Documented use-after-free with crash trace
- **Already upstream**: Has been reviewed by Jan Kara (filesystem
  maintainer) and merged by Christian Brauner
- **Small change**: Only 6 lines moved, no new logic
- **No new features**: Pure bug fix
- **Low regression risk**: Simply extends existing lock protection
  slightly

### 6. **Similar Historical Fixes**
The git history shows multiple similar use-after-free fixes in this
subsystem have been backported:
- `f87904c07551` - writeback: avoid use-after-free after removing device
- `8e00c4e9dd85` - writeback: fix use-after-free in
  finish_writeback_work()
- `614a4e377314` - writeback, cgroup: fix premature wb_put()

### 7. **CONFIG_CGROUP_WRITEBACK Dependency**
While this bug primarily affects systems with `CONFIG_CGROUP_WRITEBACK`
enabled (used for cgroup-aware writeback), this is a common
configuration in modern distributions, especially those using cgroups v2
for container workloads.

### 8. **No Side Effects**
The change has no functional side effects beyond fixing the race:
- The trace event still fires at the same logical point
- The wb_wakeup_delayed() still executes under the same conditions
- Only the lock hold duration is slightly extended to cover the wb
  access

This is a textbook example of a commit that should be backported to
stable kernels - it fixes a serious bug with minimal, well-understood
changes in a critical subsystem.

 fs/fs-writeback.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index cc57367fb641..a07b8cf73ae2 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -2608,10 +2608,6 @@ void __mark_inode_dirty(struct inode *inode, int flags)
 			wakeup_bdi = inode_io_list_move_locked(inode, wb,
 							       dirty_list);

-			spin_unlock(&wb->list_lock);
-			spin_unlock(&inode->i_lock);
-			trace_writeback_dirty_inode_enqueue(inode);
-
 			/*
 			 * If this is the first dirty inode for this bdi,
 			 * we have to wake-up the corresponding bdi thread
@@ -2621,6 +2617,11 @@ void __mark_inode_dirty(struct inode *inode, int flags)
 			if (wakeup_bdi &&
 			    (wb->bdi->capabilities & BDI_CAP_WRITEBACK))
 				wb_wakeup_delayed(wb);
+
+			spin_unlock(&wb->list_lock);
+			spin_unlock(&inode->i_lock);
+			trace_writeback_dirty_inode_enqueue(inode);
+
 			return;
 		}
 	}
-- 
2.50.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH AUTOSEL 6.16-6.1] cdc_ncm: Flag Intel OEM version of Fibocom L850-GL as WWAN
  2025-08-25 12:14 [PATCH AUTOSEL 6.16-5.15] fs: writeback: fix use-after-free in __mark_inode_dirty() Sasha Levin
@ 2025-08-25 12:14 ` Sasha Levin
  2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.6] LoongArch: Save LBT before FPU in setup_sigcontext() Sasha Levin
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Sasha Levin @ 2025-08-25 12:14 UTC (permalink / raw)
  To: patches, stable
  Cc: Lubomir Rintel, Jakub Kicinski, Sasha Levin, oliver, linux-usb

From: Lubomir Rintel <lkundrak@v3.sk>

[ Upstream commit 4a73a36cb704813f588af13d9842d0ba5a185758 ]

This lets NetworkManager/ModemManager know that this is a modem and
needs to be connected first.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Link: https://patch.msgid.link/20250814154214.250103-1-lkundrak@v3.sk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

**Backport Status: YES**

This commit should be backported to stable kernel trees for the
following reasons:

## Analysis of the Commit:

1. **Nature of the change**: This is a simple device ID addition that
   adds USB vendor/device ID pair (0x8087:0x095a) for the Intel OEM
   version of Fibocom L850-GL modem to the cdc_ncm driver's device
   table.

2. **Bug fix vs feature**: This is a **hardware enablement fix** rather
   than a new feature. Without this change, the Intel-branded Fibocom
   L850-GL modem won't be properly recognized as a WWAN device, causing
   NetworkManager/ModemManager to fail to handle it correctly. This
   directly impacts users with this hardware.

3. **Code impact**: The change is minimal - just 7 lines adding a new
   entry to the `cdc_devs[]` USB device table:
  ```c
  /* Intel modem (label from OEM reads Fibocom L850-GL) */
  { USB_DEVICE_AND_INTERFACE_INFO(0x8087, 0x095a,
  USB_CLASS_COMM,
  USB_CDC_SUBCLASS_NCM, USB_CDC_PROTO_NONE),
  .driver_info = (unsigned long)&wwan_info,
  },
  ```

4. **Risk assessment**:
   - **Extremely low risk** - The change only adds a new device ID entry
   - No existing functionality is modified
   - Uses the existing `wwan_info` driver configuration (FLAG_WWAN flag)
   - Follows the same pattern as other WWAN devices in the driver
   - Cannot cause regressions for other hardware

5. **User impact**: Users with this specific hardware (Intel OEM version
   with VID:PID 0x8087:0x095a) cannot use their modem properly without
   this fix. The modem won't be recognized as a WWAN device, preventing
   proper network management.

6. **Stable tree criteria compliance**:
   - ✓ Fixes a real bug (hardware not working properly)
   - ✓ Minimal change (7 lines)
   - ✓ No architectural changes
   - ✓ Self-contained to specific hardware
   - ✓ Clear and obvious correctness

7. **Historical context**: The git history shows numerous quirks and
   fixes for the Fibocom L850-GL modem variants, indicating this is
   well-known hardware that has required various fixes over time. This
   particular Intel OEM variant (0x8087:0x095a) was simply missing from
   the device table.

This is a textbook example of a stable-worthy commit: it enables
specific hardware that should already be working, with zero risk to
existing functionality.

 drivers/net/usb/cdc_ncm.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index ea0e5e276cd6..5d123df0a866 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -2087,6 +2087,13 @@ static const struct usb_device_id cdc_devs[] = {
 	  .driver_info = (unsigned long)&wwan_info,
 	},

+	/* Intel modem (label from OEM reads Fibocom L850-GL) */
+	{ USB_DEVICE_AND_INTERFACE_INFO(0x8087, 0x095a,
+		USB_CLASS_COMM,
+		USB_CDC_SUBCLASS_NCM, USB_CDC_PROTO_NONE),
+	  .driver_info = (unsigned long)&wwan_info,
+	},
+
 	/* DisplayLink docking stations */
 	{ .match_flags = USB_DEVICE_ID_MATCH_INT_INFO
 		| USB_DEVICE_ID_MATCH_VENDOR,
-- 
2.50.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH AUTOSEL 6.16-6.6] LoongArch: Save LBT before FPU in setup_sigcontext()
  2025-08-25 12:14 [PATCH AUTOSEL 6.16-5.15] fs: writeback: fix use-after-free in __mark_inode_dirty() Sasha Levin
  2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.1] cdc_ncm: Flag Intel OEM version of Fibocom L850-GL as WWAN Sasha Levin
@ 2025-08-25 12:14 ` Sasha Levin
  2025-08-25 12:14 ` [PATCH AUTOSEL 6.16] btrfs: clear block dirty if submit_one_sector() failed Sasha Levin
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Sasha Levin @ 2025-08-25 12:14 UTC (permalink / raw)
  To: patches, stable
  Cc: Huacai Chen, Hanlu Li, Sasha Levin, chenhuacai, yangtiezhu

From: Huacai Chen <chenhuacai@loongson.cn>

[ Upstream commit 112ca94f6c3b3e0b2002a240de43c487a33e0234 ]

Now if preemption happens between protected_save_fpu_context() and
protected_save_lbt_context(), FTOP context is lost. Because FTOP is
saved by protected_save_lbt_context() but protected_save_fpu_context()
disables TM before that. So save LBT before FPU in setup_sigcontext()
to avoid this potential risk.

Signed-off-by: Hanlu Li <lihanlu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Based on my analysis of this commit and the Linux kernel codebase, I can
now provide a comprehensive assessment.

**Backport Status: YES**

## Extensive Analysis

This commit fixes a **critical race condition** in the LoongArch signal
handling code that can lead to **data loss** of processor state during
signal context saving. Here's the detailed analysis:

### The Bug Being Fixed

1. **Context Loss Issue**: The commit addresses a race condition where
   the FTOP (x87 FPU stack pointer) context can be lost during signal
   handling. This happens when:
   - A preemption occurs between `protected_save_fpu_context()` and
     `protected_save_lbt_context()`
   - The FPU context save operation disables TM (likely Transaction
     Memory or a similar mechanism) before LBT context is saved
   - Since FTOP is part of the LBT (Loongson Binary Translation)
     extension context, it gets lost

2. **Binary Translation Context**: LBT is a hardware extension used to
   accelerate binary translation on LoongArch processors. According to
   the original LBT support commit (bd3c5798484a), it includes:
   - 4 scratch registers (scr0-scr3)
   - x86/ARM eflags register
   - x87 FPU stack pointer (FTOP)

### Code Changes Analysis

The fix is **minimal and surgical** - it simply reorders the save
operations:

**Before (buggy order):**
```c
// Save FPU contexts first (LASX/LSX/FPU)
if (extctx->lasx.addr)
    err |= protected_save_lasx_context(extctx);
else if (extctx->lsx.addr)
    err |= protected_save_lsx_context(extctx);
else if (extctx->fpu.addr)
    err |= protected_save_fpu_context(extctx);

// Save LBT context last - PROBLEM: FTOP may be lost by now
#ifdef CONFIG_CPU_HAS_LBT
if (extctx->lbt.addr)
    err |= protected_save_lbt_context(extctx);
#endif
```

**After (fixed order):**
```c
// Save LBT context FIRST to preserve FTOP
#ifdef CONFIG_CPU_HAS_LBT
if (extctx->lbt.addr)
    err |= protected_save_lbt_context(extctx);
#endif

// Then save FPU contexts (LASX/LSX/FPU)
if (extctx->lasx.addr)
    err |= protected_save_lasx_context(extctx);
else if (extctx->lsx.addr)
    err |= protected_save_lsx_context(extctx);
else if (extctx->fpu.addr)
    err |= protected_save_fpu_context(extctx);
```

### Why This Should Be Backported

1. **Data Corruption Risk**: This bug can cause loss of processor state
   during signal handling, which could lead to:
   - Incorrect program execution after signal return
   - Potential application crashes
   - Data corruption in applications using binary translation features

2. **Small, Contained Fix**: The change is:
   - Only 10 lines (5 insertions, 5 deletions)
   - Confined to a single function in signal handling
   - Simply reorders existing operations without adding new logic
   - Protected by `#ifdef CONFIG_CPU_HAS_LBT` so it only affects systems
     with LBT support

3. **No Architectural Changes**: This is purely a bug fix that:
   - Doesn't introduce new features
   - Doesn't change kernel APIs or ABIs
   - Doesn't modify core subsystem behavior
   - Only affects LoongArch architecture with LBT extension enabled

4. **Clear Bug with Clear Fix**: The problem is well-defined (race
   condition causing context loss) and the solution is straightforward
   (reorder operations to save LBT before FPU).

5. **Affects User-Space Reliability**: Signal handling is a fundamental
   mechanism used by many applications. A bug here can affect system
   stability and application reliability.

### Risk Assessment

The risk of regression is **very low** because:
- The change only affects code paths when LBT is enabled
  (`CONFIG_CPU_HAS_LBT`)
- It's a simple reordering of independent save operations
- The fix has been tested and merged into mainline
- It doesn't change the fundamental logic, just the execution order

This commit clearly meets the stable kernel criteria for backporting as
it fixes an important bug with minimal risk of introducing new issues.

 arch/loongarch/kernel/signal.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/loongarch/kernel/signal.c b/arch/loongarch/kernel/signal.c
index 4740cb5b2388..c9f7ca778364 100644
--- a/arch/loongarch/kernel/signal.c
+++ b/arch/loongarch/kernel/signal.c
@@ -677,6 +677,11 @@ static int setup_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc,
 	for (i = 1; i < 32; i++)
 		err |= __put_user(regs->regs[i], &sc->sc_regs[i]);
 
+#ifdef CONFIG_CPU_HAS_LBT
+	if (extctx->lbt.addr)
+		err |= protected_save_lbt_context(extctx);
+#endif
+
 	if (extctx->lasx.addr)
 		err |= protected_save_lasx_context(extctx);
 	else if (extctx->lsx.addr)
@@ -684,11 +689,6 @@ static int setup_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc,
 	else if (extctx->fpu.addr)
 		err |= protected_save_fpu_context(extctx);
 
-#ifdef CONFIG_CPU_HAS_LBT
-	if (extctx->lbt.addr)
-		err |= protected_save_lbt_context(extctx);
-#endif
-
 	/* Set the "end" magic */
 	info = (struct sctx_info *)extctx->end.addr;
 	err |= __put_user(0, &info->magic);
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH AUTOSEL 6.16] btrfs: clear block dirty if submit_one_sector() failed
  2025-08-25 12:14 [PATCH AUTOSEL 6.16-5.15] fs: writeback: fix use-after-free in __mark_inode_dirty() Sasha Levin
  2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.1] cdc_ncm: Flag Intel OEM version of Fibocom L850-GL as WWAN Sasha Levin
  2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.6] LoongArch: Save LBT before FPU in setup_sigcontext() Sasha Levin
@ 2025-08-25 12:14 ` Sasha Levin
  2025-08-25 12:14 ` [PATCH AUTOSEL 6.16] platform/x86/amd: pmc: Drop SMU F/W match for Cezanne Sasha Levin
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Sasha Levin @ 2025-08-25 12:14 UTC (permalink / raw)
  To: patches, stable
  Cc: Qu Wenruo, Filipe Manana, David Sterba, Sasha Levin, clm, josef,
	linux-btrfs

From: Qu Wenruo <wqu@suse.com>

[ Upstream commit 4bcd3061e8154606af7f721cb75ca04ffe191a12 ]

[BUG]
If submit_one_sector() failed, the block will be kept dirty, but with
their corresponding range finished in the ordered extent.

This means if a writeback happens later again, we can hit the following
problems:

- ASSERT(block_start != EXTENT_MAP_HOLE) in submit_one_sector()
  If the original extent map is a hole, then we can hit this case, as
  the new ordered extent failed, we will drop the new extent map and
  re-read one from the disk.

- DEBUG_WARN() in btrfs_writepage_cow_fixup()
  This is because we no longer have an ordered extent for those dirty
  blocks. The original for them is already finished with error.

[CAUSE]
The function submit_one_sector() is not following the regular error
handling of writeback.  The common practice is to clear the folio dirty,
start and finish the writeback for the block.

This is normally done by extent_clear_unlock_delalloc() with
PAGE_START_WRITEBACK | PAGE_END_WRITEBACK flags during
run_delalloc_range().

So if we keep those failed blocks dirty, they will stay in the page
cache and wait for the next writeback.

And since the original ordered extent is already finished and removed,
depending on the original extent map, we either hit the ASSERT() inside
submit_one_sector(), or hit the DEBUG_WARN() in
btrfs_writepage_cow_fixup().

[FIX]
Follow the regular error handling to clear the dirty flag for the block,
start and finish writeback for that block instead.

Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Based on my analysis of the commit and the kernel repository context,
here's my assessment:

**Backport Status: YES**

This commit should be backported to stable kernel trees for the
following reasons:

## Bug Fix Analysis

1. **Critical Bug Being Fixed**: The commit fixes a serious error
   handling bug in btrfs writeback that can lead to two different
   assertion failures:
   - `ASSERT(block_start != EXTENT_MAP_HOLE)` in submit_one_sector()
   - `DEBUG_WARN()` in btrfs_writepage_cow_fixup()

2. **Data Integrity Issue**: The bug causes dirty blocks to remain dirty
   after a failed submission, but their corresponding ordered extent is
   already finished with error. This creates an inconsistent state
   where:
   - Dirty blocks exist without proper ordered extent tracking
   - Subsequent writeback attempts will fail with assertions/warnings
   - The filesystem enters an undefined state that could affect data
     integrity

3. **Clear Root Cause**: The commit message clearly identifies the
   problem - submit_one_sector() was not following standard writeback
   error handling practices. The fix aligns the error handling with the
   rest of the btrfs writeback code.

## Code Change Analysis

The fix is minimal and contained:
```c
if (IS_ERR(em)) {
+    /*
+     * When submission failed, we should still clear the folio dirty.
+     * Or the folio will be written back again but without any
+     * ordered extent.
+     */
+    btrfs_folio_clear_dirty(fs_info, folio, filepos, sectorsize);
+    btrfs_folio_set_writeback(fs_info, folio, filepos, sectorsize);
+    btrfs_folio_clear_writeback(fs_info, folio, filepos, sectorsize);
    return PTR_ERR(em);
}
```

The changes:
- Add proper error handling to clear dirty flag
- Set and clear writeback status to properly finish the failed writeback
- Update comments to clarify the behavior

## Stable Tree Criteria Met

1. **Fixes a real bug**: Yes - prevents assertion failures and potential
   filesystem corruption
2. **Small and contained**: Yes - only ~10 lines of actual code change
   in one function
3. **No new features**: Correct - purely bug fix
4. **Low regression risk**: The change follows established patterns used
   elsewhere in btrfs (extent_clear_unlock_delalloc)
5. **Important enough**: Yes - prevents filesystem errors and potential
   data integrity issues

The fix is straightforward, follows existing btrfs patterns, and
addresses a clear bug that could affect users running btrfs filesystems,
especially under I/O error conditions. This makes it an excellent
candidate for stable backporting.

 fs/btrfs/extent_io.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 1dc931c4937f..2e127c109e5b 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1483,7 +1483,7 @@ static noinline_for_stack int writepage_delalloc(struct btrfs_inode *inode,

 /*
  * Return 0 if we have submitted or queued the sector for submission.
- * Return <0 for critical errors.
+ * Return <0 for critical errors, and the sector will have its dirty flag cleared.
  *
  * Caller should make sure filepos < i_size and handle filepos >= i_size case.
  */
@@ -1506,8 +1506,17 @@ static int submit_one_sector(struct btrfs_inode *inode,
 	ASSERT(filepos < i_size);

 	em = btrfs_get_extent(inode, NULL, filepos, sectorsize);
-	if (IS_ERR(em))
+	if (IS_ERR(em)) {
+		/*
+		 * When submission failed, we should still clear the folio dirty.
+		 * Or the folio will be written back again but without any
+		 * ordered extent.
+		 */
+		btrfs_folio_clear_dirty(fs_info, folio, filepos, sectorsize);
+		btrfs_folio_set_writeback(fs_info, folio, filepos, sectorsize);
+		btrfs_folio_clear_writeback(fs_info, folio, filepos, sectorsize);
 		return PTR_ERR(em);
+	}

 	extent_offset = filepos - em->start;
 	em_end = btrfs_extent_map_end(em);
@@ -1637,8 +1646,8 @@ static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode,
 	 * Here we set writeback and clear for the range. If the full folio
 	 * is no longer dirty then we clear the PAGECACHE_TAG_DIRTY tag.
 	 *
-	 * If we hit any error, the corresponding sector will still be dirty
-	 * thus no need to clear PAGECACHE_TAG_DIRTY.
+	 * If we hit any error, the corresponding sector will have its dirty
+	 * flag cleared and writeback finished, thus no need to handle the error case.
 	 */
 	if (!submitted_io && !error) {
 		btrfs_folio_set_writeback(fs_info, folio, start, len);
-- 
2.50.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH AUTOSEL 6.16] platform/x86/amd: pmc: Drop SMU F/W match for Cezanne
  2025-08-25 12:14 [PATCH AUTOSEL 6.16-5.15] fs: writeback: fix use-after-free in __mark_inode_dirty() Sasha Levin
                   ` (2 preceding siblings ...)
  2025-08-25 12:14 ` [PATCH AUTOSEL 6.16] btrfs: clear block dirty if submit_one_sector() failed Sasha Levin
@ 2025-08-25 12:14 ` Sasha Levin
  2025-08-25 12:14 ` [PATCH AUTOSEL 6.16] LoongArch: Add cpuhotplug hooks to fix high cpu usage of vCPU threads Sasha Levin
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Sasha Levin @ 2025-08-25 12:14 UTC (permalink / raw)
  To: patches, stable
  Cc: Mario Limonciello, Chris Bainbridge, Ilpo Järvinen,
	Sasha Levin, Shyam-sundar.S-k, platform-driver-x86

From: Mario Limonciello <mario.limonciello@amd.com>

[ Upstream commit 5b9e07551faa7bb2f26cb039cc6e8d00bc4d0831 ]

Chris reported that even on a BIOS that has a new enough SMU F/W
version there is still a spurious IRQ1.  Although the solution was
added to SMU F/W 64.66.0 it turns out there needs to be a matching
SBIOS change to activate it.  Thus Linux shouldn't be avoiding the
IRQ1 workaround on newer SMU F/W because there is no indication the
BIOS change is in place.

Drop the match for 64.66.0+ and instead match all RN/CZN/BRC (they
all share same SMU F/W). Adjust the quirk infrastructure to allow
quirking the workaround on or off and also adjust existing quirks
to match properly.

Unfortunately this may cause some systems that did have the SBIOS
change in place to regress in keyboard wakeup but we don't have a
way to know.  If a user reports a keyboard wakeup regression they can
run with amd_pmc.disable_workarounds=1 to deactivate the workaround
and share DMI data so that their system can be quirked not to use
the workaround in the upstream kernel.

Reported-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4449
Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250724185156.1827592-1-superm1@kernel.org
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have a better understanding. Let me analyze the commit to
determine if it should be backported:

**Backport Status: YES**

## Extensive Explanation:

This commit should be backported to stable kernel trees for the
following reasons:

### 1. **Fixes a Real User-Impacting Bug**
The commit addresses a spurious IRQ1 (keyboard interrupt) issue that
causes unintended system wakeups from suspend. The commit message
references a specific bug report
(https://gitlab.freedesktop.org/drm/amd/-/issues/4449) where users are
experiencing this problem. This directly impacts user experience with
system suspend/resume functionality.

### 2. **Regression Fix for Previous Workaround**
The code shows this is fixing an incomplete workaround that was
previously implemented. The original workaround avoided applying the
IRQ1 disable on SMU firmware version 64.66.0+, assuming the firmware fix
was sufficient. However, the commit message reveals that:
- The SMU firmware fix requires a matching SBIOS change to be activated
- Linux has no way to detect if the SBIOS change is present
- This means systems with newer SMU firmware but without the SBIOS
  change still experience the spurious IRQ1 issue

### 3. **Limited Scope and Low Risk**
The changes are confined to the AMD PMC driver quirks handling:
- Removes the SMU firmware version check from `amd_pmc_wa_irq1()`
  function
- Adjusts the quirk infrastructure to allow both s2idle bug and spurious
  8042 fixes
- Updates DMI matches to use the combined quirk where appropriate
- The changes are self-contained within the platform-specific driver

### 4. **Hardware-Specific Fix**
The fix targets specific AMD CPU models (Renoir/Cezanne/Barcelo -
RN/CZN/BRC) that share the same SMU firmware. This hardware-specific
nature means:
- It won't affect other platforms
- The risk is limited to AMD systems that already have the issue
- The workaround provides a module parameter
  (`amd_pmc.disable_workarounds=1`) for users who might experience
  regressions

### 5. **Addresses Known Hardware/Firmware Limitation**
The commit acknowledges a hardware/firmware limitation where:
- A fix exists in SMU firmware 64.66.0+
- But it requires SBIOS activation that Linux cannot detect
- This is a defensive approach to ensure all affected systems get the
  workaround

### 6. **Provides User Control**
The commit message mentions that users who experience keyboard wakeup
regression can use `amd_pmc.disable_workarounds=1` to disable the
workaround and provide DMI data for future quirking. This gives users an
escape hatch if needed.

### 7. **Follows Stable Kernel Criteria**
This commit meets the stable kernel backport criteria:
- **Fixes a real bug**: Spurious IRQ1 wakeups affecting suspend/resume
- **Already tested**: Has a "Tested-by" tag from the bug reporter
- **Small and contained**: Changes are limited to the AMD PMC driver
- **No new features**: Only adjusts existing workaround logic
- **Clear impact**: Users experience unwanted system wakeups

### Code Analysis Details:
The key change in `drivers/platform/x86/amd/pmc/pmc.c` removes the SMU
version check:
```c
- /* cezanne platform firmware has a fix in 64.66.0 */
- if (pdev->cpu_id == AMD_CPU_ID_CZN) {
- if (!pdev->major) {
- rc = amd_pmc_get_smu_version(pdev);
- if (rc)
- return rc;
- }
- if (pdev->major > 64 || (pdev->major == 64 && pdev->minor > 65))
- return 0;
- }
```

This ensures the workaround is always applied for affected CPUs,
regardless of SMU firmware version.

The quirks restructuring in `pmc-quirks.c` creates a combined quirk
(`quirk_s2idle_spurious_8042`) that applies both fixes where needed,
showing careful consideration of the various affected systems.

 drivers/platform/x86/amd/pmc/pmc-quirks.c | 54 ++++++++++++++---------
 drivers/platform/x86/amd/pmc/pmc.c        | 13 ------
 2 files changed, 34 insertions(+), 33 deletions(-)

diff --git a/drivers/platform/x86/amd/pmc/pmc-quirks.c b/drivers/platform/x86/amd/pmc/pmc-quirks.c
index ded4c84f5ed1..7ffc659b2794 100644
--- a/drivers/platform/x86/amd/pmc/pmc-quirks.c
+++ b/drivers/platform/x86/amd/pmc/pmc-quirks.c
@@ -28,10 +28,15 @@ static struct quirk_entry quirk_spurious_8042 = {
 	.spurious_8042 = true,
 };
 
+static struct quirk_entry quirk_s2idle_spurious_8042 = {
+	.s2idle_bug_mmio = FCH_PM_BASE + FCH_PM_SCRATCH,
+	.spurious_8042 = true,
+};
+
 static const struct dmi_system_id fwbug_list[] = {
 	{
 		.ident = "L14 Gen2 AMD",
-		.driver_data = &quirk_s2idle_bug,
+		.driver_data = &quirk_s2idle_spurious_8042,
 		.matches = {
 			DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "20X5"),
@@ -39,7 +44,7 @@ static const struct dmi_system_id fwbug_list[] = {
 	},
 	{
 		.ident = "T14s Gen2 AMD",
-		.driver_data = &quirk_s2idle_bug,
+		.driver_data = &quirk_s2idle_spurious_8042,
 		.matches = {
 			DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "20XF"),
@@ -47,7 +52,7 @@ static const struct dmi_system_id fwbug_list[] = {
 	},
 	{
 		.ident = "X13 Gen2 AMD",
-		.driver_data = &quirk_s2idle_bug,
+		.driver_data = &quirk_s2idle_spurious_8042,
 		.matches = {
 			DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "20XH"),
@@ -55,7 +60,7 @@ static const struct dmi_system_id fwbug_list[] = {
 	},
 	{
 		.ident = "T14 Gen2 AMD",
-		.driver_data = &quirk_s2idle_bug,
+		.driver_data = &quirk_s2idle_spurious_8042,
 		.matches = {
 			DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "20XK"),
@@ -63,7 +68,7 @@ static const struct dmi_system_id fwbug_list[] = {
 	},
 	{
 		.ident = "T14 Gen1 AMD",
-		.driver_data = &quirk_s2idle_bug,
+		.driver_data = &quirk_s2idle_spurious_8042,
 		.matches = {
 			DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "20UD"),
@@ -71,7 +76,7 @@ static const struct dmi_system_id fwbug_list[] = {
 	},
 	{
 		.ident = "T14 Gen1 AMD",
-		.driver_data = &quirk_s2idle_bug,
+		.driver_data = &quirk_s2idle_spurious_8042,
 		.matches = {
 			DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "20UE"),
@@ -79,7 +84,7 @@ static const struct dmi_system_id fwbug_list[] = {
 	},
 	{
 		.ident = "T14s Gen1 AMD",
-		.driver_data = &quirk_s2idle_bug,
+		.driver_data = &quirk_s2idle_spurious_8042,
 		.matches = {
 			DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "20UH"),
@@ -87,7 +92,7 @@ static const struct dmi_system_id fwbug_list[] = {
 	},
 	{
 		.ident = "T14s Gen1 AMD",
-		.driver_data = &quirk_s2idle_bug,
+		.driver_data = &quirk_s2idle_spurious_8042,
 		.matches = {
 			DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "20UJ"),
@@ -95,7 +100,7 @@ static const struct dmi_system_id fwbug_list[] = {
 	},
 	{
 		.ident = "P14s Gen1 AMD",
-		.driver_data = &quirk_s2idle_bug,
+		.driver_data = &quirk_s2idle_spurious_8042,
 		.matches = {
 			DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "20Y1"),
@@ -103,7 +108,7 @@ static const struct dmi_system_id fwbug_list[] = {
 	},
 	{
 		.ident = "P14s Gen2 AMD",
-		.driver_data = &quirk_s2idle_bug,
+		.driver_data = &quirk_s2idle_spurious_8042,
 		.matches = {
 			DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "21A0"),
@@ -111,7 +116,7 @@ static const struct dmi_system_id fwbug_list[] = {
 	},
 	{
 		.ident = "P14s Gen2 AMD",
-		.driver_data = &quirk_s2idle_bug,
+		.driver_data = &quirk_s2idle_spurious_8042,
 		.matches = {
 			DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "21A1"),
@@ -152,7 +157,7 @@ static const struct dmi_system_id fwbug_list[] = {
 	},
 	{
 		.ident = "IdeaPad 1 14AMN7",
-		.driver_data = &quirk_s2idle_bug,
+		.driver_data = &quirk_s2idle_spurious_8042,
 		.matches = {
 			DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "82VF"),
@@ -160,7 +165,7 @@ static const struct dmi_system_id fwbug_list[] = {
 	},
 	{
 		.ident = "IdeaPad 1 15AMN7",
-		.driver_data = &quirk_s2idle_bug,
+		.driver_data = &quirk_s2idle_spurious_8042,
 		.matches = {
 			DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "82VG"),
@@ -168,7 +173,7 @@ static const struct dmi_system_id fwbug_list[] = {
 	},
 	{
 		.ident = "IdeaPad 1 15AMN7",
-		.driver_data = &quirk_s2idle_bug,
+		.driver_data = &quirk_s2idle_spurious_8042,
 		.matches = {
 			DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "82X5"),
@@ -176,7 +181,7 @@ static const struct dmi_system_id fwbug_list[] = {
 	},
 	{
 		.ident = "IdeaPad Slim 3 14AMN8",
-		.driver_data = &quirk_s2idle_bug,
+		.driver_data = &quirk_s2idle_spurious_8042,
 		.matches = {
 			DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "82XN"),
@@ -184,7 +189,7 @@ static const struct dmi_system_id fwbug_list[] = {
 	},
 	{
 		.ident = "IdeaPad Slim 3 15AMN8",
-		.driver_data = &quirk_s2idle_bug,
+		.driver_data = &quirk_s2idle_spurious_8042,
 		.matches = {
 			DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "82XQ"),
@@ -193,7 +198,7 @@ static const struct dmi_system_id fwbug_list[] = {
 	/* https://gitlab.freedesktop.org/drm/amd/-/issues/4434 */
 	{
 		.ident = "Lenovo Yoga 6 13ALC6",
-		.driver_data = &quirk_s2idle_bug,
+		.driver_data = &quirk_s2idle_spurious_8042,
 		.matches = {
 			DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "82ND"),
@@ -202,7 +207,7 @@ static const struct dmi_system_id fwbug_list[] = {
 	/* https://gitlab.freedesktop.org/drm/amd/-/issues/2684 */
 	{
 		.ident = "HP Laptop 15s-eq2xxx",
-		.driver_data = &quirk_s2idle_bug,
+		.driver_data = &quirk_s2idle_spurious_8042,
 		.matches = {
 			DMI_MATCH(DMI_SYS_VENDOR, "HP"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "HP Laptop 15s-eq2xxx"),
@@ -285,6 +290,16 @@ void amd_pmc_quirks_init(struct amd_pmc_dev *dev)
 {
 	const struct dmi_system_id *dmi_id;
 
+	/*
+	 * IRQ1 may cause an interrupt during resume even without a keyboard
+	 * press.
+	 *
+	 * Affects Renoir, Cezanne and Barcelo SoCs
+	 *
+	 * A solution is available in PMFW 64.66.0, but it must be activated by
+	 * SBIOS. If SBIOS is known to have the fix a quirk can be added for
+	 * a given system to avoid workaround.
+	 */
 	if (dev->cpu_id == AMD_CPU_ID_CZN)
 		dev->disable_8042_wakeup = true;
 
@@ -295,6 +310,5 @@ void amd_pmc_quirks_init(struct amd_pmc_dev *dev)
 	if (dev->quirks->s2idle_bug_mmio)
 		pr_info("Using s2idle quirk to avoid %s platform firmware bug\n",
 			dmi_id->ident);
-	if (dev->quirks->spurious_8042)
-		dev->disable_8042_wakeup = true;
+	dev->disable_8042_wakeup = dev->quirks->spurious_8042;
 }
diff --git a/drivers/platform/x86/amd/pmc/pmc.c b/drivers/platform/x86/amd/pmc/pmc.c
index 0b9b23eb7c2c..bd318fd02ccf 100644
--- a/drivers/platform/x86/amd/pmc/pmc.c
+++ b/drivers/platform/x86/amd/pmc/pmc.c
@@ -530,19 +530,6 @@ static int amd_pmc_get_os_hint(struct amd_pmc_dev *dev)
 static int amd_pmc_wa_irq1(struct amd_pmc_dev *pdev)
 {
 	struct device *d;
-	int rc;
-
-	/* cezanne platform firmware has a fix in 64.66.0 */
-	if (pdev->cpu_id == AMD_CPU_ID_CZN) {
-		if (!pdev->major) {
-			rc = amd_pmc_get_smu_version(pdev);
-			if (rc)
-				return rc;
-		}
-
-		if (pdev->major > 64 || (pdev->major == 64 && pdev->minor > 65))
-			return 0;
-	}
 
 	d = bus_find_device_by_name(&serio_bus, NULL, "serio0");
 	if (!d)
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH AUTOSEL 6.16] LoongArch: Add cpuhotplug hooks to fix high cpu usage of vCPU threads
  2025-08-25 12:14 [PATCH AUTOSEL 6.16-5.15] fs: writeback: fix use-after-free in __mark_inode_dirty() Sasha Levin
                   ` (3 preceding siblings ...)
  2025-08-25 12:14 ` [PATCH AUTOSEL 6.16] platform/x86/amd: pmc: Drop SMU F/W match for Cezanne Sasha Levin
@ 2025-08-25 12:14 ` Sasha Levin
  2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.12] btrfs: zoned: skip ZONE FINISH of conventional zones Sasha Levin
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Sasha Levin @ 2025-08-25 12:14 UTC (permalink / raw)
  To: patches, stable
  Cc: Xianglai Li, Huacai Chen, Sasha Levin, tglx, peterz, chenhuacai,
	jiaxun.yang, kees, linux-kernel

From: Xianglai Li <lixianglai@loongson.cn>

[ Upstream commit 8ef7f3132e4005a103b382e71abea7ad01fbeb86 ]

When the CPU is offline, the timer of LoongArch is not correctly closed.
This is harmless for real machines, but resulting in an excessively high
cpu usage rate of the offline vCPU thread in the virtual machines.

To correctly close the timer, we have made the following modifications:

Register the cpu hotplug event (CPUHP_AP_LOONGARCH_ARCH_TIMER_STARTING)
for LoongArch. This event's hooks will be called to close the timer when
the CPU is offline.

Clear the timer interrupt when the timer is turned off. Since before the
timer is turned off, there may be a timer interrupt that has already been
in the pending state due to the interruption of the disabled, which also
affects the halt state of the offline vCPU.

Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Based on my analysis of the commit and its context, here is my
assessment:

**Backport Status: YES**

This commit should be backported to stable kernel trees for the
following reasons:

## Bug Fix for Real User Impact

1. **Fixes a concrete bug affecting virtual machines**: The commit
   addresses a problem where offline vCPUs consume excessive CPU
   resources in virtual machines. This is a real performance bug that
   affects users running LoongArch VMs.

2. **Clear problem statement**: The commit message clearly describes the
   issue - when a CPU is offlined, the LoongArch timer is not properly
   disabled, leading to high CPU usage by offline vCPU threads in
   virtual environments.

## Small and Contained Change

The commit is relatively small and well-contained:
- Adds ~20 lines of code for timer management hooks
- Registers CPU hotplug callbacks using existing infrastructure
  (CPUHP_AP_LOONGARCH_ARCH_TIMER_STARTING)
- The changes are isolated to the LoongArch timer subsystem

## Follows Established Patterns

1. **Uses standard kernel infrastructure**: The fix properly uses the
   cpuhotplug framework that other architectures already use (ARM, MIPS,
   RISCV all have similar CPUHP_AP_*_TIMER_STARTING entries).

2. **Similar to previous fixes**: Commit 355170a7ecac ("LoongArch:
   Implement constant timer shutdown interface") addressed a related
   issue with timer shutdown, and this commit completes the proper timer
   management during CPU hotplug.

## Minimal Risk of Regression

1. **Architecture-specific**: Changes are confined to LoongArch
   architecture code, with no impact on other architectures.

2. **Clear timer interrupt handling**: The fix properly clears pending
   timer interrupts when disabling the timer, preventing interrupt
   storms.

3. **Protected by proper locking**: Uses existing state_lock for
   synchronization.

## Virtual Machine Support is Important

With increasing use of virtualization, proper vCPU management is
critical for production environments. High CPU usage by offline vCPUs
can significantly impact VM performance and host resource utilization.

## Technical Correctness

The implementation correctly:
- Enables timer interrupts on CPU startup (`set_csr_ecfg(ECFGF_TIMER)`)
- Shuts down the timer on CPU dying (`constant_set_state_shutdown()`)
- Clears pending timer interrupts
  (`write_csr_tintclear(CSR_TINTCLR_TI)`)

This is a straightforward bug fix that addresses a clear performance
issue in virtual machine environments without introducing new features
or architectural changes, making it an ideal candidate for stable
backport.

 arch/loongarch/kernel/time.c | 22 ++++++++++++++++++++++
 include/linux/cpuhotplug.h   |  1 +
 2 files changed, 23 insertions(+)

diff --git a/arch/loongarch/kernel/time.c b/arch/loongarch/kernel/time.c
index 367906b10f81..f3092f2de8b5 100644
--- a/arch/loongarch/kernel/time.c
+++ b/arch/loongarch/kernel/time.c
@@ -5,6 +5,7 @@
  * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
  */
 #include <linux/clockchips.h>
+#include <linux/cpuhotplug.h>
 #include <linux/delay.h>
 #include <linux/export.h>
 #include <linux/init.h>
@@ -102,6 +103,23 @@ static int constant_timer_next_event(unsigned long delta, struct clock_event_dev
 	return 0;
 }

+static int arch_timer_starting(unsigned int cpu)
+{
+	set_csr_ecfg(ECFGF_TIMER);
+
+	return 0;
+}
+
+static int arch_timer_dying(unsigned int cpu)
+{
+	constant_set_state_shutdown(this_cpu_ptr(&constant_clockevent_device));
+
+	/* Clear Timer Interrupt */
+	write_csr_tintclear(CSR_TINTCLR_TI);
+
+	return 0;
+}
+
 static unsigned long get_loops_per_jiffy(void)
 {
 	unsigned long lpj = (unsigned long)const_clock_freq;
@@ -172,6 +190,10 @@ int constant_clockevent_init(void)
 	lpj_fine = get_loops_per_jiffy();
 	pr_info("Constant clock event device register\n");

+	cpuhp_setup_state(CPUHP_AP_LOONGARCH_ARCH_TIMER_STARTING,
+			  "clockevents/loongarch/timer:starting",
+			  arch_timer_starting, arch_timer_dying);
+
 	return 0;
 }

diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index df366ee15456..e62064cb9e08 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -169,6 +169,7 @@ enum cpuhp_state {
 	CPUHP_AP_QCOM_TIMER_STARTING,
 	CPUHP_AP_TEGRA_TIMER_STARTING,
 	CPUHP_AP_ARMADA_TIMER_STARTING,
+	CPUHP_AP_LOONGARCH_ARCH_TIMER_STARTING,
 	CPUHP_AP_MIPS_GIC_TIMER_STARTING,
 	CPUHP_AP_ARC_TIMER_STARTING,
 	CPUHP_AP_REALTEK_TIMER_STARTING,
-- 
2.50.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH AUTOSEL 6.16-6.12] btrfs: zoned: skip ZONE FINISH of conventional zones
  2025-08-25 12:14 [PATCH AUTOSEL 6.16-5.15] fs: writeback: fix use-after-free in __mark_inode_dirty() Sasha Levin
                   ` (4 preceding siblings ...)
  2025-08-25 12:14 ` [PATCH AUTOSEL 6.16] LoongArch: Add cpuhotplug hooks to fix high cpu usage of vCPU threads Sasha Levin
@ 2025-08-25 12:14 ` Sasha Levin
  2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-5.10] drm/amd/display: Don't warn when missing DCE encoder caps Sasha Levin
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Sasha Levin @ 2025-08-25 12:14 UTC (permalink / raw)
  To: patches, stable
  Cc: Johannes Thumshirn, Naohiro Aota, Anand Jain, David Sterba,
	Sasha Levin, clm, josef, linux-btrfs

From: Johannes Thumshirn <johannes.thumshirn@wdc.com>

[ Upstream commit f0ba0e7172a222ea6043b61ecd86723c46d7bcf2 ]

Don't call ZONE FINISH for conventional zones as this will result in I/O
errors. Instead check if the zone that needs finishing is a conventional
zone and if yes skip it.

Also factor out the actual handling of finishing a single zone into a
helper function, as do_zone_finish() is growing ever bigger and the
indentations levels are getting higher.

Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

**Backport Status: YES**

This commit should be backported to stable kernel trees for the
following reasons:

## Bug Fix Analysis

1. **Critical Bug Fix**: The commit fixes a concrete bug where calling
   `REQ_OP_ZONE_FINISH` on conventional zones results in I/O errors.
   This is a real runtime error that affects users of btrfs on zoned
   storage devices that have conventional zones mixed with sequential
   zones.

2. **Small and Contained Change**: The fix is relatively small and well-
   contained:
   - Adds a new helper function `call_zone_finish()` that encapsulates
     the zone finishing logic
   - Most importantly, adds the critical check: `if
     (btrfs_dev_is_sequential(device, physical))` before issuing the
     `REQ_OP_ZONE_FINISH` operation
   - The refactoring merely moves existing code into the helper function
     without changing the logic

3. **Clear Root Cause**: The bug occurs because the original code
   unconditionally calls `blkdev_zone_mgmt()` with `REQ_OP_ZONE_FINISH`
   on all zones, but this operation is invalid for conventional zones
   (non-sequential zones). The fix properly checks if a zone is
   sequential before attempting to finish it.

4. **No New Features**: This commit doesn't introduce any new
   functionality - it's purely a bug fix that prevents I/O errors.

5. **Minimal Risk**: The change has minimal risk of regression:
   - The check for sequential zones is straightforward
   - The refactoring doesn't change the existing logic flow
   - The fix has been reviewed by multiple developers familiar with the
     zoned code

6. **Affects Real Users**: This bug affects users running btrfs on SMR
   (Shingled Magnetic Recording) drives or ZNS (Zoned Namespace) SSDs
   that have a mix of conventional and sequential zones, which is a
   common configuration.

## Code Analysis

The key fix in `call_zone_finish()` at line 2262:
```c
if (btrfs_dev_is_sequential(device, physical)) {
    // Only call zone finish for sequential zones
    ret = blkdev_zone_mgmt(device->bdev, REQ_OP_ZONE_FINISH, ...);
}
```

This prevents the invalid operation on conventional zones while
maintaining the correct behavior for sequential zones. The subsequent
operations (updating reserved_active_zones and clearing active zone) are
still performed regardless of zone type, which is the correct behavior.

The commit follows stable kernel rules perfectly: it's a clear bug fix,
has minimal changes, doesn't introduce new features, and addresses a
real user-facing issue that causes I/O errors.

 fs/btrfs/zoned.c | 55 ++++++++++++++++++++++++++++++------------------
 1 file changed, 35 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 5439d8374716..950e72dc537c 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -2246,6 +2246,40 @@ static void wait_eb_writebacks(struct btrfs_block_group *block_group)
 	rcu_read_unlock();
 }
 
+static int call_zone_finish(struct btrfs_block_group *block_group,
+			    struct btrfs_io_stripe *stripe)
+{
+	struct btrfs_device *device = stripe->dev;
+	const u64 physical = stripe->physical;
+	struct btrfs_zoned_device_info *zinfo = device->zone_info;
+	int ret;
+
+	if (!device->bdev)
+		return 0;
+
+	if (zinfo->max_active_zones == 0)
+		return 0;
+
+	if (btrfs_dev_is_sequential(device, physical)) {
+		unsigned int nofs_flags;
+
+		nofs_flags = memalloc_nofs_save();
+		ret = blkdev_zone_mgmt(device->bdev, REQ_OP_ZONE_FINISH,
+				       physical >> SECTOR_SHIFT,
+				       zinfo->zone_size >> SECTOR_SHIFT);
+		memalloc_nofs_restore(nofs_flags);
+
+		if (ret)
+			return ret;
+	}
+
+	if (!(block_group->flags & BTRFS_BLOCK_GROUP_DATA))
+		zinfo->reserved_active_zones++;
+	btrfs_dev_clear_active_zone(device, physical);
+
+	return 0;
+}
+
 static int do_zone_finish(struct btrfs_block_group *block_group, bool fully_written)
 {
 	struct btrfs_fs_info *fs_info = block_group->fs_info;
@@ -2330,31 +2364,12 @@ static int do_zone_finish(struct btrfs_block_group *block_group, bool fully_writ
 	down_read(&dev_replace->rwsem);
 	map = block_group->physical_map;
 	for (i = 0; i < map->num_stripes; i++) {
-		struct btrfs_device *device = map->stripes[i].dev;
-		const u64 physical = map->stripes[i].physical;
-		struct btrfs_zoned_device_info *zinfo = device->zone_info;
-		unsigned int nofs_flags;
-
-		if (!device->bdev)
-			continue;
-
-		if (zinfo->max_active_zones == 0)
-			continue;
-
-		nofs_flags = memalloc_nofs_save();
-		ret = blkdev_zone_mgmt(device->bdev, REQ_OP_ZONE_FINISH,
-				       physical >> SECTOR_SHIFT,
-				       zinfo->zone_size >> SECTOR_SHIFT);
-		memalloc_nofs_restore(nofs_flags);
 
+		ret = call_zone_finish(block_group, &map->stripes[i]);
 		if (ret) {
 			up_read(&dev_replace->rwsem);
 			return ret;
 		}
-
-		if (!(block_group->flags & BTRFS_BLOCK_GROUP_DATA))
-			zinfo->reserved_active_zones++;
-		btrfs_dev_clear_active_zone(device, physical);
 	}
 	up_read(&dev_replace->rwsem);
 
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH AUTOSEL 6.16-5.10] drm/amd/display: Don't warn when missing DCE encoder caps
  2025-08-25 12:14 [PATCH AUTOSEL 6.16-5.15] fs: writeback: fix use-after-free in __mark_inode_dirty() Sasha Levin
                   ` (5 preceding siblings ...)
  2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.12] btrfs: zoned: skip ZONE FINISH of conventional zones Sasha Levin
@ 2025-08-25 12:14 ` Sasha Levin
  2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.1] Bluetooth: hci_sync: Avoid adding default advertising on startup Sasha Levin
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Sasha Levin @ 2025-08-25 12:14 UTC (permalink / raw)
  To: patches, stable
  Cc: Timur Kristóf, Alex Deucher, Rodrigo Siqueira, Alex Hung,
	Sasha Levin

From: Timur Kristóf <timur.kristof@gmail.com>

[ Upstream commit 8246147f1fbaed522b8bcc02ca34e4260747dcfb ]

On some GPUs the VBIOS just doesn't have encoder caps,
or maybe not for every encoder.

This isn't really a problem and it's handled well,
so let's not litter the logs with it.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 33e0227ee96e62d034781e91f215e32fd0b1d512)
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Based on my analysis of this commit, here is my determination:

**Backport Status: YES**

## Extensive Explanation:

This commit is a good candidate for backporting to stable kernel trees
for the following reasons:

### 1. **It fixes a user-visible issue (log spam)**
The commit addresses warning messages that unnecessarily clutter system
logs. The code shows that when `get_encoder_cap_info()` returns
`BP_RESULT_NORECORD` (meaning the VBIOS doesn't have encoder
capabilities for that specific encoder), it was incorrectly logging a
warning. Looking at the code pattern across the AMD display driver,
`BP_RESULT_NORECORD` is a normal, expected condition - not an error.

### 2. **The fix is minimal and contained**
The change is extremely simple - it only modifies the condition check
from:
```c
if (BP_RESULT_OK == result) {
    // handle success
} else {
    DC_LOG_WARNING(...); // Always warn on non-OK
}
```
to:
```c
if (result == BP_RESULT_OK) {
    // handle success
} else if (result != BP_RESULT_NORECORD) {
    DC_LOG_WARNING(...); // Only warn on actual errors
}
```

### 3. **No functional changes or new features**
The commit doesn't change any behavior - it only suppresses
inappropriate warning messages. The driver already handles the missing
encoder caps gracefully (as noted in the commit message: "This isn't
really a problem and it's handled well").

### 4. **Low risk of regression**
The change only affects logging behavior. It doesn't modify:
- Any hardware initialization sequences
- Any encoder capabilities detection logic
- Any functional paths in the driver
- Any data structures or APIs

### 5. **Pattern consistency across the codebase**
My grep analysis shows that `BP_RESULT_NORECORD` is commonly handled
without warnings in other parts of the AMD display driver. For example,
in `dc_link.c`, it uses `ASSERT(bp_result == BP_RESULT_NORECORD)`
showing it's an expected condition. Many other functions simply return
`BP_RESULT_NORECORD` without logging warnings.

### 6. **Applied to two identical code paths**
The commit applies the same fix to both
`dce110_link_encoder_construct()` and `dce60_link_encoder_construct()`
functions, maintaining consistency and preventing the warning in both
DCE 11.0 and DCE 6.0 hardware paths.

### 7. **Already cherry-picked**
The commit message shows "(cherry picked from commit 33e0227ee96e...)",
indicating it was already deemed important enough to cherry-pick to
another branch, suggesting its stability importance.

### 8. **Quality of Life improvement for users**
Reducing unnecessary log spam improves the user experience and makes it
easier to identify real problems in system logs. This is especially
important for users monitoring their systems for actual issues.

The commit follows stable tree rules perfectly: it's a small, contained
fix for a real issue (log spam) that affects users, with minimal risk of
introducing regressions.

 drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c b/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c
index 4a9d07c31bc5..0c50fe266c8a 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c
@@ -896,13 +896,13 @@ void dce110_link_encoder_construct(
 						enc110->base.id, &bp_cap_info);

 	/* Override features with DCE-specific values */
-	if (BP_RESULT_OK == result) {
+	if (result == BP_RESULT_OK) {
 		enc110->base.features.flags.bits.IS_HBR2_CAPABLE =
 				bp_cap_info.DP_HBR2_EN;
 		enc110->base.features.flags.bits.IS_HBR3_CAPABLE =
 				bp_cap_info.DP_HBR3_EN;
 		enc110->base.features.flags.bits.HDMI_6GB_EN = bp_cap_info.HDMI_6GB_EN;
-	} else {
+	} else if (result != BP_RESULT_NORECORD) {
 		DC_LOG_WARNING("%s: Failed to get encoder_cap_info from VBIOS with error code %d!\n",
 				__func__,
 				result);
@@ -1798,13 +1798,13 @@ void dce60_link_encoder_construct(
 						enc110->base.id, &bp_cap_info);

 	/* Override features with DCE-specific values */
-	if (BP_RESULT_OK == result) {
+	if (result == BP_RESULT_OK) {
 		enc110->base.features.flags.bits.IS_HBR2_CAPABLE =
 				bp_cap_info.DP_HBR2_EN;
 		enc110->base.features.flags.bits.IS_HBR3_CAPABLE =
 				bp_cap_info.DP_HBR3_EN;
 		enc110->base.features.flags.bits.HDMI_6GB_EN = bp_cap_info.HDMI_6GB_EN;
-	} else {
+	} else if (result != BP_RESULT_NORECORD) {
 		DC_LOG_WARNING("%s: Failed to get encoder_cap_info from VBIOS with error code %d!\n",
 				__func__,
 				result);
-- 
2.50.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH AUTOSEL 6.16-6.1] Bluetooth: hci_sync: Avoid adding default advertising on startup
  2025-08-25 12:14 [PATCH AUTOSEL 6.16-5.15] fs: writeback: fix use-after-free in __mark_inode_dirty() Sasha Levin
                   ` (6 preceding siblings ...)
  2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-5.10] drm/amd/display: Don't warn when missing DCE encoder caps Sasha Levin
@ 2025-08-25 12:14 ` Sasha Levin
  2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.6] cpupower: Fix a bug where the -t option of the set subcommand was not working Sasha Levin
  2025-08-25 12:15 ` [PATCH AUTOSEL 6.16-6.12] drm/rockchip: vop2: make vp registers nonvolatile Sasha Levin
  9 siblings, 0 replies; 11+ messages in thread
From: Sasha Levin @ 2025-08-25 12:14 UTC (permalink / raw)
  To: patches, stable
  Cc: Yang Li, Luiz Augusto von Dentz, Sasha Levin, marcel,
	johan.hedberg, luiz.dentz, linux-bluetooth

From: Yang Li <yang.li@amlogic.com>

[ Upstream commit de5d7d3f27ddd4046736f558a40e252ddda82013 ]

list_empty(&hdev->adv_instances) is always true during startup,
so an advertising instance is added by default.

Call trace:
  dump_backtrace+0x94/0xec
  show_stack+0x18/0x24
  dump_stack_lvl+0x48/0x60
  dump_stack+0x18/0x24
  hci_setup_ext_adv_instance_sync+0x17c/0x328
  hci_powered_update_adv_sync+0xb4/0x12c
  hci_powered_update_sync+0x54/0x70
  hci_power_on_sync+0xe4/0x278
  hci_set_powered_sync+0x28/0x34
  set_powered_sync+0x40/0x58
  hci_cmd_sync_work+0x94/0x100
  process_one_work+0x168/0x444
  worker_thread+0x378/0x3f4
  kthread+0x108/0x10c
  ret_from_fork+0x10/0x20

Link: https://github.com/bluez/bluez/issues/1442
Signed-off-by: Yang Li <yang.li@amlogic.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

**Backport Status: YES**

This commit should be backported to stable kernel trees. Here's my
extensive analysis:

## Bug Analysis

The commit fixes a **logic error** in `hci_powered_update_adv_sync()`
that causes unnecessary default advertising instance creation during
Bluetooth startup. The problematic code at line 3290 uses an OR
condition:

```c
if (hci_dev_test_flag(hdev, HCI_ADVERTISING) ||
    list_empty(&hdev->adv_instances))
```

This condition is **always true** during startup because
`list_empty(&hdev->adv_instances)` returns true when the advertising
instances list is initially empty. This causes the code block to execute
unconditionally, creating a default advertising instance even when not
needed.

## The Fix

The patch changes the OR (`||`) to AND (`&&`):

```c
if (hci_dev_test_flag(hdev, HCI_ADVERTISING) &&
    list_empty(&hdev->adv_instances))
```

This ensures the default advertising setup only occurs when:
1. The HCI_ADVERTISING flag is explicitly set, AND
2. There are no advertising instances configured

## Why This Should Be Backported

1. **Clear Bug Fix**: This is a straightforward logic error that causes
   incorrect behavior during Bluetooth initialization. The stack trace
   in the commit message shows this happens during normal startup flow
   (`hci_power_on_sync` → `hci_powered_update_sync` →
   `hci_powered_update_adv_sync`).

2. **Small and Contained**: The fix is a single character change (|| to
   &&) that only affects the conditional logic. No architectural changes
   or new features are introduced.

3. **Prevents Resource Waste**: The bug causes unnecessary advertising
   instance creation on every Bluetooth startup, which wastes system
   resources and may interfere with user-configured advertising
   settings.

4. **Low Risk**: The change is minimal and only affects the specific
   condition for creating default advertising. The same pattern
   (checking both flags with AND) is already used in other parts of the
   codebase (e.g., `reenable_adv_sync()` function).

5. **User-Visible Impact**: The issue has an associated BlueZ bug report
   (#1442), indicating real users are affected by this problem.

6. **Long-Standing Issue**: The problematic code was introduced in
   commit cf75ad8b41d2a (October 2021), meaning this bug has been
   affecting users for an extended period across multiple kernel
   versions.

The fix follows stable kernel rules perfectly: it's a important bugfix
with minimal code change and very low regression risk, making it an
ideal candidate for stable backporting.

 net/bluetooth/hci_sync.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c
index 7938c004071c..795952d5f921 100644
--- a/net/bluetooth/hci_sync.c
+++ b/net/bluetooth/hci_sync.c
@@ -3344,7 +3344,7 @@ static int hci_powered_update_adv_sync(struct hci_dev *hdev)
 	 * advertising data. This also applies to the case
 	 * where BR/EDR was toggled during the AUTO_OFF phase.
 	 */
-	if (hci_dev_test_flag(hdev, HCI_ADVERTISING) ||
+	if (hci_dev_test_flag(hdev, HCI_ADVERTISING) &&
 	    list_empty(&hdev->adv_instances)) {
 		if (ext_adv_capable(hdev)) {
 			err = hci_setup_ext_adv_instance_sync(hdev, 0x00);
-- 
2.50.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH AUTOSEL 6.16-6.6] cpupower: Fix a bug where the -t option of the set subcommand was not working.
  2025-08-25 12:14 [PATCH AUTOSEL 6.16-5.15] fs: writeback: fix use-after-free in __mark_inode_dirty() Sasha Levin
                   ` (7 preceding siblings ...)
  2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.1] Bluetooth: hci_sync: Avoid adding default advertising on startup Sasha Levin
@ 2025-08-25 12:14 ` Sasha Levin
  2025-08-25 12:15 ` [PATCH AUTOSEL 6.16-6.12] drm/rockchip: vop2: make vp registers nonvolatile Sasha Levin
  9 siblings, 0 replies; 11+ messages in thread
From: Sasha Levin @ 2025-08-25 12:14 UTC (permalink / raw)
  To: patches, stable; +Cc: Shinji Nomoto, Shuah Khan, Sasha Levin

From: Shinji Nomoto <fj5851bi@fujitsu.com>

[ Upstream commit b3eaf14f4c63fd6abc7b68c6d7a07c5680a6d8e5 ]

The set subcommand's -t option is documented as being available for boost
configuration, but it was not actually functioning due to a bug
in the option handling.

Link: https://lore.kernel.org/r/20250522061122.2149188-2-fj5851bi@fujitsu.com
Signed-off-by: Shinji Nomoto <fj5851bi@fujitsu.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

**Backport Status: YES**

This commit is a clear bug fix that should be backported to stable
kernel trees. Here's my detailed analysis:

## Bug Analysis

The commit fixes a critical bug where the `-t` option for the `cpupower
set` subcommand was documented and accepted as a valid option (defined
in `set_opts[]` structure), but was completely non-functional due to a
missing character in the `getopt_long()` call.

### Specific Code Changes

The bug is in line 58-59 of the original code:
```c
while ((ret = getopt_long(argc, argv, "b:e:m:",
                          set_opts, NULL)) != -1) {
```

The fix adds the missing `t:` to the argument string:
```c
while ((ret = getopt_long(argc, argv, "b:e:m:t:",
                          set_opts, NULL)) != -1) {
```

### Root Cause

When the turbo-boost feature was added in commit `eb426fc6bdd6` (merged
in v6.6-rc1), the developer:
1. Added the option to the `set_opts[]` array correctly
2. Added the case handler for 't' in the switch statement
3. **BUT forgot to add "t:" to the getopt_long argument string**

This means that while the option was defined and had handling code,
`getopt_long()` would never recognize `-t` as a valid option, making the
feature completely unusable.

## Why This Should Be Backported

1. **Clear Bug Fix**: This is an obvious bug where documented
   functionality doesn't work at all. Users trying to use `cpupower set
   -t` or `--turbo-boost` would get errors.

2. **Small and Contained**: The fix is a single-line change adding 2
   characters ("t:") to a string. This is as minimal as fixes get.

3. **No Side Effects**: The change only enables already-implemented
   functionality. It doesn't introduce new code paths or change existing
   behavior.

4. **User Impact**: The turbo-boost control feature is important for
   power management, and users on stable kernels with v6.6+ would expect
   this documented feature to work.

5. **Affects Stable Versions**: The bug was introduced in v6.6-rc1 and
   affects all kernels from v6.6 onwards that include the turbo-boost
   feature.

6. **Low Risk**: There's virtually no regression risk - the worst case
   is the option continues not working, which is the current state.

This is exactly the type of fix that stable kernel rules recommend: a
clear bug fix that restores documented functionality with minimal code
change and no architectural modifications.

 tools/power/cpupower/utils/cpupower-set.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/power/cpupower/utils/cpupower-set.c b/tools/power/cpupower/utils/cpupower-set.c
index 0677b58374ab..59ace394cf3e 100644
--- a/tools/power/cpupower/utils/cpupower-set.c
+++ b/tools/power/cpupower/utils/cpupower-set.c
@@ -62,8 +62,8 @@ int cmd_set(int argc, char **argv)

 	params.params = 0;
 	/* parameter parsing */
-	while ((ret = getopt_long(argc, argv, "b:e:m:",
-						set_opts, NULL)) != -1) {
+	while ((ret = getopt_long(argc, argv, "b:e:m:t:",
+				  set_opts, NULL)) != -1) {
 		switch (ret) {
 		case 'b':
 			if (params.perf_bias)
-- 
2.50.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH AUTOSEL 6.16-6.12] drm/rockchip: vop2: make vp registers nonvolatile
  2025-08-25 12:14 [PATCH AUTOSEL 6.16-5.15] fs: writeback: fix use-after-free in __mark_inode_dirty() Sasha Levin
                   ` (8 preceding siblings ...)
  2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.6] cpupower: Fix a bug where the -t option of the set subcommand was not working Sasha Levin
@ 2025-08-25 12:15 ` Sasha Levin
  9 siblings, 0 replies; 11+ messages in thread
From: Sasha Levin @ 2025-08-25 12:15 UTC (permalink / raw)
  To: patches, stable
  Cc: Piotr Zalewski, Diederik de Haas, Andy Yan, Heiko Stuebner,
	Sasha Levin, hjc, dri-devel, linux-arm-kernel, linux-rockchip

From: Piotr Zalewski <pZ010001011111@proton.me>

[ Upstream commit a52dffaa46c2c5ff0b311c4dc1288581f7b9109e ]

Make video port registers nonvolatile. As DSP_CTRL register is written
to twice due to gamma LUT enable bit which is set outside of the main
DSP_CTRL initialization within atomic_enable (for rk356x case it is also
necessary to always disable gamma LUT before writing a new LUT) there is
a chance that DSP_CTRL value read-out in gamma LUT init/update code is
not the one which was written by the preceding DSP_CTRL initialization
code within atomic_enable. This might result in misconfigured DSP_CTRL
which leads to no visual output[1]. Since DSP_CTRL write takes effect
after VSYNC[1] the issue is not always present. When tested on Pinetab2
with kernel 6.14 it happenes only when DRM is compiled as a module[1].
In order to confirm that it is a timing issue I inserted 18ms udelay
before vop2_crtc_atomic_try_set_gamma in atomic enable and compiled DRM
as module - this has also fixed the issue.

[1] https://lore.kernel.org/linux-rockchip/562b38e5.a496.1975f09f983.Coremail.andyshrk@163.com/

Reported-by: Diederik de Haas <didi.debian@cknow.org>
Closes: https://lore.kernel.org/linux-rockchip/DAEVDSTMWI1E.J454VZN0R9MA@cknow.org/
Suggested-by: Andy Yan <andy.yan@rock-chips.com>
Signed-off-by: Piotr Zalewski <pZ010001011111@proton.me>
Tested-by: Diederik de Haas <didi.debian@cknow.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20250706083629.140332-2-pZ010001011111@proton.me
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Based on my analysis of the commit, let me provide my determination:

**Backport Status: YES**

## Extensive Analysis:

This commit is a strong candidate for backporting to stable kernel trees
for the following reasons:

### 1. **Bug Fix Nature**
The commit fixes a real bug that causes **no visual output** on
displays. This is a critical user-visible issue where the display driver
fails to produce any image output. The commit message clearly describes
this as fixing a situation where "misconfigured DSP_CTRL which leads to
no visual output."

### 2. **Race Condition Fix**
The code changes address a race condition in register access patterns.
The issue occurs because:
- DSP_CTRL register is written twice during atomic_enable
- The second write happens for gamma LUT configuration
- Due to the register being volatile, read-modify-write operations can
  read stale values
- DSP_CTRL writes only take effect after VSYNC, creating a timing window
  for the race

### 3. **Small and Contained Change**
The actual code change is minimal - just one line adding the VP register
range to the nonvolatile list:
```c
+       regmap_reg_range(RK3568_VP0_CTRL_BASE, RK3588_VP3_CTRL_BASE +
255),
```

This makes the video port registers (0xc00-0xeff based on the offsets)
use the regmap cache instead of reading back from hardware, preventing
the race condition.

### 4. **Hardware-Specific Fix**
The fix is specific to Rockchip VOP2 hardware (RK3568/RK3588 SoCs) and
doesn't affect other subsystems. It's confined to the DRM/Rockchip
display driver.

### 5. **Tested and Confirmed**
The commit includes:
- A "Reported-by" tag indicating a real user issue
- A "Tested-by" tag confirming the fix works
- Clear reproduction details (happens on Pinetab2 with kernel 6.14 when
  DRM is compiled as module)

### 6. **Low Risk of Regression**
The change only affects how registers are cached in the regmap
framework. It doesn't change any functional logic, algorithms, or
introduce new features. The comment update accurately reflects that both
window AND video port registers are now marked as non-volatile for cache
coherency.

### 7. **Follows Stable Rules**
This fix aligns with stable kernel rules:
- Fixes a real bug that users are hitting
- Small change (essentially 1 line of functional code)
- Already tested by reporter
- Fix is obvious once the problem is understood
- No new features or architectural changes

### 8. **Critical Display Functionality**
Display output is fundamental functionality. A bug that causes "no
visual output" severely impacts usability of devices using these SoCs,
making this a high-priority fix for stable trees.

The commit properly documents the issue with external references and
follows good practices for a bug fix that should be backported to stable
kernels.

 drivers/gpu/drm/rockchip/rockchip_drm_vop2.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop2.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop2.c
index 186f6452a7d3..b50927a824b4 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop2.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop2.c
@@ -2579,12 +2579,13 @@ static int vop2_win_init(struct vop2 *vop2)
 }

 /*
- * The window registers are only updated when config done is written.
- * Until that they read back the old value. As we read-modify-write
- * these registers mark them as non-volatile. This makes sure we read
- * the new values from the regmap register cache.
+ * The window and video port registers are only updated when config
+ * done is written. Until that they read back the old value. As we
+ * read-modify-write these registers mark them as non-volatile. This
+ * makes sure we read the new values from the regmap register cache.
  */
 static const struct regmap_range vop2_nonvolatile_range[] = {
+	regmap_reg_range(RK3568_VP0_CTRL_BASE, RK3588_VP3_CTRL_BASE + 255),
 	regmap_reg_range(0x1000, 0x23ff),
 };

-- 
2.50.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-08-25 12:15 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-25 12:14 [PATCH AUTOSEL 6.16-5.15] fs: writeback: fix use-after-free in __mark_inode_dirty() Sasha Levin
2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.1] cdc_ncm: Flag Intel OEM version of Fibocom L850-GL as WWAN Sasha Levin
2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.6] LoongArch: Save LBT before FPU in setup_sigcontext() Sasha Levin
2025-08-25 12:14 ` [PATCH AUTOSEL 6.16] btrfs: clear block dirty if submit_one_sector() failed Sasha Levin
2025-08-25 12:14 ` [PATCH AUTOSEL 6.16] platform/x86/amd: pmc: Drop SMU F/W match for Cezanne Sasha Levin
2025-08-25 12:14 ` [PATCH AUTOSEL 6.16] LoongArch: Add cpuhotplug hooks to fix high cpu usage of vCPU threads Sasha Levin
2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.12] btrfs: zoned: skip ZONE FINISH of conventional zones Sasha Levin
2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-5.10] drm/amd/display: Don't warn when missing DCE encoder caps Sasha Levin
2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.1] Bluetooth: hci_sync: Avoid adding default advertising on startup Sasha Levin
2025-08-25 12:14 ` [PATCH AUTOSEL 6.16-6.6] cpupower: Fix a bug where the -t option of the set subcommand was not working Sasha Levin
2025-08-25 12:15 ` [PATCH AUTOSEL 6.16-6.12] drm/rockchip: vop2: make vp registers nonvolatile Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox