* [PATCH AUTOSEL 6.16-5.10] md: dm-zoned-target: Initialize return variable r to avoid uninitialized use
2025-08-08 15:30 [PATCH AUTOSEL 6.16-6.6] apparmor: shift ouid when mediating hard links in userns Sasha Levin
@ 2025-08-08 15:30 ` Sasha Levin
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-5.4] i3c: don't fail if GETHDRCAP is unsupported Sasha Levin
` (11 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2025-08-08 15:30 UTC (permalink / raw)
To: patches, stable
Cc: Purva Yeshi, Mikulas Patocka, Sasha Levin, agk, snitzer, dm-devel
From: Purva Yeshi <purvayeshi550@gmail.com>
[ Upstream commit 487767bff572d46f7c37ad846c4078f6d6c9cc55 ]
Fix Smatch-detected error:
drivers/md/dm-zoned-target.c:1073 dmz_iterate_devices()
error: uninitialized symbol 'r'.
Smatch detects a possible use of the uninitialized variable 'r' in
dmz_iterate_devices() because if dmz->nr_ddevs is zero, the loop is
skipped and 'r' is returned without being set, leading to undefined
behavior.
Initialize 'r' to 0 before the loop. This ensures that if there are no
devices to iterate over, the function still returns a defined value.
Signed-off-by: Purva Yeshi <purvayeshi550@gmail.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my
extensive analysis:
## Analysis of the Bug
The commit fixes a genuine uninitialized variable bug in
`dmz_iterate_devices()` function in drivers/md/dm-zoned-target.c:1073.
The bug occurs when:
1. The variable `r` is declared but not initialized at line 1065
2. If `dmz->nr_ddevs` is 0, the for loop at line 1067 is never entered
3. The function returns `r` at line 1073, which contains garbage data
This is a clear programming error that can lead to undefined behavior.
The function could return any arbitrary value when there are no devices
to iterate over.
## Why This Should Be Backported
1. **Real Bug Fix**: This fixes an actual bug that can cause
unpredictable behavior. An uninitialized return value can lead to:
- Incorrect error handling by callers
- Potential system instability
- Unpredictable behavior in device mapper operations
2. **Small and Contained Fix**: The fix is minimal - just initializing
`r` to 0. This follows the pattern used in other similar
iterate_devices implementations like `dm-cache-target.c:3438` where
`int r = 0;` is properly initialized.
3. **No Side Effects**: The change has no architectural impact or side
effects. It simply ensures the function returns a defined value (0)
when there are no devices to iterate, which is the expected behavior.
4. **Affects Users**: While the condition (nr_ddevs == 0) might be rare,
it's still a possible configuration that could affect users of dm-
zoned devices. The undefined behavior could manifest in various ways
depending on what's in memory.
5. **Static Analysis Finding**: This was caught by Smatch static
analysis, indicating it's a real code quality issue that should be
fixed.
6. **Follows Stable Rules**: The fix is:
- Obviously correct
- Fixes a real bug (uninitialized variable use)
- Minimal risk of regression
- Small change (1 line)
- No new features or architectural changes
## Technical Context
Looking at the code structure, `nr_ddevs` is set from `argc` in line
858, which must be at least 1 based on the check at line 835. However,
there could be error paths or future code changes that might result in
this function being called with nr_ddevs == 0, making this defensive
programming important.
The fix aligns with how other device mapper targets handle their
iterate_devices functions, where the return value is properly
initialized before use.
drivers/md/dm-zoned-target.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index 5da3db06da10..9da329078ea4 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -1062,7 +1062,7 @@ static int dmz_iterate_devices(struct dm_target *ti,
struct dmz_target *dmz = ti->private;
unsigned int zone_nr_sectors = dmz_zone_nr_sectors(dmz->metadata);
sector_t capacity;
- int i, r;
+ int i, r = 0;
for (i = 0; i < dmz->nr_ddevs; i++) {
capacity = dmz->dev[i].capacity & ~(zone_nr_sectors - 1);
--
2.39.5
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH AUTOSEL 6.16-5.4] i3c: don't fail if GETHDRCAP is unsupported
2025-08-08 15:30 [PATCH AUTOSEL 6.16-6.6] apparmor: shift ouid when mediating hard links in userns Sasha Levin
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-5.10] md: dm-zoned-target: Initialize return variable r to avoid uninitialized use Sasha Levin
@ 2025-08-08 15:30 ` Sasha Levin
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-5.10] dm-mpath: don't print the "loaded" message if registering fails Sasha Levin
` (10 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2025-08-08 15:30 UTC (permalink / raw)
To: patches, stable
Cc: Wolfram Sang, Frank Li, Alexandre Belloni, Sasha Levin, linux-i3c
From: Wolfram Sang <wsa+renesas@sang-engineering.com>
[ Upstream commit 447270cdb41b1c8c3621bb14b93a6749f942556e ]
'I3C_BCR_HDR_CAP' is still spec v1.0 and has been renamed to 'advanced
capabilities' in v1.1 onwards. The ST pressure sensor LPS22DF does not
have HDR, but has the 'advanced cap' bit set. The core still wants to
get additional information using the CCC 'GETHDRCAP' (or GETCAPS in v1.1
onwards). Not all controllers support this CCC and will notify the upper
layers about it. For instantiating the device, we can ignore this
unsupported CCC as standard communication will work. Without this patch,
the device will not be instantiated at all.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20250704204524.6124-1-wsa+renesas@sang-engineering.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit and the surrounding code, here's my
assessment:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the
following reasons:
1. **Fixes a real bug that prevents device instantiation**: The commit
message clearly states that "Without this patch, the device will not
be instantiated at all." This is a concrete functionality issue where
legitimate I3C devices (like the ST pressure sensor LPS22DF) cannot
be used with controllers that don't support the GETHDRCAP CCC
command.
2. **Small and contained fix**: The change is minimal - only 2 lines
changed:
- Line 1441: Changed from `if (ret)` to `if (ret && ret !=
-ENOTSUPP)`
- This allows the function to continue when GETHDRCAP returns
-ENOTSUPP instead of failing
3. **Clear regression prevention**: The fix prevents device
initialization failure for hardware combinations that should work.
The I3C spec evolved from v1.0 to v1.1, and this handles backward
compatibility gracefully.
4. **No architectural changes**: This is a simple error handling
adjustment that doesn't change any core functionality or introduce
new features.
5. **Fixes interoperability issue**: The commit addresses a real-world
hardware compatibility problem between certain I3C controllers and
devices, which is exactly the type of bug that stable kernels should
fix.
6. **Low risk of regression**: The change only affects the error path
when GETHDRCAP fails with -ENOTSUPP. It doesn't change behavior for
successful cases or other error conditions.
7. **Follows stable rules**:
- It's obviously correct (treats unsupported CCC as non-fatal)
- It's tiny (2-line change)
- Fixes one specific issue
- Fixes a real bug that prevents hardware from working
The commit allows I3C device initialization to proceed even when the
controller doesn't support the GETHDRCAP command, which is reasonable
since HDR capabilities are optional and standard I3C communication will
still work without them.
drivers/i3c/master.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/i3c/master.c b/drivers/i3c/master.c
index fd81871609d9..e53c69d24873 100644
--- a/drivers/i3c/master.c
+++ b/drivers/i3c/master.c
@@ -1439,7 +1439,7 @@ static int i3c_master_retrieve_dev_info(struct i3c_dev_desc *dev)
if (dev->info.bcr & I3C_BCR_HDR_CAP) {
ret = i3c_master_gethdrcap_locked(master, &dev->info);
- if (ret)
+ if (ret && ret != -ENOTSUPP)
return ret;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH AUTOSEL 6.16-5.10] dm-mpath: don't print the "loaded" message if registering fails
2025-08-08 15:30 [PATCH AUTOSEL 6.16-6.6] apparmor: shift ouid when mediating hard links in userns Sasha Levin
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-5.10] md: dm-zoned-target: Initialize return variable r to avoid uninitialized use Sasha Levin
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-5.4] i3c: don't fail if GETHDRCAP is unsupported Sasha Levin
@ 2025-08-08 15:30 ` Sasha Levin
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-5.10] rtc: ds1307: handle oscillator stop flag (OSF) for ds1341 Sasha Levin
` (9 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2025-08-08 15:30 UTC (permalink / raw)
To: patches, stable; +Cc: Mikulas Patocka, Sasha Levin, agk, snitzer, dm-devel
From: Mikulas Patocka <mpatocka@redhat.com>
[ Upstream commit 6e11952a6abc4641dc8ae63f01b318b31b44e8db ]
If dm_register_path_selector, don't print the "version X loaded" message.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit is a clear candidate for backporting to stable kernel trees.
Here's my extensive analysis:
## Bug Analysis
The commit fixes a logic error in the error handling of four device-
mapper path selector modules. Looking at the code changes:
1. **Original buggy behavior**: When `dm_register_path_selector()` fails
(returns r < 0), the code would:
- Print "register failed" error message
- **Continue to print "version X loaded" message**
- Return the error code
2. **Fixed behavior**: When registration fails:
- Print "register failed" error message
- **Return immediately with the error code**
- Skip the misleading "loaded" message
## Why This Should Be Backported
1. **Clear Bug Fix**: This is a straightforward bug fix that corrects
misleading log messages. When module registration fails, it
incorrectly reports success with "version X loaded".
2. **Small and Contained**: The fix is minimal - just adding an early
return statement in four similar init functions:
- `dm_hst_init()` in dm-ps-historical-service-time.c
- `dm_ql_init()` in dm-ps-queue-length.c
- `dm_rr_init()` in dm-ps-round-robin.c
- `dm_st_init()` in dm-ps-service-time.c
3. **No Side Effects**: The change only affects error path behavior and
logging. It doesn't introduce new functionality or change any
successful operation paths.
4. **User-Visible Issue**: This fixes a user-visible problem where
system logs would incorrectly indicate a module loaded successfully
when it actually failed.
5. **Consistent Pattern**: All four files have the identical bug and
identical fix, showing this is a systematic issue that should be
addressed.
6. **Low Risk**: The change is trivial - adding an early return on
error. There's virtually no risk of regression since:
- It only changes behavior when registration already failed
- The function already returns the error code at the end
- This just makes it return earlier without printing the misleading
message
7. **Affects Core Subsystem**: Device-mapper is a critical storage
subsystem, and accurate error reporting is important for system
administrators debugging storage issues.
The commit perfectly fits stable kernel criteria: it's a small, obvious
fix for a real bug that affects users (through misleading log messages),
with minimal risk of introducing new problems.
drivers/md/dm-ps-historical-service-time.c | 4 +++-
drivers/md/dm-ps-queue-length.c | 4 +++-
drivers/md/dm-ps-round-robin.c | 4 +++-
drivers/md/dm-ps-service-time.c | 4 +++-
4 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/md/dm-ps-historical-service-time.c b/drivers/md/dm-ps-historical-service-time.c
index b49e10d76d03..2c8626a83de4 100644
--- a/drivers/md/dm-ps-historical-service-time.c
+++ b/drivers/md/dm-ps-historical-service-time.c
@@ -541,8 +541,10 @@ static int __init dm_hst_init(void)
{
int r = dm_register_path_selector(&hst_ps);
- if (r < 0)
+ if (r < 0) {
DMERR("register failed %d", r);
+ return r;
+ }
DMINFO("version " HST_VERSION " loaded");
diff --git a/drivers/md/dm-ps-queue-length.c b/drivers/md/dm-ps-queue-length.c
index e305f05ad1e5..eb543e6431e0 100644
--- a/drivers/md/dm-ps-queue-length.c
+++ b/drivers/md/dm-ps-queue-length.c
@@ -260,8 +260,10 @@ static int __init dm_ql_init(void)
{
int r = dm_register_path_selector(&ql_ps);
- if (r < 0)
+ if (r < 0) {
DMERR("register failed %d", r);
+ return r;
+ }
DMINFO("version " QL_VERSION " loaded");
diff --git a/drivers/md/dm-ps-round-robin.c b/drivers/md/dm-ps-round-robin.c
index d1745b123dc1..66a15ac0c22c 100644
--- a/drivers/md/dm-ps-round-robin.c
+++ b/drivers/md/dm-ps-round-robin.c
@@ -220,8 +220,10 @@ static int __init dm_rr_init(void)
{
int r = dm_register_path_selector(&rr_ps);
- if (r < 0)
+ if (r < 0) {
DMERR("register failed %d", r);
+ return r;
+ }
DMINFO("version " RR_VERSION " loaded");
diff --git a/drivers/md/dm-ps-service-time.c b/drivers/md/dm-ps-service-time.c
index 969d31c40272..f8c43aecdb27 100644
--- a/drivers/md/dm-ps-service-time.c
+++ b/drivers/md/dm-ps-service-time.c
@@ -341,8 +341,10 @@ static int __init dm_st_init(void)
{
int r = dm_register_path_selector(&st_ps);
- if (r < 0)
+ if (r < 0) {
DMERR("register failed %d", r);
+ return r;
+ }
DMINFO("version " ST_VERSION " loaded");
--
2.39.5
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH AUTOSEL 6.16-5.10] rtc: ds1307: handle oscillator stop flag (OSF) for ds1341
2025-08-08 15:30 [PATCH AUTOSEL 6.16-6.6] apparmor: shift ouid when mediating hard links in userns Sasha Levin
` (2 preceding siblings ...)
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-5.10] dm-mpath: don't print the "loaded" message if registering fails Sasha Levin
@ 2025-08-08 15:30 ` Sasha Levin
2025-08-11 16:46 ` Meagan Lloyd
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-5.4] i3c: add missing include to internal header Sasha Levin
` (8 subsequent siblings)
12 siblings, 1 reply; 18+ messages in thread
From: Sasha Levin @ 2025-08-08 15:30 UTC (permalink / raw)
To: patches, stable
Cc: Meagan Lloyd, Tyler Hicks, Rodolfo Giometti, Alexandre Belloni,
Sasha Levin, linux-rtc
From: Meagan Lloyd <meaganlloyd@linux.microsoft.com>
[ Upstream commit 523923cfd5d622b8f4ba893fdaf29fa6adeb8c3e ]
In using CONFIG_RTC_HCTOSYS, rtc_hctosys() will sync the RTC time to the
kernel time as long as rtc_read_time() succeeds. In some power loss
situations, our supercapacitor-backed DS1342 RTC comes up with either an
unpredictable future time or the default 01/01/00 from the datasheet.
The oscillator stop flag (OSF) is set in these scenarios due to the
power loss and can be used to determine the validity of the RTC data.
This change expands the oscillator stop flag (OSF) handling that has
already been implemented for some chips to the ds1341 chip (DS1341 and
DS1342 share a datasheet). This handling manages the validity of the RTC
data in .read_time and .set_time based on the OSF.
Signed-off-by: Meagan Lloyd <meaganlloyd@linux.microsoft.com>
Reviewed-by: Tyler Hicks <code@tyhicks.com>
Acked-by: Rodolfo Giometti <giometti@enneenne.com>
Link: https://lore.kernel.org/r/1749665656-30108-3-git-send-email-meaganlloyd@linux.microsoft.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the
following reasons:
1. **Bug Fix**: This commit fixes a real bug where the DS1341/DS1342 RTC
can report invalid time data after power loss without properly
checking the oscillator stop flag (OSF). The commit message clearly
describes the problem: "In some power loss situations, our
supercapacitor-backed DS1342 RTC comes up with either an
unpredictable future time or the default 01/01/00."
2. **Data Corruption Prevention**: Without this fix, systems using
DS1341/DS1342 RTCs can have incorrect system time after power loss,
which could lead to data corruption, incorrect timestamps, or system
malfunction. This is particularly critical for systems using
CONFIG_RTC_HCTOSYS.
3. **Minimal Risk**: The change is very small and contained - it only
adds OSF handling for the ds1341 chip type in two switch statements
(ds1307_get_time:282-287 and ds1307_set_time:380-383). The pattern
exactly follows the existing OSF handling already implemented for
ds_1338, ds_1340, and ds_1388 chips.
4. **Established Pattern**: The commit follows an established pattern in
the driver. Looking at the code, OSF handling is already implemented
for:
- ds_1338: Uses DS1338_BIT_OSF in DS1307_REG_CONTROL
- ds_1340: Uses DS1340_BIT_OSF in DS1340_REG_FLAG
- ds_1388: Uses DS1388_BIT_OSF in DS1388_REG_FLAG
- And now ds_1341: Uses DS1337_BIT_OSF in DS1337_REG_STATUS
5. **No New Features**: This is purely a bug fix - it doesn't add any
new functionality, just ensures existing functionality (reading valid
time) works correctly after power loss.
6. **Previous Similar Fixes**: The git history shows similar OSF fixes
have been made before, such as commit f471b05f76e4 ("rtc: ds1307:
Clear OSF flag on DS1388 when setting time"), indicating this is a
known class of issues that needs fixing.
7. **Hardware-Specific Fix**: This only affects systems with
DS1341/DS1342 RTCs, so there's no risk to systems using other RTC
chips. The change is guarded by the chip type check.
The commit meets all the criteria for stable backporting: it fixes a
real bug that affects users, the fix is small and self-contained, it
doesn't introduce new features or architectural changes, and follows
established patterns in the codebase.
drivers/rtc/rtc-ds1307.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/drivers/rtc/rtc-ds1307.c b/drivers/rtc/rtc-ds1307.c
index 5efbe69bf5ca..7a60e5ca2b8a 100644
--- a/drivers/rtc/rtc-ds1307.c
+++ b/drivers/rtc/rtc-ds1307.c
@@ -279,6 +279,13 @@ static int ds1307_get_time(struct device *dev, struct rtc_time *t)
if (tmp & DS1340_BIT_OSF)
return -EINVAL;
break;
+ case ds_1341:
+ ret = regmap_read(ds1307->regmap, DS1337_REG_STATUS, &tmp);
+ if (ret)
+ return ret;
+ if (tmp & DS1337_BIT_OSF)
+ return -EINVAL;
+ break;
case ds_1388:
ret = regmap_read(ds1307->regmap, DS1388_REG_FLAG, &tmp);
if (ret)
@@ -377,6 +384,10 @@ static int ds1307_set_time(struct device *dev, struct rtc_time *t)
regmap_update_bits(ds1307->regmap, DS1340_REG_FLAG,
DS1340_BIT_OSF, 0);
break;
+ case ds_1341:
+ regmap_update_bits(ds1307->regmap, DS1337_REG_STATUS,
+ DS1337_BIT_OSF, 0);
+ break;
case ds_1388:
regmap_update_bits(ds1307->regmap, DS1388_REG_FLAG,
DS1388_BIT_OSF, 0);
--
2.39.5
^ permalink raw reply related [flat|nested] 18+ messages in thread* Re: [PATCH AUTOSEL 6.16-5.10] rtc: ds1307: handle oscillator stop flag (OSF) for ds1341
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-5.10] rtc: ds1307: handle oscillator stop flag (OSF) for ds1341 Sasha Levin
@ 2025-08-11 16:46 ` Meagan Lloyd
2025-08-16 13:07 ` Sasha Levin
0 siblings, 1 reply; 18+ messages in thread
From: Meagan Lloyd @ 2025-08-11 16:46 UTC (permalink / raw)
To: Sasha Levin, patches, stable
Cc: Tyler Hicks, Rodolfo Giometti, Alexandre Belloni, linux-rtc
On 8/8/2025 8:30 AM, Sasha Levin wrote:
> From: Meagan Lloyd <meaganlloyd@linux.microsoft.com>
>
> [ Upstream commit 523923cfd5d622b8f4ba893fdaf29fa6adeb8c3e ]
Hi Sasha,
Without the first patch of the series, the OSF bit will get cleared in probe, so for it to meaningfully handle an invalid RTC, we'll need to take the first patch too.
The upstream commit is 48458654659c9c2e149c211d86637f1592470da5
https://lore.kernel.org/r/1749665656-30108-2-git-send-email-meaganlloyd@linux.microsoft.com
It removes the clear of the OSF status flag in probe in the shared ds_1337, ds_1339, ds_1341, ds_3231 switch case block.
Thank you,
Meagan
>
> In using CONFIG_RTC_HCTOSYS, rtc_hctosys() will sync the RTC time to the
> kernel time as long as rtc_read_time() succeeds. In some power loss
> situations, our supercapacitor-backed DS1342 RTC comes up with either an
> unpredictable future time or the default 01/01/00 from the datasheet.
> The oscillator stop flag (OSF) is set in these scenarios due to the
> power loss and can be used to determine the validity of the RTC data.
>
> This change expands the oscillator stop flag (OSF) handling that has
> already been implemented for some chips to the ds1341 chip (DS1341 and
> DS1342 share a datasheet). This handling manages the validity of the RTC
> data in .read_time and .set_time based on the OSF.
>
> Signed-off-by: Meagan Lloyd <meaganlloyd@linux.microsoft.com>
> Reviewed-by: Tyler Hicks <code@tyhicks.com>
> Acked-by: Rodolfo Giometti <giometti@enneenne.com>
> Link: https://lore.kernel.org/r/1749665656-30108-3-git-send-email-meaganlloyd@linux.microsoft.com
> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>
> LLM Generated explanations, may be completely bogus:
>
> **Backport Status: YES**
>
> This commit should be backported to stable kernel trees for the
> following reasons:
>
> 1. **Bug Fix**: This commit fixes a real bug where the DS1341/DS1342 RTC
> can report invalid time data after power loss without properly
> checking the oscillator stop flag (OSF). The commit message clearly
> describes the problem: "In some power loss situations, our
> supercapacitor-backed DS1342 RTC comes up with either an
> unpredictable future time or the default 01/01/00."
>
> 2. **Data Corruption Prevention**: Without this fix, systems using
> DS1341/DS1342 RTCs can have incorrect system time after power loss,
> which could lead to data corruption, incorrect timestamps, or system
> malfunction. This is particularly critical for systems using
> CONFIG_RTC_HCTOSYS.
>
> 3. **Minimal Risk**: The change is very small and contained - it only
> adds OSF handling for the ds1341 chip type in two switch statements
> (ds1307_get_time:282-287 and ds1307_set_time:380-383). The pattern
> exactly follows the existing OSF handling already implemented for
> ds_1338, ds_1340, and ds_1388 chips.
>
> 4. **Established Pattern**: The commit follows an established pattern in
> the driver. Looking at the code, OSF handling is already implemented
> for:
> - ds_1338: Uses DS1338_BIT_OSF in DS1307_REG_CONTROL
> - ds_1340: Uses DS1340_BIT_OSF in DS1340_REG_FLAG
> - ds_1388: Uses DS1388_BIT_OSF in DS1388_REG_FLAG
> - And now ds_1341: Uses DS1337_BIT_OSF in DS1337_REG_STATUS
>
> 5. **No New Features**: This is purely a bug fix - it doesn't add any
> new functionality, just ensures existing functionality (reading valid
> time) works correctly after power loss.
>
> 6. **Previous Similar Fixes**: The git history shows similar OSF fixes
> have been made before, such as commit f471b05f76e4 ("rtc: ds1307:
> Clear OSF flag on DS1388 when setting time"), indicating this is a
> known class of issues that needs fixing.
>
> 7. **Hardware-Specific Fix**: This only affects systems with
> DS1341/DS1342 RTCs, so there's no risk to systems using other RTC
> chips. The change is guarded by the chip type check.
>
> The commit meets all the criteria for stable backporting: it fixes a
> real bug that affects users, the fix is small and self-contained, it
> doesn't introduce new features or architectural changes, and follows
> established patterns in the codebase.
>
> drivers/rtc/rtc-ds1307.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/drivers/rtc/rtc-ds1307.c b/drivers/rtc/rtc-ds1307.c
> index 5efbe69bf5ca..7a60e5ca2b8a 100644
> --- a/drivers/rtc/rtc-ds1307.c
> +++ b/drivers/rtc/rtc-ds1307.c
> @@ -279,6 +279,13 @@ static int ds1307_get_time(struct device *dev, struct rtc_time *t)
> if (tmp & DS1340_BIT_OSF)
> return -EINVAL;
> break;
> + case ds_1341:
> + ret = regmap_read(ds1307->regmap, DS1337_REG_STATUS, &tmp);
> + if (ret)
> + return ret;
> + if (tmp & DS1337_BIT_OSF)
> + return -EINVAL;
> + break;
> case ds_1388:
> ret = regmap_read(ds1307->regmap, DS1388_REG_FLAG, &tmp);
> if (ret)
> @@ -377,6 +384,10 @@ static int ds1307_set_time(struct device *dev, struct rtc_time *t)
> regmap_update_bits(ds1307->regmap, DS1340_REG_FLAG,
> DS1340_BIT_OSF, 0);
> break;
> + case ds_1341:
> + regmap_update_bits(ds1307->regmap, DS1337_REG_STATUS,
> + DS1337_BIT_OSF, 0);
> + break;
> case ds_1388:
> regmap_update_bits(ds1307->regmap, DS1388_REG_FLAG,
> DS1388_BIT_OSF, 0);
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH AUTOSEL 6.16-5.10] rtc: ds1307: handle oscillator stop flag (OSF) for ds1341
2025-08-11 16:46 ` Meagan Lloyd
@ 2025-08-16 13:07 ` Sasha Levin
0 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2025-08-16 13:07 UTC (permalink / raw)
To: Meagan Lloyd
Cc: patches, stable, Tyler Hicks, Rodolfo Giometti, Alexandre Belloni,
linux-rtc
On Mon, Aug 11, 2025 at 09:46:07AM -0700, Meagan Lloyd wrote:
>On 8/8/2025 8:30 AM, Sasha Levin wrote:
>> From: Meagan Lloyd <meaganlloyd@linux.microsoft.com>
>>
>> [ Upstream commit 523923cfd5d622b8f4ba893fdaf29fa6adeb8c3e ]
>
>Hi Sasha,
>
>Without the first patch of the series, the OSF bit will get cleared in probe, so for it to meaningfully handle an invalid RTC, we'll need to take the first patch too.
>
>The upstream commit is 48458654659c9c2e149c211d86637f1592470da5
>
>https://lore.kernel.org/r/1749665656-30108-2-git-send-email-meaganlloyd@linux.microsoft.com
>
>It removes the clear of the OSF status flag in probe in the shared ds_1337, ds_1339, ds_1341, ds_3231 switch case block.
I'll take it too, thanks!
--
Thanks,
Sasha
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH AUTOSEL 6.16-5.4] i3c: add missing include to internal header
2025-08-08 15:30 [PATCH AUTOSEL 6.16-6.6] apparmor: shift ouid when mediating hard links in userns Sasha Levin
` (3 preceding siblings ...)
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-5.10] rtc: ds1307: handle oscillator stop flag (OSF) for ds1341 Sasha Levin
@ 2025-08-08 15:30 ` Sasha Levin
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-6.1] i3c: master: Initialize ret in i3c_i2c_notifier_call() Sasha Levin
` (7 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2025-08-08 15:30 UTC (permalink / raw)
To: patches, stable
Cc: Wolfram Sang, kernel test robot, Frank Li, Alexandre Belloni,
Sasha Levin, linux-i3c
From: Wolfram Sang <wsa+renesas@sang-engineering.com>
[ Upstream commit 3b661ca549b9e5bb11d0bc97ada6110aac3282d2 ]
LKP found a random config which failed to build because IO accessors
were not defined:
In file included from drivers/i3c/master.c:21:
drivers/i3c/internals.h: In function 'i3c_writel_fifo':
>> drivers/i3c/internals.h:35:9: error: implicit declaration of function 'writesl' [-Werror=implicit-function-declaration]
Add the proper header to where the IO accessors are used.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202507150208.BZDzzJ5E-lkp@intel.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20250717120046.9022-2-wsa+renesas@sang-engineering.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the
following reasons:
1. **Fixes a real build failure**: The commit fixes an actual
compilation error where `writesl` function is used without including
the proper header (`<linux/io.h>`). This is a concrete build
regression that prevents kernel compilation in certain
configurations.
2. **Minimal and safe change**: The fix is extremely simple - it only
adds a single `#include <linux/io.h>` line to the internals.h header
file. This is a one-line change with virtually no risk of introducing
regressions.
3. **Detected by kernel test robot**: The issue was found by the Linux
Kernel Test Robot (LKP), indicating it affects real-world build
configurations that are tested regularly. The error message shows:
```
drivers/i3c/internals.h:35:9: error: implicit declaration of function
'writesl'
```
4. **Affects core I3C infrastructure**: The internals.h header is
included by core I3C files (master.c and device.c), so a build
failure here can prevent the entire I3C subsystem from compiling.
5. **Recent regression**: Looking at the git history, the
`i3c_writel_fifo()` and `i3c_readl_fifo()` functions were recently
added in commit 733b439375b4, and are already being used by I3C
controller drivers (dw-i3c-master.c and cdns-i3c-master.c as shown in
commits 6e055b1fb2fc and c20d3fa70491). This missing include is a
regression that breaks builds after these recent changes.
6. **Clear stable backport criteria**: This meets the stable kernel
rules perfectly:
- It fixes a real bug (build failure)
- It's obviously correct (missing include for used functions)
- It's a minimal change (1 line)
- No new features or architectural changes
- Zero risk of functional regression
The commit is a textbook example of what should be backported to stable
trees - a simple, obvious fix for a build regression with no side
effects.
drivers/i3c/internals.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/i3c/internals.h b/drivers/i3c/internals.h
index 433f6088b7ce..ce04aa4f269e 100644
--- a/drivers/i3c/internals.h
+++ b/drivers/i3c/internals.h
@@ -9,6 +9,7 @@
#define I3C_INTERNALS_H
#include <linux/i3c/master.h>
+#include <linux/io.h>
void i3c_bus_normaluse_lock(struct i3c_bus *bus);
void i3c_bus_normaluse_unlock(struct i3c_bus *bus);
--
2.39.5
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH AUTOSEL 6.16-6.1] i3c: master: Initialize ret in i3c_i2c_notifier_call()
2025-08-08 15:30 [PATCH AUTOSEL 6.16-6.6] apparmor: shift ouid when mediating hard links in userns Sasha Levin
` (4 preceding siblings ...)
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-5.4] i3c: add missing include to internal header Sasha Levin
@ 2025-08-08 15:30 ` Sasha Levin
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-5.4] PCI: pnv_php: Work around switches with broken presence detection Sasha Levin
` (6 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2025-08-08 15:30 UTC (permalink / raw)
To: patches, stable
Cc: Jorge Marques, Frank Li, Alexandre Belloni, Sasha Levin,
linux-i3c
From: Jorge Marques <jorge.marques@analog.com>
[ Upstream commit 290ce8b2d0745e45a3155268184523a8c75996f1 ]
Set ret to -EINVAL if i3c_i2c_notifier_call() receives an invalid
action, resolving uninitialized warning.
Signed-off-by: Jorge Marques <jorge.marques@analog.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20250622-i3c-master-ret-uninitialized-v1-1-aabb5625c932@analog.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees. Here's my
extensive analysis:
## Bug Analysis
The commit fixes a serious uninitialized variable bug in
`i3c_i2c_notifier_call()`. Looking at the code:
1. **The Bug**: The function declares `int ret` at line 2449 but doesn't
initialize it. The switch statement only handles two cases
(`BUS_NOTIFY_ADD_DEVICE` and `BUS_NOTIFY_DEL_DEVICE`), leaving `ret`
uninitialized for any other action values.
2. **When It Triggers**: Since this notifier is registered with
`bus_register_notifier(&i2c_bus_type, &i2cdev_notifier)`, it receives
ALL bus notification events for I2C devices, including:
- `BUS_NOTIFY_REMOVED_DEVICE`
- `BUS_NOTIFY_BIND_DRIVER`
- `BUS_NOTIFY_BOUND_DRIVER`
- `BUS_NOTIFY_UNBIND_DRIVER`
- `BUS_NOTIFY_UNBOUND_DRIVER`
- `BUS_NOTIFY_DRIVER_NOT_BOUND`
3. **Impact**: When any of these unhandled actions occur, the function
returns an uninitialized stack value, which could:
- Accidentally return `NOTIFY_STOP` or `NOTIFY_BAD`, halting the
notification chain
- Cause unpredictable behavior in the device/driver binding process
- Lead to intermittent, hard-to-debug failures
## Backport Criteria Analysis
1. **Fixes a real bug**: ✓ Yes - fixes an uninitialized variable that
causes undefined behavior
2. **Small and contained**: ✓ Yes - adds only 2 lines (default case
returning -EINVAL)
3. **No side effects**: ✓ Correct - the fix properly handles unexpected
actions by returning an error
4. **No architectural changes**: ✓ Correct - simple bug fix, no design
changes
5. **Critical subsystem**: ✓ Yes - affects I3C/I2C device management and
driver binding
6. **Follows stable rules**: ✓ Yes - important bugfix with minimal risk
## Additional Context
- The bug was introduced in commit 72a4501b5d08 ("i3c: support
dynamically added i2c devices") in January 2022
- The I3C subsystem has had other uninitialized variable fixes (e.g.,
commit 6cbf8b38dfe3)
- This is a classic compiler warning fix that prevents real runtime
issues
- The fix is conservative, returning -EINVAL for unexpected actions
rather than silently ignoring them
This is exactly the type of bug that stable kernels should fix: a real
issue with unpredictable runtime consequences, fixed with a minimal,
safe change.
drivers/i3c/master.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/i3c/master.c b/drivers/i3c/master.c
index e53c69d24873..dfa0bad991cf 100644
--- a/drivers/i3c/master.c
+++ b/drivers/i3c/master.c
@@ -2467,6 +2467,8 @@ static int i3c_i2c_notifier_call(struct notifier_block *nb, unsigned long action
case BUS_NOTIFY_DEL_DEVICE:
ret = i3c_master_i2c_detach(adap, client);
break;
+ default:
+ ret = -EINVAL;
}
i3c_bus_maintenance_unlock(&master->bus);
--
2.39.5
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH AUTOSEL 6.16-5.4] PCI: pnv_php: Work around switches with broken presence detection
2025-08-08 15:30 [PATCH AUTOSEL 6.16-6.6] apparmor: shift ouid when mediating hard links in userns Sasha Levin
` (5 preceding siblings ...)
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-6.1] i3c: master: Initialize ret in i3c_i2c_notifier_call() Sasha Levin
@ 2025-08-08 15:30 ` Sasha Levin
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-6.1] module: Prevent silent truncation of module name in delete_module(2) Sasha Levin
` (5 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2025-08-08 15:30 UTC (permalink / raw)
To: patches, stable
Cc: Timothy Pearson, Shawn Anastasio, Bjorn Helgaas,
Madhavan Srinivasan, Sasha Levin, mpe, linuxppc-dev
From: Timothy Pearson <tpearson@raptorengineering.com>
[ Upstream commit 80f9fc2362797538ebd4fd70a1dfa838cc2c2cdb ]
The Microsemi Switchtec PM8533 PFX 48xG3 [11f8:8533] PCIe switch system
was observed to incorrectly assert the Presence Detect Set bit in its
capabilities when tested on a Raptor Computing Systems Blackbird system,
resulting in the hot insert path never attempting a rescan of the bus
and any downstream devices not being re-detected.
Work around this by additionally checking whether the PCIe data link is
active or not when performing presence detection on downstream switches'
ports, similar to the pciehp_hpc.c driver.
Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/505981576.1359853.1752615415117.JavaMail.zimbra@raptorengineeringinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit is suitable for backporting to stable kernel trees based on
the following analysis:
## Commit Analysis
### 1. Bug Fix Nature
The commit fixes a real hardware bug where the Microsemi Switchtec
PM8533 PFX 48xG3 PCIe switch incorrectly reports presence detection.
This causes downstream devices to not be re-detected after hot
insertion, which is a significant functional issue affecting users.
### 2. Small and Contained Fix
The patch adds only 33 lines of code:
- Adds a new helper function `pcie_check_link_active()` (11 lines)
- Modifies `pnv_php_get_adapter_state()` with a targeted workaround (13
lines)
- The changes are confined to a single file:
`drivers/pci/hotplug/pnv_php.c`
### 3. Clear and Minimal Side Effects
The workaround is:
- Only activated for specific hardware (downstream PCIe ports)
- Only when presence detection returns EMPTY
- Uses an additional check (link active status) as a fallback
- Similar approach already proven in pciehp_hpc.c driver
### 4. Hardware Quirk Fix
This is explicitly a hardware quirk workaround for broken hardware
(Microsemi Switchtec PM8533), which is one of the accepted categories
for stable backports according to stable kernel rules.
### 5. No Architectural Changes
The commit:
- Does not introduce new features
- Does not change any APIs or interfaces
- Does not modify core kernel infrastructure
- Simply adds a fallback detection mechanism
### 6. Tested Solution
The commit message indicates this was "observed" and tested on a Raptor
Computing Systems Blackbird system, suggesting real-world validation.
## Stable Tree Compliance
According to Documentation/process/stable-kernel-rules.rst, this patch
meets the criteria:
- Fixes a real bug that bothers people (hotplug detection failure)
- Is a hardware quirk fix (explicitly allowed)
- Is under 100 lines with context
- Is obviously correct (uses established pattern from pciehp_hpc.c)
- Has clear user impact (devices not being detected after hot insertion)
The workaround pattern of checking link active status when presence
detection fails is a well-established approach in PCIe hotplug drivers,
making this a low-risk backport candidate.
drivers/pci/hotplug/pnv_php.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
index 573a41869c15..aaa8e0a2a893 100644
--- a/drivers/pci/hotplug/pnv_php.c
+++ b/drivers/pci/hotplug/pnv_php.c
@@ -391,6 +391,20 @@ static int pnv_php_get_power_state(struct hotplug_slot *slot, u8 *state)
return 0;
}
+static int pcie_check_link_active(struct pci_dev *pdev)
+{
+ u16 lnk_status;
+ int ret;
+
+ ret = pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status);
+ if (ret == PCIBIOS_DEVICE_NOT_FOUND || PCI_POSSIBLE_ERROR(lnk_status))
+ return -ENODEV;
+
+ ret = !!(lnk_status & PCI_EXP_LNKSTA_DLLLA);
+
+ return ret;
+}
+
static int pnv_php_get_adapter_state(struct hotplug_slot *slot, u8 *state)
{
struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
@@ -403,6 +417,19 @@ static int pnv_php_get_adapter_state(struct hotplug_slot *slot, u8 *state)
*/
ret = pnv_pci_get_presence_state(php_slot->id, &presence);
if (ret >= 0) {
+ if (pci_pcie_type(php_slot->pdev) == PCI_EXP_TYPE_DOWNSTREAM &&
+ presence == OPAL_PCI_SLOT_EMPTY) {
+ /*
+ * Similar to pciehp_hpc, check whether the Link Active
+ * bit is set to account for broken downstream bridges
+ * that don't properly assert Presence Detect State, as
+ * was observed on the Microsemi Switchtec PM8533 PFX
+ * [11f8:8533].
+ */
+ if (pcie_check_link_active(php_slot->pdev) > 0)
+ presence = OPAL_PCI_SLOT_PRESENT;
+ }
+
*state = presence;
ret = 0;
} else {
--
2.39.5
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH AUTOSEL 6.16-6.1] module: Prevent silent truncation of module name in delete_module(2)
2025-08-08 15:30 [PATCH AUTOSEL 6.16-6.6] apparmor: shift ouid when mediating hard links in userns Sasha Levin
` (6 preceding siblings ...)
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-5.4] PCI: pnv_php: Work around switches with broken presence detection Sasha Levin
@ 2025-08-08 15:30 ` Sasha Levin
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-6.1] apparmor: use the condition in AA_BUG_FMT even with debug disabled Sasha Levin
` (4 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2025-08-08 15:30 UTC (permalink / raw)
To: patches, stable
Cc: Petr Pavlu, Daniel Gomez, Sasha Levin, mcgrof, da.gomez,
linux-modules, linux-kernel
From: Petr Pavlu <petr.pavlu@suse.com>
[ Upstream commit a6323bd4e611567913e23df5b58f2d4e4da06789 ]
Passing a module name longer than MODULE_NAME_LEN to the delete_module
syscall results in its silent truncation. This really isn't much of
a problem in practice, but it could theoretically lead to the removal of an
incorrect module. It is more sensible to return ENAMETOOLONG or ENOENT in
such a case.
Update the syscall to return ENOENT, as documented in the delete_module(2)
man page to mean "No module by that name exists." This is appropriate
because a module with a name longer than MODULE_NAME_LEN cannot be loaded
in the first place.
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Reviewed-by: Daniel Gomez <da.gomez@samsung.com>
Link: https://lore.kernel.org/r/20250630143535.267745-2-petr.pavlu@suse.com
Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my analysis, here's my assessment:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the
following reasons:
1. **Security Fix for Silent Truncation**: The commit fixes a security-
relevant issue where module names longer than MODULE_NAME_LEN (56
bytes on 64-bit systems) are silently truncated in the
delete_module() syscall. This could theoretically lead to
unintentionally removing the wrong module if two modules share the
same truncated prefix.
2. **Minimal and Contained Change**: The fix is very small and localized
- it only modifies the delete_module syscall implementation in
kernel/module/main.c. The changes are:
- Changes from `strncpy_from_user(name, name_user,
MODULE_NAME_LEN-1)` to `strncpy_from_user(name, name_user,
MODULE_NAME_LEN)`
- Adds proper length checking: `if (len == 0 || len ==
MODULE_NAME_LEN) return -ENOENT;`
- Returns the correct error code (ENOENT) for oversized names
3. **Follows Stable Tree Rules**:
- Fixes a bug that could affect users (incorrect module removal)
- Very low risk of regression - the change only affects error
handling for invalid input
- No new features or architectural changes
- Improves consistency with documented behavior (delete_module(2) man
page)
4. **Prevents Inconsistent Behavior**: The commit message correctly
points out that modules with names longer than MODULE_NAME_LEN cannot
be loaded in the first place, so returning ENOENT for such names in
delete_module makes the behavior consistent across module operations.
5. **Clear Bug Fix**: The old code would accept a 57+ character module
name, truncate it to 56 characters, and potentially remove a
different module. The new code properly rejects such names with
ENOENT, preventing this dangerous silent truncation.
The change is defensive programming that prevents a potential security
issue without introducing any backward compatibility concerns, making it
an ideal candidate for stable backporting.
kernel/module/main.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/module/main.c b/kernel/module/main.c
index c2c08007029d..cbd637627eb4 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -751,14 +751,16 @@ SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
struct module *mod;
char name[MODULE_NAME_LEN];
char buf[MODULE_FLAGS_BUF_SIZE];
- int ret, forced = 0;
+ int ret, len, forced = 0;
if (!capable(CAP_SYS_MODULE) || modules_disabled)
return -EPERM;
- if (strncpy_from_user(name, name_user, MODULE_NAME_LEN-1) < 0)
- return -EFAULT;
- name[MODULE_NAME_LEN-1] = '\0';
+ len = strncpy_from_user(name, name_user, MODULE_NAME_LEN);
+ if (len == 0 || len == MODULE_NAME_LEN)
+ return -ENOENT;
+ if (len < 0)
+ return len;
audit_log_kern_module(name);
--
2.39.5
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH AUTOSEL 6.16-6.1] apparmor: use the condition in AA_BUG_FMT even with debug disabled
2025-08-08 15:30 [PATCH AUTOSEL 6.16-6.6] apparmor: shift ouid when mediating hard links in userns Sasha Levin
` (7 preceding siblings ...)
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-6.1] module: Prevent silent truncation of module name in delete_module(2) Sasha Levin
@ 2025-08-08 15:30 ` Sasha Levin
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-6.6] powerpc/eeh: Make EEH driver device hotplug safe Sasha Levin
` (3 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2025-08-08 15:30 UTC (permalink / raw)
To: patches, stable
Cc: Mateusz Guzik, John Johansen, Sasha Levin, rdunlap, ryan.lee,
linux
From: Mateusz Guzik <mjguzik@gmail.com>
[ Upstream commit 67e370aa7f968f6a4f3573ed61a77b36d1b26475 ]
This follows the established practice and fixes a build failure for me:
security/apparmor/file.c: In function ‘__file_sock_perm’:
security/apparmor/file.c:544:24: error: unused variable ‘sock’ [-Werror=unused-variable]
544 | struct socket *sock = (struct socket *) file->private_data;
| ^~~~
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the
following reasons:
1. **It fixes a real build failure**: The commit message clearly states
it fixes a build failure with `-Werror=unused-variable`. When
`CONFIG_SECURITY_APPARMOR_DEBUG_ASSERTS` is disabled, the
`AA_BUG_FMT` macro becomes just `no_printk(fmt, ##args)`, which
doesn't evaluate the condition `X`. This causes the `sock` variable
in `__file_sock_perm()` to be unused, triggering a compiler warning
that becomes an error with `-Werror`.
2. **The fix is minimal and contained**: The change only modifies the
`AA_BUG_FMT` macro definition when debug is disabled, adding
`BUILD_BUG_ON_INVALID(X)` which forces the compiler to evaluate the
condition without generating any runtime code. This is a well-
established pattern in the kernel (as seen in `VM_BUG_ON` and similar
macros in `include/linux/mmdebug.h`).
3. **No functional changes or side effects**: The fix doesn't change any
runtime behavior. `BUILD_BUG_ON_INVALID(e)` is defined as
`((void)(sizeof((__force long)(e))))` which only evaluates the
expression at compile time to ensure it's valid, without generating
any code.
4. **Follows established kernel patterns**: The commit message states
"This follows the established practice," and indeed, examining other
kernel debug macros like `VM_BUG_ON` shows they use the exact same
pattern - using `BUILD_BUG_ON_INVALID` when debug is disabled to
ensure the condition is still evaluated by the compiler.
5. **Low risk**: This is a compile-time only change that prevents build
failures. It cannot introduce runtime regressions since it generates
no runtime code.
6. **Affects a security subsystem**: AppArmor is a security module, and
build failures in security code can prevent users from building
kernels with their desired security configuration.
The commit is a classic example of a safe, minimal fix that resolves a
real problem without introducing new risks, making it an ideal candidate
for stable backporting.
security/apparmor/include/lib.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/security/apparmor/include/lib.h b/security/apparmor/include/lib.h
index f11a0db7f51d..e83f45e936a7 100644
--- a/security/apparmor/include/lib.h
+++ b/security/apparmor/include/lib.h
@@ -48,7 +48,11 @@ extern struct aa_dfa *stacksplitdfa;
#define AA_BUG_FMT(X, fmt, args...) \
WARN((X), "AppArmor WARN %s: (" #X "): " fmt, __func__, ##args)
#else
-#define AA_BUG_FMT(X, fmt, args...) no_printk(fmt, ##args)
+#define AA_BUG_FMT(X, fmt, args...) \
+ do { \
+ BUILD_BUG_ON_INVALID(X); \
+ no_printk(fmt, ##args); \
+ } while (0)
#endif
#define AA_ERROR(fmt, args...) \
--
2.39.5
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH AUTOSEL 6.16-6.6] powerpc/eeh: Make EEH driver device hotplug safe
2025-08-08 15:30 [PATCH AUTOSEL 6.16-6.6] apparmor: shift ouid when mediating hard links in userns Sasha Levin
` (8 preceding siblings ...)
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-6.1] apparmor: use the condition in AA_BUG_FMT even with debug disabled Sasha Levin
@ 2025-08-08 15:30 ` Sasha Levin
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-6.12] apparmor: fix x_table_lookup when stacking is not the first entry Sasha Levin
` (2 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2025-08-08 15:30 UTC (permalink / raw)
To: patches, stable
Cc: Timothy Pearson, Bjorn Helgaas, Madhavan Srinivasan, Sasha Levin,
mahesh, linuxppc-dev
From: Timothy Pearson <tpearson@raptorengineering.com>
[ Upstream commit 1010b4c012b0d78dfb9d3132b49aa2ef024a07a7 ]
Multiple race conditions existed between the PCIe hotplug driver and the
EEH driver, leading to a variety of kernel oopses of the same general
nature:
<pcie device unplug>
<eeh driver trigger>
<hotplug removal trigger>
<pcie tree reconfiguration>
<eeh recovery next step>
<oops in EEH driver bus iteration loop>
A second class of oops is also seen when the underlying bus disappears
during device recovery.
Refactor the EEH module to be PCI rescan and remove safe. Also clean
up a few minor formatting / readability issues.
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/1334208367.1359861.1752615503144.JavaMail.zimbra@raptorengineeringinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees for the
following reasons:
## 1. **Critical Bug Fix for Race Conditions**
The commit explicitly fixes "Multiple race conditions" between the PCIe
hotplug driver and the EEH (Enhanced Error Handling) driver that lead to
kernel oopses. These are serious kernel crashes that can cause system
instability or complete system failure.
## 2. **Clear Bug Symptoms Described**
The commit message describes two classes of kernel oopses:
- Race conditions during concurrent PCIe device unplug and EEH recovery
operations
- Crashes when the underlying bus disappears during device recovery
These are real-world stability issues that affect system reliability.
## 3. **Focused Fix with Clear Scope**
The changes are focused on making the EEH driver "PCI rescan and remove
safe" by properly coordinating locking around critical sections. The key
changes include:
- Moving `pci_lock_rescan_remove()` calls to protect larger critical
sections in `eeh_handle_normal_event()` and
`eeh_handle_special_event()`
- Removing scattered lock/unlock pairs from individual functions like
`eeh_pe_report_edev()` and `eeh_reset_device()`
- Adding proper NULL checks (e.g., checking if `edev->pdev` exists
before dereferencing in `eeh_bridge_check_link()`)
- Adding `*result = PCI_ERS_RESULT_DISCONNECT` when no device is found
## 4. **Minimal Risk of Regression**
The changes are:
- Confined to the PowerPC EEH subsystem
- Primarily involve moving existing locking to broader scopes
- Add defensive NULL checks
- Don't introduce new features or change architecture
## 5. **Follows Stable Kernel Rules**
This fix meets the stable kernel criteria:
- Fixes a real bug (kernel oopses)
- Has been tested (signed off by multiple maintainers)
- Is self-contained to a specific subsystem
- Doesn't add new features
- Has clear bug symptoms that users experience
## 6. **Critical for PowerPC Server Systems**
EEH is crucial for PowerPC server systems' reliability, especially in
environments with PCIe hotplug operations. The race conditions can lead
to unexpected downtime in production systems.
The refactoring to consolidate the locking at higher levels (in the main
event handlers) rather than scattered throughout the code makes the
locking strategy clearer and reduces the chance of deadlocks or missed
locking, which is a good defensive programming practice for stable
kernels.
arch/powerpc/kernel/eeh_driver.c | 48 +++++++++++++++++++++-----------
arch/powerpc/kernel/eeh_pe.c | 10 ++++---
2 files changed, 38 insertions(+), 20 deletions(-)
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 7efe04c68f0f..dd50de91c438 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -257,13 +257,12 @@ static void eeh_pe_report_edev(struct eeh_dev *edev, eeh_report_fn fn,
struct pci_driver *driver;
enum pci_ers_result new_result;
- pci_lock_rescan_remove();
pdev = edev->pdev;
if (pdev)
get_device(&pdev->dev);
- pci_unlock_rescan_remove();
if (!pdev) {
eeh_edev_info(edev, "no device");
+ *result = PCI_ERS_RESULT_DISCONNECT;
return;
}
device_lock(&pdev->dev);
@@ -304,8 +303,9 @@ static void eeh_pe_report(const char *name, struct eeh_pe *root,
struct eeh_dev *edev, *tmp;
pr_info("EEH: Beginning: '%s'\n", name);
- eeh_for_each_pe(root, pe) eeh_pe_for_each_dev(pe, edev, tmp)
- eeh_pe_report_edev(edev, fn, result);
+ eeh_for_each_pe(root, pe)
+ eeh_pe_for_each_dev(pe, edev, tmp)
+ eeh_pe_report_edev(edev, fn, result);
if (result)
pr_info("EEH: Finished:'%s' with aggregate recovery state:'%s'\n",
name, pci_ers_result_name(*result));
@@ -383,6 +383,8 @@ static void eeh_dev_restore_state(struct eeh_dev *edev, void *userdata)
if (!edev)
return;
+ pci_lock_rescan_remove();
+
/*
* The content in the config space isn't saved because
* the blocked config space on some adapters. We have
@@ -393,14 +395,19 @@ static void eeh_dev_restore_state(struct eeh_dev *edev, void *userdata)
if (list_is_last(&edev->entry, &edev->pe->edevs))
eeh_pe_restore_bars(edev->pe);
+ pci_unlock_rescan_remove();
return;
}
pdev = eeh_dev_to_pci_dev(edev);
- if (!pdev)
+ if (!pdev) {
+ pci_unlock_rescan_remove();
return;
+ }
pci_restore_state(pdev);
+
+ pci_unlock_rescan_remove();
}
/**
@@ -647,9 +654,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
if (any_passed || driver_eeh_aware || (pe->type & EEH_PE_VF)) {
eeh_pe_dev_traverse(pe, eeh_rmv_device, rmv_data);
} else {
- pci_lock_rescan_remove();
pci_hp_remove_devices(bus);
- pci_unlock_rescan_remove();
}
/*
@@ -665,8 +670,6 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
if (rc)
return rc;
- pci_lock_rescan_remove();
-
/* Restore PE */
eeh_ops->configure_bridge(pe);
eeh_pe_restore_bars(pe);
@@ -674,7 +677,6 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
/* Clear frozen state */
rc = eeh_clear_pe_frozen_state(pe, false);
if (rc) {
- pci_unlock_rescan_remove();
return rc;
}
@@ -709,7 +711,6 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
pe->tstamp = tstamp;
pe->freeze_count = cnt;
- pci_unlock_rescan_remove();
return 0;
}
@@ -843,10 +844,13 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
{LIST_HEAD_INIT(rmv_data.removed_vf_list), 0};
int devices = 0;
+ pci_lock_rescan_remove();
+
bus = eeh_pe_bus_get(pe);
if (!bus) {
pr_err("%s: Cannot find PCI bus for PHB#%x-PE#%x\n",
__func__, pe->phb->global_number, pe->addr);
+ pci_unlock_rescan_remove();
return;
}
@@ -1094,10 +1098,15 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
eeh_pe_state_clear(pe, EEH_PE_PRI_BUS, true);
eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
- pci_lock_rescan_remove();
- pci_hp_remove_devices(bus);
- pci_unlock_rescan_remove();
+ bus = eeh_pe_bus_get(pe);
+ if (bus)
+ pci_hp_remove_devices(bus);
+ else
+ pr_err("%s: PCI bus for PHB#%x-PE#%x disappeared\n",
+ __func__, pe->phb->global_number, pe->addr);
+
/* The passed PE should no longer be used */
+ pci_unlock_rescan_remove();
return;
}
@@ -1114,6 +1123,8 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
eeh_clear_slot_attention(edev->pdev);
eeh_pe_state_clear(pe, EEH_PE_RECOVERING, true);
+
+ pci_unlock_rescan_remove();
}
/**
@@ -1132,6 +1143,7 @@ void eeh_handle_special_event(void)
unsigned long flags;
int rc;
+ pci_lock_rescan_remove();
do {
rc = eeh_ops->next_error(&pe);
@@ -1171,10 +1183,12 @@ void eeh_handle_special_event(void)
break;
case EEH_NEXT_ERR_NONE:
+ pci_unlock_rescan_remove();
return;
default:
pr_warn("%s: Invalid value %d from next_error()\n",
__func__, rc);
+ pci_unlock_rescan_remove();
return;
}
@@ -1186,7 +1200,9 @@ void eeh_handle_special_event(void)
if (rc == EEH_NEXT_ERR_FROZEN_PE ||
rc == EEH_NEXT_ERR_FENCED_PHB) {
eeh_pe_state_mark(pe, EEH_PE_RECOVERING);
+ pci_unlock_rescan_remove();
eeh_handle_normal_event(pe);
+ pci_lock_rescan_remove();
} else {
eeh_for_each_pe(pe, tmp_pe)
eeh_pe_for_each_dev(tmp_pe, edev, tmp_edev)
@@ -1199,7 +1215,6 @@ void eeh_handle_special_event(void)
eeh_report_failure, NULL);
eeh_set_channel_state(pe, pci_channel_io_perm_failure);
- pci_lock_rescan_remove();
list_for_each_entry(hose, &hose_list, list_node) {
phb_pe = eeh_phb_pe_get(hose);
if (!phb_pe ||
@@ -1218,7 +1233,6 @@ void eeh_handle_special_event(void)
}
pci_hp_remove_devices(bus);
}
- pci_unlock_rescan_remove();
}
/*
@@ -1228,4 +1242,6 @@ void eeh_handle_special_event(void)
if (rc == EEH_NEXT_ERR_DEAD_IOC)
break;
} while (rc != EEH_NEXT_ERR_NONE);
+
+ pci_unlock_rescan_remove();
}
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index d283d281d28e..e740101fadf3 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -671,10 +671,12 @@ static void eeh_bridge_check_link(struct eeh_dev *edev)
eeh_ops->write_config(edev, cap + PCI_EXP_LNKCTL, 2, val);
/* Check link */
- if (!edev->pdev->link_active_reporting) {
- eeh_edev_dbg(edev, "No link reporting capability\n");
- msleep(1000);
- return;
+ if (edev->pdev) {
+ if (!edev->pdev->link_active_reporting) {
+ eeh_edev_dbg(edev, "No link reporting capability\n");
+ msleep(1000);
+ return;
+ }
}
/* Wait the link is up until timeout (5s) */
--
2.39.5
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH AUTOSEL 6.16-6.12] apparmor: fix x_table_lookup when stacking is not the first entry
2025-08-08 15:30 [PATCH AUTOSEL 6.16-6.6] apparmor: shift ouid when mediating hard links in userns Sasha Levin
` (9 preceding siblings ...)
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-6.6] powerpc/eeh: Make EEH driver device hotplug safe Sasha Levin
@ 2025-08-08 15:30 ` Sasha Levin
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-6.1] dm-table: fix checking for rq stackable devices Sasha Levin
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-5.10] PCI: pnv_php: Clean up allocated IRQs on unplug Sasha Levin
12 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2025-08-08 15:30 UTC (permalink / raw)
To: patches, stable; +Cc: John Johansen, Sasha Levin, apparmor
From: John Johansen <john.johansen@canonical.com>
[ Upstream commit a9eb185be84e998aa9a99c7760534ccc06216705 ]
x_table_lookup currently does stacking during label_parse() if the
target specifies a stack but its only caller ensures that it will
never be used with stacking.
Refactor to slightly simplify the code in x_to_label(), this
also fixes a long standing problem where x_to_labels check on stacking
is only on the first element to the table option list, instead of
the element that is found and used.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the
following reasons:
## Bug Fix Analysis
### 1. **Critical Bug Fix**
This fixes a **long-standing bug** in AppArmor's domain transition
handling where stacking directives would fail when they weren't the
first entry in the transition table. The commit message explicitly
states this fixes a "long standing problem" where the stacking check was
only performed on the first element rather than the element actually
found and used.
### 2. **Security Impact**
- **Policy Bypass Risk**: The bug could cause incorrect security
profiles to be applied during domain transitions, potentially allowing
unauthorized access
- **Inconsistent Enforcement**: Applications might not receive the
correct stacked security context, leading to security policy
violations
- **Profile Selection Errors**: Wrong profiles could be selected when
stacking is involved, compromising the intended security boundaries
### 3. **Small and Contained Fix**
The changes are:
- Limited to two functions in a single file
(`security/apparmor/domain.c`)
- Focused on fixing the logic flow without introducing new features
- Clear refactoring that separates concerns properly
### 4. **Low Regression Risk**
- The fix simplifies the code by properly separating lookup from
stacking logic
- Changes from `aa_label_parse()` to `aa_label_merge()` for stacking is
the correct semantic operation
- Proper resource management with `aa_put_label(stack)` prevents memory
leaks
- The refactored loop in `x_table_lookup()` is cleaner and easier to
understand
### 5. **Real-World Impact**
This bug would manifest in production systems where:
- AppArmor profiles use stacking features
- The stacking directive appears after other entries in the transition
table
- Domain transitions are critical for security isolation
### 6. **Clear Problem and Solution**
The commit clearly identifies:
- **Problem**: Stacking check only on first element, not the found
element
- **Solution**: Proper separation of lookup and stacking logic, checking
the actual found entry
### 7. **Maintainer Authorship**
The commit is authored by John Johansen, the AppArmor maintainer,
indicating authoritative understanding of the subsystem and the fix's
importance.
This is exactly the type of commit that stable trees should include: a
clear bug fix for a security-relevant subsystem that could cause policy
enforcement failures in production systems, with minimal code changes
and low regression risk.
security/apparmor/domain.c | 52 +++++++++++++++++++++-----------------
1 file changed, 29 insertions(+), 23 deletions(-)
diff --git a/security/apparmor/domain.c b/security/apparmor/domain.c
index 5939bd9a9b9b..08ca9057f82b 100644
--- a/security/apparmor/domain.c
+++ b/security/apparmor/domain.c
@@ -508,6 +508,7 @@ static const char *next_name(int xtype, const char *name)
* @name: returns: name tested to find label (NOT NULL)
*
* Returns: refcounted label, or NULL on failure (MAYBE NULL)
+ * @name will always be set with the last name tried
*/
struct aa_label *x_table_lookup(struct aa_profile *profile, u32 xindex,
const char **name)
@@ -517,6 +518,7 @@ struct aa_label *x_table_lookup(struct aa_profile *profile, u32 xindex,
struct aa_label *label = NULL;
u32 xtype = xindex & AA_X_TYPE_MASK;
int index = xindex & AA_X_INDEX_MASK;
+ const char *next;
AA_BUG(!name);
@@ -524,25 +526,27 @@ struct aa_label *x_table_lookup(struct aa_profile *profile, u32 xindex,
/* TODO: move lookup parsing to unpack time so this is a straight
* index into the resultant label
*/
- for (*name = rules->file->trans.table[index]; !label && *name;
- *name = next_name(xtype, *name)) {
+ for (next = rules->file->trans.table[index]; next;
+ next = next_name(xtype, next)) {
+ const char *lookup = (*next == '&') ? next + 1 : next;
+ *name = next;
if (xindex & AA_X_CHILD) {
- struct aa_profile *new_profile;
- /* release by caller */
- new_profile = aa_find_child(profile, *name);
- if (new_profile)
- label = &new_profile->label;
+ /* TODO: switich to parse to get stack of child */
+ struct aa_profile *new = aa_find_child(profile, lookup);
+
+ if (new)
+ /* release by caller */
+ return &new->label;
continue;
}
- label = aa_label_parse(&profile->label, *name, GFP_KERNEL,
+ label = aa_label_parse(&profile->label, lookup, GFP_KERNEL,
true, false);
- if (IS_ERR(label))
- label = NULL;
+ if (!IS_ERR_OR_NULL(label))
+ /* release by caller */
+ return label;
}
- /* released by caller */
-
- return label;
+ return NULL;
}
/**
@@ -567,9 +571,9 @@ static struct aa_label *x_to_label(struct aa_profile *profile,
struct aa_ruleset *rules = list_first_entry(&profile->rules,
typeof(*rules), list);
struct aa_label *new = NULL;
+ struct aa_label *stack = NULL;
struct aa_ns *ns = profile->ns;
u32 xtype = xindex & AA_X_TYPE_MASK;
- const char *stack = NULL;
switch (xtype) {
case AA_X_NONE:
@@ -578,13 +582,14 @@ static struct aa_label *x_to_label(struct aa_profile *profile,
break;
case AA_X_TABLE:
/* TODO: fix when perm mapping done at unload */
- stack = rules->file->trans.table[xindex & AA_X_INDEX_MASK];
- if (*stack != '&') {
- /* released by caller */
- new = x_table_lookup(profile, xindex, lookupname);
- stack = NULL;
+ /* released by caller
+ * if null for both stack and direct want to try fallback
+ */
+ new = x_table_lookup(profile, xindex, lookupname);
+ if (!new || **lookupname != '&')
break;
- }
+ stack = new;
+ new = NULL;
fallthrough; /* to X_NAME */
case AA_X_NAME:
if (xindex & AA_X_CHILD)
@@ -599,6 +604,7 @@ static struct aa_label *x_to_label(struct aa_profile *profile,
break;
}
+ /* fallback transition check */
if (!new) {
if (xindex & AA_X_INHERIT) {
/* (p|c|n)ix - don't change profile but do
@@ -617,12 +623,12 @@ static struct aa_label *x_to_label(struct aa_profile *profile,
/* base the stack on post domain transition */
struct aa_label *base = new;
- new = aa_label_parse(base, stack, GFP_KERNEL, true, false);
- if (IS_ERR(new))
- new = NULL;
+ new = aa_label_merge(base, stack, GFP_KERNEL);
+ /* null on error */
aa_put_label(base);
}
+ aa_put_label(stack);
/* released by caller */
return new;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH AUTOSEL 6.16-6.1] dm-table: fix checking for rq stackable devices
2025-08-08 15:30 [PATCH AUTOSEL 6.16-6.6] apparmor: shift ouid when mediating hard links in userns Sasha Levin
` (10 preceding siblings ...)
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-6.12] apparmor: fix x_table_lookup when stacking is not the first entry Sasha Levin
@ 2025-08-08 15:30 ` Sasha Levin
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-5.10] PCI: pnv_php: Clean up allocated IRQs on unplug Sasha Levin
12 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2025-08-08 15:30 UTC (permalink / raw)
To: patches, stable
Cc: Benjamin Marzinski, Mike Snitzer, Mikulas Patocka, Sasha Levin,
agk, dm-devel
From: Benjamin Marzinski <bmarzins@redhat.com>
[ Upstream commit 8ca719b81987be690f197e82fdb030580c0a07f3 ]
Due to the semantics of iterate_devices(), the current code allows a
request-based dm table as long as it includes one request-stackable
device. It is supposed to only allow tables where there are no
non-request-stackable devices.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit and the code context, here's my
assessment:
**Backport Status: YES**
This commit fixes a clear logic bug in the device mapper table
validation code that could lead to incorrect device stacking
configurations. Here's the detailed analysis:
**1. The Bug:**
The original code had inverted logic when checking if devices are
request-stackable:
- The function `device_is_rq_stackable()` returned `true` (non-zero) for
stackable devices
- But `iterate_devices()` stops iteration on non-zero returns and
returns that non-zero value
- The check `!ti->type->iterate_devices(ti, device_is_rq_stackable,
NULL)` would pass if ANY device returned true (stackable), not if ALL
devices were stackable
**2. The Fix:**
- Renamed function to `device_is_not_rq_stackable()` to clarify its
purpose
- Inverted the return values: returns `true` for non-stackable devices
- Changed the logic check to `ti->type->iterate_devices(ti,
device_is_not_rq_stackable, NULL)` without negation
- Now correctly fails if ANY device is not stackable
**3. Why This Should Be Backported:**
a) **Fixes a real bug**: The current code allows invalid device stacking
configurations where request-based DM tables could include non-request-
stackable devices, potentially leading to:
- Data corruption
- System crashes
- I/O failures
b) **Small and contained fix**: The change is minimal (4 lines changed)
and only affects the validation logic without changing any core
functionality
c) **Clear semantics**: The fix makes the code's intent clearer by
renaming the function to match what it actually checks
d) **No architectural changes**: This is purely a bug fix that corrects
validation logic
e) **Critical subsystem**: Device mapper is a critical component used in
production environments for LVM, encryption, and other storage features
f) **Follows stable rules**: This is exactly the type of fix stable
kernels want - important bug fixes with minimal risk
The commit message clearly explains the bug: "the current code allows a
request-based dm table as long as it includes one request-stackable
device. It is supposed to only allow tables where there are no non-
request-stackable devices." This could lead to serious issues in
production systems using device mapper.
drivers/md/dm-table.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 24a857ff6d0b..79ba4bacd0f9 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -899,17 +899,17 @@ static bool dm_table_supports_dax(struct dm_table *t,
return true;
}
-static int device_is_rq_stackable(struct dm_target *ti, struct dm_dev *dev,
- sector_t start, sector_t len, void *data)
+static int device_is_not_rq_stackable(struct dm_target *ti, struct dm_dev *dev,
+ sector_t start, sector_t len, void *data)
{
struct block_device *bdev = dev->bdev;
struct request_queue *q = bdev_get_queue(bdev);
/* request-based cannot stack on partitions! */
if (bdev_is_partition(bdev))
- return false;
+ return true;
- return queue_is_mq(q);
+ return !queue_is_mq(q);
}
static int dm_table_determine_type(struct dm_table *t)
@@ -1005,7 +1005,7 @@ static int dm_table_determine_type(struct dm_table *t)
/* Non-request-stackable devices can't be used for request-based dm */
if (!ti->type->iterate_devices ||
- !ti->type->iterate_devices(ti, device_is_rq_stackable, NULL)) {
+ ti->type->iterate_devices(ti, device_is_not_rq_stackable, NULL)) {
DMERR("table load rejected: including non-request-stackable devices");
return -EINVAL;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH AUTOSEL 6.16-5.10] PCI: pnv_php: Clean up allocated IRQs on unplug
2025-08-08 15:30 [PATCH AUTOSEL 6.16-6.6] apparmor: shift ouid when mediating hard links in userns Sasha Levin
` (11 preceding siblings ...)
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-6.1] dm-table: fix checking for rq stackable devices Sasha Levin
@ 2025-08-08 15:30 ` Sasha Levin
2025-08-08 15:59 ` Timothy Pearson
12 siblings, 1 reply; 18+ messages in thread
From: Sasha Levin @ 2025-08-08 15:30 UTC (permalink / raw)
To: patches, stable
Cc: Timothy Pearson, Shawn Anastasio, Bjorn Helgaas,
Madhavan Srinivasan, Sasha Levin, mpe, linuxppc-dev
From: Timothy Pearson <tpearson@raptorengineering.com>
[ Upstream commit 4668619092554e1b95c9a5ac2941ca47ba6d548a ]
When the root of a nested PCIe bridge configuration is unplugged, the
pnv_php driver leaked the allocated IRQ resources for the child bridges'
hotplug event notifications, resulting in a panic.
Fix this by walking all child buses and deallocating all its IRQ resources
before calling pci_hp_remove_devices().
Also modify the lifetime of the workqueue at struct pnv_php_slot::wq so
that it is only destroyed in pnv_php_free_slot(), instead of
pnv_php_disable_irq(). This is required since pnv_php_disable_irq() will
now be called by workers triggered by hot unplug interrupts, so the
workqueue needs to stay allocated.
The abridged kernel panic that occurs without this patch is as follows:
WARNING: CPU: 0 PID: 687 at kernel/irq/msi.c:292 msi_device_data_release+0x6c/0x9c
CPU: 0 UID: 0 PID: 687 Comm: bash Not tainted 6.14.0-rc5+ #2
Call Trace:
msi_device_data_release+0x34/0x9c (unreliable)
release_nodes+0x64/0x13c
devres_release_all+0xc0/0x140
device_del+0x2d4/0x46c
pci_destroy_dev+0x5c/0x194
pci_hp_remove_devices+0x90/0x128
pci_hp_remove_devices+0x44/0x128
pnv_php_disable_slot+0x54/0xd4
power_write_file+0xf8/0x18c
pci_slot_attr_store+0x40/0x5c
sysfs_kf_write+0x64/0x78
kernfs_fop_write_iter+0x1b0/0x290
vfs_write+0x3bc/0x50c
ksys_write+0x84/0x140
system_call_exception+0x124/0x230
system_call_vectored_common+0x15c/0x2ec
Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
[bhelgaas: tidy comments]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/2013845045.1359852.1752615367790.JavaMail.zimbra@raptorengineeringinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit, here is my assessment:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the
following reasons:
1. **Fixes a Real Bug with Kernel Panic**: The commit explicitly fixes a
kernel panic (WARNING and crash) that occurs when unplugging nested
PCIe bridge configurations. The panic trace shows it happens in
`msi_device_data_release()` during hotplug operations, which is a
serious stability issue.
2. **Clear Resource Leak Fix**: The commit addresses a specific IRQ
resource leak where child bridge IRQ resources were not being
properly cleaned up during hot unplug operations. This is a real bug
that affects system stability and resource management.
3. **Small and Contained Fix**: The changes are well-contained within
the PowerNV PHP (PCI Hotplug) driver. The modifications include:
- Adding a new `disable_msi` parameter to `pnv_php_disable_irq()`
- Moving workqueue destruction to `pnv_php_free_slot()`
- Adding functions to recursively disable IRQs for child slots
- Re-enabling IRQ on slot enable for surprise hotplug support
4. **Meets Stable Criteria**:
- Fixes a real bug (kernel panic/crash)
- Is obviously correct - properly cleans up resources
- Has clear testing evidence (fixes the described panic)
- Size is reasonable (under 100 lines of actual changes)
- Doesn't introduce new features or architectural changes
5. **Critical for PowerNV Platform Users**: This affects IBM PowerNV
platforms using PCIe hotplug functionality, where nested bridge
configurations can cause system crashes without this fix.
6. **Well-Documented Problem**: The commit message provides a clear
description of the issue, including the kernel panic trace, making it
easy to verify the problem and solution.
The fix follows proper cleanup patterns by ensuring IRQ resources are
freed before removing devices, and properly manages the lifetime of the
workqueue to avoid use-after-free scenarios. This is exactly the type of
bug fix that stable kernels should include to maintain system stability.
drivers/pci/hotplug/pnv_php.c | 96 ++++++++++++++++++++++++++++-------
1 file changed, 77 insertions(+), 19 deletions(-)
diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
index aaa8e0a2a893..5476c9e7760d 100644
--- a/drivers/pci/hotplug/pnv_php.c
+++ b/drivers/pci/hotplug/pnv_php.c
@@ -3,6 +3,7 @@
* PCI Hotplug Driver for PowerPC PowerNV platform.
*
* Copyright Gavin Shan, IBM Corporation 2016.
+ * Copyright (C) 2025 Raptor Engineering, LLC
*/
#include <linux/bitfield.h>
@@ -36,8 +37,10 @@ static void pnv_php_register(struct device_node *dn);
static void pnv_php_unregister_one(struct device_node *dn);
static void pnv_php_unregister(struct device_node *dn);
+static void pnv_php_enable_irq(struct pnv_php_slot *php_slot);
+
static void pnv_php_disable_irq(struct pnv_php_slot *php_slot,
- bool disable_device)
+ bool disable_device, bool disable_msi)
{
struct pci_dev *pdev = php_slot->pdev;
u16 ctrl;
@@ -53,19 +56,15 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot,
php_slot->irq = 0;
}
- if (php_slot->wq) {
- destroy_workqueue(php_slot->wq);
- php_slot->wq = NULL;
- }
-
- if (disable_device) {
+ if (disable_device || disable_msi) {
if (pdev->msix_enabled)
pci_disable_msix(pdev);
else if (pdev->msi_enabled)
pci_disable_msi(pdev);
+ }
+ if (disable_device)
pci_disable_device(pdev);
- }
}
static void pnv_php_free_slot(struct kref *kref)
@@ -74,7 +73,8 @@ static void pnv_php_free_slot(struct kref *kref)
struct pnv_php_slot, kref);
WARN_ON(!list_empty(&php_slot->children));
- pnv_php_disable_irq(php_slot, false);
+ pnv_php_disable_irq(php_slot, false, false);
+ destroy_workqueue(php_slot->wq);
kfree(php_slot->name);
kfree(php_slot);
}
@@ -588,8 +588,58 @@ static int pnv_php_reset_slot(struct hotplug_slot *slot, bool probe)
static int pnv_php_enable_slot(struct hotplug_slot *slot)
{
struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
+ u32 prop32;
+ int ret;
+
+ ret = pnv_php_enable(php_slot, true);
+ if (ret)
+ return ret;
+
+ /* (Re-)enable interrupt if the slot supports surprise hotplug */
+ ret = of_property_read_u32(php_slot->dn, "ibm,slot-surprise-pluggable",
+ &prop32);
+ if (!ret && prop32)
+ pnv_php_enable_irq(php_slot);
- return pnv_php_enable(php_slot, true);
+ return 0;
+}
+
+/*
+ * Disable any hotplug interrupts for all slots on the provided bus, as well as
+ * all downstream slots in preparation for a hot unplug.
+ */
+static int pnv_php_disable_all_irqs(struct pci_bus *bus)
+{
+ struct pci_bus *child_bus;
+ struct pci_slot *slot;
+
+ /* First go down child buses */
+ list_for_each_entry(child_bus, &bus->children, node)
+ pnv_php_disable_all_irqs(child_bus);
+
+ /* Disable IRQs for all pnv_php slots on this bus */
+ list_for_each_entry(slot, &bus->slots, list) {
+ struct pnv_php_slot *php_slot = to_pnv_php_slot(slot->hotplug);
+
+ pnv_php_disable_irq(php_slot, false, true);
+ }
+
+ return 0;
+}
+
+/*
+ * Disable any hotplug interrupts for all downstream slots on the provided
+ * bus in preparation for a hot unplug.
+ */
+static int pnv_php_disable_all_downstream_irqs(struct pci_bus *bus)
+{
+ struct pci_bus *child_bus;
+
+ /* Go down child buses, recursively deactivating their IRQs */
+ list_for_each_entry(child_bus, &bus->children, node)
+ pnv_php_disable_all_irqs(child_bus);
+
+ return 0;
}
static int pnv_php_disable_slot(struct hotplug_slot *slot)
@@ -606,6 +656,13 @@ static int pnv_php_disable_slot(struct hotplug_slot *slot)
php_slot->state != PNV_PHP_STATE_REGISTERED)
return 0;
+ /*
+ * Free all IRQ resources from all child slots before remove.
+ * Note that we do not disable the root slot IRQ here as that
+ * would also deactivate the slot hot (re)plug interrupt!
+ */
+ pnv_php_disable_all_downstream_irqs(php_slot->bus);
+
/* Remove all devices behind the slot */
pci_lock_rescan_remove();
pci_hp_remove_devices(php_slot->bus);
@@ -674,6 +731,15 @@ static struct pnv_php_slot *pnv_php_alloc_slot(struct device_node *dn)
return NULL;
}
+ /* Allocate workqueue for this slot's interrupt handling */
+ php_slot->wq = alloc_workqueue("pciehp-%s", 0, 0, php_slot->name);
+ if (!php_slot->wq) {
+ SLOT_WARN(php_slot, "Cannot alloc workqueue\n");
+ kfree(php_slot->name);
+ kfree(php_slot);
+ return NULL;
+ }
+
if (dn->child && PCI_DN(dn->child))
php_slot->slot_no = PCI_SLOT(PCI_DN(dn->child)->devfn);
else
@@ -870,14 +936,6 @@ static void pnv_php_init_irq(struct pnv_php_slot *php_slot, int irq)
u16 sts, ctrl;
int ret;
- /* Allocate workqueue */
- php_slot->wq = alloc_workqueue("pciehp-%s", 0, 0, php_slot->name);
- if (!php_slot->wq) {
- SLOT_WARN(php_slot, "Cannot alloc workqueue\n");
- pnv_php_disable_irq(php_slot, true);
- return;
- }
-
/* Check PDC (Presence Detection Change) is broken or not */
ret = of_property_read_u32(php_slot->dn, "ibm,slot-broken-pdc",
&broken_pdc);
@@ -896,7 +954,7 @@ static void pnv_php_init_irq(struct pnv_php_slot *php_slot, int irq)
ret = request_irq(irq, pnv_php_interrupt, IRQF_SHARED,
php_slot->name, php_slot);
if (ret) {
- pnv_php_disable_irq(php_slot, true);
+ pnv_php_disable_irq(php_slot, true, true);
SLOT_WARN(php_slot, "Error %d enabling IRQ %d\n", ret, irq);
return;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 18+ messages in thread* Re: [PATCH AUTOSEL 6.16-5.10] PCI: pnv_php: Clean up allocated IRQs on unplug
2025-08-08 15:30 ` [PATCH AUTOSEL 6.16-5.10] PCI: pnv_php: Clean up allocated IRQs on unplug Sasha Levin
@ 2025-08-08 15:59 ` Timothy Pearson
2025-08-08 17:04 ` Sasha Levin
0 siblings, 1 reply; 18+ messages in thread
From: Timothy Pearson @ 2025-08-08 15:59 UTC (permalink / raw)
To: Sasha Levin
Cc: patches, stable, Timothy Pearson, Shawn Anastasio, Bjorn Helgaas,
Madhavan Srinivasan, Michael Ellerman, linuxppc-dev
While the autoselector has the right idea, in that this is critical functionality that currently induces panics on older stable kernels, this entire patch series should be backported, not just these two isolated patches.
The correct series for backport would be:
PCI: pnv_php: Fix surprise plug detection and recovery
powerpc/eeh: Make EEH driver device hotplug safe
powerpc/eeh: Export eeh_unfreeze_pe()
PCI: pnv_php: Work around switches with broken presence detection
PCI: pnv_php: Clean up allocated IRQs on unplug
Backport is especially critical for Debian, so that we don't ship a broken kernel with the soon to be released Trixie version.
Thanks!
----- Original Message -----
> From: "Sasha Levin" <sashal@kernel.org>
> To: patches@lists.linux.dev, stable@vger.kernel.org
> Cc: "Timothy Pearson" <tpearson@raptorengineering.com>, "Shawn Anastasio" <sanastasio@raptorengineering.com>, "Bjorn
> Helgaas" <bhelgaas@google.com>, "Madhavan Srinivasan" <maddy@linux.ibm.com>, "Sasha Levin" <sashal@kernel.org>,
> "Michael Ellerman" <mpe@ellerman.id.au>, "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>
> Sent: Friday, August 8, 2025 10:30:54 AM
> Subject: [PATCH AUTOSEL 6.16-5.10] PCI: pnv_php: Clean up allocated IRQs on unplug
> From: Timothy Pearson <tpearson@raptorengineering.com>
>
> [ Upstream commit 4668619092554e1b95c9a5ac2941ca47ba6d548a ]
>
> When the root of a nested PCIe bridge configuration is unplugged, the
> pnv_php driver leaked the allocated IRQ resources for the child bridges'
> hotplug event notifications, resulting in a panic.
>
> Fix this by walking all child buses and deallocating all its IRQ resources
> before calling pci_hp_remove_devices().
>
> Also modify the lifetime of the workqueue at struct pnv_php_slot::wq so
> that it is only destroyed in pnv_php_free_slot(), instead of
> pnv_php_disable_irq(). This is required since pnv_php_disable_irq() will
> now be called by workers triggered by hot unplug interrupts, so the
> workqueue needs to stay allocated.
>
> The abridged kernel panic that occurs without this patch is as follows:
>
> WARNING: CPU: 0 PID: 687 at kernel/irq/msi.c:292
> msi_device_data_release+0x6c/0x9c
> CPU: 0 UID: 0 PID: 687 Comm: bash Not tainted 6.14.0-rc5+ #2
> Call Trace:
> msi_device_data_release+0x34/0x9c (unreliable)
> release_nodes+0x64/0x13c
> devres_release_all+0xc0/0x140
> device_del+0x2d4/0x46c
> pci_destroy_dev+0x5c/0x194
> pci_hp_remove_devices+0x90/0x128
> pci_hp_remove_devices+0x44/0x128
> pnv_php_disable_slot+0x54/0xd4
> power_write_file+0xf8/0x18c
> pci_slot_attr_store+0x40/0x5c
> sysfs_kf_write+0x64/0x78
> kernfs_fop_write_iter+0x1b0/0x290
> vfs_write+0x3bc/0x50c
> ksys_write+0x84/0x140
> system_call_exception+0x124/0x230
> system_call_vectored_common+0x15c/0x2ec
>
> Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
> Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
> [bhelgaas: tidy comments]
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
> Link:
> https://patch.msgid.link/2013845045.1359852.1752615367790.JavaMail.zimbra@raptorengineeringinc.com
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>
> LLM Generated explanations, may be completely bogus:
>
> Based on my analysis of the commit, here is my assessment:
>
> **Backport Status: YES**
>
> This commit should be backported to stable kernel trees for the
> following reasons:
>
> 1. **Fixes a Real Bug with Kernel Panic**: The commit explicitly fixes a
> kernel panic (WARNING and crash) that occurs when unplugging nested
> PCIe bridge configurations. The panic trace shows it happens in
> `msi_device_data_release()` during hotplug operations, which is a
> serious stability issue.
>
> 2. **Clear Resource Leak Fix**: The commit addresses a specific IRQ
> resource leak where child bridge IRQ resources were not being
> properly cleaned up during hot unplug operations. This is a real bug
> that affects system stability and resource management.
>
> 3. **Small and Contained Fix**: The changes are well-contained within
> the PowerNV PHP (PCI Hotplug) driver. The modifications include:
> - Adding a new `disable_msi` parameter to `pnv_php_disable_irq()`
> - Moving workqueue destruction to `pnv_php_free_slot()`
> - Adding functions to recursively disable IRQs for child slots
> - Re-enabling IRQ on slot enable for surprise hotplug support
>
> 4. **Meets Stable Criteria**:
> - Fixes a real bug (kernel panic/crash)
> - Is obviously correct - properly cleans up resources
> - Has clear testing evidence (fixes the described panic)
> - Size is reasonable (under 100 lines of actual changes)
> - Doesn't introduce new features or architectural changes
>
> 5. **Critical for PowerNV Platform Users**: This affects IBM PowerNV
> platforms using PCIe hotplug functionality, where nested bridge
> configurations can cause system crashes without this fix.
>
> 6. **Well-Documented Problem**: The commit message provides a clear
> description of the issue, including the kernel panic trace, making it
> easy to verify the problem and solution.
>
> The fix follows proper cleanup patterns by ensuring IRQ resources are
> freed before removing devices, and properly manages the lifetime of the
> workqueue to avoid use-after-free scenarios. This is exactly the type of
> bug fix that stable kernels should include to maintain system stability.
>
> drivers/pci/hotplug/pnv_php.c | 96 ++++++++++++++++++++++++++++-------
> 1 file changed, 77 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
> index aaa8e0a2a893..5476c9e7760d 100644
> --- a/drivers/pci/hotplug/pnv_php.c
> +++ b/drivers/pci/hotplug/pnv_php.c
> @@ -3,6 +3,7 @@
> * PCI Hotplug Driver for PowerPC PowerNV platform.
> *
> * Copyright Gavin Shan, IBM Corporation 2016.
> + * Copyright (C) 2025 Raptor Engineering, LLC
> */
>
> #include <linux/bitfield.h>
> @@ -36,8 +37,10 @@ static void pnv_php_register(struct device_node *dn);
> static void pnv_php_unregister_one(struct device_node *dn);
> static void pnv_php_unregister(struct device_node *dn);
>
> +static void pnv_php_enable_irq(struct pnv_php_slot *php_slot);
> +
> static void pnv_php_disable_irq(struct pnv_php_slot *php_slot,
> - bool disable_device)
> + bool disable_device, bool disable_msi)
> {
> struct pci_dev *pdev = php_slot->pdev;
> u16 ctrl;
> @@ -53,19 +56,15 @@ static void pnv_php_disable_irq(struct pnv_php_slot
> *php_slot,
> php_slot->irq = 0;
> }
>
> - if (php_slot->wq) {
> - destroy_workqueue(php_slot->wq);
> - php_slot->wq = NULL;
> - }
> -
> - if (disable_device) {
> + if (disable_device || disable_msi) {
> if (pdev->msix_enabled)
> pci_disable_msix(pdev);
> else if (pdev->msi_enabled)
> pci_disable_msi(pdev);
> + }
>
> + if (disable_device)
> pci_disable_device(pdev);
> - }
> }
>
> static void pnv_php_free_slot(struct kref *kref)
> @@ -74,7 +73,8 @@ static void pnv_php_free_slot(struct kref *kref)
> struct pnv_php_slot, kref);
>
> WARN_ON(!list_empty(&php_slot->children));
> - pnv_php_disable_irq(php_slot, false);
> + pnv_php_disable_irq(php_slot, false, false);
> + destroy_workqueue(php_slot->wq);
> kfree(php_slot->name);
> kfree(php_slot);
> }
> @@ -588,8 +588,58 @@ static int pnv_php_reset_slot(struct hotplug_slot *slot,
> bool probe)
> static int pnv_php_enable_slot(struct hotplug_slot *slot)
> {
> struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
> + u32 prop32;
> + int ret;
> +
> + ret = pnv_php_enable(php_slot, true);
> + if (ret)
> + return ret;
> +
> + /* (Re-)enable interrupt if the slot supports surprise hotplug */
> + ret = of_property_read_u32(php_slot->dn, "ibm,slot-surprise-pluggable",
> + &prop32);
> + if (!ret && prop32)
> + pnv_php_enable_irq(php_slot);
>
> - return pnv_php_enable(php_slot, true);
> + return 0;
> +}
> +
> +/*
> + * Disable any hotplug interrupts for all slots on the provided bus, as well as
> + * all downstream slots in preparation for a hot unplug.
> + */
> +static int pnv_php_disable_all_irqs(struct pci_bus *bus)
> +{
> + struct pci_bus *child_bus;
> + struct pci_slot *slot;
> +
> + /* First go down child buses */
> + list_for_each_entry(child_bus, &bus->children, node)
> + pnv_php_disable_all_irqs(child_bus);
> +
> + /* Disable IRQs for all pnv_php slots on this bus */
> + list_for_each_entry(slot, &bus->slots, list) {
> + struct pnv_php_slot *php_slot = to_pnv_php_slot(slot->hotplug);
> +
> + pnv_php_disable_irq(php_slot, false, true);
> + }
> +
> + return 0;
> +}
> +
> +/*
> + * Disable any hotplug interrupts for all downstream slots on the provided
> + * bus in preparation for a hot unplug.
> + */
> +static int pnv_php_disable_all_downstream_irqs(struct pci_bus *bus)
> +{
> + struct pci_bus *child_bus;
> +
> + /* Go down child buses, recursively deactivating their IRQs */
> + list_for_each_entry(child_bus, &bus->children, node)
> + pnv_php_disable_all_irqs(child_bus);
> +
> + return 0;
> }
>
> static int pnv_php_disable_slot(struct hotplug_slot *slot)
> @@ -606,6 +656,13 @@ static int pnv_php_disable_slot(struct hotplug_slot *slot)
> php_slot->state != PNV_PHP_STATE_REGISTERED)
> return 0;
>
> + /*
> + * Free all IRQ resources from all child slots before remove.
> + * Note that we do not disable the root slot IRQ here as that
> + * would also deactivate the slot hot (re)plug interrupt!
> + */
> + pnv_php_disable_all_downstream_irqs(php_slot->bus);
> +
> /* Remove all devices behind the slot */
> pci_lock_rescan_remove();
> pci_hp_remove_devices(php_slot->bus);
> @@ -674,6 +731,15 @@ static struct pnv_php_slot *pnv_php_alloc_slot(struct
> device_node *dn)
> return NULL;
> }
>
> + /* Allocate workqueue for this slot's interrupt handling */
> + php_slot->wq = alloc_workqueue("pciehp-%s", 0, 0, php_slot->name);
> + if (!php_slot->wq) {
> + SLOT_WARN(php_slot, "Cannot alloc workqueue\n");
> + kfree(php_slot->name);
> + kfree(php_slot);
> + return NULL;
> + }
> +
> if (dn->child && PCI_DN(dn->child))
> php_slot->slot_no = PCI_SLOT(PCI_DN(dn->child)->devfn);
> else
> @@ -870,14 +936,6 @@ static void pnv_php_init_irq(struct pnv_php_slot *php_slot,
> int irq)
> u16 sts, ctrl;
> int ret;
>
> - /* Allocate workqueue */
> - php_slot->wq = alloc_workqueue("pciehp-%s", 0, 0, php_slot->name);
> - if (!php_slot->wq) {
> - SLOT_WARN(php_slot, "Cannot alloc workqueue\n");
> - pnv_php_disable_irq(php_slot, true);
> - return;
> - }
> -
> /* Check PDC (Presence Detection Change) is broken or not */
> ret = of_property_read_u32(php_slot->dn, "ibm,slot-broken-pdc",
> &broken_pdc);
> @@ -896,7 +954,7 @@ static void pnv_php_init_irq(struct pnv_php_slot *php_slot,
> int irq)
> ret = request_irq(irq, pnv_php_interrupt, IRQF_SHARED,
> php_slot->name, php_slot);
> if (ret) {
> - pnv_php_disable_irq(php_slot, true);
> + pnv_php_disable_irq(php_slot, true, true);
> SLOT_WARN(php_slot, "Error %d enabling IRQ %d\n", ret, irq);
> return;
> }
> --
> 2.39.5
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH AUTOSEL 6.16-5.10] PCI: pnv_php: Clean up allocated IRQs on unplug
2025-08-08 15:59 ` Timothy Pearson
@ 2025-08-08 17:04 ` Sasha Levin
0 siblings, 0 replies; 18+ messages in thread
From: Sasha Levin @ 2025-08-08 17:04 UTC (permalink / raw)
To: Timothy Pearson
Cc: patches, stable, Shawn Anastasio, Bjorn Helgaas,
Madhavan Srinivasan, Michael Ellerman, linuxppc-dev
On Fri, Aug 08, 2025 at 10:59:00AM -0500, Timothy Pearson wrote:
>While the autoselector has the right idea, in that this is critical functionality that currently induces panics on older stable kernels, this entire patch series should be backported, not just these two isolated patches.
>
>The correct series for backport would be:
>
> PCI: pnv_php: Fix surprise plug detection and recovery
> powerpc/eeh: Make EEH driver device hotplug safe
> powerpc/eeh: Export eeh_unfreeze_pe()
> PCI: pnv_php: Work around switches with broken presence detection
> PCI: pnv_php: Clean up allocated IRQs on unplug
>
>Backport is especially critical for Debian, so that we don't ship a broken kernel with the soon to be released Trixie version.
Okay, I'll just queue these up directly. Thanks!
--
Thanks,
Sasha
^ permalink raw reply [flat|nested] 18+ messages in thread