Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 0/3] arm-smmu-v3: Add PMCG child support and update PMU MMIO mapping
From: Robin Murphy @ 2026-04-14  9:32 UTC (permalink / raw)
  To: Peng Fan
  Cc: Will Deacon, Joerg Roedel, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Mark Rutland, linux-arm-kernel, iommu, devicetree,
	linux-kernel, linux-perf-users, Peng Fan
In-Reply-To: <ad3w/P1vA2uKsV/o@shlinux89>

On 2026-04-14 8:47 am, Peng Fan wrote:
> Hi Robin,
> 
> On Fri, Apr 10, 2026 at 01:07:29PM +0100, Robin Murphy wrote:
>> On 08/04/2026 2:47 pm, Peng Fan wrote:
>>> On Wed, Apr 08, 2026 at 12:15:31PM +0100, Robin Murphy wrote:
>>>> On 2026-04-08 8:51 am, Peng Fan (OSS) wrote:
>>>>> This patch series adds proper support for describing and probing the
>>>>> Arm SMMU v3 PMCG (Performance Monitor Control Group) as a child node of
>>>>> the SMMU in Devicetree, and updates the relevant drivers accordingly.
>>>>>
>>>>> The SMMU v3 architecture allows an optional PMCG block, typically
>>>>> associated with TCUs, to be implemented within the SMMU register
>>>>> address space. For example, mmu700 PMCG is at the offset 0x2000 of the
>>>>> TCU page 0.
>>>>
>>>> But what's wrong with the existing binding? Especially given that it even has
>>>> an upstream user already:
>>>>
>>>> https://git.kernel.org/torvalds/c/aef9703dcbf8
>>>>
>>>>> Patch 1 updates the SMMU v3 Devicetree binding to allow PMCG child nodes,
>>>>> referencing the existing arm,smmu-v3-pmcg binding.
>>>>>
>>>>> Patch 2 updates the arm-smmu-v3 driver to populate platform devices for
>>>>> child nodes described in DT once the SMMU probe succeeds.
>>>>>
>>>>> Patch 3 updates the SMMUv3 PMU driver to correctly handle MMIO mapping when
>>>>> PMCG is described as a child node. The PMCG registers occupy a sub-region
>>>>> of the parent SMMU MMIO window, which is already requested by the SMMU
>>>>
>>>> That has not been the case since 52f3fab0067d ("iommu/arm-smmu-v3: Don't
>>>> reserve implementation defined register space") nearly 6 years ago, where the
>>>> whole purpose was to support Arm's PMCG implementation properly. What kernel
>>>> is this based on?
>>>
>>> Seems I am wrong. I thought PMCG is in page 0, so there were resource
>>> conflicts. I just retest without this patchset, all goes well.
>>>
>>> But from dt perspective, should the TCU PMCG node be child node of
>>> SMMU node?
>>
>> No. PMCGs can be used entirely independently of the SMMU itself, and while
>> most of the events do relate to SMMU translation and thus aren't necessarily
>> meaningful if it's not in use, there are still some which can be useful for
>> basic traffic counting, monitoring GPT/translation activity from _other_
>> security states (if observation is delegated to Non-Secure) and possibly
>> other things, even if the "main" Non-Secure SMMU interface isn't advertised
>> at all. It would be unreasonable to require the SMMU node to be present and
>> enabled *and* have a driver to populate PMCGs, to monitor events which are
>> outside the scope of that driver.
> 
> Thanks for explaining this in detail.
> 
> Just have one more question, we are using mmu-700, but MMU-700 implementation
> defined TCU and TBU events are not supported.
> 
> Should we introduce a compatible string saying "arm,mmu700-tcu-pmcg" or
> "arm,mmu700-tbu-pmcg"? TBH, I have not checked MMU600(AE) or else.

MMU-700 and all other Arm implementations are still fully compatible 
with "arm,mmu-600-pmcg" in terms of what that means. That lets the 
driver correctly construct the "identifier" attribute, which then allows 
userspace to know what exact PMU implementation it is.

We don't maintain ever-growing lists of aliases for imp-def events in 
the kernel driver, same as we don't for CPU PMUs either. Generally, 
anyone who has reason to go near those is likely to already have the TRM 
to hand and thus have the encodings anyway, but I suppose you could add 
jevents with the proper meaningful descriptions if you really wanted to.

Thanks,
Robin.


^ permalink raw reply

* [PATCH 3/3] iio: adc: xilinx-ams: refactor alarm mapping to table-driven approach
From: Guilherme Ivo Bozi @ 2026-04-14  9:29 UTC (permalink / raw)
  To: Salih Erim, Conall O'Griofa, Jonathan Cameron, Michal Simek
  Cc: David Lechner, Nuno Sá, Andy Shevchenko, linux-iio,
	linux-arm-kernel, linux-kernel, Guilherme Ivo Bozi
In-Reply-To: <20260414093018.7153-1-guilherme.bozi@usp.br>

Replace multiple open-coded switch statements that map between
scan_index, alarm bits, and register offsets with a centralized
table-driven approach.

Introduce a struct-based alarm_map to describe the relationship
between scan indices and alarm offsets, and add a helper to
translate scan_index to event IDs. This removes duplicated logic
across ams_get_alarm_offset(), ams_event_to_channel(), and
ams_get_alarm_mask().

The new approach improves maintainability, reduces code size,
and makes it easier to extend or modify alarm mappings in the
future, while preserving existing behavior.

Signed-off-by: Guilherme Ivo Bozi <guilherme.bozi@usp.br>
---
 drivers/iio/adc/xilinx-ams.c | 163 +++++++++++++----------------------
 1 file changed, 60 insertions(+), 103 deletions(-)

diff --git a/drivers/iio/adc/xilinx-ams.c b/drivers/iio/adc/xilinx-ams.c
index 1d84310b61a9..d5ed2e787641 100644
--- a/drivers/iio/adc/xilinx-ams.c
+++ b/drivers/iio/adc/xilinx-ams.c
@@ -102,6 +102,7 @@
 #define AMS_PS_SEQ_MASK			GENMASK(21, 0)
 #define AMS_PL_SEQ_MASK			GENMASK_ULL(59, 22)
 
+#define AMS_ALARM_INVALID		-1
 #define AMS_ALARM_TEMP			0x140
 #define AMS_ALARM_SUPPLY1		0x144
 #define AMS_ALARM_SUPPLY2		0x148
@@ -763,9 +764,51 @@ static int ams_read_raw(struct iio_dev *indio_dev,
 	}
 }
 
+struct ams_alarm_map {
+	enum ams_ps_pl_seq scan_index;
+	int base_offset;
+};
+
+/*
+ * Array index matches enum ams_alarm_bit.
+ * Entries with base_offset == AMS_ALARM_INVALID are unused/invalid
+ * (e.g. RESERVED) and must be skipped.
+ */
+static const struct ams_alarm_map alarm_map[] = {
+	[AMS_ALARM_BIT_TEMP] = { AMS_SEQ_TEMP, AMS_ALARM_TEMP },
+	[AMS_ALARM_BIT_SUPPLY1] = { AMS_SEQ_SUPPLY1, AMS_ALARM_SUPPLY1 },
+	[AMS_ALARM_BIT_SUPPLY2] = { AMS_SEQ_SUPPLY2, AMS_ALARM_SUPPLY2 },
+	[AMS_ALARM_BIT_SUPPLY3] = { AMS_SEQ_SUPPLY3, AMS_ALARM_SUPPLY3 },
+	[AMS_ALARM_BIT_SUPPLY4] = { AMS_SEQ_SUPPLY4, AMS_ALARM_SUPPLY4 },
+	[AMS_ALARM_BIT_SUPPLY5] = { AMS_SEQ_SUPPLY5, AMS_ALARM_SUPPLY5 },
+	[AMS_ALARM_BIT_SUPPLY6] = { AMS_SEQ_SUPPLY6, AMS_ALARM_SUPPLY6 },
+	[AMS_ALARM_BIT_RESERVED] = { 0, AMS_ALARM_INVALID },
+	[AMS_ALARM_BIT_SUPPLY7] = { AMS_SEQ_SUPPLY7, AMS_ALARM_SUPPLY7 },
+	[AMS_ALARM_BIT_SUPPLY8] = { AMS_SEQ_SUPPLY8, AMS_ALARM_SUPPLY8 },
+	[AMS_ALARM_BIT_SUPPLY9] = { AMS_SEQ_SUPPLY9, AMS_ALARM_SUPPLY9 },
+	[AMS_ALARM_BIT_SUPPLY10] = { AMS_SEQ_SUPPLY10, AMS_ALARM_SUPPLY10 },
+	[AMS_ALARM_BIT_VCCAMS] = { AMS_SEQ_VCCAMS, AMS_ALARM_VCCAMS },
+	[AMS_ALARM_BIT_TEMP_REMOTE] = { AMS_SEQ_TEMP_REMOTE, AMS_ALARM_TEMP_REMOTE }
+};
+
+static int ams_scan_index_to_event(int scan_index)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(alarm_map); i++) {
+		if (alarm_map[i].base_offset == AMS_ALARM_INVALID)
+			continue;
+
+		if (alarm_map[i].scan_index == scan_index)
+			return i;
+	}
+
+	return -EINVAL;
+}
+
 static int ams_get_alarm_offset(int scan_index, enum iio_event_direction dir)
 {
-	int offset;
+	int offset, event;
 
 	if (scan_index >= AMS_PS_SEQ_MAX)
 		scan_index -= AMS_PS_SEQ_MAX;
@@ -779,36 +822,11 @@ static int ams_get_alarm_offset(int scan_index, enum iio_event_direction dir)
 		offset = 0;
 	}
 
-	switch (scan_index) {
-	case AMS_SEQ_TEMP:
-		return AMS_ALARM_TEMP + offset;
-	case AMS_SEQ_SUPPLY1:
-		return AMS_ALARM_SUPPLY1 + offset;
-	case AMS_SEQ_SUPPLY2:
-		return AMS_ALARM_SUPPLY2 + offset;
-	case AMS_SEQ_SUPPLY3:
-		return AMS_ALARM_SUPPLY3 + offset;
-	case AMS_SEQ_SUPPLY4:
-		return AMS_ALARM_SUPPLY4 + offset;
-	case AMS_SEQ_SUPPLY5:
-		return AMS_ALARM_SUPPLY5 + offset;
-	case AMS_SEQ_SUPPLY6:
-		return AMS_ALARM_SUPPLY6 + offset;
-	case AMS_SEQ_SUPPLY7:
-		return AMS_ALARM_SUPPLY7 + offset;
-	case AMS_SEQ_SUPPLY8:
-		return AMS_ALARM_SUPPLY8 + offset;
-	case AMS_SEQ_SUPPLY9:
-		return AMS_ALARM_SUPPLY9 + offset;
-	case AMS_SEQ_SUPPLY10:
-		return AMS_ALARM_SUPPLY10 + offset;
-	case AMS_SEQ_VCCAMS:
-		return AMS_ALARM_VCCAMS + offset;
-	case AMS_SEQ_TEMP_REMOTE:
-		return AMS_ALARM_TEMP_REMOTE + offset;
-	default:
+	event = ams_scan_index_to_event(scan_index);
+	if (event < 0 || alarm_map[event].base_offset == AMS_ALARM_INVALID)
 		return 0;
-	}
+
+	return alarm_map[event].base_offset + offset;
 }
 
 static const struct iio_chan_spec *ams_event_to_channel(struct iio_dev *dev,
@@ -821,49 +839,13 @@ static const struct iio_chan_spec *ams_event_to_channel(struct iio_dev *dev,
 		scan_index = AMS_PS_SEQ_MAX;
 	}
 
-	switch (event) {
-	case AMS_ALARM_BIT_TEMP:
-		scan_index += AMS_SEQ_TEMP;
-		break;
-	case AMS_ALARM_BIT_SUPPLY1:
-		scan_index += AMS_SEQ_SUPPLY1;
-		break;
-	case AMS_ALARM_BIT_SUPPLY2:
-		scan_index += AMS_SEQ_SUPPLY2;
-		break;
-	case AMS_ALARM_BIT_SUPPLY3:
-		scan_index += AMS_SEQ_SUPPLY3;
-		break;
-	case AMS_ALARM_BIT_SUPPLY4:
-		scan_index += AMS_SEQ_SUPPLY4;
-		break;
-	case AMS_ALARM_BIT_SUPPLY5:
-		scan_index += AMS_SEQ_SUPPLY5;
-		break;
-	case AMS_ALARM_BIT_SUPPLY6:
-		scan_index += AMS_SEQ_SUPPLY6;
-		break;
-	case AMS_ALARM_BIT_SUPPLY7:
-		scan_index += AMS_SEQ_SUPPLY7;
-		break;
-	case AMS_ALARM_BIT_SUPPLY8:
-		scan_index += AMS_SEQ_SUPPLY8;
-		break;
-	case AMS_ALARM_BIT_SUPPLY9:
-		scan_index += AMS_SEQ_SUPPLY9;
-		break;
-	case AMS_ALARM_BIT_SUPPLY10:
-		scan_index += AMS_SEQ_SUPPLY10;
-		break;
-	case AMS_ALARM_BIT_VCCAMS:
-		scan_index += AMS_SEQ_VCCAMS;
-		break;
-	case AMS_ALARM_BIT_TEMP_REMOTE:
-		scan_index += AMS_SEQ_TEMP_REMOTE;
-		break;
-	default:
-		break;
-	}
+	if (event < 0 || event >= ARRAY_SIZE(alarm_map))
+		return NULL;
+
+	if (alarm_map[event].base_offset == AMS_ALARM_INVALID)
+		return NULL;
+
+	scan_index += alarm_map[event].scan_index;
 
 	for (i = 0; i < dev->num_channels; i++)
 		if (dev->channels[i].scan_index == scan_index)
@@ -877,43 +859,18 @@ static const struct iio_chan_spec *ams_event_to_channel(struct iio_dev *dev,
 
 static int ams_get_alarm_mask(int scan_index)
 {
-	int bit = 0;
+	int bit = 0, event;
 
 	if (scan_index >= AMS_PS_SEQ_MAX) {
 		bit = AMS_PL_ALARM_START;
 		scan_index -= AMS_PS_SEQ_MAX;
 	}
 
-	switch (scan_index) {
-	case AMS_SEQ_TEMP:
-		return BIT(AMS_ALARM_BIT_TEMP + bit);
-	case AMS_SEQ_SUPPLY1:
-		return BIT(AMS_ALARM_BIT_SUPPLY1 + bit);
-	case AMS_SEQ_SUPPLY2:
-		return BIT(AMS_ALARM_BIT_SUPPLY2 + bit);
-	case AMS_SEQ_SUPPLY3:
-		return BIT(AMS_ALARM_BIT_SUPPLY3 + bit);
-	case AMS_SEQ_SUPPLY4:
-		return BIT(AMS_ALARM_BIT_SUPPLY4 + bit);
-	case AMS_SEQ_SUPPLY5:
-		return BIT(AMS_ALARM_BIT_SUPPLY5 + bit);
-	case AMS_SEQ_SUPPLY6:
-		return BIT(AMS_ALARM_BIT_SUPPLY6 + bit);
-	case AMS_SEQ_SUPPLY7:
-		return BIT(AMS_ALARM_BIT_SUPPLY7 + bit);
-	case AMS_SEQ_SUPPLY8:
-		return BIT(AMS_ALARM_BIT_SUPPLY8 + bit);
-	case AMS_SEQ_SUPPLY9:
-		return BIT(AMS_ALARM_BIT_SUPPLY9 + bit);
-	case AMS_SEQ_SUPPLY10:
-		return BIT(AMS_ALARM_BIT_SUPPLY10 + bit);
-	case AMS_SEQ_VCCAMS:
-		return BIT(AMS_ALARM_BIT_VCCAMS + bit);
-	case AMS_SEQ_TEMP_REMOTE:
-		return BIT(AMS_ALARM_BIT_TEMP_REMOTE + bit);
-	default:
+	event = ams_scan_index_to_event(scan_index);
+	if (event < 0)
 		return 0;
-	}
+
+	return BIT(event + bit);
 }
 
 static int ams_read_event_config(struct iio_dev *indio_dev,
-- 
2.47.3



^ permalink raw reply related

* [PATCH 2/3] iio: adc: xilinx-ams: use guard(mutex) for automatic locking
From: Guilherme Ivo Bozi @ 2026-04-14  9:29 UTC (permalink / raw)
  To: Salih Erim, Conall O'Griofa, Jonathan Cameron, Michal Simek
  Cc: David Lechner, Nuno Sá, Andy Shevchenko, linux-iio,
	linux-arm-kernel, linux-kernel, Guilherme Ivo Bozi
In-Reply-To: <20260414093018.7153-1-guilherme.bozi@usp.br>

Replace open-coded mutex_lock()/mutex_unlock() pairs with
guard(mutex) to simplify locking and ensure proper unlock on
all control flow paths.

This removes explicit unlock handling, reduces boilerplate,
and avoids potential mistakes in error paths while keeping
the behavior unchanged.

Signed-off-by: Guilherme Ivo Bozi <guilherme.bozi@usp.br>
---
 drivers/iio/adc/xilinx-ams.c | 24 ++++++++----------------
 1 file changed, 8 insertions(+), 16 deletions(-)

diff --git a/drivers/iio/adc/xilinx-ams.c b/drivers/iio/adc/xilinx-ams.c
index f364e69a5a0d..1d84310b61a9 100644
--- a/drivers/iio/adc/xilinx-ams.c
+++ b/drivers/iio/adc/xilinx-ams.c
@@ -720,22 +720,20 @@ static int ams_read_raw(struct iio_dev *indio_dev,
 	int ret;
 
 	switch (mask) {
-	case IIO_CHAN_INFO_RAW:
-		mutex_lock(&ams->lock);
+	case IIO_CHAN_INFO_RAW: {
+		guard(mutex)(&ams->lock);
 		if (chan->scan_index >= AMS_CTRL_SEQ_BASE) {
 			ret = ams_read_vcc_reg(ams, chan->address, val);
 			if (ret)
-				goto unlock_mutex;
+				return ret;
 			ams_enable_channel_sequence(indio_dev);
 		} else if (chan->scan_index >= AMS_PS_SEQ_MAX)
 			*val = readl(ams->pl_base + chan->address);
 		else
 			*val = readl(ams->ps_base + chan->address);
 
-		ret = IIO_VAL_INT;
-unlock_mutex:
-		mutex_unlock(&ams->lock);
-		return ret;
+		return IIO_VAL_INT;
+	}
 	case IIO_CHAN_INFO_SCALE:
 		switch (chan->type) {
 		case IIO_VOLTAGE:
@@ -939,7 +937,7 @@ static int ams_write_event_config(struct iio_dev *indio_dev,
 
 	alarm = ams_get_alarm_mask(chan->scan_index);
 
-	mutex_lock(&ams->lock);
+	guard(mutex)(&ams->lock);
 
 	if (state)
 		ams->alarm_mask |= alarm;
@@ -948,8 +946,6 @@ static int ams_write_event_config(struct iio_dev *indio_dev,
 
 	ams_update_alarm(ams, ams->alarm_mask);
 
-	mutex_unlock(&ams->lock);
-
 	return 0;
 }
 
@@ -962,15 +958,13 @@ static int ams_read_event_value(struct iio_dev *indio_dev,
 	struct ams *ams = iio_priv(indio_dev);
 	unsigned int offset = ams_get_alarm_offset(chan->scan_index, dir);
 
-	mutex_lock(&ams->lock);
+	guard(mutex)(&ams->lock);
 
 	if (chan->scan_index >= AMS_PS_SEQ_MAX)
 		*val = readl(ams->pl_base + offset);
 	else
 		*val = readl(ams->ps_base + offset);
 
-	mutex_unlock(&ams->lock);
-
 	return IIO_VAL_INT;
 }
 
@@ -983,7 +977,7 @@ static int ams_write_event_value(struct iio_dev *indio_dev,
 	struct ams *ams = iio_priv(indio_dev);
 	unsigned int offset;
 
-	mutex_lock(&ams->lock);
+	guard(mutex)(&ams->lock);
 
 	/* Set temperature channel threshold to direct threshold */
 	if (chan->type == IIO_TEMP) {
@@ -1005,8 +999,6 @@ static int ams_write_event_value(struct iio_dev *indio_dev,
 	else
 		writel(val, ams->ps_base + offset);
 
-	mutex_unlock(&ams->lock);
-
 	return 0;
 }
 
-- 
2.47.3



^ permalink raw reply related

* [PATCH 1/3] iio: adc: xilinx-ams: fix out-of-bounds channel lookup in event handling
From: Guilherme Ivo Bozi @ 2026-04-14  9:29 UTC (permalink / raw)
  To: Salih Erim, Conall O'Griofa, Jonathan Cameron, Michal Simek
  Cc: David Lechner, Nuno Sá, Andy Shevchenko, linux-iio,
	linux-arm-kernel, linux-kernel, Guilherme Ivo Bozi
In-Reply-To: <20260414093018.7153-1-guilherme.bozi@usp.br>

ams_event_to_channel() may return a pointer past the end of
dev->channels when no matching scan_index is found. This can lead
to invalid memory access in ams_handle_event().

Add a bounds check in ams_event_to_channel() and return NULL when
no channel is found. Also guard the caller to safely handle this
case.

Fixes: <d5c70627a79455154f5f636096abe6fe57510605>
Signed-off-by: Guilherme Ivo Bozi <guilherme.bozi@usp.br>
---
 drivers/iio/adc/xilinx-ams.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/iio/adc/xilinx-ams.c b/drivers/iio/adc/xilinx-ams.c
index 124470c92529..f364e69a5a0d 100644
--- a/drivers/iio/adc/xilinx-ams.c
+++ b/drivers/iio/adc/xilinx-ams.c
@@ -871,6 +871,9 @@ static const struct iio_chan_spec *ams_event_to_channel(struct iio_dev *dev,
 		if (dev->channels[i].scan_index == scan_index)
 			break;
 
+	if (i >= dev->num_channels)
+		return NULL;
+
 	return &dev->channels[i];
 }
 
@@ -1012,6 +1015,8 @@ static void ams_handle_event(struct iio_dev *indio_dev, u32 event)
 	const struct iio_chan_spec *chan;
 
 	chan = ams_event_to_channel(indio_dev, event);
+	if (!chan)
+		return;
 
 	if (chan->type == IIO_TEMP) {
 		/*
-- 
2.47.3



^ permalink raw reply related

* [PATCH 0/3] iio: adc: xilinx-ams: refactor alarm handling to table-driven design
From: Guilherme Ivo Bozi @ 2026-04-14  9:29 UTC (permalink / raw)
  To: Salih Erim, Conall O'Griofa, Jonathan Cameron, Michal Simek
  Cc: David Lechner, Nuno Sá, Andy Shevchenko, linux-iio,
	linux-arm-kernel, linux-kernel, Guilherme Ivo Bozi

This series addresses significant code duplication in alarm handling
logic across the Xilinx AMS IIO driver.

An analysis of the codebase (ArKanjo explorer) revealed multiple
duplicated mappings between scan_index, alarm bits, and register
offsets.

To address this, the series introduces a centralized table-driven
mapping (alarm_map) that replaces multiple switch statements spread
across the driver.

This improves:
- maintainability (single source of truth for mappings)
- readability (removes repeated switch logic)
- extensibility (new alarms require only table updates)

No functional changes are intended.

Series overview:
- Patch 1: fix out-of-bounds channel lookup 
- Patch 2: convert mutex handling to guard(mutex) 
- Patch 3: introduce table-driven alarm mapping

Guilherme Ivo Bozi (3):
  iio: adc: xilinx-ams: fix out-of-bounds channel lookup in event
    handling
  iio: adc: xilinx-ams: use guard(mutex) for automatic locking
  iio: adc: xilinx-ams: refactor alarm mapping to table-driven approach

 drivers/iio/adc/xilinx-ams.c | 192 +++++++++++++----------------------
 1 file changed, 73 insertions(+), 119 deletions(-)

-- 
2.47.3



^ permalink raw reply

* Re: [PATCH 0/3] mm: split the file's i_mmap tree for NUMA
From: Huang Shijie @ 2026-04-14  9:11 UTC (permalink / raw)
  To: Mateusz Guzik
  Cc: akpm, viro, brauner, linux-mm, linux-kernel, linux-arm-kernel,
	linux-fsdevel, muchun.song, osalvador, linux-trace-kernel,
	linux-perf-users, linux-parisc, nvdimm, zhongyuan, fangbaoshun,
	yingzhiwei
In-Reply-To: <76pfiwabdgsej6q2yxfh3efuqvsyg7mt7rvl5itzzjyhdrto5r@53viaxsackzv>

On Mon, Apr 13, 2026 at 05:33:21PM +0200, Mateusz Guzik wrote:
> On Mon, Apr 13, 2026 at 02:20:39PM +0800, Huang Shijie wrote:
> >   In NUMA, there are maybe many NUMA nodes and many CPUs.
> > For example, a Hygon's server has 12 NUMA nodes, and 384 CPUs.
> > In the UnixBench tests, there is a test "execl" which tests
> > the execve system call.
> > 
> >   When we test our server with "./Run -c 384 execl",
> > the test result is not good enough. The i_mmap locks contended heavily on
> > "libc.so" and "ld.so". For example, the i_mmap tree for "libc.so" can have 
> > over 6000 VMAs, all the VMAs can be in different NUMA mode.
> > The insert/remove operations do not run quickly enough.
> > 
> > patch 1 & patch 2 are try to hide the direct access of i_mmap.
> > patch 3 splits the i_mmap into sibling trees, and we can get better 
> > performance with this patch set:
> >     we can get 77% performance improvement(10 times average)
> > 
> 
> To my reading you kept the lock as-is and only distributed the protected
> state.
> 
> While I don't doubt the improvement, I'm confident should you take a
> look at the profile you are going to find this still does not scale with
> rwsem being one of the problems (there are other global locks, some of
> which have experimental patches for).
IMHO, when the number of VMAs in the i_mmap is very large, only optimise the rwsem
lock does not help too much for our NUMA case.

In our NUMA server, the remote access could be the major issue.


> 
> Apart from that this does nothing to help high core systems which are
> all one node, which imo puts another question mark on this specific
> proposal.
Yes, this patch set only focus on the NUMA case.
The one-node case should use the original i_mmap.

Maybe I can add a new config, CONFIG_SPILT_I_MMAP. The config is disabled
by default, and enabled when the NUMA node is not one.

> 
> Of course one may question whether a RB tree is the right choice here,
> it may be the lock-protected cost can go way down with merely a better
> data structure.
> 
> Regardless of that, for actual scalability, there will be no way around
> decentralazing locking around this and partitioning per some core count
> (not just by numa awareness).
> 
> Decentralizing locking is definitely possible, but I have not looked
> into specifics of how problematic it is. Best case scenario it will
> merely with separate locks. Worst case scenario something needs a fully
> stabilized state for traversal, in that case another rw lock can be
Yes. 

The traversal may need to hold many locks.

> slapped around this, creating locking order read lock -> per-subset
> write lock -- this will suffer scalability due to the read locking, but
> it will still scale drastically better as apart from that there will be
> no serialization. In this setting the problematic consumer will write
> lock the new thing to stabilize the state.
> 
> So my non-maintainer opinion is that the patchset is not worth it as it
> fails to address anything for significantly more common and already
> affected setups.
This patch set is to reduce the remote access latency for insert/remove VMA
in NUMA.

> 
> Have you looked into splitting the lock?
> 
I ever tried. 

But there are two disadvantages:
  1.) The traversal may need to hold many locks which makes the
      code very horrible.

  2.) Even we split the locks. Each lock protects a tree, when the tree becomes
      big enough, the VMA insert/remove will also become slow in NUMA.
      The reason is that the tree has VMAs in different NUMA nodes.
      

Thanks
Huang Shijie



^ permalink raw reply

* Re: [PATCH] arm64: dts: exynos850: Add SRAM node
From: Krzysztof Kozlowski @ 2026-04-14  9:08 UTC (permalink / raw)
  To: Alexey Klimov, Sam Protsenko, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Alim Akhtar
  Cc: linux-samsung-soc, linux-arm-kernel, devicetree, linux-kernel
In-Reply-To: <DHSR70EGYY4N.2EA2HWIXJR7QR@linaro.org>

On 14/04/2026 11:00, Alexey Klimov wrote:
> On Mon Apr 13, 2026 at 4:23 PM BST, Krzysztof Kozlowski wrote:
>> On 13/04/2026 16:52, Alexey Klimov wrote:
>>> SRAM is used by the ACPM protocol to retrieve the ACPM channels
>>> information and configuration data. Add the SRAM node.
>>>
>>> Signed-off-by: Alexey Klimov <alexey.klimov@linaro.org>
>>> ---
>>>  arch/arm64/boot/dts/exynos/exynos850.dtsi | 8 ++++++++
>>>  1 file changed, 8 insertions(+)
>>>
>>> diff --git a/arch/arm64/boot/dts/exynos/exynos850.dtsi b/arch/arm64/boot/dts/exynos/exynos850.dtsi
>>> index cb55015c8dce..cf4a6168846c 100644
>>> --- a/arch/arm64/boot/dts/exynos/exynos850.dtsi
>>> +++ b/arch/arm64/boot/dts/exynos/exynos850.dtsi
>>> @@ -910,6 +910,14 @@ spi_2: spi@11d20000 {
>>>  			};
>>>  		};
>>>  	};
>>> +
>>> +	apm_sram: sram@2039000 {
>>> +		compatible = "mmio-sram";
>>> +		reg = <0x0 0x2039000 0x40000>;
>>> +		#address-cells = <1>;
>>> +		#size-cells = <1>;
>>> +		ranges = <0x0 0x0 0x2039000 0x40000>;
>>
>> You miss here children.
> 
> Thank you! I guess I should convert it to smth like this:
> 
> apm_sram: sram@2039000 {
> 		compatible = "mmio-sram";
> 		reg = <0x0 0x2039000 0x40000>;
> 		ranges = <0x0 0x0 0x2039000 0x40000>;
> 		#address-cells = <1>;
> 		#size-cells = <1>;
> 
> 		acpm_sram_region: sram-section@0 {
> 			reg = <0x0 0x40000>;

This covers entire block, so feels pointless. Maybe requirement of
children should be dropped. What's the point of having children? Why
does the driver need them?

> 		};
> 	};
> 
> And then later reference shmem = &acpm_sram_region from acpm node.
> 
>> Also, 'ranges' should be after 'reg'.
> 
> Thanks, will fix this.
> 
> FWIW this commit is a copy of commit 48e7821b26904
> https://lore.kernel.org/r/20250207-gs101-acpm-dt-v4-1-230ba8663a2d@linaro.org


Huh, we should fix that one as well.


Best regards,
Krzysztof


^ permalink raw reply

* Re: [RFC PATCH 00/12] Add Linux RISC-V trace support via CoreSight
From: Zane Leung @ 2026-04-14  9:04 UTC (permalink / raw)
  To: Anup Patel
  Cc: robh, krzk+dt, conor+dt, palmer, pjw, gregkh, alexander.shishkin,
	irogers, coresight, peterz, mingo, namhyung, mark.rutland, jolsa,
	adrian.hunter, mchitale, atish.patra, andrew.jones, sunilvl,
	linux-riscv, linux-kernel, anup.patel, mayuresh.chitale,
	zhuangqiubin, suzuki.poulose, mike.leach, james.clark,
	alexander.shishkin, linux-arm-kernel
In-Reply-To: <CAAhSdy2Sx98qcCMBVF+_BV1G-syrO74Y_S298hMwkW+vgLtb5Q@mail.gmail.com>


On 4/14/2026 3:23 PM, Anup Patel wrote:
> On Tue, Apr 14, 2026 at 9:12 AM Zane Leung <liangzhen@linux.spacemit.com> wrote:
>> From: liangzhen <zhen.liang@spacemit.com>
>>
>> This series adds Linux RISC-V trace support via CoreSight, implementing RISC-V
>> trace drivers within the CoreSight framework and integrating them with perf tools.
>> The K3 SoC contains RISC-V Encoder, Funnel, ATB, CoreSight Funnel, and CoreSight TMC
>> components, which can be directly leveraged through the existing CoreSight infrastructure.
>>
>> Linux RISC-V trace support form Anup Patel:
>> (https://patchwork.kernel.org/project/linux-riscv/cover/20260225062448.4027948-1-anup.patel@oss.qualcomm.com/)
>> which currently lacks ATB component support and guidance on reusing CoreSight components.
> What stops you from adding RISC-V trace funnel and ATB bridge drivers
> on top of this series ?
>
Firstly, my works started much earlier than this series. Secondly, it is difficult to add the coresight funnel and coresight tmc components to this series. Based on the coresight framework, I can directly reuse them.

>
>> The series includes:
>> - RISC-V trace driver implementation within the CoreSight framework
> The RISC-V trace very different from the CoreSight framework in many ways:
> 1) Types of components supported
> 2) Trace packet formats
> 3) The way MMIO based components are discoverd
> 4) ... and more ...
1) I believe that RISC-V tracing is coresight-alike, where have encoders/funnel/sink/bridge that are described through DT and controlled by MMIO. 
2) I think the difference in package format is nothing more than the coresight frame generated by the ATB component to distinguish the trace source. After removing it, it becomes riscv trace data.
3) The current CoreSight framework code does not introduce this mechanism, it is described through DT.
>> - RISC-V Trace Encoder, Funnel, and ATB Bridge drivers as CoreSight devices
>> - RISC-V trace PMU record capabilities and parsing events in perf.
>> - RISC-V Nexus Trace decoder for perf tools
>>
>> Any comments or suggestions are welcome.
>>
>> Verification on K3 SoC:
>> To verify this patch series on K3 hardware, the following device tree are required:
>> 1. RISC-V Trace Encoder node (8)
>> 2. RISC-V ATB Bridge node (8)
>> 3. RISC-V Trace Funnel node (2)
>> 3. CoreSight Funnel configuration for RISC-V (1)
>> 4. CoreSight TMC configuration for trace buffer (1)
>>
>> /{
>>         dummy_clk: apb-pclk {
>>                 compatible = "fixed-clock";
>>                 #clock-cells = <0x0>;
>>                 clock-output-names = "clk14mhz";
>>                 clock-frequency = <14000000>;
>>         };
>>
>>
>>         soc: soc {
>>                 #address-cells = <2>;
>>                 #size-cells = <2>;
>>
>>                 encoder0: encoder@d9002000 {
>>                         compatible = "riscv,trace-encoder";
>>                         reg = <0x0 0xd9002000 0x0 0x1000>;
>>                         cpus = <&cpu_0>;
>>                         out-ports {
>>                                 port {
>>                                         cluster0_encoder0_out_port: endpoint {
>>                                                 remote-endpoint = <&cluster0_bridge0_in_port>;
>>                                         };
>>                                 };
>>                         };
>>                 };
>>
>>                 bridge0: bridge@d9003000 {
>>                         compatible = "riscv,trace-atbbridge";
>>                         reg = <0x0 0xd9003000 0x0 0x1000>;
>>                         cpus = <&cpu_0>;
>>                         out-ports {
>>                                 port {
>>                                         cluster0_bridge0_out_port: endpoint {
>>                                                 remote-endpoint = <&cluster0_funnel_in_port0>;
>>                                         };
>>                                 };
>>                         };
>>                         in-ports {
>>                                 port {
>>                                         cluster0_bridge0_in_port: endpoint {
>>                                                 remote-endpoint = <&cluster0_encoder0_out_port>;
>>                                         };
>>                                 };
>>                         };
>>                 };
>>
>> ...
>>
>>                 cluster0_funnel: funnel@d9000000 {
>>                         compatible = "riscv,trace-funnel";
>>                         reg = <0x0 0xd9000000 0x0 0x1000>;
>>                         cpus = <&cpu_0>, <&cpu_1>, <&cpu_2>, <&cpu_3>;
>>                         riscv,timestamp-present;
>>                         out-ports {
>>                                 port {
>>                                         cluster0_funnel_out_port: endpoint {
>>                                                 remote-endpoint = <&main_funnel_in_port0>;
>>                                         };
>>                                 };
>>                         };
>>                         in-ports {
>>                                 #address-cells = <1>;
>>                                 #size-cells = <0>;
>>
>>                                 port@0 {
>>                                         reg = <0>;
>>                                         cluster0_funnel_in_port0: endpoint {
>>                                                 remote-endpoint = <&cluster0_bridge0_out_port>;
>>                                         };
>>                                 };
>>
>>                                 port@1 {
>>                                         reg = <1>;
>>                                         cluster0_funnel_in_port1: endpoint {
>>                                                 remote-endpoint = <&cluster0_bridge1_out_port>;
>>                                         };
>>                                 };
>>
>>                                 port@2 {
>>                                         reg = <2>;
>>                                         cluster0_funnel_in_port2: endpoint {
>>                                                 remote-endpoint = <&cluster0_bridge2_out_port>;
>>                                         };
>>                                 };
>>
>>                                 port@3 {
>>                                         reg = <3>;
>>                                         cluster0_funnel_in_port3: endpoint {
>>                                                 remote-endpoint = <&cluster0_bridge3_out_port>;
>>                                         };
>>                                 };
>>                         };
>>                 };
>>
>>                 cluster1_funnel: funnel@d9010000 {
>>                         compatible = "riscv,trace-funnel";
>>                         reg = <0x0 0xd9010000 0x0 0x1000>;
>>                         cpus = <&cpu_4>, <&cpu_5>, <&cpu_6>, <&cpu_7>;
>>                         riscv,timestamp-present;
>>                         out-ports {
>>                                 port {
>>                                         cluster1_funnel_out_port: endpoint {
>>                                                 remote-endpoint = <&main_funnel_in_port1>;
>>                                         };
>>                                 };
>>                         };
>>                         in-ports {
>>                                 #address-cells = <1>;
>>                                 #size-cells = <0>;
>>
>>                                 port@0 {
>>                                         reg = <0>;
>>                                         cluster1_funnel_in_port0: endpoint {
>>                                                 remote-endpoint = <&cluster1_bridge0_out_port>;
>>                                         };
>>                                 };
>>
>>                                 port@1 {
>>                                         reg = <1>;
>>                                         cluster1_funnel_in_port1: endpoint {
>>                                                 remote-endpoint = <&cluster1_bridge1_out_port>;
>>                                         };
>>                                 };
>>
>>                                 port@2 {
>>                                         reg = <2>;
>>                                         cluster1_funnel_in_port2: endpoint {
>>                                                 remote-endpoint = <&cluster1_bridge2_out_port>;
>>                                         };
>>                                 };
>>
>>                                 port@3 {
>>                                         reg = <3>;
>>                                         cluster1_funnel_in_port3: endpoint {
>>                                                 remote-endpoint = <&cluster1_bridge3_out_port>;
>>                                         };
>>                                 };
>>                         };
>>                 };
>>
>>                 main_funnel: funnel@d9042000 {
>>                         compatible = "arm,coresight-dynamic-funnel", "arm,primecell";
> Is it legally allowed to mix and match ARM coresight IPs with
> RISC-V trace components at hardware level ?
The ATB Bridge allows sending RISC-V trace to Arm CoreSight infrastructure (instead of RISC-V compliant sink defined in this document) as an ATB initiator.  see: 
https://github.com/riscv-non-isa/riscv-nexus-trace/blob/105dc1c349556622e4d202d22b584a887ded462f/docs/RISC-V-Trace-Control-Interface.adoc#L184
For ATB Bridge, read trace using Coresight components (ETB/TMC/TPIU). see:
https://github.com/riscv-non-isa/riscv-nexus-trace/blob/105dc1c349556622e4d202d22b584a887ded462f/docs/RISC-V-Trace-Control-Interface.adoc#L1684
>
>>                         reg = <0x0 0xd9042000 0x0 0x1000>;
>>                         clocks = <&dummy_clk>;
>>                         clock-names = "apb_pclk";
>>                         out-ports {
>>                                 port {
>>                                         main_funnel_out_port: endpoint {
>>                                                 remote-endpoint = <&etf_in_port>;
>>                                         };
>>                                 };
>>                         };
>>                         in-ports {
>>                                 #address-cells = <1>;
>>                                 #size-cells = <0>;
>>
>>                                 port@0 {
>>                                         reg = <0>;
>>                                         main_funnel_in_port0: endpoint {
>>                                                 remote-endpoint = <&cluster0_funnel_out_port>;
>>                                         };
>>                                 };
>>
>>                                 port@1 {
>>                                         reg = <1>;
>>                                         main_funnel_in_port1: endpoint {
>>                                                 remote-endpoint = <&cluster1_funnel_out_port>;
>>                                         };
>>                                 };
>>                         };
>>                 };
>>
>>                 etf: etf@d9043000 {
>>                         compatible = "arm,coresight-tmc", "arm,primecell";
>>                         reg = <0x0 0xd9043000 0x0 0x1000>;
>>                         clocks = <&dummy_clk>;
>>                         clock-names = "apb_pclk";
>>                         out-ports {
>>                                 port {
>>                                         etf_out_port: endpoint {
>>                                                 remote-endpoint = <&etr_in_port>;
>>                                         };
>>                                 };
>>                         };
>>                         in-ports {
>>                                 port {
>>                                         etf_in_port: endpoint {
>>                                                 remote-endpoint = <&main_funnel_out_port>;
>>                                         };
>>                                 };
>>                         };
>>                 };
>>
>>                 etr: etr@d9044000 {
>>                         compatible = "arm,coresight-tmc", "arm,primecell";
>>                         reg = <0x0 0xd9044000 0x0 0x1000>;
>>                         clocks = <&dummy_clk>;
>>                         clock-names = "apb_pclk";
>>                         arm,scatter-gather;
>>                         in-ports {
>>                                 port {
>>                                         etr_in_port: endpoint {
>>                                                 remote-endpoint = <&etf_out_port>;
>>                                         };
>>                                 };
>>                         };
>>                 };
>>         };
>> };
>>
>> Verification case:
>>
>> ~ # perf list pmu
>>   rvtrace//                                          [Kernel PMU event]
>>
>> ~ # perf record -e rvtrace/@tmc_etr0/ --per-thread uname
>> Linux
>> [ perf record: Woken up 1 times to write data ]
>> [ perf record: Captured and wrote 0.191 MB perf.data ]
>> ~ # perf script
>>            uname     137 [003]          1           branches:  ffffffff80931470 rvtrace_poll_bit+0x38 ([kernel.kallsyms]) => ffffffff80931492 rvtrace_poll_bit+0x5a ([kernel.kallsyms])
>>            uname     137 [003]          1           branches:  ffffffff809328a6 encoder_enable_hw+0x252 ([kernel.kallsyms]) => ffffffff809328ba encoder_enable_hw+0x266 ([kernel.kallsyms])
>>            uname     137 [003]          1           branches:  ffffffff80932c4a encoder_enable+0x82 ([kernel.kallsyms]) => ffffffff80932c36 encoder_enable+0x6e ([kernel.kallsyms])
>>            uname     137 [003]          1           branches:  ffffffff80928198 etm_event_start+0xf0 ([kernel.kallsyms]) => ffffffff809281aa etm_event_start+0x102 ([kernel.kallsyms])
>>            uname     137 [003]          1           branches:  ffffffff809281e6 etm_event_start+0x13e ([kernel.kallsyms]) => ffffffff8092755e coresight_get_sink_id+0x16 ([kernel.kallsyms])
>>            uname     137 [003]          1           branches:  ffffffff8092820e etm_event_start+0x166 ([kernel.kallsyms]) => ffffffff80928226 etm_event_start+0x17e ([kernel.kallsyms])
>>            uname     137 [003]          1           branches:  ffffffff801c3bb4 perf_report_aux_output_id+0x0 ([kernel.kallsyms]) => ffffffff801c3bd6 perf_report_aux_output_id+0x22 ([kernel.kallsyms])
>>            uname     137 [003]          1           branches:  ffffffff801c3c5a perf_report_aux_output_id+0xa6 ([kernel.kallsyms]) => ffffffff801c3bf0 perf_report_aux_output_id+0x3c ([kernel.kallsyms])
>>            uname     137 [003]          1           branches:  ffffffff801c3c40 perf_report_aux_output_id+0x8c ([kernel.kallsyms]) => ffffffff801c3aea __perf_event_header__init_id+0x2a ([kernel.kallsyms])
>>            uname     137 [003]          1           branches:  ffffffff801c3b42 __perf_event_header__init_id+0x82 ([kernel.kallsyms]) => ffffffff801c3b4a __perf_event_header__init_id+0x8a ([kernel.kallsyms])
>>            uname     137 [003]          1           branches:  ffffffff801c3bb0 __perf_event_header__init_id+0xf0 ([kernel.kallsyms]) => ffffffff801c3b58 __perf_event_header__init_id+0x98 ([kernel.kallsyms])
>>            uname     137 [003]          1           branches:  ffffffff8004c658 __task_pid_nr_ns+0x0 ([kernel.kallsyms]) => ffffffff8004c696 __task_pid_nr_ns+0x3e ([kernel.kallsyms])
>>            uname     137 [003]          1           branches:  ffffffff8004c71e __task_pid_nr_ns+0xc6 ([kernel.kallsyms]) => ffffffff8004c6a4 __task_pid_nr_ns+0x4c ([kernel.kallsyms])
>>            uname     137 [003]          1           branches:  ffffffff8004c6e4 __task_pid_nr_ns+0x8c ([kernel.kallsyms]) => ffffffff8004c6e4 __task_pid_nr_ns+0x8c ([kernel.kallsyms])
>> ...
>>
>> liangzhen (12):
>>   coresight: Add RISC-V support to CoreSight tracing
>>   coresight: Initial implementation of RISC-V trace driver
>>   coresight: Add RISC-V Trace Encoder driver
>>   coresight: Add RISC-V Trace Funnel driver
>>   coresight: Add RISC-V Trace ATB Bridge driver
>>   coresight rvtrace: Add timestamp component support for encoder and
>>     funnel
>>   coresight: Add RISC-V PMU name support
>>   perf tools: riscv: making rvtrace PMU listable
>>   perf tools: Add RISC-V trace PMU record capabilities
>>   perf tools: Add Nexus RISC-V Trace decoder
>>   perf symbols: Add RISC-V PLT entry sizes
>>   perf tools: Integrate RISC-V trace decoder into auxtrace
>>
>>  drivers/hwtracing/Kconfig                     |    2 +
>>  drivers/hwtracing/coresight/Kconfig           |   46 +-
>>  drivers/hwtracing/coresight/Makefile          |    6 +
>>  drivers/hwtracing/coresight/coresight-core.c  |    8 +
>>  .../hwtracing/coresight/coresight-etm-perf.c  |    1 -
>>  .../hwtracing/coresight/coresight-etm-perf.h  |   21 +
>>  .../hwtracing/coresight/coresight-platform.c  |    1 -
>>  .../hwtracing/coresight/coresight-tmc-etf.c   |    4 +
>>  .../hwtracing/coresight/coresight-tmc-etr.c   |    4 +
>>  .../hwtracing/coresight/rvtrace-atbbridge.c   |  239 +++
>>  drivers/hwtracing/coresight/rvtrace-core.c    |  135 ++
>>  .../coresight/rvtrace-encoder-core.c          |  562 +++++++
>>  .../coresight/rvtrace-encoder-sysfs.c         |  363 +++++
>>  drivers/hwtracing/coresight/rvtrace-encoder.h |  151 ++
>>  drivers/hwtracing/coresight/rvtrace-funnel.c  |  337 ++++
>>  drivers/hwtracing/coresight/rvtrace-funnel.h  |   39 +
>>  .../hwtracing/coresight/rvtrace-timestamp.c   |  278 ++++
>>  .../hwtracing/coresight/rvtrace-timestamp.h   |   64 +
>>  include/linux/coresight-pmu.h                 |    4 +
>>  include/linux/rvtrace.h                       |  116 ++
>>  tools/arch/riscv/include/asm/insn.h           |  645 ++++++++
>>  tools/perf/arch/riscv/util/Build              |    2 +
>>  tools/perf/arch/riscv/util/auxtrace.c         |  490 ++++++
>>  tools/perf/arch/riscv/util/pmu.c              |   20 +
>>  tools/perf/util/Build                         |    3 +
>>  tools/perf/util/auxtrace.c                    |    4 +
>>  tools/perf/util/auxtrace.h                    |    1 +
>>  tools/perf/util/nexus-rv-decoder/Build        |    1 +
>>  .../util/nexus-rv-decoder/nexus-rv-decoder.c  | 1364 +++++++++++++++++
>>  .../util/nexus-rv-decoder/nexus-rv-decoder.h  |  139 ++
>>  .../perf/util/nexus-rv-decoder/nexus-rv-msg.h |  190 +++
>>  tools/perf/util/rvtrace-decoder.c             | 1039 +++++++++++++
>>  tools/perf/util/rvtrace.h                     |   40 +
>>  tools/perf/util/symbol-elf.c                  |    4 +
>>  34 files changed, 6320 insertions(+), 3 deletions(-)
>>  create mode 100644 drivers/hwtracing/coresight/rvtrace-atbbridge.c
>>  create mode 100644 drivers/hwtracing/coresight/rvtrace-core.c
>>  create mode 100644 drivers/hwtracing/coresight/rvtrace-encoder-core.c
>>  create mode 100644 drivers/hwtracing/coresight/rvtrace-encoder-sysfs.c
>>  create mode 100644 drivers/hwtracing/coresight/rvtrace-encoder.h
>>  create mode 100644 drivers/hwtracing/coresight/rvtrace-funnel.c
>>  create mode 100644 drivers/hwtracing/coresight/rvtrace-funnel.h
>>  create mode 100644 drivers/hwtracing/coresight/rvtrace-timestamp.c
>>  create mode 100644 drivers/hwtracing/coresight/rvtrace-timestamp.h
>>  create mode 100644 include/linux/rvtrace.h
>>  create mode 100644 tools/arch/riscv/include/asm/insn.h
>>  create mode 100644 tools/perf/arch/riscv/util/auxtrace.c
>>  create mode 100644 tools/perf/arch/riscv/util/pmu.c
>>  create mode 100644 tools/perf/util/nexus-rv-decoder/Build
>>  create mode 100644 tools/perf/util/nexus-rv-decoder/nexus-rv-decoder.c
>>  create mode 100644 tools/perf/util/nexus-rv-decoder/nexus-rv-decoder.h
>>  create mode 100644 tools/perf/util/nexus-rv-decoder/nexus-rv-msg.h
>>  create mode 100644 tools/perf/util/rvtrace-decoder.c
>>  create mode 100644 tools/perf/util/rvtrace.h
>>
>> --
>> 2.34.1
>>
> NACK to this approach of retrofitting RISC-V trace into ARM coresight.
I agree that integrating RISC-V trace directly into CoreSight is not a good approach, so I think we should abstract some of the logic of coresight and reuse it in RISC-V Trace.
>
> Regards,
> Anup

Thanks,
Zane


^ permalink raw reply

* Re: [PATCH] arm64: dts: exynos850: Add SRAM node
From: Alexey Klimov @ 2026-04-14  9:00 UTC (permalink / raw)
  To: Krzysztof Kozlowski, Alexey Klimov, Sam Protsenko, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Alim Akhtar
  Cc: linux-samsung-soc, linux-arm-kernel, devicetree, linux-kernel
In-Reply-To: <2ff077e1-8983-4a41-bb21-5e4140545aa3@kernel.org>

On Mon Apr 13, 2026 at 4:23 PM BST, Krzysztof Kozlowski wrote:
> On 13/04/2026 16:52, Alexey Klimov wrote:
>> SRAM is used by the ACPM protocol to retrieve the ACPM channels
>> information and configuration data. Add the SRAM node.
>> 
>> Signed-off-by: Alexey Klimov <alexey.klimov@linaro.org>
>> ---
>>  arch/arm64/boot/dts/exynos/exynos850.dtsi | 8 ++++++++
>>  1 file changed, 8 insertions(+)
>> 
>> diff --git a/arch/arm64/boot/dts/exynos/exynos850.dtsi b/arch/arm64/boot/dts/exynos/exynos850.dtsi
>> index cb55015c8dce..cf4a6168846c 100644
>> --- a/arch/arm64/boot/dts/exynos/exynos850.dtsi
>> +++ b/arch/arm64/boot/dts/exynos/exynos850.dtsi
>> @@ -910,6 +910,14 @@ spi_2: spi@11d20000 {
>>  			};
>>  		};
>>  	};
>> +
>> +	apm_sram: sram@2039000 {
>> +		compatible = "mmio-sram";
>> +		reg = <0x0 0x2039000 0x40000>;
>> +		#address-cells = <1>;
>> +		#size-cells = <1>;
>> +		ranges = <0x0 0x0 0x2039000 0x40000>;
>
> You miss here children.

Thank you! I guess I should convert it to smth like this:

apm_sram: sram@2039000 {
		compatible = "mmio-sram";
		reg = <0x0 0x2039000 0x40000>;
		ranges = <0x0 0x0 0x2039000 0x40000>;
		#address-cells = <1>;
		#size-cells = <1>;

		acpm_sram_region: sram-section@0 {
			reg = <0x0 0x40000>;
		};
	};

And then later reference shmem = &acpm_sram_region from acpm node.

> Also, 'ranges' should be after 'reg'.

Thanks, will fix this.

FWIW this commit is a copy of commit 48e7821b26904
https://lore.kernel.org/r/20250207-gs101-acpm-dt-v4-1-230ba8663a2d@linaro.org

Best regards,
Alexey



^ permalink raw reply

* Re: [PATCH v2] Bluetooth: Add Broadcom channel priority commands
From: Neal Gompa @ 2026-04-14  8:59 UTC (permalink / raw)
  To: fnkl.kernel
  Cc: Sven Peter, Janne Grunau, Marcel Holtmann, Luiz Augusto von Dentz,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, linux-kernel, asahi, linux-arm-kernel,
	linux-bluetooth, netdev
In-Reply-To: <20260407-brcm-prio-v2-1-3f745edf49af@gmail.com>

On Tue, Apr 7, 2026 at 1:46 PM Sasha Finkelstein via B4 Relay
<devnull+fnkl.kernel.gmail.com@kernel.org> wrote:
>
> From: Sasha Finkelstein <fnkl.kernel@gmail.com>
>
> Certain Broadcom bluetooth chips (bcm4377/bcm4378/bcm438) need ACL
> streams carrying audio to be set as "high priority" using a vendor
> specific command to prevent 10-ish second-long dropouts whenever
> something does a device scan. This patch sends the command when the
> socket priority is set to TC_PRIO_INTERACTIVE, as BlueZ does for audio.
>
> Signed-off-by: Sasha Finkelstein <fnkl.kernel@gmail.com>
> ---
> Changes in v2:
> - new ioctl got nack-ed, so let's use sk_priority as the trigger
> - Link to v1: https://lore.kernel.org/r/20260407-brcm-prio-v1-1-f38b17376640@gmail.com
> ---

Thank you so much for this!

Reviewed-by: Neal Gompa <neal@gompa.dev>


-- 
真実はいつも一つ!/ Always, there's only one truth!


^ permalink raw reply

* Re: [net-next] net: airoha: Wait for TX to complete in airoha_dev_stop()
From: Lorenzo Bianconi @ 2026-04-14  8:57 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: edumazet, kuba, davem, netdev, andrew+netdev, linux-arm-kernel,
	linux-mediatek
In-Reply-To: <20260414082949.104611-1-pabeni@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3065 bytes --]

> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> 
> For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
> ---
> net: airoha: Wait for TX to complete in airoha_dev_stop()
> 
> Wait for TX to complete in airoha_dev_stop routine before stopping the
> TX DMA and run airoha_qdma_cleanup_tx_queue routine. Moreover,
> start/stop TX/RX NAPIs in ndo_open()/ndo_stop() callbacks in order to be
> sure the TX NAPIs have completed before stopping the TX DMA engine in
> airoha_dev_stop routine.
> 
> > diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> > index 8e4b043af4bc..9e40c8f375c1 100644
> > --- a/drivers/net/ethernet/airoha/airoha_eth.c
> > +++ b/drivers/net/ethernet/airoha/airoha_eth.c
> 
> [ ... ]
> 
> > @@ -1684,18 +1686,26 @@ static int airoha_dev_stop(struct net_device *dev)
> >  	struct airoha_qdma *qdma = port->qdma;
> >  	int i, err;
> >
> > -	netif_tx_disable(dev);
> >  	err = airoha_set_vip_for_gdm_port(port, false);
> >  	if (err)
> >  		return err;
> >
> > -	for (i = 0; i < ARRAY_SIZE(qdma->q_tx); i++)
> > -		netdev_tx_reset_subqueue(dev, i);
> > -
> >  	airoha_set_gdm_port_fwd_cfg(qdma->eth, REG_GDM_FWD_CFG(port->id),
> >  				    FE_PSE_PORT_DROP);
> >
> > +	netif_tx_disable(dev);
> >  	if (atomic_dec_and_test(&qdma->users)) {
> > +		u32 val;
> > +
> > +		/* Wait for TX to complete */
> > +		err = read_poll_timeout(airoha_qdma_rr, val,
> > +					!(val & GLOBAL_CFG_TX_DMA_BUSY_MASK),
> > +					USEC_PER_MSEC, 100 * USEC_PER_MSEC,
> > +					false, qdma, REG_QDMA_GLOBAL_CFG);
> > +		if (err)
> > +			return err;
> >                   ^^^^^^^^^^
> 
> Can this leave the device in an inconsistent state? When read_poll_timeout()
> fails (TX DMA busy for more than 100ms), airoha_dev_stop() returns an error
> immediately. However, atomic_dec_and_test(&qdma->users) has already
> decremented qdma->users to 0, but airoha_qdma_stop_napi() was never called,
> so all NAPIs remain enabled.
> 
> The networking core (__dev_close_many in net/core/dev.c) ignores the
> ndo_stop return value and marks the device as down regardless. When the
> device is later brought up again, airoha_dev_open() sees atomic_fetch_inc()
> return 0 (old value) and calls airoha_qdma_start_napi() -> napi_enable(),
> which hits BUG_ON(!test_bit(NAPI_STATE_SCHED, &val)) in napi_enable_locked()
> because the NAPI was never disabled.
> 
> The early return also skips the netdev_tx_reset_subqueue() calls at the end
> of the function.
> 
> Should the code either continue with cleanup on timeout (don't return), or
> ensure NAPI and DMA state are properly cleaned up before returning?

ack, I will fix it in v2.

Regards,
Lorenzo

> 
> > +
> > +		airoha_qdma_stop_napi(qdma);
> >  		airoha_qdma_clear(qdma, REG_QDMA_GLOBAL_CFG,
> >  				  GLOBAL_CFG_TX_DMA_EN_MASK |
> >  				  GLOBAL_CFG_RX_DMA_EN_MASK);
> 
> [ ... ]
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: [PATCH v2 1/2] mmc: dw_mmc: implement option for configuring DMA threshold
From: Shawn Lin @ 2026-04-14  8:50 UTC (permalink / raw)
  To: Kaustabh Chakraborty, Ulf Hansson, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Jaehoon Chung,
	Krzysztof Kozlowski, Alim Akhtar
  Cc: shawn.lin, linux-mmc, devicetree, linux-kernel, linux-arm-kernel,
	linux-samsung-soc
In-Reply-To: <20260414-dwmmc-dma-thr-v2-1-4058078f5361@disroot.org>

在 2026/04/14 星期二 16:36, Kaustabh Chakraborty 写道:
> Some controllers, such as certain Exynos SDIO ones, are unable to
> perform DMA transfers of small amount of bytes properly. Following the
> device tree schema, implement the property to define the DMA transfer
> threshold (from a hard coded value of 16 bytes) so that lesser number of
> bytes can be transferred safely skipping DMA in such controllers. The
> value of 16 bytes stays as the default for controllers which do not
> define it. This value can be overridden by implementation-specific init
> sequences.
> 
> Signed-off-by: Kaustabh Chakraborty <kauschluss@disroot.org>
> ---
>   drivers/mmc/host/dw_mmc.c | 5 +++--
>   drivers/mmc/host/dw_mmc.h | 2 ++
>   2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c
> index 20193ee7b73eb..9dd9fed4ccf49 100644
> --- a/drivers/mmc/host/dw_mmc.c
> +++ b/drivers/mmc/host/dw_mmc.c
> @@ -40,7 +40,6 @@
>   				 SDMMC_INT_RESP_ERR | SDMMC_INT_HLE)
>   #define DW_MCI_ERROR_FLAGS	(DW_MCI_DATA_ERROR_FLAGS | \
>   				 DW_MCI_CMD_ERROR_FLAGS)
> -#define DW_MCI_DMA_THRESHOLD	16
>   
>   #define DW_MCI_FREQ_MAX	200000000	/* unit: HZ */
>   #define DW_MCI_FREQ_MIN	100000		/* unit: HZ */
> @@ -821,7 +820,7 @@ static int dw_mci_pre_dma_transfer(struct dw_mci *host,
>   	 * non-word-aligned buffers or lengths. Also, we don't bother
>   	 * with all the DMA setup overhead for short transfers.
>   	 */
> -	if (data->blocks * data->blksz < DW_MCI_DMA_THRESHOLD)
> +	if (data->blocks * data->blksz < host->dma_threshold)
>   		return -EINVAL;
>   
>   	if (data->blksz & 3)
> @@ -3245,6 +3244,8 @@ int dw_mci_probe(struct dw_mci *host)
>   		goto err_clk_ciu;
>   	}
>   
> +	host->dma_threshold = 16;

I'd prefer to set it in dw_mci_alloc_host() instead of picking up
a random place to put it, for better code management.

> +
>   	if (host->rstc) {
>   		reset_control_assert(host->rstc);
>   		usleep_range(10, 50);
> diff --git a/drivers/mmc/host/dw_mmc.h b/drivers/mmc/host/dw_mmc.h
> index 42e58be74ce09..fc7601fba849f 100644
> --- a/drivers/mmc/host/dw_mmc.h
> +++ b/drivers/mmc/host/dw_mmc.h
> @@ -164,6 +164,8 @@ struct dw_mci {
>   	void __iomem		*fifo_reg;
>   	u32			data_addr_override;
>   	bool			wm_aligned;
> +	/* Configurable data byte threshold value for DMA transfer. */

No here, there is a long section of comment before struct dw_mci{ } that
describes each member of it, please add it there.

> +	u32			dma_threshold;
>   
>   	struct scatterlist	*sg;
>   	struct sg_mapping_iter	sg_miter;
> 


^ permalink raw reply

* [PATCH v2 1/2] mmc: dw_mmc: implement option for configuring DMA threshold
From: Kaustabh Chakraborty @ 2026-04-14  8:36 UTC (permalink / raw)
  To: Ulf Hansson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Jaehoon Chung, Shawn Lin, Krzysztof Kozlowski, Alim Akhtar
  Cc: linux-mmc, devicetree, linux-kernel, linux-arm-kernel,
	linux-samsung-soc, Kaustabh Chakraborty
In-Reply-To: <20260414-dwmmc-dma-thr-v2-0-4058078f5361@disroot.org>

Some controllers, such as certain Exynos SDIO ones, are unable to
perform DMA transfers of small amount of bytes properly. Following the
device tree schema, implement the property to define the DMA transfer
threshold (from a hard coded value of 16 bytes) so that lesser number of
bytes can be transferred safely skipping DMA in such controllers. The
value of 16 bytes stays as the default for controllers which do not
define it. This value can be overridden by implementation-specific init
sequences.

Signed-off-by: Kaustabh Chakraborty <kauschluss@disroot.org>
---
 drivers/mmc/host/dw_mmc.c | 5 +++--
 drivers/mmc/host/dw_mmc.h | 2 ++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c
index 20193ee7b73eb..9dd9fed4ccf49 100644
--- a/drivers/mmc/host/dw_mmc.c
+++ b/drivers/mmc/host/dw_mmc.c
@@ -40,7 +40,6 @@
 				 SDMMC_INT_RESP_ERR | SDMMC_INT_HLE)
 #define DW_MCI_ERROR_FLAGS	(DW_MCI_DATA_ERROR_FLAGS | \
 				 DW_MCI_CMD_ERROR_FLAGS)
-#define DW_MCI_DMA_THRESHOLD	16
 
 #define DW_MCI_FREQ_MAX	200000000	/* unit: HZ */
 #define DW_MCI_FREQ_MIN	100000		/* unit: HZ */
@@ -821,7 +820,7 @@ static int dw_mci_pre_dma_transfer(struct dw_mci *host,
 	 * non-word-aligned buffers or lengths. Also, we don't bother
 	 * with all the DMA setup overhead for short transfers.
 	 */
-	if (data->blocks * data->blksz < DW_MCI_DMA_THRESHOLD)
+	if (data->blocks * data->blksz < host->dma_threshold)
 		return -EINVAL;
 
 	if (data->blksz & 3)
@@ -3245,6 +3244,8 @@ int dw_mci_probe(struct dw_mci *host)
 		goto err_clk_ciu;
 	}
 
+	host->dma_threshold = 16;
+
 	if (host->rstc) {
 		reset_control_assert(host->rstc);
 		usleep_range(10, 50);
diff --git a/drivers/mmc/host/dw_mmc.h b/drivers/mmc/host/dw_mmc.h
index 42e58be74ce09..fc7601fba849f 100644
--- a/drivers/mmc/host/dw_mmc.h
+++ b/drivers/mmc/host/dw_mmc.h
@@ -164,6 +164,8 @@ struct dw_mci {
 	void __iomem		*fifo_reg;
 	u32			data_addr_override;
 	bool			wm_aligned;
+	/* Configurable data byte threshold value for DMA transfer. */
+	u32			dma_threshold;
 
 	struct scatterlist	*sg;
 	struct sg_mapping_iter	sg_miter;

-- 
2.53.0



^ permalink raw reply related

* [PATCH v2 0/2] Configuring DMA threshold value for DW-MMC controllers
From: Kaustabh Chakraborty @ 2026-04-14  8:35 UTC (permalink / raw)
  To: Ulf Hansson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Jaehoon Chung, Shawn Lin, Krzysztof Kozlowski, Alim Akhtar
  Cc: linux-mmc, devicetree, linux-kernel, linux-arm-kernel,
	linux-samsung-soc, Kaustabh Chakraborty

In Samsung Exynos 7870 devices with Broadcom Wi-Fi, it has been observed
that small sized DMA transfers are unreliable and are not written
properly, which renders the cache incoherent.

Experimental observations say that DMA transfer sizes of somewhere
around 64 to 512 are intolerable. We must thus implement a mechanism to
fall back to PIO transfer in this case. One such approach, which this
series implements is allowing the DMA transfer threshold, which is
already defined in the driver, to be configurable.

Note that this patch is likely to be labelled as a workaround. These
smaller transfers seem to be successful from downstream kernels,
however efforts to figure out how so went in vain. It is also very
possible that the downstream Broadcom Wi-Fi SDIO driver uses PIO
transfers as well.

Signed-off-by: Kaustabh Chakraborty <kauschluss@disroot.org>
---
Changes in v2:
- Remove dt-binding to set DMA threshold (Krzysztof Kozlowski)
- Add comment to describe struct dw_mci::dma_threshold (Shawn Lin)
- Set DMA threshold in Exynos 7870 DW-MMC driver (Krzysztof Kozlowski)
- Link to v1: https://lore.kernel.org/r/20260412-dwmmc-dma-thr-v1-0-75a2f658eee3@disroot.org

---
Kaustabh Chakraborty (2):
      mmc: dw_mmc: implement option for configuring DMA threshold
      mmc: dw_mmc: exynos: increase DMA threshold value for exynos7870

 drivers/mmc/host/dw_mmc-exynos.c | 1 +
 drivers/mmc/host/dw_mmc.c        | 5 +++--
 drivers/mmc/host/dw_mmc.h        | 2 ++
 3 files changed, 6 insertions(+), 2 deletions(-)
---
base-commit: 1c7cc4904160c6fc6377564140062d68a3dc93a0
change-id: 20260412-dwmmc-dma-thr-1090d8285ea7

Best regards,
-- 
Kaustabh Chakraborty <kauschluss@disroot.org>



^ permalink raw reply

* [PATCH v2 2/2] mmc: dw_mmc: exynos: increase DMA threshold value for exynos7870
From: Kaustabh Chakraborty @ 2026-04-14  8:36 UTC (permalink / raw)
  To: Ulf Hansson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Jaehoon Chung, Shawn Lin, Krzysztof Kozlowski, Alim Akhtar
  Cc: linux-mmc, devicetree, linux-kernel, linux-arm-kernel,
	linux-samsung-soc, Kaustabh Chakraborty
In-Reply-To: <20260414-dwmmc-dma-thr-v2-0-4058078f5361@disroot.org>

Exynos 7870 compatible controllers, such as SDIO ones are not able to
perform DMA transfers for small sizes of data (~16 to ~512 bytes),
resulting in cache issues in subsequent transfers. Increase the DMA
transfer threshold to 512 to allow the shorter transfers to take place,
bypassing DMA.

Signed-off-by: Kaustabh Chakraborty <kauschluss@disroot.org>
---
 drivers/mmc/host/dw_mmc-exynos.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/mmc/host/dw_mmc-exynos.c b/drivers/mmc/host/dw_mmc-exynos.c
index 261344d3a8cfe..4b76b997ddc15 100644
--- a/drivers/mmc/host/dw_mmc-exynos.c
+++ b/drivers/mmc/host/dw_mmc-exynos.c
@@ -141,6 +141,7 @@ static int dw_mci_exynos_priv_init(struct dw_mci *host)
 		priv->ctrl_type == DW_MCI_TYPE_EXYNOS7870_SMU) {
 		/* Quirk needed for certain Exynos SoCs */
 		host->quirks |= DW_MMC_QUIRK_FIFO64_32;
+		host->dma_threshold = 512;
 	}
 
 	if (priv->ctrl_type == DW_MCI_TYPE_ARTPEC8) {

-- 
2.53.0



^ permalink raw reply related

* Re: [net-next] net: airoha: Wait for TX to complete in airoha_dev_stop()
From: Paolo Abeni @ 2026-04-14  8:29 UTC (permalink / raw)
  To: lorenzo
  Cc: edumazet, kuba, davem, netdev, andrew+netdev, pabeni,
	linux-arm-kernel, linux-mediatek
In-Reply-To: <20260411-airoha-fix-ndo_stop-v1-1-caddaa181739@kernel.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.

For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net: airoha: Wait for TX to complete in airoha_dev_stop()

Wait for TX to complete in airoha_dev_stop routine before stopping the
TX DMA and run airoha_qdma_cleanup_tx_queue routine. Moreover,
start/stop TX/RX NAPIs in ndo_open()/ndo_stop() callbacks in order to be
sure the TX NAPIs have completed before stopping the TX DMA engine in
airoha_dev_stop routine.

> diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> index 8e4b043af4bc..9e40c8f375c1 100644
> --- a/drivers/net/ethernet/airoha/airoha_eth.c
> +++ b/drivers/net/ethernet/airoha/airoha_eth.c

[ ... ]

> @@ -1684,18 +1686,26 @@ static int airoha_dev_stop(struct net_device *dev)
>  	struct airoha_qdma *qdma = port->qdma;
>  	int i, err;
>
> -	netif_tx_disable(dev);
>  	err = airoha_set_vip_for_gdm_port(port, false);
>  	if (err)
>  		return err;
>
> -	for (i = 0; i < ARRAY_SIZE(qdma->q_tx); i++)
> -		netdev_tx_reset_subqueue(dev, i);
> -
>  	airoha_set_gdm_port_fwd_cfg(qdma->eth, REG_GDM_FWD_CFG(port->id),
>  				    FE_PSE_PORT_DROP);
>
> +	netif_tx_disable(dev);
>  	if (atomic_dec_and_test(&qdma->users)) {
> +		u32 val;
> +
> +		/* Wait for TX to complete */
> +		err = read_poll_timeout(airoha_qdma_rr, val,
> +					!(val & GLOBAL_CFG_TX_DMA_BUSY_MASK),
> +					USEC_PER_MSEC, 100 * USEC_PER_MSEC,
> +					false, qdma, REG_QDMA_GLOBAL_CFG);
> +		if (err)
> +			return err;
>                   ^^^^^^^^^^

Can this leave the device in an inconsistent state? When read_poll_timeout()
fails (TX DMA busy for more than 100ms), airoha_dev_stop() returns an error
immediately. However, atomic_dec_and_test(&qdma->users) has already
decremented qdma->users to 0, but airoha_qdma_stop_napi() was never called,
so all NAPIs remain enabled.

The networking core (__dev_close_many in net/core/dev.c) ignores the
ndo_stop return value and marks the device as down regardless. When the
device is later brought up again, airoha_dev_open() sees atomic_fetch_inc()
return 0 (old value) and calls airoha_qdma_start_napi() -> napi_enable(),
which hits BUG_ON(!test_bit(NAPI_STATE_SCHED, &val)) in napi_enable_locked()
because the NAPI was never disabled.

The early return also skips the netdev_tx_reset_subqueue() calls at the end
of the function.

Should the code either continue with cleanup on timeout (don't return), or
ensure NAPI and DMA state are properly cleaned up before returning?

> +
> +		airoha_qdma_stop_napi(qdma);
>  		airoha_qdma_clear(qdma, REG_QDMA_GLOBAL_CFG,
>  				  GLOBAL_CFG_TX_DMA_EN_MASK |
>  				  GLOBAL_CFG_RX_DMA_EN_MASK);

[ ... ]



^ permalink raw reply

* [PATCH v2] PCI: host-common: Request bus reassignment when not probe-only
From: Ratheesh Kannoth @ 2026-04-14  8:17 UTC (permalink / raw)
  To: linux-pci, linux-arm-kernel, linux-kernel, bhelgaas
  Cc: will, lpieralisi, kwilczynski, mani, robh, vidyas,
	Ratheesh Kannoth, Bjorn Helgaas

pci_host_common_init() is used by several generic ECAM host drivers.
After PCI core changes around pci_flags and preserve_config, these hosts
no longer opted into full bus number reassignment the way they did
before.

When PCI_PROBE_ONLY is not set, add PCI_REASSIGN_ALL_BUS so
pci_scan_bridge_extend() takes the reassignment path: bus numbers can be
assigned from firmware EA data (e.g. pci_ea_fixed_busnrs()). Skip the
flag in probe-only mode so existing assignments are not overridden.

CC: Bjorn Helgaas <helgaas@kernel.org>
CC: Vidya Sagar <vidyas@nvidia.com>
CC: Manivannan Sadhasivam <mani@kernel.org>
Fixes: 7246a4520b4b ("PCI: Use preserve_config in place of pci_flags")
Link: https://lore.kernel.org/netdev/adcXzcz2wWJFw4d7@rkannoth-OptiPlex-7090/
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>

---
v1 -> v2 : https://lore.kernel.org/linux-pci/20260410142124.2673056-1-rkannoth@marvell.com/
---
 drivers/pci/controller/pci-host-common.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/pci/controller/pci-host-common.c b/drivers/pci/controller/pci-host-common.c
index d6258c1cffe5..99952fb7189b 100644
--- a/drivers/pci/controller/pci-host-common.c
+++ b/drivers/pci/controller/pci-host-common.c
@@ -68,6 +68,10 @@ int pci_host_common_init(struct platform_device *pdev,
 	if (IS_ERR(cfg))
 		return PTR_ERR(cfg);
 
+	/* Do not reassign bus numbers if probe only */
+	if (!pci_has_flag(PCI_PROBE_ONLY))
+		pci_add_flags(PCI_REASSIGN_ALL_BUS);
+
 	bridge->sysdata = cfg;
 	bridge->ops = (struct pci_ops *)&ops->pci_ops;
 	bridge->enable_device = ops->enable_device;
-- 
2.43.0



^ permalink raw reply related

* Re: [PATCH v4 3/9] coresight: etm4x: fix leaked trace id
From: Jie Gan @ 2026-04-14  8:04 UTC (permalink / raw)
  To: Yeoreum Yun, coresight, linux-arm-kernel, linux-kernel
  Cc: suzuki.poulose, mike.leach, james.clark, alexander.shishkin,
	leo.yan
In-Reply-To: <20260413142003.3549310-4-yeoreum.yun@arm.com>



On 4/13/2026 10:19 PM, Yeoreum Yun wrote:
> If etm4_enable_sysfs() fails in cscfg_csdev_enable_active_config(),
> the trace ID may be leaked because it is not released.
> 
> To address this, call etm4_release_trace_id() when etm4_enable_sysfs()
> fails in cscfg_csdev_enable_active_config().
> 

LGTM.

Reviewed-by: Jie Gan <jie.gan@oss.qualcomm.com>

> Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> ---
>   drivers/hwtracing/coresight/coresight-etm4x-core.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> index 8ebfd3924143..1bc9f13e33f7 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> @@ -918,8 +918,10 @@ static int etm4_enable_sysfs(struct coresight_device *csdev, struct coresight_pa
>   	cscfg_config_sysfs_get_active_cfg(&cfg_hash, &preset);
>   	if (cfg_hash) {
>   		ret = cscfg_csdev_enable_active_config(csdev, cfg_hash, preset);
> -		if (ret)
> +		if (ret) {
> +			etm4_release_trace_id(drvdata);
>   			return ret;
> +		}
>   	}
>   
>   	raw_spin_lock(&drvdata->spinlock);



^ permalink raw reply

* Re: [PATCH v4 2/9] coresight: etm4x: exclude ss_status from drvdata->config
From: Jie Gan @ 2026-04-14  8:02 UTC (permalink / raw)
  To: Yeoreum Yun, coresight, linux-arm-kernel, linux-kernel
  Cc: suzuki.poulose, mike.leach, james.clark, alexander.shishkin,
	leo.yan
In-Reply-To: <20260413142003.3549310-3-yeoreum.yun@arm.com>



On 4/13/2026 10:19 PM, Yeoreum Yun wrote:
> The purpose of TRCSSCSRn register is to show status of
> the corresponding Single-shot Comparator Control and input supports.
> That means writable field's purpose for reset or restore from idle status
> not for configuration.
> 
> Therefore, exclude ss_status from drvdata->config, move it to etm4x_caps.
> This includes remove TRCSSCRn from configurable item and
> remove saving in etm4_disable_hw().
> 

LGTM.

Reviewed-by: Jie Gan <jie.gan@oss.qualcomm.com>

> Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> ---
>   .../hwtracing/coresight/coresight-etm4x-cfg.c  |  1 -
>   .../hwtracing/coresight/coresight-etm4x-core.c | 18 +++++-------------
>   .../coresight/coresight-etm4x-sysfs.c          |  7 ++-----
>   drivers/hwtracing/coresight/coresight-etm4x.h  |  4 +++-
>   4 files changed, 10 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-cfg.c b/drivers/hwtracing/coresight/coresight-etm4x-cfg.c
> index c302072b293a..d14d7c8a23e5 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-cfg.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-cfg.c
> @@ -86,7 +86,6 @@ static int etm4_cfg_map_reg_offset(struct etmv4_drvdata *drvdata,
>   		off_mask =  (offset & GENMASK(11, 5));
>   		do {
>   			CHECKREGIDX(TRCSSCCRn(0), ss_ctrl, idx, off_mask);
> -			CHECKREGIDX(TRCSSCSRn(0), ss_status, idx, off_mask);
>   			CHECKREGIDX(TRCSSPCICRn(0), ss_pe_cmp, idx, off_mask);
>   		} while (0);
>   	} else if ((offset >= TRCCIDCVRn(0)) && (offset <= TRCVMIDCVRn(7))) {
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> index 6443f3717b37..8ebfd3924143 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> @@ -91,7 +91,7 @@ static bool etm4x_sspcicrn_present(struct etmv4_drvdata *drvdata, int n)
>   	const struct etmv4_caps *caps = &drvdata->caps;
>   
>   	return (n < caps->nr_ss_cmp) && caps->nr_pe &&
> -	       (drvdata->config.ss_status[n] & TRCSSCSRn_PC);
> +	       (caps->ss_status[n] & TRCSSCSRn_PC);
>   }
>   
>   u64 etm4x_sysreg_read(u32 offset, bool _relaxed, bool _64bit)
> @@ -571,11 +571,9 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata)
>   		etm4x_relaxed_write32(csa, config->res_ctrl[i], TRCRSCTLRn(i));
>   
>   	for (i = 0; i < caps->nr_ss_cmp; i++) {
> -		/* always clear status bit on restart if using single-shot */
> -		if (config->ss_ctrl[i] || config->ss_pe_cmp[i])
> -			config->ss_status[i] &= ~TRCSSCSRn_STATUS;
>   		etm4x_relaxed_write32(csa, config->ss_ctrl[i], TRCSSCCRn(i));
> -		etm4x_relaxed_write32(csa, config->ss_status[i], TRCSSCSRn(i));
> +		/* always clear status and pending bits on restart if using single-shot */
> +		etm4x_relaxed_write32(csa, caps->ss_status[i], TRCSSCSRn(i));
>   		if (etm4x_sspcicrn_present(drvdata, i))
>   			etm4x_relaxed_write32(csa, config->ss_pe_cmp[i], TRCSSPCICRn(i));
>   	}
> @@ -1053,12 +1051,6 @@ static void etm4_disable_hw(struct etmv4_drvdata *drvdata)
>   
>   	etm4_disable_trace_unit(drvdata);
>   
> -	/* read the status of the single shot comparators */
> -	for (i = 0; i < caps->nr_ss_cmp; i++) {
> -		config->ss_status[i] =
> -			etm4x_relaxed_read32(csa, TRCSSCSRn(i));
> -	}
> -
>   	/* read back the current counter values */
>   	for (i = 0; i < caps->nr_cntr; i++) {
>   		config->cntr_val[i] =
> @@ -1501,8 +1493,8 @@ static void etm4_init_arch_data(void *info)
>   	 */
>   	caps->nr_ss_cmp = FIELD_GET(TRCIDR4_NUMSSCC_MASK, etmidr4);
>   	for (i = 0; i < caps->nr_ss_cmp; i++) {
> -		drvdata->config.ss_status[i] =
> -			etm4x_relaxed_read32(csa, TRCSSCSRn(i));
> +		caps->ss_status[i] = etm4x_relaxed_read32(csa, TRCSSCSRn(i));
> +		caps->ss_status[i] &= ~(TRCSSCSRn_STATUS | TRCSSCSRn_PENDING);
>   	}
>   	/* NUMCIDC, bits[27:24] number of Context ID comparators for tracing */
>   	caps->numcidc = FIELD_GET(TRCIDR4_NUMCIDC_MASK, etmidr4);
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c b/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
> index 50408215d1ac..dd62f01674cf 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
> @@ -1827,8 +1827,6 @@ static ssize_t sshot_ctrl_store(struct device *dev,
>   	raw_spin_lock(&drvdata->spinlock);
>   	idx = config->ss_idx;
>   	config->ss_ctrl[idx] = FIELD_PREP(TRCSSCCRn_SAC_ARC_RST_MASK, val);
> -	/* must clear bit 31 in related status register on programming */
> -	config->ss_status[idx] &= ~TRCSSCSRn_STATUS;
>   	raw_spin_unlock(&drvdata->spinlock);
>   	return size;
>   }
> @@ -1839,10 +1837,11 @@ static ssize_t sshot_status_show(struct device *dev,
>   {
>   	unsigned long val;
>   	struct etmv4_drvdata *drvdata = dev_get_drvdata(dev->parent);
> +	const struct etmv4_caps *caps = &drvdata->caps;
>   	struct etmv4_config *config = &drvdata->config;
>   
>   	raw_spin_lock(&drvdata->spinlock);
> -	val = config->ss_status[config->ss_idx];
> +	val = caps->ss_status[config->ss_idx];
>   	raw_spin_unlock(&drvdata->spinlock);
>   	return scnprintf(buf, PAGE_SIZE, "%#lx\n", val);
>   }
> @@ -1877,8 +1876,6 @@ static ssize_t sshot_pe_ctrl_store(struct device *dev,
>   	raw_spin_lock(&drvdata->spinlock);
>   	idx = config->ss_idx;
>   	config->ss_pe_cmp[idx] = FIELD_PREP(TRCSSPCICRn_PC_MASK, val);
> -	/* must clear bit 31 in related status register on programming */
> -	config->ss_status[idx] &= ~TRCSSCSRn_STATUS;
>   	raw_spin_unlock(&drvdata->spinlock);
>   	return size;
>   }
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
> index 8168676f2945..8864cfb76bad 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x.h
> +++ b/drivers/hwtracing/coresight/coresight-etm4x.h
> @@ -213,6 +213,7 @@
>   #define TRCACATRn_EXLEVEL_MASK			GENMASK(14, 8)
>   
>   #define TRCSSCSRn_STATUS			BIT(31)
> +#define TRCSSCSRn_PENDING			BIT(30)
>   #define TRCSSCCRn_SAC_ARC_RST_MASK		GENMASK(24, 0)
>   
>   #define TRCSSPCICRn_PC_MASK			GENMASK(7, 0)
> @@ -861,6 +862,7 @@ enum etm_impdef_type {
>    * @lpoverride:	If the implementation can support low-power state over.
>    * @skip_power_up: Indicates if an implementation can skip powering up
>    *		   the trace unit.
> + * @ss_status:	The status of the corresponding single-shot comparator.
>    */
>   struct etmv4_caps {
>   	u8	nr_pe;
> @@ -899,6 +901,7 @@ struct etmv4_caps {
>   	bool	atbtrig : 1;
>   	bool	lpoverride : 1;
>   	bool	skip_power_up : 1;
> +	u32	ss_status[ETM_MAX_SS_CMP];
>   };
>   
>   /**
> @@ -977,7 +980,6 @@ struct etmv4_config {
>   	u32				res_ctrl[ETM_MAX_RES_SEL]; /* TRCRSCTLRn */
>   	u8				ss_idx;
>   	u32				ss_ctrl[ETM_MAX_SS_CMP];
> -	u32				ss_status[ETM_MAX_SS_CMP];
>   	u32				ss_pe_cmp[ETM_MAX_SS_CMP];
>   	u8				addr_idx;
>   	u64				addr_val[ETM_MAX_SINGLE_ADDR_CMP];



^ permalink raw reply

* RE: [PATCH v7 0/4] PCI: Add support for resetting the Root Ports in a platform specific way
From: Hongxing Zhu @ 2026-04-14  7:59 UTC (permalink / raw)
  To: Manivannan Sadhasivam, Brian Norris
  Cc: manivannan.sadhasivam@oss.qualcomm.com, Bjorn Helgaas,
	Mahesh J Salgaonkar, Oliver O'Halloran, Will Deacon,
	Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
	Heiko Stuebner, Philipp Zabel, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-arm-kernel@lists.infradead.org,
	linux-arm-msm@vger.kernel.org, linux-rockchip@lists.infradead.org,
	Niklas Cassel, Wilfred Mallawa, Krishna Chaitanya Chundru,
	Lukas Wunner, Wilson Ding, Miles Chen
In-Reply-To: <klzq2i2ne62hdri65gz7s5pxmvely277optr2lrkvdrrahl3ca@k3hdo6o4nkjz>


> -----Original Message-----
> From: Manivannan Sadhasivam <mani@kernel.org>
> Sent: 2026年4月14日 0:35
> To: Brian Norris <briannorris@chromium.org>; Hongxing Zhu
> <hongxing.zhu@nxp.com>
> Cc: Hongxing Zhu <hongxing.zhu@nxp.com>;
> manivannan.sadhasivam@oss.qualcomm.com; Bjorn Helgaas
> <bhelgaas@google.com>; Mahesh J Salgaonkar <mahesh@linux.ibm.com>;
> Oliver O'Halloran <oohall@gmail.com>; Will Deacon <will@kernel.org>;
> Lorenzo Pieralisi <lpieralisi@kernel.org>; Krzysztof Wilczyński
> <kwilczynski@kernel.org>; Rob Herring <robh@kernel.org>; Heiko Stuebner
> <heiko@sntech.de>; Philipp Zabel <p.zabel@pengutronix.de>;
> linux-pci@vger.kernel.org; linux-kernel@vger.kernel.org;
> linuxppc-dev@lists.ozlabs.org; linux-arm-kernel@lists.infradead.org;
> linux-arm-msm@vger.kernel.org; linux-rockchip@lists.infradead.org; Niklas
> Cassel <cassel@kernel.org>; Wilfred Mallawa <wilfred.mallawa@wdc.com>;
> Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>; Lukas
> Wunner <lukas@wunner.de>; Wilson Ding <dingwei@marvell.com>; Miles
> Chen <minhuachen@google.com>
> Subject: Re: [PATCH v7 0/4] PCI: Add support for resetting the Root Ports in a
> platform specific way
> 
> Hi Brian,
> 
> On Wed, Apr 08, 2026 at 06:58:34PM -0700, Brian Norris wrote:
> > Hi Richard and Mani,
> >
> > For the record, I've been using a form of an earlier version of this
> > patchset in my environment for some time now, and I've run across
> > problems that *might* relate to what Richard is reporting, but I'm not
> > quite sure at the moment. Details below.
> >
> > On Wed, Mar 25, 2026 at 07:06:49AM +0000, Hongxing Zhu wrote:
> > > Hi Mani:
> > > I've accidentally encountered a new issue based on the reset root port
> patch-set.
> > > After performing a few hot-reset operations, the PCIe link enters a
> continuous up/down cycling pattern.
> > >
> > > I found that calling pci_reset_secondary_bus() first in
> pcibios_reset_secondary_bus() appears to resolve this issue.
> > > Have you experienced a similar problem?
> > >
> > > "
> > > ...
> > > [  141.897701] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000700)
> > > link up detected [  142.086341] imx6q-pcie 4c300000.pcie: PCIe Gen.3
> > > x1 link up [  142.092038] imx6q-pcie 4c300000.pcie:
> > > PCIe(LNK_STS:0x00000c00) link down detected ...
> > > "
> > >
> > > Platform: i.MX95 EVK board plus local Root Ports reset supports based
> on the #1 and #2 patches of v7 patch-set.
> > > Notes of the logs:
> > > - One Gen3 NVME device is connected.
> > > - "./memtool 4c341058=0;./memtool 4c341058=1;" is used to toggle the
> LTSSM_EN bit to trigger the link down.
> > > - Toggle BIT6 of Bridge Control Register to trigger hot reset by
> "./memtool 4c30003c=004001ff; ./memtool 4c30003c=000001ff;"
> > > - The Root Port reset patches works correctly at first.
> > > However, after several hot-reset triggers, the link enters a repeated
> down/up cycling state.
> > >
> > > Logs:
> > > [    3.553188] imx6q-pcie 4c300000.pcie: host bridge
> /soc/pcie@4c300000 ranges:
> > > [    3.560308] imx6q-pcie 4c300000.pcie:       IO
> 0x006ff00000..0x006fffffff -> 0x0000000000
> > > [    3.568525] imx6q-pcie 4c300000.pcie:      MEM
> 0x0910000000..0x091fffffff -> 0x0010000000
> > > [    3.577314] imx6q-pcie 4c300000.pcie: config reg[1] 0x60100000 ==
> cpu 0x60100000
> > > [    3.796029] imx6q-pcie 4c300000.pcie: iATU: unroll T, 128 ob, 128 ib,
> align 4K, limit 1024G
> > > [    4.003746] imx6q-pcie 4c300000.pcie: PCIe Gen.3 x1 link up
> > > [    4.009553] imx6q-pcie 4c300000.pcie: PCI host bridge to bus 0000:00
> > > root@imx95evk:~#
> > > root@imx95evk:~#
> > > root@imx95evk:~# ./memtool 4c341058=0;./memtool 4c341058=1;
> Writing
> > > 32-bit value 0x0 to address 0x4C341058 Writing 32-bit v
> > > [   87.265348] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000d01)
> link down detected
> > > alue 0x1 to adder
> > > [   87.273106] imx6q-pcie 4c300000.pcie: Stop root bus and handle link
> down
> > > ss 0x4C341058
> > > [   87.281264] pcieport 0000:00:00.0: Recovering Root Port due to Link
> Down
> > > [   87.289245] pci 0000:01:00.0: AER: can't recover (no error_detected
> callback)
> > > root@imx95evk:~# [   87.514216] imx6q-pcie 4c300000.pcie:
> PCIe(LNK_STS:0x00000700) link up detected
> > > [   87.702968] imx6q-pcie 4c300000.pcie: PCIe Gen.3 x1 link up
> > > [   87.834983] pcieport 0000:00:00.0: Root Port has been reset
> > > [   87.840714] pcieport 0000:00:00.0: AER: device recovery failed
> > > [   87.846592] imx6q-pcie 4c300000.pcie: Rescan bus after link up is
> detected
> > > [   87.855947] pcieport 0000:00:00.0: bridge configuration invalid ([bus
> 00-00]), reconfiguring
> >
> > I've seen this same line ("bridge configuration invalid") before, and
> > I believe that's because the saved state (pci_save_state(); more about
> > this below) is invalid -- it contains 0 values in places where they
> > should be non-zero. So when those values are restored
> > (pci_restore_state()), we get confused.
> >
> > I believe we've pinned down one reason this invalid state occurs --
> > it's because of an automatic (mis)feature in the DesignWare PCIe
> hardware.
> > Specifically, it's because of what the controller does during a
> > surprise link-down error.
> >
> > From the Designware docs:
> >
> >   "[...] during normal operation, the link might fail and go down. After
> >   this link-down event, the controller requests the DWC_pcie_clkrst.v
> >   module to hot-reset the controller. There is no difference in the
> >   handling of a link-down reset or a hot reset; the controller asserts
> >   the link_req_rst_not output requesting the DWC_pcie_clkrst.v module
> to
> >   reset the controller."
> >
> > In some of the adjacent documentation (and confirmed in local
> > testing), it suggests that this automatic reset will also reset
> > various DBI (i.e., PCIe config space) registers. It also seems as if
> > there's not really a good way to completely stop this automatic reset
> > -- the docs mention some SW methods prevent the reset, but they all
> seem racy or incomplete.
> >
> > Anyway, I think this implies that patch 1 is somewhat wrong [1]. It
> > includes some code like this:
> >
> > 		pci_save_state(dev);
> > 		ret = host->reset_root_port(host, dev);
> > 		if (ret)
> > 			pci_err(dev, "Failed to reset Root Port: %d\n", ret);
> > 		else
> > 			/* Now restore it on success */
> > 			pci_restore_state(dev);
> >
> > That first line (pci_save_state()) is prone to saving invalid state,
> > depending on whether the link-down event has finished flushing and
> > resetting the controller yet or not. The resulting impact is a bit
> > hard to judge, depending on what (mis)configuration you end up with.
> >
> 
> Thanks a lot for your investigation. I think your observation makes sense and
> could be the culprit in saving the corrupted state. Even on non-DWC
> controllers, there is no guarantee that the Root Port config registers state will
> be preserved after LDn (before Root Port reset).
> 
> > I also noticed commit a2f1e22390ac ("PCI/ERR: Ensure error
> > recoverability at all times") was merged recently. With that change, I
> > believe it is now safe to perform pci_restore_state() even without
> > pci_save_state() here.
> >
> > So ... can we remove pci_save_state() from
> > pcibios_reset_secondary_bus()? Might that help?
> 
> I think so. I will also test it locally and report back soon.
> 
> > It sounds like my above
> > observations *may* match Richard's reports, but I'm not sure. And
> > anyway, the documented hardware behavior is racy, so it's hard to
> > propose a foolproof solution.
> >
> 
> @Richard: Can you confirm if removing 'pci_save_state(dev);' from
> pcibios_reset_secondary_bus() fixes your issue?
I have tested the hot reset trigger hundreds of times, and it works
consistently.
@Brian Norris @Manivannan Sadhasivam
Thanks a lot for your helps.

Best Regards
Richard Zhu
> 
> - Mani
> 
> --
> மணிவண்ணன் சதாசிவம்

^ permalink raw reply

* Re: [PATCH v4 1/9] coresight: etm4x: introduce struct etm4_caps
From: Yeoreum Yun @ 2026-04-14  7:55 UTC (permalink / raw)
  To: Leo Yan
  Cc: coresight, linux-arm-kernel, linux-kernel, suzuki.poulose,
	mike.leach, james.clark, alexander.shishkin, jie.gan
In-Reply-To: <20260413172104.GD356832@e132581.arm.com>

Hi Leo!

> On Mon, Apr 13, 2026 at 03:19:54PM +0100, Yeoreum Yun wrote:
> > Introduce struct etm4_caps to describe ETMv4 capabilities
> > and move capabilities information into it.
> >
> > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
>
> LGTM:
>
> Reviewed-by: Leo Yan <leo.yan@arm.com>

Thanks.

>
> FWIW, two comments from Sashiko are valuable for me, please see below.
>
> > ---
> >  .../coresight/coresight-etm4x-core.c          | 234 +++++++++---------
> >  .../coresight/coresight-etm4x-sysfs.c         | 190 ++++++++------
> >  drivers/hwtracing/coresight/coresight-etm4x.h | 176 ++++++-------
> >  3 files changed, 328 insertions(+), 272 deletions(-)
> >
> > diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> > index d565a73f0042..6443f3717b37 100644
> > --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> > +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> > @@ -88,8 +88,9 @@ static int etm4_probe_cpu(unsigned int cpu);
> >   */
> >  static bool etm4x_sspcicrn_present(struct etmv4_drvdata *drvdata, int n)
> >  {
> > -	return (n < drvdata->nr_ss_cmp) &&
> > -	       drvdata->nr_pe &&
> > +	const struct etmv4_caps *caps = &drvdata->caps;
> > +
> > +	return (n < caps->nr_ss_cmp) && caps->nr_pe &&
> >  	       (drvdata->config.ss_status[n] & TRCSSCSRn_PC);
>
> As Sashiko suggests:
>
>   "This isn't a regression introduced by this patch, but should this be
>    checking caps->nr_pe_cmp instead of caps->nr_pe?"
>
> I confirmed the ETMv4 specification (ARM IHI0064H.b), the comment
> above is valid as the we should check caps->nr_pe_cmp instead.
>
> Could you first use a patch to fix the typo and then apply
> capabilities afterwards?  This is helpful for porting to stable
> kernels.

Sashiko finds valid point. And I'm bit of surprised no body find it.
Okay. I'll send it next round.

>
> [...]
>
> > @@ -525,14 +530,14 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata)
> >  	if (etm4x_wait_status(csa, TRCSTATR_IDLE_BIT, 1))
> >  		dev_err(etm_dev,
> >  			"timeout while waiting for Idle Trace Status\n");
> > -	if (drvdata->nr_pe)
> > +	if (caps->nr_pe)
> >  		etm4x_relaxed_write32(csa, config->pe_sel, TRCPROCSELR);
> >  	etm4x_relaxed_write32(csa, config->cfg, TRCCONFIGR);
> >  	/* nothing specific implemented */
> >  	etm4x_relaxed_write32(csa, 0x0, TRCAUXCTLR);
> >  	etm4x_relaxed_write32(csa, config->eventctrl0, TRCEVENTCTL0R);
> >  	etm4x_relaxed_write32(csa, config->eventctrl1, TRCEVENTCTL1R);
> > -	if (drvdata->stallctl)
> > +	if (caps->stallctl)
> >  		etm4x_relaxed_write32(csa, config->stall_ctrl, TRCSTALLCTLR);
> >  	etm4x_relaxed_write32(csa, config->ts_ctrl, TRCTSCTLR);
> >  	etm4x_relaxed_write32(csa, config->syncfreq, TRCSYNCPR);
> > @@ -542,17 +547,17 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata)
> >  	etm4x_relaxed_write32(csa, config->vinst_ctrl, TRCVICTLR);
> >  	etm4x_relaxed_write32(csa, config->viiectlr, TRCVIIECTLR);
> >  	etm4x_relaxed_write32(csa, config->vissctlr, TRCVISSCTLR);
> > -	if (drvdata->nr_pe_cmp)
> > +	if (caps->nr_pe_cmp)
> >  		etm4x_relaxed_write32(csa, config->vipcssctlr, TRCVIPCSSCTLR);
> > -	for (i = 0; i < drvdata->nrseqstate - 1; i++)
> > +	for (i = 0; i < caps->nrseqstate - 1; i++)
> >  		etm4x_relaxed_write32(csa, config->seq_ctrl[i], TRCSEQEVRn(i));
>
> Sashiko's comment:
>
>   "If the hardware does not implement a sequencer, caps->nrseqstate (a u8)
>    will be 0. Does 0 - 1 evaluate to -1 as an int, which then gets promoted
>    to ULONG_MAX against val (an unsigned long)?"
>
> This is a good catch.  The condition check should be:
>
>   for (i = 0; i < caps->nrseqstate; i++)
>        ...;
>
> The issue is irrelevant to your patch, but could you use a patch to fix
> "nrseqstate - 1" first and then apply the cap refactoring on it?  This
> would be friendly for porting to stable kernel.

Okay. I'll send it next round.

Thanks!

--
Sincerely,
Yeoreum Yun


^ permalink raw reply

* Re: [PATCH 0/3] arm-smmu-v3: Add PMCG child support and update PMU MMIO mapping
From: Peng Fan @ 2026-04-14  7:47 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Will Deacon, Joerg Roedel, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Mark Rutland, linux-arm-kernel, iommu, devicetree,
	linux-kernel, linux-perf-users, Peng Fan
In-Reply-To: <65629411-0e1c-4c9c-bc9f-6488097bd77f@arm.com>

Hi Robin,

On Fri, Apr 10, 2026 at 01:07:29PM +0100, Robin Murphy wrote:
>On 08/04/2026 2:47 pm, Peng Fan wrote:
>> On Wed, Apr 08, 2026 at 12:15:31PM +0100, Robin Murphy wrote:
>> > On 2026-04-08 8:51 am, Peng Fan (OSS) wrote:
>> > > This patch series adds proper support for describing and probing the
>> > > Arm SMMU v3 PMCG (Performance Monitor Control Group) as a child node of
>> > > the SMMU in Devicetree, and updates the relevant drivers accordingly.
>> > > 
>> > > The SMMU v3 architecture allows an optional PMCG block, typically
>> > > associated with TCUs, to be implemented within the SMMU register
>> > > address space. For example, mmu700 PMCG is at the offset 0x2000 of the
>> > > TCU page 0.
>> > 
>> > But what's wrong with the existing binding? Especially given that it even has
>> > an upstream user already:
>> > 
>> > https://git.kernel.org/torvalds/c/aef9703dcbf8
>> > 
>> > > Patch 1 updates the SMMU v3 Devicetree binding to allow PMCG child nodes,
>> > > referencing the existing arm,smmu-v3-pmcg binding.
>> > > 
>> > > Patch 2 updates the arm-smmu-v3 driver to populate platform devices for
>> > > child nodes described in DT once the SMMU probe succeeds.
>> > > 
>> > > Patch 3 updates the SMMUv3 PMU driver to correctly handle MMIO mapping when
>> > > PMCG is described as a child node. The PMCG registers occupy a sub-region
>> > > of the parent SMMU MMIO window, which is already requested by the SMMU
>> > 
>> > That has not been the case since 52f3fab0067d ("iommu/arm-smmu-v3: Don't
>> > reserve implementation defined register space") nearly 6 years ago, where the
>> > whole purpose was to support Arm's PMCG implementation properly. What kernel
>> > is this based on?
>> 
>> Seems I am wrong. I thought PMCG is in page 0, so there were resource
>> conflicts. I just retest without this patchset, all goes well.
>> 
>> But from dt perspective, should the TCU PMCG node be child node of
>> SMMU node?
>
>No. PMCGs can be used entirely independently of the SMMU itself, and while
>most of the events do relate to SMMU translation and thus aren't necessarily
>meaningful if it's not in use, there are still some which can be useful for
>basic traffic counting, monitoring GPT/translation activity from _other_
>security states (if observation is delegated to Non-Secure) and possibly
>other things, even if the "main" Non-Secure SMMU interface isn't advertised
>at all. It would be unreasonable to require the SMMU node to be present and
>enabled *and* have a driver to populate PMCGs, to monitor events which are
>outside the scope of that driver.

Thanks for explaining this in detail.

Just have one more question, we are using mmu-700, but MMU-700 implementation
defined TCU and TBU events are not supported.

Should we introduce a compatible string saying "arm,mmu700-tcu-pmcg" or
"arm,mmu700-tbu-pmcg"? TBH, I have not checked MMU600(AE) or else.

Thanks,
Peng

>
>Thanks,
>Robin.
>


^ permalink raw reply

* Re: [PATCH] mm/arm: pgtable: remove young bit check for pte_valid_user
From: Brian Ruley @ 2026-04-14  7:44 UTC (permalink / raw)
  To: Will Deacon
  Cc: Russell King (Oracle), Steve Capper, linux-arm-kernel,
	linux-kernel, catalin.marinas
In-Reply-To: <ad0Ky09tLcFx7JCa@zoo11.fihel.lab.ge-healthcare.net>

On Apr 13, Brian Ruley wrote:
> 
> In the meanwhile, I'll see if I can add some instrumentation to
> verify this is the bug we're seeing.

The instrumentation worked (used the same ring buffer approach to track
dcache flushes). Here's the output:

```
kernel: [48629.557043] SIGILL at b6b80ac0 cpu 1 pid 32663 linux_pte=8eff659f hw_pte=8eff6e7e young=1 exec=1
kernel: [48629.557157] dcache flush START   cpu0 pfn=8eff6 ts=48629557020154
kernel: [48629.557207] dcache flush FINISH  cpu0 pfn=8eff6 ts=48629557036154
kernel: [48629.557230] dcache flush SKIPPED cpu1 pfn=8eff6 ts=48629557020154
audisp-syslog: type=ANOM_ABEND msg=audit(1776154637.460:15): auid=4294967295 uid=0 gid=0 ses=4294967295 pid=32663 comm="journalctl" exe="/usr/bin/journalctl" sig=4 res=1 AUID="unset" UID="root" GID="root"
```

BR,
Brian


^ permalink raw reply

* Re: [PATCH 3/5] ARM: configs: Drop redundant SND_ATMEL_SOC
From: Claudiu Beznea @ 2026-04-14  7:38 UTC (permalink / raw)
  To: Krzysztof Kozlowski, Nicolas Ferre, Alexandre Belloni,
	Bjorn Andersson, Dmitry Baryshkov, Dinh Nguyen,
	Krzysztof Kozlowski
  Cc: Arnd Bergmann, Linus Walleij, Drew Fustini, soc, linux-arm-kernel,
	linux-kernel
In-Reply-To: <20260412-b4-defconfig-multi-v7-v1-3-e76de035c2df@oss.qualcomm.com>



On 4/12/26 20:12, Krzysztof Kozlowski wrote:
> CONFIG_SND_ATMEL_SOC is gone since commit 4f30f84feb77 ("ASoC: atmel:
> Standardize ASoC menu") and can be simply dropped without effect.
> 
> No impact on include/generated/autoconf.h.
> 
> Signed-off-by: Krzysztof Kozlowski<krzysztof.kozlowski@oss.qualcomm.com>

Acked-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>


^ permalink raw reply

* Re: [PATCH 10/10] drm: of: forbid bridge-only calls to drm_of_find_panel_or_bridge()
From: Luca Ceresoli @ 2026-04-14  7:02 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, Rob Clark, Dmitry Baryshkov, Abhinav Kumar,
	Jessica Zhang, Sean Paul, Marijn Suijten, Xinliang Liu, Tian Tao,
	Xinwei Kong, Sumit Semwal, Yongqin Liu, John Stultz,
	Andrzej Hajda, Neil Armstrong, Robert Foss, Laurent Pinchart,
	Jonas Karlman, Jernej Skrabec, Tomi Valkeinen, Michal Simek,
	Hui Pu, Ian Ray, Thomas Petazzoni, dri-devel, linux-kernel,
	linux-arm-msm, freedreno, linux-arm-kernel
In-Reply-To: <nligqvm3lq6n556onglmb345arxztd4pc6fboo4yrs3bfu27eu@uiyu2xklnexb>

Hello Dmitry, Maxime,

On Mon Apr 13, 2026 at 8:04 PM CEST, Dmitry Baryshkov wrote:
> On Mon, Apr 13, 2026 at 03:58:42PM +0200, Luca Ceresoli wrote:
>> Up to now drm_of_find_panel_or_bridge() can be called with a bridge pointer
>> only, a panel pointer only, or both a bridge and a panel pointers. The
>> logic to handle all the three cases is somewhat complex to read however.
>>
>> Now all bridge-only callers have been converted to
>> of_drm_get_bridge_by_endpoint(), which is simpler and handles bridge
>> refcounting. So forbid new bridge-only users by mandating a non-NULL panel
>> pointer in the docs and in the sanity checks along with a warning.
>
> Are there remaining users which still use either the bridge or the
> panel? Would it be possible / better to drop the two-arg version?

Yes. I counted ~20 panel+bridge and 4 panel-only callers with this series
applied, and on top of those there are devm_drm_of_get_bridge() and
drmm_of_get_bridge() which to me are the real issue because they make it
impossible to correctly handle bridge lifetime.

We discussed this with both you and Maxime a while back. AFAIK Maxime has a
plan to make every panel automatically instantiate a panel_bridge. I think
that's the only reasonable approach to get rid of
drm_of_find_panel_or_bridge() + *_of_get_bridge() and make bridge lifetime
easier and safe.

@Maxime, do you have updates on that idea?

Best regards,
Luca

--
Luca Ceresoli, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox