Linux-ARM-Kernel Archive on lore.kernel.org

Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH net-next v3 09/12] net: airoha: Support multiple net_devices for a single FE GDM port
From: Jakub Kicinski @ 2026-04-10  2:57 UTC (permalink / raw)
  To: lorenzo
  Cc: Jakub Kicinski, andrew+netdev, davem, edumazet, pabeni, robh,
	krzk+dt, conor+dt, ansuelsmth, benjamin.larsson, linux-arm-kernel,
	linux-mediatek, netdev, devicetree
In-Reply-To: <20260406-airoha-eth-multi-serdes-v3-9-ab6ea49d59ff@kernel.org>

This is an AI-generated review of your patch. The human sending this
email says "For your consideration."
---
net: airoha: Support multiple net_devices for a single FE GDM port

This commit introduces support for mapping multiple net_device interfaces
to a single Frame Engine (FE) GDM port via an external hardware arbiter.
It updates the rx/tx paths, port allocation, and registration logic to
handle the new net_device array.

> diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> --- a/drivers/net/ethernet/airoha/airoha_eth.c
> +++ b/drivers/net/ethernet/airoha/airoha_eth.c
[ ... ]
> @@ -615,9 +617,8 @@ static int airoha_qdma_rx_process(struct airoha_queue *q, int budget)
>  		u32 hash, reason, msg1 = le32_to_cpu(desc->msg1);
>  		struct page *page = virt_to_head_page(e->buf);
>  		u32 desc_ctrl = le32_to_cpu(desc->ctrl);
> -		struct airoha_gdm_port *port;
> -		struct net_device *netdev;
> -		int data_len, len, p;
> +		struct airoha_gdm_dev *dev;
> +		int data_len, len;
>  
>  		if (!(desc_ctrl & QDMA_DESC_DONE_MASK))
>  			break;

While looking at this code, I noticed an issue that predates this patch.
Can the lack of a memory barrier after checking the descriptor ownership bit
cause stale reads?

The driver checks QDMA_DESC_DONE_MASK to verify that the hardware has finished
writing the descriptor. However, it reads desc->msg1 before this check and
without a dma_rmb():

		u32 hash, reason, msg1 = le32_to_cpu(desc->msg1);
		struct page *page = virt_to_head_page(e->buf);
		u32 desc_ctrl = le32_to_cpu(desc->ctrl);

		if (!(desc_ctrl & QDMA_DESC_DONE_MASK))
			break;

Since weakly ordered architectures can reorder memory reads, could the CPU
prefetch msg1 before the DMA engine writes the new descriptor values, leading
to routing the packet to the wrong gdm_dev?

[ ... ]
> @@ -1764,9 +1765,7 @@ static int airoha_set_gdm2_loopback(struct airoha_gdm_dev *dev)
>  	airoha_fe_clear(eth, REG_FE_VIP_PORT_EN, BIT(AIROHA_GDM2_IDX));
>  	airoha_fe_clear(eth, REG_FE_IFC_PORT_EN, BIT(AIROHA_GDM2_IDX));
>  
> -	/* XXX: handle XSI_USB_PORT and XSI_PCE1_PORT */
> -	nbq = port->id == AIROHA_GDM3_IDX && airoha_is_7581(eth) ? 4 : 0;
> -	src_port = eth->soc->ops.get_src_port_id(port, nbq);
> +	src_port = eth->soc->ops.get_src_port_id(port, dev->nbq);
>  	if (src_port < 0)
>  		return src_port;

Can configuring the loopback for one device break connectivity for other
devices sharing the same port?

When multiple net_devices (like PCIE0 and PCIE1 on GDM3) open and call
airoha_set_gdm2_loopback(), they modify global hardware registers such as
REG_FE_WAN_PORT.

Since WAN0_MASK is unconditionally updated to track the current device's
src_port:

	airoha_fe_rmw(eth, REG_FE_WAN_PORT,
		      WAN1_EN_MASK | WAN1_MASK | WAN0_MASK,
		      FIELD_PREP(WAN0_MASK, src_port));

won't the last configured device clobber the hardware state for the others?

[ ... ]
> @@ -1987,7 +1986,8 @@ static netdev_tx_t airoha_dev_xmit(struct sk_buff *skb,
>  	}
>  
>  	fport = airoha_get_fe_port(dev);
> -	msg1 = FIELD_PREP(QDMA_ETH_TXMSG_FPORT_MASK, fport) |
> +	msg1 = FIELD_PREP(QDMA_ETH_TXMSG_NBOQ_MASK, dev->nbq) |
> +	       FIELD_PREP(QDMA_ETH_TXMSG_FPORT_MASK, fport) |
>  	       FIELD_PREP(QDMA_ETH_TXMSG_METER_MASK, 0x7f);
>  
>  	q = &qdma->q_tx[qid];

Does sharing the hardware QDMA TX queues among multiple net_device instances
introduce a permanent transmit stall?

In airoha_dev_xmit(), if a shared hardware queue is full, the driver stops the
netdev_queue of the specific netdev making the call:

	if (q->queued + nr_frags >= q->ndesc) {
		/* not enough space in the queue */
		netif_tx_stop_queue(txq);
		spin_unlock_bh(&q->lock);
		return NETDEV_TX_BUSY;
	}

However, the TX completion handler (airoha_tx_irq_process()) only wakes the
queue of the netdev attached to the completed packet. If an interface's TX
queue is stopped but it has no pending packets in the hardware queue (because
another interface filled it), will it ever get woken up?

[ ... ]
> @@ -2955,11 +2958,25 @@ static int airoha_alloc_gdm_device(struct airoha_eth *eth,
>  			 netdev->dev_addr);
>  	}
>  
> +	/* Allowed nbq for EN7581 on GDM3 port are 4 and 5 for PCIE0
> +	 * and PCIE1 respectively.
> +	 */
> +	index = nbq;
> +	if (index && airoha_is_7581(eth) && port->id == AIROHA_GDM3_IDX)
> +		index -= 4;
> +
> +	if (index >= ARRAY_SIZE(port->devs) || port->devs[index]) {
> +		dev_err(eth->dev, "invalid nbq id: %d\n", nbq);
> +		return -EINVAL;
> +	}
> +
> +	netdev->dev.of_node = of_node_get(np);
>  	dev = netdev_priv(netdev);

Does this code leak the of_node reference?

The node reference is explicitly incremented via of_node_get(np). The networking
core does not automatically drop explicit of_node references on free_netdev().
Without a matching of_node_put() in the error paths or the driver's teardown
function (airoha_remove()), will this reference permanently leak when the device
is destroyed?

[ ... ]
> @@ -3003,7 +3021,43 @@ static int airoha_alloc_gdm_port(struct airoha_eth *eth,
>  	if (err)
>  		return err;
>  
> -	return airoha_alloc_gdm_device(eth, port, np);
> +	for_each_child_of_node(np, node) {
> +		/* Multiple external serdes connected to the FE GDM port via an
> +		 * external arbiter.
> +		 */
> +		const __be32 *nbq_ptr;
> +		int nbq;
> +
> +		if (!of_device_is_compatible(node, "airoha,eth-port"))
> +			continue;
> +
> +		d++;
> +		if (!of_device_is_available(node))
> +			continue;
> +
> +		nbq_ptr = of_get_property(node, "reg", NULL);
> +		if (!nbq_ptr) {
> +			dev_err(eth->dev, "missing nbq id\n");
> +			of_node_put(node);
> +			return -EINVAL;
> +		}
> +
> +		/* Verify the provided nbq parameter is valid */
> +		nbq = be32_to_cpup(nbq_ptr);
> +		err = eth->soc->ops.get_src_port_id(port, nbq);
> +		if (err < 0) {
> +			of_node_put(node);
> +			return err;
> +		}
> +
> +		err = airoha_alloc_gdm_device(eth, port, nbq, node);
> +		if (err) {
> +			of_node_put(node);
> +			return err;
> +		}
> +	}
> +
> +	return !d ? airoha_alloc_gdm_device(eth, port, 0, np) : 0;
>  }

Does passing 0 for nbq in the fallback path break older device trees on the
EN7581 SoC?

Later, when the interface is brought up, airoha_set_gdm2_loopback() calls
airoha_en7581_get_src_port_id(), which returns -EINVAL for GDM3 if nbq is
not 4 or 5:

	switch (port->id) {
	case AIROHA_GDM3_IDX:
		/* 7581 SoC supports PCIe serdes on GDM3 port */
		if (nbq == 4)
			return HSGMII_LAN_7581_PCIE0_SRCPORT;
		if (nbq == 5)
			return HSGMII_LAN_7581_PCIE1_SRCPORT;
		break;

If nbq is hardcoded to 0 for legacy DTs, won't this cause interface
initialization to fail unconditionally for GDM3?

^ permalink raw reply

* Re: [PATCH net-next v3 10/12] net: airoha: Do not stop GDM port if it is shared
From: Jakub Kicinski @ 2026-04-10  2:57 UTC (permalink / raw)
  To: lorenzo
  Cc: Jakub Kicinski, andrew+netdev, davem, edumazet, pabeni, robh,
	krzk+dt, conor+dt, ansuelsmth, benjamin.larsson, linux-arm-kernel,
	linux-mediatek, netdev, devicetree, Xuegang Lu
In-Reply-To: <20260406-airoha-eth-multi-serdes-v3-10-ab6ea49d59ff@kernel.org>

This is an AI-generated review of your patch. The human sending this
email says "For your consideration."
---
net: airoha: Do not stop GDM port if it is shared

Theoretically, in the current codebase, two independent net_devices can
be connected to the same GDM port so we need to check the GDM port is not
used by any other running net_device before setting the forward
configuration to FE_PSE_PORT_DROP.

Tested-by: Xuegang Lu <xuegang.lu@airoha.com>

> diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> index fd27ba13029bd..5b0cd37b155e4 100644
> --- a/drivers/net/ethernet/airoha/airoha_eth.c
> +++ b/drivers/net/ethernet/airoha/airoha_eth.c

[ ... ]

> @@ -1688,18 +1693,18 @@ static int airoha_dev_stop(struct net_device *netdev)
>  	struct airoha_gdm_dev *dev = netdev_priv(netdev);
>  	struct airoha_gdm_port *port = dev->port;
>  	struct airoha_qdma *qdma = dev->qdma;
> -	int i, err;
> +	int i;
>  
>  	netif_tx_disable(netdev);
> -	err = airoha_set_vip_for_gdm_port(dev, false);
> -	if (err)
> -		return err;
> -
>  	for (i = 0; i < ARRAY_SIZE(qdma->q_tx); i++)
>  		netdev_tx_reset_subqueue(netdev, i);
>  
> -	airoha_set_gdm_port_fwd_cfg(qdma->eth, REG_GDM_FWD_CFG(port->id),
> -				    FE_PSE_PORT_DROP);
> +	if (atomic_dec_and_test(&port->users)) {
> +		airoha_set_vip_for_gdm_port(dev, false);
> +		airoha_set_gdm_port_fwd_cfg(qdma->eth,
> +					    REG_GDM_FWD_CFG(port->id),
> +					    FE_PSE_PORT_DROP);
> +	}
>  
>  	if (atomic_dec_and_test(&qdma->users)) {
>  		airoha_qdma_clear(qdma, REG_QDMA_GLOBAL_CFG,
> 				  GLOBAL_CFG_TX_DMA_EN_MASK |
> 				  GLOBAL_CFG_RX_DMA_EN_MASK);
> 
> 		for (i = 0; i < ARRAY_SIZE(qdma->q_tx); i++) {
> 			if (!qdma->q_tx[i].ndesc)
> 				continue;
> 
> 			airoha_qdma_cleanup_tx_queue(&qdma->q_tx[i]);
> 		}
> 	}

This problem wasn't introduced by this patch, but does the QDMA teardown
path have a DMA corruption vulnerability here?

When GLOBAL_CFG_TX_DMA_EN_MASK is cleared, it fails to poll
GLOBAL_CFG_TX_DMA_BUSY_MASK, proceeding to unmap memory while the hardware
DMA engine may still be processing in-flight descriptors.

Furthermore, airoha_qdma_cleanup_tx_queue() unmaps the DMA addresses and
frees the SKBs, but it does not appear to clear the physical hardware
descriptors or reset the hardware ring pointers (REG_TX_CPU_IDX /
REG_TX_DMA_IDX).

When the device is later reopened in airoha_dev_open(),
GLOBAL_CFG_TX_DMA_EN_MASK is set again without re-initializing the rings.
Will the hardware DMA engine resume fetching from its previous index,
processing stale descriptors whose backing memory was already unmapped
and freed?

[ ... ]

> @@ -1851,10 +1856,22 @@ static void airoha_dev_get_stats64(struct net_device *netdev,
>  static int airoha_dev_change_mtu(struct net_device *netdev, int mtu)
>  {
>  	struct airoha_gdm_dev *dev = netdev_priv(netdev);
> +	u32 cur_len, len = ETH_HLEN + mtu + ETH_FCS_LEN;
>  	struct airoha_gdm_port *port = dev->port;
> -	u32 len = ETH_HLEN + mtu + ETH_FCS_LEN;
>  	struct airoha_eth *eth = dev->eth;
>  
> +	cur_len = FIELD_GET(GDM_LONG_LEN_MASK,
> +			    airoha_fe_rr(eth, REG_GDM_LEN_CFG(port->id)));
> +	if (len < cur_len) {
> +		u8 port_refcnt = atomic_read(&port->users);
> +
> +		/* We can decrease the device MTU just if the GDM port is
> +		 * not shared or if the other device is not running.
> +		 */
> +		if (port_refcnt > 1 || (port_refcnt && !netif_running(netdev)))
> +			return -EBUSY;
> +	}
> +

Does this MTU decrease validation create a state trap on shared ports?

If both eth0 and eth1 (sharing a port) are up, and a user increases eth0 MTU
to 9000, the hardware MTU updates to 9000. If the user then tries to lower
eth0 MTU back to 1500, len < cur_len evaluates to true. Because eth1 is also
up (port_refcnt > 1), this returns -EBUSY.

This would prevent either interface from decreasing its MTU without fully
bringing down the sibling interface.

Should the software netdev->mtu be decoupled from the hardware validation,
allowing valid MTU changes in software while dynamically programming the
hardware MTU to the maximum of all currently up interfaces on the shared
port?

^ permalink raw reply

* [PATCH net v2] net: ethernet: mtk_eth_soc: initialize PPE per-tag-layer MTU registers
From: Daniel Golle @ 2026-04-10  2:57 UTC (permalink / raw)
  To: Felix Fietkau, Lorenzo Bianconi, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Matthias Brugger,
	AngeloGioacchino Del Regno, Pablo Neira Ayuso, netdev,
	linux-kernel, linux-arm-kernel, linux-mediatek

The PPE enforces output frame size limits via per-tag-layer VLAN_MTU
registers that the driver never initializes. The hardware defaults do
not account for PPPoE overhead, causing the PPE to punt encapsulated
frames back to the CPU instead of forwarding them.

Initialize the registers at PPE start and on MTU changes using the
maximum GMAC MTU. This is a conservative approximation -- the actual
per-PPE requirement depends on egress path, but using the global
maximum ensures the limits are never too small.

Fixes: ba37b7caf1ed2 ("net: ethernet: mtk_eth_soc: add support for initializing the PPE")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
---
v2: rebase on top of current net/main

 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 22 ++++++++++++++-
 drivers/net/ethernet/mediatek/mtk_ppe.c     | 30 +++++++++++++++++++++
 drivers/net/ethernet/mediatek/mtk_ppe.h     |  1 +
 3 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index ddc321a02fdae..796f79088f366 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -3566,12 +3566,23 @@ static int mtk_device_event(struct notifier_block *n, unsigned long event, void
 	return NOTIFY_DONE;
 }
 
+static int mtk_max_gmac_mtu(struct mtk_eth *eth)
+{
+	int i, max_mtu = ETH_DATA_LEN;
+
+	for (i = 0; i < ARRAY_SIZE(eth->netdev); i++)
+		if (eth->netdev[i] && eth->netdev[i]->mtu > max_mtu)
+			max_mtu = eth->netdev[i]->mtu;
+
+	return max_mtu;
+}
+
 static int mtk_open(struct net_device *dev)
 {
 	struct mtk_mac *mac = netdev_priv(dev);
 	struct mtk_eth *eth = mac->hw;
 	struct mtk_mac *target_mac;
-	int i, err, ppe_num;
+	int i, err, ppe_num, mtu;
 
 	ppe_num = eth->soc->ppe_num;
 
@@ -3618,6 +3629,10 @@ static int mtk_open(struct net_device *dev)
 			mtk_gdm_config(eth, target_mac->id, gdm_config);
 		}
 
+		mtu = mtk_max_gmac_mtu(eth);
+		for (i = 0; i < ARRAY_SIZE(eth->ppe); i++)
+			mtk_ppe_update_mtu(eth->ppe[i], mtu);
+
 		napi_enable(&eth->tx_napi);
 		napi_enable(&eth->rx_napi);
 		mtk_tx_irq_enable(eth, MTK_TX_DONE_INT);
@@ -4311,6 +4326,7 @@ static int mtk_change_mtu(struct net_device *dev, int new_mtu)
 	int length = new_mtu + MTK_RX_ETH_HLEN;
 	struct mtk_mac *mac = netdev_priv(dev);
 	struct mtk_eth *eth = mac->hw;
+	int max_mtu, i;
 
 	if (rcu_access_pointer(eth->prog) &&
 	    length > MTK_PP_MAX_BUF_SIZE) {
@@ -4321,6 +4337,10 @@ static int mtk_change_mtu(struct net_device *dev, int new_mtu)
 	mtk_set_mcr_max_rx(mac, length);
 	WRITE_ONCE(dev->mtu, new_mtu);
 
+	max_mtu = mtk_max_gmac_mtu(eth);
+	for (i = 0; i < ARRAY_SIZE(eth->ppe); i++)
+		mtk_ppe_update_mtu(eth->ppe[i], max_mtu);
+
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/mediatek/mtk_ppe.c b/drivers/net/ethernet/mediatek/mtk_ppe.c
index 75f7728fc7962..18279e2a7022e 100644
--- a/drivers/net/ethernet/mediatek/mtk_ppe.c
+++ b/drivers/net/ethernet/mediatek/mtk_ppe.c
@@ -973,6 +973,36 @@ static void mtk_ppe_init_foe_table(struct mtk_ppe *ppe)
 	}
 }
 
+void mtk_ppe_update_mtu(struct mtk_ppe *ppe, int mtu)
+{
+	int base;
+	u32 val;
+
+	if (!ppe)
+		return;
+
+	/* The PPE checks output frame size against per-tag-layer MTU limits,
+	 * treating PPPoE and DSA tags just like 802.1Q VLAN tags. The Linux
+	 * device MTU already accounts for PPPoE (PPPOE_SES_HLEN) and DSA tag
+	 * overhead, but 802.1Q VLAN tags are handled transparently without
+	 * being reflected by the lower device MTU being increased by 4.
+	 * Use the maximum MTU across all GMAC interfaces so that PPE output
+	 * frame limits are sufficiently high regardless of which port a flow
+	 * egresses through.
+	 */
+	base = ETH_HLEN + mtu;
+
+	val = FIELD_PREP(MTK_PPE_VLAN_MTU0_NONE, base) |
+	      FIELD_PREP(MTK_PPE_VLAN_MTU0_1TAG, base + VLAN_HLEN);
+	ppe_w32(ppe, MTK_PPE_VLAN_MTU0, val);
+
+	val = FIELD_PREP(MTK_PPE_VLAN_MTU1_2TAG,
+			 base + 2 * VLAN_HLEN) |
+	      FIELD_PREP(MTK_PPE_VLAN_MTU1_3TAG,
+			 base + 3 * VLAN_HLEN);
+	ppe_w32(ppe, MTK_PPE_VLAN_MTU1, val);
+}
+
 void mtk_ppe_start(struct mtk_ppe *ppe)
 {
 	u32 val;
diff --git a/drivers/net/ethernet/mediatek/mtk_ppe.h b/drivers/net/ethernet/mediatek/mtk_ppe.h
index 223f709e2704f..ba85e39a155bf 100644
--- a/drivers/net/ethernet/mediatek/mtk_ppe.h
+++ b/drivers/net/ethernet/mediatek/mtk_ppe.h
@@ -346,6 +346,7 @@ struct mtk_ppe {
 struct mtk_ppe *mtk_ppe_init(struct mtk_eth *eth, void __iomem *base, int index);
 
 void mtk_ppe_deinit(struct mtk_eth *eth);
+void mtk_ppe_update_mtu(struct mtk_ppe *ppe, int mtu);
 void mtk_ppe_start(struct mtk_ppe *ppe);
 int mtk_ppe_stop(struct mtk_ppe *ppe);
 int mtk_ppe_prepare_reset(struct mtk_ppe *ppe);
-- 
2.53.0


^ permalink raw reply related

* Re: [PATCH net-next v3 00/12] net: airoha: Support multiple net_devices connected to the same GDM port
From: Jakub Kicinski @ 2026-04-10  2:59 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Paolo Abeni,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, Christian Marangi,
	Benjamin Larsson, linux-arm-kernel, linux-mediatek, netdev,
	devicetree, Xuegang Lu
In-Reply-To: <20260406-airoha-eth-multi-serdes-v3-0-ab6ea49d59ff@kernel.org>

On Mon, 06 Apr 2026 12:34:05 +0200 Lorenzo Bianconi wrote:
> EN7581 or AN7583 SoCs support connecting multiple external SerDes (e.g.
> Ethernet or USB SerDes) to GDM3 or GDM4 ports via a hw arbiter that
> manages the traffic in a TDM manner. As a result multiple net_devices can
> connect to the same GDM{3,4} port and there is a theoretical "1:n"
> relation between GDM ports and net_devices.

Looks like this driver uses page pool.
If you're sharing the same page pool across multiple netdevs
it must not be linked to a netdev.


^ permalink raw reply

* RE: [PATCH V2 0/8] PCI: imx6: Integrate pwrctrl API and update device trees
From: Sherry Sun @ 2026-04-10  3:00 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: robh@kernel.org, krzk+dt@kernel.org, conor+dt@kernel.org,
	Frank Li, s.hauer@pengutronix.de, kernel@pengutronix.de,
	festevam@gmail.com, lpieralisi@kernel.org, kwilczynski@kernel.org,
	bhelgaas@google.com, Hongxing Zhu, l.stach@pengutronix.de,
	imx@lists.linux.dev, linux-pci@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <omtn42mopdz7igg7jaqwehd67l6xc77zk7zzqwkufgnsycvadg@5kodhpgfesre>

> Subject: Re: [PATCH V2 0/8] PCI: imx6: Integrate pwrctrl API and update device
> trees
> 
> On Thu, Apr 02, 2026 at 06:09:59PM +0800, Sherry Sun wrote:
> > Note: This patch set depends on my previous patch set [1] which adds
> > Root Port device tree nodes and support parsing the reset property in
> > new Root Port binding in pci-imx6 driver.
> >
> > This series integrates the PCI pwrctrl framework into the pci-imx6
> > driver and updates i.MX EVK board device trees to support it.
> >
> > Patches 2-8 update device trees for i.MX EVK boards which maintained
> > by NXP to move power supply properties from the PCIe controller node
> > to the Root Port child node, which is required for pwrctrl framework.
> > Affected boards:
> > - i.MX6Q/DL SABRESD
> > - i.MX6SX SDB
> > - i.MX8MM EVK
> > - i.MX8MP EVK
> > - i.MX8MQ EVK
> > - i.MX8DXL/QM/QXP EVK
> > - i.MX95 15x15/19x19 EVK
> >
> > The driver maintains legacy regulator handling for device trees that
> > haven't been updated yet. Both old and new device tree structures are
> > supported.
> >
> 
> Thanks for the work! Due to some recently merged patches, this series (Patch
> 1) doesn't apply on top of pci/controller/dwc-imx6 branch. Please rebase and
> resend!
> 
> - Mani

Hi Mani, thanks for the reminder.
Actually this patch set depends on my PERST# patch set [1], which adds
support for Root Port dts nodes and correctly adjusts the sequence of PERST#
assert/deassert and regulator/clock enable in pci-imx6 driver.
I will resend this series once the PERST# patch set been accepted.

[1] https://lore.kernel.org/all/20260410023055.2439146-1-sherry.sun@nxp.com/

Best Regards
Sherry

^ permalink raw reply

* RE: [PATCH v3 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
From: Tian, Kevin @ 2026-04-10  3:13 UTC (permalink / raw)
  To: Jason Gunthorpe, Nicolin Chen
  Cc: will@kernel.org, robin.murphy@arm.com, bhelgaas@google.com,
	joro@8bytes.org, praan@google.com, baolu.lu@linux.intel.com,
	miko.lenczewski@arm.com, linux-arm-kernel@lists.infradead.org,
	iommu@lists.linux.dev, linux-kernel@vger.kernel.org,
	linux-pci@vger.kernel.org, Williams, Dan J,
	jonathan.cameron@huawei.com, Vikram Sethi,
	linux-cxl@vger.kernel.org
In-Reply-To: <20260409225252.GU3357077@nvidia.com>

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, April 10, 2026 6:53 AM
> 
> On Thu, Apr 09, 2026 at 03:45:26PM -0700, Nicolin Chen wrote:
> 
> > One question regarding VM case: if a device is ats_always_on, while
> > VM somehow doesn't set nested_domain->enable_ats. Should the kernel
> > at least spit a warning, given that it would surely fail the device?
> 
> No, just let break, the resulting failure has to be contained to the
> VM or the platform is broken..
> 
> The HV can't turn on ATS because we it can't know what invalidations
> to push so not much other choice.
> 

Taking about in theory - host can append a devtlb invalidation cmd
after iotlb invalidation (if vcmdq is not used)?


^ permalink raw reply

* [PATCH v1] clk: imx95-blk-ctl: Add func_out_en clock for i.MX9x PCIe
From: Richard Zhu @ 2026-04-10  3:16 UTC (permalink / raw)
  To: abelvesa, peng.fan, mturquette, sboyd, Frank.Li, s.hauer,
	festevam
  Cc: linux-clk, imx, linux-arm-kernel, linux-kernel, kernel,
	Richard Zhu

When internal PLL clock is used as PCIe REF clock, the BIT6(CREF_EN) and
BIT2(FUNC_OUTPUT_EN) control the PCIE_REF_OUT_CLK.

If the default value of BIT6(CREF_EN)&BIT2(FUNC_OUTPUT_EN) is 1b'1.
With the typical 100-ohm termination on the board, this results in
approximately 6mA of power consumption.

When PCIe internal PLL clock is not enabled, these two bits should
be cleared to 1b'0 to eliminate this power consumption.

Add a func_out_en clock for i.MX9x PCIe to serve as the parent gate clock
of the CREF_EN (BIT6) gate clock. Both of these two gate clocks enable
the output of the internal 100MHz differential reference clock.

Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
---
 drivers/clk/imx/clk-imx95-blk-ctl.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/clk/imx/clk-imx95-blk-ctl.c b/drivers/clk/imx/clk-imx95-blk-ctl.c
index 56bed4471995..1f9259f45607 100644
--- a/drivers/clk/imx/clk-imx95-blk-ctl.c
+++ b/drivers/clk/imx/clk-imx95-blk-ctl.c
@@ -286,18 +286,28 @@ static const struct imx95_blk_ctl_dev_data netcmix_dev_data = {
 static const struct imx95_blk_ctl_clk_dev_data hsio_blk_ctl_clk_dev_data[] = {
 	[0] = {
 		.name = "hsio_blk_ctl_clk",
-		.parent_names = (const char *[]){ "hsio_pll", },
+		.parent_names = (const char *[]){ "func_out_en", },
 		.num_parents = 1,
 		.reg = 0,
 		.bit_idx = 6,
 		.bit_width = 1,
 		.type = CLK_GATE,
 		.flags = CLK_SET_RATE_PARENT,
+	},
+	[1] = {
+		.name = "func_out_en",
+		.parent_names = (const char *[]){ "hsio_pll", },
+		.num_parents = 1,
+		.reg = 0,
+		.bit_idx = 2,
+		.bit_width = 1,
+		.type = CLK_GATE,
+		.flags = CLK_SET_RATE_PARENT,
 	}
 };
 
 static const struct imx95_blk_ctl_dev_data hsio_blk_ctl_dev_data = {
-	.num_clks = 1,
+	.num_clks = ARRAY_SIZE(hsio_blk_ctl_clk_dev_data),
 	.clk_dev_data = hsio_blk_ctl_clk_dev_data,
 	.clk_reg_offset = 0,
 };
-- 
2.37.1



^ permalink raw reply related

* Re: [RFC V1 14/16] arm64/mm: Enable fixmap with 5 level page table
From: Anshuman Khandual @ 2026-04-10  3:22 UTC (permalink / raw)
  To: David Hildenbrand (Arm), linux-arm-kernel
  Cc: Catalin Marinas, Will Deacon, Ryan Roberts, Mark Rutland,
	Lorenzo Stoakes, Andrew Morton, Mike Rapoport, Linu Cherian,
	linux-kernel, linux-mm
In-Reply-To: <4382c90a-bc92-470e-9aa7-4666753479ca@kernel.org>



On 08/04/26 5:59 PM, David Hildenbrand (Arm) wrote:
> On 2/24/26 06:11, Anshuman Khandual wrote:
>> Enable fixmap with 5 level page table when required. This creates table
>> entries at the PGD level. Add a fallback stub for pgd_page_paddr() when
>> (PGTBALE_LEVELS <= 4) which helps in intercepting any unintended usage.
> 
> Can you add the "why" ?

Following reworded commit message should work ?

------------------------------------------------------------------
arm64/mm: Enable fixmap with 5 level page table

FEAT_D128 halves PTRS_PER_PXX thus shrinking the VA range coverage 
for each page table level. Hence in order to preserve all existing
VA range configurations, some geometry now need to become 5-level.

Since fixmap is used to build and manipulate page tables early on
during boot the mapping must also gain that additional level which
was not required earlier.

Enable fixmap with 5 level page table when required. This creates table
entries at the PGD level. Add a fallback stub for pgd_page_paddr() when
(PGTBALE_LEVELS <= 4) which helps in intercepting any unintended usage.
-------------------------------------------------------------------


^ permalink raw reply

* Re: [PATCH net v3] net: airoha: Add dma_rmb() and READ_ONCE() in airoha_qdma_rx_process()
From: Jakub Kicinski @ 2026-04-10  3:35 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Paolo Abeni,
	Xuegang Lu, Simon Horman, linux-arm-kernel, linux-mediatek,
	netdev
In-Reply-To: <20260407-airoha_qdma_rx_process-fix-reordering-v3-1-91c36e9da31f@kernel.org>

On Tue, 07 Apr 2026 08:48:04 +0200 Lorenzo Bianconi wrote:
> Add missing dma_rmb() in airoha_qdma_rx_process routine to make sure the
> DMA read operations are completed when the NIC reports the processing on
> the current descriptor is done. Moreover, add missing READ_ONCE() in
> airoha_qdma_rx_process() for DMA descriptor control fields in order to
> avoid any compiler reordering.

Sashiko seems to have more orthogonal complaints FWIW


^ permalink raw reply

* Re: [PATCH v14 2/4] asm-generic: Move TIF_SINGLESTEP to generic TIF bits
From: Jinjie Ruan @ 2026-04-10  3:39 UTC (permalink / raw)
  To: Mark Rutland
  Cc: catalin.marinas, will, chenhuacai, kernel, hca, gor, agordeev,
	borntraeger, svens, oleg, tglx, mingo, bp, dave.hansen, hpa, arnd,
	shuah, kevin.brodsky, yeoreum.yun, anshuman.khandual, thuth,
	ryan.roberts, song, ziyao, linusw, schuster.simon, jremus, akpm,
	mathieu.desnoyers, kmal, dvyukov, reddybalavignesh9979, x86,
	linux-arm-kernel, linux-kernel, loongarch, linux-s390, linux-arch,
	linux-kselftest
In-Reply-To: <addYbV3_9eFZg_b4@J2N7QTR9R3>



On 2026/4/9 15:42, Mark Rutland wrote:
> On Fri, Mar 20, 2026 at 06:42:20PM +0800, Jinjie Ruan wrote:
>> Currently, x86, ARM64, s390, and LoongArch all define and use
>> TIF_SINGLESTEP to track single-stepping state.
> 
> Do the architectures actually use the flag in the same way?

As far as I know, the behavior of setting and clearing the
TIF_SINGLESTEP flag is consistent across these architectures, at least
within the ptrace handling logic of user_enable_single_step() and
user_disable_single_step().

> 
> I'd expect that this is used subtly differently across those
> architectures, and so isn't necessarily generic.
> 
>> Since this flag is shared across multiple major architectures and serves
>> a common purpose in the generic entry/exit paths, move TIF_SINGLESTEP
>> into the generic Thread Information Flags (TIF) infrastructure.
>>
>> This consolidation reduces architecture-specific boilerplate code and
>> ensures consistency for generic features that rely on single-step
>> state tracking.
> 
> Is it necessary to make this generic in order to move to generic irq
> flags? I'd expect that generic code cannot make use of this due to the
> different semantics across architectures, as noted abobve.
> 
> I think it's probably better to keep this architecture-specific for now,
> where architectures can clearly define how they're using this bit.

Hi Mark,

Thank you for the feedback. You are maybe right, and your concern aligns
with the original intent behind the generic TIF infrastructure.

I noticed that when the generic TIF infrastructure was first introduced
(see commit 29589343488e: "asm-generic: Provide generic TIF
infrastructure"), it explicitly mentioned:
"This could probably be extended by TIF_SINGLESTEP and BLOCKSTEP, but
those are only used in architecture specific code. So leave them alone
for now."

It seems that moving TIF_SINGLESTEP to generic TIF bits at this stage is
indeed premature. Furthermore, in the generic entry implementation, the
single-step exit handling is actually managed by
SYSCALL_WORK_SYSCALL_EXIT_TRAP rather than directly relying on a generic
TIF_SINGLESTEP flag.


Best regards,
Jinjie

> 
> Am I missing some reason why it's necessary to make this generic?
> 
> Mark.
> 
>> Cc: Thomas Gleixner <tglx@kernel.org>
>> Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
>> Reviewed-by: Linus Walleij <linusw@kernel.org>
>> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com>
>> Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390
>> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
>> ---
>>  arch/loongarch/include/asm/thread_info.h | 11 +++++------
>>  arch/s390/include/asm/thread_info.h      |  7 +++----
>>  arch/x86/include/asm/thread_info.h       |  6 ++----
>>  include/asm-generic/thread_info_tif.h    |  5 +++++
>>  4 files changed, 15 insertions(+), 14 deletions(-)
>>
>> diff --git a/arch/loongarch/include/asm/thread_info.h b/arch/loongarch/include/asm/thread_info.h
>> index 4d7117fcdc78..a2ec87f18e1d 100644
>> --- a/arch/loongarch/include/asm/thread_info.h
>> +++ b/arch/loongarch/include/asm/thread_info.h
>> @@ -70,6 +70,7 @@ register unsigned long current_stack_pointer __asm__("$sp");
>>   */
>>  #define HAVE_TIF_NEED_RESCHED_LAZY
>>  #define HAVE_TIF_RESTORE_SIGMASK
>> +#define HAVE_TIF_SINGLESTEP
>>  
>>  #include <asm-generic/thread_info_tif.h>
>>  
>> @@ -82,11 +83,10 @@ register unsigned long current_stack_pointer __asm__("$sp");
>>  #define TIF_32BIT_REGS		21	/* 32-bit general purpose registers */
>>  #define TIF_32BIT_ADDR		22	/* 32-bit address space */
>>  #define TIF_LOAD_WATCH		23	/* If set, load watch registers */
>> -#define TIF_SINGLESTEP		24	/* Single Step */
>> -#define TIF_LSX_CTX_LIVE	25	/* LSX context must be preserved */
>> -#define TIF_LASX_CTX_LIVE	26	/* LASX context must be preserved */
>> -#define TIF_USEDLBT		27	/* LBT was used by this task this quantum (SMP) */
>> -#define TIF_LBT_CTX_LIVE	28	/* LBT context must be preserved */
>> +#define TIF_LSX_CTX_LIVE	24	/* LSX context must be preserved */
>> +#define TIF_LASX_CTX_LIVE	25	/* LASX context must be preserved */
>> +#define TIF_USEDLBT		26	/* LBT was used by this task this quantum (SMP) */
>> +#define TIF_LBT_CTX_LIVE	27	/* LBT context must be preserved */
>>  
>>  #define _TIF_NOHZ		BIT(TIF_NOHZ)
>>  #define _TIF_USEDFPU		BIT(TIF_USEDFPU)
>> @@ -96,7 +96,6 @@ register unsigned long current_stack_pointer __asm__("$sp");
>>  #define _TIF_32BIT_REGS		BIT(TIF_32BIT_REGS)
>>  #define _TIF_32BIT_ADDR		BIT(TIF_32BIT_ADDR)
>>  #define _TIF_LOAD_WATCH		BIT(TIF_LOAD_WATCH)
>> -#define _TIF_SINGLESTEP		BIT(TIF_SINGLESTEP)
>>  #define _TIF_LSX_CTX_LIVE	BIT(TIF_LSX_CTX_LIVE)
>>  #define _TIF_LASX_CTX_LIVE	BIT(TIF_LASX_CTX_LIVE)
>>  #define _TIF_USEDLBT		BIT(TIF_USEDLBT)
>> diff --git a/arch/s390/include/asm/thread_info.h b/arch/s390/include/asm/thread_info.h
>> index 1bcd42614e41..95be5258a422 100644
>> --- a/arch/s390/include/asm/thread_info.h
>> +++ b/arch/s390/include/asm/thread_info.h
>> @@ -61,6 +61,7 @@ void arch_setup_new_exec(void);
>>   */
>>  #define HAVE_TIF_NEED_RESCHED_LAZY
>>  #define HAVE_TIF_RESTORE_SIGMASK
>> +#define HAVE_TIF_SINGLESTEP
>>  
>>  #include <asm-generic/thread_info_tif.h>
>>  
>> @@ -69,15 +70,13 @@ void arch_setup_new_exec(void);
>>  #define TIF_GUARDED_STORAGE	17	/* load guarded storage control block */
>>  #define TIF_ISOLATE_BP_GUEST	18	/* Run KVM guests with isolated BP */
>>  #define TIF_PER_TRAP		19	/* Need to handle PER trap on exit to usermode */
>> -#define TIF_SINGLESTEP		21	/* This task is single stepped */
>> -#define TIF_BLOCK_STEP		22	/* This task is block stepped */
>> -#define TIF_UPROBE_SINGLESTEP	23	/* This task is uprobe single stepped */
>> +#define TIF_BLOCK_STEP		20	/* This task is block stepped */
>> +#define TIF_UPROBE_SINGLESTEP	21	/* This task is uprobe single stepped */
>>  
>>  #define _TIF_ASCE_PRIMARY	BIT(TIF_ASCE_PRIMARY)
>>  #define _TIF_GUARDED_STORAGE	BIT(TIF_GUARDED_STORAGE)
>>  #define _TIF_ISOLATE_BP_GUEST	BIT(TIF_ISOLATE_BP_GUEST)
>>  #define _TIF_PER_TRAP		BIT(TIF_PER_TRAP)
>> -#define _TIF_SINGLESTEP	BIT(TIF_SINGLESTEP)
>>  #define _TIF_BLOCK_STEP		BIT(TIF_BLOCK_STEP)
>>  #define _TIF_UPROBE_SINGLESTEP	BIT(TIF_UPROBE_SINGLESTEP)
>>  
>> diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
>> index 0067684afb5b..f59072ba1473 100644
>> --- a/arch/x86/include/asm/thread_info.h
>> +++ b/arch/x86/include/asm/thread_info.h
>> @@ -98,9 +98,8 @@ struct thread_info {
>>  #define TIF_IO_BITMAP		22	/* uses I/O bitmap */
>>  #define TIF_SPEC_FORCE_UPDATE	23	/* Force speculation MSR update in context switch */
>>  #define TIF_FORCED_TF		24	/* true if TF in eflags artificially */
>> -#define TIF_SINGLESTEP		25	/* reenable singlestep on user return*/
>> -#define TIF_BLOCKSTEP		26	/* set when we want DEBUGCTLMSR_BTF */
>> -#define TIF_ADDR32		27	/* 32-bit address space on 64 bits */
>> +#define TIF_BLOCKSTEP		25	/* set when we want DEBUGCTLMSR_BTF */
>> +#define TIF_ADDR32		26	/* 32-bit address space on 64 bits */
>>  
>>  #define _TIF_SSBD		BIT(TIF_SSBD)
>>  #define _TIF_SPEC_IB		BIT(TIF_SPEC_IB)
>> @@ -112,7 +111,6 @@ struct thread_info {
>>  #define _TIF_SPEC_FORCE_UPDATE	BIT(TIF_SPEC_FORCE_UPDATE)
>>  #define _TIF_FORCED_TF		BIT(TIF_FORCED_TF)
>>  #define _TIF_BLOCKSTEP		BIT(TIF_BLOCKSTEP)
>> -#define _TIF_SINGLESTEP		BIT(TIF_SINGLESTEP)
>>  #define _TIF_ADDR32		BIT(TIF_ADDR32)
>>  
>>  /* flags to check in __switch_to() */
>> diff --git a/include/asm-generic/thread_info_tif.h b/include/asm-generic/thread_info_tif.h
>> index da1610a78f92..b277fe06aee3 100644
>> --- a/include/asm-generic/thread_info_tif.h
>> +++ b/include/asm-generic/thread_info_tif.h
>> @@ -48,4 +48,9 @@
>>  #define TIF_RSEQ		11	// Run RSEQ fast path
>>  #define _TIF_RSEQ		BIT(TIF_RSEQ)
>>  
>> +#ifdef HAVE_TIF_SINGLESTEP
>> +#define TIF_SINGLESTEP		12	/* reenable singlestep on user return*/
>> +#define _TIF_SINGLESTEP		BIT(TIF_SINGLESTEP)
>> +#endif
>> +
>>  #endif /* _ASM_GENERIC_THREAD_INFO_TIF_H_ */
>> -- 
>> 2.34.1
>>
> 


^ permalink raw reply

* Re: [PATCH net v3] net: airoha: Add dma_rmb() and READ_ONCE() in airoha_qdma_rx_process()
From: patchwork-bot+netdevbpf @ 2026-04-10  3:40 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, xuegang.lu, horms,
	linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <20260407-airoha_qdma_rx_process-fix-reordering-v3-1-91c36e9da31f@kernel.org>

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue, 07 Apr 2026 08:48:04 +0200 you wrote:
> Add missing dma_rmb() in airoha_qdma_rx_process routine to make sure the
> DMA read operations are completed when the NIC reports the processing on
> the current descriptor is done. Moreover, add missing READ_ONCE() in
> airoha_qdma_rx_process() for DMA descriptor control fields in order to
> avoid any compiler reordering.
> 
> Fixes: 23020f0493270 ("net: airoha: Introduce ethernet support for EN7581 SoC")
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> 
> [...]

Here is the summary with links:
  - [net,v3] net: airoha: Add dma_rmb() and READ_ONCE() in airoha_qdma_rx_process()
    https://git.kernel.org/netdev/net/c/4ae0604a0673

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html




^ permalink raw reply

* Re: [PATCH v2 8/8] arm64: dts: qcom: eliza: Add support for MM clock controllers
From: Taniya Das @ 2026-04-10  3:55 UTC (permalink / raw)
  To: Bryan O'Donoghue, Bjorn Andersson, Michael Turquette,
	Stephen Boyd, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Konrad Dybcio, Maxime Coquelin, Alexandre Torgue
  Cc: Ajit Pandey, Imran Shaik, Jagadeesh Kona, linux-arm-msm,
	linux-clk, devicetree, linux-kernel, linux-stm32,
	linux-arm-kernel
In-Reply-To: <cb5a40e8-e2e3-4ed9-a9c6-0daa9f408710@nxsw.ie>



On 4/10/2026 12:10 AM, Bryan O'Donoghue wrote:
> On 09/04/2026 19:10, Taniya Das wrote:
>> +		videocc: clock-controller@aaf0000 {
>> +			compatible = "qcom,eliza-videocc";
>> +			reg = <0x0 0xaaf0000 0x0 0x10000>;
>> +
>> +			clocks = <&bi_tcxo_div2>,
>> +				 <&sleep_clk>,
>> +				 <&gcc GCC_VIDEO_AHB_CLK>;
>> +
>> +			#clock-cells = <1>;
>> +			#reset-cells = <1>;
>> +			#power-domain-cells = <1>;
>> +		};
>> +
>> +		camcc: clock-controller@ade0000 {
>> +			compatible = "qcom,eliza-camcc";
>> +			reg = <0x0 0x0ade0000 0x0 0x20000>;
>> +
>> +			clocks = <&gcc GCC_CAMERA_AHB_CLK>,
>> +				 <&bi_tcxo_div2>,
>> +				 <&sleep_clk>;
>> +
>> +			#clock-cells = <1>;
>> +			#reset-cells = <1>;
>> +		};
> 
> This looks odd.
> 
> Why do these two controllers have no power-domains ?

Bryan, on Eliza the videocc and camcc are connected on CX and MXA.

-- 
Thanks,
Taniya Das



^ permalink raw reply

* Re: [RFC V1 11/16] arm64/mm: Route all pgtable atomics to central helpers
From: Anshuman Khandual @ 2026-04-10  4:02 UTC (permalink / raw)
  To: David Hildenbrand (Arm), linux-arm-kernel
  Cc: Catalin Marinas, Will Deacon, Ryan Roberts, Mark Rutland,
	Lorenzo Stoakes, Andrew Morton, Mike Rapoport, Linu Cherian,
	linux-kernel, linux-mm
In-Reply-To: <aeb1972c-57ba-4f21-8289-66424e4e619b@kernel.org>

On 08/04/26 5:58 PM, David Hildenbrand (Arm) wrote:
> On 2/24/26 06:11, Anshuman Khandual wrote:
>> Route all cmpxchg() operations performed on various page table entries to a
>> new ptdesc_cmpxchg_relaxed() helper. Similarly route all xchg() operations
>> performed on page table entries to a new ptdesc_xchg_relaxed() helper.
>>
>> Currently these helpers just forward to the same APIs that were previously
>> called direct, but in future we will change the routing for D128 which is
>> too long to use the standard APIs.
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Ryan Roberts <ryan.roberts@arm.com>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  arch/arm64/include/asm/pgtable.h | 23 +++++++++++++++++------
>>  arch/arm64/mm/fault.c            |  2 +-
>>  2 files changed, 18 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
>> index 42124d2f323d..cf69ce68f951 100644
>> --- a/arch/arm64/include/asm/pgtable.h
>> +++ b/arch/arm64/include/asm/pgtable.h
>> @@ -87,6 +87,17 @@ static inline void arch_leave_lazy_mmu_mode(void)
>>  #define ptdesc_get(x)		READ_ONCE(x)
>>  #define ptdesc_set(x, val)	WRITE_ONCE(x, val)
>>  
>> +static inline ptdesc_t ptdesc_cmpxchg_relaxed(ptdesc_t *ptep, ptdesc_t old,
>> +					      ptdesc_t new)
>> +{
>> +	return cmpxchg_relaxed(ptep, old, new);
>> +}
>> +
>> +static inline ptdesc_t ptdesc_xchg_relaxed(ptdesc_t *ptep, ptdesc_t new)
>> +{
>> +	return xchg_relaxed(ptep, new);
>> +}
>> +
> 
> We really want the rename of ptdesc_t before this change.
> 

Planning to rename ptdesc_t as ptent_t in a pre-requisite
patch early in the series.



^ permalink raw reply

* Re: [RFC V1 12/16] arm64/mm: Abstract printing of pxd_val()
From: Anshuman Khandual @ 2026-04-10  4:05 UTC (permalink / raw)
  To: David Hildenbrand (Arm), linux-arm-kernel
  Cc: Catalin Marinas, Will Deacon, Ryan Roberts, Mark Rutland,
	Lorenzo Stoakes, Andrew Morton, Mike Rapoport, Linu Cherian,
	linux-kernel, linux-mm
In-Reply-To: <4290fcdb-47d1-4f74-97f3-51ac581efeaf@kernel.org>

On 08/04/26 5:58 PM, David Hildenbrand (Arm) wrote:
> On 2/24/26 06:11, Anshuman Khandual wrote:
> 
> Subject: you probably mean "pxx_val()" ?

Yes - sounds better will change.

> 
>> Ahead of adding support for D128 pgtables, refactor places that print
>> PTE values to use the new __PRIpte format specifier and __PRIpte_args()
>> macro to prepare the argument(s). When using D128 pgtables in future,
>> we can simply redefine __PRIpte and __PTIpte_args().
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Ryan Roberts <ryan.roberts@arm.com>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  arch/arm64/include/asm/pgtable-types.h |  3 +++
>>  arch/arm64/include/asm/pgtable.h       | 22 +++++++++++-----------
>>  arch/arm64/mm/fault.c                  | 10 +++++-----
>>  3 files changed, 19 insertions(+), 16 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/pgtable-types.h b/arch/arm64/include/asm/pgtable-types.h
>> index 265e8301d7ba..dc3791dc9f14 100644
>> --- a/arch/arm64/include/asm/pgtable-types.h
>> +++ b/arch/arm64/include/asm/pgtable-types.h
>> @@ -11,6 +11,9 @@
>>  
>>  #include <asm/types.h>
>>  
>> +#define __PRIpte		"016llx"
>> +#define __PRIpte_args(val)	((u64)val)
> 
> Same comment regarding "pte" being misleading.

Sure - will rename __PRIpte as PRIpxx instead.

> 
> 



^ permalink raw reply

* Re: [RFC V1 01/16] mm: Abstract printing of pxd_val()
From: Anshuman Khandual @ 2026-04-10  4:21 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-arm-kernel, Catalin Marinas, Will Deacon, Ryan Roberts,
	Mark Rutland, Lorenzo Stoakes, Andrew Morton, David Hildenbrand,
	Linu Cherian, linux-kernel, linux-mm
In-Reply-To: <adeAhYW-hJ7-6-Xy@kernel.org>

On 09/04/26 4:03 PM, Mike Rapoport wrote:
> Hi Anshuman,
> 
> On Tue, Feb 24, 2026 at 10:41:38AM +0530, Anshuman Khandual wrote:
>> Ahead of adding support for D128 pgtables, refactor places that print
>> PTE values to use the new __PRIpte format specifier and __PRIpte_args()
>> macro to prepare the argument(s). When using D128 pgtables in future,
>> we can simply redefine __PRIpte and __PTIpte_args().
>>
>> Besides there is also an assumption about pxd_val() being always capped
>> at 'unsigned long long' size but that will not work for D128 pgtables.
>> Just increase its size to u128 if the compiler supports via a separate
>> data type pxdval_t which also defaults to existing 'unsigned long long'.
>>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: David Hildenbrand <david@kernel.org>
>> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>> Cc: Mike Rapoport <rppt@kernel.org>
>> Cc: linux-mm@kvack.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  include/linux/pgtable.h |  5 +++++
>>  mm/memory.c             | 29 +++++++++++++++++++----------
>>  2 files changed, 24 insertions(+), 10 deletions(-)
>>
>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
>> index a50df42a893f..da17139a1279 100644
>> --- a/include/linux/pgtable.h
>> +++ b/include/linux/pgtable.h
>> @@ -17,6 +17,11 @@
>>  #include <asm-generic/pgtable_uffd.h>
>>  #include <linux/page_table_check.h>
>>  
>> +#ifndef __PRIpte
>> +#define __PRIpte		"016llx"
>> +#define __PRIpte_args(val)	((u64)val)
>> +#endif
>> +
>>  #if 5 - defined(__PAGETABLE_P4D_FOLDED) - defined(__PAGETABLE_PUD_FOLDED) - \
>>  	defined(__PAGETABLE_PMD_FOLDED) != CONFIG_PGTABLE_LEVELS
>>  #error CONFIG_PGTABLE_LEVELS is not consistent with __PAGETABLE_{P4D,PUD,PMD}_FOLDED
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 07778814b4a8..cfc3077fc52f 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -532,9 +532,15 @@ static bool is_bad_page_map_ratelimited(void)
>>  	return false;
>>  }
>>  
>> +#ifdef __SIZEOF_INT128__
>> +	typedef u128 pxdval_t;
> 
> I don't think the typedef should be indented.

Sure will drop the indent from pxdval_t.

> 
>> +#else
>> +	typedef unsigned long long pxdval_t;
>> +#endif
> 
> Don't we want this in, say, include/linux/pgtable.h?
> 

Sure will move the typedef into the above header.


^ permalink raw reply

* Re: [RFC V1 02/16] mm: Add read-write accessors for vm_page_prot
From: Anshuman Khandual @ 2026-04-10  4:29 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-arm-kernel, Catalin Marinas, Will Deacon, Ryan Roberts,
	Mark Rutland, Lorenzo Stoakes, Andrew Morton, David Hildenbrand,
	Linu Cherian, linux-kernel, linux-mm
In-Reply-To: <adeBg_eZvmz-ST67@kernel.org>



On 09/04/26 4:07 PM, Mike Rapoport wrote:
> Hi Anshuman,
> 
> On Tue, Feb 24, 2026 at 10:41:39AM +0530, Anshuman Khandual wrote:
>> Currently vma->vm_page_prot is safely read from and written to, without any
>> locks with READ_ONCE() and WRITE_ONCE(). But with introduction of D128 page
>> tables on arm64 platform, vm_page_prot grows to 128 bits which can't safely
>> be handled with READ_ONCE() and WRITE_ONCE().
>>
>> Add read and write accessors for vm_page_prot like pgprot_read/write_once()
>> which any platform can override when required, although still defaulting as
>> READ_ONCE() and WRITE_ONCE(), thus preserving the functionality for others.
>>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: David Hildenbrand <david@kernel.org>
>> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>> Cc: Mike Rapoport <rppt@kernel.org>
>> Cc: linux-mm@kvack.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  include/linux/pgtable.h | 14 ++++++++++++++
>>  mm/huge_memory.c        |  4 ++--
>>  mm/memory.c             |  2 +-
>>  mm/migrate.c            |  2 +-
>>  mm/mmap.c               |  2 +-
>>  5 files changed, 19 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
>> index da17139a1279..8858b8b03a02 100644
>> --- a/include/linux/pgtable.h
>> +++ b/include/linux/pgtable.h
>> @@ -495,6 +495,20 @@ static inline pgd_t pgdp_get(pgd_t *pgdp)
>>  }
>>  #endif
>>  
>> +#ifndef pgprot_read_once
>> +static inline pgprot_t pgprot_read_once(pgprot_t *prot)
> 
> I don't think we need _once in the helper name. Presence of the helper
> already implies that pointer should not be just dereferenced from one side
> and that using the helper will do The Right Thing from the other side.


Makes sense - will drop __once from the helper name.



^ permalink raw reply

* Re: [PATCH v3 4/7] arm64: dts: ti: k3-am62a7-sk: Split r5f memory region
From: Vignesh Raghavendra @ 2026-04-10  4:30 UTC (permalink / raw)
  To: Markus Schneider-Pargmann (TI), Bjorn Andersson, Mathieu Poirier,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, Suman Anna,
	Nishanth Menon, Tero Kristo
  Cc: Vishal Mahaveer, Kevin Hilman, Dhruva Gole, Sebin Francis,
	Kendall Willis, Akashdeep Kaur, linux-remoteproc, devicetree,
	linux-kernel, linux-arm-kernel
In-Reply-To: <20260318-topic-am62a-ioddr-dt-v6-19-v3-4-c41473cb23c3@baylibre.com>

Hi Markus

On 18/03/26 20:43, Markus Schneider-Pargmann (TI) wrote:
> Split the firmware memory region in more specific parts so it is better
> described where to find which information. Specifically the LPM metadata
> region is important as bootloader software like U-Boot has to know where
> that data is to be able to read that data.
> 
> Signed-off-by: Markus Schneider-Pargmann (TI) <msp@baylibre.com>
> ---
>  arch/arm64/boot/dts/ti/k3-am62a7-sk.dts | 40 +++++++++++++++++++++++++++++++--
>  1 file changed, 38 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts b/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts
> index e99bdbc2e0cbdf858f1631096f9c2a086191bab3..c381cc33064ec427751a9ac5bcdff745a9559a89 100644
> --- a/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts
> +++ b/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts
> @@ -59,9 +59,33 @@ wkup_r5fss0_core0_dma_memory_region: memory@9c800000 {
>  			no-map;
>  		};
>  
> -		wkup_r5fss0_core0_memory_region: memory@9c900000 {
> +		wkup_r5fss0_core0_ipc_region: memory@9c900000 {

There are still references to wkup_r5fss0_core0_memory_region in
k3-am62a-ti-ipc-firmware.dtsi (same comment applies to next 2 patches as
well)

Dont those need to be updated too?

>  			compatible = "shared-dma-pool";
> -			reg = <0x00 0x9c900000 0x00 0xf00000>;
> +			reg = <0x00 0x9c900000 0x00 0x100000>;
> +			no-map;
> +		};
> +
> +		wkup_r5fss0_core0_lpm_fs_stub_region: memory@9ca00000 {
> +			compatible = "shared-dma-pool";
> +			reg = <0x00 0x9ca00000 0x00 0x8000>;
> +			no-map;
> +		};
> +
> +		wkup_r5fss0_core0_lpm_metadata_region: memory@9ca08000 {
> +			compatible = "shared-dma-pool";
> +			reg = <0x00 0x9ca08000 0x00 0x1000>;
> +			no-map;
> +		};
> +
> +		wkup_r5fss0_core0_lpm_rest_region: memory@9ca09000 {
> +			compatible = "shared-dma-pool";
> +			reg = <0x00 0x9ca09000 0x00 0x97000>;
> +			no-map;
> +		};
> +
> +		wkup_r5fss0_core0_dm_region: memory@9caa0000 {
> +			compatible = "shared-dma-pool";
> +			reg = <0x00 0x9caa0000 0x00 0xd60000>;
>  			no-map;
>  		};
>  
> @@ -922,3 +946,15 @@ &mcu_uart0 {
>  };
>  
>  #include "k3-am62a-ti-ipc-firmware.dtsi"
> +
> +&wkup_r5fss0_core0 {
> +	memory-region = <&wkup_r5fss0_core0_dma_memory_region>,
> +			<&wkup_r5fss0_core0_ipc_region>,
> +			<&wkup_r5fss0_core0_lpm_fs_stub_region>,
> +			<&wkup_r5fss0_core0_lpm_metadata_region>,
> +			<&wkup_r5fss0_core0_lpm_rest_region>,
> +			<&wkup_r5fss0_core0_dm_region>;
> +	memory-region-names = "dma", "ipc", "lpm-stub",
> +			      "lpm-metadata", "lpm-context",
> +			      "dm-firmware";
> +};
> 

-- 
Regards
Vignesh
https://ti.com/opensource



^ permalink raw reply

* RE: [PATCH v1] clk: imx95-blk-ctl: Add func_out_en clock for i.MX9x PCIe
From: Peng Fan @ 2026-04-10  4:35 UTC (permalink / raw)
  To: Hongxing Zhu, abelvesa@kernel.org, mturquette@baylibre.com,
	sboyd@kernel.org, Frank Li, s.hauer@pengutronix.de,
	festevam@gmail.com
  Cc: linux-clk@vger.kernel.org, imx@nxp.com,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, kernel@pengutronix.de
In-Reply-To: <20260410031656.2578632-1-hongxing.zhu@nxp.com>

> Subject: [PATCH v1] clk: imx95-blk-ctl: Add func_out_en clock for
> i.MX9x PCIe
> 
> When internal PLL clock is used as PCIe REF clock, the BIT6(CREF_EN)
> and
> BIT2(FUNC_OUTPUT_EN) control the PCIE_REF_OUT_CLK.
> 
> If the default value of BIT6(CREF_EN)&BIT2(FUNC_OUTPUT_EN) is 1b'1.
> With the typical 100-ohm termination on the board, this results in
> approximately 6mA of power consumption.
> 
> When PCIe internal PLL clock is not enabled, these two bits should be
> cleared to 1b'0 to eliminate this power consumption.
> 
> Add a func_out_en clock for i.MX9x PCIe to serve as the parent gate
> clock of the CREF_EN (BIT6) gate clock. Both of these two gate clocks
> enable the output of the internal 100MHz differential reference clock.
> 
> Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>

Reviewed-by: Peng Fan <peng.fan@nxp.com>


^ permalink raw reply

* Re: [RFC V1 05/16] arm64/mm: Convert READ_ONCE() as pmdp_get() while accessing PMD
From: Anshuman Khandual @ 2026-04-10  4:48 UTC (permalink / raw)
  To: David Hildenbrand (Arm), linux-arm-kernel
  Cc: Catalin Marinas, Will Deacon, Ryan Roberts, Mark Rutland,
	Lorenzo Stoakes, Andrew Morton, Mike Rapoport, Linu Cherian,
	linux-kernel, linux-mm, kasan-dev
In-Reply-To: <628a1d9b-cc16-4cf6-8c19-b6ed49af8492@kernel.org>



On 08/04/26 5:41 PM, David Hildenbrand (Arm) wrote:
> On 2/24/26 06:11, Anshuman Khandual wrote:
>> Convert all READ_ONCE() based PMD accesses as pmdp_get() instead which will
>> support both D64 and D128 translation regime going forward.
> 
> You should mention the move from pmdp_test_and_clear_young(), and why it
> is performed.

Sure will do that. Actually there as a build problem while accessing
pmdp_get() from pmdp_test_and_clear_young() while being inside the
header which necessitated this move.

> 
> Nothing else jumped at me :)
> 



^ permalink raw reply

* Re: [PATCH v2 4/8] clk: qcom: videocc: Add video clock controller driver for Eliza
From: Jie Gan @ 2026-04-10  4:48 UTC (permalink / raw)
  To: Taniya Das, Bjorn Andersson, Michael Turquette, Stephen Boyd,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, Konrad Dybcio,
	Maxime Coquelin, Alexandre Torgue
  Cc: Ajit Pandey, Imran Shaik, Jagadeesh Kona, linux-arm-msm,
	linux-clk, devicetree, linux-kernel, linux-stm32,
	linux-arm-kernel, Konrad Dybcio
In-Reply-To: <20260409-eliza_mm_cc_v2-v2-4-bc0c6dd77bc5@oss.qualcomm.com>



On 4/10/2026 2:10 AM, Taniya Das wrote:
> Add support for the video clock controller for video clients to be able
> to request for videocc clocks on Eliza platform.
> 
> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
> Signed-off-by: Taniya Das <taniya.das@oss.qualcomm.com>
> ---
>   drivers/clk/qcom/Kconfig         |   9 +
>   drivers/clk/qcom/Makefile        |   1 +
>   drivers/clk/qcom/videocc-eliza.c | 403 +++++++++++++++++++++++++++++++++++++++
>   3 files changed, 413 insertions(+)
> 
> diff --git a/drivers/clk/qcom/Kconfig b/drivers/clk/qcom/Kconfig
> index 22eb80be60ad3bde897f2c507ac9897951fbb8fe..4b0d40a38a6328fe9c41ebb15ae6821012223920 100644
> --- a/drivers/clk/qcom/Kconfig
> +++ b/drivers/clk/qcom/Kconfig
> @@ -45,6 +45,15 @@ config CLK_ELIZA_TCSRCC
>   	  Support for the TCSR clock controller on Eliza devices.
>   	  Say Y if you want to use peripheral devices such as USB/PCIe/UFS.
>   
> +config CLK_ELIZA_VIDEOCC
> +	tristate "Eliza Video Clock Controller"
> +	depends on ARM64 || COMPILE_TEST
> +	select CLK_GLYMUR_GCC

Hi,

My bot found a [BUG] here, please ignore it if it's a false positive issue.

CLK_ELIZA_VIDEOCC selects CLK_GLYMUR_GCC instead of CLK_ELIZA_GCC

- select CLK_GLYMUR_GCC pulls in gcc-glymur.c instead of gcc-eliza.c
- On an Eliza system, gcc-glymur.c will never probe (no matching DTS 
node), so GCC_VIDEO_AHB_CLK from the Eliza GCC will never be available 
to videocc
- The videocc driver's clocks = <&gcc GCC_VIDEO_AHB_CLK> will fail to 
resolve at runtime
- The correct fix is select CLK_ELIZA_GCC, consistent with all other 
Eliza clock controllers

Thanks,
Jie

> +	help
> +	  Support for the video clock controller on Eliza devices.
> +	  Say Y if you want to support video devices and functionality such as
> +	  video encode and decode.
> +
>   config CLK_GLYMUR_DISPCC
>   	tristate "Glymur Display Clock Controller"
>   	depends on ARM64 || COMPILE_TEST
> diff --git a/drivers/clk/qcom/Makefile b/drivers/clk/qcom/Makefile
> index b818fd5af8bfb85a51ee90fdc3baa93af30dc39a..e7e239c5a0d088b2e78354bf421d871a4e4e6d9d 100644
> --- a/drivers/clk/qcom/Makefile
> +++ b/drivers/clk/qcom/Makefile
> @@ -23,6 +23,7 @@ obj-$(CONFIG_APQ_MMCC_8084) += mmcc-apq8084.o
>   obj-$(CONFIG_CLK_ELIZA_DISPCC) += dispcc-eliza.o
>   obj-$(CONFIG_CLK_ELIZA_GCC) += gcc-eliza.o
>   obj-$(CONFIG_CLK_ELIZA_TCSRCC) += tcsrcc-eliza.o
> +obj-$(CONFIG_CLK_ELIZA_VIDEOCC) += videocc-eliza.o
>   obj-$(CONFIG_CLK_GFM_LPASS_SM8250) += lpass-gfm-sm8250.o
>   obj-$(CONFIG_CLK_GLYMUR_DISPCC) += dispcc-glymur.o
>   obj-$(CONFIG_CLK_GLYMUR_GCC) += gcc-glymur.o
> diff --git a/drivers/clk/qcom/videocc-eliza.c b/drivers/clk/qcom/videocc-eliza.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..cb541cfec50c12761251a822e32094e763922cdb
> --- /dev/null
> +++ b/drivers/clk/qcom/videocc-eliza.c
> @@ -0,0 +1,403 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> + */
> +
> +#include <linux/clk-provider.h>
> +#include <linux/mod_devicetable.h>
> +#include <linux/module.h>
> +#include <linux/platform_device.h>
> +#include <linux/regmap.h>
> +
> +#include <dt-bindings/clock/qcom,eliza-videocc.h>
> +
> +#include "clk-alpha-pll.h"
> +#include "clk-branch.h"
> +#include "clk-pll.h"
> +#include "clk-rcg.h"
> +#include "clk-regmap.h"
> +#include "clk-regmap-divider.h"
> +#include "clk-regmap-mux.h"
> +#include "common.h"
> +#include "gdsc.h"
> +#include "reset.h"
> +
> +enum {
> +	DT_BI_TCXO,
> +	DT_SLEEP_CLK,
> +	DT_AHB_CLK,
> +};
> +
> +enum {
> +	P_BI_TCXO,
> +	P_SLEEP_CLK,
> +	P_VIDEO_CC_PLL0_OUT_MAIN,
> +};
> +
> +static const struct pll_vco lucid_ole_vco[] = {
> +	{ 249600000, 2300000000, 0 },
> +};
> +
> +/* 576.0 MHz Configuration */
> +static const struct alpha_pll_config video_cc_pll0_config = {
> +	.l = 0x1e,
> +	.alpha = 0x0,
> +	.config_ctl_val = 0x20485699,
> +	.config_ctl_hi_val = 0x00182261,
> +	.config_ctl_hi1_val = 0x82aa299c,
> +	.test_ctl_val = 0x00000000,
> +	.test_ctl_hi_val = 0x00000003,
> +	.test_ctl_hi1_val = 0x00009000,
> +	.test_ctl_hi2_val = 0x00000034,
> +	.user_ctl_val = 0x00000000,
> +	.user_ctl_hi_val = 0x00000005,
> +};
> +
> +static struct clk_alpha_pll video_cc_pll0 = {
> +	.offset = 0x0,
> +	.config = &video_cc_pll0_config,
> +	.vco_table = lucid_ole_vco,
> +	.num_vco = ARRAY_SIZE(lucid_ole_vco),
> +	.regs = clk_alpha_pll_regs[CLK_ALPHA_PLL_TYPE_LUCID_OLE],
> +	.clkr = {
> +		.hw.init = &(const struct clk_init_data) {
> +			.name = "video_cc_pll0",
> +			.parent_data = &(const struct clk_parent_data) {
> +				.index = DT_BI_TCXO,
> +			},
> +			.num_parents = 1,
> +			.ops = &clk_alpha_pll_lucid_evo_ops,
> +		},
> +	},
> +};
> +
> +static const struct parent_map video_cc_parent_map_0[] = {
> +	{ P_BI_TCXO, 0 },
> +};
> +
> +static const struct clk_parent_data video_cc_parent_data_0[] = {
> +	{ .index = DT_BI_TCXO },
> +};
> +
> +static const struct parent_map video_cc_parent_map_1[] = {
> +	{ P_BI_TCXO, 0 },
> +	{ P_VIDEO_CC_PLL0_OUT_MAIN, 1 },
> +};
> +
> +static const struct clk_parent_data video_cc_parent_data_1[] = {
> +	{ .index = DT_BI_TCXO },
> +	{ .hw = &video_cc_pll0.clkr.hw },
> +};
> +
> +static const struct parent_map video_cc_parent_map_2[] = {
> +	{ P_SLEEP_CLK, 0 },
> +};
> +
> +static const struct clk_parent_data video_cc_parent_data_2[] = {
> +	{ .index = DT_SLEEP_CLK },
> +};
> +
> +static const struct freq_tbl ftbl_video_cc_ahb_clk_src[] = {
> +	F(19200000, P_BI_TCXO, 1, 0, 0),
> +	{ }
> +};
> +
> +static struct clk_rcg2 video_cc_ahb_clk_src = {
> +	.cmd_rcgr = 0x8018,
> +	.mnd_width = 0,
> +	.hid_width = 5,
> +	.parent_map = video_cc_parent_map_0,
> +	.freq_tbl = ftbl_video_cc_ahb_clk_src,
> +	.hw_clk_ctrl = true,
> +	.clkr.hw.init = &(const struct clk_init_data) {
> +		.name = "video_cc_ahb_clk_src",
> +		.parent_data = video_cc_parent_data_0,
> +		.num_parents = ARRAY_SIZE(video_cc_parent_data_0),
> +		.flags = CLK_SET_RATE_PARENT,
> +		.ops = &clk_rcg2_shared_ops,
> +	},
> +};
> +
> +static const struct freq_tbl ftbl_video_cc_mvs0_clk_src[] = {
> +	F(576000000, P_VIDEO_CC_PLL0_OUT_MAIN, 1, 0, 0),
> +	F(633000000, P_VIDEO_CC_PLL0_OUT_MAIN, 1, 0, 0),
> +	F(720000000, P_VIDEO_CC_PLL0_OUT_MAIN, 1, 0, 0),
> +	F(1014000000, P_VIDEO_CC_PLL0_OUT_MAIN, 1, 0, 0),
> +	F(1098000000, P_VIDEO_CC_PLL0_OUT_MAIN, 1, 0, 0),
> +	F(1113000000, P_VIDEO_CC_PLL0_OUT_MAIN, 1, 0, 0),
> +	F(1332000000, P_VIDEO_CC_PLL0_OUT_MAIN, 1, 0, 0),
> +	F(1600000000, P_VIDEO_CC_PLL0_OUT_MAIN, 1, 0, 0),
> +	{ }
> +};
> +
> +static struct clk_rcg2 video_cc_mvs0_clk_src = {
> +	.cmd_rcgr = 0x8000,
> +	.mnd_width = 0,
> +	.hid_width = 5,
> +	.parent_map = video_cc_parent_map_1,
> +	.freq_tbl = ftbl_video_cc_mvs0_clk_src,
> +	.hw_clk_ctrl = true,
> +	.clkr.hw.init = &(const struct clk_init_data) {
> +		.name = "video_cc_mvs0_clk_src",
> +		.parent_data = video_cc_parent_data_1,
> +		.num_parents = ARRAY_SIZE(video_cc_parent_data_1),
> +		.flags = CLK_SET_RATE_PARENT,
> +		.ops = &clk_rcg2_shared_ops,
> +	},
> +};
> +
> +static const struct freq_tbl ftbl_video_cc_sleep_clk_src[] = {
> +	F(32000, P_SLEEP_CLK, 1, 0, 0),
> +	{ }
> +};
> +
> +static struct clk_rcg2 video_cc_sleep_clk_src = {
> +	.cmd_rcgr = 0x8110,
> +	.mnd_width = 0,
> +	.hid_width = 5,
> +	.parent_map = video_cc_parent_map_2,
> +	.freq_tbl = ftbl_video_cc_sleep_clk_src,
> +	.clkr.hw.init = &(const struct clk_init_data) {
> +		.name = "video_cc_sleep_clk_src",
> +		.parent_data = video_cc_parent_data_2,
> +		.num_parents = ARRAY_SIZE(video_cc_parent_data_2),
> +		.flags = CLK_SET_RATE_PARENT,
> +		.ops = &clk_rcg2_shared_ops,
> +	},
> +};
> +
> +static struct clk_rcg2 video_cc_xo_clk_src = {
> +	.cmd_rcgr = 0x80f4,
> +	.mnd_width = 0,
> +	.hid_width = 5,
> +	.parent_map = video_cc_parent_map_0,
> +	.freq_tbl = ftbl_video_cc_ahb_clk_src,
> +	.clkr.hw.init = &(const struct clk_init_data) {
> +		.name = "video_cc_xo_clk_src",
> +		.parent_data = video_cc_parent_data_0,
> +		.num_parents = ARRAY_SIZE(video_cc_parent_data_0),
> +		.flags = CLK_SET_RATE_PARENT,
> +		.ops = &clk_rcg2_shared_ops,
> +	},
> +};
> +
> +static struct clk_regmap_div video_cc_mvs0_div_clk_src = {
> +	.reg = 0x80ac,
> +	.shift = 0,
> +	.width = 4,
> +	.clkr.hw.init = &(const struct clk_init_data) {
> +		.name = "video_cc_mvs0_div_clk_src",
> +		.parent_hws = (const struct clk_hw*[]) {
> +			&video_cc_mvs0_clk_src.clkr.hw,
> +		},
> +		.num_parents = 1,
> +		.flags = CLK_SET_RATE_PARENT,
> +		.ops = &clk_regmap_div_ro_ops,
> +	},
> +};
> +
> +static struct clk_regmap_div video_cc_mvs0c_div2_div_clk_src = {
> +	.reg = 0x8058,
> +	.shift = 0,
> +	.width = 4,
> +	.clkr.hw.init = &(const struct clk_init_data) {
> +		.name = "video_cc_mvs0c_div2_div_clk_src",
> +		.parent_hws = (const struct clk_hw*[]) {
> +			&video_cc_mvs0_clk_src.clkr.hw,
> +		},
> +		.num_parents = 1,
> +		.flags = CLK_SET_RATE_PARENT,
> +		.ops = &clk_regmap_div_ro_ops,
> +	},
> +};
> +
> +static struct clk_branch video_cc_mvs0_clk = {
> +	.halt_reg = 0x80a0,
> +	.halt_check = BRANCH_HALT_VOTED,
> +	.hwcg_reg = 0x80a0,
> +	.hwcg_bit = 1,
> +	.clkr = {
> +		.enable_reg = 0x80a0,
> +		.enable_mask = BIT(0),
> +		.hw.init = &(const struct clk_init_data) {
> +			.name = "video_cc_mvs0_clk",
> +			.parent_hws = (const struct clk_hw*[]) {
> +				&video_cc_mvs0_div_clk_src.clkr.hw,
> +			},
> +			.num_parents = 1,
> +			.flags = CLK_SET_RATE_PARENT,
> +			.ops = &clk_branch2_ops,
> +		},
> +	},
> +};
> +
> +static struct clk_branch video_cc_mvs0_shift_clk = {
> +	.halt_reg = 0x8144,
> +	.halt_check = BRANCH_HALT_VOTED,
> +	.hwcg_reg = 0x8144,
> +	.hwcg_bit = 1,
> +	.clkr = {
> +		.enable_reg = 0x8144,
> +		.enable_mask = BIT(0),
> +		.hw.init = &(const struct clk_init_data) {
> +			.name = "video_cc_mvs0_shift_clk",
> +			.parent_hws = (const struct clk_hw*[]) {
> +				&video_cc_xo_clk_src.clkr.hw,
> +			},
> +			.num_parents = 1,
> +			.flags = CLK_SET_RATE_PARENT,
> +			.ops = &clk_branch2_ops,
> +		},
> +	},
> +};
> +
> +static struct clk_branch video_cc_mvs0c_clk = {
> +	.halt_reg = 0x804c,
> +	.halt_check = BRANCH_HALT,
> +	.clkr = {
> +		.enable_reg = 0x804c,
> +		.enable_mask = BIT(0),
> +		.hw.init = &(const struct clk_init_data) {
> +			.name = "video_cc_mvs0c_clk",
> +			.parent_hws = (const struct clk_hw*[]) {
> +				&video_cc_mvs0c_div2_div_clk_src.clkr.hw,
> +			},
> +			.num_parents = 1,
> +			.flags = CLK_SET_RATE_PARENT,
> +			.ops = &clk_branch2_ops,
> +		},
> +	},
> +};
> +
> +static struct clk_branch video_cc_mvs0c_shift_clk = {
> +	.halt_reg = 0x8148,
> +	.halt_check = BRANCH_HALT_VOTED,
> +	.hwcg_reg = 0x8148,
> +	.hwcg_bit = 1,
> +	.clkr = {
> +		.enable_reg = 0x8148,
> +		.enable_mask = BIT(0),
> +		.hw.init = &(const struct clk_init_data) {
> +			.name = "video_cc_mvs0c_shift_clk",
> +			.parent_hws = (const struct clk_hw*[]) {
> +				&video_cc_xo_clk_src.clkr.hw,
> +			},
> +			.num_parents = 1,
> +			.flags = CLK_SET_RATE_PARENT,
> +			.ops = &clk_branch2_ops,
> +		},
> +	},
> +};
> +
> +static struct gdsc video_cc_mvs0c_gdsc = {
> +	.gdscr = 0x8034,
> +	.en_rest_wait_val = 0x2,
> +	.en_few_wait_val = 0x2,
> +	.clk_dis_wait_val = 0x6,
> +	.pd = {
> +		.name = "video_cc_mvs0c_gdsc",
> +	},
> +	.pwrsts = PWRSTS_OFF_ON,
> +	.flags = POLL_CFG_GDSCR | RETAIN_FF_ENABLE,
> +};
> +
> +static struct gdsc video_cc_mvs0_gdsc = {
> +	.gdscr = 0x808c,
> +	.en_rest_wait_val = 0x2,
> +	.en_few_wait_val = 0x2,
> +	.clk_dis_wait_val = 0x6,
> +	.pd = {
> +		.name = "video_cc_mvs0_gdsc",
> +	},
> +	.pwrsts = PWRSTS_OFF_ON,
> +	.parent = &video_cc_mvs0c_gdsc.pd,
> +	.flags = POLL_CFG_GDSCR | RETAIN_FF_ENABLE | HW_CTRL_TRIGGER,
> +};
> +
> +static struct clk_regmap *video_cc_eliza_clocks[] = {
> +	[VIDEO_CC_AHB_CLK_SRC] = &video_cc_ahb_clk_src.clkr,
> +	[VIDEO_CC_MVS0_CLK] = &video_cc_mvs0_clk.clkr,
> +	[VIDEO_CC_MVS0_CLK_SRC] = &video_cc_mvs0_clk_src.clkr,
> +	[VIDEO_CC_MVS0_DIV_CLK_SRC] = &video_cc_mvs0_div_clk_src.clkr,
> +	[VIDEO_CC_MVS0_SHIFT_CLK] = &video_cc_mvs0_shift_clk.clkr,
> +	[VIDEO_CC_MVS0C_CLK] = &video_cc_mvs0c_clk.clkr,
> +	[VIDEO_CC_MVS0C_DIV2_DIV_CLK_SRC] = &video_cc_mvs0c_div2_div_clk_src.clkr,
> +	[VIDEO_CC_MVS0C_SHIFT_CLK] = &video_cc_mvs0c_shift_clk.clkr,
> +	[VIDEO_CC_PLL0] = &video_cc_pll0.clkr,
> +	[VIDEO_CC_SLEEP_CLK_SRC] = &video_cc_sleep_clk_src.clkr,
> +	[VIDEO_CC_XO_CLK_SRC] = &video_cc_xo_clk_src.clkr,
> +};
> +
> +static struct gdsc *video_cc_eliza_gdscs[] = {
> +	[VIDEO_CC_MVS0_GDSC] = &video_cc_mvs0_gdsc,
> +	[VIDEO_CC_MVS0C_GDSC] = &video_cc_mvs0c_gdsc,
> +};
> +
> +static const struct qcom_reset_map video_cc_eliza_resets[] = {
> +	[VIDEO_CC_INTERFACE_BCR] = { 0x80d8 },
> +	[VIDEO_CC_MVS0_CLK_ARES] = { 0x80a0, 2 },
> +	[VIDEO_CC_MVS0_BCR] = { 0x8088 },
> +	[VIDEO_CC_MVS0C_CLK_ARES] = { 0x804c, 2 },
> +	[VIDEO_CC_MVS0C_BCR] = { 0x8030 },
> +	[VIDEO_CC_XO_CLK_ARES] = { 0x810c, 2 },
> +};
> +
> +static struct clk_alpha_pll *video_cc_eliza_plls[] = {
> +	&video_cc_pll0,
> +};
> +
> +static u32 video_cc_eliza_critical_cbcrs[] = {
> +	0x80dc, /* VIDEO_CC_AHB_CLK */
> +	0x8128, /* VIDEO_CC_SLEEP_CLK */
> +	0x810c, /* VIDEO_CC_XO_CLK */
> +};
> +
> +static const struct regmap_config video_cc_eliza_regmap_config = {
> +	.reg_bits = 32,
> +	.reg_stride = 4,
> +	.val_bits = 32,
> +	.max_register = 0x9f50,
> +	.fast_io = true,
> +};
> +
> +static struct qcom_cc_driver_data video_cc_eliza_driver_data = {
> +	.alpha_plls = video_cc_eliza_plls,
> +	.num_alpha_plls = ARRAY_SIZE(video_cc_eliza_plls),
> +	.clk_cbcrs = video_cc_eliza_critical_cbcrs,
> +	.num_clk_cbcrs = ARRAY_SIZE(video_cc_eliza_critical_cbcrs),
> +};
> +
> +static const struct qcom_cc_desc video_cc_eliza_desc = {
> +	.config = &video_cc_eliza_regmap_config,
> +	.clks = video_cc_eliza_clocks,
> +	.num_clks = ARRAY_SIZE(video_cc_eliza_clocks),
> +	.resets = video_cc_eliza_resets,
> +	.num_resets = ARRAY_SIZE(video_cc_eliza_resets),
> +	.gdscs = video_cc_eliza_gdscs,
> +	.num_gdscs = ARRAY_SIZE(video_cc_eliza_gdscs),
> +	.driver_data = &video_cc_eliza_driver_data,
> +};
> +
> +static const struct of_device_id video_cc_eliza_match_table[] = {
> +	{ .compatible = "qcom,eliza-videocc" },
> +	{ }
> +};
> +MODULE_DEVICE_TABLE(of, video_cc_eliza_match_table);
> +
> +static int video_cc_eliza_probe(struct platform_device *pdev)
> +{
> +	return qcom_cc_probe(pdev, &video_cc_eliza_desc);
> +}
> +
> +static struct platform_driver video_cc_eliza_driver = {
> +	.probe = video_cc_eliza_probe,
> +	.driver = {
> +		.name = "videocc-eliza",
> +		.of_match_table = video_cc_eliza_match_table,
> +	},
> +};
> +
> +module_platform_driver(video_cc_eliza_driver);
> +
> +MODULE_DESCRIPTION("QTI VIDEOCC Eliza Driver");
> +MODULE_LICENSE("GPL");
> 



^ permalink raw reply

* Re: [RFC V1 06/16] arm64/mm: Convert READ_ONCE() as pudp_get() while accessing PUD
From: Anshuman Khandual @ 2026-04-10  4:50 UTC (permalink / raw)
  To: David Hildenbrand (Arm), linux-arm-kernel
  Cc: Catalin Marinas, Will Deacon, Ryan Roberts, Mark Rutland,
	Lorenzo Stoakes, Andrew Morton, Mike Rapoport, Linu Cherian,
	linux-kernel, linux-mm, kasan-dev
In-Reply-To: <b1e1783c-3621-41e4-b65d-38cf66a1124c@kernel.org>



On 08/04/26 5:45 PM, David Hildenbrand (Arm) wrote:
> On 2/24/26 06:11, Anshuman Khandual wrote:
>> Convert all READ_ONCE() based PUD accesses as pudp_get() instead which will
>> support both D64 and D128 translation regime going forward.
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Ryan Roberts <ryan.roberts@arm.com>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Cc: kasan-dev@googlegroups.com
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
> 
> I was wondering for a second whether it would be better to structure
> this as "convert READ_ONCE to use pxxxp_get() in fault.c" instead,
> essentially, to touch each file only once.

But will not that create too many patches ?

> 
> Anyhow
> 
> Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
> 

Thanks !



^ permalink raw reply

* Re: [PATCH 3/4] perf arm_spe: Decode Arm N1 IMPDEF events
From: Namhyung Kim @ 2026-04-10  4:51 UTC (permalink / raw)
  To: James Clark
  Cc: Ian Rogers, John Garry, Will Deacon, Mike Leach, Leo Yan,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Al Grant, linux-arm-kernel, linux-perf-users, linux-kernel
In-Reply-To: <1b40ec9a-a003-4acc-9399-1a8d96fc4e42@linaro.org>

On Wed, Apr 08, 2026 at 09:47:41AM +0100, James Clark wrote:
> 
> 
> On 07/04/2026 7:26 pm, Ian Rogers wrote:
> > On Tue, Apr 7, 2026 at 5:35 AM James Clark <james.clark@linaro.org> wrote:
> > > 
> > > 
> > > 
> > > On 02/04/2026 4:26 pm, Ian Rogers wrote:
> > > > On Wed, Apr 1, 2026 at 7:26 AM James Clark <james.clark@linaro.org> wrote:
> > > > > 
> > > > >   From the TRM [1], N1 has one IMPDEF event which isn't covered by the
> > > > > common list. Add a framework so that more cores can be added in the
> > > > > future and that the N1 IMPDEF event can be decoded. Also increase the
> > > > > size of the buffer because we're adding more strings and if it gets
> > > > > truncated it falls back to a hex dump only.
> > > > > 
> > > > > [1]: https://developer.arm.com/documentation/100616/0401/Statistical-Profiling-Extension/implementation-defined-features-of-SPE
> > > > > Suggested-by: Al Grant <al.grant@arm.com>
> > > > > Signed-off-by: James Clark <james.clark@linaro.org>
> > > > > ---
> > > > >    tools/perf/util/arm-spe-decoder/Build              |  2 +
> > > > >    .../util/arm-spe-decoder/arm-spe-pkt-decoder.c     | 45 ++++++++++++++++++++--
> > > > >    .../util/arm-spe-decoder/arm-spe-pkt-decoder.h     |  5 ++-
> > > > >    tools/perf/util/arm-spe.c                          | 13 ++++---
> > > > >    4 files changed, 54 insertions(+), 11 deletions(-)
> > > > > 
> > > > > diff --git a/tools/perf/util/arm-spe-decoder/Build b/tools/perf/util/arm-spe-decoder/Build
> > > > > index ab500e0efe24..97a298d1e279 100644
> > > > > --- a/tools/perf/util/arm-spe-decoder/Build
> > > > > +++ b/tools/perf/util/arm-spe-decoder/Build
> > > > > @@ -1 +1,3 @@
> > > > >    perf-util-y += arm-spe-pkt-decoder.o arm-spe-decoder.o
> > > > > +
> > > > > +CFLAGS_arm-spe-pkt-decoder.o += -I$(srctree)/tools/arch/arm64/include/ -I$(OUTPUT)arch/arm64/include/generated/
> > > > > diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > > > > index c880b0dec3a1..42a7501d4dfe 100644
> > > > > --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > > > > +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > > > > @@ -15,6 +15,8 @@
> > > > > 
> > > > >    #include "arm-spe-pkt-decoder.h"
> > > > > 
> > > > > +#include "../../arm64/include/asm/cputype.h"
> > > > 
> > > > Sashiko spotted:
> > > > https://sashiko.dev/#/patchset/20260401-james-spe-impdef-decode-v1-0-ad0d372c220c%40linaro.org
> > > > """
> > > > This isn't a bug, but does this include directive rely on accidental
> > > > path normalization?
> > > > 
> > > > The relative path ../../arm64/include/asm/cputype.h does not exist relative
> > > > to arm-spe-pkt-decoder.c. It only compiles because the Build file adds
> > > > -I$(srctree)/tools/arch/arm64/include/ to CFLAGS.
> > > > 
> > > > Would it be cleaner to use #include <asm/cputype.h> to explicitly rely on
> > > > the include path?
> > > > [ ... ]
> > > > """
> > > > I wouldn't use <asm/cputype.h> due to cross-compilation and the like,
> > > > instead just add the extra "../" into the include path.
> > > > 
> > > 
> > > Do you mean change the #include to this?
> > > 
> > >     #include "../../../arm64/include/asm/cputype.h"
> > > 
> > > I still need to add:
> > > 
> > >     CFLAGS_arm-spe-pkt-decoder.o += -I$(srctree)/tools/arch/arm64/include/
> > > 
> > > To make the this include in cputype.h work:
> > > 
> > >     #include <asm/sysreg.h>
> > > 
> > > Which probably only works because there isn't a sysreg.h on other
> > > architectures. But I'm not sure what the significance of ../../ vs
> > > ../../../ is if either compile? arm-spe.c already does it with ../../
> > > which is what I copied.
> > 
> > Hmm.. maybe the path should be
> > "../../../arch/arm64/include/asm/cputype.h". The include preference is
> > for a path relative to the source file and
> > ../../arm64/include/asm/cputype.h doesn't exist. It is kind of horrid
> 
> Up 3 dirs from arm-spe-pkt-decoder.c would be
> "tools/arm64/include/asm/cputype.h" which doesn't exist either unless I'm
> missing something?
> 
> If get the compiler to print the path it uses with 3 then it would use the
> x86 uapi include path which doesn't seem any less weird to me:
> 
>  ...
>  In file included from util/arm-spe-decoder/arm-spe-pkt-decoder.c:19:
> 
>  linux/tools/arch/x86/include/uapi/../../../arm64/include/asm/cputype.h:254:10:
> 
> 
> > to add an include path and then use a relative path to escape into a
> > higher-level directory. arm-spe.c is a little different as it is one
> > directory higher in the directory layout.
> > 
> 
> It is one folder higher, but arm-spe.c still relies on adding a special
> include path to CFLAGS_arm-spe.o to make it work. It's not including a real
> path relative to the source either.
> 
> Yeah it's a bit horrible but I don't think the asm/ thing combined with
> copying headers from the kernel to tools expected to handle the case where
> we would want to use asm/ stuff for a different arch than the compile one.
> It might not be normal to use relative include paths to escape to higher
> directories, but it's not a normal situation either. I think it's a special
> case for a special scenario. I'm not sure of a better way, but this is
> working for now.

I hope we can cleanup the header inclusion path someday.  My idea is
that it always starts from the root of the perf directory.  So it'd
include "util/xxx.h" even if the source file is already in util/.

Thanks,
Namhyung



^ permalink raw reply

* Re: [RFC V1 07/16] arm64/mm: Convert READ_ONCE() as p4dp_get() while accessing P4D
From: Anshuman Khandual @ 2026-04-10  5:05 UTC (permalink / raw)
  To: David Hildenbrand (Arm), linux-arm-kernel
  Cc: Catalin Marinas, Will Deacon, Ryan Roberts, Mark Rutland,
	Lorenzo Stoakes, Andrew Morton, Mike Rapoport, Linu Cherian,
	linux-kernel, linux-mm, kasan-dev
In-Reply-To: <238ab437-06b1-43ef-86bb-9341c02040b1@kernel.org>

On 08/04/26 5:47 PM, David Hildenbrand (Arm) wrote:
> On 2/24/26 06:11, Anshuman Khandual wrote:
>> Convert all READ_ONCE() based P4D accesses as p4dp_get() instead which will
>> support both D64 and D128 translation regime going forward.
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Ryan Roberts <ryan.roberts@arm.com>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Cc: kasan-dev@googlegroups.com
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
> 
> 
> [...]
> 
>>  static void __init kasan_pgd_populate(unsigned long addr, unsigned long end,
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index a80d06db4de6..16ae11b29f66 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -354,7 +354,7 @@ static int alloc_init_pud(p4d_t *p4dp, unsigned long addr, unsigned long end,
>>  {
>>  	int ret = 0;
>>  	unsigned long next;
>> -	p4d_t p4d = READ_ONCE(*p4dp);
>> +	p4d_t p4d = p4dp_get(p4dp);
>>  	pud_t *pudp;
>>  
>>  	if (p4d_none(p4d)) {
>> @@ -443,7 +443,7 @@ static int alloc_init_p4d(pgd_t *pgdp, unsigned long addr, unsigned long end,
>>  	}
>>  
>>  	do {
>> -		p4d_t old_p4d = READ_ONCE(*p4dp);
>> +		p4d_t old_p4d = p4dp_get(p4dp);
>>  
>>  		next = p4d_addr_end(addr, end);
>>  
>> @@ -453,7 +453,7 @@ static int alloc_init_p4d(pgd_t *pgdp, unsigned long addr, unsigned long end,
>>  			goto out;
>>  
>>  		BUG_ON(p4d_val(old_p4d) != 0 &&
>> -		       p4d_val(old_p4d) != READ_ONCE(p4d_val(*p4dp)));
>> +		       p4d_val(old_p4d) != (p4d_val(p4dp_get(p4dp))));
> 
> Same here, while at it remove the BUG_ON. (see below)
> 
>>  
>>  		phys += next - addr;
>>  	} while (p4dp++, addr = next, addr != end);
>> @@ -1541,7 +1541,7 @@ static void unmap_hotplug_p4d_range(pgd_t *pgdp, unsigned long addr,
>>  	do {
>>  		next = p4d_addr_end(addr, end);
>>  		p4dp = p4d_offset(pgdp, addr);
>> -		p4d = READ_ONCE(*p4dp);
>> +		p4d = p4dp_get(p4dp);
>>  		if (p4d_none(p4d))
>>  			continue;
>>  
>> @@ -1703,7 +1703,7 @@ static void free_empty_p4d_table(pgd_t *pgdp, unsigned long addr,
>>  	do {
>>  		next = p4d_addr_end(addr, end);
>>  		p4dp = p4d_offset(pgdp, addr);
>> -		p4d = READ_ONCE(*p4dp);
>> +		p4d = p4dp_get(p4dp);
>>  		if (p4d_none(p4d))
>>  			continue;
>>  
>> @@ -1724,7 +1724,7 @@ static void free_empty_p4d_table(pgd_t *pgdp, unsigned long addr,
>>  	 */
>>  	p4dp = p4d_offset(pgdp, 0UL);
>>  	for (i = 0; i < PTRS_PER_P4D; i++) {
>> -		if (!p4d_none(READ_ONCE(p4dp[i])))
>> +		if (!p4d_none(p4dp_get(p4dp + i)))
>>  			return;
>>  	}
>>  
>> @@ -2258,4 +2258,21 @@ int pmdp_test_and_clear_young(struct vm_area_struct *vma,
>>  }
>>  #endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG */
>>  
>> +#if CONFIG_PGTABLE_LEVELS > 3
>> +phys_addr_t pud_offset_phys(p4d_t *p4dp, unsigned long addr)
>> +{
>> +	p4d_t p4d = p4dp_get(p4dp);
>> +
>> +	BUG_ON(!pgtable_l4_enabled());
> 
> Heh, while at it, convert that to a VM_WARN_ON_ONCE() or anything else
> that is not a BUG.
> 
> I strongly assume CONFIG_DEBUG_VM checks are sufficient.

There are multiple similar BUG_ON() instances

arch/arm64/include/asm/pgtable.h:       BUG_ON(!pgtable_l4_enabled());
arch/arm64/include/asm/pgtable.h:       BUG_ON(!pgtable_l5_enabled());

arch/arm64/mm/mmu.c:                    BUG_ON(pmd_val(old_pmd) != 0 &&
arch/arm64/mm/mmu.c:                    BUG_ON(pud_val(old_pud) != 0 &&
arch/arm64/mm/mmu.c:            BUG_ON(p4d_val(old_p4d) != 0 &&

Shall we convert all of them as VM_WARN_ON_ONCE() in a separate patch
as a pre-requisite ?

> 
>> +
>> +	return p4d_page_paddr(p4d) + pud_index(addr) * sizeof(pud_t);
>> +}
>> +
> 



^ permalink raw reply

* Re: [PATCH v5 0/4] perf arm_spe: Extend SIMD operations
From: Namhyung Kim @ 2026-04-10  5:06 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ian Rogers, Adrian Hunter,
	James Clark, Mark Rutland, Arnaldo Carvalho de Melo,
	linux-perf-users, linux-arm-kernel
In-Reply-To: <20260408-perf_support_arm_spev1-3-v5-0-b5bcea6217bb@arm.com>

Hi Leo,

On Wed, Apr 08, 2026 at 10:42:30AM +0100, Leo Yan wrote:
> This series extends SIMD flag for Arm SPE and updated the document.
> 
> Since I failed to get perf core maintainer's review for uAPI header
> updating for data source fields (since last year's Sepetember), this
> series I dropped uAPI changes and sent only SIMD flag changes, hope
> it is easier for perf tool maintainer's picking up.

Sure, I believe these SIMD flags are independent from the kernel and
uAPI changes as they are from ARM SPE formats directly.

> 
> Anyway, uAPI patches will be sent out separately.
> 
> This version is rebased onto the latest perf-tools-next branch.

Thanks for your work,
Namhyung

> 
> Signed-off-by: Leo Yan <leo.yan@arm.com>
> ---
> Changes in v5:
> - Dropped uAPI header changes for data source fields updating.
> - Link to v4: https://lore.kernel.org/r/20260106-perf_support_arm_spev1-3-v4-0-b887bb999f6e@arm.com
> 
> Changes in v4 (resend):
> - Updated for Ian and James' review tags.
> - Link to v4: https://lore.kernel.org/r/20260106-perf_support_arm_spev1-3-v4-0-bb2d143b3860@arm.com
> 
> Changes in v4:
> - Updated for Ian and James' review tags.
> - Rebased on the latest perf-tools-next branch.
> - Link to v3: https://lore.kernel.org/r/20251112-perf_support_arm_spev1-3-v3-0-e63c9829f9d9@arm.com
> 
> Changes in v3:
> - Rebased on the latest perf-tools-next branch.
> - Link to v2: https://lore.kernel.org/r/20251017-perf_support_arm_spev1-3-v2-0-2d41e4746e1b@arm.com
> 
> Changes in v2:
> - Refined to use enums for 2nd operation types. (James)
> - Avoided adjustment bit positions for operations. (James)
> - Used enum for extended operation type in uapi header and defined
>   individual bit field for operation details in uaip header. (James)
> - Refined SIMD flag definitions. (James)
> - Extracted a separate commit for updating tool's header. (James/Arnaldo)
> - Minor improvement for printing memory events.
> - Rebased on the latest perf-tools-next branch.
> - Link to v1: https://lore.kernel.org/r/20250929-perf_support_arm_spev1-3-v1-0-1150b3c83857@arm.com
> 
> ---
> Leo Yan (4):
>       perf sort: Support sort ASE and SME
>       perf sort: Sort disabled and full predicated flags
>       perf report: Update document for SIMD flags
>       perf arm_spe: Improve SIMD flags setting
> 
>  tools/perf/Documentation/perf-report.txt |  5 ++++-
>  tools/perf/util/arm-spe.c                | 26 ++++++++++++++++++++------
>  tools/perf/util/sample.h                 | 21 ++++++++++++++++-----
>  tools/perf/util/sort.c                   | 21 +++++++++++++++------
>  4 files changed, 55 insertions(+), 18 deletions(-)
> ---
> base-commit: dc647eb00969cd213c84d6caee90c480317e857d
> change-id: 20250820-perf_support_arm_spev1-3-b6efd6fc77b2
> 
> Best regards,
> -- 
> Leo Yan <leo.yan@arm.com>
> 


^ permalink raw reply

* Re: [PATCH v5 0/4] perf arm_spe: Extend SIMD operations
From: Namhyung Kim @ 2026-04-10  5:10 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ian Rogers, Adrian Hunter,
	James Clark, Mark Rutland, Arnaldo Carvalho de Melo,
	linux-perf-users, linux-arm-kernel
In-Reply-To: <20260408-perf_support_arm_spev1-3-v5-0-b5bcea6217bb@arm.com>

On Wed, Apr 08, 2026 at 10:42:30AM +0100, Leo Yan wrote:
> This series extends SIMD flag for Arm SPE and updated the document.
> 
> Since I failed to get perf core maintainer's review for uAPI header
> updating for data source fields (since last year's Sepetember), this
> series I dropped uAPI changes and sent only SIMD flag changes, hope
> it is easier for perf tool maintainer's picking up.
> 
> Anyway, uAPI patches will be sent out separately.
> 
> This version is rebased onto the latest perf-tools-next branch.

Unfortunately it doesn't apply due to a recent change.  Can you please
rebase it once again?

Thanks,
Namhyung

> 
> Signed-off-by: Leo Yan <leo.yan@arm.com>
> ---
> Changes in v5:
> - Dropped uAPI header changes for data source fields updating.
> - Link to v4: https://lore.kernel.org/r/20260106-perf_support_arm_spev1-3-v4-0-b887bb999f6e@arm.com
> 
> Changes in v4 (resend):
> - Updated for Ian and James' review tags.
> - Link to v4: https://lore.kernel.org/r/20260106-perf_support_arm_spev1-3-v4-0-bb2d143b3860@arm.com
> 
> Changes in v4:
> - Updated for Ian and James' review tags.
> - Rebased on the latest perf-tools-next branch.
> - Link to v3: https://lore.kernel.org/r/20251112-perf_support_arm_spev1-3-v3-0-e63c9829f9d9@arm.com
> 
> Changes in v3:
> - Rebased on the latest perf-tools-next branch.
> - Link to v2: https://lore.kernel.org/r/20251017-perf_support_arm_spev1-3-v2-0-2d41e4746e1b@arm.com
> 
> Changes in v2:
> - Refined to use enums for 2nd operation types. (James)
> - Avoided adjustment bit positions for operations. (James)
> - Used enum for extended operation type in uapi header and defined
>   individual bit field for operation details in uaip header. (James)
> - Refined SIMD flag definitions. (James)
> - Extracted a separate commit for updating tool's header. (James/Arnaldo)
> - Minor improvement for printing memory events.
> - Rebased on the latest perf-tools-next branch.
> - Link to v1: https://lore.kernel.org/r/20250929-perf_support_arm_spev1-3-v1-0-1150b3c83857@arm.com
> 
> ---
> Leo Yan (4):
>       perf sort: Support sort ASE and SME
>       perf sort: Sort disabled and full predicated flags
>       perf report: Update document for SIMD flags
>       perf arm_spe: Improve SIMD flags setting
> 
>  tools/perf/Documentation/perf-report.txt |  5 ++++-
>  tools/perf/util/arm-spe.c                | 26 ++++++++++++++++++++------
>  tools/perf/util/sample.h                 | 21 ++++++++++++++++-----
>  tools/perf/util/sort.c                   | 21 +++++++++++++++------
>  4 files changed, 55 insertions(+), 18 deletions(-)
> ---
> base-commit: dc647eb00969cd213c84d6caee90c480317e857d
> change-id: 20250820-perf_support_arm_spev1-3-b6efd6fc77b2
> 
> Best regards,
> -- 
> Leo Yan <leo.yan@arm.com>
> 


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox