linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/2]  Loongarch irq-redirect supprot
@ 2025-05-23 10:18 Tianyang Zhang
  2025-05-23 10:18 ` [PATCH v3 1/2] Docs/LoongArch: Add Advanced Extended-Redirect IRQ model description Tianyang Zhang
  2025-05-23 10:18 ` [PATCH v3 2/2] irq/irq-loongarch-ir:Add Redirect irqchip support Tianyang Zhang
  0 siblings, 2 replies; 7+ messages in thread
From: Tianyang Zhang @ 2025-05-23 10:18 UTC (permalink / raw)
  To: chenhuacai, kernel, corbet, alexs, si.yanteng, tglx, jiaxun.yang,
	peterz, wangliupu, lvjianmin, maobibo, siyanteng, gaosong,
	yangtiezhu
  Cc: loongarch, linux-doc, linux-kernel, Tianyang Zhang

This series of patches introduces support for interrupt-redirect
controllers, and this hardware feature will be supported on 3C6000
for the first time

change log:
	v0->v1:
	1.Rename the model names in the document.
	2.Adjust the code format.
	3.Remove architecture - specific prefixes.
	4.Refactor the initialization logic, and IR driver no longer set AVEC_ENABLE.
	5.Enhance compatibility under certain configurations.
	v1->v2:
	1.Fixed an erroneous enabling issue.
	v2->v3
	1.Replace smp_call with address mapping to access registers
	2.Fix some code style issues


Tianyang Zhang (2):
  Docs/LoongArch: Add Advanced Extended-Redirect IRQ model description
  irq/irq-loongarch-ir:Add Redirect irqchip support

 .../arch/loongarch/irq-chip-model.rst         |  38 ++
 .../zh_CN/arch/loongarch/irq-chip-model.rst   |  37 ++
 arch/loongarch/include/asm/cpu-features.h     |   1 +
 arch/loongarch/include/asm/cpu.h              |   2 +
 arch/loongarch/include/asm/loongarch.h        |   6 +
 arch/loongarch/kernel/cpu-probe.c             |   2 +
 drivers/irqchip/Makefile                      |   2 +-
 drivers/irqchip/irq-loongarch-avec.c          |  20 +-
 drivers/irqchip/irq-loongarch-ir.c            | 562 ++++++++++++++++++
 drivers/irqchip/irq-loongson.h                |  12 +
 include/linux/cpuhotplug.h                    |   1 +
 11 files changed, 669 insertions(+), 14 deletions(-)
 create mode 100644 drivers/irqchip/irq-loongarch-ir.c

-- 
2.20.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 1/2] Docs/LoongArch: Add Advanced Extended-Redirect IRQ model description
  2025-05-23 10:18 [PATCH v3 0/2] Loongarch irq-redirect supprot Tianyang Zhang
@ 2025-05-23 10:18 ` Tianyang Zhang
  2025-05-26  1:45   ` Yanteng Si
  2025-05-23 10:18 ` [PATCH v3 2/2] irq/irq-loongarch-ir:Add Redirect irqchip support Tianyang Zhang
  1 sibling, 1 reply; 7+ messages in thread
From: Tianyang Zhang @ 2025-05-23 10:18 UTC (permalink / raw)
  To: chenhuacai, kernel, corbet, alexs, si.yanteng, tglx, jiaxun.yang,
	peterz, wangliupu, lvjianmin, maobibo, siyanteng, gaosong,
	yangtiezhu
  Cc: loongarch, linux-doc, linux-kernel, Tianyang Zhang

Introduce the redirect interrupt controllers.When the redirect interrupt
controller is enabled, the routing target of MSI interrupts is no longer a
specific CPU and vector number, but a specific redirect entry. The actual
CPU and vector number used are described by the redirect entry.

Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn>
---
 .../arch/loongarch/irq-chip-model.rst         | 38 +++++++++++++++++++
 .../zh_CN/arch/loongarch/irq-chip-model.rst   | 37 ++++++++++++++++++
 2 files changed, 75 insertions(+)

diff --git a/Documentation/arch/loongarch/irq-chip-model.rst b/Documentation/arch/loongarch/irq-chip-model.rst
index a7ecce11e445..d9a2e8d7f70e 100644
--- a/Documentation/arch/loongarch/irq-chip-model.rst
+++ b/Documentation/arch/loongarch/irq-chip-model.rst
@@ -181,6 +181,44 @@ go to PCH-PIC/PCH-LPC and gathered by EIOINTC, and then go to CPUINTC directly::
              | Devices |
              +---------+
 
+Advanced Extended IRQ model (with redirection)
+==============================================
+
+In this model, IPI (Inter-Processor Interrupt) and CPU Local Timer interrupt go
+to CPUINTC directly, CPU UARTS interrupts go to LIOINTC, PCH-MSI interrupts go
+to REDIRECT for remapping it to AVEC, and then go to CPUINTC directly, while all
+other devices interrupts go to PCH-PIC/PCH-LPC and gathered by EIOINTC, and then
+go to CPUINTC directly::
+
+ +-----+     +-----------------------+     +-------+
+ | IPI | --> |        CPUINTC        | <-- | Timer |
+ +-----+     +-----------------------+     +-------+
+              ^          ^          ^
+              |          |          |
+       +---------+ +----------+ +---------+     +-------+
+       | EIOINTC | | AVECINTC | | LIOINTC | <-- | UARTs |
+       +---------+ +----------+ +---------+     +-------+
+            ^            ^
+            |            |
+            |      +----------+
+            |      | REDIRECT |
+            |      +----------+
+            |            ^
+            |            |
+       +---------+  +---------+
+       | PCH-PIC |  | PCH-MSI |
+       +---------+  +---------+
+         ^     ^           ^
+         |     |           |
+ +---------+ +---------+ +---------+
+ | Devices | | PCH-LPC | | Devices |
+ +---------+ +---------+ +---------+
+                  ^
+                  |
+             +---------+
+             | Devices |
+             +---------+
+
 ACPI-related definitions
 ========================
 
diff --git a/Documentation/translations/zh_CN/arch/loongarch/irq-chip-model.rst b/Documentation/translations/zh_CN/arch/loongarch/irq-chip-model.rst
index d4ff80de47b6..7e4e3e55c7ad 100644
--- a/Documentation/translations/zh_CN/arch/loongarch/irq-chip-model.rst
+++ b/Documentation/translations/zh_CN/arch/loongarch/irq-chip-model.rst
@@ -174,6 +174,43 @@ CPU串口(UARTs)中断发送到LIOINTC,PCH-MSI中断发送到AVECINTC,
              | Devices |
              +---------+
 
+高级扩展IRQ模型 (带重定向)
+==========================
+
+在这种模型里面,IPI(Inter-Processor Interrupt)和CPU本地时钟中断直接发送到CPUINTC,
+CPU串口(UARTs)中断发送到LIOINTC,PCH-MSI中断首先发送到REDIRECT模块,完成重定向后发
+送到AVECINTC,而后通过AVECINTC直接送达CPUINTC,而其他所有设备的中断则分别发送到所连
+接的PCH-PIC/PCH-LPC,然后由EIOINTC统一收集,再直接到达CPUINTC::
+
+ +-----+     +-----------------------+     +-------+
+ | IPI | --> |        CPUINTC        | <-- | Timer |
+ +-----+     +-----------------------+     +-------+
+              ^          ^          ^
+              |          |          |
+       +---------+ +----------+ +---------+     +-------+
+       | EIOINTC | | AVECINTC | | LIOINTC | <-- | UARTs |
+       +---------+ +----------+ +---------+     +-------+
+            ^            ^
+            |            |
+            |      +----------+
+            |      | REDIRECT |
+            |      +----------+
+            |            ^
+            |            |
+       +---------+  +---------+
+       | PCH-PIC |  | PCH-MSI |
+       +---------+  +---------+
+         ^     ^           ^
+         |     |           |
+ +---------+ +---------+ +---------+
+ | Devices | | PCH-LPC | | Devices |
+ +---------+ +---------+ +---------+
+                  ^
+                  |
+             +---------+
+             | Devices |
+             +---------+
+
 ACPI相关的定义
 ==============
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 2/2] irq/irq-loongarch-ir:Add Redirect irqchip support
  2025-05-23 10:18 [PATCH v3 0/2] Loongarch irq-redirect supprot Tianyang Zhang
  2025-05-23 10:18 ` [PATCH v3 1/2] Docs/LoongArch: Add Advanced Extended-Redirect IRQ model description Tianyang Zhang
@ 2025-05-23 10:18 ` Tianyang Zhang
  2025-05-24 14:12   ` Huacai Chen
  2025-05-25  9:06   ` Thomas Gleixner
  1 sibling, 2 replies; 7+ messages in thread
From: Tianyang Zhang @ 2025-05-23 10:18 UTC (permalink / raw)
  To: chenhuacai, kernel, corbet, alexs, si.yanteng, tglx, jiaxun.yang,
	peterz, wangliupu, lvjianmin, maobibo, siyanteng, gaosong,
	yangtiezhu
  Cc: loongarch, linux-doc, linux-kernel, Tianyang Zhang

The main function of the Redirected interrupt controller is to manage the
redirected-interrupt table, which consists of many redirected entries.
When MSI interrupts are requested, the driver creates a corresponding
redirected entry that describes the target CPU/vector number and the
operating mode of the interrupt. The redirected interrupt module has an
independent cache, and during the interrupt routing process, it will
prioritize the redirected entries that hit the cache. The driver
invalidates certain entry caches via a command queue.

Co-developed-by: Liupu Wang <wangliupu@loongson.cn>
Signed-off-by: Liupu Wang <wangliupu@loongson.cn>
Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn>
---
 arch/loongarch/include/asm/cpu-features.h |   1 +
 arch/loongarch/include/asm/cpu.h          |   2 +
 arch/loongarch/include/asm/loongarch.h    |   6 +
 arch/loongarch/kernel/cpu-probe.c         |   2 +
 drivers/irqchip/Makefile                  |   2 +-
 drivers/irqchip/irq-loongarch-avec.c      |  20 +-
 drivers/irqchip/irq-loongarch-ir.c        | 562 ++++++++++++++++++++++
 drivers/irqchip/irq-loongson.h            |  12 +
 include/linux/cpuhotplug.h                |   1 +
 9 files changed, 594 insertions(+), 14 deletions(-)
 create mode 100644 drivers/irqchip/irq-loongarch-ir.c

diff --git a/arch/loongarch/include/asm/cpu-features.h b/arch/loongarch/include/asm/cpu-features.h
index fc83bb32f9f0..03f7e93e81e0 100644
--- a/arch/loongarch/include/asm/cpu-features.h
+++ b/arch/loongarch/include/asm/cpu-features.h
@@ -68,5 +68,6 @@
 #define cpu_has_ptw		cpu_opt(LOONGARCH_CPU_PTW)
 #define cpu_has_lspw		cpu_opt(LOONGARCH_CPU_LSPW)
 #define cpu_has_avecint		cpu_opt(LOONGARCH_CPU_AVECINT)
+#define cpu_has_redirectint	cpu_opt(LOONGARCH_CPU_REDIRECTINT)
 
 #endif /* __ASM_CPU_FEATURES_H */
diff --git a/arch/loongarch/include/asm/cpu.h b/arch/loongarch/include/asm/cpu.h
index 98cf4d7b4b0a..33cd96e569d8 100644
--- a/arch/loongarch/include/asm/cpu.h
+++ b/arch/loongarch/include/asm/cpu.h
@@ -102,6 +102,7 @@ enum cpu_type_enum {
 #define CPU_FEATURE_PTW			27	/* CPU has hardware page table walker */
 #define CPU_FEATURE_LSPW		28	/* CPU has LSPW (lddir/ldpte instructions) */
 #define CPU_FEATURE_AVECINT		29	/* CPU has AVEC interrupt */
+#define CPU_FEATURE_REDIRECTINT		30      /* CPU has interrupt remmap */
 
 #define LOONGARCH_CPU_CPUCFG		BIT_ULL(CPU_FEATURE_CPUCFG)
 #define LOONGARCH_CPU_LAM		BIT_ULL(CPU_FEATURE_LAM)
@@ -133,5 +134,6 @@ enum cpu_type_enum {
 #define LOONGARCH_CPU_PTW		BIT_ULL(CPU_FEATURE_PTW)
 #define LOONGARCH_CPU_LSPW		BIT_ULL(CPU_FEATURE_LSPW)
 #define LOONGARCH_CPU_AVECINT		BIT_ULL(CPU_FEATURE_AVECINT)
+#define LOONGARCH_CPU_REDIRECTINT	BIT_ULL(CPU_FEATURE_REDIRECTINT)
 
 #endif /* _ASM_CPU_H */
diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h
index 52651aa0e583..95e06cb6831e 100644
--- a/arch/loongarch/include/asm/loongarch.h
+++ b/arch/loongarch/include/asm/loongarch.h
@@ -1130,6 +1130,7 @@
 #define  IOCSRF_FLATMODE		BIT_ULL(10)
 #define  IOCSRF_VM			BIT_ULL(11)
 #define  IOCSRF_AVEC			BIT_ULL(15)
+#define  IOCSRF_REDIRECTINT		BIT_ULL(16)
 
 #define LOONGARCH_IOCSR_VENDOR		0x10
 
@@ -1189,6 +1190,11 @@
 
 #define LOONGARCH_IOCSR_EXTIOI_NODEMAP_BASE	0x14a0
 #define LOONGARCH_IOCSR_EXTIOI_IPMAP_BASE	0x14c0
+#define LOONGARCH_IOCSR_REDIRECT_CFG		0x15e0
+#define LOONGARCH_IOCSR_REDIRECT_TBR		0x15e8  /* IRT BASE REG*/
+#define LOONGARCH_IOCSR_REDIRECT_CQB		0x15f0  /* IRT CACHE QUEUE BASE */
+#define LOONGARCH_IOCSR_REDIRECT_CQH		0x15f8  /* IRT CACHE QUEUE HEAD, 32bit */
+#define LOONGARCH_IOCSR_REDIRECT_CQT		0x15fc  /* IRT CACHE QUEUE TAIL, 32bit */
 #define LOONGARCH_IOCSR_EXTIOI_EN_BASE		0x1600
 #define LOONGARCH_IOCSR_EXTIOI_BOUNCE_BASE	0x1680
 #define LOONGARCH_IOCSR_EXTIOI_ISR_BASE		0x1800
diff --git a/arch/loongarch/kernel/cpu-probe.c b/arch/loongarch/kernel/cpu-probe.c
index fedaa67cde41..543474fd1399 100644
--- a/arch/loongarch/kernel/cpu-probe.c
+++ b/arch/loongarch/kernel/cpu-probe.c
@@ -289,6 +289,8 @@ static inline void cpu_probe_loongson(struct cpuinfo_loongarch *c, unsigned int
 		c->options |= LOONGARCH_CPU_EIODECODE;
 	if (config & IOCSRF_AVEC)
 		c->options |= LOONGARCH_CPU_AVECINT;
+	if (config & IOCSRF_REDIRECTINT)
+		c->options |= LOONGARCH_CPU_REDIRECTINT;
 	if (config & IOCSRF_VM)
 		c->options |= LOONGARCH_CPU_HYPERVISOR;
 }
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index 365bcea9a61f..2bb8618f96d1 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -114,7 +114,7 @@ obj-$(CONFIG_LS1X_IRQ)			+= irq-ls1x.o
 obj-$(CONFIG_TI_SCI_INTR_IRQCHIP)	+= irq-ti-sci-intr.o
 obj-$(CONFIG_TI_SCI_INTA_IRQCHIP)	+= irq-ti-sci-inta.o
 obj-$(CONFIG_TI_PRUSS_INTC)		+= irq-pruss-intc.o
-obj-$(CONFIG_IRQ_LOONGARCH_CPU)		+= irq-loongarch-cpu.o irq-loongarch-avec.o
+obj-$(CONFIG_IRQ_LOONGARCH_CPU)		+= irq-loongarch-cpu.o irq-loongarch-avec.o irq-loongarch-ir.o
 obj-$(CONFIG_LOONGSON_LIOINTC)		+= irq-loongson-liointc.o
 obj-$(CONFIG_LOONGSON_EIOINTC)		+= irq-loongson-eiointc.o
 obj-$(CONFIG_LOONGSON_HTPIC)		+= irq-loongson-htpic.o
diff --git a/drivers/irqchip/irq-loongarch-avec.c b/drivers/irqchip/irq-loongarch-avec.c
index 80e55955a29f..7f4a671038ee 100644
--- a/drivers/irqchip/irq-loongarch-avec.c
+++ b/drivers/irqchip/irq-loongarch-avec.c
@@ -24,7 +24,6 @@
 #define VECTORS_PER_REG		64
 #define IRR_VECTOR_MASK		0xffUL
 #define IRR_INVALID_MASK	0x80000000UL
-#define AVEC_MSG_OFFSET		0x100000
 
 #ifdef CONFIG_SMP
 struct pending_list {
@@ -47,15 +46,6 @@ struct avecintc_chip {
 
 static struct avecintc_chip loongarch_avec;
 
-struct avecintc_data {
-	struct list_head	entry;
-	unsigned int		cpu;
-	unsigned int		vec;
-	unsigned int		prev_cpu;
-	unsigned int		prev_vec;
-	unsigned int		moving;
-};
-
 static inline void avecintc_enable(void)
 {
 	u64 value;
@@ -85,7 +75,7 @@ static inline void pending_list_init(int cpu)
 	INIT_LIST_HEAD(&plist->head);
 }
 
-static void avecintc_sync(struct avecintc_data *adata)
+void avecintc_sync(struct avecintc_data *adata)
 {
 	struct pending_list *plist;
 
@@ -109,7 +99,7 @@ static int avecintc_set_affinity(struct irq_data *data, const struct cpumask *de
 			return -EBUSY;
 
 		if (cpu_online(adata->cpu) && cpumask_test_cpu(adata->cpu, dest))
-			return 0;
+			return IRQ_SET_MASK_OK_DONE;
 
 		cpumask_and(&intersect_mask, dest, cpu_online_mask);
 
@@ -121,7 +111,8 @@ static int avecintc_set_affinity(struct irq_data *data, const struct cpumask *de
 		adata->cpu = cpu;
 		adata->vec = vector;
 		per_cpu_ptr(irq_map, adata->cpu)[adata->vec] = irq_data_to_desc(data);
-		avecintc_sync(adata);
+		if (!cpu_has_redirectint)
+			avecintc_sync(adata);
 	}
 
 	irq_data_update_effective_affinity(data, cpumask_of(cpu));
@@ -412,6 +403,9 @@ static int __init pch_msi_parse_madt(union acpi_subtable_headers *header,
 
 static inline int __init acpi_cascade_irqdomain_init(void)
 {
+	if (cpu_has_redirectint)
+		return redirect_acpi_init(loongarch_avec.domain);
+
 	return acpi_table_parse_madt(ACPI_MADT_TYPE_MSI_PIC, pch_msi_parse_madt, 1);
 }
 
diff --git a/drivers/irqchip/irq-loongarch-ir.c b/drivers/irqchip/irq-loongarch-ir.c
new file mode 100644
index 000000000000..ac1ee3f78aa4
--- /dev/null
+++ b/drivers/irqchip/irq-loongarch-ir.c
@@ -0,0 +1,562 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020 Loongson Technologies, Inc.
+ */
+
+#include <linux/cpuhotplug.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/kernel.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqdomain.h>
+#include <linux/spinlock.h>
+#include <linux/msi.h>
+
+#include <asm/irq.h>
+#include <asm/loongarch.h>
+#include <asm/setup.h>
+#include <larchintrin.h>
+
+#include "irq-loongson.h"
+#include "irq-msi-lib.h"
+
+#define IRD_ENTRIES			65536
+
+/* redirect entry size 128bits */
+#define IRD_PAGE_ORDER			(20 - PAGE_SHIFT)
+
+/* irt cache invalid queue */
+#define	INVALID_QUEUE_SIZE		4096
+
+#define INVALID_QUEUE_PAGE_ORDER	(16 - PAGE_SHIFT)
+
+#define GPID_ADDR_MASK			0x3ffffffffffULL
+#define GPID_ADDR_SHIFT			6
+
+#define CQB_SIZE_SHIFT			0
+#define CQB_SIZE_MASK			0xf
+#define CQB_ADDR_SHIFT			12
+#define CQB_ADDR_MASK			(0xfffffffffULL)
+
+#define CFG_DISABLE_IDLE		2
+#define INVALID_INDEX			0
+
+#define MAX_IR_ENGINES			16
+
+struct irq_domain *redirect_domain;
+
+struct redirect_entry {
+	struct  {
+		__u64	valid	: 1,
+			res1	: 5,
+			gpid	: 42,
+			res2	: 8,
+			vector	: 8;
+	}	lo;
+	__u64	hi;
+};
+
+struct redirect_gpid {
+	u64	pir[4];      /* Pending interrupt requested */
+	u8	en	: 1, /* doorbell */
+		res0	: 7;
+	u8	irqnum;
+	u16	res1;
+	u32	dst;
+	u32	rsvd[6];
+} __aligned(64);
+
+struct redirect_table {
+	int			node;
+	struct redirect_entry	*table;
+	unsigned long		*bitmap;
+	unsigned int		nr_ird;
+	struct page		*page;
+	raw_spinlock_t		lock;
+};
+
+struct redirect_item {
+	int			index;
+	struct redirect_entry	*entry;
+	struct redirect_gpid	*gpid;
+	struct redirect_table	*table;
+};
+
+struct redirect_queue {
+	int		node;
+	u64		base;
+	u32		max_size;
+	int		head;
+	int		tail;
+	struct page	*page;
+	raw_spinlock_t	lock;
+};
+
+struct irde_desc {
+	struct redirect_table	ird_table;
+	struct redirect_queue	inv_queue;
+};
+
+struct irde_inv_cmd {
+	union {
+		__u64	cmd_info;
+		struct {
+			__u64	res1		: 4,
+				type		: 1,
+				need_notice	: 1,
+				pad		: 2,
+				index		: 16,
+				pad2		: 40;
+		}	index;
+	};
+	__u64		notice_addr;
+};
+
+static struct irde_desc irde_descs[MAX_IR_ENGINES];
+static phys_addr_t msi_base_addr;
+static phys_addr_t redirect_reg_base = 0x1fe00000;
+
+#define REDIRECT_REG_BASE(reg, node) \
+	(UNCACHE_BASE | redirect_reg_base | (u64)(node) << NODE_ADDRSPACE_SHIFT | (reg))
+#define	redirect_reg_queue_head(node)	REDIRECT_REG_BASE(LOONGARCH_IOCSR_REDIRECT_CQH, (node))
+#define	redirect_reg_queue_tail(node)	REDIRECT_REG_BASE(LOONGARCH_IOCSR_REDIRECT_CQT, (node))
+#define read_queue_head(node)		(*((u32 *)(redirect_reg_queue_head(node))))
+#define read_queue_tail(node)		(*((u32 *)(redirect_reg_queue_tail(node))))
+#define write_queue_tail(node, val)	(*((u32 *)(redirect_reg_queue_tail(node))) = (val))
+
+static inline bool invalid_queue_is_full(int node, u32 *tail)
+{
+	u32 head;
+
+	head = read_queue_head(node);
+	*tail = read_queue_tail(node);
+
+	return !!(head == ((*tail + 1) % INVALID_QUEUE_SIZE));
+}
+
+static void invalid_enqueue(struct redirect_queue *rqueue, struct irde_inv_cmd *cmd)
+{
+	struct irde_inv_cmd *inv_addr;
+	u32 tail;
+
+	guard(raw_spinlock_irqsave)(&rqueue->lock);
+
+	while (invalid_queue_is_full(rqueue->node, &tail))
+		cpu_relax();
+
+	inv_addr = (struct irde_inv_cmd *)(rqueue->base + tail * sizeof(struct irde_inv_cmd));
+	memcpy(inv_addr, cmd, sizeof(struct irde_inv_cmd));
+	tail = (tail + 1) % INVALID_QUEUE_SIZE;
+
+	/*
+	 * The uncache-memory access may have an out of order problem cache-memory access,
+	 * so a barrier is needed to ensure tail is valid
+	 */
+	wmb();
+
+	write_queue_tail(rqueue->node, tail);
+}
+
+static void irde_invlid_entry_node(struct redirect_item *item)
+{
+	struct redirect_queue *rqueue;
+	struct irde_inv_cmd cmd;
+	volatile u64 raddr = 0;
+	int node = item->table->node;
+
+	rqueue = &(irde_descs[node].inv_queue);
+	cmd.cmd_info = 0;
+	cmd.index.type = INVALID_INDEX;
+	cmd.index.need_notice = 1;
+	cmd.index.index = item->index;
+	cmd.notice_addr = (u64)(__pa(&raddr));
+
+	invalid_enqueue(rqueue, &cmd);
+
+	while (!raddr)
+		cpu_relax();
+
+}
+
+static inline struct avecintc_data *irq_data_get_avec_data(struct irq_data *data)
+{
+	return data->parent_data->chip_data;
+}
+
+static int redirect_table_alloc(struct redirect_item *item, struct redirect_table *ird_table)
+{
+	int index;
+
+	guard(raw_spinlock_irqsave)(&ird_table->lock);
+
+	index = find_first_zero_bit(ird_table->bitmap, IRD_ENTRIES);
+	if (index > IRD_ENTRIES) {
+		pr_err("No redirect entry to use\n");
+		return -ENOMEM;
+	}
+
+	__set_bit(index, ird_table->bitmap);
+
+	item->index = index;
+	item->entry = &ird_table->table[index];
+	item->table = ird_table;
+
+	return 0;
+}
+
+static int redirect_table_free(struct redirect_item *item)
+{
+	struct redirect_table *ird_table;
+	struct redirect_entry *entry;
+
+	ird_table = item->table;
+
+	entry = item->entry;
+	memset(entry, 0, sizeof(struct redirect_entry));
+
+	scoped_guard(raw_spinlock_irqsave, &ird_table->lock)
+		bitmap_release_region(ird_table->bitmap, item->index, 0);
+
+	kfree(item->gpid);
+
+	irde_invlid_entry_node(item);
+
+	return 0;
+}
+
+static inline void redirect_domain_prepare_entry(struct redirect_item *item,
+					struct avecintc_data *adata)
+{
+	struct redirect_entry *entry = item->entry;
+
+	item->gpid->en = 1;
+	item->gpid->irqnum = adata->vec;
+	item->gpid->dst = adata->cpu;
+
+	entry->lo.valid = 1;
+	entry->lo.gpid = ((long)item->gpid >> GPID_ADDR_SHIFT) & (GPID_ADDR_MASK);
+	entry->lo.vector = 0xff;
+}
+
+static int redirect_set_affinity(struct irq_data *data, const struct cpumask *dest, bool force)
+{
+	struct redirect_item *item = data->chip_data;
+	struct avecintc_data *adata;
+	int ret;
+
+	ret = irq_chip_set_affinity_parent(data, dest, force);
+	if (ret == IRQ_SET_MASK_OK_DONE) {
+		return ret;
+	} else if (ret) {
+		pr_err("IRDE:set_affinity error %d\n", ret);
+		return ret;
+	}
+
+	adata = irq_data_get_avec_data(data);
+	redirect_domain_prepare_entry(item, adata);
+	irde_invlid_entry_node(item);
+	avecintc_sync(adata);
+
+	return IRQ_SET_MASK_OK;
+}
+
+static void redirect_compose_msi_msg(struct irq_data *d, struct msi_msg *msg)
+{
+	struct redirect_item *item;
+
+	item = irq_data_get_irq_chip_data(d);
+	msg->address_lo = (msi_base_addr | 1 << 2 | ((item->index & 0xffff) << 4));
+	msg->address_hi = 0x0;
+	msg->data = 0x0;
+}
+
+static inline void redirect_ack_irq(struct irq_data *d)
+{
+}
+
+static inline void redirect_unmask_irq(struct irq_data *d)
+{
+}
+
+static inline void redirect_mask_irq(struct irq_data *d)
+{
+}
+
+static struct irq_chip loongarch_redirect_chip = {
+	.name			= "REDIRECT",
+	.irq_ack		= redirect_ack_irq,
+	.irq_mask		= redirect_mask_irq,
+	.irq_unmask		= redirect_unmask_irq,
+	.irq_set_affinity	= redirect_set_affinity,
+	.irq_compose_msi_msg	= redirect_compose_msi_msg,
+};
+
+static void redirect_free_resources(struct irq_domain *domain, unsigned int virq,
+				unsigned int nr_irqs)
+{
+	for (int i = 0; i < nr_irqs; i++) {
+		struct irq_data *irq_data;
+
+		irq_data = irq_domain_get_irq_data(domain, virq  + i);
+		if (irq_data && irq_data->chip_data) {
+			struct redirect_item *item;
+
+			item = irq_data->chip_data;
+			redirect_table_free(item);
+			kfree(item);
+		}
+	}
+}
+
+static int redirect_domain_alloc(struct irq_domain *domain, unsigned int virq,
+			unsigned int nr_irqs, void *arg)
+{
+	struct redirect_table *ird_table;
+	struct avecintc_data *avec_data;
+	struct irq_data *irq_data;
+	msi_alloc_info_t *info;
+	int ret, i, node;
+
+	info = (msi_alloc_info_t *)arg;
+	node = dev_to_node(info->desc->dev);
+	ird_table = &irde_descs[node].ird_table;
+
+	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg);
+	if (ret < 0)
+		return ret;
+
+	for (i = 0; i < nr_irqs; i++) {
+		struct redirect_item *item;
+
+		item = kzalloc(sizeof(struct redirect_item), GFP_KERNEL);
+		if (!item) {
+			pr_err("Alloc redirect descriptor failed\n");
+			goto out_free_resources;
+		}
+
+		irq_data = irq_domain_get_irq_data(domain, virq + i);
+
+		avec_data = irq_data_get_avec_data(irq_data);
+		ret = redirect_table_alloc(item, ird_table);
+		if (ret) {
+			pr_err("Alloc redirect table entry failed\n");
+			goto out_free_resources;
+		}
+
+		item->gpid = kzalloc_node(sizeof(struct redirect_gpid), GFP_KERNEL, node);
+		if (!item->gpid) {
+			pr_err("Alloc redirect GPID failed\n");
+			goto out_free_resources;
+		}
+
+		irq_data->chip_data = item;
+		irq_data->chip = &loongarch_redirect_chip;
+		redirect_domain_prepare_entry(item, avec_data);
+	}
+	return 0;
+
+out_free_resources:
+	redirect_free_resources(domain, virq, nr_irqs);
+	irq_domain_free_irqs_common(domain, virq, nr_irqs);
+
+	return -EINVAL;
+}
+
+static void redirect_domain_free(struct irq_domain *domain, unsigned int virq, unsigned int nr_irqs)
+{
+	redirect_free_resources(domain, virq, nr_irqs);
+	return irq_domain_free_irqs_common(domain, virq, nr_irqs);
+}
+
+static const struct irq_domain_ops redirect_domain_ops = {
+	.alloc		= redirect_domain_alloc,
+	.free		= redirect_domain_free,
+	.select		= msi_lib_irq_domain_select,
+};
+
+static int redirect_queue_init(int node)
+{
+	struct redirect_queue *rqueue = &(irde_descs[node].inv_queue);
+	struct page *pages;
+
+	pages = alloc_pages_node(0, GFP_KERNEL | __GFP_ZERO, INVALID_QUEUE_PAGE_ORDER);
+	if (!pages) {
+		pr_err("Node [%d] Invalid Queue alloc pages failed!\n", node);
+		return -ENOMEM;
+	}
+
+	rqueue->page = pages;
+	rqueue->base = (u64)page_address(pages);
+	rqueue->max_size = INVALID_QUEUE_SIZE;
+	rqueue->head = 0;
+	rqueue->tail = 0;
+	rqueue->node = node;
+	raw_spin_lock_init(&rqueue->lock);
+
+	iocsr_write32(0, LOONGARCH_IOCSR_REDIRECT_CQH);
+	iocsr_write32(0, LOONGARCH_IOCSR_REDIRECT_CQT);
+	iocsr_write64(((rqueue->base & (CQB_ADDR_MASK << CQB_ADDR_SHIFT)) |
+				(CQB_SIZE_MASK << CQB_SIZE_SHIFT)), LOONGARCH_IOCSR_REDIRECT_CQB);
+	return 0;
+}
+
+static int redirect_table_init(int node)
+{
+	struct redirect_table *ird_table = &(irde_descs[node].ird_table);
+	unsigned long *bitmap;
+	struct page *pages;
+
+	pages = alloc_pages_node(node, GFP_KERNEL | __GFP_ZERO, IRD_PAGE_ORDER);
+	if (!pages) {
+		pr_err("Node [%d] redirect table alloc pages failed!\n", node);
+		return -ENOMEM;
+	}
+	ird_table->page = pages;
+	ird_table->table = page_address(pages);
+
+	bitmap = bitmap_zalloc(IRD_ENTRIES, GFP_KERNEL);
+	if (!bitmap) {
+		pr_err("Node [%d] redirect table bitmap alloc pages failed!\n", node);
+		return -ENOMEM;
+	}
+
+	ird_table->bitmap = bitmap;
+	ird_table->nr_ird = IRD_ENTRIES;
+	ird_table->node = node;
+
+	raw_spin_lock_init(&ird_table->lock);
+
+	if (redirect_queue_init(node))
+		return -EINVAL;
+
+	iocsr_write64(CFG_DISABLE_IDLE, LOONGARCH_IOCSR_REDIRECT_CFG);
+	iocsr_write64(__pa(ird_table->table), LOONGARCH_IOCSR_REDIRECT_TBR);
+
+	return 0;
+}
+
+static void redirect_table_fini(int node)
+{
+	struct redirect_table *ird_table = &(irde_descs[node].ird_table);
+	struct redirect_queue *rqueue = &(irde_descs[node].inv_queue);
+
+	if (ird_table->page) {
+		__free_pages(ird_table->page, IRD_PAGE_ORDER);
+		ird_table->table = NULL;
+		ird_table->page = NULL;
+	}
+
+	if (ird_table->page) {
+		bitmap_free(ird_table->bitmap);
+		ird_table->bitmap = NULL;
+	}
+
+	if (rqueue->page) {
+		__free_pages(rqueue->page, INVALID_QUEUE_PAGE_ORDER);
+		rqueue->page = NULL;
+		rqueue->base = 0;
+	}
+
+	iocsr_write64(0, LOONGARCH_IOCSR_REDIRECT_CQB);
+	iocsr_write64(0, LOONGARCH_IOCSR_REDIRECT_TBR);
+}
+
+static int redirect_cpu_online(unsigned int cpu)
+{
+	int ret, node = cpu_to_node(cpu);
+
+	if (cpu != cpumask_first(cpumask_of_node(node)))
+		return 0;
+
+	ret = redirect_table_init(node);
+	if (ret) {
+		redirect_table_fini(node);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+#if defined(CONFIG_ACPI)
+static int __init redirect_reg_base_init(void)
+{
+	acpi_status status;
+	uint64_t addr = 0;
+
+	if (acpi_disabled)
+		return 0;
+
+	status = acpi_evaluate_integer(NULL, "\\_SB.NO00", NULL, &addr);
+	if (ACPI_FAILURE(status) || !addr)
+		pr_info("redirect_iocsr_base used default 0x1fe00000\n");
+	else
+		redirect_reg_base = addr;
+
+	return 0;
+}
+subsys_initcall_sync(redirect_reg_base_init);
+
+static int __init pch_msi_parse_madt(union acpi_subtable_headers *header,
+		const unsigned long end)
+{
+	struct acpi_madt_msi_pic *pchmsi_entry = (struct acpi_madt_msi_pic *)header;
+
+	msi_base_addr = pchmsi_entry->msg_address - AVEC_MSG_OFFSET;
+
+	return pch_msi_acpi_init_avec(redirect_domain);
+}
+
+static int __init acpi_cascade_irqdomain_init(void)
+{
+	return acpi_table_parse_madt(ACPI_MADT_TYPE_MSI_PIC, pch_msi_parse_madt, 1);
+}
+
+int __init redirect_acpi_init(struct irq_domain *parent)
+{
+	struct fwnode_handle *fwnode;
+	struct irq_domain *domain;
+	int ret;
+
+	fwnode = irq_domain_alloc_named_fwnode("redirect");
+	if (!fwnode) {
+		pr_err("Unable to alloc redirect domain handle\n");
+		goto fail;
+	}
+
+	domain = irq_domain_create_hierarchy(parent, 0, IRD_ENTRIES, fwnode,
+			&redirect_domain_ops, irde_descs);
+	if (!domain) {
+		pr_err("Unable to alloc redirect domain\n");
+		goto out_free_fwnode;
+	}
+
+	redirect_domain = domain;
+
+	ret = redirect_table_init(0);
+	if (ret)
+		goto out_free_table;
+
+	ret = acpi_cascade_irqdomain_init();
+	if (ret < 0) {
+		pr_err("Failed to cascade IRQ domain, ret=%d\n", ret);
+		goto out_free_table;
+	}
+
+	cpuhp_setup_state_nocalls(CPUHP_AP_IRQ_REDIRECT_STARTING,
+				  "irqchip/loongarch/redirect:starting",
+				  redirect_cpu_online, NULL);
+
+	pr_info("loongarch irq redirect modules init succeeded\n");
+	return 0;
+
+out_free_table:
+	redirect_table_fini(0);
+	irq_domain_remove(redirect_domain);
+	redirect_domain = NULL;
+out_free_fwnode:
+	irq_domain_free_fwnode(fwnode);
+fail:
+	return -EINVAL;
+}
+#endif
diff --git a/drivers/irqchip/irq-loongson.h b/drivers/irqchip/irq-loongson.h
index 11fa138d1f44..05ad40ffb62b 100644
--- a/drivers/irqchip/irq-loongson.h
+++ b/drivers/irqchip/irq-loongson.h
@@ -5,6 +5,15 @@
 
 #ifndef _DRIVERS_IRQCHIP_IRQ_LOONGSON_H
 #define _DRIVERS_IRQCHIP_IRQ_LOONGSON_H
+#define AVEC_MSG_OFFSET		0x100000
+struct avecintc_data {
+	struct list_head        entry;
+	unsigned int            cpu;
+	unsigned int            vec;
+	unsigned int            prev_cpu;
+	unsigned int            prev_vec;
+	unsigned int            moving;
+};
 
 int find_pch_pic(u32 gsi);
 
@@ -24,4 +33,7 @@ int pch_msi_acpi_init(struct irq_domain *parent,
 					struct acpi_madt_msi_pic *acpi_pchmsi);
 int pch_msi_acpi_init_avec(struct irq_domain *parent);
 
+int redirect_acpi_init(struct irq_domain *parent);
+
+void avecintc_sync(struct avecintc_data *adata);
 #endif /* _DRIVERS_IRQCHIP_IRQ_LOONGSON_H */
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index 1987400000b4..6a4ff072db42 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -145,6 +145,7 @@ enum cpuhp_state {
 	CPUHP_AP_IRQ_MIPS_GIC_STARTING,
 	CPUHP_AP_IRQ_EIOINTC_STARTING,
 	CPUHP_AP_IRQ_AVECINTC_STARTING,
+	CPUHP_AP_IRQ_REDIRECT_STARTING,
 	CPUHP_AP_IRQ_SIFIVE_PLIC_STARTING,
 	CPUHP_AP_IRQ_THEAD_ACLINT_SSWI_STARTING,
 	CPUHP_AP_IRQ_RISCV_IMSIC_STARTING,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 2/2] irq/irq-loongarch-ir:Add Redirect irqchip support
  2025-05-23 10:18 ` [PATCH v3 2/2] irq/irq-loongarch-ir:Add Redirect irqchip support Tianyang Zhang
@ 2025-05-24 14:12   ` Huacai Chen
  2025-05-25  9:06   ` Thomas Gleixner
  1 sibling, 0 replies; 7+ messages in thread
From: Huacai Chen @ 2025-05-24 14:12 UTC (permalink / raw)
  To: Tianyang Zhang
  Cc: kernel, corbet, alexs, si.yanteng, tglx, jiaxun.yang, peterz,
	wangliupu, lvjianmin, maobibo, siyanteng, gaosong, yangtiezhu,
	loongarch, linux-doc, linux-kernel

Hi, Tianyang,

On Fri, May 23, 2025 at 6:18 PM Tianyang Zhang
<zhangtianyang@loongson.cn> wrote:
>
> The main function of the Redirected interrupt controller is to manage the
> redirected-interrupt table, which consists of many redirected entries.
> When MSI interrupts are requested, the driver creates a corresponding
> redirected entry that describes the target CPU/vector number and the
> operating mode of the interrupt. The redirected interrupt module has an
> independent cache, and during the interrupt routing process, it will
> prioritize the redirected entries that hit the cache. The driver
> invalidates certain entry caches via a command queue.
>
> Co-developed-by: Liupu Wang <wangliupu@loongson.cn>
> Signed-off-by: Liupu Wang <wangliupu@loongson.cn>
> Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn>
> ---
>  arch/loongarch/include/asm/cpu-features.h |   1 +
>  arch/loongarch/include/asm/cpu.h          |   2 +
>  arch/loongarch/include/asm/loongarch.h    |   6 +
>  arch/loongarch/kernel/cpu-probe.c         |   2 +
>  drivers/irqchip/Makefile                  |   2 +-
>  drivers/irqchip/irq-loongarch-avec.c      |  20 +-
>  drivers/irqchip/irq-loongarch-ir.c        | 562 ++++++++++++++++++++++
>  drivers/irqchip/irq-loongson.h            |  12 +
>  include/linux/cpuhotplug.h                |   1 +
>  9 files changed, 594 insertions(+), 14 deletions(-)
>  create mode 100644 drivers/irqchip/irq-loongarch-ir.c
>
> diff --git a/arch/loongarch/include/asm/cpu-features.h b/arch/loongarch/include/asm/cpu-features.h
> index fc83bb32f9f0..03f7e93e81e0 100644
> --- a/arch/loongarch/include/asm/cpu-features.h
> +++ b/arch/loongarch/include/asm/cpu-features.h
> @@ -68,5 +68,6 @@
>  #define cpu_has_ptw            cpu_opt(LOONGARCH_CPU_PTW)
>  #define cpu_has_lspw           cpu_opt(LOONGARCH_CPU_LSPW)
>  #define cpu_has_avecint                cpu_opt(LOONGARCH_CPU_AVECINT)
> +#define cpu_has_redirectint    cpu_opt(LOONGARCH_CPU_REDIRECTINT)
>
>  #endif /* __ASM_CPU_FEATURES_H */
> diff --git a/arch/loongarch/include/asm/cpu.h b/arch/loongarch/include/asm/cpu.h
> index 98cf4d7b4b0a..33cd96e569d8 100644
> --- a/arch/loongarch/include/asm/cpu.h
> +++ b/arch/loongarch/include/asm/cpu.h
> @@ -102,6 +102,7 @@ enum cpu_type_enum {
>  #define CPU_FEATURE_PTW                        27      /* CPU has hardware page table walker */
>  #define CPU_FEATURE_LSPW               28      /* CPU has LSPW (lddir/ldpte instructions) */
>  #define CPU_FEATURE_AVECINT            29      /* CPU has AVEC interrupt */
> +#define CPU_FEATURE_REDIRECTINT                30      /* CPU has interrupt remmap */
>
>  #define LOONGARCH_CPU_CPUCFG           BIT_ULL(CPU_FEATURE_CPUCFG)
>  #define LOONGARCH_CPU_LAM              BIT_ULL(CPU_FEATURE_LAM)
> @@ -133,5 +134,6 @@ enum cpu_type_enum {
>  #define LOONGARCH_CPU_PTW              BIT_ULL(CPU_FEATURE_PTW)
>  #define LOONGARCH_CPU_LSPW             BIT_ULL(CPU_FEATURE_LSPW)
>  #define LOONGARCH_CPU_AVECINT          BIT_ULL(CPU_FEATURE_AVECINT)
> +#define LOONGARCH_CPU_REDIRECTINT      BIT_ULL(CPU_FEATURE_REDIRECTINT)
>
>  #endif /* _ASM_CPU_H */
> diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h
> index 52651aa0e583..95e06cb6831e 100644
> --- a/arch/loongarch/include/asm/loongarch.h
> +++ b/arch/loongarch/include/asm/loongarch.h
> @@ -1130,6 +1130,7 @@
>  #define  IOCSRF_FLATMODE               BIT_ULL(10)
>  #define  IOCSRF_VM                     BIT_ULL(11)
>  #define  IOCSRF_AVEC                   BIT_ULL(15)
> +#define  IOCSRF_REDIRECTINT            BIT_ULL(16)
>
>  #define LOONGARCH_IOCSR_VENDOR         0x10
>
> @@ -1189,6 +1190,11 @@
>
>  #define LOONGARCH_IOCSR_EXTIOI_NODEMAP_BASE    0x14a0
>  #define LOONGARCH_IOCSR_EXTIOI_IPMAP_BASE      0x14c0
> +#define LOONGARCH_IOCSR_REDIRECT_CFG           0x15e0
> +#define LOONGARCH_IOCSR_REDIRECT_TBR           0x15e8  /* IRT BASE REG*/
> +#define LOONGARCH_IOCSR_REDIRECT_CQB           0x15f0  /* IRT CACHE QUEUE BASE */
> +#define LOONGARCH_IOCSR_REDIRECT_CQH           0x15f8  /* IRT CACHE QUEUE HEAD, 32bit */
> +#define LOONGARCH_IOCSR_REDIRECT_CQT           0x15fc  /* IRT CACHE QUEUE TAIL, 32bit */
>  #define LOONGARCH_IOCSR_EXTIOI_EN_BASE         0x1600
>  #define LOONGARCH_IOCSR_EXTIOI_BOUNCE_BASE     0x1680
>  #define LOONGARCH_IOCSR_EXTIOI_ISR_BASE                0x1800
> diff --git a/arch/loongarch/kernel/cpu-probe.c b/arch/loongarch/kernel/cpu-probe.c
> index fedaa67cde41..543474fd1399 100644
> --- a/arch/loongarch/kernel/cpu-probe.c
> +++ b/arch/loongarch/kernel/cpu-probe.c
> @@ -289,6 +289,8 @@ static inline void cpu_probe_loongson(struct cpuinfo_loongarch *c, unsigned int
>                 c->options |= LOONGARCH_CPU_EIODECODE;
>         if (config & IOCSRF_AVEC)
>                 c->options |= LOONGARCH_CPU_AVECINT;
> +       if (config & IOCSRF_REDIRECTINT)
> +               c->options |= LOONGARCH_CPU_REDIRECTINT;
>         if (config & IOCSRF_VM)
>                 c->options |= LOONGARCH_CPU_HYPERVISOR;
>  }
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> index 365bcea9a61f..2bb8618f96d1 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -114,7 +114,7 @@ obj-$(CONFIG_LS1X_IRQ)                      += irq-ls1x.o
>  obj-$(CONFIG_TI_SCI_INTR_IRQCHIP)      += irq-ti-sci-intr.o
>  obj-$(CONFIG_TI_SCI_INTA_IRQCHIP)      += irq-ti-sci-inta.o
>  obj-$(CONFIG_TI_PRUSS_INTC)            += irq-pruss-intc.o
> -obj-$(CONFIG_IRQ_LOONGARCH_CPU)                += irq-loongarch-cpu.o irq-loongarch-avec.o
> +obj-$(CONFIG_IRQ_LOONGARCH_CPU)                += irq-loongarch-cpu.o irq-loongarch-avec.o irq-loongarch-ir.o
>  obj-$(CONFIG_LOONGSON_LIOINTC)         += irq-loongson-liointc.o
>  obj-$(CONFIG_LOONGSON_EIOINTC)         += irq-loongson-eiointc.o
>  obj-$(CONFIG_LOONGSON_HTPIC)           += irq-loongson-htpic.o
> diff --git a/drivers/irqchip/irq-loongarch-avec.c b/drivers/irqchip/irq-loongarch-avec.c
> index 80e55955a29f..7f4a671038ee 100644
> --- a/drivers/irqchip/irq-loongarch-avec.c
> +++ b/drivers/irqchip/irq-loongarch-avec.c
> @@ -24,7 +24,6 @@
>  #define VECTORS_PER_REG                64
>  #define IRR_VECTOR_MASK                0xffUL
>  #define IRR_INVALID_MASK       0x80000000UL
> -#define AVEC_MSG_OFFSET                0x100000
>
>  #ifdef CONFIG_SMP
>  struct pending_list {
> @@ -47,15 +46,6 @@ struct avecintc_chip {
>
>  static struct avecintc_chip loongarch_avec;
>
> -struct avecintc_data {
> -       struct list_head        entry;
> -       unsigned int            cpu;
> -       unsigned int            vec;
> -       unsigned int            prev_cpu;
> -       unsigned int            prev_vec;
> -       unsigned int            moving;
> -};
> -
>  static inline void avecintc_enable(void)
>  {
>         u64 value;
> @@ -85,7 +75,7 @@ static inline void pending_list_init(int cpu)
>         INIT_LIST_HEAD(&plist->head);
>  }
>
> -static void avecintc_sync(struct avecintc_data *adata)
> +void avecintc_sync(struct avecintc_data *adata)
>  {
>         struct pending_list *plist;
>
> @@ -109,7 +99,7 @@ static int avecintc_set_affinity(struct irq_data *data, const struct cpumask *de
>                         return -EBUSY;
>
>                 if (cpu_online(adata->cpu) && cpumask_test_cpu(adata->cpu, dest))
> -                       return 0;
> +                       return IRQ_SET_MASK_OK_DONE;
>
>                 cpumask_and(&intersect_mask, dest, cpu_online_mask);
>
> @@ -121,7 +111,8 @@ static int avecintc_set_affinity(struct irq_data *data, const struct cpumask *de
>                 adata->cpu = cpu;
>                 adata->vec = vector;
>                 per_cpu_ptr(irq_map, adata->cpu)[adata->vec] = irq_data_to_desc(data);
> -               avecintc_sync(adata);
> +               if (!cpu_has_redirectint)
> +                       avecintc_sync(adata);
>         }
>
>         irq_data_update_effective_affinity(data, cpumask_of(cpu));
> @@ -412,6 +403,9 @@ static int __init pch_msi_parse_madt(union acpi_subtable_headers *header,
>
>  static inline int __init acpi_cascade_irqdomain_init(void)
>  {
> +       if (cpu_has_redirectint)
> +               return redirect_acpi_init(loongarch_avec.domain);
> +
>         return acpi_table_parse_madt(ACPI_MADT_TYPE_MSI_PIC, pch_msi_parse_madt, 1);
>  }
>
> diff --git a/drivers/irqchip/irq-loongarch-ir.c b/drivers/irqchip/irq-loongarch-ir.c
> new file mode 100644
> index 000000000000..ac1ee3f78aa4
> --- /dev/null
> +++ b/drivers/irqchip/irq-loongarch-ir.c
> @@ -0,0 +1,562 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2020 Loongson Technologies, Inc.
> + */
> +
> +#include <linux/cpuhotplug.h>
> +#include <linux/init.h>
> +#include <linux/interrupt.h>
> +#include <linux/kernel.h>
> +#include <linux/irq.h>
> +#include <linux/irqchip.h>
> +#include <linux/irqdomain.h>
> +#include <linux/spinlock.h>
> +#include <linux/msi.h>
> +
> +#include <asm/irq.h>
> +#include <asm/loongarch.h>
> +#include <asm/setup.h>
> +#include <larchintrin.h>
> +
> +#include "irq-loongson.h"
> +#include "irq-msi-lib.h"
> +
> +#define IRD_ENTRIES                    65536
> +
> +/* redirect entry size 128bits */
> +#define IRD_PAGE_ORDER                 (20 - PAGE_SHIFT)
> +
> +/* irt cache invalid queue */
> +#define        INVALID_QUEUE_SIZE              4096
> +
> +#define INVALID_QUEUE_PAGE_ORDER       (16 - PAGE_SHIFT)
> +
> +#define GPID_ADDR_MASK                 0x3ffffffffffULL
> +#define GPID_ADDR_SHIFT                        6
> +
> +#define CQB_SIZE_SHIFT                 0
> +#define CQB_SIZE_MASK                  0xf
> +#define CQB_ADDR_SHIFT                 12
> +#define CQB_ADDR_MASK                  (0xfffffffffULL)
> +
> +#define CFG_DISABLE_IDLE               2
> +#define INVALID_INDEX                  0
> +
> +#define MAX_IR_ENGINES                 16
> +
> +struct irq_domain *redirect_domain;
> +
> +struct redirect_entry {
> +       struct  {
> +               __u64   valid   : 1,
> +                       res1    : 5,
> +                       gpid    : 42,
> +                       res2    : 8,
> +                       vector  : 8;
> +       }       lo;
> +       __u64   hi;
> +};
> +
> +struct redirect_gpid {
> +       u64     pir[4];      /* Pending interrupt requested */
> +       u8      en      : 1, /* doorbell */
> +               res0    : 7;
> +       u8      irqnum;
> +       u16     res1;
> +       u32     dst;
> +       u32     rsvd[6];
> +} __aligned(64);
> +
> +struct redirect_table {
> +       int                     node;
> +       struct redirect_entry   *table;
> +       unsigned long           *bitmap;
> +       unsigned int            nr_ird;
> +       struct page             *page;
> +       raw_spinlock_t          lock;
> +};
> +
> +struct redirect_item {
> +       int                     index;
> +       struct redirect_entry   *entry;
> +       struct redirect_gpid    *gpid;
> +       struct redirect_table   *table;
> +};
> +
> +struct redirect_queue {
> +       int             node;
> +       u64             base;
> +       u32             max_size;
> +       int             head;
> +       int             tail;
> +       struct page     *page;
> +       raw_spinlock_t  lock;
> +};
> +
> +struct irde_desc {
> +       struct redirect_table   ird_table;
> +       struct redirect_queue   inv_queue;
> +};
> +
> +struct irde_inv_cmd {
> +       union {
> +               __u64   cmd_info;
> +               struct {
> +                       __u64   res1            : 4,
> +                               type            : 1,
> +                               need_notice     : 1,
> +                               pad             : 2,
> +                               index           : 16,
> +                               pad2            : 40;
> +               }       index;
> +       };
> +       __u64           notice_addr;
> +};
> +
> +static struct irde_desc irde_descs[MAX_IR_ENGINES];
> +static phys_addr_t msi_base_addr;
> +static phys_addr_t redirect_reg_base = 0x1fe00000;
> +
> +#define REDIRECT_REG_BASE(reg, node) \
> +       (UNCACHE_BASE | redirect_reg_base | (u64)(node) << NODE_ADDRSPACE_SHIFT | (reg))
IO_BASE is a little better than UNCACHE_BASE.

> +#define        redirect_reg_queue_head(node)   REDIRECT_REG_BASE(LOONGARCH_IOCSR_REDIRECT_CQH, (node))
> +#define        redirect_reg_queue_tail(node)   REDIRECT_REG_BASE(LOONGARCH_IOCSR_REDIRECT_CQT, (node))
> +#define read_queue_head(node)          (*((u32 *)(redirect_reg_queue_head(node))))
> +#define read_queue_tail(node)          (*((u32 *)(redirect_reg_queue_tail(node))))
> +#define write_queue_tail(node, val)    (*((u32 *)(redirect_reg_queue_tail(node))) = (val))
You can use readl() and writel() directly, then you can remove the
memory barrier around write_queue_tail().

> +
> +static inline bool invalid_queue_is_full(int node, u32 *tail)
> +{
> +       u32 head;
> +
> +       head = read_queue_head(node);
> +       *tail = read_queue_tail(node);
> +
> +       return !!(head == ((*tail + 1) % INVALID_QUEUE_SIZE));
> +}
> +
> +static void invalid_enqueue(struct redirect_queue *rqueue, struct irde_inv_cmd *cmd)
> +{
> +       struct irde_inv_cmd *inv_addr;
> +       u32 tail;
> +
> +       guard(raw_spinlock_irqsave)(&rqueue->lock);
> +
> +       while (invalid_queue_is_full(rqueue->node, &tail))
> +               cpu_relax();
> +
> +       inv_addr = (struct irde_inv_cmd *)(rqueue->base + tail * sizeof(struct irde_inv_cmd));
> +       memcpy(inv_addr, cmd, sizeof(struct irde_inv_cmd));
> +       tail = (tail + 1) % INVALID_QUEUE_SIZE;
> +
> +       /*
> +        * The uncache-memory access may have an out of order problem cache-memory access,
> +        * so a barrier is needed to ensure tail is valid
> +        */
> +       wmb();
> +
> +       write_queue_tail(rqueue->node, tail);
> +}
> +
> +static void irde_invlid_entry_node(struct redirect_item *item)
s/irde_invlid_entry_node/irde_invalid_entry_node/g

> +{
> +       struct redirect_queue *rqueue;
> +       struct irde_inv_cmd cmd;
> +       volatile u64 raddr = 0;
> +       int node = item->table->node;
> +
> +       rqueue = &(irde_descs[node].inv_queue);
> +       cmd.cmd_info = 0;
> +       cmd.index.type = INVALID_INDEX;
> +       cmd.index.need_notice = 1;
> +       cmd.index.index = item->index;
> +       cmd.notice_addr = (u64)(__pa(&raddr));
> +
> +       invalid_enqueue(rqueue, &cmd);
> +
> +       while (!raddr)
> +               cpu_relax();
> +
> +}
> +
> +static inline struct avecintc_data *irq_data_get_avec_data(struct irq_data *data)
> +{
> +       return data->parent_data->chip_data;
> +}
> +
> +static int redirect_table_alloc(struct redirect_item *item, struct redirect_table *ird_table)
> +{
> +       int index;
> +
> +       guard(raw_spinlock_irqsave)(&ird_table->lock);
> +
> +       index = find_first_zero_bit(ird_table->bitmap, IRD_ENTRIES);
> +       if (index > IRD_ENTRIES) {
> +               pr_err("No redirect entry to use\n");
> +               return -ENOMEM;
> +       }
> +
> +       __set_bit(index, ird_table->bitmap);
> +
> +       item->index = index;
> +       item->entry = &ird_table->table[index];
> +       item->table = ird_table;
> +
> +       return 0;
> +}
> +
> +static int redirect_table_free(struct redirect_item *item)
> +{
> +       struct redirect_table *ird_table;
> +       struct redirect_entry *entry;
> +
> +       ird_table = item->table;
> +
> +       entry = item->entry;
> +       memset(entry, 0, sizeof(struct redirect_entry));
> +
> +       scoped_guard(raw_spinlock_irqsave, &ird_table->lock)
> +               bitmap_release_region(ird_table->bitmap, item->index, 0);
> +
> +       kfree(item->gpid);
> +
> +       irde_invlid_entry_node(item);
> +
> +       return 0;
> +}
> +
> +static inline void redirect_domain_prepare_entry(struct redirect_item *item,
> +                                       struct avecintc_data *adata)
> +{
> +       struct redirect_entry *entry = item->entry;
> +
> +       item->gpid->en = 1;
> +       item->gpid->irqnum = adata->vec;
> +       item->gpid->dst = adata->cpu;
> +
> +       entry->lo.valid = 1;
> +       entry->lo.gpid = ((long)item->gpid >> GPID_ADDR_SHIFT) & (GPID_ADDR_MASK);
> +       entry->lo.vector = 0xff;
> +}
> +
> +static int redirect_set_affinity(struct irq_data *data, const struct cpumask *dest, bool force)
> +{
> +       struct redirect_item *item = data->chip_data;
> +       struct avecintc_data *adata;
> +       int ret;
> +
> +       ret = irq_chip_set_affinity_parent(data, dest, force);
> +       if (ret == IRQ_SET_MASK_OK_DONE) {
> +               return ret;
> +       } else if (ret) {
> +               pr_err("IRDE:set_affinity error %d\n", ret);
> +               return ret;
> +       }
> +
> +       adata = irq_data_get_avec_data(data);
> +       redirect_domain_prepare_entry(item, adata);
> +       irde_invlid_entry_node(item);
> +       avecintc_sync(adata);
> +
> +       return IRQ_SET_MASK_OK;
> +}
Have you tried to build with no SMP? This function (and maybe more)
should be guarded by CONFIG_SMP.

> +
> +static void redirect_compose_msi_msg(struct irq_data *d, struct msi_msg *msg)
> +{
> +       struct redirect_item *item;
> +
> +       item = irq_data_get_irq_chip_data(d);
> +       msg->address_lo = (msi_base_addr | 1 << 2 | ((item->index & 0xffff) << 4));
> +       msg->address_hi = 0x0;
> +       msg->data = 0x0;
> +}
> +
> +static inline void redirect_ack_irq(struct irq_data *d)
> +{
> +}
> +
> +static inline void redirect_unmask_irq(struct irq_data *d)
> +{
> +}
> +
> +static inline void redirect_mask_irq(struct irq_data *d)
> +{
> +}
> +
> +static struct irq_chip loongarch_redirect_chip = {
> +       .name                   = "REDIRECT",
> +       .irq_ack                = redirect_ack_irq,
> +       .irq_mask               = redirect_mask_irq,
> +       .irq_unmask             = redirect_unmask_irq,
> +       .irq_set_affinity       = redirect_set_affinity,
> +       .irq_compose_msi_msg    = redirect_compose_msi_msg,
> +};
> +
> +static void redirect_free_resources(struct irq_domain *domain, unsigned int virq,
> +                               unsigned int nr_irqs)
> +{
> +       for (int i = 0; i < nr_irqs; i++) {
> +               struct irq_data *irq_data;
> +
> +               irq_data = irq_domain_get_irq_data(domain, virq  + i);
> +               if (irq_data && irq_data->chip_data) {
> +                       struct redirect_item *item;
> +
> +                       item = irq_data->chip_data;
> +                       redirect_table_free(item);
> +                       kfree(item);
> +               }
> +       }
> +}
> +
> +static int redirect_domain_alloc(struct irq_domain *domain, unsigned int virq,
> +                       unsigned int nr_irqs, void *arg)
> +{
> +       struct redirect_table *ird_table;
> +       struct avecintc_data *avec_data;
> +       struct irq_data *irq_data;
> +       msi_alloc_info_t *info;
> +       int ret, i, node;
> +
> +       info = (msi_alloc_info_t *)arg;
> +       node = dev_to_node(info->desc->dev);
> +       ird_table = &irde_descs[node].ird_table;
> +
> +       ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg);
> +       if (ret < 0)
> +               return ret;
> +
> +       for (i = 0; i < nr_irqs; i++) {
> +               struct redirect_item *item;
> +
> +               item = kzalloc(sizeof(struct redirect_item), GFP_KERNEL);
> +               if (!item) {
> +                       pr_err("Alloc redirect descriptor failed\n");
> +                       goto out_free_resources;
> +               }
> +
> +               irq_data = irq_domain_get_irq_data(domain, virq + i);
> +
> +               avec_data = irq_data_get_avec_data(irq_data);
> +               ret = redirect_table_alloc(item, ird_table);
> +               if (ret) {
> +                       pr_err("Alloc redirect table entry failed\n");
> +                       goto out_free_resources;
> +               }
> +
> +               item->gpid = kzalloc_node(sizeof(struct redirect_gpid), GFP_KERNEL, node);
> +               if (!item->gpid) {
> +                       pr_err("Alloc redirect GPID failed\n");
> +                       goto out_free_resources;
> +               }
> +
> +               irq_data->chip_data = item;
> +               irq_data->chip = &loongarch_redirect_chip;
> +               redirect_domain_prepare_entry(item, avec_data);
> +       }
> +       return 0;
> +
> +out_free_resources:
> +       redirect_free_resources(domain, virq, nr_irqs);
> +       irq_domain_free_irqs_common(domain, virq, nr_irqs);
> +
> +       return -EINVAL;
> +}
> +
> +static void redirect_domain_free(struct irq_domain *domain, unsigned int virq, unsigned int nr_irqs)
> +{
> +       redirect_free_resources(domain, virq, nr_irqs);
> +       return irq_domain_free_irqs_common(domain, virq, nr_irqs);
> +}
> +
> +static const struct irq_domain_ops redirect_domain_ops = {
> +       .alloc          = redirect_domain_alloc,
> +       .free           = redirect_domain_free,
> +       .select         = msi_lib_irq_domain_select,
> +};
> +
> +static int redirect_queue_init(int node)
> +{
> +       struct redirect_queue *rqueue = &(irde_descs[node].inv_queue);
> +       struct page *pages;
> +
> +       pages = alloc_pages_node(0, GFP_KERNEL | __GFP_ZERO, INVALID_QUEUE_PAGE_ORDER);
> +       if (!pages) {
> +               pr_err("Node [%d] Invalid Queue alloc pages failed!\n", node);
> +               return -ENOMEM;
> +       }
> +
> +       rqueue->page = pages;
> +       rqueue->base = (u64)page_address(pages);
> +       rqueue->max_size = INVALID_QUEUE_SIZE;
> +       rqueue->head = 0;
> +       rqueue->tail = 0;
> +       rqueue->node = node;
> +       raw_spin_lock_init(&rqueue->lock);
> +
> +       iocsr_write32(0, LOONGARCH_IOCSR_REDIRECT_CQH);
> +       iocsr_write32(0, LOONGARCH_IOCSR_REDIRECT_CQT);
> +       iocsr_write64(((rqueue->base & (CQB_ADDR_MASK << CQB_ADDR_SHIFT)) |
> +                               (CQB_SIZE_MASK << CQB_SIZE_SHIFT)), LOONGARCH_IOCSR_REDIRECT_CQB);
> +       return 0;
> +}
> +
> +static int redirect_table_init(int node)
> +{
> +       struct redirect_table *ird_table = &(irde_descs[node].ird_table);
> +       unsigned long *bitmap;
> +       struct page *pages;
> +
> +       pages = alloc_pages_node(node, GFP_KERNEL | __GFP_ZERO, IRD_PAGE_ORDER);
> +       if (!pages) {
> +               pr_err("Node [%d] redirect table alloc pages failed!\n", node);
> +               return -ENOMEM;
> +       }
> +       ird_table->page = pages;
> +       ird_table->table = page_address(pages);
> +
> +       bitmap = bitmap_zalloc(IRD_ENTRIES, GFP_KERNEL);
> +       if (!bitmap) {
> +               pr_err("Node [%d] redirect table bitmap alloc pages failed!\n", node);
> +               return -ENOMEM;
> +       }
> +
> +       ird_table->bitmap = bitmap;
> +       ird_table->nr_ird = IRD_ENTRIES;
> +       ird_table->node = node;
> +
> +       raw_spin_lock_init(&ird_table->lock);
> +
> +       if (redirect_queue_init(node))
> +               return -EINVAL;
> +
> +       iocsr_write64(CFG_DISABLE_IDLE, LOONGARCH_IOCSR_REDIRECT_CFG);
> +       iocsr_write64(__pa(ird_table->table), LOONGARCH_IOCSR_REDIRECT_TBR);
> +
> +       return 0;
> +}
> +
> +static void redirect_table_fini(int node)
> +{
> +       struct redirect_table *ird_table = &(irde_descs[node].ird_table);
> +       struct redirect_queue *rqueue = &(irde_descs[node].inv_queue);
> +
> +       if (ird_table->page) {
> +               __free_pages(ird_table->page, IRD_PAGE_ORDER);
> +               ird_table->table = NULL;
> +               ird_table->page = NULL;
> +       }
> +
> +       if (ird_table->page) {
> +               bitmap_free(ird_table->bitmap);
> +               ird_table->bitmap = NULL;
> +       }
> +
> +       if (rqueue->page) {
> +               __free_pages(rqueue->page, INVALID_QUEUE_PAGE_ORDER);
> +               rqueue->page = NULL;
> +               rqueue->base = 0;
> +       }
> +
> +       iocsr_write64(0, LOONGARCH_IOCSR_REDIRECT_CQB);
> +       iocsr_write64(0, LOONGARCH_IOCSR_REDIRECT_TBR);
> +}
> +
> +static int redirect_cpu_online(unsigned int cpu)
> +{
> +       int ret, node = cpu_to_node(cpu);
> +
> +       if (cpu != cpumask_first(cpumask_of_node(node)))
> +               return 0;
> +
> +       ret = redirect_table_init(node);
> +       if (ret) {
> +               redirect_table_fini(node);
> +               return -EINVAL;
> +       }
> +
> +       return 0;
> +}
> +
> +#if defined(CONFIG_ACPI)
> +static int __init redirect_reg_base_init(void)
> +{
> +       acpi_status status;
> +       uint64_t addr = 0;
> +
> +       if (acpi_disabled)
> +               return 0;
> +
> +       status = acpi_evaluate_integer(NULL, "\\_SB.NO00", NULL, &addr);
> +       if (ACPI_FAILURE(status) || !addr)
> +               pr_info("redirect_iocsr_base used default 0x1fe00000\n");
> +       else
> +               redirect_reg_base = addr;
> +
> +       return 0;
> +}
> +subsys_initcall_sync(redirect_reg_base_init);
Can this function be put at the end of redirect_acpi_init()? It is too
late in an initcall() function because the irqchip drivers begin to
work before that.

Huacai

> +
> +static int __init pch_msi_parse_madt(union acpi_subtable_headers *header,
> +               const unsigned long end)
> +{
> +       struct acpi_madt_msi_pic *pchmsi_entry = (struct acpi_madt_msi_pic *)header;
> +
> +       msi_base_addr = pchmsi_entry->msg_address - AVEC_MSG_OFFSET;
> +
> +       return pch_msi_acpi_init_avec(redirect_domain);
> +}
> +
> +static int __init acpi_cascade_irqdomain_init(void)
> +{
> +       return acpi_table_parse_madt(ACPI_MADT_TYPE_MSI_PIC, pch_msi_parse_madt, 1);
> +}
> +
> +int __init redirect_acpi_init(struct irq_domain *parent)
> +{
> +       struct fwnode_handle *fwnode;
> +       struct irq_domain *domain;
> +       int ret;
> +
> +       fwnode = irq_domain_alloc_named_fwnode("redirect");
> +       if (!fwnode) {
> +               pr_err("Unable to alloc redirect domain handle\n");
> +               goto fail;
> +       }
> +
> +       domain = irq_domain_create_hierarchy(parent, 0, IRD_ENTRIES, fwnode,
> +                       &redirect_domain_ops, irde_descs);
> +       if (!domain) {
> +               pr_err("Unable to alloc redirect domain\n");
> +               goto out_free_fwnode;
> +       }
> +
> +       redirect_domain = domain;
> +
> +       ret = redirect_table_init(0);
> +       if (ret)
> +               goto out_free_table;
> +
> +       ret = acpi_cascade_irqdomain_init();
> +       if (ret < 0) {
> +               pr_err("Failed to cascade IRQ domain, ret=%d\n", ret);
> +               goto out_free_table;
> +       }
> +
> +       cpuhp_setup_state_nocalls(CPUHP_AP_IRQ_REDIRECT_STARTING,
> +                                 "irqchip/loongarch/redirect:starting",
> +                                 redirect_cpu_online, NULL);
> +
> +       pr_info("loongarch irq redirect modules init succeeded\n");
> +       return 0;
> +
> +out_free_table:
> +       redirect_table_fini(0);
> +       irq_domain_remove(redirect_domain);
> +       redirect_domain = NULL;
> +out_free_fwnode:
> +       irq_domain_free_fwnode(fwnode);
> +fail:
> +       return -EINVAL;
> +}
> +#endif
> diff --git a/drivers/irqchip/irq-loongson.h b/drivers/irqchip/irq-loongson.h
> index 11fa138d1f44..05ad40ffb62b 100644
> --- a/drivers/irqchip/irq-loongson.h
> +++ b/drivers/irqchip/irq-loongson.h
> @@ -5,6 +5,15 @@
>
>  #ifndef _DRIVERS_IRQCHIP_IRQ_LOONGSON_H
>  #define _DRIVERS_IRQCHIP_IRQ_LOONGSON_H
> +#define AVEC_MSG_OFFSET                0x100000
> +struct avecintc_data {
> +       struct list_head        entry;
> +       unsigned int            cpu;
> +       unsigned int            vec;
> +       unsigned int            prev_cpu;
> +       unsigned int            prev_vec;
> +       unsigned int            moving;
> +};
>
>  int find_pch_pic(u32 gsi);
>
> @@ -24,4 +33,7 @@ int pch_msi_acpi_init(struct irq_domain *parent,
>                                         struct acpi_madt_msi_pic *acpi_pchmsi);
>  int pch_msi_acpi_init_avec(struct irq_domain *parent);
>
> +int redirect_acpi_init(struct irq_domain *parent);
> +
> +void avecintc_sync(struct avecintc_data *adata);
>  #endif /* _DRIVERS_IRQCHIP_IRQ_LOONGSON_H */
> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> index 1987400000b4..6a4ff072db42 100644
> --- a/include/linux/cpuhotplug.h
> +++ b/include/linux/cpuhotplug.h
> @@ -145,6 +145,7 @@ enum cpuhp_state {
>         CPUHP_AP_IRQ_MIPS_GIC_STARTING,
>         CPUHP_AP_IRQ_EIOINTC_STARTING,
>         CPUHP_AP_IRQ_AVECINTC_STARTING,
> +       CPUHP_AP_IRQ_REDIRECT_STARTING,
>         CPUHP_AP_IRQ_SIFIVE_PLIC_STARTING,
>         CPUHP_AP_IRQ_THEAD_ACLINT_SSWI_STARTING,
>         CPUHP_AP_IRQ_RISCV_IMSIC_STARTING,
> --
> 2.20.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 2/2] irq/irq-loongarch-ir:Add Redirect irqchip support
  2025-05-23 10:18 ` [PATCH v3 2/2] irq/irq-loongarch-ir:Add Redirect irqchip support Tianyang Zhang
  2025-05-24 14:12   ` Huacai Chen
@ 2025-05-25  9:06   ` Thomas Gleixner
  2025-05-27  1:22     ` Tianyang Zhang
  1 sibling, 1 reply; 7+ messages in thread
From: Thomas Gleixner @ 2025-05-25  9:06 UTC (permalink / raw)
  To: Tianyang Zhang, chenhuacai, kernel, corbet, alexs, si.yanteng,
	jiaxun.yang, peterz, wangliupu, lvjianmin, maobibo, siyanteng,
	gaosong, yangtiezhu
  Cc: loongarch, linux-doc, linux-kernel, Tianyang Zhang

On Fri, May 23 2025 at 18:18, Tianyang Zhang wrote:
>  
> -static void avecintc_sync(struct avecintc_data *adata)
> +void avecintc_sync(struct avecintc_data *adata)
>  {
>  	struct pending_list *plist;
>  
> @@ -109,7 +99,7 @@ static int avecintc_set_affinity(struct irq_data *data, const struct cpumask *de
>  			return -EBUSY;
>  
>  		if (cpu_online(adata->cpu) && cpumask_test_cpu(adata->cpu, dest))
> -			return 0;
> +			return IRQ_SET_MASK_OK_DONE;

This change really wants to be seperate with a proper explanation and
not burried inside of this pile of changes.

> +static inline bool invalid_queue_is_full(int node, u32 *tail)
> +{
> +	u32 head;
> +
> +	head = read_queue_head(node);

Please move the initialization into the declaration line:

       u32 head = read_queue...();

All over the place, where it's the first operation in the code. That
makes the code more dense and easier to follow.

> +	*tail = read_queue_tail(node);
> +
> +	return !!(head == ((*tail + 1) % INVALID_QUEUE_SIZE));

What's the !! for? A == B is a boolean expression already.

> +}
> +
> +static void invalid_enqueue(struct redirect_queue *rqueue, struct irde_inv_cmd *cmd)
> +{
> +	struct irde_inv_cmd *inv_addr;
> +	u32 tail;
> +
> +	guard(raw_spinlock_irqsave)(&rqueue->lock);
> +
> +	while (invalid_queue_is_full(rqueue->node, &tail))
> +		cpu_relax();
> +
> +	inv_addr = (struct irde_inv_cmd *)(rqueue->base + tail * sizeof(struct irde_inv_cmd));
> +	memcpy(inv_addr, cmd, sizeof(struct irde_inv_cmd));
> +	tail = (tail + 1) % INVALID_QUEUE_SIZE;
> +
> +	/*
> +	 * The uncache-memory access may have an out of order problem cache-memory access,
> +	 * so a barrier is needed to ensure tail is valid
> +	 */

This comment does not make sense at all.

What's the actual uncached vs. cached access problem here? AFAICT it's
all about the ordering of the writes:

    You need to ensure that the memcpy() data is visible _before_ the
    tail is updated, no?

> +	wmb();
> +
> +	write_queue_tail(rqueue->node, tail);
> +}

> +static int redirect_table_free(struct redirect_item *item)

That return value is there to be ignored by the only caller, right?

> +{
> +	struct redirect_table *ird_table;
> +	struct redirect_entry *entry;
> +
> +	ird_table = item->table;
> +
> +	entry = item->entry;
> +	memset(entry, 0, sizeof(struct redirect_entry));
> +
> +	scoped_guard(raw_spinlock_irqsave, &ird_table->lock)
> +		bitmap_release_region(ird_table->bitmap, item->index, 0);
> +
> +	kfree(item->gpid);
> +
> +	irde_invlid_entry_node(item);
> +
> +	return 0;
> +}
> +
> +static inline void redirect_domain_prepare_entry(struct redirect_item *item,
> +					struct avecintc_data *adata)

Please align the argument in the second line properly:

https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#line-breaks

> +
> +static inline void redirect_ack_irq(struct irq_data *d)
> +{
> +}
> +
> +static inline void redirect_unmask_irq(struct irq_data *d)
> +{
> +}
> +
> +static inline void redirect_mask_irq(struct irq_data *d)
> +{
> +}

These want some explanation why they are empty.

> +
> +static struct irq_chip loongarch_redirect_chip = {
> +	.name			= "REDIRECT",
> +	.irq_ack		= redirect_ack_irq,
> +	.irq_mask		= redirect_mask_irq,
> +	.irq_unmask		= redirect_unmask_irq,
> +	.irq_set_affinity	= redirect_set_affinity,
> +	.irq_compose_msi_msg	= redirect_compose_msi_msg,
> +};
> +out_free_resources:
> +	redirect_free_resources(domain, virq, nr_irqs);
> +	irq_domain_free_irqs_common(domain, virq, nr_irqs);
> +
> +	return -EINVAL;

-ENOMEM?

> +}
> +
> +	bitmap = bitmap_zalloc(IRD_ENTRIES, GFP_KERNEL);
> +	if (!bitmap) {
> +		pr_err("Node [%d] redirect table bitmap alloc pages failed!\n", node);
> +		return -ENOMEM;

Leaks pages.

> +	}
> +
> +	ird_table->bitmap = bitmap;
> +	ird_table->nr_ird = IRD_ENTRIES;
> +	ird_table->node = node;
> +
> +	raw_spin_lock_init(&ird_table->lock);
> +
> +	if (redirect_queue_init(node))
> +		return -EINVAL;

Leaks pages and bitmap.

> +
> +	iocsr_write64(CFG_DISABLE_IDLE, LOONGARCH_IOCSR_REDIRECT_CFG);
> +	iocsr_write64(__pa(ird_table->table), LOONGARCH_IOCSR_REDIRECT_TBR);
> +
> +	return 0;
> +}

> +#if defined(CONFIG_ACPI)

#ifdef CONFIG_ACPI

> +static int __init redirect_reg_base_init(void)
> +{
> +	acpi_status status;
> +	uint64_t addr = 0;

What's this initialization for?

> +int __init redirect_acpi_init(struct irq_domain *parent)
> +{
> +	struct fwnode_handle *fwnode;
> +	struct irq_domain *domain;
> +	int ret;
> +
> +	fwnode = irq_domain_alloc_named_fwnode("redirect");
> +	if (!fwnode) {
> +		pr_err("Unable to alloc redirect domain handle\n");
> +		goto fail;
> +	}
> +
> +	domain = irq_domain_create_hierarchy(parent, 0, IRD_ENTRIES, fwnode,
> +			&redirect_domain_ops, irde_descs);

Please align the arguments in the second line properly.

> +	if (!domain) {
> +		pr_err("Unable to alloc redirect domain\n");
> +		goto out_free_fwnode;
> +	}
> +
> +	redirect_domain = domain;
> +
> +	ret = redirect_table_init(0);
> +	if (ret)
> +		goto out_free_table;
> +
> +	ret = acpi_cascade_irqdomain_init();
> +	if (ret < 0) {
> +		pr_err("Failed to cascade IRQ domain, ret=%d\n", ret);
> +		goto out_free_table;
> +	}
> +
> +	cpuhp_setup_state_nocalls(CPUHP_AP_IRQ_REDIRECT_STARTING,
> +				  "irqchip/loongarch/redirect:starting",
> +				  redirect_cpu_online, NULL);

Hmm.

> +static int redirect_cpu_online(unsigned int cpu)
> +{
> +	int ret, node = cpu_to_node(cpu);
> +
> +	if (cpu != cpumask_first(cpumask_of_node(node)))
> +		return 0;
> +
> +	ret = redirect_table_init(node);
> +	if (ret) {
> +		redirect_table_fini(node);
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}

So if you unplug all CPUs of a node and then replug the first CPU in the
node, then this invokes redirect_table_init() unconditionally, which
will unconditionally allocate pages and bitmap again ....

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 1/2] Docs/LoongArch: Add Advanced Extended-Redirect IRQ model description
  2025-05-23 10:18 ` [PATCH v3 1/2] Docs/LoongArch: Add Advanced Extended-Redirect IRQ model description Tianyang Zhang
@ 2025-05-26  1:45   ` Yanteng Si
  0 siblings, 0 replies; 7+ messages in thread
From: Yanteng Si @ 2025-05-26  1:45 UTC (permalink / raw)
  To: Tianyang Zhang, chenhuacai, kernel, corbet, alexs, tglx,
	jiaxun.yang, peterz, wangliupu, lvjianmin, maobibo, siyanteng,
	gaosong, yangtiezhu
  Cc: loongarch, linux-doc, linux-kernel


在 5/23/25 6:18 PM, Tianyang Zhang 写道:
> Introduce the redirect interrupt controllers.When the redirect interrupt
> controller is enabled, the routing target of MSI interrupts is no longer a
> specific CPU and vector number, but a specific redirect entry. The actual
> CPU and vector number used are described by the redirect entry.
>
> Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn>

Only for Chinese:


Reviewed-by: Yanteng Si <si.yanteng@linux.dev>


Thanks,

Yanteng

> ---
>   .../arch/loongarch/irq-chip-model.rst         | 38 +++++++++++++++++++
>   .../zh_CN/arch/loongarch/irq-chip-model.rst   | 37 ++++++++++++++++++
>   2 files changed, 75 insertions(+)
>
> diff --git a/Documentation/arch/loongarch/irq-chip-model.rst b/Documentation/arch/loongarch/irq-chip-model.rst
> index a7ecce11e445..d9a2e8d7f70e 100644
> --- a/Documentation/arch/loongarch/irq-chip-model.rst
> +++ b/Documentation/arch/loongarch/irq-chip-model.rst
> @@ -181,6 +181,44 @@ go to PCH-PIC/PCH-LPC and gathered by EIOINTC, and then go to CPUINTC directly::
>                | Devices |
>                +---------+
>   
> +Advanced Extended IRQ model (with redirection)
> +==============================================
> +
> +In this model, IPI (Inter-Processor Interrupt) and CPU Local Timer interrupt go
> +to CPUINTC directly, CPU UARTS interrupts go to LIOINTC, PCH-MSI interrupts go
> +to REDIRECT for remapping it to AVEC, and then go to CPUINTC directly, while all
> +other devices interrupts go to PCH-PIC/PCH-LPC and gathered by EIOINTC, and then
> +go to CPUINTC directly::
> +
> + +-----+     +-----------------------+     +-------+
> + | IPI | --> |        CPUINTC        | <-- | Timer |
> + +-----+     +-----------------------+     +-------+
> +              ^          ^          ^
> +              |          |          |
> +       +---------+ +----------+ +---------+     +-------+
> +       | EIOINTC | | AVECINTC | | LIOINTC | <-- | UARTs |
> +       +---------+ +----------+ +---------+     +-------+
> +            ^            ^
> +            |            |
> +            |      +----------+
> +            |      | REDIRECT |
> +            |      +----------+
> +            |            ^
> +            |            |
> +       +---------+  +---------+
> +       | PCH-PIC |  | PCH-MSI |
> +       +---------+  +---------+
> +         ^     ^           ^
> +         |     |           |
> + +---------+ +---------+ +---------+
> + | Devices | | PCH-LPC | | Devices |
> + +---------+ +---------+ +---------+
> +                  ^
> +                  |
> +             +---------+
> +             | Devices |
> +             +---------+
> +
>   ACPI-related definitions
>   ========================
>   
> diff --git a/Documentation/translations/zh_CN/arch/loongarch/irq-chip-model.rst b/Documentation/translations/zh_CN/arch/loongarch/irq-chip-model.rst
> index d4ff80de47b6..7e4e3e55c7ad 100644
> --- a/Documentation/translations/zh_CN/arch/loongarch/irq-chip-model.rst
> +++ b/Documentation/translations/zh_CN/arch/loongarch/irq-chip-model.rst
> @@ -174,6 +174,43 @@ CPU串口(UARTs)中断发送到LIOINTC,PCH-MSI中断发送到AVECINTC,
>                | Devices |
>                +---------+
>   
> +高级扩展IRQ模型 (带重定向)
> +==========================
> +
> +在这种模型里面,IPI(Inter-Processor Interrupt)和CPU本地时钟中断直接发送到CPUINTC,
> +CPU串口(UARTs)中断发送到LIOINTC,PCH-MSI中断首先发送到REDIRECT模块,完成重定向后发
> +送到AVECINTC,而后通过AVECINTC直接送达CPUINTC,而其他所有设备的中断则分别发送到所连
> +接的PCH-PIC/PCH-LPC,然后由EIOINTC统一收集,再直接到达CPUINTC::
> +
> + +-----+     +-----------------------+     +-------+
> + | IPI | --> |        CPUINTC        | <-- | Timer |
> + +-----+     +-----------------------+     +-------+
> +              ^          ^          ^
> +              |          |          |
> +       +---------+ +----------+ +---------+     +-------+
> +       | EIOINTC | | AVECINTC | | LIOINTC | <-- | UARTs |
> +       +---------+ +----------+ +---------+     +-------+
> +            ^            ^
> +            |            |
> +            |      +----------+
> +            |      | REDIRECT |
> +            |      +----------+
> +            |            ^
> +            |            |
> +       +---------+  +---------+
> +       | PCH-PIC |  | PCH-MSI |
> +       +---------+  +---------+
> +         ^     ^           ^
> +         |     |           |
> + +---------+ +---------+ +---------+
> + | Devices | | PCH-LPC | | Devices |
> + +---------+ +---------+ +---------+
> +                  ^
> +                  |
> +             +---------+
> +             | Devices |
> +             +---------+
> +
>   ACPI相关的定义
>   ==============
>   

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 2/2] irq/irq-loongarch-ir:Add Redirect irqchip support
  2025-05-25  9:06   ` Thomas Gleixner
@ 2025-05-27  1:22     ` Tianyang Zhang
  0 siblings, 0 replies; 7+ messages in thread
From: Tianyang Zhang @ 2025-05-27  1:22 UTC (permalink / raw)
  To: Thomas Gleixner, chenhuacai, kernel, corbet, alexs, si.yanteng,
	jiaxun.yang, peterz, wangliupu, lvjianmin, maobibo, siyanteng,
	gaosong, yangtiezhu
  Cc: loongarch, linux-doc, linux-kernel

Hi , Thomas

在 2025/5/25 下午5:06, Thomas Gleixner 写道:
> On Fri, May 23 2025 at 18:18, Tianyang Zhang wrote:
>>   
>> -static void avecintc_sync(struct avecintc_data *adata)
>> +void avecintc_sync(struct avecintc_data *adata)
>>   {
>>   	struct pending_list *plist;
>>   
>> @@ -109,7 +99,7 @@ static int avecintc_set_affinity(struct irq_data *data, const struct cpumask *de
>>   			return -EBUSY;
>>   
>>   		if (cpu_online(adata->cpu) && cpumask_test_cpu(adata->cpu, dest))
>> -			return 0;
>> +			return IRQ_SET_MASK_OK_DONE;
> This change really wants to be seperate with a proper explanation and
> not burried inside of this pile of changes.
Ok, I got it , I will add some annotation info
>
>> +static inline bool invalid_queue_is_full(int node, u32 *tail)
>> +{
>> +	u32 head;
>> +
>> +	head = read_queue_head(node);
> Please move the initialization into the declaration line:
>
>         u32 head = read_queue...();
>
> All over the place, where it's the first operation in the code. That
> makes the code more dense and easier to follow.
OK I got it , thanks
>
>> +	*tail = read_queue_tail(node);
>> +
>> +	return !!(head == ((*tail + 1) % INVALID_QUEUE_SIZE));
> What's the !! for? A == B is a boolean expression already.
Emmm....This is actually a rookie mistake, thanks
>
>> +}
>> +
>> +static void invalid_enqueue(struct redirect_queue *rqueue, struct irde_inv_cmd *cmd)
>> +{
>> +	struct irde_inv_cmd *inv_addr;
>> +	u32 tail;
>> +
>> +	guard(raw_spinlock_irqsave)(&rqueue->lock);
>> +
>> +	while (invalid_queue_is_full(rqueue->node, &tail))
>> +		cpu_relax();
>> +
>> +	inv_addr = (struct irde_inv_cmd *)(rqueue->base + tail * sizeof(struct irde_inv_cmd));
>> +	memcpy(inv_addr, cmd, sizeof(struct irde_inv_cmd));
>> +	tail = (tail + 1) % INVALID_QUEUE_SIZE;
>> +
>> +	/*
>> +	 * The uncache-memory access may have an out of order problem cache-memory access,
>> +	 * so a barrier is needed to ensure tail is valid
>> +	 */
> This comment does not make sense at all.
>
> What's the actual uncached vs. cached access problem here? AFAICT it's
> all about the ordering of the writes:
>
>      You need to ensure that the memcpy() data is visible _before_ the
>      tail is updated, no?

Yes, the fundamental purpose is to ensure that all data is valid when 
updating registers.

I will modify the annotation information here. Thank you

>> +	wmb();
>> +
>> +	write_queue_tail(rqueue->node, tail);
>> +}
>> +static int redirect_table_free(struct redirect_item *item)
> That return value is there to be ignored by the only caller, right?
Let's re evaluate the significance of the return value here, thanks
>
>> +{
>> +	struct redirect_table *ird_table;
>> +	struct redirect_entry *entry;
>> +
>> +	ird_table = item->table;
>> +
>> +	entry = item->entry;
>> +	memset(entry, 0, sizeof(struct redirect_entry));
>> +
>> +	scoped_guard(raw_spinlock_irqsave, &ird_table->lock)
>> +		bitmap_release_region(ird_table->bitmap, item->index, 0);
>> +
>> +	kfree(item->gpid);
>> +
>> +	irde_invlid_entry_node(item);
>> +
>> +	return 0;
>> +}
>> +
>> +static inline void redirect_domain_prepare_entry(struct redirect_item *item,
>> +					struct avecintc_data *adata)
> Please align the argument in the second line properly:
>
> https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#line-breaks
Ok, I got it , thanks
>
>> +
>> +static inline void redirect_ack_irq(struct irq_data *d)
>> +{
>> +}
>> +
>> +static inline void redirect_unmask_irq(struct irq_data *d)
>> +{
>> +}
>> +
>> +static inline void redirect_mask_irq(struct irq_data *d)
>> +{
>> +}
> These want some explanation why they are empty.
Ok, I got it , thanks
>
>> +
>> +static struct irq_chip loongarch_redirect_chip = {
>> +	.name			= "REDIRECT",
>> +	.irq_ack		= redirect_ack_irq,
>> +	.irq_mask		= redirect_mask_irq,
>> +	.irq_unmask		= redirect_unmask_irq,
>> +	.irq_set_affinity	= redirect_set_affinity,
>> +	.irq_compose_msi_msg	= redirect_compose_msi_msg,
>> +};
>> +out_free_resources:
>> +	redirect_free_resources(domain, virq, nr_irqs);
>> +	irq_domain_free_irqs_common(domain, virq, nr_irqs);
>> +
>> +	return -EINVAL;
> -ENOMEM?
Ok, I got it , thanks
>> +}
>> +
>> +	bitmap = bitmap_zalloc(IRD_ENTRIES, GFP_KERNEL);
>> +	if (!bitmap) {
>> +		pr_err("Node [%d] redirect table bitmap alloc pages failed!\n", node);
>> +		return -ENOMEM;
> Leaks pages.
Ok, I got it , thanks
>
>> +	}
>> +
>> +	ird_table->bitmap = bitmap;
>> +	ird_table->nr_ird = IRD_ENTRIES;
>> +	ird_table->node = node;
>> +
>> +	raw_spin_lock_init(&ird_table->lock);
>> +
>> +	if (redirect_queue_init(node))
>> +		return -EINVAL;
> Leaks pages and bitmap.
Ok, I got it , thanks
>
>> +
>> +	iocsr_write64(CFG_DISABLE_IDLE, LOONGARCH_IOCSR_REDIRECT_CFG);
>> +	iocsr_write64(__pa(ird_table->table), LOONGARCH_IOCSR_REDIRECT_TBR);
>> +
>> +	return 0;
>> +}
>> +#if defined(CONFIG_ACPI)
> #ifdef CONFIG_ACPI
Ok, I got it , thanks
>
>> +static int __init redirect_reg_base_init(void)
>> +{
>> +	acpi_status status;
>> +	uint64_t addr = 0;
> What's this initialization for?

The initial purpose here was to confirm the validity of the data 
returned by acpi_evaluate_integer,

but perhaps this is not necessary.

I will confirm again here, thanks

>
>> +int __init redirect_acpi_init(struct irq_domain *parent)
>> +{
>> +	struct fwnode_handle *fwnode;
>> +	struct irq_domain *domain;
>> +	int ret;
>> +
>> +	fwnode = irq_domain_alloc_named_fwnode("redirect");
>> +	if (!fwnode) {
>> +		pr_err("Unable to alloc redirect domain handle\n");
>> +		goto fail;
>> +	}
>> +
>> +	domain = irq_domain_create_hierarchy(parent, 0, IRD_ENTRIES, fwnode,
>> +			&redirect_domain_ops, irde_descs);
> Please align the arguments in the second line properly.
Ok, I got it , thanks
>> +static int redirect_cpu_online(unsigned int cpu)
>> +{
>> +	int ret, node = cpu_to_node(cpu);
>> +
>> +	if (cpu != cpumask_first(cpumask_of_node(node)))
>> +		return 0;
>> +
>> +	ret = redirect_table_init(node);
>> +	if (ret) {
>> +		redirect_table_fini(node);
>> +		return -EINVAL;
>> +	}
>> +
>> +	return 0;
>> +}
> So if you unplug all CPUs of a node and then replug the first CPU in the
> node, then this invokes redirect_table_init() unconditionally, which
> will unconditionally allocate pages and bitmap again ....
We need to reconsider here, thank you
>
> Thanks,
>
>          tglx

Thanks again

     Tianyang


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-05-27  1:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-23 10:18 [PATCH v3 0/2] Loongarch irq-redirect supprot Tianyang Zhang
2025-05-23 10:18 ` [PATCH v3 1/2] Docs/LoongArch: Add Advanced Extended-Redirect IRQ model description Tianyang Zhang
2025-05-26  1:45   ` Yanteng Si
2025-05-23 10:18 ` [PATCH v3 2/2] irq/irq-loongarch-ir:Add Redirect irqchip support Tianyang Zhang
2025-05-24 14:12   ` Huacai Chen
2025-05-25  9:06   ` Thomas Gleixner
2025-05-27  1:22     ` Tianyang Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).