From: Anup Patel <apatel@ventanamicro.com>
To: "Björn Töpel" <bjorn@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>,
Paul Walmsley <paul.walmsley@sifive.com>,
Thomas Gleixner <tglx@linutronix.de>,
Rob Herring <robh+dt@kernel.org>,
Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>,
Frank Rowand <frowand.list@gmail.com>,
Conor Dooley <conor+dt@kernel.org>,
devicetree@vger.kernel.org,
Saravana Kannan <saravanak@google.com>,
Marc Zyngier <maz@kernel.org>, Anup Patel <anup@brainfault.org>,
linux-kernel@vger.kernel.org,
Atish Patra <atishp@atishpatra.org>,
linux-riscv@lists.infradead.org,
linux-arm-kernel@lists.infradead.org,
Andrew Jones <ajones@ventanamicro.com>
Subject: Re: [PATCH v12 00/25] Linux RISC-V AIA Support
Date: Thu, 1 Feb 2024 20:37:22 +0530 [thread overview]
Message-ID: <CAK9=C2UX0sRb5UbLdm8xwe1dP=x+enJRYzAuCPf6MdHTLTC_Cw@mail.gmail.com> (raw)
In-Reply-To: <87fryeognc.fsf@all.your.base.are.belong.to.us>
On Tue, Jan 30, 2024 at 11:19 PM Björn Töpel <bjorn@kernel.org> wrote:
>
> Anup Patel <apatel@ventanamicro.com> writes:
>
> > On Tue, Jan 30, 2024 at 8:18 PM Björn Töpel <bjorn@kernel.org> wrote:
> >>
> >> Björn Töpel <bjorn@kernel.org> writes:
> >>
> >> > Anup Patel <apatel@ventanamicro.com> writes:
> >> >
> >> >> On Tue, Jan 30, 2024 at 1:22 PM Björn Töpel <bjorn@kernel.org> wrote:
> >> >>>
> >> >>> Björn Töpel <bjorn@kernel.org> writes:
> >> >>>
> >> >>> > Anup Patel <apatel@ventanamicro.com> writes:
> >> >>> >
> >> >>> >> The RISC-V AIA specification is ratified as-per the RISC-V international
> >> >>> >> process. The latest ratified AIA specifcation can be found at:
> >> >>> >> https://github.com/riscv/riscv-aia/releases/download/1.0/riscv-interrupts-1.0.pdf
> >> >>> >>
> >> >>> >> At a high-level, the AIA specification adds three things:
> >> >>> >> 1) AIA CSRs
> >> >>> >> - Improved local interrupt support
> >> >>> >> 2) Incoming Message Signaled Interrupt Controller (IMSIC)
> >> >>> >> - Per-HART MSI controller
> >> >>> >> - Support MSI virtualization
> >> >>> >> - Support IPI along with virtualization
> >> >>> >> 3) Advanced Platform-Level Interrupt Controller (APLIC)
> >> >>> >> - Wired interrupt controller
> >> >>> >> - In MSI-mode, converts wired interrupt into MSIs (i.e. MSI generator)
> >> >>> >> - In Direct-mode, injects external interrupts directly into HARTs
> >> >>> >>
> >> >>> >> For an overview of the AIA specification, refer the AIA virtualization
> >> >>> >> talk at KVM Forum 2022:
> >> >>> >> https://static.sched.com/hosted_files/kvmforum2022/a1/AIA_Virtualization_in_KVM_RISCV_final.pdf
> >> >>> >> https://www.youtube.com/watch?v=r071dL8Z0yo
> >> >>> >>
> >> >>> >> To test this series, use QEMU v7.2 (or higher) and OpenSBI v1.2 (or higher).
> >> >>> >>
> >> >>> >> These patches can also be found in the riscv_aia_v12 branch at:
> >> >>> >> https://github.com/avpatel/linux.git
> >> >>> >>
> >> >>> >> Changes since v11:
> >> >>> >> - Rebased on Linux-6.8-rc1
> >> >>> >> - Included kernel/irq related patches from "genirq, irqchip: Convert ARM
> >> >>> >> MSI handling to per device MSI domains" series by Thomas.
> >> >>> >> (PATCH7, PATCH8, PATCH9, PATCH14, PATCH16, PATCH17, PATCH18, PATCH19,
> >> >>> >> PATCH20, PATCH21, PATCH22, PATCH23, and PATCH32 of
> >> >>> >> https://lore.kernel.org/linux-arm-kernel/20221121135653.208611233@linutronix.de/)
> >> >>> >> - Updated APLIC MSI-mode driver to use the new WIRED_TO_MSI mechanism.
> >> >>> >> - Updated IMSIC driver to support per-device MSI domains for PCI and
> >> >>> >> platform devices.
> >> >>> >
> >> >>> > Thanks for working on this, Anup! I'm still reviewing the patches.
> >> >>> >
> >> >>> > I'm hitting a boot hang in text patching, with this series applied on
> >> >>> > 6.8-rc2. IPI issues?
> >> >>>
> >> >>> Not text patching! One cpu spinning in smp_call_function_many_cond() and
> >> >>> the others are in cpu_relax(). Smells like IPI...
> >> >>
> >> >> I tried bootefi from U-Boot multiple times but can't reproduce the
> >> >> issue you are seeing.
> >> >
> >> > Thanks! I can reproduce without EFI, and simpler command-line:
> >> >
> >> > qemu-system-riscv64 \
> >> > -bios /path/to/fw_dynamic.bin \
> >> > -kernel /path/to/Image \
> >> > -append 'earlycon console=tty0 console=ttyS0' \
> >> > -machine virt,aia=aplic-imsic \
> >> > -no-reboot -nodefaults -nographic \
> >> > -smp 4 \
> >> > -object rng-random,filename=/dev/urandom,id=rng0 \
> >> > -device virtio-rng-device,rng=rng0 \
> >> > -m 4G -chardev stdio,id=char0 -serial chardev:char0
> >> >
> >> > I can reproduce with your upstream riscv_aia_v12 plus the config in the
> >> > gist [1], and all latest QEMU/OpenSBI:
> >> >
> >> > QEMU: 11be70677c70 ("Merge tag 'pull-vfio-20240129' of https://github.com/legoater/qemu into staging")
> >> > OpenSBI: bb90a9ebf6d9 ("lib: sbi: Print number of debug triggers found")
> >> > Linux: d9b9d6eb987f ("MAINTAINERS: Add entry for RISC-V AIA drivers")
> >> >
> >> > Removing ",aia=aplic-imsic" from the CLI above completes the boot (i.e.
> >> > panicking about missing root mount ;-))
> >>
> >> More context; The hang is during a late initcall, where an ftrace direct
> >> (register_ftrace_direct()) modification is done.
> >>
> >> Stop machine is used to call into __ftrace_modify_call(). Then into the
> >> arch specific patch_text_nosync(), where flush_icache_range() hangs in
> >> flush_icache_all(). From "on_each_cpu(ipi_remote_fence_i, NULL, 1);" to
> >> on_each_cpu_cond_mask() "smp_call_function_many_cond(mask, func, info,
> >> scf_flags, cond_func);" which never returns from "csd_lock_wait(csd)"
> >> right before the end of the function.
> >>
> >> Any ideas? Disabling CONFIG_HID_BPF, that does the early ftrace code
> >> patching fixes the boot hang, but it does seem related to IPI...
> >>
> > Looks like flush_icache_all() does not use the IPIs (on_each_cpu()
> > and friends) correctly.
> >
> > On other hand, the flush_icache_mm() does the right thing by
> > doing local flush on the current CPU and IPI based flush on other
> > CPUs.
> >
> > Can you try the following patch ?
> >
> > diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
> > index 55a34f2020a8..a3dfbe4de832 100644
> > --- a/arch/riscv/mm/cacheflush.c
> > +++ b/arch/riscv/mm/cacheflush.c
> > @@ -19,12 +19,18 @@ static void ipi_remote_fence_i(void *info)
> >
> > void flush_icache_all(void)
> > {
> > + cpumask_t others;
> > +
> > local_flush_icache_all();
> >
> > + cpumask_andnot(&others, cpu_online_mask, cpumask_of(smp_processor_id()));
> > + if (cpumask_empty(&others))
> > + return;
> > +
> > if (IS_ENABLED(CONFIG_RISCV_SBI) && !riscv_use_ipi_for_rfence())
> > - sbi_remote_fence_i(NULL);
> > + sbi_remote_fence_i(&others);
> > else
> > - on_each_cpu(ipi_remote_fence_i, NULL, 1);
> > + on_each_cpu_mask(&others, ipi_remote_fence_i, NULL, 1);
> > }
> > EXPORT_SYMBOL(flush_icache_all);
>
> Unfortunately, I see the same hang. LMK if you'd like me to try anything
> else.
I was able to reproduce this at my end but I had to use your config.
Digging further, it seems the issue is observed only when we use
in-kernel IPIs for cache flushing (instead of SBI calls) along with
some of the tracers (or debugging features) enabled. With the tracers
(or debug features) disabled we don't see any issue. In fact, the
upstream defconfig works perfectly fine with AIA drivers and
in-kernel IPIs.
It seems AIA based in-kernel IPIs are exposing some other issue
with RISC-V kernel. I will debug more to find the root cause.
Regards,
Anup
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2024-02-01 15:07 UTC|newest]
Thread overview: 86+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-27 16:17 [PATCH v12 00/25] Linux RISC-V AIA Support Anup Patel
2024-01-27 16:17 ` [PATCH v12 01/25] irqchip/gic-v3: Make gic_irq_domain_select() robust for zero parameter count Anup Patel
2024-02-15 11:47 ` Marc Zyngier
2024-01-27 16:17 ` [PATCH v12 02/25] genirq/irqdomain: Remove the param count restriction from select() Anup Patel
2024-02-22 13:01 ` Aishwarya TCV
2024-02-22 16:28 ` Marc Zyngier
2024-02-22 22:59 ` Aishwarya TCV
[not found] ` <CGME20240223102258eucas1p119f38e40f769c883c0a502e9e26be888@eucas1p1.samsung.com>
2024-02-23 10:22 ` Marek Szyprowski
2024-02-23 10:45 ` Biju Das
2024-02-23 10:56 ` Marek Szyprowski
2024-02-23 11:01 ` Biju Das
2024-01-27 16:17 ` [PATCH v12 03/25] genirq/msi: Extend msi_parent_ops Anup Patel
2024-01-27 16:17 ` [PATCH v12 04/25] genirq/irqdomain: Add DOMAIN_BUS_DEVICE_IMS Anup Patel
2024-02-15 11:54 ` Marc Zyngier
2024-02-15 15:01 ` Thomas Gleixner
2024-01-27 16:17 ` [PATCH v12 05/25] platform-msi: Prepare for real per device domains Anup Patel
2024-01-27 16:17 ` [PATCH v12 06/25] irqchip: Convert all platform MSI users to the new API Anup Patel
2024-01-27 16:17 ` [PATCH v12 07/25] genirq/msi: Provide optional translation op Anup Patel
2024-01-27 16:17 ` [PATCH v12 08/25] genirq/msi: Split msi_domain_alloc_irq_at() Anup Patel
2024-01-27 16:17 ` [PATCH v12 09/25] genirq/msi: Provide DOMAIN_BUS_WIRED_TO_MSI Anup Patel
2024-01-27 16:17 ` [PATCH v12 10/25] genirq/msi: Optionally use dev->fwnode for device domain Anup Patel
2024-01-27 16:17 ` [PATCH v12 11/25] genirq/msi: Provide allocation/free functions for "wired" MSI interrupts Anup Patel
2024-01-27 16:17 ` [PATCH v12 12/25] genirq/irqdomain: Reroute device MSI create_mapping Anup Patel
2024-01-27 16:17 ` [PATCH v12 13/25] genirq/msi: Provide MSI_FLAG_PARENT_PM_DEV Anup Patel
2024-01-27 16:17 ` [PATCH v12 14/25] irqchip/sifive-plic: Convert PLIC driver into a platform driver Anup Patel
2024-02-16 15:33 ` Thomas Gleixner
2024-02-16 17:11 ` Anup Patel
2024-02-16 20:22 ` Thomas Gleixner
2024-02-17 5:42 ` Anup Patel
2024-01-27 16:17 ` [PATCH v12 15/25] irqchip/riscv-intc: Add support for RISC-V AIA Anup Patel
2024-01-27 16:17 ` [PATCH v12 16/25] dt-bindings: interrupt-controller: Add RISC-V incoming MSI controller Anup Patel
2024-01-27 16:17 ` [PATCH v12 17/25] genirq/matrix: Dynamic bitmap allocation Anup Patel
2024-01-27 16:17 ` [PATCH v12 18/25] irqchip: Add RISC-V incoming MSI controller early driver Anup Patel
2024-02-07 9:43 ` Björn Töpel
2024-02-16 18:40 ` Thomas Gleixner
2024-02-18 13:16 ` Anup Patel
2024-01-27 16:17 ` [PATCH v12 19/25] irqchip/riscv-imsic: Add device MSI domain support for platform devices Anup Patel
2024-02-06 15:36 ` Björn Töpel
2024-02-16 20:12 ` Thomas Gleixner
2024-02-19 4:10 ` Anup Patel
2024-01-27 16:17 ` [PATCH v12 20/25] irqchip/riscv-imsic: Add device MSI domain support for PCI devices Anup Patel
2024-02-16 20:14 ` Thomas Gleixner
2024-02-19 4:41 ` Anup Patel
2024-01-27 16:17 ` [PATCH v12 21/25] dt-bindings: interrupt-controller: Add RISC-V advanced PLIC Anup Patel
2024-01-27 16:17 ` [PATCH v12 22/25] irqchip: Add RISC-V advanced PLIC driver for direct-mode Anup Patel
2024-02-01 6:39 ` Andy Chiu
2024-02-19 10:28 ` Anup Patel
2024-02-02 9:29 ` Clément Léger
2024-02-02 10:30 ` Anup Patel
2024-02-02 10:33 ` Clément Léger
2024-02-16 20:50 ` Thomas Gleixner
2024-02-19 9:35 ` Anup Patel
2024-01-27 16:17 ` [PATCH v12 23/25] irqchip/riscv-aplic: Add support for MSI-mode Anup Patel
2024-02-16 21:04 ` Thomas Gleixner
2024-02-19 9:45 ` Anup Patel
2024-01-27 16:17 ` [PATCH v12 24/25] RISC-V: Select APLIC and IMSIC drivers Anup Patel
2024-01-27 16:17 ` [PATCH v12 25/25] MAINTAINERS: Add entry for RISC-V AIA drivers Anup Patel
2024-01-27 16:20 ` [PATCH v12 00/25] Linux RISC-V AIA Support Anup Patel
2024-02-14 19:54 ` Thomas Gleixner
2024-02-15 5:48 ` Anup Patel
2024-02-15 19:59 ` Thomas Gleixner
2024-02-16 21:05 ` Thomas Gleixner
2024-02-20 6:12 ` Anup Patel
2024-02-15 11:57 ` Marc Zyngier
2024-01-30 7:16 ` Björn Töpel
2024-01-30 7:52 ` Björn Töpel
2024-01-30 10:02 ` Anup Patel
2024-01-30 11:05 ` Björn Töpel
2024-01-30 10:23 ` Anup Patel
2024-01-30 11:46 ` Björn Töpel
2024-01-30 14:48 ` Björn Töpel
2024-01-30 15:19 ` Anup Patel
2024-01-30 15:48 ` Anup Patel
2024-01-30 17:49 ` Björn Töpel
2024-02-01 15:07 ` Anup Patel [this message]
2024-02-01 18:45 ` Björn Töpel
2024-02-06 15:39 ` Björn Töpel
2024-02-06 17:39 ` Anup Patel
2024-02-07 7:27 ` Björn Töpel
2024-02-07 9:18 ` Anup Patel
2024-02-07 9:37 ` Björn Töpel
2024-02-07 12:55 ` Björn Töpel
2024-02-07 13:08 ` Anup Patel
2024-02-07 13:10 ` Anup Patel
2024-02-08 10:10 ` Andrea Parri
2024-02-16 11:33 ` Anup Patel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAK9=C2UX0sRb5UbLdm8xwe1dP=x+enJRYzAuCPf6MdHTLTC_Cw@mail.gmail.com' \
--to=apatel@ventanamicro.com \
--cc=ajones@ventanamicro.com \
--cc=anup@brainfault.org \
--cc=atishp@atishpatra.org \
--cc=bjorn@kernel.org \
--cc=conor+dt@kernel.org \
--cc=devicetree@vger.kernel.org \
--cc=frowand.list@gmail.com \
--cc=krzysztof.kozlowski+dt@linaro.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=maz@kernel.org \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=robh+dt@kernel.org \
--cc=saravanak@google.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).