* Question on the scheduling of timer interrupt and FIO interrupt @ 2025-08-01 6:26 wangwudi 2025-08-01 12:01 ` Marc Zyngier 0 siblings, 1 reply; 3+ messages in thread From: wangwudi @ 2025-08-01 6:26 UTC (permalink / raw) To: Marc Zyngier, Thomas Gleixner, linux-arm-kernel, linux-kernel Cc: yangwei24, yaohongshi Hi, all When running some FIO tests on ARM64 server(Kunpeng), frequent NVMe interrupts occupy the CPU, and the CPU's hardirq load is 100%. The watchdog feed interrupt arch_timer cannot be responded, triggering the hardlockup. GIC driver uses GICV3_PRIO_IRQ to set the same priority for arch_timer interrupt and NVMe interrupt. In GIC spec, "If, on a particular CPU interface, multiple pending interrupts have the same priority, and have sufficient priority for the interface to signal them to the PE, it is IMPLEMENTATION DEFINED how the interface selects which interrupt to signal." Shell we consider setting a higher priority for the arch_timer interrupt to fix this case? Thanks for your help. Wangwudi ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Question on the scheduling of timer interrupt and FIO interrupt 2025-08-01 6:26 Question on the scheduling of timer interrupt and FIO interrupt wangwudi @ 2025-08-01 12:01 ` Marc Zyngier 2025-08-04 3:27 ` Zenghui Yu 0 siblings, 1 reply; 3+ messages in thread From: Marc Zyngier @ 2025-08-01 12:01 UTC (permalink / raw) To: wangwudi Cc: Thomas Gleixner, linux-arm-kernel, linux-kernel, yangwei24, yaohongshi, Zenghui Yu + Zenghui, in case he has seen this before. On Fri, 01 Aug 2025 07:26:20 +0100, wangwudi <wangwudi@hisilicon.com> wrote: > > Hi, all > When running some FIO tests on ARM64 server(Kunpeng), frequent NVMe interrupts occupy the > CPU, and the CPU's hardirq load is 100%. The watchdog feed interrupt arch_timer cannot be > responded, triggering the hardlockup. I am extremely surprised that even with a screaming NVMe (or even several of them), you end up in a situation where you don't have the resource to take the timer interrupt. > > GIC driver uses GICV3_PRIO_IRQ to set the same priority for arch_timer interrupt and NVMe > interrupt. In GIC spec, "If, on a particular CPU interface, multiple pending interrupts > have the same priority, and have sufficient priority for the interface to signal them to > the PE, it is IMPLEMENTATION DEFINED how the interface selects which interrupt to signal." > Shell we consider setting a higher priority for the arch_timer interrupt to fix this case? Linux only deals with two priorities: the normal interrupt priority, and NMI, where the NMI can preempt any other interrupt. obviously, we don't want to make the timer an NMI, as it would break a lot of things. Which means that even if you were to give the timer a higher priority, it should not be allowed to preempt any other interrupt. Which means that you'd need to set the binary point so that both the NVMe and timer priorities fall into the same preemption bucket. But it also means that you now are eating into the few bits of priority that we have, and that will cause problems with the NMI priority. Also, how to you decide what interrupts should be of a higher priority? I find it surprising that your GIC doesn't have some form of round-robin scheme to pick the next HPPI, because that's clearly a fairness problem, and punting that on SW is pretty ugly. Thanks, M. -- Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Question on the scheduling of timer interrupt and FIO interrupt 2025-08-01 12:01 ` Marc Zyngier @ 2025-08-04 3:27 ` Zenghui Yu 0 siblings, 0 replies; 3+ messages in thread From: Zenghui Yu @ 2025-08-04 3:27 UTC (permalink / raw) To: Marc Zyngier Cc: wangwudi, Thomas Gleixner, linux-arm-kernel, linux-kernel, yangwei24, yaohongshi, Zenghui Yu Hi Marc, On 2025/8/1 20:01, Marc Zyngier wrote: > + Zenghui, in case he has seen this before. I haven't heard of it before. > On Fri, 01 Aug 2025 07:26:20 +0100, > wangwudi <wangwudi@hisilicon.com> wrote: > > > > Hi, all > > When running some FIO tests on ARM64 server(Kunpeng), frequent NVMe interrupts occupy the > > CPU, and the CPU's hardirq load is 100%. The watchdog feed interrupt arch_timer cannot be > > responded, triggering the hardlockup. > > I am extremely surprised that even with a screaming NVMe (or even > several of them), you end up in a situation where you don't have the > resource to take the timer interrupt. +1. I will probably have an offline discussion with Wudi today, or a bit later, to dig out more clues about it. > > GIC driver uses GICV3_PRIO_IRQ to set the same priority for arch_timer interrupt and NVMe > > interrupt. In GIC spec, "If, on a particular CPU interface, multiple pending interrupts > > have the same priority, and have sufficient priority for the interface to signal them to > > the PE, it is IMPLEMENTATION DEFINED how the interface selects which interrupt to signal." > > Shell we consider setting a higher priority for the arch_timer interrupt to fix this case? > > Linux only deals with two priorities: the normal interrupt priority, > and NMI, where the NMI can preempt any other interrupt. obviously, we > don't want to make the timer an NMI, as it would break a lot of > things. > > Which means that even if you were to give the timer a higher priority, > it should not be allowed to preempt any other interrupt. Which means > that you'd need to set the binary point so that both the NVMe and > timer priorities fall into the same preemption bucket. > > But it also means that you now are eating into the few bits of > priority that we have, and that will cause problems with the NMI > priority. Also, how to you decide what interrupts should be of a > higher priority? > > I find it surprising that your GIC doesn't have some form of > round-robin scheme to pick the next HPPI, because that's clearly a > fairness problem, and punting that on SW is pretty ugly. > > Thanks, > > M. > ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-08-04 3:30 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-08-01 6:26 Question on the scheduling of timer interrupt and FIO interrupt wangwudi 2025-08-01 12:01 ` Marc Zyngier 2025-08-04 3:27 ` Zenghui Yu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).