From: "enlin.mu" <enlin.mu@linux.dev>
To: Thomas Gleixner <tglx@linutronix.de>,
anna-maria@linutronix.de, frederic@kernel.org,
linux-kernel@vger.kernel.org, enlin.mu@unisoc.com
Subject: Re: [PATCH V2] hrtimer: Check running timer state
Date: Mon, 17 Nov 2025 07:58:57 +0800 [thread overview]
Message-ID: <68734d1a-b77e-495e-922a-7fce3a122789@linux.dev> (raw)
In-Reply-To: <87o6p4bgzn.ffs@tglx>
On 2025/11/15 02:34, Thomas Gleixner wrote:
> On Fri, Nov 14 2025 at 19:42, Enlin Mu wrote:
>> When the running timer is not NULL, print debugging information.
>
> What for?
>
>> The test code is roughly as follows:
>>
>> static struct hrtimer serial_timer;
>> enum hrtimer_restart serial_timer_handler(struct hrtimer * timer)
>> {
>> local_irq_disable();
>
> What is this for?
Hi Thomas
This code comes from my colleauge who is not familiar with the
running machanism of hrtimer, which is why there is this logical error.
>
>> ......
>> do_someting();
>> copy_data_with_dma();
>> ......
>> hrtimer_forward_now(*serial_timer, ns_to_ktime(1000*2000));
>> local_irq_enable();
>
> And this?
>
>> return HRTIMER_RESTART;
>> }
>>
>> static int serial_start(struct uart_port *port)
>> {
>> ......
>> ......
>> hrtime_init(&serial_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
>
> That function does not exist.
>
>> ktime = no_to_ktime(1000*2000);
>> serial_timer.function = serial_timer_handler;
>> hrtimer_start(&serial_timer, ktime, HRTIMER_MODE_REL);
>> ......
>> return 0;
>> }
>>
>> static void serial_shutdown(struct uart_port *port)
>> {
>> ......
>> hrtimer_cancle(&serial_timer);
>> ......
>> serial_release_dma(port);
>> ......
>> }
>
>> The cpu6 canceled serial_timer and released dma,but the
>> serial_timer still ran many times on CPU7 until a panic occurred.
>> The reason for the panic is that serial_timer accessed the
>> released dma,But the serial_timer had been canceled for
>> some time now on cpu6.
>
> You still fail to explain how the timer can still run after being
> canceled.
Hi Thomas
The serial_timer on cpu6 has not been properly cancelled.
The serial_timer is once again entered into the active queue on cpu7.
---------------------------
on cpu7
step 1: preempted by hrtimer_usb
static void __run_hrtimer(.....)
{
......
base->running = timer; //timer is serial_timer;
.....
.....
restart = fn(timer); // fn is serial_timer_handler
.... ==> irq is enable. serial_timer is preempted by hrtimer_usb
....
}
step 2: hrtimer_usb return
static void __run_hrtimer(.....)
{
......
base->running = timer; //timer is hrtimer_usb;
.....
.....
restart = fn(timer);
....
}
step 3: serial_timer continues to execute
static void __run_hrtimer(.....)
{
......
(restart != HRTIMER_NORESTART &&
timer->state & HRTIMER_STATE_ENQUEUED))
(timer, base, HRTIMER_MODE_ABS); //timer is serial_timer
}
step 4: serial_timer run again.
.......
........
step n: serial_timer run again.
.......
step n+1: serial_timer run again.
.......
panic from serial_timer;
---------------------------
on cpu6
step 1:
hrtimer_cancel
step 2:
hrtimer_try_to_cancel
step 3:
hrtimer_active return false, because running timer is hrtimer_usb
on cpu7(step 2 on cpu7)
step 4:
hrtimer_cancel returns 0
step 5:
serial_timer is added active queue on cpu 7
---------------------------
Due to English not being my native language, I am unable to
explain this exception with a more complex description, and
can only provide a simple step-by-step explanation.
>
>> The cpu6 can successfully cancel the serial_timer because the
>> running timer has changed and it is another timer(such as
>> hrtimer_usb).
>
> After that the timer _cannot_ be running anymore unless some other code
> re-arms it afterwards.
>
>> When the serial_timer is enable to interrupt, the next hrtimer
>> (such as hrtimer_usb) on cpu7 preempts the return of ther serial_timer,
>> causing a change in the running timer.
>
> Then fix your timer callback. The callback is invoked in hard interrupt
> context and the callback enables interrupts, which is a NONO. You
> clearly never ran your code with lockdep enabled. It would have told you
> so.
>
>> Signed-off-by: Enlin Mu <enlin.mu@linux.dev>
>> Signed-off-by: Enlin Mu <enlin.mu@unisoc.com>
>> Signed-off-by: Enlin Mu <enlin.mu@linux.dev>
>
> Interesting Signed-off-by chain. Seems you're co-developing this patch
> with your Alter ego.
>
> Thanks,
>
> tglx
next prev parent reply other threads:[~2025-11-16 23:59 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-14 11:42 [PATCH V2] hrtimer: Check running timer state Enlin Mu
2025-11-14 18:34 ` Thomas Gleixner
2025-11-16 23:58 ` enlin.mu [this message]
2025-11-17 7:31 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=68734d1a-b77e-495e-922a-7fce3a122789@linux.dev \
--to=enlin.mu@linux.dev \
--cc=anna-maria@linutronix.de \
--cc=enlin.mu@unisoc.com \
--cc=frederic@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox