From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-186.mta1.migadu.com (out-186.mta1.migadu.com [95.215.58.186]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5D021257AEC for ; Sun, 16 Nov 2025 23:59:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.186 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763337561; cv=none; b=FDTRJToEh9hJcsViEjGyoHFMxetHQqUe/aQZNAlaFaYxo96S//gVRQK+kQZZciGrqQXl/2WrDBrkoXKhuFDtgEljepQrF77vHNljTVeNOGbuNyE6N4zId0Zzt2qyrszXox5rg4qqOtdK4/Svtm+i+zZqwBdgX7AgVvhlWyRaCEE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763337561; c=relaxed/simple; bh=owRlTfyjNUGj3FJG8WHshcCxvB6bodG1C9EFiiHJh2Q=; h=Message-ID:Date:MIME-Version:Subject:To:References:From: In-Reply-To:Content-Type; b=qeZVT5CJqyqEedEp4+QM+bG9rUZNZVzpca85m77+UEte3W7oCB568ja6KAH/BKqrJFesBQXt2zj4PPC+UcXKdse+jzdiouAiAyspKpH/C4VcclJtnvCY/kYIyU1ND9o38kz1/+KCTQ7/5xe7H/b7JvzRMV/j7Vx3PDuXIvOuVm4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=bIMS4q5D; arc=none smtp.client-ip=95.215.58.186 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="bIMS4q5D" Message-ID: <68734d1a-b77e-495e-922a-7fce3a122789@linux.dev> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1763337557; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FyLZMf/M+Nzf8e9DuRWlFufH1YaG+RKQIkKRO+as0dg=; b=bIMS4q5DHuJsunzmx3jovjmr4DncqQmYDJnVZGjUxPqhObWH9G2uIfS9GeNnnQqpQNk+X0 U6fwcIer7QZafRgs4ANja8HLvb+eBmEpEfITVoCe8e4wKYaaJfJE1dyQdXWbTMKTBlfRzy yEI1hK6OA/kNzdBj3+ASU4LHDm9P+zs= Date: Mon, 17 Nov 2025 07:58:57 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH V2] hrtimer: Check running timer state To: Thomas Gleixner , anna-maria@linutronix.de, frederic@kernel.org, linux-kernel@vger.kernel.org, enlin.mu@unisoc.com References: <20251114114230.3802-1-enlin.mu@linux.dev> <87o6p4bgzn.ffs@tglx> Content-Language: en-US X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: "enlin.mu" In-Reply-To: <87o6p4bgzn.ffs@tglx> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT On 2025/11/15 02:34, Thomas Gleixner wrote: > On Fri, Nov 14 2025 at 19:42, Enlin Mu wrote: >> When the running timer is not NULL, print debugging information. > > What for? > >> The test code is roughly as follows: >> >> static struct hrtimer serial_timer; >> enum hrtimer_restart serial_timer_handler(struct hrtimer * timer) >> { >> local_irq_disable(); > > What is this for? Hi Thomas This code comes from my colleauge who is not familiar with the running machanism of hrtimer, which is why there is this logical error. > >> ...... >> do_someting(); >> copy_data_with_dma(); >> ...... >> hrtimer_forward_now(*serial_timer, ns_to_ktime(1000*2000)); >> local_irq_enable(); > > And this? > >> return HRTIMER_RESTART; >> } >> >> static int serial_start(struct uart_port *port) >> { >> ...... >> ...... >> hrtime_init(&serial_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); > > That function does not exist. > >> ktime = no_to_ktime(1000*2000); >> serial_timer.function = serial_timer_handler; >> hrtimer_start(&serial_timer, ktime, HRTIMER_MODE_REL); >> ...... >> return 0; >> } >> >> static void serial_shutdown(struct uart_port *port) >> { >> ...... >> hrtimer_cancle(&serial_timer); >> ...... >> serial_release_dma(port); >> ...... >> } > >> The cpu6 canceled serial_timer and released dma,but the >> serial_timer still ran many times on CPU7 until a panic occurred. >> The reason for the panic is that serial_timer accessed the >> released dma,But the serial_timer had been canceled for >> some time now on cpu6. > > You still fail to explain how the timer can still run after being > canceled. Hi Thomas The serial_timer on cpu6 has not been properly cancelled. The serial_timer is once again entered into the active queue on cpu7. --------------------------- on cpu7 step 1: preempted by hrtimer_usb static void __run_hrtimer(.....) { ...... base->running = timer; //timer is serial_timer; ..... ..... restart = fn(timer); // fn is serial_timer_handler .... ==> irq is enable. serial_timer is preempted by hrtimer_usb .... } step 2: hrtimer_usb return static void __run_hrtimer(.....) { ...... base->running = timer; //timer is hrtimer_usb; ..... ..... restart = fn(timer); .... } step 3: serial_timer continues to execute static void __run_hrtimer(.....) { ...... (restart != HRTIMER_NORESTART && timer->state & HRTIMER_STATE_ENQUEUED)) (timer, base, HRTIMER_MODE_ABS); //timer is serial_timer } step 4: serial_timer run again. ....... ........ step n: serial_timer run again. ....... step n+1: serial_timer run again. ....... panic from serial_timer; --------------------------- on cpu6 step 1: hrtimer_cancel step 2: hrtimer_try_to_cancel step 3: hrtimer_active return false, because running timer is hrtimer_usb on cpu7(step 2 on cpu7) step 4: hrtimer_cancel returns 0 step 5: serial_timer is added active queue on cpu 7 --------------------------- Due to English not being my native language, I am unable to explain this exception with a more complex description, and can only provide a simple step-by-step explanation. > >> The cpu6 can successfully cancel the serial_timer because the >> running timer has changed and it is another timer(such as >> hrtimer_usb). > > After that the timer _cannot_ be running anymore unless some other code > re-arms it afterwards. > >> When the serial_timer is enable to interrupt, the next hrtimer >> (such as hrtimer_usb) on cpu7 preempts the return of ther serial_timer, >> causing a change in the running timer. > > Then fix your timer callback. The callback is invoked in hard interrupt > context and the callback enables interrupts, which is a NONO. You > clearly never ran your code with lockdep enabled. It would have told you > so. > >> Signed-off-by: Enlin Mu >> Signed-off-by: Enlin Mu >> Signed-off-by: Enlin Mu > > Interesting Signed-off-by chain. Seems you're co-developing this patch > with your Alter ego. > > Thanks, > > tglx