public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Anna-Maria Behnsen <anna-maria@linutronix.de>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	syzkaller-bugs@googlegroups.com
Subject: Re: [PATCH] timers: Annotate possible non critical data race of next_expiry
Date: Tue, 03 Sep 2024 08:55:50 +0200	[thread overview]
Message-ID: <87o755b4zt.fsf@somnus> (raw)
In-Reply-To: <ZtTo0wB_Jccoi0oM@pavilion.home>

Frederic Weisbecker <frederic@kernel.org> writes:

> Le Thu, Aug 29, 2024 at 05:43:05PM +0200, Anna-Maria Behnsen a écrit :
>> Global timers could be expired remotely when the target CPU is idle. After
>> a remote timer expiry, the remote timer_base->next_expiry value is updated
>> while holding the timer_base->lock. When the formerly idle CPU becomes
>> active at the same time and checks whether timers need to expire, this
>> check is done lockless as it is on the local CPU. This could lead to a data
>> race, which was reported by sysbot:
>> 
>>   https://lore.kernel.org/r/000000000000916e55061f969e14@google.com
>> 
>> When the value is read lockless but changed by the remote CPU, only two non
>> critical scenarios could happen:
>> 
>> 1) The already update value is read -> everything is perfect
>> 
>> 2) The old value is read -> a superfluous timer soft interrupt is raised
>> 
>> The same situation could happen when enqueueing a new first pinned timer by
>> a remote CPU also with non critical scenarios:
>> 
>> 1) The already update value is read -> everything is perfect
>> 
>> 2) The old value is read -> when the CPU is idle, an IPI is executed
>> nevertheless and when the CPU isn't idle, the updated value will be visible
>> on the next tick and the timer might be late one jiffie.
>> 
>> As this is very unlikely to happen, the overhead of doing the check under
>> the lock is a way more effort, than a superfluous timer soft interrupt or a
>> possible 1 jiffie delay of the timer.
>> 
>> Document and annotate this non critical behavior in the code by using
>> READ/WRITE_ONCE() pair when accessing timer_base->next_expiry.
>> 
>> Reported-by: syzbot+bf285fcc0a048e028118@syzkaller.appspotmail.com
>> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
>> Closes: https://lore.kernel.org/lkml/000000000000916e55061f969e14@google.com
>
> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
>
> Just a few nits:
>
>> ---
>>  kernel/time/timer.c | 41 ++++++++++++++++++++++++++++++++++++-----
>>  1 file changed, 36 insertions(+), 5 deletions(-)
>> 
>> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
>> index 18aa759c3cae..71b96a9bf6e8 100644
>> --- a/kernel/time/timer.c
>> +++ b/kernel/time/timer.c
>> @@ -672,7 +672,7 @@ static void enqueue_timer(struct timer_base *base, struct timer_list *timer,
>>  		 * Set the next expiry time and kick the CPU so it
>>  		 * can reevaluate the wheel:
>>  		 */
>> -		base->next_expiry = bucket_expiry;
>> +		WRITE_ONCE(base->next_expiry, bucket_expiry);
>>  		base->timers_pending = true;
>>  		base->next_expiry_recalc = false;
>>  		trigger_dyntick_cpu(base, timer);
>> @@ -1964,7 +1964,7 @@ static void next_expiry_recalc(struct timer_base *base)
>>  		clk += adj;
>>  	}
>>  
>> -	base->next_expiry = next;
>> +	WRITE_ONCE(base->next_expiry, next);
>>  	base->next_expiry_recalc = false;
>>  	base->timers_pending = !(next == base->clk + NEXT_TIMER_MAX_DELTA);
>>  }
>> @@ -2018,7 +2018,7 @@ static unsigned long next_timer_interrupt(struct timer_base *base,
>>  	 * easy comparable to find out which base holds the first pending timer.
>>  	 */
>>  	if (!base->timers_pending)
>> -		base->next_expiry = basej + NEXT_TIMER_MAX_DELTA;
>> +		WRITE_ONCE(base->next_expiry, basej + NEXT_TIMER_MAX_DELTA);
>>  
>>  	return base->next_expiry;
>>  }
>> @@ -2462,8 +2462,39 @@ static void run_local_timers(void)
>>  	hrtimer_run_queues();
>>  
>>  	for (int i = 0; i < NR_BASES; i++, base++) {
>> -		/* Raise the softirq only if required. */
>> -		if (time_after_eq(jiffies, base->next_expiry) ||
>> +		/*
>> +		 * Raise the softirq only if required.
>> +		 *
>> +		 * timer_base::next_expiry can be written by a remote CPU while
>> +		 * holding the lock. If this write happens at the same time than
>> +		 * the lockless local read, sanity checker could complain about
>> +		 * data corruption.
>> +		 *
>> +		 * There are two possible situations where
>> +		 * timer_base::next_expiry is written by a remote CPU:
>> +		 *
>> +		 * 1. Remote CPU expires global timers of this CPU and updates
>> +		 * timer_base::next_expiry of BASE_LOCAL afterwards in
>
> BASE_GLOBAL ?
>
>> +		 * next_timer_interrupt() or timer_recalc_next_expiry(). The
>> +		 * worst outcome is a superfluous raise of the timer softirq
>> +		 * when the not yet updated value is read.
>> +		 *
>> +		 * 2. A new first pinned timer is enqueued by a remote CPU and
>> +		 * therefore timer_base::next_expiry of BASE_GLOBAL is
>
> BASE_LOCAL ?

Thanks for the review. Yes you are right, those base names should be
switched...

> Thanks.

Thanks,

	Anna-Maria


  reply	other threads:[~2024-09-03  6:55 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-13 20:40 [syzbot] [kernel?] KCSAN: data-race in next_expiry_recalc / update_process_times (2) syzbot
2024-08-29 15:43 ` [PATCH] timers: Annotate possible non critical data race of next_expiry Anna-Maria Behnsen
2024-09-01 22:21   ` Frederic Weisbecker
2024-09-03  6:55     ` Anna-Maria Behnsen [this message]
2024-09-04  9:13   ` [PATCH v2] " Anna-Maria Behnsen
2024-09-04 10:08   ` [tip: timers/core] " tip-bot2 for Anna-Maria Behnsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o755b4zt.fsf@somnus \
    --to=anna-maria@linutronix.de \
    --cc=frederic@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox