linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Race condition when replacing the broadcast timer
       [not found] <042520850d394f0bb0004a226db63d0d@xiaomi.com>
@ 2024-06-27 11:26 ` Thomas Gleixner
  2024-06-28  1:59   ` [External Mail]Re: " 朱恺乾
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2024-06-27 11:26 UTC (permalink / raw)
  To: 朱恺乾, Daniel Lezcano
  Cc: linux-kernel@vger.kernel.org, 王韬, 熊亮,
	isaacmanjarres@google.com, Frederic Weisbecker,
	Anna-Maria Behnsen

On Wed, Jun 26 2024 at 02:17, 朱恺乾 wrote:
> We find a possible race condition when replacing the broadcast
> timer. Here is how the race happend,

> 1. In thread 0, ___tick_broadcast_oneshot_control, timer 0 as a
> broadcast timer is updating the next_event.

> 2. In thread 1, tick_install_broadcast_device, timer 0 is going to be
> replaced by a new timer 1.

> 3. If thread 0 gets the broadcast timer first, it would have the old
> timer returned (timer 0). When thread 1 shuts the old timer down and
> marks it as detached, Thread 0 still have the chance to re-enable the
> old timer with a noop handler if it executes slower than thread 1.

> 4. As the old timer is binded to a CPU, when plug out that CPU, kernel
> fails at clockevents.c:653

Clearly tick_install_broadcast_device() lacks serialization.

The untested patch below should cure that.

Thanks,

        tglx
---
 kernel/time/clockevents.c    |   31 +++++++++++++++++++------------
 kernel/time/tick-broadcast.c |   36 ++++++++++++++++++++++--------------
 kernel/time/tick-internal.h  |    2 ++
 3 files changed, 43 insertions(+), 26 deletions(-)

--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -557,23 +557,14 @@ void clockevents_handle_noop(struct cloc
 {
 }
 
-/**
- * clockevents_exchange_device - release and request clock devices
- * @old:	device to release (can be NULL)
- * @new:	device to request (can be NULL)
- *
- * Called from various tick functions with clockevents_lock held and
- * interrupts disabled.
- */
-void clockevents_exchange_device(struct clock_event_device *old,
-				 struct clock_event_device *new)
+void __clockevents_exchange_device(struct clock_event_device *old,
+				   struct clock_event_device *new)
 {
 	/*
 	 * Caller releases a clock event device. We queue it into the
 	 * released list and do a notify add later.
 	 */
 	if (old) {
-		module_put(old->owner);
 		clockevents_switch_state(old, CLOCK_EVT_STATE_DETACHED);
 		list_move(&old->list, &clockevents_released);
 	}
@@ -585,6 +576,22 @@ void clockevents_exchange_device(struct
 }
 
 /**
+ * clockevents_exchange_device - release and request clock devices
+ * @old:	device to release (can be NULL)
+ * @new:	device to request (can be NULL)
+ *
+ * Called from various tick functions with clockevents_lock held and
+ * interrupts disabled.
+ */
+void clockevents_exchange_device(struct clock_event_device *old,
+				 struct clock_event_device *new)
+{
+	if (old)
+		module_put(old->owner);
+	__clockevents_exchange_device(old, new);
+}
+
+/**
  * clockevents_suspend - suspend clock devices
  */
 void clockevents_suspend(void)
@@ -650,7 +657,7 @@ void tick_cleanup_dead_cpu(int cpu)
 		if (cpumask_test_cpu(cpu, dev->cpumask) &&
 		    cpumask_weight(dev->cpumask) == 1 &&
 		    !tick_is_broadcast_device(dev)) {
-			BUG_ON(!clockevent_state_detached(dev));
+			WARN_ON(!clockevent_state_detached(dev));
 			list_del(&dev->list);
 		}
 	}
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -162,23 +162,31 @@ static bool tick_set_oneshot_wakeup_devi
  */
 void tick_install_broadcast_device(struct clock_event_device *dev, int cpu)
 {
-	struct clock_event_device *cur = tick_broadcast_device.evtdev;
+	struct clock_event_device *cur;
 
-	if (tick_set_oneshot_wakeup_device(dev, cpu))
-		return;
+	scoped_guard(raw_spinlock_irqsave, &tick_broadcast_lock) {
 
-	if (!tick_check_broadcast_device(cur, dev))
-		return;
+		if (tick_set_oneshot_wakeup_device(dev, cpu))
+			return;
 
-	if (!try_module_get(dev->owner))
-		return;
+		cur = tick_broadcast_device.evtdev;
+		if (!tick_check_broadcast_device(cur, dev))
+			return;
 
-	clockevents_exchange_device(cur, dev);
+		if (!try_module_get(dev->owner))
+			return;
+
+		__clockevents_exchange_device(cur, dev);
+		if (cur)
+			cur->event_handler = clockevents_handle_noop;
+		WRITE_ONCE(tick_broadcast_device.evtdev, dev);
+		if (!cpumask_empty(tick_broadcast_mask))
+			tick_broadcast_start_periodic(dev);
+	}
+
+	/* Module release must be outside of the lock */
 	if (cur)
-		cur->event_handler = clockevents_handle_noop;
-	tick_broadcast_device.evtdev = dev;
-	if (!cpumask_empty(tick_broadcast_mask))
-		tick_broadcast_start_periodic(dev);
+		module_put(old->owner);
 
 	if (!(dev->features & CLOCK_EVT_FEAT_ONESHOT))
 		return;
@@ -1185,7 +1193,7 @@ int tick_broadcast_oneshot_active(void)
  */
 bool tick_broadcast_oneshot_available(void)
 {
-	struct clock_event_device *bc = tick_broadcast_device.evtdev;
+	struct clock_event_device *bc = READ_ONCE(tick_broadcast_device.evtdev);
 
 	return bc ? bc->features & CLOCK_EVT_FEAT_ONESHOT : false;
 }
@@ -1193,7 +1201,7 @@ bool tick_broadcast_oneshot_available(vo
 #else
 int __tick_broadcast_oneshot_control(enum tick_broadcast_state state)
 {
-	struct clock_event_device *bc = tick_broadcast_device.evtdev;
+	struct clock_event_device *bc = READ_ONCE(tick_broadcast_device.evtdev);
 
 	if (!bc || (bc->features & CLOCK_EVT_FEAT_HRTIMER))
 		return -EBUSY;
--- a/kernel/time/tick-internal.h
+++ b/kernel/time/tick-internal.h
@@ -53,6 +53,8 @@ static inline void clockevent_set_state(
 }
 
 extern void clockevents_shutdown(struct clock_event_device *dev);
+extern void __clockevents_exchange_device(struct clock_event_device *old,
+					  struct clock_event_device *new);
 extern void clockevents_exchange_device(struct clock_event_device *old,
 					struct clock_event_device *new);
 extern void clockevents_switch_state(struct clock_event_device *dev,

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [External Mail]Re: Race condition when replacing the broadcast timer
  2024-06-27 11:26 ` Race condition when replacing the broadcast timer Thomas Gleixner
@ 2024-06-28  1:59   ` 朱恺乾
  2024-06-28  7:22     ` Daniel Lezcano
  0 siblings, 1 reply; 9+ messages in thread
From: 朱恺乾 @ 2024-06-28  1:59 UTC (permalink / raw)
  To: Thomas Gleixner, Daniel Lezcano
  Cc: linux-kernel@vger.kernel.org, 王韬, 熊亮,
	isaacmanjarres@google.com, Frederic Weisbecker,
	Anna-Maria Behnsen

Thanks for the fast reply.
May I know when there'll be a formal patch on the mainline?

-----Original Message-----
From: Thomas Gleixner <tglx@linutronix.de>
Sent: Thursday, June 27, 2024 7:27 PM
To: 朱恺乾 <zhukaiqian@xiaomi.com>; Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-kernel@vger.kernel.org; 王韬 <lingyue@xiaomi.com>; 熊亮 <xiongliang@xiaomi.com>; isaacmanjarres@google.com; Frederic Weisbecker <frederic@kernel.org>; Anna-Maria Behnsen <anna-maria@linutronix.de>
Subject: [External Mail]Re: Race condition when replacing the broadcast timer

[外部邮件] 此邮件来源于小米公司外部,请谨慎处理。若对邮件安全性存疑,请将邮件转发给misec@xiaomi.com进行反馈

On Wed, Jun 26 2024 at 02:17, 朱恺乾 wrote:
> We find a possible race condition when replacing the broadcast timer.
> Here is how the race happend,

> 1. In thread 0, ___tick_broadcast_oneshot_control, timer 0 as a
> broadcast timer is updating the next_event.

> 2. In thread 1, tick_install_broadcast_device, timer 0 is going to be
> replaced by a new timer 1.

> 3. If thread 0 gets the broadcast timer first, it would have the old
> timer returned (timer 0). When thread 1 shuts the old timer down and
> marks it as detached, Thread 0 still have the chance to re-enable the
> old timer with a noop handler if it executes slower than thread 1.

> 4. As the old timer is binded to a CPU, when plug out that CPU, kernel
> fails at clockevents.c:653

Clearly tick_install_broadcast_device() lacks serialization.

The untested patch below should cure that.

Thanks,

        tglx
---
 kernel/time/clockevents.c    |   31 +++++++++++++++++++------------
 kernel/time/tick-broadcast.c |   36 ++++++++++++++++++++++--------------
 kernel/time/tick-internal.h  |    2 ++
 3 files changed, 43 insertions(+), 26 deletions(-)

--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -557,23 +557,14 @@ void clockevents_handle_noop(struct cloc  {  }

-/**
- * clockevents_exchange_device - release and request clock devices
- * @old:       device to release (can be NULL)
- * @new:       device to request (can be NULL)
- *
- * Called from various tick functions with clockevents_lock held and
- * interrupts disabled.
- */
-void clockevents_exchange_device(struct clock_event_device *old,
-                                struct clock_event_device *new)
+void __clockevents_exchange_device(struct clock_event_device *old,
+                                  struct clock_event_device *new)
 {
        /*
         * Caller releases a clock event device. We queue it into the
         * released list and do a notify add later.
         */
        if (old) {
-               module_put(old->owner);
                clockevents_switch_state(old, CLOCK_EVT_STATE_DETACHED);
                list_move(&old->list, &clockevents_released);
        }
@@ -585,6 +576,22 @@ void clockevents_exchange_device(struct
 }

 /**
+ * clockevents_exchange_device - release and request clock devices
+ * @old:       device to release (can be NULL)
+ * @new:       device to request (can be NULL)
+ *
+ * Called from various tick functions with clockevents_lock held and
+ * interrupts disabled.
+ */
+void clockevents_exchange_device(struct clock_event_device *old,
+                                struct clock_event_device *new) {
+       if (old)
+               module_put(old->owner);
+       __clockevents_exchange_device(old, new); }
+
+/**
  * clockevents_suspend - suspend clock devices
  */
 void clockevents_suspend(void)
@@ -650,7 +657,7 @@ void tick_cleanup_dead_cpu(int cpu)
                if (cpumask_test_cpu(cpu, dev->cpumask) &&
                    cpumask_weight(dev->cpumask) == 1 &&
                    !tick_is_broadcast_device(dev)) {
-                       BUG_ON(!clockevent_state_detached(dev));
+                       WARN_ON(!clockevent_state_detached(dev));
                        list_del(&dev->list);
                }
        }
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -162,23 +162,31 @@ static bool tick_set_oneshot_wakeup_devi
  */
 void tick_install_broadcast_device(struct clock_event_device *dev, int cpu)  {
-       struct clock_event_device *cur = tick_broadcast_device.evtdev;
+       struct clock_event_device *cur;

-       if (tick_set_oneshot_wakeup_device(dev, cpu))
-               return;
+       scoped_guard(raw_spinlock_irqsave, &tick_broadcast_lock) {

-       if (!tick_check_broadcast_device(cur, dev))
-               return;
+               if (tick_set_oneshot_wakeup_device(dev, cpu))
+                       return;

-       if (!try_module_get(dev->owner))
-               return;
+               cur = tick_broadcast_device.evtdev;
+               if (!tick_check_broadcast_device(cur, dev))
+                       return;

-       clockevents_exchange_device(cur, dev);
+               if (!try_module_get(dev->owner))
+                       return;
+
+               __clockevents_exchange_device(cur, dev);
+               if (cur)
+                       cur->event_handler = clockevents_handle_noop;
+               WRITE_ONCE(tick_broadcast_device.evtdev, dev);
+               if (!cpumask_empty(tick_broadcast_mask))
+                       tick_broadcast_start_periodic(dev);
+       }
+
+       /* Module release must be outside of the lock */
        if (cur)
-               cur->event_handler = clockevents_handle_noop;
-       tick_broadcast_device.evtdev = dev;
-       if (!cpumask_empty(tick_broadcast_mask))
-               tick_broadcast_start_periodic(dev);
+               module_put(old->owner);

        if (!(dev->features & CLOCK_EVT_FEAT_ONESHOT))
                return;
@@ -1185,7 +1193,7 @@ int tick_broadcast_oneshot_active(void)
  */
 bool tick_broadcast_oneshot_available(void)
 {
-       struct clock_event_device *bc = tick_broadcast_device.evtdev;
+       struct clock_event_device *bc =
+ READ_ONCE(tick_broadcast_device.evtdev);

        return bc ? bc->features & CLOCK_EVT_FEAT_ONESHOT : false;  } @@ -1193,7 +1201,7 @@ bool tick_broadcast_oneshot_available(vo
 #else
 int __tick_broadcast_oneshot_control(enum tick_broadcast_state state)  {
-       struct clock_event_device *bc = tick_broadcast_device.evtdev;
+       struct clock_event_device *bc =
+ READ_ONCE(tick_broadcast_device.evtdev);

        if (!bc || (bc->features & CLOCK_EVT_FEAT_HRTIMER))
                return -EBUSY;
--- a/kernel/time/tick-internal.h
+++ b/kernel/time/tick-internal.h
@@ -53,6 +53,8 @@ static inline void clockevent_set_state(  }

 extern void clockevents_shutdown(struct clock_event_device *dev);
+extern void __clockevents_exchange_device(struct clock_event_device *old,
+                                         struct clock_event_device
+*new);
 extern void clockevents_exchange_device(struct clock_event_device *old,
                                        struct clock_event_device *new);  extern void clockevents_switch_state(struct clock_event_device *dev,
#/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [External Mail]Re: Race condition when replacing the broadcast timer
  2024-06-28  1:59   ` [External Mail]Re: " 朱恺乾
@ 2024-06-28  7:22     ` Daniel Lezcano
  2024-07-01  2:11       ` 朱恺乾
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Lezcano @ 2024-06-28  7:22 UTC (permalink / raw)
  To: 朱恺乾, Thomas Gleixner
  Cc: linux-kernel@vger.kernel.org, 王韬, 熊亮,
	isaacmanjarres@google.com, Frederic Weisbecker,
	Anna-Maria Behnsen

On 28/06/2024 03:59, 朱恺乾 wrote:
> Thanks for the fast reply.
> May I know when there'll be a formal patch on the mainline?

Do you confirm the patch fixes the issue ?


> -----Original Message-----
> From: Thomas Gleixner <tglx@linutronix.de>
> Sent: Thursday, June 27, 2024 7:27 PM
> To: 朱恺乾 <zhukaiqian@xiaomi.com>; Daniel Lezcano <daniel.lezcano@linaro.org>
> Cc: linux-kernel@vger.kernel.org; 王韬 <lingyue@xiaomi.com>; 熊亮 <xiongliang@xiaomi.com>; isaacmanjarres@google.com; Frederic Weisbecker <frederic@kernel.org>; Anna-Maria Behnsen <anna-maria@linutronix.de>
> Subject: [External Mail]Re: Race condition when replacing the broadcast timer
> 
> [外部邮件] 此邮件来源于小米公司外部,请谨慎处理。若对邮件安全性存疑,请将邮件转发给misec@xiaomi.com进行反馈
> 
> On Wed, Jun 26 2024 at 02:17, 朱恺乾 wrote:
>> We find a possible race condition when replacing the broadcast timer.
>> Here is how the race happend,
> 
>> 1. In thread 0, ___tick_broadcast_oneshot_control, timer 0 as a
>> broadcast timer is updating the next_event.
> 
>> 2. In thread 1, tick_install_broadcast_device, timer 0 is going to be
>> replaced by a new timer 1.
> 
>> 3. If thread 0 gets the broadcast timer first, it would have the old
>> timer returned (timer 0). When thread 1 shuts the old timer down and
>> marks it as detached, Thread 0 still have the chance to re-enable the
>> old timer with a noop handler if it executes slower than thread 1.
> 
>> 4. As the old timer is binded to a CPU, when plug out that CPU, kernel
>> fails at clockevents.c:653
> 
> Clearly tick_install_broadcast_device() lacks serialization.
> 
> The untested patch below should cure that.
> 
> Thanks,
> 
>          tglx
> ---
>   kernel/time/clockevents.c    |   31 +++++++++++++++++++------------
>   kernel/time/tick-broadcast.c |   36 ++++++++++++++++++++++--------------
>   kernel/time/tick-internal.h  |    2 ++
>   3 files changed, 43 insertions(+), 26 deletions(-)
> 
> --- a/kernel/time/clockevents.c
> +++ b/kernel/time/clockevents.c
> @@ -557,23 +557,14 @@ void clockevents_handle_noop(struct cloc  {  }
> 
> -/**
> - * clockevents_exchange_device - release and request clock devices
> - * @old:       device to release (can be NULL)
> - * @new:       device to request (can be NULL)
> - *
> - * Called from various tick functions with clockevents_lock held and
> - * interrupts disabled.
> - */
> -void clockevents_exchange_device(struct clock_event_device *old,
> -                                struct clock_event_device *new)
> +void __clockevents_exchange_device(struct clock_event_device *old,
> +                                  struct clock_event_device *new)
>   {
>          /*
>           * Caller releases a clock event device. We queue it into the
>           * released list and do a notify add later.
>           */
>          if (old) {
> -               module_put(old->owner);
>                  clockevents_switch_state(old, CLOCK_EVT_STATE_DETACHED);
>                  list_move(&old->list, &clockevents_released);
>          }
> @@ -585,6 +576,22 @@ void clockevents_exchange_device(struct
>   }
> 
>   /**
> + * clockevents_exchange_device - release and request clock devices
> + * @old:       device to release (can be NULL)
> + * @new:       device to request (can be NULL)
> + *
> + * Called from various tick functions with clockevents_lock held and
> + * interrupts disabled.
> + */
> +void clockevents_exchange_device(struct clock_event_device *old,
> +                                struct clock_event_device *new) {
> +       if (old)
> +               module_put(old->owner);
> +       __clockevents_exchange_device(old, new); }
> +
> +/**
>    * clockevents_suspend - suspend clock devices
>    */
>   void clockevents_suspend(void)
> @@ -650,7 +657,7 @@ void tick_cleanup_dead_cpu(int cpu)
>                  if (cpumask_test_cpu(cpu, dev->cpumask) &&
>                      cpumask_weight(dev->cpumask) == 1 &&
>                      !tick_is_broadcast_device(dev)) {
> -                       BUG_ON(!clockevent_state_detached(dev));
> +                       WARN_ON(!clockevent_state_detached(dev));
>                          list_del(&dev->list);
>                  }
>          }
> --- a/kernel/time/tick-broadcast.c
> +++ b/kernel/time/tick-broadcast.c
> @@ -162,23 +162,31 @@ static bool tick_set_oneshot_wakeup_devi
>    */
>   void tick_install_broadcast_device(struct clock_event_device *dev, int cpu)  {
> -       struct clock_event_device *cur = tick_broadcast_device.evtdev;
> +       struct clock_event_device *cur;
> 
> -       if (tick_set_oneshot_wakeup_device(dev, cpu))
> -               return;
> +       scoped_guard(raw_spinlock_irqsave, &tick_broadcast_lock) {
> 
> -       if (!tick_check_broadcast_device(cur, dev))
> -               return;
> +               if (tick_set_oneshot_wakeup_device(dev, cpu))
> +                       return;
> 
> -       if (!try_module_get(dev->owner))
> -               return;
> +               cur = tick_broadcast_device.evtdev;
> +               if (!tick_check_broadcast_device(cur, dev))
> +                       return;
> 
> -       clockevents_exchange_device(cur, dev);
> +               if (!try_module_get(dev->owner))
> +                       return;
> +
> +               __clockevents_exchange_device(cur, dev);
> +               if (cur)
> +                       cur->event_handler = clockevents_handle_noop;
> +               WRITE_ONCE(tick_broadcast_device.evtdev, dev);
> +               if (!cpumask_empty(tick_broadcast_mask))
> +                       tick_broadcast_start_periodic(dev);
> +       }
> +
> +       /* Module release must be outside of the lock */
>          if (cur)
> -               cur->event_handler = clockevents_handle_noop;
> -       tick_broadcast_device.evtdev = dev;
> -       if (!cpumask_empty(tick_broadcast_mask))
> -               tick_broadcast_start_periodic(dev);
> +               module_put(old->owner);
> 
>          if (!(dev->features & CLOCK_EVT_FEAT_ONESHOT))
>                  return;
> @@ -1185,7 +1193,7 @@ int tick_broadcast_oneshot_active(void)
>    */
>   bool tick_broadcast_oneshot_available(void)
>   {
> -       struct clock_event_device *bc = tick_broadcast_device.evtdev;
> +       struct clock_event_device *bc =
> + READ_ONCE(tick_broadcast_device.evtdev);
> 
>          return bc ? bc->features & CLOCK_EVT_FEAT_ONESHOT : false;  } @@ -1193,7 +1201,7 @@ bool tick_broadcast_oneshot_available(vo
>   #else
>   int __tick_broadcast_oneshot_control(enum tick_broadcast_state state)  {
> -       struct clock_event_device *bc = tick_broadcast_device.evtdev;
> +       struct clock_event_device *bc =
> + READ_ONCE(tick_broadcast_device.evtdev);
> 
>          if (!bc || (bc->features & CLOCK_EVT_FEAT_HRTIMER))
>                  return -EBUSY;
> --- a/kernel/time/tick-internal.h
> +++ b/kernel/time/tick-internal.h
> @@ -53,6 +53,8 @@ static inline void clockevent_set_state(  }
> 
>   extern void clockevents_shutdown(struct clock_event_device *dev);
> +extern void __clockevents_exchange_device(struct clock_event_device *old,
> +                                         struct clock_event_device
> +*new);
>   extern void clockevents_exchange_device(struct clock_event_device *old,
>                                          struct clock_event_device *new);  extern void clockevents_switch_state(struct clock_event_device *dev,
> #/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#

-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [External Mail]Re: Race condition when replacing the broadcast timer
  2024-06-28  7:22     ` Daniel Lezcano
@ 2024-07-01  2:11       ` 朱恺乾
  2024-07-10 20:30         ` Thomas Gleixner
  0 siblings, 1 reply; 9+ messages in thread
From: 朱恺乾 @ 2024-07-01  2:11 UTC (permalink / raw)
  To: Daniel Lezcano, Thomas Gleixner, 张嘉伟
  Cc: linux-kernel@vger.kernel.org, 王韬, 熊亮,
	isaacmanjarres@google.com, Frederic Weisbecker,
	Anna-Maria Behnsen, 梁伟鹏,
	翁金飞

+Jiawei

Jiawei,
Please update here when you have the test result


-----Original Message-----
From: Daniel Lezcano <daniel.lezcano@linaro.org>
Sent: Friday, June 28, 2024 3:23 PM
To: 朱恺乾 <zhukaiqian@xiaomi.com>; Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org; 王韬 <lingyue@xiaomi.com>; 熊亮 <xiongliang@xiaomi.com>; isaacmanjarres@google.com; Frederic Weisbecker <frederic@kernel.org>; Anna-Maria Behnsen <anna-maria@linutronix.de>
Subject: Re: [External Mail]Re: Race condition when replacing the broadcast timer

[外部邮件] 此邮件来源于小米公司外部,请谨慎处理。若对邮件安全性存疑,请将邮件转发给misec@xiaomi.com进行反馈

On 28/06/2024 03:59, 朱恺乾 wrote:
> Thanks for the fast reply.
> May I know when there'll be a formal patch on the mainline?

Do you confirm the patch fixes the issue ?


> -----Original Message-----
> From: Thomas Gleixner <tglx@linutronix.de>
> Sent: Thursday, June 27, 2024 7:27 PM
> To: 朱恺乾 <zhukaiqian@xiaomi.com>; Daniel Lezcano
> <daniel.lezcano@linaro.org>
> Cc: linux-kernel@vger.kernel.org; 王韬 <lingyue@xiaomi.com>; 熊亮
> <xiongliang@xiaomi.com>; isaacmanjarres@google.com; Frederic
> Weisbecker <frederic@kernel.org>; Anna-Maria Behnsen
> <anna-maria@linutronix.de>
> Subject: [External Mail]Re: Race condition when replacing the
> broadcast timer
>
> [外部邮件] 此邮件来源于小米公司外部,请谨慎处理。若对邮件安全性存疑,请将邮件转发给misec@xiaomi.com进行反馈
>
> On Wed, Jun 26 2024 at 02:17, 朱恺乾 wrote:
>> We find a possible race condition when replacing the broadcast timer.
>> Here is how the race happend,
>
>> 1. In thread 0, ___tick_broadcast_oneshot_control, timer 0 as a
>> broadcast timer is updating the next_event.
>
>> 2. In thread 1, tick_install_broadcast_device, timer 0 is going to be
>> replaced by a new timer 1.
>
>> 3. If thread 0 gets the broadcast timer first, it would have the old
>> timer returned (timer 0). When thread 1 shuts the old timer down and
>> marks it as detached, Thread 0 still have the chance to re-enable the
>> old timer with a noop handler if it executes slower than thread 1.
>
>> 4. As the old timer is binded to a CPU, when plug out that CPU,
>> kernel fails at clockevents.c:653
>
> Clearly tick_install_broadcast_device() lacks serialization.
>
> The untested patch below should cure that.
>
> Thanks,
>
>          tglx
> ---
>   kernel/time/clockevents.c    |   31 +++++++++++++++++++------------
>   kernel/time/tick-broadcast.c |   36 ++++++++++++++++++++++--------------
>   kernel/time/tick-internal.h  |    2 ++
>   3 files changed, 43 insertions(+), 26 deletions(-)
>
> --- a/kernel/time/clockevents.c
> +++ b/kernel/time/clockevents.c
> @@ -557,23 +557,14 @@ void clockevents_handle_noop(struct cloc  {  }
>
> -/**
> - * clockevents_exchange_device - release and request clock devices
> - * @old:       device to release (can be NULL)
> - * @new:       device to request (can be NULL)
> - *
> - * Called from various tick functions with clockevents_lock held and
> - * interrupts disabled.
> - */
> -void clockevents_exchange_device(struct clock_event_device *old,
> -                                struct clock_event_device *new)
> +void __clockevents_exchange_device(struct clock_event_device *old,
> +                                  struct clock_event_device *new)
>   {
>          /*
>           * Caller releases a clock event device. We queue it into the
>           * released list and do a notify add later.
>           */
>          if (old) {
> -               module_put(old->owner);
>                  clockevents_switch_state(old, CLOCK_EVT_STATE_DETACHED);
>                  list_move(&old->list, &clockevents_released);
>          }
> @@ -585,6 +576,22 @@ void clockevents_exchange_device(struct
>   }
>
>   /**
> + * clockevents_exchange_device - release and request clock devices
> + * @old:       device to release (can be NULL)
> + * @new:       device to request (can be NULL)
> + *
> + * Called from various tick functions with clockevents_lock held and
> + * interrupts disabled.
> + */
> +void clockevents_exchange_device(struct clock_event_device *old,
> +                                struct clock_event_device *new) {
> +       if (old)
> +               module_put(old->owner);
> +       __clockevents_exchange_device(old, new); }
> +
> +/**
>    * clockevents_suspend - suspend clock devices
>    */
>   void clockevents_suspend(void)
> @@ -650,7 +657,7 @@ void tick_cleanup_dead_cpu(int cpu)
>                  if (cpumask_test_cpu(cpu, dev->cpumask) &&
>                      cpumask_weight(dev->cpumask) == 1 &&
>                      !tick_is_broadcast_device(dev)) {
> -                       BUG_ON(!clockevent_state_detached(dev));
> +                       WARN_ON(!clockevent_state_detached(dev));
>                          list_del(&dev->list);
>                  }
>          }
> --- a/kernel/time/tick-broadcast.c
> +++ b/kernel/time/tick-broadcast.c
> @@ -162,23 +162,31 @@ static bool tick_set_oneshot_wakeup_devi
>    */
>   void tick_install_broadcast_device(struct clock_event_device *dev, int cpu)  {
> -       struct clock_event_device *cur = tick_broadcast_device.evtdev;
> +       struct clock_event_device *cur;
>
> -       if (tick_set_oneshot_wakeup_device(dev, cpu))
> -               return;
> +       scoped_guard(raw_spinlock_irqsave, &tick_broadcast_lock) {
>
> -       if (!tick_check_broadcast_device(cur, dev))
> -               return;
> +               if (tick_set_oneshot_wakeup_device(dev, cpu))
> +                       return;
>
> -       if (!try_module_get(dev->owner))
> -               return;
> +               cur = tick_broadcast_device.evtdev;
> +               if (!tick_check_broadcast_device(cur, dev))
> +                       return;
>
> -       clockevents_exchange_device(cur, dev);
> +               if (!try_module_get(dev->owner))
> +                       return;
> +
> +               __clockevents_exchange_device(cur, dev);
> +               if (cur)
> +                       cur->event_handler = clockevents_handle_noop;
> +               WRITE_ONCE(tick_broadcast_device.evtdev, dev);
> +               if (!cpumask_empty(tick_broadcast_mask))
> +                       tick_broadcast_start_periodic(dev);
> +       }
> +
> +       /* Module release must be outside of the lock */
>          if (cur)
> -               cur->event_handler = clockevents_handle_noop;
> -       tick_broadcast_device.evtdev = dev;
> -       if (!cpumask_empty(tick_broadcast_mask))
> -               tick_broadcast_start_periodic(dev);
> +               module_put(old->owner);
>
>          if (!(dev->features & CLOCK_EVT_FEAT_ONESHOT))
>                  return;
> @@ -1185,7 +1193,7 @@ int tick_broadcast_oneshot_active(void)
>    */
>   bool tick_broadcast_oneshot_available(void)
>   {
> -       struct clock_event_device *bc = tick_broadcast_device.evtdev;
> +       struct clock_event_device *bc =
> + READ_ONCE(tick_broadcast_device.evtdev);
>
>          return bc ? bc->features & CLOCK_EVT_FEAT_ONESHOT : false;  } @@ -1193,7 +1201,7 @@ bool tick_broadcast_oneshot_available(vo
>   #else
>   int __tick_broadcast_oneshot_control(enum tick_broadcast_state state)  {
> -       struct clock_event_device *bc = tick_broadcast_device.evtdev;
> +       struct clock_event_device *bc =
> + READ_ONCE(tick_broadcast_device.evtdev);
>
>          if (!bc || (bc->features & CLOCK_EVT_FEAT_HRTIMER))
>                  return -EBUSY;
> --- a/kernel/time/tick-internal.h
> +++ b/kernel/time/tick-internal.h
> @@ -53,6 +53,8 @@ static inline void clockevent_set_state(  }
>
>   extern void clockevents_shutdown(struct clock_event_device *dev);
> +extern void __clockevents_exchange_device(struct clock_event_device *old,
> +                                         struct clock_event_device
> +*new);
>   extern void clockevents_exchange_device(struct clock_event_device *old,
>                                          struct clock_event_device
> *new);  extern void clockevents_switch_state(struct clock_event_device
> *dev,
> #/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部
> 或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and
> its attachments contain confidential information from XIAOMI, which is
> intended only for the person or entity whose address is listed above.
> Any use of the information contained herein in any way (including, but
> not limited to, total or partial disclosure, reproduction, or
> dissemination) by persons other than the intended recipient(s) is
> prohibited. If you receive this e-mail in error, please notify the
> sender by phone or email immediately and delete it!******/#

--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog

#/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [External Mail]Re: Race condition when replacing the broadcast timer
  2024-07-01  2:11       ` 朱恺乾
@ 2024-07-10 20:30         ` Thomas Gleixner
  2024-07-29 11:44           ` Thomas Gleixner
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2024-07-10 20:30 UTC (permalink / raw)
  To: 朱恺乾, Daniel Lezcano,
	张嘉伟
  Cc: linux-kernel@vger.kernel.org, 王韬, 熊亮,
	isaacmanjarres@google.com, Frederic Weisbecker,
	Anna-Maria Behnsen, 梁伟鹏,
	翁金飞

On Mon, Jul 01 2024 at 02:11, 朱恺乾 wrote:
> Jiawei,
> Please update here when you have the test result

Any update on this?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [External Mail]Re: Race condition when replacing the broadcast timer
  2024-07-10 20:30         ` Thomas Gleixner
@ 2024-07-29 11:44           ` Thomas Gleixner
  2024-08-12 14:19             ` [PATCH] tick/broadcast: Plug clockevents replacement race Thomas Gleixner
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2024-07-29 11:44 UTC (permalink / raw)
  To: 朱恺乾, Daniel Lezcano,
	张嘉伟
  Cc: linux-kernel@vger.kernel.org, 王韬, 熊亮,
	isaacmanjarres@google.com, Frederic Weisbecker,
	Anna-Maria Behnsen, 梁伟鹏,
	翁金飞

On Wed, Jul 10 2024 at 22:30, Thomas Gleixner wrote:
> On Mon, Jul 01 2024 at 02:11, 朱恺乾 wrote:
>> Jiawei,
>> Please update here when you have the test result
>
> Any update on this?

Gentle reminder.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] tick/broadcast: Plug clockevents replacement race
  2024-07-29 11:44           ` Thomas Gleixner
@ 2024-08-12 14:19             ` Thomas Gleixner
  2024-09-25  9:45               ` Anna-Maria Behnsen
  2024-10-17 16:16               ` Frederic Weisbecker
  0 siblings, 2 replies; 9+ messages in thread
From: Thomas Gleixner @ 2024-08-12 14:19 UTC (permalink / raw)
  To: 朱恺乾, Daniel Lezcano,
	张嘉伟
  Cc: linux-kernel@vger.kernel.org, 王韬, 熊亮,
	isaacmanjarres@google.com, Frederic Weisbecker,
	Anna-Maria Behnsen, 梁伟鹏,
	翁金飞

朱恺乾 reported and decoded the following race condition when a broadcast
device is replaced:

CPUA					CPUB
 __tick_broadcast_oneshot_control()
   bc = tick_broadcast_device.evtdev;
					tick_install_broadcast_device(dev)
        				clockevents_exchange_device(cur, dev)
					   shutdown(cur);
					   detach(cur);
					   cur->handler = noop;
					   tick_broadcast_device.evtdev = dev;

  tick_broadcast_set_event(bc, next_event); <- FAIL: arms a detached device.

If the original broadcast device has a restricted interrupt affinity mask
and the last CPU in that mask goes offline then the BUG() in
tick_cleanup_dead_cpu() triggers because the clockevent device is not in
detached state.

The reason for this is that tick_install_broadcast_device() is not
serialized vs. tick broadcast operations.

The obvious cure is to serialize tick_install_broadcast_device() with
tick_broadcast_lock against a concurrent tick broadcast operation.

That requires to split clockevents_exchange_device() into two parts, one
which does the exchange, shutdown and detach operation and the other which
drops the module reference count. This is required because the module
reference cannot be dropped while holding tick_broadcast_lock.

Let clockevents_exchange_device() do both operations as before, but let the
broadcast device code take the two step approach and do the device
exchange under tick_broadcast_lock and drop the module reference count
after releasing it.

Fixes: f8381cba04ba ("[PATCH] tick-management: broadcast functionality")
Reported-by: 朱恺乾 <zhukaiqian@xiaomi.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/clockevents.c    |   33 ++++++++++++++++++++-------------
 kernel/time/tick-broadcast.c |   36 ++++++++++++++++++++++--------------
 kernel/time/tick-internal.h  |    2 ++
 3 files changed, 44 insertions(+), 27 deletions(-)

--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -557,34 +557,41 @@ void clockevents_handle_noop(struct cloc
 {
 }
 
-/**
- * clockevents_exchange_device - release and request clock devices
- * @old:	device to release (can be NULL)
- * @new:	device to request (can be NULL)
- *
- * Called from various tick functions with clockevents_lock held and
- * interrupts disabled.
- */
-void clockevents_exchange_device(struct clock_event_device *old,
-				 struct clock_event_device *new)
+void __clockevents_exchange_device(struct clock_event_device *old,
+				   struct clock_event_device *new)
 {
 	/*
 	 * Caller releases a clock event device. We queue it into the
 	 * released list and do a notify add later.
 	 */
 	if (old) {
-		module_put(old->owner);
 		clockevents_switch_state(old, CLOCK_EVT_STATE_DETACHED);
 		list_move(&old->list, &clockevents_released);
 	}
 
 	if (new) {
-		BUG_ON(!clockevent_state_detached(new));
+		WARN_ON(!clockevent_state_detached(new));
 		clockevents_shutdown(new);
 	}
 }
 
 /**
+ * clockevents_exchange_device - release and request clock devices
+ * @old:	device to release (can be NULL)
+ * @new:	device to request (can be NULL)
+ *
+ * Called from various tick functions with clockevents_lock held and
+ * interrupts disabled.
+ */
+void clockevents_exchange_device(struct clock_event_device *old,
+				 struct clock_event_device *new)
+{
+	__clockevents_exchange_device(old, new);
+	if (old)
+		module_put(old->owner);
+}
+
+/**
  * clockevents_suspend - suspend clock devices
  */
 void clockevents_suspend(void)
@@ -650,7 +657,7 @@ void tick_cleanup_dead_cpu(int cpu)
 		if (cpumask_test_cpu(cpu, dev->cpumask) &&
 		    cpumask_weight(dev->cpumask) == 1 &&
 		    !tick_is_broadcast_device(dev)) {
-			BUG_ON(!clockevent_state_detached(dev));
+			WARN_ON(!clockevent_state_detached(dev));
 			list_del(&dev->list);
 		}
 	}
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -162,23 +162,31 @@ static bool tick_set_oneshot_wakeup_devi
  */
 void tick_install_broadcast_device(struct clock_event_device *dev, int cpu)
 {
-	struct clock_event_device *cur = tick_broadcast_device.evtdev;
+	struct clock_event_device *cur;
 
-	if (tick_set_oneshot_wakeup_device(dev, cpu))
-		return;
+	scoped_guard(raw_spinlock_irqsave, &tick_broadcast_lock) {
 
-	if (!tick_check_broadcast_device(cur, dev))
-		return;
+		if (tick_set_oneshot_wakeup_device(dev, cpu))
+			return;
 
-	if (!try_module_get(dev->owner))
-		return;
+		cur = tick_broadcast_device.evtdev;
+		if (!tick_check_broadcast_device(cur, dev))
+			return;
 
-	clockevents_exchange_device(cur, dev);
+		if (!try_module_get(dev->owner))
+			return;
+
+		__clockevents_exchange_device(cur, dev);
+		if (cur)
+			cur->event_handler = clockevents_handle_noop;
+		WRITE_ONCE(tick_broadcast_device.evtdev, dev);
+		if (!cpumask_empty(tick_broadcast_mask))
+			tick_broadcast_start_periodic(dev);
+	}
+
+	/* Module release must be outside of the lock */
 	if (cur)
-		cur->event_handler = clockevents_handle_noop;
-	tick_broadcast_device.evtdev = dev;
-	if (!cpumask_empty(tick_broadcast_mask))
-		tick_broadcast_start_periodic(dev);
+		module_put(cur->owner);
 
 	if (!(dev->features & CLOCK_EVT_FEAT_ONESHOT))
 		return;
@@ -1209,7 +1217,7 @@ int tick_broadcast_oneshot_active(void)
  */
 bool tick_broadcast_oneshot_available(void)
 {
-	struct clock_event_device *bc = tick_broadcast_device.evtdev;
+	struct clock_event_device *bc = READ_ONCE(tick_broadcast_device.evtdev);
 
 	return bc ? bc->features & CLOCK_EVT_FEAT_ONESHOT : false;
 }
@@ -1217,7 +1225,7 @@ bool tick_broadcast_oneshot_available(vo
 #else
 int __tick_broadcast_oneshot_control(enum tick_broadcast_state state)
 {
-	struct clock_event_device *bc = tick_broadcast_device.evtdev;
+	struct clock_event_device *bc = READ_ONCE(tick_broadcast_device.evtdev);
 
 	if (!bc || (bc->features & CLOCK_EVT_FEAT_HRTIMER))
 		return -EBUSY;
--- a/kernel/time/tick-internal.h
+++ b/kernel/time/tick-internal.h
@@ -53,6 +53,8 @@ static inline void clockevent_set_state(
 }
 
 extern void clockevents_shutdown(struct clock_event_device *dev);
+extern void __clockevents_exchange_device(struct clock_event_device *old,
+					  struct clock_event_device *new);
 extern void clockevents_exchange_device(struct clock_event_device *old,
 					struct clock_event_device *new);
 extern void clockevents_switch_state(struct clock_event_device *dev,

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] tick/broadcast: Plug clockevents replacement race
  2024-08-12 14:19             ` [PATCH] tick/broadcast: Plug clockevents replacement race Thomas Gleixner
@ 2024-09-25  9:45               ` Anna-Maria Behnsen
  2024-10-17 16:16               ` Frederic Weisbecker
  1 sibling, 0 replies; 9+ messages in thread
From: Anna-Maria Behnsen @ 2024-09-25  9:45 UTC (permalink / raw)
  To: Thomas Gleixner, 朱恺乾, Daniel Lezcano,
	张嘉伟
  Cc: linux-kernel@vger.kernel.org, 王韬, 熊亮,
	isaacmanjarres@google.com, Frederic Weisbecker,
	梁伟鹏, 翁金飞

Thomas Gleixner <tglx@linutronix.de> writes:

> 朱恺乾 reported and decoded the following race condition when a broadcast
> device is replaced:
>
> CPUA					CPUB
>  __tick_broadcast_oneshot_control()
>    bc = tick_broadcast_device.evtdev;
> 					tick_install_broadcast_device(dev)
>         				clockevents_exchange_device(cur, dev)
> 					   shutdown(cur);
> 					   detach(cur);
> 					   cur->handler = noop;
> 					   tick_broadcast_device.evtdev = dev;
>
>   tick_broadcast_set_event(bc, next_event); <- FAIL: arms a detached device.
>
> If the original broadcast device has a restricted interrupt affinity mask
> and the last CPU in that mask goes offline then the BUG() in
> tick_cleanup_dead_cpu() triggers because the clockevent device is not in
> detached state.
>
> The reason for this is that tick_install_broadcast_device() is not
> serialized vs. tick broadcast operations.
>
> The obvious cure is to serialize tick_install_broadcast_device() with
> tick_broadcast_lock against a concurrent tick broadcast operation.
>
> That requires to split clockevents_exchange_device() into two parts, one
> which does the exchange, shutdown and detach operation and the other which
> drops the module reference count. This is required because the module
> reference cannot be dropped while holding tick_broadcast_lock.
>
> Let clockevents_exchange_device() do both operations as before, but let the
> broadcast device code take the two step approach and do the device
> exchange under tick_broadcast_lock and drop the module reference count
> after releasing it.
>
> Fixes: f8381cba04ba ("[PATCH] tick-management: broadcast functionality")
> Reported-by: 朱恺乾 <zhukaiqian@xiaomi.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  kernel/time/clockevents.c    |   33 ++++++++++++++++++++-------------
>  kernel/time/tick-broadcast.c |   36 ++++++++++++++++++++++--------------
>  kernel/time/tick-internal.h  |    2 ++
>  3 files changed, 44 insertions(+), 27 deletions(-)
>
> --- a/kernel/time/clockevents.c
> +++ b/kernel/time/clockevents.c
> @@ -557,34 +557,41 @@ void clockevents_handle_noop(struct cloc

[...]

>  
>  /**
> + * clockevents_exchange_device - release and request clock devices
> + * @old:	device to release (can be NULL)
> + * @new:	device to request (can be NULL)
> + *
> + * Called from various tick functions with clockevents_lock held and
> + * interrupts disabled.

can you please transform the comment into a lockdep annotation?

Thanks

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] tick/broadcast: Plug clockevents replacement race
  2024-08-12 14:19             ` [PATCH] tick/broadcast: Plug clockevents replacement race Thomas Gleixner
  2024-09-25  9:45               ` Anna-Maria Behnsen
@ 2024-10-17 16:16               ` Frederic Weisbecker
  1 sibling, 0 replies; 9+ messages in thread
From: Frederic Weisbecker @ 2024-10-17 16:16 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: 朱恺乾, Daniel Lezcano,
	张嘉伟, linux-kernel@vger.kernel.org,
	王韬, 熊亮, isaacmanjarres@google.com,
	Anna-Maria Behnsen, 梁伟鹏,
	翁金飞

Le Mon, Aug 12, 2024 at 04:19:48PM +0200, Thomas Gleixner a écrit :
> 朱恺乾 reported and decoded the following race condition when a broadcast
> device is replaced:
> 
> CPUA					CPUB
>  __tick_broadcast_oneshot_control()
>    bc = tick_broadcast_device.evtdev;
> 					tick_install_broadcast_device(dev)
>         				clockevents_exchange_device(cur, dev)
> 					   shutdown(cur);
> 					   detach(cur);
> 					   cur->handler = noop;
> 					   tick_broadcast_device.evtdev = dev;
> 
>   tick_broadcast_set_event(bc, next_event); <- FAIL: arms a detached device.
> 
> If the original broadcast device has a restricted interrupt affinity mask
> and the last CPU in that mask goes offline then the BUG() in
> tick_cleanup_dead_cpu() triggers because the clockevent device is not in
> detached state.
> 
> The reason for this is that tick_install_broadcast_device() is not
> serialized vs. tick broadcast operations.
> 
> The obvious cure is to serialize tick_install_broadcast_device() with
> tick_broadcast_lock against a concurrent tick broadcast operation.
> 
> That requires to split clockevents_exchange_device() into two parts, one
> which does the exchange, shutdown and detach operation and the other which
> drops the module reference count. This is required because the module
> reference cannot be dropped while holding tick_broadcast_lock.

The reason why the module reference can not be dropped while holding
tick_broadcast_lock is not obvious though. What can go wrong?

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-10-17 16:16 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <042520850d394f0bb0004a226db63d0d@xiaomi.com>
2024-06-27 11:26 ` Race condition when replacing the broadcast timer Thomas Gleixner
2024-06-28  1:59   ` [External Mail]Re: " 朱恺乾
2024-06-28  7:22     ` Daniel Lezcano
2024-07-01  2:11       ` 朱恺乾
2024-07-10 20:30         ` Thomas Gleixner
2024-07-29 11:44           ` Thomas Gleixner
2024-08-12 14:19             ` [PATCH] tick/broadcast: Plug clockevents replacement race Thomas Gleixner
2024-09-25  9:45               ` Anna-Maria Behnsen
2024-10-17 16:16               ` Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).