Kexec Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [bug report] PM / reboot: Eliminate race between reboot and suspend
@ 2020-11-18 18:59 Dan Carpenter
  2021-01-20  9:30 ` Baoquan He
  0 siblings, 1 reply; 5+ messages in thread
From: Dan Carpenter @ 2020-11-18 18:59 UTC (permalink / raw)
  To: kernelfans; +Cc: kexec, linux-pm

Hello Pingfan Liu,

The patch 55f2503c3b69: "PM / reboot: Eliminate race between reboot
and suspend" from Jul 31, 2018, leads to the following static checker
warning:

	kernel/power/main.c:27 lock_system_sleep()
	warn: called with lock held.  '&system_transition_mutex'

kernel/reboot.c
   345  
   346          mutex_lock(&system_transition_mutex);
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The patch changed the code to take this lock.

   347          switch (cmd) {
   348          case LINUX_REBOOT_CMD_RESTART:
   349                  kernel_restart(NULL);
   350                  break;
   351  
   352          case LINUX_REBOOT_CMD_CAD_ON:
   353                  C_A_D = 1;
   354                  break;
   355  
   356          case LINUX_REBOOT_CMD_CAD_OFF:
   357                  C_A_D = 0;
   358                  break;
   359  
   360          case LINUX_REBOOT_CMD_HALT:
   361                  kernel_halt();
   362                  do_exit(0);
   363                  panic("cannot halt");
   364  
   365          case LINUX_REBOOT_CMD_POWER_OFF:
   366                  kernel_power_off();
   367                  do_exit(0);
   368                  break;
   369  
   370          case LINUX_REBOOT_CMD_RESTART2:
   371                  ret = strncpy_from_user(&buffer[0], arg, sizeof(buffer) - 1);
   372                  if (ret < 0) {
   373                          ret = -EFAULT;
   374                          break;
   375                  }
   376                  buffer[sizeof(buffer) - 1] = '\0';
   377  
   378                  kernel_restart(buffer);
   379                  break;
   380  
   381  #ifdef CONFIG_KEXEC_CORE
   382          case LINUX_REBOOT_CMD_KEXEC:
   383                  ret = kernel_kexec();
                        ^^^^^^^^^^^^^^^^^^^^
Called with lock held.

   384                  break;
   385  #endif

But kernel_kexec() also tries to take the &system_transition_mutex so
it will dead lock.

kernel/kexec_core.c
  1125  int kernel_kexec(void)
  1126  {
  1127          int error = 0;
  1128  
  1129          if (!mutex_trylock(&kexec_mutex))
  1130                  return -EBUSY;
  1131          if (!kexec_image) {
  1132                  error = -EINVAL;
  1133                  goto Unlock;
  1134          }
  1135  
  1136  #ifdef CONFIG_KEXEC_JUMP
  1137          if (kexec_image->preserve_context) {
  1138                  lock_system_sleep();
                        ^^^^^^^^^^^^^^^^^^^
Here.

  1139                  pm_prepare_console();
  1140                  error = freeze_processes();
  1141                  if (error) {
  1142                          error = -EBUSY;
  1143                          goto Restore_console;
  1144                  }
  1145                  suspend_console();
  1146                  error = dpm_suspend_start(PMSG_FREEZE);
  1147                  if (error)
  1148                          goto Resume_console;
  1149                  /* At this point, dpm_suspend_start() has been called,
  1150                   * but *not* dpm_suspend_end(). We *must* call
  1151                   * dpm_suspend_end() now.  Otherwise, drivers for
  1152                   * some devices (e.g. interrupt controllers) become
  1153                   * desynchronized with the actual state of the
  1154                   * hardware at resume time, and evil weirdness ensues.
  1155                   */
  1156                  error = dpm_suspend_end(PMSG_FREEZE);
  1157                  if (error)
  1158                          goto Resume_devices;
  1159                  error = suspend_disable_secondary_cpus();
  1160                  if (error)

regards,
dan carpenter

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [bug report] PM / reboot: Eliminate race between reboot and suspend
  2020-11-18 18:59 [bug report] PM / reboot: Eliminate race between reboot and suspend Dan Carpenter
@ 2021-01-20  9:30 ` Baoquan He
  2021-01-21  9:10   ` Pingfan Liu
  0 siblings, 1 reply; 5+ messages in thread
From: Baoquan He @ 2021-01-20  9:30 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: dyoung, kexec, kernelfans, linux-pm

Hi,

On 11/18/20 at 09:59pm, Dan Carpenter wrote:
> Hello Pingfan Liu,
> 
> The patch 55f2503c3b69: "PM / reboot: Eliminate race between reboot
> and suspend" from Jul 31, 2018, leads to the following static checker
> warning:
> 
> 	kernel/power/main.c:27 lock_system_sleep()
> 	warn: called with lock held.  '&system_transition_mutex'

This is a good finding. I think we can simply remove the lock/unlock
pair of system_transition_mutex in kernel_kexec() function. The dead
lock should be easily triggered, but it hasn't caused any failure report
because the feature 'kexec jump' is almost not used by anyone as far as
I know. We may need to find out who is using it and where it's used
through an inquiry. Before that, we can just remove the lock operation
inside CONFIG_KEXEC_JUMP ifdeffery scope. Thanks. 


diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 80905e5aa8ae..a0b6780740c8 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1134,7 +1134,6 @@ int kernel_kexec(void)
 
 #ifdef CONFIG_KEXEC_JUMP
 	if (kexec_image->preserve_context) {
-		lock_system_sleep();
 		pm_prepare_console();
 		error = freeze_processes();
 		if (error) {
@@ -1197,7 +1196,6 @@ int kernel_kexec(void)
 		thaw_processes();
  Restore_console:
 		pm_restore_console();
-		unlock_system_sleep();
 	}
 #endif
 

> 
> kernel/reboot.c
>    345  
>    346          mutex_lock(&system_transition_mutex);
>                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> The patch changed the code to take this lock.
> 
>    347          switch (cmd) {
>    348          case LINUX_REBOOT_CMD_RESTART:
>    349                  kernel_restart(NULL);
>    350                  break;
>    351  
>    352          case LINUX_REBOOT_CMD_CAD_ON:
>    353                  C_A_D = 1;
>    354                  break;
>    355  
>    356          case LINUX_REBOOT_CMD_CAD_OFF:
>    357                  C_A_D = 0;
>    358                  break;
>    359  
>    360          case LINUX_REBOOT_CMD_HALT:
>    361                  kernel_halt();
>    362                  do_exit(0);
>    363                  panic("cannot halt");
>    364  
>    365          case LINUX_REBOOT_CMD_POWER_OFF:
>    366                  kernel_power_off();
>    367                  do_exit(0);
>    368                  break;
>    369  
>    370          case LINUX_REBOOT_CMD_RESTART2:
>    371                  ret = strncpy_from_user(&buffer[0], arg, sizeof(buffer) - 1);
>    372                  if (ret < 0) {
>    373                          ret = -EFAULT;
>    374                          break;
>    375                  }
>    376                  buffer[sizeof(buffer) - 1] = '\0';
>    377  
>    378                  kernel_restart(buffer);
>    379                  break;
>    380  
>    381  #ifdef CONFIG_KEXEC_CORE
>    382          case LINUX_REBOOT_CMD_KEXEC:
>    383                  ret = kernel_kexec();
>                         ^^^^^^^^^^^^^^^^^^^^
> Called with lock held.
> 
>    384                  break;
>    385  #endif
> 
> But kernel_kexec() also tries to take the &system_transition_mutex so
> it will dead lock.
> 
> kernel/kexec_core.c
>   1125  int kernel_kexec(void)
>   1126  {
>   1127          int error = 0;
>   1128  
>   1129          if (!mutex_trylock(&kexec_mutex))
>   1130                  return -EBUSY;
>   1131          if (!kexec_image) {
>   1132                  error = -EINVAL;
>   1133                  goto Unlock;
>   1134          }
>   1135  
>   1136  #ifdef CONFIG_KEXEC_JUMP
>   1137          if (kexec_image->preserve_context) {
>   1138                  lock_system_sleep();
>                         ^^^^^^^^^^^^^^^^^^^
> Here.
> 
>   1139                  pm_prepare_console();
>   1140                  error = freeze_processes();
>   1141                  if (error) {
>   1142                          error = -EBUSY;
>   1143                          goto Restore_console;
>   1144                  }
>   1145                  suspend_console();
>   1146                  error = dpm_suspend_start(PMSG_FREEZE);
>   1147                  if (error)
>   1148                          goto Resume_console;
>   1149                  /* At this point, dpm_suspend_start() has been called,
>   1150                   * but *not* dpm_suspend_end(). We *must* call
>   1151                   * dpm_suspend_end() now.  Otherwise, drivers for
>   1152                   * some devices (e.g. interrupt controllers) become
>   1153                   * desynchronized with the actual state of the
>   1154                   * hardware at resume time, and evil weirdness ensues.
>   1155                   */
>   1156                  error = dpm_suspend_end(PMSG_FREEZE);
>   1157                  if (error)
>   1158                          goto Resume_devices;
>   1159                  error = suspend_disable_secondary_cpus();
>   1160                  if (error)
> 
> regards,
> dan carpenter
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [bug report] PM / reboot: Eliminate race between reboot and suspend
  2021-01-20  9:30 ` Baoquan He
@ 2021-01-21  9:10   ` Pingfan Liu
  2021-01-21 14:42     ` Rafael J. Wysocki
  0 siblings, 1 reply; 5+ messages in thread
From: Pingfan Liu @ 2021-01-21  9:10 UTC (permalink / raw)
  To: Baoquan He; +Cc: Dave Young, Kexec Mailing List, Dan Carpenter, linux-pm

On Wed, Jan 20, 2021 at 5:30 PM Baoquan He <bhe@redhat.com> wrote:
>
> Hi,
>
> On 11/18/20 at 09:59pm, Dan Carpenter wrote:
> > Hello Pingfan Liu,
> >
> > The patch 55f2503c3b69: "PM / reboot: Eliminate race between reboot
> > and suspend" from Jul 31, 2018, leads to the following static checker
> > warning:
> >
> >       kernel/power/main.c:27 lock_system_sleep()
> >       warn: called with lock held.  '&system_transition_mutex'
>
> This is a good finding. I think we can simply remove the lock/unlock
> pair of system_transition_mutex in kernel_kexec() function. The dead
> lock should be easily triggered, but it hasn't caused any failure report
> because the feature 'kexec jump' is almost not used by anyone as far as
> I know. We may need to find out who is using it and where it's used
> through an inquiry. Before that, we can just remove the lock operation
> inside CONFIG_KEXEC_JUMP ifdeffery scope. Thanks.
>
>
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 80905e5aa8ae..a0b6780740c8 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -1134,7 +1134,6 @@ int kernel_kexec(void)
>
>  #ifdef CONFIG_KEXEC_JUMP
>         if (kexec_image->preserve_context) {
> -               lock_system_sleep();
>                 pm_prepare_console();
>                 error = freeze_processes();
>                 if (error) {
> @@ -1197,7 +1196,6 @@ int kernel_kexec(void)
>                 thaw_processes();
>   Restore_console:
>                 pm_restore_console();
> -               unlock_system_sleep();

This should work since the only caller syscall_reboot has already
placed kernel_kexec() under the protection of system_transition_mutex.

Thanks for the fix.

Reviewed-by: Pingfan Liu <kernelfans@gmail.com>

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [bug report] PM / reboot: Eliminate race between reboot and suspend
  2021-01-21  9:10   ` Pingfan Liu
@ 2021-01-21 14:42     ` Rafael J. Wysocki
  2021-01-22  7:38       ` Baoquan He
  0 siblings, 1 reply; 5+ messages in thread
From: Rafael J. Wysocki @ 2021-01-21 14:42 UTC (permalink / raw)
  To: Pingfan Liu, Baoquan He
  Cc: Dave Young, Kexec Mailing List, Dan Carpenter, Linux PM

On Thu, Jan 21, 2021 at 10:14 AM Pingfan Liu <kernelfans@gmail.com> wrote:
>
> On Wed, Jan 20, 2021 at 5:30 PM Baoquan He <bhe@redhat.com> wrote:
> >
> > Hi,
> >
> > On 11/18/20 at 09:59pm, Dan Carpenter wrote:
> > > Hello Pingfan Liu,
> > >
> > > The patch 55f2503c3b69: "PM / reboot: Eliminate race between reboot
> > > and suspend" from Jul 31, 2018, leads to the following static checker
> > > warning:
> > >
> > >       kernel/power/main.c:27 lock_system_sleep()
> > >       warn: called with lock held.  '&system_transition_mutex'
> >
> > This is a good finding. I think we can simply remove the lock/unlock
> > pair of system_transition_mutex in kernel_kexec() function. The dead
> > lock should be easily triggered, but it hasn't caused any failure report
> > because the feature 'kexec jump' is almost not used by anyone as far as
> > I know. We may need to find out who is using it and where it's used
> > through an inquiry. Before that, we can just remove the lock operation
> > inside CONFIG_KEXEC_JUMP ifdeffery scope. Thanks.
> >
> >
> > diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> > index 80905e5aa8ae..a0b6780740c8 100644
> > --- a/kernel/kexec_core.c
> > +++ b/kernel/kexec_core.c
> > @@ -1134,7 +1134,6 @@ int kernel_kexec(void)
> >
> >  #ifdef CONFIG_KEXEC_JUMP
> >         if (kexec_image->preserve_context) {
> > -               lock_system_sleep();
> >                 pm_prepare_console();
> >                 error = freeze_processes();
> >                 if (error) {
> > @@ -1197,7 +1196,6 @@ int kernel_kexec(void)
> >                 thaw_processes();
> >   Restore_console:
> >                 pm_restore_console();
> > -               unlock_system_sleep();
>
> This should work since the only caller syscall_reboot has already
> placed kernel_kexec() under the protection of system_transition_mutex.
>
> Thanks for the fix.
>
> Reviewed-by: Pingfan Liu <kernelfans@gmail.com>

OK, so can anyone please submit that patch formally (Cc linux-pm, please)?

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [bug report] PM / reboot: Eliminate race between reboot and suspend
  2021-01-21 14:42     ` Rafael J. Wysocki
@ 2021-01-22  7:38       ` Baoquan He
  0 siblings, 0 replies; 5+ messages in thread
From: Baoquan He @ 2021-01-22  7:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Kexec Mailing List, Dave Young, Dan Carpenter, Pingfan Liu,
	Linux PM

On 01/21/21 at 03:42pm, Rafael J. Wysocki wrote:
> On Thu, Jan 21, 2021 at 10:14 AM Pingfan Liu <kernelfans@gmail.com> wrote:
> >
> > On Wed, Jan 20, 2021 at 5:30 PM Baoquan He <bhe@redhat.com> wrote:
> > >
> > > Hi,
> > >
> > > On 11/18/20 at 09:59pm, Dan Carpenter wrote:
> > > > Hello Pingfan Liu,
> > > >
> > > > The patch 55f2503c3b69: "PM / reboot: Eliminate race between reboot
> > > > and suspend" from Jul 31, 2018, leads to the following static checker
> > > > warning:
> > > >
> > > >       kernel/power/main.c:27 lock_system_sleep()
> > > >       warn: called with lock held.  '&system_transition_mutex'
> > >
> > > This is a good finding. I think we can simply remove the lock/unlock
> > > pair of system_transition_mutex in kernel_kexec() function. The dead
> > > lock should be easily triggered, but it hasn't caused any failure report
> > > because the feature 'kexec jump' is almost not used by anyone as far as
> > > I know. We may need to find out who is using it and where it's used
> > > through an inquiry. Before that, we can just remove the lock operation
> > > inside CONFIG_KEXEC_JUMP ifdeffery scope. Thanks.
> > >
> > >
> > > diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> > > index 80905e5aa8ae..a0b6780740c8 100644
> > > --- a/kernel/kexec_core.c
> > > +++ b/kernel/kexec_core.c
> > > @@ -1134,7 +1134,6 @@ int kernel_kexec(void)
> > >
> > >  #ifdef CONFIG_KEXEC_JUMP
> > >         if (kexec_image->preserve_context) {
> > > -               lock_system_sleep();
> > >                 pm_prepare_console();
> > >                 error = freeze_processes();
> > >                 if (error) {
> > > @@ -1197,7 +1196,6 @@ int kernel_kexec(void)
> > >                 thaw_processes();
> > >   Restore_console:
> > >                 pm_restore_console();
> > > -               unlock_system_sleep();
> >
> > This should work since the only caller syscall_reboot has already
> > placed kernel_kexec() under the protection of system_transition_mutex.
> >
> > Thanks for the fix.
> >
> > Reviewed-by: Pingfan Liu <kernelfans@gmail.com>
> 
> OK, so can anyone please submit that patch formally (Cc linux-pm, please)?

I will submit a patch with Pingfan's ack, thanks.


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-01-22  7:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-11-18 18:59 [bug report] PM / reboot: Eliminate race between reboot and suspend Dan Carpenter
2021-01-20  9:30 ` Baoquan He
2021-01-21  9:10   ` Pingfan Liu
2021-01-21 14:42     ` Rafael J. Wysocki
2021-01-22  7:38       ` Baoquan He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox