From: Baoquan He <bhe@redhat.com>
To: Dan Carpenter <dan.carpenter@oracle.com>
Cc: dyoung@redhat.com, kexec@lists.infradead.org,
kernelfans@gmail.com, linux-pm@vger.kernel.org
Subject: Re: [bug report] PM / reboot: Eliminate race between reboot and suspend
Date: Wed, 20 Jan 2021 17:30:15 +0800 [thread overview]
Message-ID: <20210120093015.GE20161@MiWiFi-R3L-srv> (raw)
In-Reply-To: <20201118185917.GA433776@mwanda>
Hi,
On 11/18/20 at 09:59pm, Dan Carpenter wrote:
> Hello Pingfan Liu,
>
> The patch 55f2503c3b69: "PM / reboot: Eliminate race between reboot
> and suspend" from Jul 31, 2018, leads to the following static checker
> warning:
>
> kernel/power/main.c:27 lock_system_sleep()
> warn: called with lock held. '&system_transition_mutex'
This is a good finding. I think we can simply remove the lock/unlock
pair of system_transition_mutex in kernel_kexec() function. The dead
lock should be easily triggered, but it hasn't caused any failure report
because the feature 'kexec jump' is almost not used by anyone as far as
I know. We may need to find out who is using it and where it's used
through an inquiry. Before that, we can just remove the lock operation
inside CONFIG_KEXEC_JUMP ifdeffery scope. Thanks.
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 80905e5aa8ae..a0b6780740c8 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1134,7 +1134,6 @@ int kernel_kexec(void)
#ifdef CONFIG_KEXEC_JUMP
if (kexec_image->preserve_context) {
- lock_system_sleep();
pm_prepare_console();
error = freeze_processes();
if (error) {
@@ -1197,7 +1196,6 @@ int kernel_kexec(void)
thaw_processes();
Restore_console:
pm_restore_console();
- unlock_system_sleep();
}
#endif
>
> kernel/reboot.c
> 345
> 346 mutex_lock(&system_transition_mutex);
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> The patch changed the code to take this lock.
>
> 347 switch (cmd) {
> 348 case LINUX_REBOOT_CMD_RESTART:
> 349 kernel_restart(NULL);
> 350 break;
> 351
> 352 case LINUX_REBOOT_CMD_CAD_ON:
> 353 C_A_D = 1;
> 354 break;
> 355
> 356 case LINUX_REBOOT_CMD_CAD_OFF:
> 357 C_A_D = 0;
> 358 break;
> 359
> 360 case LINUX_REBOOT_CMD_HALT:
> 361 kernel_halt();
> 362 do_exit(0);
> 363 panic("cannot halt");
> 364
> 365 case LINUX_REBOOT_CMD_POWER_OFF:
> 366 kernel_power_off();
> 367 do_exit(0);
> 368 break;
> 369
> 370 case LINUX_REBOOT_CMD_RESTART2:
> 371 ret = strncpy_from_user(&buffer[0], arg, sizeof(buffer) - 1);
> 372 if (ret < 0) {
> 373 ret = -EFAULT;
> 374 break;
> 375 }
> 376 buffer[sizeof(buffer) - 1] = '\0';
> 377
> 378 kernel_restart(buffer);
> 379 break;
> 380
> 381 #ifdef CONFIG_KEXEC_CORE
> 382 case LINUX_REBOOT_CMD_KEXEC:
> 383 ret = kernel_kexec();
> ^^^^^^^^^^^^^^^^^^^^
> Called with lock held.
>
> 384 break;
> 385 #endif
>
> But kernel_kexec() also tries to take the &system_transition_mutex so
> it will dead lock.
>
> kernel/kexec_core.c
> 1125 int kernel_kexec(void)
> 1126 {
> 1127 int error = 0;
> 1128
> 1129 if (!mutex_trylock(&kexec_mutex))
> 1130 return -EBUSY;
> 1131 if (!kexec_image) {
> 1132 error = -EINVAL;
> 1133 goto Unlock;
> 1134 }
> 1135
> 1136 #ifdef CONFIG_KEXEC_JUMP
> 1137 if (kexec_image->preserve_context) {
> 1138 lock_system_sleep();
> ^^^^^^^^^^^^^^^^^^^
> Here.
>
> 1139 pm_prepare_console();
> 1140 error = freeze_processes();
> 1141 if (error) {
> 1142 error = -EBUSY;
> 1143 goto Restore_console;
> 1144 }
> 1145 suspend_console();
> 1146 error = dpm_suspend_start(PMSG_FREEZE);
> 1147 if (error)
> 1148 goto Resume_console;
> 1149 /* At this point, dpm_suspend_start() has been called,
> 1150 * but *not* dpm_suspend_end(). We *must* call
> 1151 * dpm_suspend_end() now. Otherwise, drivers for
> 1152 * some devices (e.g. interrupt controllers) become
> 1153 * desynchronized with the actual state of the
> 1154 * hardware at resume time, and evil weirdness ensues.
> 1155 */
> 1156 error = dpm_suspend_end(PMSG_FREEZE);
> 1157 if (error)
> 1158 goto Resume_devices;
> 1159 error = suspend_disable_secondary_cpus();
> 1160 if (error)
>
> regards,
> dan carpenter
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
>
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
WARNING: multiple messages have this Message-ID (diff)
From: Baoquan He <bhe@redhat.com>
To: Dan Carpenter <dan.carpenter@oracle.com>
Cc: kernelfans@gmail.com, kexec@lists.infradead.org,
linux-pm@vger.kernel.org, dyoung@redhat.com
Subject: Re: [bug report] PM / reboot: Eliminate race between reboot and suspend
Date: Wed, 20 Jan 2021 17:30:15 +0800 [thread overview]
Message-ID: <20210120093015.GE20161@MiWiFi-R3L-srv> (raw)
In-Reply-To: <20201118185917.GA433776@mwanda>
Hi,
On 11/18/20 at 09:59pm, Dan Carpenter wrote:
> Hello Pingfan Liu,
>
> The patch 55f2503c3b69: "PM / reboot: Eliminate race between reboot
> and suspend" from Jul 31, 2018, leads to the following static checker
> warning:
>
> kernel/power/main.c:27 lock_system_sleep()
> warn: called with lock held. '&system_transition_mutex'
This is a good finding. I think we can simply remove the lock/unlock
pair of system_transition_mutex in kernel_kexec() function. The dead
lock should be easily triggered, but it hasn't caused any failure report
because the feature 'kexec jump' is almost not used by anyone as far as
I know. We may need to find out who is using it and where it's used
through an inquiry. Before that, we can just remove the lock operation
inside CONFIG_KEXEC_JUMP ifdeffery scope. Thanks.
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 80905e5aa8ae..a0b6780740c8 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1134,7 +1134,6 @@ int kernel_kexec(void)
#ifdef CONFIG_KEXEC_JUMP
if (kexec_image->preserve_context) {
- lock_system_sleep();
pm_prepare_console();
error = freeze_processes();
if (error) {
@@ -1197,7 +1196,6 @@ int kernel_kexec(void)
thaw_processes();
Restore_console:
pm_restore_console();
- unlock_system_sleep();
}
#endif
>
> kernel/reboot.c
> 345
> 346 mutex_lock(&system_transition_mutex);
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> The patch changed the code to take this lock.
>
> 347 switch (cmd) {
> 348 case LINUX_REBOOT_CMD_RESTART:
> 349 kernel_restart(NULL);
> 350 break;
> 351
> 352 case LINUX_REBOOT_CMD_CAD_ON:
> 353 C_A_D = 1;
> 354 break;
> 355
> 356 case LINUX_REBOOT_CMD_CAD_OFF:
> 357 C_A_D = 0;
> 358 break;
> 359
> 360 case LINUX_REBOOT_CMD_HALT:
> 361 kernel_halt();
> 362 do_exit(0);
> 363 panic("cannot halt");
> 364
> 365 case LINUX_REBOOT_CMD_POWER_OFF:
> 366 kernel_power_off();
> 367 do_exit(0);
> 368 break;
> 369
> 370 case LINUX_REBOOT_CMD_RESTART2:
> 371 ret = strncpy_from_user(&buffer[0], arg, sizeof(buffer) - 1);
> 372 if (ret < 0) {
> 373 ret = -EFAULT;
> 374 break;
> 375 }
> 376 buffer[sizeof(buffer) - 1] = '\0';
> 377
> 378 kernel_restart(buffer);
> 379 break;
> 380
> 381 #ifdef CONFIG_KEXEC_CORE
> 382 case LINUX_REBOOT_CMD_KEXEC:
> 383 ret = kernel_kexec();
> ^^^^^^^^^^^^^^^^^^^^
> Called with lock held.
>
> 384 break;
> 385 #endif
>
> But kernel_kexec() also tries to take the &system_transition_mutex so
> it will dead lock.
>
> kernel/kexec_core.c
> 1125 int kernel_kexec(void)
> 1126 {
> 1127 int error = 0;
> 1128
> 1129 if (!mutex_trylock(&kexec_mutex))
> 1130 return -EBUSY;
> 1131 if (!kexec_image) {
> 1132 error = -EINVAL;
> 1133 goto Unlock;
> 1134 }
> 1135
> 1136 #ifdef CONFIG_KEXEC_JUMP
> 1137 if (kexec_image->preserve_context) {
> 1138 lock_system_sleep();
> ^^^^^^^^^^^^^^^^^^^
> Here.
>
> 1139 pm_prepare_console();
> 1140 error = freeze_processes();
> 1141 if (error) {
> 1142 error = -EBUSY;
> 1143 goto Restore_console;
> 1144 }
> 1145 suspend_console();
> 1146 error = dpm_suspend_start(PMSG_FREEZE);
> 1147 if (error)
> 1148 goto Resume_console;
> 1149 /* At this point, dpm_suspend_start() has been called,
> 1150 * but *not* dpm_suspend_end(). We *must* call
> 1151 * dpm_suspend_end() now. Otherwise, drivers for
> 1152 * some devices (e.g. interrupt controllers) become
> 1153 * desynchronized with the actual state of the
> 1154 * hardware at resume time, and evil weirdness ensues.
> 1155 */
> 1156 error = dpm_suspend_end(PMSG_FREEZE);
> 1157 if (error)
> 1158 goto Resume_devices;
> 1159 error = suspend_disable_secondary_cpus();
> 1160 if (error)
>
> regards,
> dan carpenter
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
>
next prev parent reply other threads:[~2021-01-20 9:30 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-18 18:59 [bug report] PM / reboot: Eliminate race between reboot and suspend Dan Carpenter
2020-11-18 18:59 ` Dan Carpenter
2021-01-20 9:30 ` Baoquan He [this message]
2021-01-20 9:30 ` Baoquan He
2021-01-21 9:10 ` Pingfan Liu
2021-01-21 9:10 ` Pingfan Liu
2021-01-21 14:42 ` Rafael J. Wysocki
2021-01-21 14:42 ` Rafael J. Wysocki
2021-01-22 7:38 ` Baoquan He
2021-01-22 7:38 ` Baoquan He
-- strict thread matches above, loose matches on Subject: below --
2019-11-19 6:18 Dan Carpenter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210120093015.GE20161@MiWiFi-R3L-srv \
--to=bhe@redhat.com \
--cc=dan.carpenter@oracle.com \
--cc=dyoung@redhat.com \
--cc=kernelfans@gmail.com \
--cc=kexec@lists.infradead.org \
--cc=linux-pm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.