public inbox for linux-edac@vger.kernel.org
 help / color / mirror / Atom feed
* [bug report] x86/mce: Mark fatal MCE's page as poison to avoid panic in the kdump kernel
@ 2023-10-25  5:44 Dan Carpenter
  2023-10-25  6:19 ` Zhiquan Li
  0 siblings, 1 reply; 3+ messages in thread
From: Dan Carpenter @ 2023-10-25  5:44 UTC (permalink / raw)
  To: zhiquan1.li; +Cc: linux-edac

Hello Zhiquan Li,

The patch 1d11b153d23b: "x86/mce: Mark fatal MCE's page as poison to
avoid panic in the kdump kernel" from Oct 23, 2023 (linux-next),
leads to the following Smatch static checker warning:

	arch/x86/kernel/cpu/mce/core.c:299 mce_panic()
	error: we previously assumed 'final' could be null (see line 281)

arch/x86/kernel/cpu/mce/core.c
    270         /* Now print uncorrected but with the final one last */
    271         llist_for_each_entry(l, pending, llnode) {
    272                 struct mce *m = &l->mce;
    273                 if (!(m->status & MCI_STATUS_UC))
    274                         continue;
    275                 if (!final || mce_cmp(m, final)) {
    276                         print_mce(m);
    277                         if (!apei_err)
    278                                 apei_err = apei_write_mce(m);
    279                 }
    280         }
    281         if (final) {
                    ^^^^^
This assumes final can be NULL

    282                 print_mce(final);
    283                 if (!apei_err)
    284                         apei_err = apei_write_mce(final);
    285         }
    286         if (exp)
    287                 pr_emerg(HW_ERR "Machine check: %s\n", exp);
    288         if (!fake_panic) {
    289                 if (panic_timeout == 0)
    290                         panic_timeout = mca_cfg.panic_timeout;
    291 
    292                 /*
    293                  * Kdump skips the poisoned page in order to avoid
    294                  * touching the error bits again. Poison the page even
    295                  * if the error is fatal and the machine is about to
    296                  * panic.
    297                  */
    298                 if (kexec_crash_loaded()) {
--> 299                         p = pfn_to_online_page(final->addr >> PAGE_SHIFT);
                                                       ^^^^^^^^^^^
Unchecked dereference

    300                         if (final && (final->status & MCI_STATUS_ADDRV) && p)
                                    ^^^^^
Checked too late

    301                                 SetPageHWPoison(p);
    302                 }
    303                 panic(msg);
    304         } else
    305                 pr_emerg(HW_ERR "Fake kernel panic: %s\n", msg);
    306 
    307 out:
    308         instrumentation_end();
    309 }

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [bug report] x86/mce: Mark fatal MCE's page as poison to avoid panic in the kdump kernel
  2023-10-25  5:44 [bug report] x86/mce: Mark fatal MCE's page as poison to avoid panic in the kdump kernel Dan Carpenter
@ 2023-10-25  6:19 ` Zhiquan Li
  2023-10-25  7:48   ` Borislav Petkov
  0 siblings, 1 reply; 3+ messages in thread
From: Zhiquan Li @ 2023-10-25  6:19 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: linux-edac



On 2023/10/25 13:44, Dan Carpenter wrote:
> Hello Zhiquan Li,
> 
> The patch 1d11b153d23b: "x86/mce: Mark fatal MCE's page as poison to
> avoid panic in the kdump kernel" from Oct 23, 2023 (linux-next),
> leads to the following Smatch static checker warning:
> 
> 	arch/x86/kernel/cpu/mce/core.c:299 mce_panic()
> 	error: we previously assumed 'final' could be null (see line 281)
> 
> arch/x86/kernel/cpu/mce/core.c
>     270         /* Now print uncorrected but with the final one last */
>     271         llist_for_each_entry(l, pending, llnode) {
>     272                 struct mce *m = &l->mce;
>     273                 if (!(m->status & MCI_STATUS_UC))
>     274                         continue;
>     275                 if (!final || mce_cmp(m, final)) {
>     276                         print_mce(m);
>     277                         if (!apei_err)
>     278                                 apei_err = apei_write_mce(m);
>     279                 }
>     280         }
>     281         if (final) {
>                     ^^^^^
> This assumes final can be NULL
> 
>     282                 print_mce(final);
>     283                 if (!apei_err)
>     284                         apei_err = apei_write_mce(final);
>     285         }
>     286         if (exp)
>     287                 pr_emerg(HW_ERR "Machine check: %s\n", exp);
>     288         if (!fake_panic) {
>     289                 if (panic_timeout == 0)
>     290                         panic_timeout = mca_cfg.panic_timeout;
>     291 
>     292                 /*
>     293                  * Kdump skips the poisoned page in order to avoid
>     294                  * touching the error bits again. Poison the page even
>     295                  * if the error is fatal and the machine is about to
>     296                  * panic.
>     297                  */
>     298                 if (kexec_crash_loaded()) {
> --> 299                         p = pfn_to_online_page(final->addr >> PAGE_SHIFT);
>                                                        ^^^^^^^^^^^
> Unchecked dereference
> 
>     300                         if (final && (final->status & MCI_STATUS_ADDRV) && p)
>                                     ^^^^^
> Checked too late
> 
>     301                                 SetPageHWPoison(p);

Nice catch!

This part should be changed like these:

+                       if (final && (final->status & MCI_STATUS_ADDRV))
+                               p = pfn_to_online_page(final->addr >> PAGE_SHIFT);
+                               if (p)
+                                       SetPageHWPoison(p);

The assignment can not be put into the "if" condition, checkpatch.pl script
will complain it. So it have to take a separate line.

I'll re-validate the patch and send V5 soon.
Thanks a lot, Dan.

Best Regards,
Zhiquan

>     302                 }
>     303                 panic(msg);
>     304         } else
>     305                 pr_emerg(HW_ERR "Fake kernel panic: %s\n", msg);
>     306 
>     307 out:
>     308         instrumentation_end();
>     309 }
> 
> regards,
> dan carpenter

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [bug report] x86/mce: Mark fatal MCE's page as poison to avoid panic in the kdump kernel
  2023-10-25  6:19 ` Zhiquan Li
@ 2023-10-25  7:48   ` Borislav Petkov
  0 siblings, 0 replies; 3+ messages in thread
From: Borislav Petkov @ 2023-10-25  7:48 UTC (permalink / raw)
  To: Zhiquan Li; +Cc: Dan Carpenter, linux-edac

On Wed, Oct 25, 2023 at 02:19:26PM +0800, Zhiquan Li wrote:
> +                       if (final && (final->status & MCI_STATUS_ADDRV))
> +                               p = pfn_to_online_page(final->addr >> PAGE_SHIFT);
> +                               if (p)
> +                                       SetPageHWPoison(p);
> 
> The assignment can not be put into the "if" condition, checkpatch.pl script
> will complain it.

It doesn't complain here:

$ git diff | ./scripts/checkpatch.pl 
total: 0 errors, 0 warnings, 14 lines checked

Your patch has no obvious style problems and is ready for submission

> I'll re-validate the patch and send V5 soon.

When you do, please send it with my edits to the text.

Zapped from tip for now.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-10-25  7:49 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-25  5:44 [bug report] x86/mce: Mark fatal MCE's page as poison to avoid panic in the kdump kernel Dan Carpenter
2023-10-25  6:19 ` Zhiquan Li
2023-10-25  7:48   ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox