From: Nikolay Borisov <nik.borisov@suse.com>
To: "Luck, Tony" <tony.luck@intel.com>,
Borislav Petkov <bp@alien8.de>,
"Li, Rongqing" <lirongqing@baidu.com>
Cc: Thomas Gleixner <tglx@kernel.org>, Ingo Molnar <mingo@redhat.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
"x86@kernel.org" <x86@kernel.org>,
"H . Peter Anvin" <hpa@zytor.com>,
Yazen Ghannam <yazen.ghannam@amd.com>,
"Zhuo, Qiuxu" <qiuxu.zhuo@intel.com>,
Avadhut Naik <avadhut.naik@amd.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Subject: Re: 答复: 答复: 答复: [外部邮件] Re: [PATCH] x86/mce: Fix timer interval adjustment after logging a MCE event
Date: Tue, 13 Jan 2026 20:55:08 +0200 [thread overview]
Message-ID: <be786e9a-0302-47be-b2a8-bfa4449c7ab7@suse.com> (raw)
In-Reply-To: <SJ1PR11MB6083F2650A8DB801F0EF26C8FC8EA@SJ1PR11MB6083.namprd11.prod.outlook.com>
On 13.01.26 г. 20:53 ч., Luck, Tony wrote:
>>> The comment in mce_timer_fn says to adjust the polling interval, but
>>> I notice the kernel log always shows an MCE log every 5 minutes. Is this
>>> normal?
>>
>> Use git annotate to figure out which patch added this comment and in context
>> of what and that'll tell you why.
>>
>> As to the 5 minutes, look at how the check interval gets established.
>
> Once upon a time the polling interval started out at 5 minutes, but the
> interval was halved each time an error was found (so interval went
> 150s, 75s, 37s, ... down to 1s). If no error was found, then the interval
> was doubled (going back up to 300s).
>
> This is described in the comment:
>
> /*
> * Alert userspace if needed. If we logged an MCE, reduce the polling
> * interval, otherwise increase the polling interval.
> */
>
> It seems that the kernel isn't doing that today. Polling at a fixed 300 seconds
> event though errors are being found and logged. Interesting that the timestamps
> are 327.68 seconds apart, rather than 300 and change. So there is some strange
> stuff going on.
>
> I can reproduce here on an Icelake system. Booted with mce=no_cmci to force polling
> (and turned of BIOS firmware first mode). Injecting an error every 30 seconds I also see
> constant 327 seconds between logs (multiple logs show up because my injection picks memory
> channel "randomly", so there can be several banks with errors when polling happens).
>
> $ dmesg | grep 'Machine Check Event:'
> [ 662.632988] EDAC skx MC4: CPU 40: Machine Check Event: 0x0 Bank 13: 0x8c00014200800090
> [ 662.727377] EDAC skx MC6: CPU 40: Machine Check Event: 0x0 Bank 21: 0x8c0000c200800090
> [ 990.283484] EDAC skx MC4: CPU 121: Machine Check Event: 0x0 Bank 13: 0x8c00010200800090
> [ 990.378233] EDAC skx MC6: CPU 121: Machine Check Event: 0x0 Bank 21: 0x8c00014200800090
> [ 990.467199] EDAC skx MC0: CPU 3: Machine Check Event: 0x0 Bank 13: 0x8c00004200800090
> [ 1317.939260] EDAC skx MC4: CPU 122: Machine Check Event: 0x0 Bank 13: 0x8c00010200800090
> [ 1318.033721] EDAC skx MC6: CPU 122: Machine Check Event: 0x0 Bank 21: 0x8c00010200800090
> [ 1318.122612] EDAC skx MC0: CPU 14: Machine Check Event: 0x0 Bank 13: 0x8c00004200800090
> [ 1318.211507] EDAC skx MC2: CPU 14: Machine Check Event: 0x0 Bank 21: 0x8c00004200800090
> [ 1645.590773] EDAC skx MC4: CPU 129: Machine Check Event: 0x0 Bank 13: 0x8c00010200800090
> [ 1645.685153] EDAC skx MC6: CPU 129: Machine Check Event: 0x0 Bank 21: 0x8c00018200800090
> [ 1645.773744] EDAC skx MC0: CPU 100: Machine Check Event: 0x0 Bank 13: 0x8c00004200800090
>
> -Tony
>
At this stage I think lirongqi's patch is ok, but in the long run (i.e
tomorrow) I will send a patch that simply eliminates mce_notify_irq's
call in mce_timer_fn. I.e that function should be called only from the
early notifier.
next prev parent reply other threads:[~2026-01-13 18:55 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-12 8:27 [PATCH] x86/mce: Fix timer interval adjustment after logging a MCE event lirongqing
2026-01-12 8:56 ` Nikolay Borisov
2026-01-12 9:36 ` 答复: [外部邮件] " Li,Rongqing
2026-01-12 9:51 ` Borislav Petkov
2026-01-12 10:24 ` 答复: " Li,Rongqing
2026-01-13 9:51 ` Borislav Petkov
[not found] ` <39cfb093256f4da78fe0bc9e814ce5d0@baidu.com>
2026-01-13 12:48 ` 答复: " Borislav Petkov
2026-01-13 18:53 ` Luck, Tony
2026-01-13 18:55 ` Nikolay Borisov [this message]
2026-01-13 19:13 ` Borislav Petkov
2026-01-13 19:25 ` Nikolay Borisov
2026-01-13 19:33 ` Borislav Petkov
2026-01-13 19:37 ` Nikolay Borisov
2026-01-13 19:44 ` Borislav Petkov
2026-01-13 19:51 ` Nikolay Borisov
2026-01-13 20:33 ` Borislav Petkov
2026-01-13 19:10 ` Borislav Petkov
2026-01-13 19:31 ` Nikolay Borisov
2026-01-13 20:30 ` Thomas Gleixner
2026-01-13 20:56 ` Borislav Petkov
2026-01-13 21:05 ` Luck, Tony
2026-01-13 21:31 ` Borislav Petkov
2026-01-13 22:41 ` Borislav Petkov
2026-01-14 0:30 ` Luck, Tony
2026-01-14 13:50 ` Borislav Petkov
2026-01-14 14:48 ` Borislav Petkov
2026-02-02 15:18 ` Borislav Petkov
2026-02-02 23:49 ` 答复: [外部邮件] " Li,Rongqing
2026-02-06 22:03 ` Borislav Petkov
2026-02-07 11:51 ` Borislav Petkov
2026-02-09 17:37 ` Luck, Tony
2026-02-10 15:01 ` Borislav Petkov
2026-03-06 7:37 ` 答复: [外部邮件] " Li,Rongqing(ACG CCN)
2026-03-06 14:00 ` Borislav Petkov
2026-03-06 14:38 ` 答复: " Li,Rongqing(ACG CCN)
2026-03-06 15:29 ` Borislav Petkov
2026-03-07 1:18 ` 答复: " Li,Rongqing(ACG CCN)
2026-03-16 13:44 ` Borislav Petkov
2026-04-06 22:49 ` [PATCH] x86/mce: Restore MCA polling interval halving Borislav Petkov
2026-04-07 12:51 ` Nikolay Borisov
2026-04-07 15:04 ` Zhuo, Qiuxu
2026-04-14 21:18 ` Borislav Petkov
2026-04-14 22:22 ` Luck, Tony
2026-04-15 19:27 ` Borislav Petkov
2026-04-15 19:53 ` Luck, Tony
2026-04-15 20:02 ` Borislav Petkov
2026-04-17 11:50 ` Borislav Petkov
2026-04-20 14:14 ` Zhuo, Qiuxu
2026-04-21 12:05 ` Borislav Petkov
2026-04-21 15:49 ` Zhuo, Qiuxu
2026-04-23 12:49 ` Borislav Petkov
2026-04-15 13:39 ` Zhuo, Qiuxu
2026-04-15 19:35 ` Borislav Petkov
2026-01-14 6:17 ` [PATCH] x86/mce: Fix timer interval adjustment after logging a MCE event Nikolay Borisov
2026-01-14 13:52 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=be786e9a-0302-47be-b2a8-bfa4449c7ab7@suse.com \
--to=nik.borisov@suse.com \
--cc=avadhut.naik@amd.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lirongqing@baidu.com \
--cc=mingo@redhat.com \
--cc=qiuxu.zhuo@intel.com \
--cc=tglx@kernel.org \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
--cc=yazen.ghannam@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox