From: Dave Hansen <dave.hansen@intel.com>
To: Filippo Sironi <sironi@amazon.de>, linux-kernel@vger.kernel.org
Cc: tony.luck@intel.com, bp@alien8.de, tglx@linutronix.de,
mingo@redhat.com, dave.hansen@linux.intel.com, x86@kernel.org,
hpa@zytor.com, linux-edac@vger.kernel.org
Subject: Re: [PATCH] x86/mce: Increase the size of the MCE pool from 2 to 8 pages
Date: Wed, 11 Oct 2023 10:32:44 -0700 [thread overview]
Message-ID: <afaef377-25e0-49f6-a99f-3e5bd4b44f87@intel.com> (raw)
In-Reply-To: <20231011163320.79732-1-sironi@amazon.de>
On 10/11/23 09:33, Filippo Sironi wrote:
> On some of our large servers and some of our most sorry servers ( 🙂 ),
> we're seeing the kernel reporting the warning in mce_gen_pool_add: "MCE
> records pool full!". Let's increase the amount of memory that we use to
> store the MCE records from 2 to 8 pages to prevent this from happening
> and be able to collect useful information.
MCE_POOLSZ is used to size gen_pool_buf[] which was a line out of your
diff context:
> #define MCE_POOLSZ (2 * PAGE_SIZE)
>
> static struct gen_pool *mce_evt_pool;
> static LLIST_HEAD(mce_event_llist);
> static char gen_pool_buf[MCE_POOLSZ];
That's in .bss which means it eats up memory for *everyone*. It seems a
little silly to eat up an extra 6 pages of memory for *everyone* in
order to get rid of a message on what I assume is a relatively small set
of "sorry servers".
Is there any way that the size of the pool can be more automatically
determined? Is the likelihood of a bunch errors proportional to the
number of CPUs or amount of RAM or some other aspect of the hardware?
Could the pool be emptied more aggressively so that it does not fill up?
Last, what is the _actual_ harm caused by missing this "useful
information"? Is collecting that information collectively really worth
24kb*NR_X86_SYSTEMS_ON_EARTH? Is it really that valuable to know that
the system got 4,000 ECC errors on a DIMM versus 1,000?
If there's no other choice and this extra information is *CRITICAL*,
then by all means let's enlarge the buffer. But, let's please do it for
a known, tangible benefit.
next prev parent reply other threads:[~2023-10-11 17:32 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-11 16:33 [PATCH] x86/mce: Increase the size of the MCE pool from 2 to 8 pages Filippo Sironi
2023-10-11 17:32 ` Dave Hansen [this message]
2023-10-12 11:46 ` Sironi, Filippo
2023-10-12 11:52 ` Borislav Petkov
2023-10-12 15:49 ` Dave Hansen
2023-10-16 14:14 ` Yazen Ghannam
2023-10-16 14:24 ` Dave Hansen
2023-10-16 14:40 ` Borislav Petkov
2023-10-16 14:47 ` Yazen Ghannam
2023-10-16 16:14 ` Luck, Tony
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=afaef377-25e0-49f6-a99f-3e5bd4b44f87@intel.com \
--to=dave.hansen@intel.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=sironi@amazon.de \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox