From: Karandeep Chahal <kchahal@ddn.com>
To: linux-kernel@vger.kernel.org
Cc: mchehab@redhat.com
Subject: [PATCH] sb_edac.c, kernel linux-3.2-rc6.
Date: Fri, 23 Dec 2011 16:35:21 -0500 [thread overview]
Message-ID: <4EF4F419.2080809@ddn.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 2157 bytes --]
While testing Sandy Bridge EDAC module I discovered a problem in the way
sb_edac was registering itself for machine check notifications. The
symptoms of the problem include:
1. Injecting a machine check exception can cause the system to hang for
10-15 seconds.
2. Removing and re-inserting the kernel module can cause panic.
The system hangs for 10-15 seconds because the sb_edac notifier gets
called by the kernel (notifier_call_chain) 0xffffffff times ((u32)(-1)).
The problem occurs because sb_edac calls atomic_notifier_chain_register
twice with the same static notifier_block structure. The function
atomic_notifier_chain_register gets called once for each memory
controller (MC) with the same structure. The patch, then, fixes this
problem by making sure that sb_edac registers for machine check
notifications only once.
Also copying Mauro Carvalho Chehab (maintainer of sb_edac) for the
review of the patch.
Cheers,
Karan
--- linux-3.2-rc6/drivers/edac/sb_edac.c 2011-12-16
21:36:26.000000000 -0500
+++ linux-3.2-rc6-new/drivers/edac/sb_edac.c 2011-12-23
14:54:57.000000000 -0500
@@ -1661,9 +1661,6 @@
debugf0("MC: " __FILE__ ": %s(): mci = %p, dev = %p\n",
__func__, mci, &sbridge_dev->pdev[0]->dev);
- atomic_notifier_chain_unregister(&x86_mce_decoder_chain,
- &sbridge_mce_dec);
-
/* Remove MC sysfs nodes */
edac_mc_del_mc(mci->dev);
@@ -1731,8 +1728,6 @@
goto fail0;
}
- atomic_notifier_chain_register(&x86_mce_decoder_chain,
- &sbridge_mce_dec);
return 0;
fail0:
@@ -1861,8 +1856,11 @@
pci_rc = pci_register_driver(&sbridge_driver);
- if (pci_rc >= 0)
+ if (pci_rc >= 0) {
+ atomic_notifier_chain_register(&x86_mce_decoder_chain,
+ &sbridge_mce_dec);
return 0;
+ }
sbridge_printk(KERN_ERR, "Failed to register device with error %d.\n",
pci_rc);
@@ -1877,6 +1875,9 @@
static void __exit sbridge_exit(void)
{
debugf2("MC: " __FILE__ ": %s()\n", __func__);
+ atomic_notifier_chain_unregister(&x86_mce_decoder_chain,
+ &sbridge_mce_dec);
+
pci_unregister_driver(&sbridge_driver);
}
[-- Attachment #2: me.patch --]
[-- Type: text/x-patch, Size: 1147 bytes --]
--- linux-3.2-rc6/drivers/edac/sb_edac.c 2011-12-16 21:36:26.000000000 -0500
+++ linux-3.2-rc6-new/drivers/edac/sb_edac.c 2011-12-23 14:54:57.000000000 -0500
@@ -1661,9 +1661,6 @@
debugf0("MC: " __FILE__ ": %s(): mci = %p, dev = %p\n",
__func__, mci, &sbridge_dev->pdev[0]->dev);
- atomic_notifier_chain_unregister(&x86_mce_decoder_chain,
- &sbridge_mce_dec);
-
/* Remove MC sysfs nodes */
edac_mc_del_mc(mci->dev);
@@ -1731,8 +1728,6 @@
goto fail0;
}
- atomic_notifier_chain_register(&x86_mce_decoder_chain,
- &sbridge_mce_dec);
return 0;
fail0:
@@ -1861,8 +1856,11 @@
pci_rc = pci_register_driver(&sbridge_driver);
- if (pci_rc >= 0)
+ if (pci_rc >= 0) {
+ atomic_notifier_chain_register(&x86_mce_decoder_chain,
+ &sbridge_mce_dec);
return 0;
+ }
sbridge_printk(KERN_ERR, "Failed to register device with error %d.\n",
pci_rc);
@@ -1877,6 +1875,9 @@
static void __exit sbridge_exit(void)
{
debugf2("MC: " __FILE__ ": %s()\n", __func__);
+ atomic_notifier_chain_unregister(&x86_mce_decoder_chain,
+ &sbridge_mce_dec);
+
pci_unregister_driver(&sbridge_driver);
}
[-- Attachment #3: README --]
[-- Type: text/plain, Size: 186 bytes --]
[PATCH] sb_edac.c, kernel linux-3.2-rc6.
Karandeep Chahal <karandeepchahal@gmail.com>
The sb_edac patch fixes incorrect Sandy Bridge machine check notifier chain registration problem.
reply other threads:[~2011-12-23 21:41 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4EF4F419.2080809@ddn.com \
--to=kchahal@ddn.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox