From: Karandeep Chahal <kchahal@ddn.com>
To: linux-kernel@vger.kernel.org
Cc: mchehab@redhat.com
Subject: [PATCH] sb_edac.c, kernel linux-3.2-rc6.
Date: Fri, 23 Dec 2011 16:35:21 -0500 [thread overview]
Message-ID: <4EF4F419.2080809@ddn.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 2157 bytes --]
While testing Sandy Bridge EDAC module I discovered a problem in the way
sb_edac was registering itself for machine check notifications. The
symptoms of the problem include:
1. Injecting a machine check exception can cause the system to hang for
10-15 seconds.
2. Removing and re-inserting the kernel module can cause panic.
The system hangs for 10-15 seconds because the sb_edac notifier gets
called by the kernel (notifier_call_chain) 0xffffffff times ((u32)(-1)).
The problem occurs because sb_edac calls atomic_notifier_chain_register
twice with the same static notifier_block structure. The function
atomic_notifier_chain_register gets called once for each memory
controller (MC) with the same structure. The patch, then, fixes this
problem by making sure that sb_edac registers for machine check
notifications only once.
Also copying Mauro Carvalho Chehab (maintainer of sb_edac) for the
review of the patch.
Cheers,
Karan
--- linux-3.2-rc6/drivers/edac/sb_edac.c 2011-12-16
21:36:26.000000000 -0500
+++ linux-3.2-rc6-new/drivers/edac/sb_edac.c 2011-12-23
14:54:57.000000000 -0500
@@ -1661,9 +1661,6 @@
debugf0("MC: " __FILE__ ": %s(): mci = %p, dev = %p\n",
__func__, mci, &sbridge_dev->pdev[0]->dev);
- atomic_notifier_chain_unregister(&x86_mce_decoder_chain,
- &sbridge_mce_dec);
-
/* Remove MC sysfs nodes */
edac_mc_del_mc(mci->dev);
@@ -1731,8 +1728,6 @@
goto fail0;
}
- atomic_notifier_chain_register(&x86_mce_decoder_chain,
- &sbridge_mce_dec);
return 0;
fail0:
@@ -1861,8 +1856,11 @@
pci_rc = pci_register_driver(&sbridge_driver);
- if (pci_rc >= 0)
+ if (pci_rc >= 0) {
+ atomic_notifier_chain_register(&x86_mce_decoder_chain,
+ &sbridge_mce_dec);
return 0;
+ }
sbridge_printk(KERN_ERR, "Failed to register device with error %d.\n",
pci_rc);
@@ -1877,6 +1875,9 @@
static void __exit sbridge_exit(void)
{
debugf2("MC: " __FILE__ ": %s()\n", __func__);
+ atomic_notifier_chain_unregister(&x86_mce_decoder_chain,
+ &sbridge_mce_dec);
+
pci_unregister_driver(&sbridge_driver);
}
[-- Attachment #2: me.patch --]
[-- Type: text/x-patch, Size: 1147 bytes --]
--- linux-3.2-rc6/drivers/edac/sb_edac.c 2011-12-16 21:36:26.000000000 -0500
+++ linux-3.2-rc6-new/drivers/edac/sb_edac.c 2011-12-23 14:54:57.000000000 -0500
@@ -1661,9 +1661,6 @@
debugf0("MC: " __FILE__ ": %s(): mci = %p, dev = %p\n",
__func__, mci, &sbridge_dev->pdev[0]->dev);
- atomic_notifier_chain_unregister(&x86_mce_decoder_chain,
- &sbridge_mce_dec);
-
/* Remove MC sysfs nodes */
edac_mc_del_mc(mci->dev);
@@ -1731,8 +1728,6 @@
goto fail0;
}
- atomic_notifier_chain_register(&x86_mce_decoder_chain,
- &sbridge_mce_dec);
return 0;
fail0:
@@ -1861,8 +1856,11 @@
pci_rc = pci_register_driver(&sbridge_driver);
- if (pci_rc >= 0)
+ if (pci_rc >= 0) {
+ atomic_notifier_chain_register(&x86_mce_decoder_chain,
+ &sbridge_mce_dec);
return 0;
+ }
sbridge_printk(KERN_ERR, "Failed to register device with error %d.\n",
pci_rc);
@@ -1877,6 +1875,9 @@
static void __exit sbridge_exit(void)
{
debugf2("MC: " __FILE__ ": %s()\n", __func__);
+ atomic_notifier_chain_unregister(&x86_mce_decoder_chain,
+ &sbridge_mce_dec);
+
pci_unregister_driver(&sbridge_driver);
}
[-- Attachment #3: README --]
[-- Type: text/plain, Size: 186 bytes --]
[PATCH] sb_edac.c, kernel linux-3.2-rc6.
Karandeep Chahal <karandeepchahal@gmail.com>
The sb_edac patch fixes incorrect Sandy Bridge machine check notifier chain registration problem.
reply other threads:[~2011-12-23 21:41 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4EF4F419.2080809@ddn.com \
--to=kchahal@ddn.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.