All of lore.kernel.org
 help / color / mirror / Atom feed
From: Karandeep Chahal <kchahal@ddn.com>
To: linux-kernel@vger.kernel.org
Cc: mchehab@redhat.com
Subject: [PATCH] sb_edac.c, kernel linux-3.2-rc6.
Date: Fri, 23 Dec 2011 16:35:21 -0500	[thread overview]
Message-ID: <4EF4F419.2080809@ddn.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 2157 bytes --]

While testing Sandy Bridge EDAC module I discovered a problem in the way 
sb_edac was registering itself for machine check notifications. The 
symptoms of the problem include:
1. Injecting a machine check exception can cause the system to hang for 
10-15 seconds.
2. Removing and re-inserting the kernel module can cause panic.

The system hangs for 10-15 seconds because the sb_edac notifier gets 
called by the kernel (notifier_call_chain) 0xffffffff times ((u32)(-1)).

The problem occurs because sb_edac calls atomic_notifier_chain_register 
twice with the same static notifier_block structure. The function 
atomic_notifier_chain_register gets called once for each memory 
controller (MC) with the same structure. The patch, then, fixes this 
problem by making sure that sb_edac registers for machine check 
notifications only once.

Also copying Mauro Carvalho Chehab (maintainer of sb_edac) for the 
review of the patch.

Cheers,
Karan

--- linux-3.2-rc6/drivers/edac/sb_edac.c    2011-12-16 
21:36:26.000000000 -0500
+++ linux-3.2-rc6-new/drivers/edac/sb_edac.c    2011-12-23 
14:54:57.000000000 -0500
@@ -1661,9 +1661,6 @@
      debugf0("MC: " __FILE__ ": %s(): mci = %p, dev = %p\n",
          __func__, mci, &sbridge_dev->pdev[0]->dev);

-    atomic_notifier_chain_unregister(&x86_mce_decoder_chain,
- &sbridge_mce_dec);
-
      /* Remove MC sysfs nodes */
      edac_mc_del_mc(mci->dev);

@@ -1731,8 +1728,6 @@
          goto fail0;
      }

-    atomic_notifier_chain_register(&x86_mce_decoder_chain,
- &sbridge_mce_dec);
      return 0;

  fail0:
@@ -1861,8 +1856,11 @@

      pci_rc = pci_register_driver(&sbridge_driver);

-    if (pci_rc >= 0)
+    if (pci_rc >= 0) {
+         atomic_notifier_chain_register(&x86_mce_decoder_chain,
+ &sbridge_mce_dec);
          return 0;
+    }

      sbridge_printk(KERN_ERR, "Failed to register device with error %d.\n",
                pci_rc);
@@ -1877,6 +1875,9 @@
  static void __exit sbridge_exit(void)
  {
      debugf2("MC: " __FILE__ ": %s()\n", __func__);
+    atomic_notifier_chain_unregister(&x86_mce_decoder_chain,
+ &sbridge_mce_dec);
+
      pci_unregister_driver(&sbridge_driver);
  }



[-- Attachment #2: me.patch --]
[-- Type: text/x-patch, Size: 1147 bytes --]

--- linux-3.2-rc6/drivers/edac/sb_edac.c	2011-12-16 21:36:26.000000000 -0500
+++ linux-3.2-rc6-new/drivers/edac/sb_edac.c	2011-12-23 14:54:57.000000000 -0500
@@ -1661,9 +1661,6 @@
 	debugf0("MC: " __FILE__ ": %s(): mci = %p, dev = %p\n",
 		__func__, mci, &sbridge_dev->pdev[0]->dev);
 
-	atomic_notifier_chain_unregister(&x86_mce_decoder_chain,
-					 &sbridge_mce_dec);
-
 	/* Remove MC sysfs nodes */
 	edac_mc_del_mc(mci->dev);
 
@@ -1731,8 +1728,6 @@
 		goto fail0;
 	}
 
-	atomic_notifier_chain_register(&x86_mce_decoder_chain,
-				       &sbridge_mce_dec);
 	return 0;
 
 fail0:
@@ -1861,8 +1856,11 @@
 
 	pci_rc = pci_register_driver(&sbridge_driver);
 
-	if (pci_rc >= 0)
+	if (pci_rc >= 0) {
+ 		atomic_notifier_chain_register(&x86_mce_decoder_chain,
+						  &sbridge_mce_dec);
 		return 0;
+	}
 
 	sbridge_printk(KERN_ERR, "Failed to register device with error %d.\n",
 		      pci_rc);
@@ -1877,6 +1875,9 @@
 static void __exit sbridge_exit(void)
 {
 	debugf2("MC: " __FILE__ ": %s()\n", __func__);
+	atomic_notifier_chain_unregister(&x86_mce_decoder_chain,
+					 &sbridge_mce_dec);
+
 	pci_unregister_driver(&sbridge_driver);
 }
 

[-- Attachment #3: README --]
[-- Type: text/plain, Size: 186 bytes --]

[PATCH] sb_edac.c, kernel linux-3.2-rc6.
Karandeep Chahal <karandeepchahal@gmail.com>

The sb_edac patch fixes incorrect Sandy Bridge machine check notifier chain registration problem.


                 reply	other threads:[~2011-12-23 21:41 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4EF4F419.2080809@ddn.com \
    --to=kchahal@ddn.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.