From: Borislav Petkov <bp@alien8.de>
To: Nicholas Krause <xerofoify@gmail.com>
Cc: dougthompson@xmission.com, mchehab@osg.samsung.com,
linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH] edac:Fix kernel panic regression in edac_mc_reset_delay_period
Date: Thu, 19 May 2016 23:44:55 +0200 [thread overview]
Message-ID: <20160519214455.GE552@pd.tnic> (raw)
In-Reply-To: <1463687097-11306-1-git-send-email-xerofoify@gmail.com>
On Thu, May 19, 2016 at 03:44:57PM -0400, Nicholas Krause wrote:
> This fixes a kernel panic regression in the function,
> edac_mc_reset_delay_period as show by this kernel panic
> trace:
> [ 58.402137] BUG: unable to handle kernel paging request at 0000000000015d10
> [ 58.410564] IP: [<ffffffff8109ab82>] queued_spin_lock_slowpath+0x132/0x170
> [ 58.418941] PGD 3ffcc8067 PUD 3ffc56067 PMD 0
> [ 58.428821] Oops: 0002 [#1] SMP
> [ 58.439076] Modules linked in: xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_addrtype iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter ip_tables x_tables
> [ 58.468176] CPU: 1 PID: 2792 Comm: edactest Not tainted 4.6.0-dirty #1
^^^^^^^^
Ha, what is that program?
> [ 58.478878] Hardware name: HP ProLiant MicroServer, BIOS O41 10/01/2013
> [ 58.488590] task: ffff8803ff9a9300 ti: ffff8803ffbf0000 task.ti: ffff8803ffbf0000
> [ 58.499562] RIP: 0010:[<ffffffff8109ab82>] [<ffffffff8109ab82>] queued_spin_lock_slowpath+0x132/0x170
> [ 58.521850] RSP: 0018:ffff8803ffbf3cf8 EFLAGS: 00010002
> [ 58.532653] RAX: 0000000000002bfe RBX: 0000000000000082 RCX: 0000000000080000
> [ 58.545334] RDX: 0000000000015d10 RSI: 00000000affd0fc4 RDI: ffffffff81d39940
> [ 58.555376] RBP: ffff88040a97b848 R08: ffff88041ed15d00 R09: 0000000000000004
> [ 58.565813] R10: 000000000000000a R11: f000000000000000 R12: ffffffff81d39940
> [ 58.577911] R13: 000000000000c940 R14: ffff8803ffbf3d48 R15: ffff8803ffbf3f28
> [ 58.588311] FS: 00007f639468f780(0000) GS:ffff88041ed00000(0000) knlGS:00000000f7743680
> [ 58.598270] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 58.609814] CR2: 0000000000015d10 CR3: 00000003ffafa000 CR4: 00000000000006e0
> [ 58.620848] Stack:
> [ 58.630118] ffffffff81774d3f 000000000000000f ffffffff810ae889 ffff88040a97b820
> [ 58.640635] ffff8803ffbf3d90 0000000000002000 ffff88040c335c00 00000000000003e8
> [ 58.652220] ffffffff810aed20 0000000000000041 0000000200000000 ffff88040a97b800
> [ 58.662230] Call Trace:
> [ 58.672043] [<ffffffff81774d3f>] ? _raw_spin_lock_irqsave+0x1f/0x30
> [ 58.682221] [<ffffffff810ae889>] ? lock_timer_base.isra.34+0x49/0x60
> [ 58.693178] [<ffffffff810aed20>] ? del_timer+0x30/0x70
> [ 58.704839] [<ffffffff81075494>] ? try_to_grab_pending+0xa4/0x140
> [ 58.715206] [<ffffffff81075569>] ? mod_delayed_work_on+0x39/0x80
> [ 58.725250] [<ffffffff81684e90>] ? edac_mc_reset_delay_period+0x30/0x50
> [ 58.735572] [<ffffffff81685865>] ? edac_set_poll_msec+0x45/0x60
> [ 58.745346] [<ffffffff8107a43b>] ? param_attr_store+0x6b/0xe0
> [ 58.755254] [<ffffffff81079975>] ? module_attr_store+0x15/0x20
> [ 58.764869] [<ffffffff811f7192>] ? kernfs_fop_write+0x142/0x190
> [ 58.774516] [<ffffffff81187a1e>] ? __vfs_write+0x1e/0xe0
> [ 58.783565] [<ffffffff811879d4>] ? __vfs_read+0xa4/0xd0
> [ 58.792437] [<ffffffff811a47a7>] ? __alloc_fd+0x37/0x160
> [ 58.801108] [<ffffffff811887f0>] ? vfs_write+0xb0/0x1b0
> [ 58.809465] [<ffffffff81189bdb>] ? SyS_write+0x4b/0xb0
> [ 58.817707] [<ffffffff81774f5f>] ? entry_SYSCALL_64_fastpath+0x17/0x93
> [ 58.825626] Code: f8 66 c7 07 01 00 c3 66 90 f3 c3 48 89 c2 c1 e8 12 48 c1 ea 0c ff c8 83 e2 30 48 98 48 81 c2 00 5d 01 00 48 03 14 c5 40 24 d1 81 <4c> 89 02 41 8b 40 08 85 c0 75 0a f3 90 41 8b 40 08 85 c0 74 f6
> [ 58.852733] RIP [<ffffffff8109ab82>] queued_spin_lock_slowpath+0x132/0x170
> [ 58.861275] RSP <ffff8803ffbf3cf8>
> [ 58.869458] CR2: 0000000000015d10
> [ 58.877632] ---[ end trace 3f286bc71cca15d1 ]---
> [ 58.885869] Kernel panic - not syncing: Fatal exception
So I see the splat but the fix does not look correct... It is more,
like, an uninitialized workqueue somewhere. How do you trigger this?
Write some values into
/sys/module/edac_core/parameters/edac_mc_poll_msec ? I guess that's that
edactest program.
Can I have your .config please?
...
Ok, I think I see it - we initialize the workqueues only when
->edac_check is defined. And you're probably using an EDAC driver which
doesn't define that function, thus the splat.
But which driver are you using? I don't see it in your module list. So
it is either compiled in or you've simply loaded edac_core.ko only.
If you want to write a proper fix, I'd give you a hint: look at
->op_state. That should be tested.
:-)
Thanks.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
next parent reply other threads:[~2016-05-19 21:45 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1463687097-11306-1-git-send-email-xerofoify@gmail.com>
2016-05-19 21:44 ` Borislav Petkov [this message]
[not found] ` <573E39DD.7080603@gmail.com>
2016-05-19 22:27 ` [PATCH] edac:Fix kernel panic regression in edac_mc_reset_delay_period Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160519214455.GE552@pd.tnic \
--to=bp@alien8.de \
--cc=dougthompson@xmission.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab@osg.samsung.com \
--cc=rostedt@goodmis.org \
--cc=xerofoify@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox