From: Thomas Gleixner <tglx@linutronix.de>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>,
Network Development <netdev@vger.kernel.org>,
Linux Wireless List <linux-wireless@vger.kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [run_timer_softirq] BUG: unable to handle kernel paging request at 0000000000010007
Date: Fri, 10 Nov 2017 22:29:59 +0100 (CET) [thread overview]
Message-ID: <alpine.DEB.2.20.1711102228030.2288@nanos> (raw)
In-Reply-To: <CA+55aFxz=_g38OsVXhLBMGXnQyZ3CvFk971e19z9eu+9Pu=4dA@mail.gmail.com>
On Fri, 10 Nov 2017, Linus Torvalds wrote:
> On Wed, Nov 8, 2017 at 9:19 PM, Fengguang Wu <fengguang.wu@intel.com> wrote:
> >
> > Yes it's accessing the list. Here is the faddr2line output.
>
> Ok, so it's a corrupted timer list. Which is not a big surprise.
>
> It's
>
> next->pprev = pprev;
>
> in __hlist_del(), and the trapping instruction decodes as
>
> mov %rdx,0x8(%rax)
>
> with %rax having the value dead000000000200,
>
> Which is just LIST_POISON2.
>
> So we've deleted that entry twice - LIST_POISON2 is what hlist_del()
> sets pprev to after already deleting it once.
>
> Although in this case it might not be hlist_del(), because
> detach_timer() also sets entry->next to LIST_POISON2.
>
> Which is pretty bogus, we are supposed to use LIST_POISON1 for the
> "next" pointer. Oh well. Nobody cares, except for the list entry
> debugging code, which isn't run on the hlist cases.
>
> Adding Thomas Gleixner to the cc. It should not be possible to delete
> the same timer twice.
Right, it shouldn't.
Fengguang, can you please enable:
CONFIG_DEBUG_OBJECTS
CONFIG_DEBUG_OBJECTS_TIMERS
and try to reproduce? Debugobject should catch that hopefully.
Thanks,
tglx
next prev parent reply other threads:[~2017-11-10 21:29 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CA+55aFxSJGeN=2X-uX-on1Uq2Nb8+v1aiMDz5H1+tKW_N5Q+6g@mail.gmail.com>
[not found] ` <20171029225155.qcum5i75awrt5tzm@wfg-t540p.sh.intel.com>
2017-10-29 23:48 ` [run_timer_softirq] BUG: unable to handle kernel paging request at 0000000000010007 Fengguang Wu
2017-10-30 19:29 ` Linus Torvalds
2017-10-30 20:37 ` Fengguang Wu
2017-11-09 5:19 ` Fengguang Wu
2017-11-10 20:08 ` Linus Torvalds
2017-11-10 21:29 ` Thomas Gleixner [this message]
2017-11-11 15:35 ` Fengguang Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.20.1711102228030.2288@nanos \
--to=tglx@linutronix.de \
--cc=fengguang.wu@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-wireless@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox