From: Sean Young <sean@mess.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "devicetree@vger.kernel.org" <devicetree@vger.kernel.org>,
Ruslan Ruslichenko <rruslich@cisco.com>,
"linux-omap@vger.kernel.org" <linux-omap@vger.kernel.org>,
kernel@stlinux.com, wfg@linux.intel.com,
LKML <linux-kernel@vger.kernel.org>,
Mauro Carvalho Chehab <mchehab@infradead.org>,
linux-mediatek@lists.infradead.org,
Linux LED Subsystem <linux-leds@vger.kernel.org>,
"linux-input@vger.kernel.org" <linux-input@vger.kernel.org>,
linux-amlogic@lists.infradead.org,
Thomas Gleixner <tglx@linutronix.de>,
kernel test robot <fengguang.wu@intel.com>, LKP <lkp@01.org>,
Ingo Molnar <mingo@kernel.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>,
Linux Media Mailing List <linux-media@vger.kernel.org>
Subject: Re: [WARNING: A/V UNSCANNABLE][Merge tag 'media/v4.11-1' of git] ff58d005cd: BUG: unable to handle kernel NULL pointer dereference at 0000039c
Date: Sat, 25 Feb 2017 11:14:25 +0000 [thread overview]
Message-ID: <20170225111424.GA7659@gofer.mess.org> (raw)
In-Reply-To: <CA+55aFytXj+TZ_TanbxcY0KgRTrV7Vvr=fWON8tioUGmYHYiNA@mail.gmail.com>
On Fri, Feb 24, 2017 at 11:15:51AM -0800, Linus Torvalds wrote:
> Added more relevant people. I've debugged the immediate problem below,
> but I think there's another problem that actually triggered this.
>
> On Fri, Feb 24, 2017 at 10:28 AM, kernel test robot
> <fengguang.wu@intel.com> wrote:
> >
> > 0day kernel testing robot got the below dmesg and the first bad commit is
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> >
> > commit ff58d005cd10fcd372787cceac547e11cf706ff6
> > Merge: 5ab3566 9eeb0ed
> >
> > Merge tag 'media/v4.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
> [...]
> > [ 4.664940] rc rc0: lirc_dev: driver ir-lirc-codec (rc-loopback) registered at minor = 0
> > [ 4.666322] BUG: unable to handle kernel NULL pointer dereference at 0000039c
> > [ 4.666675] IP: serial_ir_irq_handler+0x189/0x410
>
> This merge being fingered ends up being a subtle interaction with other changes.
>
> Those "other changes" are (again) the interrupt retrigger code that
> was reverted for 4.10, and then we tried to merge them again this
> merge window.
>
> Because the immediate cause is:
>
> > [ 4.666675] EIP: serial_ir_irq_handler+0x189/0x410
> > [ 4.666675] Call Trace:
> > [ 4.666675] <IRQ>
> > [ 4.666675] __handle_irq_event_percpu+0x57/0x100
> > [ 4.666675] handle_irq_event_percpu+0x1d/0x50
> > [ 4.666675] handle_irq_event+0x32/0x60
> > [ 4.666675] handle_edge_irq+0xa5/0x120
> > [ 4.666675] handle_irq+0x9d/0xd0
> > [ 4.666675] </IRQ>
> > [ 4.666675] do_IRQ+0x5f/0x130
> > [ 4.666675] common_interrupt+0x33/0x38
> > [ 4.666675] EIP: hardware_init_port+0x3f/0x190
> > [ 4.666675] EFLAGS: 00200246 CPU: 0
> > [ 4.666675] EAX: c718990f EBX: 00000000 ECX: 00000000 EDX: 000003f9
> > [ 4.666675] ESI: 000003f9 EDI: 000003f8 EBP: c0065d98 ESP: c0065d84
> > [ 4.666675] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> > [ 4.666675] serial_ir_probe+0xbb/0x300
> > [ 4.666675] platform_drv_probe+0x48/0xb0
> ...
>
> ie an interrupt came in immediately after the request_irq(), before
> all the data was properly set up, which then causes the interrupt
> handler to take a fault because it tries to access some field that
> hasn't even been set up yet.
Oh dear. I've pointed out others making the same mistake when doing code
reviews, clearly I need review my own code better.
>
> The code line is helpful, the faulting instruction is
>
> mov 0x39c(%rax),%eax <--- fault
> call ..
> mov someglobalvar,%edx
>
> which together with the supplied config file makes me able to match it
> up with the assembly generation around it:
>
> inb %dx, %al # tmp254, value
> andb $1, %al #, tmp255
> testb %al, %al # tmp255
> je .L233 #,
> .L215:
> movl serial_ir+8, %eax # serial_ir.rcdev, serial_ir.rcdev
> xorl %edx, %edx # _66->timeout
> movl 924(%eax), %eax # _66->timeout, _66->timeout
> call nsecs_to_jiffies #
> movl jiffies, %edx # jiffies, jiffies.33_70
> addl %eax, %edx # _69, tmp259
> movl $serial_ir+16, %eax #,
> call mod_timer #
> movl serial_ir+8, %eax # serial_ir.rcdev,
> call ir_raw_event_handle #
> movl $1, %eax #, <retval>
>
> so it's that "serial_ir.rcdev->timeout" access that faults. So this is
> the faulting source code:
>
> drivers/media/rc/serial_ir.c: 402
>
> mod_timer(&serial_ir.timeout_timer,
> jiffies + nsecs_to_jiffies(serial_ir.rcdev->timeout));
>
> ir_raw_event_handle(serial_ir.rcdev);
>
> return IRQ_HANDLED;
>
> and serial_ir.rcdev is NULL when ti tries to look up the timeout.
ir_raw_event_handle() call will also go bang if passed a null pointer, so
this problem existed before (since v4.10).
Thanks for debugging this, I'll send a patch as a reply to this email.
Sean
next prev parent reply other threads:[~2017-02-25 11:14 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-24 18:28 [WARNING: A/V UNSCANNABLE][Merge tag 'media/v4.11-1' of git] ff58d005cd: BUG: unable to handle kernel NULL pointer dereference at 0000039c kernel test robot
2017-02-24 19:15 ` Linus Torvalds
2017-02-25 9:07 ` Ingo Molnar
2017-02-25 18:02 ` Linus Torvalds
[not found] ` <CA+55aFy+ER8cYV02eZsKAOLnZBWY96zNWqUFWSWT1+3sZD4XnQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-02-27 10:09 ` Thomas Gleixner
2017-02-27 12:32 ` Thomas Gleixner
2017-02-27 15:41 ` Ingo Molnar
2017-02-27 16:07 ` Tony Lindgren
[not found] ` <20170227160750.GM21809-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
2017-02-27 16:18 ` Thomas Gleixner
2017-02-27 16:26 ` Tony Lindgren
[not found] ` <20170227154124.GA20569-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-02-27 16:12 ` Thomas Gleixner
2017-02-27 19:23 ` Linus Torvalds
[not found] ` <CA+55aFxwtkOs95R-v7z8yjguvp91oDTxRKs-x3uN_=sM_33Gvg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-02-28 7:25 ` Ingo Molnar
2017-02-28 10:51 ` Thomas Gleixner
2017-02-25 11:14 ` Sean Young [this message]
2017-02-25 11:28 ` [PATCH] [media] serial_ir: ensure we're ready to receive interrupts Sean Young
[not found] ` <20170225112816.GA7981-3XSxi2G4b3iXFJAUJl40Xg@public.gmane.org>
2017-02-25 13:34 ` Mauro Carvalho Chehab
[not found] ` <20170225103437.58c5a199-ch4gOOMV7nf/PtFMR13I2A@public.gmane.org>
2017-02-25 13:54 ` Mauro Carvalho Chehab
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170225111424.GA7659@gofer.mess.org \
--to=sean@mess.org \
--cc=devicetree@vger.kernel.org \
--cc=fengguang.wu@intel.com \
--cc=kernel@stlinux.com \
--cc=linux-amlogic@lists.infradead.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-input@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-leds@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=linux-mediatek@lists.infradead.org \
--cc=linux-omap@vger.kernel.org \
--cc=lkp@01.org \
--cc=mchehab@infradead.org \
--cc=mingo@kernel.org \
--cc=rruslich@cisco.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=wfg@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).