From: Michel Thierry <michel.thierry@intel.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
Daniel Vetter <daniel.vetter@intel.com>,
Jani Nikula <jani.nikula@linux.intel.com>,
David Airlie <airlied@linux.ie>, Ben Widawsky <ben@bwidawsk.net>,
Mika Kuoppala <mika.kuoppala@intel.com>
Cc: intel-gfx <intel-gfx@lists.freedesktop.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: NULL ptr dereference in current i915 driver
Date: Wed, 22 Apr 2015 17:11:37 +0100 [thread overview]
Message-ID: <5537C839.2060702@intel.com> (raw)
In-Reply-To: <CA+55aFyTY1i9PtRH4DqSKWcpQog+EzXKftirHh_2-17k41szhQ@mail.gmail.com>
On 4/22/2015 12:36 AM, Linus Torvalds wrote:
> So I just go the appended NULL pointer de-reference when trying to
> look at a video from my GoPro.
>
> The code disassembles to
>
> 0: 81 fb 00 04 00 00 cmp $0x400,%ebx
> 6: 41 89 07 mov %eax,(%r15)
> 9: 74 78 je 0x83
> b: 48 8d 7c 24 18 lea 0x18(%rsp),%rdi
> 10: e8 6e b3 1b c1 callq 0xffffffffc11bb383
> 15: 84 c0 test %al,%al
> 17: 74 4a je 0x63
> 19: 48 85 ed test %rbp,%rbp
> 1c: 75 b5 jne 0xffffffffffffffd3
> 1e: 48 8b 04 24 mov (%rsp),%rax
> 22: 49 8b 84 c4 98 01 00 mov 0x198(%r12,%rax,8),%rax
> 29: 00
> 2a:* 48 8b 28 mov (%rax),%rbp <-- trapping instruction
> 2d: 65 ff 05 1f e8 ef 3f incl %gs:0x3fefe81f(%rip) # 0x3fefe853
> 34: 48 b8 00 00 00 00 00 movabs $0x160000000000,%rax
> 3b: 16 00 00
>
> which matches up with the asm code
>
> cmpl $1024, %ebx #, act_pte
> movl %eax, (%r15) # D.49217, *_26
> je .L118 #,
> .L110:
> leaq 24(%rsp), %rdi #, tmp156
> call __sg_page_iter_next #
> testb %al, %al # D.49219
> je .L119 #,
> testq %rbp, %rbp # pt_vaddr
> jne .L109 #,
> movq (%rsp), %rax # %sfp, act_pt
> movq 408(%r12,%rax,8), %rax # MEM[(struct i915_hw_ppgtt
> *)vm_8(D)].D.36998.pd.page
> movq (%rax), %rbp # _21->page, D.49221
> #APP
> # 72 "./arch/x86/include/asm/preempt.h" 1
> incl %gs:__preempt_count(%rip) # __preempt_count
> # 0 "" 2
> #NO_APP
> movabsq $24189255811072, %rax #, tmp150
>
> which in turn seems to come from the C code
>
> pt_vaddr =
> kmap_atomic(ppgtt->pd.page_table[act_pt]->page);
>
> (that "testq %rbp,%rbp; jne" just before the oopsing instruction group
> is that "if (pt_vaddr == NULL)" test.
>
> IOW, it looks like
>
> ppgtt->pd.page_table[act_pt]
>
> is NULL, and then trying to dereference ->page off of it is what
> oopses (the preempt-count increment that comes after is the
> "pagefault_disable()" in kmap_atomic, and the big constant we're
> loading into %rax is part of "page_address(page)").
>
> I have no idea why "ppgtt->pd.page_table[act_pt]" would be NULL, but
> clearly it can be. Can somebody who knows this code look into it. I've
> added a few people who have worked in this area recently, in addition
> to the usual maintainer list..
>
> Thanks,
>
> Linus
>
> ---
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [<ffffffffc010c137>] gen6_ppgtt_insert_entries+0xa7/0x120 [i915]
> PGD 0
> Oops: 0000 [#1] SMP
> Modules linked in: rfcomm fuse cmac ip6t_rpfilter ip6t_REJECT
> nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4
> nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_nat ebtable_broute
> bridge stp llc ebtable_filter ebtables ip6table_mangle
> ip6table_security ip6table_raw ip6table_filter ip6_tables
> iptable_mangle iptable_security iptable_raw bnep arc4 vfat fat
> x86_pkg_temp_thermal pn544_mei mei_phy coretemp pn544 hci nfc
> kvm_intel iTCO_wdt iTCO_vendor_support snd_hda_codec_realtek
> snd_hda_codec_hdmi kvm snd_hda_codec_generic uvcvideo
> videobuf2_vmalloc videobuf2_memops microcode videobuf2_core
> snd_hda_intel v4l2_common hid_multitouch snd_hda_controller videodev
> btusb snd_hda_codec iwlmvm media snd_hwdep mac80211 btbcm snd_seq
> btintel bluetooth snd_seq_device joydev snd_pcm serio_raw
> i2c_i801 iwlwifi cfg80211 snd_hda_core sony_laptop snd_timer snd
> rfkill mei_me soundcore lpc_ich shpchp mei mfd_core dm_crypt
> crct10dif_pclmul i915 crc32_pclmul crc32c_intel i2c_algo_bit
> drm_kms_helper ghash_clmulni_intel drm i2c_core video
> CPU: 1 PID: 2697 Comm: chrome Not tainted 4.0.0-09362-g1fc149933fd4 #8
> Hardware name: Sony Corporation SVP11213CXB/VAIO, BIOS R0270V7 05/17/2013
> task: ffff88010dc51b30 ti: ffff88003f328000 task.ti: ffff88003f328000
> RIP: 0010:[<ffffffffc010c137>] [<ffffffffc010c137>]
> gen6_ppgtt_insert_entries+0xa7/0x120 [i915]
> RSP: 0018:ffff88003f32b9a8 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000075b1b
> RDX: ffff88007d848990 RSI: 0000000000000001 RDI: ffff88003f32b9c0
> RBP: 0000000000000000 R08: 0000000000000000 R09: ffff88003f6f7e58
> R10: 000000000d836000 R11: 0000000000000000 R12: ffff8800d4164000
> R13: 0000000000000000 R14: 0000000000000001 R15: ffff88003f7bbffc
> FS: 00007f7f0ee94a00(0000) GS:ffff88011fa80000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 000000005f607000 CR4: 00000000001407e0
> Stack:
> 0000000000000201 000002010dc51b30 0000000000000000 ffff88007d848990
> 0000004000075b1b ffff880100000001 0000000000000fe0 ffff880011215900
> 0000000000000000 ffff88006cc4c380 ffff88003f6f0000 0000000000000001
> Call Trace:
> ggtt_bind_vma+0x97/0x110 [i915]
> i915_vma_bind+0x40/0x410 [i915]
> swiotlb_map_sg_attrs+0x74/0x140
> i915_gem_object_do_pin+0x864/0x9f0 [i915]
> mutex_lock+0x9/0x30
> i915_gem_execbuffer_reserve_vma.isra.20+0x66/0x130 [i915]
> i915_gem_execbuffer_reserve+0x2ec/0x320 [i915]
> i915_gem_do_execbuffer.isra.27+0x5ee/0xf80 [i915]
> mutex_optimistic_spin+0x16e/0x1f0
> __mutex_lock_interruptible_slowpath+0x21/0x130
> shmem_fault+0x57/0x1c0
> drm_gem_object_lookup+0x14/0xa0 [drm]
> i915_gem_execbuffer2+0xb2/0x2a0 [i915]
> drm_ioctl+0x15a/0x580 [drm]
> current_fs_time+0x9/0x50
> do_vfs_ioctl+0x2e8/0x4f0
> file_has_perm+0x77/0x80
> syscall_trace_enter_phase1+0x116/0x140
> SyS_ioctl+0x79/0x90
> system_call_fastpath+0x12/0x6a
> Code: 00 81 fb 00 04 00 00 41 89 07 74 78 48 8d 7c 24 18 e8 6e b3 1b
> c1 84 c0 74 4a 48 85 ed 75 b5 48 8b 04 24 49 8b 84 c4 98 01 00 00 <48>
> 8b 28 65 ff 05 1f e8 ef 3f 48 b8 00 00 00 00 00 16 00 00 48
> RIP gen6_ppgtt_insert_entries+0xa7/0x120 [i915]
> RSP <ffff88003f32b9a8>
> CR2: 0000000000000000
>
Hi,
I see a possible va re-allocation that could be the culprit, but the
change was commited just 2 days ago
(http://cgit.freedesktop.org/drm-intel/commit/?id=5c5f645773b6d147bf68c350674dc3ef4f8de83d).
-Michel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
WARNING: multiple messages have this Message-ID (diff)
From: Michel Thierry <michel.thierry@intel.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
Daniel Vetter <daniel.vetter@intel.com>,
Jani Nikula <jani.nikula@linux.intel.com>,
David Airlie <airlied@linux.ie>, Ben Widawsky <ben@bwidawsk.net>,
Mika Kuoppala <mika.kuoppala@intel.com>
Cc: intel-gfx <intel-gfx@lists.freedesktop.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: NULL ptr dereference in current i915 driver
Date: Wed, 22 Apr 2015 17:11:37 +0100 [thread overview]
Message-ID: <5537C839.2060702@intel.com> (raw)
In-Reply-To: <CA+55aFyTY1i9PtRH4DqSKWcpQog+EzXKftirHh_2-17k41szhQ@mail.gmail.com>
On 4/22/2015 12:36 AM, Linus Torvalds wrote:
> So I just go the appended NULL pointer de-reference when trying to
> look at a video from my GoPro.
>
> The code disassembles to
>
> 0: 81 fb 00 04 00 00 cmp $0x400,%ebx
> 6: 41 89 07 mov %eax,(%r15)
> 9: 74 78 je 0x83
> b: 48 8d 7c 24 18 lea 0x18(%rsp),%rdi
> 10: e8 6e b3 1b c1 callq 0xffffffffc11bb383
> 15: 84 c0 test %al,%al
> 17: 74 4a je 0x63
> 19: 48 85 ed test %rbp,%rbp
> 1c: 75 b5 jne 0xffffffffffffffd3
> 1e: 48 8b 04 24 mov (%rsp),%rax
> 22: 49 8b 84 c4 98 01 00 mov 0x198(%r12,%rax,8),%rax
> 29: 00
> 2a:* 48 8b 28 mov (%rax),%rbp <-- trapping instruction
> 2d: 65 ff 05 1f e8 ef 3f incl %gs:0x3fefe81f(%rip) # 0x3fefe853
> 34: 48 b8 00 00 00 00 00 movabs $0x160000000000,%rax
> 3b: 16 00 00
>
> which matches up with the asm code
>
> cmpl $1024, %ebx #, act_pte
> movl %eax, (%r15) # D.49217, *_26
> je .L118 #,
> .L110:
> leaq 24(%rsp), %rdi #, tmp156
> call __sg_page_iter_next #
> testb %al, %al # D.49219
> je .L119 #,
> testq %rbp, %rbp # pt_vaddr
> jne .L109 #,
> movq (%rsp), %rax # %sfp, act_pt
> movq 408(%r12,%rax,8), %rax # MEM[(struct i915_hw_ppgtt
> *)vm_8(D)].D.36998.pd.page
> movq (%rax), %rbp # _21->page, D.49221
> #APP
> # 72 "./arch/x86/include/asm/preempt.h" 1
> incl %gs:__preempt_count(%rip) # __preempt_count
> # 0 "" 2
> #NO_APP
> movabsq $24189255811072, %rax #, tmp150
>
> which in turn seems to come from the C code
>
> pt_vaddr =
> kmap_atomic(ppgtt->pd.page_table[act_pt]->page);
>
> (that "testq %rbp,%rbp; jne" just before the oopsing instruction group
> is that "if (pt_vaddr == NULL)" test.
>
> IOW, it looks like
>
> ppgtt->pd.page_table[act_pt]
>
> is NULL, and then trying to dereference ->page off of it is what
> oopses (the preempt-count increment that comes after is the
> "pagefault_disable()" in kmap_atomic, and the big constant we're
> loading into %rax is part of "page_address(page)").
>
> I have no idea why "ppgtt->pd.page_table[act_pt]" would be NULL, but
> clearly it can be. Can somebody who knows this code look into it. I've
> added a few people who have worked in this area recently, in addition
> to the usual maintainer list..
>
> Thanks,
>
> Linus
>
> ---
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [<ffffffffc010c137>] gen6_ppgtt_insert_entries+0xa7/0x120 [i915]
> PGD 0
> Oops: 0000 [#1] SMP
> Modules linked in: rfcomm fuse cmac ip6t_rpfilter ip6t_REJECT
> nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4
> nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_nat ebtable_broute
> bridge stp llc ebtable_filter ebtables ip6table_mangle
> ip6table_security ip6table_raw ip6table_filter ip6_tables
> iptable_mangle iptable_security iptable_raw bnep arc4 vfat fat
> x86_pkg_temp_thermal pn544_mei mei_phy coretemp pn544 hci nfc
> kvm_intel iTCO_wdt iTCO_vendor_support snd_hda_codec_realtek
> snd_hda_codec_hdmi kvm snd_hda_codec_generic uvcvideo
> videobuf2_vmalloc videobuf2_memops microcode videobuf2_core
> snd_hda_intel v4l2_common hid_multitouch snd_hda_controller videodev
> btusb snd_hda_codec iwlmvm media snd_hwdep mac80211 btbcm snd_seq
> btintel bluetooth snd_seq_device joydev snd_pcm serio_raw
> i2c_i801 iwlwifi cfg80211 snd_hda_core sony_laptop snd_timer snd
> rfkill mei_me soundcore lpc_ich shpchp mei mfd_core dm_crypt
> crct10dif_pclmul i915 crc32_pclmul crc32c_intel i2c_algo_bit
> drm_kms_helper ghash_clmulni_intel drm i2c_core video
> CPU: 1 PID: 2697 Comm: chrome Not tainted 4.0.0-09362-g1fc149933fd4 #8
> Hardware name: Sony Corporation SVP11213CXB/VAIO, BIOS R0270V7 05/17/2013
> task: ffff88010dc51b30 ti: ffff88003f328000 task.ti: ffff88003f328000
> RIP: 0010:[<ffffffffc010c137>] [<ffffffffc010c137>]
> gen6_ppgtt_insert_entries+0xa7/0x120 [i915]
> RSP: 0018:ffff88003f32b9a8 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000075b1b
> RDX: ffff88007d848990 RSI: 0000000000000001 RDI: ffff88003f32b9c0
> RBP: 0000000000000000 R08: 0000000000000000 R09: ffff88003f6f7e58
> R10: 000000000d836000 R11: 0000000000000000 R12: ffff8800d4164000
> R13: 0000000000000000 R14: 0000000000000001 R15: ffff88003f7bbffc
> FS: 00007f7f0ee94a00(0000) GS:ffff88011fa80000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 000000005f607000 CR4: 00000000001407e0
> Stack:
> 0000000000000201 000002010dc51b30 0000000000000000 ffff88007d848990
> 0000004000075b1b ffff880100000001 0000000000000fe0 ffff880011215900
> 0000000000000000 ffff88006cc4c380 ffff88003f6f0000 0000000000000001
> Call Trace:
> ggtt_bind_vma+0x97/0x110 [i915]
> i915_vma_bind+0x40/0x410 [i915]
> swiotlb_map_sg_attrs+0x74/0x140
> i915_gem_object_do_pin+0x864/0x9f0 [i915]
> mutex_lock+0x9/0x30
> i915_gem_execbuffer_reserve_vma.isra.20+0x66/0x130 [i915]
> i915_gem_execbuffer_reserve+0x2ec/0x320 [i915]
> i915_gem_do_execbuffer.isra.27+0x5ee/0xf80 [i915]
> mutex_optimistic_spin+0x16e/0x1f0
> __mutex_lock_interruptible_slowpath+0x21/0x130
> shmem_fault+0x57/0x1c0
> drm_gem_object_lookup+0x14/0xa0 [drm]
> i915_gem_execbuffer2+0xb2/0x2a0 [i915]
> drm_ioctl+0x15a/0x580 [drm]
> current_fs_time+0x9/0x50
> do_vfs_ioctl+0x2e8/0x4f0
> file_has_perm+0x77/0x80
> syscall_trace_enter_phase1+0x116/0x140
> SyS_ioctl+0x79/0x90
> system_call_fastpath+0x12/0x6a
> Code: 00 81 fb 00 04 00 00 41 89 07 74 78 48 8d 7c 24 18 e8 6e b3 1b
> c1 84 c0 74 4a 48 85 ed 75 b5 48 8b 04 24 49 8b 84 c4 98 01 00 00 <48>
> 8b 28 65 ff 05 1f e8 ef 3f 48 b8 00 00 00 00 00 16 00 00 48
> RIP gen6_ppgtt_insert_entries+0xa7/0x120 [i915]
> RSP <ffff88003f32b9a8>
> CR2: 0000000000000000
>
Hi,
I see a possible va re-allocation that could be the culprit, but the
change was commited just 2 days ago
(http://cgit.freedesktop.org/drm-intel/commit/?id=5c5f645773b6d147bf68c350674dc3ef4f8de83d).
-Michel
next prev parent reply other threads:[~2015-04-22 16:11 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-21 23:36 NULL ptr dereference in current i915 driver Linus Torvalds
2015-04-21 23:36 ` Linus Torvalds
2015-04-22 16:11 ` Michel Thierry [this message]
2015-04-22 16:11 ` Michel Thierry
2015-04-22 16:45 ` [PATCH] drm/i915: Add checks to i915_bind_vma Mika Kuoppala
2015-04-22 16:45 ` Mika Kuoppala
2015-04-22 19:53 ` shuang.he
2015-04-23 13:14 ` Josh Boyer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5537C839.2060702@intel.com \
--to=michel.thierry@intel.com \
--cc=airlied@linux.ie \
--cc=ben@bwidawsk.net \
--cc=daniel.vetter@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=jani.nikula@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mika.kuoppala@intel.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.