From: Thomas Hellstrom <thellstrom@vmware.com>
To: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Jerome Glisse <j.glisse@gmail.com>,
Markus Trippelsdorf <markus@trippelsdorf.de>,
"dri-devel@lists.freedesktop.org"
<dri-devel@lists.freedesktop.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"airlied@linux.ie" <airlied@linux.ie>
Subject: Re: Radeon RS780 - BUG: unable to handle kernel NULL pointer dereference
Date: Mon, 08 Nov 2010 23:25:38 +0100 [thread overview]
Message-ID: <4CD878E2.5050106@vmware.com> (raw)
In-Reply-To: <201011082159.00199.rjw@sisk.pl>
On 11/08/2010 09:58 PM, Rafael J. Wysocki wrote:
> On Monday, November 08, 2010, Jerome Glisse wrote:
>
>> On Mon, Nov 8, 2010 at 2:02 PM, Markus Trippelsdorf
>> <markus@trippelsdorf.de> wrote:
>>
>>> On Mon, Nov 08, 2010 at 07:43:02PM +0100, Markus Trippelsdorf wrote:
>>>
>>>> On Mon, Nov 08, 2010 at 06:07:37PM +0100, Markus Trippelsdorf wrote:
>>>>
>>>>> On Mon, Nov 08, 2010 at 06:02:21PM +0100, Markus Trippelsdorf wrote:
>>>>>
>>>>>> I can trigger a kernel crash on my system by simply loading this png
>>>>>> image with firefox:
>>>>>> http://mediaarchive.cern.ch/MediaArchive/Photo/Public/2010/1011251/1011251_01/1011251_01-A4-at-144-dpi.jpg
>>>>>>
>>>>> Sorry the above link is wrong, this is the right one (that triggers the
>>>>> crash):
>>>>> http://cdsweb.cern.ch/record/1305179/files/HI-150431-630470-huge.png
>>>>>
>>>> I triggered it a few more times and took the attached picture.
>>>> It points to the BUG() call at drivers/gpu/drm/ttm/ttm_bo.c:1628 .
>>>> (Sorry for the bad picture quality)
>>>>
>>> And here the same BUG in plaintext (should be a bit easier to read):
>>>
>>> Nov 8 19:28:23 arch kernel: ------------[ cut here ]------------
>>> Nov 8 19:28:23 arch kernel: kernel BUG at drivers/gpu/drm/ttm/ttm_bo.c:1628!
>>> Nov 8 19:28:23 arch kernel: invalid opcode: 0000 [#1] PREEMPT SMP
>>> Nov 8 19:28:23 arch kernel: last sysfs file: /sys/devices/pci0000:00/0000:00:18.3/temp1_input
>>> Nov 8 19:28:23 arch kernel: CPU 1
>>> Nov 8 19:28:23 arch kernel: Pid: 1541, comm: X Not tainted 2.6.37-rc1-00116-g151f52f-dirty #31 M4A78T-E/System Product Name
>>> Nov 8 19:28:23 arch kernel: RIP: 0010:[<ffffffff8121f0ff>] [<ffffffff8121f0ff>] ttm_bo_init+0x30f/0x340
>>> Nov 8 19:28:23 arch kernel: RSP: 0018:ffff88011b0fbbe8 EFLAGS: 00010246
>>> Nov 8 19:28:23 arch kernel: RAX: ffff8800da881778 RBX: ffff8800da881620 RCX: ffff88011b15ed78
>>> Nov 8 19:28:23 arch kernel: RDX: ffff8800c1556040 RSI: ffff88011ff22770 RDI: 000000000017adfb
>>> Nov 8 19:28:23 arch kernel: RBP: ffff8800da881648 R08: 0000000000000000 R09: ffff8800c1556040
>>> Nov 8 19:28:23 arch kernel: R10: 000000000ff85205 R11: ffff8800dae19200 R12: 0000000000000001
>>> Nov 8 19:28:23 arch kernel: R13: ffff88011ff22528 R14: ffff88011ff22778 R15: 0000000000000000
>>> Nov 8 19:28:23 arch kernel: FS: 00007f2043043700(0000) GS:ffff8800dfc80000(0000) knlGS:0000000000000000
>>> Nov 8 19:28:23 arch kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> Nov 8 19:28:23 arch kernel: CR2: 00007f203d057000 CR3: 000000011b12b000 CR4: 00000000000006e0
>>> Nov 8 19:28:23 arch kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> Nov 8 19:28:23 arch kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> Nov 8 19:28:23 arch kernel: Process X (pid: 1541, threadinfo ffff88011b0fa000, task ffff88011c959c20)
>>> Nov 8 19:28:23 arch kernel: Stack:
>>> Nov 8 19:28:23 arch kernel: 0000000000000000 ffff8800da881648 ffff88011b0fbd00 ffff8800da881600
>>> Nov 8 19:28:23 arch kernel: ffff88011ff22000 0000000000000000 0000000000000001 00000000fffffff4
>>> Nov 8 19:28:23 arch kernel: ffff88011b0fbd00 ffffffff8125294d 0000000000000000 ffffffff00000001
>>> Nov 8 19:28:23 arch kernel: Call Trace:
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8125294d>] ? radeon_bo_create+0x14d/0x250
>>> Nov 8 19:28:23 arch kernel: [<ffffffff812526c0>] ? radeon_ttm_bo_destroy+0x0/0xb0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff812671cc>] ? radeon_gem_object_create+0x8c/0x130
>>> Nov 8 19:28:23 arch kernel: [<ffffffff81267634>] ? radeon_gem_create_ioctl+0x54/0xd0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff813ab26d>] ? sock_aio_read+0x10d/0x120
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8120963c>] ? drm_ioctl+0x39c/0x450
>>> Nov 8 19:28:23 arch kernel: [<ffffffff812675e0>] ? radeon_gem_create_ioctl+0x0/0xd0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff810dd2c9>] ? do_vfs_ioctl+0xa9/0x610
>>> Nov 8 19:28:23 arch kernel: [<ffffffff810dd879>] ? sys_ioctl+0x49/0x80
>>> Nov 8 19:28:23 arch kernel: [<ffffffff810ce24e>] ? sys_read+0x4e/0x90
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8102dc2b>] ? system_call_fastpath+0x16/0x1b
>>> Nov 8 19:28:23 arch kernel: Code: e8 fb ff ff 85 c0 0f 85 68 ff ff ff 48 8b 7c 24 08 89 04 24 e8 83 d9 ff ff 8b 04 24 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e 41 5f c3<0f> 0b 48 c7 c7 60 a4 55 81 31 c0 e8 14 80 22 00 b8 ea ff ff ff
>>> Nov 8 19:28:23 arch kernel: RIP [<ffffffff8121f0ff>] ttm_bo_init+0x30f/0x340
>>> Nov 8 19:28:23 arch kernel: RSP<ffff88011b0fbbe8>
>>> Nov 8 19:28:23 arch kernel: ---[ end trace 328a9acba7691d6e ]---
>>> Nov 8 19:28:23 arch kernel: note: X[1541] exited with preempt_count 1
>>> Nov 8 19:28:23 arch kernel: BUG: scheduling while atomic: X/1541/0x10000002
>>> Nov 8 19:28:23 arch kernel: Pid: 1541, comm: X Tainted: G D 2.6.37-rc1-00116-g151f52f-dirty #31
>>> Nov 8 19:28:23 arch kernel: Call Trace:
>>> Nov 8 19:28:23 arch kernel: [<ffffffff81447ad9>] ? schedule+0x639/0x850
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8105826d>] ? __cond_resched+0x1d/0x30
>>> Nov 8 19:28:23 arch kernel: [<ffffffff81447f2f>] ? _cond_resched+0x2f/0x40
>>> Nov 8 19:28:23 arch kernel: [<ffffffff810b57fc>] ? unmap_vmas+0x82c/0x9c0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff810bcb62>] ? exit_mmap+0xe2/0x1a0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8105a705>] ? mmput+0x25/0xc0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8105e734>] ? exit_mm+0x104/0x130
>>> Nov 8 19:28:23 arch kernel: [<ffffffff81079ebf>] ? hrtimer_try_to_cancel+0x3f/0x80
>>> Nov 8 19:28:23 arch kernel: [<ffffffff81089d0a>] ? acct_collect+0x9a/0x1a0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8106045a>] ? do_exit+0x5aa/0x760
>>> Nov 8 19:28:23 arch kernel: [<ffffffff81447163>] ? printk+0x40/0x45
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8105e33c>] ? kmsg_dump+0x7c/0x150
>>> Nov 8 19:28:23 arch kernel: [<ffffffff81031fda>] ? oops_end+0x9a/0xe0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8102ee74>] ? do_invalid_op+0x84/0xa0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8121f0ff>] ? ttm_bo_init+0x30f/0x340
>>> Nov 8 19:28:23 arch kernel: [<ffffffff810ddf50>] ? __pollwait+0x0/0x110
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8102e7d5>] ? invalid_op+0x15/0x20
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8121f0ff>] ? ttm_bo_init+0x30f/0x340
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8121efe3>] ? ttm_bo_init+0x1f3/0x340
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8125294d>] ? radeon_bo_create+0x14d/0x250
>>> Nov 8 19:28:23 arch kernel: [<ffffffff812526c0>] ? radeon_ttm_bo_destroy+0x0/0xb0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff812671cc>] ? radeon_gem_object_create+0x8c/0x130
>>> Nov 8 19:28:23 arch kernel: [<ffffffff81267634>] ? radeon_gem_create_ioctl+0x54/0xd0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff813ab26d>] ? sock_aio_read+0x10d/0x120
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8120963c>] ? drm_ioctl+0x39c/0x450
>>> Nov 8 19:28:23 arch kernel: [<ffffffff812675e0>] ? radeon_gem_create_ioctl+0x0/0xd0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff810dd2c9>] ? do_vfs_ioctl+0xa9/0x610
>>> Nov 8 19:28:23 arch kernel: [<ffffffff810dd879>] ? sys_ioctl+0x49/0x80
>>> Nov 8 19:28:23 arch kernel: [<ffffffff810ce24e>] ? sys_read+0x4e/0x90
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8102dc2b>] ? system_call_fastpath+0x16/0x1b
>>> Nov 8 19:28:23 arch kernel: BUG: scheduling while atomic: X/1541/0x10000002
>>> Nov 8 19:28:23 arch kernel: Pid: 1541, comm: X Tainted: G D 2.6.37-rc1-00116-g151f52f-dirty #31
>>> Nov 8 19:28:23 arch kernel: Call Trace:
>>> Nov 8 19:28:23 arch kernel: [<ffffffff81447ad9>] ? schedule+0x639/0x850
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8105826d>] ? __cond_resched+0x1d/0x30
>>> Nov 8 19:28:23 arch kernel: [<ffffffff81447f2f>] ? _cond_resched+0x2f/0x40
>>> Nov 8 19:28:23 arch kernel: [<ffffffff810b57fc>] ? unmap_vmas+0x82c/0x9c0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff810bcb62>] ? exit_mmap+0xe2/0x1a0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8105a705>] ? mmput+0x25/0xc0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8105e734>] ? exit_mm+0x104/0x130
>>> Nov 8 19:28:23 arch kernel: [<ffffffff81079ebf>] ? hrtimer_try_to_cancel+0x3f/0x80
>>> Nov 8 19:28:23 arch kernel: [<ffffffff81089d0a>] ? acct_collect+0x9a/0x1a0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8106045a>] ? do_exit+0x5aa/0x760
>>> Nov 8 19:28:23 arch kernel: [<ffffffff81447163>] ? printk+0x40/0x45
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8105e33c>] ? kmsg_dump+0x7c/0x150
>>> Nov 8 19:28:23 arch kernel: [<ffffffff81031fda>] ? oops_end+0x9a/0xe0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8102ee74>] ? do_invalid_op+0x84/0xa0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8121f0ff>] ? ttm_bo_init+0x30f/0x340
>>> Nov 8 19:28:23 arch kernel: [<ffffffff810ddf50>] ? __pollwait+0x0/0x110
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8102e7d5>] ? invalid_op+0x15/0x20
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8121f0ff>] ? ttm_bo_init+0x30f/0x340
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8121efe3>] ? ttm_bo_init+0x1f3/0x340
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8125294d>] ? radeon_bo_create+0x14d/0x250
>>> Nov 8 19:28:23 arch kernel: [<ffffffff812526c0>] ? radeon_ttm_bo_destroy+0x0/0xb0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff812671cc>] ? radeon_gem_object_create+0x8c/0x130
>>> Nov 8 19:28:23 arch kernel: [<ffffffff81267634>] ? radeon_gem_create_ioctl+0x54/0xd0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff813ab26d>] ? sock_aio_read+0x10d/0x120
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8120963c>] ? drm_ioctl+0x39c/0x450
>>> Nov 8 19:28:23 arch kernel: [<ffffffff812675e0>] ? radeon_gem_create_ioctl+0x0/0xd0
>>> Nov 8 19:28:23 arch kernel: [<ffffffff810dd2c9>] ? do_vfs_ioctl+0xa9/0x610
>>> Nov 8 19:28:23 arch kernel: [<ffffffff810dd879>] ? sys_ioctl+0x49/0x80
>>> Nov 8 19:28:23 arch kernel: [<ffffffff810ce24e>] ? sys_read+0x4e/0x90
>>> Nov 8 19:28:23 arch kernel: [<ffffffff8102dc2b>] ? system_call_fastpath+0x16/0x1b
>>>
>>>
>> Thomas this bug seems to point to a case where we endup trying adding
>> an entry to
>> same offset in the rb tree for addr_space_mm. After reviewing
>> carefully the locking
>> around the rb tree modification& addr_space_mm i am fairly confident
>> that no race can
>> occur. Would you have any idea on what might go wrong here ? I guess i would
>> ultimately need to dump mm& rb tree state when BUG get trigger to try
>> to understand
>> states of things.
>>
> Hmm, why are you using BUG in there in the first place? Would it be _so_
> dangerous to continue that we just have to crash here?
>
> Rafael
>
BUGs in the TTM module are there to catch incorrect usage of the TTM
API, and the intention is that they should only happen during
development or stabilizing phases. In this case, we're probably seeing
the symptoms of memory corruption or a buggy range manager change.
/Thomas
next prev parent reply other threads:[~2010-11-08 22:25 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-08 17:02 Radeon RS780 - BUG: unable to handle kernel NULL pointer dereference Markus Trippelsdorf
2010-11-08 17:07 ` Markus Trippelsdorf
2010-11-08 18:43 ` Markus Trippelsdorf
2010-11-08 19:02 ` Markus Trippelsdorf
2010-11-08 19:36 ` Jerome Glisse
2010-11-08 20:53 ` Jerome Glisse
2010-11-08 20:58 ` Rafael J. Wysocki
2010-11-08 22:01 ` Jerome Glisse
2010-11-08 22:25 ` Thomas Hellstrom [this message]
2010-11-08 22:29 ` Thomas Hellstrom
2010-11-09 9:29 ` Markus Trippelsdorf
2010-11-09 9:53 ` Thomas Hellstrom
2010-11-09 10:07 ` Thomas Hellstrom
2010-11-09 10:32 ` Michel Dänzer
2010-11-09 10:37 ` Markus Trippelsdorf
2010-11-09 10:52 ` Michel Dänzer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CD878E2.5050106@vmware.com \
--to=thellstrom@vmware.com \
--cc=airlied@linux.ie \
--cc=dri-devel@lists.freedesktop.org \
--cc=j.glisse@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=markus@trippelsdorf.de \
--cc=rjw@sisk.pl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox