From: Markus Trippelsdorf <markus@trippelsdorf.de>
To: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: "Michel Dänzer" <michel@daenzer.net>, dri-devel@lists.freedesktop.org
Subject: Re: GPU lockup CP stall for more than 10000msec on latest vanilla git
Date: Wed, 19 Dec 2012 15:20:19 +0100 [thread overview]
Message-ID: <20121219142019.GA24579@x4> (raw)
In-Reply-To: <50D1C7E4.1060701@canonical.com>
On 2012.12.19 at 14:57 +0100, Maarten Lankhorst wrote:
> Op 18-12-12 17:12, Markus Trippelsdorf schreef:
> > With your supposed debugging BUG_ONs added I still get:
> >
> > Dec 18 17:01:15 x4 kernel: ------------[ cut here ]------------
> > Dec 18 17:01:15 x4 kernel: WARNING: at include/linux/kref.h:42 radeon_fence_ref+0x2c/0x40()
> > Dec 18 17:01:15 x4 kernel: Hardware name: System Product Name
> > Dec 18 17:01:15 x4 kernel: Pid: 157, comm: X Not tainted 3.7.0-rc7-00520-g85b144f-dirty #174
> > Dec 18 17:01:15 x4 kernel: Call Trace:
> > Dec 18 17:01:15 x4 kernel: [<ffffffff81058c84>] ? warn_slowpath_common+0x74/0xb0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff8129273c>] ? radeon_fence_ref+0x2c/0x40
> > Dec 18 17:01:15 x4 kernel: [<ffffffff8125e95c>] ? ttm_bo_cleanup_refs_and_unlock+0x18c/0x2d0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff8125f17c>] ? ttm_mem_evict_first+0x1dc/0x2a0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff81264452>] ? ttm_bo_man_get_node+0x62/0xb0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff8125f4ce>] ? ttm_bo_mem_space+0x28e/0x340
> > Dec 18 17:01:15 x4 kernel: [<ffffffff8125fb0c>] ? ttm_bo_move_buffer+0xfc/0x170
> > Dec 18 17:01:15 x4 kernel: [<ffffffff810de172>] ? kmem_cache_alloc+0xb2/0xc0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff8125fc15>] ? ttm_bo_validate+0x95/0x110
> > Dec 18 17:01:15 x4 kernel: [<ffffffff8125ff7c>] ? ttm_bo_init+0x2ec/0x3b0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff8129419a>] ? radeon_bo_create+0x18a/0x200
> > Dec 18 17:01:15 x4 kernel: [<ffffffff81293e80>] ? radeon_bo_clear_va+0x40/0x40
> > Dec 18 17:01:15 x4 kernel: [<ffffffff812a5342>] ? radeon_gem_object_create+0x92/0x160
> > Dec 18 17:01:15 x4 kernel: [<ffffffff812a575c>] ? radeon_gem_create_ioctl+0x6c/0x150
> > Dec 18 17:01:15 x4 kernel: [<ffffffff812a529f>] ? radeon_gem_object_free+0x2f/0x40
> > Dec 18 17:01:15 x4 kernel: [<ffffffff81246b60>] ? drm_ioctl+0x420/0x4f0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff812a56f0>] ? radeon_gem_pwrite_ioctl+0x20/0x20
> > Dec 18 17:01:15 x4 kernel: [<ffffffff810f53a4>] ? do_vfs_ioctl+0x2e4/0x4e0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff810e5588>] ? vfs_read+0x118/0x160
> > Dec 18 17:01:15 x4 kernel: [<ffffffff810f55ec>] ? sys_ioctl+0x4c/0xa0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff810e5851>] ? sys_read+0x51/0xa0
> > Dec 18 17:01:15 x4 kernel: [<ffffffff814b0612>] ? system_call_fastpath+0x16/0x1b
> so the kref to fence is null here. This should be impossible and
> indicates a bug in refcounting somewhere, or possibly memory
> corruption.
>
> Lets first look where things could go wrong..
>
> sync_obj member requires fence_lock to be taken, but radeon code in
> general doesn't do that, hm..
>
> I think radeon_cs_sync_rings needs to take fence_lock during the
> iteration, then taking on a refcount to the fence, and
> radeon_crtc_page_flip and radeon_move_blit are lacking refcount on
> fence_lock as well.
>
> But that would probably still not explain why it crashes in
> radeon_vm_bo_invalidate shortly after, so it seems just as likely that
> it's operating on freed memory there or something.
>
> But none of the code touches refcounting for that bo, and I really
> don't see how I messed up anything there.
>
> I seem to be able to reproduce it if I add a hack though, can you test
> if you get the exact same issues if you apply this patch?
Your patch doesn't apply unfortunately:
markus@x4 linux % patch -p1 --dry-run < ~/maarten.patch
checking file drivers/gpu/drm/ttm/ttm_bo.c
Hunk #1 succeeded at 512 with fuzz 1.
Hunk #6 FAILED at 814.
1 out of 6 hunks FAILED
markus@x4 linux % git describe
v3.7-10833-g752451f
markus@x4 linux %
--
Markus
next prev parent reply other threads:[~2012-12-19 14:20 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-17 18:27 GPU lockup CP stall for more than 10000msec on latest vanilla git Markus Trippelsdorf
2012-12-17 21:32 ` Alex Deucher
2012-12-17 21:48 ` Markus Trippelsdorf
2012-12-17 21:58 ` Markus Trippelsdorf
2012-12-17 22:00 ` Alex Deucher
2012-12-17 22:25 ` Markus Trippelsdorf
2012-12-17 22:55 ` Markus Trippelsdorf
2012-12-18 11:20 ` Michel Dänzer
2012-12-18 13:38 ` Markus Trippelsdorf
2012-12-18 13:51 ` Markus Trippelsdorf
2012-12-18 15:24 ` Maarten Lankhorst
2012-12-18 16:12 ` Markus Trippelsdorf
2012-12-18 18:10 ` Maarten Lankhorst
2012-12-19 13:57 ` Maarten Lankhorst
2012-12-19 14:20 ` Markus Trippelsdorf [this message]
2012-12-19 14:31 ` Maarten Lankhorst
2012-12-23 1:46 ` Alex Deucher
2012-12-23 8:43 ` Markus Trippelsdorf
2012-12-23 10:09 ` Andy Furniss
2012-12-23 10:21 ` Markus Trippelsdorf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121219142019.GA24579@x4 \
--to=markus@trippelsdorf.de \
--cc=dri-devel@lists.freedesktop.org \
--cc=maarten.lankhorst@canonical.com \
--cc=michel@daenzer.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.