From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcin Slusarz Subject: Re: nouveau/ttm: BUG in ttm_bo_release_list Date: Sat, 18 Sep 2010 13:18:48 +0200 Message-ID: <20100918111848.GA2953@joi.lan> References: <20100917174352.GA2770@joi.lan> <1284766761.14299.1.camel@araqiel> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <1284766761.14299.1.camel@araqiel> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org Errors-To: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org To: Ben Skeggs Cc: nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org List-Id: nouveau.vger.kernel.org On Sat, Sep 18, 2010 at 09:39:21AM +1000, Ben Skeggs wrote: > On Fri, 2010-09-17 at 19:43 +0200, Marcin Slusarz wrote: > > Hi > > Since upgrade from 2.6.35 to 2.6.36-rc3 (nouveau tree) I'm hitting this bug a couple of times a day: > > > > [ 2869.618504] ------------[ cut here ]------------ > > [ 2869.618532] kernel BUG at drivers/gpu/drm/ttm/ttm_bo.c:153! > > [ 2869.618560] invalid opcode: 0000 [#1] PREEMPT SMP > > [ 2869.618600] last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/0000:02:00.0/pm_status > > [ 2869.618637] CPU 0 > > [ 2869.618649] Modules linked in: nouveau ttm drm_kms_helper snd_hda_codec_realtek snd_hda_intel snd_hda_codec > > [ 2869.618730] > > [ 2869.618742] Pid: 11, comm: kworker/0:1 Not tainted 2.6.36-rc3-nv+ #485 P6T SE/System Product Name > > [ 2869.618781] RIP: 0010:[] [] ttm_bo_release_list+0x37/0xcf [ttm] > > [ 2869.618830] RSP: 0018:ffff8801bfd85d40 EFLAGS: 00010202 > > [ 2869.618855] RAX: 0000000000000001 RBX: ffff8801b1e50c00 RCX: ffff8801b7500330 > > [ 2869.618887] RDX: 0000000000000001 RSI: 0000000000000037 RDI: ffff8801b1e50c44 > > [ 2869.618918] RBP: ffff8801bfd85d60 R08: 0000000000000002 R09: ffff8801bf683000 > > [ 2869.618950] R10: ffff8801b1e50c70 R11: 0000000000000000 R12: ffff8801b1e50c44 > > [ 2869.618981] R13: ffff8801b7500188 R14: 0000000000000000 R15: ffff8801b7500738 > > [ 2869.619013] FS: 0000000000000000(0000) GS:ffff880001e00000(0000) knlGS:0000000000000000 > > [ 2869.619048] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > [ 2869.619074] CR2: 00007f2cc0f6b000 CR3: 00000001bfe8a000 CR4: 00000000000006f0 > > [ 2869.619106] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > [ 2869.619137] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > [ 2869.619168] Process kworker/0:1 (pid: 11, threadinfo ffff8801bfd84000, task ffff8801bfd88000) > > [ 2869.619205] Stack: > > [ 2869.619217] 0000000000000000 ffff8801b1e50c44 ffffffffa0082c67 ffff8801bdcb8ab0 > > [ 2869.619262] <0> ffff8801bfd85d80 ffffffff8121c83a ffff8801b1e50000 ffff8801b1e50c00 > > [ 2869.619318] <0> ffff8801bfd85dd0 ffffffffa00839f4 ffff880001e0ebc0 0000000000000001 > > [ 2869.619374] Call Trace: > > [ 2869.619392] [] ? ttm_bo_release_list+0x0/0xcf [ttm] > > [ 2869.619425] [] kref_put+0x43/0x4d > > [ 2869.619451] [] ttm_bo_delayed_delete+0xa2/0xf9 [ttm] > > [ 2869.619484] [] ? ttm_bo_delayed_workqueue+0x0/0x30 [ttm] > > [ 2869.619517] [] ttm_bo_delayed_workqueue+0x1a/0x30 [ttm] > > [ 2869.619551] [] process_one_work+0x29f/0x448 > > [ 2869.619580] [] worker_thread+0x1d6/0x349 > > [ 2869.619607] [] ? worker_thread+0x0/0x349 > > [ 2869.619635] [] kthread+0x7d/0x85 > > [ 2869.619661] [] kernel_thread_helper+0x4/0x10 > > [ 2869.619692] [] ? finish_task_switch+0x49/0xb2 > > [ 2869.619722] [] ? _raw_spin_unlock_irq+0x19/0x34 > > [ 2869.619752] [] ? restore_args+0x0/0x30 > > [ 2869.619779] [] ? kthread+0x0/0x85 > > [ 2869.619805] [] ? kernel_thread_helper+0x0/0x10 > > [ 2869.619832] Code: 48 8d 5f bc 48 83 ec 08 8b 07 4c 8b 6f c4 85 c0 74 04 0f 0b eb fe 8b 47 fc 85 c0 74 04 0f 0b eb fe 8b 87 a8 00 00 00 85 c0 74 04 <0f> 0b eb fe 48 83 bb 38 01 00 00 00 74 04 0f 0b eb fe 48 83 bb > > [ 2869.620255] RIP [] ttm_bo_release_list+0x37/0xcf [ttm] > > [ 2869.620295] RSP > > [ 2869.629610] ---[ end trace 302a6257f0da8cdc ]--- > > > > this is BUG_ON(atomic_read(&bo->cpu_writers)); > > > > I'm on b642f07208988270ac402d2548d42dab1d5fec92 "drm/nv50: fix 100c90 write on nva3". > > If you have any patches / ideas how to debug this, let me know. > I've actually seen this a couple of times. It appears I was incorrectly > blaming some patches I have in progress for it, the problem appeared to > go away when I reverted them. Perhaps it's more random. I have no > current ideas however. > Heh, Francisco provided the patch on IRC, which seem to fix this bug. http://annarchy.freedesktop.org/~currojerez/nouveau_cpu_prep_validate.patch I'm still testing it. Marcin