From: Daniel Vetter <daniel@ffwll.ch>
To: Chris Wilson <chris@chris-wilson.co.uk>,
Daniel Vetter <daniel@ffwll.ch>,
intel-gfx@lists.freedesktop.org,
John Harrison <John.C.Harrison@Intel.com>,
Daniel Vetter <daniel.vetter@ffwll.ch>
Subject: Re: [PATCH] drm/i915: Keep ring->active_list and ring->requests_list consistent
Date: Fri, 20 Mar 2015 15:32:52 +0100 [thread overview]
Message-ID: <20150320143252.GO1349@phenom.ffwll.local> (raw)
In-Reply-To: <20150320133951.GF10812@nuc-i3427.alporthouse.com>
On Fri, Mar 20, 2015 at 01:39:51PM +0000, Chris Wilson wrote:
> On Fri, Mar 20, 2015 at 01:02:10PM +0000, Chris Wilson wrote:
> > On Fri, Mar 20, 2015 at 11:06:57AM +0100, Daniel Vetter wrote:
> > > On Thu, Mar 19, 2015 at 10:17:42PM +0000, Chris Wilson wrote:
> > > > On Thu, Mar 19, 2015 at 06:37:28PM +0100, Daniel Vetter wrote:
> > > > > On Wed, Mar 18, 2015 at 06:19:22PM +0000, Chris Wilson wrote:
> > > > > > WARNING: CPU: 0 PID: 1383 at drivers/gpu/drm/i915/i915_gem_evict.c:279 i915_gem_evict_vm+0x10c/0x140()
> > > > > > WARN_ON(!list_empty(&vm->active_list))
> > > > >
> > > > > How does this come about - we call gpu_idle before this seems to blow up,
> > > > > so all requests should be completed?
> > > >
> > > > Honestly, I couldn't figure it out either. I had an epiphany when I saw
> > > > that we could now have an empty request list but non-empty active list
> > > > added a test to detect when that happens and shouted eureka when the
> > > > WARN fired. I could trigger the WARN in evict_vm pretty reliably, but
> > > > not since this patch. It could just be masking another bug.
> > >
> > > Can you perhaps double-check the theory by putting a
> > > WARN_ON(list_empty(active_list) != list_empyt(request_list)) into
> > > gpu_idle? Ofc with this patch reverted so that the bug surfaces again.
> >
> > [ 5215.567573] [drm:i915_verify_lists] *ERROR* render ring: active list not empty, but no requests
> > [ 5215.567586] ------------[ cut here ]------------
> > [ 5215.567598] WARNING: CPU: 0 PID: 1304 at drivers/gpu/drm/i915/i915_gem.c:3166 i915_gpu_idle+0x88/0x90()
> > [ 5215.567602] WARN_ON(i915_verify_lists(dev))
> > [ 5215.567606] Modules linked in: ctr ccm arc4 ath9k ath9k_common ath9k_hw bnep ath mac80211 rfcomm snd_hda_codec_conexant snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec uvcvideo snd_hwdep snd_pcm gpio_ich videobuf2_vmalloc dell_wmi cfg80211 videobuf2_memops sparse_keymap videobuf2_core dell_laptop snd_seq_midi v4l2_common dcdbas snd_seq_midi_event btusb videodev i8k snd_rawmidi snd_seq hid_multitouch coretemp bluetooth microcode snd_seq_device joydev snd_timer serio_raw snd shpchp soundcore wmi lpc_ich usbhid hid psmouse ahci libahci
> > [ 5215.567708] CPU: 0 PID: 1304 Comm: Xorg Tainted: G W OE 4.0.0-rc4+ #108
> > [ 5215.567713] Hardware name: Dell Inc. Inspiron 1090/Inspiron 1090, BIOS A06 08/23/2011
> > [ 5215.567718] 00000000 00000000 f46e1b98 c16b3e19 f46e1bd8 f46e1bc8 c1047f17 c1937e78
> > [ 5215.567733] f46e1bf4 00000518 c1937cec 00000c5e c14441e8 c14441e8 e733bdc8 00000000
> > [ 5215.567747] f6346c00 f46e1be0 c1047f83 00000009 f46e1bd8 c1937e78 f46e1bf4 f46e1c00
> > [ 5215.567762] Call Trace:
> > [ 5215.567776] [<c16b3e19>] dump_stack+0x41/0x52
> > [ 5215.567788] [<c1047f17>] warn_slowpath_common+0x87/0xc0
> > [ 5215.567797] [<c14441e8>] ? i915_gpu_idle+0x88/0x90
> > [ 5215.567805] [<c14441e8>] ? i915_gpu_idle+0x88/0x90
> > [ 5215.567815] [<c1047f83>] warn_slowpath_fmt+0x33/0x40
> > [ 5215.567823] [<c14441e8>] i915_gpu_idle+0x88/0x90
> > [ 5215.567833] [<c1439949>] i915_gem_evict_something+0x269/0x300
> > [ 5215.567843] [<c144754f>] i915_gem_object_do_pin+0x6ef/0xb20
> > [ 5215.567854] [<c14479c5>] i915_gem_object_pin+0x45/0x50
> > [ 5215.567864] [<c1439f08>] i915_gem_execbuffer_reserve_vma.isra.13+0x78/0x180
> > [ 5215.567874] [<c143a2e5>] i915_gem_execbuffer_reserve+0x2d5/0x320
> > [ 5215.567884] [<c11594cd>] ? __kmalloc+0x14d/0x190
> > [ 5215.567894] [<c143b6d9>] i915_gem_do_execbuffer.isra.17+0x5c9/0xdd0
> > [ 5215.567906] [<c112efdb>] ? vm_mmap_pgoff+0x7b/0xa0
> > [ 5215.567915] [<c11594cd>] ? __kmalloc+0x14d/0x190
> > [ 5215.567925] [<c143cfeb>] i915_gem_execbuffer2+0x8b/0x2c0
> > [ 5215.567934] [<c143cf60>] ? i915_gem_execbuffer+0x4e0/0x4e0
> > [ 5215.567944] [<c1401d67>] drm_ioctl+0x1b7/0x510
> > [ 5215.567954] [<c1120a9a>] ? balance_dirty_pages_ratelimited+0x1a/0x6a0
> > [ 5215.567963] [<c143cf60>] ? i915_gem_execbuffer+0x4e0/0x4e0
> > [ 5215.567975] [<c113cef9>] ? handle_mm_fault+0x329/0x1250
> > [ 5215.567984] [<c1401bb0>] ? drm_getmap+0xb0/0xb0
> > [ 5215.567994] [<c117d9ca>] do_vfs_ioctl+0x30a/0x530
> > [ 5215.568005] [<c10a9e92>] ? ktime_get_ts64+0x52/0x1a0
> > [ 5215.568095] [<c1185f62>] ? __fget_light+0x22/0x60
> > [ 5215.568136] [<c117dc50>] SyS_ioctl+0x60/0x90
> > [ 5215.568175] [<c16b9bc8>] sysenter_do_call+0x12/0x12
> > [ 5215.568198] ---[ end trace ab3f7e4953cb9eb6 ]---
> > [ 5215.568272] ------------[ cut here ]------------
> > [ 5215.568288] WARNING: CPU: 0 PID: 1304 at drivers/gpu/drm/i915/i915_gem_evict.c:283 i915_gem_evict_vm+0x10c/0x140()
> > [ 5215.568292] WARN_ON(!list_empty(&vm->active_list))
> > [ 5215.568296] Modules linked in: ctr ccm arc4 ath9k ath9k_common ath9k_hw bnep ath mac80211 rfcomm snd_hda_codec_conexant snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec uvcvideo snd_hwdep snd_pcm gpio_ich videobuf2_vmalloc dell_wmi cfg80211 videobuf2_memops sparse_keymap videobuf2_core dell_laptop snd_seq_midi v4l2_common dcdbas snd_seq_midi_event btusb videodev i8k snd_rawmidi snd_seq hid_multitouch coretemp bluetooth microcode snd_seq_device joydev snd_timer serio_raw snd shpchp soundcore wmi lpc_ich usbhid hid psmouse ahci libahci
> > [ 5215.568383] CPU: 0 PID: 1304 Comm: Xorg Tainted: G W OE 4.0.0-rc4+ #108
> > [ 5215.568388] Hardware name: Dell Inc. Inspiron 1090/Inspiron 1090, BIOS A06 08/23/2011
> > [ 5215.568393] 00000000 00000000 f46e1cc0 c16b3e19 f46e1d00 f46e1cf0 c1047f17 c193712c
> > [ 5215.568407] f46e1d1c 00000518 c19370d0 0000011b c1439c6c c1439c6c f3b225b0 e733c3ec
> > [ 5215.568421] 00000001 f46e1d08 c1047f83 00000009 f46e1d00 c193712c f46e1d1c f46e1d28
> > [ 5215.568435] Call Trace:
> > [ 5215.568445] [<c16b3e19>] dump_stack+0x41/0x52
> > [ 5215.568455] [<c1047f17>] warn_slowpath_common+0x87/0xc0
> > [ 5215.568465] [<c1439c6c>] ? i915_gem_evict_vm+0x10c/0x140
> > [ 5215.568474] [<c1439c6c>] ? i915_gem_evict_vm+0x10c/0x140
> > [ 5215.568483] [<c1047f83>] warn_slowpath_fmt+0x33/0x40
> > [ 5215.568492] [<c1439c6c>] i915_gem_evict_vm+0x10c/0x140
> > [ 5215.568502] [<c143a236>] i915_gem_execbuffer_reserve+0x226/0x320
> > [ 5215.568511] [<c11594cd>] ? __kmalloc+0x14d/0x190
> > [ 5215.568521] [<c143b6d9>] i915_gem_do_execbuffer.isra.17+0x5c9/0xdd0
> > [ 5215.568532] [<c112efdb>] ? vm_mmap_pgoff+0x7b/0xa0
> > [ 5215.568541] [<c11594cd>] ? __kmalloc+0x14d/0x190
> > [ 5215.568550] [<c143cfeb>] i915_gem_execbuffer2+0x8b/0x2c0
> > [ 5215.568560] [<c143cf60>] ? i915_gem_execbuffer+0x4e0/0x4e0
> > [ 5215.568568] [<c1401d67>] drm_ioctl+0x1b7/0x510
> > [ 5215.568577] [<c1120a9a>] ? balance_dirty_pages_ratelimited+0x1a/0x6a0
> > [ 5215.568587] [<c143cf60>] ? i915_gem_execbuffer+0x4e0/0x4e0
> > [ 5215.568599] [<c113cef9>] ? handle_mm_fault+0x329/0x1250
> > [ 5215.568607] [<c1401bb0>] ? drm_getmap+0xb0/0xb0
> > [ 5215.568616] [<c117d9ca>] do_vfs_ioctl+0x30a/0x530
> > [ 5215.568626] [<c10a9e92>] ? ktime_get_ts64+0x52/0x1a0
> > [ 5215.568635] [<c1185f62>] ? __fget_light+0x22/0x60
> > [ 5215.568644] [<c117dc50>] SyS_ioctl+0x60/0x90
> > [ 5215.568653] [<c16b9bc8>] sysenter_do_call+0x12/0x12
> > [ 5215.568659] ---[ end trace ab3f7e4953cb9eb7 ]---
Ok, at least we have clear evidence now that the lists indeed seem to get
out of sync.
> Ah, so what it boils down to is that i915_gpu_idle() is a no-op here is
> list_empty(ring->request_list)) [intel_ring_idle:2176].
>
> Missing link discovered, I think the bug fixed by the patch is indeed
> the same one that triggered the first WARN.
But if we do that short-circuiting in ring_idle the all the requests
_should_ be completed. Which meanse retire_request_ring should move all
buffers to the inactive list, even when we do that before retiring
requests.
I'm still baffled and don't really understand what's going on here ...
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2015-03-20 14:31 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-18 18:19 [PATCH] drm/i915: Keep ring->active_list and ring->requests_list consistent Chris Wilson
2015-03-19 11:18 ` shuang.he
2015-03-19 17:37 ` Daniel Vetter
2015-03-19 22:17 ` Chris Wilson
2015-03-20 10:06 ` Daniel Vetter
2015-03-20 13:02 ` Chris Wilson
2015-03-20 13:39 ` Chris Wilson
2015-03-20 14:32 ` Daniel Vetter [this message]
2015-03-20 14:45 ` Chris Wilson
2015-03-20 15:00 ` Daniel Vetter
2015-03-20 15:04 ` Chris Wilson
2015-03-20 15:33 ` Daniel Vetter
2015-03-20 15:36 ` Chris Wilson
2015-03-23 8:43 ` Daniel Vetter
2015-03-23 8:49 ` Daniel Vetter
2015-03-23 9:13 ` Chris Wilson
2015-03-23 9:15 ` Chris Wilson
2015-03-23 9:40 ` Daniel Vetter
2015-03-25 11:43 ` Jani Nikula
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150320143252.GO1349@phenom.ffwll.local \
--to=daniel@ffwll.ch \
--cc=John.C.Harrison@Intel.com \
--cc=chris@chris-wilson.co.uk \
--cc=daniel.vetter@ffwll.ch \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.