All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 25806] New: NV40 vertex corruption (kernel BO deletion too early?)
@ 2009-12-27 21:45 bugzilla-daemon-CC+yJ3UmIYqDUpFQwHEjaQ
       [not found] ` <bug-25806-8800-V0hAGp6uBxMKqLRl/0Ahz6D7qz1kEfGD2LY78lusg7I@public.gmane.org/>
  0 siblings, 1 reply; 3+ messages in thread
From: bugzilla-daemon-CC+yJ3UmIYqDUpFQwHEjaQ @ 2009-12-27 21:45 UTC (permalink / raw)
  To: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

http://bugs.freedesktop.org/show_bug.cgi?id=25806

           Summary: NV40 vertex corruption (kernel BO deletion too early?)
           Product: Mesa
           Version: git
          Platform: Other
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: medium
         Component: Drivers/DRI/nouveau
        AssignedTo: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
        ReportedBy: luca.barbieri-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org


On my G71 system, several programs show vertex corruption issues. In
particular, vertices tend to be corrupted or randomly go to infinity, leading
to spiked triangles or random polygons, in several programs, such as
demos/engine, demos/dinoshade, Blender, Extreme Tux Racer.

The system is running:
Linux 2.6.33-rc2
libdrm 2.4.17
Mesa HEAD (b46bcd8e7b37aa2e9159e126c1cc88234a3c2790)
Detected an NV40 generation card (0x049800a2)
64 MB GART aperture
256 MB VRAM

The problem is solved by either of the following:
1. #define FORCE_SWTNL 1
2. Adding usleep(10000) at the end of nv40_draw_arrays
3. Making nouveau_screen_bo_del do nothing

It seems that the issue is that Mesa deletes a buffer object used for vertex
data while the GPU is still drawing to it. The kernel actually performs the
deletion without waiting for the GPU drawing, the memory (or GART mapping) is
reused, and corruption ensues.

From Gallium tracing, Mesa is sending vertex data in 64 KB buffers, which are
created, written, drawn and then recreated upon reuse (which seems correct
behavior).

It seems, in other words, that the kernel is not keeping an extra reference to
buffers which are currently referenced by an in-flight pushbuffer, and
unreferencing them only once the GPU finished drawing.

Is the kernel already supposed to do so?
If yes, something is broken. If things work for others, maybe my system is
somehow more prone to reusing memory or GART mappings, so they don't see that?

If no, then how are things supposed to work?

(BTW, not freeing buffers leads to X freezing and the kernel oopsing on my
machine upon saturating memory, but that's another issue)


-- 
Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug 25806] NV40 vertex corruption (kernel BO deletion too early?)
       [not found] ` <bug-25806-8800-V0hAGp6uBxMKqLRl/0Ahz6D7qz1kEfGD2LY78lusg7I@public.gmane.org/>
@ 2009-12-27 22:13   ` bugzilla-daemon-CC+yJ3UmIYqDUpFQwHEjaQ
  2009-12-27 22:25   ` bugzilla-daemon-CC+yJ3UmIYqDUpFQwHEjaQ
  1 sibling, 0 replies; 3+ messages in thread
From: bugzilla-daemon-CC+yJ3UmIYqDUpFQwHEjaQ @ 2009-12-27 22:13 UTC (permalink / raw)
  To: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

http://bugs.freedesktop.org/show_bug.cgi?id=25806





--- Comment #1 from Luca Barbieri <luca.barbieri-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>  2009-12-27 14:13:14 PST ---
Upon further examination, the kernel does seem to have the required logic:
sending a pushbuffer creates a fence, which is put in bo->sync_obj, which is
then checked on deletion and if non-null, the buffer is put on a delayed
destroy list.

However, it seems to be somehow not working.

Maybe fencing is broken on my card? (i.e. the kernel thinks fences are signaled
when they aren't)
Or possibly fences are being signaled before the vertex shader is finished
running?

How can I test that fencing is working correctly?


-- 
Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug 25806] NV40 vertex corruption (kernel BO deletion too early?)
       [not found] ` <bug-25806-8800-V0hAGp6uBxMKqLRl/0Ahz6D7qz1kEfGD2LY78lusg7I@public.gmane.org/>
  2009-12-27 22:13   ` [Bug 25806] " bugzilla-daemon-CC+yJ3UmIYqDUpFQwHEjaQ
@ 2009-12-27 22:25   ` bugzilla-daemon-CC+yJ3UmIYqDUpFQwHEjaQ
  1 sibling, 0 replies; 3+ messages in thread
From: bugzilla-daemon-CC+yJ3UmIYqDUpFQwHEjaQ @ 2009-12-27 22:25 UTC (permalink / raw)
  To: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

http://bugs.freedesktop.org/show_bug.cgi?id=25806


Francisco Jerez <currojerez-sGOZH3hwPm2sTnJN9+BGXg@public.gmane.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID




--- Comment #2 from Francisco Jerez <currojerez-sGOZH3hwPm2sTnJN9+BGXg@public.gmane.org>  2009-12-27 14:25:30 PST ---
(In reply to comment #1)
> Upon further examination, the kernel does seem to have the required logic:
> sending a pushbuffer creates a fence, which is put in bo->sync_obj, which is
> then checked on deletion and if non-null, the buffer is put on a delayed
> destroy list.
> 
> However, it seems to be somehow not working.
> 
> Maybe fencing is broken on my card? (i.e. the kernel thinks fences are signaled
> when they aren't)
> Or possibly fences are being signaled before the vertex shader is finished
> running?

That would be almost unprecedented... it's more likely that some caches in the
GPU aren't being flushed often enough (or maybe the ones in the CPU... a bug in
the kernel PAT code also used to cause the same symptoms, but that's hopefully
already fixed).

I'm marking this as invalid because that's the current policy, unfortunately
we're already aware of too many gallium bugs.

> 
> How can I test that fencing is working correctly?
> 


-- 
Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-12-27 22:25 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-27 21:45 [Bug 25806] New: NV40 vertex corruption (kernel BO deletion too early?) bugzilla-daemon-CC+yJ3UmIYqDUpFQwHEjaQ
     [not found] ` <bug-25806-8800-V0hAGp6uBxMKqLRl/0Ahz6D7qz1kEfGD2LY78lusg7I@public.gmane.org/>
2009-12-27 22:13   ` [Bug 25806] " bugzilla-daemon-CC+yJ3UmIYqDUpFQwHEjaQ
2009-12-27 22:25   ` bugzilla-daemon-CC+yJ3UmIYqDUpFQwHEjaQ

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.