Re: shmem + TTM oops - Thomas Hellström

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Thomas Hellström" <thellstrom@vmware.com>
To: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: "Thomas Hellström" <thomas@tungstengraphics.com>,
	"Linux Kernel list" <linux-kernel@vger.kernel.org>,
	"Linux Memory Management" <linux-mm@kvack.org>
Subject: Re: shmem + TTM  oops
Date: Thu, 06 Aug 2009 22:16:44 +0200	[thread overview]
Message-ID: <4A7B3A2C.60500@vmware.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0908062045270.944@sister.anvils>

Hugh Dickins skrev:
> On Thu, 6 Aug 2009, Thomas Hellström wrote:
>   
>> Hi!
>> I've been debugging a strange problem for a while, and it'd be nice to have
>> some more eyes on this.
>>
>> When the TTM graphics memory manager decides it's using too much memory, it
>> copies the contents of the buffer to shmem objects and releases the buffers.
>> This is because shmem objects are pageable whereas TTM buffers are not. When
>> the TTM buffers are accessed in one way or another, it copies contents back.
>> Seems to work fairly nice, but not really optimal.
>>
>> When the X server is VT switched, TTM optionally switches out all buffers to
>> shmem objects, but when the contents are read back, some shmem objects have
>> corrupted swap entry top directory. The member
>> shmem_inode_info::i_indirect[0] usually contains a value 0xffffff60 or
>> something similar, causing an oops in shmem_truncate_range() when the shmem
>> object is freed. Before that, readback seems to work OK. The corruption is
>> happening after X server VT switch when TTM is supposed to be idle. The shmem
>> objects have been verified to have swap entry directories after all buffer
>> objects have been swapped out.
>>     
>
> Not a symptom I've ever come across: I agree strange.  A few questions:
>
> What architecture? I assume x86 32-bit; if so, what happens on 64-bit?
> if not x86, what is your PAGE_SIZE?
>
> What size are these objects i.e. how many pages?
>
> What release? I'm assuming 2.6.31-rc5 and various earlier.
>
> What slab allocator? what if you choose another (SLUB versus SLAB)?
> Please turn on all the slab/slub debugging you can.
>
> And you say i_indirect "usually contains a value 0xffffff60 or something
> similar": please give other examples of what you find there (if possible,
> with a rough idea of their frequency e.g. is 0xffffff60 the most common?).
>
> Does there appear to be corruption of any other nearby fields?
>
> Thanks.
>
>   
>> If anyone could shed some light over this, it would be very helpful. Relevant
>> TTM code is fairly straightforward looks like this. The process that copies
>> out to shmem objects may not be the same process that copies in:
>>     
>
> I didn't notice anything wrong with your code; and it wouldn't
> be easy for it to corrupt that field of shmem_inode_info.
>
> Hugh
Hugh,

Thanks for looking at this.
After further debugging it seems this is not relevant to the shmem code. 
It looks like a (possibly misconfigured) hrtimer in the graphics driver 
corrupts the shmem_inode_info data from within interrupt context, so 
this appears to be a false alarm. The hrtimer was supposed to be idled 
at vt switch, but apparently not.

Thanks,
Thomas

WARNING: multiple messages have this Message-ID (diff)

From: "Thomas Hellström" <thellstrom@vmware.com>
To: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: "Thomas Hellström" <thomas@tungstengraphics.com>,
	"Linux Kernel list" <linux-kernel@vger.kernel.org>,
	"Linux Memory Management" <linux-mm@kvack.org>
Subject: Re: shmem + TTM  oops
Date: Thu, 06 Aug 2009 22:16:44 +0200	[thread overview]
Message-ID: <4A7B3A2C.60500@vmware.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0908062045270.944@sister.anvils>

Hugh Dickins skrev:
> On Thu, 6 Aug 2009, Thomas HellstrA?m wrote:
>   
>> Hi!
>> I've been debugging a strange problem for a while, and it'd be nice to have
>> some more eyes on this.
>>
>> When the TTM graphics memory manager decides it's using too much memory, it
>> copies the contents of the buffer to shmem objects and releases the buffers.
>> This is because shmem objects are pageable whereas TTM buffers are not. When
>> the TTM buffers are accessed in one way or another, it copies contents back.
>> Seems to work fairly nice, but not really optimal.
>>
>> When the X server is VT switched, TTM optionally switches out all buffers to
>> shmem objects, but when the contents are read back, some shmem objects have
>> corrupted swap entry top directory. The member
>> shmem_inode_info::i_indirect[0] usually contains a value 0xffffff60 or
>> something similar, causing an oops in shmem_truncate_range() when the shmem
>> object is freed. Before that, readback seems to work OK. The corruption is
>> happening after X server VT switch when TTM is supposed to be idle. The shmem
>> objects have been verified to have swap entry directories after all buffer
>> objects have been swapped out.
>>     
>
> Not a symptom I've ever come across: I agree strange.  A few questions:
>
> What architecture? I assume x86 32-bit; if so, what happens on 64-bit?
> if not x86, what is your PAGE_SIZE?
>
> What size are these objects i.e. how many pages?
>
> What release? I'm assuming 2.6.31-rc5 and various earlier.
>
> What slab allocator? what if you choose another (SLUB versus SLAB)?
> Please turn on all the slab/slub debugging you can.
>
> And you say i_indirect "usually contains a value 0xffffff60 or something
> similar": please give other examples of what you find there (if possible,
> with a rough idea of their frequency e.g. is 0xffffff60 the most common?).
>
> Does there appear to be corruption of any other nearby fields?
>
> Thanks.
>
>   
>> If anyone could shed some light over this, it would be very helpful. Relevant
>> TTM code is fairly straightforward looks like this. The process that copies
>> out to shmem objects may not be the same process that copies in:
>>     
>
> I didn't notice anything wrong with your code; and it wouldn't
> be easy for it to corrupt that field of shmem_inode_info.
>
> Hugh
Hugh,

Thanks for looking at this.
After further debugging it seems this is not relevant to the shmem code. 
It looks like a (possibly misconfigured) hrtimer in the graphics driver 
corrupts the shmem_inode_info data from within interrupt context, so 
this appears to be a false alarm. The hrtimer was supposed to be idled 
at vt switch, but apparently not.

Thanks,
Thomas

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2009-08-06 20:16 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-06 12:29 shmem + TTM oops Thomas Hellström
2009-08-06 12:29 ` Thomas Hellström
2009-08-06 20:03 ` Hugh Dickins
2009-08-06 20:16   ` Thomas Hellström [this message]
2009-08-06 20:16     ` Thomas Hellström

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A7B3A2C.60500@vmware.com \
    --to=thellstrom@vmware.com \
    --cc=hugh.dickins@tiscali.co.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=thomas@tungstengraphics.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.