Repeated corruption of file->f_ep

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

* Repeated corruption of file->f_ep_lock
@ 2005-09-17 11:27 David Woodhouse
  2005-09-17 13:11 ` Paul Mackerras
  2005-09-18 23:23 ` Gabriel Paubert
  0 siblings, 2 replies; 5+ messages in thread
From: David Woodhouse @ 2005-09-17 11:27 UTC (permalink / raw)
  To: ppc-dev list; +Cc: Davide Libenzi, viro

For a while I've been seeing occasional deadlocks on one CPU of a PPC
SMP machine:

_spin_lock(c8cbf250) CPU#1 NIP c02bb740 holder: cpu 2305 pc 00000000 (lock 24000484)

Further debugging shows that it's always due to file->f_ep_lock being
corrupted, and the deadlock happens when epoll is used on such a file.
The owner_cpu field is almost always 2305. However, it's not due to the
epoll code itself -- I've turned all three of the epoll syscalls into
sys_ni_syscall and it's still happening. I also added sanity checks for
(file->f_ep_lock.owner_cpu > 1) throughout fs/file_table.c, and I see it
happen ten or twenty times during a kernel compile.

The previous and next members of 'struct file', which are f_ep_list and
f_mapping respectively, are always fine. It's just f_ep_lock which is
scribbled upon, and the scribble is fairly repeatable: 'owner_cpu' is
almost always set to 0x901 but occasionally 0x501, and the 'lock' field
has values like 20282484, 24042884, 28022484, 24042084, 22000424 (hex).
Do those numbers seem meaningful to anyone? Any clues as to where they
might be coming from?

During a kernel compile, the corruption is mostly detected in fget()
from vfs_fstat(), but also I've seen it once or twice in vfs_read() from
do_execve():

 File cb2f5b40 (fops d107c980) has corrupted f_epoll_lock!
 lock 24002484, owner_pc 0, owner_cpu 901
 f->private_data 00000000, f->f_ep_links (cb2f5bc8, cb2f5bc8), f->f_mapping cc21c1c8
 f->f_mapping->a_ops d107cad8
 Pid 16648, comm gcc
 File is /usr/bin/gcc
 Badness in dumpbadfile at fs/file_table.c:133
 Call trace:
  [c00059b8] check_bug_trap+0xa8/0x120
  [c0005c94] ProgramCheckException+0x264/0x4e0
  [c00050a8] ret_from_except_full+0x0/0x4c
  [c0080bb4] dumpbadfile+0x114/0x160
  [c007f9f0] vfs_read+0xa0/0x1c0
  [c008ef7c] kernel_read+0x3c/0x60
  [c0091810] do_execve+0x1e0/0x280
  [c0008594] sys_execve+0x64/0xd0
  [c0004980] ret_from_syscall+0x0/0x44

This is the Fedora Core kernel (currently 2.6.12.5). The 'owner_cpu > 1'
sanity check isn't applicable to 2.6.13, so I haven't yet tried to
reproduce the problem there.

-- 
dwmw2

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Repeated corruption of file->f_ep_lock
  2005-09-17 11:27 Repeated corruption of file->f_ep_lock David Woodhouse
@ 2005-09-17 13:11 ` Paul Mackerras
  2005-09-17 18:12   ` David Woodhouse
  2005-09-18  1:23   ` Benjamin Herrenschmidt
  2005-09-18 23:23 ` Gabriel Paubert
  1 sibling, 2 replies; 5+ messages in thread
From: Paul Mackerras @ 2005-09-17 13:11 UTC (permalink / raw)
  To: David Woodhouse; +Cc: ppc-dev list, Davide Libenzi, viro

David Woodhouse writes:

> The previous and next members of 'struct file', which are f_ep_list and
> f_mapping respectively, are always fine. It's just f_ep_lock which is
> scribbled upon, and the scribble is fairly repeatable: 'owner_cpu' is
> almost always set to 0x901 but occasionally 0x501, and the 'lock' field
> has values like 20282484, 24042884, 28022484, 24042084, 22000424 (hex).
> Do those numbers seem meaningful to anyone? Any clues as to where they
> might be coming from?

They look like part of an exception stack frame.  The 901 or 501 would
be the trap number; 500 for an external interrupt or 900 for a
decrementer interrupt, plus 1 which we use as a marker to say that
only the volatile registers have been saved in the frame.  The other
values (20282484 etc.) could possibly be condition register values.
That would fit with owner_cpu being 2 words past the lock field; the
trap field in struct pt_regs is 2 words past the ccr field.

Paul.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Repeated corruption of file->f_ep_lock
  2005-09-17 13:11 ` Paul Mackerras
@ 2005-09-17 18:12   ` David Woodhouse
  2005-09-18  1:23   ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 5+ messages in thread
From: David Woodhouse @ 2005-09-17 18:12 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: ppc-dev list, Davide Libenzi, viro

On Sat, 2005-09-17 at 23:11 +1000, Paul Mackerras wrote:
> They look like part of an exception stack frame. 

Aha, thanks. Now... any ideas about how they're getting into struct
file? It's always at the same offset in the struct, and it's not at the
same offset in the page each time -- this is happening all over the
place.

e.g....

Sep 17 11:09:34 peach kernel: File cb789760 (fops d107c980) has corrupted f_epoll_lock!
Sep 17 11:09:34 peach kernel: lock 24042084, owner_pc 0, owner_cpu 901
--
Sep 17 11:11:07 peach kernel: File c70038c0 (fops d107c980) has corrupted f_epoll_lock!
Sep 17 11:11:07 peach kernel: lock 24042484, owner_pc 0, owner_cpu 901
--
Sep 17 11:12:22 peach kernel: File c89c0320 (fops d107c980) has corrupted f_epoll_lock!
Sep 17 11:12:22 peach kernel: lock 24042484, owner_pc 0, owner_cpu 901
--
Sep 17 12:02:02 peach kernel: File c90378a0 (fops d107c980) has corrupted f_epoll_lock!
Sep 17 12:02:02 peach kernel: lock 24022484, owner_pc 0, owner_cpu 901
--
Sep 17 12:04:57 peach kernel: File c962b3e0 (fops d107c980) has corrupted f_epoll_lock!
Sep 17 12:04:57 peach kernel: lock 28000484, owner_pc 0, owner_cpu 901
--
Sep 17 12:08:01 peach kernel: File cb245700 (fops d107c980) has corrupted f_epoll_lock!
Sep 17 12:08:01 peach kernel: lock 24042884, owner_pc 0, owner_cpu 901
--
Sep 17 12:11:00 peach kernel: File c64e5aa0 (fops d107c980) has corrupted f_epoll_lock!
Sep 17 12:11:00 peach kernel: lock 28000484, owner_pc 0, owner_cpu 901
--
Sep 17 12:13:15 peach kernel: File c0a1eac0 (fops d107c980) has corrupted f_epoll_lock!
Sep 17 12:13:15 peach kernel: lock 28000484, owner_pc 0, owner_cpu 901
--
Sep 17 12:21:44 peach kernel: File caeb3360 (fops d107c980) has corrupted f_epoll_lock!
Sep 17 12:21:44 peach kernel: lock 24042084, owner_pc 0, owner_cpu 901
--
Sep 17 12:24:25 peach kernel: File caeb3d60 (fops d107c980) has corrupted f_epoll_lock!
Sep 17 12:24:25 peach kernel: lock 28000484, owner_pc 0, owner_cpu 901
--
Sep 17 12:25:08 peach kernel: File c6b24300 (fops d107c980) has corrupted f_epoll_lock!
Sep 17 12:25:08 peach kernel: lock 28000484, owner_pc 0, owner_cpu 901
--
Sep 17 12:25:09 peach kernel: File ca6465c0 (fops d107c980) has corrupted f_epoll_lock!
Sep 17 12:25:09 peach kernel: lock 24022484, owner_pc 0, owner_cpu 901
--
Sep 17 12:26:59 peach kernel: File c9037c60 (fops d107c980) has corrupted f_epoll_lock!
Sep 17 12:26:59 peach kernel: lock 28000484, owner_pc 0, owner_cpu 901
--
Sep 17 12:29:55 peach kernel: File ca64b120 (fops c031f44c) has corrupted f_epoll_lock!
Sep 17 12:29:55 peach kernel: lock 20000484, owner_pc 0, owner_cpu 901



-- 
dwmw2

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Repeated corruption of file->f_ep_lock
  2005-09-17 13:11 ` Paul Mackerras
  2005-09-17 18:12   ` David Woodhouse
@ 2005-09-18  1:23   ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 5+ messages in thread
From: Benjamin Herrenschmidt @ 2005-09-18  1:23 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: ppc-dev list, David Woodhouse, Davide Libenzi, viro

On Sat, 2005-09-17 at 23:11 +1000, Paul Mackerras wrote:
> David Woodhouse writes:
> 
> > The previous and next members of 'struct file', which are f_ep_list and
> > f_mapping respectively, are always fine. It's just f_ep_lock which is
> > scribbled upon, and the scribble is fairly repeatable: 'owner_cpu' is
> > almost always set to 0x901 but occasionally 0x501, and the 'lock' field
> > has values like 20282484, 24042884, 28022484, 24042084, 22000424 (hex).
> > Do those numbers seem meaningful to anyone? Any clues as to where they
> > might be coming from?
> 
> They look like part of an exception stack frame.  The 901 or 501 would
> be the trap number; 500 for an external interrupt or 900 for a
> decrementer interrupt, plus 1 which we use as a marker to say that
> only the volatile registers have been saved in the frame.  The other
> values (20282484 etc.) could possibly be condition register values.
> That would fit with owner_cpu being 2 words past the lock field; the
> trap field in struct pt_regs is 2 words past the ccr field.

kernel stack overflow ? Also, you could try using the DABR (Data Access
Breakpoint) if any on your CPU to try to catch at the instant of the
corruption...

Ben.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Repeated corruption of file->f_ep_lock
  2005-09-17 11:27 Repeated corruption of file->f_ep_lock David Woodhouse
  2005-09-17 13:11 ` Paul Mackerras
@ 2005-09-18 23:23 ` Gabriel Paubert
  1 sibling, 0 replies; 5+ messages in thread
From: Gabriel Paubert @ 2005-09-18 23:23 UTC (permalink / raw)
  To: David Woodhouse; +Cc: ppc-dev list, Davide Libenzi, viro

On Sat, Sep 17, 2005 at 12:27:17PM +0100, David Woodhouse wrote:
> For a while I've been seeing occasional deadlocks on one CPU of a PPC
> SMP machine:
> 
> _spin_lock(c8cbf250) CPU#1 NIP c02bb740 holder: cpu 2305 pc 00000000 (lock 24000484)
> 
> Further debugging shows that it's always due to file->f_ep_lock being
> corrupted, and the deadlock happens when epoll is used on such a file.
> The owner_cpu field is almost always 2305. However, it's not due to the
> epoll code itself -- I've turned all three of the epoll syscalls into
> sys_ni_syscall and it's still happening. I also added sanity checks for
> (file->f_ep_lock.owner_cpu > 1) throughout fs/file_table.c, and I see it
> happen ten or twenty times during a kernel compile.
> 
> The previous and next members of 'struct file', which are f_ep_list and
> f_mapping respectively, are always fine. It's just f_ep_lock which is
> scribbled upon, and the scribble is fairly repeatable: 'owner_cpu' is
> almost always set to 0x901 but occasionally 0x501, and the 'lock' field
> has values like 20282484, 24042884, 28022484, 24042084, 22000424 (hex).
> Do those numbers seem meaningful to anyone? Any clues as to where they
> might be coming from?

As Paul mentioned, these ones furiously look like the contents of 
a condition register, which is not saved and restored very often:
- on every exception/interrupt
- in some stack frames, when GCC decides that it needs to use some
of the 3 CR fields (out of 8) that must be preserved across function
calls. This is rather infrequent.

> 
> During a kernel compile, the corruption is mostly detected in fget()
> from vfs_fstat(), but also I've seen it once or twice in vfs_read() from
> do_execve():
> 
>  File cb2f5b40 (fops d107c980) has corrupted f_epoll_lock!
>  lock 24002484, owner_pc 0, owner_cpu 901
>  f->private_data 00000000, f->f_ep_links (cb2f5bc8, cb2f5bc8), f->f_mapping cc21c1c8
>  f->f_mapping->a_ops d107cad8
>  Pid 16648, comm gcc
>  File is /usr/bin/gcc
>  Badness in dumpbadfile at fs/file_table.c:133
>  Call trace:
>   [c00059b8] check_bug_trap+0xa8/0x120
>   [c0005c94] ProgramCheckException+0x264/0x4e0
>   [c00050a8] ret_from_except_full+0x0/0x4c
>   [c0080bb4] dumpbadfile+0x114/0x160
>   [c007f9f0] vfs_read+0xa0/0x1c0
>   [c008ef7c] kernel_read+0x3c/0x60
>   [c0091810] do_execve+0x1e0/0x280
>   [c0008594] sys_execve+0x64/0xd0
>   [c0004980] ret_from_syscall+0x0/0x44

It's hard to imagine a stack overflow on such a short
call chain. The other idea I have is a backlink chain
corruption, but GCC generated code is not very sensitive
to it unless you use alloca()...

In this case, it would be very useful to also have the 
value of the stack pointer (r1) on each line in the call
backtrace. (The PPC ABI makes the call backtrace much more 
reliable than on x86, where the backtrace without frame pointer
is an educated guess at best).

	Regards,
	Gabriel

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-09-18 23:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-17 11:27 Repeated corruption of file->f_ep_lock David Woodhouse
2005-09-17 13:11 ` Paul Mackerras
2005-09-17 18:12   ` David Woodhouse
2005-09-18  1:23   ` Benjamin Herrenschmidt
2005-09-18 23:23 ` Gabriel Paubert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).