public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* fs/pipe.c null pointer dereference
@ 2009-10-14 13:39 Earl Chew
  2009-10-14 14:53 ` Frans Pop
  0 siblings, 1 reply; 9+ messages in thread
From: Earl Chew @ 2009-10-14 13:39 UTC (permalink / raw)
  To: linux-kernel

I'm working on a 2.6.21 based kernel and received the following
oops last tonight:

> stopped custom tracer.
> Unable to handle kernel NULL pointer dereference at 0000000000000028 RIP:
>  [<ffffffff802899a5>] pipe_rdwr_open+0x35/0x70
> PGD 17d198067 PUD 17c672067 PMD 0
> Oops: 0002 [1] PREEMPT SMP
> CPU 0
> Modules linked in: jffs2 cfi_cmdset_0001 cfi_util cfi_probe gen_probe physmap_lo e1000e fakephp amp_uio uio coretemp lm90 hwmon w83627ehf ipmi_watchdog ipmi_devintf ipmi_si ipmi_msghandler ppdev physmap mtdpart chipreg map_funcs mtdblock mtd_blkdevs mtdchar mtdcore
> Pid: 6928, comm: poll Not tainted 2.6.21-amp64c-10X-n2x-10X #1
> RIP: 0010:[<ffffffff802899a5>]  [<ffffffff802899a5>] pipe_rdwr_open+0x35/0x70
> RSP: 0018:ffff81017c583e48  EFLAGS: 00010202
> RAX: 0000000000000000 RBX: ffff81017c9bc490 RCX: ffffffff80e48c00
> RDX: ffff81017c583fd8 RSI: ffff81017c603040 RDI: ffff81000642bf40
> RBP: ffff81017cf2dec0 R08: ffff81017c582000 R09: 0000000000000082
> R10: ffff81017d1b6000 R11: ffffffff802985f0 R12: ffff81017c9bc550
> R13: ffffffff80289970 R14: ffff81017cea1e90 R15: ffff81017fc2c980
> FS:  0000000000000000(0000) GS:ffffffff806b30c0(0063) knlGS:00000000f7dc16c0
> CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
> CR2: 00000000f7dfb040 CR3: 000000017c5bc000 CR4: 00000000000006e0
> Process poll (pid: 6928, threadinfo ffff81017c582000, task ffff81017c603040)
> Stack:  ffff81017cf2dec0 ffff81017c9bc490 0000000000008000 ffffffff8028125c
>  ffff81017c603040 0000000000008000 ffff81017f68e000 0000000000008000
>  0000000000000004 00000000ffffff9c 0000000000000000 ffffffff8028143d
> Call Trace:
>  [<ffffffff8028125c>] __dentry_open+0x13c/0x230
>  [<ffffffff8028143d>] do_filp_open+0x2d/0x40
>  [<ffffffff802814aa>] do_sys_open+0x5a/0x100
>  [<ffffffff8021faf3>] sysenter_do_call+0x1b/0x67
> 
> 
> Code: 83 40 28 01 8b 45 38 a8 02 74 0b 48 8b 83 d8 01 00 00 83 40
> RIP  [<ffffffff802899a5>] pipe_rdwr_open+0x35/0x70
>  RSP <ffff81017c583e48>
> CR2: 0000000000000028

The null dereference is happening at the ++ operator in:

 > static int
 > pipe_rdwr_open(struct inode *inode, struct file *filp)
 > {
 >         mutex_lock(&inode->i_mutex);
 >         if (filp->f_mode & FMODE_READ)
 >                 inode->i_pipe->readers++;

The corresponding assembler is:

> 0000000000000190 <pipe_rdwr_open>:
>      190:       48 83 ec 18             sub    $0x18,%rsp
>      194:       4c 89 64 24 10          mov    %r12,0x10(%rsp)
>      199:       4c 8d a7 c0 00 00 00    lea    0xc0(%rdi),%r12
>      1a0:       48 89 1c 24             mov    %rbx,(%rsp)
>      1a4:       48 89 6c 24 08          mov    %rbp,0x8(%rsp)
>      1a9:       48 89 fb                mov    %rdi,%rbx
>      1ac:       48 89 f5                mov    %rsi,%rbp
>      1af:       4c 89 e7                mov    %r12,%rdi
>      1b2:       e8 00 00 00 00          callq  1b7 <pipe_rdwr_open+0x27>
>                         1b3: R_X86_64_PC32      mutex_lock+0xfffffffffffffffc
>      1b7:       8b 45 38                mov    0x38(%rbp),%eax
>      1ba:       a8 01                   test   $0x1,%al
>      1bc:       74 0e                   je     1cc <pipe_rdwr_open+0x3c>
>      1be:       48 8b 83 d8 01 00 00    mov    0x1d8(%rbx),%rax
>      1c5:       83 40 28 01             addl   $0x1,0x28(%rax)    <--------**** FAULT HERE ****

IOW i_pipe is NULL, apparently set by free_pipe_info()

I went trawling through the code to see if I could figure out
how this might have happened. The are mutexes of the form:

        mutex_lock(&inode->i_mutex);
          ...
        mutex_unlock(&inode->i_mutex);

throughout fs/pipe.c and fs/fifo.c so the above seems to be
an impossibility ;-)


Perhaps there is a potential window for failure in fs/fifo.c.

pipe_rdwr_open() is only accessible via rdwr_pipefifo_fops and
that is obtained via fs/fifo.c.

Looking at fs/fifo.c I see:

        mutex_lock(&inode->i_mutex);
          ...
        switch (filp->f_mode) {
        case FMODE_READ:
                ...
                 if (!pipe->writers) {
                       wait_for_partner(inode, filp, &pipe->w_counter);
                ...
        case FMODE_WRITE:
                ...
                 if (!pipe->readers) {
                         wait_for_partner(inode, filp, &pipe->r_counter);
                ...

        case FMODE_READ | FMODE_WRITE:
                 filp->f_op = &rdwr_pipefifo_fops;
                   ...
                 if (pipe->readers == 1 || pipe->writers == 1)
                         wake_up_partner(inode);
                 break;

        }
          ...
        mutex_unlock(&inode->i_mutex);

So it turns out that FMODE_READ|FMODE_WRITE does not block.

However, FMODE_READ alone or FMODE_WRITE alone may call
wait_for_partner(), which in turn calls pipe_wait(), which
in turn drops the mutex, then reacquires it:

> void pipe_wait(struct pipe_inode_info *pipe)
> {
 >          ...
>         if (pipe->inode)
>                 mutex_unlock(&pipe->inode->i_mutex);
 >          ...
>         if (pipe->inode)
>                 mutex_lock(&pipe->inode->i_mutex);
> }

So perhaps:

1. Process A calls fifo_open(FMODE_READ), then relinquishes
    the mutex at pipe_wait() (readers == 1, writers == 0)

2. Process B calls fifo_open(FMODE_WRITE|FMODE_READ) and completes
    (readers == 2, writers == 1)

3. Process A wakes, but finds signal pending, so goes to err_rd
    and drops readers to 1

  ... but I couldn't figure out a way for this to fail ...

Any other ideas?

Earl



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fs/pipe.c null pointer dereference
  2009-10-14 13:39 fs/pipe.c null pointer dereference Earl Chew
@ 2009-10-14 14:53 ` Frans Pop
  2009-10-14 22:50   ` Earl Chew
  0 siblings, 1 reply; 9+ messages in thread
From: Frans Pop @ 2009-10-14 14:53 UTC (permalink / raw)
  To: Earl Chew; +Cc: linux-kernel

Earl Chew wrote:
> I'm working on a 2.6.21 based kernel and received the following
> oops last tonight:

2.6.21 is really rather ancient for the kernel community and I doubt anyone 
here will really want to look into issues with it, given that there's a 
realistic chance it has already been fixed since then. The current stable 
version is 2.6.31 after all.

Can you reproduce the oops with a current kernel? Have you checked if there 
have been any changes in the code you identified since 2.6.21?

Cheers,
FJP

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fs/pipe.c null pointer dereference
  2009-10-14 14:53 ` Frans Pop
@ 2009-10-14 22:50   ` Earl Chew
  2009-10-15  0:11     ` Frans Pop
  2009-10-16  3:31     ` Earl Chew
  0 siblings, 2 replies; 9+ messages in thread
From: Earl Chew @ 2009-10-14 22:50 UTC (permalink / raw)
  To: Frans Pop; +Cc: linux-kernel

Frans Pop wrote:
> 2.6.21 is really rather ancient for the kernel community and I doubt anyone 
> here will really want to look into issues with it, given that there's a 
> realistic chance it has already been fixed since then. The current stable 
> version is 2.6.31 after all.
> 
> Can you reproduce the oops with a current kernel? Have you checked if there 
> have been any changes in the code you identified since 2.6.21?

I was mostly after some guidance. I'm happy to dig around to figure
out what the problem is and log a bug if indeed I find one.


Yes, I have searched for obvious patches but didn't turn up anything.

I have looked at the code for more recent 2.6.31 kernel on
lxr.linux.no and haven't turned up any appreciable difference.

I have given the matter more thought, and I believe it is unlikely
that the problem is in fs/fifo.c. As outlined in my previous
email, the code looks ok.


I'm a little confused about the switch between the fifo_open()
and the subsequent call to pipe_rdwr_open(). My understanding
is fifo_open() is called during open("/fifo", O_RDWR) which
fills in filp->f_op to point at rdwr_fifo_fops.

Thereafter, a call to close() will vector through filp->f_op
to pipe_rdwr_release().

I'm still puzzled as to how pipe_rdwr_open() is called. The
stack trace says:

Call Trace:
  [<ffffffff8028125c>] __dentry_open+0x13c/0x230
  [<ffffffff8028143d>] do_filp_open+0x2d/0x40
  [<ffffffff802814aa>] do_sys_open+0x5a/0x100
  [<ffffffff8021faf3>] sysenter_do_call+0x1b/0x67

I figure that the fileoperation for the fifo inode must be
revectored to rdwr_fifo_fops --- but I can't see that's
done exactly.

Any hints as to where to look for this ?


In any case, my current guess as to how this happens is that
there might be window of time between do_sys_open() ... pipe_rdwr_open()
where the inode is obtained and the fileoperation pointer is
followed to get to pipe_rdwr_open().

During that window of time, might it be possible for the fifo
to be closed and the underlying pipe released?


Where can I look to see if this is possible?

Earl



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fs/pipe.c null pointer dereference
  2009-10-14 22:50   ` Earl Chew
@ 2009-10-15  0:11     ` Frans Pop
  2009-10-16  4:27       ` Earl Chew
  2009-10-16  3:31     ` Earl Chew
  1 sibling, 1 reply; 9+ messages in thread
From: Frans Pop @ 2009-10-15  0:11 UTC (permalink / raw)
  To: Earl Chew; +Cc: linux-kernel

On Thursday 15 October 2009, Earl Chew wrote:
> Frans Pop wrote:
> > 2.6.21 is really rather ancient for the kernel community and I doubt
> > anyone here will really want to look into issues with it, given that
> > there's a realistic chance it has already been fixed since then. The
> > current stable version is 2.6.31 after all.
> >
> > Can you reproduce the oops with a current kernel? Have you checked if
> > there have been any changes in the code you identified since 2.6.21?
>
> I was mostly after some guidance. I'm happy to dig around to figure
> out what the problem is and log a bug if indeed I find one.

OK. I'm afraid I can't help you there myself, but maybe with this 
additional info someone else will.

Good luck with the hunt.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fs/pipe.c null pointer dereference
  2009-10-14 22:50   ` Earl Chew
  2009-10-15  0:11     ` Frans Pop
@ 2009-10-16  3:31     ` Earl Chew
  2009-10-16  3:39       ` Earl Chew
  1 sibling, 1 reply; 9+ messages in thread
From: Earl Chew @ 2009-10-16  3:31 UTC (permalink / raw)
  Cc: linux-kernel

Earl Chew wrote:
 > I have given the matter more thought, and I believe it is unlikely
 > that the problem is in fs/fifo.c. As outlined in my previous
 > email, the code looks ok.

I notice that the other place the rdwr_pipe_fops is set up is:

 > static struct inode * get_pipe_inode(void)
 > {
 >        ...
 >         inode->i_fop = &rdwr_pipe_fops;
 >

and that get_pipe_inode() is called from:

 > struct file *create_write_pipe(void)
 > {
 >        ...
 >         inode = get_pipe_inode();
 >         if (!inode)
 >                 goto err_file;

and that is called from do_pipe():

 > int do_pipe(int *fd)
 > {
 >        ...
 >         fw = create_write_pipe();
 >         if (IS_ERR(fw))
 >                 return PTR_ERR(fw);
 >         fr = create_read_pipe(fw);
 >         error = PTR_ERR(fr);
 >         if (IS_ERR(fr))
 >                 goto err_write_pipe;


... and do_pipe() is called from many places.


The stack trace I have shows:

 > Call Trace:
 > [<ffffffff802899a5>] pipe_rdwr_open+0x35/0x70
 > [<ffffffff8028125c>] __dentry_open+0x13c/0x230
 > [<ffffffff8028143d>] do_filp_open+0x2d/0x40
 > [<ffffffff802814aa>] do_sys_open+0x5a/0x100
 > [<ffffffff8021faf3>] sysenter_do_call+0x1b/0x67


How can it be possible that sys_open (ie open(2)) can get hold
of an inode whose i_fop->open() points at pipe_rdwr_open() ?

Is this possible via /proc/pid/fd/* ?

Yes, it looks likely:

> { echo y ; sleep 1 ; } | { while read ; do echo z$REPLY; done ; } &
> PID=$!
> OUT=$(ps -efl | grep 'sleep 1' | grep -v grep)
> OUT=${OUT%% *}
> echo n > /proc/$OUT/fd/1

This test prints zy and zn indicating that the 2nd echo was able
to open the writing end of the pipe and inject another character.

Earl



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fs/pipe.c null pointer dereference
  2009-10-16  3:31     ` Earl Chew
@ 2009-10-16  3:39       ` Earl Chew
  0 siblings, 0 replies; 9+ messages in thread
From: Earl Chew @ 2009-10-16  3:39 UTC (permalink / raw)
  Cc: linux-kernel

Earl Chew wrote:
> Is this possible via /proc/pid/fd/* ?

The window for failure is small. It's easiest to reproduce
this problem by stalling pipe_rdwr_open() to open up the
window:

--- pipe.c.orig 2009-10-15 20:33:53.000000000 -0700
+++ pipe.c      2009-10-15 20:17:40.000000000 -0700
@@ -736,2 +736,3 @@
  {
+       msleep(100);
         mutex_lock(&inode->i_mutex);


With the failure window widened, it's easy to reproduce
the failure with:

--------------------------------------------------------------
#!/bin/sh

while : ; do
   { echo y ; sleep 1 ; } | { while read ; do echo z$REPLY; done ; } &
   PID=$!
   OUT=$(ps -efl | grep 'sleep 1' | grep -v grep |
        { read PID REST ; echo $PID; } )
   OUT="${OUT%% *}"
   DELAY=$((RANDOM * 1000 / 32768))
   usleep $((DELAY * 1000 + RANDOM % 1000 ))
   echo n > /proc/$OUT/fd/1
done
--------------------------------------------------------------



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fs/pipe.c null pointer dereference
  2009-10-15  0:11     ` Frans Pop
@ 2009-10-16  4:27       ` Earl Chew
  2009-10-16  7:16         ` Pekka Enberg
  0 siblings, 1 reply; 9+ messages in thread
From: Earl Chew @ 2009-10-16  4:27 UTC (permalink / raw)
  To: Frans Pop; +Cc: linux-kernel

Frans Pop wrote:
> Good luck with the hunt.

I've logged a defect:

http://bugzilla.kernel.org/show_bug.cgi?id=14416

which includes a proposed fix and script to reproduce.

Earl




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fs/pipe.c null pointer dereference
  2009-10-16  4:27       ` Earl Chew
@ 2009-10-16  7:16         ` Pekka Enberg
  2009-10-16 14:51           ` Earl Chew
  0 siblings, 1 reply; 9+ messages in thread
From: Pekka Enberg @ 2009-10-16  7:16 UTC (permalink / raw)
  To: Earl Chew
  Cc: Frans Pop, linux-kernel, Andrew Morton, Jens Axboe,
	Linus Torvalds

Hi Earl,

On Fri, Oct 16, 2009 at 7:27 AM, Earl Chew <earl_chew@agilent.com> wrote:
> Frans Pop wrote:
>>
>> Good luck with the hunt.
>
> I've logged a defect:
>
> http://bugzilla.kernel.org/show_bug.cgi?id=14416
>
> which includes a proposed fix and script to reproduce.

Can you please send the patch to LKML so your fix doesn't get lost in
the noise? See Documentation/SubmittingPatches for details.

                        Pekka

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fs/pipe.c null pointer dereference
  2009-10-16  7:16         ` Pekka Enberg
@ 2009-10-16 14:51           ` Earl Chew
  0 siblings, 0 replies; 9+ messages in thread
From: Earl Chew @ 2009-10-16 14:51 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Frans Pop, linux-kernel, Andrew Morton, Jens Axboe,
	Linus Torvalds

Pekka Enberg wrote:
> Can you please send the patch to LKML so your fix doesn't get lost in
> the noise? See Documentation/SubmittingPatches for details.

Done.

http://lkml.org/lkml/2009/10/16/146

Earl




^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-10-16 14:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-14 13:39 fs/pipe.c null pointer dereference Earl Chew
2009-10-14 14:53 ` Frans Pop
2009-10-14 22:50   ` Earl Chew
2009-10-15  0:11     ` Frans Pop
2009-10-16  4:27       ` Earl Chew
2009-10-16  7:16         ` Pekka Enberg
2009-10-16 14:51           ` Earl Chew
2009-10-16  3:31     ` Earl Chew
2009-10-16  3:39       ` Earl Chew

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox