From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?UTF-8?Q?Am=C3=A9rico_Wang?= Subject: Re: [PATCH v4 1/1]: fs: pipe.c null pointer dereference + really sign off + unmangled diffs Date: Wed, 21 Oct 2009 17:38:22 +0800 Message-ID: <2375c9f90910210238i744fec42gb4d9a1c5229696f0@mail.gmail.com> References: <4AD8852E.2090302@agilent.com> <4ADCAC33.4070908@agilent.com> <4ADCDB9A.7050701@agilent.com> <4ADCEE6D.6050008@agilent.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: linux-kernel@vger.kernel.org, Al Viro , linux-fsdevel@vger.kernel.org To: Earl Chew Return-path: Received: from qw-out-2122.google.com ([74.125.92.24]:59529 "EHLO qw-out-2122.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750919AbZJUJiS convert rfc822-to-8bit (ORCPT ); Wed, 21 Oct 2009 05:38:18 -0400 In-Reply-To: <4ADCEE6D.6050008@agilent.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Tue, Oct 20, 2009 at 6:55 AM, Earl Chew wrot= e: > [ Exactly as before, but really sign off and tabs preserved ] > > > This patch fixes a null pointer exception in pipe_rdwr_open() which > generates the stack trace: > > >> Unable to handle kernel NULL pointer dereference at 0000000000000028= RIP: >> =C2=A0[] pipe_rdwr_open+0x35/0x70 >> =C2=A0[] __dentry_open+0x13c/0x230 >> =C2=A0[] do_filp_open+0x2d/0x40 >> =C2=A0[] do_sys_open+0x5a/0x100 >> =C2=A0[] sysenter_do_call+0x1b/0x67 > > > The failure mode is triggered by an attempt to open an anonymous > pipe via /proc/pid/fd/* as exemplified by this script: > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > #!/bin/sh > while : ; do > =C2=A0 { echo y ; sleep 1 ; } | { while read ; do echo z$REPLY; done = ; } & > =C2=A0 PID=3D$! > =C2=A0 OUT=3D$(ps -efl | grep 'sleep 1' | grep -v grep | > =C2=A0 =C2=A0 =C2=A0 =C2=A0{ read PID REST ; echo $PID; } ) > =C2=A0 OUT=3D"${OUT%% *}" Well, you can use 'pgrep', it will save you a lot here. Try: pgrep -f 'sleep 1' -n > =C2=A0 DELAY=3D$((RANDOM * 1000 / 32768)) > =C2=A0 usleep $((DELAY * 1000 + RANDOM % 1000 )) > =C2=A0 echo n > /proc/$OUT/fd/1 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 # Trigger defect > done > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > This still has very little chance to trigger it, I am afraid. I tried on my machine, didn't get any oops. Trying to use C to write it may be better. > Note that the failure window is quite small and I could only > reliably reproduce the defect by inserting a small delay > in pipe_rdwr_open(). For example: > > =C2=A0static int > =C2=A0pipe_rdwr_open(struct inode *inode, struct file *filp) > =C2=A0{ > =C2=A0 =C2=A0 =C2=A0 msleep(100); > =C2=A0 =C2=A0 =C2=A0 mutex_lock(&inode->i_mutex); > > > Although the defect was observed in pipe_rdwr_open(), I think it > makes sense to replicate the change through all the pipe_*_open() > functions. > > The core of the change is to verify that inode->i_pipe has not > been released before attempting to manipulate it. If inode->i_pipe > is no longer present, return ENOENT to indicate so. > > The comment about potentially using atomic_t for i_pipe->readers > and i_pipe->writers has also been removed because it is no longer > relevant in this context. The inode->i_mutex lock must be used so > that inode->i_pipe can be dealt with correctly. So, if I understand you correctly, you mean we have a small window between calling sys_open() and fifo_open(), during this little period, we don't have i_mutex held, thun another process have a chance to release that pipe and make i_pipe NULL. Right? Hmm, sounds reasonable. :-/ I'd like you to put the explanations into the code, as comments. > > > Signed-off-by: Earl Chew Add some Cc, fs-devel and Al. > > > --- linux-2.6.21_mvlcge500/fs/pipe.c.orig =C2=A0 =C2=A0 =C2=A0 2009-1= 0-15 20:33:53.000000000 -0700 > +++ linux-2.6.21_mvlcge500/fs/pipe.c =C2=A0 =C2=A02009-10-15 21:21:25= =2E000000000 -0700 > @@ -712,36 +712,55 @@ pipe_rdwr_release(struct inode *inode, s > =C2=A0static int > =C2=A0pipe_read_open(struct inode *inode, struct file *filp) > =C2=A0{ > - =C2=A0 =C2=A0 =C2=A0 /* We could have perhaps used atomic_t, but th= is and friends > - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0below are the only places. =C2=A0= So it doesn't seem worthwhile. =C2=A0*/ > + =C2=A0 =C2=A0 =C2=A0 int ret =3D -ENOENT; > + > =C2=A0 =C2=A0 =C2=A0 =C2=A0mutex_lock(&inode->i_mutex); > - =C2=A0 =C2=A0 =C2=A0 inode->i_pipe->readers++; > + > + =C2=A0 =C2=A0 =C2=A0 if (inode->i_pipe) { > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ret =3D 0; > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 inode->i_pipe->rea= ders++; > + =C2=A0 =C2=A0 =C2=A0 } > + > =C2=A0 =C2=A0 =C2=A0 =C2=A0mutex_unlock(&inode->i_mutex); > > - =C2=A0 =C2=A0 =C2=A0 return 0; > + =C2=A0 =C2=A0 =C2=A0 return ret; > =C2=A0} > > =C2=A0static int > =C2=A0pipe_write_open(struct inode *inode, struct file *filp) > =C2=A0{ > + =C2=A0 =C2=A0 =C2=A0 int ret =3D -ENOENT; > + > =C2=A0 =C2=A0 =C2=A0 =C2=A0mutex_lock(&inode->i_mutex); > - =C2=A0 =C2=A0 =C2=A0 inode->i_pipe->writers++; > + > + =C2=A0 =C2=A0 =C2=A0 if (inode->i_pipe) { > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ret =3D 0; > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 inode->i_pipe->wri= ters++; > + =C2=A0 =C2=A0 =C2=A0 } > + > =C2=A0 =C2=A0 =C2=A0 =C2=A0mutex_unlock(&inode->i_mutex); > > - =C2=A0 =C2=A0 =C2=A0 return 0; > + =C2=A0 =C2=A0 =C2=A0 return ret; > =C2=A0} > > =C2=A0static int > =C2=A0pipe_rdwr_open(struct inode *inode, struct file *filp) > =C2=A0{ > + =C2=A0 =C2=A0 =C2=A0 int ret =3D -ENOENT; > + > =C2=A0 =C2=A0 =C2=A0 =C2=A0mutex_lock(&inode->i_mutex); > - =C2=A0 =C2=A0 =C2=A0 if (filp->f_mode & FMODE_READ) > - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 inode->i_pipe->rea= ders++; > - =C2=A0 =C2=A0 =C2=A0 if (filp->f_mode & FMODE_WRITE) > - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 inode->i_pipe->wri= ters++; > + > + =C2=A0 =C2=A0 =C2=A0 if (inode->i_pipe) { > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ret =3D 0; > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (filp->f_mode &= FMODE_READ) > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 inode->i_pipe->readers++; > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (filp->f_mode &= FMODE_WRITE) > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 inode->i_pipe->writers++; > + =C2=A0 =C2=A0 =C2=A0 } > + > =C2=A0 =C2=A0 =C2=A0 =C2=A0mutex_unlock(&inode->i_mutex); > > - =C2=A0 =C2=A0 =C2=A0 return 0; > + =C2=A0 =C2=A0 =C2=A0 return ret; > =C2=A0} > > =C2=A0/* > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kerne= l" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at =C2=A0http://vger.kernel.org/majordomo-info.ht= ml > Please read the FAQ at =C2=A0http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html