* [PATCH] bug in read_buf @ 2010-04-20 2:16 Neil Brown 2010-04-20 16:51 ` J. Bruce Fields 0 siblings, 1 reply; 8+ messages in thread From: Neil Brown @ 2010-04-20 2:16 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-nfs Surely this can never have worked... which implies that the code has never been used? When read_buf is called to move over to the next page in the pagelist of an NFSv4 request, it sets argp->end to essentially a random number, certainly not an address within the page which argp->p now points to. So subsequent calls to READ_BUF will think there is much more than a page of spare space (the cast to u32 ensures an unsigned comparison) so we can expect to fall off the end of the second page. I guess we never ever receive requests with any operation starting beyond the first page! [[ I found this while looking at why fsstress over NFS over RDMA caused a bad memory dereference in READ32, suggesting that 'p' had a bad value. However it was ffff8801299188f0, which is not an "I've fallen off the end of the page" sort of value. So I think it must be a different bug :-( It is as if the page is being unmapped underneath us... ]] NeilBrown diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index e170317..34ccf81 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -161,10 +161,10 @@ static __be32 *read_buf(struct nfsd4_compoundargs *argp, u32 nbytes) argp->p = page_address(argp->pagelist[0]); argp->pagelist++; if (argp->pagelen < PAGE_SIZE) { - argp->end = p + (argp->pagelen>>2); + argp->end = argp->p + (argp->pagelen>>2); argp->pagelen = 0; } else { - argp->end = p + (PAGE_SIZE>>2); + argp->end = argp->p + (PAGE_SIZE>>2); argp->pagelen -= PAGE_SIZE; } memcpy(((char*)p)+avail, argp->p, (nbytes - avail)); @@ -1426,10 +1426,10 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp) argp->p = page_address(argp->pagelist[0]); argp->pagelist++; if (argp->pagelen < PAGE_SIZE) { - argp->end = p + (argp->pagelen>>2); + argp->end = argp->p + (argp->pagelen>>2); argp->pagelen = 0; } else { - argp->end = p + (PAGE_SIZE>>2); + argp->end = argp->p + (PAGE_SIZE>>2); argp->pagelen -= PAGE_SIZE; } } ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] bug in read_buf 2010-04-20 2:16 [PATCH] bug in read_buf Neil Brown @ 2010-04-20 16:51 ` J. Bruce Fields 2010-04-20 19:24 ` William A. (Andy) Adamson 0 siblings, 1 reply; 8+ messages in thread From: J. Bruce Fields @ 2010-04-20 16:51 UTC (permalink / raw) To: Neil Brown; +Cc: linux-nfs On Tue, Apr 20, 2010 at 12:16:52PM +1000, Neil Brown wrote: > > Surely this can never have worked... which implies that the code has > never been used? > > When read_buf is called to move over to the next page in the pagelist > of an NFSv4 request, it sets argp->end to essentially a random > number, certainly not an address within the page which argp->p now > points to. So subsequent calls to READ_BUF will think there is much > more than a page of spare space (the cast to u32 ensures an unsigned > comparison) so we can expect to fall off the end of the second > page. Yipes, thanks. > I guess we never ever receive requests with any operation starting > beyond the first page! putfh-write-getattr, for example, is common enough. The write decoding should leave arg->end set correctly. But there are two read_buf()'s in decode_getattr(), and I can't see why we don't hit this bug on a write that leaves that final getattr exactly straddling a page boundary. --b. > [[ > I found this while looking at why fsstress over NFS over RDMA caused > a bad memory dereference in READ32, suggesting that 'p' had a bad > value. However it was ffff8801299188f0, which is not an "I've fallen > off the end of the page" sort of value. So I think it must be a > different bug :-( It is as if the page is being unmapped underneath > us... > ]] > NeilBrown > > > > > diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c > index e170317..34ccf81 100644 > --- a/fs/nfsd/nfs4xdr.c > +++ b/fs/nfsd/nfs4xdr.c > @@ -161,10 +161,10 @@ static __be32 *read_buf(struct nfsd4_compoundargs *argp, u32 nbytes) > argp->p = page_address(argp->pagelist[0]); > argp->pagelist++; > if (argp->pagelen < PAGE_SIZE) { > - argp->end = p + (argp->pagelen>>2); > + argp->end = argp->p + (argp->pagelen>>2); > argp->pagelen = 0; > } else { > - argp->end = p + (PAGE_SIZE>>2); > + argp->end = argp->p + (PAGE_SIZE>>2); > argp->pagelen -= PAGE_SIZE; > } > memcpy(((char*)p)+avail, argp->p, (nbytes - avail)); > @@ -1426,10 +1426,10 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp) > argp->p = page_address(argp->pagelist[0]); > argp->pagelist++; > if (argp->pagelen < PAGE_SIZE) { > - argp->end = p + (argp->pagelen>>2); > + argp->end = argp->p + (argp->pagelen>>2); > argp->pagelen = 0; > } else { > - argp->end = p + (PAGE_SIZE>>2); > + argp->end = argp->p + (PAGE_SIZE>>2); > argp->pagelen -= PAGE_SIZE; > } > } > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] bug in read_buf 2010-04-20 16:51 ` J. Bruce Fields @ 2010-04-20 19:24 ` William A. (Andy) Adamson [not found] ` <g2k89c397151004201224wb35ae389g961523bbef23f452-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 8+ messages in thread From: William A. (Andy) Adamson @ 2010-04-20 19:24 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Neil Brown, linux-nfs On Tue, Apr 20, 2010 at 12:51 PM, J. Bruce Fields <bfields@fieldses.org= > wrote: > On Tue, Apr 20, 2010 at 12:16:52PM +1000, Neil Brown wrote: >> >> Surely this can never have worked... which implies that the code has >> never been used? >> >> When read_buf is called to move over to the next page in the pagelis= t >> of an NFSv4 request, it sets argp->end to essentially a random >> number, certainly not an address within the page which argp->p now >> points to. =A0So subsequent calls to READ_BUF will think there is mu= ch >> more than a page of spare space (the cast to u32 ensures an unsigned >> comparison) so we can expect to fall off the end of the second >> page. > > Yipes, thanks. > >> I guess we never ever receive requests with any operation starting >> beyond the first page! > > putfh-write-getattr, for example, is common enough. =A0The write deco= ding > should leave arg->end set correctly. =A0But there are two read_buf()'= s in > decode_getattr(), and I can't see why we don't hit this bug on a writ= e > that leaves that final getattr exactly straddling a page boundary. The write data is dumped into the rq_vec which has non-contiguous pages. So the xdr_buf head only holds the putfh result, the short write response header (v4 stateid, offset, how, length, etc), and then the getattr. so there is plenty of space. -->Andy > > --b. > >> [[ >> I found this while looking at why fsstress over NFS over RDMA caused >> a bad memory dereference in READ32, suggesting that 'p' had a bad >> value. =A0However it was ffff8801299188f0, which is not an "I've fal= len >> off the end of the page" sort of value. =A0So I think it must be a >> different bug :-( =A0It is as if the page is being unmapped undernea= th >> us... >> ]] >> NeilBrown >> >> >> >> >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c >> index e170317..34ccf81 100644 >> --- a/fs/nfsd/nfs4xdr.c >> +++ b/fs/nfsd/nfs4xdr.c >> @@ -161,10 +161,10 @@ static __be32 *read_buf(struct nfsd4_compounda= rgs *argp, u32 nbytes) >> =A0 =A0 =A0 argp->p =3D page_address(argp->pagelist[0]); >> =A0 =A0 =A0 argp->pagelist++; >> =A0 =A0 =A0 if (argp->pagelen < PAGE_SIZE) { >> - =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D p + (argp->pagelen>>2); >> + =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D argp->p + (argp->pagelen>>2)= ; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pagelen =3D 0; >> =A0 =A0 =A0 } else { >> - =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D p + (PAGE_SIZE>>2); >> + =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D argp->p + (PAGE_SIZE>>2); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pagelen -=3D PAGE_SIZE; >> =A0 =A0 =A0 } >> =A0 =A0 =A0 memcpy(((char*)p)+avail, argp->p, (nbytes - avail)); >> @@ -1426,10 +1426,10 @@ nfsd4_decode_compound(struct nfsd4_compounda= rgs *argp) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->p =3D page_address= (argp->pagelist[0]); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pagelist++; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (argp->pagelen < PAGE= _SIZE) { >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->end = =3D p + (argp->pagelen>>2); >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->end = =3D argp->p + (argp->pagelen>>2); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pa= gelen =3D 0; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } else { >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->end = =3D p + (PAGE_SIZE>>2); >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->end = =3D argp->p + (PAGE_SIZE>>2); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pa= gelen -=3D PAGE_SIZE; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 } >> >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" = in > the body of a message to majordomo@vger.kernel.org > More majordomo info at =A0http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <g2k89c397151004201224wb35ae389g961523bbef23f452-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH] bug in read_buf [not found] ` <g2k89c397151004201224wb35ae389g961523bbef23f452-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2010-04-20 19:39 ` J. Bruce Fields 2010-04-21 22:35 ` J. Bruce Fields 2010-04-22 15:41 ` William A. (Andy) Adamson 0 siblings, 2 replies; 8+ messages in thread From: J. Bruce Fields @ 2010-04-20 19:39 UTC (permalink / raw) To: William A. (Andy) Adamson; +Cc: Neil Brown, linux-nfs On Tue, Apr 20, 2010 at 03:24:59PM -0400, William A. (Andy) Adamson wro= te: > On Tue, Apr 20, 2010 at 12:51 PM, J. Bruce Fields <bfields@fieldses.o= rg> wrote: > > On Tue, Apr 20, 2010 at 12:16:52PM +1000, Neil Brown wrote: > >> > >> Surely this can never have worked... which implies that the code h= as > >> never been used? > >> > >> When read_buf is called to move over to the next page in the pagel= ist > >> of an NFSv4 request, it sets argp->end to essentially a random > >> number, certainly not an address within the page which argp->p now > >> points to. =C2=A0So subsequent calls to READ_BUF will think there = is much > >> more than a page of spare space (the cast to u32 ensures an unsign= ed > >> comparison) so we can expect to fall off the end of the second > >> page. > > > > Yipes, thanks. > > > >> I guess we never ever receive requests with any operation starting > >> beyond the first page! > > > > putfh-write-getattr, for example, is common enough. =C2=A0The write= decoding > > should leave arg->end set correctly. =C2=A0But there are two read_b= uf()'s in > > decode_getattr(), and I can't see why we don't hit this bug on a wr= ite > > that leaves that final getattr exactly straddling a page boundary. >=20 > The write data is dumped into the rq_vec which has non-contiguous > pages. So the xdr_buf head only holds the putfh result, the short > write response header (v4 stateid, offset, how, length, etc), and the= n > the getattr. so there is plenty of space. This is the server-side write-decoding, so you could see: rpc header | putfh | write ... data ... | getattr ^ | page boundary here --b. >=20 > -->Andy >=20 > > > > --b. > > > >> [[ > >> I found this while looking at why fsstress over NFS over RDMA caus= ed > >> a bad memory dereference in READ32, suggesting that 'p' had a bad > >> value. =C2=A0However it was ffff8801299188f0, which is not an "I'v= e fallen > >> off the end of the page" sort of value. =C2=A0So I think it must b= e a > >> different bug :-( =C2=A0It is as if the page is being unmapped und= erneath > >> us... > >> ]] > >> NeilBrown > >> > >> > >> > >> > >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c > >> index e170317..34ccf81 100644 > >> --- a/fs/nfsd/nfs4xdr.c > >> +++ b/fs/nfsd/nfs4xdr.c > >> @@ -161,10 +161,10 @@ static __be32 *read_buf(struct nfsd4_compoun= dargs *argp, u32 nbytes) > >> =C2=A0 =C2=A0 =C2=A0 argp->p =3D page_address(argp->pagelist[0]); > >> =C2=A0 =C2=A0 =C2=A0 argp->pagelist++; > >> =C2=A0 =C2=A0 =C2=A0 if (argp->pagelen < PAGE_SIZE) { > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (arg= p->pagelen>>2); > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->p = + (argp->pagelen>>2); > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen =3D= 0; > >> =C2=A0 =C2=A0 =C2=A0 } else { > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (PAG= E_SIZE>>2); > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->p = + (PAGE_SIZE>>2); > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen -=3D= PAGE_SIZE; > >> =C2=A0 =C2=A0 =C2=A0 } > >> =C2=A0 =C2=A0 =C2=A0 memcpy(((char*)p)+avail, argp->p, (nbytes - a= vail)); > >> @@ -1426,10 +1426,10 @@ nfsd4_decode_compound(struct nfsd4_compoun= dargs *argp) > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 argp->p =3D page_address(argp->pagelist[0]); > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 argp->pagelist++; > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 if (argp->pagelen < PAGE_SIZE) { > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (argp->pagelen>>2); > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->p + (argp->pagelen>= >2); > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen =3D 0; > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 } else { > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (PAGE_SIZE>>2); > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->p + (PAGE_SIZE>>2); > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen -=3D PAGE_SIZE; > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 } > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 } > >> > >> > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-nfs= " in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at =C2=A0http://vger.kernel.org/majordomo-info.= html > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] bug in read_buf 2010-04-20 19:39 ` J. Bruce Fields @ 2010-04-21 22:35 ` J. Bruce Fields 2010-04-21 22:36 ` J. Bruce Fields 2010-04-22 15:41 ` William A. (Andy) Adamson 1 sibling, 1 reply; 8+ messages in thread From: J. Bruce Fields @ 2010-04-21 22:35 UTC (permalink / raw) To: William A. (Andy) Adamson; +Cc: Neil Brown, linux-nfs On Tue, Apr 20, 2010 at 03:39:44PM -0400, J. Bruce Fields wrote: > On Tue, Apr 20, 2010 at 03:24:59PM -0400, William A. (Andy) Adamson w= rote: > > On Tue, Apr 20, 2010 at 12:51 PM, J. Bruce Fields <bfields@fieldses= =2Eorg> wrote: > > > On Tue, Apr 20, 2010 at 12:16:52PM +1000, Neil Brown wrote: > > >> > > >> Surely this can never have worked... which implies that the code= has > > >> never been used? > > >> > > >> When read_buf is called to move over to the next page in the pag= elist > > >> of an NFSv4 request, it sets argp->end to essentially a random > > >> number, certainly not an address within the page which argp->p n= ow > > >> points to. =C2=A0So subsequent calls to READ_BUF will think ther= e is much > > >> more than a page of spare space (the cast to u32 ensures an unsi= gned > > >> comparison) so we can expect to fall off the end of the second > > >> page. > > > > > > Yipes, thanks. > > > > > >> I guess we never ever receive requests with any operation starti= ng > > >> beyond the first page! > > > > > > putfh-write-getattr, for example, is common enough. =C2=A0The wri= te decoding > > > should leave arg->end set correctly. =C2=A0But there are two read= _buf()'s in > > > decode_getattr(), and I can't see why we don't hit this bug on a = write > > > that leaves that final getattr exactly straddling a page boundary= =2E > >=20 > > The write data is dumped into the rq_vec which has non-contiguous > > pages. So the xdr_buf head only holds the putfh result, the short > > write response header (v4 stateid, offset, how, length, etc), and t= hen > > the getattr. so there is plenty of space. >=20 > This is the server-side write-decoding, so you could see: >=20 >=20 > rpc header | putfh | write ... data ... | getattr > ^ > | > page boundary here Hm, I guess even when argp->end is wrong, argp->p is always set to something sane; so on the next READ_BUF(), when you hit the nbytes <=3D (u32)((char *)argp->end - (char *)argp->p case, you do p =3D argp->p; argp->p +=3D XDR_QUADLEN(nbytes); and p is something reasonable. "end" stays wrong, but that won't be a problem until you run past the end of the *next* page, which it would take a very unusual compound to do. --b. >=20 > --b. >=20 > >=20 > > -->Andy > >=20 > > > > > > --b. > > > > > >> [[ > > >> I found this while looking at why fsstress over NFS over RDMA ca= used > > >> a bad memory dereference in READ32, suggesting that 'p' had a ba= d > > >> value. =C2=A0However it was ffff8801299188f0, which is not an "I= 've fallen > > >> off the end of the page" sort of value. =C2=A0So I think it must= be a > > >> different bug :-( =C2=A0It is as if the page is being unmapped u= nderneath > > >> us... > > >> ]] > > >> NeilBrown > > >> > > >> > > >> > > >> > > >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c > > >> index e170317..34ccf81 100644 > > >> --- a/fs/nfsd/nfs4xdr.c > > >> +++ b/fs/nfsd/nfs4xdr.c > > >> @@ -161,10 +161,10 @@ static __be32 *read_buf(struct nfsd4_compo= undargs *argp, u32 nbytes) > > >> =C2=A0 =C2=A0 =C2=A0 argp->p =3D page_address(argp->pagelist[0])= ; > > >> =C2=A0 =C2=A0 =C2=A0 argp->pagelist++; > > >> =C2=A0 =C2=A0 =C2=A0 if (argp->pagelen < PAGE_SIZE) { > > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (a= rgp->pagelen>>2); > > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->= p + (argp->pagelen>>2); > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen =3D= 0; > > >> =C2=A0 =C2=A0 =C2=A0 } else { > > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (P= AGE_SIZE>>2); > > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->= p + (PAGE_SIZE>>2); > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen -= =3D PAGE_SIZE; > > >> =C2=A0 =C2=A0 =C2=A0 } > > >> =C2=A0 =C2=A0 =C2=A0 memcpy(((char*)p)+avail, argp->p, (nbytes -= avail)); > > >> @@ -1426,10 +1426,10 @@ nfsd4_decode_compound(struct nfsd4_compo= undargs *argp) > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 argp->p =3D page_address(argp->pagelist[0]); > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 argp->pagelist++; > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 if (argp->pagelen < PAGE_SIZE) { > > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (argp->pagelen>>2= ); > > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->p + (argp->page= len>>2); > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen =3D 0; > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 } else { > > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (PAGE_SIZE>>2); > > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->p + (PAGE_SIZE>= >2); > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen -=3D PAGE_SIZE; > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 } > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 } > > >> > > >> > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-n= fs" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at =C2=A0http://vger.kernel.org/majordomo-inf= o.html > > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] bug in read_buf 2010-04-21 22:35 ` J. Bruce Fields @ 2010-04-21 22:36 ` J. Bruce Fields 2010-04-21 23:08 ` Neil Brown 0 siblings, 1 reply; 8+ messages in thread From: J. Bruce Fields @ 2010-04-21 22:36 UTC (permalink / raw) To: William A. (Andy) Adamson; +Cc: Neil Brown, linux-nfs On Wed, Apr 21, 2010 at 06:35:27PM -0400, J. Bruce Fields wrote: > On Tue, Apr 20, 2010 at 03:39:44PM -0400, J. Bruce Fields wrote: > > On Tue, Apr 20, 2010 at 03:24:59PM -0400, William A. (Andy) Adamson= wrote: > > > On Tue, Apr 20, 2010 at 12:51 PM, J. Bruce Fields <bfields@fields= es.org> wrote: > > > > On Tue, Apr 20, 2010 at 12:16:52PM +1000, Neil Brown wrote: > > > >> > > > >> Surely this can never have worked... which implies that the co= de has > > > >> never been used? > > > >> > > > >> When read_buf is called to move over to the next page in the p= agelist > > > >> of an NFSv4 request, it sets argp->end to essentially a random > > > >> number, certainly not an address within the page which argp->p= now > > > >> points to. =C2=A0So subsequent calls to READ_BUF will think th= ere is much > > > >> more than a page of spare space (the cast to u32 ensures an un= signed > > > >> comparison) so we can expect to fall off the end of the second > > > >> page. > > > > > > > > Yipes, thanks. > > > > > > > >> I guess we never ever receive requests with any operation star= ting > > > >> beyond the first page! > > > > > > > > putfh-write-getattr, for example, is common enough. =C2=A0The w= rite decoding > > > > should leave arg->end set correctly. =C2=A0But there are two re= ad_buf()'s in > > > > decode_getattr(), and I can't see why we don't hit this bug on = a write > > > > that leaves that final getattr exactly straddling a page bounda= ry. > > >=20 > > > The write data is dumped into the rq_vec which has non-contiguous > > > pages. So the xdr_buf head only holds the putfh result, the short > > > write response header (v4 stateid, offset, how, length, etc), and= then > > > the getattr. so there is plenty of space. > >=20 > > This is the server-side write-decoding, so you could see: > >=20 > >=20 > > rpc header | putfh | write ... data ... | getattr > > ^ > > | > > page boundary here >=20 > Hm, I guess even when argp->end is wrong, argp->p is always set to > something sane; so on the next READ_BUF(), when you hit the >=20 > nbytes <=3D (u32)((char *)argp->end - (char *)argp->p >=20 > case, you do >=20 > p =3D argp->p; > argp->p +=3D XDR_QUADLEN(nbytes); >=20 > and p is something reasonable. "end" stays wrong, but that won't be = a > problem until you run past the end of the *next* page, which it would > take a very unusual compound to do. (Nevertheless: applied, for 2.6.34 and stable.) --b. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] bug in read_buf 2010-04-21 22:36 ` J. Bruce Fields @ 2010-04-21 23:08 ` Neil Brown 0 siblings, 0 replies; 8+ messages in thread From: Neil Brown @ 2010-04-21 23:08 UTC (permalink / raw) To: J. Bruce Fields; +Cc: William A. (Andy) Adamson, linux-nfs On Wed, 21 Apr 2010 18:36:05 -0400 "J. Bruce Fields" <bfields@fieldses.org> wrote: > > Hm, I guess even when argp->end is wrong, argp->p is always set to > > something sane; so on the next READ_BUF(), when you hit the > > > > nbytes <= (u32)((char *)argp->end - (char *)argp->p > > > > case, you do > > > > p = argp->p; > > argp->p += XDR_QUADLEN(nbytes); > > > > and p is something reasonable. "end" stays wrong, but that won't be a > > problem until you run past the end of the *next* page, which it would > > take a very unusual compound to do. Yes, it would not be an easy bug to trigger ... it takes away some of the thrill of finding a bug when you discover that it only affects a corner case that never ever happens :-( > > (Nevertheless: applied, for 2.6.34 and stable.) Thanks. NeilBrown ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] bug in read_buf 2010-04-20 19:39 ` J. Bruce Fields 2010-04-21 22:35 ` J. Bruce Fields @ 2010-04-22 15:41 ` William A. (Andy) Adamson 1 sibling, 0 replies; 8+ messages in thread From: William A. (Andy) Adamson @ 2010-04-22 15:41 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Neil Brown, linux-nfs On Tue, Apr 20, 2010 at 3:39 PM, J. Bruce Fields <bfields@fieldses.org>= wrote: > On Tue, Apr 20, 2010 at 03:24:59PM -0400, William A. (Andy) Adamson w= rote: >> On Tue, Apr 20, 2010 at 12:51 PM, J. Bruce Fields <bfields@fieldses.= org> wrote: >> > On Tue, Apr 20, 2010 at 12:16:52PM +1000, Neil Brown wrote: >> >> >> >> Surely this can never have worked... which implies that the code = has >> >> never been used? >> >> >> >> When read_buf is called to move over to the next page in the page= list >> >> of an NFSv4 request, it sets argp->end to essentially a random >> >> number, certainly not an address within the page which argp->p no= w >> >> points to. =A0So subsequent calls to READ_BUF will think there is= much >> >> more than a page of spare space (the cast to u32 ensures an unsig= ned >> >> comparison) so we can expect to fall off the end of the second >> >> page. >> > >> > Yipes, thanks. >> > >> >> I guess we never ever receive requests with any operation startin= g >> >> beyond the first page! >> > >> > putfh-write-getattr, for example, is common enough. =A0The write d= ecoding >> > should leave arg->end set correctly. =A0But there are two read_buf= ()'s in >> > decode_getattr(), and I can't see why we don't hit this bug on a w= rite >> > that leaves that final getattr exactly straddling a page boundary. >> >> The write data is dumped into the rq_vec which has non-contiguous >> pages. So the xdr_buf head only holds the putfh result, the short >> write response header (v4 stateid, offset, how, length, etc), and th= en >> the getattr. so there is plenty of space. > > This is the server-side write-decoding, so you could see: > > > =A0 =A0 =A0 =A0rpc header | putfh | write ... data ... | getattr > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ^ > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0page boundary here ulp - you're right. -->Andy > > --b. > >> >> -->Andy >> >> > >> > --b. >> > >> >> [[ >> >> I found this while looking at why fsstress over NFS over RDMA cau= sed >> >> a bad memory dereference in READ32, suggesting that 'p' had a bad >> >> value. =A0However it was ffff8801299188f0, which is not an "I've = fallen >> >> off the end of the page" sort of value. =A0So I think it must be = a >> >> different bug :-( =A0It is as if the page is being unmapped under= neath >> >> us... >> >> ]] >> >> NeilBrown >> >> >> >> >> >> >> >> >> >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c >> >> index e170317..34ccf81 100644 >> >> --- a/fs/nfsd/nfs4xdr.c >> >> +++ b/fs/nfsd/nfs4xdr.c >> >> @@ -161,10 +161,10 @@ static __be32 *read_buf(struct nfsd4_compou= ndargs *argp, u32 nbytes) >> >> =A0 =A0 =A0 argp->p =3D page_address(argp->pagelist[0]); >> >> =A0 =A0 =A0 argp->pagelist++; >> >> =A0 =A0 =A0 if (argp->pagelen < PAGE_SIZE) { >> >> - =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D p + (argp->pagelen>>2); >> >> + =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D argp->p + (argp->pagelen>= >2); >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pagelen =3D 0; >> >> =A0 =A0 =A0 } else { >> >> - =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D p + (PAGE_SIZE>>2); >> >> + =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D argp->p + (PAGE_SIZE>>2); >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pagelen -=3D PAGE_SIZE; >> >> =A0 =A0 =A0 } >> >> =A0 =A0 =A0 memcpy(((char*)p)+avail, argp->p, (nbytes - avail)); >> >> @@ -1426,10 +1426,10 @@ nfsd4_decode_compound(struct nfsd4_compou= ndargs *argp) >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->p =3D page_addr= ess(argp->pagelist[0]); >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pagelist++; >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (argp->pagelen < P= AGE_SIZE) { >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->e= nd =3D p + (argp->pagelen>>2); >> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->e= nd =3D argp->p + (argp->pagelen>>2); >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp-= >pagelen =3D 0; >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } else { >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->e= nd =3D p + (PAGE_SIZE>>2); >> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->e= nd =3D argp->p + (PAGE_SIZE>>2); >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp-= >pagelen -=3D PAGE_SIZE; >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 } >> >> >> >> >> > -- >> > To unsubscribe from this list: send the line "unsubscribe linux-nf= s" in >> > the body of a message to majordomo@vger.kernel.org >> > More majordomo info at =A0http://vger.kernel.org/majordomo-info.ht= ml >> > > ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-04-22 15:41 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-20 2:16 [PATCH] bug in read_buf Neil Brown
2010-04-20 16:51 ` J. Bruce Fields
2010-04-20 19:24 ` William A. (Andy) Adamson
[not found] ` <g2k89c397151004201224wb35ae389g961523bbef23f452-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-04-20 19:39 ` J. Bruce Fields
2010-04-21 22:35 ` J. Bruce Fields
2010-04-21 22:36 ` J. Bruce Fields
2010-04-21 23:08 ` Neil Brown
2010-04-22 15:41 ` William A. (Andy) Adamson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox