* [PATCH] bug in read_buf @ 2010-04-20 2:16 Neil Brown 2010-04-20 16:51 ` J. Bruce Fields 0 siblings, 1 reply; 8+ messages in thread From: Neil Brown @ 2010-04-20 2:16 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-nfs Surely this can never have worked... which implies that the code has never been used? When read_buf is called to move over to the next page in the pagelist of an NFSv4 request, it sets argp->end to essentially a random number, certainly not an address within the page which argp->p now points to. So subsequent calls to READ_BUF will think there is much more than a page of spare space (the cast to u32 ensures an unsigned comparison) so we can expect to fall off the end of the second page. I guess we never ever receive requests with any operation starting beyond the first page! [[ I found this while looking at why fsstress over NFS over RDMA caused a bad memory dereference in READ32, suggesting that 'p' had a bad value. However it was ffff8801299188f0, which is not an "I've fallen off the end of the page" sort of value. So I think it must be a different bug :-( It is as if the page is being unmapped underneath us... ]] NeilBrown diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index e170317..34ccf81 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -161,10 +161,10 @@ static __be32 *read_buf(struct nfsd4_compoundargs *argp, u32 nbytes) argp->p = page_address(argp->pagelist[0]); argp->pagelist++; if (argp->pagelen < PAGE_SIZE) { - argp->end = p + (argp->pagelen>>2); + argp->end = argp->p + (argp->pagelen>>2); argp->pagelen = 0; } else { - argp->end = p + (PAGE_SIZE>>2); + argp->end = argp->p + (PAGE_SIZE>>2); argp->pagelen -= PAGE_SIZE; } memcpy(((char*)p)+avail, argp->p, (nbytes - avail)); @@ -1426,10 +1426,10 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp) argp->p = page_address(argp->pagelist[0]); argp->pagelist++; if (argp->pagelen < PAGE_SIZE) { - argp->end = p + (argp->pagelen>>2); + argp->end = argp->p + (argp->pagelen>>2); argp->pagelen = 0; } else { - argp->end = p + (PAGE_SIZE>>2); + argp->end = argp->p + (PAGE_SIZE>>2); argp->pagelen -= PAGE_SIZE; } } ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] bug in read_buf 2010-04-20 2:16 [PATCH] bug in read_buf Neil Brown @ 2010-04-20 16:51 ` J. Bruce Fields 2010-04-20 19:24 ` William A. (Andy) Adamson 0 siblings, 1 reply; 8+ messages in thread From: J. Bruce Fields @ 2010-04-20 16:51 UTC (permalink / raw) To: Neil Brown; +Cc: linux-nfs On Tue, Apr 20, 2010 at 12:16:52PM +1000, Neil Brown wrote: > > Surely this can never have worked... which implies that the code has > never been used? > > When read_buf is called to move over to the next page in the pagelist > of an NFSv4 request, it sets argp->end to essentially a random > number, certainly not an address within the page which argp->p now > points to. So subsequent calls to READ_BUF will think there is much > more than a page of spare space (the cast to u32 ensures an unsigned > comparison) so we can expect to fall off the end of the second > page. Yipes, thanks. > I guess we never ever receive requests with any operation starting > beyond the first page! putfh-write-getattr, for example, is common enough. The write decoding should leave arg->end set correctly. But there are two read_buf()'s in decode_getattr(), and I can't see why we don't hit this bug on a write that leaves that final getattr exactly straddling a page boundary. --b. > [[ > I found this while looking at why fsstress over NFS over RDMA caused > a bad memory dereference in READ32, suggesting that 'p' had a bad > value. However it was ffff8801299188f0, which is not an "I've fallen > off the end of the page" sort of value. So I think it must be a > different bug :-( It is as if the page is being unmapped underneath > us... > ]] > NeilBrown > > > > > diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c > index e170317..34ccf81 100644 > --- a/fs/nfsd/nfs4xdr.c > +++ b/fs/nfsd/nfs4xdr.c > @@ -161,10 +161,10 @@ static __be32 *read_buf(struct nfsd4_compoundargs *argp, u32 nbytes) > argp->p = page_address(argp->pagelist[0]); > argp->pagelist++; > if (argp->pagelen < PAGE_SIZE) { > - argp->end = p + (argp->pagelen>>2); > + argp->end = argp->p + (argp->pagelen>>2); > argp->pagelen = 0; > } else { > - argp->end = p + (PAGE_SIZE>>2); > + argp->end = argp->p + (PAGE_SIZE>>2); > argp->pagelen -= PAGE_SIZE; > } > memcpy(((char*)p)+avail, argp->p, (nbytes - avail)); > @@ -1426,10 +1426,10 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp) > argp->p = page_address(argp->pagelist[0]); > argp->pagelist++; > if (argp->pagelen < PAGE_SIZE) { > - argp->end = p + (argp->pagelen>>2); > + argp->end = argp->p + (argp->pagelen>>2); > argp->pagelen = 0; > } else { > - argp->end = p + (PAGE_SIZE>>2); > + argp->end = argp->p + (PAGE_SIZE>>2); > argp->pagelen -= PAGE_SIZE; > } > } > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] bug in read_buf 2010-04-20 16:51 ` J. Bruce Fields @ 2010-04-20 19:24 ` William A. (Andy) Adamson [not found] ` <g2k89c397151004201224wb35ae389g961523bbef23f452-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 8+ messages in thread From: William A. (Andy) Adamson @ 2010-04-20 19:24 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Neil Brown, linux-nfs On Tue, Apr 20, 2010 at 12:51 PM, J. Bruce Fields <bfields@fieldses.org= > wrote: > On Tue, Apr 20, 2010 at 12:16:52PM +1000, Neil Brown wrote: >> >> Surely this can never have worked... which implies that the code has >> never been used? >> >> When read_buf is called to move over to the next page in the pagelis= t >> of an NFSv4 request, it sets argp->end to essentially a random >> number, certainly not an address within the page which argp->p now >> points to. =A0So subsequent calls to READ_BUF will think there is mu= ch >> more than a page of spare space (the cast to u32 ensures an unsigned >> comparison) so we can expect to fall off the end of the second >> page. > > Yipes, thanks. > >> I guess we never ever receive requests with any operation starting >> beyond the first page! > > putfh-write-getattr, for example, is common enough. =A0The write deco= ding > should leave arg->end set correctly. =A0But there are two read_buf()'= s in > decode_getattr(), and I can't see why we don't hit this bug on a writ= e > that leaves that final getattr exactly straddling a page boundary. The write data is dumped into the rq_vec which has non-contiguous pages. So the xdr_buf head only holds the putfh result, the short write response header (v4 stateid, offset, how, length, etc), and then the getattr. so there is plenty of space. -->Andy > > --b. > >> [[ >> I found this while looking at why fsstress over NFS over RDMA caused >> a bad memory dereference in READ32, suggesting that 'p' had a bad >> value. =A0However it was ffff8801299188f0, which is not an "I've fal= len >> off the end of the page" sort of value. =A0So I think it must be a >> different bug :-( =A0It is as if the page is being unmapped undernea= th >> us... >> ]] >> NeilBrown >> >> >> >> >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c >> index e170317..34ccf81 100644 >> --- a/fs/nfsd/nfs4xdr.c >> +++ b/fs/nfsd/nfs4xdr.c >> @@ -161,10 +161,10 @@ static __be32 *read_buf(struct nfsd4_compounda= rgs *argp, u32 nbytes) >> =A0 =A0 =A0 argp->p =3D page_address(argp->pagelist[0]); >> =A0 =A0 =A0 argp->pagelist++; >> =A0 =A0 =A0 if (argp->pagelen < PAGE_SIZE) { >> - =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D p + (argp->pagelen>>2); >> + =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D argp->p + (argp->pagelen>>2)= ; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pagelen =3D 0; >> =A0 =A0 =A0 } else { >> - =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D p + (PAGE_SIZE>>2); >> + =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D argp->p + (PAGE_SIZE>>2); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pagelen -=3D PAGE_SIZE; >> =A0 =A0 =A0 } >> =A0 =A0 =A0 memcpy(((char*)p)+avail, argp->p, (nbytes - avail)); >> @@ -1426,10 +1426,10 @@ nfsd4_decode_compound(struct nfsd4_compounda= rgs *argp) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->p =3D page_address= (argp->pagelist[0]); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pagelist++; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (argp->pagelen < PAGE= _SIZE) { >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->end = =3D p + (argp->pagelen>>2); >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->end = =3D argp->p + (argp->pagelen>>2); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pa= gelen =3D 0; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } else { >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->end = =3D p + (PAGE_SIZE>>2); >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->end = =3D argp->p + (PAGE_SIZE>>2); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pa= gelen -=3D PAGE_SIZE; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 } >> >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" = in > the body of a message to majordomo@vger.kernel.org > More majordomo info at =A0http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <g2k89c397151004201224wb35ae389g961523bbef23f452-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH] bug in read_buf [not found] ` <g2k89c397151004201224wb35ae389g961523bbef23f452-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2010-04-20 19:39 ` J. Bruce Fields 2010-04-21 22:35 ` J. Bruce Fields 2010-04-22 15:41 ` William A. (Andy) Adamson 0 siblings, 2 replies; 8+ messages in thread From: J. Bruce Fields @ 2010-04-20 19:39 UTC (permalink / raw) To: William A. (Andy) Adamson; +Cc: Neil Brown, linux-nfs On Tue, Apr 20, 2010 at 03:24:59PM -0400, William A. (Andy) Adamson wro= te: > On Tue, Apr 20, 2010 at 12:51 PM, J. Bruce Fields <bfields@fieldses.o= rg> wrote: > > On Tue, Apr 20, 2010 at 12:16:52PM +1000, Neil Brown wrote: > >> > >> Surely this can never have worked... which implies that the code h= as > >> never been used? > >> > >> When read_buf is called to move over to the next page in the pagel= ist > >> of an NFSv4 request, it sets argp->end to essentially a random > >> number, certainly not an address within the page which argp->p now > >> points to. =C2=A0So subsequent calls to READ_BUF will think there = is much > >> more than a page of spare space (the cast to u32 ensures an unsign= ed > >> comparison) so we can expect to fall off the end of the second > >> page. > > > > Yipes, thanks. > > > >> I guess we never ever receive requests with any operation starting > >> beyond the first page! > > > > putfh-write-getattr, for example, is common enough. =C2=A0The write= decoding > > should leave arg->end set correctly. =C2=A0But there are two read_b= uf()'s in > > decode_getattr(), and I can't see why we don't hit this bug on a wr= ite > > that leaves that final getattr exactly straddling a page boundary. >=20 > The write data is dumped into the rq_vec which has non-contiguous > pages. So the xdr_buf head only holds the putfh result, the short > write response header (v4 stateid, offset, how, length, etc), and the= n > the getattr. so there is plenty of space. This is the server-side write-decoding, so you could see: rpc header | putfh | write ... data ... | getattr ^ | page boundary here --b. >=20 > -->Andy >=20 > > > > --b. > > > >> [[ > >> I found this while looking at why fsstress over NFS over RDMA caus= ed > >> a bad memory dereference in READ32, suggesting that 'p' had a bad > >> value. =C2=A0However it was ffff8801299188f0, which is not an "I'v= e fallen > >> off the end of the page" sort of value. =C2=A0So I think it must b= e a > >> different bug :-( =C2=A0It is as if the page is being unmapped und= erneath > >> us... > >> ]] > >> NeilBrown > >> > >> > >> > >> > >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c > >> index e170317..34ccf81 100644 > >> --- a/fs/nfsd/nfs4xdr.c > >> +++ b/fs/nfsd/nfs4xdr.c > >> @@ -161,10 +161,10 @@ static __be32 *read_buf(struct nfsd4_compoun= dargs *argp, u32 nbytes) > >> =C2=A0 =C2=A0 =C2=A0 argp->p =3D page_address(argp->pagelist[0]); > >> =C2=A0 =C2=A0 =C2=A0 argp->pagelist++; > >> =C2=A0 =C2=A0 =C2=A0 if (argp->pagelen < PAGE_SIZE) { > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (arg= p->pagelen>>2); > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->p = + (argp->pagelen>>2); > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen =3D= 0; > >> =C2=A0 =C2=A0 =C2=A0 } else { > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (PAG= E_SIZE>>2); > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->p = + (PAGE_SIZE>>2); > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen -=3D= PAGE_SIZE; > >> =C2=A0 =C2=A0 =C2=A0 } > >> =C2=A0 =C2=A0 =C2=A0 memcpy(((char*)p)+avail, argp->p, (nbytes - a= vail)); > >> @@ -1426,10 +1426,10 @@ nfsd4_decode_compound(struct nfsd4_compoun= dargs *argp) > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 argp->p =3D page_address(argp->pagelist[0]); > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 argp->pagelist++; > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 if (argp->pagelen < PAGE_SIZE) { > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (argp->pagelen>>2); > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->p + (argp->pagelen>= >2); > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen =3D 0; > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 } else { > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (PAGE_SIZE>>2); > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->p + (PAGE_SIZE>>2); > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen -=3D PAGE_SIZE; > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 } > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 } > >> > >> > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-nfs= " in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at =C2=A0http://vger.kernel.org/majordomo-info.= html > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] bug in read_buf 2010-04-20 19:39 ` J. Bruce Fields @ 2010-04-21 22:35 ` J. Bruce Fields 2010-04-21 22:36 ` J. Bruce Fields 2010-04-22 15:41 ` William A. (Andy) Adamson 1 sibling, 1 reply; 8+ messages in thread From: J. Bruce Fields @ 2010-04-21 22:35 UTC (permalink / raw) To: William A. (Andy) Adamson; +Cc: Neil Brown, linux-nfs On Tue, Apr 20, 2010 at 03:39:44PM -0400, J. Bruce Fields wrote: > On Tue, Apr 20, 2010 at 03:24:59PM -0400, William A. (Andy) Adamson w= rote: > > On Tue, Apr 20, 2010 at 12:51 PM, J. Bruce Fields <bfields@fieldses= =2Eorg> wrote: > > > On Tue, Apr 20, 2010 at 12:16:52PM +1000, Neil Brown wrote: > > >> > > >> Surely this can never have worked... which implies that the code= has > > >> never been used? > > >> > > >> When read_buf is called to move over to the next page in the pag= elist > > >> of an NFSv4 request, it sets argp->end to essentially a random > > >> number, certainly not an address within the page which argp->p n= ow > > >> points to. =C2=A0So subsequent calls to READ_BUF will think ther= e is much > > >> more than a page of spare space (the cast to u32 ensures an unsi= gned > > >> comparison) so we can expect to fall off the end of the second > > >> page. > > > > > > Yipes, thanks. > > > > > >> I guess we never ever receive requests with any operation starti= ng > > >> beyond the first page! > > > > > > putfh-write-getattr, for example, is common enough. =C2=A0The wri= te decoding > > > should leave arg->end set correctly. =C2=A0But there are two read= _buf()'s in > > > decode_getattr(), and I can't see why we don't hit this bug on a = write > > > that leaves that final getattr exactly straddling a page boundary= =2E > >=20 > > The write data is dumped into the rq_vec which has non-contiguous > > pages. So the xdr_buf head only holds the putfh result, the short > > write response header (v4 stateid, offset, how, length, etc), and t= hen > > the getattr. so there is plenty of space. >=20 > This is the server-side write-decoding, so you could see: >=20 >=20 > rpc header | putfh | write ... data ... | getattr > ^ > | > page boundary here Hm, I guess even when argp->end is wrong, argp->p is always set to something sane; so on the next READ_BUF(), when you hit the nbytes <=3D (u32)((char *)argp->end - (char *)argp->p case, you do p =3D argp->p; argp->p +=3D XDR_QUADLEN(nbytes); and p is something reasonable. "end" stays wrong, but that won't be a problem until you run past the end of the *next* page, which it would take a very unusual compound to do. --b. >=20 > --b. >=20 > >=20 > > -->Andy > >=20 > > > > > > --b. > > > > > >> [[ > > >> I found this while looking at why fsstress over NFS over RDMA ca= used > > >> a bad memory dereference in READ32, suggesting that 'p' had a ba= d > > >> value. =C2=A0However it was ffff8801299188f0, which is not an "I= 've fallen > > >> off the end of the page" sort of value. =C2=A0So I think it must= be a > > >> different bug :-( =C2=A0It is as if the page is being unmapped u= nderneath > > >> us... > > >> ]] > > >> NeilBrown > > >> > > >> > > >> > > >> > > >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c > > >> index e170317..34ccf81 100644 > > >> --- a/fs/nfsd/nfs4xdr.c > > >> +++ b/fs/nfsd/nfs4xdr.c > > >> @@ -161,10 +161,10 @@ static __be32 *read_buf(struct nfsd4_compo= undargs *argp, u32 nbytes) > > >> =C2=A0 =C2=A0 =C2=A0 argp->p =3D page_address(argp->pagelist[0])= ; > > >> =C2=A0 =C2=A0 =C2=A0 argp->pagelist++; > > >> =C2=A0 =C2=A0 =C2=A0 if (argp->pagelen < PAGE_SIZE) { > > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (a= rgp->pagelen>>2); > > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->= p + (argp->pagelen>>2); > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen =3D= 0; > > >> =C2=A0 =C2=A0 =C2=A0 } else { > > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (P= AGE_SIZE>>2); > > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->= p + (PAGE_SIZE>>2); > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen -= =3D PAGE_SIZE; > > >> =C2=A0 =C2=A0 =C2=A0 } > > >> =C2=A0 =C2=A0 =C2=A0 memcpy(((char*)p)+avail, argp->p, (nbytes -= avail)); > > >> @@ -1426,10 +1426,10 @@ nfsd4_decode_compound(struct nfsd4_compo= undargs *argp) > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 argp->p =3D page_address(argp->pagelist[0]); > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 argp->pagelist++; > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 if (argp->pagelen < PAGE_SIZE) { > > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (argp->pagelen>>2= ); > > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->p + (argp->page= len>>2); > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen =3D 0; > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 } else { > > >> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D p + (PAGE_SIZE>>2); > > >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->end =3D argp->p + (PAGE_SIZE>= >2); > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 argp->pagelen -=3D PAGE_SIZE; > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 } > > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 } > > >> > > >> > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-n= fs" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at =C2=A0http://vger.kernel.org/majordomo-inf= o.html > > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] bug in read_buf 2010-04-21 22:35 ` J. Bruce Fields @ 2010-04-21 22:36 ` J. Bruce Fields 2010-04-21 23:08 ` Neil Brown 0 siblings, 1 reply; 8+ messages in thread From: J. Bruce Fields @ 2010-04-21 22:36 UTC (permalink / raw) To: William A. (Andy) Adamson; +Cc: Neil Brown, linux-nfs On Wed, Apr 21, 2010 at 06:35:27PM -0400, J. Bruce Fields wrote: > On Tue, Apr 20, 2010 at 03:39:44PM -0400, J. Bruce Fields wrote: > > On Tue, Apr 20, 2010 at 03:24:59PM -0400, William A. (Andy) Adamson= wrote: > > > On Tue, Apr 20, 2010 at 12:51 PM, J. Bruce Fields <bfields@fields= es.org> wrote: > > > > On Tue, Apr 20, 2010 at 12:16:52PM +1000, Neil Brown wrote: > > > >> > > > >> Surely this can never have worked... which implies that the co= de has > > > >> never been used? > > > >> > > > >> When read_buf is called to move over to the next page in the p= agelist > > > >> of an NFSv4 request, it sets argp->end to essentially a random > > > >> number, certainly not an address within the page which argp->p= now > > > >> points to. =C2=A0So subsequent calls to READ_BUF will think th= ere is much > > > >> more than a page of spare space (the cast to u32 ensures an un= signed > > > >> comparison) so we can expect to fall off the end of the second > > > >> page. > > > > > > > > Yipes, thanks. > > > > > > > >> I guess we never ever receive requests with any operation star= ting > > > >> beyond the first page! > > > > > > > > putfh-write-getattr, for example, is common enough. =C2=A0The w= rite decoding > > > > should leave arg->end set correctly. =C2=A0But there are two re= ad_buf()'s in > > > > decode_getattr(), and I can't see why we don't hit this bug on = a write > > > > that leaves that final getattr exactly straddling a page bounda= ry. > > >=20 > > > The write data is dumped into the rq_vec which has non-contiguous > > > pages. So the xdr_buf head only holds the putfh result, the short > > > write response header (v4 stateid, offset, how, length, etc), and= then > > > the getattr. so there is plenty of space. > >=20 > > This is the server-side write-decoding, so you could see: > >=20 > >=20 > > rpc header | putfh | write ... data ... | getattr > > ^ > > | > > page boundary here >=20 > Hm, I guess even when argp->end is wrong, argp->p is always set to > something sane; so on the next READ_BUF(), when you hit the >=20 > nbytes <=3D (u32)((char *)argp->end - (char *)argp->p >=20 > case, you do >=20 > p =3D argp->p; > argp->p +=3D XDR_QUADLEN(nbytes); >=20 > and p is something reasonable. "end" stays wrong, but that won't be = a > problem until you run past the end of the *next* page, which it would > take a very unusual compound to do. (Nevertheless: applied, for 2.6.34 and stable.) --b. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] bug in read_buf 2010-04-21 22:36 ` J. Bruce Fields @ 2010-04-21 23:08 ` Neil Brown 0 siblings, 0 replies; 8+ messages in thread From: Neil Brown @ 2010-04-21 23:08 UTC (permalink / raw) To: J. Bruce Fields; +Cc: William A. (Andy) Adamson, linux-nfs On Wed, 21 Apr 2010 18:36:05 -0400 "J. Bruce Fields" <bfields@fieldses.org> wrote: > > Hm, I guess even when argp->end is wrong, argp->p is always set to > > something sane; so on the next READ_BUF(), when you hit the > > > > nbytes <= (u32)((char *)argp->end - (char *)argp->p > > > > case, you do > > > > p = argp->p; > > argp->p += XDR_QUADLEN(nbytes); > > > > and p is something reasonable. "end" stays wrong, but that won't be a > > problem until you run past the end of the *next* page, which it would > > take a very unusual compound to do. Yes, it would not be an easy bug to trigger ... it takes away some of the thrill of finding a bug when you discover that it only affects a corner case that never ever happens :-( > > (Nevertheless: applied, for 2.6.34 and stable.) Thanks. NeilBrown ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] bug in read_buf 2010-04-20 19:39 ` J. Bruce Fields 2010-04-21 22:35 ` J. Bruce Fields @ 2010-04-22 15:41 ` William A. (Andy) Adamson 1 sibling, 0 replies; 8+ messages in thread From: William A. (Andy) Adamson @ 2010-04-22 15:41 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Neil Brown, linux-nfs On Tue, Apr 20, 2010 at 3:39 PM, J. Bruce Fields <bfields@fieldses.org>= wrote: > On Tue, Apr 20, 2010 at 03:24:59PM -0400, William A. (Andy) Adamson w= rote: >> On Tue, Apr 20, 2010 at 12:51 PM, J. Bruce Fields <bfields@fieldses.= org> wrote: >> > On Tue, Apr 20, 2010 at 12:16:52PM +1000, Neil Brown wrote: >> >> >> >> Surely this can never have worked... which implies that the code = has >> >> never been used? >> >> >> >> When read_buf is called to move over to the next page in the page= list >> >> of an NFSv4 request, it sets argp->end to essentially a random >> >> number, certainly not an address within the page which argp->p no= w >> >> points to. =A0So subsequent calls to READ_BUF will think there is= much >> >> more than a page of spare space (the cast to u32 ensures an unsig= ned >> >> comparison) so we can expect to fall off the end of the second >> >> page. >> > >> > Yipes, thanks. >> > >> >> I guess we never ever receive requests with any operation startin= g >> >> beyond the first page! >> > >> > putfh-write-getattr, for example, is common enough. =A0The write d= ecoding >> > should leave arg->end set correctly. =A0But there are two read_buf= ()'s in >> > decode_getattr(), and I can't see why we don't hit this bug on a w= rite >> > that leaves that final getattr exactly straddling a page boundary. >> >> The write data is dumped into the rq_vec which has non-contiguous >> pages. So the xdr_buf head only holds the putfh result, the short >> write response header (v4 stateid, offset, how, length, etc), and th= en >> the getattr. so there is plenty of space. > > This is the server-side write-decoding, so you could see: > > > =A0 =A0 =A0 =A0rpc header | putfh | write ... data ... | getattr > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ^ > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0page boundary here ulp - you're right. -->Andy > > --b. > >> >> -->Andy >> >> > >> > --b. >> > >> >> [[ >> >> I found this while looking at why fsstress over NFS over RDMA cau= sed >> >> a bad memory dereference in READ32, suggesting that 'p' had a bad >> >> value. =A0However it was ffff8801299188f0, which is not an "I've = fallen >> >> off the end of the page" sort of value. =A0So I think it must be = a >> >> different bug :-( =A0It is as if the page is being unmapped under= neath >> >> us... >> >> ]] >> >> NeilBrown >> >> >> >> >> >> >> >> >> >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c >> >> index e170317..34ccf81 100644 >> >> --- a/fs/nfsd/nfs4xdr.c >> >> +++ b/fs/nfsd/nfs4xdr.c >> >> @@ -161,10 +161,10 @@ static __be32 *read_buf(struct nfsd4_compou= ndargs *argp, u32 nbytes) >> >> =A0 =A0 =A0 argp->p =3D page_address(argp->pagelist[0]); >> >> =A0 =A0 =A0 argp->pagelist++; >> >> =A0 =A0 =A0 if (argp->pagelen < PAGE_SIZE) { >> >> - =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D p + (argp->pagelen>>2); >> >> + =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D argp->p + (argp->pagelen>= >2); >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pagelen =3D 0; >> >> =A0 =A0 =A0 } else { >> >> - =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D p + (PAGE_SIZE>>2); >> >> + =A0 =A0 =A0 =A0 =A0 =A0 argp->end =3D argp->p + (PAGE_SIZE>>2); >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pagelen -=3D PAGE_SIZE; >> >> =A0 =A0 =A0 } >> >> =A0 =A0 =A0 memcpy(((char*)p)+avail, argp->p, (nbytes - avail)); >> >> @@ -1426,10 +1426,10 @@ nfsd4_decode_compound(struct nfsd4_compou= ndargs *argp) >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->p =3D page_addr= ess(argp->pagelist[0]); >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->pagelist++; >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (argp->pagelen < P= AGE_SIZE) { >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->e= nd =3D p + (argp->pagelen>>2); >> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->e= nd =3D argp->p + (argp->pagelen>>2); >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp-= >pagelen =3D 0; >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } else { >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->e= nd =3D p + (PAGE_SIZE>>2); >> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp->e= nd =3D argp->p + (PAGE_SIZE>>2); >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 argp-= >pagelen -=3D PAGE_SIZE; >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 } >> >> >> >> >> > -- >> > To unsubscribe from this list: send the line "unsubscribe linux-nf= s" in >> > the body of a message to majordomo@vger.kernel.org >> > More majordomo info at =A0http://vger.kernel.org/majordomo-info.ht= ml >> > > ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-04-22 15:41 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-20 2:16 [PATCH] bug in read_buf Neil Brown
2010-04-20 16:51 ` J. Bruce Fields
2010-04-20 19:24 ` William A. (Andy) Adamson
[not found] ` <g2k89c397151004201224wb35ae389g961523bbef23f452-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-04-20 19:39 ` J. Bruce Fields
2010-04-21 22:35 ` J. Bruce Fields
2010-04-21 22:36 ` J. Bruce Fields
2010-04-21 23:08 ` Neil Brown
2010-04-22 15:41 ` William A. (Andy) Adamson
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.