From mboxrd@z Thu Jan 1 00:00:00 1970 From: "J. Bruce Fields" Subject: Re: nfsd changes for 3.5 Date: Thu, 31 May 2012 16:53:00 -0400 Message-ID: <20120531205300.GG25955@fieldses.org> References: <20120531182457.GB25955@fieldses.org> <20120531200138.GD25955@fieldses.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Al Viro To: Linus Torvalds Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Thu, May 31, 2012 at 01:17:26PM -0700, Linus Torvalds wrote: > On Thu, May 31, 2012 at 1:01 PM, J. Bruce Fields wrote: > > > > Right. =C2=A0By default it's 90 seconds before we'll give up on the= client. >=20 > So a slightly buggy client can basically DoS the server by getting a > delegation and then crashing or something. Everybody else that tries > to read that directory (not that file) will be dead in the water. > Definitely not good. >=20 > > I hate that too, and originally tried to avoid it with something li= ke: > > > > =C2=A0 =C2=A0 =C2=A0 =C2=A0retry: > > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0acquire lock= s > > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0lookup inode > > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ret =3D try_= to_break_deleg(inode); > > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (ret) > > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0drop locks > > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0really_break_deleg(inode); > > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0goto retry; > > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0... do the r= eal work ... > > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0drop locks > > > > I felt like I was making already complicated code logic like rename= 's > > even harder to follow. >=20 > I do think it's the only thing we can reasonably do. OK, I can give that another try. Al, does that sound like the more sensible choice to you? Uh, that means ditching some work in my public git tree. Which I haven't rebased in years. So, a stupid process question; would you rather I: - continue to be strict about rebasing and apply a bunch of reverts? - ditch it and start over? #1 looks like a mess to me, so I guess #2's my default. Probably nobod= y will notice but me. > I'd love to have > some kind of per-dentry lock for unlink/rename, but we don't. > Long-term, we really do need to do something about the directory > locking, though, because it's also a huge problem for readdir() > concurrency. Or at least it used to be (samba in particular). Making > it an rwsem might help readdir a tiny amount, but I suspect people > actually depend on the mutex in readdir right now. Al called this all "highly non-trivial": http://marc.info/?l=3Dlinux-fsdevel&m=3D132726495726326&w=3D2 I don't know who'd have the cycles. --b. >=20 > > And those operations don't really know the inode till they acquire = the > > locks, so in pathological cases that could continue forever. >=20 > I suspect at some point you just have to say "screw it".