From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33499) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bL9SG-0007ig-Jj for qemu-devel@nongnu.org; Thu, 07 Jul 2016 09:35:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bL9SC-0007Ys-1s for qemu-devel@nongnu.org; Thu, 07 Jul 2016 09:35:00 -0400 Received: from 7.mo69.mail-out.ovh.net ([46.105.50.32]:35178) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bL9SB-0007Y9-RW for qemu-devel@nongnu.org; Thu, 07 Jul 2016 09:34:55 -0400 Received: from player779.ha.ovh.net (b9.ovh.net [213.186.33.59]) by mo69.mail-out.ovh.net (Postfix) with ESMTP id B1B0D100CB0C for ; Thu, 7 Jul 2016 15:34:54 +0200 (CEST) Date: Thu, 7 Jul 2016 15:34:34 +0200 From: Greg Kurz Message-ID: <20160707153434.1117e8a3@bahia.lan> In-Reply-To: <20160707123540.GA15192@u-isr-cdi-08> References: <146659832556.15781.17414806975641516683.stgit@bahia.lan> <20160704141655.GA5799@u-isr-cdi-08> <20160704170849.1654d6a0@bahia.lan> <20160707123540.GA15192@u-isr-cdi-08> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 0/3] fs/9p: fix setattr/getattr issues with open files List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Dominique Martinet Cc: Eric Van Hensbergen , Latchesar Ionkov , linux-kernel@vger.kernel.org, qemu-devel@nongnu.org, v9fs-developer@lists.sourceforge.net, Ron Minnich , "David S. Miller" On Thu, 7 Jul 2016 14:35:40 +0200 Dominique Martinet wrote: > Hi Greg, > Hi Dominique, > Greg Kurz wrote on Mon, Jul 04, 2016 at 05:08:49PM +0200: > > On Mon, 4 Jul 2016 16:16:55 +0200 > > Dominique Martinet wrote: > > > > > I *think* this introduces a race somewhere, I'm getting errors like: > > > cat: f.05: No such file or directory > > > cat: f.14: No such file or directory > > > cat: f.13: No such file or directory > > > cat: f.39: No such file or directory > > > cat: f.05: No such file or directory > > > > > > > > > when doing: > > > for file in {01..50}; do touch f.${file}; done > > > seq 1 1000 | xargs -n 1 -P 25 -I{} cat f.* > /dev/null > > Ok so, tested with the first two patches and I can't seem to hit any > problem with the qemu server at least (I'd need more time to fix > ganesha's 9p tcp/rdma server before I could blame the client in any way) > I'm not surprised: patch 1 simply adds a "fallback" lookup to the existing code, and patch 2 changes this "fallback" lookup only. Bad things can come with patch 3 because it really changes the lookup logic. > > The last patch looks good to me, I think it only makes an existing race > more visible... What I think could happen is: > process 1 has file open > process 2 tries to open file, sees fid open > process 1 closes file/clunk fids > process 2 tries to clone now-clunked fid and gets ENOENT > I'll try to have a look with this scenario in mind. > > I'm afraid I just found out my hypervisor is no longer recent enough for > gdb kernel scripts (gdb 7.6 and python 2.7.5 in el7 compared to the > apparently required 7.7 and 2.7.6 respectively...), and I don't see > anything obvious with just debug messages/adding a few printks (wasn't > able to confirm where exactly that ENOENT comes from or if my theory is > even close to the truth) > > I'd like to spend more time on it but don't think I'll be able to for a > couple of weeks ; sorry about that. > No problem. My plate is full anyway until I go into a 1-month vacation, starting end of July. And I'm currently targeting QEMU 2.8 for the server side fixes: we have plenty of time to fix this. > > Were you able to reproduce the problem? > Yes ! I get it every time :) > Thanks, I really appreciate your assistance since v9fs-devel is really quiet these days. Cheers. -- Greg