From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S939624AbdAIS31 (ORCPT ); Mon, 9 Jan 2017 13:29:27 -0500 Received: from 2.mo69.mail-out.ovh.net ([178.33.251.80]:57222 "EHLO 2.mo69.mail-out.ovh.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S939559AbdAIS3W (ORCPT ); Mon, 9 Jan 2017 13:29:22 -0500 Date: Mon, 9 Jan 2017 19:29:15 +0100 From: Greg Kurz To: Al Viro Cc: Tuomas Tynkkynen , linux-fsdevel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-kernel@vger.kernel.org Subject: Re: [V9fs-developer] 9pfs hangs since 4.7 Message-ID: <20170109192915.3227d2ee@bahia.lan> In-Reply-To: <20170107171910.GJ1555@ZenIV.linux.org.uk> References: <20161124215023.02deb03c@duuni> <20170102102035.7d1cf903@duuni> <20170102162309.GZ1555@ZenIV.linux.org.uk> <20170104013355.4a8923b6@duuni> <20170104014753.GE1555@ZenIV.linux.org.uk> <20170104220447.74f2265d@duuni> <20170104230101.GG1555@ZenIV.linux.org.uk> <20170106145235.51630baf@bahia.lan> <20170107062647.GB12074@ZenIV.linux.org.uk> <20170107161045.742893b1@bahia.lan> <20170107171910.GJ1555@ZenIV.linux.org.uk> X-Mailer: Claws Mail 3.14.1 (GTK+ 2.24.31; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Ovh-Tracer-Id: 15456353921809684878 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrfeelgedrvdeggdduudehucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuqfggjfdpvefjgfevmfevgfenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddm Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 7 Jan 2017 17:19:10 +0000 Al Viro wrote: > On Sat, Jan 07, 2017 at 04:10:45PM +0100, Greg Kurz wrote: > > > virtqueue_push(), but pdu freeing is delayed until v9fs_flush() gets woken > > > up. In the meanwhile, another request arrives into the slot of freed by > > > that virtqueue_push() and we are out of pdus. > > > > > > > Indeed. Even if this doesn't seem to be the problem here, I guess this should > > be fixed. > > FWIW, there's something that looks like an off-by-one in > v9fs_device_realize_common(): > /* initialize pdu allocator */ > QLIST_INIT(&s->free_list); > QLIST_INIT(&s->active_list); > for (i = 0; i < (MAX_REQ - 1); i++) { > QLIST_INSERT_HEAD(&s->free_list, &s->pdus[i], next); > s->pdus[i].s = s; > s->pdus[i].idx = i; > } > > Had been there since the original merge of 9p support into qemu - that code > had moved around a bit, but it had never inserted s->pdus[MAX_REQ - 1] into > free list. So your scenario with failing pdu_alloc() is still possible. Indeed, this (MAX_REQ - 1) thing looks wrong. Thanks for poiting that out. > In that log the total amount of pending requests has reached 128 for the > first time right when the requests had stopped being handled and even > though it had dropped below that shortly after, extra requests being put > into queue had not been processed at all... > > I'm not familiar with qemu guts enough to tell if that's a plausible scenario, > though... shouldn't subsequent queue insertions (after enough slots had been > released) simply trigger virtio_queue_notify_vq() again? It *is* a bug > (if we get a burst filling a previously empty queue all at once, there won't > be any slots becoming freed), but that's obviously not the case here - > slots were getting freed, after all.