From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A068C43334 for ; Thu, 14 Jul 2022 19:16:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238985AbiGNTQY (ORCPT ); Thu, 14 Jul 2022 15:16:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53242 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229504AbiGNTQW (ORCPT ); Thu, 14 Jul 2022 15:16:22 -0400 Received: from kylie.crudebyte.com (kylie.crudebyte.com [5.189.157.229]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7212D2DA88 for ; Thu, 14 Jul 2022 12:16:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=crudebyte.com; s=kylie; h=Content-Type:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Content-ID:Content-Description; bh=Tkxkd+pdE19XC0lUU8oQ8Peb4YvHVWYiASesMiTlbkA=; b=aBWehos5u3QQ6zPND9TqxbbuSv 21buZRhuZOirrfTY6IvbvL/WPWQzQwX8B4cD3bcuZhnPZVj8OIXGML0Fh4o5eyRw23XdAS0yQKOed SA/i014SKaKlo2kikH2XDdvpV9mxs0SwZU0HTA/Q/wmovTxmq9rSHuW7KJHD8gv3KhA2+QdOO7XFd EMqvd7iQGAWLqyXhhAr+4SO3SnCYMFg2HXL8vTAbAUrueqJFWkLnOnLCDud3BMRh80QZmF3iNM5VC GKLdHoULMZbL3W+GQwUUgiOgZ9GhwpDOFl74fOaGsbBG2H+k82Qi59NeAfmiJ7vNyAjEvoNuhWeyf eX5LIk7xzJgjWocySGfbSJQbMZER3NU0twvnzrMwspnCrX5Umydu+48kMT0y7SuDjXkhLbGH5jDbW YM2ObQWjVzvoZ5jBt1hNLOlhXZSWROftZo0A0QzWeY7uyUxE1Sf6upSn8juF6iJj7REUrhMccxwfN WVw2BJbTJne3buHSPGA5eyphwzSZyEEiv3m7yVZHw5Te8beP8XuLKfJv4H/S1uUjhLb5CCupDpvrS ZWNAwx03vnjuf5m8BfpfdlOL+FQTz2Cz4yT6007IPIUAlmcyfR6U2bwB5Y/K3MyIQUCruzqCs3rh5 X8mYCxX4OHKxUb/aOCrdCBcwQPhqOfYCkA/fyYvmo=; From: Christian Schoenebeck To: Kent Overstreet , Dominique Martinet Cc: linux-kernel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, Greg Kurz , Eric Van Hensbergen , Latchesar Ionkov Subject: Re: [RFC PATCH] 9p: forbid use of mempool for TFLUSH Date: Thu, 14 Jul 2022 21:16:14 +0200 Message-ID: <2229731.hRsvSkCM7u@silver> In-Reply-To: <20220713041700.2502404-1-asmadeus@codewreck.org> References: <12950409.o0bIpVV1Ut@silver> <20220713041700.2502404-1-asmadeus@codewreck.org> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mittwoch, 13. Juli 2022 06:17:01 CEST Dominique Martinet wrote: > TFLUSH is called while the thread still holds memory for the > request we're trying to flush, so mempool alloc can deadlock > there. With p9_msg_buf_size() rework the flush allocation is > small so just make it fail if allocation failed; all that does > is potentially leak the request we're flushing until its reply > finally does come.. or if it never does until umount. > > Signed-off-by: Dominique Martinet > --- Patch looks fine on first impression, but I'll postpone testing this. And yes, I also think that exempting Tflush should be fair. If 4k (soon) cannot be allocated, then you probably have worse problems than that. > Here's a concrete version of what I had in mind: literally just make > allocation fail if the initial alloc failed. > > I can't reproduce any bad hang with a sane server here, but we still > risk hanging with a bad server that ignores flushes as these are still > unkillable (someday I'll finish my async requests work...) > > So ultimately there are two things I'm not so happy about with mempools: > - this real possibility of client hangs if a server mishandles some > replies -- this might make fuzzing difficult in particular, I think it's Concrete example of such a mishap? > easier to deal with failed IO (as long as it fails all the way back to > userspace) than to hang forever. > I'm sure there are others who prefer to wait instead, but I think this > should at least have a timeout or something. Not sure if it was easy to pick an appropriate timeout value. I've seen things slowing down extremely with 9p after a while. But to be fair, these were on production machines with ancient kernel versions, so maybe already fixed. A proc interface would be useful though to be able to identify things like piling up too many fids and other performance related numbers. > - One of the reasons I wanted to drop the old request cache before is > that these caches are per mount/connection. If you have a dozen of > mounts that each cache 4 requests worth as here, with msize=1MB and two > buffers per request we're locking down 8 * 12 = 96 MB of ram just for > mounting. > That being said, as long as hanging is a real risk I'm not comfortable > sharing the mempools between all the clients either, so I'm not sure > what to suggest. Why would a shared mempool increase the chance of a hang or worsen its outcome? > Anyway, we're getting close to the next merge request and I don't have > time to look deeper into this; I'll be putting the mempool patches on > hold for next cycle at least and we can discuss again if Kent still > encounters allocation troubles with Christian's smaller buffers > optimization first. > These will very likely get in this cycle unless something bad happens, > I've finished retesting a bit without trouble here. > > > net/9p/client.c | 23 +++++++++++++++++++---- > 1 file changed, 19 insertions(+), 4 deletions(-) > > diff --git a/net/9p/client.c b/net/9p/client.c > index 928c9f88f204..f9c17fb79f35 100644 > --- a/net/9p/client.c > +++ b/net/9p/client.c > @@ -218,7 +218,7 @@ static int parse_opts(char *opts, struct p9_client > *clnt) return ret; > } > > -static void p9_fcall_init(struct p9_client *c, struct p9_fcall *fc, > +static int p9_fcall_init(struct p9_client *c, struct p9_fcall *fc, > int fc_idx, unsigned alloc_msize) > { > gfp_t gfp = GFP_NOFS|__GFP_NOWARN; > @@ -232,11 +232,13 @@ static void p9_fcall_init(struct p9_client *c, struct > p9_fcall *fc, if (alloc_msize < c->msize) > fc->sdata = kmalloc(alloc_msize, gfp); > > - if (!fc->sdata) { > + if (!fc->sdata && fc_idx >= 0) { > fc->sdata = mempool_alloc(&c->pools[fc_idx], gfp); > fc->used_mempool = true; > fc->capacity = c->msize; > } > + > + return fc->sdata == NULL; > } > > void p9_fcall_fini(struct p9_client *c, struct p9_fcall *fc, > @@ -280,6 +282,7 @@ p9_tag_alloc(struct p9_client *c, int8_t type, uint > t_size, uint r_size, struct p9_req_t *req = kmem_cache_alloc(p9_req_cache, > GFP_NOFS); int alloc_tsize; > int alloc_rsize; > + int fc_idx = 0; > int tag; > va_list apc; > > @@ -294,8 +297,19 @@ p9_tag_alloc(struct p9_client *c, int8_t type, uint > t_size, uint r_size, if (!req) > return ERR_PTR(-ENOMEM); > > - p9_fcall_init(c, &req->tc, 0, alloc_tsize); > - p9_fcall_init(c, &req->rc, 1, alloc_rsize); > + /* We cannot use the mempool for TFLUSH because flushes can be > + * allocated when the thread still holds memory for the request > + * we're flushing. A negative fc_idx will make p9_fcall_init > + * error out. > + */ > + if (type == P9_TFLUSH) { > + fc_idx = -2; > + } > + > + if (p9_fcall_init(c, &req->tc, fc_idx, alloc_tsize)) > + goto free_req; > + if (p9_fcall_init(c, &req->rc, fc_idx + 1, alloc_rsize)) > + goto free; > > p9pdu_reset(&req->tc); > p9pdu_reset(&req->rc); > @@ -334,6 +348,7 @@ p9_tag_alloc(struct p9_client *c, int8_t type, uint > t_size, uint r_size, free: > p9_fcall_fini(c, &req->tc, 0); > p9_fcall_fini(c, &req->rc, 1); > +free_req: > kmem_cache_free(p9_req_cache, req); > return ERR_PTR(-ENOMEM); > }