From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0EF1FC76195 for ; Fri, 19 Jul 2019 12:27:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E727321851 for ; Fri, 19 Jul 2019 12:27:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727559AbfGSM1B (ORCPT ); Fri, 19 Jul 2019 08:27:01 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57412 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727552AbfGSM1B (ORCPT ); Fri, 19 Jul 2019 08:27:01 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 236F53086258; Fri, 19 Jul 2019 12:27:00 +0000 (UTC) Received: from localhost (unknown [10.18.25.174]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 807A61001B14; Fri, 19 Jul 2019 12:26:55 +0000 (UTC) Date: Fri, 19 Jul 2019 08:26:54 -0400 From: Mike Snitzer To: Ming Lei Cc: Jens Axboe , linux-block@vger.kernel.org, "James E . J . Bottomley" , "Martin K . Petersen" , linux-scsi@vger.kernel.org, "Ewan D . Milne" , Bart Van Assche , Hannes Reinecke , Christoph Hellwig , dm-devel@redhat.com, stable@vger.kernel.org Subject: Re: [PATCH 1/2] blk-mq: add callback of .cleanup_rq Message-ID: <20190719122654.GA7339@redhat.com> References: <20190718032519.28306-1-ming.lei@redhat.com> <20190718032519.28306-2-ming.lei@redhat.com> <20190718145201.GA2305@redhat.com> <20190719013546.GA12004@ming.t460p> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190719013546.GA12004@ming.t460p> User-Agent: Mutt/1.5.21 (2010-09-15) X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Fri, 19 Jul 2019 12:27:00 +0000 (UTC) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org On Thu, Jul 18 2019 at 9:35pm -0400, Ming Lei wrote: > On Thu, Jul 18, 2019 at 10:52:01AM -0400, Mike Snitzer wrote: > > On Wed, Jul 17 2019 at 11:25pm -0400, > > Ming Lei wrote: > > > > > dm-rq needs to free request which has been dispatched and not completed > > > by underlying queue. However, the underlying queue may have allocated > > > private stuff for this request in .queue_rq(), so dm-rq will leak the > > > request private part. > > > > No, SCSI (and blk-mq) will leak. DM doesn't know anything about the > > internal memory SCSI uses. That memory is a SCSI implementation detail. > > It isn't noting to do with dm-rq, which frees one request after BLK_STS_*RESOURCE > is returned from blk_insert_cloned_request(), in this case it has to be > the user for releasing the request private data. > > > > > Please fix header to properly reflect which layer is doing the leaking. > > Fine. > > > > > > Add one new callback of .cleanup_rq() to fix the memory leak issue. > > > > > > Another use case is to free request when the hctx is dead during > > > cpu hotplug context. > > > > > > Cc: Ewan D. Milne > > > Cc: Bart Van Assche > > > Cc: Hannes Reinecke > > > Cc: Christoph Hellwig > > > Cc: Mike Snitzer > > > Cc: dm-devel@redhat.com > > > Cc: > > > Fixes: 396eaf21ee17 ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback") > > > Signed-off-by: Ming Lei > > > --- > > > drivers/md/dm-rq.c | 1 + > > > include/linux/blk-mq.h | 13 +++++++++++++ > > > 2 files changed, 14 insertions(+) > > > > > > diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c > > > index c9e44ac1f9a6..21d5c1784d0c 100644 > > > --- a/drivers/md/dm-rq.c > > > +++ b/drivers/md/dm-rq.c > > > @@ -408,6 +408,7 @@ static int map_request(struct dm_rq_target_io *tio) > > > ret = dm_dispatch_clone_request(clone, rq); > > > if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) { > > > blk_rq_unprep_clone(clone); > > > + blk_mq_cleanup_rq(clone); > > > tio->ti->type->release_clone_rq(clone, &tio->info); > > > tio->clone = NULL; > > > return DM_MAPIO_REQUEUE; > > > > Requiring upper layer driver (dm-rq) to explicitly call blk_mq_cleanup_rq() > > seems wrong. In this instance tio->ti->type->release_clone_rq() > > (dm-mpath's multipath_release_clone) calls blk_put_request(). Why can't > > blk_put_request(), or blk_mq_free_request(), call blk_mq_cleanup_rq()? > > I did think about doing it in blk_put_request(), and I just want to > avoid the little cost in generic fast path, given freeing request after > dispatch is very unusual, so far only nvme multipath and dm-rq did in > that way. > > However, if no one objects to move blk_mq_cleanup_rq() to blk_put_request() > or blk_mq_free_request(), I am fine to do that in V2. Think it'd be a less fragile/nuanced way to extend the blk-mq interface. Otherwise there is potential for other future drivers experiencing leaks. > > Not looked at the cpu hotplug case you mention, but my naive thought is > > it'd be pretty weird to also sprinkle a call to blk_mq_cleanup_rq() from > > that specific "dead hctx" code path. > > It isn't weird, and it is exactly what NVMe multipath is doing, please see > nvme_failover_req(). And it is just that nvme doesn't allocate request > private data. > > Wrt. blk-mq cpu hotplug handling: after one hctx is dead, we can't dispatch > request to this hctx any more, however one request has been bounded to its > hctx since its allocation and the association can't(or quite hard to) be > changed any more, do you have any better idea to deal with this issue? No, as I prefaced before "Not looked at the cpu hotplug case you mention". As such I should've stayed silent ;) But my point was we should hook off current interfaces rather than rely on a new primary function call. Mike