From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.linuxfoundation.org ([140.211.169.12]:44778 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934672AbdEWUMN (ORCPT ); Tue, 23 May 2017 16:12:13 -0400 From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Bart Van Assche , Hannes Reinecke , Christoph Hellwig , Mike Snitzer Subject: [PATCH 4.11 029/197] dm mpath: avoid that path removal can trigger an infinite loop Date: Tue, 23 May 2017 22:06:30 +0200 Message-Id: <20170523200824.185443353@linuxfoundation.org> In-Reply-To: <20170523200821.666872592@linuxfoundation.org> References: <20170523200821.666872592@linuxfoundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: stable-owner@vger.kernel.org List-ID: 4.11-stable review patch. If anyone has any objections, please let me know. ------------------ From: Bart Van Assche commit 7083abbbfc4fa706ff72d27d33a5214881979336 upstream. If blk_get_request() fails, check whether the failure is due to a path being removed. If that is the case, fail the path by triggering a call to fail_path(). This avoids that the following scenario can be encountered while removing paths: * CPU usage of a kworker thread jumps to 100%. * Removing the DM device becomes impossible. Delay requeueing if blk_get_request() returns -EBUSY or -EWOULDBLOCK, and the queue is not dying, because in these cases immediate requeuing is inappropriate. Signed-off-by: Bart Van Assche Cc: Hannes Reinecke Cc: Christoph Hellwig Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman --- drivers/md/dm-mpath.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -489,6 +489,7 @@ static int multipath_clone_and_map(struc struct pgpath *pgpath; struct block_device *bdev; struct dm_mpath_io *mpio = get_mpio(map_context); + struct request_queue *q; struct request *clone; /* Do we need to select a new pgpath? */ @@ -511,12 +512,18 @@ static int multipath_clone_and_map(struc mpio->nr_bytes = nr_bytes; bdev = pgpath->path.dev->bdev; - - clone = blk_get_request(bdev_get_queue(bdev), - rq->cmd_flags | REQ_NOMERGE, - GFP_ATOMIC); + q = bdev_get_queue(bdev); + clone = blk_get_request(q, rq->cmd_flags | REQ_NOMERGE, GFP_ATOMIC); if (IS_ERR(clone)) { /* EBUSY, ENODEV or EWOULDBLOCK: requeue */ + bool queue_dying = blk_queue_dying(q); + DMERR_LIMIT("blk_get_request() returned %ld%s - requeuing", + PTR_ERR(clone), queue_dying ? " (path offline)" : ""); + if (queue_dying) { + atomic_inc(&m->pg_init_in_progress); + activate_or_offline_path(pgpath); + return DM_MAPIO_REQUEUE; + } return DM_MAPIO_DELAY_REQUEUE; } clone->bio = clone->biotail = NULL;