From: Benjamin Marzinski <bmarzins@redhat.com>
To: John Garry <john.g.garry@oracle.com>
Cc: hch@lst.de, kbusch@kernel.org, sagi@grimberg.me, axboe@fb.com,
martin.petersen@oracle.com,
james.bottomley@hansenpartnership.com, hare@suse.com,
jmeneghi@redhat.com, linux-nvme@lists.infradead.org,
linux-scsi@vger.kernel.org, michael.christie@oracle.com,
snitzer@kernel.org, dm-devel@lists.linux.dev,
linux-block@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 07/24] scsi-multipath: clone each bio
Date: Mon, 2 Mar 2026 11:27:06 -0500 [thread overview]
Message-ID: <aaW6Wp9AJV0emVs_@redhat.com> (raw)
In-Reply-To: <bfe3a30f-50c1-4ede-a424-f342b80bfdcf@oracle.com>
On Mon, Mar 02, 2026 at 12:12:54PM +0000, John Garry wrote:
> On 02/03/2026 03:21, Benjamin Marzinski wrote:
> > On Wed, Feb 25, 2026 at 03:36:10PM +0000, John Garry wrote:
> > > For failover handling, we must resubmit each bio.
> > >
> > > However, unlike NVMe, for SCSI there is no guarantee that any bio submitted
> > > is either all or none completed.
> > >
> > > As such, for SCSI, for failover handling we will take the approach to
> > > just re-submit the original bio. For this clone and submit each bio.
> > >
> > > Signed-off-by: John Garry <john.g.garry@oracle.com>
> > > ---
> > > drivers/scsi/scsi_multipath.c | 51 ++++++++++++++++++++++++++++++++++-
> > > include/scsi/scsi_multipath.h | 1 +
> > > 2 files changed, 51 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/scsi/scsi_multipath.c b/drivers/scsi/scsi_multipath.c
> > > index 4b7984e7e74ba..d79a92ec0cf6c 100644
> > > --- a/drivers/scsi/scsi_multipath.c
> > > +++ b/drivers/scsi/scsi_multipath.c
> > > @@ -89,6 +89,14 @@ module_param_call(iopolicy, scsi_set_iopolicy, scsi_get_iopolicy,
> > > MODULE_PARM_DESC(iopolicy,
> > > "Default multipath I/O policy; 'numa' (default), 'round-robin' or 'queue-depth'");
> > > +struct scsi_mpath_clone_bio {
> > > + struct bio *master_bio;
> > > + struct bio clone;
> > > +};
> >
> > If the only extra information you need for your clone bios is a pointer
> > to the original bio, I think you can just store that in bi_private. So
> > you shouldn't actually need to allocate any front pad for your bioset.
>
> Yes, seems a decent idea
>
> >
> > > +
> > > +#define scsi_mpath_to_master_bio(clone) \
> > > + container_of(clone, struct scsi_mpath_clone_bio, clone)
> > > +
> > > static int scsi_mpath_unique_lun_id(struct scsi_device *sdev)
> > > {
> > > struct scsi_mpath_device *scsi_mpath_dev = sdev->scsi_mpath_dev;
> >
> > > @@ -260,6 +269,39 @@ static int scsi_multipath_sdev_init(struct scsi_device *sdev)
> > > return 0;
> > > }
> > > +static void scsi_mpath_clone_end_io(struct bio *clone)
> > > +{
> > > + struct scsi_mpath_clone_bio *scsi_mpath_clone_bio =
> > > + scsi_mpath_to_master_bio(clone);
> > > + struct bio *master_bio = scsi_mpath_clone_bio->master_bio;
> > > +
> > > + master_bio->bi_status = clone->bi_status;
> > > + bio_put(clone);
> > > + bio_endio(master_bio);
> > > +}
> > > +
> > > +static struct bio *scsi_mpath_clone_bio(struct bio *bio)
> > > +{
> > > + struct mpath_disk *mpath_disk = bio->bi_bdev->bd_disk->private_data;
> > > + struct mpath_head *mpath_head = mpath_disk->mpath_head;
> > > + struct scsi_mpath_clone_bio *scsi_mpath_clone_bio;
> > > + struct scsi_mpath_head *scsi_mpath_head = mpath_head->drvdata;
> > > + struct bio *clone;
> > > +
> > > + clone = bio_alloc_clone(bio->bi_bdev, bio, GFP_NOWAIT,
> > > + &scsi_mpath_head->bio_pool);
> >
> > Why use GFP_NOWAIT? It's more likely to fail than GFP_NOIO. If the bio
> > has REQ_NOWAIT set, I can see where you would need this, but otherwise,
> > I don't see why GFP_NOIO wouldn't be better here.
>
> Seems reasonable to try GFP_NOIO. Furthermore, we really can't tolerate the
> clone to fail. So, if it does, we should return an error pointer here and
> mpath_bdev_submit_bio() should error the original bio.
>
> >
> > > + if (!clone)
> > > + return NULL;
> > > +
> > > + clone->bi_end_io = scsi_mpath_clone_end_io;
> > > +
> > > + scsi_mpath_clone_bio = container_of(clone,
> > > + struct scsi_mpath_clone_bio, clone);
> > > + scsi_mpath_clone_bio->master_bio = bio;
> > > +
> > > + return clone;
> > > +}
> > > +
> > > static enum mpath_iopolicy_e scsi_mpath_get_iopolicy(struct mpath_head *mpath_head)
> > > {
> > > struct scsi_mpath_head *scsi_mpath_head = mpath_head->drvdata;
> > > @@ -269,6 +311,7 @@ static enum mpath_iopolicy_e scsi_mpath_get_iopolicy(struct mpath_head *mpath_he
> > > struct mpath_head_template smpdt_pr = {
> > > .get_iopolicy = scsi_mpath_get_iopolicy,
> > > + .clone_bio = scsi_mpath_clone_bio,
> > > };
> > > static struct scsi_mpath_head *scsi_mpath_alloc_head(void)
> > > @@ -283,9 +326,13 @@ static struct scsi_mpath_head *scsi_mpath_alloc_head(void)
> > > ida_init(&scsi_mpath_head->ida);
> > > mutex_init(&scsi_mpath_head->lock);
> > > + if (bioset_init(&scsi_mpath_head->bio_pool, SCSI_MAX_QUEUE_DEPTH,
> > > + offsetof(struct scsi_mpath_clone_bio, clone),
> > > + BIOSET_NEED_BVECS|BIOSET_PERCPU_CACHE))
> >
> > You don't need 4096 cached bios to guarantee forward progress. I don't
> > see why BIO_POOL_SIZE won't work fine here.
>
> Every bio which we are sent is cloned. And SCSI_MAX_QUEUE_DEPTH is used as
> the cached bio size - wouldn't it make sense to cache more than 2 bios?
IIRC, the reserved pool is there to guarantee forward progress under
memory pressure, so that if the system is short on memory, and it needs
to write out data to this multipath device in order to free up memory,
it there will be enough resources to do that.
Under normal conditions, your new bios should be getting pulled from the
per-cpu cache anyways, since you set BIOSET_PERCPU_CACHE. That's going
to be the fastest way to get one.
-Ben
>
> > Also, since you are cloning
> > bios, they are sharing the original bio's iovecs, so you don't need
> > BIOSET_NEED_BVECS.
> >
>
> ok
>
> thanks!
next prev parent reply other threads:[~2026-03-02 16:27 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-25 15:36 [PATCH 00/24] Native SCSI multipath support John Garry
2026-02-25 15:36 ` [PATCH 01/24] scsi: core: add SCSI_MAX_QUEUE_DEPTH John Garry
2026-03-03 6:52 ` Hannes Reinecke
2026-03-03 7:45 ` John Garry
2026-02-25 15:36 ` [PATCH 02/24] scsi-multipath: introduce basic SCSI device support John Garry
2026-03-02 2:16 ` Benjamin Marzinski
2026-03-02 11:33 ` John Garry
2026-03-02 2:22 ` Benjamin Marzinski
2026-03-02 11:39 ` John Garry
2026-03-03 5:39 ` Benjamin Marzinski
2026-03-03 8:01 ` Hannes Reinecke
2026-03-03 14:20 ` Benjamin Marzinski
2026-03-05 15:59 ` John Garry
2026-03-03 6:57 ` Hannes Reinecke
2026-03-03 7:45 ` John Garry
2026-02-25 15:36 ` [PATCH 03/24] scsi-multipath: introduce scsi_device head structure John Garry
2026-03-02 2:50 ` Benjamin Marzinski
2026-03-02 12:00 ` John Garry
2026-03-03 7:13 ` Hannes Reinecke
2026-03-03 7:50 ` John Garry
2026-02-25 15:36 ` [PATCH 04/24] scsi-multipath: introduce scsi_mpath_device_class John Garry
2026-03-02 2:54 ` Benjamin Marzinski
2026-03-02 12:01 ` John Garry
2026-03-03 7:16 ` Hannes Reinecke
2026-03-03 10:53 ` John Garry
2026-02-25 15:36 ` [PATCH 05/24] scsi-multipath: provide sysfs link from to scsi_device John Garry
2026-03-03 7:19 ` Hannes Reinecke
2026-03-03 10:49 ` John Garry
2026-02-25 15:36 ` [PATCH 06/24] scsi-multipath: support iopolicy John Garry
2026-02-25 15:36 ` [PATCH 07/24] scsi-multipath: clone each bio John Garry
2026-03-02 3:21 ` Benjamin Marzinski
2026-03-02 12:12 ` John Garry
2026-03-02 16:27 ` Benjamin Marzinski [this message]
2026-03-02 17:16 ` John Garry
2026-02-25 15:36 ` [PATCH 08/24] scsi-multipath: clear path when decide is blocked John Garry
2026-02-25 15:36 ` [PATCH 09/24] scsi-multipath: failover handling John Garry
2026-03-02 3:57 ` Benjamin Marzinski
2026-03-02 12:20 ` John Garry
2026-03-04 5:46 ` Benjamin Marzinski
2026-03-04 11:11 ` John Garry
2026-02-25 15:36 ` [PATCH 10/24] scsi-multipath: add scsi_mpath_{start,end}_request() John Garry
2026-03-02 4:08 ` Benjamin Marzinski
2026-03-02 12:20 ` John Garry
2026-03-04 6:13 ` Benjamin Marzinski
2026-03-04 11:11 ` John Garry
2026-03-05 2:37 ` Benjamin Marzinski
2026-02-25 15:36 ` [PATCH 11/24] scsi-multipath: add scsi_mpath_ioctl() John Garry
2026-02-25 15:36 ` [PATCH 12/24] scsi-multipath: provide callbacks for path state John Garry
2026-03-03 5:31 ` Benjamin Marzinski
2026-02-25 15:36 ` [PATCH 13/24] scsi-multipath: set disk device_groups John Garry
2026-02-25 15:36 ` [PATCH 14/24] scsi-multipath: add PR support John Garry
2026-02-25 15:36 ` [PATCH 15/24] scsi: sd: refactor PR ops John Garry
2026-02-25 15:36 ` [PATCH 16/24] scsi: sd: add multipath disk class John Garry
2026-02-25 15:36 ` [PATCH 17/24] scsi: sd: add sd_mpath_{start,end}_command() John Garry
2026-02-25 15:36 ` [PATCH 18/24] scsi: sd: add sd_mpath_ioctl() John Garry
2026-02-25 15:36 ` [PATCH 19/24] scsi: sd: add multipath PR support John Garry
2026-02-25 15:36 ` [PATCH 20/24] scsi: sd: add sd_mpath_to_disk() John Garry
2026-02-25 15:36 ` [PATCH 21/24] scsi: sd: support multipath disk John Garry
2026-03-10 2:40 ` Benjamin Marzinski
2026-03-10 10:12 ` John Garry
2026-03-10 15:19 ` Benjamin Marzinski
2026-02-25 15:36 ` [PATCH 22/24] scsi: sd: add mpath_dev file John Garry
2026-02-25 15:36 ` [PATCH 23/24] scsi: sd: add mpath_numa_nodes dev attribute John Garry
2026-02-25 15:36 ` [PATCH 24/24] scsi: sd: add mpath_queue_depth " John Garry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aaW6Wp9AJV0emVs_@redhat.com \
--to=bmarzins@redhat.com \
--cc=axboe@fb.com \
--cc=dm-devel@lists.linux.dev \
--cc=hare@suse.com \
--cc=hch@lst.de \
--cc=james.bottomley@hansenpartnership.com \
--cc=jmeneghi@redhat.com \
--cc=john.g.garry@oracle.com \
--cc=kbusch@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=michael.christie@oracle.com \
--cc=sagi@grimberg.me \
--cc=snitzer@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox