From: James Bottomley <James.Bottomley@steeleye.com>
To: Patrick Mansfield <patmans@us.ibm.com>
Cc: James Bottomley <James.Bottomley@SteelEye.com>,
Lars Marowsky-Bree <lmb@suse.de>,
linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org
Subject: Re: [RFC] Multi-path IO in 2.5/2.6 ?
Date: Mon, 09 Sep 2002 12:34:05 -0500 [thread overview]
Message-ID: <200209091734.g89HY5p11796@localhost.localdomain> (raw)
In-Reply-To: Message from Patrick Mansfield <patmans@us.ibm.com> of "Mon, 09 Sep 2002 09:56:52 PDT." <20020909095652.A21245@eng2.beaverton.ibm.com>
patmans@us.ibm.com said:
> Using md or volume manager is wrong for non-failover usage, and
> somewhat bad for failover models; generic block layer is OK but it is
> wasted code for any lower layers that do not or cannot have multi-path
> IO (such as IDE).
What about block devices that could usefully use multi-path to achieve network
redundancy, like nbd? If it's in the block layer or above, they can be made to
work with minimal effort.
My basic point is that the utility of the feature transcends SCSI, so SCSI is
too low a layer for it.
I wouldn't be too sure even of the IDE case: IDE has a habit of copying SCSI
features when they become more main-stream (and thus cheaper). It wouldn't
suprise me to see multi-path as an adjunct to the IDE serial stuff.
> A major problem with multi-path in md or other volume manager is that
> we use multiple (block layer) queues for a single device, when we
> should be using a single queue. If we want to use all paths to a
> device (i.e. round robin across paths or such, not a failover model)
> this means the elevator code becomes inefficient, mabye even
> counterproductive. For disk arrays, this might not be bad, but for
> actual drives or even plugging single ported drives into a switch or
> bus with multiple initiators, this could lead to slower disk
> performance.
That's true today, but may not be true in 2.6. Suparna's bio splitting code
is aimed precisely at this and other software RAID cases.
> In the current code, each path is allocated a Scsi_Device, including a
> request_queue_t, and a set of Scsi_Cmnd structures. Not only do we end
> up with a Scsi_Device for each path, we also have an upper level (sd,
> sg, st, or sr) driver attached to each Scsi_Device.
You can't really get away from this. Transfer parameters are negotiated at
the Scsi_Device level (i.e. per device path from HBA to controller), and LLDs
accept I/O's for Scsi_Devices. Whatever you do, you still need an entity that
performs most of the same functions as the Scsi_Device, so you might as well
keep Scsi_Device itself, since it works.
> For sd, this means if you have n paths to each SCSI device, you are
> limited to whatever limit sd has divided by n, right now 128 / n.
> Having four paths to a device is very reasonable, limiting us to 32
> devices, but with the overhead of 128 devices.
I really don't expect this to be true in 2.6.
> Using a volume manager to implement multiple paths (again non-failover
> model) means that the queue_depth might be too large if the
> queue_depth (i.e. number of outstanding commands sent to the drive)
> is set as a per-device value - we can end sending n * queue_depth
> commands to a device.
The queues tend to be in the controllers, not in the RAID devices, thus for a
dual path RAID device you usually have two caching controllers and thus twice
the queue depth (I know this isn't always the case, but it certainly is enough
of the time for me to argue that you should have the flexibility to queue per
path).
> We could implement multi-path IO in the block layer, but if the only
> user is SCSI, this gains nothing compared to putting multi-path in the
> scsi layers. Creating block level interfaces that will work for future
> devices and/or future code is hard without already having the devices
> or code in place. Any block level interface still requires support in
> the the underlying layers.
> I'm not against a block level interface, but I don't have ideas or
> code for such an implementation.
SCSI got into a lot of trouble by going down the "kernel doesn't have X
feature I need, so I'll just code it into the SCSI mid-layer instead", I'm
loth to accept something into SCSI that I don't think belongs there in the
long term.
Answer me this question:
- In the forseeable future does multi-path have uses other than SCSI?
I've got to say, I can't see a "no" to that one, so it fails the high level
bar to getting into the scsi subsystem. However, the kernel, as has been said
before, isn't a theoretical excercise in design, so is there a good expediency
argument (like "it will take one year to get all the features of the block
layer to arrive and I have a customer now"). Also, to go in under expediency,
the code must be readily removable against the day it can be redone correctly.
> Generic device naming consistency is a problem if multiple devices
> show up with the same id.
Patrick Mochel has an open task to come up with a solution to this.
> With the scsi layer multi-path, ide-scsi or usb-scsi could also do
> multi-path IO.
The "scsi is everything" approach got its wings shot off at the kernel summit,
and subsequently confirmed its death in a protracted wrangle on lkml (I can't
remember the reference off the top of my head, but I'm sure others can).
James
next prev parent reply other threads:[~2002-09-09 17:34 UTC|newest]
Thread overview: 269+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <200209091458.g89Evv806056@localhost.localdomain>
2002-09-09 16:56 ` [RFC] Multi-path IO in 2.5/2.6 ? Patrick Mansfield
2002-09-09 17:34 ` James Bottomley [this message]
2002-09-09 18:40 ` Mike Anderson
2002-09-10 13:02 ` Lars Marowsky-Bree
2002-09-10 16:03 ` Patrick Mansfield
2002-09-10 16:27 ` Mike Anderson
2002-09-10 0:08 ` Patrick Mansfield
2002-09-10 7:55 ` Jeremy Higdon
2002-09-10 13:04 ` Lars Marowsky-Bree
2002-09-10 16:20 ` Patrick Mansfield
2002-09-10 13:16 ` Lars Marowsky-Bree
2002-09-10 19:26 ` Patrick Mansfield
2002-09-11 14:20 ` James Bottomley
2002-09-11 19:17 ` Lars Marowsky-Bree
2002-09-11 19:37 ` James Bottomley
2002-09-11 19:52 ` Lars Marowsky-Bree
2002-09-11 21:38 ` Oliver Xymoron
2002-09-11 20:30 ` Doug Ledford
2002-09-11 21:17 ` Mike Anderson
2002-09-10 17:21 ` Patrick Mochel
2002-09-10 18:42 ` Patrick Mansfield
2002-09-10 19:00 ` Patrick Mochel
2002-09-10 19:37 ` Patrick Mansfield
2002-11-21 15:16 [PATCH] turn scsi_allocate_device into readable code Christoph Hellwig
2002-11-21 15:36 ` Doug Ledford
2002-11-21 15:39 ` J.E.J. Bottomley
2002-11-21 15:49 ` Doug Ledford
2002-11-21 16:12 ` J.E.J. Bottomley
2002-11-21 17:08 ` [PATCH] current scsi-misc-2.5 include files Patrick Mansfield
-- strict thread matches above, loose matches on Subject: below --
2002-11-16 19:40 [PATCH] removel useless mod use count manipulation Christoph Hellwig
2002-11-17 2:59 ` Doug Ledford
2002-11-17 17:31 ` J.E.J. Bottomley
2002-11-17 18:14 ` Doug Ledford
2002-11-17 12:40 ` Douglas Gilbert
2002-11-17 12:48 ` Christoph Hellwig
2002-11-17 13:38 ` Douglas Gilbert
2002-11-15 20:34 [RFC][PATCH] move dma_mask into struct device J.E.J. Bottomley
2002-11-16 0:19 ` Mike Anderson
2002-11-16 14:48 ` J.E.J. Bottomley
2002-11-16 20:33 ` Patrick Mansfield
2002-11-17 15:07 ` J.E.J. Bottomley
2002-11-06 22:18 [PATCH] add request prep functions to SCSI J.E.J. Bottomley
2002-11-06 23:16 ` Doug Ledford
2002-11-06 23:43 ` J.E.J. Bottomley
2002-11-07 21:45 ` Mike Anderson
2002-11-06 4:24 [PATCH] fix 2.5 scsi queue depth setting Patrick Mansfield
2002-11-06 4:35 ` Patrick Mansfield
2002-11-06 17:15 ` J.E.J. Bottomley
2002-11-06 17:47 ` J.E.J. Bottomley
2002-11-06 18:24 ` Patrick Mansfield
2002-11-06 18:32 ` J.E.J. Bottomley
2002-11-06 18:39 ` Patrick Mansfield
2002-11-06 18:50 ` J.E.J. Bottomley
2002-11-06 19:50 ` Patrick Mansfield
2002-11-06 20:45 ` Doug Ledford
2002-11-06 21:19 ` J.E.J. Bottomley
2002-11-06 20:50 ` Doug Ledford
[not found] <patmans@us.ibm.com>
2002-10-15 16:55 ` [RFC PATCH] consolidate SCSI-2 command lun setting Patrick Mansfield
2002-10-15 20:29 ` James Bottomley
2002-10-15 22:00 ` Patrick Mansfield
2002-10-30 16:58 ` [PATCH] 2.5 current bk fix setting scsi queue depths Patrick Mansfield
2002-10-30 17:17 ` James Bottomley
2002-10-30 18:05 ` Patrick Mansfield
2002-10-31 0:44 ` James Bottomley
2002-10-21 19:34 [PATCH] get rid of ->finish method for highlevel drivers Christoph Hellwig
2002-10-21 23:58 ` James Bottomley
2002-10-22 15:48 ` James Bottomley
2002-10-22 18:43 ` Patrick Mansfield
2002-10-22 23:17 ` Mike Anderson
2002-10-22 23:30 ` Doug Ledford
2002-10-23 14:16 ` James Bottomley
2002-10-23 15:13 ` Christoph Hellwig
2002-10-24 1:36 ` Patrick Mansfield
2002-10-24 23:20 ` Willem Riede
2002-10-24 23:36 ` Christoph Hellwig
2002-10-25 0:02 ` Willem Riede
2002-10-22 7:30 ` Mike Anderson
2002-10-22 11:14 ` Christoph Hellwig
2002-10-15 18:55 [patch 2.5] ips queue depths Jeffery, David
2002-10-15 19:30 ` Dave Hansen
2002-10-15 19:47 ` Doug Ledford
2002-10-15 20:04 ` Patrick Mansfield
2002-10-15 20:52 ` Doug Ledford
2002-10-15 23:30 ` Patrick Mansfield
2002-10-15 23:56 ` Luben Tuikov
2002-10-16 2:32 ` Doug Ledford
2002-10-16 19:04 ` Patrick Mansfield
2002-10-16 20:15 ` Doug Ledford
2002-10-17 0:39 ` Luben Tuikov
2002-10-17 17:01 ` Mike Anderson
2002-10-17 21:13 ` Luben Tuikov
2002-10-15 20:10 ` Mike Anderson
2002-10-15 20:24 ` Doug Ledford
2002-10-15 20:38 ` James Bottomley
2002-10-15 22:10 ` Mike Anderson
2002-10-16 1:04 ` James Bottomley
2002-10-15 20:24 ` Mike Anderson
2002-10-15 22:46 ` Doug Ledford
2002-10-15 20:26 ` Luben Tuikov
2002-10-15 21:27 ` Patrick Mansfield
2002-10-16 0:43 ` Luben Tuikov
2002-10-21 7:28 ` Mike Anderson
2002-10-21 16:16 ` Doug Ledford
2002-10-21 16:29 ` James Bottomley
2002-10-10 15:01 [PATCH] scsi host cleanup 3/3 (driver changes) Stephen Cameron
2002-10-10 16:46 ` Mike Anderson
2002-10-10 16:59 ` James Bottomley
2002-10-10 20:05 ` Mike Anderson
[not found] <dledford@redhat.com>
2002-10-02 0:28 ` PATCH: scsi device queue depth adjustability patch Doug Ledford
2002-10-02 1:16 ` Alan Cox
2002-10-02 1:41 ` Doug Ledford
2002-10-02 13:44 ` Alan Cox
2002-10-02 21:41 ` James Bottomley
2002-10-02 22:18 ` Doug Ledford
2002-10-02 23:19 ` James Bottomley
2002-10-03 12:46 ` James Bottomley
2002-10-03 16:35 ` Doug Ledford
2002-10-04 1:40 ` Jeremy Higdon
2002-10-03 14:25 ` James Bottomley
2002-10-03 16:41 ` Doug Ledford
2002-10-03 17:00 ` James Bottomley
2002-10-16 21:35 ` scsi_scan.c question Doug Ledford
2002-10-16 21:41 ` James Bottomley
2002-10-17 0:18 ` Doug Ledford
2002-10-16 21:57 ` Patrick Mansfield
2002-10-18 15:57 ` Patrick Mansfield
2002-11-18 0:27 ` aic7xxx_biosparam Doug Ledford
2002-11-18 0:36 ` aic7xxx_biosparam J.E.J. Bottomley
2002-11-18 2:46 ` aic7xxx_biosparam Doug Ledford
2002-11-18 3:20 ` aic7xxx_biosparam J.E.J. Bottomley
2002-11-18 3:26 ` aic7xxx_biosparam Doug Ledford
2002-11-18 0:43 ` aic7xxx_biosparam Andries Brouwer
2002-11-18 2:47 ` aic7xxx_biosparam Doug Ledford
2002-11-18 0:57 ` aic7xxx_biosparam Alan Cox
2002-11-18 2:34 ` aic7xxx_biosparam Doug Ledford
2002-12-21 1:22 ` scsi_scan changes Doug Ledford
2002-12-21 1:27 ` James Bottomley
2002-09-30 21:06 [PATCH] first cut at fixing unable to requeue with no outstanding commands James Bottomley
2002-09-30 23:28 ` Mike Anderson
2002-10-01 0:38 ` James Bottomley
2002-10-01 15:01 ` Patrick Mansfield
2002-10-01 15:14 ` James Bottomley
2002-10-01 16:23 ` Mike Anderson
2002-10-01 16:30 ` James Bottomley
2002-10-01 20:18 ` Inhibit auto-attach of scsi disks ? Scott Merritt
2002-10-02 0:46 ` Alan Cox
2002-10-02 1:49 ` Scott Merritt
2002-10-02 1:58 ` Doug Ledford
2002-10-02 2:45 ` Scott Merritt
2002-10-02 13:40 ` Alan Cox
2002-09-24 11:35 SCSI woes (followup) Russell King
2002-09-24 13:46 ` James Bottomley
2002-09-24 13:58 ` Russell King
2002-09-24 14:29 ` James Bottomley
2002-09-24 18:16 ` Luben Tuikov
2002-09-24 18:18 ` Patrick Mansfield
2002-09-24 19:01 ` Russell King
2002-09-24 19:08 ` Mike Anderson
2002-09-24 19:21 ` Russell King
2002-09-24 19:32 ` Patrick Mansfield
2002-09-24 20:00 ` Russell King
2002-09-24 22:23 ` Patrick Mansfield
2002-09-24 23:04 ` Russell King
2002-09-24 22:39 ` Russell King
2002-09-24 23:14 ` James Bottomley
2002-09-24 23:26 ` Mike Anderson
2002-09-24 23:31 ` James Bottomley
2002-09-24 23:56 ` Mike Anderson
2002-09-24 23:33 ` Russell King
2002-09-25 0:47 ` Mike Anderson
2002-09-25 8:45 ` Russell King
2002-09-25 2:18 ` Doug Ledford
2002-09-25 14:41 ` Russell King
2002-09-24 23:33 ` Mike Anderson
2002-09-24 23:45 ` Russell King
2002-09-25 0:08 ` Patrick Mansfield
2002-09-25 8:41 ` Russell King
2002-09-25 17:22 ` Patrick Mansfield
2002-09-25 12:46 ` Russell King
2002-09-24 17:57 ` Luben Tuikov
2002-09-24 18:39 ` Mike Anderson
2002-09-24 18:49 ` Luben Tuikov
2002-09-03 14:35 aic7xxx sets CDR offline, how to reset? James Bottomley
2002-09-03 18:23 ` Doug Ledford
2002-09-03 19:09 ` James Bottomley
2002-09-03 20:59 ` Alan Cox
2002-09-03 21:32 ` James Bottomley
2002-09-03 21:54 ` Alan Cox
2002-09-03 22:50 ` Doug Ledford
2002-09-03 23:28 ` Alan Cox
2002-09-04 7:40 ` Jeremy Higdon
2002-09-04 16:24 ` James Bottomley
2002-09-04 17:13 ` Mike Anderson
2002-09-05 9:50 ` Jeremy Higdon
2002-09-04 16:13 ` James Bottomley
2002-09-04 16:50 ` Justin T. Gibbs
2002-09-05 9:39 ` Jeremy Higdon
2002-09-05 13:35 ` Justin T. Gibbs
2002-09-05 23:56 ` Jeremy Higdon
2002-09-06 0:13 ` Justin T. Gibbs
2002-09-06 0:32 ` Jeremy Higdon
2002-09-03 21:13 ` Doug Ledford
2002-09-03 21:48 ` James Bottomley
2002-09-03 22:42 ` Doug Ledford
2002-09-03 22:52 ` Doug Ledford
2002-09-03 23:29 ` Alan Cox
2002-09-04 21:16 ` Luben Tuikov
2002-09-04 10:37 ` Andries Brouwer
2002-09-04 10:48 ` Doug Ledford
2002-09-04 11:23 ` Alan Cox
2002-09-04 16:25 ` Rogier Wolff
2002-09-04 19:34 ` Thunder from the hill
2002-09-03 21:24 ` Patrick Mansfield
2002-09-03 22:02 ` James Bottomley
2002-09-03 23:26 ` Alan Cox
2002-08-26 16:29 [RFC]: 64 bit LUN/Tags, dummy device in host_queue, host_lock <-> LLDD reentrancy Aron Zeh
2002-08-26 16:48 ` James Bottomley
2002-08-26 17:27 ` Mike Anderson
2002-08-26 19:00 ` James Bottomley
2002-08-26 20:57 ` Mike Anderson
2002-08-26 21:10 ` James Bottomley
2002-08-26 22:38 ` Mike Anderson
2002-08-26 22:56 ` Patrick Mansfield
2002-08-26 23:10 ` Doug Ledford
2002-08-28 14:38 ` James Bottomley
2002-08-26 21:15 ` Mike Anderson
2002-08-12 23:38 [PATCH] 2.5.31 scsi_error.c cleanup Mike Anderson
2002-08-22 14:05 ` James Bottomley
2002-08-22 16:34 ` Mike Anderson
2002-08-22 17:11 ` James Bottomley
2002-08-22 20:10 ` Mike Anderson
2002-08-05 23:53 When must the io_request_lock be held? Jamie Wellnitz
2002-08-06 17:58 ` Mukul Kotwani
2002-08-07 14:48 ` Doug Ledford
2002-08-07 15:26 ` James Bottomley
2002-08-07 16:18 ` Doug Ledford
2002-08-07 16:48 ` James Bottomley
2002-08-07 18:06 ` Mike Anderson
2002-08-07 23:17 ` James Bottomley
2002-08-08 19:28 ` Luben Tuikov
2002-08-07 16:55 ` Patrick Mansfield
[not found] <200206132126.g5DLQiQ24889@localhost.localdomain>
2002-06-13 21:50 ` Proposed changes to generic blk tag for use in SCSI (1/3) Doug Ledford
2002-06-13 22:09 ` James Bottomley
2002-06-11 2:46 James Bottomley
2002-06-11 5:50 ` Jens Axboe
2002-06-11 14:29 ` James Bottomley
2002-06-11 14:45 ` Jens Axboe
2002-06-11 16:39 ` James Bottomley
2002-06-13 21:01 ` Doug Ledford
2002-06-13 21:26 ` James Bottomley
2002-04-08 15:18 [RFC] Persistent naming of scsi devices sullivan
2002-04-08 15:04 ` Christoph Hellwig
2002-04-08 15:59 ` Matthew Jacob
2002-04-08 16:34 ` James Bottomley
2002-04-08 18:27 ` Patrick Mansfield
2002-04-08 19:17 ` James Bottomley
2002-04-09 0:22 ` Douglas Gilbert
2002-04-09 14:35 ` sullivan
2002-04-09 14:55 ` sullivan
2002-04-08 17:51 ` Oliver Neukum
2002-04-08 18:01 ` Christoph Hellwig
2002-04-08 18:18 ` Matthew Jacob
2002-04-08 18:28 ` James Bottomley
2002-04-08 18:34 ` Matthew Jacob
2002-04-08 19:07 ` James Bottomley
2002-04-08 20:41 ` Matthew Jacob
2002-04-08 18:45 ` Tigran Aivazian
2002-04-08 20:18 ` Eddie Williams
2002-04-09 0:48 ` Kurt Garloff
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200209091734.g89HY5p11796@localhost.localdomain \
--to=james.bottomley@steeleye.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=lmb@suse.de \
--cc=patmans@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox