From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pasi =?iso-8859-1?Q?K=E4rkk=E4inen?= Subject: Re: [Multipath] Round-robin performance limit Date: Sat, 22 Oct 2011 18:02:47 +0300 Message-ID: <20111022150247.GF12984@reaktio.net> References: <20110502072528.GW32595@reaktio.net> <1304375238.3134.23.camel@denise.theartistscloset.com> <20110503050406.GA11442@us.ibm.com> <1304417561.25728.16.camel@denise.theartistscloset.com> <1317784026.31536.7.camel@denise.theartistscloset.com> Reply-To: device-mapper development Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com To: device-mapper development Cc: jsullivan@opensourcedevel.com List-Id: dm-devel.ids On Wed, Oct 05, 2011 at 03:54:35PM -0400, Adam Chasen wrote: > John, > I am limited in a similar fashion. I would much prefer to use multibus > multipath, but was unable to achieve bandwidth which would exceed a > single link even though it was spread over the 4 available links. Were > you able to gain even a similar performance of the RAID0 setup with > the multibus multipath? > = Utilizing multiple links works with for example this setup: - VMware ESXi 4.1 software iSCSI initiator. - Dell Equallogic iSCSI target. The steps needed for ESXi are: - Configure multiple VMkernel (vmkX) IP interfaces. - Configure ESXi iscsi initiator to use (bind to) all the vmkX interfaces. - Configure the path selection policy to be RR (RoundRobin). - Configure multipath to switch paths after 3 IOs. The same should work with Linux dm-multipath. -- Pasi > Thanks, > Adam > = > On Tue, Oct 4, 2011 at 11:07 PM, John A. Sullivan III > wrote: > > On Tue, 2011-10-04 at 16:19 -0400, Adam Chasen wrote: > >> Unfortunately even with playing around with various settings, queues, > >> and other techniques, I was never able to exceed the bandwidth of more > >> than one of the Ethernet links when accessing a single multipathed > >> LUN. > >> > >> When communicating with two different multipathed LUNs, which present > >> as two different multipath devices, I can saturate two links, but it > >> is still a one to one ratio of multipath devices to link saturation. > >> > >> After further research on multipathing, it appears people are using md > >> raid to achieve multipathed devices. My initial testing of using raid0 > >> md-raid device produces the behavior I expect of multipathed devices. > >> I can easily saturate both links during read operations. > >> > >> I feel using md-raid is a less elegant solution than using > >> dm-multipath, but it will have to suffice until someone can provide me > >> some additional guidance. > >> > >> Thanks, > >> Adam > > We recently changed from the RAID0 approach to multipath multibus. > > RAID0 did seem to give more even performance over a variety of IO > > patterns but it had a critical flaw. =A0We could not use the snapshot > > capabilities of the SAN because we could never be certain of > > snapshotting the RAID0 disks in a transactionally consistent state. =A0= If > > I have four disk in a RAID0 array and snapshot them all, how can I be > > assured that I have not done something like written two of three stripes > > and no parity. =A0This was our singular reason for discarding RAID0 over > > iSCSI for multipath multibus - John > > > >> > >> On Mon, Oct 3, 2011 at 11:08 PM, Adam Chasen wrote: > >> > Malahal, > >> > After your mentioning bio vs request based I attempted to determine = if > >> > my kernel contains the request based mpath. It seems in 2.6.31 all > >> > mpath was switched to request based. I have a kernel 2.6.31+ (actual= ly > >> > .35 and .38), so I believe I have requrest-based mpath. > >> > > >> > All, > >> > There also appears to be a new multipath configuration option > >> > documented in the RHEL 6 beta documentation: > >> > rr_min_io_rq =A0 =A0Specifies the number of I/O requests to route to= a path > >> > before switching to the next path in the current path group, using > >> > request-based device-mapper-multipath. This setting should be used on > >> > systems running current kernels. On systems running kernels older th= an > >> > 2.6.31, use rr_min_io. The default value is 1. > >> > > >> > http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6-Beta/ht= ml/DM_Multipath/config_file_multipath.html > >> > > >> > I have not tested using this setting vs rr_min_io yet or even if my > >> > system supports the configuration directive. > >> > > >> > If I trust some of the claims of several VMware ESX iscsi multipath > >> > setups, it is possible (possibly using different software) to gain a > >> > multiplicative throughput by adding additional Ethernet links. This > >> > makes me hopeful that we can do this with open-iscsi and dm-mulitpath > >> > as well. > >> > > >> > It could be something obvious I am missing, but it appears a lot of > >> > people experience this same issue. > >> > > >> > Thanks, > >> > Adam > >> > > >> > On Tue, May 3, 2011 at 6:12 AM, John A. Sullivan III > >> > wrote: > >> >> On Mon, 2011-05-02 at 22:04 -0700, Malahal Naineni wrote: > >> >>> John A. Sullivan III [jsullivan@opensourcedevel.com] wrote: > >> >>> > I'm also very curious about your findings on rr_min_io. =A0I can= not find > >> >>> > my benchmarks but we tested various settings heavily. =A0I do no= t recall > >> >>> > if we saw more even scaling with 10 or 100. =A0I remember being = surprised > >> >>> > that performance with it set to 1 was poor. =A0I would have thou= ght that, > >> >>> > in a bonded environment, changing paths per iSCSI command would = give > >> >>> > optimal performance. =A0Can anyone explain why it does not? > >> >>> > >> >>> rr_min_io of 1 will give poor performance if your multipath kernel > >> >>> module doesn't support request based multipath. In those BIO based > >> >>> multipath, multipath receives 4KB requests. Such requests can't be > >> >>> coalesced if they are sent on different paths. > >> >> > >> >> Ah, that makes perfect sense and why 3 seems to be the magic number= in > >> >> Linux (4000 / 1460 (or whatever IP payload is)). =A0Does that chang= e with > >> >> Jumbo frames? In fact, how would that be optimized in Linux? > >> >> > >> >> 9KB seems to be a reasonable common jumbo frame value for various > >> >> vendors and that should contain two pages but, I would guess, Linux > >> >> can't utilize it as each block must be independently acknowledged. = Is > >> >> that correct? Thus a frame size of a little over 4KB would be optim= al > >> >> for Linux? > >> >> > >> >> Would that mean that rr_min_io of 1 would become optimal? However, = if > >> >> each block needs to be acknowledged before the next is sent, I would > >> >> think we are still latency bound, i.e., even if I can send four req= uests > >> >> down four separate paths, I cannot send the second until the first = has > >> >> been acknowledged and since I can easily place four packets on the = same > >> >> path within the latency period of four packets, multibus gives me > >> >> absolutely no performance advantage for a single iSCSI stream and o= nly > >> >> proves useful as I start multiplexing multiple iSCSI streams. > >> >> > >> >> Is that analysis correct? If so, what constitutes a separate iSCSI > >> >> stream? Are two separate file requests from the same file systems t= o the > >> >> same iSCSI device considered two iSCSI streams and thus can be > >> >> multiplexed and benefit from multipath or are they considered all p= art > >> >> of the same iSCSI stream? If they are considered one, do they becom= e two > >> >> if they reside on different partitions and thus different file syst= ems? > >> >> If not, then do we only see multibus performance gains between a si= ngle > >> >> file system host and a single iSCSI host when we use virtualization= each > >> >> with their own iSCSI connection (as opposed to using iSCSI connecti= ons > >> >> in the underlying host and exposing them to the virtual machines as > >> >> local storage)? > >> >> > >> >> I hope I'm not hijacking this thread and realize I've asked some > >> >> convoluted questions but optimizing multibus through bonded links f= or > >> >> single large hosts is still a bit of a mystery to me. =A0Thanks - J= ohn > >> >> > >> >> -- > >> >> dm-devel mailing list > >> >> dm-devel@redhat.com > >> >> https://www.redhat.com/mailman/listinfo/dm-devel > >> >> > >> > > >> > >> -- > >> dm-devel mailing list > >> dm-devel@redhat.com > >> https://www.redhat.com/mailman/listinfo/dm-devel > > > > > > -- > > dm-devel mailing list > > dm-devel@redhat.com > > https://www.redhat.com/mailman/listinfo/dm-devel > > > = > -- > dm-devel mailing list > dm-devel@redhat.com > https://www.redhat.com/mailman/listinfo/dm-devel