* Re: Performance results with exofs [not found] <op.vdxrrgf1unckof@usensfaibisl2e.eng.emc.com> @ 2010-06-07 16:07 ` Boaz Harrosh 2010-06-07 16:13 ` Boaz Harrosh 0 siblings, 1 reply; 13+ messages in thread From: Boaz Harrosh @ 2010-06-07 16:07 UTC (permalink / raw) To: sfaibish; +Cc: J. Bruce Fields, NFS list On 06/07/2010 06:24 PM, sfaibish wrote: > Boaz, > > You were mentioning some preliminary performance on NFS4.1 and pNFS during > the pNFS call few weeks back. I thought you put them in an email but I > couldn't > find that email. Could you re-send it to me or summarize the results in a > new > email for comparison to the block layout performance. Bruce is also > interested > so I CC him as well. Thanks > > /Sorin > I did not yet publish the Document. It's stuck behind my dis-talent for writing and the pnfs bugs de jur. Basically all machines: - connected by a 1 GBit link. - All clients doing a dd write of 8GB file from /dev/zero - 3of8 is the special raid-groups arrangement of exofs && objlayout where out of 8 devices each file is striped over 3 devices in a round robin fashion. (*With a small dirty trick) [single client] 1 - osds 40MB 2 - osds 80MB 4 - osds 114MB (saturation point of the 1 Gbit link) 8 - osds 114MB [2 clients 8of8 osds] 226 MBs [4 clients 8of8 osds] 263 MBs [8 clients 8of8 osds] 252 MBs [1 clients 3of8 osds] 114 MBs [2 clients 3of8 osds *] 226 MBs [4 clients 3of8 osds *] 417 MBs [8 clients 3of8 osds] 405 MBs Boaz ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Performance results with exofs 2010-06-07 16:07 ` Performance results with exofs Boaz Harrosh @ 2010-06-07 16:13 ` Boaz Harrosh 2010-06-07 17:28 ` sfaibish 0 siblings, 1 reply; 13+ messages in thread From: Boaz Harrosh @ 2010-06-07 16:13 UTC (permalink / raw) To: sfaibish; +Cc: J. Bruce Fields, NFS list On 06/07/2010 07:07 PM, Boaz Harrosh wrote: > On 06/07/2010 06:24 PM, sfaibish wrote: >> Boaz, >> >> You were mentioning some preliminary performance on NFS4.1 and pNFS during >> the pNFS call few weeks back. I thought you put them in an email but I >> couldn't >> find that email. Could you re-send it to me or summarize the results in a >> new >> email for comparison to the block layout performance. Bruce is also >> interested >> so I CC him as well. Thanks >> >> /Sorin >> > > I did not yet publish the Document. It's stuck behind my dis-talent for > writing and the pnfs bugs de jur. > > Basically all machines: > - connected by a 1 GBit link. > - All clients doing a dd write of 8GB file from /dev/zero > - 3of8 is the special raid-groups arrangement of exofs && objlayout > where out of 8 devices each file is striped over 3 devices in a > round robin fashion. (*With a small dirty trick) > - All tests over an *empty* filesystem. > [single client] > 1 - osds 40MB > 2 - osds 80MB > 4 - osds 114MB (saturation point of the 1 Gbit link) > 8 - osds 114MB > > [2 clients 8of8 osds] > 226 MBs > > [4 clients 8of8 osds] > 263 MBs > > [8 clients 8of8 osds] > 252 MBs > > [1 clients 3of8 osds] > 114 MBs > > [2 clients 3of8 osds *] > 226 MBs > > [4 clients 3of8 osds *] > 417 MBs > > [8 clients 3of8 osds] > 405 MBs > > Boaz > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Performance results with exofs 2010-06-07 16:13 ` Boaz Harrosh @ 2010-06-07 17:28 ` sfaibish [not found] ` <op.vdxxhfqsunckof-sXut7+96orlxdPWQvOaHCoI83tS8F2Zb0E9HWUfgJXw@public.gmane.org> 2010-06-07 18:29 ` J. Bruce Fields 0 siblings, 2 replies; 13+ messages in thread From: sfaibish @ 2010-06-07 17:28 UTC (permalink / raw) To: Boaz Harrosh; +Cc: J. Bruce Fields, NFS list Thanks. /Sorin On Mon, 07 Jun 2010 12:13:31 -0400, Boaz Harrosh <bharrosh@panasas.com>= =20 wrote: > On 06/07/2010 07:07 PM, Boaz Harrosh wrote: >> On 06/07/2010 06:24 PM, sfaibish wrote: >>> Boaz, >>> >>> You were mentioning some preliminary performance on NFS4.1 and pNFS= =20 >>> during >>> the pNFS call few weeks back. I thought you put them in an email bu= t I >>> couldn't >>> find that email. Could you re-send it to me or summarize the result= s =20 >>> in a >>> new >>> email for comparison to the block layout performance. Bruce is also >>> interested >>> so I CC him as well. Thanks >>> >>> /Sorin >>> >> >> I did not yet publish the Document. It's stuck behind my dis-talent = for >> writing and the pnfs bugs de jur. >> >> Basically all machines: >> - connected by a 1 GBit link. >> - All clients doing a dd write of 8GB file from /dev/zero >> - 3of8 is the special raid-groups arrangement of exofs && objlayout >> where out of 8 devices each file is striped over 3 devices in a >> round robin fashion. (*With a small dirty trick) >> > > - All tests over an *empty* filesystem. > >> [single client] >> 1 - osds 40MB >> 2 - osds 80MB >> 4 - osds 114MB (saturation point of the 1 Gbit link) >> 8 - osds 114MB >> >> [2 clients 8of8 osds] >> 226 MBs >> >> [4 clients 8of8 osds] >> 263 MBs >> >> [8 clients 8of8 osds] >> 252 MBs >> >> [1 clients 3of8 osds] >> 114 MBs >> >> [2 clients 3of8 osds *] >> 226 MBs >> >> [4 clients 3of8 osds *] >> 417 MBs >> >> [8 clients 3of8 osds] >> 405 MBs >> >> Boaz >> > > > --=20 Best Regards Sorin Faibish Corporate Distinguished Engineer Network Storage Group EMC=B2 where information lives Phone: 508-435-1000 x 48545 Cellphone: 617-510-0422 Email : sfaibish@emc.com ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <op.vdxxhfqsunckof-sXut7+96orlxdPWQvOaHCoI83tS8F2Zb0E9HWUfgJXw@public.gmane.org>]
* Re: Performance results with exofs [not found] ` <op.vdxxhfqsunckof-sXut7+96orlxdPWQvOaHCoI83tS8F2Zb0E9HWUfgJXw@public.gmane.org> @ 2010-06-07 17:29 ` Boaz Harrosh 2010-06-07 17:34 ` sfaibish 0 siblings, 1 reply; 13+ messages in thread From: Boaz Harrosh @ 2010-06-07 17:29 UTC (permalink / raw) To: sfaibish; +Cc: J. Bruce Fields, NFS list Show me yours and I'll show you mine, ...! ;-) Boaz On 06/07/2010 08:28 PM, sfaibish wrote: > Thanks. > > /Sorin > > On Mon, 07 Jun 2010 12:13:31 -0400, Boaz Harrosh <bharrosh@panasas.com> > wrote: > >> On 06/07/2010 07:07 PM, Boaz Harrosh wrote: >>> On 06/07/2010 06:24 PM, sfaibish wrote: >>>> Boaz, >>>> >>>> You were mentioning some preliminary performance on NFS4.1 and pNFS >>>> during >>>> the pNFS call few weeks back. I thought you put them in an email but I >>>> couldn't >>>> find that email. Could you re-send it to me or summarize the results >>>> in a >>>> new >>>> email for comparison to the block layout performance. Bruce is also >>>> interested >>>> so I CC him as well. Thanks >>>> >>>> /Sorin >>>> >>> >>> I did not yet publish the Document. It's stuck behind my dis-talent for >>> writing and the pnfs bugs de jur. >>> >>> Basically all machines: >>> - connected by a 1 GBit link. >>> - All clients doing a dd write of 8GB file from /dev/zero >>> - 3of8 is the special raid-groups arrangement of exofs && objlayout >>> where out of 8 devices each file is striped over 3 devices in a >>> round robin fashion. (*With a small dirty trick) >>> >> >> - All tests over an *empty* filesystem. >> >>> [single client] >>> 1 - osds 40MB >>> 2 - osds 80MB >>> 4 - osds 114MB (saturation point of the 1 Gbit link) >>> 8 - osds 114MB >>> >>> [2 clients 8of8 osds] >>> 226 MBs >>> >>> [4 clients 8of8 osds] >>> 263 MBs >>> >>> [8 clients 8of8 osds] >>> 252 MBs >>> >>> [1 clients 3of8 osds] >>> 114 MBs >>> >>> [2 clients 3of8 osds *] >>> 226 MBs >>> >>> [4 clients 3of8 osds *] >>> 417 MBs >>> >>> [8 clients 3of8 osds] >>> 405 MBs >>> >>> Boaz >>> >> >> >> > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Performance results with exofs 2010-06-07 17:29 ` Boaz Harrosh @ 2010-06-07 17:34 ` sfaibish 0 siblings, 0 replies; 13+ messages in thread From: sfaibish @ 2010-06-07 17:34 UTC (permalink / raw) To: Boaz Harrosh; +Cc: J. Bruce Fields, NFS list On Mon, 07 Jun 2010 13:29:41 -0400, Boaz Harrosh <bharrosh@panasas.com>= =20 wrote: > Show me yours and I'll show you mine, ...! ;-) Working on this. Hope that next week we will have something to share. := ) > > Boaz > > On 06/07/2010 08:28 PM, sfaibish wrote: >> Thanks. >> >> /Sorin >> >> On Mon, 07 Jun 2010 12:13:31 -0400, Boaz Harrosh <bharrosh@panasas.c= om> >> wrote: >> >>> On 06/07/2010 07:07 PM, Boaz Harrosh wrote: >>>> On 06/07/2010 06:24 PM, sfaibish wrote: >>>>> Boaz, >>>>> >>>>> You were mentioning some preliminary performance on NFS4.1 and pN= =46S >>>>> during >>>>> the pNFS call few weeks back. I thought you put them in an email = but =20 >>>>> I >>>>> couldn't >>>>> find that email. Could you re-send it to me or summarize the resu= lts >>>>> in a >>>>> new >>>>> email for comparison to the block layout performance. Bruce is al= so >>>>> interested >>>>> so I CC him as well. Thanks >>>>> >>>>> /Sorin >>>>> >>>> >>>> I did not yet publish the Document. It's stuck behind my dis-talen= t =20 >>>> for >>>> writing and the pnfs bugs de jur. >>>> >>>> Basically all machines: >>>> - connected by a 1 GBit link. >>>> - All clients doing a dd write of 8GB file from /dev/zero >>>> - 3of8 is the special raid-groups arrangement of exofs && objlayou= t >>>> where out of 8 devices each file is striped over 3 devices in a >>>> round robin fashion. (*With a small dirty trick) >>>> >>> >>> - All tests over an *empty* filesystem. >>> >>>> [single client] >>>> 1 - osds 40MB >>>> 2 - osds 80MB >>>> 4 - osds 114MB (saturation point of the 1 Gbit link) >>>> 8 - osds 114MB >>>> >>>> [2 clients 8of8 osds] >>>> 226 MBs >>>> >>>> [4 clients 8of8 osds] >>>> 263 MBs >>>> >>>> [8 clients 8of8 osds] >>>> 252 MBs >>>> >>>> [1 clients 3of8 osds] >>>> 114 MBs >>>> >>>> [2 clients 3of8 osds *] >>>> 226 MBs >>>> >>>> [4 clients 3of8 osds *] >>>> 417 MBs >>>> >>>> [8 clients 3of8 osds] >>>> 405 MBs >>>> >>>> Boaz >>>> >>> >>> >>> >> >> >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" = in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > --=20 Best Regards Sorin Faibish Corporate Distinguished Engineer Network Storage Group EMC=B2 where information lives Phone: 508-435-1000 x 48545 Cellphone: 617-510-0422 Email : sfaibish@emc.com ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Performance results with exofs 2010-06-07 17:28 ` sfaibish [not found] ` <op.vdxxhfqsunckof-sXut7+96orlxdPWQvOaHCoI83tS8F2Zb0E9HWUfgJXw@public.gmane.org> @ 2010-06-07 18:29 ` J. Bruce Fields 2010-06-07 18:41 ` Boaz Harrosh 2010-06-07 18:43 ` Boaz Harrosh 1 sibling, 2 replies; 13+ messages in thread From: J. Bruce Fields @ 2010-06-07 18:29 UTC (permalink / raw) To: sfaibish; +Cc: Boaz Harrosh, NFS list >> On 06/07/2010 07:07 PM, Boaz Harrosh wrote: >>> I did not yet publish the Document. It's stuck behind my dis-talent for >>> writing and the pnfs bugs de jur. Untalented writing we can fix, as long as the details are there! >>> >>> Basically all machines: >>> - connected by a 1 GBit link. >>> - All clients doing a dd write of 8GB file from /dev/zero >>> - 3of8 is the special raid-groups arrangement of exofs && objlayout >>> where out of 8 devices each file is striped over 3 devices in a >>> round robin fashion. (*With a small dirty trick) Random stupid questions: - why do you think the 3of8 arrangement is scaling better than the 8of8? - Have you tried any other workloads? (Perfectly reasonable that simple write throughput would be the first thing to check--I'm just curious.) >>> >> >> - All tests over an *empty* filesystem. >> >>> [single client] >>> 1 - osds 40MB >>> 2 - osds 80MB >>> 4 - osds 114MB (saturation point of the 1 Gbit link) >>> 8 - osds 114MB >>> >>> [2 clients 8of8 osds] >>> 226 MBs >>> >>> [4 clients 8of8 osds] >>> 263 MBs >>> >>> [8 clients 8of8 osds] >>> 252 MBs >>> >>> [1 clients 3of8 osds] >>> 114 MBs >>> >>> [2 clients 3of8 osds *] >>> 226 MBs >>> >>> [4 clients 3of8 osds *] >>> 417 MBs If each osd has a single gigabit interface, and you're striping to 3, of them, isn't that 417/3 == 139 MB/s each? (Oh, I see: you must be writing to a different file from each client, hence you are using all osd's even if each client is only using 3?) --b. >>> >>> [8 clients 3of8 osds] >>> 405 MBs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Performance results with exofs 2010-06-07 18:29 ` J. Bruce Fields @ 2010-06-07 18:41 ` Boaz Harrosh 2010-06-07 18:49 ` J. Bruce Fields 2010-06-07 18:43 ` Boaz Harrosh 1 sibling, 1 reply; 13+ messages in thread From: Boaz Harrosh @ 2010-06-07 18:41 UTC (permalink / raw) To: J. Bruce Fields; +Cc: sfaibish, NFS list On 06/07/2010 09:29 PM, J. Bruce Fields wrote: >>> On 06/07/2010 07:07 PM, Boaz Harrosh wrote: >>>> I did not yet publish the Document. It's stuck behind my dis-talent for >>>> writing and the pnfs bugs de jur. > > Untalented writing we can fix, as long as the details are there! > >>>> >>>> Basically all machines: >>>> - connected by a 1 GBit link. >>>> - All clients doing a dd write of 8GB file from /dev/zero >>>> - 3of8 is the special raid-groups arrangement of exofs && objlayout >>>> where out of 8 devices each file is striped over 3 devices in a >>>> round robin fashion. (*With a small dirty trick) > > Random stupid questions: > > - why do you think the 3of8 arrangement is scaling better than > the 8of8? It's a know problem with a network storage cluster. What happens is that with 8of8 all the clients exercise all of the nodes at the same time so they are clashing on the network. With 3of8 each node can still saturate it's link. (3 was chosen carefully from the first test) and some nodes talk to some OSDs while other talk to other, so there is more chance of pairs * 1GBit at the same time. (The dirty trick I did was insert dummy files so the 4 client test will exercise all 8 devices. Otherwise the stupid exofs round robin algorithm would only exercise 4+3 devices.) > - Have you tried any other workloads? (Perfectly reasonable > that simple write throughput would be the first thing to > check--I'm just curious.) Never got to it. Busy with Bakeathon preparations. Would like too very much Thanks Boaz > >>>> >>> >>> - All tests over an *empty* filesystem. >>> >>>> [single client] >>>> 1 - osds 40MB >>>> 2 - osds 80MB >>>> 4 - osds 114MB (saturation point of the 1 Gbit link) >>>> 8 - osds 114MB >>>> >>>> [2 clients 8of8 osds] >>>> 226 MBs >>>> >>>> [4 clients 8of8 osds] >>>> 263 MBs >>>> >>>> [8 clients 8of8 osds] >>>> 252 MBs >>>> >>>> [1 clients 3of8 osds] >>>> 114 MBs >>>> >>>> [2 clients 3of8 osds *] >>>> 226 MBs >>>> >>>> [4 clients 3of8 osds *] >>>> 417 MBs > > If each osd has a single gigabit interface, and you're striping to 3, of > them, isn't that 417/3 == 139 MB/s each? > > (Oh, I see: you must be writing to a different file from each client, > hence you are using all osd's even if each client is only using 3?) > > --b. > >>>> >>>> [8 clients 3of8 osds] >>>> 405 MBs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Performance results with exofs 2010-06-07 18:41 ` Boaz Harrosh @ 2010-06-07 18:49 ` J. Bruce Fields 2010-06-08 5:26 ` Boaz Harrosh 2010-06-08 6:54 ` Benny Halevy 0 siblings, 2 replies; 13+ messages in thread From: J. Bruce Fields @ 2010-06-07 18:49 UTC (permalink / raw) To: Boaz Harrosh; +Cc: sfaibish, NFS list On Mon, Jun 07, 2010 at 09:41:29PM +0300, Boaz Harrosh wrote: > On 06/07/2010 09:29 PM, J. Bruce Fields wrote: > >>> On 06/07/2010 07:07 PM, Boaz Harrosh wrote: > >>>> I did not yet publish the Document. It's stuck behind my dis-talent for > >>>> writing and the pnfs bugs de jur. > > > > Untalented writing we can fix, as long as the details are there! > > > >>>> > >>>> Basically all machines: > >>>> - connected by a 1 GBit link. > >>>> - All clients doing a dd write of 8GB file from /dev/zero > >>>> - 3of8 is the special raid-groups arrangement of exofs && objlayout > >>>> where out of 8 devices each file is striped over 3 devices in a > >>>> round robin fashion. (*With a small dirty trick) > > > > Random stupid questions: > > > > - why do you think the 3of8 arrangement is scaling better than > > the 8of8? > > It's a know problem with a network storage cluster. What happens is > that with 8of8 all the clients exercise all of the nodes at the same > time so they are clashing on the network. OK, so if two clients are both trying to send a stripe of data to the same OSD data at the same time, absent a switch that could somehow afford to queue up a full stripe-unit's worth of data, packets get lost? (Also, out of curiosity: do you know of any papers or documentation that describe that problem in more detail?) --b. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Performance results with exofs 2010-06-07 18:49 ` J. Bruce Fields @ 2010-06-08 5:26 ` Boaz Harrosh 2010-06-08 6:54 ` Benny Halevy 1 sibling, 0 replies; 13+ messages in thread From: Boaz Harrosh @ 2010-06-08 5:26 UTC (permalink / raw) To: J. Bruce Fields, Welch, Brent; +Cc: sfaibish, NFS list On 06/07/2010 09:49 PM, J. Bruce Fields wrote: >> >> It's a know problem with a network storage cluster. What happens is >> that with 8of8 all the clients exercise all of the nodes at the same >> time so they are clashing on the network. > > OK, so if two clients are both trying to send a stripe of data to the > same OSD data at the same time, absent a switch that could somehow > afford to queue up a full stripe-unit's worth of data, packets get lost? > It's tcp they don't get lost, per-se they just get queued up. And that tcp ramp up and all that, you know. We use a 64k stripe unit with say raid of 4-8 that's 256k-1M bytes in a stripe. I don't think a network buffer that big will help at all. It'll just delay everything more. The best is a sound statistical network strategy that'll let the system even out overall. (Or not ...) > (Also, out of curiosity: do you know of any papers or documentation that > describe that problem in more detail?) > Personally, I'm privileged to learn from the best here at Panasas. CC: Brent, Can you recommend to Bruce some good papers about raid groups and network SAN strategies? > --b. Boaz ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Performance results with exofs 2010-06-07 18:49 ` J. Bruce Fields 2010-06-08 5:26 ` Boaz Harrosh @ 2010-06-08 6:54 ` Benny Halevy 2010-06-08 14:48 ` sfaibish 1 sibling, 1 reply; 13+ messages in thread From: Benny Halevy @ 2010-06-08 6:54 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Boaz Harrosh, sfaibish, NFS list On 2010-06-07 21:49, J. Bruce Fields wrote: > On Mon, Jun 07, 2010 at 09:41:29PM +0300, Boaz Harrosh wrote: >> On 06/07/2010 09:29 PM, J. Bruce Fields wrote: >>>>> On 06/07/2010 07:07 PM, Boaz Harrosh wrote: >>>>>> I did not yet publish the Document. It's stuck behind my dis-talent for >>>>>> writing and the pnfs bugs de jur. >>> >>> Untalented writing we can fix, as long as the details are there! >>> >>>>>> >>>>>> Basically all machines: >>>>>> - connected by a 1 GBit link. >>>>>> - All clients doing a dd write of 8GB file from /dev/zero >>>>>> - 3of8 is the special raid-groups arrangement of exofs && objlayout >>>>>> where out of 8 devices each file is striped over 3 devices in a >>>>>> round robin fashion. (*With a small dirty trick) >>> >>> Random stupid questions: >>> >>> - why do you think the 3of8 arrangement is scaling better than >>> the 8of8? >> >> It's a know problem with a network storage cluster. What happens is >> that with 8of8 all the clients exercise all of the nodes at the same >> time so they are clashing on the network. > > OK, so if two clients are both trying to send a stripe of data to the > same OSD data at the same time, absent a switch that could somehow > afford to queue up a full stripe-unit's worth of data, packets get lost? > > (Also, out of curiosity: do you know of any papers or documentation that > describe that problem in more detail?) > A good place to start would be http://www.pdl.cmu.edu/Incast/ Benny > --b. > -- ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Performance results with exofs 2010-06-08 6:54 ` Benny Halevy @ 2010-06-08 14:48 ` sfaibish [not found] ` <op.vdzkrtmqunckof-sXut7+96orlxdPWQvOaHCoI83tS8F2Zb0E9HWUfgJXw@public.gmane.org> 0 siblings, 1 reply; 13+ messages in thread From: sfaibish @ 2010-06-08 14:48 UTC (permalink / raw) To: Benny Halevy, J. Bruce Fields; +Cc: Boaz Harrosh, NFS list Problem solved; I sent Bruce 2 relevant papers from CMU and FAST 2009. /Sorin On Tue, 08 Jun 2010 02:54:53 -0400, Benny Halevy <bhalevy@panasas.com> wrote: > On 2010-06-07 21:49, J. Bruce Fields wrote: >> On Mon, Jun 07, 2010 at 09:41:29PM +0300, Boaz Harrosh wrote: >>> On 06/07/2010 09:29 PM, J. Bruce Fields wrote: >>>>>> On 06/07/2010 07:07 PM, Boaz Harrosh wrote: >>>>>>> I did not yet publish the Document. It's stuck behind my >>>>>>> dis-talent for >>>>>>> writing and the pnfs bugs de jur. >>>> >>>> Untalented writing we can fix, as long as the details are there! >>>> >>>>>>> >>>>>>> Basically all machines: >>>>>>> - connected by a 1 GBit link. >>>>>>> - All clients doing a dd write of 8GB file from /dev/zero >>>>>>> - 3of8 is the special raid-groups arrangement of exofs && objlayout >>>>>>> where out of 8 devices each file is striped over 3 devices in a >>>>>>> round robin fashion. (*With a small dirty trick) >>>> >>>> Random stupid questions: >>>> >>>> - why do you think the 3of8 arrangement is scaling better than >>>> the 8of8? >>> >>> It's a know problem with a network storage cluster. What happens is >>> that with 8of8 all the clients exercise all of the nodes at the same >>> time so they are clashing on the network. >> >> OK, so if two clients are both trying to send a stripe of data to the >> same OSD data at the same time, absent a switch that could somehow >> afford to queue up a full stripe-unit's worth of data, packets get lost? >> >> (Also, out of curiosity: do you know of any papers or documentation that >> describe that problem in more detail?) >> > > A good place to start would be > http://www.pdl.cmu.edu/Incast/ > > Benny > >> --b. >> -- > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- Best Regards Sorin Faibish Corporate Distinguished Engineer Network Storage Group EMC² where information lives Phone: 508-435-1000 x 48545 Cellphone: 617-510-0422 Email : sfaibish@emc.com ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <op.vdzkrtmqunckof-sXut7+96orlxdPWQvOaHCoI83tS8F2Zb0E9HWUfgJXw@public.gmane.org>]
* Re: Performance results with exofs [not found] ` <op.vdzkrtmqunckof-sXut7+96orlxdPWQvOaHCoI83tS8F2Zb0E9HWUfgJXw@public.gmane.org> @ 2010-06-08 23:15 ` J. Bruce Fields 0 siblings, 0 replies; 13+ messages in thread From: J. Bruce Fields @ 2010-06-08 23:15 UTC (permalink / raw) To: sfaibish; +Cc: Benny Halevy, Boaz Harrosh, NFS list On Tue, Jun 08, 2010 at 10:48:55AM -0400, sfaibish wrote: > Problem solved; I sent Bruce 2 relevant papers from CMU and FAST 2009. Thanks, all! --b. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Performance results with exofs 2010-06-07 18:29 ` J. Bruce Fields 2010-06-07 18:41 ` Boaz Harrosh @ 2010-06-07 18:43 ` Boaz Harrosh 1 sibling, 0 replies; 13+ messages in thread From: Boaz Harrosh @ 2010-06-07 18:43 UTC (permalink / raw) To: J. Bruce Fields; +Cc: sfaibish, NFS list On 06/07/2010 09:29 PM, J. Bruce Fields wrote: >>>> [4 clients 3of8 osds *] >>>> 417 MBs > > If each osd has a single gigabit interface, and you're striping to 3, of > them, isn't that 417/3 == 139 MB/s each? > > (Oh, I see: you must be writing to a different file from each client, > hence you are using all osd's even if each client is only using 3?) > Right and that little trick from the previous email ;-) Boaz > --b. > >>>> >>>> [8 clients 3of8 osds] >>>> 405 MBs ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2010-06-08 23:15 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <op.vdxrrgf1unckof@usensfaibisl2e.eng.emc.com>
2010-06-07 16:07 ` Performance results with exofs Boaz Harrosh
2010-06-07 16:13 ` Boaz Harrosh
2010-06-07 17:28 ` sfaibish
[not found] ` <op.vdxxhfqsunckof-sXut7+96orlxdPWQvOaHCoI83tS8F2Zb0E9HWUfgJXw@public.gmane.org>
2010-06-07 17:29 ` Boaz Harrosh
2010-06-07 17:34 ` sfaibish
2010-06-07 18:29 ` J. Bruce Fields
2010-06-07 18:41 ` Boaz Harrosh
2010-06-07 18:49 ` J. Bruce Fields
2010-06-08 5:26 ` Boaz Harrosh
2010-06-08 6:54 ` Benny Halevy
2010-06-08 14:48 ` sfaibish
[not found] ` <op.vdzkrtmqunckof-sXut7+96orlxdPWQvOaHCoI83tS8F2Zb0E9HWUfgJXw@public.gmane.org>
2010-06-08 23:15 ` J. Bruce Fields
2010-06-07 18:43 ` Boaz Harrosh
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox