All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jim Schutt" <jaschut@sandia.gov>
To: Stefan Priebe <s.priebe@profihost.ag>
Cc: Mark Nelson <mark.nelson@inktank.com>,
	"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: OSD Hardware questions
Date: Wed, 27 Jun 2012 11:23:05 -0600	[thread overview]
Message-ID: <4FEB4179.8050104@sandia.gov> (raw)
In-Reply-To: <4FEB2480.3080404@profihost.ag>

On 06/27/2012 09:19 AM, Stefan Priebe wrote:
> Am 27.06.2012 16:55, schrieb Jim Schutt:
>> This is my current best tuning for my hardware, which uses
>> 24 SAS drives/server, and 1 OSD/drive with a journal partition
>> on the outer tracks and btrfs for the data store.
>
> Which raid level do you use?

No RAID.  Each OSD directly accesses a single
disk, via a partition for the journal and a partition
for the btrfs file store for that OSD.

I've got my 24 drives spread across three 6 Gb/s SAS HBAs,
so I can sustain ~90 MB/s per drive with all drives active,
when writing to the outer tracks using dd.

I want to rely on Ceph for data protection via replication.
At some point I expect to play around with the RAID0
support in btrfs to explore the performance relationship
between number of OSDs and size of each OSD, but haven't yet.

>
>> I'd be very curious to hear how these work for you.
>> My current testing load is streaming writes from
>> 166 linux clients, and the above tunings let me
>> sustain ~2 GB/s on each server (2x replication,
>> so 500 MB/s per server aggregate client bandwidth).
> 10GBe max speed shoudl be around 1Gbit/s. Do i miss something?

Hmmm, not sure.  My servers are limited by the bandwidth
of the SAS drives and HBAs.  So 2 GB/s aggregate disk
bandwidth is 1 GB/s for journals and 1 GB/s for data.
At 2x replication, that's 500 MB/s client data bandwidth.

>
>> I have dual-port 10 GbE NICs, and use one port
>> for the cluster and one for the clients. I use
>> jumbo frames because it freed up ~10% CPU cycles over
>> the default config of 1500-byte frames + GRO/GSO/etc
>> on the load I'm currently testing with.
> Do you have ntuple and lro on or off? Which kernel version do you use and which driver version? Intel cards?

# ethtool -k eth2
Offload parameters for eth2:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off

The NICs are Chelsio T4, but I'm not using any of the
TCP stateful offload features for this testing.
I don't know if they have ntuple support, but the
ethtool version I'm using (2.6.33) doesn't mention it.

For kernels I switch back and forth between latest development
kernel from Linus's tree, or latest stable kernel, depending
on where the kernel development cycle is.  I usually switch
to the development kernel around -rc4 or so.

-- Jim

>
> Stefan
>
>



  reply	other threads:[~2012-06-27 17:23 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-27 13:04 OSD Hardware questions Stefan Priebe - Profihost AG
2012-06-27 13:55 ` Mark Nelson
2012-06-27 14:55   ` Jim Schutt
2012-06-27 15:19     ` Stefan Priebe
2012-06-27 17:23       ` Jim Schutt [this message]
2012-06-27 17:54         ` Stefan Priebe
2012-06-27 18:38           ` Jim Schutt
2012-06-27 18:48             ` Stefan Priebe
2012-06-27 19:10               ` Jim Schutt
2012-06-27 19:14                 ` Jim Schutt
2012-06-27 15:53     ` Mark Nelson
2012-06-27 17:59       ` Jim Schutt
2012-06-27 15:13   ` Stefan Priebe
     [not found]     ` <CAPYLRzj916kW=KLy3dMTVPJRoNtPMP_Ejz+YAxRUJ5jZc+HeMg@mail.gmail.com>
2012-06-27 15:28       ` Stefan Priebe
2012-06-27 16:00         ` Mark Nelson
2012-06-28 13:21           ` Stefan Priebe - Profihost AG
2012-06-28 14:38             ` Mark Nelson
2012-06-28 15:18               ` Alexandre DERUMIER
2012-06-28 15:33                 ` Sage Weil
2012-06-28 15:45                   ` Alexandre DERUMIER
2012-06-28 15:48                     ` Jim Schutt
2012-06-28 21:25                   ` Stefan Priebe
2012-06-29 11:37                     ` Mark Nelson
2012-06-29 12:35                       ` Stefan Priebe - Profihost AG
2012-06-28 16:01                 ` Stefan Priebe
2012-06-28 16:00               ` Stefan Priebe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FEB4179.8050104@sandia.gov \
    --to=jaschut@sandia.gov \
    --cc=ceph-devel@vger.kernel.org \
    --cc=mark.nelson@inktank.com \
    --cc=s.priebe@profihost.ag \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.