All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Nelson <mark.nelson@inktank.com>
To: Yann Dupont <Yann.Dupont@univ-nantes.fr>
Cc: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>,
	Yehuda Sadeh <yehuda@inktank.com>,
	Stefan Majer <stefan.majer@gmail.com>,
	ceph-devel@vger.kernel.org
Subject: Re: poor OSD performance using kernel 3.4 => problem found
Date: Thu, 31 May 2012 11:14:39 -0500	[thread overview]
Message-ID: <4FC798EF.3070500@inktank.com> (raw)
In-Reply-To: <4FC79193.1000604@univ-nantes.fr>

On 05/31/2012 10:43 AM, Yann Dupont wrote:
> On 31/05/2012 17:32, Mark Nelson wrote:
>> ceph osd pool get<pool> pg_num
>
> My setup is detailed in a previous mail , But as I changed some
> parameters this morning, here we go :
>
> root@chichibu:~# ceph osd pool get data pg_num
> PG_NUM: 576
> root@chichibu:~# ceph osd pool get rbd pg_num
> PG_NUM: 576
>
>
>
> The pg num is quite low because I started with small OSD (9 osd with
> 200G each - internal disks) when I formatted. Now, I reduced to 8 osd,
> (osd.4 is out) but with much larger (& faster) storage.
>
>
> Now, each of the 8 OSD have 5T on it, I try, for the moment, to keep the
> OSD similars. Replication is set to 2.
>
>
> The fs is btrfs formatted with big metadata (-l 64k -n64k), and mounted
> via space_cache,compress=lzo,nobarrier,noatime.
>
> journal is on tmpfs :
> osd journal = /dev/shm/journal
> osd journal size = 6144
>
> I know this is dangerous, remember It's NOT a production system for the
> moment.
>
> No OSD is full, I don't have much data stored for the moment.
>
> Concerning crush map, I'm not using the default one :
>
> The 8 nodes are in 3 different locations (some kilometers away). 2 are
> in 1 place, 2 in another, and the 4 last in the principal place.
>
> There is 10G between all the nodes and they are in the same VLAN, no
> router involved (but there is (negligible ?) latency between nodes)
>
> I try to group host together to avoid problem when I loose a location
> (electrical problem, for example). Not sure I really customized the
> crush map as I should have.
>
> here is the map :
> begin crush map
>
> # devices
> device 0 osd.0
> device 1 osd.1
> device 2 osd.2
> device 3 osd.3
> device 4 device4
> device 5 osd.5
> device 6 osd.6
> device 7 osd.7
> device 8 osd.8
>
> # types
> type 0 osd
> type 1 host
> type 2 rack
> type 3 pool
>
> # buckets
> host karuizawa {
> id -5 # do not change unnecessarily
> # weight 1.000
> alg straw
> hash 0 # rjenkins1
> item osd.2 weight 1.000
> }
> host hazelburn {
> id -6 # do not change unnecessarily
> # weight 1.000
> alg straw
> hash 0 # rjenkins1
> item osd.3 weight 1.000
> }
> rack loire {
> id -3 # do not change unnecessarily
> # weight 2.000
> alg straw
> hash 0 # rjenkins1
> item karuizawa weight 1.000
> item hazelburn weight 1.000
> }
> host carsebridge {
> id -8 # do not change unnecessarily
> # weight 1.000
> alg straw
> hash 0 # rjenkins1
> item osd.5 weight 1.000
> }
> host cameronbridge {
> id -9 # do not change unnecessarily
> # weight 1.000
> alg straw
> hash 0 # rjenkins1
> item osd.6 weight 1.000
> }
> rack chantrerie {
> id -7 # do not change unnecessarily
> # weight 2.000
> alg straw
> hash 0 # rjenkins1
> item carsebridge weight 1.000
> item cameronbridge weight 1.000
> }
> host chichibu {
> id -2 # do not change unnecessarily
> # weight 1.000
> alg straw
> hash 0 # rjenkins1
> item osd.0 weight 1.000
> }
> host glenesk {
> id -4 # do not change unnecessarily
> # weight 1.000
> alg straw
> hash 0 # rjenkins1
> item osd.1 weight 1.000
> }
> host braeval {
> id -10 # do not change unnecessarily
> # weight 1.000
> alg straw
> hash 0 # rjenkins1
> item osd.7 weight 1.000
> }
> host hanyu {
> id -11 # do not change unnecessarily
> # weight 1.000
> alg straw
> hash 0 # rjenkins1
> item osd.8 weight 1.000
> }
> rack lombarderie {
> id -12 # do not change unnecessarily
> # weight 4.000
> alg straw
> hash 0 # rjenkins1
> item chichibu weight 1.000
> item glenesk weight 1.000
> item braeval weight 1.000
> item hanyu weight 1.000
> }
> pool default {
> id -1 # do not change unnecessarily
> # weight 8.000
> alg straw
> hash 0 # rjenkins1
> item loire weight 2.000
> item chantrerie weight 2.000
> item lombarderie weight 4.000
> }
>
> # rules
> rule data {
> ruleset 0
> type replicated
> min_size 1
> max_size 10
> step take default
> step chooseleaf firstn 0 type host
> step emit
> }
> rule metadata {
> ruleset 1
> type replicated
> min_size 1
> max_size 10
> step take default
> step chooseleaf firstn 0 type host
> step emit
> }
> rule rbd {
> ruleset 2
> type replicated
> min_size 1
> max_size 10
> step take default
> step chooseleaf firstn 0 type host
> step emit
> }
>
> # end crush map
>
> Hope it helps,
> cheers
>
>

Hi Yann,

You might want to start out by running sar/iostat/collectl on the OSD 
nodes and seeing if anything looks funny during the slow test compared 
to the fast one.  If that doesn't reveal much, you could run blktrace on 
one of the OSDs during the tests and see if the IO to the disk looks 
different.  I can help out if you want to send me your blktrace results. 
  Similarly you could watch the network streams for both tests and see 
if anything looks different there.

Thanks!
Mark

  reply	other threads:[~2012-05-31 16:14 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-24 14:10 poor OSD performance using kernel 3.4 Stefan Priebe - Profihost AG
2012-05-24 14:57 ` Mark Nelson
     [not found] ` <CAJCPpW+SKnnVUaDEAsCkKyZwMVrHCRJF2C8zqB4eORgwW5p=1Q@mail.gmail.com>
     [not found]   ` <4FBE7ABC.5020502@profihost.ag>
2012-05-24 18:53     ` Mark Nelson
2012-05-24 19:05       ` Stefan Priebe
2012-05-25  1:53         ` Mark Nelson
2012-05-25  8:19           ` Stefan Priebe - Profihost AG
2012-05-25 11:31             ` Stefan Priebe - Profihost AG
2012-05-25 12:10               ` Stefan Priebe - Profihost AG
2012-05-25 15:47                 ` Alexandre DERUMIER
2012-05-27  9:11                   ` Stefan Priebe - Profihost AG
2012-05-27 11:33                     ` Alexandre DERUMIER
2012-05-27 18:57                       ` Stefan Priebe
2012-05-28  5:37                         ` Alexandre DERUMIER
2012-05-28  6:25                           ` Stefan Priebe
2012-05-28  6:52                             ` Alexandre DERUMIER
2012-05-28 19:48                               ` Stefan Priebe
2012-05-29  3:54                                 ` Alexandre DERUMIER
2012-05-29  8:22                                   ` Stefan Priebe - Profihost AG
2012-05-29 13:01                                     ` Alexandre DERUMIER
2012-05-29 14:18                                       ` Stefan Priebe - Profihost AG
2012-05-29  9:46                                   ` Stefan Priebe - Profihost AG
2012-05-29 13:39                                     ` Yann Dupont
2012-05-29 14:43                                       ` Stefan Priebe - Profihost AG
2012-05-29 17:50                                         ` Mark Nelson
2012-05-29 19:50                                           ` Yann Dupont
2012-05-29 21:04                                           ` Stefan Priebe
2012-05-29 21:08                                           ` Stefan Priebe
2012-05-29 21:31                                             ` Yann Dupont
2012-05-29 21:34                                               ` Stefan Priebe
2012-05-29 21:45                                                 ` Yann Dupont
2012-05-30  6:29                                                   ` Stefan Priebe - Profihost AG
2012-05-29 21:41                                             ` Mark Nelson
2012-05-30  6:22                                               ` Stefan Priebe - Profihost AG
2012-05-30  7:20                                                 ` building test cluster : missing /etc/ceph/client.admin.keyring, need help Alexandre DERUMIER
2012-05-30  7:25                                                   ` Stefan Priebe - Profihost AG
2012-05-30  7:33                                                     ` Alexandre DERUMIER
2012-05-30  7:47                                                       ` Alexandre DERUMIER
2012-05-29 22:25 ` poor OSD performance using kernel 3.4 Mark Nelson
2012-05-30  6:33   ` Stefan Priebe - Profihost AG
     [not found]     ` <CADdPHGs9dpSh9Oyu+5yDhyYU=Et_-zF5MuYybBuuAN5DgR433A@mail.gmail.com>
2012-05-30  7:16       ` Stefan Priebe - Profihost AG
     [not found]         ` <CADdPHGuiJqZUCK-0qR_CrOo6GRhkjaCdkOhJ2boq3zD0_voTsA@mail.gmail.com>
2012-05-30 11:04           ` Stefan Priebe - Profihost AG
     [not found]             ` <CADdPHGuLAL5+hkzq0tigqu355DvPxkhE5sxBhOVZPj=EzDSVtA@mail.gmail.com>
2012-05-30 11:25               ` Stefan Priebe - Profihost AG
2012-05-30 12:17             ` Mark Nelson
2012-05-30 12:41               ` Stefan Priebe - Profihost AG
     [not found]                 ` <CADdPHGsmr8Ht1pTWH1Oe8=NmAyM81SSdH+c_GV89D8ntfyUmgA@mail.gmail.com>
2012-05-30 13:19                   ` Stefan Priebe - Profihost AG
     [not found]                     ` <CADdPHGvxCmuViy+0==Vkdz_QjC1K+kD5kD1m7+0tYM2YDTtJbw@mail.gmail.com>
2012-05-30 13:54                       ` Stefan Priebe - Profihost AG
     [not found]                       ` <4FC63381.6090300@inktank.com>
2012-05-30 14:53                         ` Stefan Priebe
2012-05-30 14:56                           ` Mark Nelson
2012-05-30 18:26                             ` Stefan Priebe
2012-05-30 19:41                               ` Mark Nelson
2012-05-30 13:27                 ` Mark Nelson
2012-05-30 13:51                   ` Stefan Priebe - Profihost AG
2012-05-30 14:16                 ` Mark Nelson
2012-05-30 18:42                   ` Stefan Priebe
     [not found]                     ` <CADdPHGuxa7TAyqXcXehb9WgKgkHwkybYTrj2oue_PKsiF+oR3A@mail.gmail.com>
2012-05-30 21:10                       ` Stefan Priebe
     [not found]                         ` <CADdPHGutEwoDc=Kcrqcx2ZMO=dqhuoT5iLoP-WxqD+e5ZUmBRA@mail.gmail.com>
2012-05-31  7:10                           ` poor OSD performance using kernel 3.4 => problem found Stefan Priebe - Profihost AG
2012-05-31  7:30                             ` Yehuda Sadeh
     [not found]                               ` <CADdPHGtz9Jq624DMO6Dve2AcJ9vrnFHbyqRa+qheA+0-y4k++g@mail.gmail.com>
2012-05-31 12:31                                 ` Mark Nelson
2012-05-31 12:33                                   ` Stefan Priebe - Profihost AG
2012-05-31 13:21                               ` Yann Dupont
2012-05-31 13:37                                 ` Stefan Priebe - Profihost AG
2012-05-31 13:45                                   ` Yann Dupont
2012-05-31 14:42                                     ` Yann Dupont
2012-05-31 15:32                                       ` Mark Nelson
2012-05-31 15:43                                         ` Yann Dupont
2012-05-31 16:14                                           ` Mark Nelson [this message]
2012-05-31 16:29                                           ` Sage Weil
2012-05-31 16:37                                             ` Yann Dupont
     [not found]                             ` <CADdPHGv0YjxDQFnZML-55jDj7XxHxaxUZ_FeQ=ReKK6Rs7NNhw@mail.gmail.com>
2012-05-31  8:04                               ` Stefan Priebe - Profihost AG
2012-05-31  8:09                                 ` Stefan Majer
2012-05-31 11:34                                   ` Stefan Priebe - Profihost AG
2012-05-31 12:18                                   ` Stefan Priebe - Profihost AG
2012-05-30 11:51     ` poor OSD performance using kernel 3.4 Mark Nelson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FC798EF.3070500@inktank.com \
    --to=mark.nelson@inktank.com \
    --cc=Yann.Dupont@univ-nantes.fr \
    --cc=ceph-devel@vger.kernel.org \
    --cc=s.priebe@profihost.ag \
    --cc=stefan.majer@gmail.com \
    --cc=yehuda@inktank.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.