From: Yann Dupont <Yann.Dupont@univ-nantes.fr>
To: Gregory Farnum <greg@inktank.com>
Cc: Sam Just <sam.just@inktank.com>, ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: domino-style OSD crash
Date: Sat, 07 Jul 2012 10:19:44 +0200 [thread overview]
Message-ID: <4FF7F120.3040708@univ-nantes.fr> (raw)
In-Reply-To: <CAPYLRzgb7KU5jBjqWS7GiYc2KqNUXXjcOv=kRoD4cEavotaX0Q@mail.gmail.com>
Le 06/07/2012 19:01, Gregory Farnum a écrit :
> On Fri, Jul 6, 2012 at 12:19 AM, Yann Dupont <Yann.Dupont@univ-nantes.fr> wrote:
>> Le 05/07/2012 23:32, Gregory Farnum a écrit :
>>
>> [...]
>>
>>>> ok, so as all nodes were identical, I probably have hit a btrfs bug (like
>>>> a
>>>> erroneous out of space ) in more or less the same time. And when 1 osd
>>>> was
>>>> out,
>>
>> OH , I didn't finish the sentence... When 1 osd was out, missing data was
>> copied on another nodes, probably speeding btrfs problem on those nodes (I
>> suspect erroneous out of space conditions)
> Ah. How full are/were the disks?
The OSD nodes were below 50 % (all are 5 To volumes):
osd.0 : 31%
osd.1 : 31%
osd.2 : 39%
osd.3 : 65%
no osd.4 :)
osd.5 : 35%
osd.6 : 60%
osd.7 : 42%
osd.8 : 34%
all the volumes were using btrfs with lzo compress.
[...]
>
> Oh, interesting. Are the broken nodes all on the same set of arrays?
>>
>> No. There are 4 completely independant raid arrays, in 4 different
>> locations. They are similar (same brand & model, but slighltly different
>> disks, and 1 different firmware), all arrays are multipathed. I don't think
>> the raid array is the problem. We use those particular models since 2/3
>> years, and in the logs I don't see any problem that can be caused by the
>> storage itself (like scsi or multipath errors)
> I must have misunderstood then. What did you mean by "1 Array for 2 OSD nodes"?
I have 8 osd nodes, in 4 different locations (several km away). In each
location I have 2 nodes and 1 raid Array.
On each location, each raid array has 16 2To disks, 2 controllers with
4x 8 Gb FC channels each. The 16 disks are organized in Raid 5 (8 disks
for one, 7 disks for the orher). Each raid set is primary attached to 1
controller, and each osd node on the location has acces to the
controller with 2 distinct paths.
There were no correlation between failed nodes & raid array.
Cheers,
--
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont@univ-nantes.fr
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2012-07-07 8:19 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-04 8:44 domino-style OSD crash Yann Dupont
2012-06-04 16:16 ` Tommi Virtanen
2012-06-04 17:40 ` Sam Just
2012-06-04 18:34 ` Greg Farnum
2012-07-03 8:40 ` Yann Dupont
2012-07-03 19:42 ` Tommi Virtanen
2012-07-03 20:54 ` Yann Dupont
2012-07-03 21:38 ` Tommi Virtanen
2012-07-04 8:06 ` Yann Dupont
2012-07-04 16:21 ` Gregory Farnum
2012-07-04 17:53 ` Yann Dupont
2012-07-05 21:32 ` Gregory Farnum
2012-07-06 7:19 ` Yann Dupont
2012-07-06 17:01 ` Gregory Farnum
2012-07-07 8:19 ` Yann Dupont [this message]
2012-07-09 17:14 ` Samuel Just
2012-07-10 9:46 ` Yann Dupont
2012-07-10 15:56 ` Tommi Virtanen
2012-07-10 16:39 ` Yann Dupont
2012-07-10 17:11 ` Tommi Virtanen
2012-07-10 17:36 ` Yann Dupont
2012-07-10 18:16 ` Tommi Virtanen
2012-07-09 17:43 ` Tommi Virtanen
2012-07-09 19:05 ` Yann Dupont
2012-07-09 19:48 ` Tommi Virtanen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FF7F120.3040708@univ-nantes.fr \
--to=yann.dupont@univ-nantes.fr \
--cc=ceph-devel@vger.kernel.org \
--cc=greg@inktank.com \
--cc=sam.just@inktank.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.