All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yann Dupont <Yann.Dupont@univ-nantes.fr>
To: Tommi Virtanen <tv@inktank.com>
Cc: Sam Just <sam.just@inktank.com>, ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: domino-style OSD crash
Date: Wed, 04 Jul 2012 10:06:36 +0200	[thread overview]
Message-ID: <4FF3F98C.30602@univ-nantes.fr> (raw)
In-Reply-To: <CADvuQRGyp8j=XXStvOFc37Gy7RoWD1AQK5ih-BHudJ8hH7dT7g@mail.gmail.com>

Le 03/07/2012 23:38, Tommi Virtanen a écrit :
> On Tue, Jul 3, 2012 at 1:54 PM, Yann Dupont <Yann.Dupont@univ-nantes.fr> wrote:
>> In the case I could repair, do you think a crashed FS as it is right now is
>> valuable for you, for future reference , as I saw you can't reproduce the
>> problem ? I can make an archive (or a btrfs dump ?), but it will be quite
>> big.
> At this point, it's more about the upstream developers (of btrfs etc)
> than us; we're on good terms with them but not experts on the on-disk
> format(s). You might want to send an email to the relevant mailing
> lists before wiping the disks.
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
Well, I probably wasn't clear enough. I talked about crashed FS, but i 
was talking about ceph. The underlying FS (btrfs in that case) of 1 node 
(and only one) has PROBABLY crashed in the past, causing corruption in 
ceph data on this node, and then the subsequent crash of other nodes.

RIGHT now btrfs on this node is OK. I can access the filesystem without 
errors.

For the moment, on 8 nodes, 4 refuse to restart .
1 of the 4 nodes was the crashed node , the 3 others didn't had broblem 
with the underlying fs as far as I can tell.

So I think the scenario is :

One node had problem with btrfs, leading first to kernel problem , 
probably corruption (in disk/ in memory maybe ?) ,and ultimately to a 
kernel oops. Before that ultimate kernel oops, bad data has been 
transmitted to other (sane) nodes, leading to ceph-osd crash on thoses 
nodes.

If you think this scenario is highly improbable in real life (that is, 
btrfs will probably be fixed for good, and then, corruption can't 
happen), it's ok.

But I wonder if this scenario can be triggered with other problem, and 
bad data can be transmitted to other sane nodes (power outage, out of 
memory condition, disk full... for example)

That's why I proposed you a crashed ceph volume image (I shouldn't have 
talked about a crashed fs, sorry for the confusion)

Talking about btrfs, there is a lot of fixes in btrfs between 3.4 and 
3.5rc. After the crash, I couldn't mount the btrfs volume. With 3.5rc I 
can , and there is no sign of problem on it. It does'nt mean data is 
safe there, but i think it's a sign that at least, some bugs have been 
corrected in btrfs code.

Cheers,

-- 
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont@univ-nantes.fr

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2012-07-04  8:06 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-04  8:44 domino-style OSD crash Yann Dupont
2012-06-04 16:16 ` Tommi Virtanen
2012-06-04 17:40   ` Sam Just
2012-06-04 18:34     ` Greg Farnum
2012-07-03  8:40     ` Yann Dupont
2012-07-03 19:42       ` Tommi Virtanen
2012-07-03 20:54         ` Yann Dupont
2012-07-03 21:38           ` Tommi Virtanen
2012-07-04  8:06             ` Yann Dupont [this message]
2012-07-04 16:21               ` Gregory Farnum
2012-07-04 17:53                 ` Yann Dupont
2012-07-05 21:32                   ` Gregory Farnum
2012-07-06  7:19                     ` Yann Dupont
2012-07-06 17:01                       ` Gregory Farnum
2012-07-07  8:19                         ` Yann Dupont
2012-07-09 17:14                           ` Samuel Just
2012-07-10  9:46                             ` Yann Dupont
2012-07-10 15:56                               ` Tommi Virtanen
2012-07-10 16:39                                 ` Yann Dupont
2012-07-10 17:11                                   ` Tommi Virtanen
2012-07-10 17:36                                     ` Yann Dupont
2012-07-10 18:16                                       ` Tommi Virtanen
2012-07-09 17:43               ` Tommi Virtanen
2012-07-09 19:05                 ` Yann Dupont
2012-07-09 19:48                   ` Tommi Virtanen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FF3F98C.30602@univ-nantes.fr \
    --to=yann.dupont@univ-nantes.fr \
    --cc=ceph-devel@vger.kernel.org \
    --cc=sam.just@inktank.com \
    --cc=tv@inktank.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.