Re: Problems after crash yesterday

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Jens Rehpöhler" <jens.rehpoehler@filoo.de>
To: ceph-devel@vger.kernel.org
Cc: sage@newdream.net
Subject: Re: Problems after crash yesterday
Date: Wed, 22 Feb 2012 10:53:41 +0100	[thread overview]
Message-ID: <4F44BB25.10202@filoo.de> (raw)
In-Reply-To: <4F4370F9.5030807@filoo.de>

[-- Attachment #1: Type: text/plain, Size: 5111 bytes --]

Some Additios: meanwhile we are at the state:

2012-02-22 10:38:49.587403    pg v1044553: 2046 pgs: 2036 active+clean,
10 active+clean+inconsistent; 2110 GB data, 4061 GB used, 25732 GB /
29794 GB avail

The  active+recovering+remapped+backfill disappeared auf a restart of a
cashed OSD.

The OSD crashed after issuing the command "ceph pg repair 106.3".

The repeating message is also there:

2012-02-22 10:52:36.198983   log 2012-02-22 10:52:32.182488 osd.3
10.10.10.8:6803/29916 302906 : [WRN] old request pg_log(0.ea epoch 849
query_epoch 843) v2 received at 2012-02-20 17:39:41.774507 currently started
2012-02-22 10:52:36.198983   log 2012-02-22 10:52:32.182500 osd.3
10.10.10.8:6803/29916 302907 : [WRN] old request pg_log(2.e8 epoch 849
query_epoch 843) v2 received at 2012-02-20 17:39:41.774662 currently no
flag points reached
2012-02-22 10:52:36.198983   log 2012-02-22 10:52:33.182615 osd.3
10.10.10.8:6803/29916 302908 : [WRN] old request pg_log(0.ea epoch 849
query_epoch 843) v2 received at 2012-02-20 17:39:41.774507 currently started
2012-02-22 10:52:36.198983   log 2012-02-22 10:52:33.182629 osd.3
10.10.10.8:6803/29916 302909 : [WRN] old request pg_log(2.e8 epoch 849
query_epoch 843) v2 received at 2012-02-20 17:39:41.774662 currently no
flag points reached
2012-02-22 10:52:36.198983   log 2012-02-22 10:52:34.182839 osd.3
10.10.10.8:6803/29916 302910 : [WRN] old request pg_log(0.ea epoch 849
query_epoch 843) v2 received at 2012-02-20 17:39:41.774507 currently started
2012-02-22 10:52:36.198983   log 2012-02-22 10:52:34.182853 osd.3
10.10.10.8:6803/29916 302911 : [WRN] old request pg_log(2.e8 epoch 849
query_epoch 843) v2 received at 2012-02-20 17:39:41.774662 currently no
flag points reached
2012-02-22 10:52:36.198983   log 2012-02-22 10:52:35.183075 osd.3
10.10.10.8:6803/29916 302912 : [WRN] old request pg_log(0.ea epoch 849
query_epoch 843) v2 received at 2012-02-20 17:39:41.774507 currently started
2012-02-22 10:52:36.198983   log 2012-02-22 10:52:35.183089 osd.3
10.10.10.8:6803/29916 302913 : [WRN] old request pg_log(2.e8 epoch 849
query_epoch 843) v2 received at 2012-02-20 17:39:41.774662 currently no
flag points reached

Seems to hang since our crash.

At last we see some scrub error like this:

2012-02-22 10:47:35.049386 log 2012-02-22 10:47:25.310571 osd.4
10.10.10.10:6800/17745 34356 : [ERR] 16.4 osd.2: soid
ce7f1004/rb.0.0.00000000001a/headmissing attr _, missing attr snapset

any advice ?

thanks

Jens



Am 21.02.2012 11:24, schrieb Jens Rehpöhler:
> Hi sage,
>
> sorry ... we have to disturb you again.
>
> After the node crash (oli wrote about that) we have some problems.
>
> The recovery process is stuck at:
>
> 2012-02-21 11:20:15.948527    pg v986715: 2046 pgs: 2035 active+clean,
> 10 active+clean+inconsistent, 1 active+recovering+remapped+backfill;
> 1988 GB data, 3823 GB used, 25970 GB / 29794 GB avail; 1/1121879
> degraded (0.000%)
>
> We also see this messages every few seconds:
>
> 2012-02-21 11:20:15.106958   log 2012-02-21 11:20:05.765762 osd.3
> 10.10.10.8:6803/29916 131581 : [WRN] old request pg_log(0.ea epoch 849
> query_epoch 843) v2 received at 2012-02-20 17:39:41.774507 currently started
> 2012-02-21 11:20:15.106958   log 2012-02-21 11:20:05.765775 osd.3
> 10.10.10.8:6803/29916 131582 : [WRN] old request pg_log(2.e8 epoch 849
> query_epoch 843) v2 received at 2012-02-20 17:39:41.774662 currently no
> flag points reached
> 2012-02-21 11:20:15.106958   log 2012-02-21 11:20:06.765912 osd.3
> 10.10.10.8:6803/29916 131583 : [WRN] old request pg_log(0.ea epoch 849
> query_epoch 843) v2 received at 2012-02-20 17:39:41.774507 currently started
> 2012-02-21 11:20:15.106958   log 2012-02-21 11:20:06.765943 osd.3
> 10.10.10.8:6803/29916 131584 : [WRN] old request pg_log(2.e8 epoch 849
> query_epoch 843) v2 received at 2012-02-20 17:39:41.774662 currently no
> flag points reached
> 2012-02-21 11:20:15.106958   log 2012-02-21 11:20:07.766312 osd.3
> 10.10.10.8:6803/29916 131585 : [WRN] old request pg_log(0.ea epoch 849
> query_epoch 843) v2 received at 2012-02-20 17:39:41.774507 currently started
> 2012-02-21 11:20:15.106958   log 2012-02-21 11:20:07.766324 osd.3
> 10.10.10.8:6803/29916 131586 : [WRN] old request pg_log(2.e8 epoch 849
> query_epoch 843) v2 received at 2012-02-20 17:39:41.774662 currently no
> flag points reached
> 2012-02-21 11:20:15.106958   log 2012-02-21 11:20:08.766467 osd.3
> 10.10.10.8:6803/29916 131587 : [WRN] old request pg_log(0.ea epoch 849
> query_epoch 843) v2 received at 2012-02-20 17:39:41.774507 currently started
>
> Any ideas how we can get the cluster back to consistent state  ?
>
> Thank you !!
>
> Jens


-- 
mit freundlichen Grüssen

Jens Rehpöhler

----------------------------------------------------------------------
Filoo GmbH
Moltkestr. 25a
33330 Gütersloh
HRB4355 AG Gütersloh

Geschäftsführer: S.Grewing | J.Rehpöhler | Dr. C.Kunz
Telefon: +49 5241 8673012 | Mobil: +49 151 54645798
Hotline: 07000-3378658 (14 Ct/min) Fax: +49 5241 8673020



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

next prev parent reply	other threads:[~2012-02-22  9:53 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-21 10:24 Problems after crash yesterday Jens Rehpöhler
2012-02-22  9:53 ` Jens Rehpöhler [this message]
2012-02-22 17:12   ` Gregory Farnum
2012-02-22 20:25     ` Jens Rehpöhler
2012-02-24  5:14       ` Gregory Farnum
2012-02-27 23:32         ` Gregory Farnum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F44BB25.10202@filoo.de \
    --to=jens.rehpoehler@filoo.de \
    --cc=ceph-devel@vger.kernel.org \
    --cc=sage@newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.