From: Andras Pataki <apataki-0QEYAsm1mgjsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
To: Samuel Just <sjust-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: "ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org"
<ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org>,
"ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: Inconsistent PGs that ceph pg repair does not fix
Date: Tue, 8 Sep 2015 17:50:38 +0000 [thread overview]
Message-ID: <1441734638465.20865@simonsfoundation.org> (raw)
In-Reply-To: <D1E543BF.8DD2%apataki-0QEYAsm1mgjsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Hi Sam,
I saw that ceph 0.94.3 is out and it contains a resolution to the issue below (http://tracker.ceph.com/issues/12577). I installed it on our cluster, but unfortunately it didn't resolve the issue. Same as before, I have a couple of inconsistent pg's, and run ceph pg repair on them - the OSD says:
2015-09-08 11:21:53.930324 7f49c17ea700 0 log_channel(cluster) log [INF] : 2.439 repair starts
2015-09-08 11:27:57.708394 7f49c17ea700 -1 log_channel(cluster) log [ERR] : repair 2.439 b029e439/10000022d93.00000f0c/head//2 on disk data digest 0xb3d78a6e != 0xa3944ad0
2015-09-08 11:28:32.359938 7f49c17ea700 -1 log_channel(cluster) log [ERR] : 2.439 repair 1 errors, 0 fixed
2015-09-08 11:28:32.364506 7f49c17ea700 0 log_channel(cluster) log [INF] : 2.439 deep-scrub starts
2015-09-08 11:29:18.650876 7f49c17ea700 -1 log_channel(cluster) log [ERR] : deep-scrub 2.439 b029e439/10000022d93.00000f0c/head//2 on disk data digest 0xb3d78a6e != 0xa3944ad0
2015-09-08 11:29:23.136109 7f49c17ea700 -1 log_channel(cluster) log [ERR] : 2.439 deep-scrub 1 errors
$ ceph tell osd.* version | grep version | sort | uniq -c
94 "version": "ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)"
Could you have another look?
Thanks,
Andras
________________________________________
From: Andras Pataki
Sent: Monday, August 3, 2015 4:09 PM
To: Samuel Just
Cc: ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org; ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [ceph-users] Inconsistent PGs that ceph pg repair does not fix
Done: http://tracker.ceph.com/issues/12577
BTW, I¹m using the latest release 0.94.2 on all machines.
Andras
On 8/3/15, 3:38 PM, "Samuel Just" <sjust-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>Hrm, that's certainly supposed to work. Can you make a bug? Be sure
>to note what version you are running (output of ceph-osd -v).
>-Sam
>
>On Mon, Aug 3, 2015 at 12:34 PM, Andras Pataki
><apataki-0QEYAsm1mgjsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> wrote:
>> Summary: I am having problems with inconsistent PG's that the 'ceph pg
>> repair' command does not fix. Below are the details. Any help would be
>> appreciated.
>>
>> # Find the inconsistent PG's
>> ~# ceph pg dump | grep inconsistent
>> dumped all in format plain
>> 2.439 42080 00 017279507143 31033103 active+clean+inconsistent2015-08-03
>> 14:49:17.29288477323'2250145 77480:890566 [78,54]78 [78,54]78
>> 77323'22501452015-08-03 14:49:17.29253877323'2250145 2015-08-03
>> 14:49:17.292538
>> 2.8b9 40830 00 016669590823 30513051 active+clean+inconsistent2015-08-03
>> 14:46:05.14006377323'2249886 77473:897325 [7,72]7 [7,72]7
>> 77323'22498862015-08-03 14:22:47.83406377323'2249886 2015-08-03
>> 14:22:47.834063
>>
>> # Look at the first one:
>> ~# ceph pg deep-scrub 2.439
>> instructing pg 2.439 on osd.78 to deep-scrub
>>
>> # The logs of osd.78 show:
>> 2015-08-03 15:16:34.409738 7f09ec04a700 0 log_channel(cluster) log
>>[INF] :
>> 2.439 deep-scrub starts
>> 2015-08-03 15:16:51.364229 7f09ec04a700 -1 log_channel(cluster) log
>>[ERR] :
>> deep-scrub 2.439 b029e439/10000022d93.00000f0c/head//2 on disk data
>>digest
>> 0xb3d78a6e != 0xa3944ad0
>> 2015-08-03 15:16:52.763977 7f09ec04a700 -1 log_channel(cluster) log
>>[ERR] :
>> 2.439 deep-scrub 1 errors
>>
>> # Finding the object in question:
>> ~# find ~ceph/osd/ceph-78/current/2.439_head -name
>>10000022d93.00000f0c* -ls
>> 21510412310 4100 -rw-r--r-- 1 root root 4194304 Jun 30 17:09
>>
>>/var/lib/ceph/osd/ceph-78/current/2.439_head/DIR_9/DIR_3/DIR_4/DIR_E/1000
>>0022d93.00000f0c__head_B029E439__2
>> ~# md5sum
>>
>>/var/lib/ceph/osd/ceph-78/current/2.439_head/DIR_9/DIR_3/DIR_4/DIR_E/1000
>>0022d93.00000f0c__head_B029E439__2
>> 4e4523244deec051cfe53dd48489a5db
>>
>>/var/lib/ceph/osd/ceph-78/current/2.439_head/DIR_9/DIR_3/DIR_4/DIR_E/1000
>>0022d93.00000f0c__head_B029E439__2
>>
>> # The object on the backup osd:
>> ~# find ~ceph/osd/ceph-54/current/2.439_head -name
>>10000022d93.00000f0c* -ls
>> 6442614367 4100 -rw-r--r-- 1 root root 4194304 Jun 30 17:09
>>
>>/var/lib/ceph/osd/ceph-54/current/2.439_head/DIR_9/DIR_3/DIR_4/DIR_E/1000
>>0022d93.00000f0c__head_B029E439__2
>> ~# md5sum
>>
>>/var/lib/ceph/osd/ceph-54/current/2.439_head/DIR_9/DIR_3/DIR_4/DIR_E/1000
>>0022d93.00000f0c__head_B029E439__2
>> 4e4523244deec051cfe53dd48489a5db
>>
>>/var/lib/ceph/osd/ceph-54/current/2.439_head/DIR_9/DIR_3/DIR_4/DIR_E/1000
>>0022d93.00000f0c__head_B029E439__2
>>
>> # They don't seem to be different.
>> # When I try repair:
>> ~# ceph pg repair 2.439
>> instructing pg 2.439 on osd.78 to repair
>>
>> # The osd.78 logs show:
>> 2015-08-03 15:19:21.775933 7f09ec04a700 0 log_channel(cluster) log
>>[INF] :
>> 2.439 repair starts
>> 2015-08-03 15:19:38.088673 7f09ec04a700 -1 log_channel(cluster) log
>>[ERR] :
>> repair 2.439 b029e439/10000022d93.00000f0c/head//2 on disk data digest
>> 0xb3d78a6e != 0xa3944ad0
>> 2015-08-03 15:19:39.958019 7f09ec04a700 -1 log_channel(cluster) log
>>[ERR] :
>> 2.439 repair 1 errors, 0 fixed
>> 2015-08-03 15:19:39.962406 7f09ec04a700 0 log_channel(cluster) log
>>[INF] :
>> 2.439 deep-scrub starts
>> 2015-08-03 15:19:56.510874 7f09ec04a700 -1 log_channel(cluster) log
>>[ERR] :
>> deep-scrub 2.439 b029e439/10000022d93.00000f0c/head//2 on disk data
>>digest
>> 0xb3d78a6e != 0xa3944ad0
>> 2015-08-03 15:19:58.348083 7f09ec04a700 -1 log_channel(cluster) log
>>[ERR] :
>> 2.439 deep-scrub 1 errors
>>
>> The inconsistency is not fixed. Any hints of what should be done next?
>> I have tried a few things:
>> * Stop the primary osd, remove the object from the filesystem, restart
>>the
>> OSD and issue a repair. It didn't work - it sais that one object is
>> missing, but did not copy it from the backup.
>> * I tried the same on the backup (remove the file) - it also didn't get
>> copied back from the primary in a repair.
>>
>> Any help would be appreciated.
>>
>> Thanks,
>>
>> Andras
>> apataki-0QEYAsm1mgjsfHDXvbKv3WD2FQJk+8+b@public.gmane.org
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
next prev parent reply other threads:[~2015-09-08 17:50 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <D1E53A36.8DB6%apataki@simonsfoundation.org>
[not found] ` <D1E53B6A.8DC6%apataki@simonsfoundation.org>
2015-08-03 19:38 ` [ceph-users] Inconsistent PGs that ceph pg repair does not fix Samuel Just
2015-08-03 20:09 ` Andras Pataki
[not found] ` <D1E543BF.8DD2%apataki-0QEYAsm1mgjsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2015-09-08 17:50 ` Andras Pataki [this message]
2015-09-08 18:07 ` [ceph-users] " Sage Weil
2015-09-08 18:17 ` Andras Pataki
2015-09-08 21:42 ` Shinobu Kinjo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1441734638465.20865@simonsfoundation.org \
--to=apataki-0qeyasm1mgjsfhdxvbkv3wd2fqjk+8+b@public.gmane.org \
--cc=ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org \
--cc=sjust-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.