* Random blocks when accessing rbd images
@ 2011-12-15 15:07 Guido Winkelmann
2011-12-15 15:13 ` Wido den Hollander
2011-12-15 15:32 ` Stratos Psomadakis
0 siblings, 2 replies; 19+ messages in thread
From: Guido Winkelmann @ 2011-12-15 15:07 UTC (permalink / raw)
To: ceph-devel
Hi,
I've got a small ceph cluster with one mon, one mds and two osds (all on the
same machine, for now), that I want to use as a block- and file storage backend
for qemu machine virtualisation.
I found that read access to some of the rbd images, or parts of some of them
sometimes blocks indefinitely, usually after the image has been sitting around
untouched for a while, for example over night. This has the effect that virtual
machines that try to access their disks as well as rbd commands like "rbd cp"
will just hang indefinitely.
I found that these blocks can usually be "fixed" by restarting one of the
osds.
The last time this happened, ceph -s reported one of the osds to be in state
"active+clean+scrubbing". (I'm afraid I don't have the complete output from
ceph -s anymore.)
Does anybody have any idea what could be going wrong here?
Guido
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Random blocks when accessing rbd images
2011-12-15 15:07 Random blocks when accessing rbd images Guido Winkelmann
@ 2011-12-15 15:13 ` Wido den Hollander
2011-12-15 15:32 ` Stratos Psomadakis
1 sibling, 0 replies; 19+ messages in thread
From: Wido den Hollander @ 2011-12-15 15:13 UTC (permalink / raw)
To: Guido Winkelmann; +Cc: ceph-devel
Hi,
On 12/15/2011 04:07 PM, Guido Winkelmann wrote:
> Hi,
>
> I've got a small ceph cluster with one mon, one mds and two osds (all on the
> same machine, for now), that I want to use as a block- and file storage backend
> for qemu machine virtualisation.
>
> I found that read access to some of the rbd images, or parts of some of them
> sometimes blocks indefinitely, usually after the image has been sitting around
> untouched for a while, for example over night. This has the effect that virtual
> machines that try to access their disks as well as rbd commands like "rbd cp"
> will just hang indefinitely.
>
> I found that these blocks can usually be "fixed" by restarting one of the
> osds.
>
> The last time this happened, ceph -s reported one of the osds to be in state
> "active+clean+scrubbing". (I'm afraid I don't have the complete output from
> ceph -s anymore.)
I've been seeing the exact same behaviour, but I wasn't able yet to get
into it a bit deeper.
As far as I know, when a PG gets scrubbed it become unavailable for a
short period, but since this scrub blocks/loops the PG will never become
available again, thus blocking the virtual machine.
I saw this behaviour with v0.37 and 0.38, upgrading to 0.39 to see if it
still exists.
Wido
>
> Does anybody have any idea what could be going wrong here?
>
> Guido
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Random blocks when accessing rbd images
2011-12-15 15:07 Random blocks when accessing rbd images Guido Winkelmann
2011-12-15 15:13 ` Wido den Hollander
@ 2011-12-15 15:32 ` Stratos Psomadakis
2011-12-15 15:45 ` Guido Winkelmann
1 sibling, 1 reply; 19+ messages in thread
From: Stratos Psomadakis @ 2011-12-15 15:32 UTC (permalink / raw)
To: Guido Winkelmann; +Cc: ceph-devel
[-- Attachment #1: Type: text/plain, Size: 1261 bytes --]
On 12/15/2011 05:07 PM, Guido Winkelmann wrote:
> Hi,
>
> I've got a small ceph cluster with one mon, one mds and two osds (all on the
> same machine, for now), that I want to use as a block- and file storage backend
> for qemu machine virtualisation.
>
> I found that read access to some of the rbd images, or parts of some of them
> sometimes blocks indefinitely, usually after the image has been sitting around
> untouched for a while, for example over night. This has the effect that virtual
> machines that try to access their disks as well as rbd commands like "rbd cp"
> will just hang indefinitely.
>
> I found that these blocks can usually be "fixed" by restarting one of the
> osds.
>
> The last time this happened, ceph -s reported one of the osds to be in state
> "active+clean+scrubbing". (I'm afraid I don't have the complete output from
> ceph -s anymore.)
>
> Does anybody have any idea what could be going wrong here?
I think it's fixed in v0.39
> Guido
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Stratos Psomadakis
<psomas@grnet.gr>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Random blocks when accessing rbd images
2011-12-15 15:32 ` Stratos Psomadakis
@ 2011-12-15 15:45 ` Guido Winkelmann
2011-12-15 16:30 ` Samuel Just
2011-12-15 16:31 ` Martin Mailand
0 siblings, 2 replies; 19+ messages in thread
From: Guido Winkelmann @ 2011-12-15 15:45 UTC (permalink / raw)
To: ceph-devel
Am Donnerstag, 15. Dezember 2011, 17:32:25 schrieben Sie:
> On 12/15/2011 05:07 PM, Guido Winkelmann wrote:
> > Hi,
> >
> > I've got a small ceph cluster with one mon, one mds and two osds (all on
> > the same machine, for now), that I want to use as a block- and file
> > storage backend for qemu machine virtualisation.
> >
> > I found that read access to some of the rbd images, or parts of some of
> > them sometimes blocks indefinitely, usually after the image has been
> > sitting around untouched for a while, for example over night. This has
> > the effect that virtual machines that try to access their disks as well
> > as rbd commands like "rbd cp" will just hang indefinitely.
> >
> > I found that these blocks can usually be "fixed" by restarting one of
> > the>
> > osds.
> >
> > The last time this happened, ceph -s reported one of the osds to be in
> > state "active+clean+scrubbing". (I'm afraid I don't have the complete
> > output from ceph -s anymore.)
> >
> > Does anybody have any idea what could be going wrong here?
>
> I think it's fixed in v0.39
I'm already using 0.39, so, no. (Should have mentioned that to start with...)
Guido
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Random blocks when accessing rbd images
2011-12-15 15:45 ` Guido Winkelmann
@ 2011-12-15 16:30 ` Samuel Just
2011-12-15 16:33 ` Wido den Hollander
2011-12-15 16:44 ` Guido Winkelmann
2011-12-15 16:31 ` Martin Mailand
1 sibling, 2 replies; 19+ messages in thread
From: Samuel Just @ 2011-12-15 16:30 UTC (permalink / raw)
To: ceph-devel
'ceph pg dump' will tell you the status (active/clean/scrubbing/etc)
for each pg. Does the same pg remain in state active+clean+scrubbing
for more than 10 minutes?
-Sam
On Thu, Dec 15, 2011 at 7:45 AM, Guido Winkelmann
<guido-ceph@thisisnotatest.de> wrote:
> Am Donnerstag, 15. Dezember 2011, 17:32:25 schrieben Sie:
>> On 12/15/2011 05:07 PM, Guido Winkelmann wrote:
>> > Hi,
>> >
>> > I've got a small ceph cluster with one mon, one mds and two osds (all on
>> > the same machine, for now), that I want to use as a block- and file
>> > storage backend for qemu machine virtualisation.
>> >
>> > I found that read access to some of the rbd images, or parts of some of
>> > them sometimes blocks indefinitely, usually after the image has been
>> > sitting around untouched for a while, for example over night. This has
>> > the effect that virtual machines that try to access their disks as well
>> > as rbd commands like "rbd cp" will just hang indefinitely.
>> >
>> > I found that these blocks can usually be "fixed" by restarting one of
>> > the>
>> > osds.
>> >
>> > The last time this happened, ceph -s reported one of the osds to be in
>> > state "active+clean+scrubbing". (I'm afraid I don't have the complete
>> > output from ceph -s anymore.)
>> >
>> > Does anybody have any idea what could be going wrong here?
>>
>> I think it's fixed in v0.39
>
> I'm already using 0.39, so, no. (Should have mentioned that to start with...)
>
> Guido
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Random blocks when accessing rbd images
2011-12-15 15:45 ` Guido Winkelmann
2011-12-15 16:30 ` Samuel Just
@ 2011-12-15 16:31 ` Martin Mailand
2011-12-15 16:51 ` Guido Winkelmann
2011-12-15 17:32 ` Guido Winkelmann
1 sibling, 2 replies; 19+ messages in thread
From: Martin Mailand @ 2011-12-15 16:31 UTC (permalink / raw)
To: Guido Winkelmann; +Cc: ceph-devel
Hi Guido,
I am running ceph version 0.39-37-g54758ab
(commit:54758abccf429122c1bc3bce6d01bc33f1cfe238) on my cluster and I do
not see this problem. Do you use the qemu rbd block driver or the kernel
mount?
How did you install ceph, via the packages?
-martin
Am 15.12.2011 16:45, schrieb Guido Winkelmann:
> Am Donnerstag, 15. Dezember 2011, 17:32:25 schrieben Sie:
>> On 12/15/2011 05:07 PM, Guido Winkelmann wrote:
>>> Hi,
>>>
>>> I've got a small ceph cluster with one mon, one mds and two osds (all on
>>> the same machine, for now), that I want to use as a block- and file
>>> storage backend for qemu machine virtualisation.
>>>
>>> I found that read access to some of the rbd images, or parts of some of
>>> them sometimes blocks indefinitely, usually after the image has been
>>> sitting around untouched for a while, for example over night. This has
>>> the effect that virtual machines that try to access their disks as well
>>> as rbd commands like "rbd cp" will just hang indefinitely.
>>>
>>> I found that these blocks can usually be "fixed" by restarting one of
>>> the>
>>> osds.
>>>
>>> The last time this happened, ceph -s reported one of the osds to be in
>>> state "active+clean+scrubbing". (I'm afraid I don't have the complete
>>> output from ceph -s anymore.)
>>>
>>> Does anybody have any idea what could be going wrong here?
>>
>> I think it's fixed in v0.39
>
> I'm already using 0.39, so, no. (Should have mentioned that to start with...)
>
> Guido
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Random blocks when accessing rbd images
2011-12-15 16:30 ` Samuel Just
@ 2011-12-15 16:33 ` Wido den Hollander
2011-12-15 16:38 ` Martin Mailand
2011-12-15 16:44 ` Guido Winkelmann
1 sibling, 1 reply; 19+ messages in thread
From: Wido den Hollander @ 2011-12-15 16:33 UTC (permalink / raw)
To: Samuel Just; +Cc: ceph-devel
On 12/15/2011 05:30 PM, Samuel Just wrote:
> 'ceph pg dump' will tell you the status (active/clean/scrubbing/etc)
> for each pg. Does the same pg remain in state active+clean+scrubbing
> for more than 10 minutes?
Yes, from what I've seen it will block indefinitely until you restart
one of the OSDs who are member of the PG.
Wido
> -Sam
>
> On Thu, Dec 15, 2011 at 7:45 AM, Guido Winkelmann
> <guido-ceph@thisisnotatest.de> wrote:
>> Am Donnerstag, 15. Dezember 2011, 17:32:25 schrieben Sie:
>>> On 12/15/2011 05:07 PM, Guido Winkelmann wrote:
>>>> Hi,
>>>>
>>>> I've got a small ceph cluster with one mon, one mds and two osds (all on
>>>> the same machine, for now), that I want to use as a block- and file
>>>> storage backend for qemu machine virtualisation.
>>>>
>>>> I found that read access to some of the rbd images, or parts of some of
>>>> them sometimes blocks indefinitely, usually after the image has been
>>>> sitting around untouched for a while, for example over night. This has
>>>> the effect that virtual machines that try to access their disks as well
>>>> as rbd commands like "rbd cp" will just hang indefinitely.
>>>>
>>>> I found that these blocks can usually be "fixed" by restarting one of
>>>> the>
>>>> osds.
>>>>
>>>> The last time this happened, ceph -s reported one of the osds to be in
>>>> state "active+clean+scrubbing". (I'm afraid I don't have the complete
>>>> output from ceph -s anymore.)
>>>>
>>>> Does anybody have any idea what could be going wrong here?
>>>
>>> I think it's fixed in v0.39
>>
>> I'm already using 0.39, so, no. (Should have mentioned that to start with...)
>>
>> Guido
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Random blocks when accessing rbd images
2011-12-15 16:33 ` Wido den Hollander
@ 2011-12-15 16:38 ` Martin Mailand
2011-12-15 16:44 ` Martin Mailand
2011-12-15 16:45 ` Wido den Hollander
0 siblings, 2 replies; 19+ messages in thread
From: Martin Mailand @ 2011-12-15 16:38 UTC (permalink / raw)
To: Wido den Hollander; +Cc: Samuel Just, ceph-devel
Hi Wido,
but wasn't that fixed a few weeks ago?
-martin
Am 15.12.2011 17:33, schrieb Wido den Hollander:
> Yes, from what I've seen it will block indefinitely until you restart
> one of the OSDs who are member of the PG.
>
> Wido
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Random blocks when accessing rbd images
2011-12-15 16:38 ` Martin Mailand
@ 2011-12-15 16:44 ` Martin Mailand
2011-12-16 16:17 ` Wido den Hollander
2011-12-15 16:45 ` Wido den Hollander
1 sibling, 1 reply; 19+ messages in thread
From: Martin Mailand @ 2011-12-15 16:44 UTC (permalink / raw)
To: Wido den Hollander; +Cc: Samuel Just, ceph-devel
Hi,
at least there is a patch that should have fixed it.
http://marc.info/?l=ceph-devel&m=131955913203561&w=2
Am 15.12.2011 17:38, schrieb Martin Mailand:
> Hi Wido,
> but wasn't that fixed a few weeks ago?
>
> -martin
>
> Am 15.12.2011 17:33, schrieb Wido den Hollander:
>> Yes, from what I've seen it will block indefinitely until you restart
>> one of the OSDs who are member of the PG.
>>
>> Wido
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Random blocks when accessing rbd images
2011-12-15 16:30 ` Samuel Just
2011-12-15 16:33 ` Wido den Hollander
@ 2011-12-15 16:44 ` Guido Winkelmann
2011-12-15 17:24 ` Stratos Psomadakis
1 sibling, 1 reply; 19+ messages in thread
From: Guido Winkelmann @ 2011-12-15 16:44 UTC (permalink / raw)
To: ceph-devel
Am Donnerstag, 15. Dezember 2011, 08:30:26 schrieben Sie:
> 'ceph pg dump' will tell you the status (active/clean/scrubbing/etc)
> for each pg. Does the same pg remain in state active+clean+scrubbing
> for more than 10 minutes?
Well, I used ceph -s, which only gave me a summary, but there definitely was a
PG that was in active+clean+scrubbing for a long time (a lot longer than 10
minutes), and remained so until I restarted one of the osds.
Unfortunately I don't know how to reliably reproduce the problem, so I can't
check now...
Guido
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Random blocks when accessing rbd images
2011-12-15 16:38 ` Martin Mailand
2011-12-15 16:44 ` Martin Mailand
@ 2011-12-15 16:45 ` Wido den Hollander
1 sibling, 0 replies; 19+ messages in thread
From: Wido den Hollander @ 2011-12-15 16:45 UTC (permalink / raw)
To: Martin Mailand; +Cc: ceph-devel
On 12/15/2011 05:38 PM, Martin Mailand wrote:
> Hi Wido,
> but wasn't that fixed a few weeks ago?
I'm not sure. I think I saw it not so long ago.
My cluster just got a upgrade to 0.39, so everything got a restart. I'll
keep an eye out for blocking scrubs.
Wido
>
> -martin
>
> Am 15.12.2011 17:33, schrieb Wido den Hollander:
>> Yes, from what I've seen it will block indefinitely until you restart
>> one of the OSDs who are member of the PG.
>>
>> Wido
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Random blocks when accessing rbd images
2011-12-15 16:31 ` Martin Mailand
@ 2011-12-15 16:51 ` Guido Winkelmann
2011-12-15 17:32 ` Guido Winkelmann
1 sibling, 0 replies; 19+ messages in thread
From: Guido Winkelmann @ 2011-12-15 16:51 UTC (permalink / raw)
To: Martin Mailand; +Cc: ceph-devel
Am Donnerstag, 15. Dezember 2011, 17:31:22 schrieb Martin Mailand:
> Hi Guido,
> I am running ceph version 0.39-37-g54758ab
> (commit:54758abccf429122c1bc3bce6d01bc33f1cfe238) on my cluster and I do
> not see this problem. Do you use the qemu rbd block driver or the kernel
> mount?
> How did you install ceph, via the packages?
I downloaded the tarball from
http://ceph.newdream.net/download/ untarred it, and did the usual ./configure ;
make ; make install. Though if I were to do it again, I would prefer rpmbuild
now.
The host system is CentOS 6, with the Kernel upgraded to 3.1.1, btw.
Guido
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Random blocks when accessing rbd images
2011-12-15 16:44 ` Guido Winkelmann
@ 2011-12-15 17:24 ` Stratos Psomadakis
2011-12-15 21:28 ` Samuel Just
0 siblings, 1 reply; 19+ messages in thread
From: Stratos Psomadakis @ 2011-12-15 17:24 UTC (permalink / raw)
To: Guido Winkelmann; +Cc: ceph-devel
[-- Attachment #1: Type: text/plain, Size: 1535 bytes --]
On 12/15/2011 06:44 PM, Guido Winkelmann wrote:
> Am Donnerstag, 15. Dezember 2011, 08:30:26 schrieben Sie:
>> 'ceph pg dump' will tell you the status (active/clean/scrubbing/etc)
>> for each pg. Does the same pg remain in state active+clean+scrubbing
>> for more than 10 minutes?
> Well, I used ceph -s, which only gave me a summary, but there definitely was a
> PG that was in active+clean+scrubbing for a long time (a lot longer than 10
> minutes), and remained so until I restarted one of the osds.
>
> Unfortunately I don't know how to reliably reproduce the problem, so I can't
> check now...
When I hit that bug, I was able to trigger it (more easily) by setting:
osd scrub max interval = 120
in the [osd] section in ceph.conf, forcing the cluster to send pg scrubs
more often.
Now, if you stress the cluster a bit (some heavy I/O), coupled with
singe OSD restarts, I think you could be able to trigger it.
Btw, I was using the rbd in-kernel driver.
Some info from the debugging I did, I think that at some point after
setting finalizing_scrub = true, it turns out that (last_update_applied
!= info.last_update), but the scrub operation is never requeued by
op_applied for some reason, and so the PG is stuck as scrubbing.
> Guido
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Stratos Psomadakis
<psomas@grnet.gr>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Random blocks when accessing rbd images
2011-12-15 16:31 ` Martin Mailand
2011-12-15 16:51 ` Guido Winkelmann
@ 2011-12-15 17:32 ` Guido Winkelmann
1 sibling, 0 replies; 19+ messages in thread
From: Guido Winkelmann @ 2011-12-15 17:32 UTC (permalink / raw)
To: ceph-devel
Am Donnerstag, 15. Dezember 2011, 17:31:22 schrieb Martin Mailand:
> Hi Guido,
> I am running ceph version 0.39-37-g54758ab
> (commit:54758abccf429122c1bc3bce6d01bc33f1cfe238) on my cluster and I do
> not see this problem. Do you use the qemu rbd block driver or the kernel
> mount?
Both - Kernel mount for the Cephfs, qemu rbd driver for the actual qemu
volumes.
Guido
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Random blocks when accessing rbd images
2011-12-15 17:24 ` Stratos Psomadakis
@ 2011-12-15 21:28 ` Samuel Just
0 siblings, 0 replies; 19+ messages in thread
From: Samuel Just @ 2011-12-15 21:28 UTC (permalink / raw)
To: ceph-devel
This is likely the problem. I'll try to reproduce it today. (Meant to
post this to the list the first time)
-Sam
On Thu, Dec 15, 2011 at 9:24 AM, Stratos Psomadakis <psomas@grnet.gr> wrote:
> On 12/15/2011 06:44 PM, Guido Winkelmann wrote:
>> Am Donnerstag, 15. Dezember 2011, 08:30:26 schrieben Sie:
>>> 'ceph pg dump' will tell you the status (active/clean/scrubbing/etc)
>>> for each pg. Does the same pg remain in state active+clean+scrubbing
>>> for more than 10 minutes?
>> Well, I used ceph -s, which only gave me a summary, but there definitely was a
>> PG that was in active+clean+scrubbing for a long time (a lot longer than 10
>> minutes), and remained so until I restarted one of the osds.
>>
>> Unfortunately I don't know how to reliably reproduce the problem, so I can't
>> check now...
> When I hit that bug, I was able to trigger it (more easily) by setting:
> osd scrub max interval = 120
> in the [osd] section in ceph.conf, forcing the cluster to send pg scrubs
> more often.
>
> Now, if you stress the cluster a bit (some heavy I/O), coupled with
> singe OSD restarts, I think you could be able to trigger it.
>
> Btw, I was using the rbd in-kernel driver.
>
> Some info from the debugging I did, I think that at some point after
> setting finalizing_scrub = true, it turns out that (last_update_applied
> != info.last_update), but the scrub operation is never requeued by
> op_applied for some reason, and so the PG is stuck as scrubbing.
>
>> Guido
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
> --
> Stratos Psomadakis
> <psomas@grnet.gr>
>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Random blocks when accessing rbd images
2011-12-15 16:44 ` Martin Mailand
@ 2011-12-16 16:17 ` Wido den Hollander
2011-12-16 21:17 ` Samuel Just
0 siblings, 1 reply; 19+ messages in thread
From: Wido den Hollander @ 2011-12-16 16:17 UTC (permalink / raw)
To: Martin Mailand; +Cc: ceph-devel
Hi,
On 12/15/2011 05:44 PM, Martin Mailand wrote:
> Hi,
> at least there is a patch that should have fixed it.
>
> http://marc.info/?l=ceph-devel&m=131955913203561&w=2
>
I'm still seeing this one:
2011-12-16 17:14:53.638722 pg v1170309: 7808 pgs: 7807 active+clean,
1 active+clean+scrubbing; 15279 MB data, 47262 MB used, 73838 GB / 74520
GB avail
In this case PG "2.688" is in scrubbing state and is staying that way.
I'm running v0.39, not the latest master.
Any suggestions to trace this one down?
Wido
> Am 15.12.2011 17:38, schrieb Martin Mailand:
>> Hi Wido,
>> but wasn't that fixed a few weeks ago?
>>
>> -martin
>>
>> Am 15.12.2011 17:33, schrieb Wido den Hollander:
>>> Yes, from what I've seen it will block indefinitely until you restart
>>> one of the OSDs who are member of the PG.
>>>
>>> Wido
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Random blocks when accessing rbd images
2011-12-16 16:17 ` Wido den Hollander
@ 2011-12-16 21:17 ` Samuel Just
2011-12-18 14:26 ` Wido den Hollander
2011-12-22 13:59 ` Martin Mailand
0 siblings, 2 replies; 19+ messages in thread
From: Samuel Just @ 2011-12-16 21:17 UTC (permalink / raw)
To: ceph-devel
In master, 061e7619aacf60a828e0ce84a108d5a0bea247c6 may fix the
problem. If not, 5274e88d2cb8c0449a4ecd1ff0cf8bb0af2cfc97 includes
some asserts that may give us a clue as to how this is happening.
-Sam
On Fri, Dec 16, 2011 at 8:17 AM, Wido den Hollander <wido@widodh.nl> wrote:
> Hi,
>
>
> On 12/15/2011 05:44 PM, Martin Mailand wrote:
>>
>> Hi,
>> at least there is a patch that should have fixed it.
>>
>> http://marc.info/?l=ceph-devel&m=131955913203561&w=2
>>
>
> I'm still seeing this one:
>
> 2011-12-16 17:14:53.638722 pg v1170309: 7808 pgs: 7807 active+clean, 1
> active+clean+scrubbing; 15279 MB data, 47262 MB used, 73838 GB / 74520 GB
> avail
>
> In this case PG "2.688" is in scrubbing state and is staying that way.
>
> I'm running v0.39, not the latest master.
>
> Any suggestions to trace this one down?
>
> Wido
>
>
>> Am 15.12.2011 17:38, schrieb Martin Mailand:
>>>
>>> Hi Wido,
>>> but wasn't that fixed a few weeks ago?
>>>
>>> -martin
>>>
>>> Am 15.12.2011 17:33, schrieb Wido den Hollander:
>>>>
>>>> Yes, from what I've seen it will block indefinitely until you restart
>>>> one of the OSDs who are member of the PG.
>>>>
>>>> Wido
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Random blocks when accessing rbd images
2011-12-16 21:17 ` Samuel Just
@ 2011-12-18 14:26 ` Wido den Hollander
2011-12-22 13:59 ` Martin Mailand
1 sibling, 0 replies; 19+ messages in thread
From: Wido den Hollander @ 2011-12-18 14:26 UTC (permalink / raw)
To: Samuel Just; +Cc: ceph-devel
On 12/16/2011 10:17 PM, Samuel Just wrote:
> In master, 061e7619aacf60a828e0ce84a108d5a0bea247c6 may fix the
> problem. If not, 5274e88d2cb8c0449a4ecd1ff0cf8bb0af2cfc97 includes
> some asserts that may give us a clue as to how this is happening.
I've been running with bfbde5b18525406fc3b678751459e989ea5d4977 for over
24 hours now, everything is still active+clean.
If it comes back I'll update this thread.
Thanks,
Wido
> -Sam
>
> On Fri, Dec 16, 2011 at 8:17 AM, Wido den Hollander<wido@widodh.nl> wrote:
>> Hi,
>>
>>
>> On 12/15/2011 05:44 PM, Martin Mailand wrote:
>>>
>>> Hi,
>>> at least there is a patch that should have fixed it.
>>>
>>> http://marc.info/?l=ceph-devel&m=131955913203561&w=2
>>>
>>
>> I'm still seeing this one:
>>
>> 2011-12-16 17:14:53.638722 pg v1170309: 7808 pgs: 7807 active+clean, 1
>> active+clean+scrubbing; 15279 MB data, 47262 MB used, 73838 GB / 74520 GB
>> avail
>>
>> In this case PG "2.688" is in scrubbing state and is staying that way.
>>
>> I'm running v0.39, not the latest master.
>>
>> Any suggestions to trace this one down?
>>
>> Wido
>>
>>
>>> Am 15.12.2011 17:38, schrieb Martin Mailand:
>>>>
>>>> Hi Wido,
>>>> but wasn't that fixed a few weeks ago?
>>>>
>>>> -martin
>>>>
>>>> Am 15.12.2011 17:33, schrieb Wido den Hollander:
>>>>>
>>>>> Yes, from what I've seen it will block indefinitely until you restart
>>>>> one of the OSDs who are member of the PG.
>>>>>
>>>>> Wido
>>>>
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Random blocks when accessing rbd images
2011-12-16 21:17 ` Samuel Just
2011-12-18 14:26 ` Wido den Hollander
@ 2011-12-22 13:59 ` Martin Mailand
1 sibling, 0 replies; 19+ messages in thread
From: Martin Mailand @ 2011-12-22 13:59 UTC (permalink / raw)
To: Samuel Just; +Cc: ceph-devel
Hi Samuel
I think I am seeing it now.
root@s-brick-003:~# ceph pg dump|grep -i scrub
pg_stat objects mip degr unf kb bytes log disklog
state v reported up acting last_scrub
0.6 0 0 0 0 0 0 0 0
active+clean+scrubbing 0'0 60'156 [6,2] [6,2] 0'0
2011-12-20 14:44:55.787529
root@s-brick-003:~# ceph -v
ceph version 0.39-171-gdcedda8
(commit:dcedda84d0e1f69af985c301276c67c1b11e7efc)
root@s-brick-003:~#
I also had an osd crash and hit this (Assertion:
./messages/MOSDRepScrub.h: 64: FAILED assert(v == 0)), see my other
email for more information.
-martin
Am 16.12.2011 22:17, schrieb Samuel Just:
> In master, 061e7619aacf60a828e0ce84a108d5a0bea247c6 may fix the
> problem. If not, 5274e88d2cb8c0449a4ecd1ff0cf8bb0af2cfc97 includes
> some asserts that may give us a clue as to how this is happening.
> -Sam
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2011-12-22 14:00 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-15 15:07 Random blocks when accessing rbd images Guido Winkelmann
2011-12-15 15:13 ` Wido den Hollander
2011-12-15 15:32 ` Stratos Psomadakis
2011-12-15 15:45 ` Guido Winkelmann
2011-12-15 16:30 ` Samuel Just
2011-12-15 16:33 ` Wido den Hollander
2011-12-15 16:38 ` Martin Mailand
2011-12-15 16:44 ` Martin Mailand
2011-12-16 16:17 ` Wido den Hollander
2011-12-16 21:17 ` Samuel Just
2011-12-18 14:26 ` Wido den Hollander
2011-12-22 13:59 ` Martin Mailand
2011-12-15 16:45 ` Wido den Hollander
2011-12-15 16:44 ` Guido Winkelmann
2011-12-15 17:24 ` Stratos Psomadakis
2011-12-15 21:28 ` Samuel Just
2011-12-15 16:31 ` Martin Mailand
2011-12-15 16:51 ` Guido Winkelmann
2011-12-15 17:32 ` Guido Winkelmann
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.