Paxos and long-lasting deleted data

All of lore.kernel.org
 help / color / mirror / Atom feed

* Paxos and long-lasting deleted data
@ 2013-01-31 18:17 Andrey Korolyov
  2013-01-31 18:31 ` Gregory Farnum
  2013-01-31 18:41 ` Joao Eduardo Luis
  0 siblings, 2 replies; 10+ messages in thread
From: Andrey Korolyov @ 2013-01-31 18:17 UTC (permalink / raw)
  To: ceph-devel

Hi,

Please take a look, this data remains for days and seems not to be
deleted in future too:

pool name       category                 KB      objects       clones
   degraded      unfound           rd        rd KB           wr
wr KB
data            -                          0            0            0
           0           0            0            0            0
    0
install         -                   15736833         3856            0
           0           0           16            3       464648
60970390
metadata        -                          0            0            0
           0           0            0            0            0
    0
prod-rack0      -                  364027905        88895            0
           0           0           32            0       267626
689034186
rbd             -                    4194305         1027            0
           0           0            4            1        11269
25165828
  total used      6900914368        93778
  total avail    18335469376
  total space    25236383744

for pool in $(rados lspools) ; do rbd ls -l $pool ; done | grep -v
SIZE | awk '{ sum += $2} END { print sum }'
rbd: pool data doesn't contain rbd images
rbd: pool metadata doesn't contain rbd images
526360

I have same thing before, but not so contrast as there. Cluster was
put on moderate failure test, dropping one or two osds at once under
I/O pressure with replication factor three.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Paxos and long-lasting deleted data
  2013-01-31 18:17 Paxos and long-lasting deleted data Andrey Korolyov
@ 2013-01-31 18:31 ` Gregory Farnum
  2013-01-31 18:50   ` Andrey Korolyov
  2013-01-31 18:41 ` Joao Eduardo Luis
  1 sibling, 1 reply; 10+ messages in thread
From: Gregory Farnum @ 2013-01-31 18:31 UTC (permalink / raw)
  To: Andrey Korolyov; +Cc: ceph-devel

Can you pastebin the output of "rados -p rbd ls"?

On Thu, Jan 31, 2013 at 10:17 AM, Andrey Korolyov <andrey@xdel.ru> wrote:
> Hi,
>
> Please take a look, this data remains for days and seems not to be
> deleted in future too:
>
> pool name       category                 KB      objects       clones
>    degraded      unfound           rd        rd KB           wr
> wr KB
> data            -                          0            0            0
>            0           0            0            0            0
>     0
> install         -                   15736833         3856            0
>            0           0           16            3       464648
> 60970390
> metadata        -                          0            0            0
>            0           0            0            0            0
>     0
> prod-rack0      -                  364027905        88895            0
>            0           0           32            0       267626
> 689034186
> rbd             -                    4194305         1027            0
>            0           0            4            1        11269
> 25165828
>   total used      6900914368        93778
>   total avail    18335469376
>   total space    25236383744
>
> for pool in $(rados lspools) ; do rbd ls -l $pool ; done | grep -v
> SIZE | awk '{ sum += $2} END { print sum }'
> rbd: pool data doesn't contain rbd images
> rbd: pool metadata doesn't contain rbd images
> 526360
>
> I have same thing before, but not so contrast as there. Cluster was
> put on moderate failure test, dropping one or two osds at once under
> I/O pressure with replication factor three.
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Paxos and long-lasting deleted data
  2013-01-31 18:31 ` Gregory Farnum
@ 2013-01-31 18:50   ` Andrey Korolyov
  2013-01-31 18:56     ` Gregory Farnum
  0 siblings, 1 reply; 10+ messages in thread
From: Andrey Korolyov @ 2013-01-31 18:50 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel, Joao Eduardo Luis

http://xdel.ru/downloads/ceph-log/rados-out.txt.gz


On Thu, Jan 31, 2013 at 10:31 PM, Gregory Farnum <greg@inktank.com> wrote:
> Can you pastebin the output of "rados -p rbd ls"?
>
> On Thu, Jan 31, 2013 at 10:17 AM, Andrey Korolyov <andrey@xdel.ru> wrote:
>> Hi,
>>
>> Please take a look, this data remains for days and seems not to be
>> deleted in future too:
>>
>> pool name       category                 KB      objects       clones
>>    degraded      unfound           rd        rd KB           wr
>> wr KB
>> data            -                          0            0            0
>>            0           0            0            0            0
>>     0
>> install         -                   15736833         3856            0
>>            0           0           16            3       464648
>> 60970390
>> metadata        -                          0            0            0
>>            0           0            0            0            0
>>     0
>> prod-rack0      -                  364027905        88895            0
>>            0           0           32            0       267626
>> 689034186
>> rbd             -                    4194305         1027            0
>>            0           0            4            1        11269
>> 25165828
>>   total used      6900914368        93778
>>   total avail    18335469376
>>   total space    25236383744
>>
>> for pool in $(rados lspools) ; do rbd ls -l $pool ; done | grep -v
>> SIZE | awk '{ sum += $2} END { print sum }'
>> rbd: pool data doesn't contain rbd images
>> rbd: pool metadata doesn't contain rbd images
>> 526360
>>
>> I have same thing before, but not so contrast as there. Cluster was
>> put on moderate failure test, dropping one or two osds at once under
>> I/O pressure with replication factor three.

>Just wondering if there was something else you wanted to discuss on your email given the email subject. Wanted by any >chance discuss anything regarding Paxos?

Sorry, please nevermind, just thought about paxos-like behavior and
suddenly put that in a title, instead of ``osd data placement''.

>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Paxos and long-lasting deleted data
  2013-01-31 18:50   ` Andrey Korolyov
@ 2013-01-31 18:56     ` Gregory Farnum
  2013-01-31 19:18       ` Andrey Korolyov
  0 siblings, 1 reply; 10+ messages in thread
From: Gregory Farnum @ 2013-01-31 18:56 UTC (permalink / raw)
  To: Andrey Korolyov, Dan Mick, Josh Durgin; +Cc: ceph-devel

On Thu, Jan 31, 2013 at 10:50 AM, Andrey Korolyov <andrey@xdel.ru> wrote:
> http://xdel.ru/downloads/ceph-log/rados-out.txt.gz
>
>
> On Thu, Jan 31, 2013 at 10:31 PM, Gregory Farnum <greg@inktank.com> wrote:
>> Can you pastebin the output of "rados -p rbd ls"?


Well, that sure is a lot of rbd objects. Looks like a tool mismatch or
a bug in whatever version you were using. Can you describe how you got
into this state, what versions of the servers and client tools you
used, etc?
-Greg

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Paxos and long-lasting deleted data
  2013-01-31 18:56     ` Gregory Farnum
@ 2013-01-31 19:18       ` Andrey Korolyov
  2013-02-03 19:45         ` Andrey Korolyov
  0 siblings, 1 reply; 10+ messages in thread
From: Andrey Korolyov @ 2013-01-31 19:18 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Dan Mick, Josh Durgin, ceph-devel

On Thu, Jan 31, 2013 at 10:56 PM, Gregory Farnum <greg@inktank.com> wrote:
> On Thu, Jan 31, 2013 at 10:50 AM, Andrey Korolyov <andrey@xdel.ru> wrote:
>> http://xdel.ru/downloads/ceph-log/rados-out.txt.gz
>>
>>
>> On Thu, Jan 31, 2013 at 10:31 PM, Gregory Farnum <greg@inktank.com> wrote:
>>> Can you pastebin the output of "rados -p rbd ls"?
>
>
> Well, that sure is a lot of rbd objects. Looks like a tool mismatch or
> a bug in whatever version you were using. Can you describe how you got
> into this state, what versions of the servers and client tools you
> used, etc?
> -Greg

That`s relatively fresh data moved into bare new cluster after couple
of days of 0.56.1 release, and tool/daemons version kept consistently
the same at any moment. All garbage data belongs to the same pool
prefix(3.) on which I have put a bunch of VM` images lately, cluster
may have been experienced split-brain problem for a short times during
crash-tests with no workload at all and standard crash tests on osd
removal/readdition during moderate workload. Killed osds have been
returned before,at the time and after process of data rearrangement on
``osd down'' timeout. Is it possible to do a little clean somehow
without pool re-creation?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Paxos and long-lasting deleted data
  2013-01-31 19:18       ` Andrey Korolyov
@ 2013-02-03 19:45         ` Andrey Korolyov
  2013-02-03 21:46           ` Gregory Farnum
  0 siblings, 1 reply; 10+ messages in thread
From: Andrey Korolyov @ 2013-02-03 19:45 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Dan Mick, Josh Durgin, ceph-devel

On Thu, Jan 31, 2013 at 11:18 PM, Andrey Korolyov <andrey@xdel.ru> wrote:
> On Thu, Jan 31, 2013 at 10:56 PM, Gregory Farnum <greg@inktank.com> wrote:
>> On Thu, Jan 31, 2013 at 10:50 AM, Andrey Korolyov <andrey@xdel.ru> wrote:
>>> http://xdel.ru/downloads/ceph-log/rados-out.txt.gz
>>>
>>>
>>> On Thu, Jan 31, 2013 at 10:31 PM, Gregory Farnum <greg@inktank.com> wrote:
>>>> Can you pastebin the output of "rados -p rbd ls"?
>>
>>
>> Well, that sure is a lot of rbd objects. Looks like a tool mismatch or
>> a bug in whatever version you were using. Can you describe how you got
>> into this state, what versions of the servers and client tools you
>> used, etc?
>> -Greg
>
> That`s relatively fresh data moved into bare new cluster after couple
> of days of 0.56.1 release, and tool/daemons version kept consistently
> the same at any moment. All garbage data belongs to the same pool
> prefix(3.) on which I have put a bunch of VM` images lately, cluster
> may have been experienced split-brain problem for a short times during
> crash-tests with no workload at all and standard crash tests on osd
> removal/readdition during moderate workload. Killed osds have been
> returned before,at the time and after process of data rearrangement on
> ``osd down'' timeout. Is it possible to do a little clean somehow
> without pool re-creation?

Just an update: this data stayed after pool deletion, so there is
probably a way to delete garbage bytes on live pool without doing any
harm(hope so), since it is can be dissected from actual pool pool data
placement, in theory.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Paxos and long-lasting deleted data
  2013-02-03 19:45         ` Andrey Korolyov
@ 2013-02-03 21:46           ` Gregory Farnum
  2013-02-04  5:31             ` Andrey Korolyov
  0 siblings, 1 reply; 10+ messages in thread
From: Gregory Farnum @ 2013-02-03 21:46 UTC (permalink / raw)
  To: Andrey Korolyov; +Cc: Dan Mick, Josh Durgin, ceph-devel

On Sunday, February 3, 2013 at 11:45 AM, Andrey Korolyov wrote:
> Just an update: this data stayed after pool deletion, so there is
> probably a way to delete garbage bytes on live pool without doing any
> harm(hope so), since it is can be dissected from actual pool pool data
> placement, in theory.


What? You mean you deleted the pool and the data in use by the cluster didn't drop? If that's the case, check and see if it's still at the same level — pool deletes are asynchronous and throttled to prevent impacting client operations too much.
-Greg

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Paxos and long-lasting deleted data
  2013-02-03 21:46           ` Gregory Farnum
@ 2013-02-04  5:31             ` Andrey Korolyov
  2013-02-04  8:46               ` Chen, Xiaoxi
  0 siblings, 1 reply; 10+ messages in thread
From: Andrey Korolyov @ 2013-02-04  5:31 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Dan Mick, Josh Durgin, ceph-devel

On Mon, Feb 4, 2013 at 1:46 AM, Gregory Farnum <greg@inktank.com> wrote:
> On Sunday, February 3, 2013 at 11:45 AM, Andrey Korolyov wrote:
>> Just an update: this data stayed after pool deletion, so there is
>> probably a way to delete garbage bytes on live pool without doing any
>> harm(hope so), since it is can be dissected from actual pool pool data
>> placement, in theory.
>
>
> What? You mean you deleted the pool and the data in use by the cluster didn't drop? If that's the case, check and see if it's still at the same level — pool deletes are asynchronous and throttled to prevent impacting client operations too much.

Yep, of course, I meant this exactly - I have waited until ``ceph -w''
values was stabilized for a long period, then checked that a bunch of
files with same prefix as in deleted pool remains, then I purged them
manually. I`m not sure if this data was in use at the moment of pool
removal, as I mentioned above, it`s just garbage produced during
periods when cluster was degraded heavily.

> -Greg
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Paxos and long-lasting deleted data
  2013-02-04  5:31             ` Andrey Korolyov
@ 2013-02-04  8:46               ` Chen, Xiaoxi
  0 siblings, 0 replies; 10+ messages in thread
From: Chen, Xiaoxi @ 2013-02-04  8:46 UTC (permalink / raw)
  To: Andrey Korolyov; +Cc: Gregory Farnum, Dan Mick, Josh Durgin, ceph-devel

I have hit the same issue,when I try to remove a pool which contains a lot of data,the delete finished,both ceph -w and iostat show no activity. 
But the 'used' field remain a large number(alought less than original valie). interate ls all the lefted pool solve the inconsistency.



在 2013-2-4，13:32，"Andrey Korolyov" <andrey@xdel.ru> 写道：

> On Mon, Feb 4, 2013 at 1:46 AM, Gregory Farnum <greg@inktank.com> wrote:
>> On Sunday, February 3, 2013 at 11:45 AM, Andrey Korolyov wrote:
>>> Just an update: this data stayed after pool deletion, so there is
>>> probably a way to delete garbage bytes on live pool without doing any
>>> harm(hope so), since it is can be dissected from actual pool pool data
>>> placement, in theory.
>> 
>> 
>> What? You mean you deleted the pool and the data in use by the cluster didn't drop? If that's the case, check and see if it's still at the same level ― pool deletes are asynchronous and throttled to prevent impacting client operations too much.
> 
> Yep, of course, I meant this exactly - I have waited until ``ceph -w''
> values was stabilized for a long period, then checked that a bunch of
> files with same prefix as in deleted pool remains, then I purged them
> manually. I`m not sure if this data was in use at the moment of pool
> removal, as I mentioned above, it`s just garbage produced during
> periods when cluster was degraded heavily.
> 
>> -Greg
>> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Paxos and long-lasting deleted data
  2013-01-31 18:17 Paxos and long-lasting deleted data Andrey Korolyov
  2013-01-31 18:31 ` Gregory Farnum
@ 2013-01-31 18:41 ` Joao Eduardo Luis
  1 sibling, 0 replies; 10+ messages in thread
From: Joao Eduardo Luis @ 2013-01-31 18:41 UTC (permalink / raw)
  To: Andrey Korolyov; +Cc: ceph-devel

On 01/31/2013 06:17 PM, Andrey Korolyov wrote:
> Hi,
>
> Please take a look, this data remains for days and seems not to be
> deleted in future too:
>
> pool name       category                 KB      objects       clones
>     degraded      unfound           rd        rd KB           wr
> wr KB
> data            -                          0            0            0
>             0           0            0            0            0
>      0
> install         -                   15736833         3856            0
>             0           0           16            3       464648
> 60970390
> metadata        -                          0            0            0
>             0           0            0            0            0
>      0
> prod-rack0      -                  364027905        88895            0
>             0           0           32            0       267626
> 689034186
> rbd             -                    4194305         1027            0
>             0           0            4            1        11269
> 25165828
>    total used      6900914368        93778
>    total avail    18335469376
>    total space    25236383744
>
> for pool in $(rados lspools) ; do rbd ls -l $pool ; done | grep -v
> SIZE | awk '{ sum += $2} END { print sum }'
> rbd: pool data doesn't contain rbd images
> rbd: pool metadata doesn't contain rbd images
> 526360
>
> I have same thing before, but not so contrast as there. Cluster was
> put on moderate failure test, dropping one or two osds at once under
> I/O pressure with replication factor three.

Just wondering if there was something else you wanted to discuss on your 
email given the email subject. Wanted by any chance discuss anything 
regarding Paxos?

   -Joao


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-02-04  8:46 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-31 18:17 Paxos and long-lasting deleted data Andrey Korolyov
2013-01-31 18:31 ` Gregory Farnum
2013-01-31 18:50   ` Andrey Korolyov
2013-01-31 18:56     ` Gregory Farnum
2013-01-31 19:18       ` Andrey Korolyov
2013-02-03 19:45         ` Andrey Korolyov
2013-02-03 21:46           ` Gregory Farnum
2013-02-04  5:31             ` Andrey Korolyov
2013-02-04  8:46               ` Chen, Xiaoxi
2013-01-31 18:41 ` Joao Eduardo Luis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.