Trouble with paxos service for large PG count

All of lore.kernel.org
 help / color / mirror / Atom feed

* Trouble with paxos service for large PG count
@ 2013-04-01 21:14 Jim Schutt
  2013-04-02 13:41 ` Joao Eduardo Luis
  2013-04-02 15:42 ` Joao Eduardo Luis
  0 siblings, 2 replies; 8+ messages in thread
From: Jim Schutt @ 2013-04-01 21:14 UTC (permalink / raw)
  To: ceph-devel@vger.kernel.org

Hi,

I've been having trouble starting a new file system
created using the current next branch (most recently,
commit 3b5f663f11).

I believe the trouble is related to how long it takes paxos to
process a pgmap proposal.

For a configuration with 1 mon, 1 mds, and 576 osds, using
pg_bits = 3 and debug paxos = 10, if I start just the monitor,
here's what I get when paxos processes the first non-trivial
pgmap proposal:

2013-04-01 14:04:16.330735 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
2013-04-01 14:04:16.358973 7ffff7fbe780  5 mon.cs31@0(leader).paxos(paxos active c 1..3) queue_proposal bl 4943990 bytes; ctx = 0x11e81f0
2013-04-01 14:04:16.359021 7ffff7fbe780  5 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued 4 4943990 bytes
2013-04-01 14:04:16.359025 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued list_proposals 1 in queue:
2013-04-01 14:04:28.096284 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) begin for 4 4943990 bytes
2013-04-01 14:04:28.164876 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos updating c 1..3) commit 4
2013-04-01 14:04:28.226789 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal finishing proposal
2013-04-01 14:04:28.226885 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal finish it (proposal = 0x1260f50)
2013-04-01 14:04:28.226892 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal proposal took 11.867914 to finish
2013-04-01 14:04:28.226941 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) _active

When I redo the test with pg_bits = 4 (27696 PGs total) I get this:

2013-04-01 14:25:16.993255 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
2013-04-01 14:25:17.051729 7ffff7fbe780  5 mon.cs31@0(leader).paxos(paxos active c 1..3) queue_proposal bl 9887726 bytes; ctx = 0x11e81f0
2013-04-01 14:25:17.051882 7ffff7fbe780  5 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued 4 9887726 bytes
2013-04-01 14:25:17.051888 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued list_proposals 1 in queue:
2013-04-01 14:26:08.185143 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) begin for 4 9887726 bytes
2013-04-01 14:26:08.296351 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos updating c 1..3) commit 4
2013-04-01 14:26:08.408900 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal finishing proposal
2013-04-01 14:26:08.408928 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal finish it (proposal = 0x1260f50)
2013-04-01 14:26:08.408931 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal proposal took 51.357088 to finish
2013-04-01 14:26:08.408974 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) _active

and at pg_bits = 5 (55392 PGs total) I get this:

2013-04-01 14:36:15.385280 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
2013-04-01 14:36:15.500292 7ffff7fbe780  5 mon.cs31@0(leader).paxos(paxos active c 1..3) queue_proposal bl 19775198 bytes; ctx = 0x11e81f0
2013-04-01 14:36:15.500433 7ffff7fbe780  5 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued 4 19775198 bytes
2013-04-01 14:36:15.500438 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued list_proposals 1 in queue:
2013-04-01 14:39:43.645062 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) begin for 4 19775198 bytes
2013-04-01 14:39:43.875099 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos updating c 1..3) commit 4
2013-04-01 14:39:44.086097 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal finishing proposal
2013-04-01 14:39:44.086128 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal finish it (proposal = 0x1260f50)
2013-04-01 14:39:44.086136 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal proposal took 208.585710 to finish
2013-04-01 14:39:44.086156 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) _active

For this last configuration, after collecting the above
I waited a bit, started all the OSDs, waited a bit longer,
then collected this:

2013-04-01 14:54:37.364686 7ffff328d700 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
2013-04-01 14:54:37.433641 7ffff328d700  5 mon.cs31@0(leader).paxos(paxos active c 1..27) queue_proposal bl 10629660 bytes; ctx = 0x11ece50
2013-04-01 14:54:37.433750 7ffff328d700  5 mon.cs31@0(leader).paxos(paxos preparing update c 1..27) propose_queued 28 10629660 bytes
2013-04-01 14:54:37.433755 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..27) propose_queued list_proposals 1 in queue:
2013-04-01 14:55:38.684532 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..27) begin for 28 10629660 bytes
2013-04-01 14:55:38.814528 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos updating c 1..27) commit 28
2013-04-01 14:55:38.937087 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..28) finish_queued_proposal finishing proposal
2013-04-01 14:55:38.937120 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..28) finish_queued_proposal finish it (proposal = 0x1c6e3c0)
2013-04-01 14:55:38.937124 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..28) finish_queued_proposal proposal took 61.503375 to finish
2013-04-01 14:55:38.937168 7ffff328d700 10 mon.cs31@0(leader).paxosservice(pgmap) _active

It looks like finish_queued_proposal processing time is scaling
quadratically with the proposal length for pgmaps.

FWIW, I don't believe I saw any issues of this sort for
versions 0.58 and earlier.

Please let me know if there is any other information I can
provide that will help to help fix this.

Thanks -- Jim


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Trouble with paxos service for large PG count
  2013-04-01 21:14 Trouble with paxos service for large PG count Jim Schutt
@ 2013-04-02 13:41 ` Joao Eduardo Luis
  2013-04-02 15:37   ` Joao Eduardo Luis
  2013-04-02 15:42 ` Joao Eduardo Luis
  1 sibling, 1 reply; 8+ messages in thread
From: Joao Eduardo Luis @ 2013-04-02 13:41 UTC (permalink / raw)
  To: Jim Schutt; +Cc: ceph-devel@vger.kernel.org

Hi Jim,

One thing to keep in mind is that with the monitor's rework we now share 
the Paxos instance across all Paxos services.  That may slow things down 
a bit, given paxos proposals for different services are now queued and 
have to wait their turn.  But what's happening to you appears to be 
something completely different -- see below.

On 04/01/2013 10:14 PM, Jim Schutt wrote:
> [snip]
>
> For this last configuration, after collecting the above
> I waited a bit, started all the OSDs, waited a bit longer,
> then collected this:
>
> 2013-04-01 14:54:37.364686 7ffff328d700 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
> 2013-04-01 14:54:37.433641 7ffff328d700  5 mon.cs31@0(leader).paxos(paxos active c 1..27) queue_proposal bl 10629660 bytes; ctx = 0x11ece50
> 2013-04-01 14:54:37.433750 7ffff328d700  5 mon.cs31@0(leader).paxos(paxos preparing update c 1..27) propose_queued 28 10629660 bytes
> 2013-04-01 14:54:37.433755 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..27) propose_queued list_proposals 1 in queue:
> 2013-04-01 14:55:38.684532 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..27) begin for 28 10629660 bytes
> 2013-04-01 14:55:38.814528 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos updating c 1..27) commit 28
> 2013-04-01 14:55:38.937087 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..28) finish_queued_proposal finishing proposal
> 2013-04-01 14:55:38.937120 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..28) finish_queued_proposal finish it (proposal = 0x1c6e3c0)
> 2013-04-01 14:55:38.937124 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..28) finish_queued_proposal proposal took 61.503375 to finish
> 2013-04-01 14:55:38.937168 7ffff328d700 10 mon.cs31@0(leader).paxosservice(pgmap) _active
>
> It looks like finish_queued_proposal processing time is scaling
> quadratically with the proposal length for pgmaps.

Ah! The reason is for such a delay just became obvious, and it's due to 
a quite dumb mistake.  Basically, during Paxos::begin() we're running 
the whole transaction on the JSON formatter, and then outputting it with 
log level 30 -- we should be checking for the log level first to avoid 
spending valuable time on that, specially when transactions are huge.

Besides that, while looking into another bug, I noticed that there's a 
slight problem with the logic of monitor and, at a given point, each 
transaction we create ends up not only containing an incremental but 
also a full version, which is bound to slow things down and exacerbate 
what I just described in the previous paragraph.


   -Joao

>
> FWIW, I don't believe I saw any issues of this sort for
> versions 0.58 and earlier.
>
> Please let me know if there is any other information I can
> provide that will help to help fix this.
>
> Thanks -- Jim
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Trouble with paxos service for large PG count
  2013-04-02 13:41 ` Joao Eduardo Luis
@ 2013-04-02 15:37   ` Joao Eduardo Luis
  0 siblings, 0 replies; 8+ messages in thread
From: Joao Eduardo Luis @ 2013-04-02 15:37 UTC (permalink / raw)
  To: Jim Schutt; +Cc: ceph-devel@vger.kernel.org

On 04/02/2013 02:41 PM, Joao Eduardo Luis wrote:
> Hi Jim,
>
> One thing to keep in mind is that with the monitor's rework we now share
> the Paxos instance across all Paxos services.  That may slow things down
> a bit, given paxos proposals for different services are now queued and
> have to wait their turn.  But what's happening to you appears to be
> something completely different -- see below.
>
> On 04/01/2013 10:14 PM, Jim Schutt wrote:
>> [snip]
>>
>> For this last configuration, after collecting the above
>> I waited a bit, started all the OSDs, waited a bit longer,
>> then collected this:
>>
>> 2013-04-01 14:54:37.364686 7ffff328d700 10
>> mon.cs31@0(leader).paxosservice(pgmap) propose_pending
>> 2013-04-01 14:54:37.433641 7ffff328d700  5
>> mon.cs31@0(leader).paxos(paxos active c 1..27) queue_proposal bl
>> 10629660 bytes; ctx = 0x11ece50
>> 2013-04-01 14:54:37.433750 7ffff328d700  5
>> mon.cs31@0(leader).paxos(paxos preparing update c 1..27)
>> propose_queued 28 10629660 bytes
>> 2013-04-01 14:54:37.433755 7ffff328d700 10
>> mon.cs31@0(leader).paxos(paxos preparing update c 1..27)
>> propose_queued list_proposals 1 in queue:
>> 2013-04-01 14:55:38.684532 7ffff328d700 10
>> mon.cs31@0(leader).paxos(paxos preparing update c 1..27) begin for 28
>> 10629660 bytes
>> 2013-04-01 14:55:38.814528 7ffff328d700 10
>> mon.cs31@0(leader).paxos(paxos updating c 1..27) commit 28
>> 2013-04-01 14:55:38.937087 7ffff328d700 10
>> mon.cs31@0(leader).paxos(paxos active c 1..28) finish_queued_proposal
>> finishing proposal
>> 2013-04-01 14:55:38.937120 7ffff328d700 10
>> mon.cs31@0(leader).paxos(paxos active c 1..28) finish_queued_proposal
>> finish it (proposal = 0x1c6e3c0)
>> 2013-04-01 14:55:38.937124 7ffff328d700 10
>> mon.cs31@0(leader).paxos(paxos active c 1..28) finish_queued_proposal
>> proposal took 61.503375 to finish
>> 2013-04-01 14:55:38.937168 7ffff328d700 10
>> mon.cs31@0(leader).paxosservice(pgmap) _active
>>
>> It looks like finish_queued_proposal processing time is scaling
>> quadratically with the proposal length for pgmaps.
>
> Ah! The reason is for such a delay just became obvious, and it's due to
> a quite dumb mistake.  Basically, during Paxos::begin() we're running
> the whole transaction on the JSON formatter, and then outputting it with
> log level 30 -- we should be checking for the log level first to avoid
> spending valuable time on that, specially when transactions are huge.

Well, this might not be right after all.

>
> Besides that, while looking into another bug, I noticed that there's a
> slight problem with the logic of monitor and, at a given point, each
> transaction we create ends up not only containing an incremental but
> also a full version, which is bound to slow things down and exacerbate
> what I just described in the previous paragraph.

But this would certainly explain the 10MB transaction on 'Paxos::begin'.

I'm now actively looking into this and created ticket #4620 on the 
tracker [1].

   -Joao

[1] - http://tracker.ceph.com/issues/4620

>
>
>    -Joao
>
>>
>> FWIW, I don't believe I saw any issues of this sort for
>> versions 0.58 and earlier.
>>
>> Please let me know if there is any other information I can
>> provide that will help to help fix this.
>>
>> Thanks -- Jim
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Trouble with paxos service for large PG count
  2013-04-01 21:14 Trouble with paxos service for large PG count Jim Schutt
  2013-04-02 13:41 ` Joao Eduardo Luis
@ 2013-04-02 15:42 ` Joao Eduardo Luis
  2013-04-02 18:18   ` Jim Schutt
  1 sibling, 1 reply; 8+ messages in thread
From: Joao Eduardo Luis @ 2013-04-02 15:42 UTC (permalink / raw)
  To: Jim Schutt; +Cc: ceph-devel@vger.kernel.org

On 04/01/2013 10:14 PM, Jim Schutt wrote:
> Hi,
>
> I've been having trouble starting a new file system
> created using the current next branch (most recently,
> commit 3b5f663f11).
>
> I believe the trouble is related to how long it takes paxos to
> process a pgmap proposal.
>
> For a configuration with 1 mon, 1 mds, and 576 osds, using
> pg_bits = 3 and debug paxos = 10, if I start just the monitor,
> here's what I get when paxos processes the first non-trivial
> pgmap proposal:
>

Just noticed one other thing.  With 'debug paxos = 10', you should have 
a whole bunch of output (the proposal's dump) after this:

> 2013-04-01 14:04:16.330735 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
> 2013-04-01 14:04:16.358973 7ffff7fbe780  5 mon.cs31@0(leader).paxos(paxos active c 1..3) queue_proposal bl 4943990 bytes; ctx = 0x11e81f0
> 2013-04-01 14:04:16.359021 7ffff7fbe780  5 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued 4 4943990 bytes
> 2013-04-01 14:04:16.359025 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued list_proposals 1 in queue:

and before this:

> 2013-04-01 14:04:28.096284 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) begin for 4 4943990 bytes

for every snippet you sent on your previous email.

The code responsible for that shouldn't ever have made into master, and 
should be to blame for a great deal of the time spent.

Jim, can you confirm such output is present?


   -Joao

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Trouble with paxos service for large PG count
  2013-04-02 15:42 ` Joao Eduardo Luis
@ 2013-04-02 18:18   ` Jim Schutt
       [not found]     ` <CAGgG2xTdTOWvaPAK2FnxpzC1fsok50_HXXSgayEoD3J8SK6k3w@mail.gmail.com>
  0 siblings, 1 reply; 8+ messages in thread
From: Jim Schutt @ 2013-04-02 18:18 UTC (permalink / raw)
  To: Joao Eduardo Luis; +Cc: ceph-devel@vger.kernel.org

On 04/02/2013 09:42 AM, Joao Eduardo Luis wrote:

> On 04/01/2013 10:14 PM, Jim Schutt wrote:
>> Hi,
>>
>> I've been having trouble starting a new file system
>> created using the current next branch (most recently,
>> commit 3b5f663f11).
>>
>> I believe the trouble is related to how long it takes paxos to
>> process a pgmap proposal.
>>
>> For a configuration with 1 mon, 1 mds, and 576 osds, using
>> pg_bits = 3 and debug paxos = 10, if I start just the monitor,
>> here's what I get when paxos processes the first non-trivial
>> pgmap proposal:
>>
> 
> Just noticed one other thing.  With 'debug paxos = 10', you should have a whole bunch of output (the proposal's dump) after this:
> 
>> 2013-04-01 14:04:16.330735 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
>> 2013-04-01 14:04:16.358973 7ffff7fbe780  5 mon.cs31@0(leader).paxos(paxos active c 1..3) queue_proposal bl 4943990 bytes; ctx = 0x11e81f0
>> 2013-04-01 14:04:16.359021 7ffff7fbe780  5 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued 4 4943990 bytes
>> 2013-04-01 14:04:16.359025 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued list_proposals 1 in queue:
> 
> and before this:
> 
>> 2013-04-01 14:04:28.096284 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) begin for 4 4943990 bytes
> 
> for every snippet you sent on your previous email.
> 
> The code responsible for that shouldn't ever have made into master, and should be to blame for a great deal of the time spent.
> 
> Jim, can you confirm such output is present?

Yep, here's what I see:

2013-04-01 14:04:16.359021 7ffff7fbe780  5 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued 4 4943990 bytes
2013-04-01 14:04:16.359025 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued list_proposals 1 in queue:
-- entry #0
 proposed queued 0.000008 tx dump:
{ "ops": [
        { "op_num": 0,
          "type": "PUT",
          "prefix": "pgmap",
          "key": "full_1",
          "length": 46,
          "bl": "0000 : 05 04 28 00 00 00 01 00 00 00 00 00 00 00 00 00 : ..(.............\n0010 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 33 33 : ..............33\n0020 : 73 3f 9a 99 59 3f 40 e8 59 51 90 f0 a4 07       : s?..Y?@.YQ....\n"},
        { "op_num": 1,
          "type": "PUT",
          "prefix": "pgmap",
          "key": "full_latest",
          "length": 8,
          "bl": "0000 : 01 00 00 00 00 00 00 00                         : ........\n"},
        { "op_num": 2,
          "type": "PUT",
          "prefix": "pgmap",
          "key": "2",
          "length": 4943790,
          "bl": "0000 : 06 05 a8 6f 4b 00 02 00 00 00 00 00 00 00 18 36 : ...oK..........6

[snipped lots more of the bufferlist, then]

        { "op_num": 3,
          "type": "PUT",
          "prefix": "pgmap",
          "key": "last_committed",
          "length": 8,
          "bl": "0000 : 02 00 00 00 00 00 00 00                         : ........\n"}]}
2013-04-01 14:04:28.096284 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) begin for 4 4943990 bytes

As you suggested, I see that sort of thing for every "propose_queued list_proposals" in my logs

-- Jim

> 
> 
>   -Joao
> 
> 



^ permalink raw reply	[flat|nested] 8+ messages in thread

[parent not found: <CAGgG2xTdTOWvaPAK2FnxpzC1fsok50_HXXSgayEoD3J8SK6k3w@mail.gmail.com>]

* Re: Trouble with paxos service for large PG count
       [not found]     ` <CAGgG2xTdTOWvaPAK2FnxpzC1fsok50_HXXSgayEoD3J8SK6k3w@mail.gmail.com>
@ 2013-04-02 19:16       ` Jim Schutt
  2013-04-02 20:36         ` Jim Schutt
  0 siblings, 1 reply; 8+ messages in thread
From: Jim Schutt @ 2013-04-02 19:16 UTC (permalink / raw)
  To: Joao Luis; +Cc: ceph-devel

On 04/02/2013 12:28 PM, Joao Luis wrote:
> Right.  I'll push a patch to bump that sort of output to 30 when I get home.

Thanks - but FWIW, I don't think it's the root cause of my
issue -- more below....
> 
> If you're willing, try reducing the paxos debug level to 0 and let us know
> if those delays in the proposal go away.

When I started debugging why I couldn't start new filesystems, several
days ago, I wasn't using any debugging.  It took me until last Friday
to track the issue into paxos, which is when I turned on its debugging.

Also, I've added the following bits of debug output:

diff --git a/src/mon/Paxos.cc b/src/mon/Paxos.cc
index bf3eb92..70a8b5d 100644
--- a/src/mon/Paxos.cc
+++ b/src/mon/Paxos.cc
@@ -494,6 +494,8 @@ void Paxos::begin(bufferlist& v)
   MonitorDBStore::Transaction t;
   t.put(get_name(), last_committed+1, new_value);
 
+  utime_t tstart = ceph_clock_now(g_ceph_context);
+
   dout(30) << __func__ << " transaction dump:\n";
   JSONFormatter f(true);
   t.dump(&f);
@@ -506,7 +508,14 @@ void Paxos::begin(bufferlist& v)
   f.flush(*_dout);
   *_dout << dendl;
 
+  dout(10) <<  __func__ << " - JSON formatting, debug_tx decode took "
+           << ceph_clock_now(g_ceph_context) - tstart << dendl;
+
+  tstart = ceph_clock_now(g_ceph_context);
   get_store()->apply_transaction(t);
+  dout(10) <<  __func__ << " - get_store()->apply_transaction took "
+           << ceph_clock_now(g_ceph_context) - tstart << dendl;
+
 
   if (mon->get_quorum().size() == 1) {
     // we're alone, take it easy
diff --git a/src/mon/PaxosService.cc b/src/mon/PaxosService.cc
index a66c5ec..808ce62 100644
--- a/src/mon/PaxosService.cc
+++ b/src/mon/PaxosService.cc
@@ -158,6 +158,8 @@ void PaxosService::propose_pending()
   encode_pending(&t);
   have_pending = false;
 
+  utime_t tstart = ceph_clock_now(g_ceph_context);
+
   dout(30) << __func__ << " transaction dump:\n";
   JSONFormatter f(true);
   t.dump(&f);
@@ -166,6 +168,9 @@ void PaxosService::propose_pending()
 
   t.encode(bl);
 
+  dout(10) <<  __func__ << " - JSON formatting, encode took "
+           << ceph_clock_now(g_ceph_context) - tstart << dendl;
+
   // apply to paxos
   proposing.set(1);
   paxos->propose_new_value(bl, new C_Committed(this));


which produces this:

2013-04-02 13:04:22.311743 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
2013-04-02 13:04:22.426810 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending - JSON formatting, encode took 0.000170
2013-04-02 13:04:22.426874 7ffff7fbe780  5 mon.cs31@0(leader).paxos(paxos active c 1..3) queue_proposal bl 19775198 bytes; ctx = 0x11e81f0
2013-04-02 13:04:22.426999 7ffff7fbe780  5 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued 4 19775198 bytes
2013-04-02 13:04:22.427003 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued list_proposals 1 in queue:
2013-04-02 13:07:51.190312 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) begin for 4 19775198 bytes
2013-04-02 13:07:51.190860 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos updating c 1..3) begin - JSON formatting, debug_tx decode took 0.000002
2013-04-02 13:07:51.402134 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos updating c 1..3) begin - get_store()->apply_transaction took 0.211275
2013-04-02 13:07:51.402178 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos updating c 1..3) commit 4
2013-04-02 13:07:51.635113 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal finishing proposal
2013-04-02 13:07:51.635152 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal finish it (proposal = 0x1261340)
2013-04-02 13:07:51.635159 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos active c 1..4) finish_queued_proposal proposal took 209.208162 to finish
2013-04-02 13:07:51.635193 7ffff7fbe780 10 mon.cs31@0(leader).paxosservice(pgmap) _active


I thought I was timing the same JSON formatting in propose_pending()
and begin(), but they take much different amounts of time.
I don't understand why yet.....

Also, I just noticed the elapsed time between these two lines:

2013-04-02 13:04:22.427003 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) propose_queued list_proposals 1 in queue:
2013-04-02 13:07:51.190312 7ffff7fbe780 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..3) begin for 4 19775198 bytes

It accounts for nearly the whole of the delay that finish_queued_proposal()
is reporting.  Why did begin() take so long to start, I wonder ???

-- Jim


> 
>   -Joao



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Trouble with paxos service for large PG count
  2013-04-02 19:16       ` Jim Schutt
@ 2013-04-02 20:36         ` Jim Schutt
  2013-04-02 23:24           ` Joao Eduardo Luis
  0 siblings, 1 reply; 8+ messages in thread
From: Jim Schutt @ 2013-04-02 20:36 UTC (permalink / raw)
  To: Jim Schutt; +Cc: Joao Luis, ceph-devel

On 04/02/2013 01:16 PM, Jim Schutt wrote:
> On 04/02/2013 12:28 PM, Joao Luis wrote:
>> Right.  I'll push a patch to bump that sort of output to 30 when I get home.
> 
> Thanks - but FWIW, I don't think it's the root cause of my
> issue -- more below....

OK, I see now that you're talking about the output from list_proposals().
Indeed, when I cause list_proposals() in propose_queued() to only
be called at paxos debug 30, the stall in finish_queued_proposal()
mostly goes away.  Now, with 55392 PGs I'm seeing this:

2013-04-02 14:21:19.401820 7ffff328d700 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
2013-04-02 14:21:19.535731 7ffff328d700  5 mon.cs31@0(leader).paxos(paxos active c 1..33) queue_proposal bl 19870402 bytes; ctx = 0x11ecff0
2013-04-02 14:21:19.535917 7ffff328d700  5 mon.cs31@0(leader).paxos(paxos preparing update c 1..33) propose_queued 34 19870402 bytes
2013-04-02 14:21:19.535925 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..33) begin for 34 19870402 bytes
2013-04-02 14:21:19.754326 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos updating c 1..33) commit 34
2013-04-02 14:21:19.975747 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..34) finish_queued_proposal finishing proposal
2013-04-02 14:21:19.975799 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..34) finish_queued_proposal finish it (proposal = 0x790c0c0)
2013-04-02 14:21:19.975802 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..34) finish_queued_proposal proposal took 0.439887 to finish
2013-04-02 14:21:19.975913 7ffff328d700 10 mon.cs31@0(leader).paxosservice(pgmap) _active

>>
>> If you're willing, try reducing the paxos debug level to 0 and let us know
>> if those delays in the proposal go away.
> 
> When I started debugging why I couldn't start new filesystems, several
> days ago, I wasn't using any debugging.  It took me until last Friday
> to track the issue into paxos, which is when I turned on its debugging.
> 

With that list_proposals() debug output issue out of the way, I'm
still having trouble starting up a new filesystem -- still working
to try to isolate the source of the issue....

Thanks -- Jim



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Trouble with paxos service for large PG count
  2013-04-02 20:36         ` Jim Schutt
@ 2013-04-02 23:24           ` Joao Eduardo Luis
  0 siblings, 0 replies; 8+ messages in thread
From: Joao Eduardo Luis @ 2013-04-02 23:24 UTC (permalink / raw)
  To: Jim Schutt; +Cc: ceph-devel

On 04/02/2013 09:36 PM, Jim Schutt wrote:
> On 04/02/2013 01:16 PM, Jim Schutt wrote:
>> On 04/02/2013 12:28 PM, Joao Luis wrote:
>>> Right.  I'll push a patch to bump that sort of output to 30 when I get home.
>>
>> Thanks - but FWIW, I don't think it's the root cause of my
>> issue -- more below....
>
> OK, I see now that you're talking about the output from list_proposals().
> Indeed, when I cause list_proposals() in propose_queued() to only
> be called at paxos debug 30, the stall in finish_queued_proposal()
> mostly goes away.  Now, with 55392 PGs I'm seeing this:
>
> 2013-04-02 14:21:19.401820 7ffff328d700 10 mon.cs31@0(leader).paxosservice(pgmap) propose_pending
> 2013-04-02 14:21:19.535731 7ffff328d700  5 mon.cs31@0(leader).paxos(paxos active c 1..33) queue_proposal bl 19870402 bytes; ctx = 0x11ecff0
> 2013-04-02 14:21:19.535917 7ffff328d700  5 mon.cs31@0(leader).paxos(paxos preparing update c 1..33) propose_queued 34 19870402 bytes
> 2013-04-02 14:21:19.535925 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos preparing update c 1..33) begin for 34 19870402 bytes
> 2013-04-02 14:21:19.754326 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos updating c 1..33) commit 34
> 2013-04-02 14:21:19.975747 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..34) finish_queued_proposal finishing proposal
> 2013-04-02 14:21:19.975799 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..34) finish_queued_proposal finish it (proposal = 0x790c0c0)
> 2013-04-02 14:21:19.975802 7ffff328d700 10 mon.cs31@0(leader).paxos(paxos active c 1..34) finish_queued_proposal proposal took 0.439887 to finish
> 2013-04-02 14:21:19.975913 7ffff328d700 10 mon.cs31@0(leader).paxosservice(pgmap) _active
>
>>>
>>> If you're willing, try reducing the paxos debug level to 0 and let us know
>>> if those delays in the proposal go away.
>>
>> When I started debugging why I couldn't start new filesystems, several
>> days ago, I wasn't using any debugging.  It took me until last Friday
>> to track the issue into paxos, which is when I turned on its debugging.
>>
>
> With that list_proposals() debug output issue out of the way, I'm
> still having trouble starting up a new filesystem -- still working
> to try to isolate the source of the issue....
>
> Thanks -- Jim
>
>

Good to know!

With regard to your troubles, what appears to go wrong?

   -Joao

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-04-02 23:25 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-01 21:14 Trouble with paxos service for large PG count Jim Schutt
2013-04-02 13:41 ` Joao Eduardo Luis
2013-04-02 15:37   ` Joao Eduardo Luis
2013-04-02 15:42 ` Joao Eduardo Luis
2013-04-02 18:18   ` Jim Schutt
     [not found]     ` <CAGgG2xTdTOWvaPAK2FnxpzC1fsok50_HXXSgayEoD3J8SK6k3w@mail.gmail.com>
2013-04-02 19:16       ` Jim Schutt
2013-04-02 20:36         ` Jim Schutt
2013-04-02 23:24           ` Joao Eduardo Luis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.