All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Łukasz Chrustek" <skidoo@tlen.pl>
To: Sage Weil <sage@newdream.net>
Cc: ceph-devel@vger.kernel.org
Subject: Re: Problem with query and any operation on PGs
Date: Tue, 23 May 2017 16:43:58 +0200	[thread overview]
Message-ID: <648333186.20170523164358@tlen.pl> (raw)
In-Reply-To: <alpine.DEB.2.11.1705231415400.3646@piezo.novalocal>

Cześć,

> On Tue, 23 May 2017, Łukasz Chrustek wrote:
>> Cześć,
>> 
>> Hello,
>> 
>> After terrible outage coused by failure of 10Gbit switch, ceph cluster
>> went  to HEALTH_ERR (three whole storage servers go offline in the same time
>> and didn't back in short time). After cluster recovery two PGs goto to
>> incomplite state, I can't them query, and can't do with them anything,

> The thing where you can't query a PG is because the OSD is throttling 
> incoming work and the throttle is exhausted (the PG can't do work so it
> isn't making progress).  A workaround for jewel is to restart the OSD 
> serving the PG and do the query quickly after that (probably in a loop so
> that you catch it after it starts up but before the throttle is 
> exhausted again).  (In luminous this is fixed.)

Thank You for claryfication.

> Once you have the query output ('ceph tell $pgid query') you'll be able to
> tell what is preventing the PG from peering.

Hm..  what  kind of loop You sugests ? When I do ceph tell $pgid query
it hangs, not relasing to the console.

> You can identify the osd(s) hosting the pg with 'ceph pg map $pgid'.

it is somehting strange here for 1.165, how it is posible, that acting
is 37 and it isn't in range of [84,38,48] ?:

ceph pg map 1.165
osdmap e114855 pg 1.165 (1.165) -> up [84,38,48] acting [37]

second one is ok, but also no ability to make pg query:

[root@cc1 ~]# ceph pg map 1.60
osdmap e114855 pg 1.60 (1.60) -> up [66,84,40] acting [66,69,40]


do I need to restart all three osds in the same time ?

Can   You   advice  how to unblock access to one of pool for this kind
of command:

[root@cc1 ~]# rbd ls volumes
^C

strace  for this is here: https://pastebin.com/hpbDg6gP - this time it
hangs  on  some futex function. Are this cases (pg query hang and this
rbd ls problem) are connected each other ?

If I find solution for this, You will make my day (and night :) ).


Regards
Lukasz

> HTH!
> sage


>> what   would   allow   back  working cluster back. here is strace of
>> this command: https://pastebin.com/HpNFvR8Z. But... this cluster isn't enteriely off:
>> 
>> [root@cc1 ~]# rbd ls management-vms
>> os-mongodb1
>> os-mongodb1-database
>> os-gitlab-root
>> os-mongodb1-database2
>> os-wiki-root
>> [root@cc1 ~]# rbd ls volumes
>> ^C
>> [root@cc1 ~]#
>> 
>> and for all mon hosts (don't put all three here)
>> 
>> [root@cc1 ~]# rbd -m 192.168.128.1 list management-vms
>> os-mongodb1
>> os-mongodb1-database
>> os-gitlab-root
>> os-mongodb1-database2
>> os-wiki-root
>> [root@cc1 ~]# rbd -m 192.168.128.1 list volumes
>> ^C
>> [root@cc1 ~]#
>> 
>> and  all other POOLs from list, except (most important) volumes, I can
>> list images.
>> 
>> Fanny thing, I can list rbd info for particular image:
>> 
>> [root@cc1 ~]# rbd info
>> volumes/volume-197602d7-40f9-40ad-b286-cdec688b1497
>> rbd image 'volume-197602d7-40f9-40ad-b286-cdec688b1497':
>>         size 20480 MB in 1280 objects
>>         order 24 (16384 kB objects)
>>         block_name_prefix: rbd_data.64a21a0a9acf52
>>         format: 2
>>         features: layering
>>         flags:
>>         parent: images/37bdf0ca-f1f3-46ce-95b9-c04bb9ac8a53@snap
>>         overlap: 3072 MB
>> 
>> but can't list the whole content of pool volumes.
>> 
>> [root@cc1 ~]# ceph osd pool ls
>> volumes
>> images
>> backups
>> volumes-ssd-intel-s3700
>> management-vms
>> .rgw.root
>> .rgw.control
>> .rgw
>> .rgw.gc
>> .log
>> .users.uid
>> .rgw.buckets.index
>> .users
>> .rgw.buckets.extra
>> .rgw.buckets
>> volumes-cached
>> cache-ssd
>> 
>> here is ceph osd tree:
>> 
>> ID  WEIGHT    TYPE NAME            UP/DOWN REWEIGHT PRIMARY-AFFINITY
>>  -7  20.88388 root ssd-intel-s3700
>> -11   3.19995     host ssd-stor1
>>  56   0.79999         osd.56            up  1.00000          1.00000
>>  57   0.79999         osd.57            up  1.00000          1.00000
>>  58   0.79999         osd.58            up  1.00000          1.00000
>>  59   0.79999         osd.59            up  1.00000          1.00000
>>  -9   2.12999     host ssd-stor2
>>  60   0.70999         osd.60            up  1.00000          1.00000
>>  61   0.70999         osd.61            up  1.00000          1.00000
>>  62   0.70999         osd.62            up  1.00000          1.00000
>>  -8   2.12999     host ssd-stor3
>>  63   0.70999         osd.63            up  1.00000          1.00000
>>  64   0.70999         osd.64            up  1.00000          1.00000
>>  65   0.70999         osd.65            up  1.00000          1.00000
>> -10   4.19998     host ssd-stor4
>>  25   0.70000         osd.25            up  1.00000          1.00000
>>  26   0.70000         osd.26            up  1.00000          1.00000
>>  27   0.70000         osd.27            up  1.00000          1.00000
>>  28   0.70000         osd.28            up  1.00000          1.00000
>>  29   0.70000         osd.29            up  1.00000          1.00000
>>  24   0.70000         osd.24            up  1.00000          1.00000
>> -12   3.41199     host ssd-stor5
>>  73   0.85300         osd.73            up  1.00000          1.00000
>>  74   0.85300         osd.74            up  1.00000          1.00000
>>  75   0.85300         osd.75            up  1.00000          1.00000
>>  76   0.85300         osd.76            up  1.00000          1.00000
>> -13   3.41199     host ssd-stor6
>>  77   0.85300         osd.77            up  1.00000          1.00000
>>  78   0.85300         osd.78            up  1.00000          1.00000
>>  79   0.85300         osd.79            up  1.00000          1.00000
>>  80   0.85300         osd.80            up  1.00000          1.00000
>> -15   2.39999     host ssd-stor7
>>  90   0.79999         osd.90            up  1.00000          1.00000
>>  91   0.79999         osd.91            up  1.00000          1.00000
>>  92   0.79999         osd.92            up  1.00000          1.00000
>>  -1 167.69969 root default
>>  -2  33.99994     host stor1
>>   6   3.39999         osd.6           down        0          1.00000
>>   7   3.39999         osd.7             up  1.00000          1.00000
>>   8   3.39999         osd.8             up  1.00000          1.00000
>>   9   3.39999         osd.9             up  1.00000          1.00000
>>  10   3.39999         osd.10          down        0          1.00000
>>  11   3.39999         osd.11          down        0          1.00000
>>  69   3.39999         osd.69            up  1.00000          1.00000
>>  70   3.39999         osd.70            up  1.00000          1.00000
>>  71   3.39999         osd.71          down        0          1.00000
>>  81   3.39999         osd.81            up  1.00000          1.00000
>>  -3  20.99991     host stor2
>>  13   2.09999         osd.13            up  1.00000          1.00000
>>  12   2.09999         osd.12            up  1.00000          1.00000
>>  14   2.09999         osd.14            up  1.00000          1.00000
>>  15   2.09999         osd.15            up  1.00000          1.00000
>>  16   2.09999         osd.16            up  1.00000          1.00000
>>  17   2.09999         osd.17            up  1.00000          1.00000
>>  18   2.09999         osd.18          down        0          1.00000
>>  19   2.09999         osd.19            up  1.00000          1.00000
>>  20   2.09999         osd.20            up  1.00000          1.00000
>>  21   2.09999         osd.21            up  1.00000          1.00000
>>  -4  25.00000     host stor3
>>  30   2.50000         osd.30            up  1.00000          1.00000
>>  31   2.50000         osd.31            up  1.00000          1.00000
>>  32   2.50000         osd.32            up  1.00000          1.00000
>>  33   2.50000         osd.33          down        0          1.00000
>>  34   2.50000         osd.34            up  1.00000          1.00000
>>  35   2.50000         osd.35            up  1.00000          1.00000
>>  66   2.50000         osd.66            up  1.00000          1.00000
>>  67   2.50000         osd.67            up  1.00000          1.00000
>>  68   2.50000         osd.68            up  1.00000          1.00000
>>  72   2.50000         osd.72          down        0          1.00000
>>  -5  25.00000     host stor4
>>  44   2.50000         osd.44            up  1.00000          1.00000
>>  45   2.50000         osd.45            up  1.00000          1.00000
>>  46   2.50000         osd.46          down        0          1.00000
>>  47   2.50000         osd.47            up  1.00000          1.00000
>>   0   2.50000         osd.0             up  1.00000          1.00000
>>   1   2.50000         osd.1             up  1.00000          1.00000
>>   2   2.50000         osd.2             up  1.00000          1.00000
>>   3   2.50000         osd.3             up  1.00000          1.00000
>>   4   2.50000         osd.4             up  1.00000          1.00000
>>   5   2.50000         osd.5             up  1.00000          1.00000
>>  -6  14.19991     host stor5
>>  48   1.79999         osd.48            up  1.00000          1.00000
>>  49   1.59999         osd.49            up  1.00000          1.00000
>>  50   1.79999         osd.50            up  1.00000          1.00000
>>  51   1.79999         osd.51          down        0          1.00000
>>  52   1.79999         osd.52            up  1.00000          1.00000
>>  53   1.79999         osd.53            up  1.00000          1.00000
>>  54   1.79999         osd.54            up  1.00000          1.00000
>>  55   1.79999         osd.55            up  1.00000          1.00000
>> -14  14.39999     host stor6
>>  82   1.79999         osd.82            up  1.00000          1.00000
>>  83   1.79999         osd.83            up  1.00000          1.00000
>>  84   1.79999         osd.84            up  1.00000          1.00000
>>  85   1.79999         osd.85            up  1.00000          1.00000
>>  86   1.79999         osd.86            up  1.00000          1.00000
>>  87   1.79999         osd.87            up  1.00000          1.00000
>>  88   1.79999         osd.88            up  1.00000          1.00000
>>  89   1.79999         osd.89            up  1.00000          1.00000
>> -16  12.59999     host stor7
>>  93   1.79999         osd.93            up  1.00000          1.00000
>>  94   1.79999         osd.94            up  1.00000          1.00000
>>  95   1.79999         osd.95            up  1.00000          1.00000
>>  96   1.79999         osd.96            up  1.00000          1.00000
>>  97   1.79999         osd.97            up  1.00000          1.00000
>>  98   1.79999         osd.98            up  1.00000          1.00000
>>  99   1.79999         osd.99            up  1.00000          1.00000
>> -17  21.49995     host stor8
>>  22   1.59999         osd.22            up  1.00000          1.00000
>>  23   1.59999         osd.23            up  1.00000          1.00000
>>  36   2.09999         osd.36            up  1.00000          1.00000
>>  37   2.09999         osd.37            up  1.00000          1.00000
>>  38   2.50000         osd.38            up  1.00000          1.00000
>>  39   2.50000         osd.39            up  1.00000          1.00000
>>  40   2.50000         osd.40            up  1.00000          1.00000
>>  41   2.50000         osd.41          down        0          1.00000
>>  42   2.50000         osd.42            up  1.00000          1.00000
>>  43   1.59999         osd.43            up  1.00000          1.00000
>> [root@cc1 ~]#
>> 
>> and ceph health detail:
>> 
>> ceph health detail | grep down
>> HEALTH_WARN 23 pgs backfilling; 23 pgs degraded; 2 pgs down; 2 pgs
>> peering; 2 pgs stuck inactive; 25 pgs stuck unclean; 23 pgs
>> undersized; recovery 176211/14148564 objects degraded (1.245%);
>> recovery 238972/14148564 objects misplaced (1.689%); noout flag(s) set
>> pg 1.60 is stuck inactive since forever, current state
>> down+remapped+peering, last acting [66,69,40]
>> pg 1.165 is stuck inactive since forever, current state
>> down+remapped+peering, last acting [37]
>> pg 1.60 is stuck unclean since forever, current state
>> down+remapped+peering, last acting [66,69,40]
>> pg 1.165 is stuck unclean since forever, current state
>> down+remapped+peering, last acting [37]
>> pg 1.165 is down+remapped+peering, acting [37]
>> pg 1.60 is down+remapped+peering, acting [66,69,40]
>> 
>> 
>> problematic pgs are 1.165 and 1.60.
>> 
>> Please  advice  how  to  unblock pool volumes and/or make this two pgs
>> working  -  in a last night and day, when we tried to solve this issue
>> these pgs are for 100% empty from data.
>> 
>> 
>> 
>> 
>> -- 
>> Pozdrowienia,
>>  Łukasz Chrustek
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> 



-- 
Pozdrowienia,
 Łukasz Chrustek


  reply	other threads:[~2017-05-23 14:44 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <175484591.20170523135449@tlen.pl>
2017-05-23 12:48 ` Problem with query and any operation on PGs Łukasz Chrustek
2017-05-23 14:17   ` Sage Weil
2017-05-23 14:43     ` Łukasz Chrustek [this message]
     [not found]     ` <1464688590.20170523185052@tlen.pl>
2017-05-23 17:40       ` Sage Weil
2017-05-23 21:43         ` Łukasz Chrustek
2017-05-23 21:48           ` Sage Weil
2017-05-24 13:19             ` Łukasz Chrustek
2017-05-24 13:37               ` Sage Weil
2017-05-24 13:58                 ` Łukasz Chrustek
2017-05-24 14:02                   ` Sage Weil
2017-05-24 14:18                     ` Łukasz Chrustek
2017-05-24 14:47                       ` Sage Weil
2017-05-24 15:00                         ` Łukasz Chrustek
2017-05-24 15:07                           ` Łukasz Chrustek
2017-05-24 15:11                           ` Sage Weil
2017-05-24 15:24                             ` Łukasz Chrustek
2017-05-24 15:54                             ` Łukasz Chrustek
2017-05-24 16:02                               ` Łukasz Chrustek
2017-05-24 17:07                                 ` Łukasz Chrustek
2017-05-24 17:16                                   ` Sage Weil
2017-05-24 17:28                                     ` Łukasz Chrustek
2017-05-24 18:16                                       ` Sage Weil
2017-05-24 19:47                                         ` Łukasz Chrustek
2017-05-24 17:30                                     ` Łukasz Chrustek
2017-05-24 17:35                                       ` Łukasz Chrustek
2017-05-24 21:38                         ` Łukasz Chrustek
2017-05-24 21:53                           ` Sage Weil
2017-05-24 22:09                             ` Łukasz Chrustek
2017-05-24 22:27                               ` Sage Weil
2017-05-24 22:46                                 ` Łukasz Chrustek
2017-05-25  2:06                                   ` Sage Weil
2017-05-25 11:22                                     ` Łukasz Chrustek
2017-05-29 15:31                                       ` Łukasz Chrustek
2017-05-30 13:21                                   ` Sage Weil
2017-06-10 22:45                                     ` Łukasz Chrustek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=648333186.20170523164358@tlen.pl \
    --to=skidoo@tlen.pl \
    --cc=ceph-devel@vger.kernel.org \
    --cc=sage@newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.