Greetings, Has anyone seen this or got ideas on how to fix it? mdsmap e18399: 3/3/3 up {0=b=up:resolve,1=a=up:resolve(laggy or crashed),2=a=up:resolve(laggy or crashed)} Notice that the 2nd and 3rd mds are the same letter("a"). I'm not sure how that happened, I'm guessing a typo in my ceph.conf. Taking mds.a down doesn't help, b just stays in resolve. mds.a is only running on a single instance, even though it shows as up twice. When I take a mds down, and start it back up, it goes through a couple of states and then sticks at resolve. I've tried the method listed here, but can't see any change: http://www.sebastien-han.fr/blog/2012/07/04/remove-a-mds-server-from-a-ceph-cluster/ I tried "ceph mds stop X" as mentioned here http://comments.gmane.org/gmane.comp.file-systems.ceph.devel/2585 , but see the results below: athompson@ceph01:~$ sudo ceph mds stop 0 mds.0 not active (up:resolve) athompson@ceph01:~$ sudo ceph mds stop 1 mds.1 not active (up:resolve) athompson@ceph01:~$ sudo ceph mds stop 2 mds.2 not active (up:resolve) I've attached the results of `ceph mds dump -o -` Currently, mds.b.log is full of these reset/connect's and then where I issued a `service ceph stop mds` a few minutes ago(see attached). Thanks, Andrew. -- Andrew Thompson http://aktzero.com/