All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Bashkirtsev <vladimir@bashkirtsev.com>
To: Dan Mick <dan.mick@inktank.com>
Cc: ceph-devel@vger.kernel.org
Subject: Re: librbd: error finding header
Date: Tue, 10 Jul 2012 18:55:25 +0930	[thread overview]
Message-ID: <4FFBF505.4050400@bashkirtsev.com> (raw)
In-Reply-To: <4FFBB74F.2050702@inktank.com>

On 10/07/12 14:32, Dan Mick wrote:
>
>
> On 07/09/2012 08:27 PM, Vladimir Bashkirtsev wrote:
>> On 10/07/12 03:17, Dan Mick wrote:
>>> Well, it's not so much those; those are the objects that hold data
>>> blocks.  You're more interested in the objects whose names end in
>>> '.rbd'.  These are the header objects, one per image, and are
>>> interpreted by rbd info, but I'm concerned that one of them may not
>>> exist.
>> Right on the ball: .rbd for image concerned just does not exist. So how
>> can we recover from this? And why it has disappeared in first place? (I
>> guess latter may be related to some sort of bug)
>
> Don't know why it might have disappeared.  Recovery: no easy way. It's 
> possible that image header could be reconstructed, but there aren't 
> any tools written to do it (the header format is pretty uncomplicated).
Well... Then somehow either I need to rebuild it manually or clean up 
image remains to free up space. Given that rbd tool refuses to do 
anything without .rbd object then clean up appears to be manual as well.

I have run rbd info on the rest of images and excluded rb.* objects 
belonging to good images. Now I know broken image has prefix of rb.0.1 
and technically I can clean out objects belonging to this image. But rbd 
ls seems to pull the list of rbd images from somewhere: broken image 
must be removed from there as well. Not sure where it is stored.

Alternatively how hard it would be to throw together a quick tool which 
picks up these objects and reconstructs .rbd header? Something tells me 
that it should be relatively straight forward.

I have no pressing need to recover this image - I have pulled the backup 
and now it is on its merry way. But just for future sake we need to get 
this one resolved: another day someone else will hit it.

----------------------------------

30 minutes later:

I have looked at structure of rbd_obj_header_ondisk and really it is 
quite simple. Image has no snapshots and so it makes everything straight 
forward. Order is default 22, size - well, unknown but finding object 
with highest index provides some guidance. get rbd header from another 
image, using hexedit changed name and size, put it back and viola - 
image is back and running. Not quite sure about integrity but at least 
now it will allow to remove image cleanly.

>
> It certainly shouldn't have just happened.  Any idea what operations 
> might have been in progress when it did?
Obviously not. I am running ceph over last few months trying to get it 
off track and till now had no major issues. VM concerned was running 
while I did upgrade from 0.47.3 to 0.48. After that point I have asked 
the list if it is safe to live migrate VM with rbd cache on. Josh 
confirmed that it is safe to do so. So I have live migrated VM to 
another host. No dramas. Still everything runs. Then I have updated 
hosts (rolling update again - migrating VMs away while rebooting hosts). 
I have around 10 VMs (including heavily loaded) and all of them migrated 
around without any issues. Then suddenly this VM refused to migrate. 
While I was typing it I remembered that there was one issue between 
upgrade of ceph and failure to migrate: one of pgs turned inconsistent. 
pg repair fixed it and I immediately forgot about it. Could it be the 
reason why this .rbd disappeared? (Went to check logs but logrotate 
already removed it).

  parent reply	other threads:[~2012-07-10  9:25 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-09  5:42 librbd: error finding header Vladimir Bashkirtsev
2012-07-09  9:03 ` Dan Mick
2012-07-09 10:29   ` Vladimir Bashkirtsev
2012-07-09 16:30     ` Florian Haas
2012-07-10  3:28       ` Vladimir Bashkirtsev
2012-07-09 17:47     ` Dan Mick
2012-07-10  3:29       ` Vladimir Bashkirtsev
     [not found]       ` <4FFBA108.3010009@bashkirtsev.com>
     [not found]         ` <4FFBB74F.2050702@inktank.com>
2012-07-10  9:25           ` Vladimir Bashkirtsev [this message]
2012-07-10 20:08             ` Josh Durgin
2012-07-12  2:40               ` Vladimir Bashkirtsev
2012-07-12  4:41                 ` Josh Durgin
2012-07-12 16:00                   ` Tommi Virtanen
2012-07-13 13:06                     ` Vladimir Bashkirtsev
2012-07-14  0:34                       ` Josh Durgin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FFBF505.4050400@bashkirtsev.com \
    --to=vladimir@bashkirtsev.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=dan.mick@inktank.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.