From: Mike Snitzer <snitzer@redhat.com>
To: Teng-Feng Yang <shinrairis@gmail.com>
Cc: device-mapper development <dm-devel@redhat.com>
Subject: Re: A bug in dm-persistent-data module which leads to dm-thin metadata corruption
Date: Fri, 7 Mar 2014 11:20:17 -0500 [thread overview]
Message-ID: <20140307162016.GB31775@redhat.com> (raw)
In-Reply-To: <20140307151425.GB10613@debian>
On Fri, Mar 07 2014 at 10:14am -0500,
Joe Thornber <thornber@redhat.com> wrote:
> On Fri, Mar 07, 2014 at 12:00:07PM +0800, Teng-Feng Yang wrote:
> > Dear all,
> >
> > I had experienced a dm-thin metadata corruption a couple of days ago,
> > and I found that someone had
> > reported the similar corruption to dm-devel recently.
> > http://www.redhat.com/archives/dm-devel/2014-February/msg00157.html
> >
> > Since this issue will leads to unrecoverable metadata corruption and
> > could be reproduced every time,
> > we add some traces and hope to find out the root cause of this. After
> > dumping the trace, I think we
> > might find a bug in dm-persistent-data and I will try my best to
> > explain it clearly in below.
> >
> > When decreasing the reference count of a metadata block with its
> > reference count equals 3,
> > we will call dm_btree_remove() to remove this enrty from the B+tree
> > which keeps the reference count info
> > in metadata device.
> >
> > The B+tree will try to rebalance the entry of the child nodes in each
> > node it traversed, and
> > the rebalance process contains the following steps.
> >
> > (1) Finding the corresponding children in current node (shadow_current(s))
> > (2) Shadow the children block (issue BOP_INC)
> > (3) redistribute keys among children, and free children if necessary
> > (issue BOP_DEC)
> >
> > Since the update of a metadata block's reference count could be
> > recursive, we will stash these
> > reference count update operations in smm->uncommitted and then process
> > them in a FILO fashion.
> > The problem is that step(3) could free the children which is created
> > in step(2), so the BOP_DEC issued
> > in step(3) will be carried out before the BOP_INC issued in step(2)
> > since these BOPs will be processed in
> > FILO fashion. Once the BOP_DEC from step(3) tries to decrease the
> > reference count of newly shadow block,
> > it will report failure for its reference equals 0 before decreasing.
> > It looks like we can solve this issue by processing
> > these BOPs in a FIFO fashion instead of FILO.
> >
> > Any comment will be grateful.
>
> Dennis,
>
> That's a really impressive piece of analysis. I think you've found
> the issue.
>
> Could you try with this patch please and see if it fixes things?
Also, if you could share what you're using to (quickly?) reproduce
that'd be appreciated.
Thanks,
Mike
prev parent reply other threads:[~2014-03-07 16:20 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-07 4:00 A bug in dm-persistent-data module which leads to dm-thin metadata corruption Teng-Feng Yang
2014-03-07 15:14 ` Joe Thornber
2014-03-07 16:20 ` Mike Snitzer [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140307162016.GB31775@redhat.com \
--to=snitzer@redhat.com \
--cc=dm-devel@redhat.com \
--cc=shinrairis@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.