From: Alex Elder <elder@ieee.org>
To: Ilya Dryomov <idryomov@redhat.com>, ceph-devel@vger.kernel.org
Cc: Alex Elder <elder@linaro.org>
Subject: Re: [PATCH 1/3] rbd: fix rbd_dev_parent_get() when parent_overlap == 0
Date: Mon, 26 Jan 2015 21:14:44 -0600 [thread overview]
Message-ID: <54C702A4.9030900@ieee.org> (raw)
In-Reply-To: <1421757669-38796-2-git-send-email-idryomov@redhat.com>
On 01/20/2015 06:41 AM, Ilya Dryomov wrote:
> The comment for rbd_dev_parent_get() said
>
> * We must get the reference before checking for the overlap to
> * coordinate properly with zeroing the parent overlap in
> * rbd_dev_v2_parent_info() when an image gets flattened. We
> * drop it again if there is no overlap.
>
> but the "drop it again if there is no overlap" part was missing from
> the implementation. This lead to absurd parent_ref values for images
> with parent_overlap == 0, as parent_ref was incremented for each
> img_request and virtually never decremented.
You're right about this. If the image had a parent with no
overlap this would leak a reference to the parent image. The
code should have said:
counter = atomic_inc_return_safe(&rbd_dev->parent_ref);
if (counter > 0) {
if (rbd_dev->parent_overlap)
return true;
atomic_dec(&rbd_dev->parent_ref);
} else if (counter < 0) {
rbd_warn(rbd_dev, "parent reference overflow");
}
> Fix this by leveraging the fact that refresh path calls
> rbd_dev_v2_parent_info() under header_rwsem and use it for read in
> rbd_dev_parent_get(), instead of messing around with atomics. Get rid
> of barriers in rbd_dev_v2_parent_info() while at it - I don't see what
> they'd pair with now and I suspect we are in a pretty miserable
> situation as far as proper locking goes regardless.
The point of the memory barrier was to ensure that when parent_overlap
gets zeroed, this code sees the zero rather than the old non-zero
value. The atomic_inc_return_safe() call has an implicit memory
barrier to match the smp_mb() call. It allowed the synchronization
to occur without the use of a lock.
We're trying to atomically determine whether an image request needs
to be marked as layered, to know how to handle ENOENT on parent reads.
If it is a write to an image with a parent having a non-zero overlap,
it's layered, otherwise we can treat it as a simple request.
I think in this particular case, this is just an optimization,
trying very hard to avoid having to do layered image handling
if the parent has become flattened. I think that even if it
got old information (suggesting non-zero overlap) things would
behave correctly, just less efficiently.
Using the semaphore adds a lock to this path and therefore
implements whatever barriers are being removed. I'm not
sure how often this is hit--maybe the optimization isn't
buying much after all.
I am getting a little rusty on some of details of what
precisely happens when a layered image gets flattened.
But I think this looks OK. Maybe just watch for small
(perhaps insignificant) performance regressions with
this change in place...
Reviewed-by: Alex Elder <elder@linaro.org>
> Cc: stable@vger.kernel.org # 3.11+
> Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
> ---
> drivers/block/rbd.c | 20 ++++++--------------
> 1 file changed, 6 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 31fa00f0d707..2990a1c75159 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -2098,32 +2098,26 @@ static void rbd_dev_parent_put(struct rbd_device *rbd_dev)
> * If an image has a non-zero parent overlap, get a reference to its
> * parent.
> *
> - * We must get the reference before checking for the overlap to
> - * coordinate properly with zeroing the parent overlap in
> - * rbd_dev_v2_parent_info() when an image gets flattened. We
> - * drop it again if there is no overlap.
> - *
> * Returns true if the rbd device has a parent with a non-zero
> * overlap and a reference for it was successfully taken, or
> * false otherwise.
> */
> static bool rbd_dev_parent_get(struct rbd_device *rbd_dev)
> {
> - int counter;
> + int counter = 0;
>
> if (!rbd_dev->parent_spec)
> return false;
>
> - counter = atomic_inc_return_safe(&rbd_dev->parent_ref);
> - if (counter > 0 && rbd_dev->parent_overlap)
> - return true;
> -
> - /* Image was flattened, but parent is not yet torn down */
> + down_read(&rbd_dev->header_rwsem);
> + if (rbd_dev->parent_overlap)
> + counter = atomic_inc_return_safe(&rbd_dev->parent_ref);
> + up_read(&rbd_dev->header_rwsem);
>
> if (counter < 0)
> rbd_warn(rbd_dev, "parent reference overflow");
>
> - return false;
> + return counter > 0;
> }
>
> /*
> @@ -4238,7 +4232,6 @@ static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev)
> */
> if (rbd_dev->parent_overlap) {
> rbd_dev->parent_overlap = 0;
> - smp_mb();
> rbd_dev_parent_put(rbd_dev);
> pr_info("%s: clone image has been flattened\n",
> rbd_dev->disk->disk_name);
> @@ -4284,7 +4277,6 @@ static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev)
> * treat it specially.
> */
> rbd_dev->parent_overlap = overlap;
> - smp_mb();
> if (!overlap) {
>
> /* A null parent_spec indicates it's the initial probe */
>
next prev parent reply other threads:[~2015-01-27 3:14 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-20 12:41 [PATCH 0/3] rbd: parent_overlap == 0 fixes Ilya Dryomov
2015-01-20 12:41 ` [PATCH 1/3] rbd: fix rbd_dev_parent_get() when parent_overlap == 0 Ilya Dryomov
2015-01-21 1:22 ` Josh Durgin
2015-01-27 3:14 ` Alex Elder [this message]
2015-01-20 12:41 ` [PATCH 2/3] rbd: drop parent_ref in rbd_dev_unprobe() unconditionally Ilya Dryomov
2015-01-21 1:24 ` Josh Durgin
2015-01-27 3:40 ` Alex Elder
2015-01-20 12:41 ` [PATCH 3/3] rbd: do not treat standalone as flatten Ilya Dryomov
2015-01-21 1:25 ` Josh Durgin
2015-01-27 3:55 ` Alex Elder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54C702A4.9030900@ieee.org \
--to=elder@ieee.org \
--cc=ceph-devel@vger.kernel.org \
--cc=elder@linaro.org \
--cc=idryomov@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.