* [PATCH 0/5] rbd: prep work for flattening images
@ 2013-05-11 17:42 Alex Elder
2013-05-11 17:43 ` [PATCH 1/5] rbd: get parent info on refresh Alex Elder
` (5 more replies)
0 siblings, 6 replies; 11+ messages in thread
From: Alex Elder @ 2013-05-11 17:42 UTC (permalink / raw)
To: ceph-devel
This series of patches prepares some parent request
code to make handling the event of a mapped clone
image getting flattened easier.
-Alex
[PATCH 1/5] rbd: get parent info on refresh
[PATCH 2/5] rbd: don't release write request until necessary
[PATCH 3/5] rbd: define rbd_dev_unparent()
[PATCH 4/5] rbd: define parent image request routines
[PATCH 5/5] rbd: reference count parent requests
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/5] rbd: get parent info on refresh
2013-05-11 17:42 [PATCH 0/5] rbd: prep work for flattening images Alex Elder
@ 2013-05-11 17:43 ` Alex Elder
2013-05-11 20:59 ` Josh Durgin
2013-05-11 17:44 ` [PATCH 2/5] rbd: don't release write request until necessary Alex Elder
` (4 subsequent siblings)
5 siblings, 1 reply; 11+ messages in thread
From: Alex Elder @ 2013-05-11 17:43 UTC (permalink / raw)
To: ceph-devel
Get parent info for format 2 images on every refresh (rather than
just during the initial probe). This will be needed to detect the
disappearance of the parent image in the event a mapped image
becomes unlayered (i.e., flattened).
Switch to using a non-zero parent overlap value rather than the
existence of a parent (a non-null parent_spec pointer) to determine
whether to mark a request layered. It will soon be possible for
a layered image to become unlayered while a request is in flight.
This means that the layered flag for an image request indicates that
there was a non-zero parent overlap at the time the image request
was created. The parent overlap can change thereafter, which may
lead to special handling at request submission or completion time.
Flesh out and fix the rbd_dev_v2_header_info() error handling path.
This and the next several pages are related to:
http://tracker.ceph.com/issues/3763
Signed-off-by: Alex Elder <elder@inktank.com>
---
drivers/block/rbd.c | 72
++++++++++++++++++++++++++++++++-------------------
1 file changed, 45 insertions(+), 27 deletions(-)
diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 3a8135f..06d49b5 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -1871,7 +1871,7 @@ static struct rbd_img_request *rbd_img_request_create(
}
if (child_request)
img_request_child_set(img_request);
- if (rbd_dev->parent_spec)
+ if (rbd_dev->parent_overlap)
img_request_layered_set(img_request);
spin_lock_init(&img_request->completion_lock);
img_request->next_completion = 0;
@@ -4021,20 +4021,60 @@ static int rbd_dev_v2_header_info(struct
rbd_device *rbd_dev)
if (first_time) {
ret = rbd_dev_v2_header_onetime(rbd_dev);
if (ret)
- goto out;
+ goto out_err;
+ }
+
+ /*
+ * If the image supports layering, get the parent info. We
+ * need to probe the first time regardless. Thereafter we
+ * only need to if there's a parent, to see if it has
+ * disappeared due to the mapped image getting flattened.
+ */
+ if (rbd_dev->header.features & RBD_FEATURE_LAYERING &&
+ (first_time || rbd_dev->parent_spec)) {
+ bool warn;
+
+ ret = rbd_dev_v2_parent_info(rbd_dev);
+ if (ret)
+ goto out_err;
+
+ /*
+ * Print a warning if this is the initial probe and
+ * the image has a parent. Don't print it if the
+ * image now being probed is itself a parent. We
+ * can tell at this point because we won't know its
+ * pool name yet (just its pool id).
+ */
+ warn = rbd_dev->parent_spec && rbd_dev->spec->pool_name;
+ if (first_time && warn)
+ rbd_warn(rbd_dev, "WARNING: kernel layering "
+ "is EXPERIMENTAL!");
}
ret = rbd_dev_v2_image_size(rbd_dev);
if (ret)
- goto out;
+ goto out_err;
+
if (rbd_dev->spec->snap_id == CEPH_NOSNAP)
if (rbd_dev->mapping.size != rbd_dev->header.image_size)
rbd_dev->mapping.size = rbd_dev->header.image_size;
ret = rbd_dev_v2_snap_context(rbd_dev);
dout("rbd_dev_v2_snap_context returned %d\n", ret);
- if (ret)
+ if (!ret)
goto out;
+out_err:
+ rbd_dev->mapping.size = 0;
+ rbd_dev->header.image_size = 0;
+ rbd_dev->header.obj_order = 0;
+ rbd_dev->parent_overlap = 0;
+ rbd_spec_put(rbd_dev->parent_spec);
+ rbd_dev->parent_spec = NULL;
+ rbd_dev->header.stripe_count = 0;
+ rbd_dev->header.stripe_unit = 0;
+ rbd_dev->header.features = 0;
+ kfree(rbd_dev->header.object_prefix);
+ rbd_dev->header.object_prefix = NULL;
out:
up_write(&rbd_dev->header_rwsem);
@@ -4488,24 +4528,6 @@ static int rbd_dev_v2_header_onetime(struct
rbd_device *rbd_dev)
if (ret)
goto out_err;
- /* If the image supports layering, get the parent info */
-
- if (rbd_dev->header.features & RBD_FEATURE_LAYERING) {
- ret = rbd_dev_v2_parent_info(rbd_dev);
- if (ret)
- goto out_err;
- /*
- * Print a warning if this image has a parent.
- * Don't print it if the image now being probed
- * is itself a parent. We can tell at this point
- * because we won't know its pool name yet (just its
- * pool id).
- */
- if (rbd_dev->parent_spec && rbd_dev->spec->pool_name)
- rbd_warn(rbd_dev, "WARNING: kernel layering "
- "is EXPERIMENTAL!");
- }
-
/* If the image supports fancy striping, get its parameters */
if (rbd_dev->header.features & RBD_FEATURE_STRIPINGV2) {
@@ -4517,11 +4539,7 @@ static int rbd_dev_v2_header_onetime(struct
rbd_device *rbd_dev)
return 0;
out_err:
- rbd_dev->parent_overlap = 0;
- rbd_spec_put(rbd_dev->parent_spec);
- rbd_dev->parent_spec = NULL;
- kfree(rbd_dev->header_name);
- rbd_dev->header_name = NULL;
+ rbd_dev->header.features = 0;
kfree(rbd_dev->header.object_prefix);
rbd_dev->header.object_prefix = NULL;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/5] rbd: don't release write request until necessary
2013-05-11 17:42 [PATCH 0/5] rbd: prep work for flattening images Alex Elder
2013-05-11 17:43 ` [PATCH 1/5] rbd: get parent info on refresh Alex Elder
@ 2013-05-11 17:44 ` Alex Elder
2013-05-11 17:44 ` [PATCH 3/5] rbd: define rbd_dev_unparent() Alex Elder
` (3 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: Alex Elder @ 2013-05-11 17:44 UTC (permalink / raw)
To: ceph-devel
Previously when a layered write was going to involve a copyup
request, the original osd request was released before submitting the
parent full-object read. The osd request for the copyup would then
be allocated in rbd_img_obj_parent_read_full_callback().
Shortly we will be handling the event of mapped layered images
getting flattened, and when that occurs we need to resubmit the
original request. We therefore don't want to release the osd
request until we really konw we're going to replace it--in the
callback function.
Signed-off-by: Alex Elder <elder@inktank.com>
---
drivers/block/rbd.c | 19 +++++++------------
1 file changed, 7 insertions(+), 12 deletions(-)
diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 06d49b5..ac3f4e7 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -2187,13 +2187,17 @@ rbd_img_obj_parent_read_full_callback(struct
rbd_img_request *img_request)
if (result)
goto out_err;
- /* Allocate the new copyup osd request for the original request */
-
+ /*
+ * The original osd request is of no use to use any more.
+ * We need a new one that can hold the two ops in a copyup
+ * request. Allocate the new copyup osd request for the
+ * original request, and release the old one.
+ */
result = -ENOMEM;
- rbd_assert(!orig_request->osd_req);
osd_req = rbd_osd_req_create_copyup(orig_request);
if (!osd_req)
goto out_err;
+ rbd_osd_req_destroy(orig_request->osd_req);
orig_request->osd_req = osd_req;
orig_request->copyup_pages = pages;
@@ -2269,15 +2273,6 @@ static int rbd_img_obj_parent_read_full(struct
rbd_obj_request *obj_request)
rbd_assert(rbd_dev->parent != NULL);
/*
- * First things first. The original osd request is of no
- * use to use any more, we'll need a new one that can hold
- * the two ops in a copyup request. We'll get that later,
- * but for now we can release the old one.
- */
- rbd_osd_req_destroy(obj_request->osd_req);
- obj_request->osd_req = NULL;
-
- /*
* Determine the byte range covered by the object in the
* child image to which the original request was to be sent.
*/
--
1.7.9.5
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/5] rbd: define rbd_dev_unparent()
2013-05-11 17:42 [PATCH 0/5] rbd: prep work for flattening images Alex Elder
2013-05-11 17:43 ` [PATCH 1/5] rbd: get parent info on refresh Alex Elder
2013-05-11 17:44 ` [PATCH 2/5] rbd: don't release write request until necessary Alex Elder
@ 2013-05-11 17:44 ` Alex Elder
2013-05-11 17:44 ` [PATCH 4/5] rbd: define parent image request routines Alex Elder
` (2 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: Alex Elder @ 2013-05-11 17:44 UTC (permalink / raw)
To: ceph-devel
Define rbd_dev_unparent() to encapsulate cleaning up parent data
structures from a layered rbd image.
Signed-off-by: Alex Elder <elder@inktank.com>
---
drivers/block/rbd.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index ac3f4e7..607c536 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -1835,6 +1835,17 @@ static void rbd_obj_request_destroy(struct kref
*kref)
kmem_cache_free(rbd_obj_request_cache, obj_request);
}
+/* It's OK to call this for a device with no parent */
+
+static void rbd_spec_put(struct rbd_spec *spec);
+static void rbd_dev_unparent(struct rbd_device *rbd_dev)
+{
+ rbd_dev_remove_parent(rbd_dev);
+ rbd_spec_put(rbd_dev->parent_spec);
+ rbd_dev->parent_spec = NULL;
+ rbd_dev->parent_overlap = 0;
+}
+
/*
* Caller is responsible for filling in the list of object requests
* that comprises the image request, and the Linux request pointer
@@ -4062,9 +4073,7 @@ out_err:
rbd_dev->mapping.size = 0;
rbd_dev->header.image_size = 0;
rbd_dev->header.obj_order = 0;
- rbd_dev->parent_overlap = 0;
- rbd_spec_put(rbd_dev->parent_spec);
- rbd_dev->parent_spec = NULL;
+ rbd_dev_unparent(rbd_dev);
rbd_dev->header.stripe_count = 0;
rbd_dev->header.stripe_unit = 0;
rbd_dev->header.features = 0;
@@ -4492,10 +4501,7 @@ static void rbd_dev_unprobe(struct rbd_device
*rbd_dev)
{
struct rbd_image_header *header;
- rbd_dev_remove_parent(rbd_dev);
- rbd_spec_put(rbd_dev->parent_spec);
- rbd_dev->parent_spec = NULL;
- rbd_dev->parent_overlap = 0;
+ rbd_dev_unparent(rbd_dev);
/* Free dynamic fields from the header, then zero it out */
@@ -4571,7 +4577,7 @@ static int rbd_dev_probe_parent(struct rbd_device
*rbd_dev)
return 0;
out_err:
if (parent) {
- rbd_spec_put(rbd_dev->parent_spec);
+ rbd_dev_unparent(rbd_dev);
kfree(rbd_dev->header_name);
rbd_dev_destroy(parent);
} else {
--
1.7.9.5
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 4/5] rbd: define parent image request routines
2013-05-11 17:42 [PATCH 0/5] rbd: prep work for flattening images Alex Elder
` (2 preceding siblings ...)
2013-05-11 17:44 ` [PATCH 3/5] rbd: define rbd_dev_unparent() Alex Elder
@ 2013-05-11 17:44 ` Alex Elder
2013-05-11 17:44 ` [PATCH 5/5] rbd: reference count parent requests Alex Elder
2013-05-11 21:08 ` [PATCH 0/5] rbd: prep work for flattening images Josh Durgin
5 siblings, 0 replies; 11+ messages in thread
From: Alex Elder @ 2013-05-11 17:44 UTC (permalink / raw)
To: ceph-devel
Define rbd_parent_request_create() and rbd_parent_request_destroy()
to handle the creation of parent image requests submitted for
layered image objects. For simplicity, let rbd_img_request_put()
handle dropping the reference to any image request (parent or not),
and call whichever destructor is appropriate on the last put.
Signed-off-by: Alex Elder <elder@inktank.com>
---
drivers/block/rbd.c | 78
++++++++++++++++++++++++++++++++++++---------------
1 file changed, 55 insertions(+), 23 deletions(-)
diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 607c536..72b53cd 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -1357,13 +1357,18 @@ static void rbd_obj_request_put(struct
rbd_obj_request *obj_request)
kref_put(&obj_request->kref, rbd_obj_request_destroy);
}
+static bool img_request_child_test(struct rbd_img_request *img_request);
+static void rbd_parent_request_destroy(struct kref *kref);
static void rbd_img_request_destroy(struct kref *kref);
static void rbd_img_request_put(struct rbd_img_request *img_request)
{
rbd_assert(img_request != NULL);
dout("%s: img %p (was %d)\n", __func__, img_request,
atomic_read(&img_request->kref.refcount));
- kref_put(&img_request->kref, rbd_img_request_destroy);
+ if (img_request_child_test(img_request))
+ kref_put(&img_request->kref, rbd_parent_request_destroy);
+ else
+ kref_put(&img_request->kref, rbd_img_request_destroy);
}
static inline void rbd_img_obj_request_add(struct rbd_img_request
*img_request,
@@ -1480,6 +1485,12 @@ static void img_request_child_set(struct
rbd_img_request *img_request)
smp_mb();
}
+static void img_request_child_clear(struct rbd_img_request *img_request)
+{
+ clear_bit(IMG_REQ_CHILD, &img_request->flags);
+ smp_mb();
+}
+
static bool img_request_child_test(struct rbd_img_request *img_request)
{
smp_mb();
@@ -1854,8 +1865,7 @@ static void rbd_dev_unparent(struct rbd_device
*rbd_dev)
static struct rbd_img_request *rbd_img_request_create(
struct rbd_device *rbd_dev,
u64 offset, u64 length,
- bool write_request,
- bool child_request)
+ bool write_request)
{
struct rbd_img_request *img_request;
@@ -1880,8 +1890,6 @@ static struct rbd_img_request *rbd_img_request_create(
} else {
img_request->snap_id = rbd_dev->spec->snap_id;
}
- if (child_request)
- img_request_child_set(img_request);
if (rbd_dev->parent_overlap)
img_request_layered_set(img_request);
spin_lock_init(&img_request->completion_lock);
@@ -1916,12 +1924,46 @@ static void rbd_img_request_destroy(struct kref
*kref)
if (img_request_write_test(img_request))
ceph_put_snap_context(img_request->snapc);
- if (img_request_child_test(img_request))
- rbd_obj_request_put(img_request->obj_request);
-
kmem_cache_free(rbd_img_request_cache, img_request);
}
+static struct rbd_img_request *rbd_parent_request_create(
+ struct rbd_obj_request *obj_request,
+ u64 img_offset, u64 length)
+{
+ struct rbd_img_request *parent_request;
+ struct rbd_device *rbd_dev;
+
+ rbd_assert(obj_request->img_request);
+ rbd_dev = obj_request->img_request->rbd_dev;
+
+ parent_request = rbd_img_request_create(rbd_dev->parent,
+ img_offset, length, false);
+ if (!parent_request)
+ return NULL;
+
+ img_request_child_set(parent_request);
+ rbd_obj_request_get(obj_request);
+ parent_request->obj_request = obj_request;
+
+ return parent_request;
+}
+
+static void rbd_parent_request_destroy(struct kref *kref)
+{
+ struct rbd_img_request *parent_request;
+ struct rbd_obj_request *orig_request;
+
+ parent_request = container_of(kref, struct rbd_img_request, kref);
+ orig_request = parent_request->obj_request;
+
+ parent_request->obj_request = NULL;
+ rbd_obj_request_put(orig_request);
+ img_request_child_clear(parent_request);
+
+ rbd_img_request_destroy(kref);
+}
+
static bool rbd_img_obj_end_request(struct rbd_obj_request *obj_request)
{
struct rbd_img_request *img_request;
@@ -2313,13 +2355,10 @@ static int rbd_img_obj_parent_read_full(struct
rbd_obj_request *obj_request)
}
result = -ENOMEM;
- parent_request = rbd_img_request_create(rbd_dev->parent,
- img_offset, length,
- false, true);
+ parent_request = rbd_parent_request_create(obj_request,
+ img_offset, length);
if (!parent_request)
goto out_err;
- rbd_obj_request_get(obj_request);
- parent_request->obj_request = obj_request;
result = rbd_img_request_fill(parent_request, OBJ_REQUEST_PAGES, pages);
if (result)
@@ -2570,7 +2609,6 @@ out:
static void rbd_img_parent_read(struct rbd_obj_request *obj_request)
{
- struct rbd_device *rbd_dev;
struct rbd_img_request *img_request;
int result;
@@ -2579,20 +2617,14 @@ static void rbd_img_parent_read(struct
rbd_obj_request *obj_request)
rbd_assert(obj_request->result == (s32) -ENOENT);
rbd_assert(obj_request_type_valid(obj_request->type));
- rbd_dev = obj_request->img_request->rbd_dev;
- rbd_assert(rbd_dev->parent != NULL);
/* rbd_read_finish(obj_request, obj_request->length); */
- img_request = rbd_img_request_create(rbd_dev->parent,
+ img_request = rbd_parent_request_create(obj_request,
obj_request->img_offset,
- obj_request->length,
- false, true);
+ obj_request->length);
result = -ENOMEM;
if (!img_request)
goto out_err;
- rbd_obj_request_get(obj_request);
- img_request->obj_request = obj_request;
-
if (obj_request->type == OBJ_REQUEST_BIO)
result = rbd_img_request_fill(img_request, OBJ_REQUEST_BIO,
obj_request->bio_list);
@@ -2903,7 +2935,7 @@ static void rbd_request_fn(struct request_queue *q)
result = -ENOMEM;
img_request = rbd_img_request_create(rbd_dev, offset, length,
- write_request, false);
+ write_request);
if (!img_request)
goto end_request;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 5/5] rbd: reference count parent requests
2013-05-11 17:42 [PATCH 0/5] rbd: prep work for flattening images Alex Elder
` (3 preceding siblings ...)
2013-05-11 17:44 ` [PATCH 4/5] rbd: define parent image request routines Alex Elder
@ 2013-05-11 17:44 ` Alex Elder
2013-05-11 21:08 ` [PATCH 0/5] rbd: prep work for flattening images Josh Durgin
5 siblings, 0 replies; 11+ messages in thread
From: Alex Elder @ 2013-05-11 17:44 UTC (permalink / raw)
To: ceph-devel
Keep a reference count for uses of the parent information for an rbd
device.
An initial reference is set in rbd_img_request_create() if the
target image has a parent (with non-zero overlap). Each image
request for an image with a non-zero parent overlap gets another
reference when it's created, and that reference is dropped when the
request is destroyed.
The initial reference is dropped when the image gets torn down.
Signed-off-by: Alex Elder <elder@inktank.com>
---
drivers/block/rbd.c | 104
++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 102 insertions(+), 2 deletions(-)
diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 72b53cd..71705f3 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -55,6 +55,39 @@
#define SECTOR_SHIFT 9
#define SECTOR_SIZE (1ULL << SECTOR_SHIFT)
+/*
+ * Increment the given counter and return its updated value.
+ * If the counter is already 0 it will not be incremented.
+ * If the counter is already at its maximum value returns
+ * -EINVAL without updating it.
+ */
+static int atomic_inc_return_safe(atomic_t *v)
+{
+ unsigned int counter;
+
+ counter = (unsigned int)__atomic_add_unless(v, 1, 0);
+ if (counter <= (unsigned int)INT_MAX)
+ return (int)counter;
+
+ atomic_dec(v);
+
+ return -EINVAL;
+}
+
+/* Decrement the counter. Return the resulting value, or -EINVAL */
+static int atomic_dec_return_safe(atomic_t *v)
+{
+ int counter;
+
+ counter = atomic_dec_return(v);
+ if (counter >= 0)
+ return counter;
+
+ atomic_inc(v);
+
+ return -EINVAL;
+}
+
#define RBD_DRV_NAME "rbd"
#define RBD_DRV_NAME_LONG "rbd (rados block device)"
@@ -310,6 +343,7 @@ struct rbd_device {
struct rbd_spec *parent_spec;
u64 parent_overlap;
+ atomic_t parent_ref;
struct rbd_device *parent;
/* protects updating the header */
@@ -359,6 +393,7 @@ static ssize_t rbd_add(struct bus_type *bus, const
char *buf,
static ssize_t rbd_remove(struct bus_type *bus, const char *buf,
size_t count);
static int rbd_dev_image_probe(struct rbd_device *rbd_dev, bool mapping);
+static void rbd_spec_put(struct rbd_spec *spec);
static struct bus_attribute rbd_bus_attrs[] = {
__ATTR(add, S_IWUSR, NULL, rbd_add),
@@ -1503,6 +1538,12 @@ static void img_request_layered_set(struct
rbd_img_request *img_request)
smp_mb();
}
+static void img_request_layered_clear(struct rbd_img_request *img_request)
+{
+ clear_bit(IMG_REQ_LAYERED, &img_request->flags);
+ smp_mb();
+}
+
static bool img_request_layered_test(struct rbd_img_request *img_request)
{
smp_mb();
@@ -1858,6 +1899,58 @@ static void rbd_dev_unparent(struct rbd_device
*rbd_dev)
}
/*
+ * Parent image reference counting is used to determine when an
+ * image's parent fields can be safely torn down--after there are no
+ * more in-flight requests to the parent image. When the last
+ * reference is dropped, cleaning them up is safe.
+ */
+static void rbd_dev_parent_put(struct rbd_device *rbd_dev)
+{
+ int counter;
+
+ if (!rbd_dev->parent_spec)
+ return;
+
+ counter = atomic_dec_return_safe(&rbd_dev->parent_ref);
+ if (counter > 0)
+ return;
+
+ /* Last reference; clean up parent data structures */
+
+ if (!counter)
+ rbd_dev_unparent(rbd_dev);
+ else
+ rbd_warn(rbd_dev, "parent reference underflow\n");
+}
+
+/*
+ * If an image has a non-zero parent overlap, get a reference to its
+ * parent.
+ *
+ * Returns true if the rbd device has a parent with a non-zero
+ * overlap and a reference for it was successfully taken, or
+ * false otherwise.
+ */
+static bool rbd_dev_parent_get(struct rbd_device *rbd_dev)
+{
+ int counter;
+
+ if (!rbd_dev->parent_spec)
+ return false;
+
+ counter = atomic_inc_return_safe(&rbd_dev->parent_ref);
+ if (counter > 0 && rbd_dev->parent_overlap)
+ return true;
+
+ /* Image was flattened, but parent is not yet torn down */
+
+ if (counter < 0)
+ rbd_warn(rbd_dev, "parent reference overflow\n");
+
+ return false;
+}
+
+/*
* Caller is responsible for filling in the list of object requests
* that comprises the image request, and the Linux request pointer
* (if there is one).
@@ -1890,7 +1983,7 @@ static struct rbd_img_request *rbd_img_request_create(
} else {
img_request->snap_id = rbd_dev->spec->snap_id;
}
- if (rbd_dev->parent_overlap)
+ if (rbd_dev_parent_get(rbd_dev))
img_request_layered_set(img_request);
spin_lock_init(&img_request->completion_lock);
img_request->next_completion = 0;
@@ -1921,6 +2014,11 @@ static void rbd_img_request_destroy(struct kref
*kref)
rbd_img_obj_request_del(img_request, obj_request);
rbd_assert(img_request->obj_request_count == 0);
+ if (img_request_layered_test(img_request)) {
+ img_request_layered_clear(img_request);
+ rbd_dev_parent_put(img_request->rbd_dev);
+ }
+
if (img_request_write_test(img_request))
ceph_put_snap_context(img_request->snapc);
@@ -3492,6 +3590,7 @@ static struct rbd_device *rbd_dev_create(struct
rbd_client *rbdc,
spin_lock_init(&rbd_dev->lock);
rbd_dev->flags = 0;
+ atomic_set(&rbd_dev->parent_ref, 0);
INIT_LIST_HEAD(&rbd_dev->node);
init_rwsem(&rbd_dev->header_rwsem);
@@ -4533,7 +4632,7 @@ static void rbd_dev_unprobe(struct rbd_device
*rbd_dev)
{
struct rbd_image_header *header;
- rbd_dev_unparent(rbd_dev);
+ rbd_dev_parent_put(rbd_dev);
/* Free dynamic fields from the header, then zero it out */
@@ -4605,6 +4704,7 @@ static int rbd_dev_probe_parent(struct rbd_device
*rbd_dev)
if (ret < 0)
goto out_err;
rbd_dev->parent = parent;
+ atomic_set(&rbd_dev->parent_ref, 1);
return 0;
out_err:
--
1.7.9.5
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/5] rbd: get parent info on refresh
2013-05-11 17:43 ` [PATCH 1/5] rbd: get parent info on refresh Alex Elder
@ 2013-05-11 20:59 ` Josh Durgin
2013-05-11 21:52 ` Alex Elder
2013-05-13 17:58 ` [PATCH 1/5, v2] " Alex Elder
0 siblings, 2 replies; 11+ messages in thread
From: Josh Durgin @ 2013-05-11 20:59 UTC (permalink / raw)
To: Alex Elder; +Cc: ceph-devel
On 05/11/2013 10:43 AM, Alex Elder wrote:
> Get parent info for format 2 images on every refresh (rather than
> just during the initial probe). This will be needed to detect the
> disappearance of the parent image in the event a mapped image
> becomes unlayered (i.e., flattened).
>
> Switch to using a non-zero parent overlap value rather than the
> existence of a parent (a non-null parent_spec pointer) to determine
> whether to mark a request layered. It will soon be possible for
> a layered image to become unlayered while a request is in flight.
>
> This means that the layered flag for an image request indicates that
> there was a non-zero parent overlap at the time the image request
> was created. The parent overlap can change thereafter, which may
> lead to special handling at request submission or completion time.
>
> Flesh out and fix the rbd_dev_v2_header_info() error handling path.
>
> This and the next several pages are related to:
> http://tracker.ceph.com/issues/3763
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
> drivers/block/rbd.c | 72
> ++++++++++++++++++++++++++++++++-------------------
> 1 file changed, 45 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 3a8135f..06d49b5 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -1871,7 +1871,7 @@ static struct rbd_img_request *rbd_img_request_create(
> }
> if (child_request)
> img_request_child_set(img_request);
> - if (rbd_dev->parent_spec)
> + if (rbd_dev->parent_overlap)
> img_request_layered_set(img_request);
> spin_lock_init(&img_request->completion_lock);
> img_request->next_completion = 0;
> @@ -4021,20 +4021,60 @@ static int rbd_dev_v2_header_info(struct
> rbd_device *rbd_dev)
> if (first_time) {
> ret = rbd_dev_v2_header_onetime(rbd_dev);
> if (ret)
> - goto out;
> + goto out_err;
> + }
> +
> + /*
> + * If the image supports layering, get the parent info. We
> + * need to probe the first time regardless. Thereafter we
> + * only need to if there's a parent, to see if it has
> + * disappeared due to the mapped image getting flattened.
> + */
> + if (rbd_dev->header.features & RBD_FEATURE_LAYERING &&
> + (first_time || rbd_dev->parent_spec)) {
> + bool warn;
> +
> + ret = rbd_dev_v2_parent_info(rbd_dev);
> + if (ret)
> + goto out_err;
> +
> + /*
> + * Print a warning if this is the initial probe and
> + * the image has a parent. Don't print it if the
> + * image now being probed is itself a parent. We
> + * can tell at this point because we won't know its
> + * pool name yet (just its pool id).
> + */
> + warn = rbd_dev->parent_spec && rbd_dev->spec->pool_name;
> + if (first_time && warn)
> + rbd_warn(rbd_dev, "WARNING: kernel layering "
> + "is EXPERIMENTAL!");
> }
>
> ret = rbd_dev_v2_image_size(rbd_dev);
> if (ret)
> - goto out;
> + goto out_err;
> +
> if (rbd_dev->spec->snap_id == CEPH_NOSNAP)
> if (rbd_dev->mapping.size != rbd_dev->header.image_size)
> rbd_dev->mapping.size = rbd_dev->header.image_size;
>
> ret = rbd_dev_v2_snap_context(rbd_dev);
> dout("rbd_dev_v2_snap_context returned %d\n", ret);
> - if (ret)
> + if (!ret)
> goto out;
> +out_err:
> + rbd_dev->mapping.size = 0;
> + rbd_dev->header.image_size = 0;
> + rbd_dev->header.obj_order = 0;
> + rbd_dev->parent_overlap = 0;
> + rbd_spec_put(rbd_dev->parent_spec);
> + rbd_dev->parent_spec = NULL;
> + rbd_dev->header.stripe_count = 0;
> + rbd_dev->header.stripe_unit = 0;
> + rbd_dev->header.features = 0;
> + kfree(rbd_dev->header.object_prefix);
> + rbd_dev->header.object_prefix = NULL;
There should probably be some warning here, since the device isn't
usable anymore after this.
I'm nervous about clearing these fields without locking around
their use. For example, rbd_request_fn() should skip new requests now
since mapping.size == 0, but it does not read the field safely, so
requests could still get through, and possibly use the new object
prefix (null) to create new objects that would not be cleaned up
by removing the image (since they used an irregular prefix).
There might be other less obvious effects for in-flight requests too.
Maybe a new flag indicating whether the header is valid would be
more useful than relying on individual fields being handled correctly
after an error reading the header occurs.
> out:
> up_write(&rbd_dev->header_rwsem);
>
> @@ -4488,24 +4528,6 @@ static int rbd_dev_v2_header_onetime(struct
> rbd_device *rbd_dev)
> if (ret)
> goto out_err;
>
> - /* If the image supports layering, get the parent info */
> -
> - if (rbd_dev->header.features & RBD_FEATURE_LAYERING) {
> - ret = rbd_dev_v2_parent_info(rbd_dev);
> - if (ret)
> - goto out_err;
> - /*
> - * Print a warning if this image has a parent.
> - * Don't print it if the image now being probed
> - * is itself a parent. We can tell at this point
> - * because we won't know its pool name yet (just its
> - * pool id).
> - */
> - if (rbd_dev->parent_spec && rbd_dev->spec->pool_name)
> - rbd_warn(rbd_dev, "WARNING: kernel layering "
> - "is EXPERIMENTAL!");
> - }
> -
> /* If the image supports fancy striping, get its parameters */
>
> if (rbd_dev->header.features & RBD_FEATURE_STRIPINGV2) {
> @@ -4517,11 +4539,7 @@ static int rbd_dev_v2_header_onetime(struct
> rbd_device *rbd_dev)
>
> return 0;
> out_err:
> - rbd_dev->parent_overlap = 0;
> - rbd_spec_put(rbd_dev->parent_spec);
> - rbd_dev->parent_spec = NULL;
> - kfree(rbd_dev->header_name);
> - rbd_dev->header_name = NULL;
> + rbd_dev->header.features = 0;
> kfree(rbd_dev->header.object_prefix);
> rbd_dev->header.object_prefix = NULL;
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/5] rbd: prep work for flattening images
2013-05-11 17:42 [PATCH 0/5] rbd: prep work for flattening images Alex Elder
` (4 preceding siblings ...)
2013-05-11 17:44 ` [PATCH 5/5] rbd: reference count parent requests Alex Elder
@ 2013-05-11 21:08 ` Josh Durgin
5 siblings, 0 replies; 11+ messages in thread
From: Josh Durgin @ 2013-05-11 21:08 UTC (permalink / raw)
To: Alex Elder; +Cc: ceph-devel
On 05/11/2013 10:42 AM, Alex Elder wrote:
> This series of patches prepares some parent request
> code to make handling the event of a mapped clone
> image getting flattened easier.
>
> -Alex
>
> [PATCH 1/5] rbd: get parent info on refresh
> [PATCH 2/5] rbd: don't release write request until necessary
> [PATCH 3/5] rbd: define rbd_dev_unparent()
> [PATCH 4/5] rbd: define parent image request routines
> [PATCH 5/5] rbd: reference count parent requests
2-5 look good.
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/5] rbd: get parent info on refresh
2013-05-11 20:59 ` Josh Durgin
@ 2013-05-11 21:52 ` Alex Elder
2013-05-13 17:58 ` [PATCH 1/5, v2] " Alex Elder
1 sibling, 0 replies; 11+ messages in thread
From: Alex Elder @ 2013-05-11 21:52 UTC (permalink / raw)
To: Josh Durgin; +Cc: ceph-devel
On 05/11/2013 03:59 PM, Josh Durgin wrote:
> On 05/11/2013 10:43 AM, Alex Elder wrote:
>> Get parent info for format 2 images on every refresh (rather than
>> just during the initial probe). This will be needed to detect the
>> disappearance of the parent image in the event a mapped image
>> becomes unlayered (i.e., flattened).
>>
>> Switch to using a non-zero parent overlap value rather than the
>> existence of a parent (a non-null parent_spec pointer) to determine
>> whether to mark a request layered. It will soon be possible for
>> a layered image to become unlayered while a request is in flight.
>>
>> This means that the layered flag for an image request indicates that
>> there was a non-zero parent overlap at the time the image request
>> was created. The parent overlap can change thereafter, which may
>> lead to special handling at request submission or completion time.
>>
>> Flesh out and fix the rbd_dev_v2_header_info() error handling path.
>>
>> This and the next several pages are related to:
>> http://tracker.ceph.com/issues/3763
>>
>> Signed-off-by: Alex Elder <elder@inktank.com>
>> ---
>> drivers/block/rbd.c | 72
>> ++++++++++++++++++++++++++++++++-------------------
>> 1 file changed, 45 insertions(+), 27 deletions(-)
>>
>> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
>> index 3a8135f..06d49b5 100644
>> --- a/drivers/block/rbd.c
>> +++ b/drivers/block/rbd.c
>> @@ -1871,7 +1871,7 @@ static struct rbd_img_request
>> *rbd_img_request_create(
>> }
>> if (child_request)
>> img_request_child_set(img_request);
>> - if (rbd_dev->parent_spec)
>> + if (rbd_dev->parent_overlap)
>> img_request_layered_set(img_request);
>> spin_lock_init(&img_request->completion_lock);
>> img_request->next_completion = 0;
>> @@ -4021,20 +4021,60 @@ static int rbd_dev_v2_header_info(struct
>> rbd_device *rbd_dev)
>> if (first_time) {
>> ret = rbd_dev_v2_header_onetime(rbd_dev);
>> if (ret)
>> - goto out;
>> + goto out_err;
>> + }
>> +
>> + /*
>> + * If the image supports layering, get the parent info. We
>> + * need to probe the first time regardless. Thereafter we
>> + * only need to if there's a parent, to see if it has
>> + * disappeared due to the mapped image getting flattened.
>> + */
>> + if (rbd_dev->header.features & RBD_FEATURE_LAYERING &&
>> + (first_time || rbd_dev->parent_spec)) {
>> + bool warn;
>> +
>> + ret = rbd_dev_v2_parent_info(rbd_dev);
>> + if (ret)
>> + goto out_err;
>> +
>> + /*
>> + * Print a warning if this is the initial probe and
>> + * the image has a parent. Don't print it if the
>> + * image now being probed is itself a parent. We
>> + * can tell at this point because we won't know its
>> + * pool name yet (just its pool id).
>> + */
>> + warn = rbd_dev->parent_spec && rbd_dev->spec->pool_name;
>> + if (first_time && warn)
>> + rbd_warn(rbd_dev, "WARNING: kernel layering "
>> + "is EXPERIMENTAL!");
>> }
>>
>> ret = rbd_dev_v2_image_size(rbd_dev);
>> if (ret)
>> - goto out;
>> + goto out_err;
>> +
>> if (rbd_dev->spec->snap_id == CEPH_NOSNAP)
>> if (rbd_dev->mapping.size != rbd_dev->header.image_size)
>> rbd_dev->mapping.size = rbd_dev->header.image_size;
>>
>> ret = rbd_dev_v2_snap_context(rbd_dev);
>> dout("rbd_dev_v2_snap_context returned %d\n", ret);
>> - if (ret)
>> + if (!ret)
>> goto out;
>> +out_err:
>> + rbd_dev->mapping.size = 0;
>> + rbd_dev->header.image_size = 0;
>> + rbd_dev->header.obj_order = 0;
>> + rbd_dev->parent_overlap = 0;
>> + rbd_spec_put(rbd_dev->parent_spec);
>> + rbd_dev->parent_spec = NULL;
>> + rbd_dev->header.stripe_count = 0;
>> + rbd_dev->header.stripe_unit = 0;
>> + rbd_dev->header.features = 0;
>> + kfree(rbd_dev->header.object_prefix);
>> + rbd_dev->header.object_prefix = NULL;
>
> There should probably be some warning here, since the device isn't
> usable anymore after this.
That is a very good point. I obviously hadn't thought
this through. And looking again at how this is used (two
callers--probe and refresh), the refresh case really doesn't
do anything with errors--just issues a warning.
If refreshing due to a request from /sys/bus/.../refresh then
we could safely just ignore an error. But if doing so in
response to an event notification we probably need to shut
down the device--probably using a flag as you suggest--don't
you think? We could conceivably re-enable it again if we
could manage to retry and succeed after that (but I'm not
going to implement that today).
For starters, I will simply get rid of this block of code that
clears everything out, and rely on an eventual other error
to clean it all up in the process of tearing down data
structures.
Now about your suggestion of using a flag:
The fields that get updated here (if not the first time) are:
- parent info (parent spec and parent overlap)
- image size and object order
- snapshot context
(And now that I look I see the old parent spec leaks when we
update the parent info--another fix to come.)
If a flag were created to mark a header invalid, we'd need
to check it whenever these fields were referenced.
- parent_spec isn't really a problem, it is really only used
to determine whether to probe for a parent.
- parent overlap is used quite a few places but they're
basically in rbd_img_obj_request_submit() and in the
parent-related callback functions.
- image size isn't really used except to set mapping.size,
which is then used to set the device capacity, so
protecting that should be easy.
- The snapshot is used a little more than that, but it
may be that some higher-level organization (like trapping
this at object request submit time) might cover it.
I'll take a look at implementing this. But it's unfortunately
going to have to wait, I've run out of time today.
Thanks a lot for the review.
-Alex
> I'm nervous about clearing these fields without locking around
> their use. For example, rbd_request_fn() should skip new requests now
> since mapping.size == 0, but it does not read the field safely, so
> requests could still get through, and possibly use the new object
> prefix (null) to create new objects that would not be cleaned up
> by removing the image (since they used an irregular prefix).
>
> There might be other less obvious effects for in-flight requests too.
> Maybe a new flag indicating whether the header is valid would be
> more useful than relying on individual fields being handled correctly
> after an error reading the header occurs.
>
>> out:
>> up_write(&rbd_dev->header_rwsem);
>>
>> @@ -4488,24 +4528,6 @@ static int rbd_dev_v2_header_onetime(struct
>> rbd_device *rbd_dev)
>> if (ret)
>> goto out_err;
>>
>> - /* If the image supports layering, get the parent info */
>> -
>> - if (rbd_dev->header.features & RBD_FEATURE_LAYERING) {
>> - ret = rbd_dev_v2_parent_info(rbd_dev);
>> - if (ret)
>> - goto out_err;
>> - /*
>> - * Print a warning if this image has a parent.
>> - * Don't print it if the image now being probed
>> - * is itself a parent. We can tell at this point
>> - * because we won't know its pool name yet (just its
>> - * pool id).
>> - */
>> - if (rbd_dev->parent_spec && rbd_dev->spec->pool_name)
>> - rbd_warn(rbd_dev, "WARNING: kernel layering "
>> - "is EXPERIMENTAL!");
>> - }
>> -
>> /* If the image supports fancy striping, get its parameters */
>>
>> if (rbd_dev->header.features & RBD_FEATURE_STRIPINGV2) {
>> @@ -4517,11 +4539,7 @@ static int rbd_dev_v2_header_onetime(struct
>> rbd_device *rbd_dev)
>>
>> return 0;
>> out_err:
>> - rbd_dev->parent_overlap = 0;
>> - rbd_spec_put(rbd_dev->parent_spec);
>> - rbd_dev->parent_spec = NULL;
>> - kfree(rbd_dev->header_name);
>> - rbd_dev->header_name = NULL;
>> + rbd_dev->header.features = 0;
>> kfree(rbd_dev->header.object_prefix);
>> rbd_dev->header.object_prefix = NULL;
>>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/5, v2] rbd: get parent info on refresh
2013-05-11 20:59 ` Josh Durgin
2013-05-11 21:52 ` Alex Elder
@ 2013-05-13 17:58 ` Alex Elder
2013-05-13 19:34 ` Josh Durgin
1 sibling, 1 reply; 11+ messages in thread
From: Alex Elder @ 2013-05-13 17:58 UTC (permalink / raw)
To: Josh Durgin; +Cc: ceph-devel
Get parent info for format 2 images on every refresh (rather than
just during the initial probe). This will be needed to detect the
disappearance of the parent image in the event a mapped image
becomes unlayered (i.e., flattened). Avoid leaking the previous
parent spec on the second and subsequent times this information is
requested by dropping the previous one (if any) before updating it.
(Also, extract the pool id into a local variable before assigning
it into the parent spec.)
Switch to using a non-zero parent overlap value rather than the
existence of a parent (a non-null parent_spec pointer) to determine
whether to mark a request layered. It will soon be possible for
a layered image to become unlayered while a request is in flight.
This means that the layered flag for an image request indicates that
there was a non-zero parent overlap at the time the image request
was created. The parent overlap can change thereafter, which may
lead to special handling at request submission or completion time.
This and the next several patches are related to:
http://tracker.ceph.com/issues/3763
NOTE:
If an error occurs while refreshing the parent info (i.e.,
requesting it after initial probe), the old parent info will
persist. This is not really correct, and is a scenario that needs
to be addressed. For now we'll assert that the failure mode is
unlikely, but the issue has been documented in tracker issue 5040.
Signed-off-by: Alex Elder <elder@inktank.com>
---
v2: no longer clean up rbd_dev and header on error in refresh
drivers/block/rbd.c | 67
++++++++++++++++++++++++++++-----------------------
1 file changed, 37 insertions(+), 30 deletions(-)
diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index b67ecda..fcef63c 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -1873,7 +1873,7 @@ static struct rbd_img_request *rbd_img_request_create(
}
if (child_request)
img_request_child_set(img_request);
- if (rbd_dev->parent_spec)
+ if (rbd_dev->parent_overlap)
img_request_layered_set(img_request);
spin_lock_init(&img_request->completion_lock);
img_request->next_completion = 0;
@@ -3613,6 +3613,7 @@ static int rbd_dev_v2_parent_info(struct
rbd_device *rbd_dev)
__le64 snapid;
void *p;
void *end;
+ u64 pool_id;
char *image_id;
u64 overlap;
int ret;
@@ -3643,18 +3644,19 @@ static int rbd_dev_v2_parent_info(struct
rbd_device *rbd_dev)
p = reply_buf;
end = reply_buf + ret;
ret = -ERANGE;
- ceph_decode_64_safe(&p, end, parent_spec->pool_id, out_err);
- if (parent_spec->pool_id == CEPH_NOPOOL)
+ ceph_decode_64_safe(&p, end, pool_id, out_err);
+ if (pool_id == CEPH_NOPOOL)
goto out; /* No parent? No problem. */
/* The ceph file layout needs to fit pool id in 32 bits */
ret = -EIO;
- if (parent_spec->pool_id > (u64)U32_MAX) {
+ if (pool_id > (u64)U32_MAX) {
rbd_warn(NULL, "parent pool id too large (%llu > %u)\n",
- (unsigned long long)parent_spec->pool_id, U32_MAX);
+ (unsigned long long)pool_id, U32_MAX);
goto out_err;
}
+ parent_spec->pool_id = pool_id;
image_id = ceph_extract_encoded_string(&p, end, NULL, GFP_KERNEL);
if (IS_ERR(image_id)) {
@@ -3666,6 +3668,7 @@ static int rbd_dev_v2_parent_info(struct
rbd_device *rbd_dev)
ceph_decode_64_safe(&p, end, overlap, out_err);
if (overlap) {
+ rbd_spec_put(rbd_dev->parent_spec);
rbd_dev->parent_spec = parent_spec;
parent_spec = NULL; /* rbd_dev now owns this */
rbd_dev->parent_overlap = overlap;
@@ -4034,17 +4037,43 @@ static int rbd_dev_v2_header_info(struct
rbd_device *rbd_dev)
goto out;
}
+ /*
+ * If the image supports layering, get the parent info. We
+ * need to probe the first time regardless. Thereafter we
+ * only need to if there's a parent, to see if it has
+ * disappeared due to the mapped image getting flattened.
+ */
+ if (rbd_dev->header.features & RBD_FEATURE_LAYERING &&
+ (first_time || rbd_dev->parent_spec)) {
+ bool warn;
+
+ ret = rbd_dev_v2_parent_info(rbd_dev);
+ if (ret)
+ goto out;
+
+ /*
+ * Print a warning if this is the initial probe and
+ * the image has a parent. Don't print it if the
+ * image now being probed is itself a parent. We
+ * can tell at this point because we won't know its
+ * pool name yet (just its pool id).
+ */
+ warn = rbd_dev->parent_spec && rbd_dev->spec->pool_name;
+ if (first_time && warn)
+ rbd_warn(rbd_dev, "WARNING: kernel layering "
+ "is EXPERIMENTAL!");
+ }
+
ret = rbd_dev_v2_image_size(rbd_dev);
if (ret)
goto out;
+
if (rbd_dev->spec->snap_id == CEPH_NOSNAP)
if (rbd_dev->mapping.size != rbd_dev->header.image_size)
rbd_dev->mapping.size = rbd_dev->header.image_size;
ret = rbd_dev_v2_snap_context(rbd_dev);
dout("rbd_dev_v2_snap_context returned %d\n", ret);
- if (ret)
- goto out;
out:
up_write(&rbd_dev->header_rwsem);
@@ -4498,24 +4527,6 @@ static int rbd_dev_v2_header_onetime(struct
rbd_device *rbd_dev)
if (ret)
goto out_err;
- /* If the image supports layering, get the parent info */
-
- if (rbd_dev->header.features & RBD_FEATURE_LAYERING) {
- ret = rbd_dev_v2_parent_info(rbd_dev);
- if (ret)
- goto out_err;
- /*
- * Print a warning if this image has a parent.
- * Don't print it if the image now being probed
- * is itself a parent. We can tell at this point
- * because we won't know its pool name yet (just its
- * pool id).
- */
- if (rbd_dev->parent_spec && rbd_dev->spec->pool_name)
- rbd_warn(rbd_dev, "WARNING: kernel layering "
- "is EXPERIMENTAL!");
- }
-
/* If the image supports fancy striping, get its parameters */
if (rbd_dev->header.features & RBD_FEATURE_STRIPINGV2) {
@@ -4527,11 +4538,7 @@ static int rbd_dev_v2_header_onetime(struct
rbd_device *rbd_dev)
return 0;
out_err:
- rbd_dev->parent_overlap = 0;
- rbd_spec_put(rbd_dev->parent_spec);
- rbd_dev->parent_spec = NULL;
- kfree(rbd_dev->header_name);
- rbd_dev->header_name = NULL;
+ rbd_dev->header.features = 0;
kfree(rbd_dev->header.object_prefix);
rbd_dev->header.object_prefix = NULL;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/5, v2] rbd: get parent info on refresh
2013-05-13 17:58 ` [PATCH 1/5, v2] " Alex Elder
@ 2013-05-13 19:34 ` Josh Durgin
0 siblings, 0 replies; 11+ messages in thread
From: Josh Durgin @ 2013-05-13 19:34 UTC (permalink / raw)
To: Alex Elder; +Cc: ceph-devel
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
On 05/13/2013 10:58 AM, Alex Elder wrote:
> Get parent info for format 2 images on every refresh (rather than
> just during the initial probe). This will be needed to detect the
> disappearance of the parent image in the event a mapped image
> becomes unlayered (i.e., flattened). Avoid leaking the previous
> parent spec on the second and subsequent times this information is
> requested by dropping the previous one (if any) before updating it.
> (Also, extract the pool id into a local variable before assigning
> it into the parent spec.)
>
> Switch to using a non-zero parent overlap value rather than the
> existence of a parent (a non-null parent_spec pointer) to determine
> whether to mark a request layered. It will soon be possible for
> a layered image to become unlayered while a request is in flight.
>
> This means that the layered flag for an image request indicates that
> there was a non-zero parent overlap at the time the image request
> was created. The parent overlap can change thereafter, which may
> lead to special handling at request submission or completion time.
>
> This and the next several patches are related to:
> http://tracker.ceph.com/issues/3763
>
> NOTE:
> If an error occurs while refreshing the parent info (i.e.,
> requesting it after initial probe), the old parent info will
> persist. This is not really correct, and is a scenario that needs
> to be addressed. For now we'll assert that the failure mode is
> unlikely, but the issue has been documented in tracker issue 5040.
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
> v2: no longer clean up rbd_dev and header on error in refresh
>
> drivers/block/rbd.c | 67
> ++++++++++++++++++++++++++++-----------------------
> 1 file changed, 37 insertions(+), 30 deletions(-)
>
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index b67ecda..fcef63c 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -1873,7 +1873,7 @@ static struct rbd_img_request *rbd_img_request_create(
> }
> if (child_request)
> img_request_child_set(img_request);
> - if (rbd_dev->parent_spec)
> + if (rbd_dev->parent_overlap)
> img_request_layered_set(img_request);
> spin_lock_init(&img_request->completion_lock);
> img_request->next_completion = 0;
> @@ -3613,6 +3613,7 @@ static int rbd_dev_v2_parent_info(struct
> rbd_device *rbd_dev)
> __le64 snapid;
> void *p;
> void *end;
> + u64 pool_id;
> char *image_id;
> u64 overlap;
> int ret;
> @@ -3643,18 +3644,19 @@ static int rbd_dev_v2_parent_info(struct
> rbd_device *rbd_dev)
> p = reply_buf;
> end = reply_buf + ret;
> ret = -ERANGE;
> - ceph_decode_64_safe(&p, end, parent_spec->pool_id, out_err);
> - if (parent_spec->pool_id == CEPH_NOPOOL)
> + ceph_decode_64_safe(&p, end, pool_id, out_err);
> + if (pool_id == CEPH_NOPOOL)
> goto out; /* No parent? No problem. */
>
> /* The ceph file layout needs to fit pool id in 32 bits */
>
> ret = -EIO;
> - if (parent_spec->pool_id > (u64)U32_MAX) {
> + if (pool_id > (u64)U32_MAX) {
> rbd_warn(NULL, "parent pool id too large (%llu > %u)\n",
> - (unsigned long long)parent_spec->pool_id, U32_MAX);
> + (unsigned long long)pool_id, U32_MAX);
> goto out_err;
> }
> + parent_spec->pool_id = pool_id;
>
> image_id = ceph_extract_encoded_string(&p, end, NULL, GFP_KERNEL);
> if (IS_ERR(image_id)) {
> @@ -3666,6 +3668,7 @@ static int rbd_dev_v2_parent_info(struct
> rbd_device *rbd_dev)
> ceph_decode_64_safe(&p, end, overlap, out_err);
>
> if (overlap) {
> + rbd_spec_put(rbd_dev->parent_spec);
> rbd_dev->parent_spec = parent_spec;
> parent_spec = NULL; /* rbd_dev now owns this */
> rbd_dev->parent_overlap = overlap;
> @@ -4034,17 +4037,43 @@ static int rbd_dev_v2_header_info(struct
> rbd_device *rbd_dev)
> goto out;
> }
>
> + /*
> + * If the image supports layering, get the parent info. We
> + * need to probe the first time regardless. Thereafter we
> + * only need to if there's a parent, to see if it has
> + * disappeared due to the mapped image getting flattened.
> + */
> + if (rbd_dev->header.features & RBD_FEATURE_LAYERING &&
> + (first_time || rbd_dev->parent_spec)) {
> + bool warn;
> +
> + ret = rbd_dev_v2_parent_info(rbd_dev);
> + if (ret)
> + goto out;
> +
> + /*
> + * Print a warning if this is the initial probe and
> + * the image has a parent. Don't print it if the
> + * image now being probed is itself a parent. We
> + * can tell at this point because we won't know its
> + * pool name yet (just its pool id).
> + */
> + warn = rbd_dev->parent_spec && rbd_dev->spec->pool_name;
> + if (first_time && warn)
> + rbd_warn(rbd_dev, "WARNING: kernel layering "
> + "is EXPERIMENTAL!");
> + }
> +
> ret = rbd_dev_v2_image_size(rbd_dev);
> if (ret)
> goto out;
> +
> if (rbd_dev->spec->snap_id == CEPH_NOSNAP)
> if (rbd_dev->mapping.size != rbd_dev->header.image_size)
> rbd_dev->mapping.size = rbd_dev->header.image_size;
>
> ret = rbd_dev_v2_snap_context(rbd_dev);
> dout("rbd_dev_v2_snap_context returned %d\n", ret);
> - if (ret)
> - goto out;
> out:
> up_write(&rbd_dev->header_rwsem);
>
> @@ -4498,24 +4527,6 @@ static int rbd_dev_v2_header_onetime(struct
> rbd_device *rbd_dev)
> if (ret)
> goto out_err;
>
> - /* If the image supports layering, get the parent info */
> -
> - if (rbd_dev->header.features & RBD_FEATURE_LAYERING) {
> - ret = rbd_dev_v2_parent_info(rbd_dev);
> - if (ret)
> - goto out_err;
> - /*
> - * Print a warning if this image has a parent.
> - * Don't print it if the image now being probed
> - * is itself a parent. We can tell at this point
> - * because we won't know its pool name yet (just its
> - * pool id).
> - */
> - if (rbd_dev->parent_spec && rbd_dev->spec->pool_name)
> - rbd_warn(rbd_dev, "WARNING: kernel layering "
> - "is EXPERIMENTAL!");
> - }
> -
> /* If the image supports fancy striping, get its parameters */
>
> if (rbd_dev->header.features & RBD_FEATURE_STRIPINGV2) {
> @@ -4527,11 +4538,7 @@ static int rbd_dev_v2_header_onetime(struct
> rbd_device *rbd_dev)
>
> return 0;
> out_err:
> - rbd_dev->parent_overlap = 0;
> - rbd_spec_put(rbd_dev->parent_spec);
> - rbd_dev->parent_spec = NULL;
> - kfree(rbd_dev->header_name);
> - rbd_dev->header_name = NULL;
> + rbd_dev->header.features = 0;
> kfree(rbd_dev->header.object_prefix);
> rbd_dev->header.object_prefix = NULL;
>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2013-05-13 19:35 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-11 17:42 [PATCH 0/5] rbd: prep work for flattening images Alex Elder
2013-05-11 17:43 ` [PATCH 1/5] rbd: get parent info on refresh Alex Elder
2013-05-11 20:59 ` Josh Durgin
2013-05-11 21:52 ` Alex Elder
2013-05-13 17:58 ` [PATCH 1/5, v2] " Alex Elder
2013-05-13 19:34 ` Josh Durgin
2013-05-11 17:44 ` [PATCH 2/5] rbd: don't release write request until necessary Alex Elder
2013-05-11 17:44 ` [PATCH 3/5] rbd: define rbd_dev_unparent() Alex Elder
2013-05-11 17:44 ` [PATCH 4/5] rbd: define parent image request routines Alex Elder
2013-05-11 17:44 ` [PATCH 5/5] rbd: reference count parent requests Alex Elder
2013-05-11 21:08 ` [PATCH 0/5] rbd: prep work for flattening images Josh Durgin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.