All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] rbd: separate reading from interpreting rbd header
@ 2012-08-06 18:15 Alex Elder
  2012-08-06 18:17 ` [PATCH 1/4] rbd: rearrange rbd_header_from_disk() Alex Elder
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Alex Elder @ 2012-08-06 18:15 UTC (permalink / raw)
  To: ceph-devel@vger.kernel.org

Right now rbd_read_header() both reads the header object for an rbd
image and decodes its contents.  This series rearranges this a bit so
this process is broken into into two steps--reading of the raw header
data and then separately decoding its contents.

[PATCH 1/4] rbd: rearrange rbd_header_from_disk()
[PATCH 2/4] rbd: return earlier in rbd_header_from_disk()
[PATCH 3/4] rbd: expand rbd_dev_ondisk_valid() checks
[PATCH 4/4] rbd: separate reading header from decoding it

					-Alex


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] rbd: rearrange rbd_header_from_disk()
  2012-08-06 18:15 [PATCH 0/4] rbd: separate reading from interpreting rbd header Alex Elder
@ 2012-08-06 18:17 ` Alex Elder
  2012-08-08  0:29   ` Josh Durgin
  2012-08-06 18:17 ` [PATCH 2/4] rbd: return earlier in rbd_header_from_disk() Alex Elder
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Alex Elder @ 2012-08-06 18:17 UTC (permalink / raw)
  To: ceph-devel@vger.kernel.org

This just moves code around for the most part.  It was pulled out as
a separate patch to avoid cluttering up some upcoming patches which
are more substantive.  The point is basically to group everything
related to initializing the snapshot context together.

The only functional change is that rbd_header_from_disk() now
ensures the (in-core) header it is passed is zero-filled.  This
allows a simpler error handling path in rbd_header_from_disk().

Signed-off-by: Alex Elder <elder@inktank.com>
---
  drivers/block/rbd.c |   41 ++++++++++++++++++++++-------------------
  1 file changed, 22 insertions(+), 19 deletions(-)

Index: b/drivers/block/rbd.c
===================================================================
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -507,11 +507,14 @@ static int rbd_header_from_disk(struct r
  	if (snap_count > size / sizeof (header->snapc->snaps[0]))
  		return -EINVAL;

-	size = sizeof (struct ceph_snap_context);
-	size += snap_count * sizeof (header->snapc->snaps[0]);
-	header->snapc = kmalloc(size, GFP_KERNEL);
-	if (!header->snapc)
+	memset(header, 0, sizeof (*header));
+
+	size = sizeof (ondisk->block_name) + 1;
+	header->object_prefix = kmalloc(size, GFP_KERNEL);
+	if (!header->object_prefix)
  		return -ENOMEM;
+	memcpy(header->object_prefix, ondisk->block_name, size - 1);
+	header->object_prefix[size - 1] = '\0';

  	if (snap_count) {
  		header->snap_names_len = le64_to_cpu(ondisk->snap_names_len);
@@ -519,11 +522,12 @@ static int rbd_header_from_disk(struct r
  		header->snap_names = kmalloc(header->snap_names_len,
  					     GFP_KERNEL);
  		if (!header->snap_names)
-			goto err_snapc;
+			goto out_err;
+
  		size = snap_count * sizeof (*header->snap_sizes);
  		header->snap_sizes = kmalloc(size, GFP_KERNEL);
  		if (!header->snap_sizes)
-			goto err_names;
+			goto out_err;
  	} else {
  		WARN_ON(ondisk->snap_names_len);
  		header->snap_names_len = 0;
@@ -531,22 +535,23 @@ static int rbd_header_from_disk(struct r
  		header->snap_sizes = NULL;
  	}

-	size = sizeof (ondisk->block_name) + 1;
-	header->object_prefix = kmalloc(size, GFP_KERNEL);
-	if (!header->object_prefix)
-		goto err_sizes;
-	memcpy(header->object_prefix, ondisk->block_name, size - 1);
-	header->object_prefix[size - 1] = '\0';
-
  	header->image_size = le64_to_cpu(ondisk->image_size);
  	header->obj_order = ondisk->options.order;
  	header->crypt_type = ondisk->options.crypt_type;
  	header->comp_type = ondisk->options.comp_type;
+	header->total_snaps = snap_count;
+
+	/* Set up the snapshot context */
+
+	size = sizeof (struct ceph_snap_context);
+	size += snap_count * sizeof (header->snapc->snaps[0]);
+	header->snapc = kzalloc(size, GFP_KERNEL);
+	if (!header->snapc)
+		goto out_err;

  	atomic_set(&header->snapc->nref, 1);
  	header->snapc->seq = le64_to_cpu(ondisk->snap_seq);
  	header->snapc->num_snaps = snap_count;
-	header->total_snaps = snap_count;

  	if (snap_count && allocated_snaps == snap_count) {
  		int i;
@@ -565,16 +570,14 @@ static int rbd_header_from_disk(struct r

  	return 0;

-err_sizes:
+out_err:
  	kfree(header->snap_sizes);
  	header->snap_sizes = NULL;
-err_names:
  	kfree(header->snap_names);
  	header->snap_names = NULL;
  	header->snap_names_len = 0;
-err_snapc:
-	kfree(header->snapc);
-	header->snapc = NULL;
+	kfree(header->object_prefix);
+	header->object_prefix = NULL;

  	return -ENOMEM;
  }

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 2/4] rbd: return earlier in rbd_header_from_disk()
  2012-08-06 18:15 [PATCH 0/4] rbd: separate reading from interpreting rbd header Alex Elder
  2012-08-06 18:17 ` [PATCH 1/4] rbd: rearrange rbd_header_from_disk() Alex Elder
@ 2012-08-06 18:17 ` Alex Elder
  2012-08-08  0:29   ` Josh Durgin
  2012-08-06 18:17 ` [PATCH 3/4] rbd: expand rbd_dev_ondisk_valid() checks Alex Elder
  2012-08-06 18:17 ` [PATCH 4/4] rbd: separate reading header from decoding it Alex Elder
  3 siblings, 1 reply; 11+ messages in thread
From: Alex Elder @ 2012-08-06 18:17 UTC (permalink / raw)
  To: ceph-devel@vger.kernel.org

The only caller of rbd_header_from_disk() is rbd_read_header().
It passes as allocated_snaps the number of snapshots it will
have received from the server for the snapshot context that
rbd_header_from_disk() is to interpret.  The first time through
it provides 0--mainly to extract the number of snapshots from
the snapshot context header--so that it can allocate an
appropriately-sized buffer to receive the entire snapshot
context from the server in a second request.

rbd_header_from_disk() will not fill in the array of snapshot ids
unless the number in the snapshot matches the number the caller
had allocated.

This patch adjusts that logic a little further to be more efficient.
rbd_read_header() doesn't even examine the snapshot context unless
the snapshot count (stored in header->total_snaps) matches the
number of snapshots allocated.  So rbd_header_from_disk() doesn't
need to allocate or fill in the snapshot context field at all in
that case.

Signed-off-by: Alex Elder <elder@inktank.com>
---
  drivers/block/rbd.c |   15 ++++++++++++---
  1 file changed, 12 insertions(+), 3 deletions(-)

Index: b/drivers/block/rbd.c
===================================================================
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -541,7 +541,14 @@ static int rbd_header_from_disk(struct r
  	header->comp_type = ondisk->options.comp_type;
  	header->total_snaps = snap_count;

-	/* Set up the snapshot context */
+	/*
+	 * If the number of snapshot ids provided by the caller
+	 * doesn't match the number in the entire context there's
+	 * no point in going further.  Caller will try again after
+	 * getting an updated snapshot context from the server.
+	 */
+	if (allocated_snaps != snap_count)
+		return 0;

  	size = sizeof (struct ceph_snap_context);
  	size += snap_count * sizeof (header->snapc->snaps[0]);
@@ -553,8 +560,10 @@ static int rbd_header_from_disk(struct r
  	header->snapc->seq = le64_to_cpu(ondisk->snap_seq);
  	header->snapc->num_snaps = snap_count;

-	if (snap_count && allocated_snaps == snap_count) {
-		int i;
+	/* Fill in the snapshot information */
+
+	if (snap_count) {
+		u32 i;

  		for (i = 0; i < snap_count; i++) {
  			header->snapc->snaps[i] =

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 3/4] rbd: expand rbd_dev_ondisk_valid() checks
  2012-08-06 18:15 [PATCH 0/4] rbd: separate reading from interpreting rbd header Alex Elder
  2012-08-06 18:17 ` [PATCH 1/4] rbd: rearrange rbd_header_from_disk() Alex Elder
  2012-08-06 18:17 ` [PATCH 2/4] rbd: return earlier in rbd_header_from_disk() Alex Elder
@ 2012-08-06 18:17 ` Alex Elder
  2012-08-08  0:31   ` Josh Durgin
  2012-08-06 18:17 ` [PATCH 4/4] rbd: separate reading header from decoding it Alex Elder
  3 siblings, 1 reply; 11+ messages in thread
From: Alex Elder @ 2012-08-06 18:17 UTC (permalink / raw)
  To: ceph-devel@vger.kernel.org

Add checks on the validity of the snap_count and snap_names_len
field values in rbd_dev_ondisk_valid().  This eliminates the
need to do them in rbd_header_from_disk().

Signed-off-by: Alex Elder <elder@inktank.com>
---
  drivers/block/rbd.c |   36 +++++++++++++++++++++++++++---------
  1 file changed, 27 insertions(+), 9 deletions(-)

Index: b/drivers/block/rbd.c
===================================================================
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -482,8 +482,31 @@ static void rbd_coll_release(struct kref

  static bool rbd_dev_ondisk_valid(struct rbd_image_header_ondisk *ondisk)
  {
-	return !memcmp(&ondisk->text,
-			RBD_HEADER_TEXT, sizeof (RBD_HEADER_TEXT));
+	size_t size;
+	u32 snap_count;
+
+	/* The header has to start with the magic rbd header text */
+	if (memcmp(&ondisk->text, RBD_HEADER_TEXT, sizeof (RBD_HEADER_TEXT)))
+		return false;
+
+	/*
+	 * The size of a snapshot header has to fit in a size_t, and
+	 * that valid limits the number of snapshots.
+	 */
+	snap_count = le32_to_cpu(ondisk->snap_count);
+	size = SIZE_MAX - sizeof (struct ceph_snap_context);
+	if (snap_count > size / sizeof (__le64))
+		return false;
+
+	/*
+	 * Not only that, but the size of the entire the snapshot
+	 * header must also be representable in a size_t.
+	 */
+	size -= snap_count * sizeof (__le64);
+	if ((u64) size < le64_to_cpu(ondisk->snap_names_len))
+		return false;
+
+	return true;
  }

  /*
@@ -500,15 +523,10 @@ static int rbd_header_from_disk(struct r
  	if (!rbd_dev_ondisk_valid(ondisk))
  		return -ENXIO;

-	snap_count = le32_to_cpu(ondisk->snap_count);
-
-	/* Make sure we don't overflow below */
-	size = SIZE_MAX - sizeof (struct ceph_snap_context);
-	if (snap_count > size / sizeof (header->snapc->snaps[0]))
-		return -EINVAL;
-
  	memset(header, 0, sizeof (*header));

+	snap_count = le32_to_cpu(ondisk->snap_count);
+
  	size = sizeof (ondisk->block_name) + 1;
  	header->object_prefix = kmalloc(size, GFP_KERNEL);
  	if (!header->object_prefix)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 4/4] rbd: separate reading header from decoding it
  2012-08-06 18:15 [PATCH 0/4] rbd: separate reading from interpreting rbd header Alex Elder
                   ` (2 preceding siblings ...)
  2012-08-06 18:17 ` [PATCH 3/4] rbd: expand rbd_dev_ondisk_valid() checks Alex Elder
@ 2012-08-06 18:17 ` Alex Elder
  2012-08-08  0:58   ` Josh Durgin
  3 siblings, 1 reply; 11+ messages in thread
From: Alex Elder @ 2012-08-06 18:17 UTC (permalink / raw)
  To: ceph-devel@vger.kernel.org

Right now rbd_read_header() both reads the header object for an rbd
image and decodes its contents.  It does this repeatedly if needed,
in order to ensure a complete and intact header is obtained.

Separate this process into two steps--reading of the raw header
data (in new function, rbd_dev_v1_header_read()) and separately
decoding its contents (in rbd_header_from_disk()).  As a result,
the latter function no longer requires its allocated_snaps argument.

Signed-off-by: Alex Elder <elder@inktank.com>
---
  drivers/block/rbd.c |  132 
+++++++++++++++++++++++++++++-----------------------
  1 file changed, 76 insertions(+), 56 deletions(-)

Index: b/drivers/block/rbd.c
===================================================================
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -514,15 +514,11 @@ static bool rbd_dev_ondisk_valid(struct
   * header.
   */
  static int rbd_header_from_disk(struct rbd_image_header *header,
-				 struct rbd_image_header_ondisk *ondisk,
-				 u32 allocated_snaps)
+				 struct rbd_image_header_ondisk *ondisk)
  {
  	u32 snap_count;
  	size_t size;

-	if (!rbd_dev_ondisk_valid(ondisk))
-		return -ENXIO;
-
  	memset(header, 0, sizeof (*header));

  	snap_count = le32_to_cpu(ondisk->snap_count);
@@ -559,15 +555,6 @@ static int rbd_header_from_disk(struct r
  	header->comp_type = ondisk->options.comp_type;
  	header->total_snaps = snap_count;

-	/*
-	 * If the number of snapshot ids provided by the caller
-	 * doesn't match the number in the entire context there's
-	 * no point in going further.  Caller will try again after
-	 * getting an updated snapshot context from the server.
-	 */
-	if (allocated_snaps != snap_count)
-		return 0;
-
  	size = sizeof (struct ceph_snap_context);
  	size += snap_count * sizeof (header->snapc->snaps[0]);
  	header->snapc = kzalloc(size, GFP_KERNEL);
@@ -1630,61 +1617,94 @@ static void rbd_free_disk(struct rbd_dev
  }

  /*
- * reload the ondisk the header
+ * Read the complete header for the given rbd device.
+ *
+ * Returns a pointer to a dynamically-allocated buffer containing
+ * the complete and validated header.  Caller can pass the address
+ * of a variable that will be filled in with the version of the
+ * header object at the time it was read.
+ *
+ * Returns a pointer-coded errno if a failure occurs.
   */
-static int rbd_read_header(struct rbd_device *rbd_dev,
-			   struct rbd_image_header *header)
+static struct rbd_image_header_ondisk *
+rbd_dev_v1_header_read(struct rbd_device *rbd_dev, u64 *version)
  {
-	ssize_t rc;
-	struct rbd_image_header_ondisk *dh;
+	struct rbd_image_header_ondisk *ondisk = NULL;
  	u32 snap_count = 0;
-	u64 ver;
-	size_t len;
+	u64 names_size = 0;
+	u32 want_count;
+	int ret;

  	/*
-	 * First reads the fixed-size header to determine the number
-	 * of snapshots, then re-reads it, along with all snapshot
-	 * records as well as their stored names.
+	 * The complete header will include an array of its 64-bit
+	 * snapshot ids, followed by the names of those snapshots as
+	 * a contiguous block of null-terminated strings.  Note that
+	 * the number of snapshots could change by the time we read
+	 * it in, in which case we re-read it.
  	 */
-	len = sizeof (*dh);
-	while (1) {
-		dh = kmalloc(len, GFP_KERNEL);
-		if (!dh)
-			return -ENOMEM;
+	do {
+		size_t size;
+
+		kfree(ondisk);

-		rc = rbd_req_sync_read(rbd_dev,
-				       CEPH_NOSNAP,
+		size = sizeof (*ondisk);
+		size += snap_count * sizeof (struct rbd_image_snap_ondisk);
+		size += names_size;
+		ondisk = kmalloc(size, GFP_KERNEL);
+		if (!ondisk)
+			return ERR_PTR(-ENOMEM);
+
+		ret = rbd_req_sync_read(rbd_dev, CEPH_NOSNAP,
  				       rbd_dev->header_name,
-				       0, len,
-				       (char *)dh, &ver);
-		if (rc < 0)
-			goto out_dh;
-
-		rc = rbd_header_from_disk(header, dh, snap_count);
-		if (rc < 0) {
-			if (rc == -ENXIO)
-				pr_warning("unrecognized header format"
-					   " for image %s\n",
-					   rbd_dev->image_name);
-			goto out_dh;
+				       0, size,
+				       (char *) ondisk, version);
+
+		if (ret < 0)
+			goto out_err;
+		if (WARN_ON(ret < size)) {
+			ret = -ENXIO;
+			goto out_err;
+		}
+		if (!rbd_dev_ondisk_valid(ondisk)) {
+			ret = -ENXIO;
+			goto out_err;
  		}

-		if (snap_count == header->total_snaps)
-			break;
+		names_size = le64_to_cpu(ondisk->snap_names_len);
+		want_count = snap_count;
+		snap_count = le32_to_cpu(ondisk->snap_count);
+	} while (snap_count != want_count);

-		snap_count = header->total_snaps;
-		len = sizeof (*dh) +
-			snap_count * sizeof(struct rbd_image_snap_ondisk) +
-			header->snap_names_len;
+	return ondisk;

-		rbd_header_free(header);
-		kfree(dh);
-	}
-	header->obj_version = ver;
+out_err:
+	kfree(ondisk);
+
+	return ERR_PTR(ret);
+}
+
+/*
+ * reload the ondisk the header
+ */
+static int rbd_read_header(struct rbd_device *rbd_dev,
+			   struct rbd_image_header *header)
+{
+	struct rbd_image_header_ondisk *ondisk;
+	u64 ver = 0;
+	int ret;
+
+	ondisk = rbd_dev_v1_header_read(rbd_dev, &ver);
+	if (IS_ERR(ondisk))
+		return PTR_ERR(ondisk);
+	ret = rbd_header_from_disk(header, ondisk);
+	if (ret >= 0)
+		header->obj_version = ver;
+	else if (ret == -ENXIO)
+		pr_warning("unrecognized header format for image %s\n",
+			rbd_dev->image_name);
+	kfree(ondisk);

-out_dh:
-	kfree(dh);
-	return rc;
+	return ret;
  }

  /*

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] rbd: rearrange rbd_header_from_disk()
  2012-08-06 18:17 ` [PATCH 1/4] rbd: rearrange rbd_header_from_disk() Alex Elder
@ 2012-08-08  0:29   ` Josh Durgin
  0 siblings, 0 replies; 11+ messages in thread
From: Josh Durgin @ 2012-08-08  0:29 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel@vger.kernel.org

On 08/06/2012 11:17 AM, Alex Elder wrote:
> This just moves code around for the most part.  It was pulled out as
> a separate patch to avoid cluttering up some upcoming patches which
> are more substantive.  The point is basically to group everything
> related to initializing the snapshot context together.
>
> The only functional change is that rbd_header_from_disk() now
> ensures the (in-core) header it is passed is zero-filled.  This
> allows a simpler error handling path in rbd_header_from_disk().
>
> Signed-off-by: Alex Elder <elder@inktank.com>

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>

> ---
>   drivers/block/rbd.c |   41 ++++++++++++++++++++++-------------------
>   1 file changed, 22 insertions(+), 19 deletions(-)
>
> Index: b/drivers/block/rbd.c
> ===================================================================
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -507,11 +507,14 @@ static int rbd_header_from_disk(struct r
>       if (snap_count > size / sizeof (header->snapc->snaps[0]))
>           return -EINVAL;
>
> -    size = sizeof (struct ceph_snap_context);
> -    size += snap_count * sizeof (header->snapc->snaps[0]);
> -    header->snapc = kmalloc(size, GFP_KERNEL);
> -    if (!header->snapc)
> +    memset(header, 0, sizeof (*header));
> +
> +    size = sizeof (ondisk->block_name) + 1;
> +    header->object_prefix = kmalloc(size, GFP_KERNEL);
> +    if (!header->object_prefix)
>           return -ENOMEM;
> +    memcpy(header->object_prefix, ondisk->block_name, size - 1);
> +    header->object_prefix[size - 1] = '\0';
>
>       if (snap_count) {
>           header->snap_names_len = le64_to_cpu(ondisk->snap_names_len);
> @@ -519,11 +522,12 @@ static int rbd_header_from_disk(struct r
>           header->snap_names = kmalloc(header->snap_names_len,
>                            GFP_KERNEL);
>           if (!header->snap_names)
> -            goto err_snapc;
> +            goto out_err;
> +
>           size = snap_count * sizeof (*header->snap_sizes);
>           header->snap_sizes = kmalloc(size, GFP_KERNEL);
>           if (!header->snap_sizes)
> -            goto err_names;
> +            goto out_err;
>       } else {
>           WARN_ON(ondisk->snap_names_len);
>           header->snap_names_len = 0;
> @@ -531,22 +535,23 @@ static int rbd_header_from_disk(struct r
>           header->snap_sizes = NULL;
>       }
>
> -    size = sizeof (ondisk->block_name) + 1;
> -    header->object_prefix = kmalloc(size, GFP_KERNEL);
> -    if (!header->object_prefix)
> -        goto err_sizes;
> -    memcpy(header->object_prefix, ondisk->block_name, size - 1);
> -    header->object_prefix[size - 1] = '\0';
> -
>       header->image_size = le64_to_cpu(ondisk->image_size);
>       header->obj_order = ondisk->options.order;
>       header->crypt_type = ondisk->options.crypt_type;
>       header->comp_type = ondisk->options.comp_type;
> +    header->total_snaps = snap_count;
> +
> +    /* Set up the snapshot context */
> +
> +    size = sizeof (struct ceph_snap_context);
> +    size += snap_count * sizeof (header->snapc->snaps[0]);
> +    header->snapc = kzalloc(size, GFP_KERNEL);
> +    if (!header->snapc)
> +        goto out_err;
>
>       atomic_set(&header->snapc->nref, 1);
>       header->snapc->seq = le64_to_cpu(ondisk->snap_seq);
>       header->snapc->num_snaps = snap_count;
> -    header->total_snaps = snap_count;
>
>       if (snap_count && allocated_snaps == snap_count) {
>           int i;
> @@ -565,16 +570,14 @@ static int rbd_header_from_disk(struct r
>
>       return 0;
>
> -err_sizes:
> +out_err:
>       kfree(header->snap_sizes);
>       header->snap_sizes = NULL;
> -err_names:
>       kfree(header->snap_names);
>       header->snap_names = NULL;
>       header->snap_names_len = 0;
> -err_snapc:
> -    kfree(header->snapc);
> -    header->snapc = NULL;
> +    kfree(header->object_prefix);
> +    header->object_prefix = NULL;
>
>       return -ENOMEM;
>   }


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/4] rbd: return earlier in rbd_header_from_disk()
  2012-08-06 18:17 ` [PATCH 2/4] rbd: return earlier in rbd_header_from_disk() Alex Elder
@ 2012-08-08  0:29   ` Josh Durgin
  0 siblings, 0 replies; 11+ messages in thread
From: Josh Durgin @ 2012-08-08  0:29 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel@vger.kernel.org

On 08/06/2012 11:17 AM, Alex Elder wrote:
> The only caller of rbd_header_from_disk() is rbd_read_header().
> It passes as allocated_snaps the number of snapshots it will
> have received from the server for the snapshot context that
> rbd_header_from_disk() is to interpret.  The first time through
> it provides 0--mainly to extract the number of snapshots from
> the snapshot context header--so that it can allocate an
> appropriately-sized buffer to receive the entire snapshot
> context from the server in a second request.
>
> rbd_header_from_disk() will not fill in the array of snapshot ids
> unless the number in the snapshot matches the number the caller
> had allocated.
>
> This patch adjusts that logic a little further to be more efficient.
> rbd_read_header() doesn't even examine the snapshot context unless
> the snapshot count (stored in header->total_snaps) matches the
> number of snapshots allocated.  So rbd_header_from_disk() doesn't
> need to allocate or fill in the snapshot context field at all in
> that case.
>
> Signed-off-by: Alex Elder <elder@inktank.com>

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>

> ---
>   drivers/block/rbd.c |   15 ++++++++++++---
>   1 file changed, 12 insertions(+), 3 deletions(-)
>
> Index: b/drivers/block/rbd.c
> ===================================================================
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -541,7 +541,14 @@ static int rbd_header_from_disk(struct r
>       header->comp_type = ondisk->options.comp_type;
>       header->total_snaps = snap_count;
>
> -    /* Set up the snapshot context */
> +    /*
> +     * If the number of snapshot ids provided by the caller
> +     * doesn't match the number in the entire context there's
> +     * no point in going further.  Caller will try again after
> +     * getting an updated snapshot context from the server.
> +     */
> +    if (allocated_snaps != snap_count)
> +        return 0;
>
>       size = sizeof (struct ceph_snap_context);
>       size += snap_count * sizeof (header->snapc->snaps[0]);
> @@ -553,8 +560,10 @@ static int rbd_header_from_disk(struct r
>       header->snapc->seq = le64_to_cpu(ondisk->snap_seq);
>       header->snapc->num_snaps = snap_count;
>
> -    if (snap_count && allocated_snaps == snap_count) {
> -        int i;
> +    /* Fill in the snapshot information */
> +
> +    if (snap_count) {
> +        u32 i;
>
>           for (i = 0; i < snap_count; i++) {
>               header->snapc->snaps[i] =
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/4] rbd: expand rbd_dev_ondisk_valid() checks
  2012-08-06 18:17 ` [PATCH 3/4] rbd: expand rbd_dev_ondisk_valid() checks Alex Elder
@ 2012-08-08  0:31   ` Josh Durgin
  0 siblings, 0 replies; 11+ messages in thread
From: Josh Durgin @ 2012-08-08  0:31 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel@vger.kernel.org

On 08/06/2012 11:17 AM, Alex Elder wrote:
> Add checks on the validity of the snap_count and snap_names_len
> field values in rbd_dev_ondisk_valid().  This eliminates the
> need to do them in rbd_header_from_disk().
>
> Signed-off-by: Alex Elder <elder@inktank.com>

Just a small typo below. Looks good otherwise:

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>

> ---
>   drivers/block/rbd.c |   36 +++++++++++++++++++++++++++---------
>   1 file changed, 27 insertions(+), 9 deletions(-)
>
> Index: b/drivers/block/rbd.c
> ===================================================================
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -482,8 +482,31 @@ static void rbd_coll_release(struct kref
>
>   static bool rbd_dev_ondisk_valid(struct rbd_image_header_ondisk *ondisk)
>   {
> -    return !memcmp(&ondisk->text,
> -            RBD_HEADER_TEXT, sizeof (RBD_HEADER_TEXT));
> +    size_t size;
> +    u32 snap_count;
> +
> +    /* The header has to start with the magic rbd header text */
> +    if (memcmp(&ondisk->text, RBD_HEADER_TEXT, sizeof (RBD_HEADER_TEXT)))
> +        return false;
> +
> +    /*
> +     * The size of a snapshot header has to fit in a size_t, and
> +     * that valid limits the number of snapshots.

typo

> +     */
> +    snap_count = le32_to_cpu(ondisk->snap_count);
> +    size = SIZE_MAX - sizeof (struct ceph_snap_context);
> +    if (snap_count > size / sizeof (__le64))
> +        return false;
> +
> +    /*
> +     * Not only that, but the size of the entire the snapshot
> +     * header must also be representable in a size_t.
> +     */
> +    size -= snap_count * sizeof (__le64);
> +    if ((u64) size < le64_to_cpu(ondisk->snap_names_len))
> +        return false;
> +
> +    return true;
>   }
>
>   /*
> @@ -500,15 +523,10 @@ static int rbd_header_from_disk(struct r
>       if (!rbd_dev_ondisk_valid(ondisk))
>           return -ENXIO;
>
> -    snap_count = le32_to_cpu(ondisk->snap_count);
> -
> -    /* Make sure we don't overflow below */
> -    size = SIZE_MAX - sizeof (struct ceph_snap_context);
> -    if (snap_count > size / sizeof (header->snapc->snaps[0]))
> -        return -EINVAL;
> -
>       memset(header, 0, sizeof (*header));
>
> +    snap_count = le32_to_cpu(ondisk->snap_count);
> +
>       size = sizeof (ondisk->block_name) + 1;
>       header->object_prefix = kmalloc(size, GFP_KERNEL);
>       if (!header->object_prefix)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] rbd: separate reading header from decoding it
  2012-08-06 18:17 ` [PATCH 4/4] rbd: separate reading header from decoding it Alex Elder
@ 2012-08-08  0:58   ` Josh Durgin
  2012-08-08  2:16     ` Alex Elder
  0 siblings, 1 reply; 11+ messages in thread
From: Josh Durgin @ 2012-08-08  0:58 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel@vger.kernel.org

On 08/06/2012 11:17 AM, Alex Elder wrote:
> Right now rbd_read_header() both reads the header object for an rbd
> image and decodes its contents.  It does this repeatedly if needed,
> in order to ensure a complete and intact header is obtained.
>
> Separate this process into two steps--reading of the raw header
> data (in new function, rbd_dev_v1_header_read()) and separately
> decoding its contents (in rbd_header_from_disk()).  As a result,
> the latter function no longer requires its allocated_snaps argument.
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>   drivers/block/rbd.c |  132
> +++++++++++++++++++++++++++++-----------------------
>   1 file changed, 76 insertions(+), 56 deletions(-)
>
> Index: b/drivers/block/rbd.c
> ===================================================================
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -514,15 +514,11 @@ static bool rbd_dev_ondisk_valid(struct
>    * header.
>    */
>   static int rbd_header_from_disk(struct rbd_image_header *header,
> -                 struct rbd_image_header_ondisk *ondisk,
> -                 u32 allocated_snaps)
> +                 struct rbd_image_header_ondisk *ondisk)
>   {
>       u32 snap_count;
>       size_t size;
>
> -    if (!rbd_dev_ondisk_valid(ondisk))
> -        return -ENXIO;
> -
>       memset(header, 0, sizeof (*header));
>
>       snap_count = le32_to_cpu(ondisk->snap_count);
> @@ -559,15 +555,6 @@ static int rbd_header_from_disk(struct r
>       header->comp_type = ondisk->options.comp_type;
>       header->total_snaps = snap_count;
>
> -    /*
> -     * If the number of snapshot ids provided by the caller
> -     * doesn't match the number in the entire context there's
> -     * no point in going further.  Caller will try again after
> -     * getting an updated snapshot context from the server.
> -     */
> -    if (allocated_snaps != snap_count)
> -        return 0;
> -
>       size = sizeof (struct ceph_snap_context);
>       size += snap_count * sizeof (header->snapc->snaps[0]);
>       header->snapc = kzalloc(size, GFP_KERNEL);
> @@ -1630,61 +1617,94 @@ static void rbd_free_disk(struct rbd_dev
>   }
>
>   /*
> - * reload the ondisk the header
> + * Read the complete header for the given rbd device.
> + *
> + * Returns a pointer to a dynamically-allocated buffer containing
> + * the complete and validated header.  Caller can pass the address
> + * of a variable that will be filled in with the version of the
> + * header object at the time it was read.
> + *
> + * Returns a pointer-coded errno if a failure occurs.
>    */
> -static int rbd_read_header(struct rbd_device *rbd_dev,
> -               struct rbd_image_header *header)
> +static struct rbd_image_header_ondisk *
> +rbd_dev_v1_header_read(struct rbd_device *rbd_dev, u64 *version)
>   {
> -    ssize_t rc;
> -    struct rbd_image_header_ondisk *dh;
> +    struct rbd_image_header_ondisk *ondisk = NULL;
>       u32 snap_count = 0;
> -    u64 ver;
> -    size_t len;
> +    u64 names_size = 0;
> +    u32 want_count;
> +    int ret;
>
>       /*
> -     * First reads the fixed-size header to determine the number
> -     * of snapshots, then re-reads it, along with all snapshot
> -     * records as well as their stored names.
> +     * The complete header will include an array of its 64-bit
> +     * snapshot ids, followed by the names of those snapshots as
> +     * a contiguous block of null-terminated strings.  Note that

minor nit, but '\0' is NUL, not NULL.

> +     * the number of snapshots could change by the time we read
> +     * it in, in which case we re-read it.
>        */
> -    len = sizeof (*dh);
> -    while (1) {
> -        dh = kmalloc(len, GFP_KERNEL);
> -        if (!dh)
> -            return -ENOMEM;
> +    do {
> +        size_t size;
> +
> +        kfree(ondisk);
>
> -        rc = rbd_req_sync_read(rbd_dev,
> -                       CEPH_NOSNAP,
> +        size = sizeof (*ondisk);
> +        size += snap_count * sizeof (struct rbd_image_snap_ondisk);
> +        size += names_size;
> +        ondisk = kmalloc(size, GFP_KERNEL);
> +        if (!ondisk)
> +            return ERR_PTR(-ENOMEM);
> +
> +        ret = rbd_req_sync_read(rbd_dev, CEPH_NOSNAP,
>                          rbd_dev->header_name,
> -                       0, len,
> -                       (char *)dh, &ver);
> -        if (rc < 0)
> -            goto out_dh;
> -
> -        rc = rbd_header_from_disk(header, dh, snap_count);
> -        if (rc < 0) {
> -            if (rc == -ENXIO)
> -                pr_warning("unrecognized header format"
> -                       " for image %s\n",
> -                       rbd_dev->image_name);
> -            goto out_dh;
> +                       0, size,
> +                       (char *) ondisk, version);
> +
> +        if (ret < 0)
> +            goto out_err;
> +        if (WARN_ON(ret < size)) {
> +            ret = -ENXIO;

Maybe -EIO, or with a pr_warning so we can distinguish between this and
the next check?

> +            goto out_err;
> +        }
> +        if (!rbd_dev_ondisk_valid(ondisk)) {
> +            ret = -ENXIO;
> +            goto out_err;
>           }
>
> -        if (snap_count == header->total_snaps)
> -            break;
> +        names_size = le64_to_cpu(ondisk->snap_names_len);
> +        want_count = snap_count;
> +        snap_count = le32_to_cpu(ondisk->snap_count);
> +    } while (snap_count != want_count);
>
> -        snap_count = header->total_snaps;
> -        len = sizeof (*dh) +
> -            snap_count * sizeof(struct rbd_image_snap_ondisk) +
> -            header->snap_names_len;
> +    return ondisk;
>
> -        rbd_header_free(header);
> -        kfree(dh);
> -    }
> -    header->obj_version = ver;
> +out_err:
> +    kfree(ondisk);
> +
> +    return ERR_PTR(ret);
> +}
> +
> +/*
> + * reload the ondisk the header
> + */
> +static int rbd_read_header(struct rbd_device *rbd_dev,
> +               struct rbd_image_header *header)
> +{
> +    struct rbd_image_header_ondisk *ondisk;
> +    u64 ver = 0;
> +    int ret;
> +
> +    ondisk = rbd_dev_v1_header_read(rbd_dev, &ver);
> +    if (IS_ERR(ondisk))
> +        return PTR_ERR(ondisk);
> +    ret = rbd_header_from_disk(header, ondisk);
> +    if (ret >= 0)
> +        header->obj_version = ver;
> +    else if (ret == -ENXIO)

This isn't returned from rbd_header_from_disk anymore, since you moved
the check into rbd_dev_v1_header_read. It seems like warnings should
be printed from rbd_dev_v1_header_read so more specific messages can be
given if you want to add them back in.

> +        pr_warning("unrecognized header format for image %s\n",
> +            rbd_dev->image_name);
> +    kfree(ondisk);
>
> -out_dh:
> -    kfree(dh);
> -    return rc;
> +    return ret;
>   }
>
>   /*


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] rbd: separate reading header from decoding it
  2012-08-08  0:58   ` Josh Durgin
@ 2012-08-08  2:16     ` Alex Elder
  2012-08-08  4:05       ` Josh Durgin
  0 siblings, 1 reply; 11+ messages in thread
From: Alex Elder @ 2012-08-08  2:16 UTC (permalink / raw)
  To: Josh Durgin; +Cc: ceph-devel@vger.kernel.org

On 08/07/2012 05:58 PM, Josh Durgin wrote:
> On 08/06/2012 11:17 AM, Alex Elder wrote:
>> Right now rbd_read_header() both reads the header object for an rbd
>> image and decodes its contents.  It does this repeatedly if needed,
>> in order to ensure a complete and intact header is obtained.
>>
>> Separate this process into two steps--reading of the raw header
>> data (in new function, rbd_dev_v1_header_read()) and separately
>> decoding its contents (in rbd_header_from_disk()).  As a result,
>> the latter function no longer requires its allocated_snaps argument.
>>
>> Signed-off-by: Alex Elder <elder@inktank.com>
>> ---
>>   drivers/block/rbd.c |  132
>> +++++++++++++++++++++++++++++-----------------------
>>   1 file changed, 76 insertions(+), 56 deletions(-)
>>
>> Index: b/drivers/block/rbd.c
>> ===================================================================
>> --- a/drivers/block/rbd.c
>> +++ b/drivers/block/rbd.c
>> @@ -514,15 +514,11 @@ static bool rbd_dev_ondisk_valid(struct
>>    * header.
>>    */
>>   static int rbd_header_from_disk(struct rbd_image_header *header,
>> -                 struct rbd_image_header_ondisk *ondisk,
>> -                 u32 allocated_snaps)
>> +                 struct rbd_image_header_ondisk *ondisk)
>>   {
>>       u32 snap_count;
>>       size_t size;
>>
>> -    if (!rbd_dev_ondisk_valid(ondisk))
>> -        return -ENXIO;
>> -
>>       memset(header, 0, sizeof (*header));
>>
>>       snap_count = le32_to_cpu(ondisk->snap_count);
>> @@ -559,15 +555,6 @@ static int rbd_header_from_disk(struct r
>>       header->comp_type = ondisk->options.comp_type;
>>       header->total_snaps = snap_count;
>>
>> -    /*
>> -     * If the number of snapshot ids provided by the caller
>> -     * doesn't match the number in the entire context there's
>> -     * no point in going further.  Caller will try again after
>> -     * getting an updated snapshot context from the server.
>> -     */
>> -    if (allocated_snaps != snap_count)
>> -        return 0;
>> -
>>       size = sizeof (struct ceph_snap_context);
>>       size += snap_count * sizeof (header->snapc->snaps[0]);
>>       header->snapc = kzalloc(size, GFP_KERNEL);
>> @@ -1630,61 +1617,94 @@ static void rbd_free_disk(struct rbd_dev
>>   }
>>
>>   /*
>> - * reload the ondisk the header
>> + * Read the complete header for the given rbd device.
>> + *
>> + * Returns a pointer to a dynamically-allocated buffer containing
>> + * the complete and validated header.  Caller can pass the address
>> + * of a variable that will be filled in with the version of the
>> + * header object at the time it was read.
>> + *
>> + * Returns a pointer-coded errno if a failure occurs.
>>    */
>> -static int rbd_read_header(struct rbd_device *rbd_dev,
>> -               struct rbd_image_header *header)
>> +static struct rbd_image_header_ondisk *
>> +rbd_dev_v1_header_read(struct rbd_device *rbd_dev, u64 *version)
>>   {
>> -    ssize_t rc;
>> -    struct rbd_image_header_ondisk *dh;
>> +    struct rbd_image_header_ondisk *ondisk = NULL;
>>       u32 snap_count = 0;
>> -    u64 ver;
>> -    size_t len;
>> +    u64 names_size = 0;
>> +    u32 want_count;
>> +    int ret;
>>
>>       /*
>> -     * First reads the fixed-size header to determine the number
>> -     * of snapshots, then re-reads it, along with all snapshot
>> -     * records as well as their stored names.
>> +     * The complete header will include an array of its 64-bit
>> +     * snapshot ids, followed by the names of those snapshots as
>> +     * a contiguous block of null-terminated strings.  Note that
>
> minor nit, but '\0' is NUL, not NULL.

Maybe "null" is either/both?  (I actually try to avoid this wording
for just this reason...  I use "...terminating '\0'...")

I'll use "NUL" instead of "null."

>> +     * the number of snapshots could change by the time we read
>> +     * it in, in which case we re-read it.
>>        */
>> -    len = sizeof (*dh);
>> -    while (1) {
>> -        dh = kmalloc(len, GFP_KERNEL);
>> -        if (!dh)
>> -            return -ENOMEM;
>> +    do {
>> +        size_t size;
>> +
>> +        kfree(ondisk);
>>
>> -        rc = rbd_req_sync_read(rbd_dev,
>> -                       CEPH_NOSNAP,
>> +        size = sizeof (*ondisk);
>> +        size += snap_count * sizeof (struct rbd_image_snap_ondisk);
>> +        size += names_size;
>> +        ondisk = kmalloc(size, GFP_KERNEL);
>> +        if (!ondisk)
>> +            return ERR_PTR(-ENOMEM);
>> +
>> +        ret = rbd_req_sync_read(rbd_dev, CEPH_NOSNAP,
>>                          rbd_dev->header_name,
>> -                       0, len,
>> -                       (char *)dh, &ver);
>> -        if (rc < 0)
>> -            goto out_dh;
>> -
>> -        rc = rbd_header_from_disk(header, dh, snap_count);
>> -        if (rc < 0) {
>> -            if (rc == -ENXIO)
>> -                pr_warning("unrecognized header format"
>> -                       " for image %s\n",
>> -                       rbd_dev->image_name);
>> -            goto out_dh;
>> +                       0, size,
>> +                       (char *) ondisk, version);
>> +
>> +        if (ret < 0)
>> +            goto out_err;
>> +        if (WARN_ON(ret < size)) {
>> +            ret = -ENXIO;
>
> Maybe -EIO, or with a pr_warning so we can distinguish between this and
> the next check?

I originally had a BUG_ON() here, but wasn't sure whether
a sync read actually could return a short-but-not-erroneous
read.  I think it could though, for a bogus and short header
object.  (And of course, this is a new check, which was not
present before.)

A short header object is invalid in somewhat the same way as a
header containing bad information, so I think the return value
should be the same.  I'll add this here:

                         pr_warning("short header read for image %s"
                                         " (want %d got %d)\n",
                                 rbd_dev->image_name, size, ret);

>
>> +            goto out_err;
>> +        }
>> +        if (!rbd_dev_ondisk_valid(ondisk)) {
>> +            ret = -ENXIO;
>> +            goto out_err;
>>           }
>>
>> -        if (snap_count == header->total_snaps)
>> -            break;
>> +        names_size = le64_to_cpu(ondisk->snap_names_len);
>> +        want_count = snap_count;
>> +        snap_count = le32_to_cpu(ondisk->snap_count);
>> +    } while (snap_count != want_count);
>>
>> -        snap_count = header->total_snaps;
>> -        len = sizeof (*dh) +
>> -            snap_count * sizeof(struct rbd_image_snap_ondisk) +
>> -            header->snap_names_len;
>> +    return ondisk;
>>
>> -        rbd_header_free(header);
>> -        kfree(dh);
>> -    }
>> -    header->obj_version = ver;
>> +out_err:
>> +    kfree(ondisk);
>> +
>> +    return ERR_PTR(ret);
>> +}
>> +
>> +/*
>> + * reload the ondisk the header
>> + */
>> +static int rbd_read_header(struct rbd_device *rbd_dev,
>> +               struct rbd_image_header *header)
>> +{
>> +    struct rbd_image_header_ondisk *ondisk;
>> +    u64 ver = 0;
>> +    int ret;
>> +
>> +    ondisk = rbd_dev_v1_header_read(rbd_dev, &ver);
>> +    if (IS_ERR(ondisk))
>> +        return PTR_ERR(ondisk);
>> +    ret = rbd_header_from_disk(header, ondisk);
>> +    if (ret >= 0)
>> +        header->obj_version = ver;
>> +    else if (ret == -ENXIO)
>
> This isn't returned from rbd_header_from_disk anymore, since you moved
> the check into rbd_dev_v1_header_read. It seems like warnings should
> be printed from rbd_dev_v1_header_read so more specific messages can be
> given if you want to add them back in.

I hadn't even noticed that.  rbd_header_from_disk() only returns 0
or -ENOMEM now.

I'll move that "unrecognized header format" warning into
rbd_dev_v1_header_read(), and will reword it "invalid header
format..."

The other errors that could occur are -ENOMEM, or the
result code from the read operation if it was an error.
Unless someone insists it's necessary, I will not be
adding warnings for them.

Do you want me to re-post my updated patch, or is my plan
described here enough to get your signoff?

					-Alex

>
>> +        pr_warning("unrecognized header format for image %s\n",
>> +            rbd_dev->image_name);
>> +    kfree(ondisk);
>>
>> -out_dh:
>> -    kfree(dh);
>> -    return rc;
>> +    return ret;
>>   }
>>
>>   /*
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] rbd: separate reading header from decoding it
  2012-08-08  2:16     ` Alex Elder
@ 2012-08-08  4:05       ` Josh Durgin
  0 siblings, 0 replies; 11+ messages in thread
From: Josh Durgin @ 2012-08-08  4:05 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel@vger.kernel.org

On 08/07/2012 07:16 PM, Alex Elder wrote:
> On 08/07/2012 05:58 PM, Josh Durgin wrote:
>> On 08/06/2012 11:17 AM, Alex Elder wrote:
>>> Right now rbd_read_header() both reads the header object for an rbd
>>> image and decodes its contents.  It does this repeatedly if needed,
>>> in order to ensure a complete and intact header is obtained.
>>>
>>> Separate this process into two steps--reading of the raw header
>>> data (in new function, rbd_dev_v1_header_read()) and separately
>>> decoding its contents (in rbd_header_from_disk()).  As a result,
>>> the latter function no longer requires its allocated_snaps argument.
>>>
>>> Signed-off-by: Alex Elder <elder@inktank.com>
>>> ---
>>>   drivers/block/rbd.c |  132
>>> +++++++++++++++++++++++++++++-----------------------
>>>   1 file changed, 76 insertions(+), 56 deletions(-)
>>>
>>> Index: b/drivers/block/rbd.c
>>> ===================================================================
>>> --- a/drivers/block/rbd.c
>>> +++ b/drivers/block/rbd.c
>>> @@ -514,15 +514,11 @@ static bool rbd_dev_ondisk_valid(struct
>>>    * header.
>>>    */
>>>   static int rbd_header_from_disk(struct rbd_image_header *header,
>>> -                 struct rbd_image_header_ondisk *ondisk,
>>> -                 u32 allocated_snaps)
>>> +                 struct rbd_image_header_ondisk *ondisk)
>>>   {
>>>       u32 snap_count;
>>>       size_t size;
>>>
>>> -    if (!rbd_dev_ondisk_valid(ondisk))
>>> -        return -ENXIO;
>>> -
>>>       memset(header, 0, sizeof (*header));
>>>
>>>       snap_count = le32_to_cpu(ondisk->snap_count);
>>> @@ -559,15 +555,6 @@ static int rbd_header_from_disk(struct r
>>>       header->comp_type = ondisk->options.comp_type;
>>>       header->total_snaps = snap_count;
>>>
>>> -    /*
>>> -     * If the number of snapshot ids provided by the caller
>>> -     * doesn't match the number in the entire context there's
>>> -     * no point in going further.  Caller will try again after
>>> -     * getting an updated snapshot context from the server.
>>> -     */
>>> -    if (allocated_snaps != snap_count)
>>> -        return 0;
>>> -
>>>       size = sizeof (struct ceph_snap_context);
>>>       size += snap_count * sizeof (header->snapc->snaps[0]);
>>>       header->snapc = kzalloc(size, GFP_KERNEL);
>>> @@ -1630,61 +1617,94 @@ static void rbd_free_disk(struct rbd_dev
>>>   }
>>>
>>>   /*
>>> - * reload the ondisk the header
>>> + * Read the complete header for the given rbd device.
>>> + *
>>> + * Returns a pointer to a dynamically-allocated buffer containing
>>> + * the complete and validated header.  Caller can pass the address
>>> + * of a variable that will be filled in with the version of the
>>> + * header object at the time it was read.
>>> + *
>>> + * Returns a pointer-coded errno if a failure occurs.
>>>    */
>>> -static int rbd_read_header(struct rbd_device *rbd_dev,
>>> -               struct rbd_image_header *header)
>>> +static struct rbd_image_header_ondisk *
>>> +rbd_dev_v1_header_read(struct rbd_device *rbd_dev, u64 *version)
>>>   {
>>> -    ssize_t rc;
>>> -    struct rbd_image_header_ondisk *dh;
>>> +    struct rbd_image_header_ondisk *ondisk = NULL;
>>>       u32 snap_count = 0;
>>> -    u64 ver;
>>> -    size_t len;
>>> +    u64 names_size = 0;
>>> +    u32 want_count;
>>> +    int ret;
>>>
>>>       /*
>>> -     * First reads the fixed-size header to determine the number
>>> -     * of snapshots, then re-reads it, along with all snapshot
>>> -     * records as well as their stored names.
>>> +     * The complete header will include an array of its 64-bit
>>> +     * snapshot ids, followed by the names of those snapshots as
>>> +     * a contiguous block of null-terminated strings.  Note that
>>
>> minor nit, but '\0' is NUL, not NULL.
>
> Maybe "null" is either/both?  (I actually try to avoid this wording
> for just this reason...  I use "...terminating '\0'...")
>
> I'll use "NUL" instead of "null."
>
>>> +     * the number of snapshots could change by the time we read
>>> +     * it in, in which case we re-read it.
>>>        */
>>> -    len = sizeof (*dh);
>>> -    while (1) {
>>> -        dh = kmalloc(len, GFP_KERNEL);
>>> -        if (!dh)
>>> -            return -ENOMEM;
>>> +    do {
>>> +        size_t size;
>>> +
>>> +        kfree(ondisk);
>>>
>>> -        rc = rbd_req_sync_read(rbd_dev,
>>> -                       CEPH_NOSNAP,
>>> +        size = sizeof (*ondisk);
>>> +        size += snap_count * sizeof (struct rbd_image_snap_ondisk);
>>> +        size += names_size;
>>> +        ondisk = kmalloc(size, GFP_KERNEL);
>>> +        if (!ondisk)
>>> +            return ERR_PTR(-ENOMEM);
>>> +
>>> +        ret = rbd_req_sync_read(rbd_dev, CEPH_NOSNAP,
>>>                          rbd_dev->header_name,
>>> -                       0, len,
>>> -                       (char *)dh, &ver);
>>> -        if (rc < 0)
>>> -            goto out_dh;
>>> -
>>> -        rc = rbd_header_from_disk(header, dh, snap_count);
>>> -        if (rc < 0) {
>>> -            if (rc == -ENXIO)
>>> -                pr_warning("unrecognized header format"
>>> -                       " for image %s\n",
>>> -                       rbd_dev->image_name);
>>> -            goto out_dh;
>>> +                       0, size,
>>> +                       (char *) ondisk, version);
>>> +
>>> +        if (ret < 0)
>>> +            goto out_err;
>>> +        if (WARN_ON(ret < size)) {
>>> +            ret = -ENXIO;
>>
>> Maybe -EIO, or with a pr_warning so we can distinguish between this and
>> the next check?
>
> I originally had a BUG_ON() here, but wasn't sure whether
> a sync read actually could return a short-but-not-erroneous
> read.  I think it could though, for a bogus and short header
> object.  (And of course, this is a new check, which was not
> present before.)

It can be a short read just as you described.

> A short header object is invalid in somewhat the same way as a
> header containing bad information, so I think the return value
> should be the same.  I'll add this here:
>
>                          pr_warning("short header read for image %s"
>                                          " (want %d got %d)\n",
>                                  rbd_dev->image_name, size, ret);
>

ok

>>
>>> +            goto out_err;
>>> +        }
>>> +        if (!rbd_dev_ondisk_valid(ondisk)) {
>>> +            ret = -ENXIO;
>>> +            goto out_err;
>>>           }
>>>
>>> -        if (snap_count == header->total_snaps)
>>> -            break;
>>> +        names_size = le64_to_cpu(ondisk->snap_names_len);
>>> +        want_count = snap_count;
>>> +        snap_count = le32_to_cpu(ondisk->snap_count);
>>> +    } while (snap_count != want_count);
>>>
>>> -        snap_count = header->total_snaps;
>>> -        len = sizeof (*dh) +
>>> -            snap_count * sizeof(struct rbd_image_snap_ondisk) +
>>> -            header->snap_names_len;
>>> +    return ondisk;
>>>
>>> -        rbd_header_free(header);
>>> -        kfree(dh);
>>> -    }
>>> -    header->obj_version = ver;
>>> +out_err:
>>> +    kfree(ondisk);
>>> +
>>> +    return ERR_PTR(ret);
>>> +}
>>> +
>>> +/*
>>> + * reload the ondisk the header
>>> + */
>>> +static int rbd_read_header(struct rbd_device *rbd_dev,
>>> +               struct rbd_image_header *header)
>>> +{
>>> +    struct rbd_image_header_ondisk *ondisk;
>>> +    u64 ver = 0;
>>> +    int ret;
>>> +
>>> +    ondisk = rbd_dev_v1_header_read(rbd_dev, &ver);
>>> +    if (IS_ERR(ondisk))
>>> +        return PTR_ERR(ondisk);
>>> +    ret = rbd_header_from_disk(header, ondisk);
>>> +    if (ret >= 0)
>>> +        header->obj_version = ver;
>>> +    else if (ret == -ENXIO)
>>
>> This isn't returned from rbd_header_from_disk anymore, since you moved
>> the check into rbd_dev_v1_header_read. It seems like warnings should
>> be printed from rbd_dev_v1_header_read so more specific messages can be
>> given if you want to add them back in.
>
> I hadn't even noticed that.  rbd_header_from_disk() only returns 0
> or -ENOMEM now.
>
> I'll move that "unrecognized header format" warning into
> rbd_dev_v1_header_read(), and will reword it "invalid header
> format..."
>
> The other errors that could occur are -ENOMEM, or the
> result code from the read operation if it was an error.
> Unless someone insists it's necessary, I will not be
> adding warnings for them.
>
> Do you want me to re-post my updated patch, or is my plan
> described here enough to get your signoff?

Your plan is good enough.
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>

Josh

>
>                      -Alex
>
>>
>>> +        pr_warning("unrecognized header format for image %s\n",
>>> +            rbd_dev->image_name);
>>> +    kfree(ondisk);
>>>
>>> -out_dh:
>>> -    kfree(dh);
>>> -    return rc;
>>> +    return ret;
>>>   }
>>>
>>>   /*
>>
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-08-08  4:05 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-06 18:15 [PATCH 0/4] rbd: separate reading from interpreting rbd header Alex Elder
2012-08-06 18:17 ` [PATCH 1/4] rbd: rearrange rbd_header_from_disk() Alex Elder
2012-08-08  0:29   ` Josh Durgin
2012-08-06 18:17 ` [PATCH 2/4] rbd: return earlier in rbd_header_from_disk() Alex Elder
2012-08-08  0:29   ` Josh Durgin
2012-08-06 18:17 ` [PATCH 3/4] rbd: expand rbd_dev_ondisk_valid() checks Alex Elder
2012-08-08  0:31   ` Josh Durgin
2012-08-06 18:17 ` [PATCH 4/4] rbd: separate reading header from decoding it Alex Elder
2012-08-08  0:58   ` Josh Durgin
2012-08-08  2:16     ` Alex Elder
2012-08-08  4:05       ` Josh Durgin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.