All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH REPOST 0/3] rbd: no need for file mapping calculation
@ 2013-01-04 14:51 Alex Elder
  2013-01-04 14:52 ` [PATCH REPOST 1/3] rbd: pull in ceph_calc_raw_layout() Alex Elder
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Alex Elder @ 2013-01-04 14:51 UTC (permalink / raw)
  To: ceph-devel@vger.kernel.org

Currently every osd request submitted by the rbd code undergoes a
file mapping operation, which is common with what the ceph file system
uses.  But some analysis shows that there is no need to do this for
rbd, because it already takes care of its own blocking of image data
into distinct objects.  Removing this simplifies things.  I especially
think removing this improves things conceptually, removing a complex
mapping operation from the I/O path.

					-Alex

[PATCH REPOST 1/3] rbd: pull in ceph_calc_raw_layout()
[PATCH REPOST 2/3] rbd: open code rbd_calc_raw_layout()
[PATCH REPOST 3/3] rbd: don't bother calculating file mapping

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH REPOST 1/3] rbd: pull in ceph_calc_raw_layout()
  2013-01-04 14:51 [PATCH REPOST 0/3] rbd: no need for file mapping calculation Alex Elder
@ 2013-01-04 14:52 ` Alex Elder
  2013-01-04 14:53 ` [PATCH REPOST 2/3] rbd: open code rbd_calc_raw_layout() Alex Elder
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Alex Elder @ 2013-01-04 14:52 UTC (permalink / raw)
  To: ceph-devel@vger.kernel.org

This is the first in a series of patches aimed at eliminating
the use of ceph_calc_raw_layout() by rbd.

It simply pulls in a copy of that function and renames it
rbd_calc_raw_layout().

Signed-off-by: Alex Elder <elder@inktank.com>
---
 drivers/block/rbd.c |   36 +++++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index a57d6e9..46b52a4 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -1101,6 +1101,40 @@ static void rbd_layout_init(struct
ceph_file_layout *layout, u64 pool_id)
 	layout->fl_pg_pool = cpu_to_le32((u32) pool_id);
 }

+int rbd_calc_raw_layout(struct ceph_file_layout *layout,
+			u64 off, u64 *plen, u64 *bno,
+			struct ceph_osd_request *req,
+			struct ceph_osd_req_op *op)
+{
+	u64 orig_len = *plen;
+	u64 objoff, objlen;    /* extent in object */
+	int r;
+
+	/* object extent? */
+	r = ceph_calc_file_object_mapping(layout, off, orig_len, bno,
+					  &objoff, &objlen);
+	if (r < 0)
+		return r;
+	if (objlen < orig_len) {
+		*plen = objlen;
+		dout(" skipping last %llu, final file extent %llu~%llu\n",
+		     orig_len - *plen, off, *plen);
+	}
+
+	if (op->op == CEPH_OSD_OP_READ || op->op == CEPH_OSD_OP_WRITE) {
+		op->extent.offset = objoff;
+		op->extent.length = objlen;
+	}
+	req->r_num_pages = calc_pages_for(off, *plen);
+	req->r_page_alignment = off & ~PAGE_MASK;
+	if (op->op == CEPH_OSD_OP_WRITE)
+		op->payload_len = *plen;
+
+	dout("calc_layout bno=%llx %llu~%llu (%d pages)\n",
+	     *bno, objoff, objlen, req->r_num_pages);
+	return 0;
+}
+
 /*
  * Send ceph osd request
  */
@@ -1167,7 +1201,7 @@ static int rbd_do_request(struct request *rq,
 	osd_req->r_oid_len = strlen(osd_req->r_oid);

 	rbd_layout_init(&osd_req->r_file_layout, rbd_dev->spec->pool_id);
-	ret = ceph_calc_raw_layout(&osd_req->r_file_layout,
+	ret = rbd_calc_raw_layout(&osd_req->r_file_layout,
 				ofs, &len, &bno, osd_req, op);
 	rbd_assert(ret == 0);

-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH REPOST 2/3] rbd: open code rbd_calc_raw_layout()
  2013-01-04 14:51 [PATCH REPOST 0/3] rbd: no need for file mapping calculation Alex Elder
  2013-01-04 14:52 ` [PATCH REPOST 1/3] rbd: pull in ceph_calc_raw_layout() Alex Elder
@ 2013-01-04 14:53 ` Alex Elder
  2013-01-04 14:53 ` [PATCH REPOST 3/3] rbd: don't bother calculating file mapping Alex Elder
  2013-01-17  2:34 ` [PATCH REPOST 0/3] rbd: no need for file mapping calculation Josh Durgin
  3 siblings, 0 replies; 5+ messages in thread
From: Alex Elder @ 2013-01-04 14:53 UTC (permalink / raw)
  To: ceph-devel@vger.kernel.org

This patch gets rid of rbd_calc_raw_layout() by simply open coding
it in its one caller.

Signed-off-by: Alex Elder <elder@inktank.com>
---
 drivers/block/rbd.c |   55
+++++++++++++++++----------------------------------
 1 file changed, 18 insertions(+), 37 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 46b52a4..e6db737 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -1032,7 +1032,7 @@ static struct ceph_osd_req_op
*rbd_create_rw_op(int opcode, u32 payload_len)
 		return NULL;
 	/*
 	 * op extent offset and length will be set later on
-	 * in calc_raw_layout()
+	 * after ceph_calc_file_object_mapping().
 	 */
 	op->op = opcode;
 	op->payload_len = payload_len;
@@ -1101,40 +1101,6 @@ static void rbd_layout_init(struct
ceph_file_layout *layout, u64 pool_id)
 	layout->fl_pg_pool = cpu_to_le32((u32) pool_id);
 }

-int rbd_calc_raw_layout(struct ceph_file_layout *layout,
-			u64 off, u64 *plen, u64 *bno,
-			struct ceph_osd_request *req,
-			struct ceph_osd_req_op *op)
-{
-	u64 orig_len = *plen;
-	u64 objoff, objlen;    /* extent in object */
-	int r;
-
-	/* object extent? */
-	r = ceph_calc_file_object_mapping(layout, off, orig_len, bno,
-					  &objoff, &objlen);
-	if (r < 0)
-		return r;
-	if (objlen < orig_len) {
-		*plen = objlen;
-		dout(" skipping last %llu, final file extent %llu~%llu\n",
-		     orig_len - *plen, off, *plen);
-	}
-
-	if (op->op == CEPH_OSD_OP_READ || op->op == CEPH_OSD_OP_WRITE) {
-		op->extent.offset = objoff;
-		op->extent.length = objlen;
-	}
-	req->r_num_pages = calc_pages_for(off, *plen);
-	req->r_page_alignment = off & ~PAGE_MASK;
-	if (op->op == CEPH_OSD_OP_WRITE)
-		op->payload_len = *plen;
-
-	dout("calc_layout bno=%llx %llu~%llu (%d pages)\n",
-	     *bno, objoff, objlen, req->r_num_pages);
-	return 0;
-}
-
 /*
  * Send ceph osd request
  */
@@ -1158,6 +1124,8 @@ static int rbd_do_request(struct request *rq,
 	struct ceph_osd_request *osd_req;
 	int ret;
 	u64 bno;
+	u64 obj_off = 0;
+	u64 obj_len = 0;
 	struct timespec mtime = CURRENT_TIME;
 	struct rbd_request *rbd_req;
 	struct ceph_osd_client *osdc;
@@ -1201,9 +1169,22 @@ static int rbd_do_request(struct request *rq,
 	osd_req->r_oid_len = strlen(osd_req->r_oid);

 	rbd_layout_init(&osd_req->r_file_layout, rbd_dev->spec->pool_id);
-	ret = rbd_calc_raw_layout(&osd_req->r_file_layout,
-				ofs, &len, &bno, osd_req, op);
+	ret = ceph_calc_file_object_mapping(&osd_req->r_file_layout, ofs, len,
+						&bno, &obj_off, &obj_len);
 	rbd_assert(ret == 0);
+	if (obj_len < len) {
+		dout(" skipping last %llu, final file extent %llu~%llu\n",
+		     len - obj_len, ofs, obj_len);
+		len = obj_len;
+	}
+	if (op->op == CEPH_OSD_OP_READ || op->op == CEPH_OSD_OP_WRITE) {
+		op->extent.offset = obj_off;
+		op->extent.length = obj_len;
+		if (op->op == CEPH_OSD_OP_WRITE)
+			op->payload_len = obj_len;
+	}
+	osd_req->r_num_pages = calc_pages_for(ofs, len);
+	osd_req->r_page_alignment = ofs & ~PAGE_MASK;

 	ceph_osdc_build_request(osd_req, ofs, len, 1, op,
 				snapc, snapid, &mtime);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH REPOST 3/3] rbd: don't bother calculating file mapping
  2013-01-04 14:51 [PATCH REPOST 0/3] rbd: no need for file mapping calculation Alex Elder
  2013-01-04 14:52 ` [PATCH REPOST 1/3] rbd: pull in ceph_calc_raw_layout() Alex Elder
  2013-01-04 14:53 ` [PATCH REPOST 2/3] rbd: open code rbd_calc_raw_layout() Alex Elder
@ 2013-01-04 14:53 ` Alex Elder
  2013-01-17  2:34 ` [PATCH REPOST 0/3] rbd: no need for file mapping calculation Josh Durgin
  3 siblings, 0 replies; 5+ messages in thread
From: Alex Elder @ 2013-01-04 14:53 UTC (permalink / raw)
  To: ceph-devel@vger.kernel.org

When rbd_do_request() has a request to process it initializes a ceph
file layout structure and uses it to compute offsets and limits for
the range of the request using ceph_calc_file_object_mapping().

The layout used is fixed, and is based on RBD_MAX_OBJ_ORDER (30).
It sets the layout's object size and stripe unit to be 1 GB (2^30),
and sets the stripe count to be 1.

The job of ceph_calc_file_object_mapping() is to determine which
of a sequence of objects will contain data covered by range, and
within that object, at what offset the range starts.  It also
truncates the length of the range at the end of the selected object
if necessary.

This is needed for ceph fs, but for rbd it really serves no purpose.
It does its own blocking of images into objects, echo of which is
(1 << obj_order) in size, and as a result it ignores the "bno"
value returned by ceph_calc_file_object_mapping().  In addition,
by the point a request has reached this function, it is already
destined for a single rbd object, and its length will not exceed
that object's extent.  Because of this, and because the mapping will
result in blocking up the range using an integer multiple of the
image's object order, ceph_calc_file_object_mapping() will never
change the offset or length values defined by the request.

In other words, this call is a big no-op for rbd data requests.

There is one exception.  We read the header object using this
function, and in that case we will not have already limited the
request size.  However, the header is a single object (not a file or
rbd image), and should not be broken into pieces anyway.  So in fact
we should *not* be calling ceph_calc_file_object_mapping() when
operating on the header object.

So...

Don't call ceph_calc_file_object_mapping() in rbd_do_request(),
because useless for image data and incorrect to do sofor the image
header.

Signed-off-by: Alex Elder <elder@inktank.com>
---
 drivers/block/rbd.c |   18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index e6db737..072608e 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -1123,9 +1123,6 @@ static int rbd_do_request(struct request *rq,
 {
 	struct ceph_osd_request *osd_req;
 	int ret;
-	u64 bno;
-	u64 obj_off = 0;
-	u64 obj_len = 0;
 	struct timespec mtime = CURRENT_TIME;
 	struct rbd_request *rbd_req;
 	struct ceph_osd_client *osdc;
@@ -1169,19 +1166,12 @@ static int rbd_do_request(struct request *rq,
 	osd_req->r_oid_len = strlen(osd_req->r_oid);

 	rbd_layout_init(&osd_req->r_file_layout, rbd_dev->spec->pool_id);
-	ret = ceph_calc_file_object_mapping(&osd_req->r_file_layout, ofs, len,
-						&bno, &obj_off, &obj_len);
-	rbd_assert(ret == 0);
-	if (obj_len < len) {
-		dout(" skipping last %llu, final file extent %llu~%llu\n",
-		     len - obj_len, ofs, obj_len);
-		len = obj_len;
-	}
+
 	if (op->op == CEPH_OSD_OP_READ || op->op == CEPH_OSD_OP_WRITE) {
-		op->extent.offset = obj_off;
-		op->extent.length = obj_len;
+		op->extent.offset = ofs;
+		op->extent.length = len;
 		if (op->op == CEPH_OSD_OP_WRITE)
-			op->payload_len = obj_len;
+			op->payload_len = len;
 	}
 	osd_req->r_num_pages = calc_pages_for(ofs, len);
 	osd_req->r_page_alignment = ofs & ~PAGE_MASK;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH REPOST 0/3] rbd: no need for file mapping calculation
  2013-01-04 14:51 [PATCH REPOST 0/3] rbd: no need for file mapping calculation Alex Elder
                   ` (2 preceding siblings ...)
  2013-01-04 14:53 ` [PATCH REPOST 3/3] rbd: don't bother calculating file mapping Alex Elder
@ 2013-01-17  2:34 ` Josh Durgin
  3 siblings, 0 replies; 5+ messages in thread
From: Josh Durgin @ 2013-01-17  2:34 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel@vger.kernel.org

On 01/04/2013 06:51 AM, Alex Elder wrote:
> Currently every osd request submitted by the rbd code undergoes a
> file mapping operation, which is common with what the ceph file system
> uses.  But some analysis shows that there is no need to do this for
> rbd, because it already takes care of its own blocking of image data
> into distinct objects.  Removing this simplifies things.  I especially
> think removing this improves things conceptually, removing a complex
> mapping operation from the I/O path.
>
> 					-Alex
>
> [PATCH REPOST 1/3] rbd: pull in ceph_calc_raw_layout()
> [PATCH REPOST 2/3] rbd: open code rbd_calc_raw_layout()
> [PATCH REPOST 3/3] rbd: don't bother calculating file mapping

We'll want to use similar methods later for fancier rbd striping with 
format 2 images, but that'll take more restructuring later anyway.
This is fine for now.

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-01-17  2:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-04 14:51 [PATCH REPOST 0/3] rbd: no need for file mapping calculation Alex Elder
2013-01-04 14:52 ` [PATCH REPOST 1/3] rbd: pull in ceph_calc_raw_layout() Alex Elder
2013-01-04 14:53 ` [PATCH REPOST 2/3] rbd: open code rbd_calc_raw_layout() Alex Elder
2013-01-04 14:53 ` [PATCH REPOST 3/3] rbd: don't bother calculating file mapping Alex Elder
2013-01-17  2:34 ` [PATCH REPOST 0/3] rbd: no need for file mapping calculation Josh Durgin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.