* [PATCH 1/7] libceph: isolate message page field manipulation
2013-03-09 16:00 [PATCH 0/7, v2] libceph: abstract message data information Alex Elder
@ 2013-03-09 16:02 ` Alex Elder
2013-03-09 16:02 ` [PATCH 2/7] libceph: set page info with byte length Alex Elder
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Alex Elder @ 2013-03-09 16:02 UTC (permalink / raw)
To: ceph-devel
Define a function ceph_msg_data_set_pages(), which more clearly
abstracts the assignment page-related fields for data in a ceph
message structure. Use this new function in the osd client and mds
client.
Ideally, these fields would never be set more than once (with
BUG_ON() calls to guarantee that). At the moment though the osd
client sets these every time it receives a message, and in the event
of a communication problem this can happen more than once. (This
will be resolved shortly, but setting up these helpers first makes
it all a bit easier to work with.)
Rearrange the field order in a ceph_msg structure to group those
that are used to define the possible data payloads.
This partially resolves:
http://tracker.ceph.com/issues/4263
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
---
fs/ceph/mds_client.c | 4 ++--
include/linux/ceph/messenger.h | 22 +++++++++++++---------
net/ceph/messenger.c | 11 +++++++++++
net/ceph/osd_client.c | 10 ++++------
4 files changed, 30 insertions(+), 17 deletions(-)
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 3549078..2b2991c 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -1721,8 +1721,8 @@ static struct ceph_msg
*create_request_message(struct ceph_mds_client *mdsc,
msg->front.iov_len = p - msg->front.iov_base;
msg->hdr.front_len = cpu_to_le32(msg->front.iov_len);
- msg->pages = req->r_pages;
- msg->page_count = req->r_num_pages;
+ ceph_msg_data_set_pages(msg, req->r_pages, req->r_num_pages, 0);
+
msg->hdr.data_len = cpu_to_le32(req->r_data_len);
msg->hdr.data_off = cpu_to_le16(0);
diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
index 6c11874..aa463b9 100644
--- a/include/linux/ceph/messenger.h
+++ b/include/linux/ceph/messenger.h
@@ -74,21 +74,22 @@ struct ceph_msg {
struct ceph_msg_footer footer; /* footer */
struct kvec front; /* unaligned blobs of message */
struct ceph_buffer *middle;
- struct page **pages; /* data payload. NOT OWNER. */
- unsigned page_count; /* size of page array */
- unsigned page_alignment; /* io offset in first page */
- struct ceph_pagelist *pagelist; /* instead of pages */
-
- struct ceph_connection *con;
- struct list_head list_head;
- struct kref kref;
+ struct page **pages; /* data payload. NOT OWNER. */
+ unsigned int page_alignment; /* io offset in first page */
+ unsigned int page_count; /* # pages in array or list */
+ struct ceph_pagelist *pagelist; /* instead of pages */
#ifdef CONFIG_BLOCK
+ unsigned int bio_seg; /* current bio segment */
struct bio *bio; /* instead of pages/pagelist */
struct bio *bio_iter; /* bio iterator */
- unsigned int bio_seg; /* current bio segment */
#endif /* CONFIG_BLOCK */
struct ceph_pagelist *trail; /* the trailing part of the data */
+
+ struct ceph_connection *con;
+ struct list_head list_head; /* links for connection lists */
+
+ struct kref kref;
bool front_is_vmalloc;
bool more_to_follow;
bool needs_out_seq;
@@ -218,6 +219,9 @@ extern void ceph_msg_revoke_incoming(struct ceph_msg
*msg);
extern void ceph_con_keepalive(struct ceph_connection *con);
+extern void ceph_msg_data_set_pages(struct ceph_msg *msg, struct page
**pages,
+ unsigned int page_count, size_t alignment);
+
extern struct ceph_msg *ceph_msg_new(int type, int front_len, gfp_t flags,
bool can_fail);
extern void ceph_msg_kfree(struct ceph_msg *m);
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index ce1669f..cec39cb 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2689,6 +2689,17 @@ void ceph_con_keepalive(struct ceph_connection *con)
}
EXPORT_SYMBOL(ceph_con_keepalive);
+void ceph_msg_data_set_pages(struct ceph_msg *msg, struct page **pages,
+ unsigned int page_count, size_t alignment)
+{
+ /* BUG_ON(msg->pages); */
+ /* BUG_ON(msg->page_count); */
+
+ msg->pages = pages;
+ msg->page_count = page_count;
+ msg->page_alignment = alignment & ~PAGE_MASK;
+}
+EXPORT_SYMBOL(ceph_msg_data_set_pages);
/*
* construct a new message with given type, size
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index 202af14..a09d571 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -1760,11 +1760,10 @@ int ceph_osdc_start_request(struct
ceph_osd_client *osdc,
if (osd_data->type == CEPH_OSD_DATA_TYPE_PAGES) {
unsigned int page_count;
- req->r_request->pages = osd_data->pages;
page_count = calc_pages_for((u64)osd_data->alignment,
(u64)osd_data->length);
- req->r_request->page_count = page_count;
- req->r_request->page_alignment = osd_data->alignment;
+ ceph_msg_data_set_pages(req->r_request, osd_data->pages,
+ page_count, osd_data->alignment);
#ifdef CONFIG_BLOCK
} else if (osd_data->type == CEPH_OSD_DATA_TYPE_BIO) {
req->r_request->bio = osd_data->bio;
@@ -2135,9 +2134,8 @@ static struct ceph_msg *get_reply(struct
ceph_connection *con,
}
page_count = calc_pages_for((u64)osd_data->alignment,
(u64)osd_data->length);
- m->pages = osd_data->pages;
- m->page_count = page_count;
- m->page_alignment = osd_data->alignment;
+ ceph_msg_data_set_pages(m, osd_data->pages,
+ osd_data->num_pages, osd_data->alignment);
#ifdef CONFIG_BLOCK
} else if (osd_data->type == CEPH_OSD_DATA_TYPE_BIO) {
m->bio = osd_data->bio;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH 2/7] libceph: set page info with byte length
2013-03-09 16:00 [PATCH 0/7, v2] libceph: abstract message data information Alex Elder
2013-03-09 16:02 ` [PATCH 1/7] libceph: isolate message page field manipulation Alex Elder
@ 2013-03-09 16:02 ` Alex Elder
2013-03-09 16:02 ` [PATCH 3/7] libceph: isolate other message data fields Alex Elder
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Alex Elder @ 2013-03-09 16:02 UTC (permalink / raw)
To: ceph-devel
When setting page array information for message data, provide the
byte length rather than the page count ceph_msg_data_set_pages().
Signed-off-by: Alex Elder <elder@inktank.com>
---
fs/ceph/mds_client.c | 2 +-
include/linux/ceph/messenger.h | 2 +-
net/ceph/messenger.c | 4 ++--
net/ceph/osd_client.c | 14 ++++----------
4 files changed, 8 insertions(+), 14 deletions(-)
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 2b2991c..bfc28ee 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -1721,7 +1721,7 @@ static struct ceph_msg
*create_request_message(struct ceph_mds_client *mdsc,
msg->front.iov_len = p - msg->front.iov_base;
msg->hdr.front_len = cpu_to_le32(msg->front.iov_len);
- ceph_msg_data_set_pages(msg, req->r_pages, req->r_num_pages, 0);
+ ceph_msg_data_set_pages(msg, req->r_pages, req->r_data_len, 0);
msg->hdr.data_len = cpu_to_le32(req->r_data_len);
msg->hdr.data_off = cpu_to_le16(0);
diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
index aa463b9..e6d20e8 100644
--- a/include/linux/ceph/messenger.h
+++ b/include/linux/ceph/messenger.h
@@ -220,7 +220,7 @@ extern void ceph_msg_revoke_incoming(struct ceph_msg
*msg);
extern void ceph_con_keepalive(struct ceph_connection *con);
extern void ceph_msg_data_set_pages(struct ceph_msg *msg, struct page
**pages,
- unsigned int page_count, size_t alignment);
+ size_t length, size_t alignment);
extern struct ceph_msg *ceph_msg_new(int type, int front_len, gfp_t flags,
bool can_fail);
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index cec39cb..fc59fcc 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2690,13 +2690,13 @@ void ceph_con_keepalive(struct ceph_connection *con)
EXPORT_SYMBOL(ceph_con_keepalive);
void ceph_msg_data_set_pages(struct ceph_msg *msg, struct page **pages,
- unsigned int page_count, size_t alignment)
+ size_t length, size_t alignment)
{
/* BUG_ON(msg->pages); */
/* BUG_ON(msg->page_count); */
msg->pages = pages;
- msg->page_count = page_count;
+ msg->page_count = calc_pages_for((u64)alignment, (u64)length);
msg->page_alignment = alignment & ~PAGE_MASK;
}
EXPORT_SYMBOL(ceph_msg_data_set_pages);
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index a09d571..f29beda 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -1758,12 +1758,9 @@ int ceph_osdc_start_request(struct
ceph_osd_client *osdc,
osd_data = &req->r_data_out;
if (osd_data->type == CEPH_OSD_DATA_TYPE_PAGES) {
- unsigned int page_count;
-
- page_count = calc_pages_for((u64)osd_data->alignment,
- (u64)osd_data->length);
+ BUG_ON(osd_data->length > (u64) SIZE_MAX);
ceph_msg_data_set_pages(req->r_request, osd_data->pages,
- page_count, osd_data->alignment);
+ osd_data->length, osd_data->alignment);
#ifdef CONFIG_BLOCK
} else if (osd_data->type == CEPH_OSD_DATA_TYPE_BIO) {
req->r_request->bio = osd_data->bio;
@@ -2119,8 +2116,6 @@ static struct ceph_msg *get_reply(struct
ceph_connection *con,
struct ceph_osd_data *osd_data = &req->r_data_in;
if (osd_data->type == CEPH_OSD_DATA_TYPE_PAGES) {
- unsigned int page_count;
-
if (osd_data->pages &&
unlikely(osd_data->length < data_len)) {
@@ -2132,10 +2127,9 @@ static struct ceph_msg *get_reply(struct
ceph_connection *con,
m = NULL;
goto out;
}
- page_count = calc_pages_for((u64)osd_data->alignment,
- (u64)osd_data->length);
+ BUG_ON(osd_data->length > (u64) SIZE_MAX);
ceph_msg_data_set_pages(m, osd_data->pages,
- osd_data->num_pages, osd_data->alignment);
+ osd_data->length, osd_data->alignment);
#ifdef CONFIG_BLOCK
} else if (osd_data->type == CEPH_OSD_DATA_TYPE_BIO) {
m->bio = osd_data->bio;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH 3/7] libceph: isolate other message data fields
2013-03-09 16:00 [PATCH 0/7, v2] libceph: abstract message data information Alex Elder
2013-03-09 16:02 ` [PATCH 1/7] libceph: isolate message page field manipulation Alex Elder
2013-03-09 16:02 ` [PATCH 2/7] libceph: set page info with byte length Alex Elder
@ 2013-03-09 16:02 ` Alex Elder
2013-03-09 16:03 ` [PATCH 4/7] ceph: only set message data pointers if non-empty Alex Elder
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Alex Elder @ 2013-03-09 16:02 UTC (permalink / raw)
To: ceph-devel
Define ceph_msg_data_set_pagelist(), ceph_msg_data_set_bio(), and
ceph_msg_data_set_trail() to clearly abstract the assignment the
remaining data-related fields in a ceph message structure. Use the
new functions in the osd client and mds client.
This partially resolves:
http://tracker.ceph.com/issues/4263
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
---
fs/ceph/mds_client.c | 2 +-
include/linux/ceph/messenger.h | 5 +++++
net/ceph/messenger.c | 28 ++++++++++++++++++++++++++++
net/ceph/osd_client.c | 6 +++---
4 files changed, 37 insertions(+), 4 deletions(-)
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index bfc28ee..ae9438f 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -2603,7 +2603,7 @@ static void send_mds_reconnect(struct
ceph_mds_client *mdsc,
goto fail;
}
- reply->pagelist = pagelist;
+ ceph_msg_data_set_pagelist(reply, pagelist);
if (recon_state.flock)
reply->hdr.version = cpu_to_le16(2);
reply->hdr.data_len = cpu_to_le32(pagelist->length);
diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
index e6d20e8..9d9be46 100644
--- a/include/linux/ceph/messenger.h
+++ b/include/linux/ceph/messenger.h
@@ -221,6 +221,11 @@ extern void ceph_con_keepalive(struct
ceph_connection *con);
extern void ceph_msg_data_set_pages(struct ceph_msg *msg, struct page
**pages,
size_t length, size_t alignment);
+extern void ceph_msg_data_set_pagelist(struct ceph_msg *msg,
+ struct ceph_pagelist *pagelist);
+extern void ceph_msg_data_set_bio(struct ceph_msg *msg, struct bio *bio);
+extern void ceph_msg_data_set_trail(struct ceph_msg *msg,
+ struct ceph_pagelist *trail);
extern struct ceph_msg *ceph_msg_new(int type, int front_len, gfp_t flags,
bool can_fail);
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index fc59fcc..d118353 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2701,6 +2701,34 @@ void ceph_msg_data_set_pages(struct ceph_msg
*msg, struct page **pages,
}
EXPORT_SYMBOL(ceph_msg_data_set_pages);
+void ceph_msg_data_set_pagelist(struct ceph_msg *msg,
+ struct ceph_pagelist *pagelist)
+{
+ /* BUG_ON(!pagelist); */
+ /* BUG_ON(msg->pagelist); */
+
+ msg->pagelist = pagelist;
+}
+EXPORT_SYMBOL(ceph_msg_data_set_pagelist);
+
+void ceph_msg_data_set_bio(struct ceph_msg *msg, struct bio *bio)
+{
+ /* BUG_ON(!bio); */
+ /* BUG_ON(msg->bio); */
+
+ msg->bio = bio;
+}
+EXPORT_SYMBOL(ceph_msg_data_set_bio);
+
+void ceph_msg_data_set_trail(struct ceph_msg *msg, struct ceph_pagelist
*trail)
+{
+ /* BUG_ON(!trail); */
+ /* BUG_ON(msg->trail); */
+
+ msg->trail = trail;
+}
+EXPORT_SYMBOL(ceph_msg_data_set_trail);
+
/*
* construct a new message with given type, size
* the new msg has a ref count of 1.
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index f29beda..387e312 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -1763,12 +1763,12 @@ int ceph_osdc_start_request(struct
ceph_osd_client *osdc,
osd_data->length, osd_data->alignment);
#ifdef CONFIG_BLOCK
} else if (osd_data->type == CEPH_OSD_DATA_TYPE_BIO) {
- req->r_request->bio = osd_data->bio;
+ ceph_msg_data_set_bio(req->r_request, osd_data->bio);
#endif
} else {
BUG_ON(osd_data->type != CEPH_OSD_DATA_TYPE_NONE);
}
- req->r_request->trail = &req->r_trail;
+ ceph_msg_data_set_trail(req->r_request, &req->r_trail);
register_request(osdc, req);
@@ -2132,7 +2132,7 @@ static struct ceph_msg *get_reply(struct
ceph_connection *con,
osd_data->length, osd_data->alignment);
#ifdef CONFIG_BLOCK
} else if (osd_data->type == CEPH_OSD_DATA_TYPE_BIO) {
- m->bio = osd_data->bio;
+ ceph_msg_data_set_bio(m, osd_data->bio);
#endif
}
}
--
1.7.9.5
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH 4/7] ceph: only set message data pointers if non-empty
2013-03-09 16:00 [PATCH 0/7, v2] libceph: abstract message data information Alex Elder
` (2 preceding siblings ...)
2013-03-09 16:02 ` [PATCH 3/7] libceph: isolate other message data fields Alex Elder
@ 2013-03-09 16:03 ` Alex Elder
2013-03-09 16:03 ` [PATCH 5/7] libceph: record message data byte length Alex Elder
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Alex Elder @ 2013-03-09 16:03 UTC (permalink / raw)
To: ceph-devel
Change it so we only assign outgoing data information for
messages if there is outgoing data to send.
This then allows us to add a few more (currently commented-out)
assertions.
This is related to:
http://tracker.ceph.com/issues/4284
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
---
fs/ceph/mds_client.c | 13 ++++++++++---
net/ceph/messenger.c | 4 ++++
net/ceph/osd_client.c | 9 ++++++---
3 files changed, 20 insertions(+), 6 deletions(-)
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index ae9438f..4bef0ab 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -1721,7 +1721,11 @@ static struct ceph_msg
*create_request_message(struct ceph_mds_client *mdsc,
msg->front.iov_len = p - msg->front.iov_base;
msg->hdr.front_len = cpu_to_le32(msg->front.iov_len);
- ceph_msg_data_set_pages(msg, req->r_pages, req->r_data_len, 0);
+ if (req->r_data_len) {
+ /* outbound data set only by ceph_sync_setxattr() */
+ BUG_ON(!req->r_pages);
+ ceph_msg_data_set_pages(msg, req->r_pages, req->r_data_len, 0);
+ }
msg->hdr.data_len = cpu_to_le32(req->r_data_len);
msg->hdr.data_off = cpu_to_le16(0);
@@ -2603,10 +2607,13 @@ static void send_mds_reconnect(struct
ceph_mds_client *mdsc,
goto fail;
}
- ceph_msg_data_set_pagelist(reply, pagelist);
if (recon_state.flock)
reply->hdr.version = cpu_to_le16(2);
- reply->hdr.data_len = cpu_to_le32(pagelist->length);
+ if (pagelist->length) {
+ /* set up outbound data if we have any */
+ reply->hdr.data_len = cpu_to_le32(pagelist->length);
+ ceph_msg_data_set_pagelist(reply, pagelist);
+ }
ceph_con_send(&session->s_con, reply);
mutex_unlock(&session->s_mutex);
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index d118353..1965d78 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2692,6 +2692,8 @@ EXPORT_SYMBOL(ceph_con_keepalive);
void ceph_msg_data_set_pages(struct ceph_msg *msg, struct page **pages,
size_t length, size_t alignment)
{
+ /* BUG_ON(!pages); */
+ /* BUG_ON(!length); */
/* BUG_ON(msg->pages); */
/* BUG_ON(msg->page_count); */
@@ -2705,6 +2707,7 @@ void ceph_msg_data_set_pagelist(struct ceph_msg *msg,
struct ceph_pagelist *pagelist)
{
/* BUG_ON(!pagelist); */
+ /* BUG_ON(!pagelist->length); */
/* BUG_ON(msg->pagelist); */
msg->pagelist = pagelist;
@@ -2723,6 +2726,7 @@ EXPORT_SYMBOL(ceph_msg_data_set_bio);
void ceph_msg_data_set_trail(struct ceph_msg *msg, struct ceph_pagelist
*trail)
{
/* BUG_ON(!trail); */
+ /* BUG_ON(!trail->length); */
/* BUG_ON(msg->trail); */
msg->trail = trail;
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index 387e312..4402e91 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -1759,8 +1759,10 @@ int ceph_osdc_start_request(struct
ceph_osd_client *osdc,
osd_data = &req->r_data_out;
if (osd_data->type == CEPH_OSD_DATA_TYPE_PAGES) {
BUG_ON(osd_data->length > (u64) SIZE_MAX);
- ceph_msg_data_set_pages(req->r_request, osd_data->pages,
- osd_data->length, osd_data->alignment);
+ if (osd_data->length)
+ ceph_msg_data_set_pages(req->r_request,
+ osd_data->pages, osd_data->length,
+ osd_data->alignment);
#ifdef CONFIG_BLOCK
} else if (osd_data->type == CEPH_OSD_DATA_TYPE_BIO) {
ceph_msg_data_set_bio(req->r_request, osd_data->bio);
@@ -1768,7 +1770,8 @@ int ceph_osdc_start_request(struct ceph_osd_client
*osdc,
} else {
BUG_ON(osd_data->type != CEPH_OSD_DATA_TYPE_NONE);
}
- ceph_msg_data_set_trail(req->r_request, &req->r_trail);
+ if (req->r_trail.length)
+ ceph_msg_data_set_trail(req->r_request, &req->r_trail);
register_request(osdc, req);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH 5/7] libceph: record message data byte length
2013-03-09 16:00 [PATCH 0/7, v2] libceph: abstract message data information Alex Elder
` (3 preceding siblings ...)
2013-03-09 16:03 ` [PATCH 4/7] ceph: only set message data pointers if non-empty Alex Elder
@ 2013-03-09 16:03 ` Alex Elder
2013-03-09 16:03 ` [PATCH 6/7] libceph: set response data fields earlier Alex Elder
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Alex Elder @ 2013-03-09 16:03 UTC (permalink / raw)
To: ceph-devel
Record the number of bytes of data in a page array rather than the
number of pages in the array. It can be assumed that the page array
is of sufficient size to hold the number of bytes indicated (and
offset by the indicated alignment).
Signed-off-by: Alex Elder <elder@inktank.com>
---
include/linux/ceph/messenger.h | 2 +-
net/ceph/messenger.c | 20 +++++++++-----------
2 files changed, 10 insertions(+), 12 deletions(-)
diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
index 9d9be46..1991a6f 100644
--- a/include/linux/ceph/messenger.h
+++ b/include/linux/ceph/messenger.h
@@ -77,7 +77,7 @@ struct ceph_msg {
struct page **pages; /* data payload. NOT OWNER. */
unsigned int page_alignment; /* io offset in first page */
- unsigned int page_count; /* # pages in array or list */
+ size_t length; /* # data bytes in array or list */
struct ceph_pagelist *pagelist; /* instead of pages */
#ifdef CONFIG_BLOCK
unsigned int bio_seg; /* current bio segment */
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 1965d78..f48e2af 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -809,11 +809,10 @@ static void prepare_write_message(struct
ceph_connection *con)
m->bio_iter = NULL;
#endif
- dout("prepare_write_message %p seq %lld type %d len %d+%d+%d %d pgs\n",
+ dout("prepare_write_message %p seq %lld type %d len %d+%d+%d (%zd)\n",
m, con->out_seq, le16_to_cpu(m->hdr.type),
le32_to_cpu(m->hdr.front_len), le32_to_cpu(m->hdr.middle_len),
- le32_to_cpu(m->hdr.data_len),
- m->page_count);
+ le32_to_cpu(m->hdr.data_len), m->length);
BUG_ON(le32_to_cpu(m->hdr.front_len) != m->front.iov_len);
/* tag + hdr + front + middle */
@@ -1091,9 +1090,8 @@ static int write_partial_msg_pages(struct
ceph_connection *con)
const size_t trail_len = (msg->trail ? msg->trail->length : 0);
const size_t trail_off = data_len - trail_len;
- dout("write_partial_msg_pages %p msg %p page %d/%d offset %d\n",
- con, msg, con->out_msg_pos.page, msg->page_count,
- con->out_msg_pos.page_pos);
+ dout("write_partial_msg_pages %p msg %p page %d offset %d\n",
+ con, msg, con->out_msg_pos.page, con->out_msg_pos.page_pos);
/*
* Iterate through each page that contains data to be
@@ -2695,10 +2693,10 @@ void ceph_msg_data_set_pages(struct ceph_msg
*msg, struct page **pages,
/* BUG_ON(!pages); */
/* BUG_ON(!length); */
/* BUG_ON(msg->pages); */
- /* BUG_ON(msg->page_count); */
+ /* BUG_ON(msg->length); */
msg->pages = pages;
- msg->page_count = calc_pages_for((u64)alignment, (u64)length);
+ msg->length = length;
msg->page_alignment = alignment & ~PAGE_MASK;
}
EXPORT_SYMBOL(ceph_msg_data_set_pages);
@@ -2906,7 +2904,7 @@ void ceph_msg_last_put(struct kref *kref)
ceph_buffer_put(m->middle);
m->middle = NULL;
}
- m->page_count = 0;
+ m->length = 0;
m->pages = NULL;
if (m->pagelist) {
@@ -2926,8 +2924,8 @@ EXPORT_SYMBOL(ceph_msg_last_put);
void ceph_msg_dump(struct ceph_msg *msg)
{
- pr_debug("msg_dump %p (front_max %d page_count %d)\n", msg,
- msg->front_max, msg->page_count);
+ pr_debug("msg_dump %p (front_max %d length %zd)\n", msg,
+ msg->front_max, msg->length);
print_hex_dump(KERN_DEBUG, "header: ",
DUMP_PREFIX_OFFSET, 16, 1,
&msg->hdr, sizeof(msg->hdr), true);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH 6/7] libceph: set response data fields earlier
2013-03-09 16:00 [PATCH 0/7, v2] libceph: abstract message data information Alex Elder
` (4 preceding siblings ...)
2013-03-09 16:03 ` [PATCH 5/7] libceph: record message data byte length Alex Elder
@ 2013-03-09 16:03 ` Alex Elder
2013-03-09 16:03 ` [PATCH 7/7] libceph: activate message data assignment checks Alex Elder
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Alex Elder @ 2013-03-09 16:03 UTC (permalink / raw)
To: ceph-devel
When an incoming message is destined for the osd client, the
messenger calls the osd client's alloc_msg method. That function
looks up which request has the tid matching the incoming message,
and returns the request message that was preallocated to receive the
response. The response message is therefore known before the
request is even started.
Between the start of the request and the receipt of the response,
the request and its data fields will not change, so there's no
reason we need to hold off setting them. In fact it's preferable
to set them just once because it's more obvious that they're
unchanging.
So set up the fields describing where incoming data is to land in a
response message at the beginning of ceph_osdc_start_request().
Define a helper function that sets these fields, and use it to
set the fields for both outgoing data in the request message and
incoming data in the response.
This resolves:
http://tracker.ceph.com/issues/4284
Signed-off-by: Alex Elder <elder@inktank.com>
---
net/ceph/osd_client.c | 43 ++++++++++++++++++++-----------------------
1 file changed, 20 insertions(+), 23 deletions(-)
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index 4402e91..37d8961 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -1744,32 +1744,36 @@ bad:
return;
}
-/*
- * Register request, send initial attempt.
- */
-int ceph_osdc_start_request(struct ceph_osd_client *osdc,
- struct ceph_osd_request *req,
- bool nofail)
+static void ceph_osdc_msg_data_set(struct ceph_msg *msg,
+ struct ceph_osd_data *osd_data)
{
- int rc = 0;
- struct ceph_osd_data *osd_data;
-
- /* Set up outgoing data */
-
- osd_data = &req->r_data_out;
if (osd_data->type == CEPH_OSD_DATA_TYPE_PAGES) {
BUG_ON(osd_data->length > (u64) SIZE_MAX);
if (osd_data->length)
- ceph_msg_data_set_pages(req->r_request,
- osd_data->pages, osd_data->length,
- osd_data->alignment);
+ ceph_msg_data_set_pages(msg, osd_data->pages,
+ osd_data->length, osd_data->alignment);
#ifdef CONFIG_BLOCK
} else if (osd_data->type == CEPH_OSD_DATA_TYPE_BIO) {
- ceph_msg_data_set_bio(req->r_request, osd_data->bio);
+ ceph_msg_data_set_bio(msg, osd_data->bio);
#endif
} else {
BUG_ON(osd_data->type != CEPH_OSD_DATA_TYPE_NONE);
}
+}
+
+/*
+ * Register request, send initial attempt.
+ */
+int ceph_osdc_start_request(struct ceph_osd_client *osdc,
+ struct ceph_osd_request *req,
+ bool nofail)
+{
+ int rc = 0;
+
+ /* Set up response incoming data and request outgoing data fields */
+
+ ceph_osdc_msg_data_set(req->r_reply, &req->r_data_in);
+ ceph_osdc_msg_data_set(req->r_request, &req->r_data_out);
if (req->r_trail.length)
ceph_msg_data_set_trail(req->r_request, &req->r_trail);
@@ -2130,13 +2134,6 @@ static struct ceph_msg *get_reply(struct
ceph_connection *con,
m = NULL;
goto out;
}
- BUG_ON(osd_data->length > (u64) SIZE_MAX);
- ceph_msg_data_set_pages(m, osd_data->pages,
- osd_data->length, osd_data->alignment);
-#ifdef CONFIG_BLOCK
- } else if (osd_data->type == CEPH_OSD_DATA_TYPE_BIO) {
- ceph_msg_data_set_bio(m, osd_data->bio);
-#endif
}
}
*skip = 0;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH 7/7] libceph: activate message data assignment checks
2013-03-09 16:00 [PATCH 0/7, v2] libceph: abstract message data information Alex Elder
` (5 preceding siblings ...)
2013-03-09 16:03 ` [PATCH 6/7] libceph: set response data fields earlier Alex Elder
@ 2013-03-09 16:03 ` Alex Elder
2013-03-09 16:14 ` [PATCH 0/7, v2] libceph: abstract message data information Alex Elder
2013-03-11 19:03 ` Josh Durgin
8 siblings, 0 replies; 10+ messages in thread
From: Alex Elder @ 2013-03-09 16:03 UTC (permalink / raw)
To: ceph-devel
The mds client no longer tries to assign zero-length message data,
and the osd client no longer sets its data info more than once.
This allows us to activate assertions in the messenger to verify
these things never happen.
This resolves both of these:
http://tracker.ceph.com/issues/4263
http://tracker.ceph.com/issues/4284
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
---
net/ceph/messenger.c | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index f48e2af..e75a03d 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2690,10 +2690,10 @@ EXPORT_SYMBOL(ceph_con_keepalive);
void ceph_msg_data_set_pages(struct ceph_msg *msg, struct page **pages,
size_t length, size_t alignment)
{
- /* BUG_ON(!pages); */
- /* BUG_ON(!length); */
- /* BUG_ON(msg->pages); */
- /* BUG_ON(msg->length); */
+ BUG_ON(!pages);
+ BUG_ON(!length);
+ BUG_ON(msg->pages);
+ BUG_ON(msg->length);
msg->pages = pages;
msg->length = length;
@@ -2704,9 +2704,9 @@ EXPORT_SYMBOL(ceph_msg_data_set_pages);
void ceph_msg_data_set_pagelist(struct ceph_msg *msg,
struct ceph_pagelist *pagelist)
{
- /* BUG_ON(!pagelist); */
- /* BUG_ON(!pagelist->length); */
- /* BUG_ON(msg->pagelist); */
+ BUG_ON(!pagelist);
+ BUG_ON(!pagelist->length);
+ BUG_ON(msg->pagelist);
msg->pagelist = pagelist;
}
@@ -2714,8 +2714,8 @@ EXPORT_SYMBOL(ceph_msg_data_set_pagelist);
void ceph_msg_data_set_bio(struct ceph_msg *msg, struct bio *bio)
{
- /* BUG_ON(!bio); */
- /* BUG_ON(msg->bio); */
+ BUG_ON(!bio);
+ BUG_ON(msg->bio);
msg->bio = bio;
}
@@ -2723,9 +2723,9 @@ EXPORT_SYMBOL(ceph_msg_data_set_bio);
void ceph_msg_data_set_trail(struct ceph_msg *msg, struct ceph_pagelist
*trail)
{
- /* BUG_ON(!trail); */
- /* BUG_ON(!trail->length); */
- /* BUG_ON(msg->trail); */
+ BUG_ON(!trail);
+ BUG_ON(!trail->length);
+ BUG_ON(msg->trail);
msg->trail = trail;
}
--
1.7.9.5
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH 0/7, v2] libceph: abstract message data information
2013-03-09 16:00 [PATCH 0/7, v2] libceph: abstract message data information Alex Elder
` (6 preceding siblings ...)
2013-03-09 16:03 ` [PATCH 7/7] libceph: activate message data assignment checks Alex Elder
@ 2013-03-09 16:14 ` Alex Elder
2013-03-11 19:03 ` Josh Durgin
8 siblings, 0 replies; 10+ messages in thread
From: Alex Elder @ 2013-03-09 16:14 UTC (permalink / raw)
To: ceph-devel
On 03/09/2013 10:00 AM, Alex Elder wrote:
> Over half of these have been reviewed already, but I'm re-posting
> them because I've rebased them to be on top of the series I just
> posted (that is, these are based on branch "review/wip-cleanup"
> or 60b789f libceph: record byte count not page count), and because
> I've added two more patches since the last time I sent them out.
>
> I won't bother summarizing them again this time, their individual
> descriptions should be good enough.
These patches are available in the branch "review/wip-abstract-2"
of the ceph-client git repository.
-Alex
-Alex
>
> The patches below marked with "***" are still in need of review.
>
> [PATCH 1/7] libceph: isolate message page field manipulation
> [PATCH 2/7] libceph: set page info with byte length ***
> [PATCH 3/7] libceph: isolate other message data fields
> [PATCH 4/7] ceph: only set message data pointers if non-empty
> [PATCH 5/7] libceph: record message data byte length ***
> [PATCH 6/7] libceph: set response data fields earlier ***
> [PATCH 7/7] libceph: activate message data assignment checks
>
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH 0/7, v2] libceph: abstract message data information
2013-03-09 16:00 [PATCH 0/7, v2] libceph: abstract message data information Alex Elder
` (7 preceding siblings ...)
2013-03-09 16:14 ` [PATCH 0/7, v2] libceph: abstract message data information Alex Elder
@ 2013-03-11 19:03 ` Josh Durgin
8 siblings, 0 replies; 10+ messages in thread
From: Josh Durgin @ 2013-03-11 19:03 UTC (permalink / raw)
To: Alex Elder; +Cc: ceph-devel
On 03/09/2013 08:00 AM, Alex Elder wrote:
> Over half of these have been reviewed already, but I'm re-posting
> them because I've rebased them to be on top of the series I just
> posted (that is, these are based on branch "review/wip-cleanup"
> or 60b789f libceph: record byte count not page count), and because
> I've added two more patches since the last time I sent them out.
>
> I won't bother summarizing them again this time, their individual
> descriptions should be good enough.
>
> -Alex
>
> The patches below marked with "***" are still in need of review.
>
> [PATCH 1/7] libceph: isolate message page field manipulation
> [PATCH 2/7] libceph: set page info with byte length ***
> [PATCH 3/7] libceph: isolate other message data fields
> [PATCH 4/7] ceph: only set message data pointers if non-empty
> [PATCH 5/7] libceph: record message data byte length ***
> [PATCH 6/7] libceph: set response data fields earlier ***
> [PATCH 7/7] libceph: activate message data assignment checks
These look good.
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
^ permalink raw reply [flat|nested] 10+ messages in thread