* [PATCH 0/5] libceph: page mapping in the messenger
@ 2012-03-12 22:33 Alex Elder
2012-03-12 22:40 ` [PATCH 1/5] libceph: use kernel_sendpage() for sending zeroes Alex Elder
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: Alex Elder @ 2012-03-12 22:33 UTC (permalink / raw)
To: ceph-devel
This series fixes up some things in the messenger code related
to how it uses pages. A new ceph_tcp_sendpage() helper is
created to match the existing ceph_tcp_sendmsg() interface,
and it is used rather than the sendmsg interface when sending
zeroes over the wire for a revoked message. The helper is
also used for sending message data. Also, the address mapping
to the zero page gets eliminated, thereby making the zero page
get treated no differently than any other page when it is the
source of a send operation. Finally there is a little code
cleanup in the message sending code.
-Alex
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/5] libceph: use kernel_sendpage() for sending zeroes
2012-03-12 22:33 [PATCH 0/5] libceph: page mapping in the messenger Alex Elder
@ 2012-03-12 22:40 ` Alex Elder
2012-03-12 22:40 ` [PATCH 2/5] libceph: only call kernel_sendpage() via helper Alex Elder
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Alex Elder @ 2012-03-12 22:40 UTC (permalink / raw)
To: ceph-devel
If a message queued for send gets revoked, zeroes are sent over the
wire instead of any unsent data. This is done by constructing a
message and passing it to kernel_sendmsg() via ceph_tcp_sendmsg().
Since we are already working with a page in this case we can use
the sendpage interface instead. Create a new ceph_tcp_sendpage()
helper that sets up flags to match the way ceph_tcp_sendmsg()
does now.
Signed-off-by: Alex Elder <elder@dreamhost.com>
---
net/ceph/messenger.c | 20 +++++++++++++++-----
1 files changed, 15 insertions(+), 5 deletions(-)
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 589b768..9207a8c 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -321,6 +321,19 @@ static int ceph_tcp_sendmsg(struct socket *sock,
struct kvec *iov,
return r;
}
+static int ceph_tcp_sendpage(struct socket *sock, struct page *page,
+ int offset, size_t size, int more)
+{
+ int flags = MSG_DONTWAIT | MSG_NOSIGNAL | (more ? MSG_MORE : MSG_EOR);
+ int ret;
+
+ ret = kernel_sendpage(sock, page, offset, size, flags);
+ if (ret == -EAGAIN)
+ ret = 0;
+
+ return ret;
+}
+
/*
* Shutdown/close the socket for the given connection.
@@ -944,12 +957,9 @@ static int write_partial_skip(struct
ceph_connection *con)
int ret;
while (con->out_skip > 0) {
- struct kvec iov = {
- .iov_base = zero_page_address,
- .iov_len = min(con->out_skip, (int)PAGE_CACHE_SIZE)
- };
+ size_t size = min(con->out_skip, (int) PAGE_CACHE_SIZE);
- ret = ceph_tcp_sendmsg(con->sock, &iov, 1, iov.iov_len, 1);
+ ret = ceph_tcp_sendpage(con->sock, zero_page, 0, size, 1);
if (ret <= 0)
goto out;
con->out_skip -= ret;
--
1.7.5.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/5] libceph: only call kernel_sendpage() via helper
2012-03-12 22:33 [PATCH 0/5] libceph: page mapping in the messenger Alex Elder
2012-03-12 22:40 ` [PATCH 1/5] libceph: use kernel_sendpage() for sending zeroes Alex Elder
@ 2012-03-12 22:40 ` Alex Elder
2012-03-12 22:40 ` [PATCH 3/5] libceph: get rid of zero_page_address Alex Elder
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Alex Elder @ 2012-03-12 22:40 UTC (permalink / raw)
To: ceph-devel
Make ceph_tcp_sendpage() be the only place kernel_sendpage() is
used, by using this helper in write_partial_msg_pages().
Signed-off-by: Alex Elder <elder@dreamhost.com>
---
net/ceph/messenger.c | 8 ++------
1 files changed, 2 insertions(+), 6 deletions(-)
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 9207a8c..adca1e6 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -904,17 +904,13 @@ static int write_partial_msg_pages(struct
ceph_connection *con)
con->out_msg->footer.data_crc = cpu_to_le32(crc);
con->out_msg_pos.did_page_crc = true;
}
- ret = kernel_sendpage(con->sock, page,
+ ret = ceph_tcp_sendpage(con->sock, page,
con->out_msg_pos.page_pos + page_shift,
- len,
- MSG_DONTWAIT | MSG_NOSIGNAL |
- MSG_MORE);
+ len, 1);
if (do_datacrc && kaddr != zero_page_address)
kunmap(page);
- if (ret == -EAGAIN)
- ret = 0;
if (ret <= 0)
goto out;
--
1.7.5.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/5] libceph: get rid of zero_page_address
2012-03-12 22:33 [PATCH 0/5] libceph: page mapping in the messenger Alex Elder
2012-03-12 22:40 ` [PATCH 1/5] libceph: use kernel_sendpage() for sending zeroes Alex Elder
2012-03-12 22:40 ` [PATCH 2/5] libceph: only call kernel_sendpage() via helper Alex Elder
@ 2012-03-12 22:40 ` Alex Elder
2012-03-12 22:40 ` [PATCH 4/5] libceph: rename "page_shift" variable to something sensible Alex Elder
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Alex Elder @ 2012-03-12 22:40 UTC (permalink / raw)
To: ceph-devel
There's not a lot of benefit to zero_page_address, which basically
holds a mapping of the zero page through the life of the messenger
module. Even with our own mapping, the sendpage interface where
it's used may need to kmap() it again. It's almost certain to
be in low memory anyway.
So stop treating the zero page specially in write_partial_msg_pages()
and just get rid of zero_page_address entirely.
Signed-off-by: Alex Elder <elder@dreamhost.com>
---
net/ceph/messenger.c | 11 ++---------
1 files changed, 2 insertions(+), 9 deletions(-)
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index adca1e6..4f1714c 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -61,7 +61,6 @@ static char addr_str[ADDR_STR_COUNT][MAX_ADDR_STR_LEN];
static atomic_t addr_str_seq = ATOMIC_INIT(0);
static struct page *zero_page; /* used in certain error cases */
-static void *zero_page_address; /* kernel virtual addr of zero_page */
const char *ceph_pr_addr(const struct sockaddr_storage *ss)
{
@@ -111,9 +110,6 @@ void _ceph_msgr_exit(void)
ceph_msgr_wq = NULL;
}
- BUG_ON(zero_page_address == NULL);
- zero_page_address = NULL;
-
BUG_ON(zero_page == NULL);
kunmap(zero_page);
page_cache_release(zero_page);
@@ -126,9 +122,6 @@ int ceph_msgr_init(void)
zero_page = ZERO_PAGE(0);
page_cache_get(zero_page);
- BUG_ON(zero_page_address != NULL);
- zero_page_address = kmap(zero_page);
-
ceph_msgr_wq = alloc_workqueue("ceph-msgr", WQ_NON_REENTRANT, 0);
if (ceph_msgr_wq)
return 0;
@@ -889,7 +882,7 @@ static int write_partial_msg_pages(struct
ceph_connection *con)
} else {
page = zero_page;
if (do_datacrc)
- kaddr = zero_page_address;
+ kaddr = kmap(page);
}
len = min_t(int, max_write - con->out_msg_pos.page_pos,
total_max_write);
@@ -908,7 +901,7 @@ static int write_partial_msg_pages(struct
ceph_connection *con)
con->out_msg_pos.page_pos + page_shift,
len, 1);
- if (do_datacrc && kaddr != zero_page_address)
+ if (do_datacrc)
kunmap(page);
if (ret <= 0)
--
1.7.5.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 4/5] libceph: rename "page_shift" variable to something sensible
2012-03-12 22:33 [PATCH 0/5] libceph: page mapping in the messenger Alex Elder
` (2 preceding siblings ...)
2012-03-12 22:40 ` [PATCH 3/5] libceph: get rid of zero_page_address Alex Elder
@ 2012-03-12 22:40 ` Alex Elder
2012-03-12 22:40 ` [PATCH 5/5] libceph: isolate kmap() call in write_partial_msg_pages() Alex Elder
2012-03-13 5:31 ` [PATCH 0/5] libceph: page mapping in the messenger Sage Weil
5 siblings, 0 replies; 7+ messages in thread
From: Alex Elder @ 2012-03-12 22:40 UTC (permalink / raw)
To: ceph-devel
In write_partial_msg_pages() there is a local variable used to
track the starting offset within a bio segment to use. Its name,
"page_shift" defies the Linux convention of using that name for
log-base-2(page size).
Since it's only used in the bio case rename it "bio_offset". Use it
along with the page_pos field to compute the memory offset when
computing CRC's in that function. This makes the bio case match the
others more closely.
Signed-off-by: Alex Elder <elder@dreamhost.com>
---
net/ceph/messenger.c | 11 ++++++-----
1 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 4f1714c..2bf9ab4 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -837,7 +837,7 @@ static int write_partial_msg_pages(struct
ceph_connection *con)
struct page *page = NULL;
void *kaddr = NULL;
int max_write = PAGE_SIZE;
- int page_shift = 0;
+ int bio_offset = 0;
total_max_write = data_len - trail_len -
con->out_msg_pos.data_pos;
@@ -874,9 +874,9 @@ static int write_partial_msg_pages(struct
ceph_connection *con)
bv = bio_iovec_idx(msg->bio_iter, msg->bio_seg);
page = bv->bv_page;
- page_shift = bv->bv_offset;
+ bio_offset = bv->bv_offset;
if (do_datacrc)
- kaddr = kmap(page) + page_shift;
+ kaddr = kmap(page);
max_write = bv->bv_len;
#endif
} else {
@@ -888,17 +888,18 @@ static int write_partial_msg_pages(struct
ceph_connection *con)
total_max_write);
if (do_datacrc && !con->out_msg_pos.did_page_crc) {
+ void *base;
u32 crc;
- void *base = kaddr + con->out_msg_pos.page_pos;
u32 tmpcrc = le32_to_cpu(con->out_msg->footer.data_crc);
BUG_ON(kaddr == NULL);
+ base = kaddr + con->out_msg_pos.page_pos + bio_offset;
crc = crc32c(tmpcrc, base, len);
con->out_msg->footer.data_crc = cpu_to_le32(crc);
con->out_msg_pos.did_page_crc = true;
}
ret = ceph_tcp_sendpage(con->sock, page,
- con->out_msg_pos.page_pos + page_shift,
+ con->out_msg_pos.page_pos + bio_offset,
len, 1);
if (do_datacrc)
--
1.7.5.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 5/5] libceph: isolate kmap() call in write_partial_msg_pages()
2012-03-12 22:33 [PATCH 0/5] libceph: page mapping in the messenger Alex Elder
` (3 preceding siblings ...)
2012-03-12 22:40 ` [PATCH 4/5] libceph: rename "page_shift" variable to something sensible Alex Elder
@ 2012-03-12 22:40 ` Alex Elder
2012-03-13 5:31 ` [PATCH 0/5] libceph: page mapping in the messenger Sage Weil
5 siblings, 0 replies; 7+ messages in thread
From: Alex Elder @ 2012-03-12 22:40 UTC (permalink / raw)
To: ceph-devel
In write_partial_msg_pages(), every case now does an identical call
to kmap(page). Instead, just call it once inside the CRC-computing
block where it's needed. Move the definition of kaddr inside that
block, and make it a (char *) to ensure portable pointer arithmetic.
We still don't kunmap() it until after the sendpage() call, in case
that also ends up needing to use the mapping.
Signed-off-by: Alex Elder <elder@dreamhost.com>
---
net/ceph/messenger.c | 13 ++-----------
1 files changed, 2 insertions(+), 11 deletions(-)
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 2bf9ab4..f0993af 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -835,7 +835,6 @@ static int write_partial_msg_pages(struct
ceph_connection *con)
while (data_len > con->out_msg_pos.data_pos) {
struct page *page = NULL;
- void *kaddr = NULL;
int max_write = PAGE_SIZE;
int bio_offset = 0;
@@ -856,18 +855,12 @@ static int write_partial_msg_pages(struct
ceph_connection *con)
page = list_first_entry(&msg->trail->head,
struct page, lru);
- if (do_datacrc)
- kaddr = kmap(page);
max_write = PAGE_SIZE;
} else if (msg->pages) {
page = msg->pages[con->out_msg_pos.page];
- if (do_datacrc)
- kaddr = kmap(page);
} else if (msg->pagelist) {
page = list_first_entry(&msg->pagelist->head,
struct page, lru);
- if (do_datacrc)
- kaddr = kmap(page);
#ifdef CONFIG_BLOCK
} else if (msg->bio) {
struct bio_vec *bv;
@@ -875,14 +868,10 @@ static int write_partial_msg_pages(struct
ceph_connection *con)
bv = bio_iovec_idx(msg->bio_iter, msg->bio_seg);
page = bv->bv_page;
bio_offset = bv->bv_offset;
- if (do_datacrc)
- kaddr = kmap(page);
max_write = bv->bv_len;
#endif
} else {
page = zero_page;
- if (do_datacrc)
- kaddr = kmap(page);
}
len = min_t(int, max_write - con->out_msg_pos.page_pos,
total_max_write);
@@ -891,7 +880,9 @@ static int write_partial_msg_pages(struct
ceph_connection *con)
void *base;
u32 crc;
u32 tmpcrc = le32_to_cpu(con->out_msg->footer.data_crc);
+ char *kaddr;
+ kaddr = kmap(page);
BUG_ON(kaddr == NULL);
base = kaddr + con->out_msg_pos.page_pos + bio_offset;
crc = crc32c(tmpcrc, base, len);
--
1.7.5.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 0/5] libceph: page mapping in the messenger
2012-03-12 22:33 [PATCH 0/5] libceph: page mapping in the messenger Alex Elder
` (4 preceding siblings ...)
2012-03-12 22:40 ` [PATCH 5/5] libceph: isolate kmap() call in write_partial_msg_pages() Alex Elder
@ 2012-03-13 5:31 ` Sage Weil
5 siblings, 0 replies; 7+ messages in thread
From: Sage Weil @ 2012-03-13 5:31 UTC (permalink / raw)
To: Alex Elder; +Cc: ceph-devel
Looks good to me,
Reviewed-by: Sage Weil <sage@newdream.net>
On Mon, 12 Mar 2012, Alex Elder wrote:
> This series fixes up some things in the messenger code related
> to how it uses pages. A new ceph_tcp_sendpage() helper is
> created to match the existing ceph_tcp_sendmsg() interface,
> and it is used rather than the sendmsg interface when sending
> zeroes over the wire for a revoked message. The helper is
> also used for sending message data. Also, the address mapping
> to the zero page gets eliminated, thereby making the zero page
> get treated no differently than any other page when it is the
> source of a send operation. Finally there is a little code
> cleanup in the message sending code.
>
> -Alex
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-03-13 5:31 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-12 22:33 [PATCH 0/5] libceph: page mapping in the messenger Alex Elder
2012-03-12 22:40 ` [PATCH 1/5] libceph: use kernel_sendpage() for sending zeroes Alex Elder
2012-03-12 22:40 ` [PATCH 2/5] libceph: only call kernel_sendpage() via helper Alex Elder
2012-03-12 22:40 ` [PATCH 3/5] libceph: get rid of zero_page_address Alex Elder
2012-03-12 22:40 ` [PATCH 4/5] libceph: rename "page_shift" variable to something sensible Alex Elder
2012-03-12 22:40 ` [PATCH 5/5] libceph: isolate kmap() call in write_partial_msg_pages() Alex Elder
2012-03-13 5:31 ` [PATCH 0/5] libceph: page mapping in the messenger Sage Weil
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.