* [RFC] rebase Ian Campbell's skb fragment tracking to 3.2
@ 2013-01-25 14:27 Alex Bligh
2013-01-25 14:27 ` [RFC] [PATCH 1/7] net: add support for per-paged-fragment destructors Alex Bligh
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: Alex Bligh @ 2013-01-25 14:27 UTC (permalink / raw)
To: netdev
Cc: Stefano Stabellini, Ian Campbell, Alex Bligh, Trond Myklebust,
Mel Gorman
This patch set, generated by Mel Gorman, rebases Ian Campbell's
fragment tracking code to 3.2. Obviously this does not rebase
it to the current development head, but I thought this might
series might be useful to someone.
The reason for this patch set can be found in the following
links:
http://www.spinics.net/lists/linux-nfs/msg34783.html
http://lists.xen.org/archives/html/xen-devel/2012-12/msg01154.html
The original problem ishere (from 2008!):
http://marc.info/?l=linux-nfs&m=122424132729720&w=2
I've not copiedevery maintainer as this is really only for
information given it's not rebased to head.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [RFC] [PATCH 1/7] net: add support for per-paged-fragment destructors
2013-01-25 14:27 [RFC] rebase Ian Campbell's skb fragment tracking to 3.2 Alex Bligh
@ 2013-01-25 14:27 ` Alex Bligh
2013-01-25 14:27 ` [RFC] [PATCH 2/7] net: only allow paged fragments with the same destructor to be coalesced Alex Bligh
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Alex Bligh @ 2013-01-25 14:27 UTC (permalink / raw)
To: netdev
Cc: Stefano Stabellini, Ian Campbell, Alex Bligh, Trond Myklebust,
Mel Gorman
From: Mel Gorman <mgorman@suse.de>
Entities which care about the complete lifecycle of pages which they inject
into the network stack via an skb paged fragment can choose to set this
destructor in order to receive a callback when the stack is really finished
with a page (including all clones, retransmits, pull-ups etc etc).
This destructor will always be propagated alongside the struct page when
copying skb_frag_t->page. This is the reason I chose to embed the destructor in
a "struct { } page" within the skb_frag_t, rather than as a separate field,
since it allows existing code which propagates ->frags[N].page to Just
Work(tm).
When the destructor is present the page reference counting is done slightly
differently. No references are held by the network stack on the struct page (it
is up to the caller to manage this as necessary) instead the network stack will
track references via the count embedded in the destructor structure. When this
reference count reaches zero then the destructor will be called and the caller
can take the necesary steps to release the page (i.e. release the struct page
reference itself).
The intention is that callers can use this callback to delay completion to
_their_ callers until the network stack has completely released the page, in
order to prevent use-after-free or modification of data pages which are still
in use by the stack.
It is allowable (indeed expected) for a caller to share a single destructor
instance between multiple pages injected into the stack e.g. a group of pages
included in a single higher level operation might share a destructor which is
used to complete that higher level operation.
NB: a small number of drivers use skb_frag_t independently of struct sk_buff so
this patch is slightly larger than necessary. I did consider leaving skb_frag_t
alone and defining a new (but similar) structure to be used in the struct
sk_buff itself. This would also have the advantage of more clearly separating
the two uses, which is useful since there are now special reference counting
accessors for skb_frag_t within a struct sk_buff but not (necessarily) for
those used outside of an skb.
Signed-off-by: Ian Campbell <ian.campbell <at> citrix.com>
Cc: "David S. Miller" <davem <at> davemloft.net>
Cc: "James E.J. Bottomley" <JBottomley <at> parallels.com>
Cc: Dimitris Michailidis <dm <at> chelsio.com>
Cc: Casey Leedom <leedom <at> chelsio.com>
Cc: Yevgeny Petrilin <yevgenyp <at> mellanox.co.il>
Cc: Eric Dumazet <eric.dumazet <at> gmail.com>
Cc: "Michał Mirosław" <mirq-linux <at> rere.qmqm.pl>
Cc: netdev <at> vger.kernel.org
Cc: linux-scsi <at> vger.kernel.org
Signed-off-by: Alex Bligh <alex@alex.org.uk>
---
include/linux/skbuff.h | 44 ++++++++++++++++++++++++++++++++++++++++++++
net/core/skbuff.c | 17 +++++++++++++++++
2 files changed, 61 insertions(+)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index fe86488..2619a61 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -139,9 +139,16 @@ struct sk_buff;
typedef struct skb_frag_struct skb_frag_t;
+struct skb_frag_destructor {
+ atomic_t ref;
+ int (*destroy)(void *data);
+ void *data;
+};
+
struct skb_frag_struct {
struct {
struct page *p;
+ struct skb_frag_destructor *destructor;
} page;
#if (BITS_PER_LONG > 32) || (PAGE_SIZE >= 65536)
__u32 page_offset;
@@ -1160,6 +1167,31 @@ static inline int skb_pagelen(const struct sk_buff *skb)
}
/**
+ * skb_frag_set_destructor - set destructor for a paged fragment
+ * @skb: buffer containing fragment to be initialised
+ * @i: paged fragment index to initialise
+ * @destroy: the destructor to use for this fragment
+ *
+ * Sets @destroy as the destructor to be called when all references to
+ * the frag @i in @skb (tracked over skb_clone, retransmit, pull-ups,
+ * etc) are released.
+ *
+ * When a destructor is set then reference counting is performed on
+ * @destroy->ref. When the ref reaches zero then @destroy->destroy
+ * will be called. The caller is responsible for holding and managing
+ * any other references (such a the struct page reference count).
+ *
+ * This function must be called before any use of skb_frag_ref() or
+ * skb_frag_unref().
+ */
+static inline void skb_frag_set_destructor(struct sk_buff *skb, int i,
+ struct skb_frag_destructor *destroy)
+{
+ skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
+ frag->page.destructor = destroy;
+}
+
+/**
* __skb_fill_page_desc - initialise a paged fragment in an skb
* @skb: buffer containing fragment to be initialised
* @i: paged fragment index to initialise
@@ -1178,6 +1210,7 @@ static inline void __skb_fill_page_desc(struct sk_buff *skb, int i,
skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
frag->page.p = page;
+ frag->page.destructor = NULL;
frag->page_offset = off;
skb_frag_size_set(frag, size);
}
@@ -1704,6 +1737,9 @@ static inline struct page *skb_frag_page(const skb_frag_t *frag)
return frag->page.p;
}
+extern void skb_frag_destructor_ref(struct skb_frag_destructor *destroy);
+extern void skb_frag_destructor_unref(struct skb_frag_destructor *destroy);
+
/**
* __skb_frag_ref - take an addition reference on a paged fragment.
* @frag: the paged fragment
@@ -1712,6 +1748,10 @@ static inline struct page *skb_frag_page(const skb_frag_t *frag)
*/
static inline void __skb_frag_ref(skb_frag_t *frag)
{
+ if (unlikely(frag->page.destructor)) {
+ skb_frag_destructor_ref(frag->page.destructor);
+ return;
+ }
get_page(skb_frag_page(frag));
}
@@ -1735,6 +1775,10 @@ static inline void skb_frag_ref(struct sk_buff *skb, int f)
*/
static inline void __skb_frag_unref(skb_frag_t *frag)
{
+ if (unlikely(frag->page.destructor)) {
+ skb_frag_destructor_unref(frag->page.destructor);
+ return;
+ }
put_page(skb_frag_page(frag));
}
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 3c30ee4..8b46cad 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -303,6 +303,23 @@ struct sk_buff *dev_alloc_skb(unsigned int length)
}
EXPORT_SYMBOL(dev_alloc_skb);
+void skb_frag_destructor_ref(struct skb_frag_destructor *destroy)
+{
+ BUG_ON(destroy == NULL);
+ atomic_inc(&destroy->ref);
+}
+EXPORT_SYMBOL(skb_frag_destructor_ref);
+
+void skb_frag_destructor_unref(struct skb_frag_destructor *destroy)
+{
+ if (destroy == NULL)
+ return;
+
+ if (atomic_dec_and_test(&destroy->ref))
+ destroy->destroy(destroy->data);
+}
+EXPORT_SYMBOL(skb_frag_destructor_unref);
+
static void skb_drop_list(struct sk_buff **listp)
{
struct sk_buff *list = *listp;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [RFC] [PATCH 2/7] net: only allow paged fragments with the same destructor to be coalesced.
2013-01-25 14:27 [RFC] rebase Ian Campbell's skb fragment tracking to 3.2 Alex Bligh
2013-01-25 14:27 ` [RFC] [PATCH 1/7] net: add support for per-paged-fragment destructors Alex Bligh
@ 2013-01-25 14:27 ` Alex Bligh
2013-01-25 14:27 ` [RFC] [PATCH 3/7] net: add paged frag destructor support to kernel_sendpage Alex Bligh
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Alex Bligh @ 2013-01-25 14:27 UTC (permalink / raw)
To: netdev
Cc: Stefano Stabellini, Ian Campbell, Alex Bligh, Trond Myklebust,
Mel Gorman
From: Ian Campbell <ian.campbell at>
Signed-off-by: Ian Campbell <ian.campbell <at> citrix.com>
Cc: "David S. Miller" <davem <at> davemloft.net>
Cc: Alexey Kuznetsov <kuznet <at> ms2.inr.ac.ru>
Cc: "Pekka Savola (ipv6)" <pekkas <at> netcore.fi>
Cc: James Morris <jmorris <at> namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji <at> linux-ipv6.org>
Cc: Patrick McHardy <kaber <at> trash.net>
Cc: Eric Dumazet <eric.dumazet <at> gmail.com>
Cc: "Michał Mirosław" <mirq-linux <at> rere.qmqm.pl>
Cc: netdev <at> vger.kernel.org
Signed-off-by: Alex Bligh <alex@alex.org.uk>
---
include/linux/skbuff.h | 7 +++++--
net/core/skbuff.c | 1 +
net/ipv4/ip_output.c | 2 +-
net/ipv4/tcp.c | 4 ++--
4 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 2619a61..8a8eecd 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1970,13 +1970,16 @@ static inline int skb_add_data(struct sk_buff *skb,
}
static inline int skb_can_coalesce(struct sk_buff *skb, int i,
- const struct page *page, int off)
+ const struct page *page,
+ const struct skb_frag_destructor *destroy,
+ int off)
{
if (i) {
const struct skb_frag_struct *frag = &skb_shinfo(skb)->frags[i - 1];
return page == skb_frag_page(frag) &&
- off == frag->page_offset + skb_frag_size(frag);
+ off == frag->page_offset + skb_frag_size(frag) &&
+ frag->page.destructor == destroy;
}
return 0;
}
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 8b46cad..425cd5a 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2276,6 +2276,7 @@ int skb_shift(struct sk_buff *tgt, struct sk_buff *skb, int shiftlen)
*/
if (!to ||
!skb_can_coalesce(tgt, to, skb_frag_page(fragfrom),
+ fragfrom->page.destructor,
fragfrom->page_offset)) {
merge = -1;
} else {
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 0bc95f3..3252e06 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1229,7 +1229,7 @@ ssize_t ip_append_page(struct sock *sk, struct flowi4 *fl4, struct page *page,
i = skb_shinfo(skb)->nr_frags;
if (len > size)
len = size;
- if (skb_can_coalesce(skb, i, page, offset)) {
+ if (skb_can_coalesce(skb, i, page, NULL, offset)) {
skb_frag_size_add(&skb_shinfo(skb)->frags[i-1], len);
} else if (i < MAX_SKB_FRAGS) {
get_page(page);
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 34f5db1..018de0c 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -804,7 +804,7 @@ new_segment:
copy = size;
i = skb_shinfo(skb)->nr_frags;
- can_coalesce = skb_can_coalesce(skb, i, page, offset);
+ can_coalesce = skb_can_coalesce(skb, i, page, NULL, offset);
if (!can_coalesce && i >= MAX_SKB_FRAGS) {
tcp_mark_push(tp, skb);
goto new_segment;
@@ -1008,7 +1008,7 @@ new_segment:
struct page *page = TCP_PAGE(sk);
int off = TCP_OFF(sk);
- if (skb_can_coalesce(skb, i, page, off) &&
+ if (skb_can_coalesce(skb, i, page, NULL, off) &&
off != PAGE_SIZE) {
/* We can extend the last page
* fragment. */
--
1.7.9.5
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [RFC] [PATCH 3/7] net: add paged frag destructor support to kernel_sendpage.
2013-01-25 14:27 [RFC] rebase Ian Campbell's skb fragment tracking to 3.2 Alex Bligh
2013-01-25 14:27 ` [RFC] [PATCH 1/7] net: add support for per-paged-fragment destructors Alex Bligh
2013-01-25 14:27 ` [RFC] [PATCH 2/7] net: only allow paged fragments with the same destructor to be coalesced Alex Bligh
@ 2013-01-25 14:27 ` Alex Bligh
2013-01-25 14:27 ` [RFC] [PATCH 4/7] sunrpc: use SKB fragment destructors to delay completion until page is released by network stack Alex Bligh
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Alex Bligh @ 2013-01-25 14:27 UTC (permalink / raw)
To: netdev
Cc: Stefano Stabellini, Ian Campbell, Alex Bligh, Trond Myklebust,
Mel Gorman
From: Ian Campbell <ian.campbell at>
Signed-off-by: Ian Campbell <ian.campbell <at> citrix.com>
Cc: "David S. Miller" <davem <at> davemloft.net>
Cc: Alexey Kuznetsov <kuznet <at> ms2.inr.ac.ru>
Cc: "Pekka Savola (ipv6)" <pekkas <at> netcore.fi>
Cc: James Morris <jmorris <at> namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji <at> linux-ipv6.org>
Cc: Patrick McHardy <kaber <at> trash.net>
Cc: Trond Myklebust <Trond.Myklebust <at> netapp.com>
Cc: Greg Kroah-Hartman <gregkh <at> suse.de>
Cc: drbd-user <at> lists.linbit.com
Cc: devel <at> driverdev.osuosl.org
Cc: cluster-devel <at> redhat.com
Cc: ocfs2-devel <at> oss.oracle.com
Cc: netdev <at> vger.kernel.org
Cc: ceph-devel <at> vger.kernel.org
Cc: rds-devel <at> oss.oracle.com
Cc: linux-nfs <at> vger.kernel.org
[since v2:
Use skb_frag_set_destructor
since v1:
Drop sendpage_destructor and just add an argument to sendpage protocol hooks
]
Signed-off-by: Alex Bligh <alex@alex.org.uk>
---
drivers/block/drbd/drbd_main.c | 1 +
drivers/scsi/iscsi_tcp.c | 4 ++--
drivers/scsi/iscsi_tcp.h | 3 ++-
drivers/staging/pohmelfs/trans.c | 3 ++-
drivers/target/iscsi/iscsi_target_util.c | 3 ++-
fs/dlm/lowcomms.c | 4 ++--
fs/ocfs2/cluster/tcp.c | 1 +
include/linux/net.h | 6 +++++-
include/net/inet_common.h | 4 +++-
include/net/ip.h | 4 +++-
include/net/sock.h | 8 +++++---
include/net/tcp.h | 4 +++-
net/ceph/messenger.c | 2 +-
net/core/sock.c | 6 +++++-
net/ipv4/af_inet.c | 9 ++++++---
net/ipv4/ip_output.c | 6 ++++--
net/ipv4/tcp.c | 24 ++++++++++++++++--------
net/ipv4/udp.c | 11 ++++++-----
net/ipv4/udp_impl.h | 5 +++--
net/rds/tcp_send.c | 1 +
net/socket.c | 11 +++++++----
net/sunrpc/svcsock.c | 6 +++---
net/sunrpc/xprtsock.c | 2 +-
23 files changed, 84 insertions(+), 44 deletions(-)
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 0358e55..49c7346 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -2584,6 +2584,7 @@ static int _drbd_send_page(struct drbd_conf *mdev, struct page *page,
set_fs(KERNEL_DS);
do {
sent = mdev->data.socket->ops->sendpage(mdev->data.socket, page,
+ NULL,
offset, len,
msg_flags);
if (sent == -EAGAIN) {
diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index 7c34d8e..3884ae1 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -284,8 +284,8 @@ static int iscsi_sw_tcp_xmit_segment(struct iscsi_tcp_conn *tcp_conn,
if (!segment->data) {
sg = segment->sg;
offset += segment->sg_offset + sg->offset;
- r = tcp_sw_conn->sendpage(sk, sg_page(sg), offset,
- copy, flags);
+ r = tcp_sw_conn->sendpage(sk, sg_page(sg), NULL,
+ offset, copy, flags);
} else {
struct msghdr msg = { .msg_flags = flags };
struct kvec iov = {
diff --git a/drivers/scsi/iscsi_tcp.h b/drivers/scsi/iscsi_tcp.h
index 666fe09..1e23265 100644
--- a/drivers/scsi/iscsi_tcp.h
+++ b/drivers/scsi/iscsi_tcp.h
@@ -52,7 +52,8 @@ struct iscsi_sw_tcp_conn {
uint32_t sendpage_failures_cnt;
uint32_t discontiguous_hdr_cnt;
- ssize_t (*sendpage)(struct socket *, struct page *, int, size_t, int);
+ ssize_t (*sendpage)(struct socket *, struct page *,
+ struct skb_frag_destructor *, int, size_t, int);
};
struct iscsi_sw_tcp_host {
diff --git a/drivers/staging/pohmelfs/trans.c b/drivers/staging/pohmelfs/trans.c
index 06c1a74..96a7921 100644
--- a/drivers/staging/pohmelfs/trans.c
+++ b/drivers/staging/pohmelfs/trans.c
@@ -104,7 +104,8 @@ static int netfs_trans_send_pages(struct netfs_trans *t, struct netfs_state *st)
msg.msg_flags = MSG_WAITALL | (attached_pages == 1 ? 0 :
MSG_MORE);
- err = kernel_sendpage(st->socket, page, 0, size, msg.msg_flags);
+ err = kernel_sendpage(st->socket, page, NULL,
+ 0, size, msg.msg_flags);
if (err <= 0) {
printk("%s: %d/%d failed to send transaction page: t: %p, gen: %u, size: %u, err: %d.\n",
__func__, i, t->page_num, t, t->gen, size, err);
diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c
index 02348f7..d0c984b 100644
--- a/drivers/target/iscsi/iscsi_target_util.c
+++ b/drivers/target/iscsi/iscsi_target_util.c
@@ -1315,7 +1315,8 @@ send_hdr:
u32 sub_len = min_t(u32, data_len, space);
send_pg:
tx_sent = conn->sock->ops->sendpage(conn->sock,
- sg_page(sg), sg->offset + offset, sub_len, 0);
+ sg_page(sg), NULL,
+ sg->offset + offset, sub_len, 0);
if (tx_sent != sub_len) {
if (tx_sent == -EAGAIN) {
pr_err("tcp_sendpage() returned"
diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 990626e..98ace05 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -1342,8 +1342,8 @@ static void send_to_sock(struct connection *con)
ret = 0;
if (len) {
- ret = kernel_sendpage(con->sock, e->page, offset, len,
- msg_flags);
+ ret = kernel_sendpage(con->sock, e->page, NULL,
+ offset, len, msg_flags);
if (ret == -EAGAIN || ret == 0) {
if (ret == -EAGAIN &&
test_bit(SOCK_ASYNC_NOSPACE, &con->sock->flags) &&
diff --git a/fs/ocfs2/cluster/tcp.c b/fs/ocfs2/cluster/tcp.c
index 044e7b5..e13851e 100644
--- a/fs/ocfs2/cluster/tcp.c
+++ b/fs/ocfs2/cluster/tcp.c
@@ -983,6 +983,7 @@ static void o2net_sendpage(struct o2net_sock_container *sc,
mutex_lock(&sc->sc_send_lock);
ret = sc->sc_sock->ops->sendpage(sc->sc_sock,
virt_to_page(kmalloced_virt),
+ NULL,
(long)kmalloced_virt & ~PAGE_MASK,
size, MSG_DONTWAIT);
mutex_unlock(&sc->sc_send_lock);
diff --git a/include/linux/net.h b/include/linux/net.h
index b299230..db562ba 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -157,6 +157,7 @@ struct kiocb;
struct sockaddr;
struct msghdr;
struct module;
+struct skb_frag_destructor;
struct proto_ops {
int family;
@@ -203,6 +204,7 @@ struct proto_ops {
int (*mmap) (struct file *file, struct socket *sock,
struct vm_area_struct * vma);
ssize_t (*sendpage) (struct socket *sock, struct page *page,
+ struct skb_frag_destructor *destroy,
int offset, size_t size, int flags);
ssize_t (*splice_read)(struct socket *sock, loff_t *ppos,
struct pipe_inode_info *pipe, size_t len, unsigned int flags);
@@ -273,7 +275,9 @@ extern int kernel_getsockopt(struct socket *sock, int level, int optname,
char *optval, int *optlen);
extern int kernel_setsockopt(struct socket *sock, int level, int optname,
char *optval, unsigned int optlen);
-extern int kernel_sendpage(struct socket *sock, struct page *page, int offset,
+extern int kernel_sendpage(struct socket *sock, struct page *page,
+ struct skb_frag_destructor *destroy,
+ int offset,
size_t size, int flags);
extern int kernel_sock_ioctl(struct socket *sock, int cmd, unsigned long arg);
extern int kernel_sock_shutdown(struct socket *sock,
diff --git a/include/net/inet_common.h b/include/net/inet_common.h
index 22fac98..91cd8d0 100644
--- a/include/net/inet_common.h
+++ b/include/net/inet_common.h
@@ -21,7 +21,9 @@ extern int inet_dgram_connect(struct socket *sock, struct sockaddr * uaddr,
extern int inet_accept(struct socket *sock, struct socket *newsock, int flags);
extern int inet_sendmsg(struct kiocb *iocb, struct socket *sock,
struct msghdr *msg, size_t size);
-extern ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
+extern ssize_t inet_sendpage(struct socket *sock, struct page *page,
+ struct skb_frag_destructor *frag,
+ int offset,
size_t size, int flags);
extern int inet_recvmsg(struct kiocb *iocb, struct socket *sock,
struct msghdr *msg, size_t size, int flags);
diff --git a/include/net/ip.h b/include/net/ip.h
index eca0ef7..d34030c 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -114,7 +114,9 @@ extern int ip_append_data(struct sock *sk, struct flowi4 *fl4,
struct rtable **rt,
unsigned int flags);
extern int ip_generic_getfrag(void *from, char *to, int offset, int len, int odd, struct sk_buff *skb);
-extern ssize_t ip_append_page(struct sock *sk, struct flowi4 *fl4, struct page *page,
+extern ssize_t ip_append_page(struct sock *sk, struct flowi4 *fl4,
+ struct page *page,
+ struct skb_frag_destructor *destroy,
int offset, size_t size, int flags);
extern struct sk_buff *__ip_make_skb(struct sock *sk,
struct flowi4 *fl4,
diff --git a/include/net/sock.h b/include/net/sock.h
index 32e3937..e0c92e6 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -776,6 +776,7 @@ struct proto {
size_t len, int noblock, int flags,
int *addr_len);
int (*sendpage)(struct sock *sk, struct page *page,
+ struct skb_frag_destructor *destroy,
int offset, size_t size, int flags);
int (*bind)(struct sock *sk,
struct sockaddr *uaddr, int addr_len);
@@ -1164,9 +1165,10 @@ extern int sock_no_mmap(struct file *file,
struct socket *sock,
struct vm_area_struct *vma);
extern ssize_t sock_no_sendpage(struct socket *sock,
- struct page *page,
- int offset, size_t size,
- int flags);
+ struct page *page,
+ struct skb_frag_destructor *destroy,
+ int offset, size_t size,
+ int flags);
/*
* Functions to fill in entries in struct proto_ops when a protocol
diff --git a/include/net/tcp.h b/include/net/tcp.h
index bb18c4d..19ef50a 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -322,7 +322,9 @@ extern void *tcp_v4_tw_get_peer(struct sock *sk);
extern int tcp_v4_tw_remember_stamp(struct inet_timewait_sock *tw);
extern int tcp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
size_t size);
-extern int tcp_sendpage(struct sock *sk, struct page *page, int offset,
+extern int tcp_sendpage(struct sock *sk, struct page *page,
+ struct skb_frag_destructor *destroy,
+ int offset,
size_t size, int flags);
extern int tcp_ioctl(struct sock *sk, int cmd, unsigned long arg);
extern int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb,
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index ad5b708..69f049b 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -851,7 +851,7 @@ static int write_partial_msg_pages(struct ceph_connection *con)
cpu_to_le32(crc32c(tmpcrc, base, len));
con->out_msg_pos.did_page_crc = 1;
}
- ret = kernel_sendpage(con->sock, page,
+ ret = kernel_sendpage(con->sock, page, NULL,
con->out_msg_pos.page_pos + page_shift,
len,
MSG_DONTWAIT | MSG_NOSIGNAL |
diff --git a/net/core/sock.c b/net/core/sock.c
index b23f174..ba889f7 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1857,7 +1857,9 @@ int sock_no_mmap(struct file *file, struct socket *sock, struct vm_area_struct *
}
EXPORT_SYMBOL(sock_no_mmap);
-ssize_t sock_no_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags)
+ssize_t sock_no_sendpage(struct socket *sock, struct page *page,
+ struct skb_frag_destructor *destroy,
+ int offset, size_t size, int flags)
{
ssize_t res;
struct msghdr msg = {.msg_flags = flags};
@@ -1867,6 +1869,8 @@ ssize_t sock_no_sendpage(struct socket *sock, struct page *page, int offset, siz
iov.iov_len = size;
res = kernel_sendmsg(sock, &msg, &iov, 1, size);
kunmap(page);
+ /* kernel_sendmsg copies so we can destroy immediately */
+ skb_frag_destructor_unref(destroy);
return res;
}
EXPORT_SYMBOL(sock_no_sendpage);
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 1b5096a..99f7fd0 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -745,7 +745,9 @@ int inet_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
}
EXPORT_SYMBOL(inet_sendmsg);
-ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
+ssize_t inet_sendpage(struct socket *sock, struct page *page,
+ struct skb_frag_destructor *destroy,
+ int offset,
size_t size, int flags)
{
struct sock *sk = sock->sk;
@@ -758,8 +760,9 @@ ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
return -EAGAIN;
if (sk->sk_prot->sendpage)
- return sk->sk_prot->sendpage(sk, page, offset, size, flags);
- return sock_no_sendpage(sock, page, offset, size, flags);
+ return sk->sk_prot->sendpage(sk, page, destroy,
+ offset, size, flags);
+ return sock_no_sendpage(sock, page, destroy, offset, size, flags);
}
EXPORT_SYMBOL(inet_sendpage);
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 3252e06..753dc7b 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1116,6 +1116,7 @@ int ip_append_data(struct sock *sk, struct flowi4 *fl4,
}
ssize_t ip_append_page(struct sock *sk, struct flowi4 *fl4, struct page *page,
+ struct skb_frag_destructor *destroy,
int offset, size_t size, int flags)
{
struct inet_sock *inet = inet_sk(sk);
@@ -1229,11 +1230,12 @@ ssize_t ip_append_page(struct sock *sk, struct flowi4 *fl4, struct page *page,
i = skb_shinfo(skb)->nr_frags;
if (len > size)
len = size;
- if (skb_can_coalesce(skb, i, page, NULL, offset)) {
+ if (skb_can_coalesce(skb, i, page, destroy, offset)) {
skb_frag_size_add(&skb_shinfo(skb)->frags[i-1], len);
} else if (i < MAX_SKB_FRAGS) {
- get_page(page);
skb_fill_page_desc(skb, i, page, offset, len);
+ skb_frag_set_destructor(skb, i, destroy);
+ skb_frag_ref(skb, i);
} else {
err = -EMSGSIZE;
goto error;
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 018de0c..56ef323 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -757,7 +757,10 @@ static int tcp_send_mss(struct sock *sk, int *size_goal, int flags)
return mss_now;
}
-static ssize_t do_tcp_sendpages(struct sock *sk, struct page **pages, int poffset,
+static ssize_t do_tcp_sendpages(struct sock *sk,
+ struct page **pages,
+ struct skb_frag_destructor **destructors,
+ int poffset,
size_t psize, int flags)
{
struct tcp_sock *tp = tcp_sk(sk);
@@ -783,6 +786,8 @@ static ssize_t do_tcp_sendpages(struct sock *sk, struct page **pages, int poffse
while (psize > 0) {
struct sk_buff *skb = tcp_write_queue_tail(sk);
struct page *page = pages[poffset / PAGE_SIZE];
+ struct skb_frag_destructor *destroy =
+ destructors ? destructors[poffset / PAGE_SIZE] : NULL;
int copy, i, can_coalesce;
int offset = poffset % PAGE_SIZE;
int size = min_t(size_t, psize, PAGE_SIZE - offset);
@@ -804,7 +809,7 @@ new_segment:
copy = size;
i = skb_shinfo(skb)->nr_frags;
- can_coalesce = skb_can_coalesce(skb, i, page, NULL, offset);
+ can_coalesce = skb_can_coalesce(skb, i, page, destroy, offset);
if (!can_coalesce && i >= MAX_SKB_FRAGS) {
tcp_mark_push(tp, skb);
goto new_segment;
@@ -815,8 +820,9 @@ new_segment:
if (can_coalesce) {
skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy);
} else {
- get_page(page);
skb_fill_page_desc(skb, i, page, offset, copy);
+ skb_frag_set_destructor(skb, i, destroy);
+ skb_frag_ref(skb, i);
}
skb->len += copy;
@@ -871,18 +877,20 @@ out_err:
return sk_stream_error(sk, flags, err);
}
-int tcp_sendpage(struct sock *sk, struct page *page, int offset,
- size_t size, int flags)
+int tcp_sendpage(struct sock *sk, struct page *page,
+ struct skb_frag_destructor *destroy,
+ int offset, size_t size, int flags)
{
ssize_t res;
if (!(sk->sk_route_caps & NETIF_F_SG) ||
!(sk->sk_route_caps & NETIF_F_ALL_CSUM))
- return sock_no_sendpage(sk->sk_socket, page, offset, size,
- flags);
+ return sock_no_sendpage(sk->sk_socket, page, destroy,
+ offset, size, flags);
lock_sock(sk);
- res = do_tcp_sendpages(sk, &page, offset, size, flags);
+ res = do_tcp_sendpages(sk, &page, &destroy,
+ offset, size, flags);
release_sock(sk);
return res;
}
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 5a65eea..1ee4728 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1028,8 +1028,9 @@ do_confirm:
}
EXPORT_SYMBOL(udp_sendmsg);
-int udp_sendpage(struct sock *sk, struct page *page, int offset,
- size_t size, int flags)
+int udp_sendpage(struct sock *sk, struct page *page,
+ struct skb_frag_destructor *destroy,
+ int offset, size_t size, int flags)
{
struct inet_sock *inet = inet_sk(sk);
struct udp_sock *up = udp_sk(sk);
@@ -1057,11 +1058,11 @@ int udp_sendpage(struct sock *sk, struct page *page, int offset,
}
ret = ip_append_page(sk, &inet->cork.fl.u.ip4,
- page, offset, size, flags);
+ page, destroy, offset, size, flags);
if (ret == -EOPNOTSUPP) {
release_sock(sk);
- return sock_no_sendpage(sk->sk_socket, page, offset,
- size, flags);
+ return sock_no_sendpage(sk->sk_socket, page, destroy,
+ offset, size, flags);
}
if (ret < 0) {
udp_flush_pending_frames(sk);
diff --git a/net/ipv4/udp_impl.h b/net/ipv4/udp_impl.h
index aaad650..4923d82 100644
--- a/net/ipv4/udp_impl.h
+++ b/net/ipv4/udp_impl.h
@@ -23,8 +23,9 @@ extern int compat_udp_getsockopt(struct sock *sk, int level, int optname,
#endif
extern int udp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
size_t len, int noblock, int flags, int *addr_len);
-extern int udp_sendpage(struct sock *sk, struct page *page, int offset,
- size_t size, int flags);
+extern int udp_sendpage(struct sock *sk, struct page *page,
+ struct skb_frag_destructor *destroy,
+ int offset, size_t size, int flags);
extern int udp_queue_rcv_skb(struct sock * sk, struct sk_buff *skb);
extern void udp_destroy_sock(struct sock *sk);
diff --git a/net/rds/tcp_send.c b/net/rds/tcp_send.c
index 1b4fd68..71503ad 100644
--- a/net/rds/tcp_send.c
+++ b/net/rds/tcp_send.c
@@ -119,6 +119,7 @@ int rds_tcp_xmit(struct rds_connection *conn, struct rds_message *rm,
while (sg < rm->data.op_nents) {
ret = tc->t_sock->ops->sendpage(tc->t_sock,
sg_page(&rm->data.op_sg[sg]),
+ NULL,
rm->data.op_sg[sg].offset + off,
rm->data.op_sg[sg].length - off,
MSG_DONTWAIT|MSG_NOSIGNAL);
diff --git a/net/socket.c b/net/socket.c
index 2877647..cbd5728 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -795,7 +795,7 @@ static ssize_t sock_sendpage(struct file *file, struct page *page,
if (more)
flags |= MSG_MORE;
- return kernel_sendpage(sock, page, offset, size, flags);
+ return kernel_sendpage(sock, page, NULL, offset, size, flags);
}
static ssize_t sock_splice_read(struct file *file, loff_t *ppos,
@@ -3352,15 +3352,18 @@ int kernel_setsockopt(struct socket *sock, int level, int optname,
}
EXPORT_SYMBOL(kernel_setsockopt);
-int kernel_sendpage(struct socket *sock, struct page *page, int offset,
+int kernel_sendpage(struct socket *sock, struct page *page,
+ struct skb_frag_destructor *destroy,
+ int offset,
size_t size, int flags)
{
sock_update_classid(sock->sk);
if (sock->ops->sendpage)
- return sock->ops->sendpage(sock, page, offset, size, flags);
+ return sock->ops->sendpage(sock, page, destroy,
+ offset, size, flags);
- return sock_no_sendpage(sock, page, offset, size, flags);
+ return sock_no_sendpage(sock, page, destroy, offset, size, flags);
}
EXPORT_SYMBOL(kernel_sendpage);
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 71bed1c..8517cd3 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -185,7 +185,7 @@ int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
/* send head */
if (slen == xdr->head[0].iov_len)
flags = 0;
- len = kernel_sendpage(sock, headpage, headoffset,
+ len = kernel_sendpage(sock, headpage, NULL, headoffset,
xdr->head[0].iov_len, flags);
if (len != xdr->head[0].iov_len)
goto out;
@@ -198,7 +198,7 @@ int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
while (pglen > 0) {
if (slen == size)
flags = 0;
- result = kernel_sendpage(sock, *ppage, base, size, flags);
+ result = kernel_sendpage(sock, *ppage, NULL, base, size, flags);
if (result > 0)
len += result;
if (result != size)
@@ -212,7 +212,7 @@ int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
/* send tail */
if (xdr->tail[0].iov_len) {
- result = kernel_sendpage(sock, tailpage, tailoffset,
+ result = kernel_sendpage(sock, tailpage, NULL, tailoffset,
xdr->tail[0].iov_len, 0);
if (result > 0)
len += result;
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 55472c4..38b2fec 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -408,7 +408,7 @@ static int xs_send_pagedata(struct socket *sock, struct xdr_buf *xdr, unsigned i
remainder -= len;
if (remainder != 0 || more)
flags |= MSG_MORE;
- err = sock->ops->sendpage(sock, *ppage, base, len, flags);
+ err = sock->ops->sendpage(sock, *ppage, NULL, base, len, flags);
if (remainder == 0 || err != len)
break;
sent += err;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [RFC] [PATCH 4/7] sunrpc: use SKB fragment destructors to delay completion until page is released by network stack.
2013-01-25 14:27 [RFC] rebase Ian Campbell's skb fragment tracking to 3.2 Alex Bligh
` (2 preceding siblings ...)
2013-01-25 14:27 ` [RFC] [PATCH 3/7] net: add paged frag destructor support to kernel_sendpage Alex Bligh
@ 2013-01-25 14:27 ` Alex Bligh
2013-01-25 14:27 ` [RFC] [PATCH 5/7] net: move skb frag kmap functions to skbuff.h Alex Bligh
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Alex Bligh @ 2013-01-25 14:27 UTC (permalink / raw)
To: netdev
Cc: Stefano Stabellini, Ian Campbell, Alex Bligh, Trond Myklebust,
Mel Gorman
From: Ian Campbell <ian.campbell at>
This prevents an issue where an ACK is delayed, a retransmit is queued (either
at the RPC or TCP level) and the ACK arrives before the retransmission hits the
wire. If this happens to an NFS WRITE RPC then the write() system call
completes and the userspace process can continue, potentially modifying data
referenced by the retransmission before the retransmission occurs.
Signed-off-by: Ian Campbell <ian.campbell <at> citrix.com>
Acked-by: Trond Myklebust <Trond.Myklebust <at> netapp.com>
Cc: "David S. Miller" <davem <at> davemloft.net>
Cc: Neil Brown <neilb <at> suse.de>
Cc: "J. Bruce Fields" <bfields <at> fieldses.org>
Cc: linux-nfs <at> vger.kernel.org
Cc: netdev <at> vger.kernel.org
[since v1:
Push down from NFS layer into RPM layer
]
Signed-off-by: Alex Bligh <alex@alex.org.uk>
---
include/linux/sunrpc/xdr.h | 2 ++
include/linux/sunrpc/xprt.h | 5 ++++-
net/sunrpc/clnt.c | 27 ++++++++++++++++++++++-----
net/sunrpc/svcsock.c | 3 ++-
net/sunrpc/xprt.c | 13 +++++++++++++
net/sunrpc/xprtsock.c | 3 ++-
6 files changed, 45 insertions(+), 8 deletions(-)
diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
index a20970e..172f81e 100644
--- a/include/linux/sunrpc/xdr.h
+++ b/include/linux/sunrpc/xdr.h
@@ -16,6 +16,7 @@
#include <asm/byteorder.h>
#include <asm/unaligned.h>
#include <linux/scatterlist.h>
+#include <linux/skbuff.h>
/*
* Buffer adjustment
@@ -57,6 +58,7 @@ struct xdr_buf {
tail[1]; /* Appended after page data */
struct page ** pages; /* Array of contiguous pages */
+ struct skb_frag_destructor *destructor;
unsigned int page_base, /* Start of page data */
page_len, /* Length of page data */
flags; /* Flags for data disposition */
diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index 15518a1..75131eb 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -92,7 +92,10 @@ struct rpc_rqst {
/* A cookie used to track the
state of the transport
connection */
-
+ struct skb_frag_destructor destructor; /* SKB paged fragment
+ * destructor for
+ * transmitted pages*/
+
/*
* Partial send handling
*/
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index f0268ea..06f363f 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -61,6 +61,7 @@ static void call_reserve(struct rpc_task *task);
static void call_reserveresult(struct rpc_task *task);
static void call_allocate(struct rpc_task *task);
static void call_decode(struct rpc_task *task);
+static void call_complete(struct rpc_task *task);
static void call_bind(struct rpc_task *task);
static void call_bind_status(struct rpc_task *task);
static void call_transmit(struct rpc_task *task);
@@ -1115,6 +1116,8 @@ rpc_xdr_encode(struct rpc_task *task)
(char *)req->rq_buffer + req->rq_callsize,
req->rq_rcvsize);
+ req->rq_snd_buf.destructor = &req->destructor;
+
p = rpc_encode_header(task);
if (p == NULL) {
printk(KERN_INFO "RPC: couldn't encode RPC header, exit EIO\n");
@@ -1278,6 +1281,7 @@ call_connect_status(struct rpc_task *task)
static void
call_transmit(struct rpc_task *task)
{
+ struct rpc_rqst *req = task->tk_rqstp;
dprint_status(task);
task->tk_action = call_status;
@@ -1311,8 +1315,8 @@ call_transmit(struct rpc_task *task)
call_transmit_status(task);
if (rpc_reply_expected(task))
return;
- task->tk_action = rpc_exit_task;
- rpc_wake_up_queued_task(&task->tk_xprt->pending, task);
+ task->tk_action = call_complete;
+ skb_frag_destructor_unref(&req->destructor);
}
/*
@@ -1385,7 +1389,8 @@ call_bc_transmit(struct rpc_task *task)
return;
}
- task->tk_action = rpc_exit_task;
+ task->tk_action = call_complete;
+ skb_frag_destructor_unref(&req->destructor);
if (task->tk_status < 0) {
printk(KERN_NOTICE "RPC: Could not send backchannel reply "
"error: %d\n", task->tk_status);
@@ -1425,7 +1430,6 @@ call_bc_transmit(struct rpc_task *task)
"error: %d\n", task->tk_status);
break;
}
- rpc_wake_up_queued_task(&req->rq_xprt->pending, task);
}
#endif /* CONFIG_SUNRPC_BACKCHANNEL */
@@ -1591,12 +1595,14 @@ call_decode(struct rpc_task *task)
return;
}
- task->tk_action = rpc_exit_task;
+ task->tk_action = call_complete;
if (decode) {
task->tk_status = rpcauth_unwrap_resp(task, decode, req, p,
task->tk_msg.rpc_resp);
}
+ rpc_sleep_on(&req->rq_xprt->pending, task, NULL);
+ skb_frag_destructor_unref(&req->destructor);
dprintk("RPC: %5u call_decode result %d\n", task->tk_pid,
task->tk_status);
return;
@@ -1611,6 +1617,17 @@ out_retry:
}
}
+/*
+ * 8. Wait for pages to be released by the network stack.
+ */
+static void
+call_complete(struct rpc_task *task)
+{
+ dprintk("RPC: %5u call_complete result %d\n",
+ task->tk_pid, task->tk_status);
+ task->tk_action = rpc_exit_task;
+}
+
static __be32 *
rpc_encode_header(struct rpc_task *task)
{
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 8517cd3..3f31181 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -198,7 +198,8 @@ int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
while (pglen > 0) {
if (slen == size)
flags = 0;
- result = kernel_sendpage(sock, *ppage, NULL, base, size, flags);
+ result = kernel_sendpage(sock, *ppage, xdr->destructor,
+ base, size, flags);
if (result > 0)
len += result;
if (result != size)
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index c64c0ef..5f28bc7 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -1101,6 +1101,16 @@ static inline void xprt_init_xid(struct rpc_xprt *xprt)
xprt->xid = net_random();
}
+static int xprt_complete_skb_pages(void *calldata)
+{
+ struct rpc_task *task = calldata;
+ struct rpc_rqst *req = task->tk_rqstp;
+
+ dprintk("RPC: %5u completing skb pages\n", task->tk_pid);
+ rpc_wake_up_queued_task(&req->rq_xprt->pending, task);
+ return 0;
+}
+
static void xprt_request_init(struct rpc_task *task, struct rpc_xprt *xprt)
{
struct rpc_rqst *req = task->tk_rqstp;
@@ -1113,6 +1123,9 @@ static void xprt_request_init(struct rpc_task *task, struct rpc_xprt *xprt)
req->rq_xid = xprt_alloc_xid(xprt);
req->rq_release_snd_buf = NULL;
xprt_reset_majortimeo(req);
+ atomic_set(&req->destructor.ref, 1);
+ req->destructor.destroy = &xprt_complete_skb_pages;
+ req->destructor.data = task;
dprintk("RPC: %5u reserved req %p xid %08x\n", task->tk_pid,
req, ntohl(req->rq_xid));
}
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 38b2fec..5406977 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -408,7 +408,8 @@ static int xs_send_pagedata(struct socket *sock, struct xdr_buf *xdr, unsigned i
remainder -= len;
if (remainder != 0 || more)
flags |= MSG_MORE;
- err = sock->ops->sendpage(sock, *ppage, NULL, base, len, flags);
+ err = sock->ops->sendpage(sock, *ppage, xdr->destructor,
+ base, len, flags);
if (remainder == 0 || err != len)
break;
sent += err;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [RFC] [PATCH 5/7] net: move skb frag kmap functions to skbuff.h
2013-01-25 14:27 [RFC] rebase Ian Campbell's skb fragment tracking to 3.2 Alex Bligh
` (3 preceding siblings ...)
2013-01-25 14:27 ` [RFC] [PATCH 4/7] sunrpc: use SKB fragment destructors to delay completion until page is released by network stack Alex Bligh
@ 2013-01-25 14:27 ` Alex Bligh
2013-01-25 14:27 ` [RFC] [PATCH 6/7] net: add skb_frag_k(un)map convenience functions Alex Bligh
2013-01-25 14:27 ` [RFC] [PATCH 7/7] net: return a *const* struct page from skb_frag_page Alex Bligh
6 siblings, 0 replies; 8+ messages in thread
From: Alex Bligh @ 2013-01-25 14:27 UTC (permalink / raw)
To: netdev
Cc: Stefano Stabellini, Ian Campbell, Alex Bligh, Trond Myklebust,
Mel Gorman
From: Ian Campbell <ian.campbell@>
The usage is open-coded in drivers/scsi/fcoe/fcoe.c and net/appletalk/ddp.c
uses an out-of-directory local include of "../core/kmap_skb.h".
Rename functions k(un)map_skb_frag to skb_frag_k(un)map_atomic to avoid
confusion with shortly to be introduced skb_frag_k(un)map.
Signed-off-by: Ian Campbell <ian.campbell <at> citrix.com>
Cc: Robert Love <robert.w.love <at> intel.com>
Cc: "James E.J. Bottomley" <JBottomley <at> parallels.com>
Cc: Arnaldo Carvalho de Melo <acme <at> ghostprotocols.net>
Cc: "David S. Miller" <davem <at> davemloft.net>
Cc: Eric Dumazet <eric.dumazet <at> gmail.com>
Cc: "Michał Mirosław" <mirq-linux <at> rere.qmqm.pl>
Cc: Tom Herbert <therbert <at> google.com>
Cc: devel <at> open-fcoe.org
Cc: linux-scsi <at> vger.kernel.org
Cc: linux-kernel <at> vger.kernel.org
Cc: netdev <at> vger.kernel.org
Signed-off-by: Alex Bligh <alex@alex.org.uk>
---
drivers/scsi/fcoe/fcoe.c | 7 +++----
include/linux/skbuff.h | 19 +++++++++++++++++++
net/appletalk/ddp.c | 5 ++---
net/core/kmap_skb.h | 19 -------------------
net/core/skbuff.c | 35 +++++++++++++++++------------------
5 files changed, 41 insertions(+), 44 deletions(-)
diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
index 8d67467..70ab372 100644
--- a/drivers/scsi/fcoe/fcoe.c
+++ b/drivers/scsi/fcoe/fcoe.c
@@ -1458,6 +1458,7 @@ int fcoe_xmit(struct fc_lport *lport, struct fc_frame *fp)
struct ethhdr *eh;
struct fcoe_crc_eof *cp;
struct sk_buff *skb;
+ skb_frag_t *frag;
struct fcoe_dev_stats *stats;
struct fc_frame_header *fh;
unsigned int hlen; /* header length implies the version */
@@ -1504,14 +1505,12 @@ int fcoe_xmit(struct fc_lport *lport, struct fc_frame *fp)
/* copy port crc and eof to the skb buff */
if (skb_is_nonlinear(skb)) {
- skb_frag_t *frag;
if (fcoe_alloc_paged_crc_eof(skb, tlen)) {
kfree_skb(skb);
return -ENOMEM;
}
frag = &skb_shinfo(skb)->frags[skb_shinfo(skb)->nr_frags - 1];
- cp = kmap_atomic(skb_frag_page(frag), KM_SKB_DATA_SOFTIRQ)
- + frag->page_offset;
+ cp = skb_frag_kmap_atomic(frag) + frag->page_offset;
} else {
cp = (struct fcoe_crc_eof *)skb_put(skb, tlen);
}
@@ -1521,7 +1520,7 @@ int fcoe_xmit(struct fc_lport *lport, struct fc_frame *fp)
cp->fcoe_crc32 = cpu_to_le32(~crc);
if (skb_is_nonlinear(skb)) {
- kunmap_atomic(cp, KM_SKB_DATA_SOFTIRQ);
+ skb_frag_kunmap_atomic(frag);
cp = NULL;
}
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 8a8eecd..84a54d6 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -29,6 +29,7 @@
#include <linux/rcupdate.h>
#include <linux/dmaengine.h>
#include <linux/hrtimer.h>
+#include <linux/highmem.h>
#include <linux/dma-mapping.h>
/* Don't change this without changing skb_csum_unnecessary! */
@@ -1848,6 +1849,24 @@ static inline void skb_frag_set_page(struct sk_buff *skb, int f,
__skb_frag_set_page(&skb_shinfo(skb)->frags[f], page);
}
+static inline void *skb_frag_kmap_atomic(const skb_frag_t *frag)
+{
+#ifdef CONFIG_HIGHMEM
+ BUG_ON(in_irq());
+
+ local_bh_disable();
+#endif
+ return kmap_atomic(skb_frag_page(frag), KM_SKB_DATA_SOFTIRQ);
+}
+
+static inline void skb_frag_kunmap_atomic(void *vaddr)
+{
+ kunmap_atomic(vaddr, KM_SKB_DATA_SOFTIRQ);
+#ifdef CONFIG_HIGHMEM
+ local_bh_enable();
+#endif
+}
+
/**
* skb_frag_dma_map - maps a paged fragment via the DMA API
* @dev: the device to map the fragment to
diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index bfa9ab9..ecc6f63 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -63,7 +63,6 @@
#include <net/tcp_states.h>
#include <net/route.h>
#include <linux/atalk.h>
-#include "../core/kmap_skb.h"
struct datalink_proto *ddp_dl, *aarp_dl;
static const struct proto_ops atalk_dgram_ops;
@@ -960,10 +959,10 @@ static unsigned long atalk_sum_skb(const struct sk_buff *skb, int offset,
if (copy > len)
copy = len;
- vaddr = kmap_skb_frag(frag);
+ vaddr = skb_frag_kmap_atomic(frag);
sum = atalk_sum_partial(vaddr + frag->page_offset +
offset - start, copy, sum);
- kunmap_skb_frag(vaddr);
+ skb_frag_kunmap_atomic(vaddr);
if (!(len -= copy))
return sum;
diff --git a/net/core/kmap_skb.h b/net/core/kmap_skb.h
deleted file mode 100644
index 81e1ed7..0000000
--- a/net/core/kmap_skb.h
+++ /dev/null
@@ -1,19 +0,0 @@
-#include <linux/highmem.h>
-
-static inline void *kmap_skb_frag(const skb_frag_t *frag)
-{
-#ifdef CONFIG_HIGHMEM
- BUG_ON(in_irq());
-
- local_bh_disable();
-#endif
- return kmap_atomic(skb_frag_page(frag), KM_SKB_DATA_SOFTIRQ);
-}
-
-static inline void kunmap_skb_frag(void *vaddr)
-{
- kunmap_atomic(vaddr, KM_SKB_DATA_SOFTIRQ);
-#ifdef CONFIG_HIGHMEM
- local_bh_enable();
-#endif
-}
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 425cd5a..047f38f 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -69,8 +69,6 @@
#include <asm/system.h>
#include <trace/events/skb.h>
-#include "kmap_skb.h"
-
static struct kmem_cache *skbuff_head_cache __read_mostly;
static struct kmem_cache *skbuff_fclone_cache __read_mostly;
@@ -676,10 +674,10 @@ int skb_copy_ubufs(struct sk_buff *skb, gfp_t gfp_mask)
}
return -ENOMEM;
}
- vaddr = kmap_skb_frag(&skb_shinfo(skb)->frags[i]);
+ vaddr = skb_frag_kmap_atomic(&skb_shinfo(skb)->frags[i]);
memcpy(page_address(page),
vaddr + f->page_offset, skb_frag_size(f));
- kunmap_skb_frag(vaddr);
+ skb_frag_kunmap_atomic(vaddr);
page->private = (unsigned long)head;
head = page;
}
@@ -1459,15 +1457,16 @@ int skb_copy_bits(const struct sk_buff *skb, int offset, void *to, int len)
end = start + skb_frag_size(&skb_shinfo(skb)->frags[i]);
if ((copy = end - offset) > 0) {
u8 *vaddr;
+ skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
if (copy > len)
copy = len;
- vaddr = kmap_skb_frag(&skb_shinfo(skb)->frags[i]);
+ vaddr = skb_frag_kmap_atomic(frag);
memcpy(to,
- vaddr + skb_shinfo(skb)->frags[i].page_offset+
- offset - start, copy);
- kunmap_skb_frag(vaddr);
+ vaddr + frag->page_offset + offset - start,
+ copy);
+ skb_frag_kunmap_atomic(vaddr);
if ((len -= copy) == 0)
return 0;
@@ -1772,10 +1771,10 @@ int skb_store_bits(struct sk_buff *skb, int offset, const void *from, int len)
if (copy > len)
copy = len;
- vaddr = kmap_skb_frag(frag);
+ vaddr = skb_frag_kmap_atomic(frag);
memcpy(vaddr + frag->page_offset + offset - start,
from, copy);
- kunmap_skb_frag(vaddr);
+ skb_frag_kunmap_atomic(vaddr);
if ((len -= copy) == 0)
return 0;
@@ -1846,10 +1845,10 @@ __wsum skb_checksum(const struct sk_buff *skb, int offset,
if (copy > len)
copy = len;
- vaddr = kmap_skb_frag(frag);
+ vaddr = skb_frag_kmap_atomic(frag);
csum2 = csum_partial(vaddr + frag->page_offset +
offset - start, copy, 0);
- kunmap_skb_frag(vaddr);
+ skb_frag_kunmap_atomic(vaddr);
csum = csum_block_add(csum, csum2, pos);
if (!(len -= copy))
return csum;
@@ -1921,12 +1920,12 @@ __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
if (copy > len)
copy = len;
- vaddr = kmap_skb_frag(frag);
+ vaddr = skb_frag_kmap_atomic(frag);
csum2 = csum_partial_copy_nocheck(vaddr +
frag->page_offset +
offset - start, to,
copy, 0);
- kunmap_skb_frag(vaddr);
+ skb_frag_kunmap_atomic(vaddr);
csum = csum_block_add(csum, csum2, pos);
if (!(len -= copy))
return csum;
@@ -2447,7 +2446,7 @@ next_skb:
if (abs_offset < block_limit) {
if (!st->frag_data)
- st->frag_data = kmap_skb_frag(frag);
+ st->frag_data = skb_frag_kmap_atomic(frag);
*data = (u8 *) st->frag_data + frag->page_offset +
(abs_offset - st->stepped_offset);
@@ -2456,7 +2455,7 @@ next_skb:
}
if (st->frag_data) {
- kunmap_skb_frag(st->frag_data);
+ skb_frag_kunmap_atomic(st->frag_data);
st->frag_data = NULL;
}
@@ -2465,7 +2464,7 @@ next_skb:
}
if (st->frag_data) {
- kunmap_skb_frag(st->frag_data);
+ skb_frag_kunmap_atomic(st->frag_data);
st->frag_data = NULL;
}
@@ -2493,7 +2492,7 @@ EXPORT_SYMBOL(skb_seq_read);
void skb_abort_seq_read(struct skb_seq_state *st)
{
if (st->frag_data)
- kunmap_skb_frag(st->frag_data);
+ skb_frag_kunmap_atomic(st->frag_data);
}
EXPORT_SYMBOL(skb_abort_seq_read);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [RFC] [PATCH 6/7] net: add skb_frag_k(un)map convenience functions.
2013-01-25 14:27 [RFC] rebase Ian Campbell's skb fragment tracking to 3.2 Alex Bligh
` (4 preceding siblings ...)
2013-01-25 14:27 ` [RFC] [PATCH 5/7] net: move skb frag kmap functions to skbuff.h Alex Bligh
@ 2013-01-25 14:27 ` Alex Bligh
2013-01-25 14:27 ` [RFC] [PATCH 7/7] net: return a *const* struct page from skb_frag_page Alex Bligh
6 siblings, 0 replies; 8+ messages in thread
From: Alex Bligh @ 2013-01-25 14:27 UTC (permalink / raw)
To: netdev
Cc: Stefano Stabellini, Ian Campbell, Alex Bligh, Trond Myklebust,
Mel Gorman
From: Ian Campbell <ian.campbell at>
Signed-off-by: Ian Campbell <ian.campbell <at> citrix.com>
Cc: "David S. Miller" <davem <at> davemloft.net>
Cc: Eric Dumazet <eric.dumazet <at> gmail.com>
Cc: "Michał Mirosław" <mirq-linux <at> rere.qmqm.pl>
Cc: Tom Herbert <therbert <at> google.com>
Cc: Neil Horman <nhorman <at> tuxdriver.com>
Cc: Koki Sanagi <sanagi.koki <at> jp.fujitsu.com>
Cc: linux-kernel <at> vger.kernel.org
Cc: netdev <at> vger.kernel.org
Signed-off-by: Alex Bligh <alex@alex.org.uk>
---
include/linux/skbuff.h | 22 ++++++++++++++++++++++
net/core/datagram.c | 20 ++++++++------------
2 files changed, 30 insertions(+), 12 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 84a54d6..698e4c1 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1868,6 +1868,28 @@ static inline void skb_frag_kunmap_atomic(void *vaddr)
}
/**
+ * skb_frag_kmap - kmaps a paged fragment
+ * @frag: the paged fragment
+ *
+ * kmap()s the paged fragment @frag and returns the virtual address.
+ */
+static inline void *skb_frag_kmap(const skb_frag_t *frag)
+{
+ return kmap(skb_frag_page(frag));
+}
+
+/**
+ * skb_frag_kunmap - kunmaps a paged fragment
+ * @frag: the paged fragment
+ *
+ * kunmap()s the paged fragment @frag.
+ */
+static inline void skb_frag_kunmap(const skb_frag_t *frag)
+{
+ kunmap(skb_frag_page(frag));
+}
+
+/**
* skb_frag_dma_map - maps a paged fragment via the DMA API
* @dev: the device to map the fragment to
* @frag: the paged fragment to map
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 68bbf9f..ac763d1 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -332,14 +332,13 @@ int skb_copy_datagram_iovec(const struct sk_buff *skb, int offset,
if ((copy = end - offset) > 0) {
int err;
u8 *vaddr;
- struct page *page = skb_frag_page(frag);
if (copy > len)
copy = len;
- vaddr = kmap(page);
+ vaddr = skb_frag_kmap(frag);
err = memcpy_toiovec(to, vaddr + frag->page_offset +
offset - start, copy);
- kunmap(page);
+ skb_frag_kunmap(frag);
if (err)
goto fault;
if (!(len -= copy))
@@ -418,14 +417,13 @@ int skb_copy_datagram_const_iovec(const struct sk_buff *skb, int offset,
if ((copy = end - offset) > 0) {
int err;
u8 *vaddr;
- struct page *page = skb_frag_page(frag);
if (copy > len)
copy = len;
- vaddr = kmap(page);
+ vaddr = skb_frag_kmap(frag);
err = memcpy_toiovecend(to, vaddr + frag->page_offset +
offset - start, to_offset, copy);
- kunmap(page);
+ skb_frag_kunmap(frag);
if (err)
goto fault;
if (!(len -= copy))
@@ -508,15 +506,14 @@ int skb_copy_datagram_from_iovec(struct sk_buff *skb, int offset,
if ((copy = end - offset) > 0) {
int err;
u8 *vaddr;
- struct page *page = skb_frag_page(frag);
if (copy > len)
copy = len;
- vaddr = kmap(page);
+ vaddr = skb_frag_kmap(frag);
err = memcpy_fromiovecend(vaddr + frag->page_offset +
offset - start,
from, from_offset, copy);
- kunmap(page);
+ skb_frag_kunmap(frag);
if (err)
goto fault;
@@ -594,16 +591,15 @@ static int skb_copy_and_csum_datagram(const struct sk_buff *skb, int offset,
__wsum csum2;
int err = 0;
u8 *vaddr;
- struct page *page = skb_frag_page(frag);
if (copy > len)
copy = len;
- vaddr = kmap(page);
+ vaddr = skb_frag_kmap(frag);
csum2 = csum_and_copy_to_user(vaddr +
frag->page_offset +
offset - start,
to, copy, 0, &err);
- kunmap(page);
+ skb_frag_kunmap(frag);
if (err)
goto fault;
*csump = csum_block_add(*csump, csum2, pos);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [RFC] [PATCH 7/7] net: return a *const* struct page from skb_frag_page.
2013-01-25 14:27 [RFC] rebase Ian Campbell's skb fragment tracking to 3.2 Alex Bligh
` (5 preceding siblings ...)
2013-01-25 14:27 ` [RFC] [PATCH 6/7] net: add skb_frag_k(un)map convenience functions Alex Bligh
@ 2013-01-25 14:27 ` Alex Bligh
6 siblings, 0 replies; 8+ messages in thread
From: Alex Bligh @ 2013-01-25 14:27 UTC (permalink / raw)
To: netdev
Cc: Stefano Stabellini, Ian Campbell, Alex Bligh, Trond Myklebust,
Mel Gorman
From: Mel Gorman <mgorman@suse.de>
This attempts to catch bare uses of get/put_page (which take a non-const struct
page) on skb paged fragments.
Add __skb_frag_page for those callers which really need a non-const reference
to the page.
Signed-off-by: Ian Campbell <ian.campbell <at> citrix.com>
Cc: "David S. Miller" <davem <at> davemloft.net>
Cc: Eric Dumazet <eric.dumazet <at> gmail.com>
Cc: "Michał Mirosław" <mirq-linux <at> rere.qmqm.pl>
Cc: netdev <at> vger.kernel.org
Signed-off-by: Alex Bligh <alex@alex.org.uk>
---
drivers/infiniband/ulp/ipoib/ipoib_cm.c | 4 ++--
drivers/infiniband/ulp/ipoib/ipoib_ib.c | 2 +-
drivers/net/ethernet/broadcom/bnx2.c | 2 +-
drivers/net/ethernet/intel/e1000/e1000_main.c | 2 +-
drivers/net/ethernet/jme.c | 2 +-
drivers/net/ethernet/sun/cassini.c | 2 +-
drivers/net/ethernet/sun/niu.c | 2 +-
drivers/net/xen-netback/netback.c | 2 +-
drivers/net/xen-netfront.c | 4 ++--
drivers/scsi/bnx2fc/bnx2fc_fcoe.c | 2 +-
drivers/scsi/cxgbi/libcxgbi.c | 2 +-
drivers/scsi/fcoe/fcoe_transport.c | 2 +-
include/linux/skbuff.h | 30 +++++++++++++++++--------
net/core/skbuff.c | 6 +++--
net/core/user_dma.c | 2 +-
net/ipv4/tcp.c | 2 +-
16 files changed, 41 insertions(+), 27 deletions(-)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 014504d..2373cb89 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -538,8 +538,8 @@ static void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space,
if (length == 0) {
/* don't need this page */
- skb_fill_page_desc(toskb, i, skb_frag_page(frag),
- 0, PAGE_SIZE);
+ skb_fill_page_desc(toskb, i, __skb_frag_page(frag),
+ 0, PAGE_SIZE);/* XXX */
--skb_shinfo(skb)->nr_frags;
} else {
size = min(length, (unsigned) PAGE_SIZE);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index 4115be5..26e4bd7 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -328,7 +328,7 @@ static int ipoib_dma_map_tx(struct ib_device *ca,
for (i = 0; i < skb_shinfo(skb)->nr_frags; ++i) {
const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
mapping[i + off] = ib_dma_map_page(ca,
- skb_frag_page(frag),
+ __skb_frag_page(frag),
frag->page_offset, skb_frag_size(frag),
DMA_TO_DEVICE);
if (unlikely(ib_dma_mapping_error(ca, mapping[i + off])))
diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
index 965c723..2927700 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -2930,7 +2930,7 @@ bnx2_reuse_rx_skb_pages(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr,
shinfo = skb_shinfo(skb);
shinfo->nr_frags--;
- page = skb_frag_page(&shinfo->frags[shinfo->nr_frags]);
+ page = __skb_frag_page(&shinfo->frags[shinfo->nr_frags]);
__skb_frag_set_page(&shinfo->frags[shinfo->nr_frags], NULL);
cons_rx_pg->page = page;
diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index cf480b5..c1d6f18 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -2916,7 +2916,7 @@ static int e1000_tx_map(struct e1000_adapter *adapter,
* Avoid terminating buffers within evenly-aligned
* dwords. */
bufend = (unsigned long)
- page_to_phys(skb_frag_page(frag));
+ page_to_phys(__skb_frag_page(frag));
bufend += offset + size - 1;
if (unlikely(adapter->pcix_82544 &&
!(bufend & 4) &&
diff --git a/drivers/net/ethernet/jme.c b/drivers/net/ethernet/jme.c
index 76b8457..833b322 100644
--- a/drivers/net/ethernet/jme.c
+++ b/drivers/net/ethernet/jme.c
@@ -2036,7 +2036,7 @@ jme_map_tx_skb(struct jme_adapter *jme, struct sk_buff *skb, int idx)
ctxbi = txbi + ((idx + i + 2) & (mask));
jme_fill_tx_map(jme->pdev, ctxdesc, ctxbi,
- skb_frag_page(frag),
+ __skb_frag_page(frag),
frag->page_offset, skb_frag_size(frag), hidma);
}
diff --git a/drivers/net/ethernet/sun/cassini.c b/drivers/net/ethernet/sun/cassini.c
index fd40988..61f6e93 100644
--- a/drivers/net/ethernet/sun/cassini.c
+++ b/drivers/net/ethernet/sun/cassini.c
@@ -2841,7 +2841,7 @@ static inline int cas_xmit_tx_ringN(struct cas *cp, int ring,
ctrl, 0);
entry = TX_DESC_NEXT(ring, entry);
- addr = cas_page_map(skb_frag_page(fragp));
+ addr = cas_page_map(__skb_frag_page(fragp));
memcpy(tx_tiny_buf(cp, ring, entry),
addr + fragp->page_offset + len - tabort,
tabort);
diff --git a/drivers/net/ethernet/sun/niu.c b/drivers/net/ethernet/sun/niu.c
index 73c7081..85fe5c2 100644
--- a/drivers/net/ethernet/sun/niu.c
+++ b/drivers/net/ethernet/sun/niu.c
@@ -6730,7 +6730,7 @@ static netdev_tx_t niu_start_xmit(struct sk_buff *skb,
const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
len = skb_frag_size(frag);
- mapping = np->ops->map_page(np->device, skb_frag_page(frag),
+ mapping = np->ops->map_page(np->device, __skb_frag_page(frag),
frag->page_offset, len,
DMA_TO_DEVICE);
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 15e332d..43ff39b 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -525,7 +525,7 @@ static int netbk_gop_skb(struct sk_buff *skb,
for (i = 0; i < nr_frags; i++) {
netbk_gop_frag_copy(vif, skb, npo,
- skb_frag_page(&skb_shinfo(skb)->frags[i]),
+ __skb_frag_page(&skb_shinfo(skb)->frags[i]),
skb_frag_size(&skb_shinfo(skb)->frags[i]),
skb_shinfo(skb)->frags[i].page_offset,
&head);
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 226faab..6d3d36b 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -770,7 +770,7 @@ static RING_IDX xennet_fill_frags(struct netfront_info *np,
skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
__skb_fill_page_desc(skb, nr_frags,
- skb_frag_page(nfrag),
+ __skb_frag_page(nfrag),
rx->offset, rx->status);
skb->data_len += rx->status;
@@ -954,7 +954,7 @@ err:
}
NETFRONT_SKB_CB(skb)->page =
- skb_frag_page(&skb_shinfo(skb)->frags[0]);
+ __skb_frag_page(&skb_shinfo(skb)->frags[0]);
NETFRONT_SKB_CB(skb)->offset = rx->offset;
len = rx->status;
diff --git a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
index 8c6156a..cd45c0f 100644
--- a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
@@ -322,7 +322,7 @@ static int bnx2fc_xmit(struct fc_lport *lport, struct fc_frame *fp)
return -ENOMEM;
}
frag = &skb_shinfo(skb)->frags[skb_shinfo(skb)->nr_frags - 1];
- cp = kmap_atomic(skb_frag_page(frag), KM_SKB_DATA_SOFTIRQ)
+ cp = kmap_atomic(__skb_frag_page(frag), KM_SKB_DATA_SOFTIRQ)
+ frag->page_offset;
} else {
cp = (struct fcoe_crc_eof *)skb_put(skb, tlen);
diff --git a/drivers/scsi/cxgbi/libcxgbi.c b/drivers/scsi/cxgbi/libcxgbi.c
index c10f74a..eba623a 100644
--- a/drivers/scsi/cxgbi/libcxgbi.c
+++ b/drivers/scsi/cxgbi/libcxgbi.c
@@ -1949,7 +1949,7 @@ int cxgbi_conn_init_pdu(struct iscsi_task *task, unsigned int offset,
/* data fits in the skb's headroom */
for (i = 0; i < tdata->nr_frags; i++, frag++) {
- char *src = kmap_atomic(frag->page,
+ char *src = kmap_atomic(__skb_frag_page(frag),
KM_SOFTIRQ0);
memcpy(dst, src+frag->offset, frag->size);
diff --git a/drivers/scsi/fcoe/fcoe_transport.c b/drivers/scsi/fcoe/fcoe_transport.c
index bd97b22..3cc35f3 100644
--- a/drivers/scsi/fcoe/fcoe_transport.c
+++ b/drivers/scsi/fcoe/fcoe_transport.c
@@ -210,7 +210,7 @@ u32 fcoe_fc_crc(struct fc_frame *fp)
while (len > 0) {
clen = min(len, PAGE_SIZE - (off & ~PAGE_MASK));
data = kmap_atomic(
- skb_frag_page(frag) + (off >> PAGE_SHIFT),
+ __skb_frag_page(frag) + (off >> PAGE_SHIFT),
KM_SKB_DATA_SOFTIRQ);
crc = crc32(crc, data + (off & ~PAGE_MASK), clen);
kunmap_atomic(data, KM_SKB_DATA_SOFTIRQ);
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 698e4c1..4adf576 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1728,12 +1728,24 @@ static inline void netdev_free_page(struct net_device *dev, struct page *page)
}
/**
- * skb_frag_page - retrieve the page refered to by a paged fragment
+ * __skb_frag_page - retrieve the page refered to by a paged fragment
* @frag: the paged fragment
*
- * Returns the &struct page associated with @frag.
+ * Returns the &struct page associated with @frag. Where possible you
+ * should use skb_frag_page() which returns a const &struct page.
*/
-static inline struct page *skb_frag_page(const skb_frag_t *frag)
+static inline struct page *__skb_frag_page(const skb_frag_t *frag)
+{
+ return frag->page.p;
+}
+
+/**
+ * __skb_frag_page - retrieve the page refered to by a paged fragment
+ * @frag: the paged fragment
+ *
+ * Returns the &struct page associated with @frag as a const.
+ */
+static inline const struct page *skb_frag_page(const skb_frag_t *frag)
{
return frag->page.p;
}
@@ -1753,7 +1765,7 @@ static inline void __skb_frag_ref(skb_frag_t *frag)
skb_frag_destructor_ref(frag->page.destructor);
return;
}
- get_page(skb_frag_page(frag));
+ get_page(__skb_frag_page(frag));
}
/**
@@ -1780,7 +1792,7 @@ static inline void __skb_frag_unref(skb_frag_t *frag)
skb_frag_destructor_unref(frag->page.destructor);
return;
}
- put_page(skb_frag_page(frag));
+ put_page(__skb_frag_page(frag));
}
/**
@@ -1856,7 +1868,7 @@ static inline void *skb_frag_kmap_atomic(const skb_frag_t *frag)
local_bh_disable();
#endif
- return kmap_atomic(skb_frag_page(frag), KM_SKB_DATA_SOFTIRQ);
+ return kmap_atomic(__skb_frag_page(frag), KM_SKB_DATA_SOFTIRQ);
}
static inline void skb_frag_kunmap_atomic(void *vaddr)
@@ -1875,7 +1887,7 @@ static inline void skb_frag_kunmap_atomic(void *vaddr)
*/
static inline void *skb_frag_kmap(const skb_frag_t *frag)
{
- return kmap(skb_frag_page(frag));
+ return kmap(__skb_frag_page(frag));
}
/**
@@ -1886,7 +1898,7 @@ static inline void *skb_frag_kmap(const skb_frag_t *frag)
*/
static inline void skb_frag_kunmap(const skb_frag_t *frag)
{
- kunmap(skb_frag_page(frag));
+ kunmap(__skb_frag_page(frag));
}
/**
@@ -1905,7 +1917,7 @@ static inline dma_addr_t skb_frag_dma_map(struct device *dev,
size_t offset, size_t size,
enum dma_data_direction dir)
{
- return dma_map_page(dev, skb_frag_page(frag),
+ return dma_map_page(dev, __skb_frag_page(frag),
frag->page_offset + offset, size, dir);
}
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 047f38f..5f20a48 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1653,7 +1653,8 @@ static int __skb_splice_bits(struct sk_buff *skb, struct pipe_inode_info *pipe,
for (seg = 0; seg < skb_shinfo(skb)->nr_frags; seg++) {
const skb_frag_t *f = &skb_shinfo(skb)->frags[seg];
- if (__splice_segment(skb_frag_page(f),
+ /* XXX */
+ if (__splice_segment(__skb_frag_page(f),
f->page_offset, skb_frag_size(f),
offset, len, skb, spd, 0, sk, pipe))
return 1;
@@ -2965,7 +2966,8 @@ __skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
if (copy > len)
copy = len;
- sg_set_page(&sg[elt], skb_frag_page(frag), copy,
+ /* XXX */
+ sg_set_page(&sg[elt], __skb_frag_page(frag), copy,
frag->page_offset+offset-start);
elt++;
if (!(len -= copy))
diff --git a/net/core/user_dma.c b/net/core/user_dma.c
index 1b5fefd..1c51151 100644
--- a/net/core/user_dma.c
+++ b/net/core/user_dma.c
@@ -79,7 +79,7 @@ int dma_skb_copy_datagram_iovec(struct dma_chan *chan,
end = start + skb_frag_size(frag);
copy = end - offset;
if (copy > 0) {
- struct page *page = skb_frag_page(frag);
+ struct page *page = __skb_frag_page(frag); /* XXX */
if (copy > len)
copy = len;
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 56ef323..28afcf7 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3037,7 +3037,7 @@ int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp,
for (i = 0; i < shi->nr_frags; ++i) {
const struct skb_frag_struct *f = &shi->frags[i];
- struct page *page = skb_frag_page(f);
+ struct page *page = __skb_frag_page(f); /* XXX */
sg_set_page(&sg, page, skb_frag_size(f), f->page_offset);
if (crypto_hash_update(desc, &sg, skb_frag_size(f)))
return 1;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2013-01-25 14:37 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-25 14:27 [RFC] rebase Ian Campbell's skb fragment tracking to 3.2 Alex Bligh
2013-01-25 14:27 ` [RFC] [PATCH 1/7] net: add support for per-paged-fragment destructors Alex Bligh
2013-01-25 14:27 ` [RFC] [PATCH 2/7] net: only allow paged fragments with the same destructor to be coalesced Alex Bligh
2013-01-25 14:27 ` [RFC] [PATCH 3/7] net: add paged frag destructor support to kernel_sendpage Alex Bligh
2013-01-25 14:27 ` [RFC] [PATCH 4/7] sunrpc: use SKB fragment destructors to delay completion until page is released by network stack Alex Bligh
2013-01-25 14:27 ` [RFC] [PATCH 5/7] net: move skb frag kmap functions to skbuff.h Alex Bligh
2013-01-25 14:27 ` [RFC] [PATCH 6/7] net: add skb_frag_k(un)map convenience functions Alex Bligh
2013-01-25 14:27 ` [RFC] [PATCH 7/7] net: return a *const* struct page from skb_frag_page Alex Bligh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).