* [PATCH 46/75] s2io: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-19 13:27 UTC (permalink / raw)
To: netdev; +Cc: linux-kernel, Ian Campbell, Jon Mason
In-Reply-To: <1313760393.5010.356.camel@zakaz.uk.xensource.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
drivers/net/s2io.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/s2io.c b/drivers/net/s2io.c
index 277d48b..8437a05 100644
--- a/drivers/net/s2io.c
+++ b/drivers/net/s2io.c
@@ -4190,10 +4190,10 @@ static netdev_tx_t s2io_xmit(struct sk_buff *skb, struct net_device *dev)
if (!frag->size)
continue;
txdp++;
- txdp->Buffer_Pointer = (u64)pci_map_page(sp->pdev, frag->page,
- frag->page_offset,
- frag->size,
- PCI_DMA_TODEVICE);
+ txdp->Buffer_Pointer = (u64)skb_frag_dma_map(&sp->pdev->dev,
+ frag, 0,
+ frag->size,
+ PCI_DMA_TODEVICE);
txdp->Control_1 = TXD_BUFFER0_SIZE(frag->size);
if (offload_type == SKB_GSO_UDP)
txdp->Control_1 |= TXD_UFO_EN;
--
1.7.2.5
^ permalink raw reply related
* [PATCH 54/75] tehuti: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-19 13:27 UTC (permalink / raw)
To: netdev; +Cc: linux-kernel, Ian Campbell, Alexander Indenbaum, Andy Gospodarek
In-Reply-To: <1313760393.5010.356.camel@zakaz.uk.xensource.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Alexander Indenbaum <baum@tehutinetworks.net>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
drivers/net/tehuti.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/tehuti.c b/drivers/net/tehuti.c
index 749bbf1..b2c11db 100644
--- a/drivers/net/tehuti.c
+++ b/drivers/net/tehuti.c
@@ -1497,9 +1497,9 @@ bdx_tx_map_skb(struct bdx_priv *priv, struct sk_buff *skb,
frag = &skb_shinfo(skb)->frags[i];
db->wptr->len = frag->size;
- db->wptr->addr.dma =
- pci_map_page(priv->pdev, frag->page, frag->page_offset,
- frag->size, PCI_DMA_TODEVICE);
+ db->wptr->addr.dma = skb_frag_dma_map(&priv->pdev->dev, frag,
+ 0, frag->size,
+ PCI_DMA_TODEVICE);
pbl++;
pbl->len = CPU_CHIP_SWAP32(db->wptr->len);
--
1.7.2.5
^ permalink raw reply related
* [PATCH 55/75] tg3: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-19 13:27 UTC (permalink / raw)
To: netdev
Cc: linux-kernel, Ian Campbell, Matt Carlson, Michael Chan,
Grant Likely, devicetree-discuss
In-Reply-To: <1313760393.5010.356.camel@zakaz.uk.xensource.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: devicetree-discuss@lists.ozlabs.org
---
drivers/net/tg3.c | 6 ++----
1 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index dc3fbf6..68a1a2c 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -6280,10 +6280,8 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
len = frag->size;
- mapping = pci_map_page(tp->pdev,
- frag->page,
- frag->page_offset,
- len, PCI_DMA_TODEVICE);
+ mapping = skb_frag_dma_map(&tp->pdev->dev, frag, 0,
+ len, PCI_DMA_TODEVICE);
tnapi->tx_buffers[entry].skb = NULL;
dma_unmap_addr_set(&tnapi->tx_buffers[entry], mapping,
--
1.7.2.5
^ permalink raw reply related
* [PATCH 59/75] virtionet: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-19 13:27 UTC (permalink / raw)
To: netdev
Cc: linux-kernel, Ian Campbell, Rusty Russell, Michael S. Tsirkin,
virtualization
In-Reply-To: <1313760393.5010.356.camel@zakaz.uk.xensource.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
drivers/net/virtio_net.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 0c7321c..52667a8 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -149,7 +149,7 @@ static void set_skb_frag(struct sk_buff *skb, struct page *page,
f = &skb_shinfo(skb)->frags[i];
f->size = min((unsigned)PAGE_SIZE - offset, *len);
f->page_offset = offset;
- f->page = page;
+ __skb_frag_set_page(f, page);
skb->data_len += f->size;
skb->len += f->size;
--
1.7.2.5
^ permalink raw reply related
* [PATCH 60/75] vmxnet3: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-19 13:27 UTC (permalink / raw)
To: netdev; +Cc: linux-kernel, Ian Campbell, Shreyas Bhatewara, VMware, Inc.
In-Reply-To: <1313760393.5010.356.camel@zakaz.uk.xensource.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Shreyas Bhatewara <sbhatewara@vmware.com>
Cc: "VMware, Inc." <pv-drivers@vmware.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
drivers/net/vmxnet3/vmxnet3_drv.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c b/drivers/net/vmxnet3/vmxnet3_drv.c
index 1cbacb3..1acb5c8 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -654,7 +654,7 @@ vmxnet3_append_frag(struct sk_buff *skb, struct Vmxnet3_RxCompDesc *rcd,
BUG_ON(skb_shinfo(skb)->nr_frags >= MAX_SKB_FRAGS);
- frag->page = rbi->page;
+ __skb_frag_set_page(frag, rbi->page);
frag->page_offset = 0;
frag->size = rcd->len;
skb->data_len += frag->size;
@@ -748,9 +748,9 @@ vmxnet3_map_pkt(struct sk_buff *skb, struct vmxnet3_tx_ctx *ctx,
tbi = tq->buf_info + tq->tx_ring.next2fill;
tbi->map_type = VMXNET3_MAP_PAGE;
- tbi->dma_addr = pci_map_page(adapter->pdev, frag->page,
- frag->page_offset, frag->size,
- PCI_DMA_TODEVICE);
+ tbi->dma_addr = skb_frag_dma_map(&adapter->pdev->dev, frag,
+ 0, frag->size,
+ PCI_DMA_TODEVICE);
tbi->len = frag->size;
--
1.7.2.5
^ permalink raw reply related
* [PATCH 61/75] vxge: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-19 13:27 UTC (permalink / raw)
To: netdev; +Cc: linux-kernel, Ian Campbell, Jon Mason
In-Reply-To: <1313760393.5010.356.camel@zakaz.uk.xensource.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
drivers/net/vxge/vxge-main.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/vxge/vxge-main.c b/drivers/net/vxge/vxge-main.c
index 178348a2..4baec97 100644
--- a/drivers/net/vxge/vxge-main.c
+++ b/drivers/net/vxge/vxge-main.c
@@ -923,9 +923,9 @@ vxge_xmit(struct sk_buff *skb, struct net_device *dev)
if (!frag->size)
continue;
- dma_pointer = (u64) pci_map_page(fifo->pdev, frag->page,
- frag->page_offset, frag->size,
- PCI_DMA_TODEVICE);
+ dma_pointer = (u64)skb_frag_dma_map(&fifo->pdev->dev, frag,
+ 0, frag->size,
+ PCI_DMA_TODEVICE);
if (unlikely(pci_dma_mapping_error(fifo->pdev, dma_pointer)))
goto _exit2;
--
1.7.2.5
^ permalink raw reply related
* Re: [PATCH/RFC v3 0/75] enable SKB paged fragment lifetime visibility
From: David Miller @ 2011-08-19 13:29 UTC (permalink / raw)
To: Ian.Campbell-Sxgqhf6Nn4DQT0dZR+AlfA
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, linux-nfs-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1313760393.5010.356.camel-o4Be2W7LfRlXesXXhkcM7miJhflN2719@public.gmane.org>
From: Ian Campbell <Ian.Campbell-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>
Date: Fri, 19 Aug 2011 14:26:33 +0100
> This is v3 of my series to enable visibility into SKB paged fragment's
Please tone down the patch count :-/ I'm not going to review anything more
than ~20 or so patches at a time from any one person.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH 71/75] sunrpc: use SKB fragment destructors to delay completion until page is released by network stack.
From: Ian Campbell @ 2011-08-19 13:27 UTC (permalink / raw)
To: netdev
Cc: linux-kernel, Ian Campbell, David S. Miller, Neil Brown,
J. Bruce Fields, linux-nfs
In-Reply-To: <1313760393.5010.356.camel@zakaz.uk.xensource.com>
This prevents an issue where an ACK is delayed, a retransmit is queued (either
at the RPC or TCP level) and the ACK arrives before the retransmission hits the
wire. If this happens to an NFS WRITE RPC then the write() system call
completes and the userspace process can continue, potentially modifying data
referenced by the retransmission before the retransmission occurs.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Neil Brown <neilb@suse.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: linux-nfs@vger.kernel.org
Cc: netdev@vger.kernel.org
[since v1:
Push down from NFS layer into RPM layer
]
---
include/linux/sunrpc/xdr.h | 2 ++
include/linux/sunrpc/xprt.h | 5 ++++-
net/sunrpc/clnt.c | 27 ++++++++++++++++++++++-----
net/sunrpc/svcsock.c | 3 ++-
net/sunrpc/xprt.c | 13 +++++++++++++
net/sunrpc/xprtsock.c | 3 ++-
6 files changed, 45 insertions(+), 8 deletions(-)
diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
index a20970e..172f81e 100644
--- a/include/linux/sunrpc/xdr.h
+++ b/include/linux/sunrpc/xdr.h
@@ -16,6 +16,7 @@
#include <asm/byteorder.h>
#include <asm/unaligned.h>
#include <linux/scatterlist.h>
+#include <linux/skbuff.h>
/*
* Buffer adjustment
@@ -57,6 +58,7 @@ struct xdr_buf {
tail[1]; /* Appended after page data */
struct page ** pages; /* Array of contiguous pages */
+ struct skb_frag_destructor *destructor;
unsigned int page_base, /* Start of page data */
page_len, /* Length of page data */
flags; /* Flags for data disposition */
diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index 15518a1..75131eb 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -92,7 +92,10 @@ struct rpc_rqst {
/* A cookie used to track the
state of the transport
connection */
-
+ struct skb_frag_destructor destructor; /* SKB paged fragment
+ * destructor for
+ * transmitted pages*/
+
/*
* Partial send handling
*/
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index c5347d2..919538d 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -61,6 +61,7 @@ static void call_reserve(struct rpc_task *task);
static void call_reserveresult(struct rpc_task *task);
static void call_allocate(struct rpc_task *task);
static void call_decode(struct rpc_task *task);
+static void call_complete(struct rpc_task *task);
static void call_bind(struct rpc_task *task);
static void call_bind_status(struct rpc_task *task);
static void call_transmit(struct rpc_task *task);
@@ -1113,6 +1114,8 @@ rpc_xdr_encode(struct rpc_task *task)
(char *)req->rq_buffer + req->rq_callsize,
req->rq_rcvsize);
+ req->rq_snd_buf.destructor = &req->destructor;
+
p = rpc_encode_header(task);
if (p == NULL) {
printk(KERN_INFO "RPC: couldn't encode RPC header, exit EIO\n");
@@ -1276,6 +1279,7 @@ call_connect_status(struct rpc_task *task)
static void
call_transmit(struct rpc_task *task)
{
+ struct rpc_rqst *req = task->tk_rqstp;
dprint_status(task);
task->tk_action = call_status;
@@ -1309,8 +1313,8 @@ call_transmit(struct rpc_task *task)
call_transmit_status(task);
if (rpc_reply_expected(task))
return;
- task->tk_action = rpc_exit_task;
- rpc_wake_up_queued_task(&task->tk_xprt->pending, task);
+ task->tk_action = call_complete;
+ skb_frag_destructor_unref(&req->destructor);
}
/*
@@ -1383,7 +1387,8 @@ call_bc_transmit(struct rpc_task *task)
return;
}
- task->tk_action = rpc_exit_task;
+ task->tk_action = call_complete;
+ skb_frag_destructor_unref(&req->destructor);
if (task->tk_status < 0) {
printk(KERN_NOTICE "RPC: Could not send backchannel reply "
"error: %d\n", task->tk_status);
@@ -1423,7 +1428,6 @@ call_bc_transmit(struct rpc_task *task)
"error: %d\n", task->tk_status);
break;
}
- rpc_wake_up_queued_task(&req->rq_xprt->pending, task);
}
#endif /* CONFIG_SUNRPC_BACKCHANNEL */
@@ -1589,12 +1593,14 @@ call_decode(struct rpc_task *task)
return;
}
- task->tk_action = rpc_exit_task;
+ task->tk_action = call_complete;
if (decode) {
task->tk_status = rpcauth_unwrap_resp(task, decode, req, p,
task->tk_msg.rpc_resp);
}
+ rpc_sleep_on(&req->rq_xprt->pending, task, NULL);
+ skb_frag_destructor_unref(&req->destructor);
dprintk("RPC: %5u call_decode result %d\n", task->tk_pid,
task->tk_status);
return;
@@ -1609,6 +1615,17 @@ out_retry:
}
}
+/*
+ * 8. Wait for pages to be released by the network stack.
+ */
+static void
+call_complete(struct rpc_task *task)
+{
+ dprintk("RPC: %5u call_complete result %d\n",
+ task->tk_pid, task->tk_status);
+ task->tk_action = rpc_exit_task;
+}
+
static __be32 *
rpc_encode_header(struct rpc_task *task)
{
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 852a258..3685cad 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -196,7 +196,8 @@ int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
while (pglen > 0) {
if (slen == size)
flags = 0;
- result = kernel_sendpage(sock, *ppage, NULL, base, size, flags);
+ result = kernel_sendpage(sock, *ppage, xdr->destructor,
+ base, size, flags);
if (result > 0)
len += result;
if (result != size)
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index f4385e4..925aa0c 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -1103,6 +1103,16 @@ static inline void xprt_init_xid(struct rpc_xprt *xprt)
xprt->xid = net_random();
}
+static int xprt_complete_skb_pages(void *calldata)
+{
+ struct rpc_task *task = calldata;
+ struct rpc_rqst *req = task->tk_rqstp;
+
+ dprintk("RPC: %5u completing skb pages\n", task->tk_pid);
+ rpc_wake_up_queued_task(&req->rq_xprt->pending, task);
+ return 0;
+}
+
static void xprt_request_init(struct rpc_task *task, struct rpc_xprt *xprt)
{
struct rpc_rqst *req = task->tk_rqstp;
@@ -1115,6 +1125,9 @@ static void xprt_request_init(struct rpc_task *task, struct rpc_xprt *xprt)
req->rq_xid = xprt_alloc_xid(xprt);
req->rq_release_snd_buf = NULL;
xprt_reset_majortimeo(req);
+ atomic_set(&req->destructor.ref, 1);
+ req->destructor.destroy = &xprt_complete_skb_pages;
+ req->destructor.data = task;
dprintk("RPC: %5u reserved req %p xid %08x\n", task->tk_pid,
req, ntohl(req->rq_xid));
}
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index f79e40e..af3a106 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -408,7 +408,8 @@ static int xs_send_pagedata(struct socket *sock, struct xdr_buf *xdr, unsigned i
remainder -= len;
if (remainder != 0 || more)
flags |= MSG_MORE;
- err = sock->ops->sendpage(sock, *ppage, NULL, base, len, flags);
+ err = sock->ops->sendpage(sock, *ppage, xdr->destructor,
+ base, len, flags);
if (remainder == 0 || err != len)
break;
sent += err;
--
1.7.2.5
^ permalink raw reply related
* [PATCH 74/75] net: add skb_frag_k(un)map convenience functions.
From: Ian Campbell @ 2011-08-19 13:27 UTC (permalink / raw)
To: netdev
Cc: linux-kernel, Ian Campbell, David S. Miller, Eric Dumazet,
Michał Mirosław, Tom Herbert, Neil Horman, Koki Sanagi
In-Reply-To: <1313760393.5010.356.camel@zakaz.uk.xensource.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: Tom Herbert <therbert@google.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
---
include/linux/skbuff.h | 22 ++++++++++++++++++++++
net/core/datagram.c | 20 ++++++++------------
2 files changed, 30 insertions(+), 12 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 73d8f7a..0328428 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1839,6 +1839,28 @@ static inline void skb_frag_kunmap_atomic(void *vaddr)
}
/**
+ * skb_frag_kmap - kmaps a paged fragment
+ * @frag: the paged fragment
+ *
+ * kmap()s the paged fragment @frag and returns the virtual address.
+ */
+static inline void *skb_frag_kmap(skb_frag_t *frag)
+{
+ return kmap(skb_frag_page(frag));
+}
+
+/**
+ * skb_frag_kunmap - kunmaps a paged fragment
+ * @frag: the paged fragment
+ *
+ * kunmap()s the paged fragment @frag.
+ */
+static inline void skb_frag_kunmap(skb_frag_t *frag)
+{
+ kunmap(skb_frag_page(frag));
+}
+
+/**
* skb_frag_dma_map - maps a paged fragment via the DMA API
* @device: the device to map the fragment to
* @frag: the paged fragment to map
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 6449bed..f0dcaa2 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -332,14 +332,13 @@ int skb_copy_datagram_iovec(const struct sk_buff *skb, int offset,
int err;
u8 *vaddr;
skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
- struct page *page = skb_frag_page(frag);
if (copy > len)
copy = len;
- vaddr = kmap(page);
+ vaddr = skb_frag_kmap(frag);
err = memcpy_toiovec(to, vaddr + frag->page_offset +
offset - start, copy);
- kunmap(page);
+ skb_frag_kunmap(frag);
if (err)
goto fault;
if (!(len -= copy))
@@ -418,14 +417,13 @@ int skb_copy_datagram_const_iovec(const struct sk_buff *skb, int offset,
int err;
u8 *vaddr;
skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
- struct page *page = skb_frag_page(frag);
if (copy > len)
copy = len;
- vaddr = kmap(page);
+ vaddr = skb_frag_kmap(frag);
err = memcpy_toiovecend(to, vaddr + frag->page_offset +
offset - start, to_offset, copy);
- kunmap(page);
+ skb_frag_kunmap(frag);
if (err)
goto fault;
if (!(len -= copy))
@@ -508,15 +506,14 @@ int skb_copy_datagram_from_iovec(struct sk_buff *skb, int offset,
int err;
u8 *vaddr;
skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
- struct page *page = skb_frag_page(frag);
if (copy > len)
copy = len;
- vaddr = kmap(page);
+ vaddr = skb_frag_kmap(frag);
err = memcpy_fromiovecend(vaddr + frag->page_offset +
offset - start,
from, from_offset, copy);
- kunmap(page);
+ skb_frag_kunmap(frag);
if (err)
goto fault;
@@ -594,16 +591,15 @@ static int skb_copy_and_csum_datagram(const struct sk_buff *skb, int offset,
int err = 0;
u8 *vaddr;
skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
- struct page *page = skb_frag_page(frag);
if (copy > len)
copy = len;
- vaddr = kmap(page);
+ vaddr = skb_frag_kmap(frag);
csum2 = csum_and_copy_to_user(vaddr +
frag->page_offset +
offset - start,
to, copy, 0, &err);
- kunmap(page);
+ skb_frag_kunmap(frag);
if (err)
goto fault;
*csump = csum_block_add(*csump, csum2, pos);
--
1.7.2.5
^ permalink raw reply related
* Re: [PATCH/RFC v3 0/75] enable SKB paged fragment lifetime visibility
From: David Miller @ 2011-08-19 13:34 UTC (permalink / raw)
To: Ian.Campbell; +Cc: netdev, linux-nfs
In-Reply-To: <20110819.062958.1560313540665582388.davem@davemloft.net>
From: David Miller <davem@davemloft.net>
Date: Fri, 19 Aug 2011 06:29:58 -0700 (PDT)
> From: Ian Campbell <Ian.Campbell@citrix.com>
> Date: Fri, 19 Aug 2011 14:26:33 +0100
>
>> This is v3 of my series to enable visibility into SKB paged fragment's
>
> Please tone down the patch count :-/ I'm not going to review anything more
> than ~20 or so patches at a time from any one person.
Also none of your patches will even apply to net-next GIT since nearly
all the ethernet drivers have been moved under drivers/net/ethernet
^ permalink raw reply
* Re: [PATCH 68/75] hv: netvsc: convert to SKB paged frag API.
From: Dan Carpenter @ 2011-08-19 13:37 UTC (permalink / raw)
To: Ian Campbell
Cc: devel, Abhishek Kane, netdev, Haiyang Zhang, Greg Kroah-Hartman,
linux-kernel
In-Reply-To: <1313760467-8598-68-git-send-email-ian.campbell@citrix.com>
On Fri, Aug 19, 2011 at 02:27:40PM +0100, Ian Campbell wrote:
> /* usually there's just one frag */
> - skb_frag_set_page(skb, 0, gl->frags[0].page);
> + skb_frag_set_page(skb, 0, gl->frags[0].page.p); /* XXX */
There are bunch of new /* XXX */ comments. What are they for?
regards,
dan carpenter
^ permalink raw reply
* Re: [PATCH/RFC v3 0/75] enable SKB paged fragment lifetime visibility
From: Ian Campbell @ 2011-08-19 13:45 UTC (permalink / raw)
To: David Miller; +Cc: netdev@vger.kernel.org, linux-nfs@vger.kernel.org
In-Reply-To: <20110819.063410.475175569086272526.davem@davemloft.net>
On Fri, 2011-08-19 at 14:34 +0100, David Miller wrote:
> From: David Miller <davem@davemloft.net>
> Date: Fri, 19 Aug 2011 06:29:58 -0700 (PDT)
>
> > From: Ian Campbell <Ian.Campbell@citrix.com>
> > Date: Fri, 19 Aug 2011 14:26:33 +0100
> >
> >> This is v3 of my series to enable visibility into SKB paged fragment's
> >
> > Please tone down the patch count :-/ I'm not going to review anything more
> > than ~20 or so patches at a time from any one person.
Unfortunately I didn't see any middle ground between one massive patch
munging drivers with lots of different maintainers and one patch per
maintainer (which is loads).
I'll resend in batches of ~20. After the first patch all the "convert to
SKB paged frag API" patches are independent.
Would you be willing to just review patches 1-20 of this series? In
which case I could avoid spamming the list with those again now.
> Also none of your patches will even apply to net-next GIT since nearly
> all the ethernet drivers have been moved under drivers/net/ethernet
Yes, I noted that in my intro mail and will rebase for v4. Since it's
mostly just been code motion I think review in the old place will still
be largely valid.
Ian.
^ permalink raw reply
* Re: [PATCH 68/75] hv: netvsc: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-19 13:46 UTC (permalink / raw)
To: Dan Carpenter
Cc: devel@driverdev.osuosl.org, Abhishek Kane, netdev@vger.kernel.org,
Haiyang Zhang, Greg Kroah-Hartman, linux-kernel@vger.kernel.org
In-Reply-To: <20110819133739.GA3715@shale.localdomain>
On Fri, 2011-08-19 at 14:37 +0100, Dan Carpenter wrote:
> On Fri, Aug 19, 2011 at 02:27:40PM +0100, Ian Campbell wrote:
> > /* usually there's just one frag */
> > - skb_frag_set_page(skb, 0, gl->frags[0].page);
> > + skb_frag_set_page(skb, 0, gl->frags[0].page.p); /* XXX */
>
> There are bunch of new /* XXX */ comments. What are they for?
They are a reminder referring to the dilemma I mentioned in the intro
mail. Basically that ".p" is ugly because gl->frags[0].page isn't
actually a paged fragment, it's just that this driver uses skb_frag_t in
its internal datastructures.
Ian.
^ permalink raw reply
* [PATCH] net: add the comment for skb->l4_hash
From: Changli Gao @ 2011-08-19 13:45 UTC (permalink / raw)
To: David S. Miller; +Cc: Eric Dumazet, Tom Herbert, netdev
---
include/linux/skbuff.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f902c33..10109a0 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -322,6 +322,8 @@ typedef unsigned char *sk_buff_data_t;
* @queue_mapping: Queue mapping for multiqueue devices
* @ndisc_nodetype: router type (from link layer)
* @ooo_okay: allow the mapping of a socket to a queue to be changed
+ * @l4_hash: indicate rxhash is a canonical 4-tuple hash over transport
+ * ports.
* @dma_cookie: a cookie to one of several possible DMA operations
* done by skb DMA functions
* @secmark: security marking
^ permalink raw reply related
* Re: [PATCH 68/75] hv: netvsc: convert to SKB paged frag API.
From: Ian Campbell @ 2011-08-19 13:50 UTC (permalink / raw)
To: netdev
Cc: linux-kernel, Hank Janssen, Haiyang Zhang, Greg Kroah-Hartman,
K. Y. Srinivasan, Abhishek Kane, devel
In-Reply-To: <1313760467-8598-68-git-send-email-ian.campbell@citrix.com>
On Fri, 2011-08-19 at 14:27 +0100, Ian Campbell wrote:
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> Cc: Hank Janssen <hjanssen@microsoft.com>
> Cc: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: Greg Kroah-Hartman <gregkh@suse.de>
> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
> Cc: Abhishek Kane <v-abkane@microsoft.com>
> Cc: devel@driverdev.osuosl.org
> Cc: linux-kernel@vger.kernel.org
> ---
> drivers/net/cxgb4/sge.c | 14 +++++-----
> drivers/net/cxgb4vf/sge.c | 18 ++++++------
> drivers/net/mlx4/en_rx.c | 2 +-
> drivers/scsi/cxgbi/libcxgbi.c | 2 +-
> drivers/staging/hv/netvsc_drv.c | 2 +-
> include/linux/skbuff.h | 54 ++++++++++++++++++++++++++++++++++++---
> net/core/skbuff.c | 26 +++++++++++++++++-
> 7 files changed, 93 insertions(+), 25 deletions(-)
Somewhere along the line I've managed to rebase/merge the "net: add
support for per-paged-fragment destructors" patch into the preceding
patch :-( Everything here apart from the drivers/staging/hv/netvsc_drv.c
change should be a changeset with the following comment. I'll fix for
the next post.
net: add support for per-paged-fragment destructors
Entities which care about the complete lifecycle of pages which they inject
into the network stack via an skb paged fragment can choose to set this
destructor in order to receive a callback when the stack is really finished
with a page (including all clones, retransmits, pull-ups etc etc).
This destructor will always be propagated alongside the struct page when
copying skb_frag_t->page. This is the reason I chose to embed the destructor in
a "struct { } page" within the skb_frag_t, rather than as a separate field,
since it allows existing code which propagates ->frags[N].page to Just
Work(tm).
When the destructor is present the page reference counting is done slightly
differently. No references are held by the network stack on the struct page (it
is up to the caller to manage this as necessary) instead the network stack will
track references via the count embedded in the destructor structure. When this
reference count reaches zero then the destructor will be called and the caller
can take the necesary steps to release the page (i.e. release the struct page
reference itself).
The intention is that callers can use this callback to delay completion to
_their_ callers until the network stack has completely released the page, in
order to prevent use-after-free or modification of data pages which are still
in use by the stack.
It is allowable (indeed expected) for a caller to share a single destructor
instance between multiple pages injected into the stack e.g. a group of pages
included in a single higher level operation might share a destructor which is
used to complete that higher level operation.
NB: a small number of drivers use skb_frag_t independently of struct sk_buff so
this patch is slightly larger than necessary. I did consider leaving skb_frag_t
alone and defining a new (but similar) structure to be used in the struct
sk_buff itself. This would also have the advantage of more clearly separating
the two uses, which is useful since there are now special reference counting
accessors for skb_frag_t within a struct sk_buff but not (necessarily) for
those used outside of an skb.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
>
> diff --git a/drivers/net/cxgb4/sge.c b/drivers/net/cxgb4/sge.c
> index f1813b5..3e7c4b3 100644
> --- a/drivers/net/cxgb4/sge.c
> +++ b/drivers/net/cxgb4/sge.c
> @@ -1416,7 +1416,7 @@ static inline void copy_frags(struct sk_buff *skb,
> unsigned int n;
>
> /* usually there's just one frag */
> - skb_frag_set_page(skb, 0, gl->frags[0].page);
> + skb_frag_set_page(skb, 0, gl->frags[0].page.p); /* XXX */
> ssi->frags[0].page_offset = gl->frags[0].page_offset + offset;
> ssi->frags[0].size = gl->frags[0].size - offset;
> ssi->nr_frags = gl->nfrags;
> @@ -1425,7 +1425,7 @@ static inline void copy_frags(struct sk_buff *skb,
> memcpy(&ssi->frags[1], &gl->frags[1], n * sizeof(skb_frag_t));
>
> /* get a reference to the last page, we don't own it */
> - get_page(gl->frags[n].page);
> + get_page(gl->frags[n].page.p); /* XXX */
> }
>
> /**
> @@ -1482,7 +1482,7 @@ static void t4_pktgl_free(const struct pkt_gl *gl)
> const skb_frag_t *p;
>
> for (p = gl->frags, n = gl->nfrags - 1; n--; p++)
> - put_page(p->page);
> + put_page(p->page.p); /* XXX */
> }
>
> /*
> @@ -1635,7 +1635,7 @@ static void restore_rx_bufs(const struct pkt_gl *si, struct sge_fl *q,
> else
> q->cidx--;
> d = &q->sdesc[q->cidx];
> - d->page = si->frags[frags].page;
> + d->page = si->frags[frags].page.p; /* XXX */
> d->dma_addr |= RX_UNMAPPED_BUF;
> q->avail++;
> }
> @@ -1717,7 +1717,7 @@ static int process_responses(struct sge_rspq *q, int budget)
> for (frags = 0, fp = si.frags; ; frags++, fp++) {
> rsd = &rxq->fl.sdesc[rxq->fl.cidx];
> bufsz = get_buf_size(rsd);
> - fp->page = rsd->page;
> + fp->page.p = rsd->page; /* XXX */
> fp->page_offset = q->offset;
> fp->size = min(bufsz, len);
> len -= fp->size;
> @@ -1734,8 +1734,8 @@ static int process_responses(struct sge_rspq *q, int budget)
> get_buf_addr(rsd),
> fp->size, DMA_FROM_DEVICE);
>
> - si.va = page_address(si.frags[0].page) +
> - si.frags[0].page_offset;
> + si.va = page_address(si.frags[0].page.p) +
> + si.frags[0].page_offset; /* XXX */
>
> prefetch(si.va);
>
> diff --git a/drivers/net/cxgb4vf/sge.c b/drivers/net/cxgb4vf/sge.c
> index 6d6060e..3688423 100644
> --- a/drivers/net/cxgb4vf/sge.c
> +++ b/drivers/net/cxgb4vf/sge.c
> @@ -1397,7 +1397,7 @@ struct sk_buff *t4vf_pktgl_to_skb(const struct pkt_gl *gl,
> skb_copy_to_linear_data(skb, gl->va, pull_len);
>
> ssi = skb_shinfo(skb);
> - skb_frag_set_page(skb, 0, gl->frags[0].page);
> + skb_frag_set_page(skb, 0, gl->frags[0].page.p); /* XXX */
> ssi->frags[0].page_offset = gl->frags[0].page_offset + pull_len;
> ssi->frags[0].size = gl->frags[0].size - pull_len;
> if (gl->nfrags > 1)
> @@ -1410,7 +1410,7 @@ struct sk_buff *t4vf_pktgl_to_skb(const struct pkt_gl *gl,
> skb->truesize += skb->data_len;
>
> /* Get a reference for the last page, we don't own it */
> - get_page(gl->frags[gl->nfrags - 1].page);
> + get_page(gl->frags[gl->nfrags - 1].page.p); /* XXX */
> }
>
> out:
> @@ -1430,7 +1430,7 @@ void t4vf_pktgl_free(const struct pkt_gl *gl)
>
> frag = gl->nfrags - 1;
> while (frag--)
> - put_page(gl->frags[frag].page);
> + put_page(gl->frags[frag].page.p); /* XXX */
> }
>
> /**
> @@ -1450,7 +1450,7 @@ static inline void copy_frags(struct sk_buff *skb,
> unsigned int n;
>
> /* usually there's just one frag */
> - skb_frag_set_page(skb, 0, gl->frags[0].page);
> + skb_frag_set_page(skb, 0, gl->frags[0].page.p); /* XXX */
> si->frags[0].page_offset = gl->frags[0].page_offset + offset;
> si->frags[0].size = gl->frags[0].size - offset;
> si->nr_frags = gl->nfrags;
> @@ -1460,7 +1460,7 @@ static inline void copy_frags(struct sk_buff *skb,
> memcpy(&si->frags[1], &gl->frags[1], n * sizeof(skb_frag_t));
>
> /* get a reference to the last page, we don't own it */
> - get_page(gl->frags[n].page);
> + get_page(gl->frags[n].page.p); /* XXX */
> }
>
> /**
> @@ -1613,7 +1613,7 @@ static void restore_rx_bufs(const struct pkt_gl *gl, struct sge_fl *fl,
> else
> fl->cidx--;
> sdesc = &fl->sdesc[fl->cidx];
> - sdesc->page = gl->frags[frags].page;
> + sdesc->page = gl->frags[frags].page.p; /* XXX */
> sdesc->dma_addr |= RX_UNMAPPED_BUF;
> fl->avail++;
> }
> @@ -1701,7 +1701,7 @@ int process_responses(struct sge_rspq *rspq, int budget)
> BUG_ON(rxq->fl.avail == 0);
> sdesc = &rxq->fl.sdesc[rxq->fl.cidx];
> bufsz = get_buf_size(sdesc);
> - fp->page = sdesc->page;
> + fp->page.p = sdesc->page; /* XXX */
> fp->page_offset = rspq->offset;
> fp->size = min(bufsz, len);
> len -= fp->size;
> @@ -1719,8 +1719,8 @@ int process_responses(struct sge_rspq *rspq, int budget)
> dma_sync_single_for_cpu(rspq->adapter->pdev_dev,
> get_buf_addr(sdesc),
> fp->size, DMA_FROM_DEVICE);
> - gl.va = (page_address(gl.frags[0].page) +
> - gl.frags[0].page_offset);
> + gl.va = (page_address(gl.frags[0].page.p) +
> + gl.frags[0].page_offset); /* XXX */
> prefetch(gl.va);
>
> /*
> diff --git a/drivers/net/mlx4/en_rx.c b/drivers/net/mlx4/en_rx.c
> index 0c26ee7..bd4e86e 100644
> --- a/drivers/net/mlx4/en_rx.c
> +++ b/drivers/net/mlx4/en_rx.c
> @@ -418,7 +418,7 @@ static int mlx4_en_complete_rx_desc(struct mlx4_en_priv *priv,
> break;
>
> /* Save page reference in skb */
> - __skb_frag_set_page(&skb_frags_rx[nr], skb_frags[nr].page);
> + __skb_frag_set_page(&skb_frags_rx[nr], skb_frags[nr].page.p); /* XXX */
> skb_frags_rx[nr].size = skb_frags[nr].size;
> skb_frags_rx[nr].page_offset = skb_frags[nr].page_offset;
> dma = be64_to_cpu(rx_desc->data[nr].addr);
> diff --git a/drivers/scsi/cxgbi/libcxgbi.c b/drivers/scsi/cxgbi/libcxgbi.c
> index 0debb06..43cc6c6 100644
> --- a/drivers/scsi/cxgbi/libcxgbi.c
> +++ b/drivers/scsi/cxgbi/libcxgbi.c
> @@ -1823,7 +1823,7 @@ static int sgl_read_to_frags(struct scatterlist *sg, unsigned int sgoffset,
> return -EINVAL;
> }
>
> - frags[i].page = page;
> + frags[i].page.p = page;
> frags[i].page_offset = sg->offset + sgoffset;
> frags[i].size = copy;
> i++;
> diff --git a/drivers/staging/hv/netvsc_drv.c b/drivers/staging/hv/netvsc_drv.c
> index 61989f0..4fc7b3e 100644
> --- a/drivers/staging/hv/netvsc_drv.c
> +++ b/drivers/staging/hv/netvsc_drv.c
> @@ -171,7 +171,7 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net)
> for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
> skb_frag_t *f = &skb_shinfo(skb)->frags[i];
>
> - packet->page_buf[i+2].pfn = page_to_pfn(f->page);
> + packet->page_buf[i+2].pfn = page_to_pfn(skb_frag_page(f));
> packet->page_buf[i+2].offset = f->page_offset;
> packet->page_buf[i+2].len = f->size;
> }
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index c015e61..c6a3d4b 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -134,8 +134,17 @@ struct sk_buff;
>
> typedef struct skb_frag_struct skb_frag_t;
>
> +struct skb_frag_destructor {
> + atomic_t ref;
> + int (*destroy)(void *data);
> + void *data;
> +};
> +
> struct skb_frag_struct {
> - struct page *page;
> + struct {
> + struct page *p;
> + struct skb_frag_destructor *destructor;
> + } page;
> #if (BITS_PER_LONG > 32) || (PAGE_SIZE >= 65536)
> __u32 page_offset;
> __u32 size;
> @@ -1128,6 +1137,31 @@ static inline int skb_pagelen(const struct sk_buff *skb)
> }
>
> /**
> + * skb_frag_set_destructor - set destructor for a paged fragment
> + * @skb: buffer containing fragment to be initialised
> + * @i: paged fragment index to initialise
> + * @destroy: the destructor to use for this fragment
> + *
> + * Sets @destroy as the destructor to be called when all references to
> + * the frag @i in @skb (tracked over skb_clone, retransmit, pull-ups,
> + * etc) are released.
> + *
> + * When a destructor is set then reference counting is performed on
> + * @destroy->ref. When the ref reaches zero then @destroy->destroy
> + * will be called. The caller is responsible for holding and managing
> + * any other references (such a the struct page reference count).
> + *
> + * This function must be called before any use of skb_frag_ref() or
> + * skb_frag_unref().
> + */
> +static inline void skb_frag_set_destructor(struct sk_buff *skb, int i,
> + struct skb_frag_destructor *destroy)
> +{
> + skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
> + frag->page.destructor = destroy;
> +}
> +
> +/**
> * __skb_fill_page_desc - initialise a paged fragment in an skb
> * @skb: buffer containing fragment to be initialised
> * @i: paged fragment index to initialise
> @@ -1145,7 +1179,8 @@ static inline void __skb_fill_page_desc(struct sk_buff *skb, int i,
> {
> skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
>
> - frag->page = page;
> + frag->page.p = page;
> + frag->page.destructor = NULL;
> frag->page_offset = off;
> frag->size = size;
> }
> @@ -1669,9 +1704,12 @@ static inline void netdev_free_page(struct net_device *dev, struct page *page)
> */
> static inline struct page *skb_frag_page(const skb_frag_t *frag)
> {
> - return frag->page;
> + return frag->page.p;
> }
>
> +extern void skb_frag_destructor_ref(struct skb_frag_destructor *destroy);
> +extern void skb_frag_destructor_unref(struct skb_frag_destructor *destroy);
> +
> /**
> * __skb_frag_ref - take an addition reference on a paged fragment.
> * @frag: the paged fragment
> @@ -1680,6 +1718,10 @@ static inline struct page *skb_frag_page(const skb_frag_t *frag)
> */
> static inline void __skb_frag_ref(skb_frag_t *frag)
> {
> + if (unlikely(frag->page.destructor)) {
> + skb_frag_destructor_ref(frag->page.destructor);
> + return;
> + }
> get_page(skb_frag_page(frag));
> }
>
> @@ -1703,6 +1745,10 @@ static inline void skb_frag_ref(struct sk_buff *skb, int f)
> */
> static inline void __skb_frag_unref(skb_frag_t *frag)
> {
> + if (unlikely(frag->page.destructor)) {
> + skb_frag_destructor_unref(frag->page.destructor);
> + return;
> + }
> put_page(skb_frag_page(frag));
> }
>
> @@ -1755,7 +1801,7 @@ static inline void *skb_frag_address_safe(const skb_frag_t *frag)
> */
> static inline void __skb_frag_set_page(skb_frag_t *frag, struct page *page)
> {
> - frag->page = page;
> + frag->page.p = page;
> __skb_frag_ref(frag);
> }
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 418f3435..906a3e7 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -292,6 +292,23 @@ struct sk_buff *dev_alloc_skb(unsigned int length)
> }
> EXPORT_SYMBOL(dev_alloc_skb);
>
> +void skb_frag_destructor_ref(struct skb_frag_destructor *destroy)
> +{
> + BUG_ON(destroy == NULL);
> + atomic_inc(&destroy->ref);
> +}
> +EXPORT_SYMBOL(skb_frag_destructor_ref);
> +
> +void skb_frag_destructor_unref(struct skb_frag_destructor *destroy)
> +{
> + if (destroy == NULL)
> + return;
> +
> + if (atomic_dec_and_test(&destroy->ref))
> + destroy->destroy(destroy->data);
> +}
> +EXPORT_SYMBOL(skb_frag_destructor_unref);
> +
> static void skb_drop_list(struct sk_buff **listp)
> {
> struct sk_buff *list = *listp;
> @@ -642,14 +659,19 @@ static int skb_copy_ubufs(struct sk_buff *skb, gfp_t gfp_mask)
>
> /* skb frags release userspace buffers */
> for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
> - put_page(skb_shinfo(skb)->frags[i].page);
> + skb_frag_unref(skb, i);
>
> uarg->callback(uarg);
>
> /* skb frags point to kernel buffers */
> for (i = skb_shinfo(skb)->nr_frags; i > 0; i--) {
> skb_shinfo(skb)->frags[i - 1].page_offset = 0;
> - skb_shinfo(skb)->frags[i - 1].page = head;
> + skb_shinfo(skb)->frags[i - 1].page.p = head;
> + /*
> + * The original destructor is called as necessary when
> + * releasing userspace buffers.
> + */
> + skb_shinfo(skb)->frags[i - 1].page.destructor = NULL;
> head = (struct page *)head->private;
> }
> return 0;
^ permalink raw reply
* Re: [PATCH] net: add the comment for skb->l4_hash
From: David Miller @ 2011-08-19 13:51 UTC (permalink / raw)
To: xiaosuo; +Cc: eric.dumazet, therbert, netdev
In-Reply-To: <1313761541-1109-1-git-send-email-xiaosuo@gmail.com>
Please fully resubmit with a proper signoff.
Thank you.
^ permalink raw reply
* Re: Traceroute and "ping" sockets: some questions
From: Dmitry Butskoy @ 2011-08-19 13:56 UTC (permalink / raw)
To: netdev
In-Reply-To: <20110819.062249.2234872106788628654.davem@davemloft.net>
David Miller wrote:
>>> Why limit? He can spam with UDP socket just as easily at any rate
>>> he pleases,
>>>
>> Yes, but most cases such UDP is "one-way" spam (until services like
>> "echo 7/udp" are enabled).
>> For icmp echo, we normally receive icmp replies, hence it is
>> bidirectional crap. Which was not present before.
>>
> Well, replace UDP with TCP syn flood.
>
Well.
Why then there is "net/ipv4/ping_group_range" restrictions, with default
values (low=1, high=0) which denies this way even for root?
Regards,
Dmitry
^ permalink raw reply
* Re: [PATCH] net: add the comment for skb->l4_hash
From: Changli Gao @ 2011-08-19 14:00 UTC (permalink / raw)
To: David Miller; +Cc: eric.dumazet, therbert, netdev
In-Reply-To: <20110819.065119.890658681263456287.davem@davemloft.net>
On Fri, Aug 19, 2011 at 9:51 PM, David Miller <davem@davemloft.net> wrote:
>
>
> Please fully resubmit with a proper signoff.
>
Oh, sorry. when editing this patch, my poor vim showed an empty line
instead, so I removed it by mistake. I'll resubmit it.
--
Regards,
Changli Gao(xiaosuo@gmail.com)
^ permalink raw reply
* Re: Traceroute and "ping" sockets: some questions
From: David Miller @ 2011-08-19 14:00 UTC (permalink / raw)
To: buc; +Cc: netdev
In-Reply-To: <4E4E6B8A.1030707@odu.neva.ru>
From: Dmitry Butskoy <buc@odusz.so-cdu.ru>
Date: Fri, 19 Aug 2011 17:56:26 +0400
> Why then there is "net/ipv4/ping_group_range" restrictions, with
> default values (low=1, high=0) which denies this way even for root?
While we shake out the bugs and better understand the security
implications of this feature. One this is resolved we can deprecate
this.
^ permalink raw reply
* Re: [PATCH/RFC v3 0/75] enable SKB paged fragment lifetime visibility
From: David Miller @ 2011-08-19 14:04 UTC (permalink / raw)
To: Ian.Campbell; +Cc: netdev, linux-nfs
In-Reply-To: <20110819.063410.475175569086272526.davem@davemloft.net>
From: David Miller <davem@davemloft.net>
Date: Fri, 19 Aug 2011 06:34:10 -0700 (PDT)
> From: David Miller <davem@davemloft.net>
> Date: Fri, 19 Aug 2011 06:29:58 -0700 (PDT)
>
>> From: Ian Campbell <Ian.Campbell@citrix.com>
>> Date: Fri, 19 Aug 2011 14:26:33 +0100
>>
>>> This is v3 of my series to enable visibility into SKB paged fragment's
>>
>> Please tone down the patch count :-/ I'm not going to review anything more
>> than ~20 or so patches at a time from any one person.
>
> Also none of your patches will even apply to net-next GIT since nearly
> all the ethernet drivers have been moved under drivers/net/ethernet
Ian, please acknowledge my grievances here. I see you replying to other
people, but not to me and I'm the one who has to process all of this
stuff eventually.
If you want this series to be taken seriously:
1) Make your patches against net-next GIT, none of your driver patches will
apply because they have all been moved around to different locations
under drivers/net
2) Submit this in a _SANE_ way. This means, get the first patch that adds
the new interfaces approved and merged. Then slowly and carefuly submit
small, reasonably sized, sets of patches that convert the drivers over.
Otherwise there is no way I'm even looking at this stuff, let alone actually
apply it.
Realize that every time you patch bomb rediculiously like this I have
to click and classify every single patch in your bomb at
http://patchwork.ozlabs.org/project/netdev/list/
^ permalink raw reply
* Re: [PATCH net-next v4 af-packet 0/2] Enhance af-packet to provide (near zero)lossless packet capture functionality.
From: chetan loke @ 2011-08-19 14:12 UTC (permalink / raw)
To: David Miller; +Cc: netdev
In-Reply-To: <20110819.033311.1204611705865122620.davem@davemloft.net>
On Fri, Aug 19, 2011 at 6:33 AM, David Miller <davem@davemloft.net> wrote:
> From: Chetan Loke <loke.chetan@gmail.com>
> Date: Thu, 11 Aug 2011 22:30:58 -0400
>
>> This patch attempts to:
>> 1)Improve network capture visibility by increasing packet density
>> 2)Assist in analyzing multiple(aggregated) capture ports.
>
> Just some general comments before I say something about the actual
> code.
>
> Claiming loss-less, or almost loss-less, packet capture is something
> which is highly dependent upon the hardware and traffic load.
Totally agree. I have added the text in [Patch 2/2], bullet 1.3 -
capture 99% traffic as seen by the kernel. So lossless is relative to
'the traffic processed by the kernel' and not to the (inline)hardware.
But yes, the patch subject itself could be misleading. Not trying to
argue but just wanted to make sure we are all on the same page.
> So I want you to describe this as what it is, an improvement. More
> specifically it's just a more flexible mmap ring buffer scheme that
> reduces the wastage incurred by the fixed buffer size scheme we
> have currently.
>
Oki. I will change the cover-subject to 'Enhance af-packet to provide
a flexible mmap ring buffer scheme eliminating fixed frame-size
requirement'.
> Next, you need to use a different subject line for the two patches.
> Right now you say the same thing in both, and someone scanning the
> changesets headers will have no idea what is different about one
> change vs. the other.
>
Done.
Chetan Loke
^ permalink raw reply
* [PATCH] net: add the comment for skb->l4_hash
From: Changli Gao @ 2011-08-19 14:14 UTC (permalink / raw)
To: David S. Miller; +Cc: Eric Dumazet, Tom Herbert, netdev, Changli Gao
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
---
include/linux/skbuff.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f902c33..10109a0 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -322,6 +322,8 @@ typedef unsigned char *sk_buff_data_t;
* @queue_mapping: Queue mapping for multiqueue devices
* @ndisc_nodetype: router type (from link layer)
* @ooo_okay: allow the mapping of a socket to a queue to be changed
+ * @l4_hash: indicate rxhash is a canonical 4-tuple hash over transport
+ * ports.
* @dma_cookie: a cookie to one of several possible DMA operations
* done by skb DMA functions
* @secmark: security marking
^ permalink raw reply related
* Re: [PATCH] net: add the comment for skb->l4_hash
From: David Miller @ 2011-08-19 14:19 UTC (permalink / raw)
To: xiaosuo; +Cc: eric.dumazet, therbert, netdev
In-Reply-To: <1313763252-3235-1-git-send-email-xiaosuo@gmail.com>
From: Changli Gao <xiaosuo@gmail.com>
Date: Fri, 19 Aug 2011 22:14:12 +0800
> Signed-off-by: Changli Gao <xiaosuo@gmail.com>
It was obvious to me this time, but please in the future indicate in
"[PATCH]" that a change is targetted for net-next as this one was.
Applied, thanks.
^ permalink raw reply
* [PATCH v2] net-next: add the comment for skb->l4_rxhash
From: Changli Gao @ 2011-08-19 14:20 UTC (permalink / raw)
To: David S. Miller; +Cc: Eric Dumazet, Tom Herbert, netdev, Changli Gao
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
---
include/linux/skbuff.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f902c33..10109a0 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -322,6 +322,8 @@ typedef unsigned char *sk_buff_data_t;
* @queue_mapping: Queue mapping for multiqueue devices
* @ndisc_nodetype: router type (from link layer)
* @ooo_okay: allow the mapping of a socket to a queue to be changed
+ * @l4_rxhash: indicate rxhash is a canonical 4-tuple hash over transport
+ * ports.
* @dma_cookie: a cookie to one of several possible DMA operations
* done by skb DMA functions
* @secmark: security marking
^ permalink raw reply related
* Re: [PATCH net-next v4 af-packet 2/2] Enhance af-packet to provide (near zero)lossless packet capture functionality.
From: chetan loke @ 2011-08-19 14:22 UTC (permalink / raw)
To: David Miller; +Cc: netdev
In-Reply-To: <20110819.033613.1959565749346388734.davem@davemloft.net>
On Fri, Aug 19, 2011 at 6:36 AM, David Miller <davem@davemloft.net> wrote:
> From: Chetan Loke <loke.chetan@gmail.com>
> Date: Thu, 11 Aug 2011 22:31:00 -0400
>
>> +struct kbdq_ft_ops {
>> + int num_ops;
>> + void (*ft_ops[2])(void *, void *);
>> +};
> ...
>> + struct kbdq_ft_ops kfops;
> ...
>> +static void prb_init_ft_ops(struct kbdq_core *p1,
>> + union tpacket_req_u *req_u)
>> +{
>> + p1->kfops.ft_ops[p1->kfops.num_ops++] = prb_fill_vlan_info;
>> +
>> + if (req_u->req3.tp_feature_req_word) {
>> + if (req_u->req3.tp_feature_req_word & TP_FT_REQ_FILL_RXHASH)
>> + p1->kfops.ft_ops[p1->kfops.num_ops++] = prb_fill_rxhash;
>> + else
>> + p1->kfops.ft_ops[p1->kfops.num_ops++] =
>> + prb_clear_rxhash;
>> + }
>> +}
>
> It is a lot cheaper to just test the flags in-line than do indirect calls.
> Indirect calls are very expensive on many cpus.
>
> In fact, since the first op (prb_fill_vlan_info) is unconditional, we eat
> the indirect call cost for absolutely no reason at all.
Oki. Will test the flags in-line in prb_run_all_ft_ops().
>
> This kfops stuff was not present in your previous changes. And I'm
> going to tell you that if you keep adding things each revision instead
> of just fixing the specific items you've received feedback about, a
> set of changes this invasive and of this size will never get merged.
>
First, thanks(to you and others) for reviewing the huge patchset. I
did think about breaking it into multiple patches but since it
involves changes in a single file it was a little difficult. And I
didn't want to create a separate file for this.
> Please resist the urge to further tinker with the code, and just
> address the feedback we give you.
>
Sorry won't happen. These changes should have been present in v3 when
I added the feature-request word because that was the intent of that
word. It was a screw up on my part.
Chetan Loke
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox