* Single packet receiving in multiple ring buffers
@ 2008-05-22 1:35 Keyur Chudgar
2008-05-22 6:40 ` Evgeniy Polyakov
0 siblings, 1 reply; 14+ messages in thread
From: Keyur Chudgar @ 2008-05-22 1:35 UTC (permalink / raw)
To: netdev
Hi,
I have a question regarding packets received in multiple ring buffers
from hw interface,
if the packet size is larger than the ring buffer size.
The way generally I saw this is done is,
1. Initially allocate skb with MTU size and configure ring buffer
2. If packet comes in multiple ring buffers, they copy these buffers
and make a new single
skb out of it
3. Send a single skb to netif_rx
The point 2 above can be done either using memcpy or DMA. But in any
case, we need a copy.
Does anybody know, if there is a way, we can avoid the copy operation
while a single packet
received in multiple ring buffers?
Thanks,
- Keyur
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Single packet receiving in multiple ring buffers
2008-05-22 1:35 Single packet receiving in multiple ring buffers Keyur Chudgar
@ 2008-05-22 6:40 ` Evgeniy Polyakov
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1834@SDCEXCHANGE01.ad.amcc.com>
0 siblings, 1 reply; 14+ messages in thread
From: Evgeniy Polyakov @ 2008-05-22 6:40 UTC (permalink / raw)
To: Keyur Chudgar; +Cc: netdev
Hi.
On Wed, May 21, 2008 at 06:35:18PM -0700, Keyur Chudgar (kchudgar.linux@gmail.com) wrote:
> The way generally I saw this is done is,
> 1. Initially allocate skb with MTU size and configure ring buffer
> 2. If packet comes in multiple ring buffers, they copy these buffers
> and make a new single
> skb out of it
> 3. Send a single skb to netif_rx
>
> The point 2 above can be done either using memcpy or DMA. But in any
> case, we need a copy.
>
> Does anybody know, if there is a way, we can avoid the copy operation
> while a single packet
> received in multiple ring buffers?
Card can allocate list of pages and put data there, pages then can be
attached to skb. It is also possible to allocate new skb, put remainder
of the data there and chain skbs together, but it is not preffered way,
instead use fraglist of pages.
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Single packet receiving in multiple ring buffers
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1834@SDCEXCHANGE01.ad.amcc.com>
@ 2008-05-22 18:14 ` Evgeniy Polyakov
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1846@SDCEXCHANGE01.ad.amcc.com>
0 siblings, 1 reply; 14+ messages in thread
From: Evgeniy Polyakov @ 2008-05-22 18:14 UTC (permalink / raw)
To: Keyur Chudgar; +Cc: Keyur Chudgar, netdev
On Thu, May 22, 2008 at 11:00:23AM -0700, Keyur Chudgar (kchudgar@amcc.com) wrote:
> Hi,
>
> Following is what I would like to understand. I am working with an
> ethernet MAC that is capable of chaining multiple buffers (fixed size)
> together to form a single packet. For example, a packet size of 2k
> would fit into 1k size hardware buffers. So, at initialization time, I
> configure the MAC buffer descriptor in the following way. I allocate 2
> skb's (ie. dev_alloc_skb(1024)) and pass the skb.data pointers to the
A nitpick: hope it is not NIC initialization time, since skb can live
very long time in stack and if you statically bind single skb to
hardware descriptor during NIC init time you will be very short on skbs.
> mac buffer descriptors as shown in the diagram below.
>
> ------------- -------------------
> | skb1->data |--->| First part of |
> -------------- | Packet in buf1 |
> |------------------
>
> ------------- -------------------
> | skb2->data |--->| Second part of |
> -------------- | same packet in |
> | buf2 |
> |------------------
>
> So, when a packet is received by the MAC, it will give me (rx portion of
> the driver) the two descriptors associated with that single packet size
> of 2k. Here, I would like to avoid copying. To this end, what options
> do I have available to me at the driver level so that stack gets a
> single skb with multiple chained buffers. In short, I would like to
> represent a single packet comprised of multiple SKB's. So, when the
> stack free's this packet, it will have to have a way of freeing all of
> the underlying SKB's.
>
> Will your previous suggestion apply for the above scenerio? If not, do
> I have any other options than to do memcpy on Rx, in the driver.
You have to chain second skb into frag_list of the first one.
But it is not preferred solution, better attach a page to each hardware
descriptor and put pages from each part into single skb via skb_fill_page_desc().
Or you can play with parts of the pages for DMA if your hardware does
not allow to DMA big portions from single descriptor (like in your
example above where each HW descriptor is limited to 1k).
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Single packet receiving in multiple ring buffers
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1846@SDCEXCHANGE01.ad.amcc.com>
@ 2008-05-22 18:49 ` Evgeniy Polyakov
2008-05-22 18:57 ` David Miller
0 siblings, 1 reply; 14+ messages in thread
From: Evgeniy Polyakov @ 2008-05-22 18:49 UTC (permalink / raw)
To: Keyur Chudgar; +Cc: Keyur Chudgar, netdev
On Thu, May 22, 2008 at 11:27:27AM -0700, Keyur Chudgar (kchudgar@amcc.com) wrote:
> > better attach a page to each hardware descriptor and put pages from
> each part into single skb via skb_fill_page_desc().
> This is a good idea. But I can see, if I do this, then what I have in
> skb is only fraglist containing the data, and
> skb->data doesn't have anything. If this is the case, then the stack or
> anybody else would not be able to use
> the macros work on data pointer, like skb->push, pull, put, and even the
> header pointers like h, nh, mac etc.
> Can you please give some thoughts about this?
You have to copy part of the data (up to and including transport headers)
into skb->data.
Or you can dma part of the packet into skb->data and part into attached
pages, which reside in skb_shinfo(skb)->frags (seupt via skb_fill_page_desc())
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Single packet receiving in multiple ring buffers
2008-05-22 18:49 ` Evgeniy Polyakov
@ 2008-05-22 18:57 ` David Miller
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1869@SDCEXCHANGE01.ad.amcc.com>
2008-05-23 12:00 ` Evgeniy Polyakov
0 siblings, 2 replies; 14+ messages in thread
From: David Miller @ 2008-05-22 18:57 UTC (permalink / raw)
To: johnpol; +Cc: kchudgar, kchudgar.linux, netdev
From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Date: Thu, 22 May 2008 22:49:07 +0400
> You have to copy part of the data (up to and including transport headers)
> into skb->data.
You don't necessarily have to. The input paths of the networking
stack will pull into the skb->data, as needed, using
pskb_may_pull() calls.
I did some tests with the NIU driver on receive, and to be
honest there was nearly zero performance gain from pre-copying
into skb->data from the paged SKB area preemptively before
passing the packet in via netif_receive_skb().
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Single packet receiving in multiple ring buffers
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1869@SDCEXCHANGE01.ad.amcc.com>
@ 2008-05-23 7:09 ` David Miller
0 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2008-05-23 7:09 UTC (permalink / raw)
To: kchudgar; +Cc: johnpol, kchudgar.linux, netdev
From: "Keyur Chudgar" <kchudgar@amcc.com>
Date: Thu, 22 May 2008 12:41:07 -0700
> Was this test run for only driver level or for some forwarding
> application?
Just stream data transfer tests.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Single packet receiving in multiple ring buffers
2008-05-22 18:57 ` David Miller
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1869@SDCEXCHANGE01.ad.amcc.com>
@ 2008-05-23 12:00 ` Evgeniy Polyakov
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1951@SDCEXCHANGE01.ad.amcc.com>
1 sibling, 1 reply; 14+ messages in thread
From: Evgeniy Polyakov @ 2008-05-23 12:00 UTC (permalink / raw)
To: David Miller; +Cc: kchudgar, kchudgar.linux, netdev
On Thu, May 22, 2008 at 11:57:55AM -0700, David Miller (davem@davemloft.net) wrote:
> > You have to copy part of the data (up to and including transport headers)
> > into skb->data.
>
> You don't necessarily have to. The input paths of the networking
> stack will pull into the skb->data, as needed, using
> pskb_may_pull() calls.
>
> I did some tests with the NIU driver on receive, and to be
> honest there was nearly zero performance gain from pre-copying
> into skb->data from the paged SKB area preemptively before
> passing the packet in via netif_receive_skb().
Well, yes, essentially there should not be major difference who will
copy that bits into skb->data when/if needed, although MAC header has
to be there oterwise eth_type_trans() will not work, it does plain
skb_pull(). If eth_type_trans() is not used (there is some way to know
protocol number) things should be ok without copy (?)
So, from perfomance viewpoint it looks like the best decision is to
have first part of the packet in skb->data and the rest in attached
pages.
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Single packet receiving in multiple ring buffers
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1951@SDCEXCHANGE01.ad.amcc.com>
@ 2008-05-25 13:00 ` Evgeniy Polyakov
2008-05-25 13:04 ` David Miller
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Evgeniy Polyakov @ 2008-05-25 13:00 UTC (permalink / raw)
To: Keyur Chudgar; +Cc: David Miller, kchudgar.linux, netdev
On Fri, May 23, 2008 at 09:08:10AM -0700, Keyur Chudgar (kchudgar@amcc.com) wrote:
> > So, from perfomance viewpoint it looks like the best decision is to
> have first part of the packet in skb->data and the
> > rest in attached pages.
>
> If we follow this approach as to have first part of the packet in
> skb->data and the rest in attached pages,
> We will need to copy from page (which is programmed initially in buffer
> for hw) to skb->data (which is newly allocated)each time, even if the
> packet was of only single buffer. We can not program skb->data pointers
> and addresses of pages intermix in the buffers for hw.
I meant that you initialize hardware buffers to point to skb->data and
then to pages, not always to pages. In this case controller will DMA
data to skb->data and the rest of the packet into attached pages. If it
is not convenient to use such scheme, you can always put data into pages
and then copy tiny bit into skb->data to be able to specify IP layer
protocol, all the rest system will do for itself via pskb_may_copy() if
needed.
Essentially copy of such small amount of data like 40-60 bytes of all
headers is so small, that you will unlikely detect it in register
access, allocation noise, tcp state processing and eventual copy to
userspace, but nevertheless you can try to optimize it out of course.
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Single packet receiving in multiple ring buffers
2008-05-25 13:00 ` Evgeniy Polyakov
@ 2008-05-25 13:04 ` David Miller
2008-05-25 13:31 ` Evgeniy Polyakov
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1B23@SDCEXCHANGE01.ad.amcc.com>
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1DC1@SDCEXCHANGE01.ad.amcc.com>
2 siblings, 1 reply; 14+ messages in thread
From: David Miller @ 2008-05-25 13:04 UTC (permalink / raw)
To: johnpol; +Cc: kchudgar, kchudgar.linux, netdev
From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Date: Sun, 25 May 2008 17:00:24 +0400
> If it is not convenient to use such scheme, you can always put data
> into pages and then copy tiny bit into skb->data to be able to
> specify IP layer protocol, all the rest system will do for itself
> via pskb_may_copy() if needed.
The network input uses "pskb_may_pull()" not "pskb_may_copy()" :-)
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Single packet receiving in multiple ring buffers
2008-05-25 13:04 ` David Miller
@ 2008-05-25 13:31 ` Evgeniy Polyakov
0 siblings, 0 replies; 14+ messages in thread
From: Evgeniy Polyakov @ 2008-05-25 13:31 UTC (permalink / raw)
To: David Miller; +Cc: kchudgar, kchudgar.linux, netdev
On Sun, May 25, 2008 at 06:04:34AM -0700, David Miller (davem@davemloft.net) wrote:
> > If it is not convenient to use such scheme, you can always put data
> > into pages and then copy tiny bit into skb->data to be able to
> > specify IP layer protocol, all the rest system will do for itself
> > via pskb_may_copy() if needed.
>
> The network input uses "pskb_may_pull()" not "pskb_may_copy()" :-)
For those who like pskb_may_copy() name, here is a patch :)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index bbd8d00..1fb5723 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -990,6 +990,11 @@ static inline int pskb_may_pull(struct sk_buff *skb, unsigned int len)
return __pskb_pull_tail(skb, len-skb_headlen(skb)) != NULL;
}
+static inline int pskb_may_copy(struct sk_buff *skb, unsigned int len)
+{
+ return pskb_may_pull(skb, len);
+}
+
/**
* skb_headroom - bytes at buffer head
* @skb: buffer to check
--
Evgeniy Polyakov
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: Single packet receiving in multiple ring buffers
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1B23@SDCEXCHANGE01.ad.amcc.com>
@ 2008-05-27 17:41 ` Evgeniy Polyakov
0 siblings, 0 replies; 14+ messages in thread
From: Evgeniy Polyakov @ 2008-05-27 17:41 UTC (permalink / raw)
To: Keyur Chudgar; +Cc: David Miller, kchudgar.linux, netdev
Hi.
On Tue, May 27, 2008 at 10:36:08AM -0700, Keyur Chudgar (kchudgar@amcc.com) wrote:
> Thanks for the suggestions. I will try both the schemes you mentioned
> below and see how it goes.
Hope it helps :)
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Single packet receiving in multiple ring buffers
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1DC1@SDCEXCHANGE01.ad.amcc.com>
@ 2008-05-29 14:45 ` Evgeniy Polyakov
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1DD0@SDCEXCHANGE01.ad.amcc.com>
0 siblings, 1 reply; 14+ messages in thread
From: Evgeniy Polyakov @ 2008-05-29 14:45 UTC (permalink / raw)
To: Keyur Chudgar; +Cc: David Miller, kchudgar.linux, netdev
Hi.
On Thu, May 29, 2008 at 07:14:45AM -0700, Keyur Chudgar (kchudgar@amcc.com) wrote:
> I can not program some hardware buffers with skb->data and others with
> pages. I don't know
> at runtime what size packets will come in. If I program first buffer
> with skb->data and other two with
> pages, what if the packet arrived only fits in first buffer? The other
> two buffers are wasted in this case.
Does your hardware support MTU (or kind of that like e1000, which
supports packets with size upto next power of two over MTU)?
> > If it is not convenient to use such scheme, you can always put data
> into pages and then copy tiny bit into skb->data to > be able to specify
> IP layer protocol, all the rest system will do for itself via
> pskb_may_copy() if needed.
>
> What is the minimum size of pages that I can allocate? Is it 4K? If that
> is the case, then
> I may be wasting a lot of memory for smaller packets.
For small enough packets you can allocate new skb with exact size and
copy data from old page into new skb. It will be much faster, since
sockets will be charged with smaller size per packet. But usually you
have to preallocate at least MTU sized packets.
Page size is different on each arch, for x86 it is 4k by default.
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Single packet receiving in multiple ring buffers
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1DD0@SDCEXCHANGE01.ad.amcc.com>
@ 2008-05-29 15:35 ` Evgeniy Polyakov
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1DF1@SDCEXCHANGE01.ad.amcc.com>
0 siblings, 1 reply; 14+ messages in thread
From: Evgeniy Polyakov @ 2008-05-29 15:35 UTC (permalink / raw)
To: Keyur Chudgar; +Cc: David Miller, kchudgar.linux, netdev
On Thu, May 29, 2008 at 08:06:25AM -0700, Keyur Chudgar (kchudgar@amcc.com) wrote:
> > Does your hardware support MTU (or kind of that like e1000, which
> supports packets with size upto next power of two over > MTU)?
>
> Yes, it does support MTU. The problem I am discussing will be for jumbo
> frames only.
Then I do not exactly understand a problem, you allocate say 1500 bytes
skb and number of pages (use PAGE_SIZE for arch driver is being loaded),
depending on arch you can even use a single page for the whole packet.
Then install descriptors according to sizes (first one will have size
1500 bytes, second and third ones will be PAGE_SIZE).
With jumbo frames you will waste some descriptors and some pages, but
small-sized traffic will hit only descriptor associated with skb->data,
so you will detach page from skb and submit it to stack, allocate new
one and attach that pages to new skb. If size is too small you can
allocate new small skb and copy data there (I recalled, it is called
copybreak) and submit it to stack instead of one, where DMA was
completed.
> > Page size is different on each arch, for x86 it is 4k by default.
> OK.
You can check PAGE_SIZE for particular arch NIC is used.
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Single packet receiving in multiple ring buffers
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1DF1@SDCEXCHANGE01.ad.amcc.com>
@ 2008-05-29 16:41 ` Evgeniy Polyakov
0 siblings, 0 replies; 14+ messages in thread
From: Evgeniy Polyakov @ 2008-05-29 16:41 UTC (permalink / raw)
To: Keyur Chudgar; +Cc: David Miller, kchudgar.linux, netdev
On Thu, May 29, 2008 at 09:09:30AM -0700, Keyur Chudgar (kchudgar@amcc.com) wrote:
> > With jumbo frames you will waste some descriptors and some pages, but
> small-sized traffic will hit only descriptor
> > associated with skb->data, so you will detach page from skb and submit
> it to stack, allocate new one and attach that
> > pages to new skb. If size is too small you can allocate new small skb
> and copy data there (I recalled, it is called
> > copybreak) and submit it to stack instead of one, where DMA was
> completed.
>
> This will be a good idea. Can you let me know how can I detach and
> attach the page from/to skb?
There is no simple function to detach page, but attaching is being done via
skb_fill_page_desc(). You can check how sky2 does it for example.
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2008-05-29 16:42 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-22 1:35 Single packet receiving in multiple ring buffers Keyur Chudgar
2008-05-22 6:40 ` Evgeniy Polyakov
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1834@SDCEXCHANGE01.ad.amcc.com>
2008-05-22 18:14 ` Evgeniy Polyakov
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1846@SDCEXCHANGE01.ad.amcc.com>
2008-05-22 18:49 ` Evgeniy Polyakov
2008-05-22 18:57 ` David Miller
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1869@SDCEXCHANGE01.ad.amcc.com>
2008-05-23 7:09 ` David Miller
2008-05-23 12:00 ` Evgeniy Polyakov
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1951@SDCEXCHANGE01.ad.amcc.com>
2008-05-25 13:00 ` Evgeniy Polyakov
2008-05-25 13:04 ` David Miller
2008-05-25 13:31 ` Evgeniy Polyakov
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1B23@SDCEXCHANGE01.ad.amcc.com>
2008-05-27 17:41 ` Evgeniy Polyakov
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1DC1@SDCEXCHANGE01.ad.amcc.com>
2008-05-29 14:45 ` Evgeniy Polyakov
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1DD0@SDCEXCHANGE01.ad.amcc.com>
2008-05-29 15:35 ` Evgeniy Polyakov
[not found] ` <31256957387BB54C8B9AFE88AA5C0564044A1DF1@SDCEXCHANGE01.ad.amcc.com>
2008-05-29 16:41 ` Evgeniy Polyakov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).