* TX_RING and VLAN : (packet size is too long 1518 > 1514)
@ 2014-02-23 11:03 Mathias Kretschmer
2014-02-23 13:58 ` Daniel Borkmann
0 siblings, 1 reply; 7+ messages in thread
From: Mathias Kretschmer @ 2014-02-23 11:03 UTC (permalink / raw)
To: dborkman; +Cc: jbrouer, netdev
Hi Daniel, all
we're running into the above error when sending full-sized VLAN-tagged
frames via TX_RING (Kernel 3.10.31, but same code seems to be present in
3.14).
For the regular send() calls, there seems to be code in place to handle
VLAN-tagged frames, but in tpacket_fill_skb() there is no such check.
I came up with fix for this, but can only verify this tomorrow.
If it works, I'd send you a patch, but someone with more insights into
the networking code should probably fix this properly ;-)
Cheers,
Mathias
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: TX_RING and VLAN : (packet size is too long 1518 > 1514)
2014-02-23 11:03 TX_RING and VLAN : (packet size is too long 1518 > 1514) Mathias Kretschmer
@ 2014-02-23 13:58 ` Daniel Borkmann
2014-02-24 9:03 ` Mathias Kretschmer
2014-02-24 21:40 ` Setting skb->protocol efficiently for AF_PACKETs via TX_RING Mathias Kretschmer
0 siblings, 2 replies; 7+ messages in thread
From: Daniel Borkmann @ 2014-02-23 13:58 UTC (permalink / raw)
To: Mathias Kretschmer; +Cc: jbrouer, netdev
On 02/23/2014 12:03 PM, Mathias Kretschmer wrote:
> Hi Daniel, all
>
> we're running into the above error when sending full-sized VLAN-tagged frames via TX_RING (Kernel 3.10.31, but same code seems to be present in 3.14).
>
> For the regular send() calls, there seems to be code in place to handle VLAN-tagged frames, but in tpacket_fill_skb() there is no such check.
>
> I came up with fix for this, but can only verify this tomorrow.
> If it works, I'd send you a patch, but someone with more insights into the networking code should probably fix this properly ;-)
Looks like the logic was added only to packet_snd() path when there was no TX_RING
implementation available yet.
Does that patch work for you?
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 48a6a93..9deb991 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2257,8 +2257,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
if (unlikely(!(dev->flags & IFF_UP)))
goto out_put;
- reserve = dev->hard_header_len;
-
+ reserve = dev->hard_header_len + VLAN_HLEN;
size_max = po->tx_ring.frame_size
- (po->tp_hdrlen - sizeof(struct sockaddr_ll));
@@ -2285,8 +2284,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
goto out_status;
tp_len = tpacket_fill_skb(po, skb, ph, dev, size_max, proto,
- addr, hlen);
-
+ addr, hlen);
if (unlikely(tp_len < 0)) {
if (po->tp_loss) {
__packet_set_status(po, ph,
@@ -2300,6 +2298,21 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
goto out_status;
}
}
+ if (tp_len > (dev->mtu + dev->hard_header_len)) {
+ struct ethhdr *ehdr;
+ /* Earlier code assumed this would be a VLAN pkt,
+ * double-check this now that we have the actual
+ * packet in hand.
+ */
+
+ skb_reset_mac_header(skb);
+ ehdr = eth_hdr(skb);
+ if (ehdr->h_proto != htons(ETH_P_8021Q)) {
+ status = TP_STATUS_WRONG_FORMAT;
+ err = -EMSGSIZE;
+ goto out_status;
+ }
+ }
packet_pick_tx_queue(dev, skb);
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: TX_RING and VLAN : (packet size is too long 1518 > 1514)
2014-02-23 13:58 ` Daniel Borkmann
@ 2014-02-24 9:03 ` Mathias Kretschmer
2014-02-24 22:33 ` Daniel Borkmann
2014-02-24 21:40 ` Setting skb->protocol efficiently for AF_PACKETs via TX_RING Mathias Kretschmer
1 sibling, 1 reply; 7+ messages in thread
From: Mathias Kretschmer @ 2014-02-24 9:03 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: jbrouer, netdev
Hi Daniel,
I've manually merged your patch into 3.10.32 since this is what I'm using right now.
Yes, it works here. We can now send full-sized VLAN-tagged frames via TX_RING.
Another question in this context:
Is there a convention regarding sbk->protocol for VLAN-tagged frames ? Should it be
8021Q (VLAN) or should it indicate the 'inner' protocol (i.e IPV4) ?
Depending on the use case, I can find pros and cons for either one.
Cheers,
Mathias
'On 02/23/2014 02:58 PM, Daniel Borkmann wrote:
> On 02/23/2014 12:03 PM, Mathias Kretschmer wrote:
>> Hi Daniel, all
>>
>> we're running into the above error when sending full-sized VLAN-tagged frames via
>> TX_RING (Kernel 3.10.31, but same code seems to be present in 3.14).
>>
>> For the regular send() calls, there seems to be code in place to handle
>> VLAN-tagged frames, but in tpacket_fill_skb() there is no such check.
>>
>> I came up with fix for this, but can only verify this tomorrow.
>> If it works, I'd send you a patch, but someone with more insights into the
>> networking code should probably fix this properly ;-)
>
> Looks like the logic was added only to packet_snd() path when there was no TX_RING
> implementation available yet.
>
> Does that patch work for you?
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 48a6a93..9deb991 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -2257,8 +2257,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr
> *msg)
> if (unlikely(!(dev->flags & IFF_UP)))
> goto out_put;
>
> - reserve = dev->hard_header_len;
> -
> + reserve = dev->hard_header_len + VLAN_HLEN;
> size_max = po->tx_ring.frame_size
> - (po->tp_hdrlen - sizeof(struct sockaddr_ll));
>
> @@ -2285,8 +2284,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr
> *msg)
> goto out_status;
>
> tp_len = tpacket_fill_skb(po, skb, ph, dev, size_max, proto,
> - addr, hlen);
> -
> + addr, hlen);
> if (unlikely(tp_len < 0)) {
> if (po->tp_loss) {
> __packet_set_status(po, ph,
> @@ -2300,6 +2298,21 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr
> *msg)
> goto out_status;
> }
> }
> + if (tp_len > (dev->mtu + dev->hard_header_len)) {
> + struct ethhdr *ehdr;
> + /* Earlier code assumed this would be a VLAN pkt,
> + * double-check this now that we have the actual
> + * packet in hand.
> + */
> +
> + skb_reset_mac_header(skb);
> + ehdr = eth_hdr(skb);
> + if (ehdr->h_proto != htons(ETH_P_8021Q)) {
> + status = TP_STATUS_WRONG_FORMAT;
> + err = -EMSGSIZE;
> + goto out_status;
> + }
> + }
>
> packet_pick_tx_queue(dev, skb);
>
--
Dr. Mathias Kretschmer, Head of Competence Center
Fraunhofer FOKUS Network Research
A Schloss Birlinghoven, 53754 Sankt Augustin, Germany
T +49-2241-14-3466, F +49-2241-14-1050
E mathias.kretschmer@fokus.fraunhofer.de
W http://www.fokus.fraunhofer.de/en/net
---> Visit us @ „The Rural Broadband Day 2014”, Feb 13th, Fraunhofer FOKUS, Berlin
---> Visit us @ “Cebit 2014”, March 10th – 14th, Trade Fair, Hannover
^ permalink raw reply [flat|nested] 7+ messages in thread
* Setting skb->protocol efficiently for AF_PACKETs via TX_RING
2014-02-23 13:58 ` Daniel Borkmann
2014-02-24 9:03 ` Mathias Kretschmer
@ 2014-02-24 21:40 ` Mathias Kretschmer
1 sibling, 0 replies; 7+ messages in thread
From: Mathias Kretschmer @ 2014-02-24 21:40 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: jbrouer, netdev
Hi all,
for our application, basically a light weight (wireless) MPLS switch, we
are looking for an efficient way to push Ethernet Frames onto NICs via
TX_RING.
One challenge here is that we might be heavily mixing EtherTypes (MPLS
unicast & multicast, as well as signalling traffic).
As far as I understand, the fastest method to push packets onto the NIC
is to not specify a protocol in the socket() call and also to omit the
protocol for the bind() call.
Hence, the last option to specify the protocol would be the send() call.
The beauty about the TX_RING implementation is that one can push a large
batch of frames towards the kernel - as long as the EtherType/protocol
is identical...
In out case, this means fragmenting a nice large send() into potentially
single frame send() calls.
So, essentially, we're looking for an efficient method to set
skb->protocol from the EtherType of the respective Ethernet frame.
Looking into the tpacket_fill_skb() function, it seems that the most
efficient way accomplish this would be to set skb->protocol after
importing the hard_header into the skb:
skb_push(skb, dev->hard_header_len);
err = skb_store_bits(skb, 0, data,
dev->hard_header_len);
if (unlikely(err))
return err;
if (skb->protocol == 0) { /* protocol not yet set bu other means */
struct ethhdr *ehdr = (struct ethhdr*) data;
skb->protocol = ehdr->h_proto;
}
This way tpacket_fill_skb() would return with a fully valid skb for
further processing. And, this would only be done if skb->protocol was
not set before via socket(), bind() or send().
Hence, there is no need to break a large send() call down into small
fragments, while the 'skb->protocol = ehdr->h_proto' assignment would
have to be done in any case. Using the send() method this would require
an additional overhead since we'd have to fill in a msghdr and pass it
along.
Needless to say, we tried it and it seems to work fine (on AMD geode and
also on a more modern Intel Atom).
We'd be open to alternative solutions. Otherwise we would propose those
few lines to be added to the af_packet code.
Best regards,
Mathias
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: TX_RING and VLAN : (packet size is too long 1518 > 1514)
2014-02-24 9:03 ` Mathias Kretschmer
@ 2014-02-24 22:33 ` Daniel Borkmann
2014-02-26 7:28 ` Mathias Kretschmer
0 siblings, 1 reply; 7+ messages in thread
From: Daniel Borkmann @ 2014-02-24 22:33 UTC (permalink / raw)
To: Mathias Kretschmer; +Cc: jbrouer, netdev
On 02/24/2014 10:03 AM, Mathias Kretschmer wrote:
...
> I've manually merged your patch into 3.10.32 since this is what I'm using right now.
>
> Yes, it works here. We can now send full-sized VLAN-tagged frames via TX_RING.
Ok, thanks, I will submit it some time this week, after some more
testing from my side as well.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: TX_RING and VLAN : (packet size is too long 1518 > 1514)
2014-02-24 22:33 ` Daniel Borkmann
@ 2014-02-26 7:28 ` Mathias Kretschmer
2014-02-26 10:33 ` Daniel Borkmann
0 siblings, 1 reply; 7+ messages in thread
From: Mathias Kretschmer @ 2014-02-26 7:28 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: jbrouer, netdev
On 02/24/2014 11:33 PM, Daniel Borkmann wrote:
> On 02/24/2014 10:03 AM, Mathias Kretschmer wrote:
> ...
>> I've manually merged your patch into 3.10.32 since this is what I'm
>> using right now.
>>
>> Yes, it works here. We can now send full-sized VLAN-tagged frames via
>> TX_RING.
>
> Ok, thanks, I will submit it some time this week, after some more
> testing from my side as well.
Thank you. It would be great if this fix would also make it into
longterm kernels.
-Mathias
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: TX_RING and VLAN : (packet size is too long 1518 > 1514)
2014-02-26 7:28 ` Mathias Kretschmer
@ 2014-02-26 10:33 ` Daniel Borkmann
0 siblings, 0 replies; 7+ messages in thread
From: Daniel Borkmann @ 2014-02-26 10:33 UTC (permalink / raw)
To: Mathias Kretschmer; +Cc: jbrouer, netdev
On 02/26/2014 08:28 AM, Mathias Kretschmer wrote:
> On 02/24/2014 11:33 PM, Daniel Borkmann wrote:
>> On 02/24/2014 10:03 AM, Mathias Kretschmer wrote:
>> ...
>>> I've manually merged your patch into 3.10.32 since this is what I'm
>>> using right now.
>>>
>>> Yes, it works here. We can now send full-sized VLAN-tagged frames via
>>> TX_RING.
>>
>> Ok, thanks, I will submit it some time this week, after some more
>> testing from my side as well.
>
> Thank you. It would be great if this fix would also make it into longterm kernels.
Will send it out today or tomorrow.
Thanks,
Daniel
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-02-26 10:33 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-23 11:03 TX_RING and VLAN : (packet size is too long 1518 > 1514) Mathias Kretschmer
2014-02-23 13:58 ` Daniel Borkmann
2014-02-24 9:03 ` Mathias Kretschmer
2014-02-24 22:33 ` Daniel Borkmann
2014-02-26 7:28 ` Mathias Kretschmer
2014-02-26 10:33 ` Daniel Borkmann
2014-02-24 21:40 ` Setting skb->protocol efficiently for AF_PACKETs via TX_RING Mathias Kretschmer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).