* TX_RING and VLAN : (packet size is too long 1518 > 1514) @ 2014-02-23 11:03 Mathias Kretschmer 2014-02-23 13:58 ` Daniel Borkmann 0 siblings, 1 reply; 7+ messages in thread From: Mathias Kretschmer @ 2014-02-23 11:03 UTC (permalink / raw) To: dborkman; +Cc: jbrouer, netdev Hi Daniel, all we're running into the above error when sending full-sized VLAN-tagged frames via TX_RING (Kernel 3.10.31, but same code seems to be present in 3.14). For the regular send() calls, there seems to be code in place to handle VLAN-tagged frames, but in tpacket_fill_skb() there is no such check. I came up with fix for this, but can only verify this tomorrow. If it works, I'd send you a patch, but someone with more insights into the networking code should probably fix this properly ;-) Cheers, Mathias ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: TX_RING and VLAN : (packet size is too long 1518 > 1514) 2014-02-23 11:03 TX_RING and VLAN : (packet size is too long 1518 > 1514) Mathias Kretschmer @ 2014-02-23 13:58 ` Daniel Borkmann 2014-02-24 9:03 ` Mathias Kretschmer 2014-02-24 21:40 ` Setting skb->protocol efficiently for AF_PACKETs via TX_RING Mathias Kretschmer 0 siblings, 2 replies; 7+ messages in thread From: Daniel Borkmann @ 2014-02-23 13:58 UTC (permalink / raw) To: Mathias Kretschmer; +Cc: jbrouer, netdev On 02/23/2014 12:03 PM, Mathias Kretschmer wrote: > Hi Daniel, all > > we're running into the above error when sending full-sized VLAN-tagged frames via TX_RING (Kernel 3.10.31, but same code seems to be present in 3.14). > > For the regular send() calls, there seems to be code in place to handle VLAN-tagged frames, but in tpacket_fill_skb() there is no such check. > > I came up with fix for this, but can only verify this tomorrow. > If it works, I'd send you a patch, but someone with more insights into the networking code should probably fix this properly ;-) Looks like the logic was added only to packet_snd() path when there was no TX_RING implementation available yet. Does that patch work for you? diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 48a6a93..9deb991 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -2257,8 +2257,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg) if (unlikely(!(dev->flags & IFF_UP))) goto out_put; - reserve = dev->hard_header_len; - + reserve = dev->hard_header_len + VLAN_HLEN; size_max = po->tx_ring.frame_size - (po->tp_hdrlen - sizeof(struct sockaddr_ll)); @@ -2285,8 +2284,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg) goto out_status; tp_len = tpacket_fill_skb(po, skb, ph, dev, size_max, proto, - addr, hlen); - + addr, hlen); if (unlikely(tp_len < 0)) { if (po->tp_loss) { __packet_set_status(po, ph, @@ -2300,6 +2298,21 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg) goto out_status; } } + if (tp_len > (dev->mtu + dev->hard_header_len)) { + struct ethhdr *ehdr; + /* Earlier code assumed this would be a VLAN pkt, + * double-check this now that we have the actual + * packet in hand. + */ + + skb_reset_mac_header(skb); + ehdr = eth_hdr(skb); + if (ehdr->h_proto != htons(ETH_P_8021Q)) { + status = TP_STATUS_WRONG_FORMAT; + err = -EMSGSIZE; + goto out_status; + } + } packet_pick_tx_queue(dev, skb); ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: TX_RING and VLAN : (packet size is too long 1518 > 1514) 2014-02-23 13:58 ` Daniel Borkmann @ 2014-02-24 9:03 ` Mathias Kretschmer 2014-02-24 22:33 ` Daniel Borkmann 2014-02-24 21:40 ` Setting skb->protocol efficiently for AF_PACKETs via TX_RING Mathias Kretschmer 1 sibling, 1 reply; 7+ messages in thread From: Mathias Kretschmer @ 2014-02-24 9:03 UTC (permalink / raw) To: Daniel Borkmann; +Cc: jbrouer, netdev Hi Daniel, I've manually merged your patch into 3.10.32 since this is what I'm using right now. Yes, it works here. We can now send full-sized VLAN-tagged frames via TX_RING. Another question in this context: Is there a convention regarding sbk->protocol for VLAN-tagged frames ? Should it be 8021Q (VLAN) or should it indicate the 'inner' protocol (i.e IPV4) ? Depending on the use case, I can find pros and cons for either one. Cheers, Mathias 'On 02/23/2014 02:58 PM, Daniel Borkmann wrote: > On 02/23/2014 12:03 PM, Mathias Kretschmer wrote: >> Hi Daniel, all >> >> we're running into the above error when sending full-sized VLAN-tagged frames via >> TX_RING (Kernel 3.10.31, but same code seems to be present in 3.14). >> >> For the regular send() calls, there seems to be code in place to handle >> VLAN-tagged frames, but in tpacket_fill_skb() there is no such check. >> >> I came up with fix for this, but can only verify this tomorrow. >> If it works, I'd send you a patch, but someone with more insights into the >> networking code should probably fix this properly ;-) > > Looks like the logic was added only to packet_snd() path when there was no TX_RING > implementation available yet. > > Does that patch work for you? > > diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c > index 48a6a93..9deb991 100644 > --- a/net/packet/af_packet.c > +++ b/net/packet/af_packet.c > @@ -2257,8 +2257,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr > *msg) > if (unlikely(!(dev->flags & IFF_UP))) > goto out_put; > > - reserve = dev->hard_header_len; > - > + reserve = dev->hard_header_len + VLAN_HLEN; > size_max = po->tx_ring.frame_size > - (po->tp_hdrlen - sizeof(struct sockaddr_ll)); > > @@ -2285,8 +2284,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr > *msg) > goto out_status; > > tp_len = tpacket_fill_skb(po, skb, ph, dev, size_max, proto, > - addr, hlen); > - > + addr, hlen); > if (unlikely(tp_len < 0)) { > if (po->tp_loss) { > __packet_set_status(po, ph, > @@ -2300,6 +2298,21 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr > *msg) > goto out_status; > } > } > + if (tp_len > (dev->mtu + dev->hard_header_len)) { > + struct ethhdr *ehdr; > + /* Earlier code assumed this would be a VLAN pkt, > + * double-check this now that we have the actual > + * packet in hand. > + */ > + > + skb_reset_mac_header(skb); > + ehdr = eth_hdr(skb); > + if (ehdr->h_proto != htons(ETH_P_8021Q)) { > + status = TP_STATUS_WRONG_FORMAT; > + err = -EMSGSIZE; > + goto out_status; > + } > + } > > packet_pick_tx_queue(dev, skb); > -- Dr. Mathias Kretschmer, Head of Competence Center Fraunhofer FOKUS Network Research A Schloss Birlinghoven, 53754 Sankt Augustin, Germany T +49-2241-14-3466, F +49-2241-14-1050 E mathias.kretschmer@fokus.fraunhofer.de W http://www.fokus.fraunhofer.de/en/net ---> Visit us @ „The Rural Broadband Day 2014”, Feb 13th, Fraunhofer FOKUS, Berlin ---> Visit us @ “Cebit 2014”, March 10th – 14th, Trade Fair, Hannover ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: TX_RING and VLAN : (packet size is too long 1518 > 1514) 2014-02-24 9:03 ` Mathias Kretschmer @ 2014-02-24 22:33 ` Daniel Borkmann 2014-02-26 7:28 ` Mathias Kretschmer 0 siblings, 1 reply; 7+ messages in thread From: Daniel Borkmann @ 2014-02-24 22:33 UTC (permalink / raw) To: Mathias Kretschmer; +Cc: jbrouer, netdev On 02/24/2014 10:03 AM, Mathias Kretschmer wrote: ... > I've manually merged your patch into 3.10.32 since this is what I'm using right now. > > Yes, it works here. We can now send full-sized VLAN-tagged frames via TX_RING. Ok, thanks, I will submit it some time this week, after some more testing from my side as well. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: TX_RING and VLAN : (packet size is too long 1518 > 1514) 2014-02-24 22:33 ` Daniel Borkmann @ 2014-02-26 7:28 ` Mathias Kretschmer 2014-02-26 10:33 ` Daniel Borkmann 0 siblings, 1 reply; 7+ messages in thread From: Mathias Kretschmer @ 2014-02-26 7:28 UTC (permalink / raw) To: Daniel Borkmann; +Cc: jbrouer, netdev On 02/24/2014 11:33 PM, Daniel Borkmann wrote: > On 02/24/2014 10:03 AM, Mathias Kretschmer wrote: > ... >> I've manually merged your patch into 3.10.32 since this is what I'm >> using right now. >> >> Yes, it works here. We can now send full-sized VLAN-tagged frames via >> TX_RING. > > Ok, thanks, I will submit it some time this week, after some more > testing from my side as well. Thank you. It would be great if this fix would also make it into longterm kernels. -Mathias ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: TX_RING and VLAN : (packet size is too long 1518 > 1514) 2014-02-26 7:28 ` Mathias Kretschmer @ 2014-02-26 10:33 ` Daniel Borkmann 0 siblings, 0 replies; 7+ messages in thread From: Daniel Borkmann @ 2014-02-26 10:33 UTC (permalink / raw) To: Mathias Kretschmer; +Cc: jbrouer, netdev On 02/26/2014 08:28 AM, Mathias Kretschmer wrote: > On 02/24/2014 11:33 PM, Daniel Borkmann wrote: >> On 02/24/2014 10:03 AM, Mathias Kretschmer wrote: >> ... >>> I've manually merged your patch into 3.10.32 since this is what I'm >>> using right now. >>> >>> Yes, it works here. We can now send full-sized VLAN-tagged frames via >>> TX_RING. >> >> Ok, thanks, I will submit it some time this week, after some more >> testing from my side as well. > > Thank you. It would be great if this fix would also make it into longterm kernels. Will send it out today or tomorrow. Thanks, Daniel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Setting skb->protocol efficiently for AF_PACKETs via TX_RING 2014-02-23 13:58 ` Daniel Borkmann 2014-02-24 9:03 ` Mathias Kretschmer @ 2014-02-24 21:40 ` Mathias Kretschmer 1 sibling, 0 replies; 7+ messages in thread From: Mathias Kretschmer @ 2014-02-24 21:40 UTC (permalink / raw) To: Daniel Borkmann; +Cc: jbrouer, netdev Hi all, for our application, basically a light weight (wireless) MPLS switch, we are looking for an efficient way to push Ethernet Frames onto NICs via TX_RING. One challenge here is that we might be heavily mixing EtherTypes (MPLS unicast & multicast, as well as signalling traffic). As far as I understand, the fastest method to push packets onto the NIC is to not specify a protocol in the socket() call and also to omit the protocol for the bind() call. Hence, the last option to specify the protocol would be the send() call. The beauty about the TX_RING implementation is that one can push a large batch of frames towards the kernel - as long as the EtherType/protocol is identical... In out case, this means fragmenting a nice large send() into potentially single frame send() calls. So, essentially, we're looking for an efficient method to set skb->protocol from the EtherType of the respective Ethernet frame. Looking into the tpacket_fill_skb() function, it seems that the most efficient way accomplish this would be to set skb->protocol after importing the hard_header into the skb: skb_push(skb, dev->hard_header_len); err = skb_store_bits(skb, 0, data, dev->hard_header_len); if (unlikely(err)) return err; if (skb->protocol == 0) { /* protocol not yet set bu other means */ struct ethhdr *ehdr = (struct ethhdr*) data; skb->protocol = ehdr->h_proto; } This way tpacket_fill_skb() would return with a fully valid skb for further processing. And, this would only be done if skb->protocol was not set before via socket(), bind() or send(). Hence, there is no need to break a large send() call down into small fragments, while the 'skb->protocol = ehdr->h_proto' assignment would have to be done in any case. Using the send() method this would require an additional overhead since we'd have to fill in a msghdr and pass it along. Needless to say, we tried it and it seems to work fine (on AMD geode and also on a more modern Intel Atom). We'd be open to alternative solutions. Otherwise we would propose those few lines to be added to the af_packet code. Best regards, Mathias ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-02-26 10:33 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-02-23 11:03 TX_RING and VLAN : (packet size is too long 1518 > 1514) Mathias Kretschmer 2014-02-23 13:58 ` Daniel Borkmann 2014-02-24 9:03 ` Mathias Kretschmer 2014-02-24 22:33 ` Daniel Borkmann 2014-02-26 7:28 ` Mathias Kretschmer 2014-02-26 10:33 ` Daniel Borkmann 2014-02-24 21:40 ` Setting skb->protocol efficiently for AF_PACKETs via TX_RING Mathias Kretschmer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).