From: Jan-Bernd Themann <ossthema@de.ibm.com>
To: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: Thomas Klein <tklein@de.ibm.com>,
Jan-Bernd Themann <themann@de.ibm.com>,
netdev <netdev@vger.kernel.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-ppc <linuxppc-dev@ozlabs.org>,
Christoph Raisch <raisch@de.ibm.com>,
Marcus Eder <meder@de.ibm.com>,
Stefan Roscher <stefan.roscher@de.ibm.com>
Subject: Re: [RFC 1/3] lro: Generic LRO for TCP traffic
Date: Thu, 12 Jul 2007 13:54:42 +0200 [thread overview]
Message-ID: <200707121354.43407.ossthema@de.ibm.com> (raw)
In-Reply-To: <20070712080137.GA25699@2ka.mipt.ru>
Hi Evgeniy
On Thursday 12 July 2007 10:01, Evgeniy Polyakov wrote:
> > +
> > + if (tcph->cwr || tcph->ece || tcph->urg || !tcph->ack || tcph->psh
> > + || tcph->rst || tcph->syn || tcph->fin)
> > + return -1;
>
> I think you do not want to break lro frame because of push flag - it is
> pretty common flag, which does not brak processing (and I'm not sure if
> it has any special meaning this days).
>
> > + if (INET_ECN_is_ce(ipv4_get_dsfield(iph)))
> > + return -1;
> > +
> > + if (tcph->doff != TCPH_LEN_WO_OPTIONS
> > + && tcph->doff != TCPH_LEN_W_TIMESTAMP)
> > + return -1;
> > +
> > + /* check tcp options (only timestamp allowed) */
> > + if (tcph->doff == TCPH_LEN_W_TIMESTAMP) {
> > + u32 *topt = (u32 *)(tcph + 1);
> > +
> > + if (*topt != htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16)
> > + | (TCPOPT_TIMESTAMP << 8)
> > + | TCPOLEN_TIMESTAMP))
> > + return -1;
> > +
> > + /* timestamp should be in right order */
> > + topt++;
> > + if (lro_desc && (ntohl(lro_desc->tcp_rcv_tsval) > ntohl(*topt)))
> > + return -1;
>
> This still does not handle wrapping over 32 bits.
> What about
> if (lro_desc && after(ntohl(lro_desc->tcp_rcv_tsval), ntohl(*topt)))
> return -1;
Looks good, will change that
>
> > + /* timestamp reply should not be zero */
> > + topt++;
> > + if (*topt == 0)
> > + return -1;
> > + }
> > +
> > + return 0;
> > +}
>
> > +static struct net_lro_desc *lro_get_desc(struct net_lro_mgr *mgr,
> > + struct net_lro_desc *lro_arr,
> > + struct iphdr *iph,
> > + struct tcphdr *tcph)
> > +{
> > + struct net_lro_desc *lro_desc = NULL;
> > + struct net_lro_desc *tmp;
> > + int max_desc = mgr->max_desc;
> > + int i;
> > +
> > + for (i = 0; i < max_desc; i++) {
> > + tmp = &lro_arr[i];
> > + if (tmp->active)
> > + if (!lro_check_tcp_conn(tmp, iph, tcph)) {
> > + lro_desc = tmp;
> > + goto out;
> > + }
> > + }
>
> Ugh... What about tree structure or hash here?
Our initial version was based on the following assumptions (remember, 8 elements...):
- given a quota of 64 packets, it makes no sense to use huge arrays for LRO descriptors
(in our case 8 elements seem to work fine).
- trying to benefit from caching effects+branch prediction,
+ use the built in cacheline prefetch
I guess the array mechanism can be improved (finding free entries), but for the
initial version to see how the rest of the stack behaves with LRO
it might be ok this way.
Do you think a tree or hash would improve this with existing
caching designs (for a small number of elements)?
>
> > + for (i = 0; i < max_desc; i++) {
> > + if(!lro_arr[i].active) {
> > + lro_desc = &lro_arr[i];
> > + goto out;
> > + }
> > + }
> > +
> > +out:
> > + return lro_desc;
> > +}
>
> > +int __lro_proc_skb(struct net_lro_mgr *lro_mgr, struct sk_buff *skb,
> > + struct vlan_group *vgrp, u16 vlan_tag, void *priv)
> > +{
> > + struct net_lro_desc *lro_desc;
> > + struct iphdr *iph;
> > + struct tcphdr *tcph;
> > +
> > + if (!lro_mgr->get_ip_tcp_hdr
> > + || lro_mgr->get_ip_tcp_hdr(skb, &iph, &tcph, priv))
> > + goto out;
> > +
> > + lro_desc = lro_get_desc(lro_mgr, lro_mgr->lro_arr, iph, tcph);
> > + if (!lro_desc)
> > + goto out;
>
> There is no protection of the descriptor array from accessing from
> different CPUs. Is it forbidden to share net_lro_mgr structure?
>
Currently we assume that netpoll runs with one device only on one cpu
at a time, and if there are multiple receive queues that can be processed
in parallel the traffic is usually sorted per receive queue. It would make
sense to use an own LRO "Manager" for each queue (would speed up the lookup)
Regards,
Jan-Bernd
prev parent reply other threads:[~2007-07-12 12:20 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-07-11 14:21 [RFC 1/3] lro: Generic LRO for TCP traffic Jan-Bernd Themann
2007-07-12 8:01 ` Evgeniy Polyakov
2007-07-12 11:54 ` Jan-Bernd Themann [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200707121354.43407.ossthema@de.ibm.com \
--to=ossthema@de.ibm.com \
--cc=johnpol@2ka.mipt.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=meder@de.ibm.com \
--cc=netdev@vger.kernel.org \
--cc=raisch@de.ibm.com \
--cc=stefan.roscher@de.ibm.com \
--cc=themann@de.ibm.com \
--cc=tklein@de.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).