* [PATCH 0/2] tun: add the RFS support @ 2013-12-22 10:54 Zhi Yong Wu 2013-12-22 10:54 ` [PATCH 1/2] net: Allow setting sock flow hash without a sock Zhi Yong Wu ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Zhi Yong Wu @ 2013-12-22 10:54 UTC (permalink / raw) To: therbert; +Cc: netdev, edumazet, Zhi Yong Wu From: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> HI, folks Since Tom Herbert's hash related patchset was modified and got merged, his pachset about adding support for RFS on tun flows also need to get adjusted accordingly. I tried to update them, and before i will start to do some perf tests, i hope to get one correct code base, so it's time to post them out now. Any constructive comments are welcome, thanks. Tom Herbert (2): net: Allow setting sock flow hash without a sock tun: Add support for RFS on tun flows drivers/net/tun.c | 37 +++++++++++++++++++++++++++++++++++-- include/net/sock.h | 18 ++++++++++++++---- 2 files changed, 49 insertions(+), 6 deletions(-) -- 1.7.6.5 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/2] net: Allow setting sock flow hash without a sock 2013-12-22 10:54 [PATCH 0/2] tun: add the RFS support Zhi Yong Wu @ 2013-12-22 10:54 ` Zhi Yong Wu 2013-12-22 10:54 ` [PATCH 2/2] tun: Add support for RFS on tun flows Zhi Yong Wu 2013-12-31 18:32 ` [PATCH 0/2] tun: add the RFS support David Miller 2 siblings, 0 replies; 15+ messages in thread From: Zhi Yong Wu @ 2013-12-22 10:54 UTC (permalink / raw) To: therbert; +Cc: netdev, edumazet, Zhi Yong Wu From: Tom Herbert <therbert@google.com> This patch adds sock_rps_record_flow_hash and sock_rps_reset_flow_hash which take a hash value as an argument and sets the sock_flow_table accordingly. This allows the table to be populated in cases where flow is being tracked outside of a sock structure. sock_rps_record_flow and sock_rps_reset_flow call this function where the hash is taken from sk_rxhash. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> --- include/net/sock.h | 18 ++++++++++++++---- 1 files changed, 14 insertions(+), 4 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index 2ef3c3e..8ee90ad 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -820,30 +820,40 @@ static inline int sk_backlog_rcv(struct sock *sk, struct sk_buff *skb) return sk->sk_backlog_rcv(sk, skb); } -static inline void sock_rps_record_flow(const struct sock *sk) +static inline void sock_rps_record_flow_hash(__u32 hash) { #ifdef CONFIG_RPS struct rps_sock_flow_table *sock_flow_table; rcu_read_lock(); sock_flow_table = rcu_dereference(rps_sock_flow_table); - rps_record_sock_flow(sock_flow_table, sk->sk_rxhash); + rps_record_sock_flow(sock_flow_table, hash); rcu_read_unlock(); #endif } -static inline void sock_rps_reset_flow(const struct sock *sk) +static inline void sock_rps_reset_flow_hash(__u32 hash) { #ifdef CONFIG_RPS struct rps_sock_flow_table *sock_flow_table; rcu_read_lock(); sock_flow_table = rcu_dereference(rps_sock_flow_table); - rps_reset_sock_flow(sock_flow_table, sk->sk_rxhash); + rps_reset_sock_flow(sock_flow_table, hash); rcu_read_unlock(); #endif } +static inline void sock_rps_record_flow(const struct sock *sk) +{ + sock_rps_record_flow_hash(sk->sk_rxhash); +} + +static inline void sock_rps_reset_flow(const struct sock *sk) +{ + sock_rps_reset_flow_hash(sk->sk_rxhash); +} + static inline void sock_rps_save_rxhash(struct sock *sk, const struct sk_buff *skb) { -- 1.7.6.5 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/2] tun: Add support for RFS on tun flows 2013-12-22 10:54 [PATCH 0/2] tun: add the RFS support Zhi Yong Wu 2013-12-22 10:54 ` [PATCH 1/2] net: Allow setting sock flow hash without a sock Zhi Yong Wu @ 2013-12-22 10:54 ` Zhi Yong Wu 2014-01-02 4:44 ` Jason Wang 2013-12-31 18:32 ` [PATCH 0/2] tun: add the RFS support David Miller 2 siblings, 1 reply; 15+ messages in thread From: Zhi Yong Wu @ 2013-12-22 10:54 UTC (permalink / raw) To: therbert; +Cc: netdev, edumazet, Zhi Yong Wu From: Tom Herbert <therbert@google.com> This patch adds support so that the rps_flow_tables (RFS) can be programmed using the tun flows which are already set up to track flows for the purposes of queue selection. On the receive path (corresponding to select_queue and tun_net_xmit) the rxhash is saved in the flow_entry. The original code only does flow lookup in select_queue, so this patch adds a flow lookup in tun_net_xmit if num_queues == 1 (select_queue is not called from dev_queue_xmit->netdev_pick_tx in that case). The flow is recorded (processing CPU) in tun_flow_update (TX path), and reset when flow is deleted. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> --- drivers/net/tun.c | 37 +++++++++++++++++++++++++++++++++++-- 1 files changed, 35 insertions(+), 2 deletions(-) diff --git a/drivers/net/tun.c b/drivers/net/tun.c index a17a701..3cf0457 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -152,6 +152,7 @@ struct tun_flow_entry { struct tun_struct *tun; u32 rxhash; + u32 rps_rxhash; int queue_index; unsigned long updated; }; @@ -220,6 +221,7 @@ static struct tun_flow_entry *tun_flow_create(struct tun_struct *tun, rxhash, queue_index); e->updated = jiffies; e->rxhash = rxhash; + e->rps_rxhash = 0; e->queue_index = queue_index; e->tun = tun; hlist_add_head_rcu(&e->hash_link, head); @@ -232,6 +234,7 @@ static void tun_flow_delete(struct tun_struct *tun, struct tun_flow_entry *e) { tun_debug(KERN_INFO, tun, "delete flow: hash %u index %u\n", e->rxhash, e->queue_index); + sock_rps_reset_flow_hash(e->rps_rxhash); hlist_del_rcu(&e->hash_link); kfree_rcu(e, rcu); --tun->flow_count; @@ -325,6 +328,7 @@ static void tun_flow_update(struct tun_struct *tun, u32 rxhash, /* TODO: keep queueing to old queue until it's empty? */ e->queue_index = queue_index; e->updated = jiffies; + sock_rps_record_flow_hash(e->rps_rxhash); } else { spin_lock_bh(&tun->lock); if (!tun_flow_find(head, rxhash) && @@ -341,6 +345,18 @@ unlock: rcu_read_unlock(); } +/** + * Save the hash received in the stack receive path and update the + * flow_hash table accordingly. + */ +static inline void tun_flow_save_rps_rxhash(struct tun_flow_entry *e, u32 hash) +{ + if (unlikely(e->rps_rxhash != hash)) { + sock_rps_reset_flow_hash(e->rps_rxhash); + e->rps_rxhash = hash; + } +} + /* We try to identify a flow through its rxhash first. The reason that * we do not check rxq no. is because some cards(e.g 82599), chooses * the rxq based on the txq where the last packet of the flow comes. As @@ -361,9 +377,10 @@ static u16 tun_select_queue(struct net_device *dev, struct sk_buff *skb) txq = skb_get_hash(skb); if (txq) { e = tun_flow_find(&tun->flows[tun_hashfn(txq)], txq); - if (e) + if (e) { txq = e->queue_index; - else + tun_flow_save_rps_rxhash(e, txq); + } else /* use multiply and shift instead of expensive divide */ txq = ((u64)txq * numqueues) >> 32; } else if (likely(skb_rx_queue_recorded(skb))) { @@ -728,6 +745,22 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev) if (txq >= tun->numqueues) goto drop; + if (tun->numqueues == 1) { + /* Select queue was not called for the skbuff, so we extract the + * RPS hash and save it into the flow_table here. + */ + __u32 rxhash; + + rxhash = skb_get_hash(skb); + if (rxhash) { + struct tun_flow_entry *e; + e = tun_flow_find(&tun->flows[tun_hashfn(rxhash)], + rxhash); + if (e) + tun_flow_save_rps_rxhash(e, rxhash); + } + } + tun_debug(KERN_INFO, tun, "tun_net_xmit %d\n", skb->len); BUG_ON(!tfile); -- 1.7.6.5 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] tun: Add support for RFS on tun flows 2013-12-22 10:54 ` [PATCH 2/2] tun: Add support for RFS on tun flows Zhi Yong Wu @ 2014-01-02 4:44 ` Jason Wang 2014-01-02 5:07 ` Zhi Yong Wu 0 siblings, 1 reply; 15+ messages in thread From: Jason Wang @ 2014-01-02 4:44 UTC (permalink / raw) To: Zhi Yong Wu, therbert; +Cc: netdev, edumazet, Zhi Yong Wu On 12/22/2013 06:54 PM, Zhi Yong Wu wrote: > From: Tom Herbert <therbert@google.com> > > This patch adds support so that the rps_flow_tables (RFS) can be > programmed using the tun flows which are already set up to track flows > for the purposes of queue selection. > > On the receive path (corresponding to select_queue and tun_net_xmit) the > rxhash is saved in the flow_entry. The original code only does flow > lookup in select_queue, so this patch adds a flow lookup in tun_net_xmit > if num_queues == 1 (select_queue is not called from > dev_queue_xmit->netdev_pick_tx in that case). > > The flow is recorded (processing CPU) in tun_flow_update (TX path), and > reset when flow is deleted. > > Signed-off-by: Tom Herbert <therbert@google.com> > Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> > --- > drivers/net/tun.c | 37 +++++++++++++++++++++++++++++++++++-- > 1 files changed, 35 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/tun.c b/drivers/net/tun.c > index a17a701..3cf0457 100644 > --- a/drivers/net/tun.c > +++ b/drivers/net/tun.c > @@ -152,6 +152,7 @@ struct tun_flow_entry { > struct tun_struct *tun; > > u32 rxhash; > + u32 rps_rxhash; > int queue_index; > unsigned long updated; > }; > @@ -220,6 +221,7 @@ static struct tun_flow_entry *tun_flow_create(struct tun_struct *tun, > rxhash, queue_index); > e->updated = jiffies; > e->rxhash = rxhash; > + e->rps_rxhash = 0; > e->queue_index = queue_index; > e->tun = tun; > hlist_add_head_rcu(&e->hash_link, head); > @@ -232,6 +234,7 @@ static void tun_flow_delete(struct tun_struct *tun, struct tun_flow_entry *e) > { > tun_debug(KERN_INFO, tun, "delete flow: hash %u index %u\n", > e->rxhash, e->queue_index); > + sock_rps_reset_flow_hash(e->rps_rxhash); > hlist_del_rcu(&e->hash_link); > kfree_rcu(e, rcu); > --tun->flow_count; > @@ -325,6 +328,7 @@ static void tun_flow_update(struct tun_struct *tun, u32 rxhash, > /* TODO: keep queueing to old queue until it's empty? */ > e->queue_index = queue_index; > e->updated = jiffies; > + sock_rps_record_flow_hash(e->rps_rxhash); > } else { > spin_lock_bh(&tun->lock); > if (!tun_flow_find(head, rxhash) && > @@ -341,6 +345,18 @@ unlock: > rcu_read_unlock(); > } > > +/** > + * Save the hash received in the stack receive path and update the > + * flow_hash table accordingly. > + */ > +static inline void tun_flow_save_rps_rxhash(struct tun_flow_entry *e, u32 hash) > +{ > + if (unlikely(e->rps_rxhash != hash)) { > + sock_rps_reset_flow_hash(e->rps_rxhash); > + e->rps_rxhash = hash; > + } > +} > + > /* We try to identify a flow through its rxhash first. The reason that > * we do not check rxq no. is because some cards(e.g 82599), chooses > * the rxq based on the txq where the last packet of the flow comes. As > @@ -361,9 +377,10 @@ static u16 tun_select_queue(struct net_device *dev, struct sk_buff *skb) > txq = skb_get_hash(skb); > if (txq) { > e = tun_flow_find(&tun->flows[tun_hashfn(txq)], txq); > - if (e) > + if (e) { > txq = e->queue_index; > - else > + tun_flow_save_rps_rxhash(e, txq); > + } else This looks wrong, txq is the queue index here. We should do this before txq = e->queue_index. Did you test the patch? You can simply verify the basic function by checking whether the rx softirq were done in the same cpu where vhost is running. > /* use multiply and shift instead of expensive divide */ > txq = ((u64)txq * numqueues) >> 32; > } else if (likely(skb_rx_queue_recorded(skb))) { > @@ -728,6 +745,22 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev) > if (txq >= tun->numqueues) > goto drop; > > + if (tun->numqueues == 1) { > + /* Select queue was not called for the skbuff, so we extract the > + * RPS hash and save it into the flow_table here. > + */ > + __u32 rxhash; > + > + rxhash = skb_get_hash(skb); > + if (rxhash) { > + struct tun_flow_entry *e; > + e = tun_flow_find(&tun->flows[tun_hashfn(rxhash)], > + rxhash); > + if (e) > + tun_flow_save_rps_rxhash(e, rxhash); > + } > + } > + > tun_debug(KERN_INFO, tun, "tun_net_xmit %d\n", skb->len); > > BUG_ON(!tfile); ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] tun: Add support for RFS on tun flows 2014-01-02 4:44 ` Jason Wang @ 2014-01-02 5:07 ` Zhi Yong Wu 2014-01-02 5:41 ` Jason Wang 0 siblings, 1 reply; 15+ messages in thread From: Zhi Yong Wu @ 2014-01-02 5:07 UTC (permalink / raw) To: Jason Wang; +Cc: Tom Herbert, Linux Netdev List, Eric Dumazet, Zhi Yong Wu On Thu, Jan 2, 2014 at 12:44 PM, Jason Wang <jasowang@redhat.com> wrote: > On 12/22/2013 06:54 PM, Zhi Yong Wu wrote: >> From: Tom Herbert <therbert@google.com> >> >> This patch adds support so that the rps_flow_tables (RFS) can be >> programmed using the tun flows which are already set up to track flows >> for the purposes of queue selection. >> >> On the receive path (corresponding to select_queue and tun_net_xmit) the >> rxhash is saved in the flow_entry. The original code only does flow >> lookup in select_queue, so this patch adds a flow lookup in tun_net_xmit >> if num_queues == 1 (select_queue is not called from >> dev_queue_xmit->netdev_pick_tx in that case). >> >> The flow is recorded (processing CPU) in tun_flow_update (TX path), and >> reset when flow is deleted. >> >> Signed-off-by: Tom Herbert <therbert@google.com> >> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >> --- >> drivers/net/tun.c | 37 +++++++++++++++++++++++++++++++++++-- >> 1 files changed, 35 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/net/tun.c b/drivers/net/tun.c >> index a17a701..3cf0457 100644 >> --- a/drivers/net/tun.c >> +++ b/drivers/net/tun.c >> @@ -152,6 +152,7 @@ struct tun_flow_entry { >> struct tun_struct *tun; >> >> u32 rxhash; >> + u32 rps_rxhash; >> int queue_index; >> unsigned long updated; >> }; >> @@ -220,6 +221,7 @@ static struct tun_flow_entry *tun_flow_create(struct tun_struct *tun, >> rxhash, queue_index); >> e->updated = jiffies; >> e->rxhash = rxhash; >> + e->rps_rxhash = 0; >> e->queue_index = queue_index; >> e->tun = tun; >> hlist_add_head_rcu(&e->hash_link, head); >> @@ -232,6 +234,7 @@ static void tun_flow_delete(struct tun_struct *tun, struct tun_flow_entry *e) >> { >> tun_debug(KERN_INFO, tun, "delete flow: hash %u index %u\n", >> e->rxhash, e->queue_index); >> + sock_rps_reset_flow_hash(e->rps_rxhash); >> hlist_del_rcu(&e->hash_link); >> kfree_rcu(e, rcu); >> --tun->flow_count; >> @@ -325,6 +328,7 @@ static void tun_flow_update(struct tun_struct *tun, u32 rxhash, >> /* TODO: keep queueing to old queue until it's empty? */ >> e->queue_index = queue_index; >> e->updated = jiffies; >> + sock_rps_record_flow_hash(e->rps_rxhash); >> } else { >> spin_lock_bh(&tun->lock); >> if (!tun_flow_find(head, rxhash) && >> @@ -341,6 +345,18 @@ unlock: >> rcu_read_unlock(); >> } >> >> +/** >> + * Save the hash received in the stack receive path and update the >> + * flow_hash table accordingly. >> + */ >> +static inline void tun_flow_save_rps_rxhash(struct tun_flow_entry *e, u32 hash) >> +{ >> + if (unlikely(e->rps_rxhash != hash)) { >> + sock_rps_reset_flow_hash(e->rps_rxhash); >> + e->rps_rxhash = hash; >> + } >> +} >> + >> /* We try to identify a flow through its rxhash first. The reason that >> * we do not check rxq no. is because some cards(e.g 82599), chooses >> * the rxq based on the txq where the last packet of the flow comes. As >> @@ -361,9 +377,10 @@ static u16 tun_select_queue(struct net_device *dev, struct sk_buff *skb) >> txq = skb_get_hash(skb); >> if (txq) { >> e = tun_flow_find(&tun->flows[tun_hashfn(txq)], txq); >> - if (e) >> + if (e) { >> txq = e->queue_index; >> - else >> + tun_flow_save_rps_rxhash(e, txq); >> + } else > > This looks wrong, txq is the queue index here. We should do this before > txq = e->queue_index. Good catch, will try to post one patch to fix it after the perf tests are done, thanks. > > Did you test the patch? You can simply verify the basic function by No so far, be planning to do its perf tests, do you have any good suggestions? > checking whether the rx softirq were done in the same cpu where vhost is > running. >> /* use multiply and shift instead of expensive divide */ >> txq = ((u64)txq * numqueues) >> 32; >> } else if (likely(skb_rx_queue_recorded(skb))) { >> @@ -728,6 +745,22 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev) >> if (txq >= tun->numqueues) >> goto drop; >> >> + if (tun->numqueues == 1) { >> + /* Select queue was not called for the skbuff, so we extract the >> + * RPS hash and save it into the flow_table here. >> + */ >> + __u32 rxhash; >> + >> + rxhash = skb_get_hash(skb); >> + if (rxhash) { >> + struct tun_flow_entry *e; >> + e = tun_flow_find(&tun->flows[tun_hashfn(rxhash)], >> + rxhash); >> + if (e) >> + tun_flow_save_rps_rxhash(e, rxhash); >> + } >> + } >> + >> tun_debug(KERN_INFO, tun, "tun_net_xmit %d\n", skb->len); >> >> BUG_ON(!tfile); > -- Regards, Zhi Yong Wu ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] tun: Add support for RFS on tun flows 2014-01-02 5:07 ` Zhi Yong Wu @ 2014-01-02 5:41 ` Jason Wang 2014-01-02 6:35 ` Zhi Yong Wu 0 siblings, 1 reply; 15+ messages in thread From: Jason Wang @ 2014-01-02 5:41 UTC (permalink / raw) To: Zhi Yong Wu; +Cc: Tom Herbert, Linux Netdev List, Eric Dumazet, Zhi Yong Wu On 01/02/2014 01:07 PM, Zhi Yong Wu wrote: > On Thu, Jan 2, 2014 at 12:44 PM, Jason Wang <jasowang@redhat.com> wrote: >> On 12/22/2013 06:54 PM, Zhi Yong Wu wrote: >>> From: Tom Herbert <therbert@google.com> >>> >>> This patch adds support so that the rps_flow_tables (RFS) can be >>> programmed using the tun flows which are already set up to track flows >>> for the purposes of queue selection. >>> >>> On the receive path (corresponding to select_queue and tun_net_xmit) the >>> rxhash is saved in the flow_entry. The original code only does flow >>> lookup in select_queue, so this patch adds a flow lookup in tun_net_xmit >>> if num_queues == 1 (select_queue is not called from >>> dev_queue_xmit->netdev_pick_tx in that case). >>> >>> The flow is recorded (processing CPU) in tun_flow_update (TX path), and >>> reset when flow is deleted. >>> >>> Signed-off-by: Tom Herbert <therbert@google.com> >>> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >>> --- >>> drivers/net/tun.c | 37 +++++++++++++++++++++++++++++++++++-- >>> 1 files changed, 35 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c >>> index a17a701..3cf0457 100644 >>> --- a/drivers/net/tun.c >>> +++ b/drivers/net/tun.c >>> @@ -152,6 +152,7 @@ struct tun_flow_entry { >>> struct tun_struct *tun; >>> >>> u32 rxhash; >>> + u32 rps_rxhash; >>> int queue_index; >>> unsigned long updated; >>> }; >>> @@ -220,6 +221,7 @@ static struct tun_flow_entry *tun_flow_create(struct tun_struct *tun, >>> rxhash, queue_index); >>> e->updated = jiffies; >>> e->rxhash = rxhash; >>> + e->rps_rxhash = 0; >>> e->queue_index = queue_index; >>> e->tun = tun; >>> hlist_add_head_rcu(&e->hash_link, head); >>> @@ -232,6 +234,7 @@ static void tun_flow_delete(struct tun_struct *tun, struct tun_flow_entry *e) >>> { >>> tun_debug(KERN_INFO, tun, "delete flow: hash %u index %u\n", >>> e->rxhash, e->queue_index); >>> + sock_rps_reset_flow_hash(e->rps_rxhash); >>> hlist_del_rcu(&e->hash_link); >>> kfree_rcu(e, rcu); >>> --tun->flow_count; >>> @@ -325,6 +328,7 @@ static void tun_flow_update(struct tun_struct *tun, u32 rxhash, >>> /* TODO: keep queueing to old queue until it's empty? */ >>> e->queue_index = queue_index; >>> e->updated = jiffies; >>> + sock_rps_record_flow_hash(e->rps_rxhash); >>> } else { >>> spin_lock_bh(&tun->lock); >>> if (!tun_flow_find(head, rxhash) && >>> @@ -341,6 +345,18 @@ unlock: >>> rcu_read_unlock(); >>> } >>> >>> +/** >>> + * Save the hash received in the stack receive path and update the >>> + * flow_hash table accordingly. >>> + */ >>> +static inline void tun_flow_save_rps_rxhash(struct tun_flow_entry *e, u32 hash) >>> +{ >>> + if (unlikely(e->rps_rxhash != hash)) { >>> + sock_rps_reset_flow_hash(e->rps_rxhash); >>> + e->rps_rxhash = hash; >>> + } >>> +} >>> + >>> /* We try to identify a flow through its rxhash first. The reason that >>> * we do not check rxq no. is because some cards(e.g 82599), chooses >>> * the rxq based on the txq where the last packet of the flow comes. As >>> @@ -361,9 +377,10 @@ static u16 tun_select_queue(struct net_device *dev, struct sk_buff *skb) >>> txq = skb_get_hash(skb); >>> if (txq) { >>> e = tun_flow_find(&tun->flows[tun_hashfn(txq)], txq); >>> - if (e) >>> + if (e) { >>> txq = e->queue_index; >>> - else >>> + tun_flow_save_rps_rxhash(e, txq); >>> + } else >> This looks wrong, txq is the queue index here. We should do this before >> txq = e->queue_index. > Good catch, will try to post one patch to fix it after the perf tests > are done, thanks. >> Did you test the patch? You can simply verify the basic function by > No so far, be planning to do its perf tests, do you have any good suggestions? Maybe the first step is to check whether it works as expected, as I said whether the rx softirq were done in the expected CPU. Then you can run a complete round of tests (e.g TCP_RR, TCP_STREAM with different message sizes and several sessions in parallel), to see whether it help. You may run the performance test for single queue first to make sure there's no regression first. Thanks >> checking whether the rx softirq were done in the same cpu where vhost is >> running. >>> /* use multiply and shift instead of expensive divide */ >>> txq = ((u64)txq * numqueues) >> 32; >>> } else if (likely(skb_rx_queue_recorded(skb))) { >>> @@ -728,6 +745,22 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev) >>> if (txq >= tun->numqueues) >>> goto drop; >>> >>> + if (tun->numqueues == 1) { >>> + /* Select queue was not called for the skbuff, so we extract the >>> + * RPS hash and save it into the flow_table here. >>> + */ >>> + __u32 rxhash; >>> + >>> + rxhash = skb_get_hash(skb); >>> + if (rxhash) { >>> + struct tun_flow_entry *e; >>> + e = tun_flow_find(&tun->flows[tun_hashfn(rxhash)], >>> + rxhash); >>> + if (e) >>> + tun_flow_save_rps_rxhash(e, rxhash); >>> + } >>> + } >>> + >>> tun_debug(KERN_INFO, tun, "tun_net_xmit %d\n", skb->len); >>> >>> BUG_ON(!tfile); > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] tun: Add support for RFS on tun flows 2014-01-02 5:41 ` Jason Wang @ 2014-01-02 6:35 ` Zhi Yong Wu 0 siblings, 0 replies; 15+ messages in thread From: Zhi Yong Wu @ 2014-01-02 6:35 UTC (permalink / raw) To: Jason Wang; +Cc: Tom Herbert, Linux Netdev List, Eric Dumazet, Zhi Yong Wu On Thu, Jan 2, 2014 at 1:41 PM, Jason Wang <jasowang@redhat.com> wrote: > On 01/02/2014 01:07 PM, Zhi Yong Wu wrote: >> On Thu, Jan 2, 2014 at 12:44 PM, Jason Wang <jasowang@redhat.com> wrote: >>> On 12/22/2013 06:54 PM, Zhi Yong Wu wrote: >>>> From: Tom Herbert <therbert@google.com> >>>> >>>> This patch adds support so that the rps_flow_tables (RFS) can be >>>> programmed using the tun flows which are already set up to track flows >>>> for the purposes of queue selection. >>>> >>>> On the receive path (corresponding to select_queue and tun_net_xmit) the >>>> rxhash is saved in the flow_entry. The original code only does flow >>>> lookup in select_queue, so this patch adds a flow lookup in tun_net_xmit >>>> if num_queues == 1 (select_queue is not called from >>>> dev_queue_xmit->netdev_pick_tx in that case). >>>> >>>> The flow is recorded (processing CPU) in tun_flow_update (TX path), and >>>> reset when flow is deleted. >>>> >>>> Signed-off-by: Tom Herbert <therbert@google.com> >>>> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >>>> --- >>>> drivers/net/tun.c | 37 +++++++++++++++++++++++++++++++++++-- >>>> 1 files changed, 35 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c >>>> index a17a701..3cf0457 100644 >>>> --- a/drivers/net/tun.c >>>> +++ b/drivers/net/tun.c >>>> @@ -152,6 +152,7 @@ struct tun_flow_entry { >>>> struct tun_struct *tun; >>>> >>>> u32 rxhash; >>>> + u32 rps_rxhash; >>>> int queue_index; >>>> unsigned long updated; >>>> }; >>>> @@ -220,6 +221,7 @@ static struct tun_flow_entry *tun_flow_create(struct tun_struct *tun, >>>> rxhash, queue_index); >>>> e->updated = jiffies; >>>> e->rxhash = rxhash; >>>> + e->rps_rxhash = 0; >>>> e->queue_index = queue_index; >>>> e->tun = tun; >>>> hlist_add_head_rcu(&e->hash_link, head); >>>> @@ -232,6 +234,7 @@ static void tun_flow_delete(struct tun_struct *tun, struct tun_flow_entry *e) >>>> { >>>> tun_debug(KERN_INFO, tun, "delete flow: hash %u index %u\n", >>>> e->rxhash, e->queue_index); >>>> + sock_rps_reset_flow_hash(e->rps_rxhash); >>>> hlist_del_rcu(&e->hash_link); >>>> kfree_rcu(e, rcu); >>>> --tun->flow_count; >>>> @@ -325,6 +328,7 @@ static void tun_flow_update(struct tun_struct *tun, u32 rxhash, >>>> /* TODO: keep queueing to old queue until it's empty? */ >>>> e->queue_index = queue_index; >>>> e->updated = jiffies; >>>> + sock_rps_record_flow_hash(e->rps_rxhash); >>>> } else { >>>> spin_lock_bh(&tun->lock); >>>> if (!tun_flow_find(head, rxhash) && >>>> @@ -341,6 +345,18 @@ unlock: >>>> rcu_read_unlock(); >>>> } >>>> >>>> +/** >>>> + * Save the hash received in the stack receive path and update the >>>> + * flow_hash table accordingly. >>>> + */ >>>> +static inline void tun_flow_save_rps_rxhash(struct tun_flow_entry *e, u32 hash) >>>> +{ >>>> + if (unlikely(e->rps_rxhash != hash)) { >>>> + sock_rps_reset_flow_hash(e->rps_rxhash); >>>> + e->rps_rxhash = hash; >>>> + } >>>> +} >>>> + >>>> /* We try to identify a flow through its rxhash first. The reason that >>>> * we do not check rxq no. is because some cards(e.g 82599), chooses >>>> * the rxq based on the txq where the last packet of the flow comes. As >>>> @@ -361,9 +377,10 @@ static u16 tun_select_queue(struct net_device *dev, struct sk_buff *skb) >>>> txq = skb_get_hash(skb); >>>> if (txq) { >>>> e = tun_flow_find(&tun->flows[tun_hashfn(txq)], txq); >>>> - if (e) >>>> + if (e) { >>>> txq = e->queue_index; >>>> - else >>>> + tun_flow_save_rps_rxhash(e, txq); >>>> + } else >>> This looks wrong, txq is the queue index here. We should do this before >>> txq = e->queue_index. >> Good catch, will try to post one patch to fix it after the perf tests >> are done, thanks. >>> Did you test the patch? You can simply verify the basic function by >> No so far, be planning to do its perf tests, do you have any good suggestions? > > Maybe the first step is to check whether it works as expected, as I said > whether the rx softirq were done in the expected CPU. Then you can run a > complete round of tests (e.g TCP_RR, TCP_STREAM with different message > sizes and several sessions in parallel), to see whether it help. You may > run the performance test for single queue first to make sure there's no > regression first. It looks like what i plan to do, anyway, thanks. > > Thanks >>> checking whether the rx softirq were done in the same cpu where vhost is >>> running. >>>> /* use multiply and shift instead of expensive divide */ >>>> txq = ((u64)txq * numqueues) >> 32; >>>> } else if (likely(skb_rx_queue_recorded(skb))) { >>>> @@ -728,6 +745,22 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev) >>>> if (txq >= tun->numqueues) >>>> goto drop; >>>> >>>> + if (tun->numqueues == 1) { >>>> + /* Select queue was not called for the skbuff, so we extract the >>>> + * RPS hash and save it into the flow_table here. >>>> + */ >>>> + __u32 rxhash; >>>> + >>>> + rxhash = skb_get_hash(skb); >>>> + if (rxhash) { >>>> + struct tun_flow_entry *e; >>>> + e = tun_flow_find(&tun->flows[tun_hashfn(rxhash)], >>>> + rxhash); >>>> + if (e) >>>> + tun_flow_save_rps_rxhash(e, rxhash); >>>> + } >>>> + } >>>> + >>>> tun_debug(KERN_INFO, tun, "tun_net_xmit %d\n", skb->len); >>>> >>>> BUG_ON(!tfile); >> >> > -- Regards, Zhi Yong Wu ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] tun: add the RFS support 2013-12-22 10:54 [PATCH 0/2] tun: add the RFS support Zhi Yong Wu 2013-12-22 10:54 ` [PATCH 1/2] net: Allow setting sock flow hash without a sock Zhi Yong Wu 2013-12-22 10:54 ` [PATCH 2/2] tun: Add support for RFS on tun flows Zhi Yong Wu @ 2013-12-31 18:32 ` David Miller 2013-12-31 22:33 ` Tom Herbert 2 siblings, 1 reply; 15+ messages in thread From: David Miller @ 2013-12-31 18:32 UTC (permalink / raw) To: zwu.kernel; +Cc: therbert, netdev, edumazet, wuzhy From: Zhi Yong Wu <zwu.kernel@gmail.com> Date: Sun, 22 Dec 2013 18:54:30 +0800 > Since Tom Herbert's hash related patchset was modified and got merged, > his pachset about adding support for RFS on tun flows also need to get > adjusted accordingly. I tried to update them, and before i will start > to do some perf tests, i hope to get one correct code base, so it's time > to post them out now. Any constructive comments are welcome, thanks. Series applied to net-next, thanks. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] tun: add the RFS support 2013-12-31 18:32 ` [PATCH 0/2] tun: add the RFS support David Miller @ 2013-12-31 22:33 ` Tom Herbert 2014-01-01 6:44 ` Zhi Yong Wu 2014-01-13 13:29 ` Zhi Yong Wu 0 siblings, 2 replies; 15+ messages in thread From: Tom Herbert @ 2013-12-31 22:33 UTC (permalink / raw) To: David Miller; +Cc: Zhi Yong Wu, Linux Netdev List, Eric Dumazet, Zhi Yong Wu Zhi, Thanks for following up on these patches. It would still be nice to have performance numbers to show the impact. These will be helpful for the next task of integrating RFS into virtio-net. Tom On Tue, Dec 31, 2013 at 10:32 AM, David Miller <davem@davemloft.net> wrote: > From: Zhi Yong Wu <zwu.kernel@gmail.com> > Date: Sun, 22 Dec 2013 18:54:30 +0800 > >> Since Tom Herbert's hash related patchset was modified and got merged, >> his pachset about adding support for RFS on tun flows also need to get >> adjusted accordingly. I tried to update them, and before i will start >> to do some perf tests, i hope to get one correct code base, so it's time >> to post them out now. Any constructive comments are welcome, thanks. > > Series applied to net-next, thanks. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] tun: add the RFS support 2013-12-31 22:33 ` Tom Herbert @ 2014-01-01 6:44 ` Zhi Yong Wu 2014-01-02 18:13 ` Tom Herbert 2014-01-13 13:29 ` Zhi Yong Wu 1 sibling, 1 reply; 15+ messages in thread From: Zhi Yong Wu @ 2014-01-01 6:44 UTC (permalink / raw) To: Tom Herbert; +Cc: David Miller, Linux Netdev List, Eric Dumazet, Zhi Yong Wu On Wed, Jan 1, 2014 at 6:33 AM, Tom Herbert <therbert@google.com> wrote: > Zhi, > > Thanks for following up on these patches. It would still be nice to > have performance numbers to show the impact. These will be helpful for > the next task of integrating RFS into virtio-net. Will try to do them after holiday, if you have any special test cases, please let me know, thanks. > > Tom > > On Tue, Dec 31, 2013 at 10:32 AM, David Miller <davem@davemloft.net> wrote: >> From: Zhi Yong Wu <zwu.kernel@gmail.com> >> Date: Sun, 22 Dec 2013 18:54:30 +0800 >> >>> Since Tom Herbert's hash related patchset was modified and got merged, >>> his pachset about adding support for RFS on tun flows also need to get >>> adjusted accordingly. I tried to update them, and before i will start >>> to do some perf tests, i hope to get one correct code base, so it's time >>> to post them out now. Any constructive comments are welcome, thanks. >> >> Series applied to net-next, thanks. -- Regards, Zhi Yong Wu ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] tun: add the RFS support 2014-01-01 6:44 ` Zhi Yong Wu @ 2014-01-02 18:13 ` Tom Herbert 2014-01-04 10:53 ` Zhi Yong Wu 0 siblings, 1 reply; 15+ messages in thread From: Tom Herbert @ 2014-01-02 18:13 UTC (permalink / raw) To: Zhi Yong Wu Cc: David Miller, Linux Netdev List, Eric Dumazet, Zhi Yong Wu, Jerry Chu +hkchu I believe Jerry Chu was using a little uspace program to turn tun into a network interface so that netperf and others can easily be run. Jerry, can you share how you did that? Thanks, Tom On Tue, Dec 31, 2013 at 10:44 PM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote: > On Wed, Jan 1, 2014 at 6:33 AM, Tom Herbert <therbert@google.com> wrote: >> Zhi, >> >> Thanks for following up on these patches. It would still be nice to >> have performance numbers to show the impact. These will be helpful for >> the next task of integrating RFS into virtio-net. > Will try to do them after holiday, if you have any special test cases, > please let me know, thanks. > >> >> Tom >> >> On Tue, Dec 31, 2013 at 10:32 AM, David Miller <davem@davemloft.net> wrote: >>> From: Zhi Yong Wu <zwu.kernel@gmail.com> >>> Date: Sun, 22 Dec 2013 18:54:30 +0800 >>> >>>> Since Tom Herbert's hash related patchset was modified and got merged, >>>> his pachset about adding support for RFS on tun flows also need to get >>>> adjusted accordingly. I tried to update them, and before i will start >>>> to do some perf tests, i hope to get one correct code base, so it's time >>>> to post them out now. Any constructive comments are welcome, thanks. >>> >>> Series applied to net-next, thanks. > > > > -- > Regards, > > Zhi Yong Wu ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] tun: add the RFS support 2014-01-02 18:13 ` Tom Herbert @ 2014-01-04 10:53 ` Zhi Yong Wu 0 siblings, 0 replies; 15+ messages in thread From: Zhi Yong Wu @ 2014-01-04 10:53 UTC (permalink / raw) To: Tom Herbert, Jerry Chu Cc: David Miller, Linux Netdev List, Eric Dumazet, Zhi Yong Wu HI, Jerry Would you like to share it? It is really appreciated if you'd like. Can you let me know your opinion? thanks. On Fri, Jan 3, 2014 at 2:13 AM, Tom Herbert <therbert@google.com> wrote: > +hkchu > > I believe Jerry Chu was using a little uspace program to turn tun into > a network interface so that netperf and others can easily be run. > Jerry, can you share how you did that? > > Thanks, > Tom > > On Tue, Dec 31, 2013 at 10:44 PM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote: >> On Wed, Jan 1, 2014 at 6:33 AM, Tom Herbert <therbert@google.com> wrote: >>> Zhi, >>> >>> Thanks for following up on these patches. It would still be nice to >>> have performance numbers to show the impact. These will be helpful for >>> the next task of integrating RFS into virtio-net. >> Will try to do them after holiday, if you have any special test cases, >> please let me know, thanks. >> >>> >>> Tom >>> >>> On Tue, Dec 31, 2013 at 10:32 AM, David Miller <davem@davemloft.net> wrote: >>>> From: Zhi Yong Wu <zwu.kernel@gmail.com> >>>> Date: Sun, 22 Dec 2013 18:54:30 +0800 >>>> >>>>> Since Tom Herbert's hash related patchset was modified and got merged, >>>>> his pachset about adding support for RFS on tun flows also need to get >>>>> adjusted accordingly. I tried to update them, and before i will start >>>>> to do some perf tests, i hope to get one correct code base, so it's time >>>>> to post them out now. Any constructive comments are welcome, thanks. >>>> >>>> Series applied to net-next, thanks. >> >> >> >> -- >> Regards, >> >> Zhi Yong Wu -- Regards, Zhi Yong Wu ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] tun: add the RFS support 2013-12-31 22:33 ` Tom Herbert 2014-01-01 6:44 ` Zhi Yong Wu @ 2014-01-13 13:29 ` Zhi Yong Wu 2014-01-13 16:49 ` Tom Herbert 1 sibling, 1 reply; 15+ messages in thread From: Zhi Yong Wu @ 2014-01-13 13:29 UTC (permalink / raw) To: Tom Herbert; +Cc: David Miller, Linux Netdev List, Eric Dumazet, Zhi Yong Wu On Wed, Jan 1, 2014 at 6:33 AM, Tom Herbert <therbert@google.com> wrote: > Zhi, HI, Tom > > Thanks for following up on these patches. It would still be nice to > have performance numbers to show the impact. These will be helpful for > the next task of integrating RFS into virtio-net. I don't get why RFS need to be integrated into virtio-net since virtio-net has supported mq. Can you give some clue? thanks. > > Tom > > On Tue, Dec 31, 2013 at 10:32 AM, David Miller <davem@davemloft.net> wrote: >> From: Zhi Yong Wu <zwu.kernel@gmail.com> >> Date: Sun, 22 Dec 2013 18:54:30 +0800 >> >>> Since Tom Herbert's hash related patchset was modified and got merged, >>> his pachset about adding support for RFS on tun flows also need to get >>> adjusted accordingly. I tried to update them, and before i will start >>> to do some perf tests, i hope to get one correct code base, so it's time >>> to post them out now. Any constructive comments are welcome, thanks. >> >> Series applied to net-next, thanks. -- Regards, Zhi Yong Wu ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] tun: add the RFS support 2014-01-13 13:29 ` Zhi Yong Wu @ 2014-01-13 16:49 ` Tom Herbert 2014-01-14 3:28 ` Zhi Yong Wu 0 siblings, 1 reply; 15+ messages in thread From: Tom Herbert @ 2014-01-13 16:49 UTC (permalink / raw) To: Zhi Yong Wu; +Cc: David Miller, Linux Netdev List, Eric Dumazet, Zhi Yong Wu On Mon, Jan 13, 2014 at 5:29 AM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote: > On Wed, Jan 1, 2014 at 6:33 AM, Tom Herbert <therbert@google.com> wrote: >> Zhi, > HI, Tom >> >> Thanks for following up on these patches. It would still be nice to >> have performance numbers to show the impact. These will be helpful for >> the next task of integrating RFS into virtio-net. > I don't get why RFS need to be integrated into virtio-net since > virtio-net has supported mq. Can you give some clue? thanks. > In this case RFS would be the mechanism for selecting the RX queue. We can either use the tun approach which is to match the RX queue to the TX queue for a flow, or expose it as accelerated RFS to the guest. Fine grained control of the queue selection should have value. >> >> Tom >> >> On Tue, Dec 31, 2013 at 10:32 AM, David Miller <davem@davemloft.net> wrote: >>> From: Zhi Yong Wu <zwu.kernel@gmail.com> >>> Date: Sun, 22 Dec 2013 18:54:30 +0800 >>> >>>> Since Tom Herbert's hash related patchset was modified and got merged, >>>> his pachset about adding support for RFS on tun flows also need to get >>>> adjusted accordingly. I tried to update them, and before i will start >>>> to do some perf tests, i hope to get one correct code base, so it's time >>>> to post them out now. Any constructive comments are welcome, thanks. >>> >>> Series applied to net-next, thanks. > > > > -- > Regards, > > Zhi Yong Wu ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] tun: add the RFS support 2014-01-13 16:49 ` Tom Herbert @ 2014-01-14 3:28 ` Zhi Yong Wu 0 siblings, 0 replies; 15+ messages in thread From: Zhi Yong Wu @ 2014-01-14 3:28 UTC (permalink / raw) To: Tom Herbert; +Cc: David Miller, Linux Netdev List, Eric Dumazet, Zhi Yong Wu On Tue, Jan 14, 2014 at 12:49 AM, Tom Herbert <therbert@google.com> wrote: > On Mon, Jan 13, 2014 at 5:29 AM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote: >> On Wed, Jan 1, 2014 at 6:33 AM, Tom Herbert <therbert@google.com> wrote: >>> Zhi, >> HI, Tom >>> >>> Thanks for following up on these patches. It would still be nice to >>> have performance numbers to show the impact. These will be helpful for >>> the next task of integrating RFS into virtio-net. >> I don't get why RFS need to be integrated into virtio-net since >> virtio-net has supported mq. Can you give some clue? thanks. >> > In this case RFS would be the mechanism for selecting the RX queue. We > can either use the tun approach which is to match the RX queue to the > TX queue for a flow, or expose it as accelerated RFS to the guest. > Fine grained control of the queue selection should have value. Thanks for your explanation. > >>> >>> Tom >>> >>> On Tue, Dec 31, 2013 at 10:32 AM, David Miller <davem@davemloft.net> wrote: >>>> From: Zhi Yong Wu <zwu.kernel@gmail.com> >>>> Date: Sun, 22 Dec 2013 18:54:30 +0800 >>>> >>>>> Since Tom Herbert's hash related patchset was modified and got merged, >>>>> his pachset about adding support for RFS on tun flows also need to get >>>>> adjusted accordingly. I tried to update them, and before i will start >>>>> to do some perf tests, i hope to get one correct code base, so it's time >>>>> to post them out now. Any constructive comments are welcome, thanks. >>>> >>>> Series applied to net-next, thanks. >> >> >> >> -- >> Regards, >> >> Zhi Yong Wu -- Regards, Zhi Yong Wu ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2014-01-14 3:28 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-12-22 10:54 [PATCH 0/2] tun: add the RFS support Zhi Yong Wu 2013-12-22 10:54 ` [PATCH 1/2] net: Allow setting sock flow hash without a sock Zhi Yong Wu 2013-12-22 10:54 ` [PATCH 2/2] tun: Add support for RFS on tun flows Zhi Yong Wu 2014-01-02 4:44 ` Jason Wang 2014-01-02 5:07 ` Zhi Yong Wu 2014-01-02 5:41 ` Jason Wang 2014-01-02 6:35 ` Zhi Yong Wu 2013-12-31 18:32 ` [PATCH 0/2] tun: add the RFS support David Miller 2013-12-31 22:33 ` Tom Herbert 2014-01-01 6:44 ` Zhi Yong Wu 2014-01-02 18:13 ` Tom Herbert 2014-01-04 10:53 ` Zhi Yong Wu 2014-01-13 13:29 ` Zhi Yong Wu 2014-01-13 16:49 ` Tom Herbert 2014-01-14 3:28 ` Zhi Yong Wu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).