* [PATCH] netpoll: allow execution of multiple rx_hooks per interface @ 2010-01-06 20:54 Daniel Borkmann 2010-01-07 3:54 ` Matt Mackall 0 siblings, 1 reply; 13+ messages in thread From: Daniel Borkmann @ 2010-01-06 20:54 UTC (permalink / raw) To: linux-kernel; +Cc: Jeff Moyer, netdev, netdev, David Miller [-- Attachment #1.1: Type: text/plain, Size: 578 bytes --] Hi, this patch allows the registration and _execution_ of multiple netpoll rx_hooks per interface. Currently, it is possible to register multiple netpoll structures to one interface, _but_ only one single rx_hook (from the netpoll struct that has been registered last) can be executed, which was an oversight in the implementation [1]. So, this patch fixes it. I've sucessfully tested it within 2.6.32.2 with the registration of multiple rx_hook clients for several times. I'd appreciate comments / feedback. Thanks, Daniel [1] http://lwn.net/Articles/140852/ [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1.2: netpoll_multiple_rx_hooks.patch --] [-- Type: text/x-patch; name="netpoll_multiple_rx_hooks.patch", Size: 4264 bytes --] Signed-off-by: Daniel Borkmann <danborkmann@googlemail.com> diff -Nur a/include/linux/netpoll.h b/include/linux/netpoll.h --- a/include/linux/netpoll.h 2010-01-05 23:52:58.000000000 +0100 +++ b/include/linux/netpoll.h 2010-01-06 00:12:54.000000000 +0100 @@ -21,6 +21,7 @@ __be32 local_ip, remote_ip; u16 local_port, remote_port; u8 remote_mac[ETH_ALEN]; + struct netpoll *next; /* rx list for net device */ }; struct netpoll_info { diff -Nur a/net/core/netpoll.c b/net/core/netpoll.c --- a/net/core/netpoll.c 2010-01-05 23:53:07.000000000 +0100 +++ b/net/core/netpoll.c 2010-01-06 20:44:59.000000000 +0100 @@ -493,13 +493,13 @@ int __netpoll_rx(struct sk_buff *skb) { - int proto, len, ulen; + int proto, len, ulen, hits; struct iphdr *iph; struct udphdr *uh; struct netpoll_info *npi = skb->dev->npinfo; - struct netpoll *np = npi->rx_np; + struct netpoll **np = &npi->rx_np; - if (!np) + if (!(*np)) goto out; if (skb->dev->type != ARPHRD_ETHER) goto out; @@ -551,16 +551,23 @@ goto out; if (checksum_udp(skb, uh, ulen, iph->saddr, iph->daddr)) goto out; - if (np->local_ip && np->local_ip != iph->daddr) - goto out; - if (np->remote_ip && np->remote_ip != iph->saddr) - goto out; - if (np->local_port && np->local_port != ntohs(uh->dest)) - goto out; - np->rx_hook(np, ntohs(uh->source), - (char *)(uh+1), - ulen - sizeof(struct udphdr)); + for (hits = 0; (*np) != NULL; np = &((*np)->next)) { + if ((*np)->local_ip && (*np)->local_ip != iph->daddr) + continue; + if ((*np)->remote_ip && (*np)->remote_ip != iph->saddr) + continue; + if ((*np)->local_port && (*np)->local_port != ntohs(uh->dest)) + continue; + + (*np)->rx_hook((*np), ntohs(uh->source), + (char *)(uh+1), + ulen - sizeof(struct udphdr)); + hits++; + } + + if (!hits) + goto out; kfree_skb(skb); return 1; @@ -679,6 +686,45 @@ return -1; } +static int netpoll_rx_list_register(struct netpoll **np_head, + struct netpoll *np) +{ + while ((*np_head) != NULL) { + if ((*np_head) == np) + return -EEXIST; + np_head = &((*np_head)->next); + } + + np->next = *np_head; + rcu_assign_pointer(*np_head, np); + return 0; +} + +static int netpoll_rx_list_unregister(struct netpoll **np_head, + struct netpoll *np) +{ + while ((*np_head) != NULL) { + if ((*np_head) == np) { + rcu_assign_pointer(*np_head, np->next); + return 0; + } + np_head = &((*np_head)->next); + } + return -ENOENT; +} + +static void netpoll_rx_list_nullify(struct netpoll **np_head) +{ + struct netpoll *np = NULL; + + while ((*np_head) != NULL) { + np = (*np_head); + np_head = &((*np_head)->next); + np->dev = NULL; + np->next = NULL; + } +} + int netpoll_setup(struct netpoll *np) { struct net_device *ndev = NULL; @@ -695,6 +741,7 @@ return -ENODEV; } + np->next = NULL; np->dev = ndev; if (!ndev->npinfo) { npinfo = kmalloc(sizeof(*npinfo), GFP_KERNEL); @@ -785,7 +832,7 @@ if (np->rx_hook) { spin_lock_irqsave(&npinfo->rx_lock, flags); npinfo->rx_flags |= NETPOLL_RX_ENABLED; - npinfo->rx_np = np; + netpoll_rx_list_register(&npinfo->rx_np, np); spin_unlock_irqrestore(&npinfo->rx_lock, flags); } @@ -801,9 +848,14 @@ return 0; release: - if (!ndev->npinfo) + if (!ndev->npinfo) { + spin_lock_irqsave(&npinfo->rx_lock, flags); + netpoll_rx_list_nullify(&npinfo->rx_np); + spin_unlock_irqrestore(&npinfo->rx_lock, flags); + kfree(npinfo); - np->dev = NULL; + } + dev_put(ndev); return err; } @@ -823,10 +875,11 @@ if (np->dev) { npinfo = np->dev->npinfo; if (npinfo) { - if (npinfo->rx_np == np) { + if (npinfo->rx_np) { spin_lock_irqsave(&npinfo->rx_lock, flags); - npinfo->rx_np = NULL; - npinfo->rx_flags &= ~NETPOLL_RX_ENABLED; + netpoll_rx_list_unregister(&npinfo->rx_np, np); + if (!npinfo->rx_np) + npinfo->rx_flags &= ~NETPOLL_RX_ENABLED; spin_unlock_irqrestore(&npinfo->rx_lock, flags); } @@ -845,7 +898,10 @@ dev_put(np->dev); } + np->next = NULL; np->dev = NULL; + + synchronize_rcu(); } int netpoll_trap(void) [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 261 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netpoll: allow execution of multiple rx_hooks per interface 2010-01-06 20:54 [PATCH] netpoll: allow execution of multiple rx_hooks per interface Daniel Borkmann @ 2010-01-07 3:54 ` Matt Mackall 2010-01-07 9:02 ` David Miller 0 siblings, 1 reply; 13+ messages in thread From: Matt Mackall @ 2010-01-07 3:54 UTC (permalink / raw) To: Daniel Borkmann; +Cc: linux-kernel, Jeff Moyer, netdev, netdev, David Miller On Wed, 2010-01-06 at 21:54 +0100, Daniel Borkmann wrote: > Hi, > > this patch allows the registration and _execution_ of multiple netpoll > rx_hooks per interface. Currently, it is possible to register multiple > netpoll structures to one interface, _but_ only one single rx_hook (from > the netpoll struct that has been registered last) can be executed, which > was an oversight in the implementation [1]. > So, this patch fixes it. I've sucessfully tested it within 2.6.32.2 with > the registration of multiple rx_hook clients for several times. I'd > appreciate comments / feedback. (grumbles about cc:) Please inline patches so they can be reviewed easily in reply. - struct netpoll *np = npi->rx_np; + struct netpoll **np = &npi->rx_np; - if (!np) + if (!(*np)) This makes everything horrible. Can you avoid the double indirection? Using a list head might be a good answer. -- http://selenic.com : development and support for Mercurial and Linux ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netpoll: allow execution of multiple rx_hooks per interface 2010-01-07 3:54 ` Matt Mackall @ 2010-01-07 9:02 ` David Miller 2010-01-07 19:06 ` Daniel Borkmann 0 siblings, 1 reply; 13+ messages in thread From: David Miller @ 2010-01-07 9:02 UTC (permalink / raw) To: mpm; +Cc: danborkmann, linux-kernel, jmoyer, netdev, netdev From: Matt Mackall <mpm@selenic.com> Date: Wed, 06 Jan 2010 21:54:05 -0600 > Please inline patches so they can be reviewed easily in reply. > > > - struct netpoll *np = npi->rx_np; > + struct netpoll **np = &npi->rx_np; > > - if (!np) > + if (!(*np)) > > This makes everything horrible. Can you avoid the double indirection? > Using a list head might be a good answer. > Agreed on all counts. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netpoll: allow execution of multiple rx_hooks per interface 2010-01-07 9:02 ` David Miller @ 2010-01-07 19:06 ` Daniel Borkmann 2010-01-08 0:20 ` Daniel Borkmann 0 siblings, 1 reply; 13+ messages in thread From: Daniel Borkmann @ 2010-01-07 19:06 UTC (permalink / raw) To: David Miller; +Cc: mpm, danborkmann, linux-kernel, jmoyer, netdev, netdev [-- Attachment #1: Type: text/plain, Size: 897 bytes --] David Miller wrote: > From: Matt Mackall <mpm@selenic.com> > Date: Wed, 06 Jan 2010 21:54:05 -0600 > >> Please inline patches so they can be reviewed easily in reply. >> >> >> - struct netpoll *np = npi->rx_np; >> + struct netpoll **np = &npi->rx_np; >> >> - if (!np) >> + if (!(*np)) >> >> This makes everything horrible. Can you avoid the double indirection? >> Using a list head might be a good answer. >> > > Agreed on all counts. > Agreed on the double indirection, I'll fix it. I've already considered the list_head structure, but then I was the opinion that a double linked list might not be necessary for this, so I did it that way ... (compare: kernel notifier by Alan Cox). If you insist on that I'll fix it of course ;) Best regards, Daniel P.s.: Sorry Matt for not CCing. I mainly took those addresses from Jeffs post. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 261 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netpoll: allow execution of multiple rx_hooks per interface 2010-01-07 19:06 ` Daniel Borkmann @ 2010-01-08 0:20 ` Daniel Borkmann 2010-01-11 23:21 ` Matt Mackall 0 siblings, 1 reply; 13+ messages in thread From: Daniel Borkmann @ 2010-01-08 0:20 UTC (permalink / raw) To: David Miller; +Cc: mpm, linux-kernel, jmoyer, netdev, netdev [-- Attachment #1: Type: text/plain, Size: 10281 bytes --] Daniel Borkmann wrote: > David Miller wrote: >> From: Matt Mackall <mpm@selenic.com> >> Date: Wed, 06 Jan 2010 21:54:05 -0600 >> >>> Please inline patches so they can be reviewed easily in reply. >>> >>> >>> - struct netpoll *np = npi->rx_np; >>> + struct netpoll **np = &npi->rx_np; >>> >>> - if (!np) >>> + if (!(*np)) >>> >>> This makes everything horrible. Can you avoid the double indirection? >>> Using a list head might be a good answer. >>> >> Agreed on all counts. >> > > Agreed on the double indirection, I'll fix it. > > I've already considered the list_head structure, but then I was the opinion > that a double linked list might not be necessary for this, so I did it that > way ... (compare: kernel notifier by Alan Cox). If you insist on that I'll > fix it of course ;) So, here's the list head implementation. Tested on both of my machines with several rx_hook clients. Best regards, Daniel Signed-off-by: Daniel Borkmann <danborkmann@googlemail.com> diff -Nur a/include/linux/netpoll.h b/include/linux/netpoll.h --- a/include/linux/netpoll.h 2010-01-05 23:52:58.000000000 +0100 +++ b/include/linux/netpoll.h 2010-01-07 23:19:25.000000000 +0100 @@ -21,15 +21,20 @@ __be32 local_ip, remote_ip; u16 local_port, remote_port; u8 remote_mac[ETH_ALEN]; + + struct list_head rx; /* rx_np list element */ }; struct netpoll_info { atomic_t refcnt; + int rx_flags; spinlock_t rx_lock; - struct netpoll *rx_np; /* netpoll that registered an rx_hook */ + struct list_head rx_np; /* netpolls that registered an rx_hook */ + struct sk_buff_head arp_tx; /* list of arp requests to reply to */ struct sk_buff_head txq; + struct delayed_work tx_work; }; @@ -51,7 +56,7 @@ unsigned long flags; int ret = 0; - if (!npinfo || (!npinfo->rx_np && !npinfo->rx_flags)) + if (!npinfo || (list_empty(&npinfo->rx_np) && !npinfo->rx_flags)) return 0; spin_lock_irqsave(&npinfo->rx_lock, flags); @@ -67,7 +72,7 @@ { struct netpoll_info *npinfo = skb->dev->npinfo; - return npinfo && (npinfo->rx_np || npinfo->rx_flags); + return npinfo && (!list_empty(&npinfo->rx_np) || npinfo->rx_flags); } static inline int netpoll_receive_skb(struct sk_buff *skb) diff -Nur a/net/core/netpoll.c b/net/core/netpoll.c --- a/net/core/netpoll.c 2010-01-05 23:53:07.000000000 +0100 +++ b/net/core/netpoll.c 2010-01-08 00:59:19.000000000 +0100 @@ -407,107 +407,119 @@ __be32 sip, tip; unsigned char *sha; struct sk_buff *send_skb; - struct netpoll *np = NULL; - - if (npinfo->rx_np && npinfo->rx_np->dev == skb->dev) - np = npinfo->rx_np; - if (!np) - return; - - /* No arp on this interface */ - if (skb->dev->flags & IFF_NOARP) - return; + struct netpoll *np, *tmp; + unsigned long flags; - if (!pskb_may_pull(skb, arp_hdr_len(skb->dev))) + if (list_empty(&npinfo->rx_np)) return; - skb_reset_network_header(skb); - skb_reset_transport_header(skb); - arp = arp_hdr(skb); - - if ((arp->ar_hrd != htons(ARPHRD_ETHER) && - arp->ar_hrd != htons(ARPHRD_IEEE802)) || - arp->ar_pro != htons(ETH_P_IP) || - arp->ar_op != htons(ARPOP_REQUEST)) - return; + spin_lock_irqsave(&npinfo->rx_lock, flags); + list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) { + if (np->dev != skb->dev) + continue; + + /* No arp on this interface */ + if (skb->dev->flags & IFF_NOARP) + continue; + + if (!pskb_may_pull(skb, arp_hdr_len(skb->dev))) + continue; + + skb_reset_network_header(skb); + skb_reset_transport_header(skb); + arp = arp_hdr(skb); + + if ((arp->ar_hrd != htons(ARPHRD_ETHER) && + arp->ar_hrd != htons(ARPHRD_IEEE802)) || + arp->ar_pro != htons(ETH_P_IP) || + arp->ar_op != htons(ARPOP_REQUEST)) + continue; + + arp_ptr = (unsigned char *)(arp+1); + /* save the location of the src hw addr */ + sha = arp_ptr; + arp_ptr += skb->dev->addr_len; + memcpy(&sip, arp_ptr, 4); + arp_ptr += 4; + /* + * if we actually cared about dst hw addr, + * it would get copied here + */ + arp_ptr += skb->dev->addr_len; + memcpy(&tip, arp_ptr, 4); - arp_ptr = (unsigned char *)(arp+1); - /* save the location of the src hw addr */ - sha = arp_ptr; - arp_ptr += skb->dev->addr_len; - memcpy(&sip, arp_ptr, 4); - arp_ptr += 4; - /* if we actually cared about dst hw addr, it would get copied here */ - arp_ptr += skb->dev->addr_len; - memcpy(&tip, arp_ptr, 4); - - /* Should we ignore arp? */ - if (tip != np->local_ip || - ipv4_is_loopback(tip) || ipv4_is_multicast(tip)) - return; + /* Should we ignore arp? */ + if (tip != np->local_ip || + ipv4_is_loopback(tip) || ipv4_is_multicast(tip)) + continue; + + size = arp_hdr_len(skb->dev); + send_skb = find_skb(np, size + LL_ALLOCATED_SPACE(np->dev), + LL_RESERVED_SPACE(np->dev)); + + if (!send_skb) + continue; + + skb_reset_network_header(send_skb); + arp = (struct arphdr *) skb_put(send_skb, size); + send_skb->dev = skb->dev; + send_skb->protocol = htons(ETH_P_ARP); + + /* Fill the device header for the ARP frame */ + if (dev_hard_header(send_skb, skb->dev, ptype, + sha, np->dev->dev_addr, + send_skb->len) < 0) { + kfree_skb(send_skb); + continue; + } - size = arp_hdr_len(skb->dev); - send_skb = find_skb(np, size + LL_ALLOCATED_SPACE(np->dev), - LL_RESERVED_SPACE(np->dev)); + /* + * Fill out the arp protocol part. + * + * we only support ethernet device type, + * which (according to RFC 1390) should + * always equal 1 (Ethernet). + */ - if (!send_skb) - return; + arp->ar_hrd = htons(np->dev->type); + arp->ar_pro = htons(ETH_P_IP); + arp->ar_hln = np->dev->addr_len; + arp->ar_pln = 4; + arp->ar_op = htons(type); + + arp_ptr = (unsigned char *)(arp + 1); + memcpy(arp_ptr, np->dev->dev_addr, np->dev->addr_len); + arp_ptr += np->dev->addr_len; + memcpy(arp_ptr, &tip, 4); + arp_ptr += 4; + memcpy(arp_ptr, sha, np->dev->addr_len); + arp_ptr += np->dev->addr_len; + memcpy(arp_ptr, &sip, 4); - skb_reset_network_header(send_skb); - arp = (struct arphdr *) skb_put(send_skb, size); - send_skb->dev = skb->dev; - send_skb->protocol = htons(ETH_P_ARP); - - /* Fill the device header for the ARP frame */ - if (dev_hard_header(send_skb, skb->dev, ptype, - sha, np->dev->dev_addr, - send_skb->len) < 0) { - kfree_skb(send_skb); - return; + netpoll_send_skb(np, send_skb); } - - /* - * Fill out the arp protocol part. - * - * we only support ethernet device type, - * which (according to RFC 1390) should always equal 1 (Ethernet). - */ - - arp->ar_hrd = htons(np->dev->type); - arp->ar_pro = htons(ETH_P_IP); - arp->ar_hln = np->dev->addr_len; - arp->ar_pln = 4; - arp->ar_op = htons(type); - - arp_ptr=(unsigned char *)(arp + 1); - memcpy(arp_ptr, np->dev->dev_addr, np->dev->addr_len); - arp_ptr += np->dev->addr_len; - memcpy(arp_ptr, &tip, 4); - arp_ptr += 4; - memcpy(arp_ptr, sha, np->dev->addr_len); - arp_ptr += np->dev->addr_len; - memcpy(arp_ptr, &sip, 4); - - netpoll_send_skb(np, send_skb); + spin_unlock_irqrestore(&npinfo->rx_lock, flags); } int __netpoll_rx(struct sk_buff *skb) { int proto, len, ulen; + int hits = 0; struct iphdr *iph; struct udphdr *uh; - struct netpoll_info *npi = skb->dev->npinfo; - struct netpoll *np = npi->rx_np; + struct netpoll_info *npinfo = skb->dev->npinfo; + struct netpoll *np, *tmp; - if (!np) + if (list_empty(&npinfo->rx_np)) goto out; + if (skb->dev->type != ARPHRD_ETHER) goto out; /* check if netpoll clients need ARP */ if (skb->protocol == htons(ETH_P_ARP) && atomic_read(&trapped)) { - skb_queue_tail(&npi->arp_tx, skb); + skb_queue_tail(&npinfo->arp_tx, skb); return 1; } @@ -551,16 +563,23 @@ goto out; if (checksum_udp(skb, uh, ulen, iph->saddr, iph->daddr)) goto out; - if (np->local_ip && np->local_ip != iph->daddr) - goto out; - if (np->remote_ip && np->remote_ip != iph->saddr) - goto out; - if (np->local_port && np->local_port != ntohs(uh->dest)) - goto out; - np->rx_hook(np, ntohs(uh->source), - (char *)(uh+1), - ulen - sizeof(struct udphdr)); + list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) { + if (np->local_ip && np->local_ip != iph->daddr) + continue; + if (np->remote_ip && np->remote_ip != iph->saddr) + continue; + if (np->local_port && np->local_port != ntohs(uh->dest)) + continue; + + np->rx_hook(np, ntohs(uh->source), + (char *)(uh+1), + ulen - sizeof(struct udphdr)); + hits++; + } + + if (!hits) + goto out; kfree_skb(skb); return 1; @@ -684,6 +703,7 @@ struct net_device *ndev = NULL; struct in_device *in_dev; struct netpoll_info *npinfo; + struct netpoll *npe, *tmp; unsigned long flags; int err; @@ -704,7 +724,7 @@ } npinfo->rx_flags = 0; - npinfo->rx_np = NULL; + INIT_LIST_HEAD(&npinfo->rx_np); spin_lock_init(&npinfo->rx_lock); skb_queue_head_init(&npinfo->arp_tx); @@ -785,7 +805,7 @@ if (np->rx_hook) { spin_lock_irqsave(&npinfo->rx_lock, flags); npinfo->rx_flags |= NETPOLL_RX_ENABLED; - npinfo->rx_np = np; + list_add_tail(&np->rx, &npinfo->rx_np); spin_unlock_irqrestore(&npinfo->rx_lock, flags); } @@ -801,9 +821,16 @@ return 0; release: - if (!ndev->npinfo) + if (!ndev->npinfo) { + spin_lock_irqsave(&npinfo->rx_lock, flags); + list_for_each_entry_safe(npe, tmp, &npinfo->rx_np, rx) { + npe->dev = NULL; + } + spin_unlock_irqrestore(&npinfo->rx_lock, flags); + kfree(npinfo); - np->dev = NULL; + } + dev_put(ndev); return err; } @@ -823,10 +850,11 @@ if (np->dev) { npinfo = np->dev->npinfo; if (npinfo) { - if (npinfo->rx_np == np) { + if (!list_empty(&npinfo->rx_np)) { spin_lock_irqsave(&npinfo->rx_lock, flags); - npinfo->rx_np = NULL; - npinfo->rx_flags &= ~NETPOLL_RX_ENABLED; + list_del(&np->rx); + if (list_empty(&npinfo->rx_np)) + npinfo->rx_flags &= ~NETPOLL_RX_ENABLED; spin_unlock_irqrestore(&npinfo->rx_lock, flags); } [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 261 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netpoll: allow execution of multiple rx_hooks per interface 2010-01-08 0:20 ` Daniel Borkmann @ 2010-01-11 23:21 ` Matt Mackall 2010-01-11 23:59 ` David Miller 0 siblings, 1 reply; 13+ messages in thread From: Matt Mackall @ 2010-01-11 23:21 UTC (permalink / raw) To: Daniel Borkmann; +Cc: David Miller, linux-kernel, jmoyer, netdev, netdev On Fri, 2010-01-08 at 01:20 +0100, Daniel Borkmann wrote: > Daniel Borkmann wrote: > > David Miller wrote: > >> From: Matt Mackall <mpm@selenic.com> > >> Date: Wed, 06 Jan 2010 21:54:05 -0600 > >> > >>> Please inline patches so they can be reviewed easily in reply. > >>> > >>> > >>> - struct netpoll *np = npi->rx_np; > >>> + struct netpoll **np = &npi->rx_np; > >>> > >>> - if (!np) > >>> + if (!(*np)) > >>> > >>> This makes everything horrible. Can you avoid the double indirection? > >>> Using a list head might be a good answer. > >>> > >> Agreed on all counts. > >> > > > > Agreed on the double indirection, I'll fix it. > > > > I've already considered the list_head structure, but then I was the opinion > > that a double linked list might not be necessary for this, so I did it that > > way ... (compare: kernel notifier by Alan Cox). If you insist on that I'll > > fix it of course ;) > > So, here's the list head implementation. Tested on both of my machines with several > rx_hook clients. Looks pretty good. Dave? Acked-by: Matt Mackall <mpm@selenic.com> -- http://selenic.com : development and support for Mercurial and Linux ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netpoll: allow execution of multiple rx_hooks per interface 2010-01-11 23:21 ` Matt Mackall @ 2010-01-11 23:59 ` David Miller 2010-01-12 0:03 ` Matt Mackall 0 siblings, 1 reply; 13+ messages in thread From: David Miller @ 2010-01-11 23:59 UTC (permalink / raw) To: mpm; +Cc: danborkmann, linux-kernel, jmoyer, netdev, netdev From: Matt Mackall <mpm@selenic.com> Date: Mon, 11 Jan 2010 17:21:48 -0600 > Looks pretty good. Dave? > > Acked-by: Matt Mackall <mpm@selenic.com> I don't like the loop for RX ARP processing. The packet contents aren't going to change, so doing basic packet validation inside of the "for each RX client" loop of arp_reply() doesn't make any sense. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netpoll: allow execution of multiple rx_hooks per interface 2010-01-11 23:59 ` David Miller @ 2010-01-12 0:03 ` Matt Mackall 2010-01-12 0:09 ` Daniel Borkmann 0 siblings, 1 reply; 13+ messages in thread From: Matt Mackall @ 2010-01-12 0:03 UTC (permalink / raw) To: David Miller; +Cc: danborkmann, linux-kernel, jmoyer, netdev, netdev On Mon, 2010-01-11 at 15:59 -0800, David Miller wrote: > From: Matt Mackall <mpm@selenic.com> > Date: Mon, 11 Jan 2010 17:21:48 -0600 > > > Looks pretty good. Dave? > > > > Acked-by: Matt Mackall <mpm@selenic.com> > > I don't like the loop for RX ARP processing. > > The packet contents aren't going to change, so doing basic > packet validation inside of the "for each RX client" loop > of arp_reply() doesn't make any sense. True. Dan, please help our poor compilers with some manual loop invariant motion. -- http://selenic.com : development and support for Mercurial and Linux ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netpoll: allow execution of multiple rx_hooks per interface 2010-01-12 0:03 ` Matt Mackall @ 2010-01-12 0:09 ` Daniel Borkmann 2010-01-13 0:27 ` Daniel Borkmann 0 siblings, 1 reply; 13+ messages in thread From: Daniel Borkmann @ 2010-01-12 0:09 UTC (permalink / raw) To: Matt Mackall Cc: David Miller, danborkmann, linux-kernel, jmoyer, netdev, netdev [-- Attachment #1: Type: text/plain, Size: 650 bytes --] Matt Mackall wrote: > On Mon, 2010-01-11 at 15:59 -0800, David Miller wrote: >> From: Matt Mackall <mpm@selenic.com> >> Date: Mon, 11 Jan 2010 17:21:48 -0600 >> >>> Looks pretty good. Dave? >>> >>> Acked-by: Matt Mackall <mpm@selenic.com> >> I don't like the loop for RX ARP processing. >> >> The packet contents aren't going to change, so doing basic >> packet validation inside of the "for each RX client" loop >> of arp_reply() doesn't make any sense. > > True. Dan, please help our poor compilers with some manual loop > invariant motion. Okay, true. I'll fix this by tomorrow and resend the patch. Best regards, Daniel [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 261 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netpoll: allow execution of multiple rx_hooks per interface 2010-01-12 0:09 ` Daniel Borkmann @ 2010-01-13 0:27 ` Daniel Borkmann 2010-01-13 13:57 ` Jeff Moyer 2010-01-14 4:41 ` David Miller 0 siblings, 2 replies; 13+ messages in thread From: Daniel Borkmann @ 2010-01-13 0:27 UTC (permalink / raw) To: Matt Mackall Cc: David Miller, danborkmann, linux-kernel, jmoyer, netdev, netdev [-- Attachment #1: Type: text/plain, Size: 9308 bytes --] Daniel Borkmann wrote: > Matt Mackall wrote: >> On Mon, 2010-01-11 at 15:59 -0800, David Miller wrote: >>> From: Matt Mackall <mpm@selenic.com> >>> Date: Mon, 11 Jan 2010 17:21:48 -0600 >>> >>>> Looks pretty good. Dave? >>>> >>>> Acked-by: Matt Mackall <mpm@selenic.com> >>> I don't like the loop for RX ARP processing. >>> >>> The packet contents aren't going to change, so doing basic >>> packet validation inside of the "for each RX client" loop >>> of arp_reply() doesn't make any sense. >> True. Dan, please help our poor compilers with some manual loop >> invariant motion. > > Okay, true. I'll fix this by tomorrow and resend the patch. Here the fix of the RX ARP processing routine. Content that isn't going to change is out-of-loop. Successfully tested on my machines. Signed-off-by: Daniel Borkmann <danborkmann@googlemail.com> diff -Nur a/include/linux/netpoll.h b/include/linux/netpoll.h --- a/include/linux/netpoll.h 2010-01-05 23:52:58.000000000 +0100 +++ b/include/linux/netpoll.h 2010-01-07 23:19:25.000000000 +0100 @@ -21,15 +21,20 @@ __be32 local_ip, remote_ip; u16 local_port, remote_port; u8 remote_mac[ETH_ALEN]; + + struct list_head rx; /* rx_np list element */ }; struct netpoll_info { atomic_t refcnt; + int rx_flags; spinlock_t rx_lock; - struct netpoll *rx_np; /* netpoll that registered an rx_hook */ + struct list_head rx_np; /* netpolls that registered an rx_hook */ + struct sk_buff_head arp_tx; /* list of arp requests to reply to */ struct sk_buff_head txq; + struct delayed_work tx_work; }; @@ -51,7 +56,7 @@ unsigned long flags; int ret = 0; - if (!npinfo || (!npinfo->rx_np && !npinfo->rx_flags)) + if (!npinfo || (list_empty(&npinfo->rx_np) && !npinfo->rx_flags)) return 0; spin_lock_irqsave(&npinfo->rx_lock, flags); @@ -67,7 +72,7 @@ { struct netpoll_info *npinfo = skb->dev->npinfo; - return npinfo && (npinfo->rx_np || npinfo->rx_flags); + return npinfo && (!list_empty(&npinfo->rx_np) || npinfo->rx_flags); } static inline int netpoll_receive_skb(struct sk_buff *skb) diff -Nur a/net/core/netpoll.c b/net/core/netpoll.c --- a/net/core/netpoll.c 2010-01-05 23:53:07.000000000 +0100 +++ b/net/core/netpoll.c 2010-01-12 23:36:48.000000000 +0100 @@ -407,11 +407,24 @@ __be32 sip, tip; unsigned char *sha; struct sk_buff *send_skb; - struct netpoll *np = NULL; + struct netpoll *np, *tmp; + unsigned long flags; + int hits = 0; + + if (list_empty(&npinfo->rx_np)) + return; + + /* Before checking the packet, we do some early + inspection whether this is interesting at all */ + spin_lock_irqsave(&npinfo->rx_lock, flags); + list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) { + if (np->dev == skb->dev) + hits++; + } + spin_unlock_irqrestore(&npinfo->rx_lock, flags); - if (npinfo->rx_np && npinfo->rx_np->dev == skb->dev) - np = npinfo->rx_np; - if (!np) + /* No netpoll struct is using this dev */ + if (!hits) return; /* No arp on this interface */ @@ -437,77 +450,91 @@ arp_ptr += skb->dev->addr_len; memcpy(&sip, arp_ptr, 4); arp_ptr += 4; - /* if we actually cared about dst hw addr, it would get copied here */ + /* If we actually cared about dst hw addr, + it would get copied here */ arp_ptr += skb->dev->addr_len; memcpy(&tip, arp_ptr, 4); /* Should we ignore arp? */ - if (tip != np->local_ip || - ipv4_is_loopback(tip) || ipv4_is_multicast(tip)) + if (ipv4_is_loopback(tip) || ipv4_is_multicast(tip)) return; size = arp_hdr_len(skb->dev); - send_skb = find_skb(np, size + LL_ALLOCATED_SPACE(np->dev), - LL_RESERVED_SPACE(np->dev)); - - if (!send_skb) - return; - - skb_reset_network_header(send_skb); - arp = (struct arphdr *) skb_put(send_skb, size); - send_skb->dev = skb->dev; - send_skb->protocol = htons(ETH_P_ARP); - - /* Fill the device header for the ARP frame */ - if (dev_hard_header(send_skb, skb->dev, ptype, - sha, np->dev->dev_addr, - send_skb->len) < 0) { - kfree_skb(send_skb); - return; - } - /* - * Fill out the arp protocol part. - * - * we only support ethernet device type, - * which (according to RFC 1390) should always equal 1 (Ethernet). - */ + spin_lock_irqsave(&npinfo->rx_lock, flags); + list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) { + if (tip != np->local_ip) + continue; + + send_skb = find_skb(np, size + LL_ALLOCATED_SPACE(np->dev), + LL_RESERVED_SPACE(np->dev)); + if (!send_skb) + continue; + + skb_reset_network_header(send_skb); + arp = (struct arphdr *) skb_put(send_skb, size); + send_skb->dev = skb->dev; + send_skb->protocol = htons(ETH_P_ARP); + + /* Fill the device header for the ARP frame */ + if (dev_hard_header(send_skb, skb->dev, ptype, + sha, np->dev->dev_addr, + send_skb->len) < 0) { + kfree_skb(send_skb); + continue; + } - arp->ar_hrd = htons(np->dev->type); - arp->ar_pro = htons(ETH_P_IP); - arp->ar_hln = np->dev->addr_len; - arp->ar_pln = 4; - arp->ar_op = htons(type); - - arp_ptr=(unsigned char *)(arp + 1); - memcpy(arp_ptr, np->dev->dev_addr, np->dev->addr_len); - arp_ptr += np->dev->addr_len; - memcpy(arp_ptr, &tip, 4); - arp_ptr += 4; - memcpy(arp_ptr, sha, np->dev->addr_len); - arp_ptr += np->dev->addr_len; - memcpy(arp_ptr, &sip, 4); + /* + * Fill out the arp protocol part. + * + * we only support ethernet device type, + * which (according to RFC 1390) should + * always equal 1 (Ethernet). + */ - netpoll_send_skb(np, send_skb); + arp->ar_hrd = htons(np->dev->type); + arp->ar_pro = htons(ETH_P_IP); + arp->ar_hln = np->dev->addr_len; + arp->ar_pln = 4; + arp->ar_op = htons(type); + + arp_ptr = (unsigned char *)(arp + 1); + memcpy(arp_ptr, np->dev->dev_addr, np->dev->addr_len); + arp_ptr += np->dev->addr_len; + memcpy(arp_ptr, &tip, 4); + arp_ptr += 4; + memcpy(arp_ptr, sha, np->dev->addr_len); + arp_ptr += np->dev->addr_len; + memcpy(arp_ptr, &sip, 4); + + netpoll_send_skb(np, send_skb); + + /* If there are several rx_hooks for the same address, + we're fine by sending a single reply */ + break; + } + spin_unlock_irqrestore(&npinfo->rx_lock, flags); } int __netpoll_rx(struct sk_buff *skb) { int proto, len, ulen; + int hits = 0; struct iphdr *iph; struct udphdr *uh; - struct netpoll_info *npi = skb->dev->npinfo; - struct netpoll *np = npi->rx_np; + struct netpoll_info *npinfo = skb->dev->npinfo; + struct netpoll *np, *tmp; - if (!np) + if (list_empty(&npinfo->rx_np)) goto out; + if (skb->dev->type != ARPHRD_ETHER) goto out; /* check if netpoll clients need ARP */ if (skb->protocol == htons(ETH_P_ARP) && atomic_read(&trapped)) { - skb_queue_tail(&npi->arp_tx, skb); + skb_queue_tail(&npinfo->arp_tx, skb); return 1; } @@ -551,16 +578,23 @@ goto out; if (checksum_udp(skb, uh, ulen, iph->saddr, iph->daddr)) goto out; - if (np->local_ip && np->local_ip != iph->daddr) - goto out; - if (np->remote_ip && np->remote_ip != iph->saddr) - goto out; - if (np->local_port && np->local_port != ntohs(uh->dest)) - goto out; - np->rx_hook(np, ntohs(uh->source), - (char *)(uh+1), - ulen - sizeof(struct udphdr)); + list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) { + if (np->local_ip && np->local_ip != iph->daddr) + continue; + if (np->remote_ip && np->remote_ip != iph->saddr) + continue; + if (np->local_port && np->local_port != ntohs(uh->dest)) + continue; + + np->rx_hook(np, ntohs(uh->source), + (char *)(uh+1), + ulen - sizeof(struct udphdr)); + hits++; + } + + if (!hits) + goto out; kfree_skb(skb); return 1; @@ -684,6 +718,7 @@ struct net_device *ndev = NULL; struct in_device *in_dev; struct netpoll_info *npinfo; + struct netpoll *npe, *tmp; unsigned long flags; int err; @@ -704,7 +739,7 @@ } npinfo->rx_flags = 0; - npinfo->rx_np = NULL; + INIT_LIST_HEAD(&npinfo->rx_np); spin_lock_init(&npinfo->rx_lock); skb_queue_head_init(&npinfo->arp_tx); @@ -785,7 +820,7 @@ if (np->rx_hook) { spin_lock_irqsave(&npinfo->rx_lock, flags); npinfo->rx_flags |= NETPOLL_RX_ENABLED; - npinfo->rx_np = np; + list_add_tail(&np->rx, &npinfo->rx_np); spin_unlock_irqrestore(&npinfo->rx_lock, flags); } @@ -801,9 +836,16 @@ return 0; release: - if (!ndev->npinfo) + if (!ndev->npinfo) { + spin_lock_irqsave(&npinfo->rx_lock, flags); + list_for_each_entry_safe(npe, tmp, &npinfo->rx_np, rx) { + npe->dev = NULL; + } + spin_unlock_irqrestore(&npinfo->rx_lock, flags); + kfree(npinfo); - np->dev = NULL; + } + dev_put(ndev); return err; } @@ -823,10 +865,11 @@ if (np->dev) { npinfo = np->dev->npinfo; if (npinfo) { - if (npinfo->rx_np == np) { + if (!list_empty(&npinfo->rx_np)) { spin_lock_irqsave(&npinfo->rx_lock, flags); - npinfo->rx_np = NULL; - npinfo->rx_flags &= ~NETPOLL_RX_ENABLED; + list_del(&np->rx); + if (list_empty(&npinfo->rx_np)) + npinfo->rx_flags &= ~NETPOLL_RX_ENABLED; spin_unlock_irqrestore(&npinfo->rx_lock, flags); } [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 261 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netpoll: allow execution of multiple rx_hooks per interface 2010-01-13 0:27 ` Daniel Borkmann @ 2010-01-13 13:57 ` Jeff Moyer 2010-01-13 16:53 ` Daniel Borkmann 2010-01-14 4:41 ` David Miller 1 sibling, 1 reply; 13+ messages in thread From: Jeff Moyer @ 2010-01-13 13:57 UTC (permalink / raw) To: Daniel Borkmann; +Cc: Matt Mackall, David Miller, linux-kernel, netdev, netdev Daniel Borkmann <danborkmann@googlemail.com> writes: > Daniel Borkmann wrote: >> Matt Mackall wrote: >>> On Mon, 2010-01-11 at 15:59 -0800, David Miller wrote: >>>> From: Matt Mackall <mpm@selenic.com> >>>> Date: Mon, 11 Jan 2010 17:21:48 -0600 >>>> >>>>> Looks pretty good. Dave? >>>>> >>>>> Acked-by: Matt Mackall <mpm@selenic.com> >>>> I don't like the loop for RX ARP processing. >>>> >>>> The packet contents aren't going to change, so doing basic >>>> packet validation inside of the "for each RX client" loop >>>> of arp_reply() doesn't make any sense. >>> True. Dan, please help our poor compilers with some manual loop >>> invariant motion. >> >> Okay, true. I'll fix this by tomorrow and resend the patch. > > Here the fix of the RX ARP processing routine. Content that isn't > going to change is out-of-loop. > Successfully tested on my machines. Against what tree does this patch apply? It doesn't apply to Linus's git tree. Also, in the future, could you use the -p option to diff so we can see what function or data structure is being modified? It really helps in reviewing. Thanks! Jeff ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netpoll: allow execution of multiple rx_hooks per interface 2010-01-13 13:57 ` Jeff Moyer @ 2010-01-13 16:53 ` Daniel Borkmann 0 siblings, 0 replies; 13+ messages in thread From: Daniel Borkmann @ 2010-01-13 16:53 UTC (permalink / raw) To: Jeff Moyer; +Cc: Matt Mackall, David Miller, linux-kernel, netdev, netdev 2010/1/13 Jeff Moyer <jmoyer@redhat.com>: > Daniel Borkmann <danborkmann@googlemail.com> writes: > >> Daniel Borkmann wrote: >>> Matt Mackall wrote: >>>> On Mon, 2010-01-11 at 15:59 -0800, David Miller wrote: >>>>> From: Matt Mackall <mpm@selenic.com> >>>>> Date: Mon, 11 Jan 2010 17:21:48 -0600 >>>>> >>>>>> Looks pretty good. Dave? >>>>>> >>>>>> Acked-by: Matt Mackall <mpm@selenic.com> >>>>> I don't like the loop for RX ARP processing. >>>>> >>>>> The packet contents aren't going to change, so doing basic >>>>> packet validation inside of the "for each RX client" loop >>>>> of arp_reply() doesn't make any sense. >>>> True. Dan, please help our poor compilers with some manual loop >>>> invariant motion. >>> >>> Okay, true. I'll fix this by tomorrow and resend the patch. >> >> Here the fix of the RX ARP processing routine. Content that isn't >> going to change is out-of-loop. >> Successfully tested on my machines. > > Against what tree does this patch apply? It doesn't apply to Linus's > git tree. Also, in the future, could you use the -p option to diff so > we can see what function or data structure is being modified? It really > helps in reviewing. As mentioned earlier in this thread... when I started developing the patch I used 2.6.32.2 (latest stable from kernel.org), which might be already "old" ... sorry for that. Ok, next time I will use -p, too. For the moment, it was diff with -Nur. Shall I resend? Best regards, Daniel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netpoll: allow execution of multiple rx_hooks per interface 2010-01-13 0:27 ` Daniel Borkmann 2010-01-13 13:57 ` Jeff Moyer @ 2010-01-14 4:41 ` David Miller 1 sibling, 0 replies; 13+ messages in thread From: David Miller @ 2010-01-14 4:41 UTC (permalink / raw) To: danborkmann; +Cc: mpm, linux-kernel, jmoyer, netdev, netdev From: Daniel Borkmann <danborkmann@googlemail.com> Date: Wed, 13 Jan 2010 01:27:30 +0100 > Signed-off-by: Daniel Borkmann <danborkmann@googlemail.com> Applied, thanks Daniel. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2010-01-14 4:41 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-01-06 20:54 [PATCH] netpoll: allow execution of multiple rx_hooks per interface Daniel Borkmann 2010-01-07 3:54 ` Matt Mackall 2010-01-07 9:02 ` David Miller 2010-01-07 19:06 ` Daniel Borkmann 2010-01-08 0:20 ` Daniel Borkmann 2010-01-11 23:21 ` Matt Mackall 2010-01-11 23:59 ` David Miller 2010-01-12 0:03 ` Matt Mackall 2010-01-12 0:09 ` Daniel Borkmann 2010-01-13 0:27 ` Daniel Borkmann 2010-01-13 13:57 ` Jeff Moyer 2010-01-13 16:53 ` Daniel Borkmann 2010-01-14 4:41 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).