Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next] tunnel: use the correct endian for some fields
From: Cong Wang @ 2012-11-15  8:51 UTC (permalink / raw)
  To: netdev; +Cc: Nicolas Dichtel, David S. Miller, Cong Wang

Fengguang reported:

net/ipv6/ip6_tunnel.c:1571:33: sparse: incorrect type in assignment (different base types)
net/ipv6/ip6_tunnel.c:1571:33:    expected restricted __be32 [usertype] flowinfo
net/ipv6/ip6_tunnel.c:1571:33:    got unsigned int

for these fields, we need to use the correct endian wrapers.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>

---

diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c
index 64686e1..54477d8 100644
--- a/net/ipv4/ipip.c
+++ b/net/ipv4/ipip.c
@@ -864,7 +864,7 @@ static void ipip_netlink_parms(struct nlattr *data[],
 		parms->link = nla_get_u32(data[IFLA_IPTUN_LINK]);
 
 	if (data[IFLA_IPTUN_LOCAL])
-		parms->iph.saddr = nla_get_u32(data[IFLA_IPTUN_LOCAL]);
+		parms->iph.saddr = nla_get_be32(data[IFLA_IPTUN_LOCAL]);
 
 	if (data[IFLA_IPTUN_REMOTE])
 		parms->iph.daddr = nla_get_u32(data[IFLA_IPTUN_REMOTE]);
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index ab4d056..bf3a549 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1568,7 +1568,7 @@ static void ip6_tnl_netlink_parms(struct nlattr *data[],
 		parms->encap_limit = nla_get_u8(data[IFLA_IPTUN_ENCAP_LIMIT]);
 
 	if (data[IFLA_IPTUN_FLOWINFO])
-		parms->flowinfo = nla_get_u32(data[IFLA_IPTUN_FLOWINFO]);
+		parms->flowinfo = nla_get_be32(data[IFLA_IPTUN_FLOWINFO]);
 
 	if (data[IFLA_IPTUN_FLAGS])
 		parms->flags = nla_get_u32(data[IFLA_IPTUN_FLAGS]);
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 7bd2a06..e137750 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -1240,7 +1240,7 @@ static void ipip6_netlink_parms(struct nlattr *data[],
 		parms->link = nla_get_u32(data[IFLA_IPTUN_LINK]);
 
 	if (data[IFLA_IPTUN_LOCAL])
-		parms->iph.saddr = nla_get_u32(data[IFLA_IPTUN_LOCAL]);
+		parms->iph.saddr = nla_get_be32(data[IFLA_IPTUN_LOCAL]);
 
 	if (data[IFLA_IPTUN_REMOTE])
 		parms->iph.daddr = nla_get_u32(data[IFLA_IPTUN_REMOTE]);
@@ -1337,7 +1337,7 @@ static int ipip6_fill_info(struct sk_buff *skb, const struct net_device *dev)
 	    nla_put_u8(skb, IFLA_IPTUN_TOS, parm->iph.tos) ||
 	    nla_put_u8(skb, IFLA_IPTUN_PMTUDISC,
 		       !!(parm->iph.frag_off & htons(IP_DF))) ||
-	    nla_put_u16(skb, IFLA_IPTUN_FLAGS, parm->i_flags))
+	    nla_put_be16(skb, IFLA_IPTUN_FLAGS, parm->i_flags))
 		goto nla_put_failure;
 	return 0;
 

^ permalink raw reply related

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
From: Ian Campbell @ 2012-11-15  8:51 UTC (permalink / raw)
  To: ANNIE LI
  Cc: Pasi Kärkkäinen, xen-devel@lists.xensource.com,
	netdev@vger.kernel.org, konrad.wilk@oracle.com
In-Reply-To: <50A4AA06.8080900@oracle.com>

On Thu, 2012-11-15 at 08:38 +0000, ANNIE LI wrote:
> 
> On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
> > Hello,
> >
> > On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
> >> This patch implements persistent grants for xen-netfront/netback. This
> >> mechanism maintains page pools in netback/netfront, these page pools is used to
> >> save grant pages which are mapped. This way improve performance which is wasted
> >> when doing grant operations.
> >>
> >> Current netback/netfront does map/unmap grant operations frequently when
> >> transmitting/receiving packets, and grant operations costs much cpu clock. In
> >> this patch, netfront/netback maps grant pages when needed and then saves them
> >> into a page pool for future use. All these pages will be unmapped when
> >> removing/releasing the net device.
> >>
> > Do you have performance numbers available already? with/without persistent grants?
> I have some simple netperf/netserver test result with/without persistent 
> grants,
> 
> Following is result of with persistent grant patch,

> Guests, Sum,      Avg,     Min,     Max
>   1,  15106.4,  15106.4, 15106.36, 15106.36
>   2,  13052.7,  6526.34,  6261.81,  6790.86
>   3,  12675.1,  6337.53,  6220.24,  6454.83
>   4,  13194,  6596.98,  6274.70,  6919.25

Are these pairs of guests or individual ones?

I think the really interesting cases are when you get up to larger
numbers of guests, aren't they? ISTR that for blkio things got most
interesting WRT persistent grants at the dozens of guests stage. Do you
have any numbers for those?

Have you run any tests other than netperf?

Do you have numbers for a a persistent capable backend with a
non-persistent frontend and vice versa?


> 
> 
> Following are result of without persistent patch
> 
> Guests, Sum,     Avg,    Min,        Max
>   1,  10864.1,  10864.1, 10864.10, 10864.10
>   2,  10898.5,  5449.24,  4862.08,  6036.40
>   3,  10734.5,  5367.26,  5261.43,  5473.08
>   4,  10924,    5461.99,  5314.84,  5609.14

^ permalink raw reply

* Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
From: Ian Campbell @ 2012-11-15  8:53 UTC (permalink / raw)
  To: Annie Li
  Cc: xen-devel@lists.xensource.com, netdev@vger.kernel.org,
	konrad.wilk@oracle.com
In-Reply-To: <1352962987-541-1-git-send-email-annie.li@oracle.com>

On Thu, 2012-11-15 at 07:03 +0000, Annie Li wrote:
> 
> This patch is based on linux3.4-rc3. I hit netperf/netserver failure
> on linux latest version v3.7-rc1, v3.7-rc2 and v3.7-rc4. Not sure
> whether thisnetperf/netserver failure connects compound page commit in
> v3.7-rc1, but I did hit BUG_ON with debug patch from thread
> http://lists.xen.org/archives/html/xen-devel/2012-10/msg00893.html

Do you think you could cook up a netfront fix similar in principal to
the 6a8ed462f16b8455eec5ae00eb6014159a6721f0 fix for netback?

Ian.

^ permalink raw reply

* [PATCH net-next 1/3] ipip: fix sparse warnings in ipip_netlink_parms()
From: Nicolas Dichtel @ 2012-11-15  8:53 UTC (permalink / raw)
  To: netdev; +Cc: davem, Nicolas Dichtel

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 net/ipv4/ipip.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c
index 64686e1..c26c171 100644
--- a/net/ipv4/ipip.c
+++ b/net/ipv4/ipip.c
@@ -864,10 +864,10 @@ static void ipip_netlink_parms(struct nlattr *data[],
 		parms->link = nla_get_u32(data[IFLA_IPTUN_LINK]);
 
 	if (data[IFLA_IPTUN_LOCAL])
-		parms->iph.saddr = nla_get_u32(data[IFLA_IPTUN_LOCAL]);
+		parms->iph.saddr = nla_get_be32(data[IFLA_IPTUN_LOCAL]);
 
 	if (data[IFLA_IPTUN_REMOTE])
-		parms->iph.daddr = nla_get_u32(data[IFLA_IPTUN_REMOTE]);
+		parms->iph.daddr = nla_get_be32(data[IFLA_IPTUN_REMOTE]);
 
 	if (data[IFLA_IPTUN_TTL]) {
 		parms->iph.ttl = nla_get_u8(data[IFLA_IPTUN_TTL]);
-- 
1.7.12

^ permalink raw reply related

* [PATCH net-next 2/3] sit: fix sparse warnings
From: Nicolas Dichtel @ 2012-11-15  8:53 UTC (permalink / raw)
  To: netdev; +Cc: davem, Nicolas Dichtel
In-Reply-To: <1352969616-30476-1-git-send-email-nicolas.dichtel@6wind.com>

Note that i_flags is defined as be16, but is used here like an u16. The test
in ipip6_tunnel_create() was just moved into this function.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 net/ipv6/sit.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 7bd2a06..ca6c2c8 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -228,7 +228,7 @@ static int ipip6_tunnel_create(struct net_device *dev)
 		goto out;
 	ipip6_tunnel_clone_6rd(dev, sitn);
 
-	if (t->parms.i_flags & SIT_ISATAP)
+	if ((__force u16)t->parms.i_flags & SIT_ISATAP)
 		dev->priv_flags |= IFF_ISATAP;
 
 	err = register_netdevice(dev);
@@ -1240,10 +1240,10 @@ static void ipip6_netlink_parms(struct nlattr *data[],
 		parms->link = nla_get_u32(data[IFLA_IPTUN_LINK]);
 
 	if (data[IFLA_IPTUN_LOCAL])
-		parms->iph.saddr = nla_get_u32(data[IFLA_IPTUN_LOCAL]);
+		parms->iph.saddr = nla_get_be32(data[IFLA_IPTUN_LOCAL]);
 
 	if (data[IFLA_IPTUN_REMOTE])
-		parms->iph.daddr = nla_get_u32(data[IFLA_IPTUN_REMOTE]);
+		parms->iph.daddr = nla_get_be32(data[IFLA_IPTUN_REMOTE]);
 
 	if (data[IFLA_IPTUN_TTL]) {
 		parms->iph.ttl = nla_get_u8(data[IFLA_IPTUN_TTL]);
@@ -1258,7 +1258,7 @@ static void ipip6_netlink_parms(struct nlattr *data[],
 		parms->iph.frag_off = htons(IP_DF);
 
 	if (data[IFLA_IPTUN_FLAGS])
-		parms->i_flags = nla_get_u16(data[IFLA_IPTUN_FLAGS]);
+		parms->i_flags = nla_get_be16(data[IFLA_IPTUN_FLAGS]);
 }
 
 static int ipip6_newlink(struct net *src_net, struct net_device *dev,
@@ -1337,7 +1337,7 @@ static int ipip6_fill_info(struct sk_buff *skb, const struct net_device *dev)
 	    nla_put_u8(skb, IFLA_IPTUN_TOS, parm->iph.tos) ||
 	    nla_put_u8(skb, IFLA_IPTUN_PMTUDISC,
 		       !!(parm->iph.frag_off & htons(IP_DF))) ||
-	    nla_put_u16(skb, IFLA_IPTUN_FLAGS, parm->i_flags))
+	    nla_put_be16(skb, IFLA_IPTUN_FLAGS, parm->i_flags))
 		goto nla_put_failure;
 	return 0;
 
-- 
1.7.12

^ permalink raw reply related

* [PATCH net-next 3/3] ip6tnl: fix sparse warnings in ip6_tnl_netlink_parms()
From: Nicolas Dichtel @ 2012-11-15  8:53 UTC (permalink / raw)
  To: netdev; +Cc: davem, Nicolas Dichtel
In-Reply-To: <1352969616-30476-1-git-send-email-nicolas.dichtel@6wind.com>

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 net/ipv6/ip6_tunnel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index ab4d056..bf3a549 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1568,7 +1568,7 @@ static void ip6_tnl_netlink_parms(struct nlattr *data[],
 		parms->encap_limit = nla_get_u8(data[IFLA_IPTUN_ENCAP_LIMIT]);
 
 	if (data[IFLA_IPTUN_FLOWINFO])
-		parms->flowinfo = nla_get_u32(data[IFLA_IPTUN_FLOWINFO]);
+		parms->flowinfo = nla_get_be32(data[IFLA_IPTUN_FLOWINFO]);
 
 	if (data[IFLA_IPTUN_FLAGS])
 		parms->flags = nla_get_u32(data[IFLA_IPTUN_FLAGS]);
-- 
1.7.12

^ permalink raw reply related

* Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
From: Ian Campbell @ 2012-11-15  8:56 UTC (permalink / raw)
  To: Annie Li
  Cc: xen-devel@lists.xensource.com, netdev@vger.kernel.org,
	konrad.wilk@oracle.com
In-Reply-To: <1352962987-541-1-git-send-email-annie.li@oracle.com>

On Thu, 2012-11-15 at 07:03 +0000, Annie Li wrote:

> 
> This patch implements persistent grants for xen-netfront/netback. This
> mechanism maintains page pools in netback/netfront, these page pools
> is used to save grant pages which are mapped. This way improve
> performance which is wasted when doing grant operations. 

Please can you send a patch against xen-unstable.hg to document this new
protocol variant in xen/include/public/io/netif.h. This header is a bit
under-documented but lets not let it fall further behind (if you want to
go further and document the existing features, in the style of the
current blkif.h, then that would be awesome!).

You may also want to provide a similar patch to Linux's copy which is in
linux/include/xen/interface/io/netif.h

Ian.

^ permalink raw reply

* Re: [PATCH net-next] tunnel: use the correct endian for some fields
From: Nicolas Dichtel @ 2012-11-15  8:57 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev, David S. Miller
In-Reply-To: <1352969482-29836-1-git-send-email-amwang@redhat.com>

Le 15/11/2012 09:51, Cong Wang a écrit :
> Fengguang reported:
>
> net/ipv6/ip6_tunnel.c:1571:33: sparse: incorrect type in assignment (different base types)
> net/ipv6/ip6_tunnel.c:1571:33:    expected restricted __be32 [usertype] flowinfo
> net/ipv6/ip6_tunnel.c:1571:33:    got unsigned int
>
> for these fields, we need to use the correct endian wrapers.
>
> Reported-by: Fengguang Wu <fengguang.wu@intel.com>
> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> Cc: David S. Miller <davem@davemloft.net>
> Signed-off-by: Cong Wang <amwang@redhat.com>
Just one minute before my patch.
Your patch does not fix all warnings (i_flags & SIT_ISATAP in ip6_tunnel.c).

^ permalink raw reply

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
From: ANNIE LI @ 2012-11-15  9:02 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Pasi Kärkkäinen, xen-devel@lists.xensource.com,
	netdev@vger.kernel.org, konrad.wilk@oracle.com
In-Reply-To: <1352969519.3499.30.camel@zakaz.uk.xensource.com>



On 2012-11-15 16:51, Ian Campbell wrote:
> On Thu, 2012-11-15 at 08:38 +0000, ANNIE LI wrote:
>> On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
>>> Hello,
>>>
>>> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
>>>> This patch implements persistent grants for xen-netfront/netback. This
>>>> mechanism maintains page pools in netback/netfront, these page pools is used to
>>>> save grant pages which are mapped. This way improve performance which is wasted
>>>> when doing grant operations.
>>>>
>>>> Current netback/netfront does map/unmap grant operations frequently when
>>>> transmitting/receiving packets, and grant operations costs much cpu clock. In
>>>> this patch, netfront/netback maps grant pages when needed and then saves them
>>>> into a page pool for future use. All these pages will be unmapped when
>>>> removing/releasing the net device.
>>>>
>>> Do you have performance numbers available already? with/without persistent grants?
>> I have some simple netperf/netserver test result with/without persistent
>> grants,
>>
>> Following is result of with persistent grant patch,
>> Guests, Sum,      Avg,     Min,     Max
>>    1,  15106.4,  15106.4, 15106.36, 15106.36
>>    2,  13052.7,  6526.34,  6261.81,  6790.86
>>    3,  12675.1,  6337.53,  6220.24,  6454.83
>>    4,  13194,  6596.98,  6274.70,  6919.25
> Are these pairs of guests or individual ones?

They are pairs of guests.

>
> I think the really interesting cases are when you get up to larger
> numbers of guests, aren't they?
Right.
> ISTR that for blkio things got most
> interesting WRT persistent grants at the dozens of guests stage. Do you
> have any numbers for those?
No, I will run more test with more gusets.
>
> Have you run any tests other than netperf?
No, I didn't.
>
> Do you have numbers for a a persistent capable backend with a
> non-persistent frontend and vice versa?
I did it, but the test only runs with 4 guests too,will test with more 
guests.

Thanks
Annie
>
>
>>
>> Following are result of without persistent patch
>>
>> Guests, Sum,     Avg,    Min,        Max
>>    1,  10864.1,  10864.1, 10864.10, 10864.10
>>    2,  10898.5,  5449.24,  4862.08,  6036.40
>>    3,  10734.5,  5367.26,  5261.43,  5473.08
>>    4,  10924,    5461.99,  5314.84,  5609.14
>

^ permalink raw reply

* Re: [PATCH net-next] tunnel: use the correct endian for some fields
From: Cong Wang @ 2012-11-15  9:04 UTC (permalink / raw)
  To: nicolas.dichtel; +Cc: netdev, David S. Miller
In-Reply-To: <50A4AE82.3040602@6wind.com>

On Thu, 2012-11-15 at 09:57 +0100, Nicolas Dichtel wrote:
> Le 15/11/2012 09:51, Cong Wang a écrit :
> > Fengguang reported:
> >
> > net/ipv6/ip6_tunnel.c:1571:33: sparse: incorrect type in assignment (different base types)
> > net/ipv6/ip6_tunnel.c:1571:33:    expected restricted __be32 [usertype] flowinfo
> > net/ipv6/ip6_tunnel.c:1571:33:    got unsigned int
> >
> > for these fields, we need to use the correct endian wrapers.
> >
> > Reported-by: Fengguang Wu <fengguang.wu@intel.com>
> > Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> > Cc: David S. Miller <davem@davemloft.net>
> > Signed-off-by: Cong Wang <amwang@redhat.com>
> Just one minute before my patch.
> Your patch does not fix all warnings (i_flags & SIT_ISATAP in ip6_tunnel.c).

Yeah, then ignore this patch. :)

^ permalink raw reply

* Re: [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
From: Ian Campbell @ 2012-11-15  9:10 UTC (permalink / raw)
  To: Annie Li
  Cc: xen-devel@lists.xensource.com, netdev@vger.kernel.org,
	konrad.wilk@oracle.com
In-Reply-To: <1352963066-570-1-git-send-email-annie.li@oracle.com>

On Thu, 2012-11-15 at 07:04 +0000, Annie Li wrote:
> This patch implements persistent grant in netback driver. Tx and rx
> share the same page pool, this pool will be split into two parts
> in next patch.
> 
> Signed-off-by: Annie Li <annie.li@oracle.com>
> ---
>  drivers/net/xen-netback/common.h    |   18 +++-
>  drivers/net/xen-netback/interface.c |   22 ++++
>  drivers/net/xen-netback/netback.c   |  212 +++++++++++++++++++++++++++++++----
>  drivers/net/xen-netback/xenbus.c    |   14 ++-
>  4 files changed, 239 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index 94b79c3..a85cac6 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -45,8 +45,19 @@
>  #include <xen/grant_table.h>
>  #include <xen/xenbus.h>
> 
> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> +#define MAXIMUM_OUTSTANDING_BLOCK_REQS \

BLOCK?

> +                       (XEN_NETIF_TX_RING_SIZE + XEN_NETIF_RX_RING_SIZE)
> +
>  struct xen_netbk;
> 
> +struct persistent_entry {
> +       grant_ref_t forgranted;
> +       struct page *fpage;
> +       struct gnttab_map_grant_ref map;
> +};

Isn't this duplicating a bunch of infrastructure which is also in
blkback? Can we put it into some common helpers please?

> +
>  struct xenvif {
>         /* Unique identifier for this interface. */
>         domid_t          domid;
> @@ -75,6 +86,7 @@ struct xenvif {
> 
>         /* Internal feature information. */
>         u8 can_queue:1;     /* can queue packets for receiver? */
> +       u8 persistent_grant:1;
> 
>         /*
>          * Allow xenvif_start_xmit() to peek ahead in the rx request
> @@ -98,6 +110,9 @@ struct xenvif {
>         struct net_device *dev;
> 
>         wait_queue_head_t waiting_to_free;
> +
> +       struct persistent_entry *persistent_gnt[MAXIMUM_OUTSTANDING_BLOCK_REQS];

What is the per-vif memory overhead of this array?

> +static struct persistent_entry*
> +get_per_gnt(struct persistent_entry **pers_entry,
> +           unsigned int count, grant_ref_t gref)
> +{
> +       int i;
> +
> +       for (i = 0; i < count; i++)
> +               if (gref == pers_entry[i]->forgranted)
> +                       return pers_entry[i];

Isn't this linear scan rather expensive? I think Roger implemented some
sort of hash lookup for blkback which I think is required here too (and
should be easy if you make that code common).

> +
> +       return NULL;
> +}
> +

> @@ -1338,7 +1497,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
>                 gop->source.domid = vif->domid;
>                 gop->source.offset = txreq.offset;
> 
> -               gop->dest.u.gmfn = virt_to_mfn(page_address(page));
> +               if (!vif->persistent_grant)
> +                       gop->dest.u.gmfn = virt_to_mfn(page_address(page));
> +               else
> +                       gop->dest.u.gmfn = (unsigned long)page_address(page);

page_address doesn't return any sort of frame number, does it? This is
rather confusing...

> @@ -453,7 +460,12 @@ static int connect_rings(struct backend_info *be)
>                 val = 0;
>         vif->csum = !val;
> 
> -       /* Map the shared frame, irq etc. */
> +       if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-persistent-grants",
> +                        "%u", &val) < 0)
> +               val = 0;
> +       vif->persistent_grant = !!val;
> +
> +/* Map the shared frame, irq etc. */

Please run the patches through checkpatch.pl

>         err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref, evtchn);
>         if (err) {
>                 xenbus_dev_fatal(dev, err,
> --
> 1.7.3.4
> 

^ permalink raw reply

* Re: [PATCH 2/4] xen/netback: Split one page pool into two(tx/rx) page pool.
From: Ian Campbell @ 2012-11-15  9:15 UTC (permalink / raw)
  To: Annie Li
  Cc: xen-devel@lists.xensource.com, netdev@vger.kernel.org,
	konrad.wilk@oracle.com
In-Reply-To: <1352963089-599-1-git-send-email-annie.li@oracle.com>

On Thu, 2012-11-15 at 07:04 +0000, Annie Li wrote:
> For tx path, this implementation simplifies the work of searching out
> grant page from page pool based on grant reference.

It's still a linear search though, and it doesn't look much simpler to
me:
  	for (i = 0; i < count; i++) {
		if (tx_pool)
			vif = netbk->gnttab_tx_vif[i];
		else
			vif = netbk->gnttab_rx_vif[i];

		pers_entry = vif->persistent_gnt;
		gnt_count = &vif->persistent_gntcnt;
		gnt_total = MAXIMUM_OUTSTANDING_BLOCK_REQS;
becomes:
	for (i = 0; i < count; i++) {
		if (tx_pool) {
			vif = netbk->gnttab_tx_vif[i];
			gnt_count = &vif->persistent_tx_gntcnt;
			gnt_total = XEN_NETIF_TX_RING_SIZE;
			pers_entry = vif->persistent_tx_gnt;
		} else {
			vif = netbk->gnttab_rx_vif[i];
			gnt_count = &vif->persistent_rx_gntcnt;
			gnt_total = 2*XEN_NETIF_RX_RING_SIZE;
			pers_entry = vif->persistent_rx_gnt;
		}

> @@ -111,8 +109,16 @@ struct xenvif {
>  
>  	wait_queue_head_t waiting_to_free;
>  
> -	struct persistent_entry *persistent_gnt[MAXIMUM_OUTSTANDING_BLOCK_REQS];
> -	unsigned int persistent_gntcnt;
> +	struct persistent_entry *persistent_tx_gnt[XEN_NETIF_TX_RING_SIZE];
> +
> +	/*
> +	 * 2*XEN_NETIF_RX_RING_SIZE is for the case of each head/fragment page

Shouldn't that been incorporated into MAXIMUM_OUTSTANDING_BLOCK_REQS
(sic) too?

> +	 * using 2 copy operations.
> +	 */
> +	struct persistent_entry *persistent_rx_gnt[2*XEN_NETIF_RX_RING_SIZE];

What is the per-vif memory overhead after this change?

Ian.

^ permalink raw reply

* Re: drivers/net/ethernet/amd/ni65.c:1243:8-14: WARNING: PTR_RET can be used
From: Jeff Kirsher @ 2012-11-15  9:20 UTC (permalink / raw)
  To: Xie ChanglongX; +Cc: fengguang.wu, yuanhan.liu
In-Reply-To: <20121115082912.GB55738@bee>

[-- Attachment #1: Type: text/plain, Size: 807 bytes --]

On Thu, 2012-11-15 at 16:29 +0800, Xie ChanglongX wrote:
> Hi Jeff,
> 
> FYI, there are coccinelle warnings in
> 
> tree:   git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux master
> head:   9924a1992a86ebdb7ca36ef790d2ba0da506296c
> commit: b955f6ca776f3bab3d1e2c5fb1d247b203cbda14 amd: Move AMD (Lance) chipset drivers
> date:   1 year, 3 months ago
> 
> + drivers/net/ethernet/amd/ni65.c:1243:8-14: WARNING: PTR_RET can be used
> 
> Please consider folding the attached diff :-)
> 
> ---
> 0-DAY kernel build testing backend         Open Source Technology Center
> Fengguang Wu, Yuanhan Liu                              Intel Corporation

Go ahead and submit your patch to netdev@vger.kernel.org, I am sure the
AMD maintainer/community will be glad to consider your patch.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
From: Wei Liu @ 2012-11-15  9:35 UTC (permalink / raw)
  To: ANNIE LI
  Cc: netdev, xen-devel@lists.xensource.com, Ian Campbell,
	Konrad Rzeszutek Wilk
In-Reply-To: <50A4AA06.8080900@oracle.com>


[-- Attachment #1.1: Type: text/plain, Size: 1799 bytes --]

On Thu, Nov 15, 2012 at 4:38 PM, ANNIE LI <annie.li@oracle.com> wrote:

>
>
> On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
>
>> Hello,
>>
>> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
>>
>>> This patch implements persistent grants for xen-netfront/netback. This
>>> mechanism maintains page pools in netback/netfront, these page pools is
>>> used to
>>> save grant pages which are mapped. This way improve performance which is
>>> wasted
>>> when doing grant operations.
>>>
>>> Current netback/netfront does map/unmap grant operations frequently when
>>> transmitting/receiving packets, and grant operations costs much cpu
>>> clock. In
>>> this patch, netfront/netback maps grant pages when needed and then saves
>>> them
>>> into a page pool for future use. All these pages will be unmapped when
>>> removing/releasing the net device.
>>>
>>>  Do you have performance numbers available already? with/without
>> persistent grants?
>>
> I have some simple netperf/netserver test result with/without persistent
> grants,
>
> Following is result of with persistent grant patch,
>
> Guests, Sum,      Avg,     Min,     Max
>  1,  15106.4,  15106.4, 15106.36, 15106.36
>  2,  13052.7,  6526.34,  6261.81,  6790.86
>  3,  12675.1,  6337.53,  6220.24,  6454.83
>  4,  13194,  6596.98,  6274.70,  6919.25
>
>
> Following are result of without persistent patch
>
> Guests, Sum,     Avg,    Min,        Max
>  1,  10864.1,  10864.1, 10864.10, 10864.10
>  2,  10898.5,  5449.24,  4862.08,  6036.40
>  3,  10734.5,  5367.26,  5261.43,  5473.08
>  4,  10924,    5461.99,  5314.84,  5609.14
>
>
>
Interesting results. Have you tested how good it is on a 10G nic, i.e.
guest sending packets
through physical network to another host.


Wei.

[-- Attachment #1.2: Type: text/html, Size: 2488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply

* Re: [PATCH net-next 1/3] ipip: fix sparse warnings in ipip_netlink_parms()
From: Nicolas Dichtel @ 2012-11-15  9:41 UTC (permalink / raw)
  To: Nicolas Dichtel; +Cc: netdev, davem
In-Reply-To: <1352969616-30476-1-git-send-email-nicolas.dichtel@6wind.com>

Le 15/11/2012 09:53, Nicolas Dichtel a écrit :
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
I forget "Reported-by: Fengguang Wu <fengguang.wu@intel.com>"

Should I resend the serie?

^ permalink raw reply

* [PATCH] bonding: rlb mode of bond should not alter ARP originating via bridge
From: Zheng Li @ 2012-11-15  9:47 UTC (permalink / raw)
  To: netdev, fubar, andy; +Cc: linux-kernel, davem, joe.jin, zheng.x.li

ARP traffic passing through a bridge and out via the bond (when the bond is a 
port of the bridge) should not have its source MAC address adjusted by the 
receive load balance code in rlb_arp_xmit.

Signed-off-by: Zheng Li <zheng.x.li@oracle.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: "David S. Miller" <davem@davemloft.net>

---
 drivers/net/bonding/bond_alb.c |    6 ++++++
 drivers/net/bonding/bonding.h  |   13 +++++++++++++
 2 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index e15cc11..75f6f0d 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -694,6 +694,12 @@ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
 	struct arp_pkt *arp = arp_pkt(skb);
 	struct slave *tx_slave = NULL;
 
+	/* Only modify ARP's MAC if it originates locally;
+	 * don't change ARPs arriving via a bridge.
+	 */
+	if (!bond_slave_has_mac(bond, arp->mac_src))
+		return NULL;
+
 	if (arp->op_code == htons(ARPOP_REPLY)) {
 		/* the arp must be sent on the selected
 		* rx channel
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index f8af2fc..6dded56 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -22,6 +22,7 @@
 #include <linux/in6.h>
 #include <linux/netpoll.h>
 #include <linux/inetdevice.h>
+#include <linux/etherdevice.h>
 #include "bond_3ad.h"
 #include "bond_alb.h"
 
@@ -450,6 +451,18 @@ static inline void bond_destroy_proc_dir(struct bond_net *bn)
 }
 #endif
 
+static inline struct slave *bond_slave_has_mac(struct bonding *bond,
+					       const u8 *mac)
+{
+	int i = 0;
+	struct slave *tmp;
+
+	bond_for_each_slave(bond, tmp, i)
+		if (ether_addr_equal_64bits(mac, tmp->dev->dev_addr))
+			return tmp;
+
+	return NULL;
+}
 
 /* exported from bond_main.c */
 extern int bond_net_id;
-- 
1.7.6.5

^ permalink raw reply related

* Re: [Xen-devel] [PATCH 1/4] xen/netback: implements persistent grant with one page pool.
From: Roger Pau Monné @ 2012-11-15  9:57 UTC (permalink / raw)
  To: Annie Li
  Cc: xen-devel@lists.xensource.com, netdev@vger.kernel.org,
	konrad.wilk@oracle.com, Ian Campbell
In-Reply-To: <1352963066-570-1-git-send-email-annie.li@oracle.com>

On 15/11/12 08:04, Annie Li wrote:
> This patch implements persistent grant in netback driver. Tx and rx
> share the same page pool, this pool will be split into two parts
> in next patch.
> 
> Signed-off-by: Annie Li <annie.li@oracle.com>
> ---
>  drivers/net/xen-netback/common.h    |   18 +++-
>  drivers/net/xen-netback/interface.c |   22 ++++
>  drivers/net/xen-netback/netback.c   |  212 +++++++++++++++++++++++++++++++----
>  drivers/net/xen-netback/xenbus.c    |   14 ++-
>  4 files changed, 239 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index 94b79c3..a85cac6 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -45,8 +45,19 @@
>  #include <xen/grant_table.h>
>  #include <xen/xenbus.h>
> 
> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> +#define MAXIMUM_OUTSTANDING_BLOCK_REQS \
> +                       (XEN_NETIF_TX_RING_SIZE + XEN_NETIF_RX_RING_SIZE)
> +
>  struct xen_netbk;
> 
> +struct persistent_entry {
> +       grant_ref_t forgranted;
> +       struct page *fpage;
> +       struct gnttab_map_grant_ref map;
> +};

This should be common with the blkback implementation, I think we should
move some structures/functions from blkback to a common place. When I
implementated some functions in blkback I though they could be reused by
other backends that wanted to use persistent grants, so I keep them free
of blkback specific structures.

>  struct xenvif {
>         /* Unique identifier for this interface. */
>         domid_t          domid;
> @@ -75,6 +86,7 @@ struct xenvif {
> 
>         /* Internal feature information. */
>         u8 can_queue:1;     /* can queue packets for receiver? */
> +       u8 persistent_grant:1;
> 
>         /*
>          * Allow xenvif_start_xmit() to peek ahead in the rx request
> @@ -98,6 +110,9 @@ struct xenvif {
>         struct net_device *dev;
> 
>         wait_queue_head_t waiting_to_free;
> +
> +       struct persistent_entry *persistent_gnt[MAXIMUM_OUTSTANDING_BLOCK_REQS];
> +       unsigned int persistent_gntcnt;

This should be a red-black tree, which has the property of a search time
<= O(log n), using an array is more expensive in terms of memory and has
a worse search time O(n), this is specially interesting for netback,
which can have twice as much persistent grants as blkback (because two
rings are used).

Take a look at the following functions from blkback; foreach_grant,
add_persistent_gnt and get_persistent_gnt. They are generic functions to
deal with persistent grants.

>  };
> 
>  static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif)
> @@ -105,9 +120,6 @@ static inline struct xenbus_device *xenvif_to_xenbus_device(struct xenvif *vif)
>         return to_xenbus_device(vif->dev->dev.parent);
>  }
> 
> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> -
>  struct xenvif *xenvif_alloc(struct device *parent,
>                             domid_t domid,
>                             unsigned int handle);
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index b7d41f8..226d159 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -300,6 +300,8 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
>                 return ERR_PTR(err);
>         }
> 
> +       vif->persistent_gntcnt = 0;
> +
>         netdev_dbg(dev, "Successfully created xenvif\n");
>         return vif;
>  }
> @@ -343,6 +345,23 @@ err:
>         return err;
>  }
> 
> +void xenvif_free_grants(struct persistent_entry **pers_entry,
> +                       unsigned int count)
> +{
> +       int i, ret;
> +       struct gnttab_unmap_grant_ref unmap;
> +
> +       for (i = 0; i < count; i++) {
> +               gnttab_set_unmap_op(&unmap,
> +                       (unsigned long)page_to_pfn(pers_entry[i]->fpage),
> +                       GNTMAP_host_map,
> +                       pers_entry[i]->map.handle);
> +               ret = gnttab_unmap_refs(&unmap, &pers_entry[i]->fpage,
> +                                       1, false);

This is not correct, you should call gnttab_set_unmap_op on a batch of
grants (up to BLKIF_MAX_SEGMENTS_PER_REQUEST), and then call
gnttab_unmap_refs on all of them. Here is a simple example (take a look
at blkback.c function xen_blkif_schedule to see an example with a
red-black tree, I think this part of the code should also be made common):

for (i = 0, segs_to_unmap = 0; i < count; i++) {
	gnttab_set_unmap_op(&unmap[segs_to_unmap],
		(unsigned long)page_to_pfn(pers_entry[i]->fpage),
		GNTMAP_host_map,
		pers_entry[i]->map.handle);
	pages[segs_to_unmap] =
		(unsigned long)page_to_pfn(pers_entry[i]->fpage);
	if (++segs_to_unmap == BLKIF_MAX_SEGMENTS_PER_REQUEST ||
		(i + 1) == count) {
		ret = gnttab_unmap_refs(unmap, NULL, pages,
			segs_to_unmap);
		BUG_ON(ret);
		segs_to_unmap == 0;
	}
}

> +               BUG_ON(ret);
> +       }
> +}
> +
>  void xenvif_disconnect(struct xenvif *vif)
>  {
>         struct net_device *dev = vif->dev;
> @@ -366,6 +385,9 @@ void xenvif_disconnect(struct xenvif *vif)
>         unregister_netdev(vif->dev);
> 
>         xen_netbk_unmap_frontend_rings(vif);
> +       if (vif->persistent_grant)
> +               xenvif_free_grants(vif->persistent_gnt,
> +                                  vif->persistent_gntcnt);
> 
>         free_netdev(vif->dev);
>  }
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 2596401..a26d3fc 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -80,6 +80,8 @@ union page_ext {
>         void *mapping;
>  };
> 
> +struct xenvif;
> +
>  struct xen_netbk {
>         wait_queue_head_t wq;
>         struct task_struct *task;
> @@ -102,6 +104,7 @@ struct xen_netbk {
> 
>         struct pending_tx_info pending_tx_info[MAX_PENDING_REQS];
>         struct gnttab_copy tx_copy_ops[MAX_PENDING_REQS];
> +       struct xenvif *gnttab_tx_vif[MAX_PENDING_REQS];
> 
>         u16 pending_ring[MAX_PENDING_REQS];
> 
> @@ -111,12 +114,139 @@ struct xen_netbk {
>          * straddles two buffers in the frontend.
>          */
>         struct gnttab_copy grant_copy_op[2*XEN_NETIF_RX_RING_SIZE];
> +       struct xenvif *gnttab_rx_vif[2*XEN_NETIF_RX_RING_SIZE];
>         struct netbk_rx_meta meta[2*XEN_NETIF_RX_RING_SIZE];
>  };
> 
>  static struct xen_netbk *xen_netbk;
>  static int xen_netbk_group_nr;
> 
> +static struct persistent_entry*
> +get_per_gnt(struct persistent_entry **pers_entry,
> +           unsigned int count, grant_ref_t gref)
> +{
> +       int i;
> +
> +       for (i = 0; i < count; i++)
> +               if (gref == pers_entry[i]->forgranted)
> +                       return pers_entry[i];
> +
> +       return NULL;
> +}

This should be replaced with common code shared with all persistent
backends implementations.

> +
> +static void*
> +map_new_gnt(struct persistent_entry **pers_entry, unsigned int count,
> +           grant_ref_t ref, domid_t domid)
> +{
> +       struct gnttab_map_grant_ref *map;
> +       struct page *page;
> +       unsigned long vaddr;
> +       unsigned long pfn;
> +       uint32_t flags;
> +       int ret = 0;
> +
> +       pers_entry[count] = (struct persistent_entry *)
> +                           kmalloc(sizeof(struct persistent_entry),
> +                                   GFP_KERNEL);
> +       if (!pers_entry[count])
> +               return ERR_PTR(-ENOMEM);
> +
> +       map = &pers_entry[count]->map;
> +       page = alloc_page(GFP_KERNEL);
> +       if (!page) {
> +               kfree(pers_entry[count]);
> +               pers_entry[count] = NULL;
> +               return ERR_PTR(-ENOMEM);
> +       }
> +
> +       pers_entry[count]->fpage = page;
> +       pfn = page_to_pfn(page);
> +       vaddr = (unsigned long)pfn_to_kaddr(pfn);
> +       flags = GNTMAP_host_map;
> +
> +       gnttab_set_map_op(map, vaddr, flags, ref, domid);
> +       ret = gnttab_map_refs(map, NULL, &page, 1);
> +       BUG_ON(ret);

This is highly inefficient, one of the points of using gnttab_set_map_op
is that you can queue a bunch of grants, and then map them at the same
time using gnttab_map_refs, but here you are using it to map a single
grant at a time. You should instead see how much grants you need to map
to complete the request and map them all at the same time.

> +
> +       pers_entry[count]->forgranted = ref;
> +
> +       return page_address(page);
> +}
> +
> +static void*
> +get_ref_page(struct persistent_entry **pers_entry, unsigned int *count,
> +            grant_ref_t ref, domid_t domid, unsigned int total)
> +{
> +       struct persistent_entry *per_gnt;
> +       void *vaddr;
> +
> +       per_gnt = get_per_gnt(pers_entry, *count, ref);
> +
> +       if (per_gnt != NULL)
> +               return page_address(per_gnt->fpage);
> +       else {
> +               BUG_ON(*count >= total);
> +               vaddr = map_new_gnt(pers_entry, *count, ref, domid);
> +               if (IS_ERR_OR_NULL(vaddr))
> +                       return vaddr;
> +               *count += 1;
> +               return vaddr;
> +       }
> +}
> +
> +static int
> +grant_memory_copy_op(unsigned int cmd, void *vuop, unsigned int count,
> +                    struct xen_netbk *netbk, bool tx_pool)
> +{
> +       int i;
> +       struct xenvif *vif;
> +       struct gnttab_copy *uop = vuop;
> +       unsigned int *gnt_count;
> +       unsigned int gnt_total;
> +       struct persistent_entry **pers_entry;
> +       int ret = 0;
> +
> +       BUG_ON(cmd != GNTTABOP_copy);
> +       for (i = 0; i < count; i++) {
> +               if (tx_pool)
> +                       vif = netbk->gnttab_tx_vif[i];
> +               else
> +                       vif = netbk->gnttab_rx_vif[i];
> +
> +               pers_entry = vif->persistent_gnt;
> +               gnt_count = &vif->persistent_gntcnt;
> +               gnt_total = MAXIMUM_OUTSTANDING_BLOCK_REQS;
> +
> +               if (vif->persistent_grant) {
> +                       void *saddr, *daddr;
> +
> +                       saddr = uop[i].source.domid == DOMID_SELF ?
> +                               (void *) uop[i].source.u.gmfn :
> +                               get_ref_page(pers_entry, gnt_count,
> +                                            uop[i].source.u.ref,
> +                                            uop[i].source.domid,
> +                                            gnt_total);
> +                       if (IS_ERR_OR_NULL(saddr))
> +                               return -ENOMEM;
> +
> +                       daddr = uop[i].dest.domid == DOMID_SELF ?
> +                               (void *) uop[i].dest.u.gmfn :
> +                               get_ref_page(pers_entry, gnt_count,
> +                                            uop[i].dest.u.ref,
> +                                            uop[i].dest.domid,
> +                                            gnt_total);
> +                       if (IS_ERR_OR_NULL(daddr))
> +                               return -ENOMEM;
> +
> +                       memcpy(daddr+uop[i].dest.offset,
> +                              saddr+uop[i].source.offset, uop[i].len);
> +               } else
> +                       ret = HYPERVISOR_grant_table_op(cmd, &uop[i], 1);
> +       }
> +
> +       return ret;
> +}
> +
>  void xen_netbk_add_xenvif(struct xenvif *vif)
>  {
>         int i;
> @@ -387,13 +517,15 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif,
>   * Set up the grant operations for this fragment. If it's a flipping
>   * interface, we also set up the unmap request from here.
>   */
> -static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
> -                               struct netrx_pending_operations *npo,
> -                               struct page *page, unsigned long size,
> -                               unsigned long offset, int *head)
> +static int netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
> +                              struct netrx_pending_operations *npo,
> +                              struct page *page, unsigned long size,
> +                              unsigned long offset, int *head,
> +                              struct xenvif **grxvif)
>  {
>         struct gnttab_copy *copy_gop;
>         struct netbk_rx_meta *meta;
> +       int count = 0;
>         /*
>          * These variables are used iff get_page_ext returns true,
>          * in which case they are guaranteed to be initialized.
> @@ -425,6 +557,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>                         bytes = MAX_BUFFER_OFFSET - npo->copy_off;
> 
>                 copy_gop = npo->copy + npo->copy_prod++;
> +               *grxvif = vif;
> +               grxvif++;
> +               count++;
> +
>                 copy_gop->flags = GNTCOPY_dest_gref;
>                 if (foreign) {
>                         struct xen_netbk *netbk = &xen_netbk[group];
> @@ -438,7 +574,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>                 } else {
>                         void *vaddr = page_address(page);
>                         copy_gop->source.domid = DOMID_SELF;
> -                       copy_gop->source.u.gmfn = virt_to_mfn(vaddr);
> +                       if (!vif->persistent_grant)
> +                               copy_gop->source.u.gmfn = virt_to_mfn(vaddr);
> +                       else
> +                               copy_gop->source.u.gmfn = (unsigned long)vaddr;
>                 }
>                 copy_gop->source.offset = offset;
>                 copy_gop->dest.domid = vif->domid;
> @@ -460,6 +599,7 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>                 *head = 0; /* There must be something in this buffer now. */
> 
>         }
> +       return count;
>  }
> 
>  /*
> @@ -474,8 +614,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>   * zero GSO descriptors (for non-GSO packets) or one descriptor (for
>   * frontend-side LRO).
>   */
> -static int netbk_gop_skb(struct sk_buff *skb,
> -                        struct netrx_pending_operations *npo)
> +static int netbk_gop_skb(struct xen_netbk *netbk,
> +                        struct sk_buff *skb,
> +                        struct netrx_pending_operations *npo,
> +                        struct xenvif **grxvif)
>  {
>         struct xenvif *vif = netdev_priv(skb->dev);
>         int nr_frags = skb_shinfo(skb)->nr_frags;
> @@ -518,17 +660,19 @@ static int netbk_gop_skb(struct sk_buff *skb,
>                 if (data + len > skb_tail_pointer(skb))
>                         len = skb_tail_pointer(skb) - data;
> 
> -               netbk_gop_frag_copy(vif, skb, npo,
> -                                   virt_to_page(data), len, offset, &head);
> +               grxvif += netbk_gop_frag_copy(vif, skb, npo,
> +                                             virt_to_page(data), len,
> +                                             offset, &head, grxvif);
> +
>                 data += len;
>         }
> 
>         for (i = 0; i < nr_frags; i++) {
> -               netbk_gop_frag_copy(vif, skb, npo,
> -                                   skb_frag_page(&skb_shinfo(skb)->frags[i]),
> -                                   skb_frag_size(&skb_shinfo(skb)->frags[i]),
> -                                   skb_shinfo(skb)->frags[i].page_offset,
> -                                   &head);
> +               grxvif += netbk_gop_frag_copy(vif, skb, npo,
> +                               skb_frag_page(&skb_shinfo(skb)->frags[i]),
> +                               skb_frag_size(&skb_shinfo(skb)->frags[i]),
> +                               skb_shinfo(skb)->frags[i].page_offset,
> +                               &head, grxvif);
>         }
> 
>         return npo->meta_prod - old_meta_prod;
> @@ -593,6 +737,8 @@ struct skb_cb_overlay {
>  static void xen_netbk_rx_action(struct xen_netbk *netbk)
>  {
>         struct xenvif *vif = NULL, *tmp;
> +       struct xenvif **grxvif = netbk->gnttab_rx_vif;
> +       int old_copy_prod = 0;
>         s8 status;
>         u16 irq, flags;
>         struct xen_netif_rx_response *resp;
> @@ -619,7 +765,9 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
>                 nr_frags = skb_shinfo(skb)->nr_frags;
> 
>                 sco = (struct skb_cb_overlay *)skb->cb;
> -               sco->meta_slots_used = netbk_gop_skb(skb, &npo);
> +               sco->meta_slots_used = netbk_gop_skb(netbk, skb, &npo, grxvif);
> +               grxvif += npo.copy_prod - old_copy_prod;
> +               old_copy_prod = npo.copy_prod;
> 
>                 count += nr_frags + 1;
> 
> @@ -636,8 +784,10 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
>                 return;
> 
>         BUG_ON(npo.copy_prod > ARRAY_SIZE(netbk->grant_copy_op));
> -       ret = HYPERVISOR_grant_table_op(GNTTABOP_copy, &netbk->grant_copy_op,
> -                                       npo.copy_prod);
> +       ret = grant_memory_copy_op(GNTTABOP_copy,
> +                                  &netbk->grant_copy_op,
> +                                  npo.copy_prod, netbk,
> +                                  false);
>         BUG_ON(ret != 0);
> 
>         while ((skb = __skb_dequeue(&rxq)) != NULL) {
> @@ -918,7 +1068,8 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk,
>                                                   struct xenvif *vif,
>                                                   struct sk_buff *skb,
>                                                   struct xen_netif_tx_request *txp,
> -                                                 struct gnttab_copy *gop)
> +                                                 struct gnttab_copy *gop,
> +                                                 struct xenvif **gtxvif)
>  {
>         struct skb_shared_info *shinfo = skb_shinfo(skb);
>         skb_frag_t *frags = shinfo->frags;
> @@ -944,7 +1095,11 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk,
>                 gop->source.domid = vif->domid;
>                 gop->source.offset = txp->offset;
> 
> -               gop->dest.u.gmfn = virt_to_mfn(page_address(page));
> +               if (!vif->persistent_grant)
> +                       gop->dest.u.gmfn = virt_to_mfn(page_address(page));
> +               else
> +                       gop->dest.u.gmfn = (unsigned long)page_address(page);
> +
>                 gop->dest.domid = DOMID_SELF;
>                 gop->dest.offset = txp->offset;
> 
> @@ -952,6 +1107,9 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk,
>                 gop->flags = GNTCOPY_source_gref;
> 
>                 gop++;
> +               *gtxvif = vif;
> +               gtxvif++;
> +
> 
>                 memcpy(&pending_tx_info[pending_idx].req, txp, sizeof(*txp));
>                 xenvif_get(vif);
> @@ -1218,6 +1376,7 @@ static bool tx_credit_exceeded(struct xenvif *vif, unsigned size)
>  static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
>  {
>         struct gnttab_copy *gop = netbk->tx_copy_ops, *request_gop;
> +       struct xenvif **gtxvif = netbk->gnttab_tx_vif;
>         struct sk_buff *skb;
>         int ret;
> 
> @@ -1338,7 +1497,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
>                 gop->source.domid = vif->domid;
>                 gop->source.offset = txreq.offset;
> 
> -               gop->dest.u.gmfn = virt_to_mfn(page_address(page));
> +               if (!vif->persistent_grant)
> +                       gop->dest.u.gmfn = virt_to_mfn(page_address(page));
> +               else
> +                       gop->dest.u.gmfn = (unsigned long)page_address(page);
> +
>                 gop->dest.domid = DOMID_SELF;
>                 gop->dest.offset = txreq.offset;
> 
> @@ -1346,6 +1509,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
>                 gop->flags = GNTCOPY_source_gref;
> 
>                 gop++;
> +               *gtxvif++ = vif;
> 
>                 memcpy(&netbk->pending_tx_info[pending_idx].req,
>                        &txreq, sizeof(txreq));
> @@ -1369,12 +1533,13 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
>                 netbk->pending_cons++;
> 
>                 request_gop = xen_netbk_get_requests(netbk, vif,
> -                                                    skb, txfrags, gop);
> +                                                    skb, txfrags, gop, gtxvif);
>                 if (request_gop == NULL) {
>                         kfree_skb(skb);
>                         netbk_tx_err(vif, &txreq, idx);
>                         continue;
>                 }
> +               gtxvif += request_gop - gop;
>                 gop = request_gop;
> 
>                 vif->tx.req_cons = idx;
> @@ -1467,8 +1632,9 @@ static void xen_netbk_tx_action(struct xen_netbk *netbk)
> 
>         if (nr_gops == 0)
>                 return;
> -       ret = HYPERVISOR_grant_table_op(GNTTABOP_copy,
> -                                       netbk->tx_copy_ops, nr_gops);
> +       ret = grant_memory_copy_op(GNTTABOP_copy,
> +                                  netbk->tx_copy_ops, nr_gops,
> +                                  netbk, true);
>         BUG_ON(ret);
> 
>         xen_netbk_tx_submit(netbk);
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
> index 410018c..938e908 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -114,6 +114,13 @@ static int netback_probe(struct xenbus_device *dev,
>                         goto abort_transaction;
>                 }
> 
> +               err = xenbus_printf(xbt, dev->nodename,
> +                                   "feature-persistent-grants", "%u", 1);
> +               if (err) {
> +                       message = "writing feature-persistent-grants";
> +                       goto abort_transaction;
> +               }
> +
>                 err = xenbus_transaction_end(xbt, 0);
>         } while (err == -EAGAIN);
> 
> @@ -453,7 +460,12 @@ static int connect_rings(struct backend_info *be)
>                 val = 0;
>         vif->csum = !val;
> 
> -       /* Map the shared frame, irq etc. */
> +       if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-persistent-grants",
> +                        "%u", &val) < 0)

In block devices "feature-persistent" is used, so I think that for
clearness it should be announced the same way in net.

> +               val = 0;
> +       vif->persistent_grant = !!val;
> +
> +/* Map the shared frame, irq etc. */
>         err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref, evtchn);
>         if (err) {
>                 xenbus_dev_fatal(dev, err,
> --
> 1.7.3.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply

* Re: [patch net] net: correct check in dev_addr_del()
From: Jiri Pirko @ 2012-11-15 10:12 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, eric.dumazet, shemminger, john.r.fastabend
In-Reply-To: <20121114.215254.973344009331632746.davem@davemloft.net>

Thu, Nov 15, 2012 at 03:52:54AM CET, davem@davemloft.net wrote:
>From: Jiri Pirko <jiri@resnulli.us>
>Date: Wed, 14 Nov 2012 13:51:04 +0100
>
>> Check (ha->addr == dev->dev_addr) is always true because dev_addr_init()
>> sets this. Correct the check to behave properly on addr removal.
>> 
>> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
>
>I'm pretty sure this is very intentional.
>
>It's trying to prevent deletion of the implicit dev->dev_addr
>entry.  But it will allow decementing the reference count to
>1, but no further.
>
>I'm not applying this.

Please look at dev_addr_init(), line 266:
dev->dev_addr = ha->addr;

and that is never changed.
Therefore check (ha->addr == dev->dev_addr) in dev_addr_del() is always
true and has no meaning.

dev_addr_del() should check if the address passed in "const unsigned
char *addr" function arg is the same as in ha->addr (dev->dev_addr) and
prevent to remove it in that case. And that is what this patch does.

^ permalink raw reply

* Re: [Xen-devel] [PATCH 3/4] Xen/netfront: Implement persistent grant in netfront.
From: Roger Pau Monné @ 2012-11-15 10:52 UTC (permalink / raw)
  To: Annie Li
  Cc: xen-devel@lists.xensource.com, netdev@vger.kernel.org,
	konrad.wilk@oracle.com, Ian Campbell
In-Reply-To: <1352963114-628-1-git-send-email-annie.li@oracle.com>

On 15/11/12 08:05, Annie Li wrote:
> Tx/rx page pool are maintained. New grant is mapped and put into
> pool, unmap only happens when releasing/removing device.
> 
> Signed-off-by: Annie Li <annie.li@oracle.com>
> ---
>  drivers/net/xen-netfront.c |  372 +++++++++++++++++++++++++++++++++++++-------
>  1 files changed, 315 insertions(+), 57 deletions(-)
> 
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 0ebbb19..17b81c0 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -79,6 +79,13 @@ struct netfront_stats {
>         struct u64_stats_sync   syncp;
>  };
> 
> +struct gnt_list {
> +       grant_ref_t             gref;
> +       struct                  page *gnt_pages;
> +       void                    *gnt_target;
> +       struct                  gnt_list *tail;
> +};

This could also be shared with blkfront.

> +
>  struct netfront_info {
>         struct list_head list;
>         struct net_device *netdev;
> @@ -109,6 +116,10 @@ struct netfront_info {
>         grant_ref_t grant_tx_ref[NET_TX_RING_SIZE];
>         unsigned tx_skb_freelist;
> 
> +       struct gnt_list *tx_grant[NET_TX_RING_SIZE];
> +       struct gnt_list *tx_gnt_list;
> +       unsigned int tx_gnt_cnt;

I don't understand this, why do you need both an array and a list? I'm
not familiar with net code, so I don't know if this is some kind of
special netfront thing?

Anyway if you have to use a list I would recommend using one of the list
constructions that's already in the kernel, it simplifies the code and
makes it more easy to understand than creating your own list structure.

> +
>         spinlock_t   rx_lock ____cacheline_aligned_in_smp;
>         struct xen_netif_rx_front_ring rx;
>         int rx_ring_ref;
> @@ -126,6 +137,10 @@ struct netfront_info {
>         grant_ref_t gref_rx_head;
>         grant_ref_t grant_rx_ref[NET_RX_RING_SIZE];
> 
> +       struct gnt_list *rx_grant[NET_RX_RING_SIZE];
> +       struct gnt_list *rx_gnt_list;
> +       unsigned int rx_gnt_cnt;

Same comment above here.

> +
>         unsigned long rx_pfn_array[NET_RX_RING_SIZE];
>         struct multicall_entry rx_mcl[NET_RX_RING_SIZE+1];
>         struct mmu_update rx_mmu[NET_RX_RING_SIZE];
> @@ -134,6 +149,7 @@ struct netfront_info {
>         struct netfront_stats __percpu *stats;
> 
>         unsigned long rx_gso_checksum_fixup;
> +       u8 persistent_gnt:1;
>  };
> 
>  struct netfront_rx_info {
> @@ -194,6 +210,16 @@ static grant_ref_t xennet_get_rx_ref(struct netfront_info *np,
>         return ref;
>  }
> 
> +static struct gnt_list *xennet_get_rx_grant(struct netfront_info *np,
> +                                           RING_IDX ri)
> +{
> +       int i = xennet_rxidx(ri);
> +       struct gnt_list *gntlist = np->rx_grant[i];
> +       np->rx_grant[i] = NULL;

Ok, I think I get why do you need both an array and a list, is that
because netfront doesn't have some kind of shadow ring to keep track of
issued requests?

So each issued request has an associated gnt_list with the list of used
grants? If so it would be good to add a comment about it.

> +       return gntlist;
> +}
> +
>  #ifdef CONFIG_SYSFS
>  static int xennet_sysfs_addif(struct net_device *netdev);
>  static void xennet_sysfs_delif(struct net_device *netdev);
> @@ -231,6 +257,68 @@ static void xennet_maybe_wake_tx(struct net_device *dev)
>                 netif_wake_queue(dev);
>  }
> 
> +static grant_ref_t xennet_alloc_rx_ref(struct net_device *dev,
> +                                      unsigned long mfn, void *vaddr,
> +                                      unsigned int id,
> +                                      grant_ref_t ref)
> +{
> +       struct netfront_info *np = netdev_priv(dev);
> +       grant_ref_t gnt_ref;
> +       struct gnt_list *gnt_list_entry;
> +
> +       if (np->persistent_gnt && np->rx_gnt_cnt) {
> +               gnt_list_entry = np->rx_gnt_list;
> +               np->rx_gnt_list = np->rx_gnt_list->tail;
> +               np->rx_gnt_cnt--;
> +
> +               gnt_list_entry->gnt_target = vaddr;
> +               gnt_ref = gnt_list_entry->gref;
> +               np->rx_grant[id] = gnt_list_entry;
> +       } else {
> +               struct page *page;
> +
> +               BUG_ON(!np->persistent_gnt && np->rx_gnt_cnt);
> +               if (!ref)
> +                       gnt_ref =
> +                               gnttab_claim_grant_reference(&np->gref_rx_head);
> +               else
> +                       gnt_ref = ref;
> +               BUG_ON((signed short)gnt_ref < 0);
> +
> +               if (np->persistent_gnt) {

So you are only using persistent grants if the backend also supports
them. Have you benchmarked the performance of a persistent frontend with
a non-persistent backend. In the block case, usign a persistent frontend
with a non-persistent backend let to an overall performance improvement,
so blkfront uses persistent grants even if blkback doesn't support them.
Take a look at the following graph:

http://xenbits.xen.org/people/royger/persistent_grants/nonpers_read.png

> +                       page = alloc_page(GFP_KERNEL);
> +                       if (!page) {
> +                               if (!ref)
> +                                       gnttab_release_grant_reference(
> +                                                       &np->gref_rx_head, ref);
> +                               return -ENOMEM;
> +                       }
> +                       mfn = pfn_to_mfn(page_to_pfn(page));
> +
> +                       gnt_list_entry = kmalloc(sizeof(struct gnt_list),
> +                                                GFP_KERNEL);
> +                       if (!gnt_list_entry) {
> +                               __free_page(page);
> +                               if (!ref)
> +                                       gnttab_release_grant_reference(
> +                                                       &np->gref_rx_head, ref);
> +                               return -ENOMEM;
> +                       }
> +                       gnt_list_entry->gref = gnt_ref;
> +                       gnt_list_entry->gnt_pages = page;
> +                       gnt_list_entry->gnt_target = vaddr;
> +
> +                       np->rx_grant[id] = gnt_list_entry;
> +               }
> +
> +               gnttab_grant_foreign_access_ref(gnt_ref, np->xbdev->otherend_id,
> +                                               mfn, 0);
> +       }
> +       np->grant_rx_ref[id] = gnt_ref;
> +
> +       return gnt_ref;
> +}
> +
>  static void xennet_alloc_rx_buffers(struct net_device *dev)
>  {
>         unsigned short id;
> @@ -240,8 +328,6 @@ static void xennet_alloc_rx_buffers(struct net_device *dev)
>         int i, batch_target, notify;
>         RING_IDX req_prod = np->rx.req_prod_pvt;
>         grant_ref_t ref;
> -       unsigned long pfn;
> -       void *vaddr;
>         struct xen_netif_rx_request *req;
> 
>         if (unlikely(!netif_carrier_ok(dev)))
> @@ -306,19 +392,16 @@ no_skb:
>                 BUG_ON(np->rx_skbs[id]);
>                 np->rx_skbs[id] = skb;
> 
> -               ref = gnttab_claim_grant_reference(&np->gref_rx_head);
> -               BUG_ON((signed short)ref < 0);
> -               np->grant_rx_ref[id] = ref;
> +               page = skb_frag_page(&skb_shinfo(skb)->frags[0]);
> 
> -               pfn = page_to_pfn(skb_frag_page(&skb_shinfo(skb)->frags[0]));
> -               vaddr = page_address(skb_frag_page(&skb_shinfo(skb)->frags[0]));
> +               ref = xennet_alloc_rx_ref(dev, pfn_to_mfn(page_to_pfn(page)),
> +                                         page_address(page), id, 0);
> +               if ((signed short)ref < 0) {
> +                       __skb_queue_tail(&np->rx_batch, skb);
> +                       break;
> +               }
> 
>                 req = RING_GET_REQUEST(&np->rx, req_prod + i);
> -               gnttab_grant_foreign_access_ref(ref,
> -                                               np->xbdev->otherend_id,
> -                                               pfn_to_mfn(pfn),
> -                                               0);
> -
>                 req->id = id;
>                 req->gref = ref;
>         }
> @@ -375,17 +458,30 @@ static void xennet_tx_buf_gc(struct net_device *dev)
> 
>                         id  = txrsp->id;
>                         skb = np->tx_skbs[id].skb;
> -                       if (unlikely(gnttab_query_foreign_access(
> -                               np->grant_tx_ref[id]) != 0)) {
> -                               printk(KERN_ALERT "xennet_tx_buf_gc: warning "
> -                                      "-- grant still in use by backend "
> -                                      "domain.\n");
> -                               BUG();
> +
> +                       if (np->persistent_gnt) {
> +                               struct gnt_list *gnt_list_entry;
> +
> +                               gnt_list_entry = np->tx_grant[id];
> +                               BUG_ON(!gnt_list_entry);
> +
> +                               gnt_list_entry->tail = np->tx_gnt_list;
> +                               np->tx_gnt_list = gnt_list_entry;
> +                               np->tx_gnt_cnt++;
> +                       } else {
> +                               if (unlikely(gnttab_query_foreign_access(
> +                                       np->grant_tx_ref[id]) != 0)) {
> +                                       printk(KERN_ALERT "xennet_tx_buf_gc: warning "
> +                                              "-- grant still in use by backend "
> +                                              "domain.\n");
> +                                       BUG();
> +                               }
> +
> +                               gnttab_end_foreign_access_ref(
> +                                       np->grant_tx_ref[id], GNTMAP_readonly);

If I've read the code correctly, you are giving this frame both
read/write permissions to the other end on xennet_alloc_tx_ref, but then
you are only removing the read permissions? (see comment below on the
xennet_alloc_tx_ref function).

> +                               gnttab_release_grant_reference(
> +                                     &np->gref_tx_head, np->grant_tx_ref[id]);
>                         }
> -                       gnttab_end_foreign_access_ref(
> -                               np->grant_tx_ref[id], GNTMAP_readonly);
> -                       gnttab_release_grant_reference(
> -                               &np->gref_tx_head, np->grant_tx_ref[id]);
>                         np->grant_tx_ref[id] = GRANT_INVALID_REF;
>                         add_id_to_freelist(&np->tx_skb_freelist, np->tx_skbs, id);
>                         dev_kfree_skb_irq(skb);
> @@ -409,6 +505,59 @@ static void xennet_tx_buf_gc(struct net_device *dev)
>         xennet_maybe_wake_tx(dev);
>  }
> 
> +static grant_ref_t xennet_alloc_tx_ref(struct net_device *dev,
> +                                      unsigned long mfn,
> +                                      unsigned int id)
> +{
> +       struct netfront_info *np = netdev_priv(dev);
> +       grant_ref_t ref;
> +       struct page *granted_page;
> +
> +       if (np->persistent_gnt && np->tx_gnt_cnt) {
> +               struct gnt_list *gnt_list_entry;
> +
> +               gnt_list_entry = np->tx_gnt_list;
> +               np->tx_gnt_list = np->tx_gnt_list->tail;
> +               np->tx_gnt_cnt--;
> +
> +               ref = gnt_list_entry->gref;
> +               np->tx_grant[id] = gnt_list_entry;
> +       } else {
> +               struct gnt_list *gnt_list_entry;
> +
> +               BUG_ON(!np->persistent_gnt && np->tx_gnt_cnt);
> +               ref = gnttab_claim_grant_reference(&np->gref_tx_head);
> +               BUG_ON((signed short)ref < 0);
> +
> +               if (np->persistent_gnt) {
> +                       granted_page = alloc_page(GFP_KERNEL);
> +                       if (!granted_page) {
> +                               gnttab_release_grant_reference(
> +                                                       &np->gref_tx_head, ref);
> +                               return -ENOMEM;
> +                       }
> +
> +                       mfn = pfn_to_mfn(page_to_pfn(granted_page));
> +                       gnt_list_entry = kmalloc(sizeof(struct gnt_list),
> +                                                GFP_KERNEL);
> +                       if (!gnt_list_entry) {
> +                               __free_page(granted_page);
> +                               gnttab_release_grant_reference(
> +                                                       &np->gref_tx_head, ref);
> +                               return -ENOMEM;
> +                       }
> +
> +                       gnt_list_entry->gref = ref;
> +                       gnt_list_entry->gnt_pages = granted_page;
> +                       np->tx_grant[id] = gnt_list_entry;
> +               }
> +               gnttab_grant_foreign_access_ref(ref, np->xbdev->otherend_id,
> +                                               mfn, 0);

If you are not always using persistent grants I guess you need to give
read only permissions to this frame (GNTMAP_readonly). Also, for keeping
things in logical order, isn't it best that this function comes before
xennet_tx_buf_gc?

> +       }
> +
> +       return ref;
> +}
> +
> @@ -1132,8 +1357,10 @@ static void xennet_release_rx_bufs(struct netfront_info *np)
>                 }
> 
>                 skb = np->rx_skbs[id];
> -               mfn = gnttab_end_foreign_transfer_ref(ref);
> -               gnttab_release_grant_reference(&np->gref_rx_head, ref);
> +               if (!np->persistent_gnt) {
> +                       mfn = gnttab_end_foreign_transfer_ref(ref);
> +                       gnttab_release_grant_reference(&np->gref_rx_head, ref);
> +               }
>                 np->grant_rx_ref[id] = GRANT_INVALID_REF;
> 
>                 if (0 == mfn) {
> @@ -1607,6 +1834,13 @@ again:
>                 goto abort_transaction;
>         }
> 
> +       err = xenbus_printf(xbt, dev->nodename, "feature-persistent-grants",
> +                           "%u", info->persistent_gnt);

As in netback, I think "feature-persistent" should be used.

> +       if (err) {
> +               message = "writing feature-persistent-grants";
> +               xenbus_dev_fatal(dev, err, "%s", message);
> +       }
> +
>         err = xenbus_transaction_end(xbt, 0);
>         if (err) {
>                 if (err == -EAGAIN)
> @@ -1634,6 +1868,7 @@ static int xennet_connect(struct net_device *dev)
>         grant_ref_t ref;
>         struct xen_netif_rx_request *req;
>         unsigned int feature_rx_copy;
> +       int ret, val;
> 
>         err = xenbus_scanf(XBT_NIL, np->xbdev->otherend,
>                            "feature-rx-copy", "%u", &feature_rx_copy);
> @@ -1646,6 +1881,13 @@ static int xennet_connect(struct net_device *dev)
>                 return -ENODEV;
>         }
> 
> +       err = xenbus_scanf(XBT_NIL, np->xbdev->otherend,
> +                          "feature-persistent-grants", "%u", &val);
> +       if (err != 1)
> +               val = 0;
> +
> +       np->persistent_gnt = !!val;
> +
>         err = talk_to_netback(np->xbdev, np);
>         if (err)
>                 return err;
> @@ -1657,9 +1899,24 @@ static int xennet_connect(struct net_device *dev)
>         spin_lock_bh(&np->rx_lock);
>         spin_lock_irq(&np->tx_lock);
> 
> +       np->tx_gnt_cnt = 0;
> +       np->rx_gnt_cnt = 0;
> +
>         /* Step 1: Discard all pending TX packet fragments. */
>         xennet_release_tx_bufs(np);
> 
> +       if (np->persistent_gnt) {
> +               struct gnt_list *gnt_list_entry;
> +
> +               while (np->rx_gnt_list) {
> +                       gnt_list_entry = np->rx_gnt_list;
> +                       np->rx_gnt_list = np->rx_gnt_list->tail;
> +                       gnttab_end_foreign_access(gnt_list_entry->gref, 0, 0UL);
> +                       __free_page(gnt_list_entry->gnt_pages);
> +                       kfree(gnt_list_entry);
> +               }
> +       }
> +
>         /* Step 2: Rebuild the RX buffer freelist and the RX ring itself. */
>         for (requeue_idx = 0, i = 0; i < NET_RX_RING_SIZE; i++) {
>                 skb_frag_t *frag;
> @@ -1673,10 +1930,11 @@ static int xennet_connect(struct net_device *dev)
> 
>                 frag = &skb_shinfo(skb)->frags[0];
>                 page = skb_frag_page(frag);
> -               gnttab_grant_foreign_access_ref(
> -                       ref, np->xbdev->otherend_id,
> -                       pfn_to_mfn(page_to_pfn(page)),
> -                       0);
> +               ret = xennet_alloc_rx_ref(dev, pfn_to_mfn(page_to_pfn(page)),
> +                                         page_address(page), requeue_idx, ref);
> +               if ((signed short)ret < 0)
> +                       break;
> +
>                 req->gref = ref;
>                 req->id   = requeue_idx;
> 
> --
> 1.7.3.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
From: Roger Pau Monné @ 2012-11-15 10:56 UTC (permalink / raw)
  To: ANNIE LI
  Cc: Pasi Kärkkäinen, netdev@vger.kernel.org,
	xen-devel@lists.xensource.com, Ian Campbell,
	konrad.wilk@oracle.com
In-Reply-To: <50A4AA06.8080900@oracle.com>

On 15/11/12 09:38, ANNIE LI wrote:
> 
> 
> On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
>> Hello,
>>
>> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
>>> This patch implements persistent grants for xen-netfront/netback. This
>>> mechanism maintains page pools in netback/netfront, these page pools is used to
>>> save grant pages which are mapped. This way improve performance which is wasted
>>> when doing grant operations.
>>>
>>> Current netback/netfront does map/unmap grant operations frequently when
>>> transmitting/receiving packets, and grant operations costs much cpu clock. In
>>> this patch, netfront/netback maps grant pages when needed and then saves them
>>> into a page pool for future use. All these pages will be unmapped when
>>> removing/releasing the net device.
>>>
>> Do you have performance numbers available already? with/without persistent grants?
> I have some simple netperf/netserver test result with/without persistent 
> grants,
> 
> Following is result of with persistent grant patch,
> 
> Guests, Sum,      Avg,     Min,     Max
>   1,  15106.4,  15106.4, 15106.36, 15106.36
>   2,  13052.7,  6526.34,  6261.81,  6790.86
>   3,  12675.1,  6337.53,  6220.24,  6454.83
>   4,  13194,  6596.98,  6274.70,  6919.25
> 
> 
> Following are result of without persistent patch
> 
> Guests, Sum,     Avg,    Min,        Max
>   1,  10864.1,  10864.1, 10864.10, 10864.10
>   2,  10898.5,  5449.24,  4862.08,  6036.40
>   3,  10734.5,  5367.26,  5261.43,  5473.08
>   4,  10924,    5461.99,  5314.84,  5609.14

In the block case, performance improvement is seen when using a large
number of guests, could you perform the same benchmark increasing the
number of guests to 15?

^ permalink raw reply

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
From: ANNIE LI @ 2012-11-15 11:12 UTC (permalink / raw)
  To: Wei Liu
  Cc: netdev, xen-devel@lists.xensource.com, Ian Campbell,
	Konrad Rzeszutek Wilk
In-Reply-To: <CAOsiSVU+9fkGQhVhnrx=xxUD8hej55XXJGpKGWAxe1USPEhiEQ@mail.gmail.com>



On 2012-11-15 17:35, Wei Liu wrote:
>
> On Thu, Nov 15, 2012 at 4:38 PM, ANNIE LI <annie.li@oracle.com 
> <mailto:annie.li@oracle.com>> wrote:
>
>
>
>     On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
>
>         Hello,
>
>         On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
>
>             This patch implements persistent grants for
>             xen-netfront/netback. This
>             mechanism maintains page pools in netback/netfront, these
>             page pools is used to
>             save grant pages which are mapped. This way improve
>             performance which is wasted
>             when doing grant operations.
>
>             Current netback/netfront does map/unmap grant operations
>             frequently when
>             transmitting/receiving packets, and grant operations costs
>             much cpu clock. In
>             this patch, netfront/netback maps grant pages when needed
>             and then saves them
>             into a page pool for future use. All these pages will be
>             unmapped when
>             removing/releasing the net device.
>
>         Do you have performance numbers available already?
>         with/without persistent grants?
>
>     I have some simple netperf/netserver test result with/without
>     persistent grants,
>
>     Following is result of with persistent grant patch,
>
>     Guests, Sum,      Avg,     Min,     Max
>      1,  15106.4,  15106.4, 15106.36, 15106.36
>      2,  13052.7,  6526.34,  6261.81,  6790.86
>      3,  12675.1,  6337.53,  6220.24,  6454.83
>      4,  13194,  6596.98,  6274.70,  6919.25
>
>
>     Following are result of without persistent patch
>
>     Guests, Sum,     Avg,    Min,        Max
>      1,  10864.1,  10864.1, 10864.10, 10864.10
>      2,  10898.5,  5449.24,  4862.08,  6036.40
>      3,  10734.5,  5367.26,  5261.43,  5473.08
>      4,  10924,    5461.99,  5314.84,  5609.14
>
>
>
> Interesting results. Have you tested how good it is on a 10G nic, i.e. 
> guest sending packets
> through physical network to another host.
Not yet.

Thanks
Annie
>
>
> Wei.
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
From: ANNIE LI @ 2012-11-15 11:14 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Pasi Kärkkäinen, netdev@vger.kernel.org,
	xen-devel@lists.xensource.com, Ian Campbell,
	konrad.wilk@oracle.com
In-Reply-To: <50A4CA51.8080208@citrix.com>



On 2012-11-15 18:56, Roger Pau Monné wrote:
> On 15/11/12 09:38, ANNIE LI wrote:
>>
>> On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
>>> Hello,
>>>
>>> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
>>>> This patch implements persistent grants for xen-netfront/netback. This
>>>> mechanism maintains page pools in netback/netfront, these page pools is used to
>>>> save grant pages which are mapped. This way improve performance which is wasted
>>>> when doing grant operations.
>>>>
>>>> Current netback/netfront does map/unmap grant operations frequently when
>>>> transmitting/receiving packets, and grant operations costs much cpu clock. In
>>>> this patch, netfront/netback maps grant pages when needed and then saves them
>>>> into a page pool for future use. All these pages will be unmapped when
>>>> removing/releasing the net device.
>>>>
>>> Do you have performance numbers available already? with/without persistent grants?
>> I have some simple netperf/netserver test result with/without persistent
>> grants,
>>
>> Following is result of with persistent grant patch,
>>
>> Guests, Sum,      Avg,     Min,     Max
>>    1,  15106.4,  15106.4, 15106.36, 15106.36
>>    2,  13052.7,  6526.34,  6261.81,  6790.86
>>    3,  12675.1,  6337.53,  6220.24,  6454.83
>>    4,  13194,  6596.98,  6274.70,  6919.25
>>
>>
>> Following are result of without persistent patch
>>
>> Guests, Sum,     Avg,    Min,        Max
>>    1,  10864.1,  10864.1, 10864.10, 10864.10
>>    2,  10898.5,  5449.24,  4862.08,  6036.40
>>    3,  10734.5,  5367.26,  5261.43,  5473.08
>>    4,  10924,    5461.99,  5314.84,  5609.14
> In the block case, performance improvement is seen when using a large
> number of guests, could you perform the same benchmark increasing the
> number of guests to 15?
Sure, but my current local environment does not meet such test 
requirement, let me find an environment and then start such test.

Thanks
Annie
>

^ permalink raw reply

* Re: [PATCH 0/4] Implement persistent grant in xen-netfront/netback
From: ANNIE LI @ 2012-11-15 11:14 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel@lists.xensource.com, netdev@vger.kernel.org,
	konrad.wilk@oracle.com
In-Reply-To: <1352969589.3499.31.camel@zakaz.uk.xensource.com>



On 2012-11-15 16:53, Ian Campbell wrote:
> On Thu, 2012-11-15 at 07:03 +0000, Annie Li wrote:
>> This patch is based on linux3.4-rc3. I hit netperf/netserver failure
>> on linux latest version v3.7-rc1, v3.7-rc2 and v3.7-rc4. Not sure
>> whether thisnetperf/netserver failure connects compound page commit in
>> v3.7-rc1, but I did hit BUG_ON with debug patch from thread
>> http://lists.xen.org/archives/html/xen-devel/2012-10/msg00893.html
> Do you think you could cook up a netfront fix similar in principal to
> the 6a8ed462f16b8455eec5ae00eb6014159a6721f0 fix for netback?

Ok, let me have a try to do similar thing in netfront.

Thanks
Annie
>
> Ian.
>
>

^ permalink raw reply

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
From: ANNIE LI @ 2012-11-15 11:14 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel@lists.xensource.com, netdev@vger.kernel.org,
	konrad.wilk@oracle.com
In-Reply-To: <1352969771.3499.34.camel@zakaz.uk.xensource.com>



On 2012-11-15 16:56, Ian Campbell wrote:
> On Thu, 2012-11-15 at 07:03 +0000, Annie Li wrote:
>
>> This patch implements persistent grants for xen-netfront/netback. This
>> mechanism maintains page pools in netback/netfront, these page pools
>> is used to save grant pages which are mapped. This way improve
>> performance which is wasted when doing grant operations.
> Please can you send a patch against xen-unstable.hg to document this new
> protocol variant in xen/include/public/io/netif.h. This header is a bit
> under-documented but lets not let it fall further behind (if you want to
> go further and document the existing features, in the style of the
> current blkif.h, then that would be awesome!).
Ok, no problem.
>
> You may also want to provide a similar patch to Linux's copy which is in
> linux/include/xen/interface/io/netif.h
Ok, will do that.

Thanks
Annie
>
> Ian.
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
From: Ian Campbell @ 2012-11-15 11:15 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: ANNIE LI, Pasi Kärkkäinen, netdev@vger.kernel.org,
	xen-devel@lists.xensource.com, konrad.wilk@oracle.com
In-Reply-To: <50A4CA51.8080208@citrix.com>

On Thu, 2012-11-15 at 10:56 +0000, Roger Pau Monne wrote:
> On 15/11/12 09:38, ANNIE LI wrote:
> > 
> > 
> > On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
> >> Hello,
> >>
> >> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
> >>> This patch implements persistent grants for xen-netfront/netback. This
> >>> mechanism maintains page pools in netback/netfront, these page pools is used to
> >>> save grant pages which are mapped. This way improve performance which is wasted
> >>> when doing grant operations.
> >>>
> >>> Current netback/netfront does map/unmap grant operations frequently when
> >>> transmitting/receiving packets, and grant operations costs much cpu clock. In
> >>> this patch, netfront/netback maps grant pages when needed and then saves them
> >>> into a page pool for future use. All these pages will be unmapped when
> >>> removing/releasing the net device.
> >>>
> >> Do you have performance numbers available already? with/without persistent grants?
> > I have some simple netperf/netserver test result with/without persistent 
> > grants,
> > 
> > Following is result of with persistent grant patch,
> > 
> > Guests, Sum,      Avg,     Min,     Max
> >   1,  15106.4,  15106.4, 15106.36, 15106.36
> >   2,  13052.7,  6526.34,  6261.81,  6790.86
> >   3,  12675.1,  6337.53,  6220.24,  6454.83
> >   4,  13194,  6596.98,  6274.70,  6919.25
> > 
> > 
> > Following are result of without persistent patch
> > 
> > Guests, Sum,     Avg,    Min,        Max
> >   1,  10864.1,  10864.1, 10864.10, 10864.10
> >   2,  10898.5,  5449.24,  4862.08,  6036.40
> >   3,  10734.5,  5367.26,  5261.43,  5473.08
> >   4,  10924,    5461.99,  5314.84,  5609.14
> 
> In the block case, performance improvement is seen when using a large
> number of guests, could you perform the same benchmark increasing the
> number of guests to 15?

It would also be nice to see some analysis of the numbers which justify
why this change is a good one without every reviewer having to evaluate
the raw data themselves. In fact this should really be part of the
commit message.

Ian.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox