Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH 2/2] net/netlabel: Avoid call to genlmsg_cancel
From: Paul Moore @ 2011-01-28 15:23 UTC (permalink / raw)
  To: Julia Lawall; +Cc: kernel-janitors, David S. Miller, netdev, linux-kernel
In-Reply-To: <Pine.LNX.4.64.1101281541030.8546@pc-004.diku.dk>

On Fri, 2011-01-28 at 15:58 +0100, Julia Lawall wrote:
> On Fri, 28 Jan 2011, Paul Moore wrote:
> 
> > On Fri, 2011-01-28 at 15:17 +0100, Julia Lawall wrote:
> > > genlmsg_cancel subtracts some constants from its second argument before
> > > calling nlmsg_cancel.  nlmsg_cancel then calls nlmsg_trim on the same
> > > arguments.  nlmsg_trim tests for NULL before doing any computation, but a
> > > NULL second argument to genlmsg_cancel is no longer NULL due to the initial
> > > subtraction.  Nothing else happens in this execution, so the call to
> > > genlmsg_cancel is simply unnecessary in this case.
> > > 
> > > The semantic match that finds this problem is as follows:
> > > (http://coccinelle.lip6.fr/)
> > > 
> > > // <smpl>
> > > @@
> > > expression data;
> > > @@
> > > 
> > > if (data == NULL) { ...
> > > * genlmsg_cancel(..., data);
> > >   ...
> > >   return ...;
> > > }
> > > // </smpl>
> > > 
> > > Signed-off-by: Julia Lawall <julia@diku.dk>
> > 
> > In all of the cases below, these functions are called multiple times to
> > generate data chunks (additional netlink attributes) which are appended
> > to an existing skbuff.  I believe that the calls to genlmsg_cancel() are
> > still needed to help cleanup in the case where the functions fail on the
> > Nth call.
> > 
> > If I'm wrong, feel free to enlighten me.
> 
> Perhaps something is needed, but I don't see how the current code can 
> work.  The call is genlmsg_cancel(cb_arg->skb, NULL) in each case.

Ah yes, you're right.  You will have to forgive me as it has been quite
a while since I have looked at NetLabel's netlink code.

You also might consider putting a NULL check in genlmsg_cancel() similar
to the check nlmsg_trim(); that seems like a worthwhile addition.

> The definition of genlmsg_cancel is:
> 
> static inline void genlmsg_cancel(struct sk_buff *skb, void *hdr)
> {
> 	nlmsg_cancel(skb, hdr - GENL_HDRLEN - NLMSG_HDRLEN);
> }
> 
> Now the second argument to nlmsg_cancel is essentially a negative integer 
> (or a very large pointer).
> 
> nlmsg_cancel will call nlmsg_trim, which is defined as follows:
> 
> static inline void nlmsg_trim(struct sk_buff *skb, const void *mark)
> {
> 	if (mark)
> 		skb_trim(skb, (unsigned char *) mark - skb->data);
> }
> 
> I guess that the subtraction is going to result in an even larger negative 
> number.  The whole process is likely to end in doing nothing in the 
> definition of skb_trim, which is as follows:
> 
> void skb_trim(struct sk_buff *skb, unsigned int len)
> {
> 	if (skb->len > len)
> 		__skb_trim(skb, len);
> }
> 
> since the result of casting a negative number to unsigned is likely to be 
> larger than skb->len.
> 
> 
> > > ---
> > >  net/netlabel/netlabel_cipso_v4.c  |    2 +-
> > >  net/netlabel/netlabel_mgmt.c      |    4 ++--
> > >  net/netlabel/netlabel_unlabeled.c |    2 +-
> > >  3 files changed, 4 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/net/netlabel/netlabel_cipso_v4.c b/net/netlabel/netlabel_cipso_v4.c
> > > index 5f14c84..0a1f77b 100644
> > > --- a/net/netlabel/netlabel_cipso_v4.c
> > > +++ b/net/netlabel/netlabel_cipso_v4.c
> > > @@ -635,7 +635,7 @@ static int netlbl_cipsov4_listall_cb(struct cipso_v4_doi *doi_def, void *arg)
> > >  			   cb_arg->seq, &netlbl_cipsov4_gnl_family,
> > >  			   NLM_F_MULTI, NLBL_CIPSOV4_C_LISTALL);
> > >  	if (data == NULL)
> > > -		goto listall_cb_failure;
> > > +		return ret_val;
> > >  
> > >  	ret_val = nla_put_u32(cb_arg->skb, NLBL_CIPSOV4_A_DOI, doi_def->doi);
> > >  	if (ret_val != 0)
> > > diff --git a/net/netlabel/netlabel_mgmt.c b/net/netlabel/netlabel_mgmt.c
> > > index 998e85e..daaa01d 100644
> > > --- a/net/netlabel/netlabel_mgmt.c
> > > +++ b/net/netlabel/netlabel_mgmt.c
> > > @@ -452,7 +452,7 @@ static int netlbl_mgmt_listall_cb(struct netlbl_dom_map *entry, void *arg)
> > >  			   cb_arg->seq, &netlbl_mgmt_gnl_family,
> > >  			   NLM_F_MULTI, NLBL_MGMT_C_LISTALL);
> > >  	if (data == NULL)
> > > -		goto listall_cb_failure;
> > > +		return ret_val;
> > >  
> > >  	ret_val = netlbl_mgmt_listentry(cb_arg->skb, entry);
> > >  	if (ret_val != 0)
> > > @@ -617,7 +617,7 @@ static int netlbl_mgmt_protocols_cb(struct sk_buff *skb,
> > >  			   &netlbl_mgmt_gnl_family, NLM_F_MULTI,
> > >  			   NLBL_MGMT_C_PROTOCOLS);
> > >  	if (data == NULL)
> > > -		goto protocols_cb_failure;
> > > +		return ret_val;
> > >  
> > >  	ret_val = nla_put_u32(skb, NLBL_MGMT_A_PROTOCOL, protocol);
> > >  	if (ret_val != 0)
> > > diff --git a/net/netlabel/netlabel_unlabeled.c b/net/netlabel/netlabel_unlabeled.c
> > > index e2b0a68..b5d3945 100644
> > > --- a/net/netlabel/netlabel_unlabeled.c
> > > +++ b/net/netlabel/netlabel_unlabeled.c
> > > @@ -1141,7 +1141,7 @@ static int netlbl_unlabel_staticlist_gen(u32 cmd,
> > >  			   cb_arg->seq, &netlbl_unlabel_gnl_family,
> > >  			   NLM_F_MULTI, cmd);
> > >  	if (data == NULL)
> > > -		goto list_cb_failure;
> > > +		return ret_val;
> > >  
> > >  	if (iface->ifindex > 0) {
> > >  		dev = dev_get_by_index(&init_net, iface->ifindex);
> > > 
> > 
> > -- 
> > paul moore
> > linux @ hp
> > 
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 

-- 
paul moore
linux @ hp



^ permalink raw reply

* Re: [PATCH 2/2] net/netlabel: Avoid call to genlmsg_cancel
From: Julia Lawall @ 2011-01-28 14:58 UTC (permalink / raw)
  To: Paul Moore; +Cc: kernel-janitors, David S. Miller, netdev, linux-kernel
In-Reply-To: <1296225364.5511.6.camel@sifl>

On Fri, 28 Jan 2011, Paul Moore wrote:

> On Fri, 2011-01-28 at 15:17 +0100, Julia Lawall wrote:
> > genlmsg_cancel subtracts some constants from its second argument before
> > calling nlmsg_cancel.  nlmsg_cancel then calls nlmsg_trim on the same
> > arguments.  nlmsg_trim tests for NULL before doing any computation, but a
> > NULL second argument to genlmsg_cancel is no longer NULL due to the initial
> > subtraction.  Nothing else happens in this execution, so the call to
> > genlmsg_cancel is simply unnecessary in this case.
> > 
> > The semantic match that finds this problem is as follows:
> > (http://coccinelle.lip6.fr/)
> > 
> > // <smpl>
> > @@
> > expression data;
> > @@
> > 
> > if (data == NULL) { ...
> > * genlmsg_cancel(..., data);
> >   ...
> >   return ...;
> > }
> > // </smpl>
> > 
> > Signed-off-by: Julia Lawall <julia@diku.dk>
> 
> In all of the cases below, these functions are called multiple times to
> generate data chunks (additional netlink attributes) which are appended
> to an existing skbuff.  I believe that the calls to genlmsg_cancel() are
> still needed to help cleanup in the case where the functions fail on the
> Nth call.
> 
> If I'm wrong, feel free to enlighten me.

Perhaps something is needed, but I don't see how the current code can 
work.  The call is genlmsg_cancel(cb_arg->skb, NULL) in each case.

The definition of genlmsg_cancel is:

static inline void genlmsg_cancel(struct sk_buff *skb, void *hdr)
{
	nlmsg_cancel(skb, hdr - GENL_HDRLEN - NLMSG_HDRLEN);
}

Now the second argument to nlmsg_cancel is essentially a negative integer 
(or a very large pointer).

nlmsg_cancel will call nlmsg_trim, which is defined as follows:

static inline void nlmsg_trim(struct sk_buff *skb, const void *mark)
{
	if (mark)
		skb_trim(skb, (unsigned char *) mark - skb->data);
}

I guess that the subtraction is going to result in an even larger negative 
number.  The whole process is likely to end in doing nothing in the 
definition of skb_trim, which is as follows:

void skb_trim(struct sk_buff *skb, unsigned int len)
{
	if (skb->len > len)
		__skb_trim(skb, len);
}

since the result of casting a negative number to unsigned is likely to be 
larger than skb->len.


> > ---
> >  net/netlabel/netlabel_cipso_v4.c  |    2 +-
> >  net/netlabel/netlabel_mgmt.c      |    4 ++--
> >  net/netlabel/netlabel_unlabeled.c |    2 +-
> >  3 files changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/net/netlabel/netlabel_cipso_v4.c b/net/netlabel/netlabel_cipso_v4.c
> > index 5f14c84..0a1f77b 100644
> > --- a/net/netlabel/netlabel_cipso_v4.c
> > +++ b/net/netlabel/netlabel_cipso_v4.c
> > @@ -635,7 +635,7 @@ static int netlbl_cipsov4_listall_cb(struct cipso_v4_doi *doi_def, void *arg)
> >  			   cb_arg->seq, &netlbl_cipsov4_gnl_family,
> >  			   NLM_F_MULTI, NLBL_CIPSOV4_C_LISTALL);
> >  	if (data == NULL)
> > -		goto listall_cb_failure;
> > +		return ret_val;
> >  
> >  	ret_val = nla_put_u32(cb_arg->skb, NLBL_CIPSOV4_A_DOI, doi_def->doi);
> >  	if (ret_val != 0)
> > diff --git a/net/netlabel/netlabel_mgmt.c b/net/netlabel/netlabel_mgmt.c
> > index 998e85e..daaa01d 100644
> > --- a/net/netlabel/netlabel_mgmt.c
> > +++ b/net/netlabel/netlabel_mgmt.c
> > @@ -452,7 +452,7 @@ static int netlbl_mgmt_listall_cb(struct netlbl_dom_map *entry, void *arg)
> >  			   cb_arg->seq, &netlbl_mgmt_gnl_family,
> >  			   NLM_F_MULTI, NLBL_MGMT_C_LISTALL);
> >  	if (data == NULL)
> > -		goto listall_cb_failure;
> > +		return ret_val;
> >  
> >  	ret_val = netlbl_mgmt_listentry(cb_arg->skb, entry);
> >  	if (ret_val != 0)
> > @@ -617,7 +617,7 @@ static int netlbl_mgmt_protocols_cb(struct sk_buff *skb,
> >  			   &netlbl_mgmt_gnl_family, NLM_F_MULTI,
> >  			   NLBL_MGMT_C_PROTOCOLS);
> >  	if (data == NULL)
> > -		goto protocols_cb_failure;
> > +		return ret_val;
> >  
> >  	ret_val = nla_put_u32(skb, NLBL_MGMT_A_PROTOCOL, protocol);
> >  	if (ret_val != 0)
> > diff --git a/net/netlabel/netlabel_unlabeled.c b/net/netlabel/netlabel_unlabeled.c
> > index e2b0a68..b5d3945 100644
> > --- a/net/netlabel/netlabel_unlabeled.c
> > +++ b/net/netlabel/netlabel_unlabeled.c
> > @@ -1141,7 +1141,7 @@ static int netlbl_unlabel_staticlist_gen(u32 cmd,
> >  			   cb_arg->seq, &netlbl_unlabel_gnl_family,
> >  			   NLM_F_MULTI, cmd);
> >  	if (data == NULL)
> > -		goto list_cb_failure;
> > +		return ret_val;
> >  
> >  	if (iface->ifindex > 0) {
> >  		dev = dev_get_by_index(&init_net, iface->ifindex);
> > 
> 
> -- 
> paul moore
> linux @ hp
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* Re: [PATCH 2/2] net/netlabel: Avoid call to genlmsg_cancel
From: Paul Moore @ 2011-01-28 14:36 UTC (permalink / raw)
  To: Julia Lawall; +Cc: kernel-janitors, David S. Miller, netdev, linux-kernel
In-Reply-To: <1296224232-8115-2-git-send-email-julia@diku.dk>

On Fri, 2011-01-28 at 15:17 +0100, Julia Lawall wrote:
> genlmsg_cancel subtracts some constants from its second argument before
> calling nlmsg_cancel.  nlmsg_cancel then calls nlmsg_trim on the same
> arguments.  nlmsg_trim tests for NULL before doing any computation, but a
> NULL second argument to genlmsg_cancel is no longer NULL due to the initial
> subtraction.  Nothing else happens in this execution, so the call to
> genlmsg_cancel is simply unnecessary in this case.
> 
> The semantic match that finds this problem is as follows:
> (http://coccinelle.lip6.fr/)
> 
> // <smpl>
> @@
> expression data;
> @@
> 
> if (data == NULL) { ...
> * genlmsg_cancel(..., data);
>   ...
>   return ...;
> }
> // </smpl>
> 
> Signed-off-by: Julia Lawall <julia@diku.dk>

In all of the cases below, these functions are called multiple times to
generate data chunks (additional netlink attributes) which are appended
to an existing skbuff.  I believe that the calls to genlmsg_cancel() are
still needed to help cleanup in the case where the functions fail on the
Nth call.

If I'm wrong, feel free to enlighten me.

> ---
>  net/netlabel/netlabel_cipso_v4.c  |    2 +-
>  net/netlabel/netlabel_mgmt.c      |    4 ++--
>  net/netlabel/netlabel_unlabeled.c |    2 +-
>  3 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/net/netlabel/netlabel_cipso_v4.c b/net/netlabel/netlabel_cipso_v4.c
> index 5f14c84..0a1f77b 100644
> --- a/net/netlabel/netlabel_cipso_v4.c
> +++ b/net/netlabel/netlabel_cipso_v4.c
> @@ -635,7 +635,7 @@ static int netlbl_cipsov4_listall_cb(struct cipso_v4_doi *doi_def, void *arg)
>  			   cb_arg->seq, &netlbl_cipsov4_gnl_family,
>  			   NLM_F_MULTI, NLBL_CIPSOV4_C_LISTALL);
>  	if (data == NULL)
> -		goto listall_cb_failure;
> +		return ret_val;
>  
>  	ret_val = nla_put_u32(cb_arg->skb, NLBL_CIPSOV4_A_DOI, doi_def->doi);
>  	if (ret_val != 0)
> diff --git a/net/netlabel/netlabel_mgmt.c b/net/netlabel/netlabel_mgmt.c
> index 998e85e..daaa01d 100644
> --- a/net/netlabel/netlabel_mgmt.c
> +++ b/net/netlabel/netlabel_mgmt.c
> @@ -452,7 +452,7 @@ static int netlbl_mgmt_listall_cb(struct netlbl_dom_map *entry, void *arg)
>  			   cb_arg->seq, &netlbl_mgmt_gnl_family,
>  			   NLM_F_MULTI, NLBL_MGMT_C_LISTALL);
>  	if (data == NULL)
> -		goto listall_cb_failure;
> +		return ret_val;
>  
>  	ret_val = netlbl_mgmt_listentry(cb_arg->skb, entry);
>  	if (ret_val != 0)
> @@ -617,7 +617,7 @@ static int netlbl_mgmt_protocols_cb(struct sk_buff *skb,
>  			   &netlbl_mgmt_gnl_family, NLM_F_MULTI,
>  			   NLBL_MGMT_C_PROTOCOLS);
>  	if (data == NULL)
> -		goto protocols_cb_failure;
> +		return ret_val;
>  
>  	ret_val = nla_put_u32(skb, NLBL_MGMT_A_PROTOCOL, protocol);
>  	if (ret_val != 0)
> diff --git a/net/netlabel/netlabel_unlabeled.c b/net/netlabel/netlabel_unlabeled.c
> index e2b0a68..b5d3945 100644
> --- a/net/netlabel/netlabel_unlabeled.c
> +++ b/net/netlabel/netlabel_unlabeled.c
> @@ -1141,7 +1141,7 @@ static int netlbl_unlabel_staticlist_gen(u32 cmd,
>  			   cb_arg->seq, &netlbl_unlabel_gnl_family,
>  			   NLM_F_MULTI, cmd);
>  	if (data == NULL)
> -		goto list_cb_failure;
> +		return ret_val;
>  
>  	if (iface->ifindex > 0) {
>  		dev = dev_get_by_index(&init_net, iface->ifindex);
> 

-- 
paul moore
linux @ hp

^ permalink raw reply

* Re: [PATCH 1/2] net/wireless/nl80211.c: Avoid call to genlmsg_cancel
From: Johannes Berg @ 2011-01-28 14:20 UTC (permalink / raw)
  To: Julia Lawall
  Cc: kernel-janitors-u79uwXL29TY76Z2rM5mHXA, John W. Linville,
	David S. Miller, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <Pine.LNX.4.64.1101281515410.8546-h8frIrHk2441Y/6SN6b7/w@public.gmane.org>

On Fri, 2011-01-28 at 15:16 +0100, Julia Lawall wrote:

> > But why did you call the label differently? :)
> 
> Because out is already used in this case, and I didn't want to change all 
> of the other occurrences of nla_put_failure.  It's a bit sloppy though, 
> because this code is the actual nla_put_failure.  I can change it if you 
> prefer.

Oh, and I could've seen that from the patch itself too, I just missed
it, sorry.

johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH 2/2] net/netlabel: Avoid call to genlmsg_cancel
From: Julia Lawall @ 2011-01-28 14:17 UTC (permalink / raw)
  To: Paul Moore; +Cc: kernel-janitors, David S. Miller, netdev, linux-kernel

genlmsg_cancel subtracts some constants from its second argument before
calling nlmsg_cancel.  nlmsg_cancel then calls nlmsg_trim on the same
arguments.  nlmsg_trim tests for NULL before doing any computation, but a
NULL second argument to genlmsg_cancel is no longer NULL due to the initial
subtraction.  Nothing else happens in this execution, so the call to
genlmsg_cancel is simply unnecessary in this case.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression data;
@@

if (data == NULL) { ...
* genlmsg_cancel(..., data);
  ...
  return ...;
}
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>

---
 net/netlabel/netlabel_cipso_v4.c  |    2 +-
 net/netlabel/netlabel_mgmt.c      |    4 ++--
 net/netlabel/netlabel_unlabeled.c |    2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/netlabel/netlabel_cipso_v4.c b/net/netlabel/netlabel_cipso_v4.c
index 5f14c84..0a1f77b 100644
--- a/net/netlabel/netlabel_cipso_v4.c
+++ b/net/netlabel/netlabel_cipso_v4.c
@@ -635,7 +635,7 @@ static int netlbl_cipsov4_listall_cb(struct cipso_v4_doi *doi_def, void *arg)
 			   cb_arg->seq, &netlbl_cipsov4_gnl_family,
 			   NLM_F_MULTI, NLBL_CIPSOV4_C_LISTALL);
 	if (data == NULL)
-		goto listall_cb_failure;
+		return ret_val;
 
 	ret_val = nla_put_u32(cb_arg->skb, NLBL_CIPSOV4_A_DOI, doi_def->doi);
 	if (ret_val != 0)
diff --git a/net/netlabel/netlabel_mgmt.c b/net/netlabel/netlabel_mgmt.c
index 998e85e..daaa01d 100644
--- a/net/netlabel/netlabel_mgmt.c
+++ b/net/netlabel/netlabel_mgmt.c
@@ -452,7 +452,7 @@ static int netlbl_mgmt_listall_cb(struct netlbl_dom_map *entry, void *arg)
 			   cb_arg->seq, &netlbl_mgmt_gnl_family,
 			   NLM_F_MULTI, NLBL_MGMT_C_LISTALL);
 	if (data == NULL)
-		goto listall_cb_failure;
+		return ret_val;
 
 	ret_val = netlbl_mgmt_listentry(cb_arg->skb, entry);
 	if (ret_val != 0)
@@ -617,7 +617,7 @@ static int netlbl_mgmt_protocols_cb(struct sk_buff *skb,
 			   &netlbl_mgmt_gnl_family, NLM_F_MULTI,
 			   NLBL_MGMT_C_PROTOCOLS);
 	if (data == NULL)
-		goto protocols_cb_failure;
+		return ret_val;
 
 	ret_val = nla_put_u32(skb, NLBL_MGMT_A_PROTOCOL, protocol);
 	if (ret_val != 0)
diff --git a/net/netlabel/netlabel_unlabeled.c b/net/netlabel/netlabel_unlabeled.c
index e2b0a68..b5d3945 100644
--- a/net/netlabel/netlabel_unlabeled.c
+++ b/net/netlabel/netlabel_unlabeled.c
@@ -1141,7 +1141,7 @@ static int netlbl_unlabel_staticlist_gen(u32 cmd,
 			   cb_arg->seq, &netlbl_unlabel_gnl_family,
 			   NLM_F_MULTI, cmd);
 	if (data == NULL)
-		goto list_cb_failure;
+		return ret_val;
 
 	if (iface->ifindex > 0) {
 		dev = dev_get_by_index(&init_net, iface->ifindex);

^ permalink raw reply related

* [PATCH 1/2] net/wireless/nl80211.c: Avoid call to genlmsg_cancel
From: Julia Lawall @ 2011-01-28 14:17 UTC (permalink / raw)
  To: Johannes Berg
  Cc: kernel-janitors, John W. Linville, David S. Miller,
	linux-wireless, netdev, linux-kernel

genlmsg_cancel subtracts some constants from its second argument before
calling nlmsg_cancel.  nlmsg_cancel then calls nlmsg_trim on the same
arguments.  nlmsg_trim tests for NULL before doing any computation, but a
NULL second argument to genlmsg_cancel is no longer NULL due to the initial
subtraction.  Nothing else happens in this execution, so the call to
genlmsg_cancel is simply unnecessary in this case.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression data;
@@

if (data == NULL) { ...
* genlmsg_cancel(..., data);
  ...
  return ...;
}
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>

---
 net/wireless/nl80211.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index 9b62710..864ddfb 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -2718,7 +2718,7 @@ static int nl80211_get_mesh_config(struct sk_buff *skb,
 	hdr = nl80211hdr_put(msg, info->snd_pid, info->snd_seq, 0,
 			     NL80211_CMD_GET_MESH_CONFIG);
 	if (!hdr)
-		goto nla_put_failure;
+		goto out;
 	pinfoattr = nla_nest_start(msg, NL80211_ATTR_MESH_CONFIG);
 	if (!pinfoattr)
 		goto nla_put_failure;
@@ -2759,6 +2759,7 @@ static int nl80211_get_mesh_config(struct sk_buff *skb,
 
  nla_put_failure:
 	genlmsg_cancel(msg, hdr);
+ out:
 	nlmsg_free(msg);
 	return -ENOBUFS;
 }
@@ -2954,7 +2955,7 @@ static int nl80211_get_reg(struct sk_buff *skb, struct genl_info *info)
 	hdr = nl80211hdr_put(msg, info->snd_pid, info->snd_seq, 0,
 			     NL80211_CMD_GET_REG);
 	if (!hdr)
-		goto nla_put_failure;
+		goto put_failure;
 
 	NLA_PUT_STRING(msg, NL80211_ATTR_REG_ALPHA2,
 		cfg80211_regdomain->alpha2);
@@ -3001,6 +3002,7 @@ static int nl80211_get_reg(struct sk_buff *skb, struct genl_info *info)
 
 nla_put_failure:
 	genlmsg_cancel(msg, hdr);
+put_failure:
 	nlmsg_free(msg);
 	err = -EMSGSIZE;
 out:

^ permalink raw reply related

* Re: [PATCH 1/2] net/wireless/nl80211.c: Avoid call to genlmsg_cancel
From: Julia Lawall @ 2011-01-28 14:16 UTC (permalink / raw)
  To: Johannes Berg
  Cc: kernel-janitors, John W. Linville, David S. Miller,
	linux-wireless, netdev, linux-kernel
In-Reply-To: <1296223267.5118.7.camel@jlt3.sipsolutions.net>

On Fri, 28 Jan 2011, Johannes Berg wrote:

> On Fri, 2011-01-28 at 15:17 +0100, Julia Lawall wrote:
> 
> > diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
> > index 9b62710..864ddfb 100644
> > --- a/net/wireless/nl80211.c
> > +++ b/net/wireless/nl80211.c
> > @@ -2718,7 +2718,7 @@ static int nl80211_get_mesh_config(struct sk_buff *skb,
> >  	hdr = nl80211hdr_put(msg, info->snd_pid, info->snd_seq, 0,
> >  			     NL80211_CMD_GET_MESH_CONFIG);
> >  	if (!hdr)
> > -		goto nla_put_failure;
> > +		goto out;
> 
> 
> > @@ -2954,7 +2955,7 @@ static int nl80211_get_reg(struct sk_buff *skb, struct genl_info *info)
> >  	hdr = nl80211hdr_put(msg, info->snd_pid, info->snd_seq, 0,
> >  			     NL80211_CMD_GET_REG);
> >  	if (!hdr)
> > -		goto nla_put_failure;
> > +		goto put_failure;
> >  
> >  	NLA_PUT_STRING(msg, NL80211_ATTR_REG_ALPHA2,
> >  		cfg80211_regdomain->alpha2);
> 
> Seems fine. Actually, since the message is freed anyhow, the call to
> genlmsg_cancel is *completely* unnecessary, I just put it in to make it
> nest better and not rely on it not having side effects.
> 
> But why did you call the label differently? :)

Because out is already used in this case, and I didn't want to change all 
of the other occurrences of nla_put_failure.  It's a bit sloppy though, 
because this code is the actual nla_put_failure.  I can change it if you 
prefer.

julia

^ permalink raw reply

* Re: [PATCH 1/2] net/wireless/nl80211.c: Avoid call to genlmsg_cancel
From: Johannes Berg @ 2011-01-28 14:01 UTC (permalink / raw)
  To: Julia Lawall
  Cc: kernel-janitors, John W. Linville, David S. Miller,
	linux-wireless, netdev, linux-kernel
In-Reply-To: <1296224232-8115-1-git-send-email-julia@diku.dk>

On Fri, 2011-01-28 at 15:17 +0100, Julia Lawall wrote:

> diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
> index 9b62710..864ddfb 100644
> --- a/net/wireless/nl80211.c
> +++ b/net/wireless/nl80211.c
> @@ -2718,7 +2718,7 @@ static int nl80211_get_mesh_config(struct sk_buff *skb,
>  	hdr = nl80211hdr_put(msg, info->snd_pid, info->snd_seq, 0,
>  			     NL80211_CMD_GET_MESH_CONFIG);
>  	if (!hdr)
> -		goto nla_put_failure;
> +		goto out;


> @@ -2954,7 +2955,7 @@ static int nl80211_get_reg(struct sk_buff *skb, struct genl_info *info)
>  	hdr = nl80211hdr_put(msg, info->snd_pid, info->snd_seq, 0,
>  			     NL80211_CMD_GET_REG);
>  	if (!hdr)
> -		goto nla_put_failure;
> +		goto put_failure;
>  
>  	NLA_PUT_STRING(msg, NL80211_ATTR_REG_ALPHA2,
>  		cfg80211_regdomain->alpha2);

Seems fine. Actually, since the message is freed anyhow, the call to
genlmsg_cancel is *completely* unnecessary, I just put it in to make it
nest better and not rely on it not having side effects.

But why did you call the label differently? :)

johannes


^ permalink raw reply

* Re: [PATCH V10 01/15] time: Introduce timekeeping_inject_offset
From: Arnd Bergmann @ 2011-01-28 12:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Richard Cochran, John Stultz, Richard Cochran,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	Alan Cox, Christoph Lameter, David Miller, Krzysztof Halasa,
	Rodolfo Giometti, Thomas Gleixner, Benjamin Herrenschmidt,
	H. Peter Anvin, Ingo Molnar, Mike Frysinger, Paul Mackerras,
	Russell King
In-Reply-To: <1296218630.15234.334.camel@laptop>

On Friday 28 January 2011, Peter Zijlstra wrote:
> > 
> > The problem is step 6. The output of git format-patch does not work when
> > sending with mutt. The easiest solution is to send with git send-email,
> > which does the same as mutt -H, but gets it right.
> 
> I use: formail -s sendmail -t < patches.mbox, but then, I use quilt mail
> to generate the mbox, not git.

I think in that case, quilt generates the correct 'From' headers both in
the actual email headers and in the body, so you can use any standard
email client to send it out.

	Arnd

^ permalink raw reply

* Re: [PATCH V10 01/15] time: Introduce timekeeping_inject_offset
From: Peter Zijlstra @ 2011-01-28 12:43 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Richard Cochran, John Stultz, Richard Cochran, linux-kernel,
	linux-api, netdev, Alan Cox, Christoph Lameter, David Miller,
	Krzysztof Halasa, Rodolfo Giometti, Thomas Gleixner,
	Benjamin Herrenschmidt, H. Peter Anvin, Ingo Molnar,
	Mike Frysinger, Paul Mackerras, Russell King
In-Reply-To: <201101281305.37370.arnd@arndb.de>

On Fri, 2011-01-28 at 13:05 +0100, Arnd Bergmann wrote:
> On Friday 28 January 2011, Richard Cochran wrote:
> > I would like to get to the bottom of this. Here is what I did:
> > 
> >    1. Saved your patch to disk in mbox format using Mutt.
> >    2. git am
> >    3. ... rebase, rebase, rebase, ...
> >    4. git format-patch [options] 1234..abcd
> >    5. Edit cover letter
> >    6. for x in 00*; do mutt -H $x; done
> > 
> > Git format-patch places the "From: John Stultz <john.stultz@linaro.org>"
> > line with the other mail headers, and so I guess mutt just faithfully
> > preserves this.
> > 
> > I don't like having to remember to fix this manually. There must be a
> > better way...
> 
> The problem is step 6. The output of git format-patch does not work when
> sending with mutt. The easiest solution is to send with git send-email,
> which does the same as mutt -H, but gets it right.

I use: formail -s sendmail -t < patches.mbox, but then, I use quilt mail
to generate the mbox, not git.

^ permalink raw reply

* Re: STMMAC driver: NFS Problem on 2.6.37
From: Shiraz Hashim @ 2011-01-28 12:43 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Deepak SIKRI, Armando VISCONTI, Trond Myklebust,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Linux NFS Mailing List, Viresh KUMAR, Peppe CAVALLARO, amitgoel
In-Reply-To: <EFFBD485-8B7E-44D1-A8D2-61E73BF42DF9-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>

Hello Chuck,

On Wed, Jan 26, 2011 at 02:04:21AM +0800, Chuck Lever wrote:
> See analysis in line.
> 
> On Jan 25, 2011, at 6:56 AM, deepaksi wrote:

[...]

> > We have made following observations
> > 1. It seems that the time taken by phy auto negotiation process is
> > long and as soon as the link gets up rpc ping request is getting
> > timed out and we receive "Unable to reach ICMP" error. The time
> > out error is same even if you do not connect a network cable and
> > do a nfs boot.
> >
> > 2. We tried to modify the rate at which the work queue is
> > scheduled in the phy framework. instead of scheduling every HZ ( 1
> > sec), we modified it to HZ/10. We did not received
> > the error. This probably reduced the margin of the phy framework
> > informing the kernel that the link is up.
> >
> > 3. We tried to use another network card and did a nfs boot. The
> > only relevant difference we could find was the time of auto
> > negotiation.
> 
> Can you post a similar debugging dump of a root mount that succeeds
> using a different network card?

Following is the NFS boot log with a PCIe based e1000e nic card.

....
....
[    1.570000] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
[    1.580000] e1000: Copyright (c) 1999-2006 Intel Corporation.
[    1.590000] e1000e: Intel(R) PRO/1000 Network Driver - 1.2.7-k2
[    1.590000] e1000e: Copyright (c) 1999 - 2010 Intel Corporation.
[    1.600000] e1000e 0000:01:00.0: Disabling ASPM  L1
[    1.600000] PCI: enabling device 0000:01:00.0 (0140 -> 0142)
[    1.610000] e1000e 0000:01:00.0: (unregistered net_device): Failed to initialize MSI interrupts.  Falling back to legacy interrupts.
[    1.850000] e1000e 0000:01:00.0: eth0: (PCI Express:2.5GB/s:Width x1) 00:15:17:ec:02:ff
[    1.860000] e1000e 0000:01:00.0: eth0: Intel(R) PRO/1000 Network Connection
[    1.870000] e1000e 0000:01:00.0: eth0: MAC: 1, PHY: 4, PBA No: d50861-004
[    1.870000] Intel(R) Gigabit Ethernet Network Driver - version 2.1.0-k2
[    1.880000] Copyright (c) 2007-2009 Intel Corporation.
[    1.880000] Intel(R) Virtual Function Network Driver - version 1.0.0-k0
[    1.890000] Copyright (c) 2009 Intel Corporation.
[    1.900000] CAN device driver interface
[    1.900000] STMMAC driver:
[    1.900000] 	platform registration... 
[    1.910000] 	done!
[    1.910000] 	DWMAC1000 - user ID: 0x10, Synopsys ID: 0x35
[    1.920000] 	Enhanced descriptor structure
[    1.920000] 	eth1 - (dev. name: stmmaceth - id: 0, IRQ #65
[    1.920000] 	IO base addr: 0xd00f0000)
[    1.940000] STMMAC MII Bus: probed
[    1.940000] eth1: PHY ID 20005c7a at 5 IRQ -1 (0:05) active
[    1.950000] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    1.950000] spear-ehci spear-ehci.0: SPEAr EHCI
[    1.960000] spear-ehci spear-ehci.0: new USB bus registered, assigned bus number 1
[    2.020000] spear-ehci spear-ehci.0: irq 96, io mem 0xe4800000
[    2.040000] spear-ehci spear-ehci.0: USB 0.0 started, EHCI 1.00
[    2.040000] hub 1-0:1.0: USB hub found
[    2.050000] hub 1-0:1.0: 1 port detected
[    2.050000] spear-ehci spear-ehci.1: SPEAr EHCI
[    2.060000] spear-ehci spear-ehci.1: new USB bus registered, assigned bus number 2
[    2.120000] spear-ehci spear-ehci.1: irq 98, io mem 0xe5800000
[    2.140000] spear-ehci spear-ehci.1: USB 0.0 started, EHCI 1.00
[    2.140000] hub 2-0:1.0: USB hub found
[    2.150000] hub 2-0:1.0: 1 port detected
[    2.150000] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[    2.160000] spear-ohci spear-ohci.0: SPEAr OHCI
[    2.160000] spear-ohci spear-ohci.0: new USB bus registered, assigned bus number 3
[    2.200000] spear-ohci spear-ohci.0: irq 97, io mem 0xe4000000
[    2.260000] hub 3-0:1.0: USB hub found
[    2.270000] hub 3-0:1.0: 1 port detected
[    2.270000] spear-ohci spear-ohci.1: SPEAr OHCI
[    2.280000] spear-ohci spear-ohci.1: new USB bus registered, assigned bus number 4
[    2.310000] spear-ohci spear-ohci.1: irq 99, io mem 0xe5000000
[    2.370000] hub 4-0:1.0: USB hub found
[    2.380000] hub 4-0:1.0: 1 port detected
[    2.380000] Initializing USB Mass Storage driver...
[    2.390000] usbcore: registered new interface driver usb-storage
[    2.390000] USB Mass Storage support registered.
[    2.400000] usbcore: registered new interface driver usbtest
[    2.400000] designware_udc designware_udc: Device Synopsys UDC probed csr d00fe000: plug d01c4000
[    2.410000] zero gadget: Gadget Zero, version: Cinco de Mayo 2008
[    2.420000] zero gadget: zero ready
[    2.420000] designware_udc designware_udc: reg gadget driver 'zero'
[    2.430000] mice: PS/2 mouse device common for all mice
[    2.440000] input: Spear Keyboard as /devices/platform/keyboard/input/input0
[    2.470000] usbcore: registered new interface driver usbtouchscreen
[    2.470000] input: STMPE610 Touchscreen as /devices/ssp-pl022/spi0.0/input/input1
[    2.670000] stmpe610-spi spi0.0: Detected Touch Screen with chip id: ffff and version: ff
[    2.690000] rtc-spear rtc-spear: rtc core: registered rtc-spear as rtc0
[    2.730000] i2c /dev entries driver
[    2.730000] cortexa9-wdt cortexa9-wdt: registration successful
[    2.740000] dw_dmac.0: DesignWare DMA Controller, 8 channels
[    2.750000] dw_dmac.1: DesignWare DMA Controller, 8 channels
[    2.760000] sdhci: Secure Digital Host Controller Interface driver
[    2.770000] sdhci: Copyright(c) Pierre Ossman
[    2.780000] mmc0: SDHCI controller on sdhci [platform] using DMA
[    2.780000] IPv4 over IPv4 tunneling driver
[    2.790000] TCP cubic registered
[    2.790000] NET: Registered protocol family 17
[    2.800000] can: controller area network core (rev 20090105 abi 8)
[    2.810000] NET: Registered protocol family 29
[    2.810000] can: raw protocol (rev 20090105)
[    2.810000] can: broadcast manager protocol (rev 20090105 t)
[    2.820000] rtc-spear rtc-spear: hctosys: invalid date/time
[    4.520000] IP-Config: Complete:
[    4.520000]      device=eth0, addr=192.168.1.10, mask=255.255.255.0, gw=192.168.1.1,
[    4.530000]      host=192.168.1.10, domain=, nis-domain=(none),
[    4.530000]      bootserver=192.168.1.1, rootserver=192.168.1.1, rootpath=
[    4.540000] Root-NFS: nfsroot=/opt/STM/STLinux-2.4/devkit/armv7/target
[    4.550000] NFS: nfs mount opts='nolock,addr=192.168.1.1'
[    4.550000] NFS:   parsing nfs mount option 'nolock'
[    4.560000] NFS:   parsing nfs mount option 'addr=192.168.1.1'
[    4.560000] NFS: MNTPATH: '/opt/STM/STLinux-2.4/devkit/armv7/target'
[    4.570000] NFS: sending MNT request for 192.168.1.1:/opt/STM/STLinux-2.4/devkit/armv7/target
[    4.580000] Calling rpc_create
[    4.580000] RPC:       set up xprt to 192.168.1.1 (autobind) via tcp
[    4.590000] RPC:       created transport cfb13800 with 16 slots
[    4.590000] xprt_create_transport: RPC:       created transport cfb13800 with 16 slots
[    4.600000] RPC:       creating mount client for 192.168.1.1 (xprt cfb13800)
[    4.610000] RPC:       creating UNIX authenticator for client cfaae800
[    4.620000] Calling rpc_ping
[    4.620000] RPC:       new task initialized, procpid 1
[    4.620000] RPC:       allocated task cfa93100
[    4.630000] RPC:     1 __rpc_execute flags=0x680
[    4.630000] RPC:     1 call_start mount3 proc NULL (sync)
[    4.640000] RPC:     1 call_reserve (status 0)
[    4.640000] RPC:     1 reserved req cfb20000 xid a026435b
[    4.650000] RPC:     1 call_reserveresult (status 0)
[    4.650000] RPC:     1 call_refresh (status 0)
[    4.660000] RPC:     1 holding NULL cred c0492798
[    4.660000] RPC:     1 refreshing NULL cred c0492798
[    4.670000] RPC:     1 call_refreshresult (status 0)
[    4.670000] RPC:     1 call_allocate (status 0)
[    4.680000] RPC:     1 allocated buffer of size 92 at cfb14000
[    4.680000] RPC:     1 call_bind (status 0)
[    4.690000] RPC:     1 rpcb_getport_async(192.168.1.1, 100005, 3, 6)
[    4.690000] RPC:     1 sleep_on(queue "xprt_binding" time 4294937765)
[    4.700000] RPC:     1 added to queue cfb138a4 "xprt_binding"
[    4.710000] RPC:     1 setting alarm for 60000 ms
[    4.710000] RPC:     1 rpcb_getport_async: trying rpcbind version 2
[    4.720000] Calling rpc_create
[    4.720000] RPC:       set up xprt to 192.168.1.1 (port 111) via tcp
[    4.730000] RPC:       created transport cfb14800 with 16 slots
[    4.730000] xprt_create_transport: RPC:       created transport cfb14800 with 16 slots
[    4.740000] RPC:       creating rpcbind client for 192.168.1.1 (xprt cfb14800)
[    4.750000] RPC:       creating UNIX authenticator for client cfaaea00
[    4.750000] rpc_create returns 0xcfaaea00
[    4.760000] RPC:       new task initialized, procpid 1
[    4.760000] RPC:       allocated task cfa93180
[    4.770000] RPC:       rpc_release_client(cfaaea00)
[    4.770000] RPC:     1 sync task going to sleep
[    4.780000] RPC:     2 __rpc_execute flags=0x681
[    4.780000] RPC:     2 call_start rpcbind2 proc GETPORT (async)
[    4.790000] RPC:     2 call_reserve (status 0)
[    4.790000] RPC:     2 reserved req cfb21000 xid aa41d674
[    4.800000] RPC:     2 call_reserveresult (status 0)
[    4.800000] RPC:     2 call_refresh (status 0)
[    4.810000] RPC:     2 looking up UNIX cred
[    4.810000] RPC:       looking up UNIX cred
[    4.820000] RPC:       allocating UNIX cred for uid 0 gid 0
[    4.820000] RPC:     2 refreshing UNIX cred cfa93200
[    4.830000] RPC:     2 call_refreshresult (status 0)
[    4.830000] RPC:     2 call_allocate (status 0)
[    4.840000] RPC:     2 allocated buffer of size 412 at cfb15000
[    4.840000] RPC:     2 call_bind (status 0)
[    4.850000] RPC:     2 call_connect xprt cfb14800 is not connected
[    4.850000] RPC:     2 xprt_connect xprt cfb14800 is not connected
[    4.860000] RPC:     2 sleep_on(queue "xprt_pending" time 4294937782)
[    4.860000] RPC:     2 added to queue cfb149dc "xprt_pending"
[    4.870000] RPC:     2 setting alarm for 60000 ms

[    4.880000] RPC:       xs_connect scheduled xprt cfb14800
[    4.880000] RPC:       xs_bind 0.0.0.0:0: ok (0)
[    4.890000] RPC:       worker connecting xprt cfb14800 via tcp to 192.168.1.1 (port 111)
[    4.890000] RPC:       cfb14800 connect status 115 connected 0 sock state 2

[    5.870000] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX

[    5.900000] RPC:       xs_tcp_state_change client cfb14800...
[    5.900000] RPC:       state 1 conn 0 dead 0 zapped 1 sk_shutdown 0
[    5.910000] RPC:     2 __rpc_wake_up_task (now 4294937887)
[    5.910000] RPC:     2 disabling timer
[    5.920000] RPC:     2 removed from queue cfb149dc "xprt_pending"
[    5.920000] RPC:       __rpc_wake_up_task done
[    5.930000] RPC:     2 __rpc_execute flags=0x681
[    5.930000] RPC:     2 xprt_connect_status: retrying
[    5.940000] RPC:     2 call_connect_status (status -11)
[    5.940000] RPC:     2 call_transmit (status 0)
[    5.950000] RPC:     2 xprt_prepare_transmit
[    5.950000] RPC:     2 rpc_xdr_encode (status 0)
[    5.960000] RPC:     2 marshaling UNIX cred cfa93200
[    5.960000] RPC:     2 using AUTH_UNIX cred cfa93200 to wrap rpc data
[    5.970000] RPC:     2 encoding PMAP_GETPORT call (100005, 3, 6, 0)
[    5.980000] RPC:     2 xprt_transmit(92)
[    5.980000] RPC:       xs_tcp_send_request(92) = 92
[    5.980000] RPC:       xs_tcp_data_ready...
[    5.990000] RPC:       xs_tcp_data_recv started
[    5.990000] RPC:       reading TCP record fragment of length 28
[    6.000000] RPC:       reading XID (4 bytes)
[    6.000000] RPC:       reading request with XID aa41d674
[    6.010000] RPC:       reading CALL/REPLY flag (4 bytes)
[    6.010000] RPC:       read reply XID aa41d674
[    6.020000] RPC:       XID aa41d674 read 20 bytes
[    6.020000] RPC:       xprt = cfb14800, tcp_copied = 28, tcp_offset = 28, tcp_reclen = 28
[    6.030000] RPC:     2 xid aa41d674 complete (28 bytes received)
[    6.040000] RPC:       xs_tcp_data_recv done
[    6.040000] RPC:     2 xmit complete
[    6.050000] RPC:       wake_up_next(cfb14974 "xprt_resend")
[    6.050000] RPC:       wake_up_next(cfb1490c "xprt_sending")
[    6.060000] RPC:     2 call_status (status 28)
[    6.060000] RPC:     2 call_decode (status 28)
[    6.070000] RPC:     2 validating UNIX cred cfa93200
[    6.070000] RPC:     2 using AUTH_UNIX cred cfa93200 to unwrap rpc data
[    6.080000] RPC:     2 PMAP_GETPORT result: 48734
[    6.080000] RPC:     2 call_decode result 0
[    6.090000] RPC:       setting port for xprt cfb13800 to 48734
[    6.090000] RPC:     2 rpcb_getport_done(status 0, port 48734)
[    6.100000] RPC:     2 return 0, status 0
[    6.100000] RPC:     2 release task
[    6.110000] RPC:       freeing buffer of size 412 at cfb15000
[    6.110000] RPC:     2 release request cfb21000
[    6.120000] RPC:       wake_up_next(cfb14a44 "xprt_backlog")
[    6.120000] RPC:       rpc_release_client(cfaaea00)
[    6.130000] RPC:       destroying rpcbind client for 192.168.1.1
[    6.130000] RPC:       destroying transport cfb14800
[    6.140000] RPC:       xs_destroy xprt cfb14800
[    6.140000] RPC:       xs_close xprt cfb14800
[    6.150000] RPC:       disconnected transport cfb14800
[    6.150000] RPC:     2 freeing task
[    6.160000] RPC:     1 __rpc_wake_up_task (now 4294937912)
[    6.160000] RPC:     1 disabling timer
[    6.160000] RPC:     1 removed from queue cfb138a4 "xprt_binding"
[    6.170000] RPC:       __rpc_wake_up_task done

....
....

> > Are there some changes in the kernel framework w.r.t rpc ping time
> > out ? This problem was not there in previous kernels.
> 
> There have been changes in the RPC socket code around how it manages
> recovery from failed attempts to connect.  We also have new logic
> now in the RPC client that causes RPC ping to fail immediately if a
> host can't be reached.
> 
> Thanks for your efforts so far.  It would be helpful if you could
> bisect to determine which commit(s) introduced this RPC client
> behavior (or any related changes to your network driver behavior).

We would do so in coming days. By then can you please guide us if you
have something in mind.

-- 
regards
Shiraz
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [net-next-2.6 2/3] igb: add support for VF Transmit rate limit using iproute2
From: Jeff Kirsher @ 2011-01-28 12:29 UTC (permalink / raw)
  To: davem; +Cc: Lior Levy, netdev, gospo, bphilips, Jeff Kirsher
In-Reply-To: <1296217779-30133-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Lior Levy <lior.levy@intel.com>

Implemented igb_ndo_set_vf_bw function which is being used
by iproute2 tool. In addition, updated igb_ndo_get_vf_config function
to show the actual rate limit to the user.

The rate limitation can be configured only when the link is up.
The rate limit value can be ranged between 0 and actual
link speed measured in Mbps. A value of '0' disables the rate limit for
this specific VF.

iproute2 usage will be 'ip link set ethX vf Y rate Z'.
After the command is made, the rate will be changed instantly.
To view the current rate limit, use 'ip link show ethX'.

The rates will be zeroed only upon driver reload or a link speed change.

This feature is being supported only by 82576 device.

Signed-off-by: Lior Levy <lior.levy@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/igb/e1000_defines.h |    7 +++
 drivers/net/igb/e1000_regs.h    |    4 ++
 drivers/net/igb/igb.h           |    2 +
 drivers/net/igb/igb_main.c      |   96 ++++++++++++++++++++++++++++++++++++++-
 4 files changed, 107 insertions(+), 2 deletions(-)

diff --git a/drivers/net/igb/e1000_defines.h b/drivers/net/igb/e1000_defines.h
index 6319ed9..0bd543f 100644
--- a/drivers/net/igb/e1000_defines.h
+++ b/drivers/net/igb/e1000_defines.h
@@ -770,4 +770,11 @@
 #define E1000_PCIEMISC_LX_DECISION      0x00000080 /* Lx power decision based
                                                       on DMA coal */
 
+/* Tx Rate-Scheduler Config fields */
+#define E1000_RTTBCNRC_RS_ENA          0x80000000
+#define E1000_RTTBCNRC_RF_DEC_MASK     0x00003FFF
+#define E1000_RTTBCNRC_RF_INT_SHIFT    14
+#define E1000_RTTBCNRC_RF_INT_MASK     \
+	(E1000_RTTBCNRC_RF_DEC_MASK << E1000_RTTBCNRC_RF_INT_SHIFT)
+
 #endif
diff --git a/drivers/net/igb/e1000_regs.h b/drivers/net/igb/e1000_regs.h
index 8ac83c5..a6485a1 100644
--- a/drivers/net/igb/e1000_regs.h
+++ b/drivers/net/igb/e1000_regs.h
@@ -106,6 +106,10 @@
 
 #define E1000_RQDPC(_n) (0x0C030 + ((_n) * 0x40))
 
+/* TX Rate Limit Registers */
+#define E1000_RTTDQSEL 0x3604  /* Tx Desc Plane Queue Select - WO */
+#define E1000_RTTBCNRC 0x36B0  /* Tx BCN Rate-Scheduler Config - WO */
+
 /* Split and Replication RX Control - RW */
 #define E1000_RXPBS    0x02404  /* Rx Packet Buffer Size - RW */
 /*
diff --git a/drivers/net/igb/igb.h b/drivers/net/igb/igb.h
index 92a4ef0..bbc5ebf 100644
--- a/drivers/net/igb/igb.h
+++ b/drivers/net/igb/igb.h
@@ -77,6 +77,7 @@ struct vf_data_storage {
 	unsigned long last_nack;
 	u16 pf_vlan; /* When set, guest VLAN config not allowed. */
 	u16 pf_qos;
+	u16 tx_rate;
 };
 
 #define IGB_VF_FLAG_CTS            0x00000001 /* VF is clear to send data */
@@ -323,6 +324,7 @@ struct igb_adapter {
 	u16 rx_ring_count;
 	unsigned int vfs_allocated_count;
 	struct vf_data_storage *vf_data;
+	int vf_rate_link_speed;
 	u32 rss_queues;
 	u32 wvbr;
 };
diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c
index cb6bf7b..6b17317 100644
--- a/drivers/net/igb/igb_main.c
+++ b/drivers/net/igb/igb_main.c
@@ -150,6 +150,7 @@ static int igb_ndo_set_vf_vlan(struct net_device *netdev,
 static int igb_ndo_set_vf_bw(struct net_device *netdev, int vf, int tx_rate);
 static int igb_ndo_get_vf_config(struct net_device *netdev, int vf,
 				 struct ifla_vf_info *ivi);
+static void igb_check_vf_rate_limit(struct igb_adapter *);
 
 #ifdef CONFIG_PM
 static int igb_suspend(struct pci_dev *, pm_message_t);
@@ -3511,6 +3512,7 @@ static void igb_watchdog_task(struct work_struct *work)
 			netif_carrier_on(netdev);
 
 			igb_ping_all_vfs(adapter);
+			igb_check_vf_rate_limit(adapter);
 
 			/* link state has changed, schedule phy info update */
 			if (!test_bit(__IGB_DOWN, &adapter->state))
@@ -6599,9 +6601,99 @@ static int igb_ndo_set_vf_mac(struct net_device *netdev, int vf, u8 *mac)
 	return igb_set_vf_mac(adapter, vf, mac);
 }
 
+static int igb_link_mbps(int internal_link_speed)
+{
+	switch (internal_link_speed) {
+	case SPEED_100:
+		return 100;
+	case SPEED_1000:
+		return 1000;
+	default:
+		return 0;
+	}
+}
+
+static void igb_set_vf_rate_limit(struct e1000_hw *hw, int vf, int tx_rate,
+                                  int link_speed)
+{
+	int rf_dec, rf_int;
+	u32 bcnrc_val;
+
+	if (tx_rate != 0) {
+		/* Calculate the rate factor values to set */
+		rf_int = link_speed / tx_rate;
+		rf_dec = (link_speed - (rf_int * tx_rate));
+		rf_dec = (rf_dec * (1 << E1000_RTTBCNRC_RF_INT_SHIFT)) /
+		         tx_rate;
+
+		bcnrc_val = E1000_RTTBCNRC_RS_ENA;
+		bcnrc_val |= ((rf_int << E1000_RTTBCNRC_RF_INT_SHIFT) &
+		               E1000_RTTBCNRC_RF_INT_MASK);
+		bcnrc_val |= (rf_dec & E1000_RTTBCNRC_RF_DEC_MASK);
+	} else
+		bcnrc_val = 0;
+
+	wr32(E1000_RTTDQSEL, vf); /* vf X uses queue X */
+	wr32(E1000_RTTBCNRC, bcnrc_val);
+}
+
+static void igb_check_vf_rate_limit(struct igb_adapter *adapter)
+{
+	int actual_link_speed, i;
+	bool reset_rate = false;
+
+	/* VF TX rate limit was not set or not supported */
+	if ((adapter->vf_rate_link_speed == 0) ||
+	    (adapter->hw.mac.type != e1000_82576))
+		return;
+
+	actual_link_speed = igb_link_mbps(adapter->link_speed);
+	if (actual_link_speed != adapter->vf_rate_link_speed) {
+		reset_rate = true;
+		adapter->vf_rate_link_speed = 0;
+		dev_info(&adapter->pdev->dev,
+		         "Link speed has been changed. VF Transmit "
+		         "rate is disabled\n");
+	}
+
+	for (i = 0; i < adapter->vfs_allocated_count; i++) {
+		if (reset_rate)
+			adapter->vf_data[i].tx_rate = 0;
+
+		igb_set_vf_rate_limit(&adapter->hw, i,
+		                      adapter->vf_data[i].tx_rate,
+		                      actual_link_speed);
+	}
+}
+
 static int igb_ndo_set_vf_bw(struct net_device *netdev, int vf, int tx_rate)
 {
-	return -EOPNOTSUPP;
+	struct igb_adapter *adapter = netdev_priv(netdev);
+	struct e1000_hw *hw = &adapter->hw;
+	int actual_link_speed;
+
+	if (hw->mac.type != e1000_82576)
+		return -EOPNOTSUPP;
+
+	actual_link_speed = igb_link_mbps(adapter->link_speed);
+	if ((vf >= adapter->vfs_allocated_count) ||
+	    (!(rd32(E1000_STATUS) & E1000_STATUS_LU)) ||
+	    (tx_rate < 0) || (tx_rate > actual_link_speed))
+		return -EINVAL;
+
+	adapter->vf_rate_link_speed = actual_link_speed;
+	adapter->vf_data[vf].tx_rate = (u16)tx_rate;
+	igb_set_vf_rate_limit(hw, vf, tx_rate, actual_link_speed);
+
+	if (tx_rate != 0)
+		dev_info(&adapter->pdev->dev,
+		         "Setting Transmit rate of %d Mbps for VF %d\n",
+		         tx_rate, vf);
+	else
+		dev_info(&adapter->pdev->dev,
+		         "Transmit rate limit for VF %d is disabled\n", vf);
+
+	return 0;
 }
 
 static int igb_ndo_get_vf_config(struct net_device *netdev,
@@ -6612,7 +6704,7 @@ static int igb_ndo_get_vf_config(struct net_device *netdev,
 		return -EINVAL;
 	ivi->vf = vf;
 	memcpy(&ivi->mac, adapter->vf_data[vf].vf_mac_addresses, ETH_ALEN);
-	ivi->tx_rate = 0;
+	ivi->tx_rate = adapter->vf_data[vf].tx_rate;
 	ivi->vlan = adapter->vf_data[vf].pf_vlan;
 	ivi->qos = adapter->vf_data[vf].pf_qos;
 	return 0;
-- 
1.7.3.5


^ permalink raw reply related

* [net-next-2.6 3/3] ixgbe: Adding 100MB FULL support in ethtool
From: Jeff Kirsher @ 2011-01-28 12:29 UTC (permalink / raw)
  To: davem; +Cc: Atita Shirwaikar, netdev, gospo, bphilips, Jeff Kirsher
In-Reply-To: <1296217779-30133-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Atita Shirwaikar <atita.shirwaikar@intel.com>

Current driver does not show 100MB support in ethtool.
Adding support for the same.

Signed-off-by: Atita Shirwaikar <atita.shirwaikar@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ixgbe/ixgbe_ethtool.c |   34 ++++++++++++++++++++++++++++++++--
 drivers/net/ixgbe/ixgbe_main.c    |    5 ++++-
 2 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c
index 2002ea8..309272f 100644
--- a/drivers/net/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ixgbe/ixgbe_ethtool.c
@@ -152,7 +152,17 @@ static int ixgbe_get_settings(struct net_device *netdev,
 		ecmd->supported |= (SUPPORTED_1000baseT_Full |
 		                    SUPPORTED_Autoneg);
 
+		switch (hw->mac.type) {
+		case ixgbe_mac_X540:
+			ecmd->supported |= SUPPORTED_100baseT_Full;
+			break;
+		default:
+			break;
+		}
+
 		ecmd->advertising = ADVERTISED_Autoneg;
+		if (hw->phy.autoneg_advertised & IXGBE_LINK_SPEED_100_FULL)
+			ecmd->advertising |= ADVERTISED_100baseT_Full;
 		if (hw->phy.autoneg_advertised & IXGBE_LINK_SPEED_10GB_FULL)
 			ecmd->advertising |= ADVERTISED_10000baseT_Full;
 		if (hw->phy.autoneg_advertised & IXGBE_LINK_SPEED_1GB_FULL)
@@ -167,6 +177,15 @@ static int ixgbe_get_settings(struct net_device *netdev,
 			ecmd->advertising |= (ADVERTISED_10000baseT_Full |
 					      ADVERTISED_1000baseT_Full);
 
+		switch (hw->mac.type) {
+		case ixgbe_mac_X540:
+			if (!(ecmd->advertising & ADVERTISED_100baseT_Full))
+				ecmd->advertising |= (ADVERTISED_100baseT_Full);
+			break;
+		default:
+			break;
+		}
+
 		if (hw->phy.media_type == ixgbe_media_type_copper) {
 			ecmd->supported |= SUPPORTED_TP;
 			ecmd->advertising |= ADVERTISED_TP;
@@ -271,8 +290,19 @@ static int ixgbe_get_settings(struct net_device *netdev,
 
 	hw->mac.ops.check_link(hw, &link_speed, &link_up, false);
 	if (link_up) {
-		ecmd->speed = (link_speed == IXGBE_LINK_SPEED_10GB_FULL) ?
-		               SPEED_10000 : SPEED_1000;
+		switch (link_speed) {
+		case IXGBE_LINK_SPEED_10GB_FULL:
+			ecmd->speed = SPEED_10000;
+			break;
+		case IXGBE_LINK_SPEED_1GB_FULL:
+			ecmd->speed = SPEED_1000;
+			break;
+		case IXGBE_LINK_SPEED_100_FULL:
+			ecmd->speed = SPEED_100;
+			break;
+		default:
+			break;
+		}
 		ecmd->duplex = DUPLEX_FULL;
 	} else {
 		ecmd->speed = -1;
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 602078b..b923e42 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -6095,7 +6095,10 @@ static void ixgbe_watchdog_task(struct work_struct *work)
 			       (link_speed == IXGBE_LINK_SPEED_10GB_FULL ?
 			       "10 Gbps" :
 			       (link_speed == IXGBE_LINK_SPEED_1GB_FULL ?
-			       "1 Gbps" : "unknown speed")),
+			       "1 Gbps" :
+			       (link_speed == IXGBE_LINK_SPEED_100_FULL ?
+			       "100 Mbps" :
+			       "unknown speed"))),
 			       ((flow_rx && flow_tx) ? "RX/TX" :
 			       (flow_rx ? "RX" :
 			       (flow_tx ? "TX" : "None"))));
-- 
1.7.3.5


^ permalink raw reply related

* [net-next-2.6 1/3] igb: Enable PF side of SR-IOV support for i350 devices
From: Jeff Kirsher @ 2011-01-28 12:29 UTC (permalink / raw)
  To: davem; +Cc: Carolyn Wyborny, netdev, gospo, bphilips, Jeff Kirsher
In-Reply-To: <1296217779-30133-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Carolyn Wyborny <carolyn.wyborny@intel.com>

This patch adds full support for SR-IOV by enabling the PF side.
VF side has already been committed.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/igb/e1000_82575.c |   10 ++++++++--
 drivers/net/igb/e1000_mbx.c   |   38 ++++++++++++++++++--------------------
 drivers/net/igb/igb_main.c    |    9 +++++++--
 3 files changed, 33 insertions(+), 24 deletions(-)

diff --git a/drivers/net/igb/e1000_82575.c b/drivers/net/igb/e1000_82575.c
index c1552b6..65c1833 100644
--- a/drivers/net/igb/e1000_82575.c
+++ b/drivers/net/igb/e1000_82575.c
@@ -238,9 +238,15 @@ static s32 igb_get_invariants_82575(struct e1000_hw *hw)
 		size = 14;
 	nvm->word_size = 1 << size;
 
-	/* if 82576 then initialize mailbox parameters */
-	if (mac->type == e1000_82576)
+	/* if part supports SR-IOV then initialize mailbox parameters */
+	switch (mac->type) {
+	case e1000_82576:
+	case e1000_i350:
 		igb_init_mbx_params_pf(hw);
+		break;
+	default:
+		break;
+	}
 
 	/* setup PHY parameters */
 	if (phy->media_type != e1000_media_type_copper) {
diff --git a/drivers/net/igb/e1000_mbx.c b/drivers/net/igb/e1000_mbx.c
index c474cdb..78d48c7 100644
--- a/drivers/net/igb/e1000_mbx.c
+++ b/drivers/net/igb/e1000_mbx.c
@@ -422,26 +422,24 @@ s32 igb_init_mbx_params_pf(struct e1000_hw *hw)
 {
 	struct e1000_mbx_info *mbx = &hw->mbx;
 
-	if (hw->mac.type == e1000_82576) {
-		mbx->timeout = 0;
-		mbx->usec_delay = 0;
-
-		mbx->size = E1000_VFMAILBOX_SIZE;
-
-		mbx->ops.read = igb_read_mbx_pf;
-		mbx->ops.write = igb_write_mbx_pf;
-		mbx->ops.read_posted = igb_read_posted_mbx;
-		mbx->ops.write_posted = igb_write_posted_mbx;
-		mbx->ops.check_for_msg = igb_check_for_msg_pf;
-		mbx->ops.check_for_ack = igb_check_for_ack_pf;
-		mbx->ops.check_for_rst = igb_check_for_rst_pf;
-
-		mbx->stats.msgs_tx = 0;
-		mbx->stats.msgs_rx = 0;
-		mbx->stats.reqs = 0;
-		mbx->stats.acks = 0;
-		mbx->stats.rsts = 0;
-	}
+	mbx->timeout = 0;
+	mbx->usec_delay = 0;
+
+	mbx->size = E1000_VFMAILBOX_SIZE;
+
+	mbx->ops.read = igb_read_mbx_pf;
+	mbx->ops.write = igb_write_mbx_pf;
+	mbx->ops.read_posted = igb_read_posted_mbx;
+	mbx->ops.write_posted = igb_write_posted_mbx;
+	mbx->ops.check_for_msg = igb_check_for_msg_pf;
+	mbx->ops.check_for_ack = igb_check_for_ack_pf;
+	mbx->ops.check_for_rst = igb_check_for_rst_pf;
+
+	mbx->stats.msgs_tx = 0;
+	mbx->stats.msgs_rx = 0;
+	mbx->stats.reqs = 0;
+	mbx->stats.acks = 0;
+	mbx->stats.rsts = 0;
 
 	return 0;
 }
diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c
index 200cc32..cb6bf7b 100644
--- a/drivers/net/igb/igb_main.c
+++ b/drivers/net/igb/igb_main.c
@@ -2287,9 +2287,14 @@ static int __devinit igb_sw_init(struct igb_adapter *adapter)
 
 	spin_lock_init(&adapter->stats64_lock);
 #ifdef CONFIG_PCI_IOV
-	if (hw->mac.type == e1000_82576)
+	switch (hw->mac.type) {
+	case e1000_82576:
+	case e1000_i350:
 		adapter->vfs_allocated_count = (max_vfs > 7) ? 7 : max_vfs;
-
+		break;
+	default:
+		break;
+	}
 #endif /* CONFIG_PCI_IOV */
 	adapter->rss_queues = min_t(u32, IGB_MAX_RX_QUEUES, num_online_cpus());
 
-- 
1.7.3.5


^ permalink raw reply related

* [net-next-2.6 0/3][pull request] Intel Wired LAN Driver Updates
From: Jeff Kirsher @ 2011-01-28 12:29 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, gospo, bphilips

The following series contains the addition of ixgbe ethtool support
for 100MB FULL and the addition of igb PF support for i350 devices as
well as VF transmit rate limit using iproute2.

The following are changes since commit a4daad6b0923030fbd3b00a01f570e4c3eef446b:
  net: Pre-COW metrics for TCP.

and are available in the git repository at:
  master.kernel.org:/pub/scm/linux/kernel/git/jkirsher/net-next-2.6 master

Atita Shirwaikar (1):
  ixgbe: Adding 100MB FULL support in ethtool

Carolyn Wyborny (1):
  igb: Enable PF side of SR-IOV support for i350 devices

Lior Levy (1):
  igb: add support for VF Transmit rate limit using iproute2

 drivers/net/igb/e1000_82575.c     |   10 +++-
 drivers/net/igb/e1000_defines.h   |    7 +++
 drivers/net/igb/e1000_mbx.c       |   38 ++++++-------
 drivers/net/igb/e1000_regs.h      |    4 ++
 drivers/net/igb/igb.h             |    2 +
 drivers/net/igb/igb_main.c        |  105 +++++++++++++++++++++++++++++++++++--
 drivers/net/ixgbe/ixgbe_ethtool.c |   34 +++++++++++-
 drivers/net/ixgbe/ixgbe_main.c    |    5 ++-
 8 files changed, 176 insertions(+), 29 deletions(-)

-- 
1.7.3.5


^ permalink raw reply

* [net-2.6 v2 7/7] ixgbe: update version string
From: Jeff Kirsher @ 2011-01-28 12:29 UTC (permalink / raw)
  To: davem; +Cc: Don Skidmore, netdev, gospo, bphilips, Jeff Kirsher
In-Reply-To: <1296217743-30093-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Don Skidmore <donald.c.skidmore@intel.com>

This will synchronize the version string with that of the latest source
forge driver which shares its functionality.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ixgbe/ixgbe_main.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 1495b74..83e13a3 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -52,7 +52,7 @@ char ixgbe_driver_name[] = "ixgbe";
 static const char ixgbe_driver_string[] =
 			      "Intel(R) 10 Gigabit PCI Express Network Driver";
 
-#define DRV_VERSION "3.0.12-k2"
+#define DRV_VERSION "3.2.9-k2"
 const char ixgbe_driver_version[] = DRV_VERSION;
 static char ixgbe_copyright[] = "Copyright (c) 1999-2010 Intel Corporation.";
 
-- 
1.7.3.5


^ permalink raw reply related

* [net-2.6 v2 5/7] ixgbe: DDP last buffer size work around
From: Jeff Kirsher @ 2011-01-28 12:29 UTC (permalink / raw)
  To: davem; +Cc: Amir Hanania, netdev, gospo, bphilips, Jeff Kirsher
In-Reply-To: <1296217743-30093-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Amir Hanania <amir.hanania@intel.com>

We found a hardware erratum on 82599 hardware that can lead to buffer
overwriting if the last buffer in FCoE DDP is exactly PAGE_SIZE.
If this is the case, we will make sure that there is no HW access to
this buffer.

Please see the 82599 Specification Update for more information.

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ixgbe/ixgbe_fcoe.c |   19 +++++++++++++++++++
 1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_fcoe.c b/drivers/net/ixgbe/ixgbe_fcoe.c
index 6342d48..ffac3f6 100644
--- a/drivers/net/ixgbe/ixgbe_fcoe.c
+++ b/drivers/net/ixgbe/ixgbe_fcoe.c
@@ -254,6 +254,25 @@ int ixgbe_fcoe_ddp_get(struct net_device *netdev, u16 xid,
 	/* only the last buffer may have non-full bufflen */
 	lastsize = thisoff + thislen;
 
+	/*
+	 * lastsize can not be PAGE_SIZE.
+	 * If it is then adding another buffer with lastsize = 1.
+	 * Since lastsize is 1 there will be no HW access to this buffer.
+	 */
+	if (lastsize == PAGE_SIZE) {
+		if (j == (IXGBE_BUFFCNT_MAX - 1)) {
+			e_err(drv, "xid=%x:%d,%d,%d:addr=%llx "
+				"not enough descriptors only since lastsize "
+				"is PAGE_SIZE\n",
+				xid, i, j, dmacount, (u64)addr);
+			goto out_noddp_free;
+		}
+
+		ddp->udl[j+1] = ddp->udl[j];
+		j++;
+		lastsize = 1;
+	}
+
 	fcbuff = (IXGBE_FCBUFF_4KB << IXGBE_FCBUFF_BUFFSIZE_SHIFT);
 	fcbuff |= ((j & 0xff) << IXGBE_FCBUFF_BUFFCNT_SHIFT);
 	fcbuff |= (firstoff << IXGBE_FCBUFF_OFFSET_SHIFT);
-- 
1.7.3.5


^ permalink raw reply related

* [net-2.6 v2 6/7] ixgbe: cleanup variable initialization
From: Jeff Kirsher @ 2011-01-28 12:29 UTC (permalink / raw)
  To: davem; +Cc: Don Skidmore, netdev, gospo, bphilips, Jeff Kirsher
In-Reply-To: <1296217743-30093-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Don Skidmore <donald.c.skidmore@intel.com>

The ixgbe_fcoe_ddp_get function wasn't initializing one of its variables
and this was producing compiler warnings.  This patch cleans that up.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ixgbe/ixgbe_fcoe.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_fcoe.c b/drivers/net/ixgbe/ixgbe_fcoe.c
index ffac3f6..24d74ca 100644
--- a/drivers/net/ixgbe/ixgbe_fcoe.c
+++ b/drivers/net/ixgbe/ixgbe_fcoe.c
@@ -165,7 +165,7 @@ int ixgbe_fcoe_ddp_get(struct net_device *netdev, u16 xid,
 	unsigned int thisoff = 0;
 	unsigned int thislen = 0;
 	u32 fcbuff, fcdmarw, fcfltrw;
-	dma_addr_t addr;
+	dma_addr_t addr = 0;
 
 	if (!netdev || !sgl)
 		return 0;
-- 
1.7.3.5


^ permalink raw reply related

* [net-2.6 v2 4/7] ixgbe: limit VF access to network traffic
From: Jeff Kirsher @ 2011-01-28 12:29 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, gospo, bphilips, Jeff Kirsher
In-Reply-To: <1296217743-30093-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Alexander Duyck <alexander.h.duyck@intel.com>

This change fixes VM pool allocation issues based on MAC address filtering,
as well as limits the scope of VF access to promiscuous mode.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ixgbe/ixgbe_common.c |    3 +++
 drivers/net/ixgbe/ixgbe_sriov.c  |    2 --
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_common.c b/drivers/net/ixgbe/ixgbe_common.c
index d5ede2d..ebbda7d 100644
--- a/drivers/net/ixgbe/ixgbe_common.c
+++ b/drivers/net/ixgbe/ixgbe_common.c
@@ -1370,6 +1370,9 @@ s32 ixgbe_init_rx_addrs_generic(struct ixgbe_hw *hw)
 		hw_dbg(hw, " New MAC Addr =%pM\n", hw->mac.addr);
 
 		hw->mac.ops.set_rar(hw, 0, hw->mac.addr, 0, IXGBE_RAH_AV);
+
+		/*  clear VMDq pool/queue selection for RAR 0 */
+		hw->mac.ops.clear_vmdq(hw, 0, IXGBE_CLEAR_VMDQ_ALL);
 	}
 	hw->addr_ctrl.overflow_promisc = 0;
 
diff --git a/drivers/net/ixgbe/ixgbe_sriov.c b/drivers/net/ixgbe/ixgbe_sriov.c
index 47b1573..187b3a1 100644
--- a/drivers/net/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ixgbe/ixgbe_sriov.c
@@ -110,12 +110,10 @@ static int ixgbe_set_vf_vlan(struct ixgbe_adapter *adapter, int add, int vid,
 	return adapter->hw.mac.ops.set_vfta(&adapter->hw, vid, vf, (bool)add);
 }
 
-
 static void ixgbe_set_vmolr(struct ixgbe_hw *hw, u32 vf, bool aupe)
 {
 	u32 vmolr = IXGBE_READ_REG(hw, IXGBE_VMOLR(vf));
 	vmolr |= (IXGBE_VMOLR_ROMPE |
-		  IXGBE_VMOLR_ROPE |
 		  IXGBE_VMOLR_BAM);
 	if (aupe)
 		vmolr |= IXGBE_VMOLR_AUPE;
-- 
1.7.3.5


^ permalink raw reply related

* [net-2.6 v2 2/7] ixgbe: fix variable set but not used warnings by gcc 4.6
From: Jeff Kirsher @ 2011-01-28 12:28 UTC (permalink / raw)
  To: davem; +Cc: Emil Tantilov, netdev, gospo, bphilips, Jeff Kirsher
In-Reply-To: <1296217743-30093-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Emil Tantilov <emil.s.tantilov@intel.com>

Caught with gcc 4.6 -Wunused-but-set-variable

Remove unused napi_vectors variable.

Fix the use of reset_bit in ixgbe_reset_hw_X540()

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ixgbe/ixgbe_main.c |    3 ---
 drivers/net/ixgbe/ixgbe_x540.c |    6 +++---
 2 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 602078b..44a1cf0 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -4863,16 +4863,13 @@ static int ixgbe_alloc_q_vectors(struct ixgbe_adapter *adapter)
 {
 	int q_idx, num_q_vectors;
 	struct ixgbe_q_vector *q_vector;
-	int napi_vectors;
 	int (*poll)(struct napi_struct *, int);
 
 	if (adapter->flags & IXGBE_FLAG_MSIX_ENABLED) {
 		num_q_vectors = adapter->num_msix_vectors - NON_Q_VECTORS;
-		napi_vectors = adapter->num_rx_queues;
 		poll = &ixgbe_clean_rxtx_many;
 	} else {
 		num_q_vectors = 1;
-		napi_vectors = 1;
 		poll = &ixgbe_poll;
 	}
 
diff --git a/drivers/net/ixgbe/ixgbe_x540.c b/drivers/net/ixgbe/ixgbe_x540.c
index 3a89239..f2518b0 100644
--- a/drivers/net/ixgbe/ixgbe_x540.c
+++ b/drivers/net/ixgbe/ixgbe_x540.c
@@ -133,17 +133,17 @@ static s32 ixgbe_reset_hw_X540(struct ixgbe_hw *hw)
 	}
 
 	ctrl = IXGBE_READ_REG(hw, IXGBE_CTRL);
-	IXGBE_WRITE_REG(hw, IXGBE_CTRL, (ctrl | IXGBE_CTRL_RST));
+	IXGBE_WRITE_REG(hw, IXGBE_CTRL, (ctrl | reset_bit));
 	IXGBE_WRITE_FLUSH(hw);
 
 	/* Poll for reset bit to self-clear indicating reset is complete */
 	for (i = 0; i < 10; i++) {
 		udelay(1);
 		ctrl = IXGBE_READ_REG(hw, IXGBE_CTRL);
-		if (!(ctrl & IXGBE_CTRL_RST))
+		if (!(ctrl & reset_bit))
 			break;
 	}
-	if (ctrl & IXGBE_CTRL_RST) {
+	if (ctrl & reset_bit) {
 		status = IXGBE_ERR_RESET_FAILED;
 		hw_dbg(hw, "Reset polling failed to complete.\n");
 	}
-- 
1.7.3.5


^ permalink raw reply related

* [net-2.6 v2 1/7] e1000: add support for Marvell Alaska M88E1118R PHY
From: Jeff Kirsher @ 2011-01-28 12:28 UTC (permalink / raw)
  To: davem
  Cc: Florian Fainelli, netdev, gospo, bphilips, Dirk Brandewie,
	Jeff Kirsher
In-Reply-To: <1296217743-30093-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Florian Fainelli <ffainelli@freebox.fr>

This patch adds support for Marvell Alask M88E188R PHY chips. Support for
other M88* PHYs is already there, so there is nothing more to add than its
PHY id.

CC: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Florian Fainelli <ffainelli@freebox.fr>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/e1000/e1000_hw.c |    4 +++-
 drivers/net/e1000/e1000_hw.h |    1 +
 2 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/drivers/net/e1000/e1000_hw.c b/drivers/net/e1000/e1000_hw.c
index aed223b..7501d97 100644
--- a/drivers/net/e1000/e1000_hw.c
+++ b/drivers/net/e1000/e1000_hw.c
@@ -124,6 +124,7 @@ static s32 e1000_set_phy_type(struct e1000_hw *hw)
 	case M88E1000_I_PHY_ID:
 	case M88E1011_I_PHY_ID:
 	case M88E1111_I_PHY_ID:
+	case M88E1118_E_PHY_ID:
 		hw->phy_type = e1000_phy_m88;
 		break;
 	case IGP01E1000_I_PHY_ID:
@@ -3222,7 +3223,8 @@ static s32 e1000_detect_gig_phy(struct e1000_hw *hw)
 		break;
 	case e1000_ce4100:
 		if ((hw->phy_id == RTL8211B_PHY_ID) ||
-		    (hw->phy_id == RTL8201N_PHY_ID))
+		    (hw->phy_id == RTL8201N_PHY_ID) ||
+		    (hw->phy_id == M88E1118_E_PHY_ID))
 			match = true;
 		break;
 	case e1000_82541:
diff --git a/drivers/net/e1000/e1000_hw.h b/drivers/net/e1000/e1000_hw.h
index 196eeda..c70b23d 100644
--- a/drivers/net/e1000/e1000_hw.h
+++ b/drivers/net/e1000/e1000_hw.h
@@ -2917,6 +2917,7 @@ struct e1000_host_command_info {
 #define M88E1000_14_PHY_ID M88E1000_E_PHY_ID
 #define M88E1011_I_REV_4   0x04
 #define M88E1111_I_PHY_ID  0x01410CC0
+#define M88E1118_E_PHY_ID  0x01410E40
 #define L1LXT971A_PHY_ID   0x001378E0
 
 #define RTL8211B_PHY_ID    0x001CC910
-- 
1.7.3.5


^ permalink raw reply related

* [net-2.6 v2 0/7][pull request] Intel Wired LAN Driver Updates
From: Jeff Kirsher @ 2011-01-28 12:28 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, gospo, bphilips

The following series contains the addition of a PHY id for e1000
and several ixgbe fixes.

v2- fixed tab/space issue in patch 3 of series

The following are changes since commit 8f2771f2b85aea4d0f9a0137ad3b63d1173c0962:
  ipv6: Remove route peer binding assertions.

and are available in the git repository at:
  master.kernel.org:/pub/scm/linux/kernel/git/jkirsher/net-2.6 master

Alexander Duyck (1):
  ixgbe: limit VF access to network traffic

Amir Hanania (1):
  ixgbe: DDP last buffer size work around

Don Skidmore (3):
  ixgbe: fix for 82599 erratum on Header Splitting.
  ixgbe: cleanup variable initialization
  ixgbe: update version string

Emil Tantilov (1):
  ixgbe: fix variable set but not used warnings by gcc 4.6

Florian Fainelli (1):
  e1000: add support for Marvell Alaska M88E1118R PHY

 drivers/net/e1000/e1000_hw.c     |    4 +++-
 drivers/net/e1000/e1000_hw.h     |    1 +
 drivers/net/ixgbe/ixgbe_common.c |    3 +++
 drivers/net/ixgbe/ixgbe_fcoe.c   |   21 ++++++++++++++++++++-
 drivers/net/ixgbe/ixgbe_main.c   |   16 ++++++++++------
 drivers/net/ixgbe/ixgbe_sriov.c  |    2 --
 drivers/net/ixgbe/ixgbe_x540.c   |    6 +++---
 7 files changed, 40 insertions(+), 13 deletions(-)

-- 
1.7.3.5


^ permalink raw reply

* [net-2.6 v2 3/7] ixgbe: fix for 82599 erratum on Header Splitting
From: Jeff Kirsher @ 2011-01-28 12:28 UTC (permalink / raw)
  To: davem; +Cc: Don Skidmore, bphilips, netdev, Jeff Kirsher, gospo, stable
In-Reply-To: <1296217743-30093-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Don Skidmore <donald.c.skidmore@intel.com>

We have found a hardware erratum on 82599 hardware that can lead to
unpredictable behavior when Header Splitting mode is enabled.  So
we are no longer enabling this feature on affected hardware.

Please see the 82599 Specification Update for more information.

CC: stable@kernel.org
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ixgbe/ixgbe_main.c |   11 +++++++++--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 44a1cf0..1495b74 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -3176,9 +3176,16 @@ static void ixgbe_set_rx_buffer_len(struct ixgbe_adapter *adapter)
 	u32 mhadd, hlreg0;
 
 	/* Decide whether to use packet split mode or not */
+	/* On by default */
+	adapter->flags |= IXGBE_FLAG_RX_PS_ENABLED;
+
 	/* Do not use packet split if we're in SR-IOV Mode */
-	if (!adapter->num_vfs)
-		adapter->flags |= IXGBE_FLAG_RX_PS_ENABLED;
+	if (adapter->num_vfs)
+		adapter->flags &= ~IXGBE_FLAG_RX_PS_ENABLED;
+
+	/* Disable packet split due to 82599 erratum #45 */
+	if (hw->mac.type == ixgbe_mac_82599EB)
+		adapter->flags &= ~IXGBE_FLAG_RX_PS_ENABLED;
 
 	/* Set the RX buffer length according to the mode */
 	if (adapter->flags & IXGBE_FLAG_RX_PS_ENABLED) {
-- 
1.7.3.5

^ permalink raw reply related

* Re: Realtek r8168C / r8169 driver VLAN TAG stripping
From: Anand Raj Manickam @ 2011-01-28 12:16 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev, Hayes, Ivan Vecera
In-Reply-To: <20110128120624.GA8100@electric-eye.fr.zoreil.com>

On Fri, Jan 28, 2011 at 5:36 PM, Francois Romieu <romieu@fr.zoreil.com> wrote:
> Added Ivan to the Cc:. He has got a 8168c with XID 1c4000c0 and may tell if
> hardware VLAN works for him or not.

Keeping my fingers crossed for Ivan's results :)


> Anand Raj Manickam <anandrm@gmail.com> :
>> On Thu, Jan 27, 2011 at 10:20 PM, Francois Romieu <romieu@fr.zoreil.com> wrote:
>> > Anand Raj Manickam <anandrm@gmail.com> :
>> >> On Thu, Jan 27, 2011 at 8:37 PM, Francois Romieu <romieu@fr.zoreil.com> wrote:
>> >> > Anand Raj Manickam <anandrm@gmail.com> :
>> > [...]
>> >> > - ip addr show
>> >>
>> >> 3: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
>> >>     link/ether 00:17:54:00:f6:62 brd ff:ff:ff:ff:ff:ff
>> >>     inet 172.16.1.1/16 brd 172.16.255.255 scope global eth0
>> >>     inet6 fe80::217:54ff:fe00:f662/64 scope link
>> >>        valid_lft forever preferred_lft forever
>> >>
>> >> 8: eth0.50@eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc noqueue
>> >>     link/ether 00:17:54:00:f6:62 brd ff:ff:ff:ff:ff:ff
>> >>     inet 172.16.10.10/24 brd 172.16.10.255 scope global eth0.50
>> >>     inet6 fe80::217:54ff:fe00:f662/64 scope link
>> >>        valid_lft forever preferred_lft forever
>> >
>> > Could you try again after issuing :
>> >
>> > ip addr del 172.16.1.1/16 brd 172.16.255.255 dev eth0
>>
>>
>> I did try this NO luck ;-(
>>
>> > then send the unabbreviated "ip addr show" and "ip route show all" if
>> > things do not perform better.
>> >
>>
>>  ip addr show
>> 1: lo: <LOOPBACK,UP,10000> mtu 16436 qdisc noqueue
>>     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>>     inet 127.0.0.1/8 scope host lo
>>     inet6 ::1/128 scope host
>>        valid_lft forever preferred_lft forever
>> 2: sit0: <NOARP> mtu 1480 qdisc noop
>>     link/sit 0.0.0.0 brd 0.0.0.0
>> 3: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
>>     link/ether 00:17:54:00:f6:62 brd ff:ff:ff:ff:ff:ff
>>     inet6 fe80::217:54ff:fe00:f662/64 scope link
>>        valid_lft forever preferred_lft forever
>> 4: eth1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
>>     link/ether 00:17:54:00:f6:63 brd ff:ff:ff:ff:ff:ff
>> 5: eth2: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
>>     link/ether 00:30:67:09:2c:b9 brd ff:ff:ff:ff:ff:ff
>>     inet 10.1.1.2/24 brd 10.1.1.255 scope global eth2
>>     inet6 fe80::230:67ff:fe09:2cb9/64 scope link
>>        valid_lft forever preferred_lft forever
>> 6: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
>>     link/ether 00:17:54:00:65:6b brd ff:ff:ff:ff:ff:ff
>> 7: eth4: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
>>     link/ether 00:17:54:00:65:6a brd ff:ff:ff:ff:ff:ff
>>     inet 192.168.138.155/24 brd 192.168.138.255 scope global eth4
>>     inet6 fe80::217:54ff:fe00:656a/64 scope link
>>        valid_lft forever preferred_lft forever
>> 8: eth0.50@eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc noqueue
>>     link/ether 00:17:54:00:f6:62 brd ff:ff:ff:ff:ff:ff
>>     inet 172.16.10.10/24 brd 172.16.10.255 scope global eth0.50
>>     inet6 fe80::217:54ff:fe00:f662/64 scope link
>>        valid_lft forever preferred_lft forever
>
> (mostly sequential hardware mac adresses)
>
> Which Arkino product is it ? Quad (+1) port switch / hub ? AK1140 ?
>
> Forget the "ip route show all" for now.
>
> [...]
>> >> The same config works on forcedeth
>> >
>> > What do you call "same config" ?
>>
>> The Same setup below works on forcedeth driver
>
> So you can remove any single 8168 adapter from eth[0134], replace it with
> an external (non-LOM) forcedeth, keep the three remaining 8168s and it
> works correctly ?
>
> If your setup includes a card that contains several 8168 chipsets behind
> some kind of bridge, it is not exactly the same setup as a single (LOM ?)
> forcedeth network adapter.

It is Onboard chipset , so cannot be replaced :-(

>
> [...]
>> >
>> > I am mildly convinced that your config is simple enough to isolate a
>> > driver level vlan problem.
>>
>> The reason why i m sure its on the Driver / Chipset is this ..
> [printk removed]
>
> Ok. This is an evidence.
>
> Reading my rev1.0 8168c datasheet from may 2007, when there is no tx
> offload, no checksumming, the tx descriptor layout should be the same
> as the perennial 8169 tx descriptor layout.
>
> Either (1) the VLAN registers and descriptor layout is different for this
> chipset or (2) something prevents the register / descriptor write (read ?)
> to be completely effective or (3) there is something beyond the 8168 or
> (4) there is a 8168 hardware bug.
>
> 1 : Hayes may answer. You can give Realtek's own driver a try btw.
> 2 : Seen before. It could be a software or a (non-8168) hardware one.
>    I have no idea if your hardware setup includes a single card with
>    four ports or four independent cards with their own 8168 or worse.
> 3 : See the hardware setup part of (2).
> 4 : I don't hope so. Hayes may answer as well.


>
> --
> Ueimor
>

^ permalink raw reply

* Re: Network performance with small packets
From: Michael S. Tsirkin @ 2011-01-28 12:16 UTC (permalink / raw)
  To: Shirley Ma; +Cc: David Miller, steved, kvm, netdev
In-Reply-To: <1296163838.1640.53.camel@localhost.localdomain>

On Thu, Jan 27, 2011 at 01:30:38PM -0800, Shirley Ma wrote:
> On Thu, 2011-01-27 at 13:02 -0800, David Miller wrote:
> > > Interesting. Could this is be a variant of the now famuous
> > bufferbloat then?
> > 
> > Sigh, bufferbloat is the new global warming... :-/ 
> 
> Yep, some places become colder, some other places become warmer; Same as
> BW results, sometimes faster, sometimes slower. :)
> 
> Shirley

OK, so thinking about it more, maybe the issue is this:
tx becomes full. We process one request and interrupt the guest,
then it adds one request and the queue is full again.

Maybe the following will help it stabilize?
By itself it does nothing, but if you set
all the parameters to a huge value we will
only interrupt when we see an empty ring.
Which might be too much: pls try other values
in the middle: e.g. make bufs half the ring,
or bytes some small value, or packets some
small value etc.

Warning: completely untested.

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index aac05bc..6769cdc 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -32,6 +32,13 @@
  * Using this limit prevents one virtqueue from starving others. */
 #define VHOST_NET_WEIGHT 0x80000
 
+int tx_bytes_coalesce = 0;
+module_param(tx_bytes_coalesce, int, 0644);
+int tx_bufs_coalesce = 0;
+module_param(tx_bufs_coalesce, int, 0644);
+int tx_packets_coalesce = 0;
+module_param(tx_packets_coalesce, int, 0644);
+
 enum {
 	VHOST_NET_VQ_RX = 0,
 	VHOST_NET_VQ_TX = 1,
@@ -127,6 +134,9 @@ static void handle_tx(struct vhost_net *net)
 	int err, wmem;
 	size_t hdr_size;
 	struct socket *sock;
+	int bytes_coalesced = 0;
+	int bufs_coalesced = 0;
+	int packets_coalesced = 0;
 
 	/* TODO: check that we are running from vhost_worker? */
 	sock = rcu_dereference_check(vq->private_data, 1);
@@ -196,14 +206,26 @@ static void handle_tx(struct vhost_net *net)
 		if (err != len)
 			pr_debug("Truncated TX packet: "
 				 " len %d != %zd\n", err, len);
-		vhost_add_used_and_signal(&net->dev, vq, head, 0);
 		total_len += len;
+		packets_coalesced += 1;
+		bytes_coalesced += len;
+		bufs_coalesced += in;
+		if (unlikely(packets_coalesced > tx_packets_coalesce ||
+			     bytes_coalesced > tx_bytes_coalesce ||
+			     bufs_coalesced > tx_bufs_coalesce))
+			vhost_add_used_and_signal(&net->dev, vq, head, 0);
+		else
+			vhost_add_used(vq, head, 0);
 		if (unlikely(total_len >= VHOST_NET_WEIGHT)) {
 			vhost_poll_queue(&vq->poll);
 			break;
 		}
 	}
 
+	if (likely(packets_coalesced > tx_packets_coalesce ||
+		   bytes_coalesced > tx_bytes_coalesce ||
+		   bufs_coalesced > tx_bufs_coalesce))
+		vhost_signal(&net->dev, vq);
 	mutex_unlock(&vq->mutex);
 }
 

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox