Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH 2/6] netfilter: nf_tables: fix port natting in little endian archs
From: Pablo Neira Ayuso @ 2015-01-10 18:50 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1420915808-7160-1-git-send-email-pablo@netfilter.org>

From: leroy christophe <christophe.leroy@c-s.fr>

Make sure this fetches 16-bits port data from the register.
Remove casting to make sparse happy, not needed anymore.

Signed-off-by: leroy christophe <christophe.leroy@c-s.fr>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/ipv4/netfilter/nft_redir_ipv4.c |    8 ++++----
 net/ipv6/netfilter/nft_redir_ipv6.c |    8 ++++----
 net/netfilter/nft_nat.c             |    8 ++++----
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/net/ipv4/netfilter/nft_redir_ipv4.c b/net/ipv4/netfilter/nft_redir_ipv4.c
index ff2d23d..6ecfce6 100644
--- a/net/ipv4/netfilter/nft_redir_ipv4.c
+++ b/net/ipv4/netfilter/nft_redir_ipv4.c
@@ -27,10 +27,10 @@ static void nft_redir_ipv4_eval(const struct nft_expr *expr,
 
 	memset(&mr, 0, sizeof(mr));
 	if (priv->sreg_proto_min) {
-		mr.range[0].min.all = (__force __be16)
-					data[priv->sreg_proto_min].data[0];
-		mr.range[0].max.all = (__force __be16)
-					data[priv->sreg_proto_max].data[0];
+		mr.range[0].min.all =
+			*(__be16 *)&data[priv->sreg_proto_min].data[0];
+		mr.range[0].max.all =
+			*(__be16 *)&data[priv->sreg_proto_max].data[0];
 		mr.range[0].flags |= NF_NAT_RANGE_PROTO_SPECIFIED;
 	}
 
diff --git a/net/ipv6/netfilter/nft_redir_ipv6.c b/net/ipv6/netfilter/nft_redir_ipv6.c
index 2433a6b..11820b6 100644
--- a/net/ipv6/netfilter/nft_redir_ipv6.c
+++ b/net/ipv6/netfilter/nft_redir_ipv6.c
@@ -27,10 +27,10 @@ static void nft_redir_ipv6_eval(const struct nft_expr *expr,
 
 	memset(&range, 0, sizeof(range));
 	if (priv->sreg_proto_min) {
-		range.min_proto.all = (__force __be16)
-					data[priv->sreg_proto_min].data[0];
-		range.max_proto.all = (__force __be16)
-					data[priv->sreg_proto_max].data[0];
+		range.min_proto.all =
+			*(__be16 *)&data[priv->sreg_proto_min].data[0];
+		range.max_proto.all =
+			*(__be16 *)&data[priv->sreg_proto_max].data[0];
 		range.flags |= NF_NAT_RANGE_PROTO_SPECIFIED;
 	}
 
diff --git a/net/netfilter/nft_nat.c b/net/netfilter/nft_nat.c
index afe2b0b..aff54fb1 100644
--- a/net/netfilter/nft_nat.c
+++ b/net/netfilter/nft_nat.c
@@ -65,10 +65,10 @@ static void nft_nat_eval(const struct nft_expr *expr,
 	}
 
 	if (priv->sreg_proto_min) {
-		range.min_proto.all = (__force __be16)
-					data[priv->sreg_proto_min].data[0];
-		range.max_proto.all = (__force __be16)
-					data[priv->sreg_proto_max].data[0];
+		range.min_proto.all =
+			*(__be16 *)&data[priv->sreg_proto_min].data[0];
+		range.max_proto.all =
+			*(__be16 *)&data[priv->sreg_proto_max].data[0];
 		range.flags |= NF_NAT_RANGE_PROTO_SPECIFIED;
 	}
 
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 4/6] netfilter: nfnetlink: validate nfnetlink header from batch
From: Pablo Neira Ayuso @ 2015-01-10 18:50 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1420915808-7160-1-git-send-email-pablo@netfilter.org>

Make sure there is enough room for the nfnetlink header in the
netlink messages that are part of the batch. There is a similar
check in netlink_rcv_skb().

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nfnetlink.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c
index 13c2e17..c6619d4 100644
--- a/net/netfilter/nfnetlink.c
+++ b/net/netfilter/nfnetlink.c
@@ -321,7 +321,8 @@ replay:
 		nlh = nlmsg_hdr(skb);
 		err = 0;
 
-		if (nlh->nlmsg_len < NLMSG_HDRLEN) {
+		if (nlmsg_len(nlh) < sizeof(struct nfgenmsg) ||
+		    skb->len < nlh->nlmsg_len) {
 			err = -EINVAL;
 			goto ack;
 		}
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 6/6] netfilter: nf_tables: fix flush ruleset chain dependencies
From: Pablo Neira Ayuso @ 2015-01-10 18:50 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1420915808-7160-1-git-send-email-pablo@netfilter.org>

Jumping between chains doesn't mix well with flush ruleset. Rules
from a different chain and set elements may still refer to us.

[  353.373791] ------------[ cut here ]------------
[  353.373845] kernel BUG at net/netfilter/nf_tables_api.c:1159!
[  353.373896] invalid opcode: 0000 [#1] SMP
[  353.373942] Modules linked in: intel_powerclamp uas iwldvm iwlwifi
[  353.374017] CPU: 0 PID: 6445 Comm: 31c3.nft Not tainted 3.18.0 #98
[  353.374069] Hardware name: LENOVO 5129CTO/5129CTO, BIOS 6QET47WW (1.17 ) 07/14/2010
[...]
[  353.375018] Call Trace:
[  353.375046]  [<ffffffff81964c31>] ? nf_tables_commit+0x381/0x540
[  353.375101]  [<ffffffff81949118>] nfnetlink_rcv+0x3d8/0x4b0
[  353.375150]  [<ffffffff81943fc5>] netlink_unicast+0x105/0x1a0
[  353.375200]  [<ffffffff8194438e>] netlink_sendmsg+0x32e/0x790
[  353.375253]  [<ffffffff818f398e>] sock_sendmsg+0x8e/0xc0
[  353.375300]  [<ffffffff818f36b9>] ? move_addr_to_kernel.part.20+0x19/0x70
[  353.375357]  [<ffffffff818f44f9>] ? move_addr_to_kernel+0x19/0x30
[  353.375410]  [<ffffffff819016d2>] ? verify_iovec+0x42/0xd0
[  353.375459]  [<ffffffff818f3e10>] ___sys_sendmsg+0x3f0/0x400
[  353.375510]  [<ffffffff810615fa>] ? native_sched_clock+0x2a/0x90
[  353.375563]  [<ffffffff81176697>] ? acct_account_cputime+0x17/0x20
[  353.375616]  [<ffffffff8110dc78>] ? account_user_time+0x88/0xa0
[  353.375667]  [<ffffffff818f4bbd>] __sys_sendmsg+0x3d/0x80
[  353.375719]  [<ffffffff81b184f4>] ? int_check_syscall_exit_work+0x34/0x3d
[  353.375776]  [<ffffffff818f4c0d>] SyS_sendmsg+0xd/0x20
[  353.375823]  [<ffffffff81b1826d>] system_call_fastpath+0x16/0x1b

Release objects in this order: rules -> sets -> chains -> tables, to
make sure no references to chains are held anymore.

Reported-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.biz>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_tables_api.c |   14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 129a8da..3b3ddb4 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -713,16 +713,12 @@ static int nft_flush_table(struct nft_ctx *ctx)
 	struct nft_chain *chain, *nc;
 	struct nft_set *set, *ns;
 
-	list_for_each_entry_safe(chain, nc, &ctx->table->chains, list) {
+	list_for_each_entry(chain, &ctx->table->chains, list) {
 		ctx->chain = chain;
 
 		err = nft_delrule_by_chain(ctx);
 		if (err < 0)
 			goto out;
-
-		err = nft_delchain(ctx);
-		if (err < 0)
-			goto out;
 	}
 
 	list_for_each_entry_safe(set, ns, &ctx->table->sets, list) {
@@ -735,6 +731,14 @@ static int nft_flush_table(struct nft_ctx *ctx)
 			goto out;
 	}
 
+	list_for_each_entry_safe(chain, nc, &ctx->table->chains, list) {
+		ctx->chain = chain;
+
+		err = nft_delchain(ctx);
+		if (err < 0)
+			goto out;
+	}
+
 	err = nft_deltable(ctx);
 out:
 	return err;
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 5/6] netfilter: nfnetlink: relax strict multicast group check from netlink_bind
From: Pablo Neira Ayuso @ 2015-01-10 18:50 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1420915808-7160-1-git-send-email-pablo@netfilter.org>

Relax the checking that was introduced in 97840cb ("netfilter:
nfnetlink: fix insufficient validation in nfnetlink_bind") when the
subscription bitmask is used. Existing userspace code code may request
to listen to all of the existing netlink groups by setting an all to one
subscription group bitmask. Netlink already validates subscription via
setsockopt() for us.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nfnetlink.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c
index c6619d4..1aa7049 100644
--- a/net/netfilter/nfnetlink.c
+++ b/net/netfilter/nfnetlink.c
@@ -470,7 +470,7 @@ static int nfnetlink_bind(int group)
 	int type;

 	if (group <= NFNLGRP_NONE || group > NFNLGRP_MAX)
-		return -EINVAL;
+		return 0;

 	type = nfnl_group2type[group];

-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 0/6] netfilter/ipvs fixes for net
From: Pablo Neira Ayuso @ 2015-01-10 18:50 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains netfilter/ipvs fixes, they are:

1) Small fix for the FTP helper in IPVS, a diff variable may be left
   unset when CONFIG_IP_VS_IPV6 is set. Patch from Dan Carpenter.

2) Fix nf_tables port NAT in little endian archs, patch from leroy
   christophe.

3) Fix race condition between conntrack confirmation and flush from
   userspace. This is the second reincarnation to resolve this problem.

4) Make sure inner messages in the batch come with the nfnetlink header.

5) Relax strict check from nfnetlink_bind() that may break old userspace
   applications using all 1s group mask.

6) Schedule removal of chains once no sets and rules refer to them in
   the new nf_tables ruleset flush command. Reported by Asbjoern Sloth
   Toennesen.

Note that this batch comes later than usual because of the short
winter holidays.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Have a happy new year, thanks!

----------------------------------------------------------------

The following changes since commit ac9a3d84e121196263636f2d38d439a45888005a:

  be2net: Fix incorrect setting of tunnel offload flag in netdev features (2014-12-18 12:51:29 -0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git master

for you to fetch changes up to a2f18db0c68fec96631c10cad9384c196e9008ac:

  netfilter: nf_tables: fix flush ruleset chain dependencies (2015-01-06 22:27:48 +0100)

----------------------------------------------------------------
Dan Carpenter (1):
      ipvs: uninitialized data with IP_VS_IPV6

Pablo Neira Ayuso (5):
      Merge tag 'ipvs2-for-v3.19' of https://git.kernel.org/.../horms/ipvs-next into ipvs-next
      netfilter: conntrack: fix race between confirmation and flush
      netfilter: nfnetlink: validate nfnetlink header from batch
      netfilter: nfnetlink: relax strict multicast group check from netlink_bind
      netfilter: nf_tables: fix flush ruleset chain dependencies

leroy christophe (1):
      netfilter: nf_tables: fix port natting in little endian archs

 net/ipv4/netfilter/nft_redir_ipv4.c |    8 ++++----
 net/ipv6/netfilter/nft_redir_ipv6.c |    8 ++++----
 net/netfilter/ipvs/ip_vs_ftp.c      |   10 +++++-----
 net/netfilter/nf_conntrack_core.c   |   20 +++++++++-----------
 net/netfilter/nf_tables_api.c       |   14 +++++++++-----
 net/netfilter/nfnetlink.c           |    5 +++--
 net/netfilter/nft_nat.c             |    8 ++++----
 7 files changed, 38 insertions(+), 35 deletions(-)

^ permalink raw reply

* [PATCH 1/6] ipvs: uninitialized data with IP_VS_IPV6
From: Pablo Neira Ayuso @ 2015-01-10 18:50 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1420915808-7160-1-git-send-email-pablo@netfilter.org>

From: Dan Carpenter <dan.carpenter@oracle.com>

The app_tcp_pkt_out() function expects "*diff" to be set and ends up
using uninitialized data if CONFIG_IP_VS_IPV6 is turned on.

The same issue is there in app_tcp_pkt_in().  Thanks to Julian Anastasov
for noticing that.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
 net/netfilter/ipvs/ip_vs_ftp.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
index 1d5341f..5d3daae 100644
--- a/net/netfilter/ipvs/ip_vs_ftp.c
+++ b/net/netfilter/ipvs/ip_vs_ftp.c
@@ -183,6 +183,8 @@ static int ip_vs_ftp_out(struct ip_vs_app *app, struct ip_vs_conn *cp,
 	struct nf_conn *ct;
 	struct net *net;
 
+	*diff = 0;
+
 #ifdef CONFIG_IP_VS_IPV6
 	/* This application helper doesn't work with IPv6 yet,
 	 * so turn this into a no-op for IPv6 packets
@@ -191,8 +193,6 @@ static int ip_vs_ftp_out(struct ip_vs_app *app, struct ip_vs_conn *cp,
 		return 1;
 #endif
 
-	*diff = 0;
-
 	/* Only useful for established sessions */
 	if (cp->state != IP_VS_TCP_S_ESTABLISHED)
 		return 1;
@@ -322,6 +322,9 @@ static int ip_vs_ftp_in(struct ip_vs_app *app, struct ip_vs_conn *cp,
 	struct ip_vs_conn *n_cp;
 	struct net *net;
 
+	/* no diff required for incoming packets */
+	*diff = 0;
+
 #ifdef CONFIG_IP_VS_IPV6
 	/* This application helper doesn't work with IPv6 yet,
 	 * so turn this into a no-op for IPv6 packets
@@ -330,9 +333,6 @@ static int ip_vs_ftp_in(struct ip_vs_app *app, struct ip_vs_conn *cp,
 		return 1;
 #endif
 
-	/* no diff required for incoming packets */
-	*diff = 0;
-
 	/* Only useful for established sessions */
 	if (cp->state != IP_VS_TCP_S_ESTABLISHED)
 		return 1;
-- 
1.7.10.4


^ permalink raw reply related

* [PATCH 3/6] netfilter: conntrack: fix race between confirmation and flush
From: Pablo Neira Ayuso @ 2015-01-10 18:50 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1420915808-7160-1-git-send-email-pablo@netfilter.org>

Commit 5195c14c8b27c ("netfilter: conntrack: fix race in
__nf_conntrack_confirm against get_next_corpse") aimed to resolve the
race condition between the confirmation (packet path) and the flush
command (from control plane). However, it introduced a crash when
several packets race to add a new conntrack, which seems easier to
reproduce when nf_queue is in place.

Fix this race, in __nf_conntrack_confirm(), by removing the CT
from unconfirmed list before checking the DYING bit. In case
race occured, re-add the CT to the dying list

This patch also changes the verdict from NF_ACCEPT to NF_DROP when
we lose race. Basically, the confirmation happens for the first packet
that we see in a flow. If you just invoked conntrack -F once (which
should be the common case), then this is likely to be the first packet
of the flow (unless you already called flush anytime soon in the past).
This should be hard to trigger, but better drop this packet, otherwise
we leave things in inconsistent state since the destination will likely
reply to this packet, but it will find no conntrack, unless the origin
retransmits.

The change of the verdict has been discussed in:
https://www.marc.info/?l=linux-netdev&m=141588039530056&w=2

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_conntrack_core.c |   20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index a116748..46d1b26 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -611,16 +611,15 @@ __nf_conntrack_confirm(struct sk_buff *skb)
 	 */
 	NF_CT_ASSERT(!nf_ct_is_confirmed(ct));
 	pr_debug("Confirming conntrack %p\n", ct);
-	/* We have to check the DYING flag inside the lock to prevent
-	   a race against nf_ct_get_next_corpse() possibly called from
-	   user context, else we insert an already 'dead' hash, blocking
-	   further use of that particular connection -JM */
+	/* We have to check the DYING flag after unlink to prevent
+	 * a race against nf_ct_get_next_corpse() possibly called from
+	 * user context, else we insert an already 'dead' hash, blocking
+	 * further use of that particular connection -JM.
+	 */
+	nf_ct_del_from_dying_or_unconfirmed_list(ct);

-	if (unlikely(nf_ct_is_dying(ct))) {
-		nf_conntrack_double_unlock(hash, reply_hash);
-		local_bh_enable();
-		return NF_ACCEPT;
-	}
+	if (unlikely(nf_ct_is_dying(ct)))
+		goto out;

 	/* See if there's one in the list already, including reverse:
 	   NAT could have grabbed it without realizing, since we're
@@ -636,8 +635,6 @@ __nf_conntrack_confirm(struct sk_buff *skb)
 		    zone == nf_ct_zone(nf_ct_tuplehash_to_ctrack(h)))
 			goto out;

-	nf_ct_del_from_dying_or_unconfirmed_list(ct);
-
 	/* Timer relative to confirmation time, not original
 	   setting time, otherwise we'd get timer wrap in
 	   weird delay cases. */
@@ -673,6 +670,7 @@ __nf_conntrack_confirm(struct sk_buff *skb)
 	return NF_ACCEPT;

 out:
+	nf_ct_add_to_dying_list(ct);
 	nf_conntrack_double_unlock(hash, reply_hash);
 	NF_CT_STAT_INC(net, insert_failed);
 	local_bh_enable();
-- 
1.7.10.4

^ permalink raw reply related

* Re: Clarification regarding IFLA_BRPORT_LEARNING_SYNC and aging of fdb entries learnt via br_fdb_external_learn_add()
From: Scott Feldman @ 2015-01-10 20:28 UTC (permalink / raw)
  To: Arad, Ronen; +Cc: Netdev, Jiri Pirko, Siva Mannem, marichika4
In-Reply-To: <E4CD12F19ABA0C4D8729E087A761DC3505DDA81C@ORSMSX101.amr.corp.intel.com>

Perfect, I think we have a good working set of use-cases.  Thanks for
adding those.  Your case 3 seems do-able without too much work since
we already know which ones where externally added, just need another
per-bridge-port flag to turn off bridging aging of externally learned
entries.  This will address the performance issue you (and B
Viswanath) raised.  What about the entry stats, from the user's
'bridge -s fdb show" perspective for the bridge fdb entries?  Will
these numbers match expectations?  I think case 1 and case 4 provide a
coherent stats view.  Case 3 seems to be lacking in this regard.

On Fri, Jan 9, 2015 at 10:51 PM, Arad, Ronen <ronen.arad@intel.com> wrote:
> [...]
>>> It is indeed simpler. However, if the overhead of reading hit bits from the
>>HW
>>> and updating freshness of entries using br_fdb_external_learn_add() is too
>>> expensive, it would force such platforms to disable learning sync
>>altogether.
>>> Therefore, I believe aging offload flag (could be sufficient at bridge
>>level)
>>> and external aging interval (possibly longer than the software aging
>>interval)
>>> will encourage drivers to use leaning sync.
>>>>-scott
>>
>>I'm not sure I follow that last part.
>>
>>Can we list out the use-cases to see what's missing?
>>
>>Case 1: bridge ages out learned_sync'ed entries
>>
>>bridge port learning: off
>>offload port learning: on
>>offload port learning_sync: on
>>
>>Driver calls br_fdb_external_learn_add() periodically to refresh
>>bridge fdb entry
>>to keep it from going stale.
>>Bridge removes aged out fdb entries (and indirectly tells offload
>>device to do the same).
>>
>>Case 2: offload device/bridge age out entries independently
>>
>>bridge port learning: on
>>offload port learning: on
>>offload port learning_sync: off
>>
>>Bridge ages out its stale learned entries, independent of offload device.
>>Offload device ages out its stale learned entries, independent of bridge.
>>
>>Case 3: ?
>>
>>Please help me finish the use-case list so we can see what's missing.
>
>
> Case 3: offload device ages out external entries and notifies bridge
>
> bridge port learning: on or off (Bridge only learns from packets seen (Rx/Tx))
> offload port learning: on
> offload port learning_sync: on
> bridge aging of external learn: off
> offload device aging: on
>
> Switch port/device driver ages entries (could be by HW aging or soft aging in
> driver/firmware),
> notifies bridge about aged entries using br_fdb_externallearn_del().
> This allows efficient HW aging and batched notification at a pace independent
> of bridge aging interval.
> User still enjoys a single VLAN-aware FDB within the bridge module and having
> all entries in one place. Externally learned entries are identified as such by
> iproute2 "bridge fdb show" command. Device does not have to implement
> ndo_bridge_fdb_dump() for each offload port as the bridge module provides it
> for the common FDB.
>
> Case 4: bridge ages owned and external entries at different intervals
>
> bridge port learning: on (Effectively only for Rx/Tx traffic seen by
>                           software bridge)
> offload port learning: on (transient traffic and RxTx, overlap with bridge
>                            learned entries possible)
> offload port learning_sync: on
> bridge aging of external learn: on
> offload device aging: off
> bridge aging interval for owned entries: T1
> bridge aging interval for external entries: T2 (Typically T2 > T1)
>
> This allows for fine-tuning the overhead of periodic updates of entries
> freshness from offload port device.
>
> The bottom line of cases 3-4 is that it is desirable to use the common bridge
> FDB as long as bridge aging of externally learned entries could be controlled
> by the offload device: Either be at a longer interval or disabled.
>
>>
>>-scott
>>--
>>To unsubscribe from this list: send the line "unsubscribe netdev" in
>>the body of a message to majordomo@vger.kernel.org
>>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [BUG] 3.19-rc1 net: less interrupt masking in NAPI
From: Oded Gabbay @ 2015-01-10 20:39 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet
  Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	Willem de Bruijn, Bridgman, John, Elifaz, Dana

Hi,

Commit d75b1ade567ffab085e8adbbdacf0092d10cd09c breaks my "Qualcomm Atheros
AR8161 Gigabit Ethernet (rev 10)" Ethernet controller, which is handled by
the alx network driver.

ogabbay@odedg-ubuntu:~$ lspci -s 01:00.0 -k
01:00.0 Ethernet controller:
	Qualcomm Atheros AR8161 Gigabit Ethernet (rev 10)
	Subsystem: Qualcomm Atheros Device 1071
	Kernel driver in use: alx

I have this controller on a mobile platform of AMD APU Kaveri, which I use
to test amdkfd, and from 3.19-rc1 the network stopped working when trying to
transfer files through scp or nfs.

I bisected the kernel (from 3.18.0 to 3.19-rc1) and reached this commit.

Here is the log of the bisect:

git bisect start
# bad: [97bf6af1f928216fd6c5a66e8a57bfa95a659672] Linux 3.19-rc1
git bisect bad 97bf6af1f928216fd6c5a66e8a57bfa95a659672

# good: [b2776bf7149bddd1f4161f14f79520f17fc1d71d] Linux 3.18
git bisect good b2776bf7149bddd1f4161f14f79520f17fc1d71d

# bad: [70e71ca0af244f48a5dcf56dc435243792e3a495] Merge
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
git bisect bad 70e71ca0af244f48a5dcf56dc435243792e3a495

# good: [e28870f9b3e92cd3570925089c6bb789c2603bc4] Merge tag
'backlight-for-linus-3.19' of
git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight
git bisect good e28870f9b3e92cd3570925089c6bb789c2603bc4

# bad: [450fa21942fe2c37f0c9f52d1a33bbc081eee288] sh_eth: Remove redundant
alignment adjustment
git bisect bad 450fa21942fe2c37f0c9f52d1a33bbc081eee288

# bad: [5c8d19da950861d0482abc0ac3481acca34b008f] e100e: use
netdev_rss_key_fill() helper
git bisect bad 5c8d19da950861d0482abc0ac3481acca34b008f

# good: [bf515fb11ab539c76d04f0e3c5216ed41f41d81f] Merge tag
'mac80211-next-for-john-2014-11-04' of
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
git bisect good bf515fb11ab539c76d04f0e3c5216ed41f41d81f

# bad: [2c99cd914d4fed9160d98849c9dd38034616768e] Merge branch 'amd-xgbe-next'
git bisect bad 2c99cd914d4fed9160d98849c9dd38034616768e

# good: [3d762a0f0ab9cb4a6b5993db3ce56c92f9f90ab2] net: dsa: Add support for
reading switch registers with ethtool
git bisect good 3d762a0f0ab9cb4a6b5993db3ce56c92f9f90ab2

# bad: [8ce0c8254f15229aa99fc6c04141f28c446e5f8c] Merge branch 'master' of
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
git bisect bad 8ce0c8254f15229aa99fc6c04141f28c446e5f8c

# good: [f0c65567b3c1b23f79e8a49139580a3872a68d1f] Merge branch
'sunvnet-multi-tx-queue'
git bisect good f0c65567b3c1b23f79e8a49139580a3872a68d1f

# bad: [547f2735c20023d7b50a791b1b17cacb652e9237] Merge branch 'mlx4-next'
git bisect bad 547f2735c20023d7b50a791b1b17cacb652e9237

# good: [4cdb1e2e3d3495423db558d3bb7ed11d66aabce7] net: shrink struct
softnet_data
git bisect good 4cdb1e2e3d3495423db558d3bb7ed11d66aabce7

# bad: [0a98455666ec87378148a1dde97f1ce5baf75a64] net/mlx4_core: Protect
port type setting by mutex
git bisect bad 0a98455666ec87378148a1dde97f1ce5baf75a64

# bad: [6e8066999800d90d52af5c84ac49ebf683d14cdc] net/mlx4_core: Prevent VF
from changing port configuration
git bisect bad 6e8066999800d90d52af5c84ac49ebf683d14cdc

# bad: [d75b1ade567ffab085e8adbbdacf0092d10cd09c] net: less interrupt
masking in NAPI
git bisect bad d75b1ade567ffab085e8adbbdacf0092d10cd09c

# first bad commit: [d75b1ade567ffab085e8adbbdacf0092d10cd09c]
net: less interrupt masking in NAPI

Could you please solve this issue as it renders my board quite useless.

If you need more info, please ask.

Thanks,
	Oded

^ permalink raw reply

* Re: [BUG] 3.19-rc1 net: less interrupt masking in NAPI
From: Eric Dumazet @ 2015-01-10 20:58 UTC (permalink / raw)
  To: Oded Gabbay
  Cc: David S. Miller, Eric Dumazet, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, Willem de Bruijn, Bridgman, John,
	Elifaz, Dana
In-Reply-To: <54B18DFF.80400@amd.com>

On Sat, 2015-01-10 at 22:39 +0200, Oded Gabbay wrote:
> Hi,
> 
> Commit d75b1ade567ffab085e8adbbdacf0092d10cd09c breaks my "Qualcomm Atheros
> AR8161 Gigabit Ethernet (rev 10)" Ethernet controller, which is handled by
> the alx network driver.
> 
> ogabbay@odedg-ubuntu:~$ lspci -s 01:00.0 -k
> 01:00.0 Ethernet controller:
> 	Qualcomm Atheros AR8161 Gigabit Ethernet (rev 10)
> 	Subsystem: Qualcomm Atheros Device 1071
> 	Kernel driver in use: alx
> 
> I have this controller on a mobile platform of AMD APU Kaveri, which I use
> to test amdkfd, and from 3.19-rc1 the network stopped working when trying to
> transfer files through scp or nfs.
> 
> I bisected the kernel (from 3.18.0 to 3.19-rc1) and reached this commit.
> 
> Here is the log of the bisect:
> 
> git bisect start
> # bad: [97bf6af1f928216fd6c5a66e8a57bfa95a659672] Linux 3.19-rc1
> git bisect bad 97bf6af1f928216fd6c5a66e8a57bfa95a659672
> 
> # good: [b2776bf7149bddd1f4161f14f79520f17fc1d71d] Linux 3.18
> git bisect good b2776bf7149bddd1f4161f14f79520f17fc1d71d
> 
> # bad: [70e71ca0af244f48a5dcf56dc435243792e3a495] Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
> git bisect bad 70e71ca0af244f48a5dcf56dc435243792e3a495
> 
> # good: [e28870f9b3e92cd3570925089c6bb789c2603bc4] Merge tag
> 'backlight-for-linus-3.19' of
> git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight
> git bisect good e28870f9b3e92cd3570925089c6bb789c2603bc4
> 
> # bad: [450fa21942fe2c37f0c9f52d1a33bbc081eee288] sh_eth: Remove redundant
> alignment adjustment
> git bisect bad 450fa21942fe2c37f0c9f52d1a33bbc081eee288
> 
> # bad: [5c8d19da950861d0482abc0ac3481acca34b008f] e100e: use
> netdev_rss_key_fill() helper
> git bisect bad 5c8d19da950861d0482abc0ac3481acca34b008f
> 
> # good: [bf515fb11ab539c76d04f0e3c5216ed41f41d81f] Merge tag
> 'mac80211-next-for-john-2014-11-04' of
> git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
> git bisect good bf515fb11ab539c76d04f0e3c5216ed41f41d81f
> 
> # bad: [2c99cd914d4fed9160d98849c9dd38034616768e] Merge branch 'amd-xgbe-next'
> git bisect bad 2c99cd914d4fed9160d98849c9dd38034616768e
> 
> # good: [3d762a0f0ab9cb4a6b5993db3ce56c92f9f90ab2] net: dsa: Add support for
> reading switch registers with ethtool
> git bisect good 3d762a0f0ab9cb4a6b5993db3ce56c92f9f90ab2
> 
> # bad: [8ce0c8254f15229aa99fc6c04141f28c446e5f8c] Merge branch 'master' of
> git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
> git bisect bad 8ce0c8254f15229aa99fc6c04141f28c446e5f8c
> 
> # good: [f0c65567b3c1b23f79e8a49139580a3872a68d1f] Merge branch
> 'sunvnet-multi-tx-queue'
> git bisect good f0c65567b3c1b23f79e8a49139580a3872a68d1f
> 
> # bad: [547f2735c20023d7b50a791b1b17cacb652e9237] Merge branch 'mlx4-next'
> git bisect bad 547f2735c20023d7b50a791b1b17cacb652e9237
> 
> # good: [4cdb1e2e3d3495423db558d3bb7ed11d66aabce7] net: shrink struct
> softnet_data
> git bisect good 4cdb1e2e3d3495423db558d3bb7ed11d66aabce7
> 
> # bad: [0a98455666ec87378148a1dde97f1ce5baf75a64] net/mlx4_core: Protect
> port type setting by mutex
> git bisect bad 0a98455666ec87378148a1dde97f1ce5baf75a64
> 
> # bad: [6e8066999800d90d52af5c84ac49ebf683d14cdc] net/mlx4_core: Prevent VF
> from changing port configuration
> git bisect bad 6e8066999800d90d52af5c84ac49ebf683d14cdc
> 
> # bad: [d75b1ade567ffab085e8adbbdacf0092d10cd09c] net: less interrupt
> masking in NAPI
> git bisect bad d75b1ade567ffab085e8adbbdacf0092d10cd09c
> 
> # first bad commit: [d75b1ade567ffab085e8adbbdacf0092d10cd09c]
> net: less interrupt masking in NAPI
> 
> Could you please solve this issue as it renders my board quite useless.
> 

Thanks for the report and bisection !

Could you try following fix ?

diff --git a/drivers/net/ethernet/atheros/alx/main.c b/drivers/net/ethernet/atheros/alx/main.c
index e398eda07298..209c40765e0d 100644
--- a/drivers/net/ethernet/atheros/alx/main.c
+++ b/drivers/net/ethernet/atheros/alx/main.c
@@ -272,7 +272,7 @@ static int alx_poll(struct napi_struct *napi, int budget)
 		   alx_clean_rx_irq(alx, budget);
 
 	if (!complete)
-		return 1;
+		return budget;
 
 	napi_complete(&alx->napi);
 

^ permalink raw reply related

* Re: [BUG] 3.19-rc1 net: less interrupt masking in NAPI
From: Eric Dumazet @ 2015-01-10 21:10 UTC (permalink / raw)
  To: Oded Gabbay, Johannes Berg
  Cc: David S. Miller, Eric Dumazet, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, Willem de Bruijn, Bridgman, John,
	Elifaz, Dana
In-Reply-To: <1420923485.5947.85.camel@edumazet-glaptop2.roam.corp.google.com>

On Sat, 2015-01-10 at 12:58 -0800, Eric Dumazet wrote:
> On Sat, 2015-01-10 at 22:39 +0200, Oded Gabbay wrote:
> > Hi,
> > 
> > Commit d75b1ade567ffab085e8adbbdacf0092d10cd09c breaks my "Qualcomm Atheros
> > AR8161 Gigabit Ethernet (rev 10)" Ethernet controller, which is handled by
> > the alx network driver.
> > 
> > ogabbay@odedg-ubuntu:~$ lspci -s 01:00.0 -k
> > 01:00.0 Ethernet controller:
> > 	Qualcomm Atheros AR8161 Gigabit Ethernet (rev 10)
> > 	Subsystem: Qualcomm Atheros Device 1071
> > 	Kernel driver in use: alx
> > 
> > I have this controller on a mobile platform of AMD APU Kaveri, which I use
> > to test amdkfd, and from 3.19-rc1 the network stopped working when trying to
> > transfer files through scp or nfs.
> > 
> > I bisected the kernel (from 3.18.0 to 3.19-rc1) and reached this commit.
> > 
> > Here is the log of the bisect:
> > 
> > git bisect start
> > # bad: [97bf6af1f928216fd6c5a66e8a57bfa95a659672] Linux 3.19-rc1
> > git bisect bad 97bf6af1f928216fd6c5a66e8a57bfa95a659672
> > 
> > # good: [b2776bf7149bddd1f4161f14f79520f17fc1d71d] Linux 3.18
> > git bisect good b2776bf7149bddd1f4161f14f79520f17fc1d71d
> > 
> > # bad: [70e71ca0af244f48a5dcf56dc435243792e3a495] Merge
> > git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
> > git bisect bad 70e71ca0af244f48a5dcf56dc435243792e3a495
> > 
> > # good: [e28870f9b3e92cd3570925089c6bb789c2603bc4] Merge tag
> > 'backlight-for-linus-3.19' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight
> > git bisect good e28870f9b3e92cd3570925089c6bb789c2603bc4
> > 
> > # bad: [450fa21942fe2c37f0c9f52d1a33bbc081eee288] sh_eth: Remove redundant
> > alignment adjustment
> > git bisect bad 450fa21942fe2c37f0c9f52d1a33bbc081eee288
> > 
> > # bad: [5c8d19da950861d0482abc0ac3481acca34b008f] e100e: use
> > netdev_rss_key_fill() helper
> > git bisect bad 5c8d19da950861d0482abc0ac3481acca34b008f
> > 
> > # good: [bf515fb11ab539c76d04f0e3c5216ed41f41d81f] Merge tag
> > 'mac80211-next-for-john-2014-11-04' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
> > git bisect good bf515fb11ab539c76d04f0e3c5216ed41f41d81f
> > 
> > # bad: [2c99cd914d4fed9160d98849c9dd38034616768e] Merge branch 'amd-xgbe-next'
> > git bisect bad 2c99cd914d4fed9160d98849c9dd38034616768e
> > 
> > # good: [3d762a0f0ab9cb4a6b5993db3ce56c92f9f90ab2] net: dsa: Add support for
> > reading switch registers with ethtool
> > git bisect good 3d762a0f0ab9cb4a6b5993db3ce56c92f9f90ab2
> > 
> > # bad: [8ce0c8254f15229aa99fc6c04141f28c446e5f8c] Merge branch 'master' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
> > git bisect bad 8ce0c8254f15229aa99fc6c04141f28c446e5f8c
> > 
> > # good: [f0c65567b3c1b23f79e8a49139580a3872a68d1f] Merge branch
> > 'sunvnet-multi-tx-queue'
> > git bisect good f0c65567b3c1b23f79e8a49139580a3872a68d1f
> > 
> > # bad: [547f2735c20023d7b50a791b1b17cacb652e9237] Merge branch 'mlx4-next'
> > git bisect bad 547f2735c20023d7b50a791b1b17cacb652e9237
> > 
> > # good: [4cdb1e2e3d3495423db558d3bb7ed11d66aabce7] net: shrink struct
> > softnet_data
> > git bisect good 4cdb1e2e3d3495423db558d3bb7ed11d66aabce7
> > 
> > # bad: [0a98455666ec87378148a1dde97f1ce5baf75a64] net/mlx4_core: Protect
> > port type setting by mutex
> > git bisect bad 0a98455666ec87378148a1dde97f1ce5baf75a64
> > 
> > # bad: [6e8066999800d90d52af5c84ac49ebf683d14cdc] net/mlx4_core: Prevent VF
> > from changing port configuration
> > git bisect bad 6e8066999800d90d52af5c84ac49ebf683d14cdc
> > 
> > # bad: [d75b1ade567ffab085e8adbbdacf0092d10cd09c] net: less interrupt
> > masking in NAPI
> > git bisect bad d75b1ade567ffab085e8adbbdacf0092d10cd09c
> > 
> > # first bad commit: [d75b1ade567ffab085e8adbbdacf0092d10cd09c]
> > net: less interrupt masking in NAPI
> > 
> > Could you please solve this issue as it renders my board quite useless.
> > 
> 
> Thanks for the report and bisection !
> 
> Could you try following fix ?
> 
> diff --git a/drivers/net/ethernet/atheros/alx/main.c b/drivers/net/ethernet/atheros/alx/main.c
> index e398eda07298..209c40765e0d 100644
> --- a/drivers/net/ethernet/atheros/alx/main.c
> +++ b/drivers/net/ethernet/atheros/alx/main.c
> @@ -272,7 +272,7 @@ static int alx_poll(struct napi_struct *napi, int budget)
>  		   alx_clean_rx_irq(alx, budget);
>  
>  	if (!complete)
> -		return 1;
> +		return budget;
>  
>  	napi_complete(&alx->napi);
>  
> 
> 
> 

BTW this driver has other issues :

complete = alx_clean_tx_irq(alx) &&
            alx_clean_rx_irq(alx, budget);

Means that under TX completion pressure (alx_clean_tx_irq(alx) return
false), we never dequeue packets from RX rings.

^ permalink raw reply

* Re: [BUG] 3.19-rc1 net: less interrupt masking in NAPI
From: Oded Gabbay @ 2015-01-10 21:30 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S. Miller, Eric Dumazet, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, Willem de Bruijn, Bridgman, John,
	Elifaz, Dana
In-Reply-To: <1420923485.5947.85.camel@edumazet-glaptop2.roam.corp.google.com>



On 01/10/2015 10:58 PM, Eric Dumazet wrote:
> On Sat, 2015-01-10 at 22:39 +0200, Oded Gabbay wrote:
>> Hi,
>>
>> Commit d75b1ade567ffab085e8adbbdacf0092d10cd09c breaks my "Qualcomm Atheros
>> AR8161 Gigabit Ethernet (rev 10)" Ethernet controller, which is handled by
>> the alx network driver.
>>
>> ogabbay@odedg-ubuntu:~$ lspci -s 01:00.0 -k
>> 01:00.0 Ethernet controller:
>> 	Qualcomm Atheros AR8161 Gigabit Ethernet (rev 10)
>> 	Subsystem: Qualcomm Atheros Device 1071
>> 	Kernel driver in use: alx
>>
>> I have this controller on a mobile platform of AMD APU Kaveri, which I use
>> to test amdkfd, and from 3.19-rc1 the network stopped working when trying to
>> transfer files through scp or nfs.
>>
>> I bisected the kernel (from 3.18.0 to 3.19-rc1) and reached this commit.
>>
>> Here is the log of the bisect:
>>
>> git bisect start
>> # bad: [97bf6af1f928216fd6c5a66e8a57bfa95a659672] Linux 3.19-rc1
>> git bisect bad 97bf6af1f928216fd6c5a66e8a57bfa95a659672
>>
>> # good: [b2776bf7149bddd1f4161f14f79520f17fc1d71d] Linux 3.18
>> git bisect good b2776bf7149bddd1f4161f14f79520f17fc1d71d
>>
>> # bad: [70e71ca0af244f48a5dcf56dc435243792e3a495] Merge
>> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
>> git bisect bad 70e71ca0af244f48a5dcf56dc435243792e3a495
>>
>> # good: [e28870f9b3e92cd3570925089c6bb789c2603bc4] Merge tag
>> 'backlight-for-linus-3.19' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight
>> git bisect good e28870f9b3e92cd3570925089c6bb789c2603bc4
>>
>> # bad: [450fa21942fe2c37f0c9f52d1a33bbc081eee288] sh_eth: Remove redundant
>> alignment adjustment
>> git bisect bad 450fa21942fe2c37f0c9f52d1a33bbc081eee288
>>
>> # bad: [5c8d19da950861d0482abc0ac3481acca34b008f] e100e: use
>> netdev_rss_key_fill() helper
>> git bisect bad 5c8d19da950861d0482abc0ac3481acca34b008f
>>
>> # good: [bf515fb11ab539c76d04f0e3c5216ed41f41d81f] Merge tag
>> 'mac80211-next-for-john-2014-11-04' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
>> git bisect good bf515fb11ab539c76d04f0e3c5216ed41f41d81f
>>
>> # bad: [2c99cd914d4fed9160d98849c9dd38034616768e] Merge branch 'amd-xgbe-next'
>> git bisect bad 2c99cd914d4fed9160d98849c9dd38034616768e
>>
>> # good: [3d762a0f0ab9cb4a6b5993db3ce56c92f9f90ab2] net: dsa: Add support for
>> reading switch registers with ethtool
>> git bisect good 3d762a0f0ab9cb4a6b5993db3ce56c92f9f90ab2
>>
>> # bad: [8ce0c8254f15229aa99fc6c04141f28c446e5f8c] Merge branch 'master' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
>> git bisect bad 8ce0c8254f15229aa99fc6c04141f28c446e5f8c
>>
>> # good: [f0c65567b3c1b23f79e8a49139580a3872a68d1f] Merge branch
>> 'sunvnet-multi-tx-queue'
>> git bisect good f0c65567b3c1b23f79e8a49139580a3872a68d1f
>>
>> # bad: [547f2735c20023d7b50a791b1b17cacb652e9237] Merge branch 'mlx4-next'
>> git bisect bad 547f2735c20023d7b50a791b1b17cacb652e9237
>>
>> # good: [4cdb1e2e3d3495423db558d3bb7ed11d66aabce7] net: shrink struct
>> softnet_data
>> git bisect good 4cdb1e2e3d3495423db558d3bb7ed11d66aabce7
>>
>> # bad: [0a98455666ec87378148a1dde97f1ce5baf75a64] net/mlx4_core: Protect
>> port type setting by mutex
>> git bisect bad 0a98455666ec87378148a1dde97f1ce5baf75a64
>>
>> # bad: [6e8066999800d90d52af5c84ac49ebf683d14cdc] net/mlx4_core: Prevent VF
>> from changing port configuration
>> git bisect bad 6e8066999800d90d52af5c84ac49ebf683d14cdc
>>
>> # bad: [d75b1ade567ffab085e8adbbdacf0092d10cd09c] net: less interrupt
>> masking in NAPI
>> git bisect bad d75b1ade567ffab085e8adbbdacf0092d10cd09c
>>
>> # first bad commit: [d75b1ade567ffab085e8adbbdacf0092d10cd09c]
>> net: less interrupt masking in NAPI
>>
>> Could you please solve this issue as it renders my board quite useless.
>>
> 
> Thanks for the report and bisection !
> 
> Could you try following fix ?
> 
> diff --git a/drivers/net/ethernet/atheros/alx/main.c b/drivers/net/ethernet/atheros/alx/main.c
> index e398eda07298..209c40765e0d 100644
> --- a/drivers/net/ethernet/atheros/alx/main.c
> +++ b/drivers/net/ethernet/atheros/alx/main.c
> @@ -272,7 +272,7 @@ static int alx_poll(struct napi_struct *napi, int budget)
>  		   alx_clean_rx_irq(alx, budget);
>  
>  	if (!complete)
> -		return 1;
> +		return budget;
>  
>  	napi_complete(&alx->napi);
>  
> 
> 
> 
> 
Yes, no problem.
I will update on the result.

	Oded

^ permalink raw reply

* Re: [BUG] 3.19-rc1 net: less interrupt masking in NAPI
From: Oded Gabbay @ 2015-01-10 21:43 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S. Miller, Eric Dumazet, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, Willem de Bruijn, Bridgman, John,
	Elifaz, Dana
In-Reply-To: <54B199EB.1080004@amd.com>



On 01/10/2015 11:30 PM, Oded Gabbay wrote:
> 
> 
> On 01/10/2015 10:58 PM, Eric Dumazet wrote:
>> On Sat, 2015-01-10 at 22:39 +0200, Oded Gabbay wrote:
>>> Hi,
>>>
>>> Commit d75b1ade567ffab085e8adbbdacf0092d10cd09c breaks my "Qualcomm Atheros
>>> AR8161 Gigabit Ethernet (rev 10)" Ethernet controller, which is handled by
>>> the alx network driver.
>>>
>>> ogabbay@odedg-ubuntu:~$ lspci -s 01:00.0 -k
>>> 01:00.0 Ethernet controller:
>>> 	Qualcomm Atheros AR8161 Gigabit Ethernet (rev 10)
>>> 	Subsystem: Qualcomm Atheros Device 1071
>>> 	Kernel driver in use: alx
>>>
>>> I have this controller on a mobile platform of AMD APU Kaveri, which I use
>>> to test amdkfd, and from 3.19-rc1 the network stopped working when trying to
>>> transfer files through scp or nfs.
>>>
>>> I bisected the kernel (from 3.18.0 to 3.19-rc1) and reached this commit.
>>>
>>> Here is the log of the bisect:
>>>
>>> git bisect start
>>> # bad: [97bf6af1f928216fd6c5a66e8a57bfa95a659672] Linux 3.19-rc1
>>> git bisect bad 97bf6af1f928216fd6c5a66e8a57bfa95a659672
>>>
>>> # good: [b2776bf7149bddd1f4161f14f79520f17fc1d71d] Linux 3.18
>>> git bisect good b2776bf7149bddd1f4161f14f79520f17fc1d71d
>>>
>>> # bad: [70e71ca0af244f48a5dcf56dc435243792e3a495] Merge
>>> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
>>> git bisect bad 70e71ca0af244f48a5dcf56dc435243792e3a495
>>>
>>> # good: [e28870f9b3e92cd3570925089c6bb789c2603bc4] Merge tag
>>> 'backlight-for-linus-3.19' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight
>>> git bisect good e28870f9b3e92cd3570925089c6bb789c2603bc4
>>>
>>> # bad: [450fa21942fe2c37f0c9f52d1a33bbc081eee288] sh_eth: Remove redundant
>>> alignment adjustment
>>> git bisect bad 450fa21942fe2c37f0c9f52d1a33bbc081eee288
>>>
>>> # bad: [5c8d19da950861d0482abc0ac3481acca34b008f] e100e: use
>>> netdev_rss_key_fill() helper
>>> git bisect bad 5c8d19da950861d0482abc0ac3481acca34b008f
>>>
>>> # good: [bf515fb11ab539c76d04f0e3c5216ed41f41d81f] Merge tag
>>> 'mac80211-next-for-john-2014-11-04' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
>>> git bisect good bf515fb11ab539c76d04f0e3c5216ed41f41d81f
>>>
>>> # bad: [2c99cd914d4fed9160d98849c9dd38034616768e] Merge branch 'amd-xgbe-next'
>>> git bisect bad 2c99cd914d4fed9160d98849c9dd38034616768e
>>>
>>> # good: [3d762a0f0ab9cb4a6b5993db3ce56c92f9f90ab2] net: dsa: Add support for
>>> reading switch registers with ethtool
>>> git bisect good 3d762a0f0ab9cb4a6b5993db3ce56c92f9f90ab2
>>>
>>> # bad: [8ce0c8254f15229aa99fc6c04141f28c446e5f8c] Merge branch 'master' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
>>> git bisect bad 8ce0c8254f15229aa99fc6c04141f28c446e5f8c
>>>
>>> # good: [f0c65567b3c1b23f79e8a49139580a3872a68d1f] Merge branch
>>> 'sunvnet-multi-tx-queue'
>>> git bisect good f0c65567b3c1b23f79e8a49139580a3872a68d1f
>>>
>>> # bad: [547f2735c20023d7b50a791b1b17cacb652e9237] Merge branch 'mlx4-next'
>>> git bisect bad 547f2735c20023d7b50a791b1b17cacb652e9237
>>>
>>> # good: [4cdb1e2e3d3495423db558d3bb7ed11d66aabce7] net: shrink struct
>>> softnet_data
>>> git bisect good 4cdb1e2e3d3495423db558d3bb7ed11d66aabce7
>>>
>>> # bad: [0a98455666ec87378148a1dde97f1ce5baf75a64] net/mlx4_core: Protect
>>> port type setting by mutex
>>> git bisect bad 0a98455666ec87378148a1dde97f1ce5baf75a64
>>>
>>> # bad: [6e8066999800d90d52af5c84ac49ebf683d14cdc] net/mlx4_core: Prevent VF
>>> from changing port configuration
>>> git bisect bad 6e8066999800d90d52af5c84ac49ebf683d14cdc
>>>
>>> # bad: [d75b1ade567ffab085e8adbbdacf0092d10cd09c] net: less interrupt
>>> masking in NAPI
>>> git bisect bad d75b1ade567ffab085e8adbbdacf0092d10cd09c
>>>
>>> # first bad commit: [d75b1ade567ffab085e8adbbdacf0092d10cd09c]
>>> net: less interrupt masking in NAPI
>>>
>>> Could you please solve this issue as it renders my board quite useless.
>>>
>>
>> Thanks for the report and bisection !
>>
>> Could you try following fix ?
>>
>> diff --git a/drivers/net/ethernet/atheros/alx/main.c b/drivers/net/ethernet/atheros/alx/main.c
>> index e398eda07298..209c40765e0d 100644
>> --- a/drivers/net/ethernet/atheros/alx/main.c
>> +++ b/drivers/net/ethernet/atheros/alx/main.c
>> @@ -272,7 +272,7 @@ static int alx_poll(struct napi_struct *napi, int budget)
>>  		   alx_clean_rx_irq(alx, budget);
>>  
>>  	if (!complete)
>> -		return 1;
>> +		return budget;
>>  
>>  	napi_complete(&alx->napi);
>>  
>>
>>
>>
>>
> Yes, no problem.
> I will update on the result.
> 
> 	Oded
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

Hi Eric,
Your patch fixed the problem.
Thanks for the quick help!

	Oded

^ permalink raw reply

* Re: [BUG] 3.19-rc1 net: less interrupt masking in NAPI
From: Eric Dumazet @ 2015-01-10 21:50 UTC (permalink / raw)
  To: Oded Gabbay, Johannes Berg
  Cc: David S. Miller, Eric Dumazet, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, Willem de Bruijn, Bridgman, John,
	Elifaz, Dana
In-Reply-To: <54B199EB.1080004@amd.com>

On Sat, 2015-01-10 at 23:30 +0200, Oded Gabbay wrote:

> Yes, no problem.
> I will update on the result.

Please try this more complete patch, solving the TX pressure problem as
well, and not lying about NAPI budget. thanks !


diff --git a/drivers/net/ethernet/atheros/alx/main.c b/drivers/net/ethernet/atheros/alx/main.c
index e398eda07298..5f05b387c0a7 100644
--- a/drivers/net/ethernet/atheros/alx/main.c
+++ b/drivers/net/ethernet/atheros/alx/main.c
@@ -184,15 +184,16 @@ static void alx_schedule_reset(struct alx_priv *alx)
 	schedule_work(&alx->reset_wk);
 }
 
-static bool alx_clean_rx_irq(struct alx_priv *alx, int budget)
+static int alx_clean_rx_irq(struct alx_priv *alx, int budget)
 {
 	struct alx_rx_queue *rxq = &alx->rxq;
 	struct alx_rrd *rrd;
 	struct alx_buffer *rxb;
 	struct sk_buff *skb;
 	u16 length, rfd_cleaned = 0;
+	int work = 0;
 
-	while (budget > 0) {
+	while (work < budget) {
 		rrd = &rxq->rrd[rxq->rrd_read_idx];
 		if (!(rrd->word3 & cpu_to_le32(1 << RRD_UPDATED_SHIFT)))
 			break;
@@ -243,7 +244,7 @@ static bool alx_clean_rx_irq(struct alx_priv *alx, int budget)
 		}
 
 		napi_gro_receive(&alx->napi, skb);
-		budget--;
+		work++;
 
 next_pkt:
 		if (++rxq->read_idx == alx->rx_ringsz)
@@ -258,21 +259,22 @@ next_pkt:
 	if (rfd_cleaned)
 		alx_refill_rx_ring(alx, GFP_ATOMIC);
 
-	return budget > 0;
+	return work;
 }
 
 static int alx_poll(struct napi_struct *napi, int budget)
 {
 	struct alx_priv *alx = container_of(napi, struct alx_priv, napi);
 	struct alx_hw *hw = &alx->hw;
-	bool complete = true;
 	unsigned long flags;
+	bool tx_complete;
+	int work;
 
-	complete = alx_clean_tx_irq(alx) &&
-		   alx_clean_rx_irq(alx, budget);
+	tx_complete = alx_clean_tx_irq(alx);
+	work = alx_clean_rx_irq(alx, budget);
 
-	if (!complete)
-		return 1;
+	if (!tx_complete || work == budget)
+		return budget;
 
 	napi_complete(&alx->napi);
 
@@ -284,7 +286,7 @@ static int alx_poll(struct napi_struct *napi, int budget)
 
 	alx_post_write(hw);
 
-	return 0;
+	return work;
 }
 
 static irqreturn_t alx_intr_handle(struct alx_priv *alx, u32 intr)

^ permalink raw reply related

* Re: Re: [PATCH] net: less interrupt masking in NAPI
From: Oded Gabbay @ 2015-01-10 20:27 UTC (permalink / raw)
  To: davem, eric.dumazet; +Cc: netdev, willemb
In-Reply-To: <20141103.122538.387451917276174830.davem@davemloft.net>

On 2014/11/4 1:25, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@xxxxxxxxx>
> Date: Sun, 02 Nov 2014 06:19:33 -0800
> 
>> From: Eric Dumazet <edumazet@xxxxxxxxxx>
>>
>> net_rx_action() can mask irqs a single time to transfert sd->poll_list
>> into a private list, for a very short duration.
>>
>> Then, napi_complete() can avoid masking irqs again,
>> and net_rx_action() only needs to mask irq again in slow path.
>>
>> This patch removes 2 couples of irq mask/unmask per typical NAPI run,
>> more if multiple napi were triggered.
>>
>> Note this also allows to give control back to caller (do_softirq())
>> more often, so that other softirq handlers can be called a bit earlier,
>> or ksoftirqd can be wakeup earlier under pressure.
>>
>> This was developed while testing an alternative to RX interrupt
>> mitigation to reduce latencies while keeping or improving GRO
>> aggregation on fast NIC.
>>
>> Idea is to test napi->gro_list at the end of a napi->poll() and
>> reschedule one NAPI poll, but after servicing a full round of
>> softirqs (timers, TX, rcu, ...). This will be allowed only if softirq
>> is currently serviced by idle task or ksoftirqd, and resched not needed.
>>
>> Signed-off-by: Eric Dumazet <edumazet@xxxxxxxxxx>
> 
> Also applied, thanks Eric.

Hi,
Unfortunately, this patch breaks my "Qualcomm Atheros AR8161 Gigabit 
Ethernet (rev 10)" Ethernet controller, which is handled by the alx 
network driver.

ogabbay@odedg-ubuntu:~$ lspci -s 01:00.0 -k
01:00.0 Ethernet controller: 
	Qualcomm Atheros AR8161 Gigabit Ethernet (rev 10)
	Subsystem: Qualcomm Atheros Device 1071
	Kernel driver in use: alx

I have this controller on a mobile platform which I use to test amdkfd 
on Kaveri AMD APU and from 3.19-rc1, the network stopped working when 
trying to transfer files through scp or nfs.

I bisected the kernel (from 3.18.0 to 3.19-rc1) and reached this patch.

Here is the log of the bisect:

git bisect start
# bad: [97bf6af1f928216fd6c5a66e8a57bfa95a659672] Linux 3.19-rc1
git bisect bad 97bf6af1f928216fd6c5a66e8a57bfa95a659672

# good: [b2776bf7149bddd1f4161f14f79520f17fc1d71d] Linux 3.18
git bisect good b2776bf7149bddd1f4161f14f79520f17fc1d71d

# bad: [70e71ca0af244f48a5dcf56dc435243792e3a495] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
git bisect bad 70e71ca0af244f48a5dcf56dc435243792e3a495

# good: [e28870f9b3e92cd3570925089c6bb789c2603bc4] Merge tag 'backlight-for-linus-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight
git bisect good e28870f9b3e92cd3570925089c6bb789c2603bc4

# bad: [450fa21942fe2c37f0c9f52d1a33bbc081eee288] sh_eth: Remove redundant alignment adjustment
git bisect bad 450fa21942fe2c37f0c9f52d1a33bbc081eee288

# bad: [5c8d19da950861d0482abc0ac3481acca34b008f] e100e: use netdev_rss_key_fill() helper
git bisect bad 5c8d19da950861d0482abc0ac3481acca34b008f

# good: [bf515fb11ab539c76d04f0e3c5216ed41f41d81f] Merge tag 'mac80211-next-for-john-2014-11-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
git bisect good bf515fb11ab539c76d04f0e3c5216ed41f41d81f

# bad: [2c99cd914d4fed9160d98849c9dd38034616768e] Merge branch 'amd-xgbe-next'
git bisect bad 2c99cd914d4fed9160d98849c9dd38034616768e

# good: [3d762a0f0ab9cb4a6b5993db3ce56c92f9f90ab2] net: dsa: Add support for reading switch registers with ethtool
git bisect good 3d762a0f0ab9cb4a6b5993db3ce56c92f9f90ab2

# bad: [8ce0c8254f15229aa99fc6c04141f28c446e5f8c] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
git bisect bad 8ce0c8254f15229aa99fc6c04141f28c446e5f8c

# good: [f0c65567b3c1b23f79e8a49139580a3872a68d1f] Merge branch 'sunvnet-multi-tx-queue'
git bisect good f0c65567b3c1b23f79e8a49139580a3872a68d1f

# bad: [547f2735c20023d7b50a791b1b17cacb652e9237] Merge branch 'mlx4-next'
git bisect bad 547f2735c20023d7b50a791b1b17cacb652e9237

# good: [4cdb1e2e3d3495423db558d3bb7ed11d66aabce7] net: shrink struct softnet_data
git bisect good 4cdb1e2e3d3495423db558d3bb7ed11d66aabce7

# bad: [0a98455666ec87378148a1dde97f1ce5baf75a64] net/mlx4_core: Protect port type setting by mutex
git bisect bad 0a98455666ec87378148a1dde97f1ce5baf75a64

# bad: [6e8066999800d90d52af5c84ac49ebf683d14cdc] net/mlx4_core: Prevent VF from changing port configuration
git bisect bad 6e8066999800d90d52af5c84ac49ebf683d14cdc

# bad: [d75b1ade567ffab085e8adbbdacf0092d10cd09c] net: less interrupt masking in NAPI
git bisect bad d75b1ade567ffab085e8adbbdacf0092d10cd09c

# first bad commit: [d75b1ade567ffab085e8adbbdacf0092d10cd09c] 
net: less interrupt masking in NAPI

Could you please solve this issue as it renders my board quite useless.

If you need more info, please ask.

Thanks,
	Oded

^ permalink raw reply

* Re: [BUG] 3.19-rc1 net: less interrupt masking in NAPI
From: Oded Gabbay @ 2015-01-10 22:05 UTC (permalink / raw)
  To: Eric Dumazet, Johannes Berg
  Cc: David S. Miller, Eric Dumazet, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, Willem de Bruijn, Bridgman, John,
	Elifaz, Dana
In-Reply-To: <1420926614.5947.89.camel@edumazet-glaptop2.roam.corp.google.com>



On 01/10/2015 11:50 PM, Eric Dumazet wrote:
> On Sat, 2015-01-10 at 23:30 +0200, Oded Gabbay wrote:
> 
>> Yes, no problem.
>> I will update on the result.
> 
> Please try this more complete patch, solving the TX pressure problem as
> well, and not lying about NAPI budget. thanks !
> 
> 
> diff --git a/drivers/net/ethernet/atheros/alx/main.c b/drivers/net/ethernet/atheros/alx/main.c
> index e398eda07298..5f05b387c0a7 100644
> --- a/drivers/net/ethernet/atheros/alx/main.c
> +++ b/drivers/net/ethernet/atheros/alx/main.c
> @@ -184,15 +184,16 @@ static void alx_schedule_reset(struct alx_priv *alx)
>  	schedule_work(&alx->reset_wk);
>  }
>  
> -static bool alx_clean_rx_irq(struct alx_priv *alx, int budget)
> +static int alx_clean_rx_irq(struct alx_priv *alx, int budget)
>  {
>  	struct alx_rx_queue *rxq = &alx->rxq;
>  	struct alx_rrd *rrd;
>  	struct alx_buffer *rxb;
>  	struct sk_buff *skb;
>  	u16 length, rfd_cleaned = 0;
> +	int work = 0;
>  
> -	while (budget > 0) {
> +	while (work < budget) {
>  		rrd = &rxq->rrd[rxq->rrd_read_idx];
>  		if (!(rrd->word3 & cpu_to_le32(1 << RRD_UPDATED_SHIFT)))
>  			break;
> @@ -243,7 +244,7 @@ static bool alx_clean_rx_irq(struct alx_priv *alx, int budget)
>  		}
>  
>  		napi_gro_receive(&alx->napi, skb);
> -		budget--;
> +		work++;
>  
>  next_pkt:
>  		if (++rxq->read_idx == alx->rx_ringsz)
> @@ -258,21 +259,22 @@ next_pkt:
>  	if (rfd_cleaned)
>  		alx_refill_rx_ring(alx, GFP_ATOMIC);
>  
> -	return budget > 0;
> +	return work;
>  }
>  
>  static int alx_poll(struct napi_struct *napi, int budget)
>  {
>  	struct alx_priv *alx = container_of(napi, struct alx_priv, napi);
>  	struct alx_hw *hw = &alx->hw;
> -	bool complete = true;
>  	unsigned long flags;
> +	bool tx_complete;
> +	int work;
>  
> -	complete = alx_clean_tx_irq(alx) &&
> -		   alx_clean_rx_irq(alx, budget);
> +	tx_complete = alx_clean_tx_irq(alx);
> +	work = alx_clean_rx_irq(alx, budget);
>  
> -	if (!complete)
> -		return 1;
> +	if (!tx_complete || work == budget)
> +		return budget;
>  
>  	napi_complete(&alx->napi);
>  
> @@ -284,7 +286,7 @@ static int alx_poll(struct napi_struct *napi, int budget)
>  
>  	alx_post_write(hw);
>  
> -	return 0;
> +	return work;
>  }
>  
>  static irqreturn_t alx_intr_handle(struct alx_priv *alx, u32 intr)
> 
> 
Hi,
Checked it and its working.
Thanks again.
Please note that I only use this Ethernet controller for NFS and SCP, it is
not used for development of networking software of any kind, so maybe a more
extensive testing is needed.

	Oded

^ permalink raw reply

* Re: [BUG] 3.19-rc1 net: less interrupt masking in NAPI
From: Oded Gabbay @ 2015-01-10 22:08 UTC (permalink / raw)
  To: Eric Dumazet, Johannes Berg
  Cc: David S. Miller, Eric Dumazet, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, Willem de Bruijn, Bridgman, John,
	Elifaz, Dana
In-Reply-To: <54B1A22C.2020301@amd.com>



On 01/11/2015 12:05 AM, Oded Gabbay wrote:
> 
> 
> On 01/10/2015 11:50 PM, Eric Dumazet wrote:
>> On Sat, 2015-01-10 at 23:30 +0200, Oded Gabbay wrote:
>>
>>> Yes, no problem.
>>> I will update on the result.
>>
>> Please try this more complete patch, solving the TX pressure problem as
>> well, and not lying about NAPI budget. thanks !
>>
>>
>> diff --git a/drivers/net/ethernet/atheros/alx/main.c b/drivers/net/ethernet/atheros/alx/main.c
>> index e398eda07298..5f05b387c0a7 100644
>> --- a/drivers/net/ethernet/atheros/alx/main.c
>> +++ b/drivers/net/ethernet/atheros/alx/main.c
>> @@ -184,15 +184,16 @@ static void alx_schedule_reset(struct alx_priv *alx)
>>  	schedule_work(&alx->reset_wk);
>>  }
>>  
>> -static bool alx_clean_rx_irq(struct alx_priv *alx, int budget)
>> +static int alx_clean_rx_irq(struct alx_priv *alx, int budget)
>>  {
>>  	struct alx_rx_queue *rxq = &alx->rxq;
>>  	struct alx_rrd *rrd;
>>  	struct alx_buffer *rxb;
>>  	struct sk_buff *skb;
>>  	u16 length, rfd_cleaned = 0;
>> +	int work = 0;
>>  
>> -	while (budget > 0) {
>> +	while (work < budget) {
>>  		rrd = &rxq->rrd[rxq->rrd_read_idx];
>>  		if (!(rrd->word3 & cpu_to_le32(1 << RRD_UPDATED_SHIFT)))
>>  			break;
>> @@ -243,7 +244,7 @@ static bool alx_clean_rx_irq(struct alx_priv *alx, int budget)
>>  		}
>>  
>>  		napi_gro_receive(&alx->napi, skb);
>> -		budget--;
>> +		work++;
>>  
>>  next_pkt:
>>  		if (++rxq->read_idx == alx->rx_ringsz)
>> @@ -258,21 +259,22 @@ next_pkt:
>>  	if (rfd_cleaned)
>>  		alx_refill_rx_ring(alx, GFP_ATOMIC);
>>  
>> -	return budget > 0;
>> +	return work;
>>  }
>>  
>>  static int alx_poll(struct napi_struct *napi, int budget)
>>  {
>>  	struct alx_priv *alx = container_of(napi, struct alx_priv, napi);
>>  	struct alx_hw *hw = &alx->hw;
>> -	bool complete = true;
>>  	unsigned long flags;
>> +	bool tx_complete;
>> +	int work;
>>  
>> -	complete = alx_clean_tx_irq(alx) &&
>> -		   alx_clean_rx_irq(alx, budget);
>> +	tx_complete = alx_clean_tx_irq(alx);
>> +	work = alx_clean_rx_irq(alx, budget);
>>  
>> -	if (!complete)
>> -		return 1;
>> +	if (!tx_complete || work == budget)
>> +		return budget;
>>  
>>  	napi_complete(&alx->napi);
>>  
>> @@ -284,7 +286,7 @@ static int alx_poll(struct napi_struct *napi, int budget)
>>  
>>  	alx_post_write(hw);
>>  
>> -	return 0;
>> +	return work;
>>  }
>>  
>>  static irqreturn_t alx_intr_handle(struct alx_priv *alx, u32 intr)
>>
>>
> Hi,
> Checked it and its working.
> Thanks again.
> Please note that I only use this Ethernet controller for NFS and SCP, it is
> not used for development of networking software of any kind, so maybe a more
> extensive testing is needed.
> 
> 	Oded
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

But in any case, you can add:
Tested-by: Oded Gabbay <oded.gabbay@amd.com>

	Oded

^ permalink raw reply

* /proc/net/dev regression
From: Carlos R. Mafra @ 2015-01-10 23:25 UTC (permalink / raw)
  To: LKML; +Cc: Hauke Mehrtens, John W. Linville, netdev

I use a dockapp called 'wmnet' [1] to monitor the speed of
my internet connection and after the kernel v3.18 it does no
longer work properly (it still doesn't work in v3.19-rc3)

I bisected the problem and the culprit is this commit:

commit 6e094bd805a9b6ad2f5421125db8f604a166616c
Author: Rafał Miłecki <zajec5@gmail.com>
Date:   Fri Sep 5 00:18:48 2014 +0200

    bcma: move code for core registration into separate function
    
    This cleans code a bit and will us to register cores in other places as
    well. The only difference with this patch is using "core_index" for
    setting device name.
    
    
The problem is caused by the different name that my wireless connection
receives after this patch:

wlp3s0b1 (after the patch)
wlp3s0   (before the patch)

because it affects the display of information in the file /proc/net/dev.

Before the patch the fields are all aligned:

[mafra@linux-g29b:wmnet]$ cat net_dev_good.txt
Inter-|   Receive                                                |  Transmit
 face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
    lo:   35916     332    0    0    0     0          0         0    35916     332    0    0    0     0       0          0
wlp3s0: 6406428    5794    0    0    0     0          0         0   426813    3778    0    0    0     0       0          0

but after the patch the fields are misaligned:

[mafra@linux-g29b:wmnet]$ cat net_dev_bad.txt
Inter-|   Receive                                                |  Transmit
 face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
    lo:     600       8    0    0    0     0          0         0      600       8    0    0    0     0       0          0
wlp3s0b1: 9266848    7298    0    0    0     0          0         0   372229    4030    0    0    0     0       0          0

And for some reason this change confuses 'wmnet'. Reading the source code
of 'wmnet' I found that it reads the packets as follows,

	totalpackets_in = strtoul(&buffer[15], NULL, 10);

I am not sure if 'wmnet' could do this better (any suggestions?),
but the fact is that it was working before and now it is not.

Any help is greatly appreciated.

[1] wmnet can be found in http://repo.or.cz/w/dockapps.git

^ permalink raw reply

* Re: /proc/net/dev regression
From: Al Viro @ 2015-01-11  0:27 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: LKML, Hauke Mehrtens, John W. Linville, netdev
In-Reply-To: <20150110232518.GA3212@linux-g29b.site>

On Sat, Jan 10, 2015 at 11:25:18PM +0000, Carlos R. Mafra wrote:
> [mafra@linux-g29b:wmnet]$ cat net_dev_bad.txt
> Inter-|   Receive                                                |  Transmit
>  face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
>     lo:     600       8    0    0    0     0          0         0      600       8    0    0    0     0       0          0
> wlp3s0b1: 9266848    7298    0    0    0     0          0         0   372229    4030    0    0    0     0       0          0
> 
> And for some reason this change confuses 'wmnet'. Reading the source code
> of 'wmnet' I found that it reads the packets as follows,
> 
> 	totalpackets_in = strtoul(&buffer[15], NULL, 10);
> 
> I am not sure if 'wmnet' could do this better (any suggestions?),

*snort*

well, yes - it's called scanf().  And if one is really, really nervous
about the overhead of <gasp> parsing a bunch of integers (as if fopen/
fgets/fclose alone won't cost enough to make constantly calling that
sucker a bad idea), just use ptr + <something> - 6 instead of
&buffer[<something>] in there.  That thing has just found where the
colon was (and replaced it with NUL), so dealing with "the first field
turned out to be too long and shifted everything past it" isn't hard.

> but the fact is that it was working before and now it is not.

True.  Mind you, the real issue is that this code expects the interface
names to be never longer than 6 characters, but then /proc/net/dev layout
strongly suggests that.  Hell knows; it is a regression and it does
break real-world userland code.  The only way to avoid that, AFAICS, is
to prohibit interface names longer than 6 chars ;-/

Lovely combination of crappy ABI (procfs file layout), crappy userland
code relying on details of said ABI out of sheer laziness and triggering
kernel change producing bloody long interface names...

Incidentally, sufficiently long interface name will produce other fun issues
for a docked app - it simply won't fit into 64x64 square on screen ;-)

^ permalink raw reply

* Re: /proc/net/dev regression
From: Carlos R. Mafra @ 2015-01-11  0:58 UTC (permalink / raw)
  To: Al Viro; +Cc: LKML, Hauke Mehrtens, John W. Linville, netdev
In-Reply-To: <20150111002706.GC22149@ZenIV.linux.org.uk>

On Sun, 11 Jan 2015 at  0:27:06 +0000, Al Viro wrote:
> On Sat, Jan 10, 2015 at 11:25:18PM +0000, Carlos R. Mafra wrote:
> > [mafra@linux-g29b:wmnet]$ cat net_dev_bad.txt
> > Inter-|   Receive                                                |  Transmit
> >  face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
> >     lo:     600       8    0    0    0     0          0         0      600       8    0    0    0     0       0          0
> > wlp3s0b1: 9266848    7298    0    0    0     0          0         0   372229    4030    0    0    0     0       0          0
> > 
> > And for some reason this change confuses 'wmnet'. Reading the source code
> > of 'wmnet' I found that it reads the packets as follows,
> > 
> > 	totalpackets_in = strtoul(&buffer[15], NULL, 10);
> > 
> > I am not sure if 'wmnet' could do this better (any suggestions?),
> 
> *snort*
> 
> well, yes - it's called scanf().  And if one is really, really nervous
> about the overhead of <gasp> parsing a bunch of integers (as if fopen/
> fgets/fclose alone won't cost enough to make constantly calling that
> sucker a bad idea), just use ptr + <something> - 6 instead of
> &buffer[<something>] in there.  That thing has just found where the
> colon was (and replaced it with NUL), so dealing with "the first field
> turned out to be too long and shifted everything past it" isn't hard.

Alright! Thank you.

> > but the fact is that it was working before and now it is not.
> 
> True.  Mind you, the real issue is that this code expects the interface
> names to be never longer than 6 characters, but then /proc/net/dev layout
> strongly suggests that.  Hell knows; it is a regression and it does
> break real-world userland code.  The only way to avoid that, AFAICS, is
> to prohibit interface names longer than 6 chars ;-/
> 
> Lovely combination of crappy ABI (procfs file layout), crappy userland
> code relying on details of said ABI out of sheer laziness and triggering
> kernel change producing bloody long interface names...

I won't mind just changing the crappy and fragile wmnet code and moving on.
I have already lost the 2 hours bisecting this anyway.

Thank you very much for your advice.

^ permalink raw reply

* Re: /proc/net/dev regression
From: Al Viro @ 2015-01-11  1:00 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: LKML, Hauke Mehrtens, John W. Linville, netdev
In-Reply-To: <20150111002706.GC22149@ZenIV.linux.org.uk>

On Sun, Jan 11, 2015 at 12:27:06AM +0000, Al Viro wrote:
> On Sat, Jan 10, 2015 at 11:25:18PM +0000, Carlos R. Mafra wrote:
> > [mafra@linux-g29b:wmnet]$ cat net_dev_bad.txt
> > Inter-|   Receive                                                |  Transmit
> >  face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
> >     lo:     600       8    0    0    0     0          0         0      600       8    0    0    0     0       0          0
> > wlp3s0b1: 9266848    7298    0    0    0     0          0         0   372229    4030    0    0    0     0       0          0
> > 
> > And for some reason this change confuses 'wmnet'. Reading the source code
> > of 'wmnet' I found that it reads the packets as follows,
> > 
> > 	totalpackets_in = strtoul(&buffer[15], NULL, 10);
> > 
> > I am not sure if 'wmnet' could do this better (any suggestions?),
> 
> *snort*
> 
> well, yes - it's called scanf().  And if one is really, really nervous
> about the overhead of <gasp> parsing a bunch of integers (as if fopen/
> fgets/fclose alone won't cost enough to make constantly calling that
> sucker a bad idea), just use ptr + <something> - 6 instead of
> &buffer[<something>] in there.  That thing has just found where the
> colon was (and replaced it with NUL), so dealing with "the first field
> turned out to be too long and shifted everything past it" isn't hard.
> 
> > but the fact is that it was working before and now it is not.
> 
> True.  Mind you, the real issue is that this code expects the interface
> names to be never longer than 6 characters, but then /proc/net/dev layout
> strongly suggests that.  Hell knows; it is a regression and it does
> break real-world userland code.  The only way to avoid that, AFAICS, is
> to prohibit interface names longer than 6 chars ;-/
> 
> Lovely combination of crappy ABI (procfs file layout), crappy userland
> code relying on details of said ABI out of sheer laziness and triggering
> kernel change producing bloody long interface names...
> 
> Incidentally, sufficiently long interface name will produce other fun issues
> for a docked app - it simply won't fit into 64x64 square on screen ;-)

Mind you, assuming that columns will align is obviously broken - the producing
side of that thing is
        seq_printf(seq, "%6s: %7llu %7llu %4llu %4llu %4llu %5llu %10llu %9llu "
                   "%8llu %7llu %4llu %4llu %4llu %5llu %7llu %10llu\n",
                   dev->name, stats->rx_bytes, stats->rx_packets,
                   stats->rx_errors,
                   stats->rx_dropped + stats->rx_missed_errors,
                   stats->rx_fifo_errors,
                   stats->rx_length_errors + stats->rx_over_errors +
                    stats->rx_crc_errors + stats->rx_frame_errors,
                   stats->rx_compressed, stats->multicast,
                   stats->tx_bytes, stats->tx_packets,
                   stats->tx_errors, stats->tx_dropped,
                   stats->tx_fifo_errors, stats->collisions,
                   stats->tx_carrier_errors +
                    stats->tx_aborted_errors +
                    stats->tx_window_errors +
                    stats->tx_heartbeat_errors,
                   stats->tx_compressed);
To start with, expecting the ->rx_bytes to remain a 7-digit number is somewhat,
er, odd.  Long interace names be damned, the columns will not stay aligned,
no matter what.  Unless your interface vanishes as soon as it has sent
or received 10 megabytes, that is...

^ permalink raw reply

* Re: /proc/net/dev regression
From: Carlos R. Mafra @ 2015-01-11  1:33 UTC (permalink / raw)
  To: Al Viro; +Cc: LKML, Hauke Mehrtens, John W. Linville, netdev
In-Reply-To: <20150111010036.GD22149@ZenIV.linux.org.uk>

On Sun, 11 Jan 2015 at  1:00:36 +0000, Al Viro wrote:
> On Sun, Jan 11, 2015 at 12:27:06AM +0000, Al Viro wrote:
> > On Sat, Jan 10, 2015 at 11:25:18PM +0000, Carlos R. Mafra wrote:
> > > [mafra@linux-g29b:wmnet]$ cat net_dev_bad.txt
> > > Inter-|   Receive                                                |  Transmit
> > >  face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
> > >     lo:     600       8    0    0    0     0          0         0      600       8    0    0    0     0       0          0
> > > wlp3s0b1: 9266848    7298    0    0    0     0          0         0   372229    4030    0    0    0     0       0          0
> > > 
> > > And for some reason this change confuses 'wmnet'. Reading the source code
> > > of 'wmnet' I found that it reads the packets as follows,
> > > 
> > > 	totalpackets_in = strtoul(&buffer[15], NULL, 10);
> > > 
> > > I am not sure if 'wmnet' could do this better (any suggestions?),
> > 
> > *snort*
> > 
> > well, yes - it's called scanf().  And if one is really, really nervous
> > about the overhead of <gasp> parsing a bunch of integers (as if fopen/
> > fgets/fclose alone won't cost enough to make constantly calling that
> > sucker a bad idea), just use ptr + <something> - 6 instead of
> > &buffer[<something>] in there.  That thing has just found where the
> > colon was (and replaced it with NUL), so dealing with "the first field
> > turned out to be too long and shifted everything past it" isn't hard.
> > 
> > > but the fact is that it was working before and now it is not.
> > 
> > True.  Mind you, the real issue is that this code expects the interface
> > names to be never longer than 6 characters, but then /proc/net/dev layout
> > strongly suggests that.  Hell knows; it is a regression and it does
> > break real-world userland code.  The only way to avoid that, AFAICS, is
> > to prohibit interface names longer than 6 chars ;-/
> > 
> > Lovely combination of crappy ABI (procfs file layout), crappy userland
> > code relying on details of said ABI out of sheer laziness and triggering
> > kernel change producing bloody long interface names...
> > 
> > Incidentally, sufficiently long interface name will produce other fun issues
> > for a docked app - it simply won't fit into 64x64 square on screen ;-)
> 
> Mind you, assuming that columns will align is obviously broken - the producing
> side of that thing is
>         seq_printf(seq, "%6s: %7llu %7llu %4llu %4llu %4llu %5llu %10llu %9llu "
>                    "%8llu %7llu %4llu %4llu %4llu %5llu %7llu %10llu\n",
>                    dev->name, stats->rx_bytes, stats->rx_packets,
>                    stats->rx_errors,
>                    stats->rx_dropped + stats->rx_missed_errors,
>                    stats->rx_fifo_errors,
>                    stats->rx_length_errors + stats->rx_over_errors +
>                     stats->rx_crc_errors + stats->rx_frame_errors,
>                    stats->rx_compressed, stats->multicast,
>                    stats->tx_bytes, stats->tx_packets,
>                    stats->tx_errors, stats->tx_dropped,
>                    stats->tx_fifo_errors, stats->collisions,
>                    stats->tx_carrier_errors +
>                     stats->tx_aborted_errors +
>                     stats->tx_window_errors +
>                     stats->tx_heartbeat_errors,
>                    stats->tx_compressed);
> To start with, expecting the ->rx_bytes to remain a 7-digit number is somewhat,
> er, odd.  Long interace names be damned, the columns will not stay aligned,
> no matter what.  Unless your interface vanishes as soon as it has sent
> or received 10 megabytes, that is...

I think the problem with wmnet is not that it was expecting the fields
to be aligned because it never had problems before (when definitely more
than 10 megabytes were received, wmnet is crappy but not _that_ crappy).

I think the problem really was here,

	totalbytes_in = strtoul(&buffer[7], NULL, 10);

After the patch the device name is 8 characters long and &buffer[7]
overlaps with the name instead of reading the bytes. Before the
patch is was fine because the call to strtoul() seems correct in the
sense that it would read everything until the NULL. So more than 10
megabytes was still ok.

So I guess I was wrong when suggesting that the problem was the
alignment.

^ permalink raw reply

* Re: /proc/net/dev regression
From: Al Viro @ 2015-01-11  1:39 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: LKML, Hauke Mehrtens, John W. Linville, netdev
In-Reply-To: <20150111013335.GA5753@linux-g29b.site>

On Sun, Jan 11, 2015 at 01:33:35AM +0000, Carlos R. Mafra wrote:

> I think the problem with wmnet is not that it was expecting the fields
> to be aligned because it never had problems before (when definitely more
> than 10 megabytes were received, wmnet is crappy but not _that_ crappy).
> 
> I think the problem really was here,
> 
> 	totalbytes_in = strtoul(&buffer[7], NULL, 10);
> 
> After the patch the device name is 8 characters long and &buffer[7]
> overlaps with the name instead of reading the bytes. Before the
> patch is was fine because the call to strtoul() seems correct in the
> sense that it would read everything until the NULL. So more than 10
> megabytes was still ok.
> 
> So I guess I was wrong when suggesting that the problem was the
> alignment.

Several lines below there's this:
                        totalpackets_out = strtoul(&buffer[74], NULL, 10);
                        if (totalpackets_out != lastpackets_out) {
                                totalbytes_out = strtoul(&buffer[66], NULL, 10);
                                diffpackets_out += totalpackets_out - lastpackets_out;
                                diffbytes_out += totalbytes_out - lastbytes_out;
                                lastpackets_out = totalpackets_out;
                                lastbytes_out = totalbytes_out;
                                tx = True;
                        }

So I'm afraid it *is* that crappy.  This function really should use scanf();
note that updateStats_ipchains() in the same file does just that (well,
fgets()+sscanf() for fsck knows what reason).  And I'd be careful with all
those %d, actually - it's not _that_ hard to get more than 4Gb sent.
scanf formats really ought to match the kernel-side (seq_)printf one...

^ permalink raw reply

* RE: [PATCH v4] can: Convert to runtime_pm
From: Appana Durga Kedareswara Rao @ 2015-01-11  5:34 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: linux-can@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, Soren Brinkmann,
	grant.likely@linaro.org, wg@grandegger.com
In-Reply-To: <54AD267C.4060004@pengutronix.de>

Hi Marc,

> -----Original Message-----
> From: Marc Kleine-Budde [mailto:mkl@pengutronix.de]
> Sent: Wednesday, January 07, 2015 5:59 PM
> To: Appana Durga Kedareswara Rao; wg@grandegger.com; Michal Simek;
> grant.likely@linaro.org
> Cc: linux-can@vger.kernel.org; netdev@vger.kernel.org; linux-
> kernel@vger.kernel.org; Appana Durga Kedareswara Rao; Soren Brinkmann
> Subject: Re: [PATCH v4] can: Convert to runtime_pm
>
> On 12/23/2014 01:25 PM, Kedareswara rao Appana wrote:
> > Instead of enabling/disabling clocks at several locations in the
> > driver, use the runtime_pm framework. This consolidates the actions
> > for runtime PM in the appropriate callbacks and makes the driver more
> > readable and mantainable.
> >
> > Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> > Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
> > ---
> > Chnages for v4:
> >  - Updated with the review comments.
> > Changes for v3:
> >   - Converted the driver to use runtime_pm.
> > Changes for v2:
> >   - Removed the struct platform_device* from suspend/resume
> >     as suggest by Lothar.
> >
> >  drivers/net/can/xilinx_can.c |  123
> > +++++++++++++++++++++++++-----------------
> >  1 files changed, 74 insertions(+), 49 deletions(-)
> >
> > diff --git a/drivers/net/can/xilinx_can.c
> > b/drivers/net/can/xilinx_can.c index 6c67643..c71f683 100644
> > --- a/drivers/net/can/xilinx_can.c
> > +++ b/drivers/net/can/xilinx_can.c
> > @@ -32,6 +32,7 @@
> >  #include <linux/can/dev.h>
> >  #include <linux/can/error.h>
> >  #include <linux/can/led.h>
> > +#include <linux/pm_runtime.h>
> >
> >  #define DRIVER_NAME        "xilinx_can"
> >
> > @@ -138,7 +139,7 @@ struct xcan_priv {
> >     u32 (*read_reg)(const struct xcan_priv *priv, enum xcan_reg reg);
> >     void (*write_reg)(const struct xcan_priv *priv, enum xcan_reg reg,
> >                     u32 val);
> > -   struct net_device *dev;
> > +   struct device *dev;
> >     void __iomem *reg_base;
> >     unsigned long irq_flags;
> >     struct clk *bus_clk;
> > @@ -842,6 +843,13 @@ static int xcan_open(struct net_device *ndev)
> >     struct xcan_priv *priv = netdev_priv(ndev);
> >     int ret;
> >
> > +   ret = pm_runtime_get_sync(priv->dev);
> > +   if (ret < 0) {
> > +           netdev_err(ndev, "%s: pm_runtime_get failed\r(%d)\n\r",
> > +                           __func__, ret);
> > +           return ret;
> > +   }
> > +
> >     ret = request_irq(ndev->irq, xcan_interrupt, priv->irq_flags,
> >                     ndev->name, ndev);
> >     if (ret < 0) {
> > @@ -849,29 +857,17 @@ static int xcan_open(struct net_device *ndev)
> >             goto err;
> >     }
> >
> > -   ret = clk_prepare_enable(priv->can_clk);
> > -   if (ret) {
> > -           netdev_err(ndev, "unable to enable device clock\n");
> > -           goto err_irq;
> > -   }
> > -
> > -   ret = clk_prepare_enable(priv->bus_clk);
> > -   if (ret) {
> > -           netdev_err(ndev, "unable to enable bus clock\n");
> > -           goto err_can_clk;
> > -   }
> > -
> >     /* Set chip into reset mode */
> >     ret = set_reset_mode(ndev);
> >     if (ret < 0) {
> >             netdev_err(ndev, "mode resetting failed!\n");
> > -           goto err_bus_clk;
> > +           goto err_irq;
> >     }
> >
> >     /* Common open */
> >     ret = open_candev(ndev);
> >     if (ret)
> > -           goto err_bus_clk;
> > +           goto err_irq;
> >
> >     ret = xcan_chip_start(ndev);
> >     if (ret < 0) {
> > @@ -887,13 +883,11 @@ static int xcan_open(struct net_device *ndev)
> >
> >  err_candev:
> >     close_candev(ndev);
> > -err_bus_clk:
> > -   clk_disable_unprepare(priv->bus_clk);
> > -err_can_clk:
> > -   clk_disable_unprepare(priv->can_clk);
> >  err_irq:
> >     free_irq(ndev->irq, ndev);
> >  err:
> > +   pm_runtime_put(priv->dev);
> > +
> >     return ret;
> >  }
> >
> > @@ -910,12 +904,11 @@ static int xcan_close(struct net_device *ndev)
> >     netif_stop_queue(ndev);
> >     napi_disable(&priv->napi);
> >     xcan_chip_stop(ndev);
> > -   clk_disable_unprepare(priv->bus_clk);
> > -   clk_disable_unprepare(priv->can_clk);
> >     free_irq(ndev->irq, ndev);
> >     close_candev(ndev);
> >
> >     can_led_event(ndev, CAN_LED_EVENT_STOP);
> > +   pm_runtime_put(priv->dev);
> >
> >     return 0;
> >  }
> > @@ -934,27 +927,20 @@ static int xcan_get_berr_counter(const struct
> net_device *ndev,
> >     struct xcan_priv *priv = netdev_priv(ndev);
> >     int ret;
> >
> > -   ret = clk_prepare_enable(priv->can_clk);
> > -   if (ret)
> > -           goto err;
> > -
> > -   ret = clk_prepare_enable(priv->bus_clk);
> > -   if (ret)
> > -           goto err_clk;
> > +   ret = pm_runtime_get_sync(priv->dev);
> > +   if (ret < 0) {
> > +           netdev_err(ndev, "%s: pm_runtime_get failed\r(%d)\n\r",
> > +                           __func__, ret);
>
> Please remove the \r from the error messages.
>

Ok

> > +           return ret;
> > +   }
> >
> >     bec->txerr = priv->read_reg(priv, XCAN_ECR_OFFSET) &
> XCAN_ECR_TEC_MASK;
> >     bec->rxerr = ((priv->read_reg(priv, XCAN_ECR_OFFSET) &
> >                     XCAN_ECR_REC_MASK) >> XCAN_ESR_REC_SHIFT);
> >
> > -   clk_disable_unprepare(priv->bus_clk);
> > -   clk_disable_unprepare(priv->can_clk);
> > +   pm_runtime_put(priv->dev);
> >
> >     return 0;
> > -
> > -err_clk:
> > -   clk_disable_unprepare(priv->can_clk);
> > -err:
> > -   return ret;
> >  }
> >
> >
> > @@ -967,15 +953,45 @@ static const struct net_device_ops
> > xcan_netdev_ops = {
> >
> >  /**
> >   * xcan_suspend - Suspend method for the driver
> > - * @dev:   Address of the platform_device structure
> > + * @dev:   Address of the device structure
> >   *
> >   * Put the driver into low power mode.
> > - * Return: 0 always
> > + * Return: 0 on success and failure value on error
> >   */
> >  static int __maybe_unused xcan_suspend(struct device *dev)  {
> > -   struct platform_device *pdev = dev_get_drvdata(dev);
> > -   struct net_device *ndev = platform_get_drvdata(pdev);
> > +   if (!device_may_wakeup(dev))
> > +           return pm_runtime_force_suspend(dev);
> > +
> > +   return 0;
> > +}
> > +
> > +/**
> > + * xcan_resume - Resume from suspend
> > + * @dev:   Address of the device structure
> > + *
> > + * Resume operation after suspend.
> > + * Return: 0 on success and failure value on error  */ static int
> > +__maybe_unused xcan_resume(struct device *dev) {
> > +   if (!device_may_wakeup(dev))
> > +           return pm_runtime_force_resume(dev);
> > +
> > +   return 0;
> > +
> > +}
> > +
> > +/**
> > + * xcan_runtime_suspend - Runtime suspend method for the driver
> > + * @dev:   Address of the device structure
> > + *
> > + * Put the driver into low power mode.
> > + * Return: 0 always
> > + */
> > +static int __maybe_unused xcan_runtime_suspend(struct device *dev) {
> > +   struct net_device *ndev = dev_get_drvdata(dev);
> >     struct xcan_priv *priv = netdev_priv(ndev);
> >
> >     if (netif_running(ndev)) {
> > @@ -993,16 +1009,15 @@ static int __maybe_unused xcan_suspend(struct
> > device *dev)  }
> >
> >  /**
> > - * xcan_resume - Resume from suspend
> > - * @dev:   Address of the platformdevice structure
> > + * xcan_runtime_resume - Runtime resume from suspend
> > + * @dev:   Address of the device structure
> >   *
> >   * Resume operation after suspend.
> >   * Return: 0 on success and failure value on error
> >   */
> > -static int __maybe_unused xcan_resume(struct device *dev)
> > +static int __maybe_unused xcan_runtime_resume(struct device *dev)
> >  {
> > -   struct platform_device *pdev = dev_get_drvdata(dev);
> > -   struct net_device *ndev = platform_get_drvdata(pdev);
> > +   struct net_device *ndev = dev_get_drvdata(dev);
> >     struct xcan_priv *priv = netdev_priv(ndev);
> >     int ret;
>
> Some more context:
>
> >     ret = clk_enable(priv->bus_clk);
> >     if (ret) {
> >             dev_err(dev, "Cannot enable clock.\n");
> >             return ret;
> >     }
> >     ret = clk_enable(priv->can_clk);
> >     if (ret) {
> >             dev_err(dev, "Cannot enable clock.\n");
> >             clk_disable_unprepare(priv->bus_clk);
>
> This disable_unprepare looks wrong, should be a disable only.
>

Ok

> >             return ret;
> >     }
> >
> >     priv->write_reg(priv, XCAN_MSR_OFFSET, 0);
> >     priv->write_reg(priv, XCAN_SRR_OFFSET, XCAN_SRR_CEN_MASK);
> >
> >     if (netif_running(ndev)) {
> >             priv->can.state = CAN_STATE_ERROR_ACTIVE;
>
> What happens if the device was not in ACTIVE state prior to the
> runtime_suspend?
>

I am not sure about the state of CAN at this point of time.
I just followed what other drivers are following for run time  suspend :).

> >             netif_device_attach(ndev);
> >             netif_start_queue(ndev);
> >     }
> >
> >     return 0;
> > }
>
>
> >
> > @@ -1020,9 +1035,9 @@ static int __maybe_unused xcan_resume(struct
> > device *dev)
> >
> >     priv->write_reg(priv, XCAN_MSR_OFFSET, 0);
> >     priv->write_reg(priv, XCAN_SRR_OFFSET, XCAN_SRR_CEN_MASK);
> > -   priv->can.state = CAN_STATE_ERROR_ACTIVE;
> >
> >     if (netif_running(ndev)) {
> > +           priv->can.state = CAN_STATE_ERROR_ACTIVE;
> >             netif_device_attach(ndev);
> >             netif_start_queue(ndev);
> >     }
> > @@ -1030,7 +1045,10 @@ static int __maybe_unused xcan_resume(struct
> device *dev)
> >     return 0;
> >  }
> >
> > -static SIMPLE_DEV_PM_OPS(xcan_dev_pm_ops, xcan_suspend,
> xcan_resume);
> > +static const struct dev_pm_ops xcan_dev_pm_ops = {
> > +   SET_SYSTEM_SLEEP_PM_OPS(xcan_suspend, xcan_resume)
> > +   SET_PM_RUNTIME_PM_OPS(xcan_runtime_suspend,
> xcan_runtime_resume,
> > +NULL) };
> >
> >  /**
> >   * xcan_probe - Platform registration call @@ -1071,7 +1089,7 @@
> > static int xcan_probe(struct platform_device *pdev)
> >             return -ENOMEM;
> >
> >     priv = netdev_priv(ndev);
> > -   priv->dev = ndev;
> > +   priv->dev = &pdev->dev;
> >     priv->can.bittiming_const = &xcan_bittiming_const;
> >     priv->can.do_set_mode = xcan_do_set_mode;
> >     priv->can.do_get_berr_counter = xcan_get_berr_counter; @@ -
> 1137,15
> > +1155,22 @@ static int xcan_probe(struct platform_device *pdev)
> >
> >     netif_napi_add(ndev, &priv->napi, xcan_rx_poll, rx_max);
> >
> > +   pm_runtime_set_active(&pdev->dev);
> > +   pm_runtime_irq_safe(&pdev->dev);
> > +   pm_runtime_enable(&pdev->dev);
> > +   pm_runtime_get_sync(&pdev->dev);
> Check error values?

Ok
> > +
> >     ret = register_candev(ndev);
> >     if (ret) {
> >             dev_err(&pdev->dev, "fail to register failed (err=%d)\n",
> ret);
> > +           pm_runtime_put(priv->dev);
>
> Please move the pm_runtime_put into the common error exit path.
>

Ok

> >             goto err_unprepare_disable_busclk;
> >     }
> >
> >     devm_can_led_init(ndev);
> > -   clk_disable_unprepare(priv->bus_clk);
> > -   clk_disable_unprepare(priv->can_clk);
> > +
> > +   pm_runtime_put(&pdev->dev);
> > +
> >     netdev_dbg(ndev, "reg_base=0x%p irq=%d clock=%d, tx fifo
> depth:%d\n",
> >                     priv->reg_base, ndev->irq, priv->can.clock.freq,
> >                     priv->tx_max);
> >
>
> I think you have to convert the _remove() function, too. Have a look at the
> gpio-zynq.c driver:
>
> > static int zynq_gpio_remove(struct platform_device *pdev) {
> >     struct zynq_gpio *gpio = platform_get_drvdata(pdev);
> >
> >     pm_runtime_get_sync(&pdev->dev);
>
> However I don't understand why the get_sync() is here. Maybe Sören can
> help?

I converted the remove function to use the run-time PM and .
Below is the remove code snippet.

       ret = pm_runtime_get_sync(&pdev->dev);
        if (ret < 0) {
                netdev_err(ndev, "%s: pm_runtime_get failed(%d)\n",
                                __func__, ret);
                return ret;
        }

        if (set_reset_mode(ndev) < 0)
                netdev_err(ndev, "mode resetting failed!\n");

        unregister_candev(ndev);
        netif_napi_del(&priv->napi);
        free_candev(ndev);
        pm_runtime_disable(&pdev->dev);

Observed the below things in the /sys/kernel/debug/clk/clk_summary.

                                Modprobe        ifconfig up    ifconfig down   rmmod  Modprobe  ifconfig up   ifconfig down  rmmod  Modprobe  ifconfig up   ifconfig down  rmmod

Clk_prepare count:                         1                            1               1               1       2        2              2               2       3       3               3               3

Clk_enable count:                 0             1               0               1       2       2               2               2       3       3               3               3

Regards,
Kedar.

>
> >     gpiochip_remove(&gpio->chip);
> >     clk_disable_unprepare(gpio->clk);
> >     device_set_wakeup_capable(&pdev->dev, 0);
> >     return 0;
> > }
>
> regards,
> Marc
>
> --
> Pengutronix e.K.                  | Marc Kleine-Budde           |
> Industrial Linux Solutions        | Phone: +49-231-2826-924     |
> Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
> Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |



This email and any attachments are intended for the sole use of the named recipient(s) and contain(s) confidential information that may be proprietary, privileged or copyrighted under applicable law. If you are not the intended recipient, do not read, copy, or forward this email message or any attachments. Delete this email message and any attachments immediately.


^ permalink raw reply

* Re: [PATCH RESEND 2/2] wlcore: align member-assigns in a structure-copy block
From: Eliad Peller @ 2015-01-11 10:22 UTC (permalink / raw)
  To: Kalle Valo
  Cc: Giel van Schijndel, LKML, John W. Linville, Arik Nemtsov,
	open list:TI WILINK WIRELES..., open list:NETWORKING DRIVERS
In-Reply-To: <87vbkfga32.fsf-HodKDYzPHsUD5k0oWYwrnHL1okKdlPRT@public.gmane.org>

On Fri, Jan 9, 2015 at 7:03 PM, Kalle Valo <kvalo-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote:
> Giel van Schijndel <me-sZ9Uef1cvPWHXe+LvDLADg@public.gmane.org> writes:
>
>> This highlights the differences (e.g. the bug fixed in the previous
>> commit).
>>
>> Signed-off-by: Giel van Schijndel <me-sZ9Uef1cvPWHXe+LvDLADg@public.gmane.org>
>> ---
>>  drivers/net/wireless/ti/wlcore/acx.c | 22 +++++++++++-----------
>>  1 file changed, 11 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/net/wireless/ti/wlcore/acx.c b/drivers/net/wireless/ti/wlcore/acx.c
>> index f28fa3b..93a2fa8 100644
>> --- a/drivers/net/wireless/ti/wlcore/acx.c
>> +++ b/drivers/net/wireless/ti/wlcore/acx.c
>> @@ -1715,17 +1715,17 @@ int wl12xx_acx_config_hangover(struct wl1271 *wl)
>>               goto out;
>>       }
>>
>> -     acx->recover_time = cpu_to_le32(conf->recover_time);
>> -     acx->hangover_period = conf->hangover_period;
>> -     acx->dynamic_mode = conf->dynamic_mode;
>> -     acx->early_termination_mode = conf->early_termination_mode;
>> -     acx->max_period = conf->max_period;
>> -     acx->min_period = conf->min_period;
>> -     acx->increase_delta = conf->increase_delta;
>> -     acx->decrease_delta = conf->decrease_delta;
>> -     acx->quiet_time = conf->quiet_time;
>> -     acx->increase_time = conf->increase_time;
>> -     acx->window_size = conf->window_size;
>> +     acx->recover_time               = cpu_to_le32(conf->recover_time);
>> +     acx->hangover_period            = conf->hangover_period;
>> +     acx->dynamic_mode               = conf->dynamic_mode;
>> +     acx->early_termination_mode     = conf->early_termination_mode;
>> +     acx->max_period                 = conf->max_period;
>> +     acx->min_period                 = conf->min_period;
>> +     acx->increase_delta             = conf->increase_delta;
>> +     acx->decrease_delta             = conf->decrease_delta;
>> +     acx->quiet_time                 = conf->quiet_time;
>> +     acx->increase_time              = conf->increase_time;
>> +     acx->window_size                = conf->window_size;
>
> I would like to get an ACK from one of the wlcore developers if I should
> apply this (or not).
>
I don't have a strong opinion here.
However, it looks pretty much redundant to take a random blob (which
was just fixed by a correct patch) and re-indent it.
The rest of the file doesn't follow this style, so i don't see a good
reason to apply it here.

I agree such indentation have some benefit, but it won't help with the
more common use case (of copy-paste error) of copying the wrong field
(i.e. D->a = S->b instead of D->a = S->a).
For these cases the macros suggested by Arend and Johannes will do the
trick. However i usually dislike such macros, as they tend to break
some IDE features (e.g. auto completion).
Maybe we can come up with some nice spatch to catch these cases.

Just my 2c.

Eliad.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox