Netdev List
 help / color / mirror / Atom feed
* [PATCH net] net/ipv4: Use non-atomic allocation of udp offloads structure instance
From: Or Gerlitz @ 2014-01-29 16:08 UTC (permalink / raw)
  To: davem; +Cc: netdev, dan.carpenter, Or Gerlitz

Since udp_add_offload() can be called from non-sleepable context e.g
under this call tree from the vxlan driver use case:

  vxlan_socket_create() <-- holds the spinlock
  -> vxlan_notify_add_rx_port()
     -> udp_add_offload()  <-- schedules

we should allocate the udp_offloads structure in atomic manner.

Fixes: b582ef0 ('net: Add GRO support for UDP encapsulating protocols')
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 net/ipv4/udp_offload.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 25f5cee..2ffea6f 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -101,7 +101,7 @@ out:
 int udp_add_offload(struct udp_offload *uo)
 {
 	struct udp_offload_priv __rcu **head = &udp_offload_base;
-	struct udp_offload_priv *new_offload = kzalloc(sizeof(*new_offload), GFP_KERNEL);
+	struct udp_offload_priv *new_offload = kzalloc(sizeof(*new_offload), GFP_ATOMIC);
 
 	if (!new_offload)
 		return -ENOMEM;
-- 
1.7.1

^ permalink raw reply related

* [PATCH net] net/vxlan: Go over all candidate streams for GRO matching
From: Or Gerlitz @ 2014-01-29 16:10 UTC (permalink / raw)
  To: davem; +Cc: netdev, therbert, edumazet, Or Gerlitz

The loop in vxlan_gro_receive() over the current set of candidates for 
coalescing was wrongly aborted once a match was found. In rare cases,
this can cause a false-positives matching in the next layer GRO checks.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/vxlan.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 026a313..3d6cd1c 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -596,10 +596,8 @@ static struct sk_buff **vxlan_gro_receive(struct sk_buff **head, struct sk_buff
 			NAPI_GRO_CB(p)->same_flow = 0;
 			continue;
 		}
-		goto found;
 	}
 
-found:
 	type = eh->h_proto;
 
 	rcu_read_lock();
-- 
1.7.1

^ permalink raw reply related

* Re: [PATCH stable 3.11+] can: use private sk reference to detect originating socket
From: Eric Dumazet @ 2014-01-29 16:12 UTC (permalink / raw)
  To: Oliver Hartkopp; +Cc: David Miller, Linux Netdev List, Andre Naujoks
In-Reply-To: <52E905C8.1060408@hartkopp.net>

On Wed, 2014-01-29 at 14:44 +0100, Oliver Hartkopp wrote:

> ps. I would suggest to move the BUG() in WARN_ON_ONCE() to fulfill the
> requirement to detect a possible misuse but not kill the entire machine.
> BUG() is intended to be used when there's no way to continue any reasonable
> operation. Detecting and fixing this issue half a year after the "temporary
> sanity check" has been integrated is a bad example for using BUG() IMO.

I find worrying it took 6 months for CAN developers to hit this problem.

A WARN_ON_ONCE() has almost no chance being noticed in the embedded
world, and makes no pressure on developers to fix the bugs.

Patch is 6 months old, and will be reverted when we have confidence that
all offenders are fixed. Your feedback shows that we shall wait 10 years
just to make sure, and maybe we wont revert it.

All BUG() in the kernel have the property to potentially halt a
perfectly working machine, at a bad time, yes. But they make us
build a better kernel.

^ permalink raw reply

* Re: [PATCH net] net/ipv4: Use non-atomic allocation of udp offloads structure instance
From: Eric Dumazet @ 2014-01-29 16:20 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: davem, netdev, dan.carpenter
In-Reply-To: <1391011739-8597-1-git-send-email-ogerlitz@mellanox.com>

On Wed, 2014-01-29 at 18:08 +0200, Or Gerlitz wrote:
> Since udp_add_offload() can be called from non-sleepable context e.g
> under this call tree from the vxlan driver use case:
> 
>   vxlan_socket_create() <-- holds the spinlock
>   -> vxlan_notify_add_rx_port()
>      -> udp_add_offload()  <-- schedules
> 
> we should allocate the udp_offloads structure in atomic manner.
> 
> Fixes: b582ef0 ('net: Add GRO support for UDP encapsulating protocols')
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
> ---
>  net/ipv4/udp_offload.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> index 25f5cee..2ffea6f 100644
> --- a/net/ipv4/udp_offload.c
> +++ b/net/ipv4/udp_offload.c
> @@ -101,7 +101,7 @@ out:
>  int udp_add_offload(struct udp_offload *uo)
>  {
>  	struct udp_offload_priv __rcu **head = &udp_offload_base;
> -	struct udp_offload_priv *new_offload = kzalloc(sizeof(*new_offload), GFP_KERNEL);
> +	struct udp_offload_priv *new_offload = kzalloc(sizeof(*new_offload), GFP_ATOMIC);
>  
>  	if (!new_offload)
>  		return -ENOMEM;

Could you also fix all the rcu_dereference() calls ?

Try CONFIG_PROVE_RCU=y

^ permalink raw reply

* Re: [PATCH net] net/ipv4: Use non-atomic allocation of udp offloads structure instance
From: Or Gerlitz @ 2014-01-29 16:42 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, netdev, dan.carpenter, Shlomo Pongratz
In-Reply-To: <1391012411.28432.55.camel@edumazet-glaptop2.roam.corp.google.com>

On 29/01/2014 18:20, Eric Dumazet wrote:
> Could you also fix all the rcu_dereference() calls ?
>
> Try CONFIG_PROVE_RCU=y

Thanks Eric, I tried this out now and got one alarm on udp_add_offload, 
which I will loop up how to fix,
do you see anything else?


===============================
[ INFO: suspicious RCU usage. ]
3.13.0+ #109 Not tainted
-------------------------------
net/ipv4/udp_offload.c:112 suspicious rcu_dereference_check() usage!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 0
4 locks held by kworker/2:2/2752:
  #0:  (vxlan){.+.+.+}, at: [<ffffffff8108bd5a>] 
process_one_work+0x15a/0x710
  #1:  ((&vxlan->sock_work)){+.+.+.}, at: [<ffffffff8108bd5a>] 
process_one_work+0x15a/0x710
  #2:  (&(&vn->sock_lock)->rlock){+.+...}, at: [<ffffffffa03caae3>] 
vxlan_sock_add+0x323/0xb70 [vxlan]
  #3:  (udp_offload_lock){+.+...}, at: [<ffffffff814d22e1>] 
udp_add_offload+0x41/0xd0

stack backtrace:
CPU: 2 PID: 2752 Comm: kworker/2:2 Not tainted 3.13.0+ #109
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
Workqueue: vxlan vxlan_sock_work [vxlan]
  0000000000000001 ffff8800d1949bf8 ffffffff81528412 0000000000000002
  ffff88021ef2e060 ffff8800d1949c28 ffffffff810b8373 ffff880218f59160
  0000000000000000 ffffffff81aa3540 ffff88021dcaf000 ffff8800d1949c48
Call Trace:
  [<ffffffff81528412>] dump_stack+0x51/0x77
  [<ffffffff810b8373>] lockdep_rcu_suspicious+0x103/0x140
  [<ffffffff814d234a>] udp_add_offload+0xaa/0xd0
  [<ffffffffa03cb0ad>] vxlan_sock_add+0x8ed/0xb70 [vxlan]
  [<ffffffffa03caa77>] ? vxlan_sock_add+0x2b7/0xb70 [vxlan]
  [<ffffffffa03ce390>] ? vxlan_xmit+0x650/0x650 [vxlan]
  [<ffffffffa03cb330>] ? vxlan_sock_add+0xb70/0xb70 [vxlan]
  [<ffffffff81291f7c>] ? delay_tsc+0x6c/0xd0
  [<ffffffffa03cb439>] vxlan_sock_work+0x109/0x260 [vxlan]
  [<ffffffffa03cb330>] ? vxlan_sock_add+0xb70/0xb70 [vxlan]
  [<ffffffff8108bdd1>] process_one_work+0x1d1/0x710
  [<ffffffff8108bd5a>] ? process_one_work+0x15a/0x710
  [<ffffffff8108c432>] worker_thread+0x122/0x400
  [<ffffffff8108c310>] ? process_one_work+0x710/0x710
  [<ffffffff810943fe>] kthread+0xde/0x100
  [<ffffffff81094320>] ? __init_kthread_worker+0x70/0x70
  [<ffffffff8152f3bc>] ret_from_fork+0x7c/0xb0
  [<ffffffff81094320>] ? __init_kthread_worker+0x70/0x70

^ permalink raw reply

* Re: [PATCH v2 1/4] phy: provide accessors for 'advertising' and 'supported' fields
From: Florian Fainelli @ 2014-01-29 17:14 UTC (permalink / raw)
  To: Max Filippov
  Cc: linux-xtensa@linux-xtensa.org, netdev,
	linux-kernel@vger.kernel.org, Chris Zankel, Marc Gauthier,
	David S. Miller, Ben Hutchings
In-Reply-To: <1390975218-13863-2-git-send-email-jcmvbkbc@gmail.com>

2014-01-28 Max Filippov <jcmvbkbc@gmail.com>:
> Many network drivers directly modify phy_device::advertising and
> phy_device::supported. Provide accessors to these fields to better
> isolate phylib from its users.
>
> Suggested-by: Ben Hutchings <ben@decadent.org.uk>
> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>

After giving some more thought to this patch, I am not sure this
really adds anything, struct phy_device is already exposed to drivers,
and those drivers have been able to modify phydev->supported and
phydev->advertising to suit their needs.

> ---
> Changes v1->v2:
> - new patch
>
>  include/linux/phy.h | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/include/linux/phy.h b/include/linux/phy.h
> index 48a4dc3..2ae58f8 100644
> --- a/include/linux/phy.h
> +++ b/include/linux/phy.h
> @@ -559,6 +559,18 @@ static inline int phy_read_status(struct phy_device *phydev) {
>         return phydev->drv->read_status(phydev);
>  }
>
> +static inline void phy_update_advert(struct phy_device *phydev, u32 clear,
> +                                    u32 set)
> +{
> +       phydev->advertising = (phydev->advertising & ~clear) | set;
> +}
> +
> +static inline void phy_update_supported(struct phy_device *phydev, u32 clear,
> +                                       u32 set)
> +{
> +       phydev->supported = (phydev->supported & ~clear) | set;
> +}
> +
>  int genphy_setup_forced(struct phy_device *phydev);
>  int genphy_restart_aneg(struct phy_device *phydev);
>  int genphy_config_aneg(struct phy_device *phydev);
> --
> 1.8.1.4
>



-- 
Florian

^ permalink raw reply

* Re: [PATCH 0/2] [BUG FIXES - 3.10.27] sit: More backports
From: Willem de Bruijn @ 2014-01-29 17:21 UTC (permalink / raw)
  To: Nicolas Dichtel
  Cc: Steven Rostedt, linux-kernel, netdev, stable, Clark Williams,
	Luis Claudio R. Goncalves, John Kacur
In-Reply-To: <52E9267C.90403@6wind.com>

> From d101450583c3a472a2a94904cfe13fd4e7d2f519 Mon Sep 17 00:00:00 2001
> From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> Date: Wed, 29 Jan 2014 16:40:30 +0100
> Subject: [PATCH] sit: fix double free of fb_tunnel_dev on exit
>
> This problem was fixed upstream by commit 9434266f2c64 ("sit: fix use after free
> of fb_tunnel_dev").
> The upstream patch depends on upstream commit 5e6700b3bf98 ("sit: add support of
> x-netns"), which was not backported into 3.10 branch.
>
> First, explain the problem: when the sit module is unloaded, sit_cleanup() is
> called.
> rmmod sit
> => sit_cleanup()
>   => rtnl_link_unregister()
>     => __rtnl_kill_links()
>       => for_each_netdev(net, dev) {
>         if (dev->rtnl_link_ops == ops)
>                 ops->dellink(dev, &list_kill);
>         }
> At this point, the FB device is deleted (and all sit tunnels).
>   => unregister_pernet_device()
>     => unregister_pernet_operations()
>       => ops_exit_list()
>         => sit_exit_net()
>           => sit_destroy_tunnels()
>           In this function, no tunnel is found.
>           => unregister_netdevice_queue(sitn->fb_tunnel_dev, &list);
> We delete the FB device a second time here!
>
> Because we cannot simply remove the second deletion (sit_exit_net() must remove
> the FB device when a netns is deleted), we add an rtnl ops which delete all sit
> device excepting the FB device and thus we can keep the explicit deletion in
> sit_exit_net().
>
> CC: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> ---
>  net/ipv6/sit.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
> index 0491264b8bfc..620d326e8fdd 100644
> --- a/net/ipv6/sit.c
> +++ b/net/ipv6/sit.c
> @@ -1507,6 +1507,15 @@ static const struct nla_policy ipip6_policy[IFLA_IPTUN_MAX + 1] = {
>  #endif
>  };
>
> +static void ipip6_dellink(struct net_device *dev, struct list_head *head)
> +{
> +       struct net *net = dev_net(dev);
> +       struct sit_net *sitn = net_generic(net, sit_net_id);
> +
> +       if (dev != sitn->fb_tunnel_dev)
> +               unregister_netdevice_queue(dev, head);
> +}
> +
>  static struct rtnl_link_ops sit_link_ops __read_mostly = {
>         .kind           = "sit",
>         .maxtype        = IFLA_IPTUN_MAX,
> @@ -1517,6 +1526,7 @@ static struct rtnl_link_ops sit_link_ops __read_mostly = {
>         .changelink     = ipip6_changelink,
>         .get_size       = ipip6_get_size,
>         .fill_info      = ipip6_fill_info,
> +       .dellink        = ipip6_dellink,
>  };
>
>  static struct xfrm_tunnel sit_handler __read_mostly = {
> --
> 1.8.4.1


This looks good to me. It is the same as the backport "sit: fix use
after free of fb_tunnel_dev" (9434266f2c64), minus the small code
cleanup at the end of that patch that is not relevant to 3.10.27 (do
not define sit_net *sitn in sit_exit_net if it is not used there. But in
3.10.27 it is used, in unregister_netdevice_queue).

^ permalink raw reply

* Re: [PATCH net] net/ipv4: Use non-atomic allocation of udp offloads structure instance
From: Eric Dumazet @ 2014-01-29 17:31 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: davem, netdev, dan.carpenter, Shlomo Pongratz
In-Reply-To: <52E92F92.9080307@mellanox.com>

On Wed, 2014-01-29 at 18:42 +0200, Or Gerlitz wrote:
> On 29/01/2014 18:20, Eric Dumazet wrote:
> > Could you also fix all the rcu_dereference() calls ?
> >
> > Try CONFIG_PROVE_RCU=y
> 
> Thanks Eric, I tried this out now and got one alarm on udp_add_offload, 
> which I will loop up how to fix,
> do you see anything else?
> 
> 

You should have same issue in udp_del_offload()


(Lockdep triggers once only)

> ===============================
> [ INFO: suspicious RCU usage. ]
> 3.13.0+ #109 Not tainted
> -------------------------------
> net/ipv4/udp_offload.c:112 suspicious rcu_dereference_check() usage!
> 
> other info that might help us debug this:
> 
> 
> rcu_scheduler_active = 1, debug_locks = 0
> 4 locks held by kworker/2:2/2752:
>   #0:  (vxlan){.+.+.+}, at: [<ffffffff8108bd5a>] 
> process_one_work+0x15a/0x710
>   #1:  ((&vxlan->sock_work)){+.+.+.}, at: [<ffffffff8108bd5a>] 
> process_one_work+0x15a/0x710
>   #2:  (&(&vn->sock_lock)->rlock){+.+...}, at: [<ffffffffa03caae3>] 
> vxlan_sock_add+0x323/0xb70 [vxlan]
>   #3:  (udp_offload_lock){+.+...}, at: [<ffffffff814d22e1>] 
> udp_add_offload+0x41/0xd0
> 
> stack backtrace:
> CPU: 2 PID: 2752 Comm: kworker/2:2 Not tainted 3.13.0+ #109
> Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
> Workqueue: vxlan vxlan_sock_work [vxlan]
>   0000000000000001 ffff8800d1949bf8 ffffffff81528412 0000000000000002
>   ffff88021ef2e060 ffff8800d1949c28 ffffffff810b8373 ffff880218f59160
>   0000000000000000 ffffffff81aa3540 ffff88021dcaf000 ffff8800d1949c48
> Call Trace:
>   [<ffffffff81528412>] dump_stack+0x51/0x77
>   [<ffffffff810b8373>] lockdep_rcu_suspicious+0x103/0x140
>   [<ffffffff814d234a>] udp_add_offload+0xaa/0xd0
>   [<ffffffffa03cb0ad>] vxlan_sock_add+0x8ed/0xb70 [vxlan]
>   [<ffffffffa03caa77>] ? vxlan_sock_add+0x2b7/0xb70 [vxlan]
>   [<ffffffffa03ce390>] ? vxlan_xmit+0x650/0x650 [vxlan]
>   [<ffffffffa03cb330>] ? vxlan_sock_add+0xb70/0xb70 [vxlan]
>   [<ffffffff81291f7c>] ? delay_tsc+0x6c/0xd0
>   [<ffffffffa03cb439>] vxlan_sock_work+0x109/0x260 [vxlan]
>   [<ffffffffa03cb330>] ? vxlan_sock_add+0xb70/0xb70 [vxlan]
>   [<ffffffff8108bdd1>] process_one_work+0x1d1/0x710
>   [<ffffffff8108bd5a>] ? process_one_work+0x15a/0x710
>   [<ffffffff8108c432>] worker_thread+0x122/0x400
>   [<ffffffff8108c310>] ? process_one_work+0x710/0x710
>   [<ffffffff810943fe>] kthread+0xde/0x100
>   [<ffffffff81094320>] ? __init_kthread_worker+0x70/0x70
>   [<ffffffff8152f3bc>] ret_from_fork+0x7c/0xb0
>   [<ffffffff81094320>] ? __init_kthread_worker+0x70/0x70
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [rfc] b43: possible missing break in b43_wa_all
From: Dave Jones @ 2014-01-29 17:35 UTC (permalink / raw)
  To: netdev; +Cc: stefano.brivio

I'm going through all the 'missing break in switch' bugs in coverity, and
most of them look like intentional/false positives, but this one looks odd to me.

It seems the intent is to B43_WARN_ON only if we don't know the phy, so it seems
strange to do it after we've just initialized type 7.

If this is agreed to be a bug, I'll submit the below patch properly,
but I want to be sure this isn't something like "type 7 is work in progress"
It's been this way since 2007 though. I guess no-one runs with CONFIG_B43_DEBUG on.

	Dave

diff --git a/drivers/net/wireless/b43/wa.c b/drivers/net/wireless/b43/wa.c
index 9b1a038be08b..47f0ec887e1f 100644
--- a/drivers/net/wireless/b43/wa.c
+++ b/drivers/net/wireless/b43/wa.c
@@ -587,6 +587,7 @@ void b43_wa_all(struct b43_wldev *dev)
 			b43_wa_txpuoff_rxpuon(dev);
 			b43_wa_txlna_gain(dev);
 			b43_wa_rssi_adc(dev);
+			break;
 		default:
 			B43_WARN_ON(1);
 		}

^ permalink raw reply related

* Re: [PATCH v2 2/4] net: ethoc: don't advertise gigabit speed on attached PHY
From: Max Filippov @ 2014-01-29 18:32 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Marc Gauthier, Ben Hutchings, LKML, David S. Miller, Chris Zankel,
	linux-xtensa@linux-xtensa.org, netdev
In-Reply-To: <CAGVrzcboHp8-qHZseGOVm14u1-cTcOjRZGExFxNu_nbK__XCSA@mail.gmail.com>

On Wed, Jan 29, 2014 at 9:12 PM, Florian Fainelli <f.fainelli@gmail.com> wrote:
> On Jan 28, 2014 11:01 PM, "Max Filippov" <jcmvbkbc@gmail.com> wrote:
>>
>> On Wed, Jan 29, 2014 at 10:47 AM, Florian Fainelli <f.fainelli@gmail.com>
>> wrote:
>> > Hi Max,
>> >
>> > Le 28/01/2014 22:00, Max Filippov a écrit :
>> >
>> >> OpenCores 10/100 Mbps MAC does not support speeds above 100 Mbps, but
>> >> does
>> >> not disable advertisement when PHY supports them. This results in
>> >> non-functioning network when the MAC is connected to a gigabit PHY
>> >> connected
>> >> to a gigabit switch.
>> >>
>> >> The fix is to disable gigabit speed advertisement on attached PHY
>> >> unconditionally.
>> >>
>> >> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
>> >> ---
>> >> Changes v1->v2:
>> >> - disable both gigabit advertisement and support.
>> >>
>> >>   drivers/net/ethernet/ethoc.c | 8 ++++++++
>> >>   1 file changed, 8 insertions(+)
>> >>
>> >> diff --git a/drivers/net/ethernet/ethoc.c
>> >> b/drivers/net/ethernet/ethoc.c
>> >> index 4de8cfd..5643b2d 100644
>> >> --- a/drivers/net/ethernet/ethoc.c
>> >> +++ b/drivers/net/ethernet/ethoc.c
>> >> @@ -688,6 +688,14 @@ static int ethoc_mdio_probe(struct net_device
>> >> *dev)
>> >>         }
>> >>
>> >>         priv->phy = phy;
>> >> +       phy_update_advert(phy,
>> >> +                         ADVERTISED_1000baseT_Full |
>> >> +                         ADVERTISED_1000baseT_Half, 0);
>> >> +       phy_start_aneg(phy);
>> >
>> >
>> > This does not look necessary, you should not have to call
>> > phy_start_aneg()
>> > because the PHY state machine is not yet started, at best this calls
>> > does
>> > nothing.
>>
>> This call actually makes the whole thing work, because otherwise once
>> gigabit
>> support is cleared from the supported mask genphy_config_advert does not
>> update gigabit advertisement register, leaving it enabled.
>
> OK, then we need to figure out what is wrong with ethoc since this is
> unusual.

Maybe they boot up with gigabit advertisement disabled in their PHY
and thus they don't see the problem?

> Other drivers do the following:
>
> - connect to the PHY
> - phydev->supported = PHY_BASIC_FEATURES
> - phydev->advertising &= phydev->supported
> - start the PHY state machine
>
> And they work just fine. Is the PHY driver you are bound to the "Generic
> PHY" or something else which does something funky in config_aneg()?

It's marvell 88E1111 from the KC-705 board, but the behaviour doesn't
change if I disable it and the generic phy is used.

-- 
Thanks.
-- Max

^ permalink raw reply

* Re: [PATCH] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get (v3)
From: Pablo Neira Ayuso @ 2014-01-29 19:21 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andrey Vagin, netfilter-devel, netfilter, coreteam, netdev,
	linux-kernel, vvs, Florian Westphal, Patrick McHardy,
	Jozsef Kadlecsik, David S. Miller, Cyrill Gorcunov
In-Reply-To: <1389558074.31367.187.camel@edumazet-glaptop2.roam.corp.google.com>

On Sun, Jan 12, 2014 at 12:21:14PM -0800, Eric Dumazet wrote:
> On Sun, 2014-01-12 at 21:50 +0400, Andrey Vagin wrote:
> > Lets look at destroy_conntrack:
> > 
> > hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
> > ...
> > nf_conntrack_free(ct)
> > 	kmem_cache_free(net->ct.nf_conntrack_cachep, ct);
> > 
> > net->ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.
> > 
> > The hash is protected by rcu, so readers look up conntracks without
> > locks.
> > A conntrack is removed from the hash, but in this moment a few readers
> > still can use the conntrack. Then this conntrack is released and another
> > thread creates conntrack with the same address and the equal tuple.
> > After this a reader starts to validate the conntrack:
> > * It's not dying, because a new conntrack was created
> > * nf_ct_tuple_equal() returns true.
> ...
> 
> 
> > v2: move nf_ct_is_confirmed into the unlikely() annotation
> > v3: Eric suggested to fix refcnt, so that it becomes zero before adding
> >     in a hash, but we can't find a way how to do that. Another way is to
> >     interpret the confirm bit as part of a search key and check it in
> >     ____nf_conntrack_find() too.
> > 
> > Cc: Eric Dumazet <eric.dumazet@gmail.com>
> > Cc: Florian Westphal <fw@strlen.de>
> > Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> > Cc: Patrick McHardy <kaber@trash.net>
> > Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
> > Cc: "David S. Miller" <davem@davemloft.net>
> > Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> > Signed-off-by: Andrey Vagin <avagin@openvz.org>
> > ---
> 
> Acked-by: Eric Dumazet <edumazet@google.com>

Applied, thanks everyone!

^ permalink raw reply

* Re: [PATCH stable 3.11+] can: use private sk reference to detect originating socket
From: Oliver Hartkopp @ 2014-01-29 19:30 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, Linux Netdev List, Andre Naujoks,
	linux-can@vger.kernel.org
In-Reply-To: <1391010437.28432.39.camel@edumazet-glaptop2.roam.corp.google.com>



On 29.01.2014 16:47, Eric Dumazet wrote:
> On Wed, 2014-01-29 at 07:30 -0800, Eric Dumazet wrote:
>> On Wed, 2014-01-29 at 07:02 -0800, Eric Dumazet wrote:
>>
>>> Thats how every protocol does this, with variants.
>>>
>>> Check l2tp_sock_wfree()/l2tp_skb_set_owner_w() for an example
>>
>> Also keep in mind this patch is needed for all kernels, not only 3.11+
>>
>> So better keep it as simple as possible, to ease backports.
> 
> Please try the following patch.
> 

Hello Eric,

there were at least two problems with your patch:

1. Only the stuff in net/can was addressed (missing drivers/net/can)

2. The check for sk and the additional skb_orphan() is not needed as the
can_skb_set_owner() is invoked only after 'fresh' skb allocation or after
clone/skb_orphan() - so we know the skb state very good.

I moved your suggested inline functions to include/linux/can/skb.h where
all users (net/can and drivers) can access them.

So what has been done:

- can_skb_set_owner() is invoked on new created skbuffs (tx path)
- can_skb_set_owner() is invoked on orphaned/cloned skbs (echo in drvs)

I did some tests with real CAN hardware and everything seems fine.

Please take a look if it looks correct to you.

If so I'll send a proper patch for it.

Regards,
Oliver

diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c
index 13a9098..17a0a00 100644
--- a/drivers/net/can/dev.c
+++ b/drivers/net/can/dev.c
@@ -325,17 +325,20 @@ void can_put_echo_skb(struct sk_buff *skb, struct net_device *dev,
 	if (!priv->echo_skb[idx]) {
 		struct sock *srcsk = skb->sk;
 
-		if (atomic_read(&skb->users) != 1) {
-			struct sk_buff *old_skb = skb;
+		if (skb_shared(skb)) {
+			struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
 
-			skb = skb_clone(old_skb, GFP_ATOMIC);
-			kfree_skb(old_skb);
-			if (!skb)
+			if (likely(nskb))
+				consume_skb(skb);
+			else {
+				kfree_skb(skb);
 				return;
+			}
+			skb = nskb;
 		} else
 			skb_orphan(skb);
 
-		skb->sk = srcsk;
+		can_skb_set_owner(skb, srcsk);
 
 		/* make settings for echo to reduce code in irq context */
 		skb->protocol = htons(ETH_P_CAN);
diff --git a/drivers/net/can/janz-ican3.c b/drivers/net/can/janz-ican3.c
index e24e669..ee48cb1 100644
--- a/drivers/net/can/janz-ican3.c
+++ b/drivers/net/can/janz-ican3.c
@@ -18,6 +18,7 @@
 #include <linux/netdevice.h>
 #include <linux/can.h>
 #include <linux/can/dev.h>
+#include <linux/can/skb.h>
 #include <linux/can/error.h>
 
 #include <linux/mfd/janz.h>
@@ -1135,18 +1136,20 @@ static void ican3_put_echo_skb(struct ican3_dev *mod, struct sk_buff *skb)
 {
 	struct sock *srcsk = skb->sk;
 
-	if (atomic_read(&skb->users) != 1) {
-		struct sk_buff *old_skb = skb;
+	if (skb_shared(skb)) {
+		struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
 
-		skb = skb_clone(old_skb, GFP_ATOMIC);
-		kfree_skb(old_skb);
-		if (!skb)
+		if (likely(nskb))
+			consume_skb(skb);
+		else {
+			kfree_skb(skb);
 			return;
-	} else {
+		}
+		skb = nskb;
+	} else
 		skb_orphan(skb);
-	}
 
-	skb->sk = srcsk;
+	can_skb_set_owner(skb, srcsk);
 
 	/* save this skb for tx interrupt echo handling */
 	skb_queue_tail(&mod->echoq, skb);
diff --git a/drivers/net/can/vcan.c b/drivers/net/can/vcan.c
index 0a2a5ee..f764f00 100644
--- a/drivers/net/can/vcan.c
+++ b/drivers/net/can/vcan.c
@@ -46,6 +46,7 @@
 #include <linux/if_ether.h>
 #include <linux/can.h>
 #include <linux/can/dev.h>
+#include <linux/can/skb.h>
 #include <linux/slab.h>
 #include <net/rtnetlink.h>
 
@@ -118,12 +119,22 @@ static netdev_tx_t vcan_tx(struct sk_buff *skb, struct net_device *dev)
 	if (loop) {
 		struct sock *srcsk = skb->sk;
 
-		skb = skb_share_check(skb, GFP_ATOMIC);
-		if (!skb)
-			return NETDEV_TX_OK;
+		if (skb_shared(skb)) {
+			struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
+
+			if (likely(nskb))
+				consume_skb(skb);
+			else {
+				kfree_skb(skb);
+				return NETDEV_TX_OK;
+			}
+			skb = nskb;
+		} else
+			skb_orphan(skb);
+
+		can_skb_set_owner(skb, srcsk);
 
 		/* receive with packet counting */
-		skb->sk = srcsk;
 		vcan_rx(skb, dev);
 	} else {
 		/* no looped packets => no counting */
diff --git a/include/linux/can/skb.h b/include/linux/can/skb.h
index 2f0543f..0429d36 100644
--- a/include/linux/can/skb.h
+++ b/include/linux/can/skb.h
@@ -11,7 +11,9 @@
 #define CAN_SKB_H
 
 #include <linux/types.h>
+#include <linux/skbuff.h>
 #include <linux/can.h>
+#include <net/sock.h>
 
 /*
  * The struct can_skb_priv is used to transport additional information along
@@ -42,4 +44,16 @@ static inline void can_skb_reserve(struct sk_buff *skb)
 	skb_reserve(skb, sizeof(struct can_skb_priv));
 }
 
+static inline void can_skb_destructor(struct sk_buff *skb)
+{
+	sock_put(skb->sk);
+}
+
+static inline void can_skb_set_owner(struct sk_buff *skb, struct sock *sk)
+{
+	sock_hold(sk);
+	skb->destructor = can_skb_destructor;
+	skb->sk = sk;
+}
+
 #endif /* CAN_SKB_H */
diff --git a/net/can/af_can.c b/net/can/af_can.c
index d249874..7e4101b 100644
--- a/net/can/af_can.c
+++ b/net/can/af_can.c
@@ -290,7 +290,7 @@ int can_send(struct sk_buff *skb, int loop)
 				return -ENOMEM;
 			}
 
-			newskb->sk = skb->sk;
+			can_skb_set_owner(newskb, skb->sk);
 			newskb->ip_summed = CHECKSUM_UNNECESSARY;
 			newskb->pkt_type = PACKET_BROADCAST;
 		}
diff --git a/net/can/bcm.c b/net/can/bcm.c
index 3fc737b..dcb75c0 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -268,7 +268,7 @@ static void bcm_can_tx(struct bcm_op *op)
 
 	/* send with loopback */
 	skb->dev = dev;
-	skb->sk = op->sk;
+	can_skb_set_owner(skb, op->sk);
 	can_send(skb, 1);
 
 	/* update statistics */
@@ -1223,7 +1223,7 @@ static int bcm_tx_send(struct msghdr *msg, int ifindex, struct sock *sk)
 
 	can_skb_prv(skb)->ifindex = dev->ifindex;
 	skb->dev = dev;
-	skb->sk  = sk;
+	can_skb_set_owner(skb, sk);
 	err = can_send(skb, 1); /* send with loopback */
 	dev_put(dev);
 


^ permalink raw reply related

* Re: [PATCH] iproute: Fix Netid value for multi-families output
From: François-Xavier Le Bail @ 2014-01-29 19:29 UTC (permalink / raw)
  To: Stephen Hemminger, Linux Netdev List, Pavel Emelyanov
In-Reply-To: <52E74D0F.4050806@parallels.com>

On Tue, 1/28/14, Pavel Emelyanov <xemul@parallels.com> wrote:

> When requesting simultaneous output of TCP and UDP sockets
> the netid field shows "tcp" always.

[...]

> Reported-by: François-Xavier Le Bail <fx.lebail@yahoo.com>
> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

> ---

Tested-by: François-Xavier Le Bail <fx.lebail@yahoo.com>

^ permalink raw reply

* Re: [PATCH] iproute: Properly handle protocol level diag module absence
From: François-Xavier Le Bail @ 2014-01-29 19:39 UTC (permalink / raw)
  To: Stephen Hemminger, Linux Netdev List, Pavel Emelyanov
In-Reply-To: <52E7E990.7050501@parallels.com>

On Tue, 1/28/14, Pavel Emelyanov <xemul@parallels.com> wrote:

> When *_diag module is missing in the kernel, the ss tool should go
> ad read legacry /proc/* files.

> This is the case when all *_diag stuff is missing, but in case the
> inet_diag.ko is loaded, but (tcp|udp)_diag.ko is not, the ss tool
> doesn't notice this and produces empty output. The reason for that
> is -- error from the inet_diag module (which means, that e.g. the
> udp_diag is missing) is reported in the NLMSG_DONE message body.

> That said, we need to check the NLMSG_DONE's message return code
> and act respectively.

> Reported-by: François-Xavier Le Bail <fx.lebail@yahoo.com>
> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

> ---

Tested-by: François-Xavier Le Bail <fx.lebail@yahoo.com>

^ permalink raw reply

* Fwd: RFC 7112 on Implications of Oversized IPv6 Header Chains
From: Fernando Gont @ 2014-01-29 19:25 UTC (permalink / raw)
  To: netdev
In-Reply-To: <20140129173044.D475C7FC17B@rfc-editor.org>

Folks,

FYI. This one has important implications -- it allows stateless
filtering in IPv6 (otherwise not really possible)



-------- Original Message --------
Subject: RFC 7112 on Implications of Oversized IPv6 Header Chains
Date: Wed, 29 Jan 2014 09:30:44 -0800 (PST)
From: rfc-editor@rfc-editor.org
To: ietf-announce@ietf.org, rfc-dist@rfc-editor.org
CC: drafts-update-ref@iana.org, ipv6@ietf.org, rfc-editor@rfc-editor.org

A new Request for Comments is now available in online RFC libraries.


        RFC 7112

        Title:      Implications of Oversized IPv6 Header
                    Chains
        Author:     F. Gont, V. Manral,
                    R. Bonica
        Status:     Standards Track
        Stream:     IETF
        Date:       January 2014
        Mailbox:    fgont@si6networks.com,
                    vishwas@ionosnetworks.com,
                    rbonica@juniper.net
        Pages:      8
        Characters: 15897
        Updates:    RFC 2460

        I-D Tag:    draft-ietf-6man-oversized-header-chain-09.txt

        URL:        http://www.rfc-editor.org/rfc/rfc7112.txt

The IPv6 specification allows IPv6 Header Chains of an arbitrary
size.  The specification also allows options that can, in turn,
extend each of the headers.  In those scenarios in which the IPv6
Header Chain or options are unusually long and packets are
fragmented, or scenarios in which the fragment size is very small,
the First Fragment of a packet may fail to include the entire IPv6
Header Chain.  This document discusses the interoperability and
security problems of such traffic, and updates RFC 2460 such that the
First Fragment of a packet is required to contain the entire IPv6
Header Chain.

This document is a product of the IPv6 Maintenance Working Group of the
IETF.

This is now a Proposed Standard.

STANDARDS TRACK: This document specifies an Internet standards track
protocol for the Internet community,and requests discussion and suggestions
for improvements.  Please refer to the current edition of the Internet
Official Protocol Standards (STD 1) for the standardization state and
status of this protocol.  Distribution of this memo is unlimited.

This announcement is sent to the IETF-Announce and rfc-dist lists.
To subscribe or unsubscribe, see
  http://www.ietf.org/mailman/listinfo/ietf-announce
  http://mailman.rfc-editor.org/mailman/listinfo/rfc-dist

For searching the RFC series, see
http://www.rfc-editor.org/search/rfc_search.php
For downloading RFCs, see http://www.rfc-editor.org/rfc.html

Requests for special distribution should be addressed to either the
author of the RFC in question, or to rfc-editor@rfc-editor.org.  Unless
specifically noted otherwise on the RFC itself, all RFCs are for
unlimited distribution.


The RFC Editor Team
Association Management Solutions, LLC


--------------------------------------------------------------------
IETF IPv6 working group mailing list
ipv6@ietf.org
Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
--------------------------------------------------------------------




-- 
Fernando Gont
e-mail: fernando@gont.com.ar || fgont@si6networks.com
PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076 FFF1

^ permalink raw reply

* Re: [PATCH stable 3.11+] can: use private sk reference to detect originating socket
From: Eric Dumazet @ 2014-01-29 19:55 UTC (permalink / raw)
  To: Oliver Hartkopp
  Cc: David Miller, Linux Netdev List, Andre Naujoks,
	linux-can@vger.kernel.org
In-Reply-To: <52E956E6.7090002@hartkopp.net>

On Wed, 2014-01-29 at 20:30 +0100, Oliver Hartkopp wrote:
> 
> On 29.01.2014 16:47, Eric Dumazet wrote:
> > On Wed, 2014-01-29 at 07:30 -0800, Eric Dumazet wrote:
> >> On Wed, 2014-01-29 at 07:02 -0800, Eric Dumazet wrote:
> >>
> >>> Thats how every protocol does this, with variants.
> >>>
> >>> Check l2tp_sock_wfree()/l2tp_skb_set_owner_w() for an example
> >>
> >> Also keep in mind this patch is needed for all kernels, not only 3.11+
> >>
> >> So better keep it as simple as possible, to ease backports.
> > 
> > Please try the following patch.
> > 
> 
> Hello Eric,
> 
> there were at least two problems with your patch:
> 
> 1. Only the stuff in net/can was addressed (missing drivers/net/can)
> 
> 2. The check for sk and the additional skb_orphan() is not needed as the
> can_skb_set_owner() is invoked only after 'fresh' skb allocation or after
> clone/skb_orphan() - so we know the skb state very good.

Yep, but then you added skb_orphan() calls. when skb is not shared.

> 
> I moved your suggested inline functions to include/linux/can/skb.h where
> all users (net/can and drivers) can access them.
> 
> So what has been done:
> 
> - can_skb_set_owner() is invoked on new created skbuffs (tx path)
> - can_skb_set_owner() is invoked on orphaned/cloned skbs (echo in drvs)
> 
> I did some tests with real CAN hardware and everything seems fine.
> 
> Please take a look if it looks correct to you.
> 
> If so I'll send a proper patch for it.
> 
> Regards,
> Oliver
> 
> diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c
> index 13a9098..17a0a00 100644
> --- a/drivers/net/can/dev.c
> +++ b/drivers/net/can/dev.c
> @@ -325,17 +325,20 @@ void can_put_echo_skb(struct sk_buff *skb, struct net_device *dev,
>  	if (!priv->echo_skb[idx]) {
>  		struct sock *srcsk = skb->sk;
>  

This is still buggy.

You should not consume_skb() before having 
a ref count on skb->sk.


> -		if (atomic_read(&skb->users) != 1) {
> -			struct sk_buff *old_skb = skb;
> +		if (skb_shared(skb)) {
> +			struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
>  
> -			skb = skb_clone(old_skb, GFP_ATOMIC);
> -			kfree_skb(old_skb);
> -			if (!skb)
> +			if (likely(nskb))
> +				consume_skb(skb);
> +			else {
> +				kfree_skb(skb);
>  				return;
> +			}
> +			skb = nskb;
>  		} else
>  			skb_orphan(skb);
>  
> -		skb->sk = srcsk;
> +		can_skb_set_owner(skb, srcsk);

Only after this point you can do the :

consume_skb(original_skb);



>  
>  		/* make settings for echo to reduce code in irq context */
>  		skb->protocol = htons(ETH_P_CAN);
> diff --git a/drivers/net/can/janz-ican3.c b/drivers/net/can/janz-ican3.c
> index e24e669..ee48cb1 100644
> --- a/drivers/net/can/janz-ican3.c
> +++ b/drivers/net/can/janz-ican3.c
> @@ -18,6 +18,7 @@
>  #include <linux/netdevice.h>
>  #include <linux/can.h>
>  #include <linux/can/dev.h>
> +#include <linux/can/skb.h>
>  #include <linux/can/error.h>
>  
>  #include <linux/mfd/janz.h>
> @@ -1135,18 +1136,20 @@ static void ican3_put_echo_skb(struct ican3_dev *mod, struct sk_buff *skb)
>  {
>  	struct sock *srcsk = skb->sk;
>  

Same problem here.

> -	if (atomic_read(&skb->users) != 1) {
> -		struct sk_buff *old_skb = skb;
> +	if (skb_shared(skb)) {
> +		struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
>  
> -		skb = skb_clone(old_skb, GFP_ATOMIC);
> -		kfree_skb(old_skb);
> -		if (!skb)
> +		if (likely(nskb))
> +			consume_skb(skb);
> +		else {
> +			kfree_skb(skb);
>  			return;
> -	} else {
> +		}
> +		skb = nskb;
> +	} else
>  		skb_orphan(skb);
> -	}
>  
> -	skb->sk = srcsk;
> +	can_skb_set_owner(skb, srcsk);
>  
>  	/* save this skb for tx interrupt echo handling */
>  	skb_queue_tail(&mod->echoq, skb);
> diff --git a/drivers/net/can/vcan.c b/drivers/net/can/vcan.c
> index 0a2a5ee..f764f00 100644
> --- a/drivers/net/can/vcan.c
> +++ b/drivers/net/can/vcan.c
> @@ -46,6 +46,7 @@
>  #include <linux/if_ether.h>
>  #include <linux/can.h>
>  #include <linux/can/dev.h>
> +#include <linux/can/skb.h>
>  #include <linux/slab.h>
>  #include <net/rtnetlink.h>
>  
> @@ -118,12 +119,22 @@ static netdev_tx_t vcan_tx(struct sk_buff *skb, struct net_device *dev)
>  	if (loop) {
>  		struct sock *srcsk = skb->sk;
>  

Same problem

> -		skb = skb_share_check(skb, GFP_ATOMIC);
> -		if (!skb)
> -			return NETDEV_TX_OK;
> +		if (skb_shared(skb)) {
> +			struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
> +
> +			if (likely(nskb))
> +				consume_skb(skb);
> +			else {
> +				kfree_skb(skb);
> +				return NETDEV_TX_OK;
> +			}
> +			skb = nskb;
> +		} else
> +			skb_orphan(skb);
> +
> +		can_skb_set_owner(skb, srcsk);
>  
>  		/* receive with packet counting */
> -		skb->sk = srcsk;
>  		vcan_rx(skb, dev);
>  	} else {
>  		/* no looped packets => no counting */


I think you really should have a helper instead of copying this 3 times.


/*
 * like skb_share_check(), but transfert the skb->sk ownership
 */
static inline struct sk_buff *can_skb_share_check(struct sk_buff *skb)
{
	if (skb_shared(skb)) {
		struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);

		if (likely(nskb)) {
			if (skb->sk)
				can_skb_set_owner(nskb, skb->sk);
			consume_skb(skb);
			return nskb;
		}
		kfree_skb(skb);
		return NULL;
	}
	return skb;
}



^ permalink raw reply

* [PATCH 0/4] OpenCores 10/100 MAC ethtool operations
From: Max Filippov @ 2014-01-29 19:56 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: David S. Miller, Florian Fainelli, Marc Gauthier, Max Filippov

Hello David, Florian and everybody,

this series implements ethtool callbacks for the ethoc driver as was
requested by Florian.

Max Filippov (4):
  net: ethoc: implement basic ethtool operations
  net: ethoc: implement ethtool get/set settings
  net: ethoc: implement ethtool get registers
  net: ethoc: implement ethtool get/set ring parameters

 drivers/net/ethernet/ethoc.c | 85 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 85 insertions(+)

-- 
1.8.1.4

^ permalink raw reply

* [PATCH 1/4] net: ethoc: implement basic ethtool operations
From: Max Filippov @ 2014-01-29 19:56 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: David S. Miller, Florian Fainelli, Marc Gauthier, Max Filippov
In-Reply-To: <1391025397-14965-1-git-send-email-jcmvbkbc@gmail.com>

The following methods are implemented:
- get link state (standard implementation);
- get timestamping info (standard implementation).

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
---
 drivers/net/ethernet/ethoc.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/ethoc.c b/drivers/net/ethernet/ethoc.c
index 5854d41..6de6352 100644
--- a/drivers/net/ethernet/ethoc.c
+++ b/drivers/net/ethernet/ethoc.c
@@ -900,6 +900,11 @@ out:
 	return NETDEV_TX_OK;
 }
 
+const struct ethtool_ops ethoc_ethtool_ops = {
+	.get_link = ethtool_op_get_link,
+	.get_ts_info = ethtool_op_get_ts_info,
+};
+
 static const struct net_device_ops ethoc_netdev_ops = {
 	.ndo_open = ethoc_open,
 	.ndo_stop = ethoc_stop,
@@ -1148,6 +1153,7 @@ static int ethoc_probe(struct platform_device *pdev)
 	netdev->netdev_ops = &ethoc_netdev_ops;
 	netdev->watchdog_timeo = ETHOC_TIMEOUT;
 	netdev->features |= 0;
+	netdev->ethtool_ops = &ethoc_ethtool_ops;
 
 	/* setup NAPI */
 	netif_napi_add(netdev, &priv->napi, ethoc_poll, 64);
-- 
1.8.1.4

^ permalink raw reply related

* [PATCH 2/4] net: ethoc: implement ethtool get/set settings
From: Max Filippov @ 2014-01-29 19:56 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: David S. Miller, Florian Fainelli, Marc Gauthier, Max Filippov
In-Reply-To: <1391025397-14965-1-git-send-email-jcmvbkbc@gmail.com>

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
---
 drivers/net/ethernet/ethoc.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/drivers/net/ethernet/ethoc.c b/drivers/net/ethernet/ethoc.c
index 6de6352..9518023 100644
--- a/drivers/net/ethernet/ethoc.c
+++ b/drivers/net/ethernet/ethoc.c
@@ -900,7 +900,31 @@ out:
 	return NETDEV_TX_OK;
 }
 
+static int ethoc_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
+{
+	struct ethoc *priv = netdev_priv(dev);
+	struct phy_device *phydev = priv->phy;
+
+	if (!phydev)
+		return -ENODEV;
+
+	return phy_ethtool_gset(phydev, cmd);
+}
+
+static int ethoc_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
+{
+	struct ethoc *priv = netdev_priv(dev);
+	struct phy_device *phydev = priv->phy;
+
+	if (!phydev)
+		return -ENODEV;
+
+	return phy_ethtool_sset(phydev, cmd);
+}
+
 const struct ethtool_ops ethoc_ethtool_ops = {
+	.get_settings = ethoc_get_settings,
+	.set_settings = ethoc_set_settings,
 	.get_link = ethtool_op_get_link,
 	.get_ts_info = ethtool_op_get_ts_info,
 };
-- 
1.8.1.4

^ permalink raw reply related

* [PATCH 3/4] net: ethoc: implement ethtool get registers
From: Max Filippov @ 2014-01-29 19:56 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: David S. Miller, Florian Fainelli, Marc Gauthier, Max Filippov
In-Reply-To: <1391025397-14965-1-git-send-email-jcmvbkbc@gmail.com>

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
---
 drivers/net/ethernet/ethoc.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/drivers/net/ethernet/ethoc.c b/drivers/net/ethernet/ethoc.c
index 9518023..0bf297b 100644
--- a/drivers/net/ethernet/ethoc.c
+++ b/drivers/net/ethernet/ethoc.c
@@ -52,6 +52,7 @@ MODULE_PARM_DESC(buffer_size, "DMA buffer allocation size");
 #define	ETH_HASH0	0x48
 #define	ETH_HASH1	0x4c
 #define	ETH_TXCTRL	0x50
+#define	ETH_END		0x54
 
 /* mode register */
 #define	MODER_RXEN	(1 <<  0) /* receive enable */
@@ -922,9 +923,28 @@ static int ethoc_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
 	return phy_ethtool_sset(phydev, cmd);
 }
 
+static int ethoc_get_regs_len(struct net_device *netdev)
+{
+	return ETH_END;
+}
+
+static void ethoc_get_regs(struct net_device *dev, struct ethtool_regs *regs,
+			   void *p)
+{
+	struct ethoc *priv = netdev_priv(dev);
+	u32 *regs_buff = p;
+	unsigned i;
+
+	regs->version = 0;
+	for (i = 0; i < ETH_END / sizeof(u32); ++i)
+		regs_buff[i] = ethoc_read(priv, i * sizeof(u32));
+}
+
 const struct ethtool_ops ethoc_ethtool_ops = {
 	.get_settings = ethoc_get_settings,
 	.set_settings = ethoc_set_settings,
+	.get_regs_len = ethoc_get_regs_len,
+	.get_regs = ethoc_get_regs,
 	.get_link = ethtool_op_get_link,
 	.get_ts_info = ethtool_op_get_ts_info,
 };
-- 
1.8.1.4

^ permalink raw reply related

* [PATCH 4/4] net: ethoc: implement ethtool get/set ring parameters
From: Max Filippov @ 2014-01-29 19:56 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: David S. Miller, Florian Fainelli, Marc Gauthier, Max Filippov
In-Reply-To: <1391025397-14965-1-git-send-email-jcmvbkbc@gmail.com>

TX and RX rings share memory and descriptors. Maximal number of
descriptors reported is one less than the total available nuber of
descriptors. For the set operation the requested number of TX descriptors
is rounded down to the nearest power of two (driver logic requirement), the
rest are RX descriptors. If less RX descriptors is requested the rest is
left unused.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
---
 drivers/net/ethernet/ethoc.c | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/drivers/net/ethernet/ethoc.c b/drivers/net/ethernet/ethoc.c
index 0bf297b..c873946 100644
--- a/drivers/net/ethernet/ethoc.c
+++ b/drivers/net/ethernet/ethoc.c
@@ -181,6 +181,7 @@ MODULE_PARM_DESC(buffer_size, "DMA buffer allocation size");
  * @membase:	pointer to buffer memory region
  * @dma_alloc:	dma allocated buffer size
  * @io_region_size:	I/O memory region size
+ * @num_bd:	number of buffer descriptors
  * @num_tx:	number of send buffers
  * @cur_tx:	last send buffer written
  * @dty_tx:	last buffer actually sent
@@ -201,6 +202,7 @@ struct ethoc {
 	int dma_alloc;
 	resource_size_t io_region_size;
 
+	unsigned int num_bd;
 	unsigned int num_tx;
 	unsigned int cur_tx;
 	unsigned int dty_tx;
@@ -940,12 +942,44 @@ static void ethoc_get_regs(struct net_device *dev, struct ethtool_regs *regs,
 		regs_buff[i] = ethoc_read(priv, i * sizeof(u32));
 }
 
+static void ethoc_get_ringparam(struct net_device *dev,
+				struct ethtool_ringparam *ring)
+{
+	struct ethoc *priv = netdev_priv(dev);
+
+	ring->rx_max_pending = priv->num_bd - 1;
+	ring->rx_mini_max_pending = 0;
+	ring->rx_jumbo_max_pending = 0;
+	ring->tx_max_pending = priv->num_bd - 1;
+
+	ring->rx_pending = priv->num_rx;
+	ring->rx_mini_pending = 0;
+	ring->rx_jumbo_pending = 0;
+	ring->tx_pending = priv->num_tx;
+}
+
+static int ethoc_set_ringparam(struct net_device *dev,
+			       struct ethtool_ringparam *ring)
+{
+	struct ethoc *priv = netdev_priv(dev);
+
+	if (netif_running(dev))
+		return -EBUSY;
+	priv->num_tx = rounddown_pow_of_two(ring->tx_pending);
+	priv->num_rx = priv->num_bd - priv->num_tx;
+	if (priv->num_rx > ring->rx_pending)
+		priv->num_rx = ring->rx_pending;
+	return 0;
+}
+
 const struct ethtool_ops ethoc_ethtool_ops = {
 	.get_settings = ethoc_get_settings,
 	.set_settings = ethoc_set_settings,
 	.get_regs_len = ethoc_get_regs_len,
 	.get_regs = ethoc_get_regs,
 	.get_link = ethtool_op_get_link,
+	.get_ringparam = ethoc_get_ringparam,
+	.set_ringparam = ethoc_set_ringparam,
 	.get_ts_info = ethtool_op_get_ts_info,
 };
 
@@ -1077,6 +1111,7 @@ static int ethoc_probe(struct platform_device *pdev)
 		ret = -ENODEV;
 		goto error;
 	}
+	priv->num_bd = num_bd;
 	/* num_tx must be a power of two */
 	priv->num_tx = rounddown_pow_of_two(num_bd >> 1);
 	priv->num_rx = num_bd - priv->num_tx;
-- 
1.8.1.4

^ permalink raw reply related

* Re: [PATCH 0/4] OpenCores 10/100 MAC ethtool operations
From: Florian Fainelli @ 2014-01-29 20:01 UTC (permalink / raw)
  To: Max Filippov
  Cc: netdev, linux-kernel@vger.kernel.org, David S. Miller,
	Marc Gauthier
In-Reply-To: <1391025397-14965-1-git-send-email-jcmvbkbc@gmail.com>

2014-01-29 Max Filippov <jcmvbkbc@gmail.com>:
> Hello David, Florian and everybody,
>
> this series implements ethtool callbacks for the ethoc driver as was
> requested by Florian.

Thanks Max, this does look good to me now:

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>

>
> Max Filippov (4):
>   net: ethoc: implement basic ethtool operations
>   net: ethoc: implement ethtool get/set settings
>   net: ethoc: implement ethtool get registers
>   net: ethoc: implement ethtool get/set ring parameters
>
>  drivers/net/ethernet/ethoc.c | 85 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 85 insertions(+)
>
> --
> 1.8.1.4
>



-- 
Florian

^ permalink raw reply

* Re: [PATCH stable 3.11+] can: use private sk reference to detect originating socket
From: Oliver Hartkopp @ 2014-01-29 20:40 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, Linux Netdev List, Andre Naujoks,
	linux-can@vger.kernel.org
In-Reply-To: <1391025348.28432.63.camel@edumazet-glaptop2.roam.corp.google.com>

On 29.01.2014 20:55, Eric Dumazet wrote:

>> 2. The check for sk and the additional skb_orphan() is not needed as the
>> can_skb_set_owner() is invoked only after 'fresh' skb allocation or after
>> clone/skb_orphan() - so we know the skb state very good.
> 
> Yep, but then you added skb_orphan() calls. when skb is not shared.

I did that e.g. in the sources placed in driver/net/can because I assume that
I can not be sure that I always get skbs created by AF_CAN.

E.g. one could also use AF_PACKET to send CAN frames to the CAN interfaces.

Btw. assuming AF_PACKET does it correct too with the destructor I will omit
the skb_orphan() / can_skb_set_owner() sequence in drivers/net/can in the next
attempt.

Tnx
Oliver

> 
> I think you really should have a helper instead of copying this 3 times.
> 

ok


^ permalink raw reply

* Re: [PATCH 0/2] [BUG FIXES - 3.10.27] sit: More backports
From: Steven Rostedt @ 2014-01-29 20:48 UTC (permalink / raw)
  To: nicolas.dichtel
  Cc: linux-kernel, netdev, stable, Clark Williams,
	Luis Claudio R. Goncalves, John Kacur, Willem de Bruijn
In-Reply-To: <52E9267C.90403@6wind.com>

On Wed, 29 Jan 2014 17:04:12 +0100
Nicolas Dichtel <nicolas.dichtel@6wind.com> wrote:
 
> Your patch serie seems to be the good way to go (note that patch 1/2 does not
> compile) but I think the fix is smaller because we don't have x-netns.
> 
> Here is my proposal, if you agree, I will send the same patch for ip6_tunnnel,
> which have the netns leak.
> 

Hold on. Seems that the kernels that were being tested in QA had more
code than what I was testing. Clark had backported "sit: fix use after
free of fb_tunnel_dev" and that was what was causing the
unlist_netdevice() to be missed.

When I started working on vanilla 3.10.27 as well, I first did my
original patch (which just removes the call to
unregister_netdevice_queue() from sit_exit_net()). I asked to have that
added to our kernel for testing, and they told me it was already there
via Clark's backport. Then I did the full backport as well and looked
for the leak. I'm now thinking that the full backport is not needed as
that was what caused the leak.

According to commit 9434266f2c645d4fcf62a03a8e36ad8075e37943 "sit: fix
use after free of fb_tunnel_dev", it states:

    Bug: The fallback device is created in sit_init_net and assumed to
    be freed in sit_exit_net. First, it is dereferenced in that
    function, in sit_destroy_tunnels:
    
            struct net *net = dev_net(sitn->fb_tunnel_dev);
    
    Prior to this, rtnl_unlink_register has removed all devices that
    match rtnl_link_ops == sit_link_ops.

    Commit 205983c43700 added the line
    
    +       sitn->fb_tunnel_dev->rtnl_link_ops = &sit_link_ops;
    
    which cases the fallback device to match here and be freed before it
    is last dereferenced.

Commit 205983c43700 was backported to 3.10, but without commit
5e6700b3bf98 "sit: add support of x-netns" which was what added the

  net = dev_net(sitn->fb_tunnel_dev);

Which looks to me that the only reason I need to port back commit
5e6700b3bf98 is if I add the full backport of 9434266f2c645d4f.

Seems to me that my original patch may be good enough. The one that I
said this series obsoletes.

Note, I've talked with the people that are doing the testing, and I'm
having them revert all changes except for that one fix and rerun the
tests again. I should know the results by tomorrow.

Let me know if "sit: fix use after free of fb_tunnel_dev" still needs
to be backported due to some other way that the fallback device can be
dereferenced after being freed.

-- Steve

^ permalink raw reply

* Re: [PATCH stable 3.11+] can: use private sk reference to detect originating socket
From: Oliver Hartkopp @ 2014-01-29 21:00 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, Linux Netdev List, Andre Naujoks,
	linux-can@vger.kernel.org
In-Reply-To: <52E9675A.4070201@hartkopp.net>

So here we are:

Introduced can_create_echo_skb() which creates a skb which is properly
owned to be send back (echo'ed) into the network stack. 

Regards,
Oliver

diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c
index 13a9098..fc59bc6 100644
--- a/drivers/net/can/dev.c
+++ b/drivers/net/can/dev.c
@@ -323,19 +323,10 @@ void can_put_echo_skb(struct sk_buff *skb, struct net_device *dev,
 	}
 
 	if (!priv->echo_skb[idx]) {
-		struct sock *srcsk = skb->sk;
 
-		if (atomic_read(&skb->users) != 1) {
-			struct sk_buff *old_skb = skb;
-
-			skb = skb_clone(old_skb, GFP_ATOMIC);
-			kfree_skb(old_skb);
-			if (!skb)
-				return;
-		} else
-			skb_orphan(skb);
-
-		skb->sk = srcsk;
+		skb = can_create_echo_skb(skb);
+		if (!skb)
+			return;
 
 		/* make settings for echo to reduce code in irq context */
 		skb->protocol = htons(ETH_P_CAN);
diff --git a/drivers/net/can/janz-ican3.c b/drivers/net/can/janz-ican3.c
index e24e669..2124c679 100644
--- a/drivers/net/can/janz-ican3.c
+++ b/drivers/net/can/janz-ican3.c
@@ -18,6 +18,7 @@
 #include <linux/netdevice.h>
 #include <linux/can.h>
 #include <linux/can/dev.h>
+#include <linux/can/skb.h>
 #include <linux/can/error.h>
 
 #include <linux/mfd/janz.h>
@@ -1133,20 +1134,9 @@ static void ican3_handle_message(struct ican3_dev *mod, struct ican3_msg *msg)
  */
 static void ican3_put_echo_skb(struct ican3_dev *mod, struct sk_buff *skb)
 {
-	struct sock *srcsk = skb->sk;
-
-	if (atomic_read(&skb->users) != 1) {
-		struct sk_buff *old_skb = skb;
-
-		skb = skb_clone(old_skb, GFP_ATOMIC);
-		kfree_skb(old_skb);
-		if (!skb)
-			return;
-	} else {
-		skb_orphan(skb);
-	}
-
-	skb->sk = srcsk;
+	skb = can_create_echo_skb(skb);
+	if (!skb)
+		return;
 
 	/* save this skb for tx interrupt echo handling */
 	skb_queue_tail(&mod->echoq, skb);
diff --git a/drivers/net/can/vcan.c b/drivers/net/can/vcan.c
index 0a2a5ee..4e94057 100644
--- a/drivers/net/can/vcan.c
+++ b/drivers/net/can/vcan.c
@@ -46,6 +46,7 @@
 #include <linux/if_ether.h>
 #include <linux/can.h>
 #include <linux/can/dev.h>
+#include <linux/can/skb.h>
 #include <linux/slab.h>
 #include <net/rtnetlink.h>
 
@@ -109,25 +110,23 @@ static netdev_tx_t vcan_tx(struct sk_buff *skb, struct net_device *dev)
 			stats->rx_packets++;
 			stats->rx_bytes += cfd->len;
 		}
-		kfree_skb(skb);
+		consume_skb(skb);
 		return NETDEV_TX_OK;
 	}
 
 	/* perform standard echo handling for CAN network interfaces */
 
 	if (loop) {
-		struct sock *srcsk = skb->sk;
 
-		skb = skb_share_check(skb, GFP_ATOMIC);
+		skb = can_create_echo_skb(skb);
 		if (!skb)
 			return NETDEV_TX_OK;
 
 		/* receive with packet counting */
-		skb->sk = srcsk;
 		vcan_rx(skb, dev);
 	} else {
 		/* no looped packets => no counting */
-		kfree_skb(skb);
+		consume_skb(skb);
 	}
 	return NETDEV_TX_OK;
 }
diff --git a/include/linux/can/skb.h b/include/linux/can/skb.h
index 2f0543f..f9bbbb4 100644
--- a/include/linux/can/skb.h
+++ b/include/linux/can/skb.h
@@ -11,7 +11,9 @@
 #define CAN_SKB_H
 
 #include <linux/types.h>
+#include <linux/skbuff.h>
 #include <linux/can.h>
+#include <net/sock.h>
 
 /*
  * The struct can_skb_priv is used to transport additional information along
@@ -42,4 +44,40 @@ static inline void can_skb_reserve(struct sk_buff *skb)
 	skb_reserve(skb, sizeof(struct can_skb_priv));
 }
 
+static inline void can_skb_destructor(struct sk_buff *skb)
+{
+	sock_put(skb->sk);
+}
+
+static inline void can_skb_set_owner(struct sk_buff *skb, struct sock *sk)
+{
+	if (sk) {
+		sock_hold(sk);
+		skb->destructor = can_skb_destructor;
+		skb->sk = sk;
+	}
+}
+
+/*
+ * returns an unshared skb owned by the original sock to be echo'ed back
+ */
+static inline struct sk_buff *can_create_echo_skb(struct sk_buff *skb)
+{
+	if (skb_shared(skb)) {
+		struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
+
+		if (likely(nskb)) {
+			can_skb_set_owner(nskb, skb->sk);
+			consume_skb(skb);
+			return nskb;
+		} else {
+			kfree_skb(skb);
+			return NULL;
+		}
+	}
+
+	/* we can assume to have an unshared skb with proper owner */
+	return skb;
+}
+
 #endif /* CAN_SKB_H */
diff --git a/net/can/af_can.c b/net/can/af_can.c
index d249874..a27f8aa 100644
--- a/net/can/af_can.c
+++ b/net/can/af_can.c
@@ -57,6 +57,7 @@
 #include <linux/skbuff.h>
 #include <linux/can.h>
 #include <linux/can/core.h>
+#include <linux/can/skb.h>
 #include <linux/ratelimit.h>
 #include <net/net_namespace.h>
 #include <net/sock.h>
@@ -290,7 +291,7 @@ int can_send(struct sk_buff *skb, int loop)
 				return -ENOMEM;
 			}
 
-			newskb->sk = skb->sk;
+			can_skb_set_owner(newskb, skb->sk);
 			newskb->ip_summed = CHECKSUM_UNNECESSARY;
 			newskb->pkt_type = PACKET_BROADCAST;
 		}
diff --git a/net/can/bcm.c b/net/can/bcm.c
index 3fc737b..dcb75c0 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -268,7 +268,7 @@ static void bcm_can_tx(struct bcm_op *op)
 
 	/* send with loopback */
 	skb->dev = dev;
-	skb->sk = op->sk;
+	can_skb_set_owner(skb, op->sk);
 	can_send(skb, 1);
 
 	/* update statistics */
@@ -1223,7 +1223,7 @@ static int bcm_tx_send(struct msghdr *msg, int ifindex, struct sock *sk)
 
 	can_skb_prv(skb)->ifindex = dev->ifindex;
 	skb->dev = dev;
-	skb->sk  = sk;
+	can_skb_set_owner(skb, sk);
 	err = can_send(skb, 1); /* send with loopback */
 	dev_put(dev);
 




On 29.01.2014 21:40, Oliver Hartkopp wrote:
> On 29.01.2014 20:55, Eric Dumazet wrote:
> 
>>> 2. The check for sk and the additional skb_orphan() is not needed as the
>>> can_skb_set_owner() is invoked only after 'fresh' skb allocation or after
>>> clone/skb_orphan() - so we know the skb state very good.
>>
>> Yep, but then you added skb_orphan() calls. when skb is not shared.
> 
> I did that e.g. in the sources placed in driver/net/can because I assume that
> I can not be sure that I always get skbs created by AF_CAN.
> 
> E.g. one could also use AF_PACKET to send CAN frames to the CAN interfaces.
> 
> Btw. assuming AF_PACKET does it correct too with the destructor I will omit
> the skb_orphan() / can_skb_set_owner() sequence in drivers/net/can in the next
> attempt.
> 
> Tnx
> Oliver
> 
>>
>> I think you really should have a helper instead of copying this 3 times.
>>
> 
> ok
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-can" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox