Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH v2 net 0/2] Alow tg3/bnx recevie full sized 802.1ad frames
From: David Miller @ 2014-10-01 20:45 UTC (permalink / raw)
  To: vyasevich
  Cc: netdev, prashant, mchan, sony.chacko, Dept-HSGLinuxNICDev,
	vyasevic
In-Reply-To: <1412120377-11125-1-git-send-email-vyasevic@redhat.com>

From: Vladislav Yasevich <vyasevich@gmail.com>
Date: Tue, 30 Sep 2014 19:39:35 -0400

> bnx2 and tg3 drivers drop packets that exceed device mtu and that
> do not have a vlan tag.  The idea is to catch ethernet frames
> that are too long.  This also ends up dropping 802.1ad tagged
> frames since the encapsulation protocol is different.
> We should not be dropping this packets.
> 
> v2:  rebased and consolidated tg3 and bnx patches.

Applied, thanks Vlad.

^ permalink raw reply

* Re: [PATCH net 0/2] r8152: patches about firmware
From: David Miller @ 2014-10-01 20:47 UTC (permalink / raw)
  To: hayeswang; +Cc: netdev, nic_swsd, linux-kernel, linux-usb
In-Reply-To: <1394712342-15778-51-Taiwan-albertk@realtek.com>

From: Hayes Wang <hayeswang@realtek.com>
Date: Wed, 1 Oct 2014 13:25:09 +0800

> The patches fix the issues when the firmware exists.
> 
> For the multiple OS, the firmware may be loaded by the
> driver of the other OS. And the Linux driver has influences
> on it.

Series applied, thanks.

^ permalink raw reply

* Re: [Patch net-next] net_sched: avoid calling tcf_unbind_filter() in call_rcu callback
From: John Fastabend @ 2014-10-01 20:50 UTC (permalink / raw)
  To: John Fastabend, Cong Wang; +Cc: netdev, davem
In-Reply-To: <542B5679.3060601@gmail.com>

On 09/30/2014 06:18 PM, John Fastabend wrote:
> On 09/30/2014 05:53 PM, John Fastabend wrote:
>> On 09/30/2014 04:07 PM, Cong Wang wrote:
>>> This fixes the following crash:
>>>
>>> [   63.976822] general protection fault: 0000 [#1] PREEMPT SMP
>>> DEBUG_PAGEALLOC
>>> [   63.980094] CPU: 1 PID: 15 Comm: ksoftirqd/1 Not tainted
>>> 3.17.0-rc6+ #648
>>> [   63.980094] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>>> [   63.980094] task: ffff880117dea690 ti: ffff880117dfc000 task.ti:
>>> ffff880117dfc000
>>> [   63.980094] RIP: 0010:[<ffffffff817e6d07>]  [<ffffffff817e6d07>]
>>> u32_destroy_key+0x27/0x6d
>>> [   63.980094] RSP: 0018:ffff880117dffcc0  EFLAGS: 00010202
>>> [   63.980094] RAX: ffff880117dea690 RBX: ffff8800d02e0820 RCX:
>>> 0000000000000000
>>> [   63.980094] RDX: 0000000000000001 RSI: 0000000000000002 RDI:
>>> 6b6b6b6b6b6b6b6b
>>> [   63.980094] RBP: ffff880117dffcd0 R08: 0000000000000000 R09:
>>> 0000000000000000
>>> [   63.980094] R10: 00006c0900006ba8 R11: 00006ba100006b9d R12:
>>> 0000000000000001
>>> [   63.980094] R13: ffff8800d02e0898 R14: ffffffff817e6d4d R15:
>>> ffff880117387a30
>>> [   63.980094] FS:  0000000000000000(0000) GS:ffff88011a800000(0000)
>>> knlGS:0000000000000000
>>> [   63.980094] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>> [   63.980094] CR2: 00007f07e6732fed CR3: 000000011665b000 CR4:
>>> 00000000000006e0
>>> [   63.980094] Stack:
>>> [   63.980094]  ffff88011a9cd300 ffffffff82051ac0 ffff880117dffce0
>>> ffffffff817e6d68
>>> [   63.980094]  ffff880117dffd70 ffffffff810cb4c7 ffffffff810cb3cd
>>> ffff880117dfffd8
>>> [   63.980094]  ffff880117dea690 ffff880117dea690 ffff880117dfffd8
>>> 000000000000000a
>>> [   63.980094] Call Trace:
>>> [   63.980094]  [<ffffffff817e6d68>] u32_delete_key_freepf_rcu+0x1b/0x1d
>>> [   63.980094]  [<ffffffff810cb4c7>] rcu_process_callbacks+0x3bb/0x691
>>> [   63.980094]  [<ffffffff810cb3cd>] ? rcu_process_callbacks+0x2c1/0x691
>>> [   63.980094]  [<ffffffff817e6d4d>] ? u32_destroy_key+0x6d/0x6d
>>> [   63.980094]  [<ffffffff810780a4>] __do_softirq+0x142/0x323
>>> [   63.980094]  [<ffffffff810782a8>] run_ksoftirqd+0x23/0x53
>>> [   63.980094]  [<ffffffff81092126>] smpboot_thread_fn+0x203/0x221
>>> [   63.980094]  [<ffffffff81091f23>] ? smpboot_unpark_thread+0x33/0x33
>>> [   63.980094]  [<ffffffff8108e44d>] kthread+0xc9/0xd1
>>> [   63.980094]  [<ffffffff819e00ea>] ? do_wait_for_common+0xf8/0x125
>>> [   63.980094]  [<ffffffff8108e384>] ? __kthread_parkme+0x61/0x61
>>> [   63.980094]  [<ffffffff819e43ec>] ret_from_fork+0x7c/0xb0
>>> [   63.980094]  [<ffffffff8108e384>] ? __kthread_parkme+0x61/0x61
>>>
>>> tp could be freed in call_rcu callback too, the order is not guaranteed.
>>>
>>> Cc: John Fastabend <john.r.fastabend@intel.com>
>>> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>>> ---
>>
>> Thanks for catching this. What if we just drop tcf_exts_result
>> I can't see how its being used anymore. It appears to just be passed
>> around the ./net/sched files for some historic reason that is lost on
>> me. Would you mind testing a patch if I sent it out?
>>
>> Maybe Jamal can shed some light?
>>
> 
> Sorry I should say its not needed to pass to the actions,
> tcf_exts_exec(). It _is_ needed here to get the class setup
> correct. And the tcf_exts_exec() stuff is a separate patch.
> 
> Thanks again.
> 
> Acked-by: John Fastabend <john.r.fastabend@intel.com>
> 
> 
>>

Its worth noting why this is safe. Any running schedulers will either
read the valid class field or it will be zeroed.

All schedulers today when the class is 0 do a lookup using the
same call used by the tcf_exts_bind(). So even if we have a running
classifier hit the null class pointer it will do a lookup and get
to the same result. This is particularly fragile at the moment because
the only way to verify this is to audit the schedulers call sites.

I think we need a helper to ensure the code doesn't get broken in
a subtle way in the future. At very least it should be documented.
I'll try to draft a follow up patch to use a helper routine for
this and document it.

Similar patches are needed for basic, fw, route, rsvp, and tcindex.

.John

^ permalink raw reply

* netlink NETLINK_ROUTE  failure & Can the kernel really handle IPv6 properly
From: Ulf Samuelsson @ 2014-10-01 20:28 UTC (permalink / raw)
  To: Linux Netdev List

Trying to write a kernel module handing NETLINK_ROUTE messages for a 
3.4.x kernel.

After building the module, it fails to load on my PowerPC P2020 target.
Reason is that the "netlink_kernel_create" fails.

---------------------------------------
struct *sock    nls;

...
static int __init my_init (...)
{
     ...
     nls = netlink_kernel_create( &init_net, NETLINK_ROUTE, 0, my_rcv, 
NULL, THIS_MODULE);
     if (!nls) {
             return -ENOMEM;
     }
...
}
----------------------------------------
netlick_kernel_create returns 0, so the kernel module fails.

Is there anything wrong wth the call to netlink_kernel_create ?

Goggling show that some code uses "__net_init".
Not sure what that means.

Other code uses a mutex, instead of NULL.
Does NETLINK_ROUTE require a mutex?

All ideas are welcome-

==============
BTW, the problem I am trying to solve is how to connect to an I/F with 
an IPv6 link-local address.

An existing kernel module waits for a NETDEV_UP event, and then tries to 
communicate
with the link-local address.

This will fail, because (according to a colleague) the I/F enters a 
"tentative" state,
where it is trying to decide if it is unique or not.
It will remain in that state for 1-2 seconds, and only afterwards is the 
link-local address
available for normal use.

The guys writing the module, claim that the kernel is using NETDEV_UP.
There is very little code in the kernel using NETLINK_ROUTE, even in 
latest stable.
It is using NETDEV_UP.

If my colleague is right, the kernel really cannot handle IPv6 
link-local addresses properly.

-- 
Best Regards
Ulf Samuelsson

^ permalink raw reply

* Re: [PATCH] net: Add ndo_gso_check
From: Or Gerlitz @ 2014-10-01 20:58 UTC (permalink / raw)
  To: Tom Herbert, Alexander Duyck, John Fastabend
  Cc: Jeff Kirsher, David Miller, Linux Netdev List, Thomas Graf,
	Pravin Shelar, Andy Zhou
In-Reply-To: <CA+mtBx-sP6g9hXAYGfHKCPYkCtjgCFM0srueh1uhoNnC2ACr1g@mail.gmail.com>

On Tue, Sep 30, 2014 at 6:34 PM, Tom Herbert <therbert@google.com> wrote:
> On Tue, Sep 30, 2014 at 7:30 AM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>> On Tue, Sep 30, 2014 at 12:38 AM, Tom Herbert <therbert@google.com> wrote:
>>> On Mon, Sep 29, 2014 at 2:10 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>>> On Mon, Sep 29, 2014 at 11:53 PM, Tom Herbert <therbert@google.com> wrote:
>>>>> On Mon, Sep 29, 2014 at 12:59 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>>>>> On Mon, Sep 29, 2014 at 6:50 AM, Tom Herbert <therbert@google.com> wrote:

>> I'm not sure how exactly this (inner protocol being Ethernet and inner
>> header lengths)
>> is going to work to differentiate between VXLAN and NVGRE (or @ least
>> the GRE-ing done by OVS on guest Ethernet frames).

> GSO processing for VXLAN and NVGRE should be identical. They both have
> a four byte header that needs to be copied per packet and both only
> carry Ethernet frames.

I know and indeed in the mlx4 case, the HW (ConnectX3-pro NIC) that
supports VXLAN offloads also supports NVGRE, but generally speaking, I
can't commit for other vendors HW/driver which support VXLAN today.


> This would be equivalent to adding more protocol specific GSO feature
> bits. I still don't see how this will scale. The number of protocols
> that we might want to encapsulate over UDP is vast-- even before FOU
> adding possibility of encapsulating any IP protocol in UDP. And, as
> already pointed out, devices might have other arbitrary limitations
> such as length of inner headers that wouldn't easily be represented in features.


> Also, this does not benefit the stack or drivers that already support
> generic SKB_GSO_UDP_TUNNEL mechanism.

But again, we have 4-5 drivers that are

1. upstream
2. support GSO offloading under VXLAN
3. advertize GSO_UDP_TUNNEL

As a starting point to stabilize the system (3.18 which has FOU and
friends, merge window coming up while we are @ LPC) we can

1. add GSO_UDP_VXLAN_TUNNEL type

2. ask each maintainer to decide if they want to advertize the generic tunnel
or only the vxlan specific one

3. color udp tunneled skb's with the tunnel type being vxlan or something else

have today's code that decides if to do SW GSO or HW GSO choose according
to the skb color and the driver posted value.

> Would any other driver maintainers like to chime in on this?

Alex? John?

^ permalink raw reply

* Re: [PATCH] net: Add ndo_gso_check
From: Jesse Gross @ 2014-10-01 21:12 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Tom Herbert, Alexander Duyck, John Fastabend, Jeff Kirsher,
	David Miller, Linux Netdev List, Thomas Graf, Pravin Shelar,
	Andy Zhou
In-Reply-To: <CAJ3xEMhc-97+3ZDwqXzX2N+=voO4QvJ5_YXn3sp8r8E9YyqotQ@mail.gmail.com>

On Wed, Oct 1, 2014 at 1:58 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>> Also, this does not benefit the stack or drivers that already support
>> generic SKB_GSO_UDP_TUNNEL mechanism.
>
> But again, we have 4-5 drivers that are
>
> 1. upstream
> 2. support GSO offloading under VXLAN
> 3. advertize GSO_UDP_TUNNEL
>
> As a starting point to stabilize the system (3.18 which has FOU and
> friends, merge window coming up while we are @ LPC) we can
>
> 1. add GSO_UDP_VXLAN_TUNNEL type
>
> 2. ask each maintainer to decide if they want to advertize the generic tunnel
> or only the vxlan specific one
>
> 3. color udp tunneled skb's with the tunnel type being vxlan or something else

There is no point in doing this. We should create a generic mechanism
that allows for drivers to assert some restrictions. For existing
drivers we can just provide an implementation of restrictions that is
pretty conservative and ask driver writers to take a look and loosen
it if possible. This is both cleaner and just as safe.

^ permalink raw reply

* Re: Why include/net/tcp_states.h is not exported to user-space?
From: Cong Wang @ 2014-10-01 21:26 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: David Miller, netdev
In-Reply-To: <20141001172500.GH2799@kernel.org>

On Wed, Oct 1, 2014 at 10:25 AM, Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
> Em Tue, Sep 30, 2014 at 11:12:04AM -0700, Cong Wang escreveu:
>> These states are needed for user-space to specify ->idiag_states.
>> Currently both libnl and ss make their own copy of these definitions.
>> Of course, new sock diag is not just limited to TCP, so we probably
>> need to rename them to SS_* before exporting them.
>>
>> I can send a patch if you agree.
>
> One would have to check what are the current best practices for doing
> that, i.e. probably this would have to go via
> include/uapi//linux/tcp_states.h? Or take the time and create a new
> header for the inet_diag.h stuff as this is not TCP specific for quite a
> while?

Exactly. The problem is, even in kernel, other protocols already use
these TCP_* states. Maybe we should really define SS_* first, and then
let each protocol define their own based on SS_*, including inet_diag?

^ permalink raw reply

* Re: netlink NETLINK_ROUTE  failure & Can the kernel really handle IPv6 properly
From: Hannes Frederic Sowa @ 2014-10-01 21:30 UTC (permalink / raw)
  To: Ulf Samuelsson, Linux Netdev List
In-Reply-To: <542C63FE.9080807@emagii.com>

Hello,

On Wed, Oct 1, 2014, at 22:28, Ulf Samuelsson wrote:
> BTW, the problem I am trying to solve is how to connect to an I/F with 
> an IPv6 link-local address.
> 
> An existing kernel module waits for a NETDEV_UP event, and then tries to 
> communicate
> with the link-local address.
> 
> This will fail, because (according to a colleague) the I/F enters a 
> "tentative" state,
> where it is trying to decide if it is unique or not.
> It will remain in that state for 1-2 seconds, and only afterwards is the 
> link-local address
> available for normal use.
> 
> The guys writing the module, claim that the kernel is using NETDEV_UP.
> There is very little code in the kernel using NETLINK_ROUTE, even in 
> latest stable.
> It is using NETDEV_UP.
> 
> If my colleague is right, the kernel really cannot handle IPv6 
> link-local addresses properly.

Sorry, I cannot really follow you, can you send example code or be a bit
more precise?

Thanks,
Hannes

^ permalink raw reply

* [PATCH net-next] net: phy: adjust fixed_phy_register() return value
From: Petri Gynther @ 2014-10-01 21:45 UTC (permalink / raw)
  To: netdev; +Cc: davem, f.fainelli, thomas.petazzoni

Adjust fixed_phy_register() to return struct phy_device *, so that
it becomes easy to use fixed PHYs without device tree support:

  phydev = fixed_phy_register(PHY_POLL, &fixed_phy_status, NULL);
  fixed_phy_set_link_update(phydev, fixed_phy_link_update);
  phy_connect_direct(netdev, phydev, handler_fn, phy_interface);

This change is a prerequisite for modifying bcmgenet driver to work
without a device tree on Broadcom's MIPS-based 7xxx platforms.

Signed-off-by: Petri Gynther <pgynther@google.com>
---
 drivers/net/phy/fixed.c   | 16 ++++++++--------
 drivers/of/of_mdio.c      |  7 +++++--
 include/linux/phy_fixed.h | 14 +++++++-------
 3 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/drivers/net/phy/fixed.c b/drivers/net/phy/fixed.c
index 5b19fbb..47872caa 100644
--- a/drivers/net/phy/fixed.c
+++ b/drivers/net/phy/fixed.c
@@ -233,9 +233,9 @@ EXPORT_SYMBOL_GPL(fixed_phy_del);
 static int phy_fixed_addr;
 static DEFINE_SPINLOCK(phy_fixed_addr_lock);
 
-int fixed_phy_register(unsigned int irq,
-		       struct fixed_phy_status *status,
-		       struct device_node *np)
+struct phy_device *fixed_phy_register(unsigned int irq,
+				      struct fixed_phy_status *status,
+				      struct device_node *np)
 {
 	struct fixed_mdio_bus *fmb = &platform_fmb;
 	struct phy_device *phy;
@@ -246,19 +246,19 @@ int fixed_phy_register(unsigned int irq,
 	spin_lock(&phy_fixed_addr_lock);
 	if (phy_fixed_addr == PHY_MAX_ADDR) {
 		spin_unlock(&phy_fixed_addr_lock);
-		return -ENOSPC;
+		return ERR_PTR(-ENOSPC);
 	}
 	phy_addr = phy_fixed_addr++;
 	spin_unlock(&phy_fixed_addr_lock);
 
 	ret = fixed_phy_add(PHY_POLL, phy_addr, status);
 	if (ret < 0)
-		return ret;
+		return ERR_PTR(ret);
 
 	phy = get_phy_device(fmb->mii_bus, phy_addr, false);
 	if (!phy || IS_ERR(phy)) {
 		fixed_phy_del(phy_addr);
-		return -EINVAL;
+		return ERR_PTR(-EINVAL);
 	}
 
 	of_node_get(np);
@@ -269,10 +269,10 @@ int fixed_phy_register(unsigned int irq,
 		phy_device_free(phy);
 		of_node_put(np);
 		fixed_phy_del(phy_addr);
-		return ret;
+		return ERR_PTR(ret);
 	}
 
-	return 0;
+	return phy;
 }
 
 static int __init fixed_mdio_bus_init(void)
diff --git a/drivers/of/of_mdio.c b/drivers/of/of_mdio.c
index a85d800..907baa9 100644
--- a/drivers/of/of_mdio.c
+++ b/drivers/of/of_mdio.c
@@ -286,6 +286,7 @@ int of_phy_register_fixed_link(struct device_node *np)
 	struct device_node *fixed_link_node;
 	const __be32 *fixed_link_prop;
 	int len;
+	struct phy_device *phy;
 
 	/* New binding */
 	fixed_link_node = of_get_child_by_name(np, "fixed-link");
@@ -299,7 +300,8 @@ int of_phy_register_fixed_link(struct device_node *np)
 		status.asym_pause = of_property_read_bool(fixed_link_node,
 							  "asym-pause");
 		of_node_put(fixed_link_node);
-		return fixed_phy_register(PHY_POLL, &status, np);
+		phy = fixed_phy_register(PHY_POLL, &status, np);
+		return (!phy || IS_ERR(phy));
 	}
 
 	/* Old binding */
@@ -310,7 +312,8 @@ int of_phy_register_fixed_link(struct device_node *np)
 		status.speed = be32_to_cpu(fixed_link_prop[2]);
 		status.pause = be32_to_cpu(fixed_link_prop[3]);
 		status.asym_pause = be32_to_cpu(fixed_link_prop[4]);
-		return fixed_phy_register(PHY_POLL, &status, np);
+		phy = fixed_phy_register(PHY_POLL, &status, np);
+		return (!phy || IS_ERR(phy));
 	}
 
 	return -ENODEV;
diff --git a/include/linux/phy_fixed.h b/include/linux/phy_fixed.h
index 9411386..f2ca1b4 100644
--- a/include/linux/phy_fixed.h
+++ b/include/linux/phy_fixed.h
@@ -14,9 +14,9 @@ struct device_node;
 #ifdef CONFIG_FIXED_PHY
 extern int fixed_phy_add(unsigned int irq, int phy_id,
 			 struct fixed_phy_status *status);
-extern int fixed_phy_register(unsigned int irq,
-			      struct fixed_phy_status *status,
-			      struct device_node *np);
+extern struct phy_device *fixed_phy_register(unsigned int irq,
+					     struct fixed_phy_status *status,
+					     struct device_node *np);
 extern void fixed_phy_del(int phy_addr);
 extern int fixed_phy_set_link_update(struct phy_device *phydev,
 			int (*link_update)(struct net_device *,
@@ -27,11 +27,11 @@ static inline int fixed_phy_add(unsigned int irq, int phy_id,
 {
 	return -ENODEV;
 }
-static inline int fixed_phy_register(unsigned int irq,
-				     struct fixed_phy_status *status,
-				     struct device_node *np)
+static inline struct phy_device *fixed_phy_register(unsigned int irq,
+						struct fixed_phy_status *status,
+						struct device_node *np)
 {
-	return -ENODEV;
+	return ERR_PTR(-ENODEV);
 }
 static inline int fixed_phy_del(int phy_addr)
 {
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related

* Small fixes/changes for RDS
From: Herton R. Krzesinski @ 2014-10-01 21:49 UTC (permalink / raw)
  To: Chien Yen
  Cc: rds-devel, netdev, linux-kernel, David S. Miller, rmanes,
	dledford

Hi,

I got a report of one issue within RDS (after investigation it was a double
free), and I'm sending the fix (patch 3/3) which reporter said it works (no more
WARNING triggered on a specially instrumented kernel). The report/test was done
on a very old kernel (RHEL 5, 2.6.18 based with backports), but the problem the
patch handles still exists and should not change. Besides that, while
reviewing some of the code but being unable to reproduce with rds_tcp, I
noticed two small improvements/fixes which are in patches 1 and 2.

thanks,
--
[]'s
Herton

^ permalink raw reply

* [PATCH 1/3] net/rds: call rds_conn_drop instead of open code it at rds_connect_complete
From: Herton R. Krzesinski @ 2014-10-01 21:49 UTC (permalink / raw)
  To: Chien Yen
  Cc: rds-devel, netdev, linux-kernel, David S. Miller, rmanes,
	dledford
In-Reply-To: <1412200194-28902-1-git-send-email-herton@redhat.com>

Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
---
 net/rds/threads.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/rds/threads.c b/net/rds/threads.c
index 65eaefc..dc2402e 100644
--- a/net/rds/threads.c
+++ b/net/rds/threads.c
@@ -78,8 +78,7 @@ void rds_connect_complete(struct rds_connection *conn)
 				"current state is %d\n",
 				__func__,
 				atomic_read(&conn->c_state));
-		atomic_set(&conn->c_state, RDS_CONN_ERROR);
-		queue_work(rds_wq, &conn->c_down_w);
+		rds_conn_drop(conn);
 		return;
 	}
 
-- 
1.9.3

^ permalink raw reply related

* [PATCH 2/3] net/rds: do proper house keeping if connection fails in rds_tcp_conn_connect
From: Herton R. Krzesinski @ 2014-10-01 21:49 UTC (permalink / raw)
  To: Chien Yen
  Cc: rds-devel, netdev, linux-kernel, David S. Miller, rmanes,
	dledford
In-Reply-To: <1412200194-28902-1-git-send-email-herton@redhat.com>

I see two problems if we consider the sock->ops->connect attempt to fail in
rds_tcp_conn_connect. The first issue is that for example we don't remove the
previously added rds_tcp_connection item to rds_tcp_tc_list at
rds_tcp_set_callbacks, which means that on a next reconnect attempt for the
same rds_connection, when rds_tcp_conn_connect is called we can again call
rds_tcp_set_callbacks, resulting in duplicated items on rds_tcp_tc_list,
leading to list corruption: to avoid this just make sure we call
properly rds_tcp_restore_callbacks before we exit. The second issue
is that we should also release the sock properly, by setting sock = NULL
only if we are returning without error.

Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
---
 net/rds/tcp_connect.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/rds/tcp_connect.c b/net/rds/tcp_connect.c
index a65ee78..f9f564a 100644
--- a/net/rds/tcp_connect.c
+++ b/net/rds/tcp_connect.c
@@ -106,11 +106,14 @@ int rds_tcp_conn_connect(struct rds_connection *conn)
 	rds_tcp_set_callbacks(sock, conn);
 	ret = sock->ops->connect(sock, (struct sockaddr *)&dest, sizeof(dest),
 				 O_NONBLOCK);
-	sock = NULL;

 	rdsdebug("connect to address %pI4 returned %d\n", &conn->c_faddr, ret);
 	if (ret == -EINPROGRESS)
 		ret = 0;
+	if (ret == 0)
+		sock = NULL;
+	else
+		rds_tcp_restore_callbacks(sock, conn->c_transport_data);

 out:
 	if (sock)
-- 
1.9.3

^ permalink raw reply related

* [PATCH 3/3] net/rds: fix possible double free on sock tear down
From: Herton R. Krzesinski @ 2014-10-01 21:49 UTC (permalink / raw)
  To: Chien Yen
  Cc: rds-devel, netdev, linux-kernel, David S. Miller, rmanes,
	dledford
In-Reply-To: <1412200194-28902-1-git-send-email-herton@redhat.com>

I got a report of a double free happening at RDS slab cache. One
suspicion was that may be somewhere we were doing a sock_hold/sock_put
on an already freed sock. Thus after providing a kernel with the
following change:

 static inline void sock_hold(struct sock *sk)
 {
-       atomic_inc(&sk->sk_refcnt);
+       if (!atomic_inc_not_zero(&sk->sk_refcnt))
+               WARN(1, "Trying to hold sock already gone: %p (family: %hd)\n",
+                       sk, sk->sk_family);
 }

The warning successfuly triggered:

Trying to hold sock already gone: ffff81f6dda61280 (family: 21)
WARNING: at include/net/sock.h:350 sock_hold()
Call Trace:
<IRQ>  [<ffffffff8adac135>] :rds:rds_send_remove_from_sock+0xf0/0x21b
[<ffffffff8adad35c>] :rds:rds_send_drop_acked+0xbf/0xcf
[<ffffffff8addf546>] :rds_rdma:rds_ib_recv_tasklet_fn+0x256/0x2dc
[<ffffffff8009899a>] tasklet_action+0x8f/0x12b
[<ffffffff800125a2>] __do_softirq+0x89/0x133
[<ffffffff8005f30c>] call_softirq+0x1c/0x28
[<ffffffff8006e644>] do_softirq+0x2c/0x7d
[<ffffffff8006e4d4>] do_IRQ+0xee/0xf7
[<ffffffff8005e625>] ret_from_intr+0x0/0xa
<EOI>

Looking at the call chain above, the only way I think this would be
possible is if somewhere we already released the same socket->sock which
is assigned to the rds_message at rds_send_remove_from_sock. Which seems
only possible to happen after the tear down done on rds_release.

rds_release properly calls rds_send_drop_to to drop the socket from any
rds_message, and some proper synchronization is in place to avoid race
with rds_send_drop_acked/rds_send_remove_from_sock. However, I still see
a very narrow window where it may be possible we touch a sock already
released: when rds_release races with rds_send_drop_acked, we check
RDS_MSG_ON_CONN to avoid cleanup on the same rds_message, but in this
specific case we don't clear rm->m_rs. In this case, it seems we could
then go on at rds_send_drop_to and after it returns, the sock is freed
by last sock_put on rds_release, with concurrently we being at
rds_send_remove_from_sock; then at some point in the loop at
rds_send_remove_from_sock we process an rds_message which didn't have
rm->m_rs unset for a freed sock, and a possible sock_hold on an sock
already gone at rds_release happens.

This hopefully address the described condition above and avoids a double
free on "second last" sock_put. In addition, I removed the comment about
socket destruction on top of rds_send_drop_acked: we call rds_send_drop_to
in rds_release and we should have things properly serialized there, thus
I can't see the comment being accurate there.

Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
---
 net/rds/send.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/net/rds/send.c b/net/rds/send.c
index 2371816..0a64541 100644
--- a/net/rds/send.c
+++ b/net/rds/send.c
@@ -593,8 +593,11 @@ static void rds_send_remove_from_sock(struct list_head *messages, int status)
 				sock_put(rds_rs_to_sk(rs));
 			}
 			rs = rm->m_rs;
-			sock_hold(rds_rs_to_sk(rs));
+			if (rs)
+				sock_hold(rds_rs_to_sk(rs));
 		}
+		if (!rs)
+			goto unlock_and_drop;
 		spin_lock(&rs->rs_lock);

 		if (test_and_clear_bit(RDS_MSG_ON_SOCK, &rm->m_flags)) {
@@ -638,9 +641,6 @@ unlock_and_drop:
  * queue. This means that in the TCP case, the message may not have been
  * assigned the m_ack_seq yet - but that's fine as long as tcp_is_acked
  * checks the RDS_MSG_HAS_ACK_SEQ bit.
- *
- * XXX It's not clear to me how this is safely serialized with socket
- * destruction.  Maybe it should bail if it sees SOCK_DEAD.
  */
 void rds_send_drop_acked(struct rds_connection *conn, u64 ack,
 			 is_acked_func is_acked)
@@ -711,6 +711,9 @@ void rds_send_drop_to(struct rds_sock *rs, struct sockaddr_in *dest)
 		 */
 		if (!test_and_clear_bit(RDS_MSG_ON_CONN, &rm->m_flags)) {
 			spin_unlock_irqrestore(&conn->c_lock, flags);
+			spin_lock_irqsave(&rm->m_rs_lock, flags);
+			rm->m_rs = NULL;
+			spin_unlock_irqrestore(&rm->m_rs_lock, flags);
 			continue;
 		}
 		list_del_init(&rm->m_conn_item);
-- 
1.9.3

^ permalink raw reply related

* Re: [PATCH net-next] net: phy: adjust fixed_phy_register() return value
From: Florian Fainelli @ 2014-10-01 21:56 UTC (permalink / raw)
  To: Petri Gynther, netdev; +Cc: davem, thomas.petazzoni
In-Reply-To: <20141001214509.2BF4F10070D@puck.mtv.corp.google.com>

On 10/01/2014 02:45 PM, Petri Gynther wrote:
> Adjust fixed_phy_register() to return struct phy_device *, so that
> it becomes easy to use fixed PHYs without device tree support:
> 
>   phydev = fixed_phy_register(PHY_POLL, &fixed_phy_status, NULL);
>   fixed_phy_set_link_update(phydev, fixed_phy_link_update);
>   phy_connect_direct(netdev, phydev, handler_fn, phy_interface);
> 
> This change is a prerequisite for modifying bcmgenet driver to work
> without a device tree on Broadcom's MIPS-based 7xxx platforms.

Without a Device Tree enabled platform, you still have the
fixed_phy_add() API to register a fixed PHY at a known location (the
second parameter), did we someone broke that with Thomas' patches?

You could for instance register a fixed PHY at address 1 for the
GENET/MoCA instance to find it.

Other than that, I am not opposed to the change, since it allows us to
eventually get rid of fixed_phy_add() in the future.

Thanks!

> 
> Signed-off-by: Petri Gynther <pgynther@google.com>
> ---
>  drivers/net/phy/fixed.c   | 16 ++++++++--------
>  drivers/of/of_mdio.c      |  7 +++++--
>  include/linux/phy_fixed.h | 14 +++++++-------
>  3 files changed, 20 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/net/phy/fixed.c b/drivers/net/phy/fixed.c
> index 5b19fbb..47872caa 100644
> --- a/drivers/net/phy/fixed.c
> +++ b/drivers/net/phy/fixed.c
> @@ -233,9 +233,9 @@ EXPORT_SYMBOL_GPL(fixed_phy_del);
>  static int phy_fixed_addr;
>  static DEFINE_SPINLOCK(phy_fixed_addr_lock);
>  
> -int fixed_phy_register(unsigned int irq,
> -		       struct fixed_phy_status *status,
> -		       struct device_node *np)
> +struct phy_device *fixed_phy_register(unsigned int irq,
> +				      struct fixed_phy_status *status,
> +				      struct device_node *np)
>  {
>  	struct fixed_mdio_bus *fmb = &platform_fmb;
>  	struct phy_device *phy;
> @@ -246,19 +246,19 @@ int fixed_phy_register(unsigned int irq,
>  	spin_lock(&phy_fixed_addr_lock);
>  	if (phy_fixed_addr == PHY_MAX_ADDR) {
>  		spin_unlock(&phy_fixed_addr_lock);
> -		return -ENOSPC;
> +		return ERR_PTR(-ENOSPC);
>  	}
>  	phy_addr = phy_fixed_addr++;
>  	spin_unlock(&phy_fixed_addr_lock);
>  
>  	ret = fixed_phy_add(PHY_POLL, phy_addr, status);
>  	if (ret < 0)
> -		return ret;
> +		return ERR_PTR(ret);
>  
>  	phy = get_phy_device(fmb->mii_bus, phy_addr, false);
>  	if (!phy || IS_ERR(phy)) {
>  		fixed_phy_del(phy_addr);
> -		return -EINVAL;
> +		return ERR_PTR(-EINVAL);
>  	}
>  
>  	of_node_get(np);
> @@ -269,10 +269,10 @@ int fixed_phy_register(unsigned int irq,
>  		phy_device_free(phy);
>  		of_node_put(np);
>  		fixed_phy_del(phy_addr);
> -		return ret;
> +		return ERR_PTR(ret);
>  	}
>  
> -	return 0;
> +	return phy;
>  }
>  
>  static int __init fixed_mdio_bus_init(void)
> diff --git a/drivers/of/of_mdio.c b/drivers/of/of_mdio.c
> index a85d800..907baa9 100644
> --- a/drivers/of/of_mdio.c
> +++ b/drivers/of/of_mdio.c
> @@ -286,6 +286,7 @@ int of_phy_register_fixed_link(struct device_node *np)
>  	struct device_node *fixed_link_node;
>  	const __be32 *fixed_link_prop;
>  	int len;
> +	struct phy_device *phy;
>  
>  	/* New binding */
>  	fixed_link_node = of_get_child_by_name(np, "fixed-link");
> @@ -299,7 +300,8 @@ int of_phy_register_fixed_link(struct device_node *np)
>  		status.asym_pause = of_property_read_bool(fixed_link_node,
>  							  "asym-pause");
>  		of_node_put(fixed_link_node);
> -		return fixed_phy_register(PHY_POLL, &status, np);
> +		phy = fixed_phy_register(PHY_POLL, &status, np);
> +		return (!phy || IS_ERR(phy));
>  	}
>  
>  	/* Old binding */
> @@ -310,7 +312,8 @@ int of_phy_register_fixed_link(struct device_node *np)
>  		status.speed = be32_to_cpu(fixed_link_prop[2]);
>  		status.pause = be32_to_cpu(fixed_link_prop[3]);
>  		status.asym_pause = be32_to_cpu(fixed_link_prop[4]);
> -		return fixed_phy_register(PHY_POLL, &status, np);
> +		phy = fixed_phy_register(PHY_POLL, &status, np);
> +		return (!phy || IS_ERR(phy));
>  	}
>  
>  	return -ENODEV;
> diff --git a/include/linux/phy_fixed.h b/include/linux/phy_fixed.h
> index 9411386..f2ca1b4 100644
> --- a/include/linux/phy_fixed.h
> +++ b/include/linux/phy_fixed.h
> @@ -14,9 +14,9 @@ struct device_node;
>  #ifdef CONFIG_FIXED_PHY
>  extern int fixed_phy_add(unsigned int irq, int phy_id,
>  			 struct fixed_phy_status *status);
> -extern int fixed_phy_register(unsigned int irq,
> -			      struct fixed_phy_status *status,
> -			      struct device_node *np);
> +extern struct phy_device *fixed_phy_register(unsigned int irq,
> +					     struct fixed_phy_status *status,
> +					     struct device_node *np);
>  extern void fixed_phy_del(int phy_addr);
>  extern int fixed_phy_set_link_update(struct phy_device *phydev,
>  			int (*link_update)(struct net_device *,
> @@ -27,11 +27,11 @@ static inline int fixed_phy_add(unsigned int irq, int phy_id,
>  {
>  	return -ENODEV;
>  }
> -static inline int fixed_phy_register(unsigned int irq,
> -				     struct fixed_phy_status *status,
> -				     struct device_node *np)
> +static inline struct phy_device *fixed_phy_register(unsigned int irq,
> +						struct fixed_phy_status *status,
> +						struct device_node *np)
>  {
> -	return -ENODEV;
> +	return ERR_PTR(-ENODEV);
>  }
>  static inline int fixed_phy_del(int phy_addr)
>  {
> 

^ permalink raw reply

* Re: [PATCH net-next] net: phy: adjust fixed_phy_register() return value
From: Petri Gynther @ 2014-10-01 22:16 UTC (permalink / raw)
  To: Florian Fainelli; +Cc: netdev, David Miller, thomas.petazzoni
In-Reply-To: <542C7875.8090102@gmail.com>

Hi Florian,

On Wed, Oct 1, 2014 at 2:56 PM, Florian Fainelli <f.fainelli@gmail.com> wrote:
> On 10/01/2014 02:45 PM, Petri Gynther wrote:
>> Adjust fixed_phy_register() to return struct phy_device *, so that
>> it becomes easy to use fixed PHYs without device tree support:
>>
>>   phydev = fixed_phy_register(PHY_POLL, &fixed_phy_status, NULL);
>>   fixed_phy_set_link_update(phydev, fixed_phy_link_update);
>>   phy_connect_direct(netdev, phydev, handler_fn, phy_interface);
>>
>> This change is a prerequisite for modifying bcmgenet driver to work
>> without a device tree on Broadcom's MIPS-based 7xxx platforms.
>
> Without a Device Tree enabled platform, you still have the
> fixed_phy_add() API to register a fixed PHY at a known location (the
> second parameter), did we someone broke that with Thomas' patches?
>

If a driver uses fixed_phy_add() directly, it then has to call
get_phy_device() and phy_device_register() as well. But, all that is
already handled nicely inside fixed_phy_register(). And,
fixed_phy_register() also handles the fixed PHY address allocation
automatically.

In my opinion, this change makes it really easy to use fixed PHYs from
non-OF code, as I showed in the change description.

-- Petri

> You could for instance register a fixed PHY at address 1 for the
> GENET/MoCA instance to find it.
>
> Other than that, I am not opposed to the change, since it allows us to
> eventually get rid of fixed_phy_add() in the future.
>
> Thanks!
>
>>
>> Signed-off-by: Petri Gynther <pgynther@google.com>
>> ---
>>  drivers/net/phy/fixed.c   | 16 ++++++++--------
>>  drivers/of/of_mdio.c      |  7 +++++--
>>  include/linux/phy_fixed.h | 14 +++++++-------
>>  3 files changed, 20 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/net/phy/fixed.c b/drivers/net/phy/fixed.c
>> index 5b19fbb..47872caa 100644
>> --- a/drivers/net/phy/fixed.c
>> +++ b/drivers/net/phy/fixed.c
>> @@ -233,9 +233,9 @@ EXPORT_SYMBOL_GPL(fixed_phy_del);
>>  static int phy_fixed_addr;
>>  static DEFINE_SPINLOCK(phy_fixed_addr_lock);
>>
>> -int fixed_phy_register(unsigned int irq,
>> -                    struct fixed_phy_status *status,
>> -                    struct device_node *np)
>> +struct phy_device *fixed_phy_register(unsigned int irq,
>> +                                   struct fixed_phy_status *status,
>> +                                   struct device_node *np)
>>  {
>>       struct fixed_mdio_bus *fmb = &platform_fmb;
>>       struct phy_device *phy;
>> @@ -246,19 +246,19 @@ int fixed_phy_register(unsigned int irq,
>>       spin_lock(&phy_fixed_addr_lock);
>>       if (phy_fixed_addr == PHY_MAX_ADDR) {
>>               spin_unlock(&phy_fixed_addr_lock);
>> -             return -ENOSPC;
>> +             return ERR_PTR(-ENOSPC);
>>       }
>>       phy_addr = phy_fixed_addr++;
>>       spin_unlock(&phy_fixed_addr_lock);
>>
>>       ret = fixed_phy_add(PHY_POLL, phy_addr, status);
>>       if (ret < 0)
>> -             return ret;
>> +             return ERR_PTR(ret);
>>
>>       phy = get_phy_device(fmb->mii_bus, phy_addr, false);
>>       if (!phy || IS_ERR(phy)) {
>>               fixed_phy_del(phy_addr);
>> -             return -EINVAL;
>> +             return ERR_PTR(-EINVAL);
>>       }
>>
>>       of_node_get(np);
>> @@ -269,10 +269,10 @@ int fixed_phy_register(unsigned int irq,
>>               phy_device_free(phy);
>>               of_node_put(np);
>>               fixed_phy_del(phy_addr);
>> -             return ret;
>> +             return ERR_PTR(ret);
>>       }
>>
>> -     return 0;
>> +     return phy;
>>  }
>>
>>  static int __init fixed_mdio_bus_init(void)
>> diff --git a/drivers/of/of_mdio.c b/drivers/of/of_mdio.c
>> index a85d800..907baa9 100644
>> --- a/drivers/of/of_mdio.c
>> +++ b/drivers/of/of_mdio.c
>> @@ -286,6 +286,7 @@ int of_phy_register_fixed_link(struct device_node *np)
>>       struct device_node *fixed_link_node;
>>       const __be32 *fixed_link_prop;
>>       int len;
>> +     struct phy_device *phy;
>>
>>       /* New binding */
>>       fixed_link_node = of_get_child_by_name(np, "fixed-link");
>> @@ -299,7 +300,8 @@ int of_phy_register_fixed_link(struct device_node *np)
>>               status.asym_pause = of_property_read_bool(fixed_link_node,
>>                                                         "asym-pause");
>>               of_node_put(fixed_link_node);
>> -             return fixed_phy_register(PHY_POLL, &status, np);
>> +             phy = fixed_phy_register(PHY_POLL, &status, np);
>> +             return (!phy || IS_ERR(phy));
>>       }
>>
>>       /* Old binding */
>> @@ -310,7 +312,8 @@ int of_phy_register_fixed_link(struct device_node *np)
>>               status.speed = be32_to_cpu(fixed_link_prop[2]);
>>               status.pause = be32_to_cpu(fixed_link_prop[3]);
>>               status.asym_pause = be32_to_cpu(fixed_link_prop[4]);
>> -             return fixed_phy_register(PHY_POLL, &status, np);
>> +             phy = fixed_phy_register(PHY_POLL, &status, np);
>> +             return (!phy || IS_ERR(phy));
>>       }
>>
>>       return -ENODEV;
>> diff --git a/include/linux/phy_fixed.h b/include/linux/phy_fixed.h
>> index 9411386..f2ca1b4 100644
>> --- a/include/linux/phy_fixed.h
>> +++ b/include/linux/phy_fixed.h
>> @@ -14,9 +14,9 @@ struct device_node;
>>  #ifdef CONFIG_FIXED_PHY
>>  extern int fixed_phy_add(unsigned int irq, int phy_id,
>>                        struct fixed_phy_status *status);
>> -extern int fixed_phy_register(unsigned int irq,
>> -                           struct fixed_phy_status *status,
>> -                           struct device_node *np);
>> +extern struct phy_device *fixed_phy_register(unsigned int irq,
>> +                                          struct fixed_phy_status *status,
>> +                                          struct device_node *np);
>>  extern void fixed_phy_del(int phy_addr);
>>  extern int fixed_phy_set_link_update(struct phy_device *phydev,
>>                       int (*link_update)(struct net_device *,
>> @@ -27,11 +27,11 @@ static inline int fixed_phy_add(unsigned int irq, int phy_id,
>>  {
>>       return -ENODEV;
>>  }
>> -static inline int fixed_phy_register(unsigned int irq,
>> -                                  struct fixed_phy_status *status,
>> -                                  struct device_node *np)
>> +static inline struct phy_device *fixed_phy_register(unsigned int irq,
>> +                                             struct fixed_phy_status *status,
>> +                                             struct device_node *np)
>>  {
>> -     return -ENODEV;
>> +     return ERR_PTR(-ENODEV);
>>  }
>>  static inline int fixed_phy_del(int phy_addr)
>>  {
>>
>

^ permalink raw reply

* [PATCH net-next] net: avoid one atomic operation in skb_clone()
From: Eric Dumazet @ 2014-10-01 22:27 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

From: Eric Dumazet <edumazet@google.com>

Fast clone cloning can actually avoid an atomic_inc(), if we
guarantee prior clone_ref value is 1.

This requires a change kfree_skbmem(), to perform the 
atomic_dec_and_test() on clone_ref before setting fclone to
SKB_FCLONE_UNAVAILABLE.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/skbuff.c |   23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index a8cebb40699c..f77e64873caf 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -541,13 +541,20 @@ static void kfree_skbmem(struct sk_buff *skb)
 	case SKB_FCLONE_CLONE:
 		fclones = container_of(skb, struct sk_buff_fclones, skb2);
 
-		/* The clone portion is available for
-		 * fast-cloning again.
+		/* Warning : We must perform the atomic_dec_and_test() before
+		 * setting skb->fclone back to SKB_FCLONE_UNAVAILABLE, otherwise
+		 * skb_clone() could set clone_ref to 2 before our decrement.
+		 * Anyway, if we are going to free the structure, no need to
+		 * rewrite skb->fclone.
 		 */
-		skb->fclone = SKB_FCLONE_UNAVAILABLE;
-
-		if (atomic_dec_and_test(&fclones->fclone_ref))
+		if (atomic_dec_and_test(&fclones->fclone_ref)) {
 			kmem_cache_free(skbuff_fclone_cache, fclones);
+		} else {
+			/* The clone portion is available for
+			 * fast-cloning again.
+			 */
+			skb->fclone = SKB_FCLONE_UNAVAILABLE;
+		}
 		break;
 	}
 }
@@ -869,7 +876,11 @@ struct sk_buff *skb_clone(struct sk_buff *skb, gfp_t gfp_mask)
 	if (skb->fclone == SKB_FCLONE_ORIG &&
 	    n->fclone == SKB_FCLONE_UNAVAILABLE) {
 		n->fclone = SKB_FCLONE_CLONE;
-		atomic_inc(&fclones->fclone_ref);
+		/* As our fastclone was free, clone_ref must be 1 at this point.
+		 * We could use atomic_inc() here, but it is faster
+		 * to set the final value.
+		 */
+		atomic_set(&fclones->fclone_ref, 2);
 	} else {
 		if (skb_pfmemalloc(skb))
 			gfp_mask |= __GFP_MEMALLOC;

^ permalink raw reply related

* Re: [PATCH 1/1 linux-next] net/dccp/proto.c: add __init to dccp_mib_init
From: David Miller @ 2014-10-01 22:34 UTC (permalink / raw)
  To: fabf; +Cc: linux-kernel, gerrit, dccp, netdev
In-Reply-To: <1412138883-2476-1-git-send-email-fabf@skynet.be>

From: Fabian Frederick <fabf@skynet.be>
Date: Wed,  1 Oct 2014 06:48:03 +0200

> dccp_mib_init is only called by __init dccp_init in same module.
> 
> Signed-off-by: Fabian Frederick <fabf@skynet.be>

Applied.

^ permalink raw reply

* Re: [PATCH 1/1 linux-next] net/dccp/ccid.c: add __init to ccid_activate
From: David Miller @ 2014-10-01 22:34 UTC (permalink / raw)
  To: fabf; +Cc: linux-kernel, gerrit, dccp, netdev
In-Reply-To: <1412139126-3794-1-git-send-email-fabf@skynet.be>

From: Fabian Frederick <fabf@skynet.be>
Date: Wed,  1 Oct 2014 06:52:06 +0200

> ccid_activate is only called by __init ccid_initialize_builtins in same module.
> 
> Signed-off-by: Fabian Frederick <fabf@skynet.be>

Applied.

^ permalink raw reply

* Re: [PATCH] net: Add ndo_gso_check
From: Tom Herbert @ 2014-10-01 23:06 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Alexander Duyck, John Fastabend, Jeff Kirsher, David Miller,
	Linux Netdev List, Thomas Graf, Pravin Shelar, Andy Zhou
In-Reply-To: <CAJ3xEMhc-97+3ZDwqXzX2N+=voO4QvJ5_YXn3sp8r8E9YyqotQ@mail.gmail.com>

On Wed, Oct 1, 2014 at 1:58 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> On Tue, Sep 30, 2014 at 6:34 PM, Tom Herbert <therbert@google.com> wrote:
>> On Tue, Sep 30, 2014 at 7:30 AM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>> On Tue, Sep 30, 2014 at 12:38 AM, Tom Herbert <therbert@google.com> wrote:
>>>> On Mon, Sep 29, 2014 at 2:10 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>>>> On Mon, Sep 29, 2014 at 11:53 PM, Tom Herbert <therbert@google.com> wrote:
>>>>>> On Mon, Sep 29, 2014 at 12:59 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>>>>>> On Mon, Sep 29, 2014 at 6:50 AM, Tom Herbert <therbert@google.com> wrote:
>
>>> I'm not sure how exactly this (inner protocol being Ethernet and inner
>>> header lengths)
>>> is going to work to differentiate between VXLAN and NVGRE (or @ least
>>> the GRE-ing done by OVS on guest Ethernet frames).
>
>> GSO processing for VXLAN and NVGRE should be identical. They both have
>> a four byte header that needs to be copied per packet and both only
>> carry Ethernet frames.
>
> I know and indeed in the mlx4 case, the HW (ConnectX3-pro NIC) that
> supports VXLAN offloads also supports NVGRE, but generally speaking, I
> can't commit for other vendors HW/driver which support VXLAN today.
>
>
>> This would be equivalent to adding more protocol specific GSO feature
>> bits. I still don't see how this will scale. The number of protocols
>> that we might want to encapsulate over UDP is vast-- even before FOU
>> adding possibility of encapsulating any IP protocol in UDP. And, as
>> already pointed out, devices might have other arbitrary limitations
>> such as length of inner headers that wouldn't easily be represented in features.
>
>
>> Also, this does not benefit the stack or drivers that already support
>> generic SKB_GSO_UDP_TUNNEL mechanism.
>
> But again, we have 4-5 drivers that are
>
> 1. upstream
> 2. support GSO offloading under VXLAN
> 3. advertize GSO_UDP_TUNNEL
>
> As a starting point to stabilize the system (3.18 which has FOU and
> friends, merge window coming up while we are @ LPC) we can
>
> 1. add GSO_UDP_VXLAN_TUNNEL type
>
> 2. ask each maintainer to decide if they want to advertize the generic tunnel
> or only the vxlan specific one
>
> 3. color udp tunneled skb's with the tunnel type being vxlan or something else
>
> have today's code that decides if to do SW GSO or HW GSO choose according
> to the skb color and the driver posted value.
>
Solution #4: apply this patch and implement the check functions as
needed in those 4 or 5 drivers. If a device can only do VXLAN/NVGRE
then I believe the check function is something like:

bool mydev_gso_check(struct sk_buff *skb, struct net_device *dev)
{
        if ((skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL) &&
            ((skb->inner_protocol_type != ENCAP_TYPE_ETHER ||
              skb->protocol != htons(ETH_P_TEB) ||
              skb_inner_mac_header(skb) - skb_transport_header(skb) != 12)
                return false;

        return true;
}

>> Would any other driver maintainers like to chime in on this?
>
> Alex? John?

^ permalink raw reply

* Re: randconfig build error with next-20141001, in drivers/i2c/algos/i2c-algo-bit.c
From: Randy Dunlap @ 2014-10-01 23:26 UTC (permalink / raw)
  To: Jim Davis, Stephen Rothwell, linux-next, linux-kernel, wsa, khali,
	Paul Gortmaker, linux-i2c, netdev@vger.kernel.org, linux-can,
	linux-media, Hans Verkuil
In-Reply-To: <CA+r1ZhiL1y9aeLeJjpd_1DtzOG_oyoPg7XsTPJ9G-XY5G2DfCQ@mail.gmail.com>

On 10/01/14 14:37, Jim Davis wrote:
> Building with the attached random configuration file,

Also:
warning: (CAN_PEAK_PCIEC && SFC && IGB && VIDEO_TW68 && DRM && FB_DDC && FB_VIA) selects I2C_ALGOBIT which has unmet direct dependencies (I2C)

> drivers/i2c/algos/i2c-algo-bit.c: In function ‘i2c_bit_add_bus’:
> drivers/i2c/algos/i2c-algo-bit.c:658:33: error: ‘i2c_add_adapter’
> undeclared (first use in this function)
>   return __i2c_bit_add_bus(adap, i2c_add_adapter);
>                                  ^
> drivers/i2c/algos/i2c-algo-bit.c:658:33: note: each undeclared
> identifier is reported only once for each function it appears in
> drivers/i2c/algos/i2c-algo-bit.c: In function ‘i2c_bit_add_numbered_bus’:
> drivers/i2c/algos/i2c-algo-bit.c:664:33: error:
> ‘i2c_add_numbered_adapter’ undeclared (first use in this function)
>   return __i2c_bit_add_bus(adap, i2c_add_numbered_adapter);
>                                  ^
>   CC      net/openvswitch/actions.o
> drivers/i2c/algos/i2c-algo-bit.c: In function ‘i2c_bit_add_bus’:
> drivers/i2c/algos/i2c-algo-bit.c:659:1: warning: control reaches end of non-void
>  function [-Wreturn-type]
>  }
>  ^
> drivers/i2c/algos/i2c-algo-bit.c: In function ‘i2c_bit_add_numbered_bus’:
> drivers/i2c/algos/i2c-algo-bit.c:665:1: warning: control reaches end of non-void
>  function [-Wreturn-type]
>  }
>  ^
> make[3]: *** [drivers/i2c/algos/i2c-algo-bit.o] Error 1

In drivers/media/pci/tw68/Kconfig, VIDEO_TW68 should depend on I2C in order
to make it safe to select I2C_ALGOBIT.

In drivers/net/can/sja1000/Kconfig, CAN_PEAK_PCIEC should depend on I2C
instead of selecting I2C (and change the help text).


-- 
~Randy

^ permalink raw reply

* Re: [Patch net-next] net_sched: avoid calling tcf_unbind_filter() in call_rcu callback
From: Cong Wang @ 2014-10-01 23:36 UTC (permalink / raw)
  To: John Fastabend; +Cc: John Fastabend, Cong Wang, netdev, David Miller
In-Reply-To: <542C6910.4070904@intel.com>

On Wed, Oct 1, 2014 at 1:50 PM, John Fastabend
<john.r.fastabend@intel.com> wrote:
>
> Its worth noting why this is safe. Any running schedulers will either
> read the valid class field or it will be zeroed.
>
> All schedulers today when the class is 0 do a lookup using the
> same call used by the tcf_exts_bind(). So even if we have a running
> classifier hit the null class pointer it will do a lookup and get
> to the same result. This is particularly fragile at the moment because
> the only way to verify this is to audit the schedulers call sites.

Yeah, maybe David can squash this into changelog.

>
> I think we need a helper to ensure the code doesn't get broken in
> a subtle way in the future. At very least it should be documented.
> I'll try to draft a follow up patch to use a helper routine for
> this and document it.
>
> Similar patches are needed for basic, fw, route, rsvp, and tcindex.
>

Please do fix them. Meanwhile, I am trying to refactor these pieces
of ugly code.

Thanks.

^ permalink raw reply

* fec_main: context imbalance in 'fec_set_features'
From: Fabio Estevam @ 2014-10-02  0:15 UTC (permalink / raw)
  To: Frank Li, Duan Fugang-B38611
  Cc: David S. Miller, netdev@vger.kernel.org, linux-sparse

Hi,

sparse complains the following:

drivers/net/ethernet/freescale/fec_main.c:2835:12: warning: context
imbalance in 'fec_set_features' - different lock contexts for basic
block

The code looks like this:

static int fec_set_features(struct net_device *netdev,
    netdev_features_t features)
{
    struct fec_enet_private *fep = netdev_priv(netdev);
    netdev_features_t changed = features ^ netdev->features;

    /* Quiesce the device if necessary */
    if (netif_running(netdev) && changed & FEATURES_NEED_QUIESCE) {
        napi_disable(&fep->napi);
        netif_tx_lock_bh(netdev);
        fec_stop(netdev);
    }

    netdev->features = features;

    /* Receive checksum has been changed */
    if (changed & NETIF_F_RXCSUM) {
        if (features & NETIF_F_RXCSUM)
            fep->csum_flags |= FLAG_RX_CSUM_ENABLED;
        else
            fep->csum_flags &= ~FLAG_RX_CSUM_ENABLED;
    }

    /* Resume the device after updates */
    if (netif_running(netdev) && changed & FEATURES_NEED_QUIESCE) {
        fec_restart(netdev);
        netif_tx_wake_all_queues(netdev);
        netif_tx_unlock_bh(netdev);
        napi_enable(&fep->napi);
    }

    return 0;
}

What would be the proper way to fix this warning?

Thanks,

Fabio Estevam

^ permalink raw reply

* Re: [PATCH net-next] net: phy: adjust fixed_phy_register() return value
From: Florian Fainelli @ 2014-10-02  0:33 UTC (permalink / raw)
  To: Petri Gynther; +Cc: netdev, David Miller, thomas.petazzoni
In-Reply-To: <CAGXr9JFhD+tguddQ3c9Lv+akbJo2d0Fp0_JBx7pU3A+L=b6RHw@mail.gmail.com>

On 10/01/2014 03:16 PM, Petri Gynther wrote:
> Hi Florian,
> 
> On Wed, Oct 1, 2014 at 2:56 PM, Florian Fainelli <f.fainelli@gmail.com> wrote:
>> On 10/01/2014 02:45 PM, Petri Gynther wrote:
>>> Adjust fixed_phy_register() to return struct phy_device *, so that
>>> it becomes easy to use fixed PHYs without device tree support:
>>>
>>>   phydev = fixed_phy_register(PHY_POLL, &fixed_phy_status, NULL);
>>>   fixed_phy_set_link_update(phydev, fixed_phy_link_update);
>>>   phy_connect_direct(netdev, phydev, handler_fn, phy_interface);
>>>
>>> This change is a prerequisite for modifying bcmgenet driver to work
>>> without a device tree on Broadcom's MIPS-based 7xxx platforms.
>>
>> Without a Device Tree enabled platform, you still have the
>> fixed_phy_add() API to register a fixed PHY at a known location (the
>> second parameter), did we someone broke that with Thomas' patches?
>>
> 
> If a driver uses fixed_phy_add() directly, it then has to call
> get_phy_device() and phy_device_register() as well. But, all that is
> already handled nicely inside fixed_phy_register(). And,
> fixed_phy_register() also handles the fixed PHY address allocation
> automatically.

Sure, the usual model we had before Thomas' patches was to specifically
format phy_id to make sure we would bind to the "fixed-0" MDIO bus
instead of another one, see drivers/net/ethernet/ti/cpmac.c for instance.

> 
> In my opinion, this change makes it really easy to use fixed PHYs from
> non-OF code, as I showed in the change description.

Agreed, and I think it will allow us to get rid of the manual allocation
API eventually.

I let Thomas comment on whether he likes this or not.

> 
> -- Petri
> 
>> You could for instance register a fixed PHY at address 1 for the
>> GENET/MoCA instance to find it.
>>
>> Other than that, I am not opposed to the change, since it allows us to
>> eventually get rid of fixed_phy_add() in the future.
>>
>> Thanks!
>>
>>>
>>> Signed-off-by: Petri Gynther <pgynther@google.com>
>>> ---
>>>  drivers/net/phy/fixed.c   | 16 ++++++++--------
>>>  drivers/of/of_mdio.c      |  7 +++++--
>>>  include/linux/phy_fixed.h | 14 +++++++-------
>>>  3 files changed, 20 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/drivers/net/phy/fixed.c b/drivers/net/phy/fixed.c
>>> index 5b19fbb..47872caa 100644
>>> --- a/drivers/net/phy/fixed.c
>>> +++ b/drivers/net/phy/fixed.c
>>> @@ -233,9 +233,9 @@ EXPORT_SYMBOL_GPL(fixed_phy_del);
>>>  static int phy_fixed_addr;
>>>  static DEFINE_SPINLOCK(phy_fixed_addr_lock);
>>>
>>> -int fixed_phy_register(unsigned int irq,
>>> -                    struct fixed_phy_status *status,
>>> -                    struct device_node *np)
>>> +struct phy_device *fixed_phy_register(unsigned int irq,
>>> +                                   struct fixed_phy_status *status,
>>> +                                   struct device_node *np)
>>>  {
>>>       struct fixed_mdio_bus *fmb = &platform_fmb;
>>>       struct phy_device *phy;
>>> @@ -246,19 +246,19 @@ int fixed_phy_register(unsigned int irq,
>>>       spin_lock(&phy_fixed_addr_lock);
>>>       if (phy_fixed_addr == PHY_MAX_ADDR) {
>>>               spin_unlock(&phy_fixed_addr_lock);
>>> -             return -ENOSPC;
>>> +             return ERR_PTR(-ENOSPC);
>>>       }
>>>       phy_addr = phy_fixed_addr++;
>>>       spin_unlock(&phy_fixed_addr_lock);
>>>
>>>       ret = fixed_phy_add(PHY_POLL, phy_addr, status);
>>>       if (ret < 0)
>>> -             return ret;
>>> +             return ERR_PTR(ret);
>>>
>>>       phy = get_phy_device(fmb->mii_bus, phy_addr, false);
>>>       if (!phy || IS_ERR(phy)) {
>>>               fixed_phy_del(phy_addr);
>>> -             return -EINVAL;
>>> +             return ERR_PTR(-EINVAL);
>>>       }
>>>
>>>       of_node_get(np);
>>> @@ -269,10 +269,10 @@ int fixed_phy_register(unsigned int irq,
>>>               phy_device_free(phy);
>>>               of_node_put(np);
>>>               fixed_phy_del(phy_addr);
>>> -             return ret;
>>> +             return ERR_PTR(ret);
>>>       }
>>>
>>> -     return 0;
>>> +     return phy;
>>>  }
>>>
>>>  static int __init fixed_mdio_bus_init(void)
>>> diff --git a/drivers/of/of_mdio.c b/drivers/of/of_mdio.c
>>> index a85d800..907baa9 100644
>>> --- a/drivers/of/of_mdio.c
>>> +++ b/drivers/of/of_mdio.c
>>> @@ -286,6 +286,7 @@ int of_phy_register_fixed_link(struct device_node *np)
>>>       struct device_node *fixed_link_node;
>>>       const __be32 *fixed_link_prop;
>>>       int len;
>>> +     struct phy_device *phy;
>>>
>>>       /* New binding */
>>>       fixed_link_node = of_get_child_by_name(np, "fixed-link");
>>> @@ -299,7 +300,8 @@ int of_phy_register_fixed_link(struct device_node *np)
>>>               status.asym_pause = of_property_read_bool(fixed_link_node,
>>>                                                         "asym-pause");
>>>               of_node_put(fixed_link_node);
>>> -             return fixed_phy_register(PHY_POLL, &status, np);
>>> +             phy = fixed_phy_register(PHY_POLL, &status, np);
>>> +             return (!phy || IS_ERR(phy));
>>>       }
>>>
>>>       /* Old binding */
>>> @@ -310,7 +312,8 @@ int of_phy_register_fixed_link(struct device_node *np)
>>>               status.speed = be32_to_cpu(fixed_link_prop[2]);
>>>               status.pause = be32_to_cpu(fixed_link_prop[3]);
>>>               status.asym_pause = be32_to_cpu(fixed_link_prop[4]);
>>> -             return fixed_phy_register(PHY_POLL, &status, np);
>>> +             phy = fixed_phy_register(PHY_POLL, &status, np);
>>> +             return (!phy || IS_ERR(phy));
>>>       }
>>>
>>>       return -ENODEV;
>>> diff --git a/include/linux/phy_fixed.h b/include/linux/phy_fixed.h
>>> index 9411386..f2ca1b4 100644
>>> --- a/include/linux/phy_fixed.h
>>> +++ b/include/linux/phy_fixed.h
>>> @@ -14,9 +14,9 @@ struct device_node;
>>>  #ifdef CONFIG_FIXED_PHY
>>>  extern int fixed_phy_add(unsigned int irq, int phy_id,
>>>                        struct fixed_phy_status *status);
>>> -extern int fixed_phy_register(unsigned int irq,
>>> -                           struct fixed_phy_status *status,
>>> -                           struct device_node *np);
>>> +extern struct phy_device *fixed_phy_register(unsigned int irq,
>>> +                                          struct fixed_phy_status *status,
>>> +                                          struct device_node *np);
>>>  extern void fixed_phy_del(int phy_addr);
>>>  extern int fixed_phy_set_link_update(struct phy_device *phydev,
>>>                       int (*link_update)(struct net_device *,
>>> @@ -27,11 +27,11 @@ static inline int fixed_phy_add(unsigned int irq, int phy_id,
>>>  {
>>>       return -ENODEV;
>>>  }
>>> -static inline int fixed_phy_register(unsigned int irq,
>>> -                                  struct fixed_phy_status *status,
>>> -                                  struct device_node *np)
>>> +static inline struct phy_device *fixed_phy_register(unsigned int irq,
>>> +                                             struct fixed_phy_status *status,
>>> +                                             struct device_node *np)
>>>  {
>>> -     return -ENODEV;
>>> +     return ERR_PTR(-ENODEV);
>>>  }
>>>  static inline int fixed_phy_del(int phy_addr)
>>>  {
>>>
>>

^ permalink raw reply

* Re: [PATCH net-next] net: avoid one atomic operation in skb_clone()
From: David Miller @ 2014-10-02  1:27 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1412202435.16704.68.camel@edumazet-glaptop2.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 01 Oct 2014 15:27:15 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> Fast clone cloning can actually avoid an atomic_inc(), if we
> guarantee prior clone_ref value is 1.
> 
> This requires a change kfree_skbmem(), to perform the 
> atomic_dec_and_test() on clone_ref before setting fclone to
> SKB_FCLONE_UNAVAILABLE.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks Eric.

^ permalink raw reply

* Re: [PATCH v3 1/1] net: fec: implement rx_copybreak to improve rx performance
From: David Miller @ 2014-10-02  1:28 UTC (permalink / raw)
  To: b38611; +Cc: b20596, netdev, bhutchings, shawn.guo, romieu, eric.dumazet
In-Reply-To: <1412040485-19130-1-git-send-email-b38611@freescale.com>

From: Fugang Duan <b38611@freescale.com>
Date: Tue, 30 Sep 2014 09:28:05 +0800

> - Copy short frames and keep the buffers mapped, re-allocate skb instead of
>   memory copy for long frames.
> - Add support for setting/getting rx_copybreak using generic ethtool tunable
> 
> Changes V3:
> * As Eric Dumazet's suggestion that removing the copybreak module parameter
>   and only keep the ethtool API support for rx_copybreak.
> 
> Changes V2:
> * Implements rx_copybreak
> * Rx_copybreak provides module parameter to change this value
> * Add tunable_ops support for rx_copybreak
> 
> Signed-off-by: Fugang Duan <B38611@freescale.com>
> Signed-off-by: Frank Li <Frank.Li@freescale.com>

Applied, thanks.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox