* Re: [PATCH 02/14] net: dsa: Report known silicon revisions for Marvell 88E6060
From: Sergei Shtylyov @ 2014-10-23 16:00 UTC (permalink / raw)
To: Guenter Roeck, netdev
Cc: David S. Miller, Florian Fainelli, Andrew Lunn, linux-kernel
In-Reply-To: <5449008C.4050505@roeck-us.net>
On 10/23/2014 05:20 PM, Guenter Roeck wrote:
>>> Report known silicon revisions when probing Marvell 88E6060 switches.
>>> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
>>> ---
>>> drivers/net/dsa/mv88e6060.c | 5 ++++-
>>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>> diff --git a/drivers/net/dsa/mv88e6060.c b/drivers/net/dsa/mv88e6060.c
>>> index 05b0ca3..c29aebe 100644
>>> --- a/drivers/net/dsa/mv88e6060.c
>>> +++ b/drivers/net/dsa/mv88e6060.c
>>> @@ -69,8 +69,11 @@ static char *mv88e6060_probe(struct device *host_dev,
>>> int sw_addr)
>>>
>>> ret = mdiobus_read(bus, sw_addr + REG_PORT(0), 0x03);
>>> if (ret >= 0) {
>>> - ret &= 0xfff0;
>>> if (ret == 0x0600)
>>> + return "Marvell 88E6060 (A0)";
>>> + if (ret == 0x0601 || ret == 0x0602)
>> *else* *if*.
>>> + return "Marvell 88E6060 (B0)";
>>> + if ((ret & 0xfff0) == 0x0600)
>> Likewise?
> The if case returns, so the else would just introduce an unnecessary
> additional level of indentation.
Not really.
> I think nowadays even checkpatch
> complains about an unnecessary else after return.
You're right about the *return* though. I should have stayed silent.
> Thanks,
> Guenter
WBR, Sergei
^ permalink raw reply
* Re: localed stuck in recent 3.18 git in copy_net_ns?
From: Paul E. McKenney @ 2014-10-23 15:33 UTC (permalink / raw)
To: Yanko Kaneti
Cc: Josh Boyer, Eric W. Biederman, Cong Wang, Kevin Fenzi, netdev,
Linux-Kernel@Vger. Kernel. Org
In-Reply-To: <20141023122750.GP4977@linux.vnet.ibm.com>
On Thu, Oct 23, 2014 at 05:27:50AM -0700, Paul E. McKenney wrote:
> On Thu, Oct 23, 2014 at 09:09:26AM +0300, Yanko Kaneti wrote:
> > On Wed, 2014-10-22 at 16:24 -0700, Paul E. McKenney wrote:
> > > On Thu, Oct 23, 2014 at 01:40:32AM +0300, Yanko Kaneti wrote:
> > > > On Wed-10/22/14-2014 15:33, Josh Boyer wrote:
> > > > > On Wed, Oct 22, 2014 at 2:55 PM, Paul E. McKenney
> > > > > <paulmck@linux.vnet.ibm.com> wrote:
> > >
> > > [ . . . ]
> > >
> > > > > > Don't get me wrong -- the fact that this kthread appears to
> > > > > > have
> > > > > > blocked within rcu_barrier() for 120 seconds means that
> > > > > > something is
> > > > > > most definitely wrong here. I am surprised that there are no
> > > > > > RCU CPU
> > > > > > stall warnings, but perhaps the blockage is in the callback
> > > > > > execution
> > > > > > rather than grace-period completion. Or something is
> > > > > > preventing this
> > > > > > kthread from starting up after the wake-up callback executes.
> > > > > > Or...
> > > > > >
> > > > > > Is this thing reproducible?
> > > > >
> > > > > I've added Yanko on CC, who reported the backtrace above and can
> > > > > recreate it reliably. Apparently reverting the RCU merge commit
> > > > > (d6dd50e) and rebuilding the latest after that does not show the
> > > > > issue. I'll let Yanko explain more and answer any questions you
> > > > > have.
> > > >
> > > > - It is reproducible
> > > > - I've done another build here to double check and its definitely
> > > > the rcu merge
> > > > that's causing it.
> > > >
> > > > Don't think I'll be able to dig deeper, but I can do testing if
> > > > needed.
> > >
> > > Please! Does the following patch help?
> >
> > Nope, doesn't seem to make a difference to the modprobe ppp_generic
> > test
>
> Well, I was hoping. I will take a closer look at the RCU merge commit
> and see what suggests itself. I am likely to ask you to revert specific
> commits, if that works for you.
Well, rather than reverting commits, could you please try testing the
following commits?
11ed7f934cb8 (rcu: Make nocb leader kthreads process pending callbacks after spawning)
73a860cd58a1 (rcu: Replace flush_signals() with WARN_ON(signal_pending()))
c847f14217d5 (rcu: Avoid misordering in nocb_leader_wait())
For whatever it is worth, I am guessing this one.
a53dd6a65668 (rcutorture: Add RCU-tasks tests to default rcutorture list)
If any of the above fail, this one should also fail.
Also, could you please send along your .config?
Thanx, Paul
^ permalink raw reply
* Re: Possible wireless issue introduced in next-20140930
From: Murilo Opsfelder Araujo @ 2014-10-23 15:26 UTC (permalink / raw)
To: Mike Galbraith
Cc: linux-kernel, netdev, linux-wireless, Larry Finger,
John W. Linville, Thadeu Cascardo
In-Reply-To: <1414041815.5228.37.camel@marge.simpson.net>
On 10/23/2014 03:23 AM, Mike Galbraith wrote:
> On Thu, 2014-10-23 at 01:17 -0200, Murilo Opsfelder Araujo wrote:
>> Hello, everyone.
>>
>> With next-20140930 my laptop does not work, i.e. after I enter my login
>> and password in KDM, the entire system becomes unresponsive and I need
>> to reset it in order to reboot (it does not even show the KDE splash
>> screen).
>>
>> It was working pretty fine with next-20140926.
>>
>> I've also tested with next-20141022 and v3.18-rc1 and no luck.
>>
>> git bisect pointed me to the commit below [1]. My wireless card is a
>> RTL8191SEvA [2].
>
> Mine is RTL8191SEvB.
>
> I was going to bisect RTL8191SE regression when I got a chance, but you
> beat me to it.
>
>> commit 38506ecefab911785d5e1aa5889f6eeb462e0954
>> Author: Larry Finger <Larry.Finger@lwfinger.net>
>> Date: Mon Sep 22 09:39:19 2014 -0500
>>
>> rtlwifi: rtl_pci: Start modification for new drivers
>
> Did you confirm that bisection result, ie revert it at HEAD of whichever
> tree you bisected?
>
> The revert (master) removed some warnings on the way to kaboom here, but
> the drivers is still toxic. My log follows in case it's the same thing.
> I can't go hunting atm, maybe during the weekend if the problem hasn't
> evaporate by then.
next-20141023 does not work as well.
With commit 38506ecefab911785d5e1aa5889f6eeb462e0954 reverted, kernel
blows up very early in boot.
Cascardo (CC:) helped me to investigate and it seems that when
rtlpriv->cfg->ops->get_btc_status() is called, it is pointing to a NULL
function.
With the changes below, written by Cascardo, I could get rid of
oops/panic and system booted normally. But there was no wifi network
available (like if wifi card was disabled).
diff --git a/drivers/net/wireless/rtlwifi/rtl8192se/sw.c
b/drivers/net/wireless/rtlwifi/rtl8192se/sw.c
index 1bff2a0..807f0f7 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192se/sw.c
+++ b/drivers/net/wireless/rtlwifi/rtl8192se/sw.c
@@ -253,6 +253,11 @@ static void rtl92s_deinit_sw_vars(struct
ieee80211_hw *hw)
}
}
+static bool rtl92s_get_btc_status(void)
+{
+ return false;
+}
+
static struct rtl_hal_ops rtl8192se_hal_ops = {
.init_sw_vars = rtl92s_init_sw_vars,
.deinit_sw_vars = rtl92s_deinit_sw_vars,
@@ -294,6 +299,7 @@ static struct rtl_hal_ops rtl8192se_hal_ops = {
.set_bbreg = rtl92s_phy_set_bb_reg,
.get_rfreg = rtl92s_phy_query_rf_reg,
.set_rfreg = rtl92s_phy_set_rf_reg,
+ .get_btc_status = rtl92s_get_btc_status,
};
static struct rtl_mod_params rtl92se_mod_params = {
--
Murilo
^ permalink raw reply related
* Re: [RFC PATCH v3 net-next] net: ipv6: Add a sysctl to make optimistic addresses useful candidates
From: Nicolas Dichtel @ 2014-10-23 15:12 UTC (permalink / raw)
To: Erik Kline, netdev; +Cc: ben, lorenzo, hannes
In-Reply-To: <1414058399-19695-1-git-send-email-ek@google.com>
Le 23/10/2014 11:59, Erik Kline a écrit :
> Add a sysctl that causes an interface's optimistic addresses
> to be considered equivalent to other non-deprecated addresses
> for source address selection purposes. Preferred addresses
> will still take precedence over optimistic addresses, subject
> to other ranking in the source address selection algorithm.
>
> This is useful where different interfaces are connected to
> different networks from different ISPs (e.g., a cell network
> and a home wifi network).
>
> The current behaviour complies with RFC 3484/6724, and it
> makes sense if the host has only one interface, or has
> multiple interfaces on the same network (same or cooperating
> administrative domain(s), but not in the multiple distinct
> networks case.
>
> For example, if a mobile device has an IPv6 address on an LTE
> network and then connects to IPv6-enabled wifi, while the wifi
> IPv6 address is undergoing DAD, IPv6 connections will try use
> the wifi default route with the LTE IPv6 address, and will get
> stuck until they time out.
>
> Also, because optimistic nodes can receive frames, issue
> an RTM_NEWADDR as soon as DAD starts. If DAD fails, a separate
> RTM_DELADDR is always sent.
>
> Also: add an entry in ip-sysctl.txt for optimistic_dad.
>
> Signed-off-by: Erik Kline <ek@google.com>
I agree with the principle of the patch. This option is useful.
Thank you,
Nicolas
^ permalink raw reply
* Re: [RFC] tcp md5 use of alloc_percpu
From: Crestez Dan Leonard @ 2014-10-23 14:46 UTC (permalink / raw)
To: Eric Dumazet, Jonathan Toppins; +Cc: David Ahern, netdev
In-Reply-To: <1414070490.2094.39.camel@edumazet-glaptop2.roam.corp.google.com>
On Thu, Oct 23, 2014 at 4:21 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2014-10-23 at 02:58 -0400, Jonathan Toppins wrote:
>
>> > + if (!pool) {
>> > + pool = kzalloc_node(sizeof(*pool), GFP_KERNEL,
>> GFP_DMA | GFP_KERNEL
>> This memory will possibly be used in a DMA correct? (thinking crypto
>> hardware offload)
>
> I am not sure this is the case, but this certainly can be added.
As far as I know what GFP_DMA actually does is restrict allocation to
low memory addresses under 24 bits for very old devices. There is also
GFP_DMA32 which restricts to addresses under 32 bytes (for device
which don't support 64 bit addresses).
This kind of stuff only belongs in device drivers where the exact
hardware limitations are known. I don't think crypto devices with this
kind of limitations can be exposed through the crypto API that the
md5sig feature uses.
--
Regards,
Leonard
^ permalink raw reply
* Re: [RFC] tcp md5 use of alloc_percpu
From: Eric Dumazet @ 2014-10-23 14:43 UTC (permalink / raw)
To: Jonathan Toppins; +Cc: David Ahern, Crestez Dan Leonard, netdev
In-Reply-To: <1414070490.2094.39.camel@edumazet-glaptop2.roam.corp.google.com>
On Thu, 2014-10-23 at 06:21 -0700, Eric Dumazet wrote:
> On Thu, 2014-10-23 at 02:58 -0400, Jonathan Toppins wrote:
>
> > > + if (!pool) {
> > > + pool = kzalloc_node(sizeof(*pool), GFP_KERNEL,
> > GFP_DMA | GFP_KERNEL
> > This memory will possibly be used in a DMA correct? (thinking crypto
> > hardware offload)
>
> I am not sure this is the case, but this certainly can be added.
>
Yes, this is not the case.
The real problem is because sg_set_buf() does the following :
sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
So it assumes a memory range is not spanning multiple pages.
So maybe a simpler patch would be :
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 1bec4e76d88c5852d8ba3392b22aa58d6453ab4d..53e355199437b00836635c5919f2f1a1ae237271 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2886,10 +2886,17 @@ static void __tcp_free_md5sig_pool(struct tcp_md5sig_pool __percpu *pool)
static void __tcp_alloc_md5sig_pool(void)
{
- int cpu;
struct tcp_md5sig_pool __percpu *pool;
+ size_t sz;
+ int cpu;
- pool = alloc_percpu(struct tcp_md5sig_pool);
+ /* sg_set_buf() assumes a contiguous memory area, but alloc_percpu()
+ * can use vmalloc(). So make sure we ask an alignment so that
+ * tcp_md5sig_pool is located in a single page.
+ */
+ BUILD_BUG_ON(sizeof(struct tcp_md5sig_pool) > PAGE_SIZE);
+ sz = roundup_pow_of_two(sizeof(struct tcp_md5sig_pool));
+ pool = __alloc_percpu(sz, sz);
if (!pool)
return;
^ permalink raw reply related
* ixgbe driver fails occasionally since ee98b577e7711d5890ded2c7b05578a29512bd39
From: Scott Harrison @ 2014-10-23 14:06 UTC (permalink / raw)
To: netdev
Hi,
I was asked to raise this issue here.
https://bugzilla.kernel.org/show_bug.cgi?id=86591
lspci ->
03:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01)
03:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01)
With the fibre 10Gbs SFP occasionally on reboot we get
[ 15.104726] ixgbe 0000:03:00.0 eth0: detected SFP+: 5
[ 19.735155] ixgbe 0000:03:00.0 eth0: setup link failed with code -14
Probably about 1 in 10 boots. With the 1Gbs Copper SFP it works fine.
kernel 3.14.8 worked fine, but on upgrading to 3.16.1 it fails in this way.
I have narrowed it down to the commit
https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=ee98b577e7711d5890ded2c7b05578a29512bd39
I don't know enough about the hardware to understand why the change
sometimes causes the failure, and was hoping someone else would. Reverting
this commit on 3.16.1 fixes the issue.
If you need any more information please ask.
Scott.
--
Software Engineer
Cisco.com - http://www.cisco.com
This email may contain confidential and privileged material for the sole use of
the intended recipient. Any review, use, distribution or disclosure by others
is strictly prohibited. If you are not the intended recipient (or authorised to
receive for the recipient), please contact the sender by reply email and delete
all copies of this message.
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
^ permalink raw reply
* Re: [PATCH 10/14] net: dsa/mv88e6352: Implement EEPROM access functions
From: Andrew Lunn @ 2014-10-23 13:54 UTC (permalink / raw)
To: Guenter Roeck
Cc: netdev, David S. Miller, Florian Fainelli, Andrew Lunn,
linux-kernel
In-Reply-To: <1414037002-25528-11-git-send-email-linux@roeck-us.net>
On Wed, Oct 22, 2014 at 09:03:18PM -0700, Guenter Roeck wrote:
> MV88E6352 supports read and write access to its configuration eeprom.
Hi Guenter
I don't have the datasheet for the MV88E6352. Is the EEPROM built in,
or external on an i2c bus?
> +static int mv88e6352_get_eeprom_len(struct dsa_switch *ds)
> +{
> + return 0x200;
> +}
How do you handle the case of it being external and not populated.
Would it not be better to try to detect it here, and return 0 if it
does not exist?
Andrew
^ permalink raw reply
* Re: [PATHC] net: napi_reuse_skb() should check pfmemalloc
From: Roman Gushchin @ 2014-10-23 13:49 UTC (permalink / raw)
To: Eric Dumazet
Cc: alexander.h.duyck@intel.com, netdev@vger.kernel.org,
sassmann@kpanic.de, e1000-devel@lists.sourceforge.net,
peter.p.waskiewicz.jr@intel.com, bruce.w.allan@intel.com,
jesse.brandeburg@intel.com, davem@davemloft.net,
john.ronciak@intel.com, gregkh@linuxfoundation.org,
linux-kernel@vger.kernel.org
In-Reply-To: <1414071030.2094.46.camel@edumazet-glaptop2.roam.corp.google.com>
Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
--
@klamm
23.10.2014, 17:30, "Eric Dumazet" <eric.dumazet@gmail.com>:
> From: Eric Dumazet <edumazet@google.com>
>
> Do not reuse skb if it was pfmemalloc tainted, otherwise
> future frame might be dropped anyway.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
> net/core/dev.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index b793e3521a36..945bbd001359 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4157,6 +4157,10 @@ EXPORT_SYMBOL(napi_gro_receive);
>
> static void napi_reuse_skb(struct napi_struct *napi, struct sk_buff *skb)
> {
> + if (unlikely(skb->pfmemalloc)) {
> + consume_skb(skb);
> + return;
> + }
> __skb_pull(skb, skb_headlen(skb));
> /* restore the reserve we had after netdev_alloc_skb_ip_align() */
> skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN - skb_headroom(skb));
------------------------------------------------------------------------------
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
^ permalink raw reply
* Re: [PATCH 06/14] net: dsa: Add support for hardware monitoring
From: Andrew Lunn @ 2014-10-23 13:47 UTC (permalink / raw)
To: Guenter Roeck
Cc: Florian Fainelli, netdev, David S. Miller, Andrew Lunn,
linux-kernel@vger.kernel.org
In-Reply-To: <54488CE1.2000106@roeck-us.net>
> >>+static DEVICE_ATTR_RO(temp1_input);
> >
> >You probably want the number of temperature sensors to come from the
> >switch driver, and support arbitrary number of temperature sensors?
>
> In that case we would need the number of sensors, pass a sensor index,
> and create a dynamic number of attributes. The code would get much more
> complex without real benefit unless there is such a chip
We are two different use cases here:
One switch chip with multiple temperature sensors
Multiple switch chips, each with its own temperature sensor.
I don't know of any hardware using either of these uses cases, but
they could appear in the future.
If we are not doing the generic implementation now, how about avoiding
an ABI break in the future, by hard coding the sensors with .0.0 on
the end?
Andrew
^ permalink raw reply
* Re: [PATCH 1/2] net: dsa: Error out on tagging protocol mismatches
From: Andrew Lunn @ 2014-10-23 13:32 UTC (permalink / raw)
To: Florian Fainelli; +Cc: Andrew Lunn, davem, netdev, alexander.h.duyck
In-Reply-To: <544841EC.4070103@gmail.com>
On Wed, Oct 22, 2014 at 04:46:52PM -0700, Florian Fainelli wrote:
> On 10/22/2014 04:35 PM, Andrew Lunn wrote:
> > If there is a mismatch between enabled tagging protocols and the
> > protocol the switch supports, error out, rather than continue with a
> > situation which is unlikely to work.
> >
> > Signed-off-by: Andrew Lunn <andrew@lunn.ch>
> > cc: alexander.h.duyck@intel.com
> > ---
> > net/dsa/dsa.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
> > index 22f34cf4cb27..8a31bd81a315 100644
> > --- a/net/dsa/dsa.c
> > +++ b/net/dsa/dsa.c
> > @@ -175,7 +175,8 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index,
> > break;
> > #endif
> > default:
> > - break;
> > + ret = -ENOPROTOOPT;
> > + goto out;
> > }
>
> This prevents using a switch driver without tagging, which is something
> that you might want to do (link setup, ethtool stats, EEE etc...).
Hi Florian
I didn't know that was a use case.
So i assume such a driver would use DSA_TAG_PROTO_NONE? So all i need
to do is add that as a case value to the switch statement, and we
should cover that use case, and still be able to detect a mismatch.
v2 patch soon.
Andrew
^ permalink raw reply
* [PATHC] net: napi_reuse_skb() should check pfmemalloc
From: Eric Dumazet @ 2014-10-23 13:30 UTC (permalink / raw)
To: Roman Gushchin
Cc: jeffrey.t.kirsher@intel.com, jesse.brandeburg@intel.com,
bruce.w.allan@intel.com, carolyn.wyborny@intel.com,
donald.c.skidmore@intel.com, gregory.v.rose@intel.com,
peter.p.waskiewicz.jr@intel.com, alexander.h.duyck@intel.com,
john.ronciak@intel.com, tushar.n.dave@intel.com,
davem@davemloft.net, sassmann@kpanic.de,
gregkh@linuxfoundation.org, e1000-devel@lists.sourceforge.net,
netdev@vger.kernel.org, "linux-kernel@vger.kernel.or
In-Reply-To: <1811414063272@webcorp02e.yandex-team.ru>
From: Eric Dumazet <edumazet@google.com>
Do not reuse skb if it was pfmemalloc tainted, otherwise
future frame might be dropped anyway.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/core/dev.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/core/dev.c b/net/core/dev.c
index b793e3521a36..945bbd001359 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4157,6 +4157,10 @@ EXPORT_SYMBOL(napi_gro_receive);
static void napi_reuse_skb(struct napi_struct *napi, struct sk_buff *skb)
{
+ if (unlikely(skb->pfmemalloc)) {
+ consume_skb(skb);
+ return;
+ }
__skb_pull(skb, skb_headlen(skb));
/* restore the reserve we had after netdev_alloc_skb_ip_align() */
skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN - skb_headroom(skb));
^ permalink raw reply related
* Re: [PATCH 06/14] net: dsa: Add support for hardware monitoring
From: Guenter Roeck @ 2014-10-23 13:27 UTC (permalink / raw)
To: Richard Cochran
Cc: Florian Fainelli, netdev, David S. Miller, Andrew Lunn,
linux-kernel@vger.kernel.org
In-Reply-To: <20141023082444.GA4122@localhost.localdomain>
On 10/23/2014 01:24 AM, Richard Cochran wrote:
> On Wed, Oct 22, 2014 at 10:06:41PM -0700, Guenter Roeck wrote:
>> On 10/22/2014 09:37 PM, Florian Fainelli wrote:
>>> You probably want the number of temperature sensors to come from the
>>> switch driver, and support arbitrary number of temperature sensors?
>>
>> In that case we would need the number of sensors, pass a sensor index,
>> and create a dynamic number of attributes. The code would get much more
>> complex without real benefit unless there is such a chip - and if there is,
>> we can still change the code at that time. I would prefer to keep it simple
>> for now.
>
> Better to do it correctly, right from the start. There *will* be such
> a chip one day, and the person wanting to add it will appreciate the
> solid foundation (even if that person ends up being you ;).
>
That is really a matter of opinion; others say one should only introduce
complex infrastructure into the kernel if and when needed. I don't mind
changing the code to anticipate multiple sensors, but as I said it would
be more complex, obviously I would only have limited means to test it,
and, yes, by nature I tend to be one of the people who prefer to keep
things simple. Before I jump into doing this, I would prefer to get
guidance from the maintainer. David ?
Thanks,
Guenter
^ permalink raw reply
* Re: [RFC] tcp md5 use of alloc_percpu
From: Eric Dumazet @ 2014-10-23 13:21 UTC (permalink / raw)
To: Jonathan Toppins; +Cc: David Ahern, Crestez Dan Leonard, netdev
In-Reply-To: <5448A72D.1050806@cumulusnetworks.com>
On Thu, 2014-10-23 at 02:58 -0400, Jonathan Toppins wrote:
> > + if (!pool) {
> > + pool = kzalloc_node(sizeof(*pool), GFP_KERNEL,
> GFP_DMA | GFP_KERNEL
> This memory will possibly be used in a DMA correct? (thinking crypto
> hardware offload)
I am not sure this is the case, but this certainly can be added.
^ permalink raw reply
* Re: [PATCH 02/14] net: dsa: Report known silicon revisions for Marvell 88E6060
From: Guenter Roeck @ 2014-10-23 13:20 UTC (permalink / raw)
To: Sergei Shtylyov, netdev
Cc: David S. Miller, Florian Fainelli, Andrew Lunn, linux-kernel
In-Reply-To: <5448F9B4.9010209@cogentembedded.com>
On 10/23/2014 05:51 AM, Sergei Shtylyov wrote:
> Hello.
>
> On 10/23/2014 8:03 AM, Guenter Roeck wrote:
>
>> Report known silicon revisions when probing Marvell 88E6060 switches.
>
>> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
>> ---
>> drivers/net/dsa/mv88e6060.c | 5 ++++-
>> 1 file changed, 4 insertions(+), 1 deletion(-)
>
>> diff --git a/drivers/net/dsa/mv88e6060.c b/drivers/net/dsa/mv88e6060.c
>> index 05b0ca3..c29aebe 100644
>> --- a/drivers/net/dsa/mv88e6060.c
>> +++ b/drivers/net/dsa/mv88e6060.c
>> @@ -69,8 +69,11 @@ static char *mv88e6060_probe(struct device *host_dev, int sw_addr)
>>
>> ret = mdiobus_read(bus, sw_addr + REG_PORT(0), 0x03);
>> if (ret >= 0) {
>> - ret &= 0xfff0;
>> if (ret == 0x0600)
>> + return "Marvell 88E6060 (A0)";
>> + if (ret == 0x0601 || ret == 0x0602)
>
> *else* *if*.
>
>> + return "Marvell 88E6060 (B0)";
>> + if ((ret & 0xfff0) == 0x0600)
>
> Likewise?
>
The if case returns, so the else would just introduce an unnecessary
additional level of indentation. I think nowadays even checkpatch
complains about an unnecessary else after return.
Thanks,
Guenter
^ permalink raw reply
* Re: [RFC] tcp md5 use of alloc_percpu
From: Crestez Dan Leonard @ 2014-10-23 13:03 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Ahern, netdev
In-Reply-To: <1414042688.2094.30.camel@edumazet-glaptop2.roam.corp.google.com>
On 10/23/2014 08:38 AM, Eric Dumazet wrote:
> On Wed, 2014-10-22 at 22:23 -0700, Eric Dumazet wrote:
> My updated patch would be :
>
> net/ipv4/tcp.c | 66 +++++++++++++++++++----------------------------
> 1 file changed, 28 insertions(+), 38 deletions(-)
>
I can confirm that this patch fixes my original issue.
I am working with a kernel based on 3.10 so I had to integrate 71cea17ed39fdf1c0634f530ddc6a2c2fc601c2b as well.
Regards,
Leonard
^ permalink raw reply
* [PATCH for-net 2/2] net/mlx4_core: Call synchronize_irq() before freeing EQ buffer
From: Eli Cohen @ 2014-10-23 12:57 UTC (permalink / raw)
To: davem; +Cc: netdev, ogerlitz, Eli Cohen
In-Reply-To: <1414069047-30865-1-git-send-email-eli@mellanox.com>
After moving the EQ ownership to software effectively destroying it, call
synchronize_irq() to ensure that any handler routines running on other CPU
cores finish execution. Only then free the EQ buffer.
The same thing is done when we destroy a CQ which is one of the sources
generating interrupts. In the case of CQ we want to avoid completion handlers
on a CQ that was destroyed. In the case we do the same to avoid receiving
asynchronous events after the EQ has been destroyed and its buffers freed.
Signed-off-by: Eli Cohen <eli@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/eq.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/mellanox/mlx4/eq.c b/drivers/net/ethernet/mellanox/mlx4/eq.c
index a49c9d11d8a5..49290a405903 100644
--- a/drivers/net/ethernet/mellanox/mlx4/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/eq.c
@@ -1026,6 +1026,7 @@ static void mlx4_free_eq(struct mlx4_dev *dev,
pr_cont("\n");
}
}
+ synchronize_irq(eq->irq);
mlx4_mtt_cleanup(dev, &eq->mtt);
for (i = 0; i < npages; ++i)
--
2.1.2
^ permalink raw reply related
* [PATCH for-net 1/2] net/mlx5_core: Call synchronize_irq() before freeing EQ buffer
From: Eli Cohen @ 2014-10-23 12:57 UTC (permalink / raw)
To: davem; +Cc: netdev, ogerlitz, Eli Cohen
In-Reply-To: <1414069047-30865-1-git-send-email-eli@mellanox.com>
After destroying the EQ, the object responsible for generating interrupts, call
synchronize_irq() to ensure that any handler routines running on other CPU
cores finish execution. Only then free the EQ buffer. This patch solves a very
rare case when we get panic on driver unload.
The same thing is done when we destroy a CQ which is one of the sources
generating interrupts. In the case of CQ we want to avoid completion handlers
on a CQ that was destroyed. In the case we do the same to avoid receiving
asynchronous events after the EQ has been destroyed and its buffers freed.
Signed-off-by: Eli Cohen <eli@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index ed53291468f3..a278238a2db6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -420,6 +420,7 @@ int mlx5_destroy_unmap_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq)
if (err)
mlx5_core_warn(dev, "failed to destroy a previously created eq: eqn %d\n",
eq->eqn);
+ synchronize_irq(table->msix_arr[eq->irqn].vector);
mlx5_buf_free(dev, &eq->buf);
return err;
--
2.1.2
^ permalink raw reply related
* [PATCH for-net 0/2] irq sync fixes
From: Eli Cohen @ 2014-10-23 12:57 UTC (permalink / raw)
To: davem; +Cc: netdev, ogerlitz, Eli Cohen
Hi Dave,
This two patch series fixes a race where an interrupt handler could access a
freed memory.
Eli
Eli Cohen (2):
net/mlx5_core: Call synchronize_irq() before freeing EQ buffer
net/mlx4_core: Call synchronize_irq() before freeing EQ buffer
drivers/net/ethernet/mellanox/mlx4/eq.c | 1 +
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 1 +
2 files changed, 2 insertions(+)
--
2.1.2
^ permalink raw reply
* Re: [PATCH 02/14] net: dsa: Report known silicon revisions for Marvell 88E6060
From: Sergei Shtylyov @ 2014-10-23 12:51 UTC (permalink / raw)
To: Guenter Roeck, netdev
Cc: David S. Miller, Florian Fainelli, Andrew Lunn, linux-kernel
In-Reply-To: <1414037002-25528-3-git-send-email-linux@roeck-us.net>
Hello.
On 10/23/2014 8:03 AM, Guenter Roeck wrote:
> Report known silicon revisions when probing Marvell 88E6060 switches.
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> ---
> drivers/net/dsa/mv88e6060.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
> diff --git a/drivers/net/dsa/mv88e6060.c b/drivers/net/dsa/mv88e6060.c
> index 05b0ca3..c29aebe 100644
> --- a/drivers/net/dsa/mv88e6060.c
> +++ b/drivers/net/dsa/mv88e6060.c
> @@ -69,8 +69,11 @@ static char *mv88e6060_probe(struct device *host_dev, int sw_addr)
>
> ret = mdiobus_read(bus, sw_addr + REG_PORT(0), 0x03);
> if (ret >= 0) {
> - ret &= 0xfff0;
> if (ret == 0x0600)
> + return "Marvell 88E6060 (A0)";
> + if (ret == 0x0601 || ret == 0x0602)
*else* *if*.
> + return "Marvell 88E6060 (B0)";
> + if ((ret & 0xfff0) == 0x0600)
Likewise?
> return "Marvell 88E6060";
> }
WBR, Sergei
^ permalink raw reply
* Re: [PATCH net-next v2 5/6] ethernet: renesas: remove unnecessary check
From: Sergei Shtylyov @ 2014-10-23 12:29 UTC (permalink / raw)
To: Varka Bhadram, netdev; +Cc: davem, Varka Bhadram
In-Reply-To: <1414029531-5067-6-git-send-email-varkab@cdac.in>
On 10/23/2014 5:58 AM, Varka Bhadram wrote:
> devm_ioremap_resource checks platform_get_resource() return value.
> We can remove the duplicate check here.
> Signed-off-by: Varka Bhadram <varkab@cdac.in>
> ---
> drivers/net/ethernet/renesas/sh_eth.c | 9 +++------
> 1 file changed, 3 insertions(+), 6 deletions(-)
> diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
> index 60e9c2c..ffb49f3 100644
> --- a/drivers/net/ethernet/renesas/sh_eth.c
> +++ b/drivers/net/ethernet/renesas/sh_eth.c
[...]
> @@ -2781,8 +2777,6 @@ static int sh_eth_drv_probe(struct platform_device *pdev)
> pm_runtime_enable(&pdev->dev);
> pm_runtime_get_sync(&pdev->dev);
>
> - /* The sh Ether-specific entries in the device structure. */
> - ndev->base_addr = res->start;
> devno = pdev->id;
> if (devno < 0)
> devno = 0;
> @@ -2806,6 +2800,9 @@ static int sh_eth_drv_probe(struct platform_device *pdev)
> goto out_release;
> }
>
> + /* The sh Ether-specific entries in the device structure. */
No need to move the comment, at least not there.
> + ndev->base_addr = res->start;
> +
> spin_lock_init(&mdp->lock);
> mdp->pdev = pdev;
WBR, Sergei
^ permalink raw reply
* Re: localed stuck in recent 3.18 git in copy_net_ns?
From: Paul E. McKenney @ 2014-10-23 12:27 UTC (permalink / raw)
To: Yanko Kaneti
Cc: Josh Boyer, Eric W. Biederman, Cong Wang, Kevin Fenzi, netdev,
Linux-Kernel@Vger. Kernel. Org
In-Reply-To: <1414044566.2031.1.camel@declera.com>
On Thu, Oct 23, 2014 at 09:09:26AM +0300, Yanko Kaneti wrote:
> On Wed, 2014-10-22 at 16:24 -0700, Paul E. McKenney wrote:
> > On Thu, Oct 23, 2014 at 01:40:32AM +0300, Yanko Kaneti wrote:
> > > On Wed-10/22/14-2014 15:33, Josh Boyer wrote:
> > > > On Wed, Oct 22, 2014 at 2:55 PM, Paul E. McKenney
> > > > <paulmck@linux.vnet.ibm.com> wrote:
> >
> > [ . . . ]
> >
> > > > > Don't get me wrong -- the fact that this kthread appears to
> > > > > have
> > > > > blocked within rcu_barrier() for 120 seconds means that
> > > > > something is
> > > > > most definitely wrong here. I am surprised that there are no
> > > > > RCU CPU
> > > > > stall warnings, but perhaps the blockage is in the callback
> > > > > execution
> > > > > rather than grace-period completion. Or something is
> > > > > preventing this
> > > > > kthread from starting up after the wake-up callback executes.
> > > > > Or...
> > > > >
> > > > > Is this thing reproducible?
> > > >
> > > > I've added Yanko on CC, who reported the backtrace above and can
> > > > recreate it reliably. Apparently reverting the RCU merge commit
> > > > (d6dd50e) and rebuilding the latest after that does not show the
> > > > issue. I'll let Yanko explain more and answer any questions you
> > > > have.
> > >
> > > - It is reproducible
> > > - I've done another build here to double check and its definitely
> > > the rcu merge
> > > that's causing it.
> > >
> > > Don't think I'll be able to dig deeper, but I can do testing if
> > > needed.
> >
> > Please! Does the following patch help?
>
> Nope, doesn't seem to make a difference to the modprobe ppp_generic
> test
Well, I was hoping. I will take a closer look at the RCU merge commit
and see what suggests itself. I am likely to ask you to revert specific
commits, if that works for you.
Thanx, Paul
> INFO: task kworker/u16:6:101 blocked for more than 120 seconds.
> Not tainted 3.18.0-0.rc1.git2.3.fc22.x86_64 #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
> message.
> kworker/u16:6 D ffff88022067cec0 11680 101 2 0x00000000
> Workqueue: netns cleanup_net
> ffff8802206939e8 0000000000000096 ffff88022067cec0 00000000001d5f00
> ffff880220693fd8 00000000001d5f00 ffff880223263480 ffff88022067cec0
> ffffffff82c51d60 7fffffffffffffff ffffffff81ee2698 ffffffff81ee2690
> Call Trace:
> [<ffffffff8185e289>] schedule+0x29/0x70
> [<ffffffff818634ac>] schedule_timeout+0x26c/0x410
> [<ffffffff81028c4a>] ? native_sched_clock+0x2a/0xa0
> [<ffffffff81107afc>] ? mark_held_locks+0x7c/0xb0
> [<ffffffff81864530>] ? _raw_spin_unlock_irq+0x30/0x50
> [<ffffffff81107c8d>] ? trace_hardirqs_on_caller+0x15d/0x200
> [<ffffffff8185fcbc>] wait_for_completion+0x10c/0x150
> [<ffffffff810e5430>] ? wake_up_state+0x20/0x20
> [<ffffffff8112a799>] _rcu_barrier+0x159/0x200
> [<ffffffff8112a895>] rcu_barrier+0x15/0x20
> [<ffffffff81718f0f>] netdev_run_todo+0x6f/0x310
> [<ffffffff8170dad5>] ? rollback_registered_many+0x265/0x2e0
> [<ffffffff81725f7e>] rtnl_unlock+0xe/0x10
> [<ffffffff8170f936>] default_device_exit_batch+0x156/0x180
> [<ffffffff810fd8f0>] ? abort_exclusive_wait+0xb0/0xb0
> [<ffffffff817079e3>] ops_exit_list.isra.1+0x53/0x60
> [<ffffffff81708590>] cleanup_net+0x100/0x1f0
> [<ffffffff810ccff8>] process_one_work+0x218/0x850
> [<ffffffff810ccf5f>] ? process_one_work+0x17f/0x850
> [<ffffffff810cd717>] ? worker_thread+0xe7/0x4a0
> [<ffffffff810cd69b>] worker_thread+0x6b/0x4a0
> [<ffffffff810cd630>] ? process_one_work+0x850/0x850
> [<ffffffff810d39eb>] kthread+0x10b/0x130
> [<ffffffff81028cc9>] ? sched_clock+0x9/0x10
> [<ffffffff810d38e0>] ? kthread_create_on_node+0x250/0x250
> [<ffffffff8186527c>] ret_from_fork+0x7c/0xb0
> [<ffffffff810d38e0>] ? kthread_create_on_node+0x250/0x250
> 4 locks held by kworker/u16:6/101:
> #0: ("%s""netns"){.+.+.+}, at: [<ffffffff810ccf5f>] process_one_work+0x17f/0x850
> #1: (net_cleanup_work){+.+.+.}, at: [<ffffffff810ccf5f>] process_one_work+0x17f/0x850
> #2: (net_mutex){+.+.+.}, at: [<ffffffff8170851c>] cleanup_net+0x8c/0x1f0
> #3: (rcu_sched_state.barrier_mutex){+.+...}, at: [<ffffffff8112a675>] _rcu_barrier+0x35/0x200
> INFO: task modprobe:1139 blocked for more than 120 seconds.
> Not tainted 3.18.0-0.rc1.git2.3.fc22.x86_64 #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
> message.
> modprobe D ffff880213ac1a40 13112 1139 1138 0x00000080
> ffff880036ab3be8 0000000000000096 ffff880213ac1a40 00000000001d5f00
> ffff880036ab3fd8 00000000001d5f00 ffff880223264ec0 ffff880213ac1a40
> ffff880213ac1a40 ffffffff81f8fb48 0000000000000246 ffff880213ac1a40
> Call Trace:
> [<ffffffff8185e831>] schedule_preempt_disabled+0x31/0x80
> [<ffffffff81860083>] mutex_lock_nested+0x183/0x440
> [<ffffffff817083af>] ? register_pernet_subsys+0x1f/0x50
> [<ffffffff817083af>] ? register_pernet_subsys+0x1f/0x50
> [<ffffffffa06f3000>] ? 0xffffffffa06f3000
> [<ffffffff817083af>] register_pernet_subsys+0x1f/0x50
> [<ffffffffa06f3048>] br_init+0x48/0xd3 [bridge]
> [<ffffffff81002148>] do_one_initcall+0xd8/0x210
> [<ffffffff81153c52>] load_module+0x20c2/0x2870
> [<ffffffff8114ec30>] ? store_uevent+0x70/0x70
> [<ffffffff8110ac76>] ? lock_release_non_nested+0x3c6/0x3d0
> [<ffffffff811544e7>] SyS_init_module+0xe7/0x140
> [<ffffffff81865329>] system_call_fastpath+0x12/0x17
> 1 lock held by modprobe/1139:
> #0: (net_mutex){+.+.+.}, at: [<ffffffff817083af>]
> register_pernet_subsys+0x1f/0x50
> INFO: task modprobe:1209 blocked for more than 120 seconds.
> Not tainted 3.18.0-0.rc1.git2.3.fc22.x86_64 #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
> message.
> modprobe D ffff8800c5324ec0 13368 1209 1151 0x00000080
> ffff88020d14bbe8 0000000000000096 ffff8800c5324ec0 00000000001d5f00
> ffff88020d14bfd8 00000000001d5f00 ffff880223280000 ffff8800c5324ec0
> ffff8800c5324ec0 ffffffff81f8fb48 0000000000000246 ffff8800c5324ec0
> Call Trace:
> [<ffffffff8185e831>] schedule_preempt_disabled+0x31/0x80
> [<ffffffff81860083>] mutex_lock_nested+0x183/0x440
> [<ffffffff817083fd>] ? register_pernet_device+0x1d/0x70
> [<ffffffff817083fd>] ? register_pernet_device+0x1d/0x70
> [<ffffffffa070f000>] ? 0xffffffffa070f000
> [<ffffffff817083fd>] register_pernet_device+0x1d/0x70
> [<ffffffffa070f020>] ppp_init+0x20/0x1000 [ppp_generic]
> [<ffffffff81002148>] do_one_initcall+0xd8/0x210
> [<ffffffff81153c52>] load_module+0x20c2/0x2870
> [<ffffffff8114ec30>] ? store_uevent+0x70/0x70
> [<ffffffff8110ac76>] ? lock_release_non_nested+0x3c6/0x3d0
> [<ffffffff811544e7>] SyS_init_module+0xe7/0x140
> [<ffffffff81865329>] system_call_fastpath+0x12/0x17
> 1 lock held by modprobe/1209:
> #0: (net_mutex){+.+.+.}, at: [<ffffffff817083fd>] register_pernet_device+0x1d/0x70
>
>
> > Thanx, Paul
> >
> > ---------------------------------------------------------------------
> > ---
> >
> > rcu: More on deadlock between CPU hotplug and expedited grace periods
> >
> > Commit dd56af42bd82 (rcu: Eliminate deadlock between CPU hotplug and
> > expedited grace periods) was incomplete. Although it did eliminate
> > deadlocks involving synchronize_sched_expedited()'s acquisition of
> > cpu_hotplug.lock via get_online_cpus(), it did nothing about the
> > similar
> > deadlock involving acquisition of this same lock via
> > put_online_cpus().
> > This deadlock became apparent with testing involving hibernation.
> >
> > This commit therefore changes put_online_cpus() acquisition of this
> > lock
> > to be conditional, and increments a new cpu_hotplug.puts_pending
> > field
> > in case of acquisition failure. Then cpu_hotplug_begin() checks for
> > this
> > new field being non-zero, and applies any changes to
> > cpu_hotplug.refcount.
> >
> > Reported-by: Jiri Kosina <jkosina@suse.cz>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Tested-by: Jiri Kosina <jkosina@suse.cz>
> >
> > diff --git a/kernel/cpu.c b/kernel/cpu.c
> > index 356450f09c1f..90a3d017b90c 100644
> > --- a/kernel/cpu.c
> > +++ b/kernel/cpu.c
> > @@ -64,6 +64,8 @@ static struct {
> > * an ongoing cpu hotplug operation.
> > */
> > int refcount;
> > + /* And allows lockless put_online_cpus(). */
> > + atomic_t puts_pending;
> >
> > #ifdef CONFIG_DEBUG_LOCK_ALLOC
> > struct lockdep_map dep_map;
> > @@ -113,7 +115,11 @@ void put_online_cpus(void)
> > {
> > if (cpu_hotplug.active_writer == current)
> > return;
> > - mutex_lock(&cpu_hotplug.lock);
> > + if (!mutex_trylock(&cpu_hotplug.lock)) {
> > + atomic_inc(&cpu_hotplug.puts_pending);
> > + cpuhp_lock_release();
> > + return;
> > + }
> >
> > if (WARN_ON(!cpu_hotplug.refcount))
> > cpu_hotplug.refcount++; /* try to fix things up */
> > @@ -155,6 +161,12 @@ void cpu_hotplug_begin(void)
> > cpuhp_lock_acquire();
> > for (;;) {
> > mutex_lock(&cpu_hotplug.lock);
> > + if (atomic_read(&cpu_hotplug.puts_pending)) {
> > + int delta;
> > +
> > + delta = atomic_xchg(&cpu_hotplug.puts_pending, 0);
> > + cpu_hotplug.refcount -= delta;
> > + }
> > if (likely(!cpu_hotplug.refcount))
> > break;
> > __set_current_state(TASK_UNINTERRUPTIBLE);
> >
> >
>
^ permalink raw reply
* Re: [PATCH net-next v2 2/6] ethernet: wiznet: remove unnecessary check
From: Sergei Shtylyov @ 2014-10-23 12:23 UTC (permalink / raw)
To: Varka Bhadram, netdev; +Cc: davem, Varka Bhadram
In-Reply-To: <1414029531-5067-3-git-send-email-varkab@cdac.in>
Hello.
On 10/23/2014 5:58 AM, Varka Bhadram wrote:
> devm_ioremap_resource checks platform_get_resource() return value.
> We can remove the duplicate check here.
> Signed-off-by: Varka Bhadram <varkab@cdac.in>
> ---
> drivers/net/ethernet/wiznet/w5300.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
Don't do 2 patches with the same subject. Either merge them, or include
the driver name (which is different in both cases) in the subject.
WBR, Sergei
^ permalink raw reply
* Re: [PATCH 1/3] xen-netback: make feature-rx-notify mandatory
From: David Vrabel @ 2014-10-23 11:52 UTC (permalink / raw)
To: Wei Liu; +Cc: netdev, xen-devel, Ian Campbell
In-Reply-To: <20141023114453.GH9188@zion.uk.xensource.com>
On 23/10/14 12:44, Wei Liu wrote:
> On Thu, Oct 23, 2014 at 12:37:54PM +0100, David Vrabel wrote:
>> On 23/10/14 12:32, Wei Liu wrote:
>>> On Thu, Oct 23, 2014 at 12:16:29PM +0100, Wei Liu wrote:
>>>> On Wed, Oct 22, 2014 at 02:08:53PM +0100, David Vrabel wrote:
>>>>> Frontends that do not provide feature-rx-notify may stall because
>>>>> netback depends on the notification from frontend to wake the guest Rx
>>>>> thread (even if can_queue is false).
>>>>>
>>>>> This could be fixed but feature-rx-notify was introduced in 2006 and I
>>>>> am not aware of any frontends that do not implement this.
>>>>>
>>>>> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
>>>>
>>>> Acked-by: Wei Liu <wei.liu2@citrix.com>
>>>
>>> While I can understand this patch by itself, can you elaborate a little
>>> bit on how it affects later patches? Because what I'm thinking is that
>>> this patch is not for stable while other two should go to stable.
>>
>> >From the cover letter:
>>
>> "The first patch is a prerequite. Removing support for frontends with
>> feature-rx-notify makes it easier to reason about the correctness of
>> netback since it no longer has to support this outdated and broken
>> mode."
>>
>
> I saw that.
>
> I think you should make it a little bit clearer.
I'm not sure how I can make this any clearer. Perhaps you should wander
over to my desk to discuss this in person?
> The queue is not guaranteed to stop if we keep this feature.
>
>> The other patches do not meet the stable kernel requirements (they're
>> too long one thing).
>>
>
> Does length matter? I surely had written long patch for stable.
>From Documentation/stable_kernel_rules.txt:
"- It cannot be bigger than 100 lines, with context."
David
^ permalink raw reply
* Re: [PATCH 3/3] xen-netback: reintroduce guest Rx stall detection
From: Wei Liu @ 2014-10-23 11:49 UTC (permalink / raw)
To: David Vrabel; +Cc: netdev, xen-devel, Ian Campbell, Wei Liu
In-Reply-To: <1413983335-8307-4-git-send-email-david.vrabel@citrix.com>
On Wed, Oct 22, 2014 at 02:08:55PM +0100, David Vrabel wrote:
> If a frontend not receiving packets it is useful to detect this and
> turn off the carrier so packets are dropped early instead of being
> queued and drained when they expire.
>
> A to-guest queue is stalled if it doesn't have enough free slots for a
> an extended period of time (default 60 s).
>
> If at least one queue is stalled, the carrier is turned off (in the
> expectation that the other queues will soon stall as well). The
> carrier is only turned on once all queues are ready.
>
> When the frontend connects, all the queues start in the stalled state
> and only become ready once the frontend queues enough Rx requests.
>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox