* [PATCH] stmmac: fix memory barriers
From: Pavel Machek @ 2016-12-18 20:38 UTC (permalink / raw)
To: LinoSanfilippo, peppe.cavallaro, alexandre.torgue, davem,
linux-kernel, netdev, niklas.cassel, Joao.Pinto
[-- Attachment #1: Type: text/plain, Size: 3034 bytes --]
Fix up memory barriers in stmmac driver. They are meant to protect
against DMA engine, so smp_ variants are certainly wrong, and dma_
variants are preferable.
Signed-off-by: Pavel Machek <pavel@denx.de>
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
index a340fc8..8816515 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
@@ -334,7 +334,7 @@ static void dwmac4_rd_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
* descriptors for the same frame has to be set before, to
* avoid race condition.
*/
- wmb();
+ dma_wmb();
p->des3 = cpu_to_le32(tdes3);
}
@@ -377,7 +377,7 @@ static void dwmac4_rd_prepare_tso_tx_desc(struct dma_desc *p, int is_fs,
* descriptors for the same frame has to be set before, to
* avoid race condition.
*/
- wmb();
+ dma_wmb();
p->des3 = cpu_to_le32(tdes3);
}
diff --git a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
index ce97e52..f0d8632 100644
--- a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
@@ -350,7 +350,7 @@ static void enh_desc_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
* descriptors for the same frame has to be set before, to
* avoid race condition.
*/
- wmb();
+ dma_wmb();
p->des0 = cpu_to_le32(tdes0);
}
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 3e40578..bb40382 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2125,7 +2125,7 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
* descriptor and then barrier is needed to make sure that
* all is coherent before granting the DMA engine.
*/
- smp_wmb();
+ dma_wmb();
if (netif_msg_pktdata(priv)) {
pr_info("%s: curr=%d dirty=%d f=%d, e=%d, f_p=%p, nfrags %d\n",
@@ -2338,7 +2338,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
* descriptor and then barrier is needed to make sure that
* all is coherent before granting the DMA engine.
*/
- smp_wmb();
+ dma_wmb();
}
netdev_sent_queue(dev, skb->len);
@@ -2443,14 +2443,14 @@ static inline void stmmac_rx_refill(struct stmmac_priv *priv)
netif_dbg(priv, rx_status, priv->dev,
"refill entry #%d\n", entry);
}
- wmb();
+ dma_wmb();
if (unlikely(priv->synopsys_id >= DWMAC_CORE_4_00))
priv->hw->desc->init_rx_desc(p, priv->use_riwt, 0, 0);
else
priv->hw->desc->set_rx_owner(p);
- wmb();
+ dma_wmb();
entry = STMMAC_GET_ENTRY(entry, DMA_RX_SIZE);
}
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
^ permalink raw reply related
* Re: [PATCHv2 1/5] sh_eth: add generic wake-on-lan support via magic packet
From: Sergei Shtylyov @ 2016-12-18 20:26 UTC (permalink / raw)
To: Niklas Söderlund, Simon Horman, netdev, linux-renesas-soc
Cc: Geert Uytterhoeven
In-Reply-To: <20161212160931.6478-2-niklas.soderlund+renesas@ragnatech.se>
Hello.
On 12/12/2016 07:09 PM, Niklas Söderlund wrote:
> Add generic functionality to support Wake-on-Lan using MagicPacket which
> are supported by at least a few versions of sh_eth. Only add
> functionality for WoL, no specific sh_eth version are marked to support
Versions.
> WoL yet.
>
> WoL is enabled in the suspend callback by setting MagicPacket detection
> and disabling all interrupts expect MagicPacket. In the resume path the
> driver needs to reset the hardware to rearm the WoL logic, this prevents
> the driver from simply restoring the registers and to take advantage of
> that sh_eth was not suspended to reduce resume time. To reset the
> hardware the driver close and reopens the device just like it would do
Closes.
> in a normal suspend/resume scenario without WoL enabled, but it both
> close and open the device in the resume callback since the device needs
Closes and opens.
> to be open for WoL to work.
> One quirk needed for WoL is that the module clock needs to be prevented
> from being switched off by Runtime PM. To keep the clock alive the
I tried to find the code in question and failed, getting muddled in the
RPM maze. Could you point at this code for my education? :-)
> suspend callback need to call clk_enable() directly to increase the
My main concern is why we need to manipulate the clock directly --
can't you call RPM to achieve the same effect?
> usage count of the clock. Then when Runtime PM decreases the clock usage
> count it won't reach 0 and be switched off.
You mean it does this even though we don't call pr_runtime_put_sync()
as done in sh_eth_close()?
> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
[...]
MBR, Sergei
^ permalink raw reply
* Re: [PATCH 1/2] net: ethernet: sxgbe: remove private tx queue lock
From: Pavel Machek @ 2016-12-18 20:16 UTC (permalink / raw)
To: Lino Sanfilippo
Cc: Francois Romieu, bh74.an, ks.giri, vipul.pandya, peppe.cavallaro,
alexandre.torgue, davem, linux-kernel, netdev
In-Reply-To: <a2cbcfcd-f377-565c-a21c-3daa3abce519@gmx.de>
[-- Attachment #1: Type: text/plain, Size: 1266 bytes --]
Hi!
> > For the same reason it's broken if it races with the transmit path: it
> > can release driver resources while the transmit path uses these.
> >
> > Btw the points below may not matter/hurt much for a proof a concept
> > but they would need to be addressed as well:
> > 1) unchecked (and avoidable) extra error paths due to stmmac_release()
> > 2) racy cancel_work_sync. Low probability as it is, an irq + error could
> > take place right after cancel_work_sync
>
> It was indeed only meant as a proof of concept. Nevertheless the race is not
> good, since one can run into it when faking the tx error for testings purposes.
> So below is a slightly improved version of the restart handling.
> Its not meant as a final version either. But maybe we can use it as a starting
> point.
Certainly works better than version we currently have in tree. I'm
running it in a loop, and it survived 10 minutes of testing so
far. (Previous version killed the hardware at first iteration.)
> Again the patch is only compile tested.
Tested-by: Pavel Machek <pavel@denx.de>
Thanks!
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
^ permalink raw reply
* Re: regression: ath_tx_edma_tasklet() Illegal idle entry in RCU read-side critical section
From: Paul E. McKenney @ 2016-12-18 20:14 UTC (permalink / raw)
To: Valo, Kalle
Cc: Tobias Klausmann, Gabriel C, lkml, ath9k-devel,
linux-wireless@vger.kernel.org, ath9k-devel@lists.ath9k.org,
netdev@vger.kernel.org, nbd@nbd.name
In-Reply-To: <87pokpc8ng.fsf@qca.qualcomm.com>
On Sun, Dec 18, 2016 at 07:57:42PM +0000, Valo, Kalle wrote:
> Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> writes:
>
> > A patch for this is already floating on the ML for a while now latest:
> > (ath9k: do not return early to fix rcu unlocking)
>
> It's here:
>
> https://patchwork.kernel.org/patch/9472709/
Feel free to add:
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Thanx, Paul
> > Hopefully Kalle will include it in one of his upcoming pull requests.
>
> Yes, I'll try to get it to 4.10-rc2.
>
> --
> Kalle Valo
^ permalink raw reply
* Re: wl1251 & mac address & calibration data
From: Arend Van Spriel @ 2016-12-18 20:08 UTC (permalink / raw)
To: Pali Rohár
Cc: Daniel Wagner, Luis R. Rodriguez, Tom Gundersen, Johannes Berg,
Ming Lei, Mimi Zohar, Bjorn Andersson, Rafał Miłecki,
Kalle Valo, Sebastian Reichel, Pavel Machek, Michal Kazior,
Ivaylo Dimitrov, Aaro Koskinen, Tony Lindgren, linux-wireless,
Network Development, linux-kernel@vger.kernel.org
In-Reply-To: <201612181309.01298@pali>
On 18-12-2016 13:09, Pali Rohár wrote:
> On Sunday 18 December 2016 12:54:00 Arend Van Spriel wrote:
>> On 18-12-2016 12:04, Pali Rohár wrote:
>>> On Sunday 18 December 2016 11:49:53 Arend Van Spriel wrote:
>>>> On 16-12-2016 11:40, Pali Rohár wrote:
>>>>> On Friday 16 December 2016 08:25:44 Daniel Wagner wrote:
>>>>>> On 12/16/2016 03:03 AM, Luis R. Rodriguez wrote:
>>>>>>> For the new API a solution for "fallback mechanisms" should be
>>>>>>> clean though and I am looking to stay as far as possible from
>>>>>>> the existing mess. A solution to help both the old API and new
>>>>>>> API is possible for the "fallback mechanism" though -- but for
>>>>>>> that I can only refer you at this point to some of Daniel
>>>>>>> Wagner and Tom Gunderson's firmwared deamon prospect. It
>>>>>>> should help pave the way for a clean solution and help address
>>>>>>> other stupid issues.
>>>>>>
>>>>>> The firmwared project is hosted here
>>>>>>
>>>>>> https://github.com/teg/firmwared
>>>>>>
>>>>>> As Luis pointed out, firmwared relies on
>>>>>> FW_LOADER_USER_HELPER_FALLBACK, which is not enabled by default.
>>>>>
>>>>> I know. But it does not mean that I cannot enable this option at
>>>>> kernel compile time.
>>>>>
>>>>> Bigger problem is that currently request_firmware() first try to
>>>>> load firmware directly from VFS and after that (if fails)
>>>>> fallback to user helper.
>>>>>
>>>>> So I would need to extend kernel firmware code with new function
>>>>> (or flag) to not use VFS and try only user mode helper.
>>>>
>>>> Why do you need the user-mode helper anyway. This is all static
>>>> data, right?
>>>
>>> Those are static data, but device specific!
>>
>> So what?
>>
>>>> So why not cook up a firmware file in user-space once and put
>>>> it in /lib/firmware for the driver to request directly.
>>>
>>> 1. Violates FHS
>>
>> How?
>>
>>> 2. Does not work for readonly /, readonly /lib, readonly
>>> /lib/firmware
>>
>> Que?
>>
>>> 3. Backup & restore of rootfs between same devices does not work
>>> (as rootfs now contains device specific data).
>>
>> True.
>>
>>> 4. Sharing one rootfs (either via nfs or other technology) does not
>>> work for more devices (even in state when rootfs is used only by
>>> one device at one time).
>>
>> Indeed.
>>
>>> And it is common that N900 developers have rootfs in laptop and via
>>> usb (cdc_ether) exports it over nfs to N900 device and boot
>>> system. It basically break booting from one nfs-exported rootfs,
>>> as that export become model specific...
>>
>> These are all you choices and more a logistic issue. If your take is
>> that udev is the way to solve those, fine by me.
>>
>>>> Seems a bit
>>>> overkill to have a {e,}udev or whatever daemon running if the
>>>> result is always the same. Just my 2 cents.
>>>
>>> No it is not. It will break couple of other things in Linux and
>>> device
>>
>> Now I am curious. What "couple of other things" will be broken.
>>
>>> and model specific calibration data should not be in /lib/firmware!
>>> That directory is used for firmware files, not calibration.
>>
>> What is "firmware"? Really. These are binary blobs required to make
>> the device work. And guess what, your device needs calibration data.
>> Why make the distinction.
>>
>> Regards,
>> Arend
>
> File wl1251-nvs.bin is provided by linux-firmware package and contains
> default data which should be overriden by model specific calibrated
> data.
Ah. Someone thought it was a good idea to provide the "one ring to rule
them all". Nice.
> But overwriting that one file is not possible as it next update of
> linux-firmware package will overwrite it back. It break any normal usage
> of package management.
>
> Also it is ridiculously broken by design if some "boot" files needs to
> be overwritten to initialize hardware properly. To not break booting you
> need to overwrite that file before first boot. But without booting
> device you cannot read calibration data. So some hack with autoreboot
> after boot is needed. And how to detect that we have real overwritten
> calibration data and not default one from linux-firmware? Any heuristic
> or checks will be broken here. And no, nothing like you need to reboot
> your device now (and similar concept) from windows world is not
> accepted.
Well. After reading and creating calibration data you could just rebind
the driver to the device to have it probed again. But yeah, the default
one from linux-firmware should never have been there in the first place.
> "firmware" is one for chip. Any N900 device with wl1251 chip needs
> exactly same firmware "wl1251-fw.bin". But every N900 needs different
> calibration data which is not firmware.
Ok. This is exactly why Luis is giving the new API different name just
calling it "data".
Regards,
Arend
^ permalink raw reply
* Re: regression: ath_tx_edma_tasklet() Illegal idle entry in RCU read-side critical section
From: Tobias Klausmann @ 2016-12-18 20:04 UTC (permalink / raw)
To: Valo, Kalle
Cc: paulmck@linux.vnet.ibm.com, Gabriel C, lkml, ath9k-devel,
linux-wireless@vger.kernel.org, ath9k-devel@lists.ath9k.org,
netdev@vger.kernel.org, nbd@nbd.name
In-Reply-To: <87pokpc8ng.fsf@qca.qualcomm.com>
On 18.12.2016 20:57, Valo, Kalle wrote:
> Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> writes:
>
>> A patch for this is already floating on the ML for a while now latest:
>> (ath9k: do not return early to fix rcu unlocking)
> It's here:
>
> https://patchwork.kernel.org/patch/9472709/
Good to know!
>
>> Hopefully Kalle will include it in one of his upcoming pull requests.
> Yes, I'll try to get it to 4.10-rc2.
Thanks for the update!
^ permalink raw reply
* Re: regression: ath_tx_edma_tasklet() Illegal idle entry in RCU read-side critical section
From: Valo, Kalle @ 2016-12-18 19:57 UTC (permalink / raw)
To: Tobias Klausmann
Cc: Gabriel C, netdev@vger.kernel.org, linux-wireless@vger.kernel.org,
ath9k-devel, lkml, ath9k-devel@lists.ath9k.org,
paulmck@linux.vnet.ibm.com, nbd@nbd.name
In-Reply-To: <58b67d5b-0275-f80f-479f-78cf748b4319@mni.thm.de>
Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> writes:
> A patch for this is already floating on the ML for a while now latest:
> (ath9k: do not return early to fix rcu unlocking)
It's here:
https://patchwork.kernel.org/patch/9472709/
> Hopefully Kalle will include it in one of his upcoming pull requests.
Yes, I'll try to get it to 4.10-rc2.
--
Kalle Valo
^ permalink raw reply
* [PATCH net v2] ipvlan: fix crash when master is set in loopback mode
From: Mahesh Bandewar @ 2016-12-18 19:00 UTC (permalink / raw)
To: netdev, Eric Dumazet, David Miller; +Cc: Mahesh Bandewar
From: Mahesh Bandewar <maheshb@google.com>
In an IPvlan setup when master is set in loopback mode e.g.
ethtool -K eth0 set loopback on
where eth0 is master device for IPvlan setup.
The failure actually happens while processing mulitcast packets
but that's a result of unconditionally queueing packets without
ensuring ether-header is part of the linear part of skb.
This patch forces this check at the reception and drops packets
which fail this check before queuing them.
------------[ cut here ]------------
kernel BUG at include/linux/skbuff.h:1737!
Call Trace:
[<ffffffff921fbbc2>] dev_forward_skb+0x92/0xd0
[<ffffffffc031ac65>] ipvlan_process_multicast+0x395/0x4c0 [ipvlan]
[<ffffffffc031a9a7>] ? ipvlan_process_multicast+0xd7/0x4c0 [ipvlan]
[<ffffffff91cdfea7>] ? process_one_work+0x147/0x660
[<ffffffff91cdff09>] process_one_work+0x1a9/0x660
[<ffffffff91cdfea7>] ? process_one_work+0x147/0x660
[<ffffffff91ce086d>] worker_thread+0x11d/0x360
[<ffffffff91ce0750>] ? rescuer_thread+0x350/0x350
[<ffffffff91ce960b>] kthread+0xdb/0xe0
[<ffffffff91c05c70>] ? _raw_spin_unlock_irq+0x30/0x50
[<ffffffff91ce9530>] ? flush_kthread_worker+0xc0/0xc0
[<ffffffff92348b7a>] ret_from_fork+0x9a/0xd0
[<ffffffff91ce9530>] ? flush_kthread_worker+0xc0/0xc0
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
---
v1->v2: commit log update
drivers/net/ipvlan/ipvlan_core.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
index b4e990743e1d..4294fc1f5564 100644
--- a/drivers/net/ipvlan/ipvlan_core.c
+++ b/drivers/net/ipvlan/ipvlan_core.c
@@ -660,6 +660,9 @@ rx_handler_result_t ipvlan_handle_frame(struct sk_buff **pskb)
if (!port)
return RX_HANDLER_PASS;
+ if (unlikely(!pskb_may_pull(skb, sizeof(struct ethhdr))))
+ goto out;
+
switch (port->mode) {
case IPVLAN_MODE_L2:
return ipvlan_handle_mode_l2(pskb, port);
@@ -672,6 +675,8 @@ rx_handler_result_t ipvlan_handle_frame(struct sk_buff **pskb)
/* Should not reach here */
WARN_ONCE(true, "ipvlan_handle_frame() called for mode = [%hx]\n",
port->mode);
+
+out:
kfree_skb(skb);
return RX_HANDLER_CONSUMED;
}
--
2.8.0.rc3.226.g39d4020
^ permalink raw reply related
* Re: [PATCH 1/2] net: ethernet: sxgbe: remove private tx queue lock
From: Pavel Machek @ 2016-12-18 18:30 UTC (permalink / raw)
To: Lino Sanfilippo
Cc: Francois Romieu, bh74.an, ks.giri, vipul.pandya, peppe.cavallaro,
alexandre.torgue, davem, linux-kernel, netdev
In-Reply-To: <a2cbcfcd-f377-565c-a21c-3daa3abce519@gmx.de>
[-- Attachment #1: Type: text/plain, Size: 811 bytes --]
Hi!
> > - e1efa87241272104d6a12c8b9fcdc4f62634d447
>
> Yep, a sync of the dma descriptors before the hardware gets ownership of the tx tail
> idx is missing in the stmmac, too.
I can reproduce failure with 4.4 fairly easily. I tried with dma_
variant of barriers, and it failed, too
[ 1018.410012] stmmac: early irq
[ 1023.939702] fpga_cmd_read:wait_event timed out!
[ 1033.128692] ------------[ cut here ]------------
[ 1033.133329] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:303
dev_watchdog+0x264/0x284()
[ 1033.141748] NETDEV WATCHDOG: eth0 (socfpga-dwmac): transmit queue 0
timed out
[ 1033.148861] Modules linked in:
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
^ permalink raw reply
* Re: mlx4: Bug in XDP_TX + 16 rx-queues
From: Martin KaFai Lau @ 2016-12-18 18:14 UTC (permalink / raw)
To: Tariq Toukan
Cc: Saeed Mahameed, Tariq Toukan, netdev@vger.kernel.org,
Alexei Starovoitov
In-Reply-To: <fd3b36fa-8365-3873-b3e5-fff47939c330@gmail.com>
On Sun, Dec 18, 2016 at 12:31:30PM +0200, Tariq Toukan wrote:
> Hi Martin,
>
>
> On 17/12/2016 12:18 PM, Martin KaFai Lau wrote:
> >Hi All,
> >
> >I have been debugging with XDP_TX and 16 rx-queues.
> >
> >1) When 16 rx-queues is used and an XDP prog is doing XDP_TX,
> >it seems that the packet cannot be XDP_TX out if the pkt
> >is received from some particular CPUs (/rx-queues).
> Does the rx_xdp_tx_full counter increase?
The rx_xdp_tx_full counter did not increase. A capture of
ethtool -S eth0:
[root@kerneltest003.14.prn2 ~]# ethtool -S eth0 | egrep 'rx.*_xdp_tx.*:'
rx_xdp_tx: 1024
rx_xdp_tx_full: 0
rx0_xdp_tx: 64
rx0_xdp_tx_full: 0
rx1_xdp_tx: 64
rx1_xdp_tx_full: 0
rx2_xdp_tx: 64
rx2_xdp_tx_full: 0
rx3_xdp_tx: 64
rx3_xdp_tx_full: 0
rx4_xdp_tx: 64
rx4_xdp_tx_full: 0
rx5_xdp_tx: 64
rx5_xdp_tx_full: 0
rx6_xdp_tx: 64
rx6_xdp_tx_full: 0
rx7_xdp_tx: 64
rx7_xdp_tx_full: 0
rx8_xdp_tx: 64
rx8_xdp_tx_full: 0
rx9_xdp_tx: 63
rx9_xdp_tx_full: 0
rx10_xdp_tx: 65
rx10_xdp_tx_full: 0
rx11_xdp_tx: 64
rx11_xdp_tx_full: 0
rx12_xdp_tx: 64
rx12_xdp_tx_full: 0
rx13_xdp_tx: 64
rx13_xdp_tx_full: 0
rx14_xdp_tx: 64
rx14_xdp_tx_full: 0
rx15_xdp_tx: 64
rx15_xdp_tx_full: 0
> Does the problem repro if you turn off PFC?
> ethtool -A <intf> rx off tx off
Turning pause off does not help.
> >
> >2) If 8 rx-queues is used, it does not have problem.
> >
> >3) The 16 rx-queues problem also went away after reverting these
> >two patches:
> >15fca2c8eb41 net/mlx4_en: Add ethtool statistics for XDP cases
> >67f8b1dcb9ee net/mlx4_en: Refactor the XDP forwarding rings scheme
> >
> >4) I can reproduce the problem by running samples/bof/xdp_ip_tunnel at
> >the receiver side. The sender side sends out TCP packets with
> >source port ranging from 1 to 1024. At the sender side also, do
> >a tcpdump to capture the ip-tunnel packet reflected by xdp_ip_tunnel.
> >With 8 rx-queues, I can get all 1024 packets back. With 16 rx-queues,
> >I can only get 512 packets back. It is a 40 CPUs machine.
> >I also checked the rx*_xdp_tx counters (from ethtool -S eth0) to ensure
> >the xdp prog has XDP_TX-ed it out.
> So all packets were transmitted (according to rx*_xdp_tx), and only half the
> of them received on the other side?
Correct. The XDP program 'samples/bpf/xdp_tx_iptunnel' received,
processed and sent out 1024 packets. The rx*_xdp_tx also showed all of the
1024 packets. However, only half of them reached to the other side (by
observing the tcpdump) when 16 rx-queues was used.
Thanks,
--Martin
^ permalink raw reply
* Re: [PATCH net] ipvlan: fix crash
From: Mahesh Bandewar (महेश बंडेवार) @ 2016-12-18 18:10 UTC (permalink / raw)
To: David Miller; +Cc: mahesh, linux-netdev, Eric Dumazet
In-Reply-To: <20161217.235452.1604434909844387069.davem@davemloft.net>
On Sat, Dec 17, 2016 at 8:54 PM, David Miller <davem@davemloft.net> wrote:
> From: Mahesh Bandewar <mahesh@bandewar.net>
> Date: Sat, 17 Dec 2016 18:16:19 -0800
>
>> diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
>> index b4e990743e1d..4294fc1f5564 100644
>> --- a/drivers/net/ipvlan/ipvlan_core.c
>> +++ b/drivers/net/ipvlan/ipvlan_core.c
>> @@ -660,6 +660,9 @@ rx_handler_result_t ipvlan_handle_frame(struct sk_buff **pskb)
>> if (!port)
>> return RX_HANDLER_PASS;
>>
>> + if (unlikely(!pskb_may_pull(skb, sizeof(struct ethhdr))))
>> + goto out;
>> +
>> switch (port->mode) {
>
> ipvlan only allows non-loopback ethernet devices to register
> this RX handler.
>
> Such situations being tested here should therefore be completely
> impossible.
>
Yes, correct. This happens when the master device is set in loopback mode.
> Every such device must send the SKB through eth_type_trans(), which
> unconditionally accesses the ethernet header, therefore it must
> be pulled into the linear SKB area already, long before this RX
> handler is invoked.
>
> If this really can legitimately happen, you must explain how so.
>
OK, will update the commit log.
> Just showing the crash that later happens in some (completely
> unrelated BTW) ipvlan multicast workqueue handling function, is
> really an insufficient commit log message for a bug like this.
^ permalink raw reply
* [PATCH iproute2 1/1] tc: updated man page to reflect filter-id use in filter GET command.
From: Roman Mashak @ 2016-12-18 17:25 UTC (permalink / raw)
To: stephen; +Cc: netdev, jhs, daniel, xiyou.wangcong, Roman Mashak
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
---
man/man8/tc.8 | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/man/man8/tc.8 b/man/man8/tc.8
index 583b72f..f96911a 100644
--- a/man/man8/tc.8
+++ b/man/man8/tc.8
@@ -32,7 +32,8 @@ tc \- show / manipulate traffic control settings
\fIDEV\fR
.B [ parent
\fIqdisc-id\fR
-.B | root ] protocol
+.B | root ] [ handle \fIfilter-id\fR ]
+.B protocol
\fIprotocol\fR
.B prio
\fIpriority\fR filtertype
@@ -577,7 +578,8 @@ it is created.
.TP
get
-Displays a single filter given the interface, parent ID, priority, protocol and handle ID.
+Displays a single filter given the interface \fIDEV\fR, \fIqdisc-id\fR,
+\fIpriority\fR, \fIprotocol\fR and \fIfilter-id\fR.
.TP
show
--
1.9.1
^ permalink raw reply related
* [PATCH iproute2 1/1] tc: fixed man page fonts for keywords and variable values
From: Roman Mashak @ 2016-12-18 17:25 UTC (permalink / raw)
To: stephen; +Cc: netdev, jhs, daniel, xiyou.wangcong, Roman Mashak
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
---
man/man8/tc.8 | 36 ++++++++++++++++++------------------
1 file changed, 18 insertions(+), 18 deletions(-)
diff --git a/man/man8/tc.8 b/man/man8/tc.8
index 8a47a2b..583b72f 100644
--- a/man/man8/tc.8
+++ b/man/man8/tc.8
@@ -5,58 +5,58 @@ tc \- show / manipulate traffic control settings
.B tc
.RI "[ " OPTIONS " ]"
.B qdisc [ add | change | replace | link | delete ] dev
-DEV
+\fIDEV\fR
.B
[ parent
-qdisc-id
+\fIqdisc-id\fR
.B | root ]
.B [ handle
-qdisc-id ] qdisc
+\fIqdisc-id\fR ] qdisc
[ qdisc specific parameters ]
.P
.B tc
.RI "[ " OPTIONS " ]"
.B class [ add | change | replace | delete ] dev
-DEV
+\fIDEV\fR
.B parent
-qdisc-id
+\fIqdisc-id\fR
.B [ classid
-class-id ] qdisc
+\fIclass-id\fR ] qdisc
[ qdisc specific parameters ]
.P
.B tc
.RI "[ " OPTIONS " ]"
.B filter [ add | change | replace | delete | get ] dev
-DEV
+\fIDEV\fR
.B [ parent
-qdisc-id
+\fIqdisc-id\fR
.B | root ] protocol
-protocol
+\fIprotocol\fR
.B prio
-priority filtertype
+\fIpriority\fR filtertype
[ filtertype specific parameters ]
.B flowid
-flow-id
+\fIflow-id\fR
.B tc
.RI "[ " OPTIONS " ]"
.RI "[ " FORMAT " ]"
.B qdisc show [ dev
-DEV
+\fIDEV\fR
.B ]
.P
.B tc
.RI "[ " OPTIONS " ]"
.RI "[ " FORMAT " ]"
.B class show dev
-DEV
+\fIDEV\fR
.P
.B tc
.RI "[ " OPTIONS " ]"
.B filter show dev
-DEV
+\fIDEV\fR
.P
.ti 8
@@ -294,14 +294,14 @@ In the absence of classful qdiscs, classless qdiscs can only be attached at
the root of a device. Full syntax:
.P
.B tc qdisc add dev
-DEV
+\fIDEV\fR
.B root
QDISC QDISC-PARAMETERS
To remove, issue
.P
.B tc qdisc del dev
-DEV
+\fIDEV\fR
.B root
The
@@ -386,7 +386,7 @@ Type of Service
Some qdiscs have built in rules for classifying packets based on the TOS field.
.TP
skb->priority
-Userspace programs can encode a class-id in the 'skb->priority' field using
+Userspace programs can encode a \fIclass-id\fR in the 'skb->priority' field using
the SO_PRIORITY option.
.P
Each node within the tree can have its own filters but higher level filters
@@ -554,7 +554,7 @@ must be passed, either by passing its ID or by attaching directly to the root of
When creating a qdisc or a filter, it can be named with the
.B handle
parameter. A class is named with the
-.B classid
+.B \fBclassid\fR
parameter.
.TP
--
1.9.1
^ permalink raw reply related
* Re: [PATCH 1/2] net: ethernet: sxgbe: remove private tx queue lock
From: Pavel Machek @ 2016-12-18 17:23 UTC (permalink / raw)
To: Lino Sanfilippo
Cc: Francois Romieu, bh74.an, ks.giri, vipul.pandya, peppe.cavallaro,
alexandre.torgue, davem, linux-kernel, netdev
In-Reply-To: <a2cbcfcd-f377-565c-a21c-3daa3abce519@gmx.de>
[-- Attachment #1: Type: text/plain, Size: 590 bytes --]
Hi!
> > - e1efa87241272104d6a12c8b9fcdc4f62634d447
>
> Yep, a sync of the dma descriptors before the hardware gets ownership of the tx tail
> idx is missing in the stmmac, too.
Thanks for the hint. Actually, the driver uses smp_wmb() which is
completely crazy, and probably misses rmb() in clean_tx, too. Anyway,
I did not notice there are dma_ variants, too... we clearly need them.
Thanks and best regards,
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
^ permalink raw reply
* Re: regression: ath_tx_edma_tasklet() Illegal idle entry in RCU read-side critical section
From: Tobias Klausmann @ 2016-12-18 16:17 UTC (permalink / raw)
To: paulmck, Gabriel C
Cc: lkml, ath9k-devel, linux-wireless, ath9k-devel, netdev, nbd,
kvalo
In-Reply-To: <20161218155938.GP3924@linux.vnet.ibm.com>
Hi,
A patch for this is already floating on the ML for a while now latest:
(ath9k: do not return early to fix rcu unlocking)
Hopefully Kalle will include it in one of his upcoming pull requests.
Greetings,
Tobias
On 18.12.2016 16:59, Paul E. McKenney wrote:
> On Sun, Dec 18, 2016 at 02:52:48PM +0100, Gabriel C wrote:
>> Hello,
>>
>> while testing kernel 4.9 I run into a weird issue with the ath9k driver.
>>
>> I can boot the box in console mode and it stay up sometime but is not usable.
> Looks to me like someone forgot an rcu_read_unlock() somewhere. Given that
> the unmatched rcu_read_lock() appears in ath_tx_edma_tasklet(), perhaps
> that is also where the missing rcu_read_unlock() is. And sure enough,
> in the middle of this function we have the following:
>
> fifo_list = &txq->txq_fifo[txq->txq_tailidx];
> if (list_empty(fifo_list)) {
> ath_txq_unlock(sc, txq);
> return;
> }
>
> This will of course return while still in an RCU read-side critical
> section. The caller cannot tell the difference between a return here
> and falling off the end of the function, so this is likely the bug.
> Or one of the bugs, anyway. Copying the author and committer for
> their thoughts.
>
> Please try the patch at the end of this email.
>
> Thanx, Paul
>
>> from dmesg :
>>
>> ===============================
>> [ INFO: suspicious RCU usage. ]
>> 4.9-fw1 #1 Tainted: G I
>> -------------------------------
>> kernel/rcu/tree.c:705 Illegal idle entry in RCU read-side critical section.!
>>
>> other info that might help us debug this:
>>
>>
>> RCU used illegally from idle CPU!
>> rcu_scheduler_active = 1, debug_locks = 1
>> RCU used illegally from extended quiescent state!
>> 1 lock held by swapper/0/0:
>> #0: (rcu_read_lock){......}, at: [<ffffffffa0ee0240>] ath_tx_edma_tasklet+0x0/0x460 [ath9k]
>>
>> stack backtrace:
>> CPU: 0 PID: 0 Comm: swapper/0 Tainted: G I 4.9-fw1 #1
>> Hardware name: FUJITSU PRIMERGY TX200 S5 /D2709, BIOS 6.00 Rev. 1.14.2709 02/04/2013
>> ffff88043ee03f38 ffffffff812cf0f3 ffffffff81a11540 0000000000000001
>> ffff88043ee03f68 ffffffff810b7865 ffffffff81a55d58 ffff88043efcedc0
>> ffff88083cb1ca00 00000000000000d1 ffff88043ee03f88 ffffffff810dbfe8
>> Call Trace:
>> <IRQ>
>> [<ffffffff812cf0f3>] dump_stack+0x86/0xc3
>> [<ffffffff810b7865>] lockdep_rcu_suspicious+0xc5/0x100
>> [<ffffffff810dbfe8>] rcu_eqs_enter_common.constprop.62+0x128/0x130
>> [<ffffffff810ddc78>] rcu_irq_exit+0x38/0x70
>> [<ffffffff81067ec4>] irq_exit+0x74/0xd0
>> [<ffffffff8101e561>] do_IRQ+0x71/0x130
>> [<ffffffff8158700c>] common_interrupt+0x8c/0x8c
>> <EOI>
>> [<ffffffff81472836>] ? cpuidle_enter_state+0x156/0x220
>> [<ffffffff81472922>] cpuidle_enter+0x12/0x20
>> [<ffffffff810ad23e>] call_cpuidle+0x1e/0x40
>> [<ffffffff810ad46d>] cpu_startup_entry+0x11d/0x210
>> [<ffffffff8157892c>] rest_init+0x12c/0x140
>> [<ffffffff81d02ec3>] start_kernel+0x40f/0x41c
>> [<ffffffff81d02120>] ? early_idt_handler_array+0x120/0x120
>> [<ffffffff81d02299>] x86_64_start_reservations+0x2a/0x2c
>> [<ffffffff81d02386>] x86_64_start_kernel+0xeb/0xf8
> ------------------------------------------------------------------------
>
> commit 5a16fed76936184a7ac22e466cf39bd8bb5ee65e
> Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Date: Sun Dec 18 07:49:00 2016 -0800
>
> drivers/ath: Add missing rcu_read_unlock() to ath_tx_edma_tasklet()
>
> Commit d94a461d7a7d ("ath9k: use ieee80211_tx_status_noskb where possible")
> added rcu_read_lock() and rcu_read_unlock() around the body of
> ath_tx_edma_tasklet(), but failed to add the needed rcu_read_unlock()
> before a "return" in the middle of this function. This commit therefore
> adds the missing rcu_read_unlock().
>
> Reported-by: Gabriel C <nix.or.die@gmail.com>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Felix Fietkau <nbd@nbd.name>
> Cc: Kalle Valo <kvalo@qca.qualcomm.com>
> Cc: QCA ath9k Development <ath9k-devel@qca.qualcomm.com>
> Cc: <linux-wireless@vger.kernel.org?
> Cc: <ath9k-devel@lists.ath9k.org>
>
> diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
> index 52bfbb988611..857d5ae09a1d 100644
> --- a/drivers/net/wireless/ath/ath9k/xmit.c
> +++ b/drivers/net/wireless/ath/ath9k/xmit.c
> @@ -2787,6 +2787,7 @@ void ath_tx_edma_tasklet(struct ath_softc *sc)
> fifo_list = &txq->txq_fifo[txq->txq_tailidx];
> if (list_empty(fifo_list)) {
> ath_txq_unlock(sc, txq);
> + rcu_read_unlock();
> return;
> }
>
>
^ permalink raw reply
* Re: [PATCH 1/2] net: ethernet: sxgbe: remove private tx queue lock
From: Lino Sanfilippo @ 2016-12-18 16:15 UTC (permalink / raw)
To: Francois Romieu, Pavel Machek
Cc: bh74.an, ks.giri, vipul.pandya, peppe.cavallaro, alexandre.torgue,
davem, linux-kernel, netdev
In-Reply-To: <20161218001507.GA5343@electric-eye.fr.zoreil.com>
Hi,
On 18.12.2016 01:15, Francois Romieu wrote:
> Pavel Machek <pavel@ucw.cz> :
> [...]
>> Won't this up/down the interface, in a way userspace can observe?
>
> It won't up/down the interface as it doesn't exactly mimic what the
> network code does (there's more than rtnl_lock).
>
Right. Userspace wont see link down/up, but it will see carrier off/on.
But this is AFAIK acceptable for a rare situation like a tx error.
> For the same reason it's broken if it races with the transmit path: it
> can release driver resources while the transmit path uses these.
>
> Btw the points below may not matter/hurt much for a proof a concept
> but they would need to be addressed as well:
> 1) unchecked (and avoidable) extra error paths due to stmmac_release()
> 2) racy cancel_work_sync. Low probability as it is, an irq + error could
> take place right after cancel_work_sync
It was indeed only meant as a proof of concept. Nevertheless the race is not
good, since one can run into it when faking the tx error for testings purposes.
So below is a slightly improved version of the restart handling.
Its not meant as a final version either. But maybe we can use it as a starting
point.
> Lino, have you considered via-rhine.c since its "move work from irq to
> workqueue context" changes that started in
> 7ab87ff4c770eed71e3777936299292739fcd0fe [*] ?
> It's a shameless plug - originated in r8169.c - but it should be rather
> close to what the sxgbe and friends require. Thought / opinion ?
>
Not really. There are a few drivers that I use to look into if I want to know
how certain things are done correctly (e.g the sky2 or the intel drivers) because
I think they are well implemented.
But you seem to have put some thoughts into various race condition problems
in the via-rhine driver and I can image that sxgbe and stmmac still have some
of these issues, too.
> [*] Including fixes/changes in:
> - 3a5a883a8a663b930908cae4abe5ec913b9b2fd2
Ok, the issues you fixed here are concerning the tx_curr and tx_dirty
pointers. For now this is not needed in stmmac and sxgbe since the
tx completion handlers in both drivers are not lock-free like in
the via-rhine.c but are synchronized with xmit by means of the xmit_lock.
> - e1efa87241272104d6a12c8b9fcdc4f62634d447
Yep, a sync of the dma descriptors before the hardware gets ownership of the tx tail
idx is missing in the stmmac, too.
> - 810f19bcb862f8889b27e0c9d9eceac9593925dd
> - e45af497950a89459a0c4b13ffd91e1729fffef4
> - a926592f5e4e900f3fa903298c4619a131e60963
I think we should use netif_tx_disable() instead of netif_stop_queue(), too, in
case of restart to avoid a possible schedule of the xmit function while we restart.
So this is also part of the new patch.
Again the patch is only compile tested.
Regards,
Lino
---
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 1 +
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 95 +++++++++++++++--------
2 files changed, 63 insertions(+), 33 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index eab04ae..9c240d7 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -131,6 +131,7 @@ struct stmmac_priv {
u32 rx_tail_addr;
u32 tx_tail_addr;
u32 mss;
+ struct work_struct tx_err_work;
#ifdef CONFIG_DEBUG_FS
struct dentry *dbgfs_dir;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 3e40578..5762750 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1403,37 +1403,6 @@ static inline void stmmac_disable_dma_irq(struct stmmac_priv *priv)
}
/**
- * stmmac_tx_err - to manage the tx error
- * @priv: driver private structure
- * Description: it cleans the descriptors and restarts the transmission
- * in case of transmission errors.
- */
-static void stmmac_tx_err(struct stmmac_priv *priv)
-{
- int i;
- netif_stop_queue(priv->dev);
-
- priv->hw->dma->stop_tx(priv->ioaddr);
- dma_free_tx_skbufs(priv);
- for (i = 0; i < DMA_TX_SIZE; i++)
- if (priv->extend_desc)
- priv->hw->desc->init_tx_desc(&priv->dma_etx[i].basic,
- priv->mode,
- (i == DMA_TX_SIZE - 1));
- else
- priv->hw->desc->init_tx_desc(&priv->dma_tx[i],
- priv->mode,
- (i == DMA_TX_SIZE - 1));
- priv->dirty_tx = 0;
- priv->cur_tx = 0;
- netdev_reset_queue(priv->dev);
- priv->hw->dma->start_tx(priv->ioaddr);
-
- priv->dev->stats.tx_errors++;
- netif_wake_queue(priv->dev);
-}
-
-/**
* stmmac_dma_interrupt - DMA ISR
* @priv: driver private structure
* Description: this is the DMA ISR. It is called by the main ISR.
@@ -1466,7 +1435,7 @@ static void stmmac_dma_interrupt(struct stmmac_priv *priv)
priv->xstats.threshold = tc;
}
} else if (unlikely(status == tx_hard_error))
- stmmac_tx_err(priv);
+ schedule_work(&priv->tx_err_work);
}
/**
@@ -1902,6 +1871,7 @@ static int stmmac_release(struct net_device *dev)
if (priv->lpi_irq > 0)
free_irq(priv->lpi_irq, dev);
+ cancel_work_sync(&priv->tx_err_work);
/* Stop TX/RX DMA and clear the descriptors */
priv->hw->dma->stop_tx(priv->ioaddr);
priv->hw->dma->stop_rx(priv->ioaddr);
@@ -1920,9 +1890,67 @@ static int stmmac_release(struct net_device *dev)
stmmac_release_ptp(priv);
+
return 0;
}
+static void stmmac_shutdown(struct net_device *dev)
+{
+ struct stmmac_priv *priv = netdev_priv(dev);
+
+ /* make sure xmit is not scheduled any more */
+ netif_tx_disable(dev);
+
+ if (priv->eee_enabled)
+ del_timer_sync(&priv->eee_ctrl_timer);
+
+ /* Stop and disconnect the PHY */
+ if (dev->phydev) {
+ phy_stop(dev->phydev);
+ phy_disconnect(dev->phydev);
+ }
+
+ napi_disable(&priv->napi);
+
+ del_timer_sync(&priv->txtimer);
+
+ /* Free the IRQ lines */
+ free_irq(dev->irq, dev);
+ if (priv->wol_irq != dev->irq)
+ free_irq(priv->wol_irq, dev);
+ if (priv->lpi_irq > 0)
+ free_irq(priv->lpi_irq, dev);
+
+ /* Stop TX/RX DMA and clear the descriptors */
+ priv->hw->dma->stop_tx(priv->ioaddr);
+ priv->hw->dma->stop_rx(priv->ioaddr);
+
+ /* Release and free the Rx/Tx resources */
+ free_dma_desc_resources(priv);
+
+ /* Disable the MAC Rx/Tx */
+ stmmac_set_mac(priv->ioaddr, false);
+
+ netif_carrier_off(dev);
+
+#ifdef CONFIG_DEBUG_FS
+ stmmac_exit_fs(dev);
+#endif
+
+ stmmac_release_ptp(priv);
+}
+
+static void stmmac_tx_err_work(struct work_struct *work)
+{
+ struct stmmac_priv *priv = container_of(work, struct stmmac_priv,
+ tx_err_work);
+ /* restart netdev */
+ rtnl_lock();
+ stmmac_shutdown(priv->dev);
+ stmmac_open(priv->dev);
+ rtnl_unlock();
+}
+
/**
* stmmac_tso_allocator - close entry point of the driver
* @priv: driver private structure
@@ -2688,7 +2716,7 @@ static void stmmac_tx_timeout(struct net_device *dev)
struct stmmac_priv *priv = netdev_priv(dev);
/* Clear Tx resources and restart transmitting again */
- stmmac_tx_err(priv);
+ schedule_work(&priv->tx_err_work);
}
/**
@@ -3338,6 +3366,7 @@ int stmmac_dvr_probe(struct device *device,
netif_napi_add(ndev, &priv->napi, stmmac_poll, 64);
spin_lock_init(&priv->lock);
+ INIT_WORK(&priv->tx_err_work, stmmac_tx_err_work);
ret = register_netdev(ndev);
if (ret) {
--
1.9.1
^ permalink raw reply related
* Re: [PATCH net-next v4 2/2] net: stmmac: dwmac-meson8b: make the RGMII TX delay configurable
From: Martin Blumenstingl @ 2016-12-18 16:13 UTC (permalink / raw)
To: David Miller
Cc: netdev, devicetree, linux-amlogic, robh+dt, mark.rutland, carlo,
khilman, peppe.cavallaro, alexandre.torgue, linux-arm-kernel
In-Reply-To: <20161218.104950.1013829528388480468.davem@davemloft.net>
On Sun, Dec 18, 2016 at 4:49 PM, David Miller <davem@davemloft.net> wrote:
> From: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
> Date: Sat, 17 Dec 2016 19:21:19 +0100
>
>> Prior to this patch we were using a hardcoded RGMII TX clock delay of
>> 2ns (= 1/4 cycle of the 125MHz RGMII TX clock). This value works for
>> many boards, but unfortunately not for all (due to the way the actual
>> circuit is designed, sometimes because the TX delay is enabled in the
>> PHY, etc.). Making the TX delay on the MAC side configurable allows us
>> to support all possible hardware combinations.
>>
>> This allows fixing a compatibility issue on some boards, where the
>> RTL8211F PHY is configured to generate the TX delay. We can now turn
>> off the TX delay in the MAC, because otherwise we would be applying the
>> delay twice (which results in non-working TX traffic).
>>
>> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
>> Tested-by: Neil Armstrong <narmstrong@baylibre.com>
>
> Is this really the safest thing to do?
>
> If you say the existing hard-coded setting of 1/4 cycle works on most
> boards, and what you're trying to do is override it with an OF
> property value for boards where the existing setting does not work,
> then you _must_ use a default value that corresponds to what the
> existing code does not when you don't see this new OF property.
it's a bit more complicated in reality: 1/4 cycle works when the TX
delay of the RTL8211F PHY is turned off (until recently it was always
enabled for phy-mode RGMII).
> So please retain the current behavior of the 1/4 cycle TX delay
> setting when you don't see the amlogic,tx-delay-ns property.
>
> I really think you risk breaking existing boards by not doing so,
> unless you can have this patch tested on every such board that exists
> and I don't think you really can feasibly and rigorously do that.
there's a patch in my follow-up series which adds the 2ns to the .dts
for all RGMII based boards: [0] (and I would keep these even if we had
a default value, just to make it explicit and thus easier to
understand for other people).
however, we can add the 2ns default back (I can do this if you want -
Rob Herring was unhappy with the missing documentation of this default
value [1] - so note to myself: take care of that as well). but then we
have to decide when to apply this default value: only when we're in
RGMII mode or also in any of the RGMII_*ID modes?
please let me know how we should proceed
Regards,
Martin
[0] http://lists.infradead.org/pipermail/linux-amlogic/2016-December/001838.html
[1] http://lists.infradead.org/pipermail/linux-amlogic/2016-November/001817.html
^ permalink raw reply
* Re: regression: ath_tx_edma_tasklet() Illegal idle entry in RCU read-side critical section
From: Paul E. McKenney @ 2016-12-18 15:59 UTC (permalink / raw)
To: Gabriel C
Cc: lkml, ath9k-devel, linux-wireless, ath9k-devel, netdev, nbd,
kvalo
In-Reply-To: <23a2a3ab-974a-ed26-6afa-aafab9bb972e@gmail.com>
On Sun, Dec 18, 2016 at 02:52:48PM +0100, Gabriel C wrote:
> Hello,
>
> while testing kernel 4.9 I run into a weird issue with the ath9k driver.
>
> I can boot the box in console mode and it stay up sometime but is not usable.
Looks to me like someone forgot an rcu_read_unlock() somewhere. Given that
the unmatched rcu_read_lock() appears in ath_tx_edma_tasklet(), perhaps
that is also where the missing rcu_read_unlock() is. And sure enough,
in the middle of this function we have the following:
fifo_list = &txq->txq_fifo[txq->txq_tailidx];
if (list_empty(fifo_list)) {
ath_txq_unlock(sc, txq);
return;
}
This will of course return while still in an RCU read-side critical
section. The caller cannot tell the difference between a return here
and falling off the end of the function, so this is likely the bug.
Or one of the bugs, anyway. Copying the author and committer for
their thoughts.
Please try the patch at the end of this email.
Thanx, Paul
> from dmesg :
>
> ===============================
> [ INFO: suspicious RCU usage. ]
> 4.9-fw1 #1 Tainted: G I
> -------------------------------
> kernel/rcu/tree.c:705 Illegal idle entry in RCU read-side critical section.!
>
> other info that might help us debug this:
>
>
> RCU used illegally from idle CPU!
> rcu_scheduler_active = 1, debug_locks = 1
> RCU used illegally from extended quiescent state!
> 1 lock held by swapper/0/0:
> #0: (rcu_read_lock){......}, at: [<ffffffffa0ee0240>] ath_tx_edma_tasklet+0x0/0x460 [ath9k]
>
> stack backtrace:
> CPU: 0 PID: 0 Comm: swapper/0 Tainted: G I 4.9-fw1 #1
> Hardware name: FUJITSU PRIMERGY TX200 S5 /D2709, BIOS 6.00 Rev. 1.14.2709 02/04/2013
> ffff88043ee03f38 ffffffff812cf0f3 ffffffff81a11540 0000000000000001
> ffff88043ee03f68 ffffffff810b7865 ffffffff81a55d58 ffff88043efcedc0
> ffff88083cb1ca00 00000000000000d1 ffff88043ee03f88 ffffffff810dbfe8
> Call Trace:
> <IRQ>
> [<ffffffff812cf0f3>] dump_stack+0x86/0xc3
> [<ffffffff810b7865>] lockdep_rcu_suspicious+0xc5/0x100
> [<ffffffff810dbfe8>] rcu_eqs_enter_common.constprop.62+0x128/0x130
> [<ffffffff810ddc78>] rcu_irq_exit+0x38/0x70
> [<ffffffff81067ec4>] irq_exit+0x74/0xd0
> [<ffffffff8101e561>] do_IRQ+0x71/0x130
> [<ffffffff8158700c>] common_interrupt+0x8c/0x8c
> <EOI>
> [<ffffffff81472836>] ? cpuidle_enter_state+0x156/0x220
> [<ffffffff81472922>] cpuidle_enter+0x12/0x20
> [<ffffffff810ad23e>] call_cpuidle+0x1e/0x40
> [<ffffffff810ad46d>] cpu_startup_entry+0x11d/0x210
> [<ffffffff8157892c>] rest_init+0x12c/0x140
> [<ffffffff81d02ec3>] start_kernel+0x40f/0x41c
> [<ffffffff81d02120>] ? early_idt_handler_array+0x120/0x120
> [<ffffffff81d02299>] x86_64_start_reservations+0x2a/0x2c
> [<ffffffff81d02386>] x86_64_start_kernel+0xeb/0xf8
------------------------------------------------------------------------
commit 5a16fed76936184a7ac22e466cf39bd8bb5ee65e
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date: Sun Dec 18 07:49:00 2016 -0800
drivers/ath: Add missing rcu_read_unlock() to ath_tx_edma_tasklet()
Commit d94a461d7a7d ("ath9k: use ieee80211_tx_status_noskb where possible")
added rcu_read_lock() and rcu_read_unlock() around the body of
ath_tx_edma_tasklet(), but failed to add the needed rcu_read_unlock()
before a "return" in the middle of this function. This commit therefore
adds the missing rcu_read_unlock().
Reported-by: Gabriel C <nix.or.die@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Kalle Valo <kvalo@qca.qualcomm.com>
Cc: QCA ath9k Development <ath9k-devel@qca.qualcomm.com>
Cc: <linux-wireless@vger.kernel.org?
Cc: <ath9k-devel@lists.ath9k.org>
diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
index 52bfbb988611..857d5ae09a1d 100644
--- a/drivers/net/wireless/ath/ath9k/xmit.c
+++ b/drivers/net/wireless/ath/ath9k/xmit.c
@@ -2787,6 +2787,7 @@ void ath_tx_edma_tasklet(struct ath_softc *sc)
fifo_list = &txq->txq_fifo[txq->txq_tailidx];
if (list_empty(fifo_list)) {
ath_txq_unlock(sc, txq);
+ rcu_read_unlock();
return;
}
^ permalink raw reply related
* Re: [PATCH net-next v4 2/2] net: stmmac: dwmac-meson8b: make the RGMII TX delay configurable
From: David Miller @ 2016-12-18 15:49 UTC (permalink / raw)
To: martin.blumenstingl
Cc: netdev, devicetree, linux-amlogic, robh+dt, mark.rutland, carlo,
khilman, peppe.cavallaro, alexandre.torgue, linux-arm-kernel
In-Reply-To: <20161217182119.4037-3-martin.blumenstingl@googlemail.com>
From: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Date: Sat, 17 Dec 2016 19:21:19 +0100
> Prior to this patch we were using a hardcoded RGMII TX clock delay of
> 2ns (= 1/4 cycle of the 125MHz RGMII TX clock). This value works for
> many boards, but unfortunately not for all (due to the way the actual
> circuit is designed, sometimes because the TX delay is enabled in the
> PHY, etc.). Making the TX delay on the MAC side configurable allows us
> to support all possible hardware combinations.
>
> This allows fixing a compatibility issue on some boards, where the
> RTL8211F PHY is configured to generate the TX delay. We can now turn
> off the TX delay in the MAC, because otherwise we would be applying the
> delay twice (which results in non-working TX traffic).
>
> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
> Tested-by: Neil Armstrong <narmstrong@baylibre.com>
Is this really the safest thing to do?
If you say the existing hard-coded setting of 1/4 cycle works on most
boards, and what you're trying to do is override it with an OF
property value for boards where the existing setting does not work,
then you _must_ use a default value that corresponds to what the
existing code does not when you don't see this new OF property.
So please retain the current behavior of the 1/4 cycle TX delay
setting when you don't see the amlogic,tx-delay-ns property.
I really think you risk breaking existing boards by not doing so,
unless you can have this patch tested on every such board that exists
and I don't think you really can feasibly and rigorously do that.
Thanks.
^ permalink raw reply
* Re: [PATCH] qed: fix memory leak of a qed_spq_entry on error failure paths
From: David Miller @ 2016-12-18 15:37 UTC (permalink / raw)
To: Yuval.Mintz; +Cc: colin.king, netdev, linux-kernel, Ariel.Elior, Tomer.Tayar
In-Reply-To: <BL2PR07MB2306151726E6A6E95702776B8D9E0@BL2PR07MB2306.namprd07.prod.outlook.com>
From: "Mintz, Yuval" <Yuval.Mintz@cavium.com>
Date: Sun, 18 Dec 2016 06:33:50 +0000
>> From: Colin Ian King <colin.king@canonical.com>
>>
>> A qed_spq_entry entry is allocated by qed_sp_init_request but is not kfree'd
>> if an error occurs, causing a memory leak. Fix this by kfree'ing it and also
>> setting *pp_ent to NULL to be safe.
>>
>> Found with static analysis by CoverityScan, CIDs 1389468-1389470
>>
>> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> ...
>> +err:
>> + kfree(*pp_ent);
>> + *pp_ent = NULL;
>> +
>> + return rc;
>> }
>
> Hi Colin - thanks for this.
> It would have been preferable to return the previously allocated spq entry.
> I.e., do:
>
> +err:
> + qed_spq_return_entry(p_hwfn, *pp_ent);
> + *pp_ent = NULL;
> + return rc;
Looking at this last night, I came to the same conclusion.
^ permalink raw reply
* regression: ath_tx_edma_tasklet() Illegal idle entry in RCU read-side critical section
From: Gabriel C @ 2016-12-18 13:52 UTC (permalink / raw)
To: lkml; +Cc: ath9k-devel, linux-wireless, ath9k-devel, Paul E. McKenney,
netdev
Hello,
while testing kernel 4.9 I run into a weird issue with the ath9k driver.
I can boot the box in console mode and it stay up sometime but is not usable.
from dmesg :
===============================
[ INFO: suspicious RCU usage. ]
4.9-fw1 #1 Tainted: G I
-------------------------------
kernel/rcu/tree.c:705 Illegal idle entry in RCU read-side critical section.!
other info that might help us debug this:
RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 1
RCU used illegally from extended quiescent state!
1 lock held by swapper/0/0:
#0: (rcu_read_lock){......}, at: [<ffffffffa0ee0240>] ath_tx_edma_tasklet+0x0/0x460 [ath9k]
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G I 4.9-fw1 #1
Hardware name: FUJITSU PRIMERGY TX200 S5 /D2709, BIOS 6.00 Rev. 1.14.2709 02/04/2013
ffff88043ee03f38 ffffffff812cf0f3 ffffffff81a11540 0000000000000001
ffff88043ee03f68 ffffffff810b7865 ffffffff81a55d58 ffff88043efcedc0
ffff88083cb1ca00 00000000000000d1 ffff88043ee03f88 ffffffff810dbfe8
Call Trace:
<IRQ>
[<ffffffff812cf0f3>] dump_stack+0x86/0xc3
[<ffffffff810b7865>] lockdep_rcu_suspicious+0xc5/0x100
[<ffffffff810dbfe8>] rcu_eqs_enter_common.constprop.62+0x128/0x130
[<ffffffff810ddc78>] rcu_irq_exit+0x38/0x70
[<ffffffff81067ec4>] irq_exit+0x74/0xd0
[<ffffffff8101e561>] do_IRQ+0x71/0x130
[<ffffffff8158700c>] common_interrupt+0x8c/0x8c
<EOI>
[<ffffffff81472836>] ? cpuidle_enter_state+0x156/0x220
[<ffffffff81472922>] cpuidle_enter+0x12/0x20
[<ffffffff810ad23e>] call_cpuidle+0x1e/0x40
[<ffffffff810ad46d>] cpu_startup_entry+0x11d/0x210
[<ffffffff8157892c>] rest_init+0x12c/0x140
[<ffffffff81d02ec3>] start_kernel+0x40f/0x41c
[<ffffffff81d02120>] ? early_idt_handler_array+0x120/0x120
[<ffffffff81d02299>] x86_64_start_reservations+0x2a/0x2c
[<ffffffff81d02386>] x86_64_start_kernel+0xeb/0xf8
...
perf: interrupt took too long (2766 > 2500), lowering kernel.perf_event_max_sample_rate to 72000
perf: interrupt took too long (3510 > 3457), lowering kernel.perf_event_max_sample_rate to 56000
perf: interrupt took too long (4689 > 4387), lowering kernel.perf_event_max_sample_rate to 42000
perf: interrupt took too long (5980 > 5861), lowering kernel.perf_event_max_sample_rate to 33000
INFO: rcu_preempt detected stalls on CPUs/tasks:
Tasks blocked on level-0 rcu_node (CPUs 0-15): P0
(detected by 5, t=65002 jiffies, g=3241, c=3240, q=8520)
swapper/0 R running task 0 0 0 0x00000000
ffffffff81a03e90 ffffffff8139bf30 ffffffff81ae30b8 00000000810253a9
ffff88083cb1e600 ffffffff81ae30a0 0000000000000002 ffffffff81ae30b8
ffffffff81ae2fe0 ffffffff81a03ed0 ffffffff81472814 0000001823671b47
Call Trace:
[<ffffffff8139bf30>] ? acpi_idle_enter+0x116/0x1fb
[<ffffffff81472814>] ? cpuidle_enter_state+0x134/0x220
[<ffffffff81472922>] ? cpuidle_enter+0x12/0x20
[<ffffffff810ad23e>] ? call_cpuidle+0x1e/0x40
[<ffffffff810ad46d>] ? cpu_startup_entry+0x11d/0x210
[<ffffffff8157892c>] ? rest_init+0x12c/0x140
[<ffffffff81d02ec3>] ? start_kernel+0x40f/0x41c
[<ffffffff81d02120>] ? early_idt_handler_array+0x120/0x120
[<ffffffff81d02299>] ? x86_64_start_reservations+0x2a/0x2c
[<ffffffff81d02386>] ? x86_64_start_kernel+0xeb/0xf8
swapper/0 R running task 0 0 0 0x00000000
ffffffff81a03e90 ffffffff8139bf30 ffffffff81ae30b8 00000000810253a9
ffff88083cb1e600 ffffffff81ae30a0 0000000000000002 ffffffff81ae30b8
ffffffff81ae2fe0 ffffffff81a03ed0 ffffffff81472814 0000001823671b47
Call Trace:
[<ffffffff8139bf30>] ? acpi_idle_enter+0x116/0x1fb
[<ffffffff81472814>] ? cpuidle_enter_state+0x134/0x220
[<ffffffff81472922>] ? cpuidle_enter+0x12/0x20
[<ffffffff810ad23e>] ? call_cpuidle+0x1e/0x40
[<ffffffff810ad46d>] ? cpu_startup_entry+0x11d/0x210
[<ffffffff8157892c>] ? rest_init+0x12c/0x140
[<ffffffff81d02ec3>] ? start_kernel+0x40f/0x41c
[<ffffffff81d02120>] ? early_idt_handler_array+0x120/0x120
[<ffffffff81d02299>] ? x86_64_start_reservations+0x2a/0x2c
[<ffffffff81d02386>] ? x86_64_start_kernel+0xeb/0xf8
perf: interrupt took too long (7746 > 7475), lowering kernel.perf_event_max_sample_rate to 25000
systemd-hostnamed.service: State 'stop-sigterm' timed out. Killing.
systemd-hostnamed.service: Killing process 1507 (systemd-hostnam) with signal SIGKILL.
perf: interrupt took too long (10065 > 9682), lowering kernel.perf_event_max_sample_rate to 19000
perf: interrupt took too long (12596 > 12581), lowering kernel.perf_event_max_sample_rate to 15000
INFO: task systemd-hostnam:1507 blocked for more than 120 seconds.
Tainted: G I 4.9-fw1 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
systemd-hostnam D 0 1507 1 0x00000002
ffff88043a29f200 000000000000c460 ffff88043ab0a1c0 ffff88043cdc0000
ffff88043f9d6718 ffffc9000b67fb88 ffffffff8157ff6e ffffc9000b67fbf8
ffff88043ab0abc0 ffff88043f9d6718 0000000000000000 ffff88043ab0a1c0
Call Trace:
[<ffffffff8157ff6e>] ? __schedule+0x2ce/0x810
[<ffffffff815804eb>] schedule+0x3b/0x90
[<ffffffff81584e82>] schedule_timeout+0x222/0x3a0
[<ffffffff812fe1f7>] ? debug_smp_processor_id+0x17/0x20
[<ffffffff812fe1f7>] ? debug_smp_processor_id+0x17/0x20
[<ffffffff810b5de9>] ? get_lock_stats+0x19/0x50
[<ffffffff81585d17>] ? _raw_spin_unlock_irq+0x27/0x50
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
[<ffffffff810b92cf>] ? trace_hardirqs_on_caller+0xef/0x200
[<ffffffff81580ffa>] wait_for_common+0xca/0x180
[<ffffffff8108e150>] ? wake_up_q+0x80/0x80
[<ffffffff815810c8>] wait_for_completion+0x18/0x20
[<ffffffff810d82e5>] __wait_rcu_gp+0xc5/0x100
[<ffffffff810dbccd>] synchronize_rcu.part.53+0x2d/0x50
[<ffffffff810dc7e0>] ? __call_rcu.constprop.59+0x270/0x270
[<ffffffff810d8210>] ? rcu_panic+0x20/0x20
[<ffffffff81580f69>] ? wait_for_common+0x39/0x180
[<ffffffff810dbd17>] synchronize_rcu+0x27/0x90
[<ffffffff811fd887>] namespace_unlock+0x47/0x60
[<ffffffff81200639>] drop_collected_mounts+0x89/0x90
[<ffffffff8120246b>] ? put_mnt_ns+0x1b/0x30
[<ffffffff8120246b>] put_mnt_ns+0x1b/0x30
[<ffffffff81085798>] free_nsproxy+0x18/0xb0
[<ffffffff8108593e>] switch_task_namespaces+0x5e/0x70
[<ffffffff8108595b>] exit_task_namespaces+0xb/0x10
[<ffffffff8106652e>] do_exit+0x2de/0xb30
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
[<ffffffff810b92cf>] ? trace_hardirqs_on_caller+0xef/0x200
[<ffffffff81066e00>] do_group_exit+0x40/0xc0
[<ffffffff81066e8f>] SyS_exit_group+0xf/0x10
[<ffffffff81586681>] entry_SYSCALL_64_fastpath+0x1f/0xc2
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
Showing all locks held in the system:
2 locks held by khungtaskd/108:
#0: (rcu_read_lock){......}, at: [<ffffffff811269af>] watchdog+0x9f/0x490
#1: (tasklist_lock){.+.+..}, at: [<ffffffff810b760d>] debug_show_all_locks+0x3d/0x1a0
=============================================
systemd-hostnamed.service: Processes still around after SIGKILL. Ignoring.
INFO: rcu_preempt detected stalls on CPUs/tasks:
Tasks blocked on level-0 rcu_node (CPUs 0-15): P0
(detected by 9, t=260007 jiffies, g=3241, c=3240, q=12143)
swapper/0 R running task 0 0 0 0x00000000
ffffffff81a11540 000000000000001f 0000000000000007 ffffffff817d2a6f
ffffffff817a0867 ffffffffffffffcf ffffffff81472836 0000000000000010
0000000000000212 ffffffff81a03ea0 ffffffff81085cb8 ffffffff81085c70
Call Trace:
[<ffffffff8139bf30>] ? acpi_idle_enter+0x116/0x1fb
[<ffffffff81472814>] ? cpuidle_enter_state+0x134/0x220
[<ffffffff81472922>] ? cpuidle_enter+0x12/0x20
[<ffffffff810ad23e>] ? call_cpuidle+0x1e/0x40
[<ffffffff810ad46d>] ? cpu_startup_entry+0x11d/0x210
[<ffffffff8157892c>] ? rest_init+0x12c/0x140
[<ffffffff81d02ec3>] ? start_kernel+0x40f/0x41c
[<ffffffff81d02120>] ? early_idt_handler_array+0x120/0x120
[<ffffffff81d02299>] ? x86_64_start_reservations+0x2a/0x2c
[<ffffffff81d02386>] ? x86_64_start_kernel+0xeb/0xf8
swapper/0 R running task 0 0 0 0x00000000
ffffffff81a03e90 ffffffff8139bf30 ffffffff81ae30b8 00000000810253a9
ffff88083cb1e600 ffffffff81ae30a0 0000000000000002 ffffffff81ae30b8
ffffffff81ae2fe0 ffffffff81a03ed0 ffffffff81472814 000000458b315be4
Call Trace:
[<ffffffff8139bf30>] ? acpi_idle_enter+0x116/0x1fb
[<ffffffff81472814>] ? cpuidle_enter_state+0x134/0x220
[<ffffffff81472922>] ? cpuidle_enter+0x12/0x20
[<ffffffff810ad23e>] ? call_cpuidle+0x1e/0x40
[<ffffffff810ad46d>] ? cpu_startup_entry+0x11d/0x210
[<ffffffff8157892c>] ? rest_init+0x12c/0x140
[<ffffffff81d02ec3>] ? start_kernel+0x40f/0x41c
[<ffffffff81d02120>] ? early_idt_handler_array+0x120/0x120
[<ffffffff81d02299>] ? x86_64_start_reservations+0x2a/0x2c
[<ffffffff81d02386>] ? x86_64_start_kernel+0xeb/0xf8
systemd-hostnamed.service: State 'stop-final-sigterm' timed out. Killing.
systemd-hostnamed.service: Killing process 1507 (systemd-hostnam) with signal SIGKILL.
INFO: task systemd-hostnam:1507 blocked for more than 120 seconds.
Tainted: G I 4.9-fw1 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
systemd-hostnam D 0 1507 1 0x00000002
ffff88043a29f200 000000000000c460 ffff88043ab0a1c0 ffff88043cdc0000
ffff88043f9d6718 ffffc9000b67fb88 ffffffff8157ff6e ffffc9000b67fbf8
ffff88043ab0abc0 ffff88043f9d6718 0000000000000000 ffff88043ab0a1c0
Call Trace:
[<ffffffff8157ff6e>] ? __schedule+0x2ce/0x810
[<ffffffff815804eb>] schedule+0x3b/0x90
[<ffffffff81584e82>] schedule_timeout+0x222/0x3a0
[<ffffffff812fe1f7>] ? debug_smp_processor_id+0x17/0x20
[<ffffffff812fe1f7>] ? debug_smp_processor_id+0x17/0x20
[<ffffffff810b5de9>] ? get_lock_stats+0x19/0x50
[<ffffffff81585d17>] ? _raw_spin_unlock_irq+0x27/0x50
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
[<ffffffff810b92cf>] ? trace_hardirqs_on_caller+0xef/0x200
[<ffffffff81580ffa>] wait_for_common+0xca/0x180
[<ffffffff8108e150>] ? wake_up_q+0x80/0x80
[<ffffffff815810c8>] wait_for_completion+0x18/0x20
[<ffffffff810d82e5>] __wait_rcu_gp+0xc5/0x100
[<ffffffff810dbccd>] synchronize_rcu.part.53+0x2d/0x50
[<ffffffff810dc7e0>] ? __call_rcu.constprop.59+0x270/0x270
[<ffffffff810d8210>] ? rcu_panic+0x20/0x20
[<ffffffff81580f69>] ? wait_for_common+0x39/0x180
[<ffffffff810dbd17>] synchronize_rcu+0x27/0x90
[<ffffffff811fd887>] namespace_unlock+0x47/0x60
[<ffffffff81200639>] drop_collected_mounts+0x89/0x90
[<ffffffff8120246b>] ? put_mnt_ns+0x1b/0x30
[<ffffffff8120246b>] put_mnt_ns+0x1b/0x30
[<ffffffff81085798>] free_nsproxy+0x18/0xb0
[<ffffffff8108593e>] switch_task_namespaces+0x5e/0x70
[<ffffffff8108595b>] exit_task_namespaces+0xb/0x10
[<ffffffff8106652e>] do_exit+0x2de/0xb30
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
[<ffffffff810b92cf>] ? trace_hardirqs_on_caller+0xef/0x200
[<ffffffff81066e00>] do_group_exit+0x40/0xc0
[<ffffffff81066e8f>] SyS_exit_group+0xf/0x10
[<ffffffff81586681>] entry_SYSCALL_64_fastpath+0x1f/0xc2
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
Showing all locks held in the system:
2 locks held by khungtaskd/108:
#0: (rcu_read_lock){......}, at: [<ffffffff811269af>] watchdog+0x9f/0x490
#1: (tasklist_lock){.+.+..}, at: [<ffffffff810b760d>] debug_show_all_locks+0x3d/0x1a0
2 locks held by NetworkManager/1475:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
#1: (rcu_preempt_state.exp_mutex){+.+...}, at: [<ffffffff810da7d9>] _synchronize_rcu_expedited+0x149/0x350
2 locks held by kworker/2:7/1630:
#0: ("events"){.+.+.+}, at: [<ffffffff8107d5c6>] process_one_work+0x1e6/0x4d0
#1: ((&rew.rew_work)){+.+...}, at: [<ffffffff8107d5c6>] process_one_work+0x1e6/0x4d0
=============================================
perf: interrupt took too long (15815 > 15745), lowering kernel.perf_event_max_sample_rate to 12000
INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P0 } 67953 jiffies s: 1039 root: 0x0/T
blocking rcu_node structures:
systemd-hostnamed.service: Processes still around after final SIGKILL. Entering failed mode.
systemd-hostnamed.service: Unit entered failed state.
systemd-hostnamed.service: Failed with result 'timeout'.
Starting system activity accounting tool...
INFO: task NetworkManager:1475 blocked for more than 120 seconds.
Tainted: G I 4.9-fw1 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
NetworkManager D 0 1475 1 0x00000000
ffff88083a8eaf80 000000000000d4d0 ffff88083955c380 ffff8804371d4380
ffff88043f3d6718 ffffc90007f57640 ffffffff8157ff6e ffffc90007f57608
ffff88083955cd80 ffff88043f3d6718 0000000000000000 ffff88083955c380
Call Trace:
[<ffffffff8157ff6e>] ? __schedule+0x2ce/0x810
[<ffffffff815804eb>] schedule+0x3b/0x90
[<ffffffff810da9d4>] _synchronize_rcu_expedited+0x344/0x350
[<ffffffff810da5c0>] ? rcu_momentary_dyntick_idle+0xa0/0xa0
[<ffffffff810acd10>] ? wake_atomic_t_function+0x50/0x50
[<ffffffff810da5c0>] ? rcu_momentary_dyntick_idle+0xa0/0xa0
[<ffffffff810dacf0>] ? rcu_seq_end+0x40/0x40
[<ffffffff810daca7>] synchronize_rcu_expedited+0x17/0x20
[<ffffffff814aaf6c>] synchronize_net+0x2c/0x30
[<ffffffff814daf8c>] dev_deactivate_many+0x2cc/0x2e0
[<ffffffff814a6971>] __dev_close_many+0x71/0xe0
[<ffffffff814a6b21>] __dev_close+0x31/0x50
[<ffffffff814b16a8>] __dev_change_flags+0x98/0x160
[<ffffffff814b1794>] dev_change_flags+0x24/0x60
[<ffffffff810253a9>] ? sched_clock+0x9/0x10
[<ffffffff814c3c96>] do_setlink+0x2e6/0xcc0
[<ffffffff810b9b64>] ? __lock_acquire+0x454/0x1b00
[<ffffffff813081c1>] ? nla_parse+0x31/0x120
[<ffffffff814c6750>] rtnl_newlink+0x5c0/0x860
[<ffffffff812fe1f7>] ? debug_smp_processor_id+0x17/0x20
[<ffffffff810b5de9>] ? get_lock_stats+0x19/0x50
[<ffffffff814c6a6f>] rtnetlink_rcv_msg+0x7f/0x1e0
[<ffffffff8158178a>] ? mutex_lock_nested+0x2fa/0x430
[<ffffffff814c3286>] ? rtnetlink_rcv+0x16/0x30
[<ffffffff814c3286>] ? rtnetlink_rcv+0x16/0x30
[<ffffffff814c69f0>] ? rtnl_newlink+0x860/0x860
[<ffffffff814e7eef>] netlink_rcv_skb+0x9f/0xc0
[<ffffffff814c3295>] rtnetlink_rcv+0x25/0x30
[<ffffffff814e7865>] netlink_unicast+0x155/0x1f0
[<ffffffff814e7cad>] netlink_sendmsg+0x2dd/0x360
[<ffffffff8148d682>] sock_sendmsg+0x12/0x20
[<ffffffff8148ddfc>] ___sys_sendmsg+0x2ac/0x2c0
[<ffffffff812fe1f7>] ? debug_smp_processor_id+0x17/0x20
[<ffffffff811fb60b>] ? __fget+0x10b/0x1f0
[<ffffffff811fb500>] ? expand_files+0x2a0/0x2a0
[<ffffffff811fb730>] ? __fget_light+0x20/0x60
[<ffffffff8148ed60>] __sys_sendmsg+0x40/0x70
[<ffffffff8148ed9d>] SyS_sendmsg+0xd/0x20
[<ffffffff81586681>] entry_SYSCALL_64_fastpath+0x1f/0xc2
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
Showing all locks held in the system:
2 locks held by khungtaskd/108:
#0: (rcu_read_lock){......}, at: [<ffffffff811269af>] watchdog+0x9f/0x490
#1: (tasklist_lock){.+.+..}, at: [<ffffffff810b760d>] debug_show_all_locks+0x3d/0x1a0
1 lock held by sd-resolve/1445:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
2 locks held by NetworkManager/1475:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
#1: (rcu_preempt_state.exp_mutex){+.+...}, at: [<ffffffff810da7d9>] _synchronize_rcu_expedited+0x149/0x350
2 locks held by kworker/2:7/1630:
#0: ("events"){.+.+.+}, at: [<ffffffff8107d5c6>] process_one_work+0x1e6/0x4d0
#1: ((&rew.rew_work)){+.+...}, at: [<ffffffff8107d5c6>] process_one_work+0x1e6/0x4d0
2 locks held by nano/2071:
#0: (&tty->ldisc_sem){++++.+}, at: [<ffffffff8158549d>] ldsem_down_read+0x2d/0x40
#1: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff813c4363>] n_tty_read+0xb3/0x8e0
=============================================
INFO: task systemd-hostnam:1507 blocked for more than 120 seconds.
Tainted: G I 4.9-fw1 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
systemd-hostnam D 0 1507 1 0x00000002
ffff88043a29f200 000000000000c460 ffff88043ab0a1c0 ffff88043cdc0000
ffff88043f9d6718 ffffc9000b67fb88 ffffffff8157ff6e ffffc9000b67fbf8
ffff88043ab0abc0 ffff88043f9d6718 0000000000000000 ffff88043ab0a1c0
Call Trace:
[<ffffffff8157ff6e>] ? __schedule+0x2ce/0x810
[<ffffffff815804eb>] schedule+0x3b/0x90
[<ffffffff81584e82>] schedule_timeout+0x222/0x3a0
[<ffffffff812fe1f7>] ? debug_smp_processor_id+0x17/0x20
[<ffffffff812fe1f7>] ? debug_smp_processor_id+0x17/0x20
[<ffffffff810b5de9>] ? get_lock_stats+0x19/0x50
[<ffffffff81585d17>] ? _raw_spin_unlock_irq+0x27/0x50
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
[<ffffffff810b92cf>] ? trace_hardirqs_on_caller+0xef/0x200
[<ffffffff81580ffa>] wait_for_common+0xca/0x180
[<ffffffff8108e150>] ? wake_up_q+0x80/0x80
[<ffffffff815810c8>] wait_for_completion+0x18/0x20
[<ffffffff810d82e5>] __wait_rcu_gp+0xc5/0x100
[<ffffffff810dbccd>] synchronize_rcu.part.53+0x2d/0x50
[<ffffffff810dc7e0>] ? __call_rcu.constprop.59+0x270/0x270
[<ffffffff810d8210>] ? rcu_panic+0x20/0x20
[<ffffffff81580f69>] ? wait_for_common+0x39/0x180
[<ffffffff810dbd17>] synchronize_rcu+0x27/0x90
[<ffffffff811fd887>] namespace_unlock+0x47/0x60
[<ffffffff81200639>] drop_collected_mounts+0x89/0x90
[<ffffffff8120246b>] ? put_mnt_ns+0x1b/0x30
[<ffffffff8120246b>] put_mnt_ns+0x1b/0x30
[<ffffffff81085798>] free_nsproxy+0x18/0xb0
[<ffffffff8108593e>] switch_task_namespaces+0x5e/0x70
[<ffffffff8108595b>] exit_task_namespaces+0xb/0x10
[<ffffffff8106652e>] do_exit+0x2de/0xb30
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
[<ffffffff810b92cf>] ? trace_hardirqs_on_caller+0xef/0x200
[<ffffffff81066e00>] do_group_exit+0x40/0xc0
[<ffffffff81066e8f>] SyS_exit_group+0xf/0x10
[<ffffffff81586681>] entry_SYSCALL_64_fastpath+0x1f/0xc2
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
Showing all locks held in the system:
2 locks held by khungtaskd/108:
#0: (rcu_read_lock){......}, at: [<ffffffff811269af>] watchdog+0x9f/0x490
#1: (tasklist_lock){.+.+..}, at: [<ffffffff810b760d>] debug_show_all_locks+0x3d/0x1a0
1 lock held by sd-resolve/1445:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
2 locks held by NetworkManager/1475:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
#1: (rcu_preempt_state.exp_mutex){+.+...}, at: [<ffffffff810da7d9>] _synchronize_rcu_expedited+0x149/0x350
2 locks held by kworker/2:7/1630:
#0: ("events"){.+.+.+}, at: [<ffffffff8107d5c6>] process_one_work+0x1e6/0x4d0
#1: ((&rew.rew_work)){+.+...}, at: [<ffffffff8107d5c6>] process_one_work+0x1e6/0x4d0
2 locks held by nano/2071:
#0: (&tty->ldisc_sem){++++.+}, at: [<ffffffff8158549d>] ldsem_down_read+0x2d/0x40
#1: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff813c4363>] n_tty_read+0xb3/0x8e0
=============================================
INFO: rcu_preempt detected stalls on CPUs/tasks:
Tasks blocked on level-0 rcu_node (CPUs 0-15): P0
(detected by 5, t=455012 jiffies, g=3241, c=3240, q=17768)
swapper/0 R running task 0 0 0 0x00000000
ffffffff81a11540 000000000000001f 0000000000000007 ffffffff817d2a6f
ffffffff817a0867 ffffffffffffffcf ffffffff81472836 0000000000000010
0000000000000216 ffffffff81a03ea0 0000000000000018 00000072f2ea5c0d
Call Trace:
[<ffffffff81472836>] ? cpuidle_enter_state+0x156/0x220
[<ffffffff815804eb>] ? schedule+0x3b/0x90
[<ffffffff81580943>] ? schedule_preempt_disabled+0x13/0x20
[<ffffffff810ad4c9>] ? cpu_startup_entry+0x179/0x210
[<ffffffff8157892c>] ? rest_init+0x12c/0x140
[<ffffffff81d02ec3>] ? start_kernel+0x40f/0x41c
[<ffffffff81d02120>] ? early_idt_handler_array+0x120/0x120
[<ffffffff81d02299>] ? x86_64_start_reservations+0x2a/0x2c
[<ffffffff81d02386>] ? x86_64_start_kernel+0xeb/0xf8
swapper/0 R running task 0 0 0 0x00000000
ffff880439c31c80 0000000000003392 ffffffff81a11540 ffff88043cd40000
ffff88043efd6718 ffffffff81a03ec8 ffffffff8157ff6e 00000000001d6700
ffffffff81a11f48 ffff88043efd6718 0000000000000000 ffffffff81a11540
Call Trace:
[<ffffffff8157ff6e>] ? __schedule+0x2ce/0x810
[<ffffffff815804eb>] schedule+0x3b/0x90
[<ffffffff811736c7>] ? quiet_vmstat+0x47/0x50
[<ffffffff81026654>] ? arch_cpu_idle_enter+0x24/0x30
[<ffffffff810ad46d>] ? cpu_startup_entry+0x11d/0x210
[<ffffffff8157892c>] ? rest_init+0x12c/0x140
[<ffffffff81d02ec3>] ? start_kernel+0x40f/0x41c
[<ffffffff81d02120>] ? early_idt_handler_array+0x120/0x120
[<ffffffff81d02299>] ? x86_64_start_reservations+0x2a/0x2c
[<ffffffff81d02386>] ? x86_64_start_kernel+0xeb/0xf8
perf: interrupt took too long (19983 > 19768), lowering kernel.perf_event_max_sample_rate to 10000
......
INFO: task sd-resolve:1445 blocked for more than 120 seconds.
Tainted: G I 4.9-fw1 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
sd-resolve D 0 1445 1 0x00000000
ffff88043a299300 00000000000078cb ffff88043c56c380 ffff88043cdc4380
ffff88043fdd6718 ffffc90008093c90 ffffffff8157ff6e ffff88043c56cff0
ffff88043c56cd80 ffff88043fdd6718 0000000000000000 ffff88043c56c380
Call Trace:
[<ffffffff8157ff6e>] ? __schedule+0x2ce/0x810
[<ffffffff815815ef>] ? mutex_lock_nested+0x15f/0x430
[<ffffffff815804eb>] schedule+0x3b/0x90
[<ffffffff81580943>] schedule_preempt_disabled+0x13/0x20
[<ffffffff81581630>] mutex_lock_nested+0x1a0/0x430
[<ffffffff814c3286>] ? rtnetlink_rcv+0x16/0x30
[<ffffffff814c3286>] ? rtnetlink_rcv+0x16/0x30
[<ffffffff814e4870>] ? netlink_deliver_tap+0x90/0x2b0
[<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
[<ffffffff814e7865>] netlink_unicast+0x155/0x1f0
[<ffffffff814e7cad>] netlink_sendmsg+0x2dd/0x360
[<ffffffff8148d682>] sock_sendmsg+0x12/0x20
[<ffffffff8148e952>] SyS_sendto+0xf2/0x170
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
[<ffffffff810b92cf>] ? trace_hardirqs_on_caller+0xef/0x200
[<ffffffff8100201a>] ? trace_hardirqs_on_thunk+0x1a/0x1c
[<ffffffff81586681>] entry_SYSCALL_64_fastpath+0x1f/0xc2
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
Showing all locks held in the system:
2 locks held by khungtaskd/108:
#0: (rcu_read_lock){......}, at: [<ffffffff811269af>] watchdog+0x9f/0x490
#1: (tasklist_lock){.+.+..}, at: [<ffffffff810b760d>] debug_show_all_locks+0x3d/0x1a0
1 lock held by sd-resolve/1445:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
2 locks held by NetworkManager/1475:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
#1: (rcu_preempt_state.exp_mutex){+.+...}, at: [<ffffffff810da7d9>] _synchronize_rcu_expedited+0x149/0x350
2 locks held by kworker/2:7/1630:
#0: ("events"){.+.+.+}, at: [<ffffffff8107d5c6>] process_one_work+0x1e6/0x4d0
#1: ((&rew.rew_work)){+.+...}, at: [<ffffffff8107d5c6>] process_one_work+0x1e6/0x4d0
1 lock held by sudo/2214:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
=============================================
INFO: task NetworkManager:1475 blocked for more than 120 seconds.
Tainted: G I 4.9-fw1 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
NetworkManager D 0 1475 1 0x00000000
ffff88083a8eaf80 000000000000d4d0 ffff88083955c380 ffff8804371d4380
ffff88043f3d6718 ffffc90007f57640 ffffffff8157ff6e ffffc90007f57608
ffff88083955cd80 ffff88043f3d6718 0000000000000000 ffff88083955c380
Call Trace:
[<ffffffff8157ff6e>] ? __schedule+0x2ce/0x810
[<ffffffff815804eb>] schedule+0x3b/0x90
[<ffffffff810da9d4>] _synchronize_rcu_expedited+0x344/0x350
[<ffffffff810da5c0>] ? rcu_momentary_dyntick_idle+0xa0/0xa0
[<ffffffff810acd10>] ? wake_atomic_t_function+0x50/0x50
[<ffffffff810da5c0>] ? rcu_momentary_dyntick_idle+0xa0/0xa0
[<ffffffff810dacf0>] ? rcu_seq_end+0x40/0x40
[<ffffffff810daca7>] synchronize_rcu_expedited+0x17/0x20
[<ffffffff814aaf6c>] synchronize_net+0x2c/0x30
[<ffffffff814daf8c>] dev_deactivate_many+0x2cc/0x2e0
[<ffffffff814a6971>] __dev_close_many+0x71/0xe0
[<ffffffff814a6b21>] __dev_close+0x31/0x50
[<ffffffff814b16a8>] __dev_change_flags+0x98/0x160
[<ffffffff814b1794>] dev_change_flags+0x24/0x60
[<ffffffff810253a9>] ? sched_clock+0x9/0x10
[<ffffffff814c3c96>] do_setlink+0x2e6/0xcc0
[<ffffffff810b9b64>] ? __lock_acquire+0x454/0x1b00
[<ffffffff813081c1>] ? nla_parse+0x31/0x120
[<ffffffff814c6750>] rtnl_newlink+0x5c0/0x860
[<ffffffff812fe1f7>] ? debug_smp_processor_id+0x17/0x20
[<ffffffff810b5de9>] ? get_lock_stats+0x19/0x50
[<ffffffff814c6a6f>] rtnetlink_rcv_msg+0x7f/0x1e0
[<ffffffff8158178a>] ? mutex_lock_nested+0x2fa/0x430
[<ffffffff814c3286>] ? rtnetlink_rcv+0x16/0x30
[<ffffffff814c3286>] ? rtnetlink_rcv+0x16/0x30
[<ffffffff814c69f0>] ? rtnl_newlink+0x860/0x860
[<ffffffff814e7eef>] netlink_rcv_skb+0x9f/0xc0
[<ffffffff814c3295>] rtnetlink_rcv+0x25/0x30
[<ffffffff814e7865>] netlink_unicast+0x155/0x1f0
[<ffffffff814e7cad>] netlink_sendmsg+0x2dd/0x360
[<ffffffff8148d682>] sock_sendmsg+0x12/0x20
[<ffffffff8148ddfc>] ___sys_sendmsg+0x2ac/0x2c0
[<ffffffff812fe1f7>] ? debug_smp_processor_id+0x17/0x20
[<ffffffff811fb60b>] ? __fget+0x10b/0x1f0
[<ffffffff811fb500>] ? expand_files+0x2a0/0x2a0
[<ffffffff811fb730>] ? __fget_light+0x20/0x60
[<ffffffff8148ed60>] __sys_sendmsg+0x40/0x70
[<ffffffff8148ed9d>] SyS_sendmsg+0xd/0x20
[<ffffffff81586681>] entry_SYSCALL_64_fastpath+0x1f/0xc2
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
Showing all locks held in the system:
2 locks held by khungtaskd/108:
#0: (rcu_read_lock){......}, at: [<ffffffff811269af>] watchdog+0x9f/0x490
#1: (tasklist_lock){.+.+..}, at: [<ffffffff810b760d>] debug_show_all_locks+0x3d/0x1a0
1 lock held by sd-resolve/1445:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
2 locks held by NetworkManager/1475:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
#1: (rcu_preempt_state.exp_mutex){+.+...}, at: [<ffffffff810da7d9>] _synchronize_rcu_expedited+0x149/0x350
2 locks held by kworker/2:7/1630:
#0: ("events"){.+.+.+}, at: [<ffffffff8107d5c6>] process_one_work+0x1e6/0x4d0
#1: ((&rew.rew_work)){+.+...}, at: [<ffffffff8107d5c6>] process_one_work+0x1e6/0x4d0
1 lock held by sudo/2214:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
=============================================
INFO: task systemd-hostnam:1507 blocked for more than 120 seconds.
Tainted: G I 4.9-fw1 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
systemd-hostnam D 0 1507 1 0x00000002
ffff88043a29f200 000000000000c460 ffff88043ab0a1c0 ffff88043cdc0000
ffff88043f9d6718 ffffc9000b67fb88 ffffffff8157ff6e ffffc9000b67fbf8
ffff88043ab0abc0 ffff88043f9d6718 0000000000000000 ffff88043ab0a1c0
Call Trace:
[<ffffffff8157ff6e>] ? __schedule+0x2ce/0x810
[<ffffffff815804eb>] schedule+0x3b/0x90
[<ffffffff81584e82>] schedule_timeout+0x222/0x3a0
[<ffffffff812fe1f7>] ? debug_smp_processor_id+0x17/0x20
[<ffffffff812fe1f7>] ? debug_smp_processor_id+0x17/0x20
[<ffffffff810b5de9>] ? get_lock_stats+0x19/0x50
[<ffffffff81585d17>] ? _raw_spin_unlock_irq+0x27/0x50
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
[<ffffffff810b92cf>] ? trace_hardirqs_on_caller+0xef/0x200
[<ffffffff81580ffa>] wait_for_common+0xca/0x180
[<ffffffff8108e150>] ? wake_up_q+0x80/0x80
[<ffffffff815810c8>] wait_for_completion+0x18/0x20
[<ffffffff810d82e5>] __wait_rcu_gp+0xc5/0x100
[<ffffffff810dbccd>] synchronize_rcu.part.53+0x2d/0x50
[<ffffffff810dc7e0>] ? __call_rcu.constprop.59+0x270/0x270
[<ffffffff810d8210>] ? rcu_panic+0x20/0x20
[<ffffffff81580f69>] ? wait_for_common+0x39/0x180
[<ffffffff810dbd17>] synchronize_rcu+0x27/0x90
[<ffffffff811fd887>] namespace_unlock+0x47/0x60
[<ffffffff81200639>] drop_collected_mounts+0x89/0x90
[<ffffffff8120246b>] ? put_mnt_ns+0x1b/0x30
[<ffffffff8120246b>] put_mnt_ns+0x1b/0x30
[<ffffffff81085798>] free_nsproxy+0x18/0xb0
[<ffffffff8108593e>] switch_task_namespaces+0x5e/0x70
[<ffffffff8108595b>] exit_task_namespaces+0xb/0x10
[<ffffffff8106652e>] do_exit+0x2de/0xb30
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
[<ffffffff810b92cf>] ? trace_hardirqs_on_caller+0xef/0x200
[<ffffffff81066e00>] do_group_exit+0x40/0xc0
[<ffffffff81066e8f>] SyS_exit_group+0xf/0x10
[<ffffffff81586681>] entry_SYSCALL_64_fastpath+0x1f/0xc2
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
Showing all locks held in the system:
2 locks held by khungtaskd/108:
#0: (rcu_read_lock){......}, at: [<ffffffff811269af>] watchdog+0x9f/0x490
#1: (tasklist_lock){.+.+..}, at: [<ffffffff810b760d>] debug_show_all_locks+0x3d/0x1a0
1 lock held by sd-resolve/1445:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
2 locks held by NetworkManager/1475:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
#1: (rcu_preempt_state.exp_mutex){+.+...}, at: [<ffffffff810da7d9>] _synchronize_rcu_expedited+0x149/0x350
2 locks held by kworker/2:7/1630:
#0: ("events"){.+.+.+}, at: [<ffffffff8107d5c6>] process_one_work+0x1e6/0x4d0
#1: ((&rew.rew_work)){+.+...}, at: [<ffffffff8107d5c6>] process_one_work+0x1e6/0x4d0
1 lock held by sudo/2214:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
=============================================
INFO: task kworker/2:7:1630 blocked for more than 120 seconds.
Tainted: G I 4.9-fw1 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/2:7 D 0 1630 2 0x00000000
Workqueue: events wait_rcu_exp_gp
ffff880439c31c80 000000000000affb ffff8804371d4380 ffff88043bddc380
ffff88043f3d6718 ffffc9000c157be8 ffffffff8157ff6e ffffc9000c157bb0
ffff8804371d4d80 ffff88043f3d6718 0000000000000000 ffff8804371d4380
Call Trace:
[<ffffffff8157ff6e>] ? __schedule+0x2ce/0x810
[<ffffffff815804eb>] schedule+0x3b/0x90
[<ffffffff81584e4b>] schedule_timeout+0x1eb/0x3a0
[<ffffffff810e1410>] ? del_timer_sync+0xd0/0xd0
[<ffffffff810acdf7>] ? prepare_to_swait+0x67/0x90
[<ffffffff810daff5>] wait_rcu_exp_gp+0x305/0xa10
[<ffffffff8107d62c>] process_one_work+0x24c/0x4d0
[<ffffffff8107d5c6>] ? process_one_work+0x1e6/0x4d0
[<ffffffff8107d8f6>] worker_thread+0x46/0x4f0
[<ffffffff8107d8b0>] ? process_one_work+0x4d0/0x4d0
[<ffffffff810840fe>] kthread+0xee/0x110
[<ffffffff81084010>] ? kthread_park+0x60/0x60
[<ffffffff815868ea>] ret_from_fork+0x2a/0x40
Showing all locks held in the system:
2 locks held by khungtaskd/108:
#0: (rcu_read_lock){......}, at: [<ffffffff811269af>] watchdog+0x9f/0x490
#1: (tasklist_lock){.+.+..}, at: [<ffffffff810b760d>] debug_show_all_locks+0x3d/0x1a0
1 lock held by sd-resolve/1445:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
2 locks held by NetworkManager/1475:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
#1: (rcu_preempt_state.exp_mutex){+.+...}, at: [<ffffffff810da7d9>] _synchronize_rcu_expedited+0x149/0x350
2 locks held by kworker/2:7/1630:
#0: ("events"){.+.+.+}, at: [<ffffffff8107d5c6>] process_one_work+0x1e6/0x4d0
#1: ((&rew.rew_work)){+.+...}, at: [<ffffffff8107d5c6>] process_one_work+0x1e6/0x4d0
1 lock held by sudo/2214:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
=============================================
INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P0 } 264561 jiffies s: 1039 root: 0x0/T
blocking rcu_node structures:
session-c1.scope: Stopping timed out. Killing.
session-c1.scope: Killing process 2214 (sudo) with signal SIGKILL.
gpm.service: State 'stop-sigterm' timed out. Killing.
gpm.service: Killing process 1461 (gpm) with signal SIGKILL.
perf: interrupt took too long (25099 > 24978), lowering kernel.perf_event_max_sample_rate to 7000
INFO: rcu_preempt detected stalls on CPUs/tasks:
Tasks blocked on level-0 rcu_node (CPUs 0-15): P0
(detected by 1, t=650017 jiffies, g=3241, c=3240, q=28138)
swapper/0 R running task 0 0 0 0x00000000
ffffffff81a03e90 ffffffff8139bf30 ffffffff81ae30b8 00000000810253a9
ffff88083cb1e600 ffffffff81ae30a0 0000000000000002 ffffffff81ae30b8
ffffffff81ae2fe0 ffffffff81a03ed0 ffffffff81472814 000000a05ac4422f
Call Trace:
[<ffffffff8139bf30>] ? acpi_idle_enter+0x116/0x1fb
[<ffffffff81472814>] ? cpuidle_enter_state+0x134/0x220
[<ffffffff81472922>] ? cpuidle_enter+0x12/0x20
[<ffffffff810ad23e>] ? call_cpuidle+0x1e/0x40
[<ffffffff810ad46d>] ? cpu_startup_entry+0x11d/0x210
[<ffffffff8157892c>] ? rest_init+0x12c/0x140
[<ffffffff81d02ec3>] ? start_kernel+0x40f/0x41c
[<ffffffff81d02120>] ? early_idt_handler_array+0x120/0x120
[<ffffffff81d02299>] ? x86_64_start_reservations+0x2a/0x2c
[<ffffffff81d02386>] ? x86_64_start_kernel+0xeb/0xf8
swapper/0 R running task 0 0 0 0x00000000
ffffffff81a03e90 ffffffff8139bf30 ffffffff81ae30b8 00000000810253a9
ffff88083cb1e600 ffffffff81ae30a0 0000000000000002 ffffffff81ae30b8
ffffffff81ae2fe0 ffffffff81a03ed0 ffffffff81472814 000000a05ac4422f
Call Trace:
[<ffffffff8139bf30>] ? acpi_idle_enter+0x116/0x1fb
[<ffffffff81472814>] ? cpuidle_enter_state+0x134/0x220
[<ffffffff81472922>] ? cpuidle_enter+0x12/0x20
[<ffffffff810ad23e>] ? call_cpuidle+0x1e/0x40
[<ffffffff810ad46d>] ? cpu_startup_entry+0x11d/0x210
[<ffffffff8157892c>] ? rest_init+0x12c/0x140
[<ffffffff81d02ec3>] ? start_kernel+0x40f/0x41c
[<ffffffff81d02120>] ? early_idt_handler_array+0x120/0x120
[<ffffffff81d02299>] ? x86_64_start_reservations+0x2a/0x2c
[<ffffffff81d02386>] ? x86_64_start_kernel+0xeb/0xf8
session-c1.scope: Still around after SIGKILL. Ignoring.
Stopped Session c1 of user root.
session-c1.scope: Unit entered failed state.
gpm.service: Processes still around after SIGKILL. Ignoring.
Removed slice User Slice of root.
Stopping Login Service...
Stopping Permit User Sessions...
Stopped Permit User Sessions.
Stopped target Remote File Systems.
Stopped target Network.
Stopping Network Manager...
Stopping WPA supplicant...
INFO: task sd-resolve:1445 blocked for more than 120 seconds.
Tainted: G I 4.9-fw1 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
sd-resolve D 0 1445 1 0x00000000
ffff88043a299300 00000000000078cb ffff88043c56c380 ffff88043cdc4380
ffff88043fdd6718 ffffc90008093c90 ffffffff8157ff6e ffff88043c56cff0
ffff88043c56cd80 ffff88043fdd6718 0000000000000000 ffff88043c56c380
Call Trace:
[<ffffffff8157ff6e>] ? __schedule+0x2ce/0x810
[<ffffffff815815ef>] ? mutex_lock_nested+0x15f/0x430
[<ffffffff815804eb>] schedule+0x3b/0x90
[<ffffffff81580943>] schedule_preempt_disabled+0x13/0x20
[<ffffffff81581630>] mutex_lock_nested+0x1a0/0x430
[<ffffffff814c3286>] ? rtnetlink_rcv+0x16/0x30
[<ffffffff814c3286>] ? rtnetlink_rcv+0x16/0x30
[<ffffffff814e4870>] ? netlink_deliver_tap+0x90/0x2b0
[<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
[<ffffffff814e7865>] netlink_unicast+0x155/0x1f0
[<ffffffff814e7cad>] netlink_sendmsg+0x2dd/0x360
[<ffffffff8148d682>] sock_sendmsg+0x12/0x20
[<ffffffff8148e952>] SyS_sendto+0xf2/0x170
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
[<ffffffff810b92cf>] ? trace_hardirqs_on_caller+0xef/0x200
[<ffffffff8100201a>] ? trace_hardirqs_on_thunk+0x1a/0x1c
[<ffffffff81586681>] entry_SYSCALL_64_fastpath+0x1f/0xc2
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
Showing all locks held in the system:
2 locks held by khungtaskd/108:
#0: (rcu_read_lock){......}, at: [<ffffffff811269af>] watchdog+0x9f/0x490
#1: (tasklist_lock){.+.+..}, at: [<ffffffff810b760d>] debug_show_all_locks+0x3d/0x1a0
1 lock held by sd-resolve/1445:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
2 locks held by NetworkManager/1475:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
#1: (rcu_preempt_state.exp_mutex){+.+...}, at: [<ffffffff810da7d9>] _synchronize_rcu_expedited+0x149/0x350
1 lock held by wpa_supplicant/1512:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
2 locks held by kworker/2:7/1630:
#0: ("events"){.+.+.+}, at: [<ffffffff8107d5c6>] process_one_work+0x1e6/0x4d0
#1: ((&rew.rew_work)){+.+...}, at: [<ffffffff8107d5c6>] process_one_work+0x1e6/0x4d0
1 lock held by sudo/2214:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
=============================================
INFO: task gpm:1461 blocked for more than 120 seconds.
Tainted: G I 4.9-fw1 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
gpm D 0 1461 1 0x00000002
ffff880439d59300 0000000000001482 ffff8804371bc380 ffff88083c8e8000
ffff88083efd6718 ffffc9000b523b78 ffffffff8157ff6e ffffc9000b523be8
ffff8804371bcd80 ffff88083efd6718 0000000000000000 ffff8804371bc380
Call Trace:
[<ffffffff8157ff6e>] ? __schedule+0x2ce/0x810
[<ffffffff815804eb>] schedule+0x3b/0x90
[<ffffffff81584e82>] schedule_timeout+0x222/0x3a0
[<ffffffff812fe1f7>] ? debug_smp_processor_id+0x17/0x20
[<ffffffff812fe1f7>] ? debug_smp_processor_id+0x17/0x20
[<ffffffff810b5de9>] ? get_lock_stats+0x19/0x50
[<ffffffff81585d17>] ? _raw_spin_unlock_irq+0x27/0x50
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
[<ffffffff810b92cf>] ? trace_hardirqs_on_caller+0xef/0x200
[<ffffffff81580ffa>] wait_for_common+0xca/0x180
[<ffffffff8108e150>] ? wake_up_q+0x80/0x80
[<ffffffff815810c8>] wait_for_completion+0x18/0x20
[<ffffffff810d82e5>] __wait_rcu_gp+0xc5/0x100
[<ffffffff810dbccd>] synchronize_rcu.part.53+0x2d/0x50
[<ffffffff810dc7e0>] ? __call_rcu.constprop.59+0x270/0x270
[<ffffffff810d8210>] ? rcu_panic+0x20/0x20
[<ffffffff81580f69>] ? wait_for_common+0x39/0x180
[<ffffffff810dbd17>] synchronize_rcu+0x27/0x90
[<ffffffff81459c5e>] mousedev_release+0x4e/0x70
[<ffffffff811dc51a>] __fput+0xba/0x200
[<ffffffff811dc699>] ____fput+0x9/0x10
[<ffffffff81082680>] task_work_run+0x80/0xb0
[<ffffffff81066533>] do_exit+0x2e3/0xb30
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
[<ffffffff810b92cf>] ? trace_hardirqs_on_caller+0xef/0x200
[<ffffffff81066e00>] do_group_exit+0x40/0xc0
[<ffffffff81066e8f>] SyS_exit_group+0xf/0x10
[<ffffffff81586681>] entry_SYSCALL_64_fastpath+0x1f/0xc2
[<ffffffff812fe213>] ? __this_cpu_preempt_check+0x13/0x20
Showing all locks held in the system:
2 locks held by khungtaskd/108:
#0: (rcu_read_lock){......}, at: [<ffffffff811269af>] watchdog+0x9f/0x490
#1: (tasklist_lock){.+.+..}, at: [<ffffffff810b760d>] debug_show_all_locks+0x3d/0x1a0
1 lock held by sd-resolve/1445:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
2 locks held by NetworkManager/1475:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
#1: (rcu_preempt_state.exp_mutex){+.+...}, at: [<ffffffff810da7d9>] _synchronize_rcu_expedited+0x149/0x350
1 lock held by wpa_supplicant/1512:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
2 locks held by kworker/2:7/1630:
#0: ("events"){.+.+.+}, at: [<ffffffff8107d5c6>] process_one_work+0x1e6/0x4d0
#1: ((&rew.rew_work)){+.+...}, at: [<ffffffff8107d5c6>] process_one_work+0x1e6/0x4d0
1 lock held by sudo/2214:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c3286>] rtnetlink_rcv+0x16/0x30
=============================================
Full log can be found there :
http://ftp.frugalware.org/pub/other/people/crazy/journalctl-4.9-log
lspci -vv for the card :
02:00.0 Network controller: Qualcomm Atheros AR93xx Wireless Network Adapter (rev 01)
Subsystem: Qualcomm Atheros AR93xx Wireless Network Adapter
Physical Slot: 6
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 32 bytes
Interrupt: pin A routed to IRQ 25
Region 0: Memory at b0220000 (64-bit, non-prefetchable) [size=128K]
[virtual] Expansion ROM at b0200000 [disabled] [size=64K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable- Count=1/4 Maskable+ 64bit+
Address: 0000000000000000 Data: 0000
Masking: 00000000 Pending: 00000000
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 25.000W
DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <2us, L1 <64us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
Capabilities: [140 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
Status: NegoPending- InProgress-
Capabilities: [300 v1] Device Serial Number 00-00-00-00-00-00-00-00
Kernel driver in use: ath9k
Kernel modules: ath9k
Also when disabling the ath9k driver or blacklisting it everything seems normal.
Please let me know when you need more infos.
Best Regards,
Gabrile C
^ permalink raw reply
* Re: [PATCH net-next v3 0/4] Fix OdroidC2 Gigabit Tx link issue
From: Martin Blumenstingl @ 2016-12-18 13:37 UTC (permalink / raw)
To: Florian Fainelli, jbrunet
Cc: David Miller, netdev, devicetree, carlo, khilman, peppe.cavallaro,
alexandre.torgue, neolynx, andrew, narmstrong, linux-amlogic,
linux-arm-kernel, linux-kernel
In-Reply-To: <162595c2-d403-0070-3399-de03c1653065@gmail.com>
Hi Florian, Hi Jerome,
On Wed, Nov 30, 2016 at 2:15 AM, Florian Fainelli <f.fainelli@gmail.com> wrote:
> On 11/29/2016 05:13 PM, David Miller wrote:
>> From: Florian Fainelli <f.fainelli@gmail.com>
>> Date: Tue, 29 Nov 2016 16:43:20 -0800
>>
>>> On 11/29/2016 04:38 PM, David Miller wrote:
>>>> From: Jerome Brunet <jbrunet@baylibre.com>
>>>> Date: Mon, 28 Nov 2016 10:46:45 +0100
>>>>
>>>>> This patchset fixes an issue with the OdroidC2 board (DWMAC + RTL8211F).
>>>>> The platform seems to enter LPI on the Rx path too often while performing
>>>>> relatively high TX transfer. This eventually break the link (both Tx and
>>>>> Rx), and require to bring the interface down and up again to get the Rx
>>>>> path working again.
>>>>>
>>>>> The root cause of this issue is not fully understood yet but disabling EEE
>>>>> advertisement on the PHY prevent this feature to be negotiated.
>>>>> With this change, the link is stable and reliable, with the expected
>>>>> throughput performance.
>>>>>
>>>>> The patchset adds options in the generic phy driver to disable EEE
>>>>> advertisement, through device tree. The way it is done is very similar
>>>>> to the handling of the max-speed property.
>>>>
>>>> Patches 1-3 applied to net-next, thanks.
>>>
>>> Meh, there was a v4 submitted shortly after, and I objected to the whole
>>> idea of using that kind of Device Tree properties to disable EEE, we can
>>> send reverts though..
>>
>> Sorry, I lost this in all the discussion, I can revert.
>
> Yeah, I can understand why, these freaking PHYs tend to generate a lot
> of noise and discussion...
>
>>
>> Just send me a revert of the entire merge commit
>> a152c91889556df17ca6d8ea134fb2cb4ac9f893 with a short
>> description of why and I'll apply it.
>
> OK, I will talk with Jerome first to make sure that we are in agreement
> with the solution to deploy to fix the OdroidC2 problem first.
simply because I'm curious: what was the outcome of your discussion?
can we stay with the current solution or are any changes required?
Regards,
Martin
^ permalink raw reply
* Re: [PATCH iproute2 2/2] tc/m_tunnel_key: Add dest UDP port to tunnel key action
From: Hadar Hen Zion @ 2016-12-18 7:41 UTC (permalink / raw)
To: Simon Horman; +Cc: Stephen Hemminger, netdev, Or Gerlitz, Roi Dayan, Amir Vadai
In-Reply-To: <20161215135312.GB7104@penelope.horms.nl>
On 12/15/2016 3:53 PM, Simon Horman wrote:
> On Thu, Dec 15, 2016 at 02:03:36PM +0100, Simon Horman wrote:
>> On Tue, Dec 13, 2016 at 10:07:47AM +0200, Hadar Hen Zion wrote:
>>> Enhance tunnel key action parameters by adding destination UDP port.
>>>
>>> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
>>> Reviewed-by: Roi Dayan <roid@mellanox.com>
>> Hi,
>>
>> this looks good to me but could you also update tc/m_tunnel_key.c:usage(); ?
> It seems that I was a bit hasty here as I now see that Stephen has
> indicated that he has applied this series. I also notice that
> patch 1/2 of this series also misses updating usage(). Let me know
> if sending some follow-up patches is the best way forwards.
Yes, I you are right, I'll send a follow-up patches.
Thanks,
Hadar
^ permalink raw reply
* Re: wl1251 & mac address & calibration data
From: Pali Rohár @ 2016-12-18 12:09 UTC (permalink / raw)
To: Arend Van Spriel
Cc: Daniel Wagner, Luis R. Rodriguez, Tom Gundersen, Johannes Berg,
Ming Lei, Mimi Zohar, Bjorn Andersson, Rafał Miłecki,
Kalle Valo, Sebastian Reichel, Pavel Machek, Michal Kazior,
Ivaylo Dimitrov, Aaro Koskinen, Tony Lindgren, linux-wireless,
Network Development, linux-kernel@vger.kernel.org
In-Reply-To: <83b2e9a4-f990-68a8-241e-375e46448d47@broadcom.com>
[-- Attachment #1: Type: Text/Plain, Size: 4450 bytes --]
On Sunday 18 December 2016 12:54:00 Arend Van Spriel wrote:
> On 18-12-2016 12:04, Pali Rohár wrote:
> > On Sunday 18 December 2016 11:49:53 Arend Van Spriel wrote:
> >> On 16-12-2016 11:40, Pali Rohár wrote:
> >>> On Friday 16 December 2016 08:25:44 Daniel Wagner wrote:
> >>>> On 12/16/2016 03:03 AM, Luis R. Rodriguez wrote:
> >>>>> For the new API a solution for "fallback mechanisms" should be
> >>>>> clean though and I am looking to stay as far as possible from
> >>>>> the existing mess. A solution to help both the old API and new
> >>>>> API is possible for the "fallback mechanism" though -- but for
> >>>>> that I can only refer you at this point to some of Daniel
> >>>>> Wagner and Tom Gunderson's firmwared deamon prospect. It
> >>>>> should help pave the way for a clean solution and help address
> >>>>> other stupid issues.
> >>>>
> >>>> The firmwared project is hosted here
> >>>>
> >>>> https://github.com/teg/firmwared
> >>>>
> >>>> As Luis pointed out, firmwared relies on
> >>>> FW_LOADER_USER_HELPER_FALLBACK, which is not enabled by default.
> >>>
> >>> I know. But it does not mean that I cannot enable this option at
> >>> kernel compile time.
> >>>
> >>> Bigger problem is that currently request_firmware() first try to
> >>> load firmware directly from VFS and after that (if fails)
> >>> fallback to user helper.
> >>>
> >>> So I would need to extend kernel firmware code with new function
> >>> (or flag) to not use VFS and try only user mode helper.
> >>
> >> Why do you need the user-mode helper anyway. This is all static
> >> data, right?
> >
> > Those are static data, but device specific!
>
> So what?
>
> >> So why not cook up a firmware file in user-space once and put
> >> it in /lib/firmware for the driver to request directly.
> >
> > 1. Violates FHS
>
> How?
>
> > 2. Does not work for readonly /, readonly /lib, readonly
> > /lib/firmware
>
> Que?
>
> > 3. Backup & restore of rootfs between same devices does not work
> > (as rootfs now contains device specific data).
>
> True.
>
> > 4. Sharing one rootfs (either via nfs or other technology) does not
> > work for more devices (even in state when rootfs is used only by
> > one device at one time).
>
> Indeed.
>
> > And it is common that N900 developers have rootfs in laptop and via
> > usb (cdc_ether) exports it over nfs to N900 device and boot
> > system. It basically break booting from one nfs-exported rootfs,
> > as that export become model specific...
>
> These are all you choices and more a logistic issue. If your take is
> that udev is the way to solve those, fine by me.
>
> >> Seems a bit
> >> overkill to have a {e,}udev or whatever daemon running if the
> >> result is always the same. Just my 2 cents.
> >
> > No it is not. It will break couple of other things in Linux and
> > device
>
> Now I am curious. What "couple of other things" will be broken.
>
> > and model specific calibration data should not be in /lib/firmware!
> > That directory is used for firmware files, not calibration.
>
> What is "firmware"? Really. These are binary blobs required to make
> the device work. And guess what, your device needs calibration data.
> Why make the distinction.
>
> Regards,
> Arend
File wl1251-nvs.bin is provided by linux-firmware package and contains
default data which should be overriden by model specific calibrated
data.
But overwriting that one file is not possible as it next update of
linux-firmware package will overwrite it back. It break any normal usage
of package management.
Also it is ridiculously broken by design if some "boot" files needs to
be overwritten to initialize hardware properly. To not break booting you
need to overwrite that file before first boot. But without booting
device you cannot read calibration data. So some hack with autoreboot
after boot is needed. And how to detect that we have real overwritten
calibration data and not default one from linux-firmware? Any heuristic
or checks will be broken here. And no, nothing like you need to reboot
your device now (and similar concept) from windows world is not
accepted.
"firmware" is one for chip. Any N900 device with wl1251 chip needs
exactly same firmware "wl1251-fw.bin". But every N900 needs different
calibration data which is not firmware.
--
Pali Rohár
pali.rohar@gmail.com
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply
* Re: wl1251 & mac address & calibration data
From: Arend Van Spriel @ 2016-12-18 11:54 UTC (permalink / raw)
To: Pali Rohár
Cc: Daniel Wagner, Luis R. Rodriguez, Tom Gundersen, Johannes Berg,
Ming Lei, Mimi Zohar, Bjorn Andersson, Rafał Miłecki,
Kalle Valo, Sebastian Reichel, Pavel Machek, Michal Kazior,
Ivaylo Dimitrov, Aaro Koskinen, Tony Lindgren, linux-wireless,
Network Development, linux-kernel@vger.kernel.org
In-Reply-To: <201612181204.52928@pali>
On 18-12-2016 12:04, Pali Rohár wrote:
> On Sunday 18 December 2016 11:49:53 Arend Van Spriel wrote:
>> On 16-12-2016 11:40, Pali Rohár wrote:
>>> On Friday 16 December 2016 08:25:44 Daniel Wagner wrote:
>>>> On 12/16/2016 03:03 AM, Luis R. Rodriguez wrote:
>>>>> For the new API a solution for "fallback mechanisms" should be
>>>>> clean though and I am looking to stay as far as possible from the
>>>>> existing mess. A solution to help both the old API and new API is
>>>>> possible for the "fallback mechanism" though -- but for that I
>>>>> can only refer you at this point to some of Daniel Wagner and
>>>>> Tom Gunderson's firmwared deamon prospect. It should help pave
>>>>> the way for a clean solution and help address other stupid
>>>>> issues.
>>>>
>>>> The firmwared project is hosted here
>>>>
>>>> https://github.com/teg/firmwared
>>>>
>>>> As Luis pointed out, firmwared relies on
>>>> FW_LOADER_USER_HELPER_FALLBACK, which is not enabled by default.
>>>
>>> I know. But it does not mean that I cannot enable this option at
>>> kernel compile time.
>>>
>>> Bigger problem is that currently request_firmware() first try to
>>> load firmware directly from VFS and after that (if fails) fallback
>>> to user helper.
>>>
>>> So I would need to extend kernel firmware code with new function
>>> (or flag) to not use VFS and try only user mode helper.
>>
>> Why do you need the user-mode helper anyway. This is all static data,
>> right?
>
> Those are static data, but device specific!
So what?
>> So why not cook up a firmware file in user-space once and put
>> it in /lib/firmware for the driver to request directly.
>
> 1. Violates FHS
How?
> 2. Does not work for readonly /, readonly /lib, readonly /lib/firmware
Que?
> 3. Backup & restore of rootfs between same devices does not work (as
> rootfs now contains device specific data).
True.
> 4. Sharing one rootfs (either via nfs or other technology) does not work
> for more devices (even in state when rootfs is used only by one device
> at one time).
Indeed.
> And it is common that N900 developers have rootfs in laptop and via usb
> (cdc_ether) exports it over nfs to N900 device and boot system. It
> basically break booting from one nfs-exported rootfs, as that export
> become model specific...
These are all you choices and more a logistic issue. If your take is
that udev is the way to solve those, fine by me.
>> Seems a bit
>> overkill to have a {e,}udev or whatever daemon running if the result
>> is always the same. Just my 2 cents.
>
> No it is not. It will break couple of other things in Linux and device
Now I am curious. What "couple of other things" will be broken.
> and model specific calibration data should not be in /lib/firmware! That
> directory is used for firmware files, not calibration.
What is "firmware"? Really. These are binary blobs required to make the
device work. And guess what, your device needs calibration data. Why
make the distinction.
Regards,
Arend
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox