Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH RFC net-next 7/7] netdevsim: Add simple FIB resource controller via devlink
From: Jakub Kicinski @ 2018-03-25  6:35 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, davem, roopa, shm, jiri, idosch, David Ahern
In-Reply-To: <1b5db863-7949-e4a2-7f22-6dd79ecb8089@cumulusnetworks.com>

On Sat, 24 Mar 2018 09:02:45 -0600, David Ahern wrote:
> >> diff --git a/drivers/net/netdevsim/Makefile b/drivers/net/netdevsim/Makefile
> >> index 09388c06171d..449b2a1a1800 100644
> >> --- a/drivers/net/netdevsim/Makefile
> >> +++ b/drivers/net/netdevsim/Makefile
> >> @@ -9,3 +9,7 @@ ifeq ($(CONFIG_BPF_SYSCALL),y)
> >>  netdevsim-objs += \
> >>  	bpf.o
> >>  endif
> >> +
> >> +ifneq ($(CONFIG_NET_DEVLINK),)  
> > 
> > Hm.  Don't you need MAY_USE_DEVLINK dependency perhaps?  
> 
> mlxsw uses CONFIG_NET_DEVLINK in its Makefile.
> 
> MAY_USE_DEVLINK seems to only be used in Kconfig files. Not clear to me
> why it is needed at all.

NETDEVSIM=y && DEVLINK=m

^ permalink raw reply

* Re: syzbot rcu/debugobjects warning
From: Joel Fernandes @ 2018-03-25  6:29 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Paul McKenney, LKML, Todd Poynor,
	open list:BPF (Safe dynamic programs and tools), Ben Hutchings,
	Greg Kroah-Hartman, Guillaume Nault
In-Reply-To: <alpine.DEB.2.21.1803232133450.1481@nanos.tec.linutronix.de>

On Fri, Mar 23, 2018 at 1:41 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Fri, 23 Mar 2018, Joel Fernandes wrote:
>> On Fri, Mar 23, 2018 at 2:11 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
>> > On Thu, 22 Mar 2018, Joel Fernandes wrote:
>> Sorry. Here is the raw crash log: https://pastebin.com/raw/puvh0cXE
>> (The kernel logs are toward the end with the above).
>
> And that is interesting:
>
> [  150.629667]  <IRQ> [  150.631700]  [<ffffffff81d96069>] dump_stack+0xc1/0x128
> [  150.637051]  [<ffffffff81dfefb6>] ? __debug_object_init+0x526/0xc40
> [  150.643431]  [<ffffffff8142fbd1>] panic+0x1bc/0x3a8
> [  150.648416]  [<ffffffff8142fa15>] ? percpu_up_read_preempt_enable.constprop.53+0xd7/0xd7
> [  150.656611]  [<ffffffff81430835>] ? load_image_and_restore+0xf9/0xf9
> [  150.663070]  [<ffffffff81269efd>] ? vprintk_default+0x1d/0x30
> [  150.668925]  [<ffffffff81131879>] ? __warn+0x1a9/0x1e0
> [  150.674170]  [<ffffffff81dfefb6>] ? __debug_object_init+0x526/0xc40
> [  150.680543]  [<ffffffff81131894>] __warn+0x1c4/0x1e0
> [  150.685614]  [<ffffffff81131afc>] warn_slowpath_null+0x2c/0x40
> [  150.691972]  [<ffffffff81dfefb6>] __debug_object_init+0x526/0xc40
> [  150.698174]  [<ffffffff81dfea90>] ? debug_object_fixup+0x30/0x30
> [  150.704283]  [<ffffffff81dff709>] debug_object_init_on_stack+0x19/0x20
> [  150.710917]  [<ffffffff81287a93>] __wait_rcu_gp+0x93/0x1b0
> [  150.716508]  [<ffffffff81290251>] synchronize_rcu.part.65+0x101/0x110
> [  150.723054]  [<ffffffff81290150>] ? rcu_pm_notify+0xc0/0xc0
> [  150.728735]  [<ffffffff81292bc0>] ? __call_rcu.constprop.72+0x910/0x910
> [  150.735459]  [<ffffffff81235221>] ? __lock_is_held+0xa1/0xf0
> [  150.741223]  [<ffffffff81290287>] synchronize_rcu+0x27/0x90
>
> So this calls synchronize_rcu from a rcu callback. That's a nono. This is
> on the back of an interrupt in softirq context and __wait_rcu_gp() can
> sleep, which is obviously a bad idea in softirq context....
>
> Cc'ed netdev ....
>
> And that also explains the debug object splat because this is not running
> on the task stack. It's running on the softirq stack ....
>
> [  150.746908]  [<ffffffff83588b35>] __l2tp_session_unhash+0x3d5/0x550
> [  150.753281]  [<ffffffff8358891f>] ? __l2tp_session_unhash+0x1bf/0x550
> [  150.759828]  [<ffffffff8114596a>] ? __local_bh_enable_ip+0x6a/0xd0
> [  150.766123]  [<ffffffff8358ddb0>] ? l2tp_udp_encap_recv+0xd90/0xd90
> [  150.772497]  [<ffffffff83588e97>] l2tp_tunnel_closeall+0x1e7/0x3a0
> [  150.778782]  [<ffffffff835897be>] l2tp_tunnel_destruct+0x30e/0x5a0
> [  150.785067]  [<ffffffff8358965a>] ? l2tp_tunnel_destruct+0x1aa/0x5a0
> [  150.791537]  [<ffffffff835894b0>] ? l2tp_tunnel_del_work+0x460/0x460
> [  150.797997]  [<ffffffff82ee8053>] __sk_destruct+0x53/0x570
> [  150.803588]  [<ffffffff81293918>] rcu_process_callbacks+0x898/0x1300
> [  150.810048]  [<ffffffff812939f7>] ? rcu_process_callbacks+0x977/0x1300
> [  150.816684]  [<ffffffff82ee8000>] ? __sk_dst_check+0x240/0x240
> [  150.822625]  [<ffffffff838be5d6>] __do_softirq+0x206/0x951
> [  150.828223]  [<ffffffff81147315>] irq_exit+0x165/0x190
> [  150.833557]  [<ffffffff838bd1eb>] smp_apic_timer_interrupt+0x7b/0xa0
> [  150.840018]  [<ffffffff838b9470>] apic_timer_interrupt+0xa0/0xb0
> [  150.846132]  <EOI> [  150.848166]  [<ffffffff838b6756>] ? native_safe_halt+0x6/0x10
> [  150.854036]  [<ffffffff8123bf2d>] ? trace_hardirqs_on+0xd/0x10
> [  150.859973]  [<ffffffff838b5d85>] default_idle+0x55/0x360
> [  150.865478]  [<ffffffff8106be0a>] arch_cpu_idle+0xa/0x10
> [  150.870896]  [<ffffffff838b6b96>] default_idle_call+0x36/0x60
> [  150.876751]  [<ffffffff81226cb0>] cpu_startup_entry+0x2b0/0x380
> [  150.882787]  [<ffffffff81226a00>] ? cpu_in_idle+0x20/0x20
> [  150.888291]  [<ffffffff812d2343>] ? clockevents_register_device+0x123/0x200
> [  150.895358]  [<ffffffff810b0693>] start_secondary+0x303/0x3e0
> [  150.901209]  [<ffffffff810b0390>] ? set_cpu_sibling_map+0x11f0/0x11f0

Thomas, thanks a lot. It appears this issue will not happen on
mainline since from commit  765924e362d1  (subject "l2tp: don't close
sessions in l2tp_tunnel_destruct()"), l2tp_tunnel_closeall is no
longer called from l2tp_tunnel_destruct. From that commit message it
seems one of the motivations is to solve scheduling from atomic issue.

However for this change to be applied to android-4.9 and/or 4.9
stable, it depends on several other l2p patches and they aren't
straight forward cherry-picks from mainline (and I don't have much
background with this driver).

v3.16.56 stable seems to be further along with l2tp than v4.9.89, in
that it atleast has more of the upstream patches adapted for it, that
the above patch depends on. Since this also related to stable, I am
CC'ing Greg kh and Ben.

Here are some of the commits in 3.16 stable that I couldn't find
applied to v4.9 stable. The above fix quotes the below patches as
dependencies so they would need to be stable backported. Also CC'ing
Guillaume since he authored the above mentioned fix.

0c15ddabbcf l2tp: don't register sessions in l2tp_session_create()
a3c5d5b70f4e l2tp: fix race condition in l2tp_tunnel_delete
5b216e8dcda2 l2tp: prevent creation of sessions on terminated tunnels
76ff5e22f1e0 l2tp: hold tunnel while looking up sessions in l2tp_netlink
ceb8f6b23a38 l2tp: define parameters of l2tp_session_get*() as "const"
0295d020b63f l2tp: initialise session's refcount before making it reachable
29a77518927e l2tp: take reference on sessions being dumped
b301c9b7782f l2tp: take a reference on sessions used in genetlink handlers

By the way I think the reason why scheduling while atomic checks
didn't show up is because the debugobjects warning caused a panic
first, before that could happen.

- Joel

PS: There's also 12d656af4e3d2 ("l2tp: Avoid schedule while atomic in
exit_net") which was fixing a call to synchronize_rcu in the same
path/function, but the caller originated from l2tp_exit_net. But this
patch is already in the stable trees. I am just mentioning it here for
completeness.

^ permalink raw reply

* Re: linux-next on x60: network manager often complains "network is disabled" after resume
From: Pavel Machek @ 2018-03-25  6:19 UTC (permalink / raw)
  To: Dan Williams
  Cc: Woody Suwalski, Rafael J. Wysocki, kernel list,
	Linux-pm mailing list, Netdev list
In-Reply-To: <1521551568.16848.5.camel@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 1453 bytes --]

> > > Ok, what does 'nmcli dev' and 'nmcli radio' show?

> > 
> > Broken state.
> > 
> > pavel@amd:~$ nmcli dev
> > DEVICE  TYPE      STATE        CONNECTION
> > eth1    ethernet  unavailable  --
> > lo      loopback  unmanaged    --
> > wlan0   wifi      unmanaged    --
> 
> If the state is "unmanaged" on resume, that would indicate a problem
> with sleep/wake and likely not a kernel network device issue.
> 
> We should probably move this discussion to the NM lists to debug
> further.  Before you suspend, run "nmcli gen log level trace" to turn
> on full debug logging, then reproduce the issue, and send a pointer to
> those logs (scrubbed for anything you consider sensitive) to the NM
> mailing list.

Hmm :-)

root@amd:/data/pavel# nmcli gen log level trace
Error: Unknown log level 'trace'
root@amd:/data/pavel# nmcli gen log level help
Error: Unknown log level 'help'
root@amd:/data/pavel# nmcli gen log level
Error: value for 'level' argument is required.
root@amd:/data/pavel# nmcli gen log level debug
root@amd:/data/pavel# cat /var/log/sys/log

Where do I get the logs? I don't see much in the syslog...

And.. It seems that it is "every other suspend". One resume results in
broken network, one in working one, one in broken one...

Thanks,
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply

* [net PATCH v2] net: sched, fix OOO packets with pfifo_fast
From: John Fastabend @ 2018-03-25  5:25 UTC (permalink / raw)
  To: eric.dumazet; +Cc: xiyou.wangcong, jiri, davem, netdev

After the qdisc lock was dropped in pfifo_fast we allow multiple
enqueue threads and dequeue threads to run in parallel. On the
enqueue side the skb bit ooo_okay is used to ensure all related
skbs are enqueued in-order. On the dequeue side though there is
no similar logic. What we observe is with fewer queues than CPUs
it is possible to re-order packets when two instances of
__qdisc_run() are running in parallel. Each thread will dequeue
a skb and then whichever thread calls the ndo op first will
be sent on the wire. This doesn't typically happen because
qdisc_run() is usually triggered by the same core that did the
enqueue. However, drivers will trigger __netif_schedule()
when queues are transitioning from stopped to awake using the
netif_tx_wake_* APIs. When this happens netif_schedule() calls
qdisc_run() on the same CPU that did the netif_tx_wake_* which
is usually done in the interrupt completion context. This CPU
is selected with the irq affinity which is unrelated to the
enqueue operations.

To resolve this we add a RUNNING bit to the qdisc to ensure
only a single dequeue per qdisc is running. Enqueue and dequeue
operations can still run in parallel and also on multi queue
NICs we can still have a dequeue in-flight per qdisc, which
is typically per CPU.

Fixes: c5ad119fb6c0 ("net: sched: pfifo_fast use skb_array")
Reported-by: Jakob Unterwurzacher <jakob.unterwurzacher@theobroma-systems.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 include/net/sch_generic.h |    1 +
 net/sched/sch_generic.c   |   17 +++++++++++++----
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 2092d33..8da3267 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -30,6 +30,7 @@ struct qdisc_rate_table {
 enum qdisc_state_t {
 	__QDISC_STATE_SCHED,
 	__QDISC_STATE_DEACTIVATED,
+	__QDISC_STATE_RUNNING,
 };
 
 struct qdisc_size_table {
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 7e3fbe9..39c144b 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -373,24 +373,33 @@ bool sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q,
  */
 static inline bool qdisc_restart(struct Qdisc *q, int *packets)
 {
+	bool more, validate, nolock = q->flags & TCQ_F_NOLOCK;
 	spinlock_t *root_lock = NULL;
 	struct netdev_queue *txq;
 	struct net_device *dev;
 	struct sk_buff *skb;
-	bool validate;
 
 	/* Dequeue packet */
+	if (nolock && test_and_set_bit(__QDISC_STATE_RUNNING, &q->state))
+		return false;
+
 	skb = dequeue_skb(q, &validate, packets);
-	if (unlikely(!skb))
+	if (unlikely(!skb)) {
+		if (nolock)
+			clear_bit(__QDISC_STATE_RUNNING, &q->state);
 		return false;
+	}
 
-	if (!(q->flags & TCQ_F_NOLOCK))
+	if (!nolock)
 		root_lock = qdisc_lock(q);
 
 	dev = qdisc_dev(q);
 	txq = skb_get_tx_queue(dev, skb);
 
-	return sch_direct_xmit(skb, q, dev, txq, root_lock, validate);
+	more = sch_direct_xmit(skb, q, dev, txq, root_lock, validate);
+	if (nolock)
+		clear_bit(__QDISC_STATE_RUNNING, &q->state);
+	return more;
 }
 
 void __qdisc_run(struct Qdisc *q)

^ permalink raw reply related

* Re: [PATCH net-next RFC V1 5/5] net: mdio: Add a driver for InES time stamping IP core.
From: Richard Cochran @ 2018-03-25  4:51 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: netdev, devicetree, David Miller, Florian Fainelli, Mark Rutland,
	Miroslav Lichvar, Rob Herring, Willem de Bruijn
In-Reply-To: <20180324184858.GF31941@lunn.ch>

On Sat, Mar 24, 2018 at 07:48:58PM +0100, Andrew Lunn wrote:
> As far as i can see, you have three basic problems:
> 
> 1) How do you associate the PTP device to the netdev?
> 2) How do you get the information you need to configure the PTP device

Yes, yes.

> 3) How do you limit the MAC/PHY to what the PTP device can do.

Hm, I don't think this is important.
 
> phylib does have all this information. It is phylib that calls the MAC
> with link speed information. When the MAC connects to the PHY, it
> passes the MII mode, and when the PHY requests the MII mode changes,
> phylib knows. The MAC has to call phy_init_eee() to see if the PHY is
> EEE capable. phylib also tells the MAC what speeds the PHY is capable
> off, so it is in the position to mask out speeds the PTP device does
> not support, etc.

Right, so phylib can operate on phydev->attached_dev->mdiots;

> So i really think you need to cleanly integrate into phylib and
> phylink.

So I think I've done that, more or less, but I'd like to hear your
ideas on how to make it cleaner...

Thanks,
Richard

^ permalink raw reply

* RE
From: Ms. Ella Golan @ 2018-03-24 17:45 UTC (permalink / raw)
  To: Recipients

I am Ms.Ella Golan, I am the Executive Vice President Banking Division with FIRST INTERNATIONAL BANK OF ISRAEL LTD (FIBI). I am getting in touch with you regarding an extremely important and urgent matter. If you would oblige me the opportunity, I shall provide you with details upon your response.

Faithfully,
Ms.Ella Golan

^ permalink raw reply

* RE: [PATCH net-next,1/2] hv_netvsc: Fix the return status in RX path
From: Haiyang Zhang @ 2018-03-25  0:41 UTC (permalink / raw)
  To: Michael Kelley (EOSG), davem@davemloft.net,
	netdev@vger.kernel.org
  Cc: KY Srinivasan, Stephen Hemminger, olaf@aepfle.de,
	vkuznets@redhat.com, devel@linuxdriverproject.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <DM5PR2101MB103003AA72955A6C6AD44D0CDCAF0@DM5PR2101MB1030.namprd21.prod.outlook.com>



> -----Original Message-----
> From: Michael Kelley (EOSG)
> Sent: Saturday, March 24, 2018 12:48 PM
> To: Haiyang Zhang <haiyangz@microsoft.com>; davem@davemloft.net;
> netdev@vger.kernel.org
> Cc: KY Srinivasan <kys@microsoft.com>; Stephen Hemminger
> <sthemmin@microsoft.com>; olaf@aepfle.de; vkuznets@redhat.com;
> devel@linuxdriverproject.org; linux-kernel@vger.kernel.org
> Subject: RE: [PATCH net-next,1/2] hv_netvsc: Fix the return status in RX path
> 
> > -----Original Message-----
> > From: linux-kernel-owner@vger.kernel.org
> > <linux-kernel-owner@vger.kernel.org> On Behalf Of Haiyang Zhang
> > Sent: Thursday, March 22, 2018 12:01 PM
> > To: davem@davemloft.net; netdev@vger.kernel.org
> > Cc: Haiyang Zhang <haiyangz@microsoft.com>; KY Srinivasan
> > <kys@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>;
> > olaf@aepfle.de; vkuznets@redhat.com; devel@linuxdriverproject.org;
> > linux-kernel@vger.kernel.org
> > Subject: [PATCH net-next,1/2] hv_netvsc: Fix the return status in RX
> > path
> >
> > From: Haiyang Zhang <haiyangz@microsoft.com>
> >
> > As defined in hyperv_net.h, the NVSP_STAT_SUCCESS is one not zero.
> > Some functions returns 0 when it actually means NVSP_STAT_SUCCESS.
> > This patch fixes them.
> >
> > In netvsc_receive(), it puts the last RNDIS packet's receive status
> > for all packets in a vmxferpage which may contain multiple RNDIS
> > packets.
> > This patch puts NVSP_STAT_FAIL in the receive completion if one of the
> > packets in a vmxferpage fails.
> 
> This patch changes the status field that is being reported back to the Hyper-V
> host in the receive completion message in
> enq_receive_complete().   The current code reports 0 on success,
> and with the patch, it will report 1 on success.  So does this change affect
> anything on the Hyper-V side?  Or is Hyper-V just ignoring
> the value?   If this change doesn't have any impact on the
> interactions with Hyper-V, perhaps it would be good to explain why in the
> commit message.

Here is the definition of each status code for NetVSP. 
enum {
        NVSP_STAT_NONE = 0,
        NVSP_STAT_SUCCESS,
        NVSP_STAT_FAIL,
        NVSP_STAT_PROTOCOL_TOO_NEW,
        NVSP_STAT_PROTOCOL_TOO_OLD,
        NVSP_STAT_INVALID_RNDIS_PKT,
        NVSP_STAT_BUSY,
        NVSP_STAT_PROTOCOL_UNSUPPORTED,
        NVSP_STAT_MAX,
};

Existing code returns NVSP_STAT_NONE = 0, and with this patch
we return NVSP_STAT_SUCCESS = 1. 
Based on testing, either way works for now. But for correctness
and future stability (e.g. host side becomes more stringent), we
should follow the protocol.

Thanks,
- Haiyang

^ permalink raw reply

* Re: [PATCH net-next v2 2/2] cxgb4: collect hardware dump in second kernel
From: Eric W. Biederman @ 2018-03-25  0:17 UTC (permalink / raw)
  To: Thadeu Lima de Souza Cascardo
  Cc: Rahul Lakkireddy, netdev, linux-fsdevel, kexec, linux-kernel,
	indranil, nirranjan, stephen, ganeshgr, akpm, torvalds, davem,
	viro
In-Reply-To: <20180324221849.GW14312@siri.cascardo.eti.br>

Thadeu Lima de Souza Cascardo <cascardo@debian.org> writes:

> On Sat, Mar 24, 2018 at 04:26:34PM +0530, Rahul Lakkireddy wrote:
>> Register callback to collect hardware/firmware dumps in second kernel
>> before hardware/firmware is initialized.  The dumps for each device
>> will be available under /sys/kernel/crashdd/cxgb4/ directory in second
>> kernel.
>> 
>> Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
>> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
>> ---
>> v2:
>> - No Changes.
>> 
>> Changes since rfc v2:
>> - Update comments and commit message for sysfs change.
>> 
>> rfc v2:
>> - Updated dump registration to the new API in patch 1.
> [...]
>> diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
>> index e880be8e3c45..265cb026f868 100644
>> --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
>> +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
>> @@ -5527,6 +5527,18 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>>  	if (err)
>>  		goto out_free_adapter;
>>  
>> +	if (is_kdump_kernel()) {
>> +		/* Collect hardware state and append to
>> +		 * /sys/kernel/crashdd/cxgb4/ directory
>> +		 */
>> +		err = cxgb4_cudbg_crashdd_add_dump(adapter);
>> +		if (err) {
>> +			dev_warn(adapter->pdev_dev,
>> +				 "Fail collecting crash device dump, err: %d. Continuing\n",
>> +				 err);
>> +			err = 0;
>> +		}
>> +	}
>>  
>
> The problem I see with this approach is that you require that the driver
> is built into the kdump kernel (or present as a module in the kdump
> initramfs), and that you will probe the device during the collection of
> the dumps.

Compared to doing something in a crashing kernel anything in the kdump
kernel is a walk in the park.  Nothing is trustable in a crashing
kernel.

> IMHO, if you are going to require the device to be probed by the same
> driver during kdump, you might just as well use the device object itself
> to present the crash data. I think that's what Stephen Hemminger meant
> when he said to use sysfs. No need at all for any special crashdd. Just
> add an attribute or attribute group to the device object.

Doing something with the device model might make sense.  I am not
certain it does.  It is quite possible the device is in such a weird
state that the device driver fails to initialize.  That doesn't
mean the device driver can't scrape the registers and present
meaningful information to the rest of the system.

Whatever you do with capturing the state needs to happen early before
the driver initializes and stomps on the relevant state.

I don't expect there is much for the driver model to do, unless we wish
to do something explicitly before the normal device probe methods
happen.  What we need is the infrastructure for catching what gets
read from the driver and placing it in the core dump.

> Otherwise, as Eric Biederman pointed out, you should just add that data
> into the vmcore before you kexec, so you don't even need to look at a
> different file, and the driver does not even need to be present in the
> kdump kernel.

No.  I do mean before a kexec on panic happens.  Doing anything with
gathering this kind of information before kexec on panic is a very very
very very bad idea that will almost certainly make crash dumps less
reliable.  Don't even think about doing extra work on the crash dump
path.  Not ever.  No.  No.  No.  No.

The reason we use kexec on panic instead just creating a core dump
in the kernel is that many have tried and no one has gotten the kernel
to create crash dumps when things go wrong and it matters.  Meanwhile
kexec on panic works more often than not.

I mean that /proc/vmcore is a device that is used to gather up the bits
of the crashing kernel and to present it in a format that is easy to
read/save.  The tools read /proc/vmcore.

The driver or whatever is gathering this information absolutely needs to
be in the kdump kernel.

Eric

^ permalink raw reply

* Re: [PATCH net-next v2 2/2] cxgb4: collect hardware dump in second kernel
From: Thadeu Lima de Souza Cascardo @ 2018-03-24 22:18 UTC (permalink / raw)
  To: Rahul Lakkireddy
  Cc: indranil-ut6Up61K2wZBDgjK7y7TUQ, netdev-u79uwXL29TY76Z2rM5mHXA,
	nirranjan-ut6Up61K2wZBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ,
	ganeshgr-ut6Up61K2wZBDgjK7y7TUQ,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w
In-Reply-To: <f06cca21650affa10d83a28dbd7fb577792e35a9.1521888444.git.rahul.lakkireddy-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>

On Sat, Mar 24, 2018 at 04:26:34PM +0530, Rahul Lakkireddy wrote:
> Register callback to collect hardware/firmware dumps in second kernel
> before hardware/firmware is initialized.  The dumps for each device
> will be available under /sys/kernel/crashdd/cxgb4/ directory in second
> kernel.
> 
> Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>
> Signed-off-by: Ganesh Goudar <ganeshgr-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>
> ---
> v2:
> - No Changes.
> 
> Changes since rfc v2:
> - Update comments and commit message for sysfs change.
> 
> rfc v2:
> - Updated dump registration to the new API in patch 1.
[...]
> diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> index e880be8e3c45..265cb026f868 100644
> --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> @@ -5527,6 +5527,18 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>  	if (err)
>  		goto out_free_adapter;
>  
> +	if (is_kdump_kernel()) {
> +		/* Collect hardware state and append to
> +		 * /sys/kernel/crashdd/cxgb4/ directory
> +		 */
> +		err = cxgb4_cudbg_crashdd_add_dump(adapter);
> +		if (err) {
> +			dev_warn(adapter->pdev_dev,
> +				 "Fail collecting crash device dump, err: %d. Continuing\n",
> +				 err);
> +			err = 0;
> +		}
> +	}
>  

The problem I see with this approach is that you require that the driver
is built into the kdump kernel (or present as a module in the kdump
initramfs), and that you will probe the device during the collection of
the dumps.

IMHO, if you are going to require the device to be probed by the same
driver during kdump, you might just as well use the device object itself
to present the crash data. I think that's what Stephen Hemminger meant
when he said to use sysfs. No need at all for any special crashdd. Just
add an attribute or attribute group to the device object.

Otherwise, as Eric Biederman pointed out, you should just add that data
into the vmcore before you kexec, so you don't even need to look at a
different file, and the driver does not even need to be present in the
kdump kernel.

Cascardo.

>  	if (!is_t4(adapter->params.chip)) {
>  		s_qpp = (QUEUESPERPAGEPF0_S +
> -- 
> 2.14.1

^ permalink raw reply

* [PATCH net-next 6/6] r8169: change argument type of rtl8169_net_suspend and __rtl8169_resume
From: Heiner Kallweit @ 2018-03-24 22:18 UTC (permalink / raw)
  To: Realtek linux nic maintainers, David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <6160c32d-3905-683a-7117-d2b11d310a34@gmail.com>

Both functions can be simplified by changing the argument type to
struct rtl8169_private *.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/ethernet/realtek/r8169.c | 26 +++++++++++---------------
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index dd84cc3a..58d84e48 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -7776,15 +7776,13 @@ rtl8169_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
 	pm_runtime_put_noidle(&pdev->dev);
 }
 
-static void rtl8169_net_suspend(struct net_device *dev)
+static void rtl8169_net_suspend(struct rtl8169_private *tp)
 {
-	struct rtl8169_private *tp = netdev_priv(dev);
-
-	if (!netif_running(dev))
+	if (!netif_running(tp->dev))
 		return;
 
-	netif_device_detach(dev);
-	netif_stop_queue(dev);
+	netif_device_detach(tp->dev);
+	netif_stop_queue(tp->dev);
 
 	rtl_lock_work(tp);
 	napi_disable(&tp->napi);
@@ -7800,16 +7798,14 @@ static int rtl8169_suspend(struct device *device)
 {
 	struct rtl8169_private *tp = dev_get_drvdata(device);
 
-	rtl8169_net_suspend(tp->dev);
+	rtl8169_net_suspend(tp);
 
 	return 0;
 }
 
-static void __rtl8169_resume(struct net_device *dev)
+static void __rtl8169_resume(struct rtl8169_private *tp)
 {
-	struct rtl8169_private *tp = netdev_priv(dev);
-
-	netif_device_attach(dev);
+	netif_device_attach(tp->dev);
 
 	rtl_pll_power_up(tp);
 
@@ -7828,7 +7824,7 @@ static int rtl8169_resume(struct device *device)
 	rtl8169_init_phy(tp);
 
 	if (netif_running(tp->dev))
-		__rtl8169_resume(tp->dev);
+		__rtl8169_resume(tp);
 
 	return 0;
 }
@@ -7847,7 +7843,7 @@ static int rtl8169_runtime_suspend(struct device *device)
 	__rtl8169_set_wol(tp, WAKE_ANY);
 	rtl_unlock_work(tp);
 
-	rtl8169_net_suspend(tp->dev);
+	rtl8169_net_suspend(tp);
 
 	/* Update counters before going runtime suspend */
 	rtl8169_rx_missed(tp->dev);
@@ -7872,7 +7868,7 @@ static int rtl8169_runtime_resume(struct device *device)
 
 	rtl8169_init_phy(tp);
 
-	__rtl8169_resume(tp->dev);
+	__rtl8169_resume(tp);
 
 	return 0;
 }
@@ -7929,7 +7925,7 @@ static void rtl_shutdown(struct pci_dev *pdev)
 {
 	struct rtl8169_private *tp = pci_get_drvdata(pdev);
 
-	rtl8169_net_suspend(tp->dev);
+	rtl8169_net_suspend(tp);
 
 	/* Restore original MAC address */
 	rtl_rar_set(tp, tp->dev->perm_addr);
-- 
2.16.2

^ permalink raw reply related

* [PATCH net-next 5/6] r8169: change type of driver_data
From: Heiner Kallweit @ 2018-03-24 22:18 UTC (permalink / raw)
  To: Realtek linux nic maintainers, David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <6160c32d-3905-683a-7117-d2b11d310a34@gmail.com>

Several functions accessing the device driver_data field don't need the
net_device. All needed parameters can be accessed via struct
rtl8169_private, therefore change type of driver_data accordingly.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/ethernet/realtek/r8169.c | 51 +++++++++++++++---------------------
 1 file changed, 21 insertions(+), 30 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 2d68306e..dd84cc3a 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -7798,10 +7798,9 @@ static void rtl8169_net_suspend(struct net_device *dev)
 
 static int rtl8169_suspend(struct device *device)
 {
-	struct pci_dev *pdev = to_pci_dev(device);
-	struct net_device *dev = pci_get_drvdata(pdev);
+	struct rtl8169_private *tp = dev_get_drvdata(device);
 
-	rtl8169_net_suspend(dev);
+	rtl8169_net_suspend(tp->dev);
 
 	return 0;
 }
@@ -7824,23 +7823,19 @@ static void __rtl8169_resume(struct net_device *dev)
 
 static int rtl8169_resume(struct device *device)
 {
-	struct pci_dev *pdev = to_pci_dev(device);
-	struct net_device *dev = pci_get_drvdata(pdev);
-	struct rtl8169_private *tp = netdev_priv(dev);
+	struct rtl8169_private *tp = dev_get_drvdata(device);
 
 	rtl8169_init_phy(tp);
 
-	if (netif_running(dev))
-		__rtl8169_resume(dev);
+	if (netif_running(tp->dev))
+		__rtl8169_resume(tp->dev);
 
 	return 0;
 }
 
 static int rtl8169_runtime_suspend(struct device *device)
 {
-	struct pci_dev *pdev = to_pci_dev(device);
-	struct net_device *dev = pci_get_drvdata(pdev);
-	struct rtl8169_private *tp = netdev_priv(dev);
+	struct rtl8169_private *tp = dev_get_drvdata(device);
 
 	if (!tp->TxDescArray) {
 		rtl_pll_power_down(tp);
@@ -7852,10 +7847,10 @@ static int rtl8169_runtime_suspend(struct device *device)
 	__rtl8169_set_wol(tp, WAKE_ANY);
 	rtl_unlock_work(tp);
 
-	rtl8169_net_suspend(dev);
+	rtl8169_net_suspend(tp->dev);
 
 	/* Update counters before going runtime suspend */
-	rtl8169_rx_missed(dev);
+	rtl8169_rx_missed(tp->dev);
 	rtl8169_update_counters(tp);
 
 	return 0;
@@ -7863,10 +7858,9 @@ static int rtl8169_runtime_suspend(struct device *device)
 
 static int rtl8169_runtime_resume(struct device *device)
 {
-	struct pci_dev *pdev = to_pci_dev(device);
-	struct net_device *dev = pci_get_drvdata(pdev);
-	struct rtl8169_private *tp = netdev_priv(dev);
-	rtl_rar_set(tp, dev->dev_addr);
+	struct rtl8169_private *tp = dev_get_drvdata(device);
+
+	rtl_rar_set(tp, tp->dev->dev_addr);
 
 	if (!tp->TxDescArray)
 		return 0;
@@ -7878,17 +7872,16 @@ static int rtl8169_runtime_resume(struct device *device)
 
 	rtl8169_init_phy(tp);
 
-	__rtl8169_resume(dev);
+	__rtl8169_resume(tp->dev);
 
 	return 0;
 }
 
 static int rtl8169_runtime_idle(struct device *device)
 {
-	struct pci_dev *pdev = to_pci_dev(device);
-	struct net_device *dev = pci_get_drvdata(pdev);
+	struct rtl8169_private *tp = dev_get_drvdata(device);
 
-	if (!netif_running(dev) || !netif_carrier_ok(dev))
+	if (!netif_running(tp->dev) || !netif_carrier_ok(tp->dev))
 		pm_schedule_suspend(device, 10000);
 
 	return -EBUSY;
@@ -7934,13 +7927,12 @@ static void rtl_wol_shutdown_quirk(struct rtl8169_private *tp)
 
 static void rtl_shutdown(struct pci_dev *pdev)
 {
-	struct net_device *dev = pci_get_drvdata(pdev);
-	struct rtl8169_private *tp = netdev_priv(dev);
+	struct rtl8169_private *tp = pci_get_drvdata(pdev);
 
-	rtl8169_net_suspend(dev);
+	rtl8169_net_suspend(tp->dev);
 
 	/* Restore original MAC address */
-	rtl_rar_set(tp, dev->perm_addr);
+	rtl_rar_set(tp, tp->dev->perm_addr);
 
 	rtl8169_hw_reset(tp);
 
@@ -7957,15 +7949,14 @@ static void rtl_shutdown(struct pci_dev *pdev)
 
 static void rtl_remove_one(struct pci_dev *pdev)
 {
-	struct net_device *dev = pci_get_drvdata(pdev);
-	struct rtl8169_private *tp = netdev_priv(dev);
+	struct rtl8169_private *tp = pci_get_drvdata(pdev);
 
 	if (r8168_check_dash(tp))
 		rtl8168_driver_stop(tp);
 
 	netif_napi_del(&tp->napi);
 
-	unregister_netdev(dev);
+	unregister_netdev(tp->dev);
 
 	rtl_release_firmware(tp);
 
@@ -7973,7 +7964,7 @@ static void rtl_remove_one(struct pci_dev *pdev)
 		pm_runtime_get_noresume(&pdev->dev);
 
 	/* restore original MAC address */
-	rtl_rar_set(tp, dev->perm_addr);
+	rtl_rar_set(tp, tp->dev->perm_addr);
 }
 
 static const struct net_device_ops rtl_netdev_ops = {
@@ -8361,7 +8352,7 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	if (rc < 0)
 		return rc;
 
-	pci_set_drvdata(pdev, dev);
+	pci_set_drvdata(pdev, tp);
 
 	netif_info(tp, probe, dev, "%s at 0x%p, %pM, XID %08x IRQ %d\n",
 		   rtl_chip_infos[chipset].name, tp->mmio_addr, dev->dev_addr,
-- 
2.16.2

^ permalink raw reply related

* [PATCH net-next 4/6] r8169: change argument type of interrupt handler
From: Heiner Kallweit @ 2018-03-24 22:18 UTC (permalink / raw)
  To: Realtek linux nic maintainers, David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <6160c32d-3905-683a-7117-d2b11d310a34@gmail.com>

The interrupt handler doesn't need the net_device, therefore change the
argument type to struct rtl8169_private *.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/ethernet/realtek/r8169.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index a0e32557..2d68306e 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -7445,8 +7445,7 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, u32 budget
 
 static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
 {
-	struct net_device *dev = dev_instance;
-	struct rtl8169_private *tp = netdev_priv(dev);
+	struct rtl8169_private *tp = dev_instance;
 	int handled = 0;
 	u16 status;
 
@@ -7618,7 +7617,7 @@ static int rtl8169_close(struct net_device *dev)
 
 	cancel_work_sync(&tp->wk.work);
 
-	pci_free_irq(pdev, 0, dev);
+	pci_free_irq(pdev, 0, tp);
 
 	dma_free_coherent(&pdev->dev, R8169_RX_RING_BYTES, tp->RxDescArray,
 			  tp->RxPhyAddr);
@@ -7673,7 +7672,7 @@ static int rtl_open(struct net_device *dev)
 
 	rtl_request_firmware(tp);
 
-	retval = pci_request_irq(pdev, 0, rtl8169_interrupt, NULL, dev,
+	retval = pci_request_irq(pdev, 0, rtl8169_interrupt, NULL, tp,
 				 dev->name);
 	if (retval < 0)
 		goto err_release_fw_2;
-- 
2.16.2

^ permalink raw reply related

* [PATCH net-next 3/6] r8169: change argument type of PHY-related functions
From: Heiner Kallweit @ 2018-03-24 22:18 UTC (permalink / raw)
  To: Realtek linux nic maintainers, David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <6160c32d-3905-683a-7117-d2b11d310a34@gmail.com>

Some PHY-related functions functions can be a little streamlined and
simplified by taking a struct rtl8169_private * argument.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/ethernet/realtek/r8169.c | 23 ++++++++++-------------
 1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 0f1c19e8..a0e32557 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -4370,10 +4370,8 @@ static void rtl8106e_hw_phy_config(struct rtl8169_private *tp)
 	rtl_eri_write(tp, 0x1d0, ERIAR_MASK_0011, 0x0000, ERIAR_EXGMAC);
 }
 
-static void rtl_hw_phy_config(struct net_device *dev)
+static void rtl_hw_phy_config(struct rtl8169_private *tp)
 {
-	struct rtl8169_private *tp = netdev_priv(dev);
-
 	rtl8169_print_mac_version(tp);
 
 	switch (tp->mac_version) {
@@ -4546,8 +4544,7 @@ DECLARE_RTL_COND(rtl_phy_reset_cond)
 	return tp->phy_reset_pending(tp);
 }
 
-static void rtl8169_phy_reset(struct net_device *dev,
-			      struct rtl8169_private *tp)
+static void rtl8169_phy_reset(struct rtl8169_private *tp)
 {
 	tp->phy_reset_enable(tp);
 	rtl_msleep_loop_wait_low(tp, &rtl_phy_reset_cond, 1, 100);
@@ -4559,9 +4556,9 @@ static bool rtl_tbi_enabled(struct rtl8169_private *tp)
 	    (RTL_R8(tp, PHYstatus) & TBI_Enable);
 }
 
-static void rtl8169_init_phy(struct net_device *dev, struct rtl8169_private *tp)
+static void rtl8169_init_phy(struct rtl8169_private *tp)
 {
-	rtl_hw_phy_config(dev);
+	rtl_hw_phy_config(tp);
 
 	if (tp->mac_version <= RTL_GIGA_MAC_VER_06) {
 		dprintk("Set MAC Reg C+CR Offset 0x82h = 0x01h\n");
@@ -4580,9 +4577,9 @@ static void rtl8169_init_phy(struct net_device *dev, struct rtl8169_private *tp)
 		rtl_writephy(tp, 0x0b, 0x0000); //w 0x0b 15 0 0
 	}
 
-	rtl8169_phy_reset(dev, tp);
+	rtl8169_phy_reset(tp);
 
-	rtl8169_set_speed(dev, AUTONEG_ENABLE, SPEED_1000, DUPLEX_FULL,
+	rtl8169_set_speed(tp->dev, AUTONEG_ENABLE, SPEED_1000, DUPLEX_FULL,
 			  ADVERTISED_10baseT_Half | ADVERTISED_10baseT_Full |
 			  ADVERTISED_100baseT_Half | ADVERTISED_100baseT_Full |
 			  (tp->mii.supports_gmii ?
@@ -4590,7 +4587,7 @@ static void rtl8169_init_phy(struct net_device *dev, struct rtl8169_private *tp)
 			   ADVERTISED_1000baseT_Full : 0));
 
 	if (rtl_tbi_enabled(tp))
-		netif_info(tp, link, dev, "TBI auto-negotiating\n");
+		netif_info(tp, link, tp->dev, "TBI auto-negotiating\n");
 }
 
 static void rtl_rar_set(struct rtl8169_private *tp, u8 *addr)
@@ -7687,7 +7684,7 @@ static int rtl_open(struct net_device *dev)
 
 	napi_enable(&tp->napi);
 
-	rtl8169_init_phy(dev, tp);
+	rtl8169_init_phy(tp);
 
 	__rtl8169_set_features(dev, dev->features);
 
@@ -7832,7 +7829,7 @@ static int rtl8169_resume(struct device *device)
 	struct net_device *dev = pci_get_drvdata(pdev);
 	struct rtl8169_private *tp = netdev_priv(dev);
 
-	rtl8169_init_phy(dev, tp);
+	rtl8169_init_phy(tp);
 
 	if (netif_running(dev))
 		__rtl8169_resume(dev);
@@ -7880,7 +7877,7 @@ static int rtl8169_runtime_resume(struct device *device)
 	tp->saved_wolopts = 0;
 	rtl_unlock_work(tp);
 
-	rtl8169_init_phy(dev, tp);
+	rtl8169_init_phy(tp);
 
 	__rtl8169_resume(dev);
 
-- 
2.16.2

^ permalink raw reply related

* [PATCH net-next 2/6] r8169: change argument type of counter functions
From: Heiner Kallweit @ 2018-03-24 22:18 UTC (permalink / raw)
  To: Realtek linux nic maintainers, David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <6160c32d-3905-683a-7117-d2b11d310a34@gmail.com>

These counter functions don't deal with the net_device, so change the
argument type to struct rtl8169_private *.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/ethernet/realtek/r8169.c | 32 +++++++++++++-------------------
 1 file changed, 13 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 74435ba9..0f1c19e8 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -2155,9 +2155,8 @@ DECLARE_RTL_COND(rtl_counters_cond)
 	return RTL_R32(tp, CounterAddrLow) & (CounterReset | CounterDump);
 }
 
-static bool rtl8169_do_counters(struct net_device *dev, u32 counter_cmd)
+static bool rtl8169_do_counters(struct rtl8169_private *tp, u32 counter_cmd)
 {
-	struct rtl8169_private *tp = netdev_priv(dev);
 	dma_addr_t paddr = tp->counters_phys_addr;
 	u32 cmd;
 
@@ -2170,10 +2169,8 @@ static bool rtl8169_do_counters(struct net_device *dev, u32 counter_cmd)
 	return rtl_udelay_loop_wait_low(tp, &rtl_counters_cond, 10, 1000);
 }
 
-static bool rtl8169_reset_counters(struct net_device *dev)
+static bool rtl8169_reset_counters(struct rtl8169_private *tp)
 {
-	struct rtl8169_private *tp = netdev_priv(dev);
-
 	/*
 	 * Versions prior to RTL_GIGA_MAC_VER_19 don't support resetting the
 	 * tally counters.
@@ -2181,13 +2178,11 @@ static bool rtl8169_reset_counters(struct net_device *dev)
 	if (tp->mac_version < RTL_GIGA_MAC_VER_19)
 		return true;
 
-	return rtl8169_do_counters(dev, CounterReset);
+	return rtl8169_do_counters(tp, CounterReset);
 }
 
-static bool rtl8169_update_counters(struct net_device *dev)
+static bool rtl8169_update_counters(struct rtl8169_private *tp)
 {
-	struct rtl8169_private *tp = netdev_priv(dev);
-
 	/*
 	 * Some chips are unable to dump tally counters when the receiver
 	 * is disabled.
@@ -2195,12 +2190,11 @@ static bool rtl8169_update_counters(struct net_device *dev)
 	if ((RTL_R8(tp, ChipCmd) & CmdRxEnb) == 0)
 		return true;
 
-	return rtl8169_do_counters(dev, CounterDump);
+	return rtl8169_do_counters(tp, CounterDump);
 }
 
-static bool rtl8169_init_counter_offsets(struct net_device *dev)
+static bool rtl8169_init_counter_offsets(struct rtl8169_private *tp)
 {
-	struct rtl8169_private *tp = netdev_priv(dev);
 	struct rtl8169_counters *counters = tp->counters;
 	bool ret = false;
 
@@ -2223,10 +2217,10 @@ static bool rtl8169_init_counter_offsets(struct net_device *dev)
 		return true;
 
 	/* If both, reset and update fail, propagate to caller. */
-	if (rtl8169_reset_counters(dev))
+	if (rtl8169_reset_counters(tp))
 		ret = true;
 
-	if (rtl8169_update_counters(dev))
+	if (rtl8169_update_counters(tp))
 		ret = true;
 
 	tp->tc_offset.tx_errors = counters->tx_errors;
@@ -2249,7 +2243,7 @@ static void rtl8169_get_ethtool_stats(struct net_device *dev,
 	pm_runtime_get_noresume(d);
 
 	if (pm_runtime_active(d))
-		rtl8169_update_counters(dev);
+		rtl8169_update_counters(tp);
 
 	pm_runtime_put_noidle(d);
 
@@ -7617,7 +7611,7 @@ static int rtl8169_close(struct net_device *dev)
 	pm_runtime_get_sync(&pdev->dev);
 
 	/* Update counters before going down */
-	rtl8169_update_counters(dev);
+	rtl8169_update_counters(tp);
 
 	rtl_lock_work(tp);
 	clear_bit(RTL_FLAG_TASK_ENABLED, tp->wk.flags);
@@ -7701,7 +7695,7 @@ static int rtl_open(struct net_device *dev)
 
 	rtl_hw_start(tp);
 
-	if (!rtl8169_init_counter_offsets(dev))
+	if (!rtl8169_init_counter_offsets(tp))
 		netif_warn(tp, hw, dev, "counter reset/update failed\n");
 
 	netif_start_queue(dev);
@@ -7770,7 +7764,7 @@ rtl8169_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
 	 * from tally counters.
 	 */
 	if (pm_runtime_active(&pdev->dev))
-		rtl8169_update_counters(dev);
+		rtl8169_update_counters(tp);
 
 	/*
 	 * Subtract values fetched during initalization.
@@ -7866,7 +7860,7 @@ static int rtl8169_runtime_suspend(struct device *device)
 
 	/* Update counters before going runtime suspend */
 	rtl8169_rx_missed(dev);
-	rtl8169_update_counters(dev);
+	rtl8169_update_counters(tp);
 
 	return 0;
 }
-- 
2.16.2

^ permalink raw reply related

* [PATCH net-next 1/6] r8169: change argument type of callback hw_start in struct rtl8169_private
From: Heiner Kallweit @ 2018-03-24 22:18 UTC (permalink / raw)
  To: Realtek linux nic maintainers, David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <6160c32d-3905-683a-7117-d2b11d310a34@gmail.com>

Change argument type of callback hw_start to be more in line with the
other callbacks and to simplify the code.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/ethernet/realtek/r8169.c | 41 +++++++++++++-----------------------
 1 file changed, 15 insertions(+), 26 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 630409e0..74435ba9 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -821,7 +821,7 @@ struct rtl8169_private {
 	int (*get_link_ksettings)(struct net_device *,
 				  struct ethtool_link_ksettings *);
 	void (*phy_reset_enable)(struct rtl8169_private *tp);
-	void (*hw_start)(struct net_device *);
+	void (*hw_start)(struct rtl8169_private *tp);
 	unsigned int (*phy_reset_pending)(struct rtl8169_private *tp);
 	unsigned int (*link_ok)(struct rtl8169_private *tp);
 	int (*do_ioctl)(struct rtl8169_private *tp, struct mii_ioctl_data *data, int cmd);
@@ -5358,12 +5358,9 @@ static void rtl_set_rx_tx_config_registers(struct rtl8169_private *tp)
 		(InterFrameGap << TxInterFrameGapShift));
 }
 
-static void rtl_hw_start(struct net_device *dev)
+static void rtl_hw_start(struct  rtl8169_private *tp)
 {
-	struct rtl8169_private *tp = netdev_priv(dev);
-
-	tp->hw_start(dev);
-
+	tp->hw_start(tp);
 	rtl_irq_enable_all(tp);
 }
 
@@ -5472,14 +5469,11 @@ static void rtl_set_rx_mode(struct net_device *dev)
 	RTL_W32(tp, RxConfig, tmp);
 }
 
-static void rtl_hw_start_8169(struct net_device *dev)
+static void rtl_hw_start_8169(struct rtl8169_private *tp)
 {
-	struct rtl8169_private *tp = netdev_priv(dev);
-	struct pci_dev *pdev = tp->pci_dev;
-
 	if (tp->mac_version == RTL_GIGA_MAC_VER_05) {
 		RTL_W16(tp, CPlusCmd, RTL_R16(tp, CPlusCmd) | PCIMulRW);
-		pci_write_config_byte(pdev, PCI_CACHE_LINE_SIZE, 0x08);
+		pci_write_config_byte(tp->pci_dev, PCI_CACHE_LINE_SIZE, 0x08);
 	}
 
 	RTL_W8(tp, Cfg9346, Cfg9346_Unlock);
@@ -5537,7 +5531,7 @@ static void rtl_hw_start_8169(struct net_device *dev)
 
 	RTL_W32(tp, RxMissed, 0);
 
-	rtl_set_rx_mode(dev);
+	rtl_set_rx_mode(tp->dev);
 
 	/* no early-rx interrupts */
 	RTL_W16(tp, MultiIntr, RTL_R16(tp, MultiIntr) & 0xf000);
@@ -6325,10 +6319,8 @@ static void rtl_hw_start_8168ep_3(struct rtl8169_private *tp)
 	r8168_mac_ocp_write(tp, 0xe860, data);
 }
 
-static void rtl_hw_start_8168(struct net_device *dev)
+static void rtl_hw_start_8168(struct rtl8169_private *tp)
 {
-	struct rtl8169_private *tp = netdev_priv(dev);
-
 	RTL_W8(tp, Cfg9346, Cfg9346_Unlock);
 
 	RTL_W8(tp, MaxTxPacketSize, TxPacketMax);
@@ -6453,7 +6445,7 @@ static void rtl_hw_start_8168(struct net_device *dev)
 
 	default:
 		printk(KERN_ERR PFX "%s: unknown chipset (mac_version = %d).\n",
-			dev->name, tp->mac_version);
+		       tp->dev->name, tp->mac_version);
 		break;
 	}
 
@@ -6461,7 +6453,7 @@ static void rtl_hw_start_8168(struct net_device *dev)
 
 	RTL_W8(tp, ChipCmd, CmdTxEnb | CmdRxEnb);
 
-	rtl_set_rx_mode(dev);
+	rtl_set_rx_mode(tp->dev);
 
 	RTL_W16(tp, MultiIntr, RTL_R16(tp, MultiIntr) & 0xf000);
 }
@@ -6600,17 +6592,14 @@ static void rtl_hw_start_8106(struct rtl8169_private *tp)
 	rtl_pcie_state_l2l3_enable(tp, false);
 }
 
-static void rtl_hw_start_8101(struct net_device *dev)
+static void rtl_hw_start_8101(struct rtl8169_private *tp)
 {
-	struct rtl8169_private *tp = netdev_priv(dev);
-	struct pci_dev *pdev = tp->pci_dev;
-
 	if (tp->mac_version >= RTL_GIGA_MAC_VER_30)
 		tp->event_slow &= ~RxFIFOOver;
 
 	if (tp->mac_version == RTL_GIGA_MAC_VER_13 ||
 	    tp->mac_version == RTL_GIGA_MAC_VER_16)
-		pcie_capability_set_word(pdev, PCI_EXP_DEVCTL,
+		pcie_capability_set_word(tp->pci_dev, PCI_EXP_DEVCTL,
 					 PCI_EXP_DEVCTL_NOSNOOP_EN);
 
 	RTL_W8(tp, Cfg9346, Cfg9346_Unlock);
@@ -6668,7 +6657,7 @@ static void rtl_hw_start_8101(struct net_device *dev)
 
 	RTL_W8(tp, ChipCmd, CmdTxEnb | CmdRxEnb);
 
-	rtl_set_rx_mode(dev);
+	rtl_set_rx_mode(tp->dev);
 
 	RTL_R8(tp, IntrMask);
 
@@ -6880,7 +6869,7 @@ static void rtl_reset_work(struct rtl8169_private *tp)
 	rtl8169_init_ring_indexes(tp);
 
 	napi_enable(&tp->napi);
-	rtl_hw_start(dev);
+	rtl_hw_start(tp);
 	netif_wake_queue(dev);
 	rtl8169_check_link_status(dev, tp);
 }
@@ -7710,7 +7699,7 @@ static int rtl_open(struct net_device *dev)
 
 	rtl_pll_power_up(tp);
 
-	rtl_hw_start(dev);
+	rtl_hw_start(tp);
 
 	if (!rtl8169_init_counter_offsets(dev))
 		netif_warn(tp, hw, dev, "counter reset/update failed\n");
@@ -8017,7 +8006,7 @@ static const struct net_device_ops rtl_netdev_ops = {
 };
 
 static const struct rtl_cfg_info {
-	void (*hw_start)(struct net_device *);
+	void (*hw_start)(struct rtl8169_private *);
 	unsigned int region;
 	unsigned int align;
 	u16 event_slow;
-- 
2.16.2

^ permalink raw reply related

* [PATCH net-next 0/6] r8169: patch set with further smaller improvements
From: Heiner Kallweit @ 2018-03-24 22:11 UTC (permalink / raw)
  To: Realtek linux nic maintainers, David Miller; +Cc: netdev@vger.kernel.org

Patch set with further changes w/o functional change aiming at
streamlining and simplifying the code.

Heiner Kallweit (6):
  r8169: change argument type of callback hw_start in struct rtl8169_private
  r8169: change argument type of counter functions
  r8169: change argument type of PHY-related functions
  r8169: change argument type of interrupt handler
  r8169: change type of driver_data
  r8169: change argument type of rtl8169_net_suspend and __rtl8169_resume

 drivers/net/ethernet/realtek/r8169.c | 170 ++++++++++++++---------------------
 1 file changed, 68 insertions(+), 102 deletions(-)

-- 
2.16.2

^ permalink raw reply

* Re: [net PATCH] net: sched, fix OOO packets with pfifo_fast
From: John Fastabend @ 2018-03-24 22:10 UTC (permalink / raw)
  To: Eric Dumazet, xiyou.wangcong, jiri, davem; +Cc: netdev
In-Reply-To: <71d325ad-5270-0491-b858-75674d9fe09a@gmail.com>

On 03/24/2018 02:15 PM, Eric Dumazet wrote:
> 
> 
> On 03/24/2018 01:13 PM, John Fastabend wrote:
>> After the qdisc lock was dropped in pfifo_fast we allow multiple
>> enqueue threads and dequeue threads to run in parallel. On the
>> enqueue side the skb bit ooo_okay is used to ensure all related
>> skbs are enqueued in-order. On the dequeue side though there is
>> no similar logic. What we observe is with fewer queues than CPUs
>> it is possible to re-order packets when two instances of
>> __qdisc_run() are running in parallel. Each thread will dequeue
>> a skb and then whichever thread calls the ndo op first will
>> be sent on the wire. This doesn't typically happen because
>> qdisc_run() is usually triggered by the same core that did the
>> enqueue. However, drivers will trigger __netif_schedule()
>> when queues are transitioning from stopped to awake using the
>> netif_tx_wake_* APIs. When this happens netif_schedule() calls
>> qdisc_run() on the same CPU that did the netif_tx_wake_* which
>> is usually done in the interrupt completion context. This CPU
>> is selected with the irq affinity which is unrelated to the
>> enqueue operations.
>>
>> To resolve this we add a RUNNING bit to the qdisc to ensure
>> only a single dequeue per qdisc is running. Enqueue and dequeue
>> operations can still run in parallel and also on multi queue
>> NICs we can still have a dequeue in-flight per qdisc, which
>> is typically per CPU.
>>
>> Fixes: c5ad119fb6c0 ("net: sched: pfifo_fast use skb_array")
>> Reported-by: Jakob Unterwurzacher <jakob.unterwurzacher@theobroma-systems.com>
>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
>> ---
>>  include/net/sch_generic.h |    1 +
>>  net/sched/sch_generic.c   |   13 ++++++++++---
>>  2 files changed, 11 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
>> index 2092d33..8da3267 100644
>> --- a/include/net/sch_generic.h
>> +++ b/include/net/sch_generic.h
>> @@ -30,6 +30,7 @@ struct qdisc_rate_table {
>>  enum qdisc_state_t {
>>  	__QDISC_STATE_SCHED,
>>  	__QDISC_STATE_DEACTIVATED,
>> +	__QDISC_STATE_RUNNING,
>>  };
>>  
>>  struct qdisc_size_table {
>> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
>> index 7e3fbe9..29a1b47 100644
>> --- a/net/sched/sch_generic.c
>> +++ b/net/sched/sch_generic.c
>> @@ -377,12 +377,17 @@ static inline bool qdisc_restart(struct Qdisc *q, int *packets)
>>  	struct netdev_queue *txq;
>>  	struct net_device *dev;
>>  	struct sk_buff *skb;
>> -	bool validate;
>> +	bool more, validate;
>>  
>>  	/* Dequeue packet */
>> +	if (test_and_set_bit(__QDISC_STATE_RUNNING, &q->state))
>> +		return false;
>> +

[...]

> 
> This adds a pair of atomic operations in fast path, only for pfifo_fast sake.
> 

Yeah, we can wrap these in a `if (TCQ_F_NOLOCK)` to avoid it in cases
its not needed. Alternatively, for net we could turn off NOLOCK in
pfifo_fast and fix it net-next with something more complete.

> qdisc_restart() name is misleading, this is used from __qdisc_run()
> 
> 

I'll change it in net-next.

.John

^ permalink raw reply

* Re: switchdev and dsa
From: Andrew Lunn @ 2018-03-24 21:31 UTC (permalink / raw)
  To: Ran Shalit; +Cc: netdev
In-Reply-To: <CAJ2oMhLQKUkfpQQE3mzcGB4pnGUY8kFo+PRLxhCPKvNm0moWqA@mail.gmail.com>

> I use older kernel (2.6), because these is the Linux from provider chip (TI).

TI really only provides 2.6? I strongly suggest you choose a different
SoC vendor.

> Is there an example maybe for this which someone can provide ?

https://elixir.bootlin.com/linux/v2.6.39.4/source/arch/arm/mach-orion5x/rd88f5181l-ge-setup.c

	Andrew

^ permalink raw reply

* Re: [net PATCH] net: sched, fix OOO packets with pfifo_fast
From: Eric Dumazet @ 2018-03-24 21:15 UTC (permalink / raw)
  To: John Fastabend, xiyou.wangcong, jiri, davem; +Cc: netdev
In-Reply-To: <20180324201338.7661.44440.stgit@john-Precision-Tower-5810>



On 03/24/2018 01:13 PM, John Fastabend wrote:
> After the qdisc lock was dropped in pfifo_fast we allow multiple
> enqueue threads and dequeue threads to run in parallel. On the
> enqueue side the skb bit ooo_okay is used to ensure all related
> skbs are enqueued in-order. On the dequeue side though there is
> no similar logic. What we observe is with fewer queues than CPUs
> it is possible to re-order packets when two instances of
> __qdisc_run() are running in parallel. Each thread will dequeue
> a skb and then whichever thread calls the ndo op first will
> be sent on the wire. This doesn't typically happen because
> qdisc_run() is usually triggered by the same core that did the
> enqueue. However, drivers will trigger __netif_schedule()
> when queues are transitioning from stopped to awake using the
> netif_tx_wake_* APIs. When this happens netif_schedule() calls
> qdisc_run() on the same CPU that did the netif_tx_wake_* which
> is usually done in the interrupt completion context. This CPU
> is selected with the irq affinity which is unrelated to the
> enqueue operations.
> 
> To resolve this we add a RUNNING bit to the qdisc to ensure
> only a single dequeue per qdisc is running. Enqueue and dequeue
> operations can still run in parallel and also on multi queue
> NICs we can still have a dequeue in-flight per qdisc, which
> is typically per CPU.
> 
> Fixes: c5ad119fb6c0 ("net: sched: pfifo_fast use skb_array")
> Reported-by: Jakob Unterwurzacher <jakob.unterwurzacher@theobroma-systems.com>
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
>  include/net/sch_generic.h |    1 +
>  net/sched/sch_generic.c   |   13 ++++++++++---
>  2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
> index 2092d33..8da3267 100644
> --- a/include/net/sch_generic.h
> +++ b/include/net/sch_generic.h
> @@ -30,6 +30,7 @@ struct qdisc_rate_table {
>  enum qdisc_state_t {
>  	__QDISC_STATE_SCHED,
>  	__QDISC_STATE_DEACTIVATED,
> +	__QDISC_STATE_RUNNING,
>  };
>  
>  struct qdisc_size_table {
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index 7e3fbe9..29a1b47 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -377,12 +377,17 @@ static inline bool qdisc_restart(struct Qdisc *q, int *packets)
>  	struct netdev_queue *txq;
>  	struct net_device *dev;
>  	struct sk_buff *skb;
> -	bool validate;
> +	bool more, validate;
>  
>  	/* Dequeue packet */
> +	if (test_and_set_bit(__QDISC_STATE_RUNNING, &q->state))
> +		return false;
> +
>  	skb = dequeue_skb(q, &validate, packets);
> -	if (unlikely(!skb))
> +	if (unlikely(!skb)) {
> +		clear_bit(__QDISC_STATE_RUNNING, &q->state);
>  		return false;
> +	}
>  
>  	if (!(q->flags & TCQ_F_NOLOCK))
>  		root_lock = qdisc_lock(q);
> @@ -390,7 +395,9 @@ static inline bool qdisc_restart(struct Qdisc *q, int *packets)
>  	dev = qdisc_dev(q);
>  	txq = skb_get_tx_queue(dev, skb);
>  
> -	return sch_direct_xmit(skb, q, dev, txq, root_lock, validate);
> +	more = sch_direct_xmit(skb, q, dev, txq, root_lock, validate);
> +	clear_bit(__QDISC_STATE_RUNNING, &q->state);
> +	return more;
>  }
>  
>  void __qdisc_run(struct Qdisc *q)
> 


This adds a pair of atomic operations in fast path, only for pfifo_fast sake.

qdisc_restart() name is misleading, this is used from __qdisc_run()

^ permalink raw reply

* Re: switchdev and dsa
From: Ran Shalit @ 2018-03-24 21:11 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: netdev
In-Reply-To: <20180324205148.GZ24361@lunn.ch>

On Sat, Mar 24, 2018 at 11:51 PM, Andrew Lunn <andrew@lunn.ch> wrote:
> On Sat, Mar 24, 2018 at 11:35:38PM +0300, Ran Shalit wrote:
>> On Sat, Mar 24, 2018 at 10:02 PM, Andrew Lunn <andrew@lunn.ch> wrote:
>> > On Sat, Mar 24, 2018 at 07:12:15PM +0300, Ran Shalit wrote:
>> >> Hello,
>> >>
>> >> I am new with switchdev and dsa.
>> >> I would please like to ask if configuring a device tree/board file
>> >> with a switch (for example marvell switch) will provide all ports
>> >> behins the switch as valid interface (ethX) ports in userspace
>> >> ifconfig command ?
>> >
>> > Hi Ranran
>> >
>> > Yes, that is the idea.
>> >
>>
>> I see that in marvell's switch, for example with managed cpu, all
>> ports are disabled.
>> Where in dsa driver the are the ports enabled ? I can't see it in
>> drivers. I use 88E6176.
>
> You need to explain in more detail what you mean by a port being
> enabled?
>

I see in DS that bits 0-1 in port control register (offset 0x4) are
responsible for port enabled/disabled in switch.
But now I think I understand how it works.
I do see that  mv88e6123_61_65_setup() , do set these bits 0-1 to '1'
 (00- disabled, 11- forwarding)

Yet, I would like to ask one more question about it.
I use older kernel (2.6), because these is the Linux from provider chip (TI).
I did not find any example how to configure marvell switch  in board
file, but only in device tree. (we use 88E6176, but any other switch
example would help).
Is there an example maybe for this which someone can provide ?

Thanks again,
ranran
>         Andrew

^ permalink raw reply

* Re: [PATCH 0/7] Netfilter fixes for net
From: David Miller @ 2018-03-24 21:10 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev
In-Reply-To: <20180324203423.4513-1-pablo@netfilter.org>

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Sat, 24 Mar 2018 21:34:16 +0100

> The following patchset contains Netfilter fixes for your net tree,
> they are:
 ...
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thank you.

^ permalink raw reply

* [PATCH] net: qmi_wwan: add BroadMobi BM806U 2020:2033
From: Pawel Dembicki @ 2018-03-24 21:08 UTC (permalink / raw)
  To: linux-usb; +Cc: Pawel Dembicki, Bjørn Mork, netdev, linux-kernel

BroadMobi BM806U is an Qualcomm MDM9225 based 3G/4G modem.
Tested hardware BM806U is mounted on D-Link DWR-921-C3 router.
The USB id is added to qmi_wwan.c to allow QMI communication with
the BM806U.

Tested on 4.14 kernel and OpenWRT.

Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
---
 drivers/net/usb/qmi_wwan.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index 76ac480..7ced288 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -1180,6 +1180,7 @@ static const struct usb_device_id products[] = {
 	{QMI_FIXED_INTF(0x19d2, 0x2002, 4)},	/* ZTE (Vodafone) K3765-Z */
 	{QMI_FIXED_INTF(0x2001, 0x7e19, 4)},	/* D-Link DWM-221 B1 */
 	{QMI_FIXED_INTF(0x2001, 0x7e35, 4)},	/* D-Link DWM-222 */
+	{QMI_FIXED_INTF(0x2020, 0x2033, 4)},	/* BroadMobi BM806U */
 	{QMI_FIXED_INTF(0x0f3d, 0x68a2, 8)},    /* Sierra Wireless MC7700 */
 	{QMI_FIXED_INTF(0x114f, 0x68a2, 8)},    /* Sierra Wireless MC7750 */
 	{QMI_FIXED_INTF(0x1199, 0x68a2, 8)},	/* Sierra Wireless MC7710 in QMI mode */
-- 
2.7.4

^ permalink raw reply related

* Re: switchdev and dsa
From: Andrew Lunn @ 2018-03-24 20:51 UTC (permalink / raw)
  To: Ran Shalit; +Cc: netdev
In-Reply-To: <CAJ2oMhKde9+U+ts1_8-8+ft3k1S=YVpdcqHztfrFHYQRFQod0g@mail.gmail.com>

On Sat, Mar 24, 2018 at 11:35:38PM +0300, Ran Shalit wrote:
> On Sat, Mar 24, 2018 at 10:02 PM, Andrew Lunn <andrew@lunn.ch> wrote:
> > On Sat, Mar 24, 2018 at 07:12:15PM +0300, Ran Shalit wrote:
> >> Hello,
> >>
> >> I am new with switchdev and dsa.
> >> I would please like to ask if configuring a device tree/board file
> >> with a switch (for example marvell switch) will provide all ports
> >> behins the switch as valid interface (ethX) ports in userspace
> >> ifconfig command ?
> >
> > Hi Ranran
> >
> > Yes, that is the idea.
> >
> 
> I see that in marvell's switch, for example with managed cpu, all
> ports are disabled.
> Where in dsa driver the are the ports enabled ? I can't see it in
> drivers. I use 88E6176.

You need to explain in more detail what you mean by a port being
enabled?

	Andrew

^ permalink raw reply

* Re: switchdev and dsa
From: Ran Shalit @ 2018-03-24 20:35 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: netdev
In-Reply-To: <20180324190250.GI31941@lunn.ch>

On Sat, Mar 24, 2018 at 10:02 PM, Andrew Lunn <andrew@lunn.ch> wrote:
> On Sat, Mar 24, 2018 at 07:12:15PM +0300, Ran Shalit wrote:
>> Hello,
>>
>> I am new with switchdev and dsa.
>> I would please like to ask if configuring a device tree/board file
>> with a switch (for example marvell switch) will provide all ports
>> behins the switch as valid interface (ethX) ports in userspace
>> ifconfig command ?
>
> Hi Ranran
>
> Yes, that is the idea.
>

I see that in marvell's switch, for example with managed cpu, all
ports are disabled.
Where in dsa driver the are the ports enabled ? I can't see it in
drivers. I use 88E6176.


> https://www.netdevconf.org/2.1/session.html?lunn_didelot_fainelli
>
>         Andrew

Hi Andrew,

Thanks a lot for the information and slides.

Ranran

^ permalink raw reply

* [PATCH 7/7] netfilter: nf_socket: Fix out of bounds access in nf_sk_lookup_slow_v{4,6}
From: Pablo Neira Ayuso @ 2018-03-24 20:34 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <20180324203423.4513-1-pablo@netfilter.org>

From: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>

skb_header_pointer will copy data into a buffer if data is non linear,
otherwise it will return a pointer in the linear section of the data.
nf_sk_lookup_slow_v{4,6} always copies data of size udphdr but later
accesses memory within the size of tcphdr (th->doff) in case of TCP
packets. This causes a crash when running with KASAN with the following
call stack -

BUG: KASAN: stack-out-of-bounds in xt_socket_lookup_slow_v4+0x524/0x718
net/netfilter/xt_socket.c:178
Read of size 2 at addr ffffffe3d417a87c by task syz-executor/28971
CPU: 2 PID: 28971 Comm: syz-executor Tainted: G    B   W  O    4.9.65+ #1
Call trace:
[<ffffff9467e8d390>] dump_backtrace+0x0/0x428 arch/arm64/kernel/traps.c:76
[<ffffff9467e8d7e0>] show_stack+0x28/0x38 arch/arm64/kernel/traps.c:226
[<ffffff946842d9b8>] __dump_stack lib/dump_stack.c:15 [inline]
[<ffffff946842d9b8>] dump_stack+0xd4/0x124 lib/dump_stack.c:51
[<ffffff946811d4b0>] print_address_description+0x68/0x258 mm/kasan/report.c:248
[<ffffff946811d8c8>] kasan_report_error mm/kasan/report.c:347 [inline]
[<ffffff946811d8c8>] kasan_report.part.2+0x228/0x2f0 mm/kasan/report.c:371
[<ffffff946811df44>] kasan_report+0x5c/0x70 mm/kasan/report.c:372
[<ffffff946811bebc>] check_memory_region_inline mm/kasan/kasan.c:308 [inline]
[<ffffff946811bebc>] __asan_load2+0x84/0x98 mm/kasan/kasan.c:739
[<ffffff94694d6f04>] __tcp_hdrlen include/linux/tcp.h:35 [inline]
[<ffffff94694d6f04>] xt_socket_lookup_slow_v4+0x524/0x718 net/netfilter/xt_socket.c:178

Fix this by copying data into appropriate size headers based on protocol.

Fixes: a583636a83ea ("inet: refactor inet[6]_lookup functions to take skb")
Signed-off-by: Tejaswi Tanikella <tejaswit@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/ipv4/netfilter/nf_socket_ipv4.c | 6 ++++--
 net/ipv6/netfilter/nf_socket_ipv6.c | 6 ++++--
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/netfilter/nf_socket_ipv4.c b/net/ipv4/netfilter/nf_socket_ipv4.c
index e9293bdebba0..4824b1e183a1 100644
--- a/net/ipv4/netfilter/nf_socket_ipv4.c
+++ b/net/ipv4/netfilter/nf_socket_ipv4.c
@@ -108,10 +108,12 @@ struct sock *nf_sk_lookup_slow_v4(struct net *net, const struct sk_buff *skb,
 	int doff = 0;
 
 	if (iph->protocol == IPPROTO_UDP || iph->protocol == IPPROTO_TCP) {
-		struct udphdr _hdr, *hp;
+		struct tcphdr _hdr;
+		struct udphdr *hp;
 
 		hp = skb_header_pointer(skb, ip_hdrlen(skb),
-					sizeof(_hdr), &_hdr);
+					iph->protocol == IPPROTO_UDP ?
+					sizeof(*hp) : sizeof(_hdr), &_hdr);
 		if (hp == NULL)
 			return NULL;
 
diff --git a/net/ipv6/netfilter/nf_socket_ipv6.c b/net/ipv6/netfilter/nf_socket_ipv6.c
index ebb2bf84232a..f14de4b6d639 100644
--- a/net/ipv6/netfilter/nf_socket_ipv6.c
+++ b/net/ipv6/netfilter/nf_socket_ipv6.c
@@ -116,9 +116,11 @@ struct sock *nf_sk_lookup_slow_v6(struct net *net, const struct sk_buff *skb,
 	}
 
 	if (tproto == IPPROTO_UDP || tproto == IPPROTO_TCP) {
-		struct udphdr _hdr, *hp;
+		struct tcphdr _hdr;
+		struct udphdr *hp;
 
-		hp = skb_header_pointer(skb, thoff, sizeof(_hdr), &_hdr);
+		hp = skb_header_pointer(skb, thoff, tproto == IPPROTO_UDP ?
+					sizeof(*hp) : sizeof(_hdr), &_hdr);
 		if (hp == NULL)
 			return NULL;
 
-- 
2.11.0

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox