Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net 0/2] net: dsa: bcm_sf2: Couple of fixes
From: David Miller @ 2018-10-11 22:20 UTC (permalink / raw)
  To: f.fainelli; +Cc: netdev, andrew, vivien.didelot
In-Reply-To: <05eb558e-b20c-fab0-e111-f73948069e5a@gmail.com>

From: Florian Fainelli <f.fainelli@gmail.com>
Date: Thu, 11 Oct 2018 15:02:54 -0700

> On 10/10/2018 10:53 PM, David Miller wrote:
>> From: Florian Fainelli <f.fainelli@gmail.com>
>> Date: Tue,  9 Oct 2018 16:48:56 -0700
>> 
>>> Here are two fixes for the bcm_sf2 driver that were found during
>>> testing unbind and analysing another issue during system
>>> suspend/resume.
>> 
>> Series applied and queued up for -stable, thanks.
> 
> Did you push that series yet?
> 
> git pull
> remote: Counting objects: 171, done.
> remote: Compressing objects: 100% (71/71), done.
> remote: Total 171 (delta 144), reused 118 (delta 100)
> Receiving objects: 100% (171/171), 32.80 KiB | 16.40 MiB/s, done.
> Resolving deltas: 100% (144/144), completed with 61 local objects.
> From https://git.kernel.org/pub/scm/linux/kernel/git/davem/net
>    52b5d6f5dcf0..052858663db3  master     -> davem-net/master
> 
> git rebase net/master sf2-fixes
> First, rewinding head to replay your work on top of it...
> Applying: net: dsa: bcm_sf2: Fix unbind ordering
> Applying: net: dsa: bcm_sf2: Call setup during switch resume

I applied them to net-next, sorry about that.

I'll add them to 'net' too.

Done.

^ permalink raw reply

* Re: [PATCH net 0/2] net: dsa: bcm_sf2: Couple of fixes
From: Florian Fainelli @ 2018-10-11 22:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, andrew, vivien.didelot
In-Reply-To: <20181010.225335.776879235761704804.davem@davemloft.net>

On 10/10/2018 10:53 PM, David Miller wrote:
> From: Florian Fainelli <f.fainelli@gmail.com>
> Date: Tue,  9 Oct 2018 16:48:56 -0700
> 
>> Here are two fixes for the bcm_sf2 driver that were found during
>> testing unbind and analysing another issue during system
>> suspend/resume.
> 
> Series applied and queued up for -stable, thanks.

Did you push that series yet?

git pull
remote: Counting objects: 171, done.
remote: Compressing objects: 100% (71/71), done.
remote: Total 171 (delta 144), reused 118 (delta 100)
Receiving objects: 100% (171/171), 32.80 KiB | 16.40 MiB/s, done.
Resolving deltas: 100% (144/144), completed with 61 local objects.
>From https://git.kernel.org/pub/scm/linux/kernel/git/davem/net
   52b5d6f5dcf0..052858663db3  master     -> davem-net/master

git rebase net/master sf2-fixes
First, rewinding head to replay your work on top of it...
Applying: net: dsa: bcm_sf2: Fix unbind ordering
Applying: net: dsa: bcm_sf2: Call setup during switch resume
-- 
Florian

^ permalink raw reply

* Some suggestions for tc-tests
From: Cong Wang @ 2018-10-11 21:53 UTC (permalink / raw)
  To: Roman Mashak; +Cc: Jamal Hadi Salim, Linux Kernel Network Developers

Hi,

I tried to run tc-tests with some old iproute2 package, it is painful.
I'd suggest the following improvements:

1. Create veth pair devices by its own. The most important thing for
tc-tests is to automate everything, it is not friendly for users to
create their own veth pair named v0p0 to just run the tests. tc-tests
should be able to create a veth pair with random names and clean up
them once it is finished.

2. Test iproute2 version or capability. Apparently my iproute2 doesn't
support tc filter chain yet, this makes many tests failed. Ideally,
each test should be able to check if the iproute2 supports the thing
it wants to test, if not just skip it, at least by default.

3. Is there anything in the tests that can be done only with Python3?
If we could lower the requirement to Python2, then it would be easier
to setup and run these tests.

Thanks!

^ permalink raw reply

* Re: [PATCH] virtio_net: enable tx after resuming from suspend
From: ake @ 2018-10-12  4:30 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, David S. Miller, virtualization, netdev,
	linux-kernel
In-Reply-To: <a156fd7e-e36e-e4e9-c240-3117ecc5b1b4@redhat.com>



On 2018年10月11日 22:06, Jason Wang wrote:
> 
> 
> On 2018年10月11日 18:22, ake wrote:
>>
>> On 2018年10月11日 18:44, Jason Wang wrote:
>>>
>>> On 2018年10月11日 15:51, Ake Koomsin wrote:
>>>> commit 713a98d90c5e ("virtio-net: serialize tx routine during reset")
>>>> disabled the virtio tx before going to suspend to avoid a use after
>>>> free.
>>>> However, after resuming, it causes the virtio_net device to lose its
>>>> network connectivity.
>>>>
>>>> To solve the issue, we need to enable tx after resuming.
>>>>
>>>> Fixes commit 713a98d90c5e ("virtio-net: serialize tx routine during
>>>> reset")
>>>> Signed-off-by: Ake Koomsin <ake@igel.co.jp>
>>>> ---
>>>>    drivers/net/virtio_net.c | 1 +
>>>>    1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>>> index dab504ec5e50..3453d80f5f81 100644
>>>> --- a/drivers/net/virtio_net.c
>>>> +++ b/drivers/net/virtio_net.c
>>>> @@ -2256,6 +2256,7 @@ static int virtnet_restore_up(struct
>>>> virtio_device *vdev)
>>>>        }
>>>>          netif_device_attach(vi->dev);
>>>> +    netif_start_queue(vi->dev);
>>> I believe this is duplicated with netif_tx_wake_all_queues() in
>>> netif_device_attach() above?
>> Thank you for your review.
>>
>> If both netif_tx_wake_all_queues() and netif_start_queue() result in
>> clearing __QUEUE_STATE_DRV_XOFF, then is it possible that some
>> conditions in netif_device_attach() is not satisfied?
> 
> Yes, maybe. One case I can see now is when the device is down, in this
> case netif_device_attach() won't try to wakeup the queue.
> 
>>   Without
>> netif_start_queue(), the virtio_net device does not resume properly
>> after waking up.
> 
> How do you trigger the issue? Just do suspend/resume?

Yes, simply suspend and resume.

Here is how I trigger the issue:

1) Start the Virtual Machine Manager GUI program.
2) Create a guest Linux OS. Make sure that the guest OS kernel is
   >= 4.12. Make sure that it uses virtio_net as its network device.
   In addition, make sure that the video adapter is VGA. Otherwise,
   waking up with the virtual power button does not work.
3) After installing the guest OS, log in, and test the network
   connectivity by ping the host machine.
4) Suspend. After this, the screen is blank.
5) Resume by hitting the virtual power button. The login screen
   appears again.
6) Log in again. The guest loses its network connection.

In my test:
Guest: Ubuntu 16.04/18.04 with kernel 4.15.0-36-generic
Host: Ubuntu 16.04 with kernel 4.15.0-36-generic/4.4.0-137-generic

>>
>> Is it better to report this as a bug first?
> 
> Nope, you're very welcome to post patch directly.
> 
>> If I am to do more
>> investigation, what areas should I look into?
> 
> As you've figured out, you can start with why netif_tx_wake_all_queues()
> were not executed?
> 
> (Btw, does the issue disappear if you move netif_tx_disable() under the
> check of netif_running() in virtnet_freeze_down()?)

The issue disappears if I move netif_tx_disable() under the check of
netif_running() in virtnet_freeze_down(). Moving netif_tx_disable()
is probably better as its logic is consistent with
netif_device_attach() implementation. If you are OK with this idea,
I will submit another patch.

> Thanks
> 
>>
>> Best Regards
>> Ake Koomsin
>>
> 

Best Regards

^ permalink raw reply

* [PATCH iproute2 net-next] ipneigh: support for NTF_EXT_LEARNED flag on neigh entries
From: Roopa Prabhu @ 2018-10-11 20:45 UTC (permalink / raw)
  To: dsahern; +Cc: netdev

From: Roopa Prabhu <roopa@cumulusnetworks.com>

Adds new option extern_learn to set NTF_EXT_LEARNED flag
on neigh entries.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
---
 ip/ipneigh.c            | 7 ++++++-
 man/man8/ip-neighbour.8 | 9 ++++++++-
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/ip/ipneigh.c b/ip/ipneigh.c
index 165546e..042d01f 100644
--- a/ip/ipneigh.c
+++ b/ip/ipneigh.c
@@ -48,7 +48,7 @@ static void usage(void)
 {
 	fprintf(stderr, "Usage: ip neigh { add | del | change | replace }\n"
 			"                { ADDR [ lladdr LLADDR ] [ nud STATE ] | proxy ADDR } [ dev DEV ]\n");
-	fprintf(stderr, "                                 [ router ]\n\n");
+	fprintf(stderr, "                                 [ router ] [ extern_learn ]\n\n");
 	fprintf(stderr, "       ip neigh { show | flush } [ proxy ] [ to PREFIX ] [ dev DEV ] [ nud STATE ]\n");
 	fprintf(stderr, "                                 [ vrf NAME ]\n\n");
 	fprintf(stderr, "STATE := { permanent | noarp | stale | reachable | none |\n"
@@ -142,6 +142,8 @@ static int ipneigh_modify(int cmd, int flags, int argc, char **argv)
 			req.ndm.ndm_flags |= NTF_PROXY;
 		} else if (strcmp(*argv, "router") == 0) {
 			req.ndm.ndm_flags |= NTF_ROUTER;
+		} else if (matches(*argv, "extern_learn") == 0) {
+			req.ndm.ndm_flags |= NTF_EXT_LEARNED;
 		} else if (strcmp(*argv, "dev") == 0) {
 			NEXT_ARG();
 			dev = *argv;
@@ -354,6 +356,9 @@ int print_neigh(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 	if (r->ndm_flags & NTF_PROXY)
 		print_null(PRINT_ANY, "proxy", " %s", "proxy");
 
+	if (r->ndm_flags & NTF_EXT_LEARNED)
+		print_null(PRINT_ANY, "extern_learn", " %s ", "extern_learn");
+
 	if (show_stats) {
 		if (tb[NDA_CACHEINFO])
 			print_cacheinfo(RTA_DATA(tb[NDA_CACHEINFO]));
diff --git a/man/man8/ip-neighbour.8 b/man/man8/ip-neighbour.8
index db286d1..4a672bb 100644
--- a/man/man8/ip-neighbour.8
+++ b/man/man8/ip-neighbour.8
@@ -24,7 +24,8 @@ ip-neighbour \- neighbour/arp tables management.
 .IR ADDR " } [ "
 .B  dev
 .IR DEV " ] [ "
-.BR router " ] "
+.BR router " ] [ "
+.BR extern_learn " ]"
 
 .ti -8
 .BR "ip neigh" " { " show " | " flush " } [ " proxy " ] [ " to
@@ -85,6 +86,12 @@ indicates whether we are proxying for this neigbour entry
 indicates whether neigbour is a router
 
 .TP
+.BI extern_learn
+this neigh entry was learned externally. This option can be used to
+indicate to the kernel that this is a controller learnt dynamic entry.
+Kernel will not gc such an entry.
+
+.TP
 .BI lladdr " LLADDRESS"
 the link layer address of the neighbour.
 .I LLADDRESS
-- 
2.1.4

^ permalink raw reply related

* Re: [PATCH net-next] rxrpc: Remove set but not used variable 'ioc'
From: David Howells @ 2018-10-11 20:38 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: dhowells, YueHaibing, davem, linux-afs, netdev, kernel-janitors
In-Reply-To: <20181011083807.tvhfguimyw7glrtp@mwanda>

Dan Carpenter <dan.carpenter@oracle.com> wrote:

> I told him to do that because it wasn't a bugfix...  Probably just Fixes
> is the right thing to use though.

But the correct fix *is* a bug fix.

David

^ permalink raw reply

* [PATCH net-next 2/2] net: phy: simplify handling of PHY_RESUMING in state machine
From: Heiner Kallweit @ 2018-10-11 20:37 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <a3f84d67-c7f1-82d5-1956-bd348ab2b5c0@gmail.com>

Simplify code for handling state PHY_RESUMING, no functional change
intended.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/phy/phy.c | 43 ++++++++++++++-----------------------------
 1 file changed, 14 insertions(+), 29 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 696955d38..d03bdbbd1 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -1059,41 +1059,26 @@ void phy_state_machine(struct work_struct *work)
 	case PHY_RESUMING:
 		if (AUTONEG_ENABLE == phydev->autoneg) {
 			err = phy_aneg_done(phydev);
-			if (err < 0)
+			if (err < 0) {
 				break;
-
-			/* err > 0 if AN is done.
-			 * Otherwise, it's 0, and we're  still waiting for AN
-			 */
-			if (err > 0) {
-				err = phy_read_status(phydev);
-				if (err)
-					break;
-
-				if (phydev->link) {
-					phydev->state = PHY_RUNNING;
-					phy_link_up(phydev);
-				} else	{
-					phydev->state = PHY_NOLINK;
-					phy_link_down(phydev, false);
-				}
-			} else {
+			} else if (!err) {
 				phydev->state = PHY_AN;
 				phydev->link_timeout = PHY_AN_TIMEOUT;
-			}
-		} else {
-			err = phy_read_status(phydev);
-			if (err)
 				break;
-
-			if (phydev->link) {
-				phydev->state = PHY_RUNNING;
-				phy_link_up(phydev);
-			} else	{
-				phydev->state = PHY_NOLINK;
-				phy_link_down(phydev, false);
 			}
 		}
+
+		err = phy_read_status(phydev);
+		if (err)
+			break;
+
+		if (phydev->link) {
+			phydev->state = PHY_RUNNING;
+			phy_link_up(phydev);
+		} else	{
+			phydev->state = PHY_NOLINK;
+			phy_link_down(phydev, false);
+		}
 		break;
 	}
 
-- 
2.19.1

^ permalink raw reply related

* [PATCH net-next 1/2] net: phy: improve handling of PHY_RUNNING in state machine
From: Heiner Kallweit @ 2018-10-11 20:36 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <a3f84d67-c7f1-82d5-1956-bd348ab2b5c0@gmail.com>

Handling of state PHY_RUNNING seems to be more complex than it needs
to be. If not polling, then we don't have to do anything, we'll
receive an interrupt and go to state PHY_CHANGELINK once the link
goes down. If polling and link is down, we don't have to go the
extra mile over PHY_CHANGELINK and call phy_read_status() again
but can set status PHY_NOLINK directly.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/phy/phy.c | 29 +++++++++--------------------
 1 file changed, 9 insertions(+), 20 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 704428211..696955d38 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -941,7 +941,6 @@ void phy_state_machine(struct work_struct *work)
 	bool needs_aneg = false, do_suspend = false;
 	enum phy_state old_state;
 	int err = 0;
-	int old_link;
 
 	mutex_lock(&phydev->lock);
 
@@ -1025,26 +1024,16 @@ void phy_state_machine(struct work_struct *work)
 		}
 		break;
 	case PHY_RUNNING:
-		/* Only register a CHANGE if we are polling and link changed
-		 * since latest checking.
-		 */
-		if (phy_polling_mode(phydev)) {
-			old_link = phydev->link;
-			err = phy_read_status(phydev);
-			if (err)
-				break;
+		if (!phy_polling_mode(phydev))
+			break;
 
-			if (old_link != phydev->link)
-				phydev->state = PHY_CHANGELINK;
-		}
-		/*
-		 * Failsafe: check that nobody set phydev->link=0 between two
-		 * poll cycles, otherwise we won't leave RUNNING state as long
-		 * as link remains down.
-		 */
-		if (!phydev->link && phydev->state == PHY_RUNNING) {
-			phydev->state = PHY_CHANGELINK;
-			phydev_err(phydev, "no link in PHY_RUNNING\n");
+		err = phy_read_status(phydev);
+		if (err)
+			break;
+
+		if (!phydev->link) {
+			phydev->state = PHY_NOLINK;
+			phy_link_down(phydev, true);
 		}
 		break;
 	case PHY_CHANGELINK:
-- 
2.19.1

^ permalink raw reply related

* [PATCH net-next 0/2] net: phy: improve and simplify state machine
From: Heiner Kallweit @ 2018-10-11 20:35 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, David Miller; +Cc: netdev@vger.kernel.org

Improve / simplify handling of states PHY_RUNNING and PHY_RESUMING in
phylib state machine.

Heiner Kallweit (2):
  net: phy: improve handling of PHY_RUNNING in state machine
  net: phy: simplify handling of PHY_RESUMING in state machine

 drivers/net/phy/phy.c | 72 ++++++++++++++-----------------------------
 1 file changed, 23 insertions(+), 49 deletions(-)

-- 
2.19.1

^ permalink raw reply

* Fwd: [PATCH stable 4.4 V2 0/6] fix SegmentSmack in stable branch (CVE-2018-5390)
From: salil GK @ 2018-10-12  3:53 UTC (permalink / raw)
  To: maowenan, netdev, dwmw2, eric.dumazet, davem, stable,
	linux-kernel, gregkh
In-Reply-To: <CAPACB-yF16qtiYn9e41_iFNRUzHB349NEiAGubUvW-kiUwpRMQ@mail.gmail.com>

I was looking for fix for CVE-2018-5390 and CVE-2018-5390) in 4.18.x.
Will these fix be available in 4.18 train ?

PS: Sorry for sending again - I got a rejection message as my previous
message contains html tags !

Thanks
~S

On Oct 11, 2018 7:38 PM, "Greg KH" <gregkh@linux-foundation.org> wrote:

On Wed, Sep 26, 2018 at 10:21:21PM +0200, Greg KH wrote:
> On Tue, Sep 25, 2018 at 10:10:15PM +0800, maowenan wrote:
> > Hi Greg:
> >
> > can you review this patch set?
>
> It is still in the queue, don't worry.  It will take some more time to
> properly review and test it.
>
> Ideally you could get someone else to test this and provide a
> "tested-by:" tag for it?

All now queued up, let's see what breaks :)

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH stable 4.4 V2 0/6] fix SegmentSmack in stable branch (CVE-2018-5390)
From: maowenan @ 2018-10-12  3:39 UTC (permalink / raw)
  To: salil GK, gregkh
  Cc: netdev, dwmw2, eric.dumazet, davem, stable, linux-kernel,
	maowenan
In-Reply-To: <CAPACB-yF16qtiYn9e41_iFNRUzHB349NEiAGubUvW-kiUwpRMQ@mail.gmail.com>



On 2018/10/12 10:28, salil GK wrote:
> I was looking for fix for CVE-2018-5390 and CVE-2018-5390) in 4.18.x. Will these fix be available in 4.18 train ?

The fixes of CVE-2018-5390 have already existed in stable 4.18. These fixes only available with < 4.9 that don't using RB tree.

58152ec tcp: add tcp_ooo_try_coalesce() helper
8541b21 tcp: call tcp_drop() from tcp_data_queue_ofo()
3d4bf93 tcp: detect malicious patterns in tcp_collapse_ofo_queue()
f4a3313 tcp: avoid collapses in tcp_prune_queue() if possible
72cd43b tcp: free batches of packets in tcp_prune_ofo_queue()


> 
> Thanks
> ~S
> 
> On Oct 11, 2018 7:38 PM, "Greg KH" <gregkh@linux-foundation.org <mailto:gregkh@linux-foundation.org>> wrote:
> 
>     On Wed, Sep 26, 2018 at 10:21:21PM +0200, Greg KH wrote:
>     > On Tue, Sep 25, 2018 at 10:10:15PM +0800, maowenan wrote:
>     > > Hi Greg:
>     > >
>     > > can you review this patch set?
>     >
>     > It is still in the queue, don't worry.  It will take some more time to
>     > properly review and test it.
>     >
>     > Ideally you could get someone else to test this and provide a
>     > "tested-by:" tag for it?
> 
>     All now queued up, let's see what breaks :)
> 
>     thanks,
> 
>     greg k-h
> 
> 

^ permalink raw reply

* [net  1/1] tipc: initialize broadcast link stale counter correctly
From: Jon Maloy @ 2018-10-11 20:02 UTC (permalink / raw)
  To: davem, netdev
  Cc: gordan.mihaljevic, tung.q.nguyen, hoang.h.le, jon.maloy,
	canh.d.luu, ying.xue, tipc-discussion

In the commit referred to below we added link tolerance as an additional
criteria for declaring broadcast transmission "stale" and resetting the
unicast links to the affected node.

Unfortunately, this 'improvement' introduced two bugs, which each and
one alone cause only limited problems, but combined lead to seemingly
stochastic unicast link resets, depending on the amount of broadcast
traffic transmitted.

The first issue, a missing initialization of the 'tolerance' field of
the receiver broadcast link, was recently fixed by commit 047491ea334a
("tipc: set link tolerance correctly in broadcast link").

Ths second issue, where we omit to reset the 'stale_cnt' field of
the same link after a 'stale' period is over, leads to this counter
accumulating over time, and in the absence of the 'tolerance' criteria
leads to the above described symptoms. This commit adds the missing
initialization.

Fixes: a4dc70d46cf1 ("tipc: extend link reset criteria for stale packet
retransmission")

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
---
 net/tipc/link.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/tipc/link.c b/net/tipc/link.c
index f6552e4..201c3b5 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1041,6 +1041,7 @@ static int tipc_link_retrans(struct tipc_link *l, struct tipc_link *r,
 	if (r->last_retransm != buf_seqno(skb)) {
 		r->last_retransm = buf_seqno(skb);
 		r->stale_limit = jiffies + msecs_to_jiffies(r->tolerance);
+		r->stale_cnt = 0;
 	} else if (++r->stale_cnt > 99 && time_after(jiffies, r->stale_limit)) {
 		link_retransmit_failure(l, skb);
 		if (link_is_bc_sndlink(l))
-- 
2.1.4

^ permalink raw reply related

* Re: [PATCH net-next 0/9] net: Kernel side filtering for route dumps
From: Jamal Hadi Salim @ 2018-10-11 19:54 UTC (permalink / raw)
  To: David Ahern, Sowmini Varadhan, Stephen Hemminger
  Cc: David Ahern, netdev, davem
In-Reply-To: <199349c3-fd14-1401-86d1-814776372917@gmail.com>

On 2018-10-11 2:44 p.m., David Ahern wrote:
> On 10/11/18 12:05 PM, Jamal Hadi Salim wrote:
>> On 2018-10-11 1:04 p.m., David Ahern wrote:

> 
> I meant the general API of users passing filter arguments as attributes
> to the dump (or values in the header) -- KIND, MASTER, device index,
> etc. This is an existing API and existing capability.
> 

which i referred to as use-case driven. It is not unreasonable
to optimize for the most common - but every time somebody
comes up with something new you need to patch the kernel.


> I disagree with your overall premise of bpf the end-all hammer. It is a
> tool but not the only tool. For starters, you are proposing building the
> message, run the filter on it, and potentially back the message up to
> drop the recently added piece because the filter does not want it
> included. That is still wasting a lot of cpu cycles to build and drop. I
> am thinking about scale to 1 million routes -- I do not need the dump
> loop building a message for 1 million entries only to drop 99% of them.
> That is crazy.
> 

My earlier suggestion was for somewhere before the skb is formed.
In the vicinity of xxx_fill_info()
The "create skb and drop" kind works already today with some
acrobatics needed for some cases with cbpf.

Is it unfeasible to add an ebpf hook at that point and ask a user
supplied code "is this ok to send?" - this is no different
than doing a "get by key" operation where key/filter is any arbitrary
construction of fields rtm understands (including the ones you
provided like table index) that are passed in the user program.
Classical "select" mechanism in database tables

> The way the kernel manages route tables says I should pass in the table
> id as it is a major differentiator on what is returned. From there
> lookup the specific table (I need to fix this part per my response to
> Andrew), and then only walk it. The existing semantics, capabilities
> that exist for other dump commands is the most efficient for some of
> these high level, big hammer filters.
> 

Sure.

> What you want gets into the tiniest of details and yes the imagination
> can go wild with combinations of filter options. So maybe this scanning
> of post-built messages is reasonable *after* the high level sorting is done.
> 

That doesnt require any change.

cheers,
jamal

^ permalink raw reply

* Re: [PATCH net-next 0/9] net: Kernel side filtering for route dumps
From: David Miller @ 2018-10-11 19:43 UTC (permalink / raw)
  To: sowmini.varadhan; +Cc: dsahern, jhs, stephen, dsahern, netdev
In-Reply-To: <20181011193248.GD17841@oracle.com>

From: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Date: Thu, 11 Oct 2018 15:32:48 -0400

> Without getting into Ahern's patchset, which he obviously feels 
> quite passionately about..
> 
> On (10/11/18 12:28), David Miller wrote:
>> 
>> Once you've composed the message, the whole point of filtering is lost.
> 
> it would be nice to apply the filter *before* constructing the skb, 
> but afaict most things in BPF today only operate on sk_buffs. How should
> we use *BPF on something other than an sk_buff?

Personally I'm not going to spend cycles on that.

What's important to me in the short term is that David's patch set is
an appropriate way to add filtering, using existing facilities and
mechanisms that already exist for that purpose.

If people want to explore a possible eBPF mechanism for the future,
with an emphasis on "future", feel free to explore to your heart's
content.

But that doesn't exist in any form whatsoever, so that's not what we
should be talking about here.

^ permalink raw reply

* [PATCH net-next v2] vxlan: support NTF_USE refresh of fdb entries
From: Roopa Prabhu @ 2018-10-11 19:35 UTC (permalink / raw)
  To: davem; +Cc: netdev

From: Roopa Prabhu <roopa@cumulusnetworks.com>

This makes use of NTF_USE in vxlan driver consistent
with bridge driver.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
---
v2: fix patch prefix

 drivers/net/vxlan.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index fb0cdbb..018406c 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -697,6 +697,7 @@ static int vxlan_fdb_update(struct vxlan_dev *vxlan,
 			    __be16 port, __be32 src_vni, __be32 vni,
 			    __u32 ifindex, __u8 ndm_flags)
 {
+	__u8 fdb_flags = (ndm_flags & ~NTF_USE);
 	struct vxlan_rdst *rd = NULL;
 	struct vxlan_fdb *f;
 	int notify = 0;
@@ -714,8 +715,8 @@ static int vxlan_fdb_update(struct vxlan_dev *vxlan,
 			f->updated = jiffies;
 			notify = 1;
 		}
-		if (f->flags != ndm_flags) {
-			f->flags = ndm_flags;
+		if (f->flags != fdb_flags) {
+			f->flags = fdb_flags;
 			f->updated = jiffies;
 			notify = 1;
 		}
@@ -737,6 +738,9 @@ static int vxlan_fdb_update(struct vxlan_dev *vxlan,
 				return rc;
 			notify |= rc;
 		}
+
+		if (ndm_flags & NTF_USE)
+			f->used = jiffies;
 	} else {
 		if (!(flags & NLM_F_CREATE))
 			return -ENOENT;
@@ -748,7 +752,7 @@ static int vxlan_fdb_update(struct vxlan_dev *vxlan,
 
 		netdev_dbg(vxlan->dev, "add %pM -> %pIS\n", mac, ip);
 		rc = vxlan_fdb_create(vxlan, mac, ip, state, port, src_vni,
-				      vni, ifindex, ndm_flags, &f);
+				      vni, ifindex, fdb_flags, &f);
 		if (rc < 0)
 			return rc;
 		notify = 1;
-- 
2.1.4

^ permalink raw reply related

* Re: [PATCH cumulus-4.18.y] vxlan: support NTF_USE refresh of fdb entries
From: Roopa Prabhu @ 2018-10-11 19:34 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <1539286408-25597-1-git-send-email-roopa@cumulusnetworks.com>

On Thu, Oct 11, 2018 at 12:33 PM Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
>
> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>
> This makes use of NTF_USE in vxlan driver consistent
> with bridge driver.
>
> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
> ---

pls ignore. wrong patch prefix :)

^ permalink raw reply

* [PATCH cumulus-4.18.y] vxlan: support NTF_USE refresh of fdb entries
From: Roopa Prabhu @ 2018-10-11 19:33 UTC (permalink / raw)
  To: davem; +Cc: netdev

From: Roopa Prabhu <roopa@cumulusnetworks.com>

This makes use of NTF_USE in vxlan driver consistent
with bridge driver.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
---
 drivers/net/vxlan.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index fb0cdbb..018406c 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -697,6 +697,7 @@ static int vxlan_fdb_update(struct vxlan_dev *vxlan,
 			    __be16 port, __be32 src_vni, __be32 vni,
 			    __u32 ifindex, __u8 ndm_flags)
 {
+	__u8 fdb_flags = (ndm_flags & ~NTF_USE);
 	struct vxlan_rdst *rd = NULL;
 	struct vxlan_fdb *f;
 	int notify = 0;
@@ -714,8 +715,8 @@ static int vxlan_fdb_update(struct vxlan_dev *vxlan,
 			f->updated = jiffies;
 			notify = 1;
 		}
-		if (f->flags != ndm_flags) {
-			f->flags = ndm_flags;
+		if (f->flags != fdb_flags) {
+			f->flags = fdb_flags;
 			f->updated = jiffies;
 			notify = 1;
 		}
@@ -737,6 +738,9 @@ static int vxlan_fdb_update(struct vxlan_dev *vxlan,
 				return rc;
 			notify |= rc;
 		}
+
+		if (ndm_flags & NTF_USE)
+			f->used = jiffies;
 	} else {
 		if (!(flags & NLM_F_CREATE))
 			return -ENOENT;
@@ -748,7 +752,7 @@ static int vxlan_fdb_update(struct vxlan_dev *vxlan,
 
 		netdev_dbg(vxlan->dev, "add %pM -> %pIS\n", mac, ip);
 		rc = vxlan_fdb_create(vxlan, mac, ip, state, port, src_vni,
-				      vni, ifindex, ndm_flags, &f);
+				      vni, ifindex, fdb_flags, &f);
 		if (rc < 0)
 			return rc;
 		notify = 1;
-- 
2.1.4

^ permalink raw reply related

* Re: [PATCH net-next 0/9] net: Kernel side filtering for route dumps
From: Sowmini Varadhan @ 2018-10-11 19:32 UTC (permalink / raw)
  To: David Miller; +Cc: dsahern, jhs, stephen, dsahern, netdev
In-Reply-To: <20181011.122847.1015772734131510783.davem@davemloft.net>

Without getting into Ahern's patchset, which he obviously feels 
quite passionately about..

On (10/11/18 12:28), David Miller wrote:
> 
> Once you've composed the message, the whole point of filtering is lost.

it would be nice to apply the filter *before* constructing the skb, 
but afaict most things in BPF today only operate on sk_buffs. How should
we use *BPF on something other than an sk_buff?

--Sowmini

^ permalink raw reply

* Re: [PATCH net-next 0/9] net: Kernel side filtering for route dumps
From: David Miller @ 2018-10-11 19:28 UTC (permalink / raw)
  To: dsahern; +Cc: jhs, sowmini.varadhan, stephen, dsahern, netdev
In-Reply-To: <199349c3-fd14-1401-86d1-814776372917@gmail.com>

From: David Ahern <dsahern@gmail.com>
Date: Thu, 11 Oct 2018 12:44:49 -0600

> I disagree with your overall premise of bpf the end-all hammer. It is a
> tool but not the only tool. For starters, you are proposing building the
> message, run the filter on it, and potentially back the message up to
> drop the recently added piece because the filter does not want it
> included. That is still wasting a lot of cpu cycles to build and drop. I
> am thinking about scale to 1 million routes -- I do not need the dump
> loop building a message for 1 million entries only to drop 99% of them.
> That is crazy.

I completely agree.

Once you've composed the message, the whole point of filtering is lost.

^ permalink raw reply

* Re: [PATCH net] net: udp: fix handling of CHECKSUM_COMPLETE packets
From: Eric Dumazet @ 2018-10-11 19:20 UTC (permalink / raw)
  To: Sean Tranchetti, davem, netdev
In-Reply-To: <1539282601-19795-1-git-send-email-stranche@codeaurora.org>



On 10/11/2018 11:30 AM, Sean Tranchetti wrote:
> Current handling of CHECKSUM_COMPLETE packets by the UDP stack is
> incorrect for any packet that has an incorrect checksum value.
> 
> udp4/6_csum_init() will both make a call to
> __skb_checksum_validate_complete() to initialize/validate the csum
> field when receiving a CHECKSUM_COMPLETE packet. When this packet
> fails validation, skb->csum will be overwritten with the pseudoheader
> checksum so the packet can be fully validated by software, but the
> skb->ip_summed value will be left as CHECKSUM_COMPLETE so that way
> the stack can later warn the user about their hardware spewing bad
> checksums. Unfortunately, leaving the SKB in this state can cause
> problems later on in the checksum calculation.
> 
> Since the the packet is still marked as CHECKSUM_COMPLETE,
> udp_csum_pull_header() will SUBTRACT the checksum of the UDP header
> from skb->csum instead of adding it, leaving us with a garbage value
> in that field. Once we try to copy the packet to userspace in the
> udp4/6_recvmsg(), we'll make a call to skb_copy_and_csum_datagram_msg()
> to checksum the packet data and add it in the garbage skb->csum value
> to perform our final validation check.
> 
> Since the value we're validating is not the proper checksum, it's possible
> that the folded value could come out to 0, causing us not to drop the
> packet. Instead, we believe that the packet was checksummed incorrectly
> by hardware since skb->ip_summed is still CHECKSUM_COMPLETE, and we attempt
> to warn the user with netdev_rx_csum_fault(skb->dev);
> 
> Unfortunately, since this is the UDP path, skb->dev has been overwritten
> by skb->dev_scratch and is no longer a valid pointer, so we end up
> reading invalid memory.
> 
> This patch addresses this problem in two ways:
> 	1) Remove the invalid netdev_rx_csum_fault(skb->dev) call from
> 	   skb_copy_and_csum_datagram_msg(). Since this is used by UDP
> 	   where skb->dev is invalid, trying to warn doesn't make sense.
> 
> 	2) Add better CHECKSUM_COMPLETE handling to udp4/6_csum_init().
> 	   If we receive a packet that's CHECKSUM_COMPLETE that fails
> 	   verification (i.e. skb->csum_valid == 0), check who performed
> 	   the calculation. It's possible that the checksum was done in
> 	   software by the network stack earlier (such as Netfilter's
> 	   CONNTRACK module), and if that says the checksum is bad,
> 	   we can drop the packet immediately instead of waiting until
> 	   we try and copy it to userspace. Otherwise, we need to
> 	   mark the SKB as CHECKSUM_NONE, since the skb->csum field
> 	   no longer contains the full packet checksum after the
> 	   call to __skb_checksum_validate_complete().
> 
> Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>

It would be nice if you could provide a Fixes: tag, so that we know what kind
of backport work is needed.

You could also CC the patch author, it is always a nice thing to do.

If really we could emit a message using skb->dev after skb has been queued
and RCU section gone, this always has been buggy, and maybe we should not
even try to use skb->dev when warning is sent.

Alternative would be to use dev index and perform a lookup to find the device,
if device still exists.

Thanks.

^ permalink raw reply

* Re: [PATCH net-next] nfp: replace long license headers with SPDX
From: David Miller @ 2018-10-11 19:16 UTC (permalink / raw)
  To: jakub.kicinski
  Cc: netdev, oss-drivers, simon.horman, edwin.peer, nick.viljoen
In-Reply-To: <20181011155742.19633-1-jakub.kicinski@netronome.com>

From: Jakub Kicinski <jakub.kicinski@netronome.com>
Date: Thu, 11 Oct 2018 08:57:42 -0700

> Replace the repeated license text with SDPX identifiers.
> While at it bump the Copyright dates for files we touched
> this year.
> 
> Signed-off-by: Edwin Peer <edwin.peer@netronome.com>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Signed-off-by: Nic Viljoen <nick.viljoen@netronome.com>
> Reviewed-by: Simon Horman <simon.horman@netronome.com>

Applied.

^ permalink raw reply

* Re: [PATCH] net: phy: sfp: remove sfp_mutex's definition
From: David Miller @ 2018-10-11 19:10 UTC (permalink / raw)
  To: bigeasy; +Cc: netdev, tglx, andrew, f.fainelli
In-Reply-To: <20181011150621.8466-1-bigeasy@linutronix.de>

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Thu, 11 Oct 2018 17:06:21 +0200

> The sfp_mutex variable is defined but never used in this file. Not even
> in the commit that introduced that variable.
> 
> Remove sfp_mutex, it has no purpose.
> 
> Cc: Andrew Lunn <andrew@lunn.ch>
> Cc: Florian Fainelli <f.fainelli@gmail.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Applied.

^ permalink raw reply

* [PATCH net-next] net: fddi: skfp: Remove unused macros 'PNMI_GET_ID' and 'PNMI_SET_ID'
From: YueHaibing @ 2018-10-12  2:37 UTC (permalink / raw)
  To: davem, natechancellor; +Cc: linux-kernel, netdev, YueHaibing

The two PNMI macros are never used

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
 drivers/net/fddi/skfp/h/cmtdef.h | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/drivers/net/fddi/skfp/h/cmtdef.h b/drivers/net/fddi/skfp/h/cmtdef.h
index a12f464..448d66c 100644
--- a/drivers/net/fddi/skfp/h/cmtdef.h
+++ b/drivers/net/fddi/skfp/h/cmtdef.h
@@ -655,14 +655,6 @@ void dump_hex(char *p, int len);
 #ifndef PNMI_INIT
 #define	PNMI_INIT(smc)	/* Nothing */
 #endif
-#ifndef PNMI_GET_ID
-#define PNMI_GET_ID( smc, ndis_oid, buf, len, BytesWritten, BytesNeeded ) \
-		( 1 ? (-1) : (-1) )
-#endif
-#ifndef PNMI_SET_ID
-#define PNMI_SET_ID( smc, ndis_oid, buf, len, BytesRead, BytesNeeded, \
-		set_type) ( 1 ? (-1) : (-1) )
-#endif
 
 /*
  * SMT_PANIC defines
-- 
2.7.0

^ permalink raw reply related

* Re: [PATCH 1/1] net: socionext: clear rx irq correctly
From: David Miller @ 2018-10-11 19:04 UTC (permalink / raw)
  To: ilias.apalodimas
  Cc: netdev, jaswinder.singh, ard.biesheuvel, masami.hiramatsu
In-Reply-To: <1539260906-29596-1-git-send-email-ilias.apalodimas@linaro.org>

From: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Date: Thu, 11 Oct 2018 15:28:26 +0300

> commit 63ae7949e94a ("net: socionext: Use descriptor info instead of MMIO reads on Rx")
> removed constant mmio reads from the driver and started using a descriptor 
> field to check if packet should be processed.
> This lead the napi rx handler being constantly called while no packets
> needed processing and ksoftirq getting 100% cpu usage. Issue one mmio read 
> to clear the irq correcty after processing packets
> 
> Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
> Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Applied.

^ permalink raw reply

* Re: [RFC PATCH 2/2] net/ncsi: Configure multi-package, multi-channel modes with failover
From: Samuel Mendoza-Jonas @ 2018-10-12  2:24 UTC (permalink / raw)
  To: Justin.Lee1, netdev; +Cc: davem, linux-kernel, openbmc
In-Reply-To: <e20a7a3da4ca4aec913800355cd6cdaf@AUSX13MPS302.AMER.DELL.COM>

On Thu, 2018-10-11 at 22:09 +0000, Justin.Lee1@Dell.com wrote:
> Hi Sam,
> 
> Please see my comments below.
> 
> Thanks,
> Justin
>  
>  
> > On Wed, 2018-10-10 at 22:36 +0000, Justin.Lee1@Dell.com wrote:
> > > Hi Samuel,
> > > 
> > > I am still testing your change and have some comments below.
> > > 
> > > Thanks,
> > > Justin
> > > 
> > > 
> > > > This patch extends the ncsi-netlink interface with two new commands and
> > > > three new attributes to configure multiple packages and/or channels at
> > > > once, and configure specific failover modes.
> > > > 
> > > > NCSI_CMD_SET_PACKAGE mask and NCSI_CMD_SET_CHANNEL_MASK set a whitelist
> > > > of packages or channels allowed to be configured with the
> > > > NCSI_ATTR_PACKAGE_MASK and NCSI_ATTR_CHANNEL_MASK attributes
> > > > respectively. If one of these whitelists is set only packages or
> > > > channels matching the whitelist are considered for the channel queue in
> > > > ncsi_choose_active_channel().
> > > > 
> > > > These commands may also use the NCSI_ATTR_MULTI_FLAG to signal that
> > > > multiple packages or channels may be configured simultaneously. NCSI
> > > > hardware arbitration (HWA) must be available in order to enable
> > > > multi-package mode. Multi-channel mode is always available.
> > > > 
> > > > If the NCSI_ATTR_CHANNEL_ID attribute is present in the
> > > > NCSI_CMD_SET_CHANNEL_MASK command the it sets the preferred channel as
> > > > with the NCSI_CMD_SET_INTERFACE command. The combination of preferred
> > > > channel and channel whitelist defines a primary channel and the allowed
> > > > failover channels.
> > > > If the NCSI_ATTR_MULTI_FLAG attribute is also present then the preferred
> > > > channel is configured for Tx/Rx and the other channels are enabled only
> > > > for Rx.
> > > > 
> > > > Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
> > > > ---
> > > >  include/uapi/linux/ncsi.h |  16 +++
> > > >  net/ncsi/internal.h       |  11 +-
> > > >  net/ncsi/ncsi-aen.c       |   2 +-
> > > >  net/ncsi/ncsi-manage.c    | 138 ++++++++++++++++--------
> > > >  net/ncsi/ncsi-netlink.c   | 217 +++++++++++++++++++++++++++++++++-----
> > > >  net/ncsi/ncsi-rsp.c       |   2 +-
> > > >  6 files changed, 312 insertions(+), 74 deletions(-)
> > > > 
> > > > diff --git a/include/uapi/linux/ncsi.h b/include/uapi/linux/ncsi.h
> > > > index 4c292ecbb748..035fba1693f9 100644
> > > > --- a/include/uapi/linux/ncsi.h
> > > > +++ b/include/uapi/linux/ncsi.h
> > > > @@ -23,6 +23,13 @@
> > > >   *	optionally the preferred NCSI_ATTR_CHANNEL_ID.
> > > >   * @NCSI_CMD_CLEAR_INTERFACE: clear any preferred package/channel combination.
> > > >   *	Requires NCSI_ATTR_IFINDEX.
> > > > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed packages.
> > > > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed channels.
> > > > + *	Requires NCSI_ATTR_IFINDEX and NCSI_ATTR_PACKAGE_MASK.
> > > > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed channels.
> > > > + *	Requires NCSI_ATTR_IFINDEX, NCSI_ATTR_PACKAGE_ID, and
> > > > + *	NCSI_ATTR_CHANNEL_MASK. If NCSI_ATTR_CHANNEL_ID is present it sets
> > > > + *	the primary channel.
> > > >   * @NCSI_CMD_MAX: highest command number
> > > >   */
> > > 
> > > There are some typo in the description.
> > > * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed packages.
> > >  *	Requires NCSI_ATTR_IFINDEX and NCSI_ATTR_PACKAGE_MASK.
> > >  * @NCSI_CMD_SET_CHANNEL_MASK: set a whitelist of allowed channels.
> > >  *	Requires NCSI_ATTR_IFINDEX, NCSI_ATTR_PACKAGE_ID, and
> > >  *	NCSI_ATTR_CHANNEL_MASK. If NCSI_ATTR_CHANNEL_ID is present it sets
> > >  *	the primary channel.
> > 
> > Haha, yes I threw that in at the end, thanks for catching it.
> > 
> > > >  enum ncsi_nl_commands {
> > > > @@ -30,6 +37,8 @@ enum ncsi_nl_commands {
> > > >  	NCSI_CMD_PKG_INFO,
> > > >  	NCSI_CMD_SET_INTERFACE,
> > > >  	NCSI_CMD_CLEAR_INTERFACE,
> > > > +	NCSI_CMD_SET_PACKAGE_MASK,
> > > > +	NCSI_CMD_SET_CHANNEL_MASK,
> > > >  
> > > >  	__NCSI_CMD_AFTER_LAST,
> > > >  	NCSI_CMD_MAX = __NCSI_CMD_AFTER_LAST - 1
> > > > @@ -43,6 +52,10 @@ enum ncsi_nl_commands {
> > > >   * @NCSI_ATTR_PACKAGE_LIST: nested array of NCSI_PKG_ATTR attributes
> > > >   * @NCSI_ATTR_PACKAGE_ID: package ID
> > > >   * @NCSI_ATTR_CHANNEL_ID: channel ID
> > > > + * @NCSI_ATTR_MULTI_FLAG: flag to signal that multi-mode should be enabled with
> > > > + *	NCSI_CMD_SET_PACKAGE_MASK or NCSI_CMD_SET_CHANNEL_MASK.
> > > > + * @NCSI_ATTR_PACKAGE_MASK: 32-bit mask of allowed packages.
> > > > + * @NCSI_ATTR_CHANNEL_MASK: 32-bit mask of allowed channels.
> > > >   * @NCSI_ATTR_MAX: highest attribute number
> > > >   */
> > > >  enum ncsi_nl_attrs {
> > > > @@ -51,6 +64,9 @@ enum ncsi_nl_attrs {
> > > >  	NCSI_ATTR_PACKAGE_LIST,
> > > >  	NCSI_ATTR_PACKAGE_ID,
> > > >  	NCSI_ATTR_CHANNEL_ID,
> > > > +	NCSI_ATTR_MULTI_FLAG,
> > > > +	NCSI_ATTR_PACKAGE_MASK,
> > > > +	NCSI_ATTR_CHANNEL_MASK,
> > > 
> > > Is there a case that we might set these two masks at the same time?
> > > If not, maybe we can just have one generic MASK attribute.
> > > 
> > 
> > I thought of this too: not yet, but I wonder if we might in the future.
> > I'll have a think about it.
> > 
> > > >  
> > > >  	__NCSI_ATTR_AFTER_LAST,
> > > >  	NCSI_ATTR_MAX = __NCSI_ATTR_AFTER_LAST - 1
> > > > diff --git a/net/ncsi/internal.h b/net/ncsi/internal.h
> > > > index 3d0a33b874f5..8437474d0a78 100644
> > > > --- a/net/ncsi/internal.h
> > > > +++ b/net/ncsi/internal.h
> > > > @@ -213,6 +213,10 @@ struct ncsi_package {
> > > >  	unsigned int         channel_num; /* Number of channels     */
> > > >  	struct list_head     channels;    /* List of chanels        */
> > > >  	struct list_head     node;        /* Form list of packages  */
> > > > +
> > > > +	bool                 multi_channel; /* Enable multiple channels  */
> > > > +	u32                  channel_whitelist; /* Channels to configure */
> > > > +	struct ncsi_channel  *preferred_channel; /* Primary channel      */
> > > >  };
> > > >  
> > > >  struct ncsi_request {
> > > > @@ -280,8 +284,6 @@ struct ncsi_dev_priv {
> > > >  	unsigned int        package_num;     /* Number of packages         */
> > > >  	struct list_head    packages;        /* List of packages           */
> > > >  	struct ncsi_channel *hot_channel;    /* Channel was ever active    */
> > > > -	struct ncsi_package *force_package;  /* Force a specific package   */
> > > > -	struct ncsi_channel *force_channel;  /* Force a specific channel   */
> > > >  	struct ncsi_request requests[256];   /* Request table              */
> > > >  	unsigned int        request_id;      /* Last used request ID       */
> > > >  #define NCSI_REQ_START_IDX	1
> > > > @@ -294,6 +296,9 @@ struct ncsi_dev_priv {
> > > >  	struct list_head    node;            /* Form NCSI device list      */
> > > >  #define NCSI_MAX_VLAN_VIDS	15
> > > >  	struct list_head    vlan_vids;       /* List of active VLAN IDs */
> > > > +
> > > > +	bool                multi_package;   /* Enable multiple packages   */
> > > > +	u32                 package_whitelist; /* Packages to configure    */
> > > >  };
> > > >  
> > > >  struct ncsi_cmd_arg {
> > > > @@ -345,6 +350,8 @@ struct ncsi_request *ncsi_alloc_request(struct ncsi_dev_priv *ndp,
> > > >  void ncsi_free_request(struct ncsi_request *nr);
> > > >  struct ncsi_dev *ncsi_find_dev(struct net_device *dev);
> > > >  int ncsi_process_next_channel(struct ncsi_dev_priv *ndp);
> > > > +bool ncsi_channel_is_last(struct ncsi_dev_priv *ndp,
> > > > +			  struct ncsi_channel *channel);
> > > >  
> > > >  /* Packet handlers */
> > > >  u32 ncsi_calculate_checksum(unsigned char *data, int len);
> > > > diff --git a/net/ncsi/ncsi-aen.c b/net/ncsi/ncsi-aen.c
> > > > index 65f47a648be3..eac56aee30c4 100644
> > > > --- a/net/ncsi/ncsi-aen.c
> > > > +++ b/net/ncsi/ncsi-aen.c
> > > > @@ -86,7 +86,7 @@ static int ncsi_aen_handler_lsc(struct ncsi_dev_priv *ndp,
> > > >  	    !(state == NCSI_CHANNEL_ACTIVE && !(data & 0x1)))
> > > >  		return 0;
> > > >  
> > > > -	if (state == NCSI_CHANNEL_ACTIVE)
> > > > +	if (state == NCSI_CHANNEL_ACTIVE && ncsi_channel_is_last(ndp, nc))
> > > >  		ndp->flags |= NCSI_DEV_RESHUFFLE;
> > > >  
> > > >  	ncsi_stop_channel_monitor(nc);
> > > > diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
> > > > index 665bee25ec44..6a55df700bcb 100644
> > > > --- a/net/ncsi/ncsi-manage.c
> > > > +++ b/net/ncsi/ncsi-manage.c
> > > > @@ -27,6 +27,24 @@
> > > >  LIST_HEAD(ncsi_dev_list);
> > > >  DEFINE_SPINLOCK(ncsi_dev_lock);
> > > >  
> > > > +/* Returns true if the given channel is the last channel available */
> > > > +bool ncsi_channel_is_last(struct ncsi_dev_priv *ndp,
> > > > +			  struct ncsi_channel *channel)
> > > > +{
> > > > +	struct ncsi_package *np;
> > > > +	struct ncsi_channel *nc;
> > > > +
> > > > +	NCSI_FOR_EACH_PACKAGE(ndp, np)
> > > > +		NCSI_FOR_EACH_CHANNEL(np, nc) {
> > > > +			if (nc == channel)
> > > > +				continue;
> > > > +			if (nc->state == NCSI_CHANNEL_ACTIVE)
> > > > +				return false;
> > > > +		}
> > > > +
> > > > +	return true;
> > > > +}
> > > > +
> > > >  static void ncsi_report_link(struct ncsi_dev_priv *ndp, bool force_down)
> > > >  {
> > > >  	struct ncsi_dev *nd = &ndp->ndev;
> > > > @@ -266,6 +284,7 @@ struct ncsi_package *ncsi_add_package(struct ncsi_dev_priv *ndp,
> > > >  	np->ndp = ndp;
> > > >  	spin_lock_init(&np->lock);
> > > >  	INIT_LIST_HEAD(&np->channels);
> > > > +	np->channel_whitelist = UINT_MAX;
> > > >  
> > > >  	spin_lock_irqsave(&ndp->lock, flags);
> > > >  	tmp = ncsi_find_package(ndp, id);
> > > > @@ -633,6 +652,34 @@ static int set_one_vid(struct ncsi_dev_priv *ndp, struct ncsi_channel *nc,
> > > >  	return 0;
> > > >  }
> > > >  
> > > > +/* Determine if a given channel should be the Tx channel */
> > > > +bool ncsi_channel_is_tx(struct ncsi_dev_priv *ndp, struct ncsi_channel *nc)
> > > > +{
> > > > +	struct ncsi_package *np = nc->package;
> > > > +	struct ncsi_channel *channel;
> > > > +	struct ncsi_channel_mode *ncm;
> 
> Learn from Dave. All local variable declarations from longest to shortest
> line. :)
> 
> > > > +
> > > > +	NCSI_FOR_EACH_CHANNEL(np, channel) {
> > > > +		ncm = &channel->modes[NCSI_MODE_TX_ENABLE];
> > > > +		/* Another channel is already Tx */
> > > > +		if (ncm->enable)
> > > > +			return false;
> > > > +	}
> > > > +
> > > > +	if (!np->preferred_channel)
> > > > +		return true;
> > > > +
> > > > +	if (np->preferred_channel == nc)
> > > > +		return true;
> > > > +
> > > > +	/* The preferred channel is not in the queue and not active */
> > > > +	if (list_empty(&np->preferred_channel->link) &&
> > > > +	    np->preferred_channel->state != NCSI_CHANNEL_ACTIVE)
> > > > +		return true;
> > > > +
> > > > +	return false;
> > > > +}
> > > > +
> > > >  static void ncsi_configure_channel(struct ncsi_dev_priv *ndp)
> > > >  {
> > > >  	struct ncsi_dev *nd = &ndp->ndev;
> > > > @@ -745,18 +792,22 @@ static void ncsi_configure_channel(struct ncsi_dev_priv *ndp)
> > > >  		} else if (nd->state == ncsi_dev_state_config_ebf) {
> > > >  			nca.type = NCSI_PKT_CMD_EBF;
> > > >  			nca.dwords[0] = nc->caps[NCSI_CAP_BC].cap;
> > > > -			nd->state = ncsi_dev_state_config_ecnt;
> > > > +			if (ncsi_channel_is_tx(ndp, nc))
> > > > +				nd->state = ncsi_dev_state_config_ecnt;
> > > > +			else
> > > > +				nd->state = ncsi_dev_state_config_ec;
> > > >  #if IS_ENABLED(CONFIG_IPV6)
> > > >  			if (ndp->inet6_addr_num > 0 &&
> > > >  			    (nc->caps[NCSI_CAP_GENERIC].cap &
> > > >  			     NCSI_CAP_GENERIC_MC))
> > > >  				nd->state = ncsi_dev_state_config_egmf;
> > > > -			else
> > > > -				nd->state = ncsi_dev_state_config_ecnt;
> > > >  		} else if (nd->state == ncsi_dev_state_config_egmf) {
> > > >  			nca.type = NCSI_PKT_CMD_EGMF;
> > > >  			nca.dwords[0] = nc->caps[NCSI_CAP_MC].cap;
> > > > -			nd->state = ncsi_dev_state_config_ecnt;
> > > > +			if (ncsi_channel_is_tx(ndp, nc))
> > > > +				nd->state = ncsi_dev_state_config_ecnt;
> > > > +			else
> > > > +				nd->state = ncsi_dev_state_config_ec;
> > > >  #endif /* CONFIG_IPV6 */
> > > >  		} else if (nd->state == ncsi_dev_state_config_ecnt) {
> > > >  			nca.type = NCSI_PKT_CMD_ECNT;
> > > > @@ -840,43 +891,35 @@ static void ncsi_configure_channel(struct ncsi_dev_priv *ndp)
> > > >  
> > > >  static int ncsi_choose_active_channel(struct ncsi_dev_priv *ndp)
> > > >  {
> > > > -	struct ncsi_package *np, *force_package;
> > > > -	struct ncsi_channel *nc, *found, *hot_nc, *force_channel;
> > > > +	struct ncsi_package *np;
> > > > +	struct ncsi_channel *nc, *found, *hot_nc;
> > > >  	struct ncsi_channel_mode *ncm;
> > > > -	unsigned long flags;
> > > > +	unsigned long flags, cflags;
> > > > +	bool with_link;
> 
> All local variable declarations from longest to shortest
> line.
> 
> > > >  
> > > >  	spin_lock_irqsave(&ndp->lock, flags);
> > > >  	hot_nc = ndp->hot_channel;
> > > > -	force_channel = ndp->force_channel;
> > > > -	force_package = ndp->force_package;
> > > >  	spin_unlock_irqrestore(&ndp->lock, flags);
> > > >  
> > > > -	/* Force a specific channel whether or not it has link if we have been
> > > > -	 * configured to do so
> > > > -	 */
> > > > -	if (force_package && force_channel) {
> > > > -		found = force_channel;
> > > > -		ncm = &found->modes[NCSI_MODE_LINK];
> > > > -		if (!(ncm->data[2] & 0x1))
> > > > -			netdev_info(ndp->ndev.dev,
> > > > -				    "NCSI: Channel %u forced, but it is link down\n",
> > > > -				    found->id);
> > > > -		goto out;
> > > > -	}
> > > > -
> > > > -	/* The search is done once an inactive channel with up
> > > > -	 * link is found.
> > > > +	/* By default the search is done once an inactive channel with up
> > > > +	 * link is found, unless a preferred channel is set.
> > > > +	 * If multi_package or multi_channel are configured all channels in the
> > > > +	 * whitelist with link are added to the channel queue.
> > > >  	 */
> > > >  	found = NULL;
> > > > +	with_link = false;
> > > >  	NCSI_FOR_EACH_PACKAGE(ndp, np) {
> > > > -		if (ndp->force_package && np != ndp->force_package)
> > > > +		if (!(ndp->package_whitelist & (0x1 << np->id)))
> > > >  			continue;
> > > >  		NCSI_FOR_EACH_CHANNEL(np, nc) {
> > > > -			spin_lock_irqsave(&nc->lock, flags);
> > > > +			if (!(np->channel_whitelist & (0x1 << nc->id)))
> > > > +				continue;
> > > > +
> > > > +			spin_lock_irqsave(&nc->lock, cflags);
> > > >  
> > > >  			if (!list_empty(&nc->link) ||
> > > >  			    nc->state != NCSI_CHANNEL_INACTIVE) {
> > > > -				spin_unlock_irqrestore(&nc->lock, flags);
> > > > +				spin_unlock_irqrestore(&nc->lock, cflags);
> > > >  				continue;
> > > >  			}
> > > >  
> > > > @@ -888,32 +931,42 @@ static int ncsi_choose_active_channel(struct ncsi_dev_priv *ndp)
> > > >  
> > > >  			ncm = &nc->modes[NCSI_MODE_LINK];
> > > >  			if (ncm->data[2] & 0x1) {
> > > 
> > > This data will not be updated if the channel monitor for it is not running.
> > > If I move the cable from the current configured channel to the other channel,
> > > NC-SI module will not detect the link status as the other channel is not configured
> > > and AEN will not happen.
> > > Is it per design that NC-SI module will always use the first interface with the link?
> > 
> > We'll check the link state of every channel at init, and per-package on
> > suspend events, but you're right that we only get AENs right now from
> > configured channels. There's probably no reason why we could at least
> > enable AENs on all channels even if we don't use them for network, maybe
> > during the package probe step.
> > 
> 
> To receive the AEN packet, the channel (RX) needs to be enabled.
> It seems that we need to send Get Link Status command to all channels before choosing
> The active channel.

We mostly already do a GLS before this; either we've just started the
NCSI device in which case we sent a GLS to every package as part of
probing, or some other event has occured and we've gone through
ncsi_suspend_channel() which will do ncsi_dev_state_suspend_gls.
However there are some gaps here, for example the
ncsi_stop_dev()/ncsi_start_dev() path won't cause GLS commands to be sent
so we'll have stale information. Continued below..

> 
> > > > -				spin_unlock_irqrestore(&nc->lock, flags);
> > > >  				found = nc;
> > > > -				goto out;
> > > > +				with_link = true;
> > > > +
> > > > +				spin_lock_irqsave(&ndp->lock, flags);
> > > > +				list_add_tail_rcu(&found->link,
> > > > +						  &ndp->channel_queue);
> > > > +				spin_unlock_irqrestore(&ndp->lock, flags);
> > > > +
> > > > +				netdev_dbg(ndp->ndev.dev,
> > > > +					   "NCSI: Channel %u added to queue (link %s)\n",
> > > > +					   found->id,
> > > > +					   ncm->data[2] & 0x1 ? "up" : "down");
> > > >  			}
> 
> If multi_channel is enabled, the interface without link still needs to be processed.
> Something likes below?
> 			} else if (np->multi_channel) {
> 				found = nc;
> 
> 				spin_lock_irqsave(&ndp->lock, flags);
> 				list_add_tail_rcu(&found->link,
> 						  &ndp->channel_queue);
> 				spin_unlock_irqrestore(&ndp->lock, flags);
> 
> 				netdev_dbg(ndp->ndev.dev,
> 					   "NCSI: pkg %u ch %u added to queue (link %s)\n",
> 					   found->package->id,
> 					   found->id,
> 					   ncm->data[2] & 0x1 ? "up" : "down");
> 			}

Yes we should just configure every channel in the whitelist if
multi_channel is set - this gives us AENs for that channel and we'll
recognise when it goes link up.

It would be good to have a better way of "bouncing" the NCSI
configuration in these cases though, especially to include a re-probe of
channel states. I'll include something like that for this series.

> 
> > > > +			spin_unlock_irqrestore(&nc->lock, cflags);
> > > >  
> > > > -			spin_unlock_irqrestore(&nc->lock, flags);
> > > > +			if (with_link && !np->multi_channel)
> > > > +				break;
> > > >  		}
> > > > +		if (with_link && !ndp->multi_package)
> > > > +			break;
> > > >  	}
> > > >  
> > > > -	if (!found) {
> > > > +	if (!with_link && found) {
> > > > +		netdev_info(ndp->ndev.dev,
> > > > +			    "NCSI: No channel with link found, configuring channel %u\n",
> > > > +			    found->id);
> > > > +		spin_lock_irqsave(&ndp->lock, flags);
> > > > +		list_add_tail_rcu(&found->link, &ndp->channel_queue);
> > > > +		spin_unlock_irqrestore(&ndp->lock, flags);
> > > > +	} else if (!found) {
> > > >  		netdev_warn(ndp->ndev.dev,
> > > > -			    "NCSI: No channel found with link\n");
> > > > +			    "NCSI: No channel found to configure!\n");
> > > >  		ncsi_report_link(ndp, true);
> > > >  		return -ENODEV;
> > > >  	}
> > > >  
> > > > -	ncm = &found->modes[NCSI_MODE_LINK];
> > > > -	netdev_dbg(ndp->ndev.dev,
> > > > -		   "NCSI: Channel %u added to queue (link %s)\n",
> > > > -		   found->id, ncm->data[2] & 0x1 ? "up" : "down");
> > > > -
> > > > -out:
> > > > -	spin_lock_irqsave(&ndp->lock, flags);
> > > > -	list_add_tail_rcu(&found->link, &ndp->channel_queue);
> > > > -	spin_unlock_irqrestore(&ndp->lock, flags);
> > > > -
> > > >  	return ncsi_process_next_channel(ndp);
> > > >  }
> > > >  
> > > > @@ -1428,6 +1481,7 @@ struct ncsi_dev *ncsi_register_dev(struct net_device *dev,
> > > >  	INIT_LIST_HEAD(&ndp->channel_queue);
> > > >  	INIT_LIST_HEAD(&ndp->vlan_vids);
> > > >  	INIT_WORK(&ndp->work, ncsi_dev_work);
> > > > +	ndp->package_whitelist = UINT_MAX;
> > > >  
> > > >  	/* Initialize private NCSI device */
> > > >  	spin_lock_init(&ndp->lock);
> > > > diff --git a/net/ncsi/ncsi-netlink.c b/net/ncsi/ncsi-netlink.c
> > > > index 32cb7751d216..33a091e6f466 100644
> > > > --- a/net/ncsi/ncsi-netlink.c
> > > > +++ b/net/ncsi/ncsi-netlink.c
> > > 
> > > Is the following missed in the patch?
> > > static const struct nla_policy ncsi_genl_policy[NCSI_ATTR_MAX + 1] = {
> > > ...
> > > 	[NCSI_ATTR_MULTI_FLAG] =	{ .type = NLA_FLAG },
> > > 	[NCSI_ATTR_PACKAGE_MASK] =	{ .type = NLA_U32 },
> > > 	[NCSI_ATTR_CHANNEL_MASK] =	{ .type = NLA_U32 },
> > 
> > Oops, added.
> > 
> > > > @@ -67,7 +67,7 @@ static int ncsi_write_channel_info(struct sk_buff *skb,
> > > >  	nla_put_u32(skb, NCSI_CHANNEL_ATTR_LINK_STATE, m->data[2]);
> > > >  	if (nc->state == NCSI_CHANNEL_ACTIVE)
> > > >  		nla_put_flag(skb, NCSI_CHANNEL_ATTR_ACTIVE);
> > > > -	if (ndp->force_channel == nc)
> > > > +	if (nc == nc->package->preferred_channel)
> > > >  		nla_put_flag(skb, NCSI_CHANNEL_ATTR_FORCED);
> > > >  
> > > >  	nla_put_u32(skb, NCSI_CHANNEL_ATTR_VERSION_MAJOR, nc->version.version);
> > > > @@ -112,7 +112,7 @@ static int ncsi_write_package_info(struct sk_buff *skb,
> > > >  		if (!pnest)
> > > >  			return -ENOMEM;
> > > >  		nla_put_u32(skb, NCSI_PKG_ATTR_ID, np->id);
> > > > -		if (ndp->force_package == np)
> > > > +		if ((0x1 << np->id) == ndp->package_whitelist)
> > > >  			nla_put_flag(skb, NCSI_PKG_ATTR_FORCED);
> > > >  		cnest = nla_nest_start(skb, NCSI_PKG_ATTR_CHANNEL_LIST);
> > > >  		if (!cnest) {
> > > > @@ -288,45 +288,54 @@ static int ncsi_set_interface_nl(struct sk_buff *msg, struct genl_info *info)
> > > >  	package_id = nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_ID]);
> > > >  	package = NULL;
> > > >  
> > > > -	spin_lock_irqsave(&ndp->lock, flags);
> > > > -
> > > >  	NCSI_FOR_EACH_PACKAGE(ndp, np)
> > > >  		if (np->id == package_id)
> > > >  			package = np;
> > > >  	if (!package) {
> > > >  		/* The user has set a package that does not exist */
> > > > -		spin_unlock_irqrestore(&ndp->lock, flags);
> > > >  		return -ERANGE;
> > > >  	}
> > > >  
> > > >  	channel = NULL;
> > > > -	if (!info->attrs[NCSI_ATTR_CHANNEL_ID]) {
> > > > -		/* Allow any channel */
> > > > -		channel_id = NCSI_RESERVED_CHANNEL;
> > > > -	} else {
> > > > +	if (info->attrs[NCSI_ATTR_CHANNEL_ID]) {
> > > >  		channel_id = nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_ID]);
> > > >  		NCSI_FOR_EACH_CHANNEL(package, nc)
> > > > -			if (nc->id == channel_id)
> > > > +			if (nc->id == channel_id) {
> > > >  				channel = nc;
> > > > +				break;
> > > > +			}
> > > > +		if (!channel) {
> > > > +			netdev_info(ndp->ndev.dev,
> > > > +				    "NCSI: Channel %u does not exist!\n",
> > > > +				    channel_id);
> > > > +			return -ERANGE;
> > > > +		}
> > > >  	}
> > > >  
> > > > -	if (channel_id != NCSI_RESERVED_CHANNEL && !channel) {
> > > > -		/* The user has set a channel that does not exist on this
> > > > -		 * package
> > > > -		 */
> > > > -		spin_unlock_irqrestore(&ndp->lock, flags);
> > > > -		netdev_info(ndp->ndev.dev, "NCSI: Channel %u does not exist!\n",
> > > > -			    channel_id);
> > > > -		return -ERANGE;
> > > > -	}
> > > > -
> > > > -	ndp->force_package = package;
> > > > -	ndp->force_channel = channel;
> > > > +	spin_lock_irqsave(&ndp->lock, flags);
> > > > +	ndp->package_whitelist = 0x1 << package->id;
> > > > +	ndp->multi_package = false;
> > > >  	spin_unlock_irqrestore(&ndp->lock, flags);
> > > >  
> > > > -	netdev_info(ndp->ndev.dev, "Set package 0x%x, channel 0x%x%s as preferred\n",
> > > > -		    package_id, channel_id,
> > > > -		    channel_id == NCSI_RESERVED_CHANNEL ? " (any)" : "");
> > > > +	spin_lock_irqsave(&package->lock, flags);
> > > > +	package->multi_channel = false;
> > > > +	if (channel) {
> > > > +		package->channel_whitelist = 0x1 << channel->id;
> > > > +		package->preferred_channel = channel;
> > > > +	} else {
> > > > +		/* Allow any channel */
> > > > +		package->channel_whitelist = UINT_MAX;
> > > > +		package->preferred_channel = NULL;
> > > > +	}
> > > > +	spin_unlock_irqrestore(&package->lock, flags);
> > > > +
> > > > +	if (channel)
> > > > +		netdev_info(ndp->ndev.dev,
> > > > +			    "Set package 0x%x, channel 0x%x as preferred\n",
> > > > +			    package_id, channel_id);
> > > > +	else
> > > > +		netdev_info(ndp->ndev.dev, "Set package 0x%x as preferred\n",
> > > > +			    package_id);
> > > >  
> > > >  	/* Bounce the NCSI channel to set changes */
> > > >  	ncsi_stop_dev(&ndp->ndev);
> > > > @@ -338,6 +347,7 @@ static int ncsi_set_interface_nl(struct sk_buff *msg, struct genl_info *info)
> > > >  static int ncsi_clear_interface_nl(struct sk_buff *msg, struct genl_info *info)
> > > >  {
> > > >  	struct ncsi_dev_priv *ndp;
> > > > +	struct ncsi_package *np;
> > > >  	unsigned long flags;
> > > >  
> > > >  	if (!info || !info->attrs)
> > > > @@ -351,11 +361,19 @@ static int ncsi_clear_interface_nl(struct sk_buff *msg, struct genl_info *info)
> > > >  	if (!ndp)
> > > >  		return -ENODEV;
> > > >  
> > > > -	/* Clear any override */
> > > > +	/* Reset any whitelists and disable multi mode */
> > > >  	spin_lock_irqsave(&ndp->lock, flags);
> > > > -	ndp->force_package = NULL;
> > > > -	ndp->force_channel = NULL;
> > > > +	ndp->package_whitelist = UINT_MAX;
> > > > +	ndp->multi_package = false;
> > > >  	spin_unlock_irqrestore(&ndp->lock, flags);
> > > > +
> > > > +	NCSI_FOR_EACH_PACKAGE(ndp, np) {
> > > > +		spin_lock_irqsave(&np->lock, flags);
> > > > +		np->multi_channel = false;
> > > > +		np->channel_whitelist = UINT_MAX;
> > > > +		np->preferred_channel = NULL;
> > > > +		spin_unlock_irqrestore(&np->lock, flags);
> > > > +	}
> > > >  	netdev_info(ndp->ndev.dev, "NCSI: Cleared preferred package/channel\n");
> > > >  
> > > >  	/* Bounce the NCSI channel to set changes */
> > > > @@ -365,6 +383,137 @@ static int ncsi_clear_interface_nl(struct sk_buff *msg, struct genl_info *info)
> > > >  	return 0;
> > > >  }
> > > >  
> > > > +static int ncsi_set_package_mask_nl(struct sk_buff *msg,
> > > > +				    struct genl_info *info)
> > > > +{
> > > > +	struct ncsi_dev_priv *ndp;
> > > > +	unsigned long flags;
> > > > +	int rc;
> > > > +
> > > > +	if (!info || !info->attrs)
> > > > +		return -EINVAL;
> > > > +
> > > > +	if (!info->attrs[NCSI_ATTR_IFINDEX])
> > > > +		return -EINVAL;
> > > > +
> > > > +	if (!info->attrs[NCSI_ATTR_PACKAGE_MASK])
> > > > +		return -EINVAL;
> > > > +
> > > > +	ndp = ndp_from_ifindex(get_net(sock_net(msg->sk)),
> > > > +			       nla_get_u32(info->attrs[NCSI_ATTR_IFINDEX]));
> > > > +	if (!ndp)
> > > > +		return -ENODEV;
> > > > +
> > > > +	spin_lock_irqsave(&ndp->lock, flags);
> > > > +	ndp->package_whitelist =
> > > > +		nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_MASK]);
> > > > +
> > > > +	if (nla_get_flag(info->attrs[NCSI_ATTR_MULTI_FLAG])) {
> > > > +		if (ndp->flags & NCSI_DEV_HWA) {
> > > > +			ndp->multi_package = true;
> > > > +			rc = 0;
> > > > +		} else {
> > > > +			netdev_err(ndp->ndev.dev,
> > > > +				   "NCSI: Can't use multiple packages without HWA\n");
> > > > +			rc = -EPERM;
> > > > +		}
> > > > +	} else {
> > > > +		rc = 0;
> > > > +	}
> 
> If the flag is not set, do we need to clear the multi_package flag? It is possible that it is
> user's intension to disable the multi-mode.
> 	} else {
> 		ndp->multi_package = false;
> 		rc = 0;
> 	}

Yep, good catch (although technically multi_package could not have been
set true in the past if HWA is not available).

> 
> > > > +
> > > > +	spin_unlock_irqrestore(&ndp->lock, flags);
> > > > +
> > > > +	if (!rc) {
> > > > +		/* Bounce the NCSI channel to set changes */
> > > > +		ncsi_stop_dev(&ndp->ndev);
> > > > +		ncsi_start_dev(&ndp->ndev);
> > > 
> > > Is it possible to delay the restart? If we have two packages, we might send
> > > set_package_mask command once and set_channel_mask command twice.
> > > We will see the unnecessary reconfigurations in a very short period time.
> > 
> > We could probably do with a better way of bouncing the link here too. I
> > suppose we could schedule the actual link 'bounce' to occur a second or
> > two later, and delay if we receive new configuration in that time.
> > Perhaps something outside of the scope of this patch though.
> > 
> > > > +	}
> > > > +
> > > > +	return rc;
> > > > +}
> > > > +
> > > > +static int ncsi_set_channel_mask_nl(struct sk_buff *msg,
> > > > +				    struct genl_info *info)
> > > > +{
> > > > +	struct ncsi_package *np, *package;
> > > > +	struct ncsi_channel *nc, *channel;
> > > > +	struct ncsi_dev_priv *ndp;
> > > > +	unsigned long flags;
> > > > +	u32 package_id, channel_id;
> 
> All local variable declarations from longest to shortest
> line.
> 
> > > > +
> > > > +	if (!info || !info->attrs)
> > > > +		return -EINVAL;
> > > > +
> > > > +	if (!info->attrs[NCSI_ATTR_IFINDEX])
> > > > +		return -EINVAL;
> > > > +
> > > > +	if (!info->attrs[NCSI_ATTR_PACKAGE_ID])
> > > > +		return -EINVAL;
> > > > +
> > > > +	if (!info->attrs[NCSI_ATTR_CHANNEL_MASK])
> > > > +		return -EINVAL;
> > > > +
> > > > +	ndp = ndp_from_ifindex(get_net(sock_net(msg->sk)),
> > > > +			       nla_get_u32(info->attrs[NCSI_ATTR_IFINDEX]));
> > > > +	if (!ndp)
> > > > +		return -ENODEV;
> > > > +
> > > > +	package_id = nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_ID]);
> > > > +	package = NULL;
> > > > +	NCSI_FOR_EACH_PACKAGE(ndp, np)
> > > > +		if (np->id == package_id) {
> > > > +			package = np;
> > > > +			break;
> > > > +		}
> > > > +	if (!package)
> > > > +		return -ERANGE;
> > > > +
> > > > +	spin_lock_irqsave(&package->lock, flags);
> > > > +
> > > > +	channel = NULL;
> > > > +	if (info->attrs[NCSI_ATTR_CHANNEL_ID]) {
> > > > +		channel_id = nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_ID]);
> > > > +		NCSI_FOR_EACH_CHANNEL(np, nc)
> > > > +			if (nc->id == channel_id) {
> > > > +				channel = nc;
> > > > +				break;
> > > > +			}
> > > > +		if (!channel) {
> > > > +			spin_unlock_irqrestore(&package->lock, flags);
> > > > +			return -ERANGE;
> > > > +		}
> > > > +		netdev_dbg(ndp->ndev.dev,
> > > > +			   "NCSI: Channel %u set as preferred channel\n",
> > > > +			   channel->id);
> > > > +	}
> > > > +
> > > > +	package->channel_whitelist =
> > > > +		nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_MASK]);
> > > > +	if (package->channel_whitelist == 0)
> > > > +		netdev_dbg(ndp->ndev.dev,
> > > > +			   "NCSI: Package %u set to all channels disabled\n",
> > > > +			   package->id);
> > > > +
> > > > +	package->preferred_channel = channel;
> > > > +
> > > > +	if (nla_get_flag(info->attrs[NCSI_ATTR_MULTI_FLAG])) {
> > > > +		package->multi_channel = true;
> > > > +		netdev_info(ndp->ndev.dev,
> > > > +			    "NCSI: Multi-channel enabled on package %u\n",
> > > > +			    package_id);
> > > > +	} else {
> > > > +		package->multi_channel = false;
> > > > +	}
> > > > +
> > > > +	spin_unlock_irqrestore(&package->lock, flags);
> > > > +
> > > > +	/* Bounce the NCSI channel to set changes */
> > > > +	ncsi_stop_dev(&ndp->ndev);
> > > > +	ncsi_start_dev(&ndp->ndev);
> > > 
> > > Same question as set_package_mask function.
> > > Is it possible to delay the restart? If we have two packages, we might send
> > > set_package_mask command once and set_channel_mask command twice.
> > > We will see the unnecessary reconfigurations in a very short period time.
> > > 
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +
> > > >  static const struct genl_ops ncsi_ops[] = {
> > > >  	{
> > > >  		.cmd = NCSI_CMD_PKG_INFO,
> > > > @@ -385,6 +534,18 @@ static const struct genl_ops ncsi_ops[] = {
> > > >  		.doit = ncsi_clear_interface_nl,
> > > >  		.flags = GENL_ADMIN_PERM,
> > > >  	},
> > > > +	{
> > > > +		.cmd = NCSI_CMD_SET_PACKAGE_MASK,
> > > > +		.policy = ncsi_genl_policy,
> > > > +		.doit = ncsi_set_package_mask_nl,
> > > > +		.flags = GENL_ADMIN_PERM,
> > > > +	},
> > > > +	{
> > > > +		.cmd = NCSI_CMD_SET_CHANNEL_MASK,
> > > > +		.policy = ncsi_genl_policy,
> > > > +		.doit = ncsi_set_channel_mask_nl,
> > > > +		.flags = GENL_ADMIN_PERM,
> > > > +	},
> > > >  };
> > > >  
> > > >  static struct genl_family ncsi_genl_family __ro_after_init = {
> > > > diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c
> > > > index d66b34749027..02ce7626b579 100644
> > > > --- a/net/ncsi/ncsi-rsp.c
> > > > +++ b/net/ncsi/ncsi-rsp.c
> > > > @@ -241,7 +241,7 @@ static int ncsi_rsp_handler_dcnt(struct ncsi_request *nr)
> > > >  	if (!ncm->enable)
> > > >  		return 0;
> > > >  
> > > > -	ncm->enable = 1;
> > > > +	ncm->enable = 0;
> > > >  	return 0;
> > > >  }
> > > >  
> > > > -- 
> > > > 2.19.0

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox