Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH V4 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
From: Jason Wang @ 2019-08-07 14:02 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: mst, kvm, virtualization, netdev, linux-kernel, linux-mm
In-Reply-To: <20190807120738.GB1557@ziepe.ca>


On 2019/8/7 下午8:07, Jason Gunthorpe wrote:
> On Wed, Aug 07, 2019 at 03:06:15AM -0400, Jason Wang wrote:
>> We used to use RCU to synchronize MMU notifier with worker. This leads
>> calling synchronize_rcu() in invalidate_range_start(). But on a busy
>> system, there would be many factors that may slow down the
>> synchronize_rcu() which makes it unsuitable to be called in MMU
>> notifier.
>>
>> So this patch switches use seqlock counter to track whether or not the
>> map was used. The counter was increased when vq try to start or finish
>> uses the map. This means, when it was even, we're sure there's no
>> readers and MMU notifier is synchronized. When it was odd, it means
>> there's a reader we need to wait it to be even again then we are
>> synchronized. Consider the read critical section is pretty small the
>> synchronization should be done very fast.
>>
>> Reported-by: Michael S. Tsirkin <mst@redhat.com>
>> Fixes: 7f466032dc9e ("vhost: access vq metadata through kernel virtual address")
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>   drivers/vhost/vhost.c | 141 ++++++++++++++++++++++++++----------------
>>   drivers/vhost/vhost.h |   7 ++-
>>   2 files changed, 90 insertions(+), 58 deletions(-)
>>
>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>> index cfc11f9ed9c9..57bfbb60d960 100644
>> +++ b/drivers/vhost/vhost.c
>> @@ -324,17 +324,16 @@ static void vhost_uninit_vq_maps(struct vhost_virtqueue *vq)
>>   
>>   	spin_lock(&vq->mmu_lock);
>>   	for (i = 0; i < VHOST_NUM_ADDRS; i++) {
>> -		map[i] = rcu_dereference_protected(vq->maps[i],
>> -				  lockdep_is_held(&vq->mmu_lock));
>> +		map[i] = vq->maps[i];
>>   		if (map[i]) {
>>   			vhost_set_map_dirty(vq, map[i], i);
>> -			rcu_assign_pointer(vq->maps[i], NULL);
>> +			vq->maps[i] = NULL;
>>   		}
>>   	}
>>   	spin_unlock(&vq->mmu_lock);
>>   
>> -	/* No need for synchronize_rcu() or kfree_rcu() since we are
>> -	 * serialized with memory accessors (e.g vq mutex held).
>> +	/* No need for synchronization since we are serialized with
>> +	 * memory accessors (e.g vq mutex held).
>>   	 */
>>   
>>   	for (i = 0; i < VHOST_NUM_ADDRS; i++)
>> @@ -362,6 +361,40 @@ static bool vhost_map_range_overlap(struct vhost_uaddr *uaddr,
>>   	return !(end < uaddr->uaddr || start > uaddr->uaddr - 1 + uaddr->size);
>>   }
>>   
>> +static void inline vhost_vq_access_map_begin(struct vhost_virtqueue *vq)
>> +{
>> +	write_seqcount_begin(&vq->seq);
>> +}
>> +
>> +static void inline vhost_vq_access_map_end(struct vhost_virtqueue *vq)
>> +{
>> +	write_seqcount_end(&vq->seq);
>> +}
> The write side of a seqlock only provides write barriers. Access to
>
> 	map = vq->maps[VHOST_ADDR_USED];
>
> Still needs a read side barrier, and then I think this will be no
> better than a normal spinlock.
>
> It also doesn't seem like this algorithm even needs a seqlock, as this
> is just a one bit flag


Right, so then I tend to use spinlock first for correctness.


>
> atomic_set_bit(using map)
> smp_mb__after_atomic()
> .. maps [...]
> atomic_clear_bit(using map)
>
>
> map = NULL;
> smp_mb__before_atomic();
> while (atomic_read_bit(using map))
>     relax()
>
> Again, not clear this could be faster than a spinlock when the
> barriers are correct...


Yes, for next release we may want to use the idea from Michael like to 
mitigate the impact of mb.

https://lwn.net/Articles/775871/

Thanks


>
> Jason

^ permalink raw reply

* Re: [PATCH v2 bpf-next 1/4] bpf: unprivileged BPF access via /dev/bpf
From: Andy Lutomirski @ 2019-08-07 13:52 UTC (permalink / raw)
  To: Lorenz Bauer
  Cc: Andy Lutomirski, Alexei Starovoitov, Song Liu, Kees Cook,
	Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team,
	Jann Horn, Greg KH, Linux API, LSM List
In-Reply-To: <CACAyw9_fVZFW_x4uyTAiRfeH6oq1KHv0uB2wO84u5JZyD+Unaw@mail.gmail.com>


> On Aug 7, 2019, at 2:03 AM, Lorenz Bauer <lmb@cloudflare.com> wrote:
> 
>> On Wed, 7 Aug 2019 at 06:24, Andy Lutomirski <luto@kernel.org> wrote:
>> a) Those that, by design, control privileged operations.  This
>> includes most attach calls, but it also includes allow_ptr_leaks,
>> bpf_probe_read(), and quite a few other things.  It also includes all
>> of the by_id calls, I think, unless some clever modification to the
>> way they worked would isolate different users' objects.  I think that
>> persistent objects can do pretty much everything that by_id users
>> would need, so this isn't a big deal.
> 
> Slightly OT, since this is an implementation question: GET_MAP_FD_BY_ID
> is useful to iterate a nested map. This isn't covered by rights to
> persistent objects,
> so it would need some thought.
> 
> 

A call to get an fd to a map referenced by a map to which you already have an fd seems reasonable to me. The new fd would inherit the old fd’s access mode.

^ permalink raw reply

* Re: [PATCH] net: fec: Allow the driver to be built for ARM64 SoCs such as i.MX8
From: Schrempf Frieder @ 2019-08-07 13:25 UTC (permalink / raw)
  To: Fabio Estevam
  Cc: David S. Miller, Ioana Radulescu, Claudiu Manoil, Thomas Gleixner,
	Yangbo Lu, netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <CAOMZO5C61NjX5=7FJj7WpQW=cSvBRi4ADKonUp3CRXtUkSqwCQ@mail.gmail.com>

On 07.08.19 14:20, Fabio Estevam wrote:
> Hi Frieder,
> 
> On Wed, Aug 7, 2019 at 9:04 AM Schrempf Frieder
> <frieder.schrempf@kontron.de> wrote:
>>
>> From: Frieder Schrempf <frieder.schrempf@kontron.de>
>>
>> The FEC ethernet controller is used in some ARM64 SoCs such as i.MX8.
>> To make use of it, append ARM64 to the list of dependencies.
> 
> ARCH_MXC is also used by i.MX8, so there is no need for such change.

You're right of course. I somehow managed to mess up my defconfig. I 
started over with a clean config from arch/arm64/configs/defconfig and 
everything seems fine now. Sorry for the noise.

> 
> By the way: arch/arm64/configs/defconfig has CONFIG_FEC=y by default.
> 

^ permalink raw reply

* Re: [PATCH net-next] net: dsa: sja1105: remove set but not used variables 'tx_vid' and 'rx_vid'
From: Vladimir Oltean @ 2019-08-07 13:20 UTC (permalink / raw)
  To: YueHaibing
  Cc: David S. Miller, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	lkml, netdev
In-Reply-To: <CA+h21hrrWGrw4kiTfjowWvQ-B6sNPLAcgTaaadA02ZAmYw1SjQ@mail.gmail.com>

On Wed, 7 Aug 2019 at 16:19, Vladimir Oltean <olteanv@gmail.com> wrote:
>
> On Wed, 7 Aug 2019 at 16:09, YueHaibing <yuehaibing@huawei.com> wrote:
> >
> > Fixes gcc '-Wunused-but-set-variable' warning:
> >
> > drivers/net/dsa/sja1105/sja1105_main.c: In function sja1105_fdb_dump:
> > drivers/net/dsa/sja1105/sja1105_main.c:1226:14: warning:
> >  variable tx_vid set but not used [-Wunused-but-set-variable]
> > drivers/net/dsa/sja1105/sja1105_main.c:1226:6: warning:
> >  variable rx_vid set but not used [-Wunused-but-set-variable]
> >
> > They are not used since commit 6d7c7d948a2e ("net: dsa:
> > sja1105: Fix broken learning with vlan_filtering disabled")
> >
> > Reported-by: Hulk Robot <hulkci@huawei.com>
> > Signed-off-by: YueHaibing <yuehaibing@huawei.com>
> > ---
>
> Reviewed-by: Vladimir Oltean <olteanv@gmail.com>

This patch should also go to the "net" tree.

>
> >  drivers/net/dsa/sja1105/sja1105_main.c | 4 ----
> >  1 file changed, 4 deletions(-)
> >
> > diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
> > index d073baf..df976b25 100644
> > --- a/drivers/net/dsa/sja1105/sja1105_main.c
> > +++ b/drivers/net/dsa/sja1105/sja1105_main.c
> > @@ -1223,12 +1223,8 @@ static int sja1105_fdb_dump(struct dsa_switch *ds, int port,
> >  {
> >         struct sja1105_private *priv = ds->priv;
> >         struct device *dev = ds->dev;
> > -       u16 rx_vid, tx_vid;
> >         int i;
> >
> > -       rx_vid = dsa_8021q_rx_vid(ds, port);
> > -       tx_vid = dsa_8021q_tx_vid(ds, port);
> > -
> >         for (i = 0; i < SJA1105_MAX_L2_LOOKUP_COUNT; i++) {
> >                 struct sja1105_l2_lookup_entry l2_lookup = {0};
> >                 u8 macaddr[ETH_ALEN];
> > --
> > 2.7.4
> >
> >

^ permalink raw reply

* Re: [PATCH 06/16] net: phy: adin: support PHY mode converters
From: Andrew Lunn @ 2019-08-07 13:20 UTC (permalink / raw)
  To: Ardelean, Alexandru
  Cc: davem@davemloft.net, hkallweit1@gmail.com, mark.rutland@arm.com,
	devicetree@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, f.fainelli@gmail.com, robh+dt@kernel.org
In-Reply-To: <7747cb845a9122004b9565f444b4719170f74b35.camel@analog.com>

> Is it ok if we defer the solution for this drivers/patchset?

Yes, not a problem if phy-mode means phy-mode.

     Andrew

^ permalink raw reply

* Re: [PATCH net-next] net: dsa: sja1105: remove set but not used variables 'tx_vid' and 'rx_vid'
From: Vladimir Oltean @ 2019-08-07 13:19 UTC (permalink / raw)
  To: YueHaibing
  Cc: David S. Miller, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	lkml, netdev
In-Reply-To: <20190807130856.60792-1-yuehaibing@huawei.com>

On Wed, 7 Aug 2019 at 16:09, YueHaibing <yuehaibing@huawei.com> wrote:
>
> Fixes gcc '-Wunused-but-set-variable' warning:
>
> drivers/net/dsa/sja1105/sja1105_main.c: In function sja1105_fdb_dump:
> drivers/net/dsa/sja1105/sja1105_main.c:1226:14: warning:
>  variable tx_vid set but not used [-Wunused-but-set-variable]
> drivers/net/dsa/sja1105/sja1105_main.c:1226:6: warning:
>  variable rx_vid set but not used [-Wunused-but-set-variable]
>
> They are not used since commit 6d7c7d948a2e ("net: dsa:
> sja1105: Fix broken learning with vlan_filtering disabled")
>
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
> ---

Reviewed-by: Vladimir Oltean <olteanv@gmail.com>

>  drivers/net/dsa/sja1105/sja1105_main.c | 4 ----
>  1 file changed, 4 deletions(-)
>
> diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
> index d073baf..df976b25 100644
> --- a/drivers/net/dsa/sja1105/sja1105_main.c
> +++ b/drivers/net/dsa/sja1105/sja1105_main.c
> @@ -1223,12 +1223,8 @@ static int sja1105_fdb_dump(struct dsa_switch *ds, int port,
>  {
>         struct sja1105_private *priv = ds->priv;
>         struct device *dev = ds->dev;
> -       u16 rx_vid, tx_vid;
>         int i;
>
> -       rx_vid = dsa_8021q_rx_vid(ds, port);
> -       tx_vid = dsa_8021q_tx_vid(ds, port);
> -
>         for (i = 0; i < SJA1105_MAX_L2_LOOKUP_COUNT; i++) {
>                 struct sja1105_l2_lookup_entry l2_lookup = {0};
>                 u8 macaddr[ETH_ALEN];
> --
> 2.7.4
>
>

^ permalink raw reply

* [PATCH net] net: phy: rtl8211f: do a double read to get real time link status
From: Yonglong Liu @ 2019-08-07 13:16 UTC (permalink / raw)
  To: davem, andrew, hkallweit1
  Cc: netdev, linux-kernel, linuxarm, salil.mehta, yisen.zhuang,
	shiju.jose

[   27.232781] hns3 0000:bd:00.3 eth7: net open
[   27.237303] 8021q: adding VLAN 0 to HW filter on device eth7
[   27.242972] IPv6: ADDRCONF(NETDEV_CHANGE): eth7: link becomes ready
[   27.244449] hns3 0000:bd:00.3: invalid speed (-1)
[   27.253904] hns3 0000:bd:00.3 eth7: failed to adjust link.
[   27.259379] RTL8211F Gigabit Ethernet mii-0000:bd:00.3:07: PHY state change UP -> RUNNING
[   27.924903] hns3 0000:bd:00.3 eth7: link up
[   28.280479] RTL8211F Gigabit Ethernet mii-0000:bd:00.3:07: PHY state change RUNNING -> NOLINK
[   29.208452] hns3 0000:bd:00.3 eth7: link down
[   32.376745] RTL8211F Gigabit Ethernet mii-0000:bd:00.3:07: PHY state change NOLINK -> RUNNING
[   33.208448] hns3 0000:bd:00.3 eth7: link up
[   35.253821] hns3 0000:bd:00.3 eth7: net stop
[   35.258270] hns3 0000:bd:00.3 eth7: link down

When using rtl8211f in polling mode, may get a invalid speed,
because of reading a fake link up and autoneg complete status
immediately after starting autoneg:

        ifconfig-1176  [007] ....    27.232763: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x00 val:0x1040
  kworker/u257:1-670   [015] ....    27.232805: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x04 val:0x01e1
  kworker/u257:1-670   [015] ....    27.232815: mdio_access: mii-0000:bd:00.3 write phy:0x07 reg:0x04 val:0x05e1
  kworker/u257:1-670   [015] ....    27.232869: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x01 val:0x79ad
  kworker/u257:1-670   [015] ....    27.232904: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x09 val:0x0200
  kworker/u257:1-670   [015] ....    27.232940: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x00 val:0x1040
  kworker/u257:1-670   [015] ....    27.232949: mdio_access: mii-0000:bd:00.3 write phy:0x07 reg:0x00 val:0x1240
  kworker/u257:1-670   [015] ....    27.233003: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x01 val:0x79ad
  kworker/u257:1-670   [015] ....    27.233039: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x0a val:0x3002
  kworker/u257:1-670   [015] ....    27.233074: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x09 val:0x0200
  kworker/u257:1-670   [015] ....    27.233110: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x05 val:0x0000
  kworker/u257:1-670   [000] ....    28.280475: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x01 val:0x7989
  kworker/u257:1-670   [000] ....    29.304471: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x01 val:0x7989

According to the datasheet of rtl8211f, to get the real time
link status, need to read MII_BMSR twice.

This patch add a read_status hook for rtl8211f, and do a fake
phy_read before genphy_read_status(), so that can get real link
status in genphy_read_status().

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
---
 drivers/net/phy/realtek.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c
index a669945..92e27d5 100644
--- a/drivers/net/phy/realtek.c
+++ b/drivers/net/phy/realtek.c
@@ -256,6 +256,18 @@ static int rtl8366rb_config_init(struct phy_device *phydev)
 	return ret;
 }
 
+static int rtl8211f_read_status(struct phy_device *phydev)
+{
+	int status;
+
+	/* do a fake read */
+	status = phy_read(phydev, MII_BMSR);
+	if (status < 0)
+		return status;
+
+	return genphy_read_status(phydev);
+}
+
 static struct phy_driver realtek_drvs[] = {
 	{
 		PHY_ID_MATCH_EXACT(0x00008201),
@@ -325,6 +337,7 @@ static struct phy_driver realtek_drvs[] = {
 		.resume		= genphy_resume,
 		.read_page	= rtl821x_read_page,
 		.write_page	= rtl821x_write_page,
+		.read_status	= rtl8211f_read_status,
 	}, {
 		PHY_ID_MATCH_EXACT(0x001cc800),
 		.name		= "Generic Realtek PHY",
-- 
2.8.1


^ permalink raw reply related

* [PATCH net-next] fq_codel: remove set but not used variables 'prev_ecn_mark' and 'prev_drop_count'
From: YueHaibing @ 2019-08-07 13:10 UTC (permalink / raw)
  To: davem, jhs, xiyou.wangcong, jiri, dave.taht
  Cc: linux-kernel, netdev, YueHaibing

Fixes gcc '-Wunused-but-set-variable' warning:

net/sched/sch_fq_codel.c: In function fq_codel_dequeue:
net/sched/sch_fq_codel.c:288:23: warning: variable prev_ecn_mark set but not used [-Wunused-but-set-variable]
net/sched/sch_fq_codel.c:288:6: warning: variable prev_drop_count set but not used [-Wunused-but-set-variable]

They are not used since commit 77ddaff218fc ("fq_codel: Kill
useless per-flow dropped statistic")

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
 net/sched/sch_fq_codel.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
index 9edd0f4..c261c0a 100644
--- a/net/sched/sch_fq_codel.c
+++ b/net/sched/sch_fq_codel.c
@@ -285,7 +285,6 @@ static struct sk_buff *fq_codel_dequeue(struct Qdisc *sch)
 	struct sk_buff *skb;
 	struct fq_codel_flow *flow;
 	struct list_head *head;
-	u32 prev_drop_count, prev_ecn_mark;
 
 begin:
 	head = &q->new_flows;
@@ -302,9 +301,6 @@ static struct sk_buff *fq_codel_dequeue(struct Qdisc *sch)
 		goto begin;
 	}
 
-	prev_drop_count = q->cstats.drop_count;
-	prev_ecn_mark = q->cstats.ecn_mark;
-
 	skb = codel_dequeue(sch, &sch->qstats.backlog, &q->cparams,
 			    &flow->cvars, &q->cstats, qdisc_pkt_len,
 			    codel_get_enqueue_time, drop_func, dequeue_func);
-- 
2.7.4



^ permalink raw reply related

* [PATCH net-next] net: dsa: sja1105: remove set but not used variables 'tx_vid' and 'rx_vid'
From: YueHaibing @ 2019-08-07 13:08 UTC (permalink / raw)
  To: davem, olteanv, andrew, vivien.didelot, f.fainelli
  Cc: linux-kernel, netdev, YueHaibing

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/dsa/sja1105/sja1105_main.c: In function sja1105_fdb_dump:
drivers/net/dsa/sja1105/sja1105_main.c:1226:14: warning:
 variable tx_vid set but not used [-Wunused-but-set-variable]
drivers/net/dsa/sja1105/sja1105_main.c:1226:6: warning:
 variable rx_vid set but not used [-Wunused-but-set-variable]

They are not used since commit 6d7c7d948a2e ("net: dsa:
sja1105: Fix broken learning with vlan_filtering disabled")

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
 drivers/net/dsa/sja1105/sja1105_main.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index d073baf..df976b25 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -1223,12 +1223,8 @@ static int sja1105_fdb_dump(struct dsa_switch *ds, int port,
 {
 	struct sja1105_private *priv = ds->priv;
 	struct device *dev = ds->dev;
-	u16 rx_vid, tx_vid;
 	int i;
 
-	rx_vid = dsa_8021q_rx_vid(ds, port);
-	tx_vid = dsa_8021q_tx_vid(ds, port);
-
 	for (i = 0; i < SJA1105_MAX_L2_LOOKUP_COUNT; i++) {
 		struct sja1105_l2_lookup_entry l2_lookup = {0};
 		u8 macaddr[ETH_ALEN];
-- 
2.7.4



^ permalink raw reply related

* Re: [PATCH net] netdevsim: Restore per-network namespace accounting for fib entries
From: Jiri Pirko @ 2019-08-07 13:07 UTC (permalink / raw)
  To: David Ahern; +Cc: Jakub Kicinski, David Ahern, davem, netdev
In-Reply-To: <dd568650-d5e7-62ee-4fde-db7b68b8962d@gmail.com>

Wed, Aug 07, 2019 at 02:39:56PM CEST, dsahern@gmail.com wrote:
>On 8/7/19 12:27 AM, Jiri Pirko wrote:
>> Wed, Aug 07, 2019 at 12:32:14AM CEST, jakub.kicinski@netronome.com wrote:
>>> On Tue,  6 Aug 2019 12:15:17 -0700, David Ahern wrote:
>>>> From: David Ahern <dsahern@gmail.com>
>>>>
>>>> Prior to the commit in the fixes tag, the resource controller in netdevsim
>>>> tracked fib entries and rules per network namespace. Restore that behavior.
>>>>
>>>> Fixes: 5fc494225c1e ("netdevsim: create devlink instance per netdevsim instance")
>>>> Signed-off-by: David Ahern <dsahern@gmail.com>
>>>
>>> Thanks.
>>>
>>> Let's see what Jiri says, but to me this patch seems to indeed restore
>>> the original per-namespace accounting when the more natural way forward
>>> may perhaps be to make nsim only count the fib entries where
>> 
>> I think that:
>> 1) netdevsim is a glorified dummy device for testing kernel api, not for
>>    configuring per-namespace resource limitation.
>> 2) If the conclusion os to use devlink instead of cgroups for resourse
>>    limitations, it should be done in a separate code, not in netdevsim.
>> 
>> I would definitelly want to wait what is the result of discussion around 2)
>> first. But one way or another netdevsim code should not do this, I would
>> like to cutout the fib limitations from it instead, just to expose the
>> setup resource limits through debugfs like the rest of the configuration
>> of netdevsim.
>> 
>> 
>
>This is the most incredulous response. You break code you don't
>understand and argue that it should be left broken in older releases and
>ripped out in later ones rather than fixed back to its intent.

Yeah. I believe it was a mistake to add it in the first place. Abuses
netdevsim for something it is not. I'm fine to use devlink the way you
want to after we conclude 2), but outside netdevsim.

Again, netdevsim is there for config api testing purposes. If things
got broken, it is not that bit deal. I broke the way it is
instantiated significantly for example (iplink->sysfs).


>
>Again, the devlink resource controller was added by me specifically to
>test the ability to fail in-kernel fib notifiers. When I add the nexthop
>in-kernel notifier, netdevsim again provides a simple means of testing it.
>
>It satisfies all of the conditions and intents of netdevsim - test
>kernel APIs without hardware.

Well, what you want to do is limit kernel sw resources per-namespace.

^ permalink raw reply

* Re: [PATCH net-next 5/5] r8152: change rx_frag_head_sz and rx_max_agg_num dynamically
From: Maciej Fijalkowski @ 2019-08-07 12:43 UTC (permalink / raw)
  To: Hayes Wang
  Cc: Jakub Kicinski, netdev@vger.kernel.org, nic_swsd,
	linux-kernel@vger.kernel.org, linux-usb@vger.kernel.org
In-Reply-To: <0835B3720019904CB8F7AA43166CEEB2F18D06C5@RTITMBSVM03.realtek.com.tw>

On Wed, 7 Aug 2019 07:12:32 +0000
Hayes Wang <hayeswang@realtek.com> wrote:

> Jakub Kicinski [mailto:jakub.kicinski@netronome.com]
> > Sent: Wednesday, August 07, 2019 6:10 AM  
> [...]
> > Please don't expose those via sysfs. Ethtool's copybreak and descriptor
> > count should be applicable here, I think.  
> 
> Excuse me.
> I find struct ethtool_tunable for ETHTOOL_RX_COPYBREAK.
> How about the descriptor count?

Look how drivers implement ethtool's set_ringparam ops.

Thanks,
Maciej

> 
> 
> Best Regards,
> Hayes
> 
> 


^ permalink raw reply

* Re: [PATCH net] netdevsim: Restore per-network namespace accounting for fib entries
From: David Ahern @ 2019-08-07 12:39 UTC (permalink / raw)
  To: Jiri Pirko, Jakub Kicinski; +Cc: David Ahern, davem, netdev
In-Reply-To: <20190807062712.GE2332@nanopsycho.orion>

On 8/7/19 12:27 AM, Jiri Pirko wrote:
> Wed, Aug 07, 2019 at 12:32:14AM CEST, jakub.kicinski@netronome.com wrote:
>> On Tue,  6 Aug 2019 12:15:17 -0700, David Ahern wrote:
>>> From: David Ahern <dsahern@gmail.com>
>>>
>>> Prior to the commit in the fixes tag, the resource controller in netdevsim
>>> tracked fib entries and rules per network namespace. Restore that behavior.
>>>
>>> Fixes: 5fc494225c1e ("netdevsim: create devlink instance per netdevsim instance")
>>> Signed-off-by: David Ahern <dsahern@gmail.com>
>>
>> Thanks.
>>
>> Let's see what Jiri says, but to me this patch seems to indeed restore
>> the original per-namespace accounting when the more natural way forward
>> may perhaps be to make nsim only count the fib entries where
> 
> I think that:
> 1) netdevsim is a glorified dummy device for testing kernel api, not for
>    configuring per-namespace resource limitation.
> 2) If the conclusion os to use devlink instead of cgroups for resourse
>    limitations, it should be done in a separate code, not in netdevsim.
> 
> I would definitelly want to wait what is the result of discussion around 2)
> first. But one way or another netdevsim code should not do this, I would
> like to cutout the fib limitations from it instead, just to expose the
> setup resource limits through debugfs like the rest of the configuration
> of netdevsim.
> 
> 

This is the most incredulous response. You break code you don't
understand and argue that it should be left broken in older releases and
ripped out in later ones rather than fixed back to its intent.

Again, the devlink resource controller was added by me specifically to
test the ability to fail in-kernel fib notifiers. When I add the nexthop
in-kernel notifier, netdevsim again provides a simple means of testing it.

It satisfies all of the conditions and intents of netdevsim - test
kernel APIs without hardware.

^ permalink raw reply

* [bpf-next PATCH 3/3] samples/bpf: xdp_fwd explain bpf_fib_lookup return codes
From: Jesper Dangaard Brouer @ 2019-08-07 12:36 UTC (permalink / raw)
  To: netdev, Daniel Borkmann, Alexei Starovoitov
  Cc: xdp-newbies, a.s.protopopov, dsahern, Jesper Dangaard Brouer
In-Reply-To: <156518133219.5636.728822418668658886.stgit@firesoul>

Make it clear that this XDP program depend on the network
stack to do the ARP resolution.  This is connected with the
BPF_FIB_LKUP_RET_NO_NEIGH return code from bpf_fib_lookup().

Another common mistake (seen via XDP-tutorial) is that users
don't realize that sysctl net.ipv{4,6}.conf.all.forwarding
setting is honored by bpf_fib_lookup.

Reported-by: Anton Protopopov <a.s.protopopov@gmail.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 samples/bpf/xdp_fwd_kern.c |   19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/samples/bpf/xdp_fwd_kern.c b/samples/bpf/xdp_fwd_kern.c
index 4a5ad381ed2a..df11163454e7 100644
--- a/samples/bpf/xdp_fwd_kern.c
+++ b/samples/bpf/xdp_fwd_kern.c
@@ -103,8 +103,23 @@ static __always_inline int xdp_fwd_flags(struct xdp_md *ctx, u32 flags)
 	fib_params.ifindex = ctx->ingress_ifindex;
 
 	rc = bpf_fib_lookup(ctx, &fib_params, sizeof(fib_params), flags);
-
-	if (rc == 0) {
+	/*
+	 * Some rc (return codes) from bpf_fib_lookup() are important,
+	 * to understand how this XDP-prog interacts with network stack.
+	 *
+	 * BPF_FIB_LKUP_RET_NO_NEIGH:
+	 *  Even if route lookup was a success, then the MAC-addresses are also
+	 *  needed.  This is obtained from arp/neighbour table, but if table is
+	 *  (still) empty then BPF_FIB_LKUP_RET_NO_NEIGH is returned.  To avoid
+	 *  doing ARP lookup directly from XDP, then send packet to normal
+	 *  network stack via XDP_PASS and expect it will do ARP resolution.
+	 *
+	 * BPF_FIB_LKUP_RET_FWD_DISABLED:
+	 *  The bpf_fib_lookup respect sysctl net.ipv{4,6}.conf.all.forwarding
+	 *  setting, and will return BPF_FIB_LKUP_RET_FWD_DISABLED if not
+	 *  enabled this on ingress device.
+	 */
+	if (rc == BPF_FIB_LKUP_RET_SUCCESS) {
 		int *val;
 
 		/* Verify egress index has been configured as TX-port.


^ permalink raw reply related

* [bpf-next PATCH 2/3] samples/bpf: make xdp_fwd more practically usable via devmap lookup
From: Jesper Dangaard Brouer @ 2019-08-07 12:36 UTC (permalink / raw)
  To: netdev, Daniel Borkmann, Alexei Starovoitov
  Cc: xdp-newbies, a.s.protopopov, dsahern, Jesper Dangaard Brouer
In-Reply-To: <156518133219.5636.728822418668658886.stgit@firesoul>

This address the TODO in samples/bpf/xdp_fwd_kern.c, which points out
that the chosen egress index should be checked for existence in the
devmap. This can now be done via taking advantage of Toke's work in
commit 0cdbb4b09a06 ("devmap: Allow map lookups from eBPF").

This change makes xdp_fwd more practically usable, as this allows for
a mixed environment, where IP-forwarding fallback to network stack, if
the egress device isn't configured to use XDP.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 samples/bpf/xdp_fwd_kern.c |   20 ++++++++++++++------
 samples/bpf/xdp_fwd_user.c |   36 +++++++++++++++++++++++++-----------
 2 files changed, 39 insertions(+), 17 deletions(-)

diff --git a/samples/bpf/xdp_fwd_kern.c b/samples/bpf/xdp_fwd_kern.c
index e6ffc4ea06f4..4a5ad381ed2a 100644
--- a/samples/bpf/xdp_fwd_kern.c
+++ b/samples/bpf/xdp_fwd_kern.c
@@ -104,13 +104,21 @@ static __always_inline int xdp_fwd_flags(struct xdp_md *ctx, u32 flags)
 
 	rc = bpf_fib_lookup(ctx, &fib_params, sizeof(fib_params), flags);
 
-	/* verify egress index has xdp support
-	 * TO-DO bpf_map_lookup_elem(&tx_port, &key) fails with
-	 *       cannot pass map_type 14 into func bpf_map_lookup_elem#1:
-	 * NOTE: without verification that egress index supports XDP
-	 *       forwarding packets are dropped.
-	 */
 	if (rc == 0) {
+		int *val;
+
+		/* Verify egress index has been configured as TX-port.
+		 * (Note: User can still have inserted an egress ifindex that
+		 * doesn't support XDP xmit, which will result in packet drops).
+		 *
+		 * Note: lookup in devmap supported since 0cdbb4b09a0.
+		 * If not supported will fail with:
+		 *  cannot pass map_type 14 into func bpf_map_lookup_elem#1:
+		 */
+		val = bpf_map_lookup_elem(&tx_port, &fib_params.ifindex);
+		if (!val)
+			return XDP_PASS;
+
 		if (h_proto == htons(ETH_P_IP))
 			ip_decrease_ttl(iph);
 		else if (h_proto == htons(ETH_P_IPV6))
diff --git a/samples/bpf/xdp_fwd_user.c b/samples/bpf/xdp_fwd_user.c
index ba012d9f93dd..20951bc27477 100644
--- a/samples/bpf/xdp_fwd_user.c
+++ b/samples/bpf/xdp_fwd_user.c
@@ -27,14 +27,20 @@
 #include "libbpf.h"
 #include <bpf/bpf.h>
 
-
-static int do_attach(int idx, int fd, const char *name)
+static int do_attach(int idx, int prog_fd, int map_fd, const char *name)
 {
 	int err;
 
-	err = bpf_set_link_xdp_fd(idx, fd, 0);
-	if (err < 0)
+	err = bpf_set_link_xdp_fd(idx, prog_fd, 0);
+	if (err < 0) {
 		printf("ERROR: failed to attach program to %s\n", name);
+		return err;
+	}
+
+	/* Adding ifindex as a possible egress TX port */
+	err = bpf_map_update_elem(map_fd, &idx, &idx, 0);
+	if (err)
+		printf("ERROR: failed using device %s as TX-port\n", name);
 
 	return err;
 }
@@ -47,6 +53,9 @@ static int do_detach(int idx, const char *name)
 	if (err < 0)
 		printf("ERROR: failed to detach program from %s\n", name);
 
+	/* TODO: Remember to cleanup map, when adding use of shared map
+	 *  bpf_map_delete_elem((map_fd, &idx);
+	 */
 	return err;
 }
 
@@ -67,10 +76,10 @@ int main(int argc, char **argv)
 	};
 	const char *prog_name = "xdp_fwd";
 	struct bpf_program *prog;
+	int prog_fd, map_fd = -1;
 	char filename[PATH_MAX];
 	struct bpf_object *obj;
 	int opt, i, idx, err;
-	int prog_fd, map_fd;
 	int attach = 1;
 	int ret = 0;
 
@@ -103,8 +112,17 @@ int main(int argc, char **argv)
 			return 1;
 		}
 
-		if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd))
+		err = bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd);
+		if (err) {
+			if (err == -22) {
+				printf("Does kernel support devmap lookup?\n");
+				/* If not, the error message will be:
+				 * "cannot pass map_type 14 into func
+				 * bpf_map_lookup_elem#1"
+				 */
+			}
 			return 1;
+		}
 
 		prog = bpf_object__find_program_by_title(obj, prog_name);
 		prog_fd = bpf_program__fd(prog);
@@ -119,10 +137,6 @@ int main(int argc, char **argv)
 			return 1;
 		}
 	}
-	if (attach) {
-		for (i = 1; i < 64; ++i)
-			bpf_map_update_elem(map_fd, &i, &i, 0);
-	}
 
 	for (i = optind; i < argc; ++i) {
 		idx = if_nametoindex(argv[i]);
@@ -138,7 +152,7 @@ int main(int argc, char **argv)
 			if (err)
 				ret = err;
 		} else {
-			err = do_attach(idx, prog_fd, argv[i]);
+			err = do_attach(idx, prog_fd, map_fd, argv[i]);
 			if (err)
 				ret = err;
 		}


^ permalink raw reply related

* [bpf-next PATCH 1/3] samples/bpf: xdp_fwd rename devmap name to be xdp_tx_ports
From: Jesper Dangaard Brouer @ 2019-08-07 12:36 UTC (permalink / raw)
  To: netdev, Daniel Borkmann, Alexei Starovoitov
  Cc: xdp-newbies, a.s.protopopov, dsahern, Jesper Dangaard Brouer
In-Reply-To: <156518133219.5636.728822418668658886.stgit@firesoul>

The devmap name 'tx_port' came from a copy-paste from xdp_redirect_map
which only have a single TX port. Change name to xdp_tx_ports
to make it more descriptive.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 samples/bpf/xdp_fwd_kern.c |    5 +++--
 samples/bpf/xdp_fwd_user.c |    2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/samples/bpf/xdp_fwd_kern.c b/samples/bpf/xdp_fwd_kern.c
index a7e94e7ff87d..e6ffc4ea06f4 100644
--- a/samples/bpf/xdp_fwd_kern.c
+++ b/samples/bpf/xdp_fwd_kern.c
@@ -23,7 +23,8 @@
 
 #define IPV6_FLOWINFO_MASK              cpu_to_be32(0x0FFFFFFF)
 
-struct bpf_map_def SEC("maps") tx_port = {
+/* For TX-traffic redirect requires net_device ifindex to be in this devmap */
+struct bpf_map_def SEC("maps") xdp_tx_ports = {
 	.type = BPF_MAP_TYPE_DEVMAP,
 	.key_size = sizeof(int),
 	.value_size = sizeof(int),
@@ -117,7 +118,7 @@ static __always_inline int xdp_fwd_flags(struct xdp_md *ctx, u32 flags)
 
 		memcpy(eth->h_dest, fib_params.dmac, ETH_ALEN);
 		memcpy(eth->h_source, fib_params.smac, ETH_ALEN);
-		return bpf_redirect_map(&tx_port, fib_params.ifindex, 0);
+		return bpf_redirect_map(&xdp_tx_ports, fib_params.ifindex, 0);
 	}
 
 	return XDP_PASS;
diff --git a/samples/bpf/xdp_fwd_user.c b/samples/bpf/xdp_fwd_user.c
index 5b46ee12c696..ba012d9f93dd 100644
--- a/samples/bpf/xdp_fwd_user.c
+++ b/samples/bpf/xdp_fwd_user.c
@@ -113,7 +113,7 @@ int main(int argc, char **argv)
 			return 1;
 		}
 		map_fd = bpf_map__fd(bpf_object__find_map_by_name(obj,
-								  "tx_port"));
+							"xdp_tx_ports"));
 		if (map_fd < 0) {
 			printf("map not found: %s\n", strerror(map_fd));
 			return 1;


^ permalink raw reply related

* [bpf-next PATCH 0/3] bpf: improvements to xdp_fwd sample
From: Jesper Dangaard Brouer @ 2019-08-07 12:36 UTC (permalink / raw)
  To: netdev, Daniel Borkmann, Alexei Starovoitov
  Cc: xdp-newbies, a.s.protopopov, dsahern, Jesper Dangaard Brouer

This patchset is focused on improvements for XDP forwarding sample
named xdp_fwd, which leverage the existing FIB routing tables as
described in LPC2018[1] talk by David Ahern.

The primary motivation is to illustrate how Toke's recent work
improves usability of XDP_REDIRECT via lookups in devmap. The other
patches are to help users understand the sample.

I have more improvements to xdp_fwd, but those might requires changes
to libbpf.  Thus, sending these patches first as they are isolated.

[1] http://vger.kernel.org/lpc-networking2018.html#session-1

---

Jesper Dangaard Brouer (3):
      samples/bpf: xdp_fwd rename devmap name to be xdp_tx_ports
      samples/bpf: make xdp_fwd more practically usable via devmap lookup
      samples/bpf: xdp_fwd explain bpf_fib_lookup return codes


 samples/bpf/xdp_fwd_kern.c |   42 +++++++++++++++++++++++++++++++++---------
 samples/bpf/xdp_fwd_user.c |   38 ++++++++++++++++++++++++++------------
 2 files changed, 59 insertions(+), 21 deletions(-)

--

^ permalink raw reply

* Re: [PATCH] net: fec: Allow the driver to be built for ARM64 SoCs such as i.MX8
From: Fabio Estevam @ 2019-08-07 12:20 UTC (permalink / raw)
  To: Schrempf Frieder
  Cc: David S. Miller, Ioana Radulescu, Claudiu Manoil, Thomas Gleixner,
	Yangbo Lu, netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20190807114332.13312-2-frieder.schrempf@kontron.de>

Hi Frieder,

On Wed, Aug 7, 2019 at 9:04 AM Schrempf Frieder
<frieder.schrempf@kontron.de> wrote:
>
> From: Frieder Schrempf <frieder.schrempf@kontron.de>
>
> The FEC ethernet controller is used in some ARM64 SoCs such as i.MX8.
> To make use of it, append ARM64 to the list of dependencies.

ARCH_MXC is also used by i.MX8, so there is no need for such change.

By the way: arch/arm64/configs/defconfig has CONFIG_FEC=y by default.

^ permalink raw reply

* [PATCH net-next] net: openvswitch: Set OvS recirc_id from tc chain index
From: Paul Blakey @ 2019-08-07 12:08 UTC (permalink / raw)
  To: netdev, David S. Miller, Justin Pettit, Pravin B Shelar,
	Simon Horman, Marcelo Ricardo Leitner, Vlad Buslov, Paul Blakey
  Cc: Jiri Pirko, Roi Dayan, Yossi Kuperman, Rony Efraim, Oz Shlomo

Offloaded OvS datapath rules are translated one to one to tc rules,
for example the following simplified OvS rule:

recirc_id(0),in_port(dev1),eth_type(0x0800),ct_state(-trk) actions:ct(),recirc(2)

Will be translated to the following tc rule:

$ tc filter add dev dev1 ingress \
	    prio 1 chain 0 proto ip \
		flower tcp ct_state -trk \
		action ct pipe \
		action goto chain 2

Received packets will first travel though tc, and if they aren't stolen
by it, like in the above rule, they will continue to OvS datapath.
Since we already did some actions (action ct in this case) which might
modify the packets, and updated action stats, we would like to continue
the proccessing with the correct recirc_id in OvS (here recirc_id(2))
where we left off.

To support this, introduce a new skb extension for tc, which
will be used for translating tc chain to ovs recirc_id to
handle these miss cases. Last tc chain index will be set
by tc goto chain action and read by OvS datapath.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 include/linux/skbuff.h    | 13 +++++++++++++
 include/net/sch_generic.h |  5 ++++-
 net/core/skbuff.c         |  6 ++++++
 net/openvswitch/flow.c    |  9 +++++++++
 net/sched/Kconfig         | 13 +++++++++++++
 net/sched/act_api.c       |  1 +
 net/sched/cls_api.c       | 12 ++++++++++++
 7 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 3aef8d8..fb2a792 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -279,6 +279,16 @@ struct nf_bridge_info {
 };
 #endif
 
+#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
+/* Chain in tc_skb_ext will be used to share the tc chain with
+ * ovs recirc_id. It will be set to the current chain by tc
+ * and read by ovs to recirc_id.
+ */
+struct tc_skb_ext {
+	__u32 chain;
+};
+#endif
+
 struct sk_buff_head {
 	/* These two members must be first. */
 	struct sk_buff	*next;
@@ -4050,6 +4060,9 @@ enum skb_ext_id {
 #ifdef CONFIG_XFRM
 	SKB_EXT_SEC_PATH,
 #endif
+#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
+	TC_SKB_EXT,
+#endif
 	SKB_EXT_NUM, /* must be last */
 };
 
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 6b6b012..871feea 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -275,7 +275,10 @@ struct tcf_result {
 			unsigned long	class;
 			u32		classid;
 		};
-		const struct tcf_proto *goto_tp;
+		struct {
+			const struct tcf_proto *goto_tp;
+			u32 goto_index;
+		};
 
 		/* used in the skb_tc_reinsert function */
 		struct {
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index ea8e8d3..2b40b5a 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4087,6 +4087,9 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb)
 #ifdef CONFIG_XFRM
 	[SKB_EXT_SEC_PATH] = SKB_EXT_CHUNKSIZEOF(struct sec_path),
 #endif
+#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
+	[TC_SKB_EXT] = SKB_EXT_CHUNKSIZEOF(struct tc_skb_ext),
+#endif
 };
 
 static __always_inline unsigned int skb_ext_total_length(void)
@@ -4098,6 +4101,9 @@ static __always_inline unsigned int skb_ext_total_length(void)
 #ifdef CONFIG_XFRM
 		skb_ext_type_len[SKB_EXT_SEC_PATH] +
 #endif
+#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
+		skb_ext_type_len[TC_SKB_EXT] +
+#endif
 		0;
 }
 
diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index bc89e16..0287ead 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -816,6 +816,9 @@ static int key_extract_mac_proto(struct sk_buff *skb)
 int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
 			 struct sk_buff *skb, struct sw_flow_key *key)
 {
+#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
+	struct tc_skb_ext *tc_ext;
+#endif
 	int res, err;
 
 	/* Extract metadata from packet. */
@@ -848,7 +851,13 @@ int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
 	if (res < 0)
 		return res;
 	key->mac_proto = res;
+
+#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
+	tc_ext = skb_ext_find(skb, TC_SKB_EXT);
+	key->recirc_id = tc_ext ? tc_ext->chain : 0;
+#else
 	key->recirc_id = 0;
+#endif
 
 	err = key_extract(skb, key);
 	if (!err)
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index afd2ba1..b3faafe 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -963,6 +963,19 @@ config NET_IFE_SKBTCINDEX
         tristate "Support to encoding decoding skb tcindex on IFE action"
         depends on NET_ACT_IFE
 
+config NET_TC_SKB_EXT
+	bool "TC recirculation support"
+	depends on NET_CLS_ACT
+	default y if NET_CLS_ACT
+	select SKB_EXTENSIONS
+
+	help
+	  Say Y here to allow tc chain misses to continue in OvS datapath in
+	  the correct recirc_id, and hardware chain misses to continue in
+	  the correct chain in tc software datapath.
+
+	  Say N here if you won't be using tc<->ovs offload or tc chains offload.
+
 endif # NET_SCHED
 
 config NET_SCH_FIFO
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 3397122..c393604 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -27,6 +27,7 @@ static void tcf_action_goto_chain_exec(const struct tc_action *a,
 {
 	const struct tcf_chain *chain = rcu_dereference_bh(a->goto_chain);
 
+	res->goto_index = chain->index;
 	res->goto_tp = rcu_dereference_bh(chain->filter_chain);
 }
 
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 3565d9a..b0b829a 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -1660,6 +1660,18 @@ int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 			goto reset;
 		} else if (unlikely(TC_ACT_EXT_CMP(err, TC_ACT_GOTO_CHAIN))) {
 			first_tp = res->goto_tp;
+
+#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
+			{
+				struct tc_skb_ext *ext;
+
+				ext = skb_ext_add(skb, TC_SKB_EXT);
+				if (WARN_ON_ONCE(!ext))
+					return TC_ACT_SHOT;
+
+				ext->chain = res->goto_index;
+			}
+#endif
 			goto reset;
 		}
 #endif
-- 
1.8.3.1


^ permalink raw reply related

* Re: [PATCH V4 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
From: Jason Gunthorpe @ 2019-08-07 12:07 UTC (permalink / raw)
  To: Jason Wang; +Cc: mst, kvm, virtualization, netdev, linux-kernel, linux-mm
In-Reply-To: <20190807070617.23716-8-jasowang@redhat.com>

On Wed, Aug 07, 2019 at 03:06:15AM -0400, Jason Wang wrote:
> We used to use RCU to synchronize MMU notifier with worker. This leads
> calling synchronize_rcu() in invalidate_range_start(). But on a busy
> system, there would be many factors that may slow down the
> synchronize_rcu() which makes it unsuitable to be called in MMU
> notifier.
> 
> So this patch switches use seqlock counter to track whether or not the
> map was used. The counter was increased when vq try to start or finish
> uses the map. This means, when it was even, we're sure there's no
> readers and MMU notifier is synchronized. When it was odd, it means
> there's a reader we need to wait it to be even again then we are
> synchronized. Consider the read critical section is pretty small the
> synchronization should be done very fast.
> 
> Reported-by: Michael S. Tsirkin <mst@redhat.com>
> Fixes: 7f466032dc9e ("vhost: access vq metadata through kernel virtual address")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
>  drivers/vhost/vhost.c | 141 ++++++++++++++++++++++++++----------------
>  drivers/vhost/vhost.h |   7 ++-
>  2 files changed, 90 insertions(+), 58 deletions(-)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index cfc11f9ed9c9..57bfbb60d960 100644
> +++ b/drivers/vhost/vhost.c
> @@ -324,17 +324,16 @@ static void vhost_uninit_vq_maps(struct vhost_virtqueue *vq)
>  
>  	spin_lock(&vq->mmu_lock);
>  	for (i = 0; i < VHOST_NUM_ADDRS; i++) {
> -		map[i] = rcu_dereference_protected(vq->maps[i],
> -				  lockdep_is_held(&vq->mmu_lock));
> +		map[i] = vq->maps[i];
>  		if (map[i]) {
>  			vhost_set_map_dirty(vq, map[i], i);
> -			rcu_assign_pointer(vq->maps[i], NULL);
> +			vq->maps[i] = NULL;
>  		}
>  	}
>  	spin_unlock(&vq->mmu_lock);
>  
> -	/* No need for synchronize_rcu() or kfree_rcu() since we are
> -	 * serialized with memory accessors (e.g vq mutex held).
> +	/* No need for synchronization since we are serialized with
> +	 * memory accessors (e.g vq mutex held).
>  	 */
>  
>  	for (i = 0; i < VHOST_NUM_ADDRS; i++)
> @@ -362,6 +361,40 @@ static bool vhost_map_range_overlap(struct vhost_uaddr *uaddr,
>  	return !(end < uaddr->uaddr || start > uaddr->uaddr - 1 + uaddr->size);
>  }
>  
> +static void inline vhost_vq_access_map_begin(struct vhost_virtqueue *vq)
> +{
> +	write_seqcount_begin(&vq->seq);
> +}
> +
> +static void inline vhost_vq_access_map_end(struct vhost_virtqueue *vq)
> +{
> +	write_seqcount_end(&vq->seq);
> +}

The write side of a seqlock only provides write barriers. Access to

	map = vq->maps[VHOST_ADDR_USED];

Still needs a read side barrier, and then I think this will be no
better than a normal spinlock.

It also doesn't seem like this algorithm even needs a seqlock, as this
is just a one bit flag

atomic_set_bit(using map)
smp_mb__after_atomic()
.. maps [...]
atomic_clear_bit(using map)


map = NULL;
smp_mb__before_atomic();
while (atomic_read_bit(using map))
   relax()

Again, not clear this could be faster than a spinlock when the
barriers are correct...

Jason

^ permalink raw reply

* Re: [PATCH v2 1/2] dt-bindings: net: snps,dwmac: update reg minItems maxItems
From: Maxime Ripard @ 2019-08-07 12:04 UTC (permalink / raw)
  To: Rob Herring
  Cc: Neil Armstrong, Martin Blumenstingl, devicetree, netdev,
	linux-amlogic,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	linux-kernel@vger.kernel.org
In-Reply-To: <CAL_Jsq+6kCO8x53d1670VjgEjfs5opKY+R3OgsAo0WsXqq512Q@mail.gmail.com>

Hi,

On Tue, Aug 06, 2019 at 09:22:12AM -0600, Rob Herring wrote:
> +Maxime
>
> On Tue, Aug 6, 2019 at 6:50 AM Neil Armstrong <narmstrong@baylibre.com> wrote:
> >
> > The Amlogic Meson DWMAC glue bindings needs a second reg cells for the
> > glue registers, thus update the reg minItems/maxItems to allow more
> > than a single reg cell.
> >
> > Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
> > ---
> >  Documentation/devicetree/bindings/net/snps,dwmac.yaml | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
>
> I haven't checked, but the derivative schema could be assuming this
> schema enforced reg is 1 item.

Yeah, we do for
Documentation/devicetree/bindings/net/allwinner,sun7i-a20-gmac.yaml
(but somehow not allwinner,sun8i-a83t-emac.yaml)

Neil, can you add it to sun7i-a20-gmac?

> I don't think that's a major issue
> though.
>
> Acked-by: Rob Herring <robh@kernel.org>

With that fixed,
Acked-by: Maxime Ripard <maxime.ripard@bootlin.com>

Maxime

--
Maxime Ripard, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply

* [PATCH] net: fec: Allow the driver to be built for ARM64 SoCs such as i.MX8
From: Schrempf Frieder @ 2019-08-07 11:44 UTC (permalink / raw)
  To: David S. Miller, Ioana Radulescu, Claudiu Manoil, Thomas Gleixner,
	Yangbo Lu, Schrempf Frieder
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20190807114332.13312-1-frieder.schrempf@kontron.de>

From: Frieder Schrempf <frieder.schrempf@kontron.de>

The FEC ethernet controller is used in some ARM64 SoCs such as i.MX8.
To make use of it, append ARM64 to the list of dependencies.

Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
---
 drivers/net/ethernet/freescale/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/Kconfig b/drivers/net/ethernet/freescale/Kconfig
index 6a7e8993119f..f7f4e073d955 100644
--- a/drivers/net/ethernet/freescale/Kconfig
+++ b/drivers/net/ethernet/freescale/Kconfig
@@ -23,7 +23,7 @@ if NET_VENDOR_FREESCALE
 config FEC
 	tristate "FEC ethernet controller (of ColdFire and some i.MX CPUs)"
 	depends on (M523x || M527x || M5272 || M528x || M520x || M532x || \
-		   ARCH_MXC || SOC_IMX28 || COMPILE_TEST)
+		   ARCH_MXC || ARM64 || SOC_IMX28 || COMPILE_TEST)
 	default ARCH_MXC || SOC_IMX28 if ARM
 	select PHYLIB
 	imply PTP_1588_CLOCK
-- 
2.17.1

^ permalink raw reply related

* KMSAN: uninit-value in smsc75xx_wait_eeprom
From: syzbot @ 2019-08-07 11:48 UTC (permalink / raw)
  To: davem, glider, linux-kernel, linux-usb, netdev, steve.glendinning,
	syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    ae0c578a kmsan: include gfp.h from kmsan.h
git tree:       kmsan
console output: https://syzkaller.appspot.com/x/log.txt?x=10e4f474600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=27abc558ecb16a3b
dashboard link: https://syzkaller.appspot.com/bug?extid=532222e4d7ddadadd1c8
compiler:       clang version 9.0.0 (/home/glider/llvm/clang  
80fee25776c2fb61e74c1ecb1a523375c2500b69)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+532222e4d7ddadadd1c8@syzkaller.appspotmail.com

usb 2-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
usb 2-1: config 0 descriptor??
smsc75xx v1.0.0
==================================================================
BUG: KMSAN: uninit-value in smsc75xx_wait_eeprom+0x1fb/0x3d0  
drivers/net/usb/smsc75xx.c:307
CPU: 1 PID: 10983 Comm: kworker/1:5 Not tainted 5.3.0-rc3+ #16
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Workqueue: usb_hub_wq hub_event
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x191/0x1f0 lib/dump_stack.c:113
  kmsan_report+0x162/0x2d0 mm/kmsan/kmsan_report.c:109
  __msan_warning+0x75/0xe0 mm/kmsan/kmsan_instr.c:294
  smsc75xx_wait_eeprom+0x1fb/0x3d0 drivers/net/usb/smsc75xx.c:307
  smsc75xx_read_eeprom+0x3c2/0x920 drivers/net/usb/smsc75xx.c:364
  smsc75xx_init_mac_address drivers/net/usb/smsc75xx.c:771 [inline]
  smsc75xx_bind+0x675/0x12d0 drivers/net/usb/smsc75xx.c:1489
  usbnet_probe+0x10ae/0x3960 drivers/net/usb/usbnet.c:1722
  usb_probe_interface+0xd19/0x1310 drivers/usb/core/driver.c:361
  really_probe+0x1373/0x1dc0 drivers/base/dd.c:552
  driver_probe_device+0x1ba/0x510 drivers/base/dd.c:709
  __device_attach_driver+0x5b8/0x790 drivers/base/dd.c:816
  bus_for_each_drv+0x28e/0x3b0 drivers/base/bus.c:454
  __device_attach+0x489/0x750 drivers/base/dd.c:882
  device_initial_probe+0x4a/0x60 drivers/base/dd.c:929
  bus_probe_device+0x131/0x390 drivers/base/bus.c:514
  device_add+0x25b5/0x2df0 drivers/base/core.c:2114
  usb_set_configuration+0x309f/0x3710 drivers/usb/core/message.c:2027
  generic_probe+0xe7/0x280 drivers/usb/core/generic.c:210
  usb_probe_device+0x146/0x200 drivers/usb/core/driver.c:266
  really_probe+0x1373/0x1dc0 drivers/base/dd.c:552
  driver_probe_device+0x1ba/0x510 drivers/base/dd.c:709
  __device_attach_driver+0x5b8/0x790 drivers/base/dd.c:816
  bus_for_each_drv+0x28e/0x3b0 drivers/base/bus.c:454
  __device_attach+0x489/0x750 drivers/base/dd.c:882
  device_initial_probe+0x4a/0x60 drivers/base/dd.c:929
  bus_probe_device+0x131/0x390 drivers/base/bus.c:514
  device_add+0x25b5/0x2df0 drivers/base/core.c:2114
  usb_new_device+0x23e5/0x2fb0 drivers/usb/core/hub.c:2536
  hub_port_connect drivers/usb/core/hub.c:5098 [inline]
  hub_port_connect_change drivers/usb/core/hub.c:5213 [inline]
  port_event drivers/usb/core/hub.c:5359 [inline]
  hub_event+0x581d/0x72f0 drivers/usb/core/hub.c:5441
  process_one_work+0x1572/0x1ef0 kernel/workqueue.c:2269
  worker_thread+0x111b/0x2460 kernel/workqueue.c:2415
  kthread+0x4b5/0x4f0 kernel/kthread.c:256
  ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:355

Local variable description: ----buf.i.i@smsc75xx_wait_eeprom
Variable was created at:
  __smsc75xx_read_reg drivers/net/usb/smsc75xx.c:83 [inline]
  smsc75xx_read_reg drivers/net/usb/smsc75xx.c:147 [inline]
  smsc75xx_wait_eeprom+0xb6/0x3d0 drivers/net/usb/smsc75xx.c:301
  smsc75xx_read_eeprom+0x3c2/0x920 drivers/net/usb/smsc75xx.c:364
==================================================================


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

^ permalink raw reply

* Re: [PATCH] staging: isdn: hysdn_procconf_init() remove parantheses from return value
From: Dan Carpenter @ 2019-08-07 11:05 UTC (permalink / raw)
  To: Giridhar Prasath R; +Cc: isdn, devel, arnd, gregkh, linux-kernel, netdev
In-Reply-To: <20190807020331.19729-1-cristianoprasath@gmail.com>

This driver is going to be deleted soon so we aren't accepting cleanups.

Thanks!

regards,
dan carpenter


^ permalink raw reply

* Re: [PATCH net-next] net: can: Fix compiling warning
From: Dan Carpenter @ 2019-08-07 10:50 UTC (permalink / raw)
  To: Oliver Hartkopp, Patrick Bellasi, linux-sparse
  Cc: Mao Wenan, davem, netdev, linux-kernel, kernel-janitors,
	Ingo Molnar
In-Reply-To: <6e1c5aa0-8ed3-eec3-a34d-867ea8f54e9d@hartkopp.net>

On Tue, Aug 06, 2019 at 06:41:44PM +0200, Oliver Hartkopp wrote:
> I compiled the code (the original version), but I do not get that "Should it
> be static?" warning:
> 
> user@box:~/net-next$ make C=1
>   CALL    scripts/checksyscalls.sh
>   CALL    scripts/atomic/check-atomics.sh
>   DESCEND  objtool
>   CHK     include/generated/compile.h
>   CHECK   net/can/af_can.c
> ./include/linux/sched.h:609:43: error: bad integer constant expression
> ./include/linux/sched.h:609:73: error: invalid named zero-width bitfield
> `value'
> ./include/linux/sched.h:610:43: error: bad integer constant expression
> ./include/linux/sched.h:610:67: error: invalid named zero-width bitfield
> `bucket_id'
>   CC [M]  net/can/af_can.o

The sched.h errors suppress Sparse warnings so it's broken/useless now.
The code looks like this:

include/linux/sched.h
   613  struct uclamp_se {
   614          unsigned int value              : bits_per(SCHED_CAPACITY_SCALE);
   615          unsigned int bucket_id          : bits_per(UCLAMP_BUCKETS);
   616          unsigned int active             : 1;
   617          unsigned int user_defined       : 1;
   618  };

bits_per() is zero and Sparse doesn't like zero sized bitfields.

regards,
dan carpenter

^ permalink raw reply

* [PATCH net-next] mlxsw: spectrum: Extend to support Spectrum-3 ASIC
From: Ido Schimmel @ 2019-08-07 10:42 UTC (permalink / raw)
  To: netdev; +Cc: davem, jiri, petrm, mlxsw, Ido Schimmel

From: Jiri Pirko <jiri@mellanox.com>

Extend existing driver for Spectrum and Spectrum-2 ASICs
to support Spectrum-3 ASIC as well.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/Kconfig   |  6 +-
 drivers/net/ethernet/mellanox/mlxsw/pci.h     |  1 +
 .../net/ethernet/mellanox/mlxsw/spectrum.c    | 55 +++++++++++++++++++
 3 files changed, 59 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/Kconfig b/drivers/net/ethernet/mellanox/mlxsw/Kconfig
index 06c80343d9ed..f458fd1ce9f8 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/Kconfig
+++ b/drivers/net/ethernet/mellanox/mlxsw/Kconfig
@@ -71,7 +71,7 @@ config MLXSW_SWITCHX2
 	  module will be called mlxsw_switchx2.
 
 config MLXSW_SPECTRUM
-	tristate "Mellanox Technologies Spectrum support"
+	tristate "Mellanox Technologies Spectrum family support"
 	depends on MLXSW_CORE && MLXSW_PCI && NET_SWITCHDEV && VLAN_8021Q
 	depends on PSAMPLE || PSAMPLE=n
 	depends on BRIDGE || BRIDGE=n
@@ -87,8 +87,8 @@ config MLXSW_SPECTRUM
 	select NET_PTP_CLASSIFY if PTP_1588_CLOCK
 	default m
 	---help---
-	  This driver supports Mellanox Technologies Spectrum Ethernet
-	  Switch ASICs.
+	  This driver supports Mellanox Technologies
+	  Spectrum/Spectrum-2/Spectrum-3 Ethernet Switch ASICs.
 
 	  To compile this driver as a module, choose M here: the
 	  module will be called mlxsw_spectrum.
diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci.h b/drivers/net/ethernet/mellanox/mlxsw/pci.h
index 946339e13eb9..5b1323645a5d 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/pci.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/pci.h
@@ -9,6 +9,7 @@
 #define PCI_DEVICE_ID_MELLANOX_SWITCHX2		0xc738
 #define PCI_DEVICE_ID_MELLANOX_SPECTRUM		0xcb84
 #define PCI_DEVICE_ID_MELLANOX_SPECTRUM2	0xcf6c
+#define PCI_DEVICE_ID_MELLANOX_SPECTRUM3	0xcf70
 #define PCI_DEVICE_ID_MELLANOX_SWITCHIB		0xcb20
 #define PCI_DEVICE_ID_MELLANOX_SWITCHIB2	0xcf08
 
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 5a8e94c0a95a..389861ece418 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -65,6 +65,7 @@ static const struct mlxsw_fw_rev mlxsw_sp1_fw_rev = {
 
 static const char mlxsw_sp1_driver_name[] = "mlxsw_spectrum";
 static const char mlxsw_sp2_driver_name[] = "mlxsw_spectrum2";
+static const char mlxsw_sp3_driver_name[] = "mlxsw_spectrum3";
 static const char mlxsw_sp_driver_version[] = "1.0";
 
 static const unsigned char mlxsw_sp1_mac_mask[ETH_ALEN] = {
@@ -5290,6 +5291,35 @@ static struct mlxsw_driver mlxsw_sp2_driver = {
 	.res_query_enabled		= true,
 };
 
+static struct mlxsw_driver mlxsw_sp3_driver = {
+	.kind				= mlxsw_sp3_driver_name,
+	.priv_size			= sizeof(struct mlxsw_sp),
+	.init				= mlxsw_sp2_init,
+	.fini				= mlxsw_sp_fini,
+	.basic_trap_groups_set		= mlxsw_sp_basic_trap_groups_set,
+	.port_split			= mlxsw_sp_port_split,
+	.port_unsplit			= mlxsw_sp_port_unsplit,
+	.sb_pool_get			= mlxsw_sp_sb_pool_get,
+	.sb_pool_set			= mlxsw_sp_sb_pool_set,
+	.sb_port_pool_get		= mlxsw_sp_sb_port_pool_get,
+	.sb_port_pool_set		= mlxsw_sp_sb_port_pool_set,
+	.sb_tc_pool_bind_get		= mlxsw_sp_sb_tc_pool_bind_get,
+	.sb_tc_pool_bind_set		= mlxsw_sp_sb_tc_pool_bind_set,
+	.sb_occ_snapshot		= mlxsw_sp_sb_occ_snapshot,
+	.sb_occ_max_clear		= mlxsw_sp_sb_occ_max_clear,
+	.sb_occ_port_pool_get		= mlxsw_sp_sb_occ_port_pool_get,
+	.sb_occ_tc_port_bind_get	= mlxsw_sp_sb_occ_tc_port_bind_get,
+	.flash_update			= mlxsw_sp_flash_update,
+	.txhdr_construct		= mlxsw_sp_txhdr_construct,
+	.resources_register		= mlxsw_sp2_resources_register,
+	.params_register		= mlxsw_sp2_params_register,
+	.params_unregister		= mlxsw_sp2_params_unregister,
+	.ptp_transmitted		= mlxsw_sp_ptp_transmitted,
+	.txhdr_len			= MLXSW_TXHDR_LEN,
+	.profile			= &mlxsw_sp2_config_profile,
+	.res_query_enabled		= true,
+};
+
 bool mlxsw_sp_port_dev_check(const struct net_device *dev)
 {
 	return dev->netdev_ops == &mlxsw_sp_port_netdev_ops;
@@ -6324,6 +6354,16 @@ static struct pci_driver mlxsw_sp2_pci_driver = {
 	.id_table = mlxsw_sp2_pci_id_table,
 };
 
+static const struct pci_device_id mlxsw_sp3_pci_id_table[] = {
+	{PCI_VDEVICE(MELLANOX, PCI_DEVICE_ID_MELLANOX_SPECTRUM3), 0},
+	{0, },
+};
+
+static struct pci_driver mlxsw_sp3_pci_driver = {
+	.name = mlxsw_sp3_driver_name,
+	.id_table = mlxsw_sp3_pci_id_table,
+};
+
 static int __init mlxsw_sp_module_init(void)
 {
 	int err;
@@ -6339,6 +6379,10 @@ static int __init mlxsw_sp_module_init(void)
 	if (err)
 		goto err_sp2_core_driver_register;
 
+	err = mlxsw_core_driver_register(&mlxsw_sp3_driver);
+	if (err)
+		goto err_sp3_core_driver_register;
+
 	err = mlxsw_pci_driver_register(&mlxsw_sp1_pci_driver);
 	if (err)
 		goto err_sp1_pci_driver_register;
@@ -6347,11 +6391,19 @@ static int __init mlxsw_sp_module_init(void)
 	if (err)
 		goto err_sp2_pci_driver_register;
 
+	err = mlxsw_pci_driver_register(&mlxsw_sp3_pci_driver);
+	if (err)
+		goto err_sp3_pci_driver_register;
+
 	return 0;
 
+err_sp3_pci_driver_register:
+	mlxsw_pci_driver_unregister(&mlxsw_sp2_pci_driver);
 err_sp2_pci_driver_register:
 	mlxsw_pci_driver_unregister(&mlxsw_sp1_pci_driver);
 err_sp1_pci_driver_register:
+	mlxsw_core_driver_unregister(&mlxsw_sp3_driver);
+err_sp3_core_driver_register:
 	mlxsw_core_driver_unregister(&mlxsw_sp2_driver);
 err_sp2_core_driver_register:
 	mlxsw_core_driver_unregister(&mlxsw_sp1_driver);
@@ -6363,8 +6415,10 @@ static int __init mlxsw_sp_module_init(void)
 
 static void __exit mlxsw_sp_module_exit(void)
 {
+	mlxsw_pci_driver_unregister(&mlxsw_sp3_pci_driver);
 	mlxsw_pci_driver_unregister(&mlxsw_sp2_pci_driver);
 	mlxsw_pci_driver_unregister(&mlxsw_sp1_pci_driver);
+	mlxsw_core_driver_unregister(&mlxsw_sp3_driver);
 	mlxsw_core_driver_unregister(&mlxsw_sp2_driver);
 	mlxsw_core_driver_unregister(&mlxsw_sp1_driver);
 	unregister_inet6addr_validator_notifier(&mlxsw_sp_inet6addr_valid_nb);
@@ -6379,4 +6433,5 @@ MODULE_AUTHOR("Jiri Pirko <jiri@mellanox.com>");
 MODULE_DESCRIPTION("Mellanox Spectrum driver");
 MODULE_DEVICE_TABLE(pci, mlxsw_sp1_pci_id_table);
 MODULE_DEVICE_TABLE(pci, mlxsw_sp2_pci_id_table);
+MODULE_DEVICE_TABLE(pci, mlxsw_sp3_pci_id_table);
 MODULE_FIRMWARE(MLXSW_SP1_FW_FILENAME);
-- 
2.21.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox