Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v3 0/2] sctp: delay calls to sk_data_ready() as much as possible
From: Marcelo Ricardo Leitner @ 2016-04-14 17:00 UTC (permalink / raw)
  To: Neil Horman, David Miller
  Cc: netdev, vyasevich, linux-sctp, David.Laight, jkbs
In-Reply-To: <20160414130324.GA6806@hmsreliant.think-freely.org>

Em 14-04-2016 10:03, Neil Horman escreveu:
> On Wed, Apr 13, 2016 at 11:05:32PM -0400, David Miller wrote:
>> From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
>> Date: Fri,  8 Apr 2016 16:41:26 -0300
>>
>>> 1st patch is a preparation for the 2nd. The idea is to not call
>>> ->sk_data_ready() for every data chunk processed while processing
>>> packets but only once before releasing the socket.
>>>
>>> v2: patchset re-checked, small changelog fixes
>>> v3: on patch 2, make use of local vars to make it more readable
>>
>> Applied to net-next, but isn't this reduced overhead coming at the
>> expense of latency?  What if that lower latency is important to the
>> application and/or consumer?
> Thats a fair point, but I'd make the counter argument that, as it currently
> stands, any latency introduced (or removed), is an artifact of our
> implementation rather than a designed feature of it.  That is to say, we make no
> guarantees at the application level regarding how long it takes to signal data
> readines from the time we get data off the wire, so I would rather see our
> throughput raised if we can, as thats been sctp's more pressing achilles heel.
>
>
> Thats not to say I'd like to enable lower latency, but I'd rather have this now,
> and start pondering how to design that in.  Perhaps we can convert the pending
> flag to a counter to count the number of events we enqueue, and call
> sk_data_ready every  time we reach a sysctl defined threshold.

That and also that there is no chance of the application reading the 
first chunks before all current ToDo's are performed by either the bh or 
backlog handlers for that packet. Socket lock won't be cycled in between 
chunks so the application is going to wait all the processing one way or 
another.

Thanks,
Marcelo

^ permalink raw reply

* Re: [PATCH 1/2] [v4] net: emac: emac gigabit ethernet controller driver
From: Timur Tabi @ 2016-04-14 16:47 UTC (permalink / raw)
  To: Rob Herring
  Cc: netdev, linux-kernel, devicetree, linux-arm-msm, sdharia,
	Shanker Donthineni, Greg Kroah-Hartman, vikrams, cov, gavidov,
	andrew, bjorn.andersson, Mark Langsdorf, Jon Masters, Andy Gross,
	David S. Miller
In-Reply-To: <20160414163240.GB15303@rob-hp-laptop>

Rob Herring wrote:

>> @@ -0,0 +1,65 @@
>> +Qualcomm EMAC Gigabit Ethernet Controller
>> +
>> +Required properties:
>> +- compatible : Should be "qcom,emac".
>
> Come on... Can you guess what I'm going to say here.

Ooops, I missed that one.

>
>> +- reg : Offset and length of the register regions for the device
>> +- reg-names : Register region names referenced in 'reg' above.
>> +	Required register resource entries are:
>> +	"base"   : EMAC controller base register block.
>> +	"csr"    : EMAC wrapper register block.
>> +	Optional register resource entries are:
>> +	"ptp"    : EMAC PTP (1588) register block.
>> +		   Required if 'qcom,emac-tstamp-en' is present.
>> +	"sgmii"  : EMAC SGMII PHY register block.
>> +- interrupts : Interrupt numbers used by this controller
>> +- interrupt-names : Interrupt resource names referenced in 'interrupts' above.
>> +	Required interrupt resource entries are:
>> +	"emac_core0"   : EMAC core0 interrupt.
>> +	"sgmii_irq"   : EMAC SGMII interrupt.
>> +- phy-addr            : Specifies phy address on MDIO bus.
>> +			Required if the optional property "qcom,no-external-phy"
>> +			is not specified.
>
> As I mentioned in the last version, you should still describe this with
> a standard MDIO bus binding even if you can't use the generic code.

You mean like this?

	phy0: ethernet-phy@0 {
		compatible = "qcom,fsm9900-emac-phy";
		reg = <4>;
	};

>> +Optional properties:
>> +- qcom,emac-tstamp-en       : Enables the PTP (1588) timestamping feature.
>> +			      Include this only if PTP (1588) timestamping
>> +			      feature is needed. If included, "ptp" register
>> +			      base should be specified.
>> +- mac-address               : The 6-byte MAC address. If present, it is the
>> +			      default MAC address.
>> +- qcom,no-external-phy      : Indicates there is no external PHY connected to
>> +			      EMAC. Include this only if the EMAC is directly
>> +			      connected to the peer end without EPHY.
>> +Example:
>> +	emac0: qcom,emac@feb20000 {
>
> ethernet@
>
>> +		compatible = "qcom,fsm9900-emac";
>
> Ah, I see you fixed it here...

and in the code, I just missed it in the top of the file.  I'll fix it 
everywhere in v5.

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum, a Linux Foundation collaborative project.

^ permalink raw reply

* Re: Deleting child qdisc doesn't reset parent to default qdisc?
From: Eric Dumazet @ 2016-04-14 16:40 UTC (permalink / raw)
  To: Phil Sutter; +Cc: Jiri Kosina, Jamal Hadi Salim, netdev, linux-kernel
In-Reply-To: <20160414162229.GF3715@orbyte.nwl.cc>

On Thu, 2016-04-14 at 18:22 +0200, Phil Sutter wrote:

> And those being invisible can be overridden using 'tc qd add', right?
> AFAIR they're not listed because they don't properly register, so the
> system doesn't care to override them. In this case we could change all
> classful qdiscs to restore the default qdisc if a leaf qdisc is being
> deleted instead of noop (which is probably not what the user wants
> anyway).

Even if they properly register, they are not visible.

Take a look at
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=95dc19299f741c986227ec33e23cbf9b3321f812

for some context.

When a default pfifo is created on say a HTB class, you do not see it by
default in a dump.

If you have 100 HTB classes, HTB created 100 pfifo just fine, and it
works, unless an admin tries to delete them maybe ;)

^ permalink raw reply

* [PATCH net-next] tun: don't require serialization lock on tx
From: Paolo Abeni @ 2016-04-14 16:39 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Michael S. Tsirkin, Hannes Frederic Sowa,
	Eric W. Biederman, Greg Kurz, Jason Wang, Eric Dumazet

The current tun_net_xmit() implementation don't need any external
lock since it relies on rcu protection for the tun data structure
and on socket queue lock for skb queuing.

This patch set the NETIF_F_LLTX feature bit in the tun device, so
that on xmit, in absence of qdisc, no serialization lock is acquired
by the caller.

The user space can remove the default tun qdisc with:

tc qdisc replace dev <tun device name> root noqueue

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>

---
RFC -> v1
 - fixed a commit message typo, extended the comment with a 
   configuration hint
---
 drivers/net/tun.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index faf9297..42992dc 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1796,7 +1796,7 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
 		dev->hw_features = NETIF_F_SG | NETIF_F_FRAGLIST |
 				   TUN_USER_FEATURES | NETIF_F_HW_VLAN_CTAG_TX |
 				   NETIF_F_HW_VLAN_STAG_TX;
-		dev->features = dev->hw_features;
+		dev->features = dev->hw_features | NETIF_F_LLTX;
 		dev->vlan_features = dev->features &
 				     ~(NETIF_F_HW_VLAN_CTAG_TX |
 				       NETIF_F_HW_VLAN_STAG_TX);
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH 1/2] [v4] net: emac: emac gigabit ethernet controller driver
From: Rob Herring @ 2016-04-14 16:32 UTC (permalink / raw)
  To: Timur Tabi
  Cc: netdev, linux-kernel, devicetree, linux-arm-msm, sdharia,
	Shanker Donthineni, Greg Kroah-Hartman, vikrams, cov, gavidov,
	andrew, bjorn.andersson, Mark Langsdorf, Jon Masters, Andy Gross,
	David S. Miller
In-Reply-To: <1460570393-19838-1-git-send-email-timur@codeaurora.org>

On Wed, Apr 13, 2016 at 12:59:52PM -0500, Timur Tabi wrote:
> From: Gilad Avidov <gavidov@codeaurora.org>
> 
> Add supports for ethernet controller HW on Qualcomm Technologies, Inc. SoC.
> This driver supports the following features:
> 1) Checksum offload.
> 2) Runtime power management support.
> 3) Interrupt coalescing support.
> 4) SGMII phy.
> 5) SGMII direct connection without external phy.
> 
> Based on a driver by Niranjana Vishwanathapura
> <nvishwan@codeaurora.org>.
> 
> Signed-off-by: Gilad Avidov <gavidov@codeaurora.org>
> Signed-off-by: Timur Tabi <timur@codeaurora.org>
> ---
> 
> v4:
>  - add missing ipv6 header file
>  - correct compatible string
>  - fix spacing in emac_reg_write arrays
>  - drop unnecessary cell-index property
>  - remove unsupported DT properties from docs
>  - remove GPIO initialization and update docs
> 
> v3:
>  - remove most of the memory barriers by using the non xxx_relaxed() api.
>  - remove RSS and WOL support.
>  - correct comments from physical address to dma address.
>  - rearrange structs to make them packed.
>  - replace polling loops with readl_poll_timeout().
>  - remove unnecessary wrapper functions from phy layer.
>  - add blank line before return statements.
>  - set to null clocks after clk_put().
>  - use module_platform_driver() and dma_set_mask_and_coherent()
>  - replace long hex bitmasks with BIT() macro.
> 
> v2:
>  - replace hw bit fields to macros with bitwise operations.
>  - change all iterators to unsized types (int)
>  - some minor code flow improvements.
>  - change return type to void for functions which return value is never
>    used.
>  - replace instance of xxxxl_relaxed() io followed by mb() with a
>    readl()/writel().
> 
> ---
>  .../devicetree/bindings/net/qcom-emac.txt          |   65 +
>  drivers/net/ethernet/qualcomm/Kconfig              |   11 +
>  drivers/net/ethernet/qualcomm/Makefile             |    2 +
>  drivers/net/ethernet/qualcomm/emac/Makefile        |    7 +
>  drivers/net/ethernet/qualcomm/emac/emac-mac.c      | 1782 ++++++++++++++++++++
>  drivers/net/ethernet/qualcomm/emac/emac-mac.h      |  286 ++++
>  drivers/net/ethernet/qualcomm/emac/emac-phy.c      |  484 ++++++
>  drivers/net/ethernet/qualcomm/emac/emac-phy.h      |   68 +
>  drivers/net/ethernet/qualcomm/emac/emac-sgmii.c    |  683 ++++++++
>  drivers/net/ethernet/qualcomm/emac/emac-sgmii.h    |   30 +
>  drivers/net/ethernet/qualcomm/emac/emac.c          | 1206 +++++++++++++
>  drivers/net/ethernet/qualcomm/emac/emac.h          |  382 +++++
>  12 files changed, 5006 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/net/qcom-emac.txt
>  create mode 100644 drivers/net/ethernet/qualcomm/emac/Makefile
>  create mode 100644 drivers/net/ethernet/qualcomm/emac/emac-mac.c
>  create mode 100644 drivers/net/ethernet/qualcomm/emac/emac-mac.h
>  create mode 100644 drivers/net/ethernet/qualcomm/emac/emac-phy.c
>  create mode 100644 drivers/net/ethernet/qualcomm/emac/emac-phy.h
>  create mode 100644 drivers/net/ethernet/qualcomm/emac/emac-sgmii.c
>  create mode 100644 drivers/net/ethernet/qualcomm/emac/emac-sgmii.h
>  create mode 100644 drivers/net/ethernet/qualcomm/emac/emac.c
>  create mode 100644 drivers/net/ethernet/qualcomm/emac/emac.h
> 
> diff --git a/Documentation/devicetree/bindings/net/qcom-emac.txt b/Documentation/devicetree/bindings/net/qcom-emac.txt
> new file mode 100644
> index 0000000..df5e7c0
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/net/qcom-emac.txt
> @@ -0,0 +1,65 @@
> +Qualcomm EMAC Gigabit Ethernet Controller
> +
> +Required properties:
> +- compatible : Should be "qcom,emac".

Come on... Can you guess what I'm going to say here.

> +- reg : Offset and length of the register regions for the device
> +- reg-names : Register region names referenced in 'reg' above.
> +	Required register resource entries are:
> +	"base"   : EMAC controller base register block.
> +	"csr"    : EMAC wrapper register block.
> +	Optional register resource entries are:
> +	"ptp"    : EMAC PTP (1588) register block.
> +		   Required if 'qcom,emac-tstamp-en' is present.
> +	"sgmii"  : EMAC SGMII PHY register block.
> +- interrupts : Interrupt numbers used by this controller
> +- interrupt-names : Interrupt resource names referenced in 'interrupts' above.
> +	Required interrupt resource entries are:
> +	"emac_core0"   : EMAC core0 interrupt.
> +	"sgmii_irq"   : EMAC SGMII interrupt.
> +- phy-addr            : Specifies phy address on MDIO bus.
> +			Required if the optional property "qcom,no-external-phy"
> +			is not specified.

As I mentioned in the last version, you should still describe this with 
a standard MDIO bus binding even if you can't use the generic code.

> +
> +Optional properties:
> +- qcom,emac-tstamp-en       : Enables the PTP (1588) timestamping feature.
> +			      Include this only if PTP (1588) timestamping
> +			      feature is needed. If included, "ptp" register
> +			      base should be specified.
> +- mac-address               : The 6-byte MAC address. If present, it is the
> +			      default MAC address.
> +- qcom,no-external-phy      : Indicates there is no external PHY connected to
> +			      EMAC. Include this only if the EMAC is directly
> +			      connected to the peer end without EPHY.
> +Example:
> +	emac0: qcom,emac@feb20000 {

ethernet@

> +		compatible = "qcom,fsm9900-emac";

Ah, I see you fixed it here...

> +		reg-names = "base", "csr", "ptp", "sgmii";
> +		reg =   <0xfeb20000 0x10000>,
> +			<0xfeb36000 0x1000>,
> +			<0xfeb3c000 0x4000>,
> +			<0xfeb38000 0x400>;
> +		#address-cells = <0>;
> +		interrupt-parent = <&emac0>;
> +		#interrupt-cells = <1>;
> +		interrupts = <0 1>;
> +		interrupt-map-mask = <0xffffffff>;
> +		interrupt-map = <0 &intc 0 76 0
> +				 1 &intc 0 80 0>;
> +		interrupt-names = "emac_core0", "sgmii_irq";
> +		qcom,emac-tstamp-en;
> +		phy-addr = <0>;
> +
> +		pinctrl-names = "default";
> +		pinctrl-0 = <&mdio_pins_a>;
> +	};
> +
> +	tlmm: pinctrl@fd510000 {
> +		compatible = "qcom,fsm9900-pinctrl";
> +
> +		mdio_pins_a: mdio {
> +			state {
> +				pins = "gpio123", "gpio124";
> +				function = "mdio";
> +			};
> +		};
> +	};

^ permalink raw reply

* Re: [PATCH 1/2] [v4] net: emac: emac gigabit ethernet controller driver
From: Rob Herring @ 2016-04-14 16:24 UTC (permalink / raw)
  To: Timur Tabi
  Cc: kbuild test robot, kbuild-all, netdev, linux-kernel, devicetree,
	linux-arm-msm, sdharia, Shanker Donthineni, Greg Kroah-Hartman,
	vikrams, cov, gavidov, andrew, bjorn.andersson, Mark Langsdorf,
	Jon Masters, Andy Gross, David S. Miller
In-Reply-To: <570E9E8D.50007@codeaurora.org>

On Wed, Apr 13, 2016 at 02:31:25PM -0500, Timur Tabi wrote:
> kbuild test robot wrote:
> >
> >    drivers/net/ethernet/qualcomm/emac/emac-mac.c: In function 'emac_mac_up':
> >>>>>drivers/net/ethernet/qualcomm/emac/emac-mac.c:1076:9: warning: large integer implicitly truncated to unsigned type [-Woverflow]
> >      writel(~DIS_INT, adpt->base + EMAC_INT_STATUS);
> 
> This doesn't happen on arm64, and I don't know how to fix it.  DIS_INT is
> defined as:

Probably depends on the compiler version. BTW, clang seems to throw 
errors for this type of thing.

> 
> 	#define DIS_INT          BIT(31)
> 
> It seems silly to add a typecast to DIS_INT.

BIT() should use 1U instead of 1.

Rob

^ permalink raw reply

* Re: Deleting child qdisc doesn't reset parent to default qdisc?
From: Phil Sutter @ 2016-04-14 16:22 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Jiri Kosina, Jamal Hadi Salim, netdev, linux-kernel
In-Reply-To: <1460648680.10638.47.camel@edumazet-glaptop3.roam.corp.google.com>

On Thu, Apr 14, 2016 at 08:44:40AM -0700, Eric Dumazet wrote:
> On Thu, 2016-04-14 at 17:34 +0200, Jiri Kosina wrote:
> > On Thu, 14 Apr 2016, Phil Sutter wrote:
> > 
> > > OTOH some qdiscs (CBQ, DRR, DSMARK, HFSC, HTB, QFQ) assign the default
> > > one upon deletion instead of noop_qdisc, hence I would describe
> > > the situation using the words 'inconsistent' and 'accident' rather than
> > > 'expected'. :)
> > 
> > Exactly. I'd again like to stress the fact that this configuration works:
> > 
> > 	jikos:~ # tc qdisc show
> > 	qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms 
> > 
> > and this (after performing add/delete operation) doesn't:
> > 
> > 	jikos:~ # tc qdisc show
> > 	qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms 
> > 
> > It's hard to spot a difference (hint: there is none).
> 
> This is because some qdisc are not visible in the dump.

And those being invisible can be overridden using 'tc qd add', right?
AFAIR they're not listed because they don't properly register, so the
system doesn't care to override them. In this case we could change all
classful qdiscs to restore the default qdisc if a leaf qdisc is being
deleted instead of noop (which is probably not what the user wants
anyway).

Cheers, Phil

^ permalink raw reply

* [patch net-next 18/18] mlxsw: spectrum_buffers: Implement occupancy monitoring
From: Jiri Pirko @ 2016-04-14 16:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, ogerlitz, roopa, nikolay, jhs,
	john.fastabend, rami.rosen, gospo, stephen, sfeldma
In-Reply-To: <1460650770-19382-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Implement occupancy API introduced in devlink and mlxsw core. This is
done by accessing SBPM register for Port-Pool and SBSR for Port-TC
current and max occupancy values. Max clear is implemented using the
same registers.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c     |  36 +--
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |  18 ++
 .../net/ethernet/mellanox/mlxsw/spectrum_buffers.c | 255 +++++++++++++++++++++
 3 files changed, 293 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index ecadb15..681afe1 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -2493,22 +2493,26 @@ static struct mlxsw_config_profile mlxsw_sp_config_profile = {
 };
 
 static struct mlxsw_driver mlxsw_sp_driver = {
-	.kind			= MLXSW_DEVICE_KIND_SPECTRUM,
-	.owner			= THIS_MODULE,
-	.priv_size		= sizeof(struct mlxsw_sp),
-	.init			= mlxsw_sp_init,
-	.fini			= mlxsw_sp_fini,
-	.port_split		= mlxsw_sp_port_split,
-	.port_unsplit		= mlxsw_sp_port_unsplit,
-	.sb_pool_get		= mlxsw_sp_sb_pool_get,
-	.sb_pool_set		= mlxsw_sp_sb_pool_set,
-	.sb_port_pool_get	= mlxsw_sp_sb_port_pool_get,
-	.sb_port_pool_set	= mlxsw_sp_sb_port_pool_set,
-	.sb_tc_pool_bind_get	= mlxsw_sp_sb_tc_pool_bind_get,
-	.sb_tc_pool_bind_set	= mlxsw_sp_sb_tc_pool_bind_set,
-	.txhdr_construct	= mlxsw_sp_txhdr_construct,
-	.txhdr_len		= MLXSW_TXHDR_LEN,
-	.profile		= &mlxsw_sp_config_profile,
+	.kind				= MLXSW_DEVICE_KIND_SPECTRUM,
+	.owner				= THIS_MODULE,
+	.priv_size			= sizeof(struct mlxsw_sp),
+	.init				= mlxsw_sp_init,
+	.fini				= mlxsw_sp_fini,
+	.port_split			= mlxsw_sp_port_split,
+	.port_unsplit			= mlxsw_sp_port_unsplit,
+	.sb_pool_get			= mlxsw_sp_sb_pool_get,
+	.sb_pool_set			= mlxsw_sp_sb_pool_set,
+	.sb_port_pool_get		= mlxsw_sp_sb_port_pool_get,
+	.sb_port_pool_set		= mlxsw_sp_sb_port_pool_set,
+	.sb_tc_pool_bind_get		= mlxsw_sp_sb_tc_pool_bind_get,
+	.sb_tc_pool_bind_set		= mlxsw_sp_sb_tc_pool_bind_set,
+	.sb_occ_snapshot		= mlxsw_sp_sb_occ_snapshot,
+	.sb_occ_max_clear		= mlxsw_sp_sb_occ_max_clear,
+	.sb_occ_port_pool_get		= mlxsw_sp_sb_occ_port_pool_get,
+	.sb_occ_tc_port_bind_get	= mlxsw_sp_sb_occ_tc_port_bind_get,
+	.txhdr_construct		= mlxsw_sp_txhdr_construct,
+	.txhdr_len			= MLXSW_TXHDR_LEN,
+	.profile			= &mlxsw_sp_config_profile,
 };
 
 static int
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index 6458efa..e2c022d 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -123,15 +123,22 @@ struct mlxsw_sp_sb_pr {
 	u32 size;
 };
 
+struct mlxsw_cp_sb_occ {
+	u32 cur;
+	u32 max;
+};
+
 struct mlxsw_sp_sb_cm {
 	u32 min_buff;
 	u32 max_buff;
 	u8 pool;
+	struct mlxsw_cp_sb_occ occ;
 };
 
 struct mlxsw_sp_sb_pm {
 	u32 min_buff;
 	u32 max_buff;
+	struct mlxsw_cp_sb_occ occ;
 };
 
 #define MLXSW_SP_SB_POOL_COUNT	4
@@ -328,6 +335,17 @@ int mlxsw_sp_sb_tc_pool_bind_set(struct mlxsw_core_port *mlxsw_core_port,
 				 unsigned int sb_index, u16 tc_index,
 				 enum devlink_sb_pool_type pool_type,
 				 u16 pool_index, u32 threshold);
+int mlxsw_sp_sb_occ_snapshot(struct mlxsw_core *mlxsw_core,
+			     unsigned int sb_index);
+int mlxsw_sp_sb_occ_max_clear(struct mlxsw_core *mlxsw_core,
+			      unsigned int sb_index);
+int mlxsw_sp_sb_occ_port_pool_get(struct mlxsw_core_port *mlxsw_core_port,
+				  unsigned int sb_index, u16 pool_index,
+				  u32 *p_cur, u32 *p_max);
+int mlxsw_sp_sb_occ_tc_port_bind_get(struct mlxsw_core_port *mlxsw_core_port,
+				     unsigned int sb_index, u16 tc_index,
+				     enum devlink_sb_pool_type pool_type,
+				     u32 *p_cur, u32 *p_max);
 
 int mlxsw_sp_switchdev_init(struct mlxsw_sp *mlxsw_sp);
 void mlxsw_sp_switchdev_fini(struct mlxsw_sp *mlxsw_sp);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
index 639ba5a..f2e073a 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
@@ -36,6 +36,7 @@
 #include <linux/types.h>
 #include <linux/dcbnl.h>
 #include <linux/if_ether.h>
+#include <linux/list.h>
 
 #include "spectrum.h"
 #include "core.h"
@@ -125,6 +126,41 @@ static int mlxsw_sp_sb_pm_write(struct mlxsw_sp *mlxsw_sp, u8 local_port,
 	return 0;
 }
 
+static int mlxsw_sp_sb_pm_occ_clear(struct mlxsw_sp *mlxsw_sp, u8 local_port,
+				    u8 pool, enum mlxsw_reg_sbxx_dir dir,
+				    struct list_head *bulk_list)
+{
+	char sbpm_pl[MLXSW_REG_SBPM_LEN];
+
+	mlxsw_reg_sbpm_pack(sbpm_pl, local_port, pool, dir, true, 0, 0);
+	return mlxsw_reg_trans_query(mlxsw_sp->core, MLXSW_REG(sbpm), sbpm_pl,
+				     bulk_list, NULL, 0);
+}
+
+static void mlxsw_sp_sb_pm_occ_query_cb(struct mlxsw_core *mlxsw_core,
+					char *sbpm_pl, size_t sbpm_pl_len,
+					unsigned long cb_priv)
+{
+	struct mlxsw_sp_sb_pm *pm = (struct mlxsw_sp_sb_pm *) cb_priv;
+
+	mlxsw_reg_sbpm_unpack(sbpm_pl, &pm->occ.cur, &pm->occ.max);
+}
+
+static int mlxsw_sp_sb_pm_occ_query(struct mlxsw_sp *mlxsw_sp, u8 local_port,
+				    u8 pool, enum mlxsw_reg_sbxx_dir dir,
+				    struct list_head *bulk_list)
+{
+	char sbpm_pl[MLXSW_REG_SBPM_LEN];
+	struct mlxsw_sp_sb_pm *pm;
+
+	pm = mlxsw_sp_sb_pm_get(mlxsw_sp, local_port, pool, dir);
+	mlxsw_reg_sbpm_pack(sbpm_pl, local_port, pool, dir, false, 0, 0);
+	return mlxsw_reg_trans_query(mlxsw_sp->core, MLXSW_REG(sbpm), sbpm_pl,
+				     bulk_list,
+				     mlxsw_sp_sb_pm_occ_query_cb,
+				     (unsigned long) pm);
+}
+
 static const u16 mlxsw_sp_pbs[] = {
 	2 * MLXSW_SP_BYTES_TO_CELLS(ETH_FRAME_LEN),
 	0,
@@ -707,3 +743,222 @@ int mlxsw_sp_sb_tc_pool_bind_set(struct mlxsw_core_port *mlxsw_core_port,
 	return mlxsw_sp_sb_cm_write(mlxsw_sp, local_port, pg_buff, dir,
 				    0, max_buff, pool);
 }
+
+#define MASKED_COUNT_MAX \
+	(MLXSW_REG_SBSR_REC_MAX_COUNT / (MLXSW_SP_SB_TC_COUNT * 2))
+
+struct mlxsw_sp_sb_sr_occ_query_cb_ctx {
+	u8 masked_count;
+	u8 local_port_1;
+};
+
+static void mlxsw_sp_sb_sr_occ_query_cb(struct mlxsw_core *mlxsw_core,
+					char *sbsr_pl, size_t sbsr_pl_len,
+					unsigned long cb_priv)
+{
+	struct mlxsw_sp *mlxsw_sp = mlxsw_core_driver_priv(mlxsw_core);
+	struct mlxsw_sp_sb_sr_occ_query_cb_ctx cb_ctx;
+	u8 masked_count;
+	u8 local_port;
+	int rec_index = 0;
+	struct mlxsw_sp_sb_cm *cm;
+	int i;
+
+	memcpy(&cb_ctx, &cb_priv, sizeof(cb_ctx));
+
+	masked_count = 0;
+	for (local_port = cb_ctx.local_port_1;
+	     local_port < MLXSW_PORT_MAX_PORTS; local_port++) {
+		if (!mlxsw_sp->ports[local_port])
+			continue;
+		for (i = 0; i < MLXSW_SP_SB_TC_COUNT; i++) {
+			cm = mlxsw_sp_sb_cm_get(mlxsw_sp, local_port, i,
+						MLXSW_REG_SBXX_DIR_INGRESS);
+			mlxsw_reg_sbsr_rec_unpack(sbsr_pl, rec_index++,
+						  &cm->occ.cur, &cm->occ.max);
+		}
+		if (++masked_count == cb_ctx.masked_count)
+			break;
+	}
+	masked_count = 0;
+	for (local_port = cb_ctx.local_port_1;
+	     local_port < MLXSW_PORT_MAX_PORTS; local_port++) {
+		if (!mlxsw_sp->ports[local_port])
+			continue;
+		for (i = 0; i < MLXSW_SP_SB_TC_COUNT; i++) {
+			cm = mlxsw_sp_sb_cm_get(mlxsw_sp, local_port, i,
+						MLXSW_REG_SBXX_DIR_EGRESS);
+			mlxsw_reg_sbsr_rec_unpack(sbsr_pl, rec_index++,
+						  &cm->occ.cur, &cm->occ.max);
+		}
+		if (++masked_count == cb_ctx.masked_count)
+			break;
+	}
+}
+
+int mlxsw_sp_sb_occ_snapshot(struct mlxsw_core *mlxsw_core,
+			     unsigned int sb_index)
+{
+	struct mlxsw_sp *mlxsw_sp = mlxsw_core_driver_priv(mlxsw_core);
+	struct mlxsw_sp_sb_sr_occ_query_cb_ctx cb_ctx;
+	unsigned long cb_priv;
+	LIST_HEAD(bulk_list);
+	char *sbsr_pl;
+	u8 masked_count;
+	u8 local_port_1;
+	u8 local_port = 0;
+	int i;
+	int err;
+	int err2;
+
+	sbsr_pl = kmalloc(MLXSW_REG_SBSR_LEN, GFP_KERNEL);
+	if (!sbsr_pl)
+		return -ENOMEM;
+
+next_batch:
+	local_port++;
+	local_port_1 = local_port;
+	masked_count = 0;
+	mlxsw_reg_sbsr_pack(sbsr_pl, false);
+	for (i = 0; i < MLXSW_SP_SB_TC_COUNT; i++) {
+		mlxsw_reg_sbsr_pg_buff_mask_set(sbsr_pl, i, 1);
+		mlxsw_reg_sbsr_tclass_mask_set(sbsr_pl, i, 1);
+	}
+	for (; local_port < MLXSW_PORT_MAX_PORTS; local_port++) {
+		if (!mlxsw_sp->ports[local_port])
+			continue;
+		mlxsw_reg_sbsr_ingress_port_mask_set(sbsr_pl, local_port, 1);
+		mlxsw_reg_sbsr_egress_port_mask_set(sbsr_pl, local_port, 1);
+		for (i = 0; i < MLXSW_SP_SB_POOL_COUNT; i++) {
+			err = mlxsw_sp_sb_pm_occ_query(mlxsw_sp, local_port, i,
+						       MLXSW_REG_SBXX_DIR_INGRESS,
+						       &bulk_list);
+			if (err)
+				goto out;
+			err = mlxsw_sp_sb_pm_occ_query(mlxsw_sp, local_port, i,
+						       MLXSW_REG_SBXX_DIR_EGRESS,
+						       &bulk_list);
+			if (err)
+				goto out;
+		}
+		if (++masked_count == MASKED_COUNT_MAX)
+			goto do_query;
+	}
+
+do_query:
+	cb_ctx.masked_count = masked_count;
+	cb_ctx.local_port_1 = local_port_1;
+	memcpy(&cb_priv, &cb_ctx, sizeof(cb_ctx));
+	err = mlxsw_reg_trans_query(mlxsw_core, MLXSW_REG(sbsr), sbsr_pl,
+				    &bulk_list, mlxsw_sp_sb_sr_occ_query_cb,
+				    cb_priv);
+	if (err)
+		goto out;
+	if (local_port < MLXSW_PORT_MAX_PORTS)
+		goto next_batch;
+
+out:
+	err2 = mlxsw_reg_trans_bulk_wait(&bulk_list);
+	if (!err)
+		err = err2;
+	kfree(sbsr_pl);
+	return err;
+}
+
+int mlxsw_sp_sb_occ_max_clear(struct mlxsw_core *mlxsw_core,
+			      unsigned int sb_index)
+{
+	struct mlxsw_sp *mlxsw_sp = mlxsw_core_driver_priv(mlxsw_core);
+	LIST_HEAD(bulk_list);
+	char *sbsr_pl;
+	unsigned int masked_count;
+	u8 local_port = 0;
+	int i;
+	int err;
+	int err2;
+
+	sbsr_pl = kmalloc(MLXSW_REG_SBSR_LEN, GFP_KERNEL);
+	if (!sbsr_pl)
+		return -ENOMEM;
+
+next_batch:
+	local_port++;
+	masked_count = 0;
+	mlxsw_reg_sbsr_pack(sbsr_pl, true);
+	for (i = 0; i < MLXSW_SP_SB_TC_COUNT; i++) {
+		mlxsw_reg_sbsr_pg_buff_mask_set(sbsr_pl, i, 1);
+		mlxsw_reg_sbsr_tclass_mask_set(sbsr_pl, i, 1);
+	}
+	for (; local_port < MLXSW_PORT_MAX_PORTS; local_port++) {
+		if (!mlxsw_sp->ports[local_port])
+			continue;
+		mlxsw_reg_sbsr_ingress_port_mask_set(sbsr_pl, local_port, 1);
+		mlxsw_reg_sbsr_egress_port_mask_set(sbsr_pl, local_port, 1);
+		for (i = 0; i < MLXSW_SP_SB_POOL_COUNT; i++) {
+			err = mlxsw_sp_sb_pm_occ_clear(mlxsw_sp, local_port, i,
+						       MLXSW_REG_SBXX_DIR_INGRESS,
+						       &bulk_list);
+			if (err)
+				goto out;
+			err = mlxsw_sp_sb_pm_occ_clear(mlxsw_sp, local_port, i,
+						       MLXSW_REG_SBXX_DIR_EGRESS,
+						       &bulk_list);
+			if (err)
+				goto out;
+		}
+		if (++masked_count == MASKED_COUNT_MAX)
+			goto do_query;
+	}
+
+do_query:
+	err = mlxsw_reg_trans_query(mlxsw_core, MLXSW_REG(sbsr), sbsr_pl,
+				    &bulk_list, NULL, 0);
+	if (err)
+		goto out;
+	if (local_port < MLXSW_PORT_MAX_PORTS)
+		goto next_batch;
+
+out:
+	err2 = mlxsw_reg_trans_bulk_wait(&bulk_list);
+	if (!err)
+		err = err2;
+	kfree(sbsr_pl);
+	return err;
+}
+
+int mlxsw_sp_sb_occ_port_pool_get(struct mlxsw_core_port *mlxsw_core_port,
+				  unsigned int sb_index, u16 pool_index,
+				  u32 *p_cur, u32 *p_max)
+{
+	struct mlxsw_sp_port *mlxsw_sp_port =
+			mlxsw_core_port_driver_priv(mlxsw_core_port);
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
+	u8 local_port = mlxsw_sp_port->local_port;
+	u8 pool = pool_get(pool_index);
+	enum mlxsw_reg_sbxx_dir dir = dir_get(pool_index);
+	struct mlxsw_sp_sb_pm *pm = mlxsw_sp_sb_pm_get(mlxsw_sp, local_port,
+						       pool, dir);
+
+	*p_cur = MLXSW_SP_CELLS_TO_BYTES(pm->occ.cur);
+	*p_max = MLXSW_SP_CELLS_TO_BYTES(pm->occ.max);
+	return 0;
+}
+
+int mlxsw_sp_sb_occ_tc_port_bind_get(struct mlxsw_core_port *mlxsw_core_port,
+				     unsigned int sb_index, u16 tc_index,
+				     enum devlink_sb_pool_type pool_type,
+				     u32 *p_cur, u32 *p_max)
+{
+	struct mlxsw_sp_port *mlxsw_sp_port =
+			mlxsw_core_port_driver_priv(mlxsw_core_port);
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
+	u8 local_port = mlxsw_sp_port->local_port;
+	u8 pg_buff = tc_index;
+	enum mlxsw_reg_sbxx_dir dir = pool_type;
+	struct mlxsw_sp_sb_cm *cm = mlxsw_sp_sb_cm_get(mlxsw_sp, local_port,
+						       pg_buff, dir);
+
+	*p_cur = MLXSW_SP_CELLS_TO_BYTES(cm->occ.cur);
+	*p_max = MLXSW_SP_CELLS_TO_BYTES(cm->occ.max);
+	return 0;
+}
-- 
2.5.5

^ permalink raw reply related

* [patch net-next 17/18] mlxsw: core: Introduce support for asynchronous EMAD register access
From: Jiri Pirko @ 2016-04-14 16:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, ogerlitz, roopa, nikolay, jhs,
	john.fastabend, rami.rosen, gospo, stephen, sfeldma
In-Reply-To: <1460650770-19382-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

So far it was possible to have one EMAD register access at a time,
locked by mutex. This patch extends this interface to allow multiple
EMAD register accesses to be in fly at once. That allows faster
processing on firmware side avoiding unused time in between EMADs.
Measured speedup is ~30% for shared occupancy snapshot operation.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/core.c | 494 +++++++++++++++++++----------
 drivers/net/ethernet/mellanox/mlxsw/core.h |  13 +
 2 files changed, 334 insertions(+), 173 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c b/drivers/net/ethernet/mellanox/mlxsw/core.c
index a14e422..b0a0b01 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.c
@@ -44,7 +44,7 @@
 #include <linux/seq_file.h>
 #include <linux/u64_stats_sync.h>
 #include <linux/netdevice.h>
-#include <linux/wait.h>
+#include <linux/completion.h>
 #include <linux/skbuff.h>
 #include <linux/etherdevice.h>
 #include <linux/types.h>
@@ -96,11 +96,9 @@ struct mlxsw_core {
 	struct list_head rx_listener_list;
 	struct list_head event_listener_list;
 	struct {
-		struct sk_buff *resp_skb;
-		u64 tid;
-		wait_queue_head_t wait;
-		bool trans_active;
-		struct mutex lock; /* One EMAD transaction at a time. */
+		atomic64_t tid;
+		struct list_head trans_list;
+		spinlock_t trans_list_lock; /* protects trans_list writes */
 		bool use_emad;
 	} emad;
 	struct mlxsw_core_pcpu_stats __percpu *pcpu_stats;
@@ -293,7 +291,7 @@ static void mlxsw_emad_pack_reg_tlv(char *reg_tlv,
 static void mlxsw_emad_pack_op_tlv(char *op_tlv,
 				   const struct mlxsw_reg_info *reg,
 				   enum mlxsw_core_reg_access_type type,
-				   struct mlxsw_core *mlxsw_core)
+				   u64 tid)
 {
 	mlxsw_emad_op_tlv_type_set(op_tlv, MLXSW_EMAD_TLV_TYPE_OP);
 	mlxsw_emad_op_tlv_len_set(op_tlv, MLXSW_EMAD_OP_TLV_LEN);
@@ -309,7 +307,7 @@ static void mlxsw_emad_pack_op_tlv(char *op_tlv,
 					     MLXSW_EMAD_OP_TLV_METHOD_WRITE);
 	mlxsw_emad_op_tlv_class_set(op_tlv,
 				    MLXSW_EMAD_OP_TLV_CLASS_REG_ACCESS);
-	mlxsw_emad_op_tlv_tid_set(op_tlv, mlxsw_core->emad.tid);
+	mlxsw_emad_op_tlv_tid_set(op_tlv, tid);
 }
 
 static int mlxsw_emad_construct_eth_hdr(struct sk_buff *skb)
@@ -331,7 +329,7 @@ static void mlxsw_emad_construct(struct sk_buff *skb,
 				 const struct mlxsw_reg_info *reg,
 				 char *payload,
 				 enum mlxsw_core_reg_access_type type,
-				 struct mlxsw_core *mlxsw_core)
+				 u64 tid)
 {
 	char *buf;
 
@@ -342,7 +340,7 @@ static void mlxsw_emad_construct(struct sk_buff *skb,
 	mlxsw_emad_pack_reg_tlv(buf, reg, payload);
 
 	buf = skb_push(skb, MLXSW_EMAD_OP_TLV_LEN * sizeof(u32));
-	mlxsw_emad_pack_op_tlv(buf, reg, type, mlxsw_core);
+	mlxsw_emad_pack_op_tlv(buf, reg, type, tid);
 
 	mlxsw_emad_construct_eth_hdr(skb);
 }
@@ -379,58 +377,16 @@ static bool mlxsw_emad_is_resp(const struct sk_buff *skb)
 	return (mlxsw_emad_op_tlv_r_get(op_tlv) == MLXSW_EMAD_OP_TLV_RESPONSE);
 }
 
-#define MLXSW_EMAD_TIMEOUT_MS 200
-
-static int __mlxsw_emad_transmit(struct mlxsw_core *mlxsw_core,
-				 struct sk_buff *skb,
-				 const struct mlxsw_tx_info *tx_info)
-{
-	int err;
-	int ret;
-
-	mlxsw_core->emad.trans_active = true;
-
-	err = mlxsw_core_skb_transmit(mlxsw_core, skb, tx_info);
-	if (err) {
-		dev_err(mlxsw_core->bus_info->dev, "Failed to transmit EMAD (tid=%llx)\n",
-			mlxsw_core->emad.tid);
-		dev_kfree_skb(skb);
-		goto trans_inactive_out;
-	}
-
-	ret = wait_event_timeout(mlxsw_core->emad.wait,
-				 !(mlxsw_core->emad.trans_active),
-				 msecs_to_jiffies(MLXSW_EMAD_TIMEOUT_MS));
-	if (!ret) {
-		dev_warn(mlxsw_core->bus_info->dev, "EMAD timed-out (tid=%llx)\n",
-			 mlxsw_core->emad.tid);
-		err = -EIO;
-		goto trans_inactive_out;
-	}
-
-	return 0;
-
-trans_inactive_out:
-	mlxsw_core->emad.trans_active = false;
-	return err;
-}
-
-static int mlxsw_emad_process_status(struct mlxsw_core *mlxsw_core,
-				     char *op_tlv)
+static int mlxsw_emad_process_status(char *op_tlv,
+				     enum mlxsw_emad_op_tlv_status *p_status)
 {
-	enum mlxsw_emad_op_tlv_status status;
-	u64 tid;
-
-	status = mlxsw_emad_op_tlv_status_get(op_tlv);
-	tid = mlxsw_emad_op_tlv_tid_get(op_tlv);
+	*p_status = mlxsw_emad_op_tlv_status_get(op_tlv);
 
-	switch (status) {
+	switch (*p_status) {
 	case MLXSW_EMAD_OP_TLV_STATUS_SUCCESS:
 		return 0;
 	case MLXSW_EMAD_OP_TLV_STATUS_BUSY:
 	case MLXSW_EMAD_OP_TLV_STATUS_MESSAGE_RECEIPT_ACK:
-		dev_warn(mlxsw_core->bus_info->dev, "Reg access status again (tid=%llx,status=%x(%s))\n",
-			 tid, status, mlxsw_emad_op_tlv_status_str(status));
 		return -EAGAIN;
 	case MLXSW_EMAD_OP_TLV_STATUS_VERSION_NOT_SUPPORTED:
 	case MLXSW_EMAD_OP_TLV_STATUS_UNKNOWN_TLV:
@@ -441,70 +397,150 @@ static int mlxsw_emad_process_status(struct mlxsw_core *mlxsw_core,
 	case MLXSW_EMAD_OP_TLV_STATUS_RESOURCE_NOT_AVAILABLE:
 	case MLXSW_EMAD_OP_TLV_STATUS_INTERNAL_ERROR:
 	default:
-		dev_err(mlxsw_core->bus_info->dev, "Reg access status failed (tid=%llx,status=%x(%s))\n",
-			tid, status, mlxsw_emad_op_tlv_status_str(status));
 		return -EIO;
 	}
 }
 
-static int mlxsw_emad_process_status_skb(struct mlxsw_core *mlxsw_core,
-					 struct sk_buff *skb)
+static int
+mlxsw_emad_process_status_skb(struct sk_buff *skb,
+			      enum mlxsw_emad_op_tlv_status *p_status)
 {
-	return mlxsw_emad_process_status(mlxsw_core, mlxsw_emad_op_tlv(skb));
+	return mlxsw_emad_process_status(mlxsw_emad_op_tlv(skb), p_status);
+}
+
+struct mlxsw_reg_trans {
+	struct list_head list;
+	struct list_head bulk_list;
+	struct mlxsw_core *core;
+	struct sk_buff *tx_skb;
+	struct mlxsw_tx_info tx_info;
+	struct delayed_work timeout_dw;
+	unsigned int retries;
+	u64 tid;
+	struct completion completion;
+	atomic_t active;
+	mlxsw_reg_trans_cb_t *cb;
+	unsigned long cb_priv;
+	const struct mlxsw_reg_info *reg;
+	enum mlxsw_core_reg_access_type type;
+	int err;
+	enum mlxsw_emad_op_tlv_status emad_status;
+	struct rcu_head rcu;
+};
+
+#define MLXSW_EMAD_TIMEOUT_MS 200
+
+static void mlxsw_emad_trans_timeout_schedule(struct mlxsw_reg_trans *trans)
+{
+	unsigned long timeout = msecs_to_jiffies(MLXSW_EMAD_TIMEOUT_MS);
+
+	mlxsw_core_schedule_dw(&trans->timeout_dw, timeout);
 }
 
 static int mlxsw_emad_transmit(struct mlxsw_core *mlxsw_core,
-			       struct sk_buff *skb,
-			       const struct mlxsw_tx_info *tx_info)
+			       struct mlxsw_reg_trans *trans)
 {
-	struct sk_buff *trans_skb;
-	int n_retry;
+	struct sk_buff *skb;
 	int err;
 
-	n_retry = 0;
-retry:
-	/* We copy the EMAD to a new skb, since we might need
-	 * to retransmit it in case of failure.
-	 */
-	trans_skb = skb_copy(skb, GFP_KERNEL);
-	if (!trans_skb) {
-		err = -ENOMEM;
-		goto out;
+	skb = skb_copy(trans->tx_skb, GFP_KERNEL);
+	if (!skb)
+		return -ENOMEM;
+
+	atomic_set(&trans->active, 1);
+	err = mlxsw_core_skb_transmit(mlxsw_core, skb, &trans->tx_info);
+	if (err) {
+		dev_kfree_skb(skb);
+		return err;
 	}
+	mlxsw_emad_trans_timeout_schedule(trans);
+	return 0;
+}
 
-	err = __mlxsw_emad_transmit(mlxsw_core, trans_skb, tx_info);
-	if (!err) {
-		struct sk_buff *resp_skb = mlxsw_core->emad.resp_skb;
+static void mlxsw_emad_trans_finish(struct mlxsw_reg_trans *trans, int err)
+{
+	struct mlxsw_core *mlxsw_core = trans->core;
+
+	dev_kfree_skb(trans->tx_skb);
+	spin_lock_bh(&mlxsw_core->emad.trans_list_lock);
+	list_del_rcu(&trans->list);
+	spin_unlock_bh(&mlxsw_core->emad.trans_list_lock);
+	trans->err = err;
+	complete(&trans->completion);
+}
+
+static void mlxsw_emad_transmit_retry(struct mlxsw_core *mlxsw_core,
+				      struct mlxsw_reg_trans *trans)
+{
+	int err;
 
-		err = mlxsw_emad_process_status_skb(mlxsw_core, resp_skb);
-		if (err)
-			dev_kfree_skb(resp_skb);
-		if (!err || err != -EAGAIN)
-			goto out;
+	if (trans->retries < MLXSW_EMAD_MAX_RETRY) {
+		trans->retries++;
+		err = mlxsw_emad_transmit(trans->core, trans);
+		if (err == 0)
+			return;
+	} else {
+		err = -EIO;
 	}
-	if (n_retry++ < MLXSW_EMAD_MAX_RETRY)
-		goto retry;
+	mlxsw_emad_trans_finish(trans, err);
+}
 
-out:
-	dev_kfree_skb(skb);
-	mlxsw_core->emad.tid++;
-	return err;
+static void mlxsw_emad_trans_timeout_work(struct work_struct *work)
+{
+	struct mlxsw_reg_trans *trans = container_of(work,
+						     struct mlxsw_reg_trans,
+						     timeout_dw.work);
+
+	if (!atomic_dec_and_test(&trans->active))
+		return;
+
+	mlxsw_emad_transmit_retry(trans->core, trans);
 }
 
+static void mlxsw_emad_process_response(struct mlxsw_core *mlxsw_core,
+					struct mlxsw_reg_trans *trans,
+					struct sk_buff *skb)
+{
+	int err;
+
+	if (!atomic_dec_and_test(&trans->active))
+		return;
+
+	err = mlxsw_emad_process_status_skb(skb, &trans->emad_status);
+	if (err == -EAGAIN) {
+		mlxsw_emad_transmit_retry(mlxsw_core, trans);
+	} else {
+		if (err == 0) {
+			char *op_tlv = mlxsw_emad_op_tlv(skb);
+
+			if (trans->cb)
+				trans->cb(mlxsw_core,
+					  mlxsw_emad_reg_payload(op_tlv),
+					  trans->reg->len, trans->cb_priv);
+		}
+		mlxsw_emad_trans_finish(trans, err);
+	}
+}
+
+/* called with rcu read lock held */
 static void mlxsw_emad_rx_listener_func(struct sk_buff *skb, u8 local_port,
 					void *priv)
 {
 	struct mlxsw_core *mlxsw_core = priv;
+	struct mlxsw_reg_trans *trans;
 
-	if (mlxsw_emad_is_resp(skb) &&
-	    mlxsw_core->emad.trans_active &&
-	    mlxsw_emad_get_tid(skb) == mlxsw_core->emad.tid) {
-		mlxsw_core->emad.resp_skb = skb;
-		mlxsw_core->emad.trans_active = false;
-		wake_up(&mlxsw_core->emad.wait);
-	} else {
-		dev_kfree_skb(skb);
+	if (!mlxsw_emad_is_resp(skb))
+		goto free_skb;
+
+	list_for_each_entry_rcu(trans, &mlxsw_core->emad.trans_list, list) {
+		if (mlxsw_emad_get_tid(skb) == trans->tid) {
+			mlxsw_emad_process_response(mlxsw_core, trans, skb);
+			break;
+		}
 	}
+
+free_skb:
+	dev_kfree_skb(skb);
 }
 
 static const struct mlxsw_rx_listener mlxsw_emad_rx_listener = {
@@ -531,18 +567,19 @@ static int mlxsw_emad_traps_set(struct mlxsw_core *mlxsw_core)
 
 static int mlxsw_emad_init(struct mlxsw_core *mlxsw_core)
 {
+	u64 tid;
 	int err;
 
 	/* Set the upper 32 bits of the transaction ID field to a random
 	 * number. This allows us to discard EMADs addressed to other
 	 * devices.
 	 */
-	get_random_bytes(&mlxsw_core->emad.tid, 4);
-	mlxsw_core->emad.tid = mlxsw_core->emad.tid << 32;
+	get_random_bytes(&tid, 4);
+	tid <<= 32;
+	atomic64_set(&mlxsw_core->emad.tid, tid);
 
-	init_waitqueue_head(&mlxsw_core->emad.wait);
-	mlxsw_core->emad.trans_active = false;
-	mutex_init(&mlxsw_core->emad.lock);
+	INIT_LIST_HEAD(&mlxsw_core->emad.trans_list);
+	spin_lock_init(&mlxsw_core->emad.trans_list_lock);
 
 	err = mlxsw_core_rx_listener_register(mlxsw_core,
 					      &mlxsw_emad_rx_listener,
@@ -600,6 +637,59 @@ static struct sk_buff *mlxsw_emad_alloc(const struct mlxsw_core *mlxsw_core,
 	return skb;
 }
 
+static int mlxsw_emad_reg_access(struct mlxsw_core *mlxsw_core,
+				 const struct mlxsw_reg_info *reg,
+				 char *payload,
+				 enum mlxsw_core_reg_access_type type,
+				 struct mlxsw_reg_trans *trans,
+				 struct list_head *bulk_list,
+				 mlxsw_reg_trans_cb_t *cb,
+				 unsigned long cb_priv, u64 tid)
+{
+	struct sk_buff *skb;
+	int err;
+
+	dev_dbg(mlxsw_core->bus_info->dev, "EMAD reg access (tid=%llx,reg_id=%x(%s),type=%s)\n",
+		trans->tid, reg->id, mlxsw_reg_id_str(reg->id),
+		mlxsw_core_reg_access_type_str(type));
+
+	skb = mlxsw_emad_alloc(mlxsw_core, reg->len);
+	if (!skb)
+		return -ENOMEM;
+
+	list_add_tail(&trans->bulk_list, bulk_list);
+	trans->core = mlxsw_core;
+	trans->tx_skb = skb;
+	trans->tx_info.local_port = MLXSW_PORT_CPU_PORT;
+	trans->tx_info.is_emad = true;
+	INIT_DELAYED_WORK(&trans->timeout_dw, mlxsw_emad_trans_timeout_work);
+	trans->tid = tid;
+	init_completion(&trans->completion);
+	trans->cb = cb;
+	trans->cb_priv = cb_priv;
+	trans->reg = reg;
+	trans->type = type;
+
+	mlxsw_emad_construct(skb, reg, payload, type, trans->tid);
+	mlxsw_core->driver->txhdr_construct(skb, &trans->tx_info);
+
+	spin_lock_bh(&mlxsw_core->emad.trans_list_lock);
+	list_add_tail_rcu(&trans->list, &mlxsw_core->emad.trans_list);
+	spin_unlock_bh(&mlxsw_core->emad.trans_list_lock);
+	err = mlxsw_emad_transmit(mlxsw_core, trans);
+	if (err)
+		goto err_out;
+	return 0;
+
+err_out:
+	spin_lock_bh(&mlxsw_core->emad.trans_list_lock);
+	list_del_rcu(&trans->list);
+	spin_unlock_bh(&mlxsw_core->emad.trans_list_lock);
+	list_del(&trans->bulk_list);
+	dev_kfree_skb(trans->tx_skb);
+	return err;
+}
+
 /*****************
  * Core functions
  *****************/
@@ -689,24 +779,6 @@ static const struct file_operations mlxsw_core_rx_stats_dbg_ops = {
 	.llseek = seq_lseek
 };
 
-static void mlxsw_core_buf_dump_dbg(struct mlxsw_core *mlxsw_core,
-				    const char *buf, size_t size)
-{
-	__be32 *m = (__be32 *) buf;
-	int i;
-	int count = size / sizeof(__be32);
-
-	for (i = count - 1; i >= 0; i--)
-		if (m[i])
-			break;
-	i++;
-	count = i ? i : 1;
-	for (i = 0; i < count; i += 4)
-		dev_dbg(mlxsw_core->bus_info->dev, "%04x - %08x %08x %08x %08x\n",
-			i * 4, be32_to_cpu(m[i]), be32_to_cpu(m[i + 1]),
-			be32_to_cpu(m[i + 2]), be32_to_cpu(m[i + 3]));
-}
-
 int mlxsw_core_driver_register(struct mlxsw_driver *mlxsw_driver)
 {
 	spin_lock(&mlxsw_core_driver_list_lock);
@@ -1264,56 +1336,112 @@ void mlxsw_core_event_listener_unregister(struct mlxsw_core *mlxsw_core,
 }
 EXPORT_SYMBOL(mlxsw_core_event_listener_unregister);
 
+static u64 mlxsw_core_tid_get(struct mlxsw_core *mlxsw_core)
+{
+	return atomic64_inc_return(&mlxsw_core->emad.tid);
+}
+
 static int mlxsw_core_reg_access_emad(struct mlxsw_core *mlxsw_core,
 				      const struct mlxsw_reg_info *reg,
 				      char *payload,
-				      enum mlxsw_core_reg_access_type type)
+				      enum mlxsw_core_reg_access_type type,
+				      struct list_head *bulk_list,
+				      mlxsw_reg_trans_cb_t *cb,
+				      unsigned long cb_priv)
 {
+	u64 tid = mlxsw_core_tid_get(mlxsw_core);
+	struct mlxsw_reg_trans *trans;
 	int err;
-	char *op_tlv;
-	struct sk_buff *skb;
-	struct mlxsw_tx_info tx_info = {
-		.local_port = MLXSW_PORT_CPU_PORT,
-		.is_emad = true,
-	};
 
-	skb = mlxsw_emad_alloc(mlxsw_core, reg->len);
-	if (!skb)
+	trans = kzalloc(sizeof(*trans), GFP_KERNEL);
+	if (!trans)
 		return -ENOMEM;
 
-	mlxsw_emad_construct(skb, reg, payload, type, mlxsw_core);
-	mlxsw_core->driver->txhdr_construct(skb, &tx_info);
+	err = mlxsw_emad_reg_access(mlxsw_core, reg, payload, type, trans,
+				    bulk_list, cb, cb_priv, tid);
+	if (err) {
+		kfree(trans);
+		return err;
+	}
+	return 0;
+}
 
-	dev_dbg(mlxsw_core->bus_info->dev, "EMAD send (tid=%llx)\n",
-		mlxsw_core->emad.tid);
-	mlxsw_core_buf_dump_dbg(mlxsw_core, skb->data, skb->len);
+int mlxsw_reg_trans_query(struct mlxsw_core *mlxsw_core,
+			  const struct mlxsw_reg_info *reg, char *payload,
+			  struct list_head *bulk_list,
+			  mlxsw_reg_trans_cb_t *cb, unsigned long cb_priv)
+{
+	return mlxsw_core_reg_access_emad(mlxsw_core, reg, payload,
+					  MLXSW_CORE_REG_ACCESS_TYPE_QUERY,
+					  bulk_list, cb, cb_priv);
+}
+EXPORT_SYMBOL(mlxsw_reg_trans_query);
 
-	err = mlxsw_emad_transmit(mlxsw_core, skb, &tx_info);
-	if (!err) {
-		op_tlv = mlxsw_emad_op_tlv(mlxsw_core->emad.resp_skb);
-		memcpy(payload, mlxsw_emad_reg_payload(op_tlv),
-		       reg->len);
+int mlxsw_reg_trans_write(struct mlxsw_core *mlxsw_core,
+			  const struct mlxsw_reg_info *reg, char *payload,
+			  struct list_head *bulk_list,
+			  mlxsw_reg_trans_cb_t *cb, unsigned long cb_priv)
+{
+	return mlxsw_core_reg_access_emad(mlxsw_core, reg, payload,
+					  MLXSW_CORE_REG_ACCESS_TYPE_WRITE,
+					  bulk_list, cb, cb_priv);
+}
+EXPORT_SYMBOL(mlxsw_reg_trans_write);
 
-		dev_dbg(mlxsw_core->bus_info->dev, "EMAD recv (tid=%llx)\n",
-			mlxsw_core->emad.tid - 1);
-		mlxsw_core_buf_dump_dbg(mlxsw_core,
-					mlxsw_core->emad.resp_skb->data,
-					mlxsw_core->emad.resp_skb->len);
+static int mlxsw_reg_trans_wait(struct mlxsw_reg_trans *trans)
+{
+	struct mlxsw_core *mlxsw_core = trans->core;
+	int err;
 
-		dev_kfree_skb(mlxsw_core->emad.resp_skb);
-	}
+	wait_for_completion(&trans->completion);
+	cancel_delayed_work_sync(&trans->timeout_dw);
+	err = trans->err;
 
+	if (trans->retries)
+		dev_warn(mlxsw_core->bus_info->dev, "EMAD retries (%d/%d) (tid=%llx)\n",
+			 trans->retries, MLXSW_EMAD_MAX_RETRY, trans->tid);
+	if (err)
+		dev_err(mlxsw_core->bus_info->dev, "EMAD reg access failed (tid=%llx,reg_id=%x(%s),type=%s,status=%x(%s))\n",
+			trans->tid, trans->reg->id,
+			mlxsw_reg_id_str(trans->reg->id),
+			mlxsw_core_reg_access_type_str(trans->type),
+			trans->emad_status,
+			mlxsw_emad_op_tlv_status_str(trans->emad_status));
+
+	list_del(&trans->bulk_list);
+	kfree_rcu(trans, rcu);
 	return err;
 }
 
+int mlxsw_reg_trans_bulk_wait(struct list_head *bulk_list)
+{
+	struct mlxsw_reg_trans *trans;
+	struct mlxsw_reg_trans *tmp;
+	int sum_err = 0;
+	int err;
+
+	list_for_each_entry_safe(trans, tmp, bulk_list, bulk_list) {
+		err = mlxsw_reg_trans_wait(trans);
+		if (err && sum_err == 0)
+			sum_err = err; /* first error to be returned */
+	}
+	return sum_err;
+}
+EXPORT_SYMBOL(mlxsw_reg_trans_bulk_wait);
+
 static int mlxsw_core_reg_access_cmd(struct mlxsw_core *mlxsw_core,
 				     const struct mlxsw_reg_info *reg,
 				     char *payload,
 				     enum mlxsw_core_reg_access_type type)
 {
+	enum mlxsw_emad_op_tlv_status status;
 	int err, n_retry;
 	char *in_mbox, *out_mbox, *tmp;
 
+	dev_dbg(mlxsw_core->bus_info->dev, "Reg cmd access (reg_id=%x(%s),type=%s)\n",
+		reg->id, mlxsw_reg_id_str(reg->id),
+		mlxsw_core_reg_access_type_str(type));
+
 	in_mbox = mlxsw_cmd_mbox_alloc();
 	if (!in_mbox)
 		return -ENOMEM;
@@ -1324,7 +1452,8 @@ static int mlxsw_core_reg_access_cmd(struct mlxsw_core *mlxsw_core,
 		goto free_in_mbox;
 	}
 
-	mlxsw_emad_pack_op_tlv(in_mbox, reg, type, mlxsw_core);
+	mlxsw_emad_pack_op_tlv(in_mbox, reg, type,
+			       mlxsw_core_tid_get(mlxsw_core));
 	tmp = in_mbox + MLXSW_EMAD_OP_TLV_LEN * sizeof(u32);
 	mlxsw_emad_pack_reg_tlv(tmp, reg, payload);
 
@@ -1332,60 +1461,61 @@ static int mlxsw_core_reg_access_cmd(struct mlxsw_core *mlxsw_core,
 retry:
 	err = mlxsw_cmd_access_reg(mlxsw_core, in_mbox, out_mbox);
 	if (!err) {
-		err = mlxsw_emad_process_status(mlxsw_core, out_mbox);
-		if (err == -EAGAIN && n_retry++ < MLXSW_EMAD_MAX_RETRY)
-			goto retry;
+		err = mlxsw_emad_process_status(out_mbox, &status);
+		if (err) {
+			if (err == -EAGAIN && n_retry++ < MLXSW_EMAD_MAX_RETRY)
+				goto retry;
+			dev_err(mlxsw_core->bus_info->dev, "Reg cmd access status failed (status=%x(%s))\n",
+				status, mlxsw_emad_op_tlv_status_str(status));
+		}
 	}
 
 	if (!err)
 		memcpy(payload, mlxsw_emad_reg_payload(out_mbox),
 		       reg->len);
 
-	mlxsw_core->emad.tid++;
 	mlxsw_cmd_mbox_free(out_mbox);
 free_in_mbox:
 	mlxsw_cmd_mbox_free(in_mbox);
+	if (err)
+		dev_err(mlxsw_core->bus_info->dev, "Reg cmd access failed (reg_id=%x(%s),type=%s)\n",
+			reg->id, mlxsw_reg_id_str(reg->id),
+			mlxsw_core_reg_access_type_str(type));
 	return err;
 }
 
+static void mlxsw_core_reg_access_cb(struct mlxsw_core *mlxsw_core,
+				     char *payload, size_t payload_len,
+				     unsigned long cb_priv)
+{
+	char *orig_payload = (char *) cb_priv;
+
+	memcpy(orig_payload, payload, payload_len);
+}
+
 static int mlxsw_core_reg_access(struct mlxsw_core *mlxsw_core,
 				 const struct mlxsw_reg_info *reg,
 				 char *payload,
 				 enum mlxsw_core_reg_access_type type)
 {
-	u64 cur_tid;
+	LIST_HEAD(bulk_list);
 	int err;
 
-	if (mutex_lock_interruptible(&mlxsw_core->emad.lock)) {
-		dev_err(mlxsw_core->bus_info->dev, "Reg access interrupted (reg_id=%x(%s),type=%s)\n",
-			reg->id, mlxsw_reg_id_str(reg->id),
-			mlxsw_core_reg_access_type_str(type));
-		return -EINTR;
-	}
-
-	cur_tid = mlxsw_core->emad.tid;
-	dev_dbg(mlxsw_core->bus_info->dev, "Reg access (tid=%llx,reg_id=%x(%s),type=%s)\n",
-		cur_tid, reg->id, mlxsw_reg_id_str(reg->id),
-		mlxsw_core_reg_access_type_str(type));
-
 	/* During initialization EMAD interface is not available to us,
 	 * so we default to command interface. We switch to EMAD interface
 	 * after setting the appropriate traps.
 	 */
 	if (!mlxsw_core->emad.use_emad)
-		err = mlxsw_core_reg_access_cmd(mlxsw_core, reg,
-						payload, type);
-	else
-		err = mlxsw_core_reg_access_emad(mlxsw_core, reg,
+		return mlxsw_core_reg_access_cmd(mlxsw_core, reg,
 						 payload, type);
 
+	err = mlxsw_core_reg_access_emad(mlxsw_core, reg,
+					 payload, type, &bulk_list,
+					 mlxsw_core_reg_access_cb,
+					 (unsigned long) payload);
 	if (err)
-		dev_err(mlxsw_core->bus_info->dev, "Reg access failed (tid=%llx,reg_id=%x(%s),type=%s)\n",
-			cur_tid, reg->id, mlxsw_reg_id_str(reg->id),
-			mlxsw_core_reg_access_type_str(type));
-
-	mutex_unlock(&mlxsw_core->emad.lock);
-	return err;
+		return err;
+	return mlxsw_reg_trans_bulk_wait(&bulk_list);
 }
 
 int mlxsw_reg_query(struct mlxsw_core *mlxsw_core,
@@ -1536,6 +1666,24 @@ void mlxsw_core_port_fini(struct mlxsw_core_port *mlxsw_core_port)
 }
 EXPORT_SYMBOL(mlxsw_core_port_fini);
 
+static void mlxsw_core_buf_dump_dbg(struct mlxsw_core *mlxsw_core,
+				    const char *buf, size_t size)
+{
+	__be32 *m = (__be32 *) buf;
+	int i;
+	int count = size / sizeof(__be32);
+
+	for (i = count - 1; i >= 0; i--)
+		if (m[i])
+			break;
+	i++;
+	count = i ? i : 1;
+	for (i = 0; i < count; i += 4)
+		dev_dbg(mlxsw_core->bus_info->dev, "%04x - %08x %08x %08x %08x\n",
+			i * 4, be32_to_cpu(m[i]), be32_to_cpu(m[i + 1]),
+			be32_to_cpu(m[i + 2]), be32_to_cpu(m[i + 3]));
+}
+
 int mlxsw_cmd_exec(struct mlxsw_core *mlxsw_core, u16 opcode, u8 opcode_mod,
 		   u32 in_mod, bool out_mbox_direct,
 		   char *in_mbox, size_t in_mbox_size,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.h b/drivers/net/ethernet/mellanox/mlxsw/core.h
index b41ebf8..436bc49 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.h
@@ -109,6 +109,19 @@ void mlxsw_core_event_listener_unregister(struct mlxsw_core *mlxsw_core,
 					  const struct mlxsw_event_listener *el,
 					  void *priv);
 
+typedef void mlxsw_reg_trans_cb_t(struct mlxsw_core *mlxsw_core, char *payload,
+				  size_t payload_len, unsigned long cb_priv);
+
+int mlxsw_reg_trans_query(struct mlxsw_core *mlxsw_core,
+			  const struct mlxsw_reg_info *reg, char *payload,
+			  struct list_head *bulk_list,
+			  mlxsw_reg_trans_cb_t *cb, unsigned long cb_priv);
+int mlxsw_reg_trans_write(struct mlxsw_core *mlxsw_core,
+			  const struct mlxsw_reg_info *reg, char *payload,
+			  struct list_head *bulk_list,
+			  mlxsw_reg_trans_cb_t *cb, unsigned long cb_priv);
+int mlxsw_reg_trans_bulk_wait(struct list_head *bulk_list);
+
 int mlxsw_reg_query(struct mlxsw_core *mlxsw_core,
 		    const struct mlxsw_reg_info *reg, char *payload);
 int mlxsw_reg_write(struct mlxsw_core *mlxsw_core,
-- 
2.5.5

^ permalink raw reply related

* [patch net-next 16/18] mlxsw: core: Add mlxsw specific workqueue and use it for FDB notif. processing
From: Jiri Pirko @ 2016-04-14 16:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, ogerlitz, roopa, nikolay, jhs,
	john.fastabend, rami.rosen, gospo, stephen, sfeldma
In-Reply-To: <1460650770-19382-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Follow-up patch is going to need to use delayed work as well and
frequently. The FDB notification processing is already using that and
also quite frequently. It makes sense to create separate workqueue just
for mlxsw driver in this case and do not pollute system_wq.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/core.c         | 25 ++++++++++++++++++++--
 drivers/net/ethernet/mellanox/mlxsw/core.h         |  3 +++
 .../ethernet/mellanox/mlxsw/spectrum_switchdev.c   |  4 ++--
 3 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c b/drivers/net/ethernet/mellanox/mlxsw/core.c
index 63a9777..a14e422 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.c
@@ -55,6 +55,7 @@
 #include <linux/mutex.h>
 #include <linux/rcupdate.h>
 #include <linux/slab.h>
+#include <linux/workqueue.h>
 #include <asm/byteorder.h>
 #include <net/devlink.h>
 
@@ -73,6 +74,8 @@ static const char mlxsw_core_driver_name[] = "mlxsw_core";
 
 static struct dentry *mlxsw_core_dbg_root;
 
+static struct workqueue_struct *mlxsw_wq;
+
 struct mlxsw_core_pcpu_stats {
 	u64			trap_rx_packets[MLXSW_TRAP_ID_MAX];
 	u64			trap_rx_bytes[MLXSW_TRAP_ID_MAX];
@@ -1575,17 +1578,35 @@ int mlxsw_cmd_exec(struct mlxsw_core *mlxsw_core, u16 opcode, u8 opcode_mod,
 }
 EXPORT_SYMBOL(mlxsw_cmd_exec);
 
+int mlxsw_core_schedule_dw(struct delayed_work *dwork, unsigned long delay)
+{
+	return queue_delayed_work(mlxsw_wq, dwork, delay);
+}
+EXPORT_SYMBOL(mlxsw_core_schedule_dw);
+
 static int __init mlxsw_core_module_init(void)
 {
-	mlxsw_core_dbg_root = debugfs_create_dir(mlxsw_core_driver_name, NULL);
-	if (!mlxsw_core_dbg_root)
+	int err;
+
+	mlxsw_wq = create_workqueue(mlxsw_core_driver_name);
+	if (!mlxsw_wq)
 		return -ENOMEM;
+	mlxsw_core_dbg_root = debugfs_create_dir(mlxsw_core_driver_name, NULL);
+	if (!mlxsw_core_dbg_root) {
+		err = -ENOMEM;
+		goto err_debugfs_create_dir;
+	}
 	return 0;
+
+err_debugfs_create_dir:
+	destroy_workqueue(mlxsw_wq);
+	return err;
 }
 
 static void __exit mlxsw_core_module_exit(void)
 {
 	debugfs_remove_recursive(mlxsw_core_dbg_root);
+	destroy_workqueue(mlxsw_wq);
 }
 
 module_init(mlxsw_core_module_init);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.h b/drivers/net/ethernet/mellanox/mlxsw/core.h
index 377dacc..b41ebf8 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.h
@@ -43,6 +43,7 @@
 #include <linux/gfp.h>
 #include <linux/types.h>
 #include <linux/skbuff.h>
+#include <linux/workqueue.h>
 #include <net/devlink.h>
 
 #include "trap.h"
@@ -151,6 +152,8 @@ int mlxsw_core_port_init(struct mlxsw_core *mlxsw_core,
 			 struct net_device *dev, bool split, u32 split_group);
 void mlxsw_core_port_fini(struct mlxsw_core_port *mlxsw_core_port);
 
+int mlxsw_core_schedule_dw(struct delayed_work *dwork, unsigned long delay);
+
 #define MLXSW_CONFIG_PROFILE_SWID_COUNT 8
 
 struct mlxsw_swid_config {
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index e1c74ef..fb9efb8 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -1430,8 +1430,8 @@ static void mlxsw_sp_fdb_notify_rec_process(struct mlxsw_sp *mlxsw_sp,
 
 static void mlxsw_sp_fdb_notify_work_schedule(struct mlxsw_sp *mlxsw_sp)
 {
-	schedule_delayed_work(&mlxsw_sp->fdb_notify.dw,
-			      msecs_to_jiffies(mlxsw_sp->fdb_notify.interval));
+	mlxsw_core_schedule_dw(&mlxsw_sp->fdb_notify.dw,
+			       msecs_to_jiffies(mlxsw_sp->fdb_notify.interval));
 }
 
 static void mlxsw_sp_fdb_notify_work(struct work_struct *work)
-- 
2.5.5

^ permalink raw reply related

* [patch net-next 15/18] mlxsw: reg: Extend SBPM register for occupancy control
From: Jiri Pirko @ 2016-04-14 16:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, ogerlitz, roopa, nikolay, jhs,
	john.fastabend, rami.rosen, gospo, stephen, sfeldma
In-Reply-To: <1460650770-19382-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Since it is not possible to get and clear Port-Pool occupancy data using
SBSR register, there's a need to implement that using SBPM.
Extend pack helper and add unpack helper to get occupancy values.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/reg.h          | 31 +++++++++++++++++++++-
 .../net/ethernet/mellanox/mlxsw/spectrum_buffers.c |  3 ++-
 2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/reg.h b/drivers/net/ethernet/mellanox/mlxsw/reg.h
index 656466a..1977e7a 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/reg.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/reg.h
@@ -3636,6 +3636,27 @@ MLXSW_ITEM32(reg, sbpm, pool, 0x00, 8, 4);
  */
 MLXSW_ITEM32(reg, sbpm, dir, 0x00, 0, 2);
 
+/* reg_sbpm_buff_occupancy
+ * Current buffer occupancy in cells.
+ * Access: RO
+ */
+MLXSW_ITEM32(reg, sbpm, buff_occupancy, 0x10, 0, 24);
+
+/* reg_sbpm_clr
+ * Clear Max Buffer Occupancy
+ * When this bit is set, max_buff_occupancy field is cleared (and a
+ * new max value is tracked from the time the clear was performed).
+ * Access: OP
+ */
+MLXSW_ITEM32(reg, sbpm, clr, 0x14, 31, 1);
+
+/* reg_sbpm_max_buff_occupancy
+ * Maximum value of buffer occupancy in cells monitored. Cleared by
+ * writing to the clr field.
+ * Access: RO
+ */
+MLXSW_ITEM32(reg, sbpm, max_buff_occupancy, 0x14, 0, 24);
+
 /* reg_sbpm_min_buff
  * Minimum buffer size for the limiter, in cells.
  * Access: RW
@@ -3656,17 +3677,25 @@ MLXSW_ITEM32(reg, sbpm, min_buff, 0x18, 0, 24);
 MLXSW_ITEM32(reg, sbpm, max_buff, 0x1C, 0, 24);
 
 static inline void mlxsw_reg_sbpm_pack(char *payload, u8 local_port, u8 pool,
-				       enum mlxsw_reg_sbxx_dir dir,
+				       enum mlxsw_reg_sbxx_dir dir, bool clr,
 				       u32 min_buff, u32 max_buff)
 {
 	MLXSW_REG_ZERO(sbpm, payload);
 	mlxsw_reg_sbpm_local_port_set(payload, local_port);
 	mlxsw_reg_sbpm_pool_set(payload, pool);
 	mlxsw_reg_sbpm_dir_set(payload, dir);
+	mlxsw_reg_sbpm_clr_set(payload, clr);
 	mlxsw_reg_sbpm_min_buff_set(payload, min_buff);
 	mlxsw_reg_sbpm_max_buff_set(payload, max_buff);
 }
 
+static inline void mlxsw_reg_sbpm_unpack(char *payload, u32 *p_buff_occupancy,
+					 u32 *p_max_buff_occupancy)
+{
+	*p_buff_occupancy = mlxsw_reg_sbpm_buff_occupancy_get(payload);
+	*p_max_buff_occupancy = mlxsw_reg_sbpm_max_buff_occupancy_get(payload);
+}
+
 /* SBMM - Shared Buffer Multicast Management Register
  * --------------------------------------------------
  * The SBMM register configures and retrieves the shared buffer allocation
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
index 6042c17..639ba5a 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
@@ -113,7 +113,8 @@ static int mlxsw_sp_sb_pm_write(struct mlxsw_sp *mlxsw_sp, u8 local_port,
 	struct mlxsw_sp_sb_pm *pm;
 	int err;
 
-	mlxsw_reg_sbpm_pack(sbpm_pl, local_port, pool, dir, min_buff, max_buff);
+	mlxsw_reg_sbpm_pack(sbpm_pl, local_port, pool, dir, false,
+			    min_buff, max_buff);
 	err = mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sbpm), sbpm_pl);
 	if (err)
 		return err;
-- 
2.5.5

^ permalink raw reply related

* [patch net-next 10/18] mlxsw: spectrum_buffers: Get max_buff defaults into limits exposed to user
From: Jiri Pirko @ 2016-04-14 16:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, ogerlitz, roopa, nikolay, jhs,
	john.fastabend, rami.rosen, gospo, stephen, sfeldma
In-Reply-To: <1460650770-19382-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Although the device supports max_buff magic values 0 and 0xff, these are
not exposed to the user via devlink.
Therefore, adjust the default values to be within configurable range.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/reg.h          |  4 ++++
 .../net/ethernet/mellanox/mlxsw/spectrum_buffers.c | 28 +++++++++++-----------
 2 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/reg.h b/drivers/net/ethernet/mellanox/mlxsw/reg.h
index 57e4a63..fce5a96 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/reg.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/reg.h
@@ -3566,6 +3566,10 @@ MLXSW_ITEM32(reg, sbcm, dir, 0x00, 0, 2);
  */
 MLXSW_ITEM32(reg, sbcm, min_buff, 0x18, 0, 24);
 
+/* shared max_buff limits for dynamic threshold for SBCM, SBPM */
+#define MLXSW_REG_SBXX_DYN_MAX_BUFF_MIN 1
+#define MLXSW_REG_SBXX_DYN_MAX_BUFF_MAX 14
+
 /* reg_sbcm_max_buff
  * When the pool associated to the port-pg/tclass is configured to
  * static, Maximum buffer size for the limiter configured in cells.
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
index 7ee2315..fd6b080 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
@@ -255,13 +255,13 @@ static int mlxsw_sp_sb_prs_init(struct mlxsw_sp *mlxsw_sp)
 
 static const struct mlxsw_sp_sb_cm mlxsw_sp_sb_cms_ingress[] = {
 	MLXSW_SP_SB_CM(MLXSW_SP_BYTES_TO_CELLS(10000), 8, 0),
-	MLXSW_SP_SB_CM(0, 0, 0),
-	MLXSW_SP_SB_CM(0, 0, 0),
-	MLXSW_SP_SB_CM(0, 0, 0),
-	MLXSW_SP_SB_CM(0, 0, 0),
-	MLXSW_SP_SB_CM(0, 0, 0),
-	MLXSW_SP_SB_CM(0, 0, 0),
-	MLXSW_SP_SB_CM(0, 0, 0),
+	MLXSW_SP_SB_CM(0, MLXSW_REG_SBXX_DYN_MAX_BUFF_MIN, 0),
+	MLXSW_SP_SB_CM(0, MLXSW_REG_SBXX_DYN_MAX_BUFF_MIN, 0),
+	MLXSW_SP_SB_CM(0, MLXSW_REG_SBXX_DYN_MAX_BUFF_MIN, 0),
+	MLXSW_SP_SB_CM(0, MLXSW_REG_SBXX_DYN_MAX_BUFF_MIN, 0),
+	MLXSW_SP_SB_CM(0, MLXSW_REG_SBXX_DYN_MAX_BUFF_MIN, 0),
+	MLXSW_SP_SB_CM(0, MLXSW_REG_SBXX_DYN_MAX_BUFF_MIN, 0),
+	MLXSW_SP_SB_CM(0, MLXSW_REG_SBXX_DYN_MAX_BUFF_MIN, 0),
 	MLXSW_SP_SB_CM(0, 0, 0), /* dummy, this PG does not exist */
 	MLXSW_SP_SB_CM(MLXSW_SP_BYTES_TO_CELLS(20000), 1, 3),
 };
@@ -385,19 +385,19 @@ static int mlxsw_sp_cpu_port_sb_cms_init(struct mlxsw_sp *mlxsw_sp)
 	}
 
 static const struct mlxsw_sp_sb_pm mlxsw_sp_sb_pms_ingress[] = {
-	MLXSW_SP_SB_PM(0, 0xff),
-	MLXSW_SP_SB_PM(0, 0),
-	MLXSW_SP_SB_PM(0, 0),
-	MLXSW_SP_SB_PM(0, 0xff),
+	MLXSW_SP_SB_PM(0, MLXSW_REG_SBXX_DYN_MAX_BUFF_MAX),
+	MLXSW_SP_SB_PM(0, MLXSW_REG_SBXX_DYN_MAX_BUFF_MIN),
+	MLXSW_SP_SB_PM(0, MLXSW_REG_SBXX_DYN_MAX_BUFF_MIN),
+	MLXSW_SP_SB_PM(0, MLXSW_REG_SBXX_DYN_MAX_BUFF_MAX),
 };
 
 #define MLXSW_SP_SB_PMS_INGRESS_LEN ARRAY_SIZE(mlxsw_sp_sb_pms_ingress)
 
 static const struct mlxsw_sp_sb_pm mlxsw_sp_sb_pms_egress[] = {
 	MLXSW_SP_SB_PM(0, 7),
-	MLXSW_SP_SB_PM(0, 0),
-	MLXSW_SP_SB_PM(0, 0),
-	MLXSW_SP_SB_PM(0, 0),
+	MLXSW_SP_SB_PM(0, MLXSW_REG_SBXX_DYN_MAX_BUFF_MIN),
+	MLXSW_SP_SB_PM(0, MLXSW_REG_SBXX_DYN_MAX_BUFF_MIN),
+	MLXSW_SP_SB_PM(0, MLXSW_REG_SBXX_DYN_MAX_BUFF_MIN),
 };
 
 #define MLXSW_SP_SB_PMS_EGRESS_LEN ARRAY_SIZE(mlxsw_sp_sb_pms_egress)
-- 
2.5.5

^ permalink raw reply related

* [patch net-next 14/18] mlxsw: reg: Add Shared Buffer Status register definition
From: Jiri Pirko @ 2016-04-14 16:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, ogerlitz, roopa, nikolay, jhs,
	john.fastabend, rami.rosen, gospo, stephen, sfeldma
In-Reply-To: <1460650770-19382-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

This register allows to query HW for current and maximal buffer usage.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/reg.h | 100 ++++++++++++++++++++++++++++++
 1 file changed, 100 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/reg.h b/drivers/net/ethernet/mellanox/mlxsw/reg.h
index fce5a96..656466a 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/reg.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/reg.h
@@ -3722,6 +3722,104 @@ static inline void mlxsw_reg_sbmm_pack(char *payload, u8 prio, u32 min_buff,
 	mlxsw_reg_sbmm_pool_set(payload, pool);
 }
 
+/* SBSR - Shared Buffer Status Register
+ * ------------------------------------
+ * The SBSR register retrieves the shared buffer occupancy according to
+ * Port-Pool. Note that this register enables reading a large amount of data.
+ * It is the user's responsibility to limit the amount of data to ensure the
+ * response can match the maximum transfer unit. In case the response exceeds
+ * the maximum transport unit, it will be truncated with no special notice.
+ */
+#define MLXSW_REG_SBSR_ID 0xB005
+#define MLXSW_REG_SBSR_BASE_LEN 0x5C /* base length, without records */
+#define MLXSW_REG_SBSR_REC_LEN 0x8 /* record length */
+#define MLXSW_REG_SBSR_REC_MAX_COUNT 120
+#define MLXSW_REG_SBSR_LEN (MLXSW_REG_SBSR_BASE_LEN +	\
+			    MLXSW_REG_SBSR_REC_LEN *	\
+			    MLXSW_REG_SBSR_REC_MAX_COUNT)
+
+static const struct mlxsw_reg_info mlxsw_reg_sbsr = {
+	.id = MLXSW_REG_SBSR_ID,
+	.len = MLXSW_REG_SBSR_LEN,
+};
+
+/* reg_sbsr_clr
+ * Clear Max Buffer Occupancy. When this bit is set, the max_buff_occupancy
+ * field is cleared (and a new max value is tracked from the time the clear
+ * was performed).
+ * Access: OP
+ */
+MLXSW_ITEM32(reg, sbsr, clr, 0x00, 31, 1);
+
+/* reg_sbsr_ingress_port_mask
+ * Bit vector for all ingress network ports.
+ * Indicates which of the ports (for which the relevant bit is set)
+ * are affected by the set operation. Configuration of any other port
+ * does not change.
+ * Access: Index
+ */
+MLXSW_ITEM_BIT_ARRAY(reg, sbsr, ingress_port_mask, 0x10, 0x20, 1);
+
+/* reg_sbsr_pg_buff_mask
+ * Bit vector for all switch priority groups.
+ * Indicates which of the priorities (for which the relevant bit is set)
+ * are affected by the set operation. Configuration of any other priority
+ * does not change.
+ * Range is 0..cap_max_pg_buffers - 1
+ * Access: Index
+ */
+MLXSW_ITEM_BIT_ARRAY(reg, sbsr, pg_buff_mask, 0x30, 0x4, 1);
+
+/* reg_sbsr_egress_port_mask
+ * Bit vector for all egress network ports.
+ * Indicates which of the ports (for which the relevant bit is set)
+ * are affected by the set operation. Configuration of any other port
+ * does not change.
+ * Access: Index
+ */
+MLXSW_ITEM_BIT_ARRAY(reg, sbsr, egress_port_mask, 0x34, 0x20, 1);
+
+/* reg_sbsr_tclass_mask
+ * Bit vector for all traffic classes.
+ * Indicates which of the traffic classes (for which the relevant bit is
+ * set) are affected by the set operation. Configuration of any other
+ * traffic class does not change.
+ * Range is 0..cap_max_tclass - 1
+ * Access: Index
+ */
+MLXSW_ITEM_BIT_ARRAY(reg, sbsr, tclass_mask, 0x54, 0x8, 1);
+
+static inline void mlxsw_reg_sbsr_pack(char *payload, bool clr)
+{
+	MLXSW_REG_ZERO(sbsr, payload);
+	mlxsw_reg_sbsr_clr_set(payload, clr);
+}
+
+/* reg_sbsr_rec_buff_occupancy
+ * Current buffer occupancy in cells.
+ * Access: RO
+ */
+MLXSW_ITEM32_INDEXED(reg, sbsr, rec_buff_occupancy, MLXSW_REG_SBSR_BASE_LEN,
+		     0, 24, MLXSW_REG_SBSR_REC_LEN, 0x00, false);
+
+/* reg_sbsr_rec_max_buff_occupancy
+ * Maximum value of buffer occupancy in cells monitored. Cleared by
+ * writing to the clr field.
+ * Access: RO
+ */
+MLXSW_ITEM32_INDEXED(reg, sbsr, rec_max_buff_occupancy, MLXSW_REG_SBSR_BASE_LEN,
+		     0, 24, MLXSW_REG_SBSR_REC_LEN, 0x04, false);
+
+static inline void mlxsw_reg_sbsr_rec_unpack(char *payload, int rec_index,
+					     u32 *p_buff_occupancy,
+					     u32 *p_max_buff_occupancy)
+{
+	*p_buff_occupancy =
+		mlxsw_reg_sbsr_rec_buff_occupancy_get(payload, rec_index);
+	*p_max_buff_occupancy =
+		mlxsw_reg_sbsr_rec_max_buff_occupancy_get(payload, rec_index);
+}
+
 static inline const char *mlxsw_reg_id_str(u16 reg_id)
 {
 	switch (reg_id) {
@@ -3817,6 +3915,8 @@ static inline const char *mlxsw_reg_id_str(u16 reg_id)
 		return "SBPM";
 	case MLXSW_REG_SBMM_ID:
 		return "SBMM";
+	case MLXSW_REG_SBSR_ID:
+		return "SBSR";
 	default:
 		return "*UNKNOWN*";
 	}
-- 
2.5.5

^ permalink raw reply related

* [patch net-next 13/18] mlxsw: core: Add devlink shared buffer occupancy callbacks
From: Jiri Pirko @ 2016-04-14 16:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, ogerlitz, roopa, nikolay, jhs,
	john.fastabend, rami.rosen, gospo, stephen, sfeldma
In-Reply-To: <1460650770-19382-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Add middle layer in mlxsw core code to forward shared buffer occupancy
calls into specific ASIC drivers.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/core.c | 74 ++++++++++++++++++++++++++----
 drivers/net/ethernet/mellanox/mlxsw/core.h | 11 +++++
 2 files changed, 77 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c b/drivers/net/ethernet/mellanox/mlxsw/core.c
index 1278260..63a9777 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.c
@@ -911,15 +911,73 @@ mlxsw_devlink_sb_tc_pool_bind_set(struct devlink_port *devlink_port,
 						 pool_index, threshold);
 }
 
+static int mlxsw_devlink_sb_occ_snapshot(struct devlink *devlink,
+					 unsigned int sb_index)
+{
+	struct mlxsw_core *mlxsw_core = devlink_priv(devlink);
+	struct mlxsw_driver *mlxsw_driver = mlxsw_core->driver;
+
+	if (!mlxsw_driver->sb_occ_snapshot)
+		return -EOPNOTSUPP;
+	return mlxsw_driver->sb_occ_snapshot(mlxsw_core, sb_index);
+}
+
+static int mlxsw_devlink_sb_occ_max_clear(struct devlink *devlink,
+					  unsigned int sb_index)
+{
+	struct mlxsw_core *mlxsw_core = devlink_priv(devlink);
+	struct mlxsw_driver *mlxsw_driver = mlxsw_core->driver;
+
+	if (!mlxsw_driver->sb_occ_max_clear)
+		return -EOPNOTSUPP;
+	return mlxsw_driver->sb_occ_max_clear(mlxsw_core, sb_index);
+}
+
+static int
+mlxsw_devlink_sb_occ_port_pool_get(struct devlink_port *devlink_port,
+				   unsigned int sb_index, u16 pool_index,
+				   u32 *p_cur, u32 *p_max)
+{
+	struct mlxsw_core *mlxsw_core = devlink_priv(devlink_port->devlink);
+	struct mlxsw_driver *mlxsw_driver = mlxsw_core->driver;
+	struct mlxsw_core_port *mlxsw_core_port = __dl_port(devlink_port);
+
+	if (!mlxsw_driver->sb_occ_port_pool_get)
+		return -EOPNOTSUPP;
+	return mlxsw_driver->sb_occ_port_pool_get(mlxsw_core_port, sb_index,
+						  pool_index, p_cur, p_max);
+}
+
+static int
+mlxsw_devlink_sb_occ_tc_port_bind_get(struct devlink_port *devlink_port,
+				      unsigned int sb_index, u16 tc_index,
+				      enum devlink_sb_pool_type pool_type,
+				      u32 *p_cur, u32 *p_max)
+{
+	struct mlxsw_core *mlxsw_core = devlink_priv(devlink_port->devlink);
+	struct mlxsw_driver *mlxsw_driver = mlxsw_core->driver;
+	struct mlxsw_core_port *mlxsw_core_port = __dl_port(devlink_port);
+
+	if (!mlxsw_driver->sb_occ_tc_port_bind_get)
+		return -EOPNOTSUPP;
+	return mlxsw_driver->sb_occ_tc_port_bind_get(mlxsw_core_port,
+						     sb_index, tc_index,
+						     pool_type, p_cur, p_max);
+}
+
 static const struct devlink_ops mlxsw_devlink_ops = {
-	.port_split		= mlxsw_devlink_port_split,
-	.port_unsplit		= mlxsw_devlink_port_unsplit,
-	.sb_pool_get		= mlxsw_devlink_sb_pool_get,
-	.sb_pool_set		= mlxsw_devlink_sb_pool_set,
-	.sb_port_pool_get	= mlxsw_devlink_sb_port_pool_get,
-	.sb_port_pool_set	= mlxsw_devlink_sb_port_pool_set,
-	.sb_tc_pool_bind_get	= mlxsw_devlink_sb_tc_pool_bind_get,
-	.sb_tc_pool_bind_set	= mlxsw_devlink_sb_tc_pool_bind_set,
+	.port_split			= mlxsw_devlink_port_split,
+	.port_unsplit			= mlxsw_devlink_port_unsplit,
+	.sb_pool_get			= mlxsw_devlink_sb_pool_get,
+	.sb_pool_set			= mlxsw_devlink_sb_pool_set,
+	.sb_port_pool_get		= mlxsw_devlink_sb_port_pool_get,
+	.sb_port_pool_set		= mlxsw_devlink_sb_port_pool_set,
+	.sb_tc_pool_bind_get		= mlxsw_devlink_sb_tc_pool_bind_get,
+	.sb_tc_pool_bind_set		= mlxsw_devlink_sb_tc_pool_bind_set,
+	.sb_occ_snapshot		= mlxsw_devlink_sb_occ_snapshot,
+	.sb_occ_max_clear		= mlxsw_devlink_sb_occ_max_clear,
+	.sb_occ_port_pool_get		= mlxsw_devlink_sb_occ_port_pool_get,
+	.sb_occ_tc_port_bind_get	= mlxsw_devlink_sb_occ_tc_port_bind_get,
 };
 
 int mlxsw_core_bus_device_register(const struct mlxsw_bus_info *mlxsw_bus_info,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.h b/drivers/net/ethernet/mellanox/mlxsw/core.h
index d0c471f..377dacc 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.h
@@ -229,6 +229,17 @@ struct mlxsw_driver {
 				   unsigned int sb_index, u16 tc_index,
 				   enum devlink_sb_pool_type pool_type,
 				   u16 pool_index, u32 threshold);
+	int (*sb_occ_snapshot)(struct mlxsw_core *mlxsw_core,
+			       unsigned int sb_index);
+	int (*sb_occ_max_clear)(struct mlxsw_core *mlxsw_core,
+				unsigned int sb_index);
+	int (*sb_occ_port_pool_get)(struct mlxsw_core_port *mlxsw_core_port,
+				    unsigned int sb_index, u16 pool_index,
+				    u32 *p_cur, u32 *p_max);
+	int (*sb_occ_tc_port_bind_get)(struct mlxsw_core_port *mlxsw_core_port,
+				       unsigned int sb_index, u16 tc_index,
+				       enum devlink_sb_pool_type pool_type,
+				       u32 *p_cur, u32 *p_max);
 	void (*txhdr_construct)(struct sk_buff *skb,
 				const struct mlxsw_tx_info *tx_info);
 	u8 txhdr_len;
-- 
2.5.5

^ permalink raw reply related

* [patch net-next 12/18] mlxsw: spectrum_buffers: Implement shared buffer configuration
From: Jiri Pirko @ 2016-04-14 16:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, ogerlitz, roopa, nikolay, jhs,
	john.fastabend, rami.rosen, gospo, stephen, sfeldma
In-Reply-To: <1460650770-19382-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Implement previously introduced mlxsw core shared buffer API.
For Spectrum, that is done utilizing registers SBPR, SBCM and SBPM.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c     |   8 +
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |  22 +++
 .../net/ethernet/mellanox/mlxsw/spectrum_buffers.c | 187 ++++++++++++++++++++-
 3 files changed, 216 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 19b3c14..ecadb15 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -2434,6 +2434,7 @@ static int mlxsw_sp_init(struct mlxsw_core *mlxsw_core,
 
 err_switchdev_init:
 err_lag_init:
+	mlxsw_sp_buffers_fini(mlxsw_sp);
 err_buffers_init:
 err_flood_init:
 	mlxsw_sp_traps_fini(mlxsw_sp);
@@ -2448,6 +2449,7 @@ static void mlxsw_sp_fini(struct mlxsw_core *mlxsw_core)
 {
 	struct mlxsw_sp *mlxsw_sp = mlxsw_core_driver_priv(mlxsw_core);
 
+	mlxsw_sp_buffers_fini(mlxsw_sp);
 	mlxsw_sp_switchdev_fini(mlxsw_sp);
 	mlxsw_sp_traps_fini(mlxsw_sp);
 	mlxsw_sp_event_unregister(mlxsw_sp, MLXSW_TRAP_ID_PUDE);
@@ -2498,6 +2500,12 @@ static struct mlxsw_driver mlxsw_sp_driver = {
 	.fini			= mlxsw_sp_fini,
 	.port_split		= mlxsw_sp_port_split,
 	.port_unsplit		= mlxsw_sp_port_unsplit,
+	.sb_pool_get		= mlxsw_sp_sb_pool_get,
+	.sb_pool_set		= mlxsw_sp_sb_pool_set,
+	.sb_port_pool_get	= mlxsw_sp_sb_port_pool_get,
+	.sb_port_pool_set	= mlxsw_sp_sb_port_pool_set,
+	.sb_tc_pool_bind_get	= mlxsw_sp_sb_tc_pool_bind_get,
+	.sb_tc_pool_bind_set	= mlxsw_sp_sb_tc_pool_bind_set,
 	.txhdr_construct	= mlxsw_sp_txhdr_construct,
 	.txhdr_len		= MLXSW_TXHDR_LEN,
 	.profile		= &mlxsw_sp_config_profile,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index 790c292..6458efa 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -65,6 +65,7 @@
 #define MLXSW_SP_BYTES_PER_CELL 96
 
 #define MLXSW_SP_BYTES_TO_CELLS(b) DIV_ROUND_UP(b, MLXSW_SP_BYTES_PER_CELL)
+#define MLXSW_SP_CELLS_TO_BYTES(c) (c * MLXSW_SP_BYTES_PER_CELL)
 
 /* Maximum delay buffer needed in case of PAUSE frames, in cells.
  * Assumes 100m cable and maximum MTU.
@@ -305,7 +306,28 @@ enum mlxsw_sp_flood_table {
 };
 
 int mlxsw_sp_buffers_init(struct mlxsw_sp *mlxsw_sp);
+void mlxsw_sp_buffers_fini(struct mlxsw_sp *mlxsw_sp);
 int mlxsw_sp_port_buffers_init(struct mlxsw_sp_port *mlxsw_sp_port);
+int mlxsw_sp_sb_pool_get(struct mlxsw_core *mlxsw_core,
+			 unsigned int sb_index, u16 pool_index,
+			 struct devlink_sb_pool_info *pool_info);
+int mlxsw_sp_sb_pool_set(struct mlxsw_core *mlxsw_core,
+			 unsigned int sb_index, u16 pool_index, u32 size,
+			 enum devlink_sb_threshold_type threshold_type);
+int mlxsw_sp_sb_port_pool_get(struct mlxsw_core_port *mlxsw_core_port,
+			      unsigned int sb_index, u16 pool_index,
+			      u32 *p_threshold);
+int mlxsw_sp_sb_port_pool_set(struct mlxsw_core_port *mlxsw_core_port,
+			      unsigned int sb_index, u16 pool_index,
+			      u32 threshold);
+int mlxsw_sp_sb_tc_pool_bind_get(struct mlxsw_core_port *mlxsw_core_port,
+				 unsigned int sb_index, u16 tc_index,
+				 enum devlink_sb_pool_type pool_type,
+				 u16 *p_pool_index, u32 *p_threshold);
+int mlxsw_sp_sb_tc_pool_bind_set(struct mlxsw_core_port *mlxsw_core_port,
+				 unsigned int sb_index, u16 tc_index,
+				 enum devlink_sb_pool_type pool_type,
+				 u16 pool_index, u32 threshold);
 
 int mlxsw_sp_switchdev_init(struct mlxsw_sp *mlxsw_sp);
 void mlxsw_sp_switchdev_fini(struct mlxsw_sp *mlxsw_sp);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
index fd6b080..6042c17 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
@@ -492,6 +492,8 @@ static int mlxsw_sp_sb_mms_init(struct mlxsw_sp *mlxsw_sp)
 	return 0;
 }
 
+#define MLXSW_SP_SB_SIZE (16 * 1024 * 1024)
+
 int mlxsw_sp_buffers_init(struct mlxsw_sp *mlxsw_sp)
 {
 	int err;
@@ -503,8 +505,19 @@ int mlxsw_sp_buffers_init(struct mlxsw_sp *mlxsw_sp)
 	if (err)
 		return err;
 	err = mlxsw_sp_sb_mms_init(mlxsw_sp);
+	if (err)
+		return err;
+	return devlink_sb_register(priv_to_devlink(mlxsw_sp->core), 0,
+				   MLXSW_SP_SB_SIZE,
+				   MLXSW_SP_SB_POOL_COUNT,
+				   MLXSW_SP_SB_POOL_COUNT,
+				   MLXSW_SP_SB_TC_COUNT,
+				   MLXSW_SP_SB_TC_COUNT);
+}
 
-	return err;
+void mlxsw_sp_buffers_fini(struct mlxsw_sp *mlxsw_sp)
+{
+	devlink_sb_unregister(priv_to_devlink(mlxsw_sp->core), 0);
 }
 
 int mlxsw_sp_port_buffers_init(struct mlxsw_sp_port *mlxsw_sp_port)
@@ -521,3 +534,175 @@ int mlxsw_sp_port_buffers_init(struct mlxsw_sp_port *mlxsw_sp_port)
 
 	return err;
 }
+
+static u8 pool_get(u16 pool_index)
+{
+	return pool_index % MLXSW_SP_SB_POOL_COUNT;
+}
+
+static u16 pool_index_get(u8 pool, enum mlxsw_reg_sbxx_dir dir)
+{
+	u16 pool_index;
+
+	pool_index = pool;
+	if (dir == MLXSW_REG_SBXX_DIR_EGRESS)
+		pool_index += MLXSW_SP_SB_POOL_COUNT;
+	return pool_index;
+}
+
+static enum mlxsw_reg_sbxx_dir dir_get(u16 pool_index)
+{
+	return pool_index < MLXSW_SP_SB_POOL_COUNT ?
+	       MLXSW_REG_SBXX_DIR_INGRESS : MLXSW_REG_SBXX_DIR_EGRESS;
+}
+
+int mlxsw_sp_sb_pool_get(struct mlxsw_core *mlxsw_core,
+			 unsigned int sb_index, u16 pool_index,
+			 struct devlink_sb_pool_info *pool_info)
+{
+	struct mlxsw_sp *mlxsw_sp = mlxsw_core_driver_priv(mlxsw_core);
+	u8 pool = pool_get(pool_index);
+	enum mlxsw_reg_sbxx_dir dir = dir_get(pool_index);
+	struct mlxsw_sp_sb_pr *pr = mlxsw_sp_sb_pr_get(mlxsw_sp, pool, dir);
+
+	pool_info->pool_type = dir;
+	pool_info->size = MLXSW_SP_CELLS_TO_BYTES(pr->size);
+	pool_info->threshold_type = pr->mode;
+	return 0;
+}
+
+int mlxsw_sp_sb_pool_set(struct mlxsw_core *mlxsw_core,
+			 unsigned int sb_index, u16 pool_index, u32 size,
+			 enum devlink_sb_threshold_type threshold_type)
+{
+	struct mlxsw_sp *mlxsw_sp = mlxsw_core_driver_priv(mlxsw_core);
+	u8 pool = pool_get(pool_index);
+	enum mlxsw_reg_sbxx_dir dir = dir_get(pool_index);
+	enum mlxsw_reg_sbpr_mode mode = threshold_type;
+	u32 pool_size = MLXSW_SP_BYTES_TO_CELLS(size);
+
+	return mlxsw_sp_sb_pr_write(mlxsw_sp, pool, dir, mode, pool_size);
+}
+
+#define MLXSW_SP_SB_THRESHOLD_TO_ALPHA_OFFSET (-2) /* 3->1, 16->14 */
+
+static u32 mlxsw_sp_sb_threshold_out(struct mlxsw_sp *mlxsw_sp, u8 pool,
+				     enum mlxsw_reg_sbxx_dir dir, u32 max_buff)
+{
+	struct mlxsw_sp_sb_pr *pr = mlxsw_sp_sb_pr_get(mlxsw_sp, pool, dir);
+
+	if (pr->mode == MLXSW_REG_SBPR_MODE_DYNAMIC)
+		return max_buff - MLXSW_SP_SB_THRESHOLD_TO_ALPHA_OFFSET;
+	return MLXSW_SP_CELLS_TO_BYTES(max_buff);
+}
+
+static int mlxsw_sp_sb_threshold_in(struct mlxsw_sp *mlxsw_sp, u8 pool,
+				    enum mlxsw_reg_sbxx_dir dir, u32 threshold,
+				    u32 *p_max_buff)
+{
+	struct mlxsw_sp_sb_pr *pr = mlxsw_sp_sb_pr_get(mlxsw_sp, pool, dir);
+
+	if (pr->mode == MLXSW_REG_SBPR_MODE_DYNAMIC) {
+		int val;
+
+		val = threshold + MLXSW_SP_SB_THRESHOLD_TO_ALPHA_OFFSET;
+		if (val < MLXSW_REG_SBXX_DYN_MAX_BUFF_MIN ||
+		    val > MLXSW_REG_SBXX_DYN_MAX_BUFF_MAX)
+			return -EINVAL;
+		*p_max_buff = val;
+	} else {
+		*p_max_buff = MLXSW_SP_BYTES_TO_CELLS(threshold);
+	}
+	return 0;
+}
+
+int mlxsw_sp_sb_port_pool_get(struct mlxsw_core_port *mlxsw_core_port,
+			      unsigned int sb_index, u16 pool_index,
+			      u32 *p_threshold)
+{
+	struct mlxsw_sp_port *mlxsw_sp_port =
+			mlxsw_core_port_driver_priv(mlxsw_core_port);
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
+	u8 local_port = mlxsw_sp_port->local_port;
+	u8 pool = pool_get(pool_index);
+	enum mlxsw_reg_sbxx_dir dir = dir_get(pool_index);
+	struct mlxsw_sp_sb_pm *pm = mlxsw_sp_sb_pm_get(mlxsw_sp, local_port,
+						       pool, dir);
+
+	*p_threshold = mlxsw_sp_sb_threshold_out(mlxsw_sp, pool, dir,
+						 pm->max_buff);
+	return 0;
+}
+
+int mlxsw_sp_sb_port_pool_set(struct mlxsw_core_port *mlxsw_core_port,
+			      unsigned int sb_index, u16 pool_index,
+			      u32 threshold)
+{
+	struct mlxsw_sp_port *mlxsw_sp_port =
+			mlxsw_core_port_driver_priv(mlxsw_core_port);
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
+	u8 local_port = mlxsw_sp_port->local_port;
+	u8 pool = pool_get(pool_index);
+	enum mlxsw_reg_sbxx_dir dir = dir_get(pool_index);
+	u32 max_buff;
+	int err;
+
+	err = mlxsw_sp_sb_threshold_in(mlxsw_sp, pool, dir,
+				       threshold, &max_buff);
+	if (err)
+		return err;
+
+	return mlxsw_sp_sb_pm_write(mlxsw_sp, local_port, pool, dir,
+				    0, max_buff);
+}
+
+int mlxsw_sp_sb_tc_pool_bind_get(struct mlxsw_core_port *mlxsw_core_port,
+				 unsigned int sb_index, u16 tc_index,
+				 enum devlink_sb_pool_type pool_type,
+				 u16 *p_pool_index, u32 *p_threshold)
+{
+	struct mlxsw_sp_port *mlxsw_sp_port =
+			mlxsw_core_port_driver_priv(mlxsw_core_port);
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
+	u8 local_port = mlxsw_sp_port->local_port;
+	u8 pg_buff = tc_index;
+	enum mlxsw_reg_sbxx_dir dir = pool_type;
+	struct mlxsw_sp_sb_cm *cm = mlxsw_sp_sb_cm_get(mlxsw_sp, local_port,
+						       pg_buff, dir);
+
+	*p_threshold = mlxsw_sp_sb_threshold_out(mlxsw_sp, cm->pool, dir,
+						 cm->max_buff);
+	*p_pool_index = pool_index_get(cm->pool, pool_type);
+	return 0;
+}
+
+int mlxsw_sp_sb_tc_pool_bind_set(struct mlxsw_core_port *mlxsw_core_port,
+				 unsigned int sb_index, u16 tc_index,
+				 enum devlink_sb_pool_type pool_type,
+				 u16 pool_index, u32 threshold)
+{
+	struct mlxsw_sp_port *mlxsw_sp_port =
+			mlxsw_core_port_driver_priv(mlxsw_core_port);
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
+	u8 local_port = mlxsw_sp_port->local_port;
+	u8 pg_buff = tc_index;
+	enum mlxsw_reg_sbxx_dir dir = pool_type;
+	u8 pool = pool_index;
+	u32 max_buff;
+	int err;
+
+	err = mlxsw_sp_sb_threshold_in(mlxsw_sp, pool, dir,
+				       threshold, &max_buff);
+	if (err)
+		return err;
+
+	if (pool_type == DEVLINK_SB_POOL_TYPE_EGRESS) {
+		if (pool < MLXSW_SP_SB_POOL_COUNT)
+			return -EINVAL;
+		pool -= MLXSW_SP_SB_POOL_COUNT;
+	} else if (pool >= MLXSW_SP_SB_POOL_COUNT) {
+		return -EINVAL;
+	}
+	return mlxsw_sp_sb_cm_write(mlxsw_sp, local_port, pg_buff, dir,
+				    0, max_buff, pool);
+}
-- 
2.5.5

^ permalink raw reply related

* [patch net-next 11/18] mlxsw: core: Add mlxsw_core_port_driver_priv helper
From: Jiri Pirko @ 2016-04-14 16:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, ogerlitz, roopa, nikolay, jhs,
	john.fastabend, rami.rosen, gospo, stephen, sfeldma
In-Reply-To: <1460650770-19382-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Needed in following patch.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/core.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.h b/drivers/net/ethernet/mellanox/mlxsw/core.h
index 184e985..d0c471f 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.h
@@ -137,6 +137,15 @@ struct mlxsw_core_port {
 	struct devlink_port devlink_port;
 };
 
+static inline void *
+mlxsw_core_port_driver_priv(struct mlxsw_core_port *mlxsw_core_port)
+{
+	/* mlxsw_core_port is ensured to always be the first field in driver
+	 * port structure.
+	 */
+	return mlxsw_core_port;
+}
+
 int mlxsw_core_port_init(struct mlxsw_core *mlxsw_core,
 			 struct mlxsw_core_port *mlxsw_core_port, u8 local_port,
 			 struct net_device *dev, bool split, u32 split_group);
-- 
2.5.5

^ permalink raw reply related

* [patch net-next 09/18] mlxsw: spectrum_buffers: Change initialization of PG 9
From: Jiri Pirko @ 2016-04-14 16:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, ogerlitz, roopa, nikolay, jhs,
	john.fastabend, rami.rosen, gospo, stephen, sfeldma
In-Reply-To: <1460650770-19382-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

As explained in commit ff6551ec0c27 ("mlxsw: spectrum: Correctly
configure headroom size") control packets are directed to priority group
buffer 9 (PG9) in the ports' headroom buffers.

Since we don't want to drop control packets in case they can't be
admitted to the switch's shared buffer we bind PG9 to a different
ingress pool from the one used by all other PGs.

Unlike other PGs, we currently don't expose the binding between PG9 to a
pool and leave it fixed.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
index dc57d77..7ee2315 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
@@ -181,6 +181,7 @@ static int mlxsw_sp_port_headroom_init(struct mlxsw_sp_port *mlxsw_sp_port)
 
 #define MLXSW_SP_SB_PR_INGRESS_SIZE				\
 	(15000000 - (2 * 20000 * MLXSW_PORT_MAX_PORTS))
+#define MLXSW_SP_SB_PR_INGRESS_MNG_SIZE (200 * 1000)
 #define MLXSW_SP_SB_PR_EGRESS_SIZE				\
 	(14000000 - (8 * 1500 * MLXSW_PORT_MAX_PORTS))
 
@@ -195,7 +196,8 @@ static const struct mlxsw_sp_sb_pr mlxsw_sp_sb_prs_ingress[] = {
 		       MLXSW_SP_BYTES_TO_CELLS(MLXSW_SP_SB_PR_INGRESS_SIZE)),
 	MLXSW_SP_SB_PR(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
 	MLXSW_SP_SB_PR(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
-	MLXSW_SP_SB_PR(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
+	MLXSW_SP_SB_PR(MLXSW_REG_SBPR_MODE_DYNAMIC,
+		       MLXSW_SP_BYTES_TO_CELLS(MLXSW_SP_SB_PR_INGRESS_MNG_SIZE)),
 };
 
 #define MLXSW_SP_SB_PRS_INGRESS_LEN ARRAY_SIZE(mlxsw_sp_sb_prs_ingress)
@@ -261,7 +263,7 @@ static const struct mlxsw_sp_sb_cm mlxsw_sp_sb_cms_ingress[] = {
 	MLXSW_SP_SB_CM(0, 0, 0),
 	MLXSW_SP_SB_CM(0, 0, 0),
 	MLXSW_SP_SB_CM(0, 0, 0), /* dummy, this PG does not exist */
-	MLXSW_SP_SB_CM(MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
+	MLXSW_SP_SB_CM(MLXSW_SP_BYTES_TO_CELLS(20000), 1, 3),
 };
 
 #define MLXSW_SP_SB_CMS_INGRESS_LEN ARRAY_SIZE(mlxsw_sp_sb_cms_ingress)
@@ -386,7 +388,7 @@ static const struct mlxsw_sp_sb_pm mlxsw_sp_sb_pms_ingress[] = {
 	MLXSW_SP_SB_PM(0, 0xff),
 	MLXSW_SP_SB_PM(0, 0),
 	MLXSW_SP_SB_PM(0, 0),
-	MLXSW_SP_SB_PM(0, 0),
+	MLXSW_SP_SB_PM(0, 0xff),
 };
 
 #define MLXSW_SP_SB_PMS_INGRESS_LEN ARRAY_SIZE(mlxsw_sp_sb_pms_ingress)
-- 
2.5.5

^ permalink raw reply related

* [patch net-next 08/18] mlxsw: spectrum_buffers: Remove eg pool 3 default init and CPU port TC binding to it
From: Jiri Pirko @ 2016-04-14 16:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, ogerlitz, roopa, nikolay, jhs,
	john.fastabend, rami.rosen, gospo, stephen, sfeldma
In-Reply-To: <1460650770-19382-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Since there is no congestion control for CPU port traffic, we can change
the CPU port TC binding to pool 0 with min_buff and max_buff zeroed.
Remove initialization for pool egress pool 3 since it is no longer used
by dafault.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
index b43f7d36e..dc57d77 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
@@ -205,8 +205,7 @@ static const struct mlxsw_sp_sb_pr mlxsw_sp_sb_prs_egress[] = {
 		       MLXSW_SP_BYTES_TO_CELLS(MLXSW_SP_SB_PR_EGRESS_SIZE)),
 	MLXSW_SP_SB_PR(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
 	MLXSW_SP_SB_PR(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
-	MLXSW_SP_SB_PR(MLXSW_REG_SBPR_MODE_DYNAMIC,
-		       MLXSW_SP_BYTES_TO_CELLS(MLXSW_SP_SB_PR_EGRESS_SIZE)),
+	MLXSW_SP_SB_PR(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
 };
 
 #define MLXSW_SP_SB_PRS_EGRESS_LEN ARRAY_SIZE(mlxsw_sp_sb_prs_egress)
@@ -289,7 +288,7 @@ static const struct mlxsw_sp_sb_cm mlxsw_sp_sb_cms_egress[] = {
 
 #define MLXSW_SP_SB_CMS_EGRESS_LEN ARRAY_SIZE(mlxsw_sp_sb_cms_egress)
 
-#define MLXSW_SP_CPU_PORT_SB_CM MLXSW_SP_SB_CM(104, 2, 3)
+#define MLXSW_SP_CPU_PORT_SB_CM MLXSW_SP_SB_CM(0, 0, 0)
 
 static const struct mlxsw_sp_sb_cm mlxsw_sp_cpu_port_sb_cms[] = {
 	MLXSW_SP_CPU_PORT_SB_CM,
-- 
2.5.5

^ permalink raw reply related

* [patch net-next 07/18] mlxsw: spectrum_buffers: Cache shared buffer configuration
From: Jiri Pirko @ 2016-04-14 16:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, ogerlitz, roopa, nikolay, jhs,
	john.fastabend, rami.rosen, gospo, stephen, sfeldma
In-Reply-To: <1460650770-19382-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

In order to achieve faster dumping of current setting and also in order
to provide possibility to get pool mode without a need to query hardware,
do cache the configuration in driver.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     | 28 +++++++++
 .../net/ethernet/mellanox/mlxsw/spectrum_buffers.c | 73 ++++++++++++++++------
 2 files changed, 82 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index 361b0c2..790c292 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -117,6 +117,33 @@ static inline bool mlxsw_sp_fid_is_vfid(u16 fid)
 	return fid >= MLXSW_SP_VFID_BASE;
 }
 
+struct mlxsw_sp_sb_pr {
+	enum mlxsw_reg_sbpr_mode mode;
+	u32 size;
+};
+
+struct mlxsw_sp_sb_cm {
+	u32 min_buff;
+	u32 max_buff;
+	u8 pool;
+};
+
+struct mlxsw_sp_sb_pm {
+	u32 min_buff;
+	u32 max_buff;
+};
+
+#define MLXSW_SP_SB_POOL_COUNT	4
+#define MLXSW_SP_SB_TC_COUNT	8
+
+struct mlxsw_sp_sb {
+	struct mlxsw_sp_sb_pr prs[2][MLXSW_SP_SB_POOL_COUNT];
+	struct {
+		struct mlxsw_sp_sb_cm cms[2][MLXSW_SP_SB_TC_COUNT];
+		struct mlxsw_sp_sb_pm pms[2][MLXSW_SP_SB_POOL_COUNT];
+	} ports[MLXSW_PORT_MAX_PORTS];
+};
+
 struct mlxsw_sp {
 	struct {
 		struct list_head list;
@@ -147,6 +174,7 @@ struct mlxsw_sp {
 	struct mlxsw_sp_upper master_bridge;
 	struct mlxsw_sp_upper lags[MLXSW_SP_LAG_MAX];
 	u8 port_to_module[MLXSW_PORT_MAX_PORTS];
+	struct mlxsw_sp_sb sb;
 };
 
 static inline struct mlxsw_sp_upper *
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
index 15bd5aa..b43f7d36e 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
@@ -42,14 +42,44 @@
 #include "port.h"
 #include "reg.h"
 
+static struct mlxsw_sp_sb_pr *mlxsw_sp_sb_pr_get(struct mlxsw_sp *mlxsw_sp,
+						 u8 pool,
+						 enum mlxsw_reg_sbxx_dir dir)
+{
+	return &mlxsw_sp->sb.prs[dir][pool];
+}
+
+static struct mlxsw_sp_sb_cm *mlxsw_sp_sb_cm_get(struct mlxsw_sp *mlxsw_sp,
+						 u8 local_port, u8 pg_buff,
+						 enum mlxsw_reg_sbxx_dir dir)
+{
+	return &mlxsw_sp->sb.ports[local_port].cms[dir][pg_buff];
+}
+
+static struct mlxsw_sp_sb_pm *mlxsw_sp_sb_pm_get(struct mlxsw_sp *mlxsw_sp,
+						 u8 local_port, u8 pool,
+						 enum mlxsw_reg_sbxx_dir dir)
+{
+	return &mlxsw_sp->sb.ports[local_port].pms[dir][pool];
+}
+
 static int mlxsw_sp_sb_pr_write(struct mlxsw_sp *mlxsw_sp, u8 pool,
 				enum mlxsw_reg_sbxx_dir dir,
 				enum mlxsw_reg_sbpr_mode mode, u32 size)
 {
 	char sbpr_pl[MLXSW_REG_SBPR_LEN];
+	struct mlxsw_sp_sb_pr *pr;
+	int err;
 
 	mlxsw_reg_sbpr_pack(sbpr_pl, pool, dir, mode, size);
-	return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sbpr), sbpr_pl);
+	err = mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sbpr), sbpr_pl);
+	if (err)
+		return err;
+
+	pr = mlxsw_sp_sb_pr_get(mlxsw_sp, pool, dir);
+	pr->mode = mode;
+	pr->size = size;
+	return 0;
 }
 
 static int mlxsw_sp_sb_cm_write(struct mlxsw_sp *mlxsw_sp, u8 local_port,
@@ -57,10 +87,22 @@ static int mlxsw_sp_sb_cm_write(struct mlxsw_sp *mlxsw_sp, u8 local_port,
 				u32 min_buff, u32 max_buff, u8 pool)
 {
 	char sbcm_pl[MLXSW_REG_SBCM_LEN];
+	int err;
 
 	mlxsw_reg_sbcm_pack(sbcm_pl, local_port, pg_buff, dir,
 			    min_buff, max_buff, pool);
-	return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sbcm), sbcm_pl);
+	err = mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sbcm), sbcm_pl);
+	if (err)
+		return err;
+	if (pg_buff < MLXSW_SP_SB_TC_COUNT) {
+		struct mlxsw_sp_sb_cm *cm;
+
+		cm = mlxsw_sp_sb_cm_get(mlxsw_sp, local_port, pg_buff, dir);
+		cm->min_buff = min_buff;
+		cm->max_buff = max_buff;
+		cm->pool = pool;
+	}
+	return 0;
 }
 
 static int mlxsw_sp_sb_pm_write(struct mlxsw_sp *mlxsw_sp, u8 local_port,
@@ -68,9 +110,18 @@ static int mlxsw_sp_sb_pm_write(struct mlxsw_sp *mlxsw_sp, u8 local_port,
 				u32 min_buff, u32 max_buff)
 {
 	char sbpm_pl[MLXSW_REG_SBPM_LEN];
+	struct mlxsw_sp_sb_pm *pm;
+	int err;
 
 	mlxsw_reg_sbpm_pack(sbpm_pl, local_port, pool, dir, min_buff, max_buff);
-	return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sbpm), sbpm_pl);
+	err = mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sbpm), sbpm_pl);
+	if (err)
+		return err;
+
+	pm = mlxsw_sp_sb_pm_get(mlxsw_sp, local_port, pool, dir);
+	pm->min_buff = min_buff;
+	pm->max_buff = max_buff;
+	return 0;
 }
 
 static const u16 mlxsw_sp_pbs[] = {
@@ -128,11 +179,6 @@ static int mlxsw_sp_port_headroom_init(struct mlxsw_sp_port *mlxsw_sp_port)
 	return mlxsw_sp_port_pb_prio_init(mlxsw_sp_port);
 }
 
-struct mlxsw_sp_sb_pr {
-	enum mlxsw_reg_sbpr_mode mode;
-	u32 size;
-};
-
 #define MLXSW_SP_SB_PR_INGRESS_SIZE				\
 	(15000000 - (2 * 20000 * MLXSW_PORT_MAX_PORTS))
 #define MLXSW_SP_SB_PR_EGRESS_SIZE				\
@@ -199,12 +245,6 @@ static int mlxsw_sp_sb_prs_init(struct mlxsw_sp *mlxsw_sp)
 				      MLXSW_SP_SB_PRS_EGRESS_LEN);
 }
 
-struct mlxsw_sp_sb_cm {
-	u32 min_buff;
-	u32 max_buff;
-	u8 pool;
-};
-
 #define MLXSW_SP_SB_CM(_min_buff, _max_buff, _pool)	\
 	{						\
 		.min_buff = _min_buff,			\
@@ -337,11 +377,6 @@ static int mlxsw_sp_cpu_port_sb_cms_init(struct mlxsw_sp *mlxsw_sp)
 				      MLXSW_SP_CPU_PORT_SB_MCS_LEN);
 }
 
-struct mlxsw_sp_sb_pm {
-	u32 min_buff;
-	u32 max_buff;
-};
-
 #define MLXSW_SP_SB_PM(_min_buff, _max_buff)	\
 	{					\
 		.min_buff = _min_buff,		\
-- 
2.5.5

^ permalink raw reply related

* [patch net-next 06/18] mlxsw: spectrum_buffers: Rename "pool" to "pr" in initialization
From: Jiri Pirko @ 2016-04-14 16:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, ogerlitz, roopa, nikolay, jhs,
	john.fastabend, rami.rosen, gospo, stephen, sfeldma
In-Reply-To: <1460650770-19382-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Be consintent with rest of the registers (pm, cm) and use "pr" here.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 .../net/ethernet/mellanox/mlxsw/spectrum_buffers.c | 70 +++++++++++-----------
 1 file changed, 35 insertions(+), 35 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
index c326e58..15bd5aa 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
@@ -128,75 +128,75 @@ static int mlxsw_sp_port_headroom_init(struct mlxsw_sp_port *mlxsw_sp_port)
 	return mlxsw_sp_port_pb_prio_init(mlxsw_sp_port);
 }
 
-struct mlxsw_sp_sb_pool {
+struct mlxsw_sp_sb_pr {
 	enum mlxsw_reg_sbpr_mode mode;
 	u32 size;
 };
 
-#define MLXSW_SP_SB_POOL_INGRESS_SIZE				\
+#define MLXSW_SP_SB_PR_INGRESS_SIZE				\
 	(15000000 - (2 * 20000 * MLXSW_PORT_MAX_PORTS))
-#define MLXSW_SP_SB_POOL_EGRESS_SIZE				\
+#define MLXSW_SP_SB_PR_EGRESS_SIZE				\
 	(14000000 - (8 * 1500 * MLXSW_PORT_MAX_PORTS))
 
-#define MLXSW_SP_SB_POOL(_mode, _size)	\
+#define MLXSW_SP_SB_PR(_mode, _size)	\
 	{				\
 		.mode = _mode,		\
 		.size = _size,		\
 	}
 
-static const struct mlxsw_sp_sb_pool mlxsw_sp_sb_pools_ingress[] = {
-	MLXSW_SP_SB_POOL(MLXSW_REG_SBPR_MODE_DYNAMIC,
-			 MLXSW_SP_BYTES_TO_CELLS(MLXSW_SP_SB_POOL_INGRESS_SIZE)),
-	MLXSW_SP_SB_POOL(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
-	MLXSW_SP_SB_POOL(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
-	MLXSW_SP_SB_POOL(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
+static const struct mlxsw_sp_sb_pr mlxsw_sp_sb_prs_ingress[] = {
+	MLXSW_SP_SB_PR(MLXSW_REG_SBPR_MODE_DYNAMIC,
+		       MLXSW_SP_BYTES_TO_CELLS(MLXSW_SP_SB_PR_INGRESS_SIZE)),
+	MLXSW_SP_SB_PR(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
+	MLXSW_SP_SB_PR(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
+	MLXSW_SP_SB_PR(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
 };
 
-#define MLXSW_SP_SB_POOLS_INGRESS_LEN ARRAY_SIZE(mlxsw_sp_sb_pools_ingress)
+#define MLXSW_SP_SB_PRS_INGRESS_LEN ARRAY_SIZE(mlxsw_sp_sb_prs_ingress)
 
-static const struct mlxsw_sp_sb_pool mlxsw_sp_sb_pools_egress[] = {
-	MLXSW_SP_SB_POOL(MLXSW_REG_SBPR_MODE_DYNAMIC,
-			 MLXSW_SP_BYTES_TO_CELLS(MLXSW_SP_SB_POOL_EGRESS_SIZE)),
-	MLXSW_SP_SB_POOL(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
-	MLXSW_SP_SB_POOL(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
-	MLXSW_SP_SB_POOL(MLXSW_REG_SBPR_MODE_DYNAMIC,
-			 MLXSW_SP_BYTES_TO_CELLS(MLXSW_SP_SB_POOL_EGRESS_SIZE)),
+static const struct mlxsw_sp_sb_pr mlxsw_sp_sb_prs_egress[] = {
+	MLXSW_SP_SB_PR(MLXSW_REG_SBPR_MODE_DYNAMIC,
+		       MLXSW_SP_BYTES_TO_CELLS(MLXSW_SP_SB_PR_EGRESS_SIZE)),
+	MLXSW_SP_SB_PR(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
+	MLXSW_SP_SB_PR(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
+	MLXSW_SP_SB_PR(MLXSW_REG_SBPR_MODE_DYNAMIC,
+		       MLXSW_SP_BYTES_TO_CELLS(MLXSW_SP_SB_PR_EGRESS_SIZE)),
 };
 
-#define MLXSW_SP_SB_POOLS_EGRESS_LEN ARRAY_SIZE(mlxsw_sp_sb_pools_egress)
+#define MLXSW_SP_SB_PRS_EGRESS_LEN ARRAY_SIZE(mlxsw_sp_sb_prs_egress)
 
-static int __mlxsw_sp_sb_pools_init(struct mlxsw_sp *mlxsw_sp,
-				    enum mlxsw_reg_sbxx_dir dir,
-				    const struct mlxsw_sp_sb_pool *pools,
-				    size_t pools_len)
+static int __mlxsw_sp_sb_prs_init(struct mlxsw_sp *mlxsw_sp,
+				  enum mlxsw_reg_sbxx_dir dir,
+				  const struct mlxsw_sp_sb_pr *prs,
+				  size_t prs_len)
 {
 	int i;
 	int err;
 
-	for (i = 0; i < pools_len; i++) {
-		const struct mlxsw_sp_sb_pool *pool;
+	for (i = 0; i < prs_len; i++) {
+		const struct mlxsw_sp_sb_pr *pr;
 
-		pool = &pools[i];
+		pr = &prs[i];
 		err = mlxsw_sp_sb_pr_write(mlxsw_sp, i, dir,
-					   pool->mode, pool->size);
+					   pr->mode, pr->size);
 		if (err)
 			return err;
 	}
 	return 0;
 }
 
-static int mlxsw_sp_sb_pools_init(struct mlxsw_sp *mlxsw_sp)
+static int mlxsw_sp_sb_prs_init(struct mlxsw_sp *mlxsw_sp)
 {
 	int err;
 
-	err = __mlxsw_sp_sb_pools_init(mlxsw_sp, MLXSW_REG_SBXX_DIR_INGRESS,
-				       mlxsw_sp_sb_pools_ingress,
-				       MLXSW_SP_SB_POOLS_INGRESS_LEN);
+	err = __mlxsw_sp_sb_prs_init(mlxsw_sp, MLXSW_REG_SBXX_DIR_INGRESS,
+				     mlxsw_sp_sb_prs_ingress,
+				     MLXSW_SP_SB_PRS_INGRESS_LEN);
 	if (err)
 		return err;
-	return __mlxsw_sp_sb_pools_init(mlxsw_sp, MLXSW_REG_SBXX_DIR_EGRESS,
-					mlxsw_sp_sb_pools_egress,
-					MLXSW_SP_SB_POOLS_EGRESS_LEN);
+	return __mlxsw_sp_sb_prs_init(mlxsw_sp, MLXSW_REG_SBXX_DIR_EGRESS,
+				      mlxsw_sp_sb_prs_egress,
+				      MLXSW_SP_SB_PRS_EGRESS_LEN);
 }
 
 struct mlxsw_sp_sb_cm {
@@ -460,7 +460,7 @@ int mlxsw_sp_buffers_init(struct mlxsw_sp *mlxsw_sp)
 {
 	int err;
 
-	err = mlxsw_sp_sb_pools_init(mlxsw_sp);
+	err = mlxsw_sp_sb_prs_init(mlxsw_sp);
 	if (err)
 		return err;
 	err = mlxsw_sp_cpu_port_sb_cms_init(mlxsw_sp);
-- 
2.5.5

^ permalink raw reply related

* [patch net-next 05/18] mlxsw: spectrum_buffers: Push out indexes and direction out of SB structs
From: Jiri Pirko @ 2016-04-14 16:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, ogerlitz, roopa, nikolay, jhs,
	john.fastabend, rami.rosen, gospo, stephen, sfeldma
In-Reply-To: <1460650770-19382-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Structs are in arrays so use array index as pool/tc/prio index. With
that, there is need to maintain separate arrays for ingress and egress.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 .../net/ethernet/mellanox/mlxsw/spectrum_buffers.c | 425 +++++++++++----------
 1 file changed, 221 insertions(+), 204 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
index ae60838..c326e58 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
@@ -73,27 +73,17 @@ static int mlxsw_sp_sb_pm_write(struct mlxsw_sp *mlxsw_sp, u8 local_port,
 	return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sbpm), sbpm_pl);
 }
 
-struct mlxsw_sp_pb {
-	u8 index;
-	u16 size;
-};
-
-#define MLXSW_SP_PB(_index, _size)	\
-	{				\
-		.index = _index,	\
-		.size = _size,		\
-	}
-
-static const struct mlxsw_sp_pb mlxsw_sp_pbs[] = {
-	MLXSW_SP_PB(0, 2 * MLXSW_SP_BYTES_TO_CELLS(ETH_FRAME_LEN)),
-	MLXSW_SP_PB(1, 0),
-	MLXSW_SP_PB(2, 0),
-	MLXSW_SP_PB(3, 0),
-	MLXSW_SP_PB(4, 0),
-	MLXSW_SP_PB(5, 0),
-	MLXSW_SP_PB(6, 0),
-	MLXSW_SP_PB(7, 0),
-	MLXSW_SP_PB(9, 2 * MLXSW_SP_BYTES_TO_CELLS(MLXSW_PORT_MAX_MTU)),
+static const u16 mlxsw_sp_pbs[] = {
+	2 * MLXSW_SP_BYTES_TO_CELLS(ETH_FRAME_LEN),
+	0,
+	0,
+	0,
+	0,
+	0,
+	0,
+	0,
+	0, /* Unused */
+	2 * MLXSW_SP_BYTES_TO_CELLS(MLXSW_PORT_MAX_MTU),
 };
 
 #define MLXSW_SP_PBS_LEN ARRAY_SIZE(mlxsw_sp_pbs)
@@ -106,10 +96,9 @@ static int mlxsw_sp_port_pb_init(struct mlxsw_sp_port *mlxsw_sp_port)
 	mlxsw_reg_pbmc_pack(pbmc_pl, mlxsw_sp_port->local_port,
 			    0xffff, 0xffff / 2);
 	for (i = 0; i < MLXSW_SP_PBS_LEN; i++) {
-		const struct mlxsw_sp_pb *pb;
-
-		pb = &mlxsw_sp_pbs[i];
-		mlxsw_reg_pbmc_lossy_buffer_pack(pbmc_pl, pb->index, pb->size);
+		if (i == 8)
+			continue;
+		mlxsw_reg_pbmc_lossy_buffer_pack(pbmc_pl, i, mlxsw_sp_pbs[i]);
 	}
 	mlxsw_reg_pbmc_lossy_buffer_pack(pbmc_pl,
 					 MLXSW_REG_PBMC_PORT_SHARED_BUF_IDX, 0);
@@ -140,8 +129,6 @@ static int mlxsw_sp_port_headroom_init(struct mlxsw_sp_port *mlxsw_sp_port)
 }
 
 struct mlxsw_sp_sb_pool {
-	u8 pool;
-	enum mlxsw_reg_sbxx_dir dir;
 	enum mlxsw_reg_sbpr_mode mode;
 	u32 size;
 };
@@ -151,45 +138,46 @@ struct mlxsw_sp_sb_pool {
 #define MLXSW_SP_SB_POOL_EGRESS_SIZE				\
 	(14000000 - (8 * 1500 * MLXSW_PORT_MAX_PORTS))
 
-#define MLXSW_SP_SB_POOL(_pool, _dir, _mode, _size)		\
-	{							\
-		.pool = _pool,					\
-		.dir = _dir,					\
-		.mode = _mode,					\
-		.size = _size,					\
+#define MLXSW_SP_SB_POOL(_mode, _size)	\
+	{				\
+		.mode = _mode,		\
+		.size = _size,		\
 	}
 
-#define MLXSW_SP_SB_POOL_INGRESS(_pool, _size)			\
-	MLXSW_SP_SB_POOL(_pool, MLXSW_REG_SBXX_DIR_INGRESS,	\
-			 MLXSW_REG_SBPR_MODE_DYNAMIC, _size)
-
-#define MLXSW_SP_SB_POOL_EGRESS(_pool, _size)			\
-	MLXSW_SP_SB_POOL(_pool, MLXSW_REG_SBXX_DIR_EGRESS,	\
-			 MLXSW_REG_SBPR_MODE_DYNAMIC, _size)
-
-static const struct mlxsw_sp_sb_pool mlxsw_sp_sb_pools[] = {
-	MLXSW_SP_SB_POOL_INGRESS(0, MLXSW_SP_BYTES_TO_CELLS(MLXSW_SP_SB_POOL_INGRESS_SIZE)),
-	MLXSW_SP_SB_POOL_INGRESS(1, 0),
-	MLXSW_SP_SB_POOL_INGRESS(2, 0),
-	MLXSW_SP_SB_POOL_INGRESS(3, 0),
-	MLXSW_SP_SB_POOL_EGRESS(0, MLXSW_SP_BYTES_TO_CELLS(MLXSW_SP_SB_POOL_EGRESS_SIZE)),
-	MLXSW_SP_SB_POOL_EGRESS(1, 0),
-	MLXSW_SP_SB_POOL_EGRESS(2, 0),
-	MLXSW_SP_SB_POOL_EGRESS(2, MLXSW_SP_BYTES_TO_CELLS(MLXSW_SP_SB_POOL_EGRESS_SIZE)),
+static const struct mlxsw_sp_sb_pool mlxsw_sp_sb_pools_ingress[] = {
+	MLXSW_SP_SB_POOL(MLXSW_REG_SBPR_MODE_DYNAMIC,
+			 MLXSW_SP_BYTES_TO_CELLS(MLXSW_SP_SB_POOL_INGRESS_SIZE)),
+	MLXSW_SP_SB_POOL(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
+	MLXSW_SP_SB_POOL(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
+	MLXSW_SP_SB_POOL(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
 };
 
-#define MLXSW_SP_SB_POOLS_LEN ARRAY_SIZE(mlxsw_sp_sb_pools)
+#define MLXSW_SP_SB_POOLS_INGRESS_LEN ARRAY_SIZE(mlxsw_sp_sb_pools_ingress)
 
-static int mlxsw_sp_sb_pools_init(struct mlxsw_sp *mlxsw_sp)
+static const struct mlxsw_sp_sb_pool mlxsw_sp_sb_pools_egress[] = {
+	MLXSW_SP_SB_POOL(MLXSW_REG_SBPR_MODE_DYNAMIC,
+			 MLXSW_SP_BYTES_TO_CELLS(MLXSW_SP_SB_POOL_EGRESS_SIZE)),
+	MLXSW_SP_SB_POOL(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
+	MLXSW_SP_SB_POOL(MLXSW_REG_SBPR_MODE_DYNAMIC, 0),
+	MLXSW_SP_SB_POOL(MLXSW_REG_SBPR_MODE_DYNAMIC,
+			 MLXSW_SP_BYTES_TO_CELLS(MLXSW_SP_SB_POOL_EGRESS_SIZE)),
+};
+
+#define MLXSW_SP_SB_POOLS_EGRESS_LEN ARRAY_SIZE(mlxsw_sp_sb_pools_egress)
+
+static int __mlxsw_sp_sb_pools_init(struct mlxsw_sp *mlxsw_sp,
+				    enum mlxsw_reg_sbxx_dir dir,
+				    const struct mlxsw_sp_sb_pool *pools,
+				    size_t pools_len)
 {
 	int i;
 	int err;
 
-	for (i = 0; i < MLXSW_SP_SB_POOLS_LEN; i++) {
+	for (i = 0; i < pools_len; i++) {
 		const struct mlxsw_sp_sb_pool *pool;
 
-		pool = &mlxsw_sp_sb_pools[i];
-		err = mlxsw_sp_sb_pr_write(mlxsw_sp, pool->pool, pool->dir,
+		pool = &pools[i];
+		err = mlxsw_sp_sb_pr_write(mlxsw_sp, i, dir,
 					   pool->mode, pool->size);
 		if (err)
 			return err;
@@ -197,109 +185,114 @@ static int mlxsw_sp_sb_pools_init(struct mlxsw_sp *mlxsw_sp)
 	return 0;
 }
 
+static int mlxsw_sp_sb_pools_init(struct mlxsw_sp *mlxsw_sp)
+{
+	int err;
+
+	err = __mlxsw_sp_sb_pools_init(mlxsw_sp, MLXSW_REG_SBXX_DIR_INGRESS,
+				       mlxsw_sp_sb_pools_ingress,
+				       MLXSW_SP_SB_POOLS_INGRESS_LEN);
+	if (err)
+		return err;
+	return __mlxsw_sp_sb_pools_init(mlxsw_sp, MLXSW_REG_SBXX_DIR_EGRESS,
+					mlxsw_sp_sb_pools_egress,
+					MLXSW_SP_SB_POOLS_EGRESS_LEN);
+}
+
 struct mlxsw_sp_sb_cm {
-	union {
-		u8 pg;
-		u8 tc;
-	} u;
-	enum mlxsw_reg_sbxx_dir dir;
 	u32 min_buff;
 	u32 max_buff;
 	u8 pool;
 };
 
-#define MLXSW_SP_SB_CM(_pg_tc, _dir, _min_buff, _max_buff, _pool)	\
-	{								\
-		.u.pg = _pg_tc,						\
-		.dir = _dir,						\
-		.min_buff = _min_buff,					\
-		.max_buff = _max_buff,					\
-		.pool = _pool,						\
+#define MLXSW_SP_SB_CM(_min_buff, _max_buff, _pool)	\
+	{						\
+		.min_buff = _min_buff,			\
+		.max_buff = _max_buff,			\
+		.pool = _pool,				\
 	}
 
-#define MLXSW_SP_SB_CM_INGRESS(_pg, _min_buff, _max_buff)		\
-	MLXSW_SP_SB_CM(_pg, MLXSW_REG_SBXX_DIR_INGRESS,			\
-		       _min_buff, _max_buff, 0)
-
-#define MLXSW_SP_SB_CM_EGRESS(_tc, _min_buff, _max_buff)		\
-	MLXSW_SP_SB_CM(_tc, MLXSW_REG_SBXX_DIR_EGRESS,			\
-		       _min_buff, _max_buff, 0)
-
-#define MLXSW_SP_CPU_PORT_SB_CM_EGRESS(_tc)				\
-	MLXSW_SP_SB_CM(_tc, MLXSW_REG_SBXX_DIR_EGRESS, 104, 2, 3)
-
-static const struct mlxsw_sp_sb_cm mlxsw_sp_sb_cms[] = {
-	MLXSW_SP_SB_CM_INGRESS(0, MLXSW_SP_BYTES_TO_CELLS(10000), 8),
-	MLXSW_SP_SB_CM_INGRESS(1, 0, 0),
-	MLXSW_SP_SB_CM_INGRESS(2, 0, 0),
-	MLXSW_SP_SB_CM_INGRESS(3, 0, 0),
-	MLXSW_SP_SB_CM_INGRESS(4, 0, 0),
-	MLXSW_SP_SB_CM_INGRESS(5, 0, 0),
-	MLXSW_SP_SB_CM_INGRESS(6, 0, 0),
-	MLXSW_SP_SB_CM_INGRESS(7, 0, 0),
-	MLXSW_SP_SB_CM_INGRESS(9, MLXSW_SP_BYTES_TO_CELLS(20000), 0xff),
-	MLXSW_SP_SB_CM_EGRESS(0, MLXSW_SP_BYTES_TO_CELLS(1500), 9),
-	MLXSW_SP_SB_CM_EGRESS(1, MLXSW_SP_BYTES_TO_CELLS(1500), 9),
-	MLXSW_SP_SB_CM_EGRESS(2, MLXSW_SP_BYTES_TO_CELLS(1500), 9),
-	MLXSW_SP_SB_CM_EGRESS(3, MLXSW_SP_BYTES_TO_CELLS(1500), 9),
-	MLXSW_SP_SB_CM_EGRESS(4, MLXSW_SP_BYTES_TO_CELLS(1500), 9),
-	MLXSW_SP_SB_CM_EGRESS(5, MLXSW_SP_BYTES_TO_CELLS(1500), 9),
-	MLXSW_SP_SB_CM_EGRESS(6, MLXSW_SP_BYTES_TO_CELLS(1500), 9),
-	MLXSW_SP_SB_CM_EGRESS(7, MLXSW_SP_BYTES_TO_CELLS(1500), 9),
-	MLXSW_SP_SB_CM_EGRESS(8, 0, 0),
-	MLXSW_SP_SB_CM_EGRESS(9, 0, 0),
-	MLXSW_SP_SB_CM_EGRESS(10, 0, 0),
-	MLXSW_SP_SB_CM_EGRESS(11, 0, 0),
-	MLXSW_SP_SB_CM_EGRESS(12, 0, 0),
-	MLXSW_SP_SB_CM_EGRESS(13, 0, 0),
-	MLXSW_SP_SB_CM_EGRESS(14, 0, 0),
-	MLXSW_SP_SB_CM_EGRESS(15, 0, 0),
-	MLXSW_SP_SB_CM_EGRESS(16, 1, 0xff),
+static const struct mlxsw_sp_sb_cm mlxsw_sp_sb_cms_ingress[] = {
+	MLXSW_SP_SB_CM(MLXSW_SP_BYTES_TO_CELLS(10000), 8, 0),
+	MLXSW_SP_SB_CM(0, 0, 0),
+	MLXSW_SP_SB_CM(0, 0, 0),
+	MLXSW_SP_SB_CM(0, 0, 0),
+	MLXSW_SP_SB_CM(0, 0, 0),
+	MLXSW_SP_SB_CM(0, 0, 0),
+	MLXSW_SP_SB_CM(0, 0, 0),
+	MLXSW_SP_SB_CM(0, 0, 0),
+	MLXSW_SP_SB_CM(0, 0, 0), /* dummy, this PG does not exist */
+	MLXSW_SP_SB_CM(MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
 };
 
-#define MLXSW_SP_SB_CMS_LEN ARRAY_SIZE(mlxsw_sp_sb_cms)
+#define MLXSW_SP_SB_CMS_INGRESS_LEN ARRAY_SIZE(mlxsw_sp_sb_cms_ingress)
+
+static const struct mlxsw_sp_sb_cm mlxsw_sp_sb_cms_egress[] = {
+	MLXSW_SP_SB_CM(MLXSW_SP_BYTES_TO_CELLS(1500), 9, 0),
+	MLXSW_SP_SB_CM(MLXSW_SP_BYTES_TO_CELLS(1500), 9, 0),
+	MLXSW_SP_SB_CM(MLXSW_SP_BYTES_TO_CELLS(1500), 9, 0),
+	MLXSW_SP_SB_CM(MLXSW_SP_BYTES_TO_CELLS(1500), 9, 0),
+	MLXSW_SP_SB_CM(MLXSW_SP_BYTES_TO_CELLS(1500), 9, 0),
+	MLXSW_SP_SB_CM(MLXSW_SP_BYTES_TO_CELLS(1500), 9, 0),
+	MLXSW_SP_SB_CM(MLXSW_SP_BYTES_TO_CELLS(1500), 9, 0),
+	MLXSW_SP_SB_CM(MLXSW_SP_BYTES_TO_CELLS(1500), 9, 0),
+	MLXSW_SP_SB_CM(0, 0, 0),
+	MLXSW_SP_SB_CM(0, 0, 0),
+	MLXSW_SP_SB_CM(0, 0, 0),
+	MLXSW_SP_SB_CM(0, 0, 0),
+	MLXSW_SP_SB_CM(0, 0, 0),
+	MLXSW_SP_SB_CM(0, 0, 0),
+	MLXSW_SP_SB_CM(0, 0, 0),
+	MLXSW_SP_SB_CM(0, 0, 0),
+	MLXSW_SP_SB_CM(1, 0xff, 0),
+};
+
+#define MLXSW_SP_SB_CMS_EGRESS_LEN ARRAY_SIZE(mlxsw_sp_sb_cms_egress)
+
+#define MLXSW_SP_CPU_PORT_SB_CM MLXSW_SP_SB_CM(104, 2, 3)
 
 static const struct mlxsw_sp_sb_cm mlxsw_sp_cpu_port_sb_cms[] = {
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(0),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(1),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(2),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(3),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(4),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(5),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(6),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(7),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(8),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(9),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(10),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(11),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(12),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(13),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(14),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(15),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(16),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(17),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(18),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(19),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(20),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(21),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(22),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(23),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(24),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(25),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(26),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(27),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(28),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(29),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(30),
-	MLXSW_SP_CPU_PORT_SB_CM_EGRESS(31),
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
+	MLXSW_SP_CPU_PORT_SB_CM,
 };
 
 #define MLXSW_SP_CPU_PORT_SB_MCS_LEN \
 	ARRAY_SIZE(mlxsw_sp_cpu_port_sb_cms)
 
-static int mlxsw_sp_sb_cms_init(struct mlxsw_sp *mlxsw_sp, u8 local_port,
-				const struct mlxsw_sp_sb_cm *cms,
-				size_t cms_len)
+static int __mlxsw_sp_sb_cms_init(struct mlxsw_sp *mlxsw_sp, u8 local_port,
+				  enum mlxsw_reg_sbxx_dir dir,
+				  const struct mlxsw_sp_sb_cm *cms,
+				  size_t cms_len)
 {
 	int i;
 	int err;
@@ -307,10 +300,12 @@ static int mlxsw_sp_sb_cms_init(struct mlxsw_sp *mlxsw_sp, u8 local_port,
 	for (i = 0; i < cms_len; i++) {
 		const struct mlxsw_sp_sb_cm *cm;
 
+		if (i == 8 && dir == MLXSW_REG_SBXX_DIR_INGRESS)
+			continue; /* PG number 8 does not exist, skip it */
 		cm = &cms[i];
-		err = mlxsw_sp_sb_cm_write(mlxsw_sp, local_port, cm->u.pg,
-					   cm->dir, cm->min_buff,
-					   cm->max_buff, cm->pool);
+		err = mlxsw_sp_sb_cm_write(mlxsw_sp, local_port, i, dir,
+					   cm->min_buff, cm->max_buff,
+					   cm->pool);
 		if (err)
 			return err;
 	}
@@ -319,65 +314,71 @@ static int mlxsw_sp_sb_cms_init(struct mlxsw_sp *mlxsw_sp, u8 local_port,
 
 static int mlxsw_sp_port_sb_cms_init(struct mlxsw_sp_port *mlxsw_sp_port)
 {
-	return mlxsw_sp_sb_cms_init(mlxsw_sp_port->mlxsw_sp,
-				    mlxsw_sp_port->local_port, mlxsw_sp_sb_cms,
-				    MLXSW_SP_SB_CMS_LEN);
+	int err;
+
+	err = __mlxsw_sp_sb_cms_init(mlxsw_sp_port->mlxsw_sp,
+				     mlxsw_sp_port->local_port,
+				     MLXSW_REG_SBXX_DIR_INGRESS,
+				     mlxsw_sp_sb_cms_ingress,
+				     MLXSW_SP_SB_CMS_INGRESS_LEN);
+	if (err)
+		return err;
+	return __mlxsw_sp_sb_cms_init(mlxsw_sp_port->mlxsw_sp,
+				      mlxsw_sp_port->local_port,
+				      MLXSW_REG_SBXX_DIR_EGRESS,
+				      mlxsw_sp_sb_cms_egress,
+				      MLXSW_SP_SB_CMS_EGRESS_LEN);
 }
 
 static int mlxsw_sp_cpu_port_sb_cms_init(struct mlxsw_sp *mlxsw_sp)
 {
-	return mlxsw_sp_sb_cms_init(mlxsw_sp, 0, mlxsw_sp_cpu_port_sb_cms,
-				    MLXSW_SP_CPU_PORT_SB_MCS_LEN);
+	return __mlxsw_sp_sb_cms_init(mlxsw_sp, 0, MLXSW_REG_SBXX_DIR_EGRESS,
+				      mlxsw_sp_cpu_port_sb_cms,
+				      MLXSW_SP_CPU_PORT_SB_MCS_LEN);
 }
 
 struct mlxsw_sp_sb_pm {
-	u8 pool;
-	enum mlxsw_reg_sbxx_dir dir;
 	u32 min_buff;
 	u32 max_buff;
 };
 
-#define MLXSW_SP_SB_PM(_pool, _dir, _min_buff, _max_buff)	\
-	{							\
-		.pool = _pool,					\
-		.dir = _dir,					\
-		.min_buff = _min_buff,				\
-		.max_buff = _max_buff,				\
+#define MLXSW_SP_SB_PM(_min_buff, _max_buff)	\
+	{					\
+		.min_buff = _min_buff,		\
+		.max_buff = _max_buff,		\
 	}
 
-#define MLXSW_SP_SB_PM_INGRESS(_pool, _min_buff, _max_buff)	\
-	MLXSW_SP_SB_PM(_pool, MLXSW_REG_SBXX_DIR_INGRESS,	\
-		       _min_buff, _max_buff)
-
-#define MLXSW_SP_SB_PM_EGRESS(_pool, _min_buff, _max_buff)	\
-	MLXSW_SP_SB_PM(_pool, MLXSW_REG_SBXX_DIR_EGRESS,	\
-		       _min_buff, _max_buff)
-
-static const struct mlxsw_sp_sb_pm mlxsw_sp_sb_pms[] = {
-	MLXSW_SP_SB_PM_INGRESS(0, 0, 0xff),
-	MLXSW_SP_SB_PM_INGRESS(1, 0, 0),
-	MLXSW_SP_SB_PM_INGRESS(2, 0, 0),
-	MLXSW_SP_SB_PM_INGRESS(3, 0, 0),
-	MLXSW_SP_SB_PM_EGRESS(0, 0, 7),
-	MLXSW_SP_SB_PM_EGRESS(1, 0, 0),
-	MLXSW_SP_SB_PM_EGRESS(2, 0, 0),
-	MLXSW_SP_SB_PM_EGRESS(3, 0, 0),
+static const struct mlxsw_sp_sb_pm mlxsw_sp_sb_pms_ingress[] = {
+	MLXSW_SP_SB_PM(0, 0xff),
+	MLXSW_SP_SB_PM(0, 0),
+	MLXSW_SP_SB_PM(0, 0),
+	MLXSW_SP_SB_PM(0, 0),
 };
 
-#define MLXSW_SP_SB_PMS_LEN ARRAY_SIZE(mlxsw_sp_sb_pms)
+#define MLXSW_SP_SB_PMS_INGRESS_LEN ARRAY_SIZE(mlxsw_sp_sb_pms_ingress)
 
-static int mlxsw_sp_port_sb_pms_init(struct mlxsw_sp_port *mlxsw_sp_port)
+static const struct mlxsw_sp_sb_pm mlxsw_sp_sb_pms_egress[] = {
+	MLXSW_SP_SB_PM(0, 7),
+	MLXSW_SP_SB_PM(0, 0),
+	MLXSW_SP_SB_PM(0, 0),
+	MLXSW_SP_SB_PM(0, 0),
+};
+
+#define MLXSW_SP_SB_PMS_EGRESS_LEN ARRAY_SIZE(mlxsw_sp_sb_pms_egress)
+
+static int __mlxsw_sp_port_sb_pms_init(struct mlxsw_sp *mlxsw_sp, u8 local_port,
+				       enum mlxsw_reg_sbxx_dir dir,
+				       const struct mlxsw_sp_sb_pm *pms,
+				       size_t pms_len)
 {
 	int i;
 	int err;
 
-	for (i = 0; i < MLXSW_SP_SB_PMS_LEN; i++) {
+	for (i = 0; i < pms_len; i++) {
 		const struct mlxsw_sp_sb_pm *pm;
 
-		pm = &mlxsw_sp_sb_pms[i];
-		err = mlxsw_sp_sb_pm_write(mlxsw_sp_port->mlxsw_sp,
-					   mlxsw_sp_port->local_port,
-					   pm->pool, pm->dir,
+		pm = &pms[i];
+		err = mlxsw_sp_sb_pm_write(mlxsw_sp, local_port, i, dir,
 					   pm->min_buff, pm->max_buff);
 		if (err)
 			return err;
@@ -385,37 +386,53 @@ static int mlxsw_sp_port_sb_pms_init(struct mlxsw_sp_port *mlxsw_sp_port)
 	return 0;
 }
 
+static int mlxsw_sp_port_sb_pms_init(struct mlxsw_sp_port *mlxsw_sp_port)
+{
+	int err;
+
+	err = __mlxsw_sp_port_sb_pms_init(mlxsw_sp_port->mlxsw_sp,
+					  mlxsw_sp_port->local_port,
+					  MLXSW_REG_SBXX_DIR_INGRESS,
+					  mlxsw_sp_sb_pms_ingress,
+					  MLXSW_SP_SB_PMS_INGRESS_LEN);
+	if (err)
+		return err;
+	return __mlxsw_sp_port_sb_pms_init(mlxsw_sp_port->mlxsw_sp,
+					   mlxsw_sp_port->local_port,
+					   MLXSW_REG_SBXX_DIR_EGRESS,
+					   mlxsw_sp_sb_pms_egress,
+					   MLXSW_SP_SB_PMS_EGRESS_LEN);
+}
+
 struct mlxsw_sp_sb_mm {
-	u8 prio;
 	u32 min_buff;
 	u32 max_buff;
 	u8 pool;
 };
 
-#define MLXSW_SP_SB_MM(_prio, _min_buff, _max_buff, _pool)	\
-	{							\
-		.prio = _prio,					\
-		.min_buff = _min_buff,				\
-		.max_buff = _max_buff,				\
-		.pool = _pool,					\
+#define MLXSW_SP_SB_MM(_min_buff, _max_buff, _pool)	\
+	{						\
+		.min_buff = _min_buff,			\
+		.max_buff = _max_buff,			\
+		.pool = _pool,				\
 	}
 
 static const struct mlxsw_sp_sb_mm mlxsw_sp_sb_mms[] = {
-	MLXSW_SP_SB_MM(0, MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
-	MLXSW_SP_SB_MM(1, MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
-	MLXSW_SP_SB_MM(2, MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
-	MLXSW_SP_SB_MM(3, MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
-	MLXSW_SP_SB_MM(4, MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
-	MLXSW_SP_SB_MM(5, MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
-	MLXSW_SP_SB_MM(6, MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
-	MLXSW_SP_SB_MM(7, MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
-	MLXSW_SP_SB_MM(8, MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
-	MLXSW_SP_SB_MM(9, MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
-	MLXSW_SP_SB_MM(10, MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
-	MLXSW_SP_SB_MM(11, MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
-	MLXSW_SP_SB_MM(12, MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
-	MLXSW_SP_SB_MM(13, MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
-	MLXSW_SP_SB_MM(14, MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
+	MLXSW_SP_SB_MM(MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
+	MLXSW_SP_SB_MM(MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
+	MLXSW_SP_SB_MM(MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
+	MLXSW_SP_SB_MM(MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
+	MLXSW_SP_SB_MM(MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
+	MLXSW_SP_SB_MM(MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
+	MLXSW_SP_SB_MM(MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
+	MLXSW_SP_SB_MM(MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
+	MLXSW_SP_SB_MM(MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
+	MLXSW_SP_SB_MM(MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
+	MLXSW_SP_SB_MM(MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
+	MLXSW_SP_SB_MM(MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
+	MLXSW_SP_SB_MM(MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
+	MLXSW_SP_SB_MM(MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
+	MLXSW_SP_SB_MM(MLXSW_SP_BYTES_TO_CELLS(20000), 0xff, 0),
 };
 
 #define MLXSW_SP_SB_MMS_LEN ARRAY_SIZE(mlxsw_sp_sb_mms)
@@ -430,7 +447,7 @@ static int mlxsw_sp_sb_mms_init(struct mlxsw_sp *mlxsw_sp)
 		const struct mlxsw_sp_sb_mm *mc;
 
 		mc = &mlxsw_sp_sb_mms[i];
-		mlxsw_reg_sbmm_pack(sbmm_pl, mc->prio, mc->min_buff,
+		mlxsw_reg_sbmm_pack(sbmm_pl, i, mc->min_buff,
 				    mc->max_buff, mc->pool);
 		err = mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sbmm), sbmm_pl);
 		if (err)
-- 
2.5.5

^ permalink raw reply related

* [patch net-next 04/18] mlxsw: spectrum_buffers: Push out shared buffer register writes
From: Jiri Pirko @ 2016-04-14 16:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, ogerlitz, roopa, nikolay, jhs,
	john.fastabend, rami.rosen, gospo, stephen, sfeldma
In-Reply-To: <1460650770-19382-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Pushed them into helper functions.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 .../net/ethernet/mellanox/mlxsw/spectrum_buffers.c | 54 ++++++++++++++++------
 1 file changed, 40 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
index f58b1d3..ae60838 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
@@ -42,6 +42,37 @@
 #include "port.h"
 #include "reg.h"
 
+static int mlxsw_sp_sb_pr_write(struct mlxsw_sp *mlxsw_sp, u8 pool,
+				enum mlxsw_reg_sbxx_dir dir,
+				enum mlxsw_reg_sbpr_mode mode, u32 size)
+{
+	char sbpr_pl[MLXSW_REG_SBPR_LEN];
+
+	mlxsw_reg_sbpr_pack(sbpr_pl, pool, dir, mode, size);
+	return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sbpr), sbpr_pl);
+}
+
+static int mlxsw_sp_sb_cm_write(struct mlxsw_sp *mlxsw_sp, u8 local_port,
+				u8 pg_buff, enum mlxsw_reg_sbxx_dir dir,
+				u32 min_buff, u32 max_buff, u8 pool)
+{
+	char sbcm_pl[MLXSW_REG_SBCM_LEN];
+
+	mlxsw_reg_sbcm_pack(sbcm_pl, local_port, pg_buff, dir,
+			    min_buff, max_buff, pool);
+	return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sbcm), sbcm_pl);
+}
+
+static int mlxsw_sp_sb_pm_write(struct mlxsw_sp *mlxsw_sp, u8 local_port,
+				u8 pool, enum mlxsw_reg_sbxx_dir dir,
+				u32 min_buff, u32 max_buff)
+{
+	char sbpm_pl[MLXSW_REG_SBPM_LEN];
+
+	mlxsw_reg_sbpm_pack(sbpm_pl, local_port, pool, dir, min_buff, max_buff);
+	return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sbpm), sbpm_pl);
+}
+
 struct mlxsw_sp_pb {
 	u8 index;
 	u16 size;
@@ -151,7 +182,6 @@ static const struct mlxsw_sp_sb_pool mlxsw_sp_sb_pools[] = {
 
 static int mlxsw_sp_sb_pools_init(struct mlxsw_sp *mlxsw_sp)
 {
-	char sbpr_pl[MLXSW_REG_SBPR_LEN];
 	int i;
 	int err;
 
@@ -159,9 +189,8 @@ static int mlxsw_sp_sb_pools_init(struct mlxsw_sp *mlxsw_sp)
 		const struct mlxsw_sp_sb_pool *pool;
 
 		pool = &mlxsw_sp_sb_pools[i];
-		mlxsw_reg_sbpr_pack(sbpr_pl, pool->pool, pool->dir,
-				    pool->mode, pool->size);
-		err = mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sbpr), sbpr_pl);
+		err = mlxsw_sp_sb_pr_write(mlxsw_sp, pool->pool, pool->dir,
+					   pool->mode, pool->size);
 		if (err)
 			return err;
 	}
@@ -272,7 +301,6 @@ static int mlxsw_sp_sb_cms_init(struct mlxsw_sp *mlxsw_sp, u8 local_port,
 				const struct mlxsw_sp_sb_cm *cms,
 				size_t cms_len)
 {
-	char sbcm_pl[MLXSW_REG_SBCM_LEN];
 	int i;
 	int err;
 
@@ -280,9 +308,9 @@ static int mlxsw_sp_sb_cms_init(struct mlxsw_sp *mlxsw_sp, u8 local_port,
 		const struct mlxsw_sp_sb_cm *cm;
 
 		cm = &cms[i];
-		mlxsw_reg_sbcm_pack(sbcm_pl, local_port, cm->u.pg, cm->dir,
-				    cm->min_buff, cm->max_buff, cm->pool);
-		err = mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sbcm), sbcm_pl);
+		err = mlxsw_sp_sb_cm_write(mlxsw_sp, local_port, cm->u.pg,
+					   cm->dir, cm->min_buff,
+					   cm->max_buff, cm->pool);
 		if (err)
 			return err;
 	}
@@ -340,7 +368,6 @@ static const struct mlxsw_sp_sb_pm mlxsw_sp_sb_pms[] = {
 
 static int mlxsw_sp_port_sb_pms_init(struct mlxsw_sp_port *mlxsw_sp_port)
 {
-	char sbpm_pl[MLXSW_REG_SBPM_LEN];
 	int i;
 	int err;
 
@@ -348,11 +375,10 @@ static int mlxsw_sp_port_sb_pms_init(struct mlxsw_sp_port *mlxsw_sp_port)
 		const struct mlxsw_sp_sb_pm *pm;
 
 		pm = &mlxsw_sp_sb_pms[i];
-		mlxsw_reg_sbpm_pack(sbpm_pl, mlxsw_sp_port->local_port,
-				    pm->pool, pm->dir,
-				    pm->min_buff, pm->max_buff);
-		err = mlxsw_reg_write(mlxsw_sp_port->mlxsw_sp->core,
-				      MLXSW_REG(sbpm), sbpm_pl);
+		err = mlxsw_sp_sb_pm_write(mlxsw_sp_port->mlxsw_sp,
+					   mlxsw_sp_port->local_port,
+					   pm->pool, pm->dir,
+					   pm->min_buff, pm->max_buff);
 		if (err)
 			return err;
 	}
-- 
2.5.5

^ permalink raw reply related

* [patch net-next 03/18] mlxsw: core: Add devlink shared buffer callbacks
From: Jiri Pirko @ 2016-04-14 16:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, ogerlitz, roopa, nikolay, jhs,
	john.fastabend, rami.rosen, gospo, stephen, sfeldma
In-Reply-To: <1460650770-19382-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Add middle layer in mlxsw core code to forward shared buffer calls
into specific ASIC drivers.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/core.c | 105 ++++++++++++++++++++++++++++-
 drivers/net/ethernet/mellanox/mlxsw/core.h |  20 ++++++
 2 files changed, 123 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c b/drivers/net/ethernet/mellanox/mlxsw/core.c
index 3958195..1278260 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.c
@@ -816,9 +816,110 @@ static int mlxsw_devlink_port_unsplit(struct devlink *devlink,
 	return mlxsw_core->driver->port_unsplit(mlxsw_core, port_index);
 }
 
+static int
+mlxsw_devlink_sb_pool_get(struct devlink *devlink,
+			  unsigned int sb_index, u16 pool_index,
+			  struct devlink_sb_pool_info *pool_info)
+{
+	struct mlxsw_core *mlxsw_core = devlink_priv(devlink);
+	struct mlxsw_driver *mlxsw_driver = mlxsw_core->driver;
+
+	if (!mlxsw_driver->sb_pool_get)
+		return -EOPNOTSUPP;
+	return mlxsw_driver->sb_pool_get(mlxsw_core, sb_index,
+					 pool_index, pool_info);
+}
+
+static int
+mlxsw_devlink_sb_pool_set(struct devlink *devlink,
+			  unsigned int sb_index, u16 pool_index, u32 size,
+			  enum devlink_sb_threshold_type threshold_type)
+{
+	struct mlxsw_core *mlxsw_core = devlink_priv(devlink);
+	struct mlxsw_driver *mlxsw_driver = mlxsw_core->driver;
+
+	if (!mlxsw_driver->sb_pool_set)
+		return -EOPNOTSUPP;
+	return mlxsw_driver->sb_pool_set(mlxsw_core, sb_index,
+					 pool_index, size, threshold_type);
+}
+
+static void *__dl_port(struct devlink_port *devlink_port)
+{
+	return container_of(devlink_port, struct mlxsw_core_port, devlink_port);
+}
+
+static int mlxsw_devlink_sb_port_pool_get(struct devlink_port *devlink_port,
+					  unsigned int sb_index, u16 pool_index,
+					  u32 *p_threshold)
+{
+	struct mlxsw_core *mlxsw_core = devlink_priv(devlink_port->devlink);
+	struct mlxsw_driver *mlxsw_driver = mlxsw_core->driver;
+	struct mlxsw_core_port *mlxsw_core_port = __dl_port(devlink_port);
+
+	if (!mlxsw_driver->sb_port_pool_get)
+		return -EOPNOTSUPP;
+	return mlxsw_driver->sb_port_pool_get(mlxsw_core_port, sb_index,
+					      pool_index, p_threshold);
+}
+
+static int mlxsw_devlink_sb_port_pool_set(struct devlink_port *devlink_port,
+					  unsigned int sb_index, u16 pool_index,
+					  u32 threshold)
+{
+	struct mlxsw_core *mlxsw_core = devlink_priv(devlink_port->devlink);
+	struct mlxsw_driver *mlxsw_driver = mlxsw_core->driver;
+	struct mlxsw_core_port *mlxsw_core_port = __dl_port(devlink_port);
+
+	if (!mlxsw_driver->sb_port_pool_set)
+		return -EOPNOTSUPP;
+	return mlxsw_driver->sb_port_pool_set(mlxsw_core_port, sb_index,
+					      pool_index, threshold);
+}
+
+static int
+mlxsw_devlink_sb_tc_pool_bind_get(struct devlink_port *devlink_port,
+				  unsigned int sb_index, u16 tc_index,
+				  enum devlink_sb_pool_type pool_type,
+				  u16 *p_pool_index, u32 *p_threshold)
+{
+	struct mlxsw_core *mlxsw_core = devlink_priv(devlink_port->devlink);
+	struct mlxsw_driver *mlxsw_driver = mlxsw_core->driver;
+	struct mlxsw_core_port *mlxsw_core_port = __dl_port(devlink_port);
+
+	if (!mlxsw_driver->sb_tc_pool_bind_get)
+		return -EOPNOTSUPP;
+	return mlxsw_driver->sb_tc_pool_bind_get(mlxsw_core_port, sb_index,
+						 tc_index, pool_type,
+						 p_pool_index, p_threshold);
+}
+
+static int
+mlxsw_devlink_sb_tc_pool_bind_set(struct devlink_port *devlink_port,
+				  unsigned int sb_index, u16 tc_index,
+				  enum devlink_sb_pool_type pool_type,
+				  u16 pool_index, u32 threshold)
+{
+	struct mlxsw_core *mlxsw_core = devlink_priv(devlink_port->devlink);
+	struct mlxsw_driver *mlxsw_driver = mlxsw_core->driver;
+	struct mlxsw_core_port *mlxsw_core_port = __dl_port(devlink_port);
+
+	if (!mlxsw_driver->sb_tc_pool_bind_set)
+		return -EOPNOTSUPP;
+	return mlxsw_driver->sb_tc_pool_bind_set(mlxsw_core_port, sb_index,
+						 tc_index, pool_type,
+						 pool_index, threshold);
+}
+
 static const struct devlink_ops mlxsw_devlink_ops = {
-	.port_split	= mlxsw_devlink_port_split,
-	.port_unsplit	= mlxsw_devlink_port_unsplit,
+	.port_split		= mlxsw_devlink_port_split,
+	.port_unsplit		= mlxsw_devlink_port_unsplit,
+	.sb_pool_get		= mlxsw_devlink_sb_pool_get,
+	.sb_pool_set		= mlxsw_devlink_sb_pool_set,
+	.sb_port_pool_get	= mlxsw_devlink_sb_port_pool_get,
+	.sb_port_pool_set	= mlxsw_devlink_sb_port_pool_set,
+	.sb_tc_pool_bind_get	= mlxsw_devlink_sb_tc_pool_bind_get,
+	.sb_tc_pool_bind_set	= mlxsw_devlink_sb_tc_pool_bind_set,
 };
 
 int mlxsw_core_bus_device_register(const struct mlxsw_bus_info *mlxsw_bus_info,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.h b/drivers/net/ethernet/mellanox/mlxsw/core.h
index f3cebef..184e985 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.h
@@ -200,6 +200,26 @@ struct mlxsw_driver {
 	int (*port_split)(struct mlxsw_core *mlxsw_core, u8 local_port,
 			  unsigned int count);
 	int (*port_unsplit)(struct mlxsw_core *mlxsw_core, u8 local_port);
+	int (*sb_pool_get)(struct mlxsw_core *mlxsw_core,
+			   unsigned int sb_index, u16 pool_index,
+			   struct devlink_sb_pool_info *pool_info);
+	int (*sb_pool_set)(struct mlxsw_core *mlxsw_core,
+			   unsigned int sb_index, u16 pool_index, u32 size,
+			   enum devlink_sb_threshold_type threshold_type);
+	int (*sb_port_pool_get)(struct mlxsw_core_port *mlxsw_core_port,
+				unsigned int sb_index, u16 pool_index,
+				u32 *p_threshold);
+	int (*sb_port_pool_set)(struct mlxsw_core_port *mlxsw_core_port,
+				unsigned int sb_index, u16 pool_index,
+				u32 threshold);
+	int (*sb_tc_pool_bind_get)(struct mlxsw_core_port *mlxsw_core_port,
+				   unsigned int sb_index, u16 tc_index,
+				   enum devlink_sb_pool_type pool_type,
+				   u16 *p_pool_index, u32 *p_threshold);
+	int (*sb_tc_pool_bind_set)(struct mlxsw_core_port *mlxsw_core_port,
+				   unsigned int sb_index, u16 tc_index,
+				   enum devlink_sb_pool_type pool_type,
+				   u16 pool_index, u32 threshold);
 	void (*txhdr_construct)(struct sk_buff *skb,
 				const struct mlxsw_tx_info *tx_info);
 	u8 txhdr_len;
-- 
2.5.5

^ permalink raw reply related

* [patch net-next 02/18] devlink: implement shared buffer occupancy monitoring interface
From: Jiri Pirko @ 2016-04-14 16:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, ogerlitz, roopa, nikolay, jhs,
	john.fastabend, rami.rosen, gospo, stephen, sfeldma
In-Reply-To: <1460650770-19382-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

User needs to monitor shared buffer occupancy. For that, he issues a
snapshot command in order to instruct hardware to catch current and
maximal occupancy values, and clear command in order to clear the
historical maximal values.

Also port-pool and tc-pool-bind command response messages are extended to
carry occupancy values.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 include/net/devlink.h        | 12 ++++++
 include/uapi/linux/devlink.h |  6 +++
 net/core/devlink.c           | 98 +++++++++++++++++++++++++++++++++++++++++---
 3 files changed, 110 insertions(+), 6 deletions(-)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index e4c2747..be64218 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -78,6 +78,18 @@ struct devlink_ops {
 				   u16 tc_index,
 				   enum devlink_sb_pool_type pool_type,
 				   u16 pool_index, u32 threshold);
+	int (*sb_occ_snapshot)(struct devlink *devlink,
+			       unsigned int sb_index);
+	int (*sb_occ_max_clear)(struct devlink *devlink,
+				unsigned int sb_index);
+	int (*sb_occ_port_pool_get)(struct devlink_port *devlink_port,
+				    unsigned int sb_index, u16 pool_index,
+				    u32 *p_cur, u32 *p_max);
+	int (*sb_occ_tc_port_bind_get)(struct devlink_port *devlink_port,
+				       unsigned int sb_index,
+				       u16 tc_index,
+				       enum devlink_sb_pool_type pool_type,
+				       u32 *p_cur, u32 *p_max);
 };
 
 static inline void *devlink_priv(struct devlink *devlink)
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index 9c1aa57..ba0073b 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -53,6 +53,10 @@ enum devlink_command {
 	DEVLINK_CMD_SB_TC_POOL_BIND_NEW,
 	DEVLINK_CMD_SB_TC_POOL_BIND_DEL,
 
+	/* Shared buffer occupancy monitoring commands */
+	DEVLINK_CMD_SB_OCC_SNAPSHOT,
+	DEVLINK_CMD_SB_OCC_MAX_CLEAR,
+
 	/* add new commands above here */
 
 	__DEVLINK_CMD_MAX,
@@ -119,6 +123,8 @@ enum devlink_attr {
 	DEVLINK_ATTR_SB_POOL_THRESHOLD_TYPE,	/* u8 */
 	DEVLINK_ATTR_SB_THRESHOLD,		/* u32 */
 	DEVLINK_ATTR_SB_TC_INDEX,		/* u16 */
+	DEVLINK_ATTR_SB_OCC_CUR,		/* u32 */
+	DEVLINK_ATTR_SB_OCC_MAX,		/* u32 */
 
 	/* add new attributes above here, update the policy in devlink.c */
 
diff --git a/net/core/devlink.c b/net/core/devlink.c
index aa0b9e1..933e8d4 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -280,6 +280,10 @@ devlink_sb_tc_index_get_from_info(struct devlink_sb *devlink_sb,
 #define DEVLINK_NL_FLAG_NEED_DEVLINK	BIT(0)
 #define DEVLINK_NL_FLAG_NEED_PORT	BIT(1)
 #define DEVLINK_NL_FLAG_NEED_SB		BIT(2)
+#define DEVLINK_NL_FLAG_LOCK_PORTS	BIT(3)
+	/* port is not needed but we need to ensure they don't
+	 * change in the middle of command
+	 */
 
 static int devlink_nl_pre_doit(const struct genl_ops *ops,
 			       struct sk_buff *skb, struct genl_info *info)
@@ -306,6 +310,9 @@ static int devlink_nl_pre_doit(const struct genl_ops *ops,
 		}
 		info->user_ptr[0] = devlink_port;
 	}
+	if (ops->internal_flags & DEVLINK_NL_FLAG_LOCK_PORTS) {
+		mutex_lock(&devlink_port_mutex);
+	}
 	if (ops->internal_flags & DEVLINK_NL_FLAG_NEED_SB) {
 		struct devlink_sb *devlink_sb;
 
@@ -324,7 +331,8 @@ static int devlink_nl_pre_doit(const struct genl_ops *ops,
 static void devlink_nl_post_doit(const struct genl_ops *ops,
 				 struct sk_buff *skb, struct genl_info *info)
 {
-	if (ops->internal_flags & DEVLINK_NL_FLAG_NEED_PORT)
+	if (ops->internal_flags & DEVLINK_NL_FLAG_NEED_PORT ||
+	    ops->internal_flags & DEVLINK_NL_FLAG_LOCK_PORTS)
 		mutex_unlock(&devlink_port_mutex);
 	mutex_unlock(&devlink_mutex);
 }
@@ -942,12 +950,13 @@ static int devlink_nl_sb_port_pool_fill(struct sk_buff *msg,
 					enum devlink_command cmd,
 					u32 portid, u32 seq, int flags)
 {
+	const struct devlink_ops *ops = devlink->ops;
 	u32 threshold;
 	void *hdr;
 	int err;
 
-	err = devlink->ops->sb_port_pool_get(devlink_port, devlink_sb->index,
-					     pool_index, &threshold);
+	err = ops->sb_port_pool_get(devlink_port, devlink_sb->index,
+				    pool_index, &threshold);
 	if (err)
 		return err;
 
@@ -966,6 +975,22 @@ static int devlink_nl_sb_port_pool_fill(struct sk_buff *msg,
 	if (nla_put_u32(msg, DEVLINK_ATTR_SB_THRESHOLD, threshold))
 		goto nla_put_failure;
 
+	if (ops->sb_occ_port_pool_get) {
+		u32 cur;
+		u32 max;
+
+		err = ops->sb_occ_port_pool_get(devlink_port, devlink_sb->index,
+						pool_index, &cur, &max);
+		if (err && err != -EOPNOTSUPP)
+			return err;
+		if (!err) {
+			if (nla_put_u32(msg, DEVLINK_ATTR_SB_OCC_CUR, cur))
+				goto nla_put_failure;
+			if (nla_put_u32(msg, DEVLINK_ATTR_SB_OCC_MAX, max))
+				goto nla_put_failure;
+		}
+	}
+
 	genlmsg_end(msg, hdr);
 	return 0;
 
@@ -1114,14 +1139,15 @@ devlink_nl_sb_tc_pool_bind_fill(struct sk_buff *msg, struct devlink *devlink,
 				enum devlink_command cmd,
 				u32 portid, u32 seq, int flags)
 {
+	const struct devlink_ops *ops = devlink->ops;
 	u16 pool_index;
 	u32 threshold;
 	void *hdr;
 	int err;
 
-	err = devlink->ops->sb_tc_pool_bind_get(devlink_port, devlink_sb->index,
-						tc_index, pool_type,
-						&pool_index, &threshold);
+	err = ops->sb_tc_pool_bind_get(devlink_port, devlink_sb->index,
+				       tc_index, pool_type,
+				       &pool_index, &threshold);
 	if (err)
 		return err;
 
@@ -1144,6 +1170,24 @@ devlink_nl_sb_tc_pool_bind_fill(struct sk_buff *msg, struct devlink *devlink,
 	if (nla_put_u32(msg, DEVLINK_ATTR_SB_THRESHOLD, threshold))
 		goto nla_put_failure;
 
+	if (ops->sb_occ_tc_port_bind_get) {
+		u32 cur;
+		u32 max;
+
+		err = ops->sb_occ_tc_port_bind_get(devlink_port,
+						   devlink_sb->index,
+						   tc_index, pool_type,
+						   &cur, &max);
+		if (err && err != -EOPNOTSUPP)
+			return err;
+		if (!err) {
+			if (nla_put_u32(msg, DEVLINK_ATTR_SB_OCC_CUR, cur))
+				goto nla_put_failure;
+			if (nla_put_u32(msg, DEVLINK_ATTR_SB_OCC_MAX, max))
+				goto nla_put_failure;
+		}
+	}
+
 	genlmsg_end(msg, hdr);
 	return 0;
 
@@ -1326,6 +1370,30 @@ static int devlink_nl_cmd_sb_tc_pool_bind_set_doit(struct sk_buff *skb,
 					   pool_index, threshold);
 }
 
+static int devlink_nl_cmd_sb_occ_snapshot_doit(struct sk_buff *skb,
+					       struct genl_info *info)
+{
+	struct devlink *devlink = info->user_ptr[0];
+	struct devlink_sb *devlink_sb = info->user_ptr[1];
+	const struct devlink_ops *ops = devlink->ops;
+
+	if (ops && ops->sb_occ_snapshot)
+		return ops->sb_occ_snapshot(devlink, devlink_sb->index);
+	return -EOPNOTSUPP;
+}
+
+static int devlink_nl_cmd_sb_occ_max_clear_doit(struct sk_buff *skb,
+						struct genl_info *info)
+{
+	struct devlink *devlink = info->user_ptr[0];
+	struct devlink_sb *devlink_sb = info->user_ptr[1];
+	const struct devlink_ops *ops = devlink->ops;
+
+	if (ops && ops->sb_occ_max_clear)
+		return ops->sb_occ_max_clear(devlink, devlink_sb->index);
+	return -EOPNOTSUPP;
+}
+
 static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
 	[DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING },
 	[DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING },
@@ -1439,6 +1507,24 @@ static const struct genl_ops devlink_nl_ops[] = {
 		.internal_flags = DEVLINK_NL_FLAG_NEED_PORT |
 				  DEVLINK_NL_FLAG_NEED_SB,
 	},
+	{
+		.cmd = DEVLINK_CMD_SB_OCC_SNAPSHOT,
+		.doit = devlink_nl_cmd_sb_occ_snapshot_doit,
+		.policy = devlink_nl_policy,
+		.flags = GENL_ADMIN_PERM,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK |
+				  DEVLINK_NL_FLAG_NEED_SB |
+				  DEVLINK_NL_FLAG_LOCK_PORTS,
+	},
+	{
+		.cmd = DEVLINK_CMD_SB_OCC_MAX_CLEAR,
+		.doit = devlink_nl_cmd_sb_occ_max_clear_doit,
+		.policy = devlink_nl_policy,
+		.flags = GENL_ADMIN_PERM,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK |
+				  DEVLINK_NL_FLAG_NEED_SB |
+				  DEVLINK_NL_FLAG_LOCK_PORTS,
+	},
 };
 
 /**
-- 
2.5.5

^ permalink raw reply related

* [patch net-next 01/18] devlink: add shared buffer configuration
From: Jiri Pirko @ 2016-04-14 16:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, ogerlitz, roopa, nikolay, jhs,
	john.fastabend, rami.rosen, gospo, stephen, sfeldma
In-Reply-To: <1460650770-19382-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Define userspace API and drivers API for configuration of shared
buffers. Four basic objects are defined:
shared buffer - attributes are size, number of pools and TCs
pool - chunk of sharedbuffer definition, it has some size and either
       static or dynamic threshold
port pool threshold - to set per-port threshold for each pool
port tc threshold bind - to bind port and TC to specified pool
                         with threshold.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 include/net/devlink.h        |  47 +++
 include/uapi/linux/devlink.h |  57 +++
 net/core/devlink.c           | 940 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 1044 insertions(+)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index c37d257..e4c2747 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -24,6 +24,7 @@ struct devlink_ops;
 struct devlink {
 	struct list_head list;
 	struct list_head port_list;
+	struct list_head sb_list;
 	const struct devlink_ops *ops;
 	struct device *dev;
 	possible_net_t _net;
@@ -42,6 +43,12 @@ struct devlink_port {
 	u32 split_group;
 };
 
+struct devlink_sb_pool_info {
+	enum devlink_sb_pool_type pool_type;
+	u32 size;
+	enum devlink_sb_threshold_type threshold_type;
+};
+
 struct devlink_ops {
 	size_t priv_size;
 	int (*port_type_set)(struct devlink_port *devlink_port,
@@ -49,6 +56,28 @@ struct devlink_ops {
 	int (*port_split)(struct devlink *devlink, unsigned int port_index,
 			  unsigned int count);
 	int (*port_unsplit)(struct devlink *devlink, unsigned int port_index);
+	int (*sb_pool_get)(struct devlink *devlink, unsigned int sb_index,
+			   u16 pool_index,
+			   struct devlink_sb_pool_info *pool_info);
+	int (*sb_pool_set)(struct devlink *devlink, unsigned int sb_index,
+			   u16 pool_index, u32 size,
+			   enum devlink_sb_threshold_type threshold_type);
+	int (*sb_port_pool_get)(struct devlink_port *devlink_port,
+				unsigned int sb_index, u16 pool_index,
+				u32 *p_threshold);
+	int (*sb_port_pool_set)(struct devlink_port *devlink_port,
+				unsigned int sb_index, u16 pool_index,
+				u32 threshold);
+	int (*sb_tc_pool_bind_get)(struct devlink_port *devlink_port,
+				   unsigned int sb_index,
+				   u16 tc_index,
+				   enum devlink_sb_pool_type pool_type,
+				   u16 *p_pool_index, u32 *p_threshold);
+	int (*sb_tc_pool_bind_set)(struct devlink_port *devlink_port,
+				   unsigned int sb_index,
+				   u16 tc_index,
+				   enum devlink_sb_pool_type pool_type,
+				   u16 pool_index, u32 threshold);
 };
 
 static inline void *devlink_priv(struct devlink *devlink)
@@ -82,6 +111,11 @@ void devlink_port_type_ib_set(struct devlink_port *devlink_port,
 void devlink_port_type_clear(struct devlink_port *devlink_port);
 void devlink_port_split_set(struct devlink_port *devlink_port,
 			    u32 split_group);
+int devlink_sb_register(struct devlink *devlink, unsigned int sb_index,
+			u32 size, u16 ingress_pools_count,
+			u16 egress_pools_count, u16 ingress_tc_count,
+			u16 egress_tc_count);
+void devlink_sb_unregister(struct devlink *devlink, unsigned int sb_index);
 
 #else
 
@@ -135,6 +169,19 @@ static inline void devlink_port_split_set(struct devlink_port *devlink_port,
 {
 }
 
+static inline int devlink_sb_register(struct devlink *devlink,
+				      unsigned int sb_index, u32 size,
+				      u16 ingress_pools_count,
+				      u16 egress_pools_count, u16 tc_count)
+{
+	return 0;
+}
+
+static inline void devlink_sb_unregister(struct devlink *devlink,
+					 unsigned int sb_index)
+{
+}
+
 #endif
 
 #endif /* _NET_DEVLINK_H_ */
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index c9fee57..9c1aa57 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -33,6 +33,26 @@ enum devlink_command {
 	DEVLINK_CMD_PORT_SPLIT,
 	DEVLINK_CMD_PORT_UNSPLIT,
 
+	DEVLINK_CMD_SB_GET,		/* can dump */
+	DEVLINK_CMD_SB_SET,
+	DEVLINK_CMD_SB_NEW,
+	DEVLINK_CMD_SB_DEL,
+
+	DEVLINK_CMD_SB_POOL_GET,	/* can dump */
+	DEVLINK_CMD_SB_POOL_SET,
+	DEVLINK_CMD_SB_POOL_NEW,
+	DEVLINK_CMD_SB_POOL_DEL,
+
+	DEVLINK_CMD_SB_PORT_POOL_GET,	/* can dump */
+	DEVLINK_CMD_SB_PORT_POOL_SET,
+	DEVLINK_CMD_SB_PORT_POOL_NEW,
+	DEVLINK_CMD_SB_PORT_POOL_DEL,
+
+	DEVLINK_CMD_SB_TC_POOL_BIND_GET,	/* can dump */
+	DEVLINK_CMD_SB_TC_POOL_BIND_SET,
+	DEVLINK_CMD_SB_TC_POOL_BIND_NEW,
+	DEVLINK_CMD_SB_TC_POOL_BIND_DEL,
+
 	/* add new commands above here */
 
 	__DEVLINK_CMD_MAX,
@@ -46,6 +66,31 @@ enum devlink_port_type {
 	DEVLINK_PORT_TYPE_IB,
 };
 
+enum devlink_sb_pool_type {
+	DEVLINK_SB_POOL_TYPE_INGRESS,
+	DEVLINK_SB_POOL_TYPE_EGRESS,
+};
+
+/* static threshold - limiting the maximum number of bytes.
+ * dynamic threshold - limiting the maximum number of bytes
+ *   based on the currently available free space in the shared buffer pool.
+ *   In this mode, the maximum quota is calculated based
+ *   on the following formula:
+ *     max_quota = alpha / (1 + alpha) * Free_Buffer
+ *   While Free_Buffer is the amount of none-occupied buffer associated to
+ *   the relevant pool.
+ *   The value range which can be passed is 0-20 and serves
+ *   for computation of alpha by following formula:
+ *     alpha = 2 ^ (passed_value - 10)
+ */
+
+enum devlink_sb_threshold_type {
+	DEVLINK_SB_THRESHOLD_TYPE_STATIC,
+	DEVLINK_SB_THRESHOLD_TYPE_DYNAMIC,
+};
+
+#define DEVLINK_SB_THRESHOLD_TO_ALPHA_MAX 20
+
 enum devlink_attr {
 	/* don't change the order or add anything between, this is ABI! */
 	DEVLINK_ATTR_UNSPEC,
@@ -62,6 +107,18 @@ enum devlink_attr {
 	DEVLINK_ATTR_PORT_IBDEV_NAME,		/* string */
 	DEVLINK_ATTR_PORT_SPLIT_COUNT,		/* u32 */
 	DEVLINK_ATTR_PORT_SPLIT_GROUP,		/* u32 */
+	DEVLINK_ATTR_SB_INDEX,			/* u32 */
+	DEVLINK_ATTR_SB_SIZE,			/* u32 */
+	DEVLINK_ATTR_SB_INGRESS_POOL_COUNT,	/* u16 */
+	DEVLINK_ATTR_SB_EGRESS_POOL_COUNT,	/* u16 */
+	DEVLINK_ATTR_SB_INGRESS_TC_COUNT,	/* u16 */
+	DEVLINK_ATTR_SB_EGRESS_TC_COUNT,	/* u16 */
+	DEVLINK_ATTR_SB_POOL_INDEX,		/* u16 */
+	DEVLINK_ATTR_SB_POOL_TYPE,		/* u8 */
+	DEVLINK_ATTR_SB_POOL_SIZE,		/* u32 */
+	DEVLINK_ATTR_SB_POOL_THRESHOLD_TYPE,	/* u8 */
+	DEVLINK_ATTR_SB_THRESHOLD,		/* u32 */
+	DEVLINK_ATTR_SB_TC_INDEX,		/* u16 */
 
 	/* add new attributes above here, update the policy in devlink.c */
 
diff --git a/net/core/devlink.c b/net/core/devlink.c
index b84cf0d..aa0b9e1 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -119,8 +119,167 @@ static struct devlink_port *devlink_port_get_from_info(struct devlink *devlink,
 	return devlink_port_get_from_attrs(devlink, info->attrs);
 }
 
+struct devlink_sb {
+	struct list_head list;
+	unsigned int index;
+	u32 size;
+	u16 ingress_pools_count;
+	u16 egress_pools_count;
+	u16 ingress_tc_count;
+	u16 egress_tc_count;
+};
+
+static u16 devlink_sb_pool_count(struct devlink_sb *devlink_sb)
+{
+	return devlink_sb->ingress_pools_count + devlink_sb->egress_pools_count;
+}
+
+static struct devlink_sb *devlink_sb_get_by_index(struct devlink *devlink,
+						  unsigned int sb_index)
+{
+	struct devlink_sb *devlink_sb;
+
+	list_for_each_entry(devlink_sb, &devlink->sb_list, list) {
+		if (devlink_sb->index == sb_index)
+			return devlink_sb;
+	}
+	return NULL;
+}
+
+static bool devlink_sb_index_exists(struct devlink *devlink,
+				    unsigned int sb_index)
+{
+	return devlink_sb_get_by_index(devlink, sb_index);
+}
+
+static struct devlink_sb *devlink_sb_get_from_attrs(struct devlink *devlink,
+						    struct nlattr **attrs)
+{
+	if (attrs[DEVLINK_ATTR_SB_INDEX]) {
+		u32 sb_index = nla_get_u32(attrs[DEVLINK_ATTR_SB_INDEX]);
+		struct devlink_sb *devlink_sb;
+
+		devlink_sb = devlink_sb_get_by_index(devlink, sb_index);
+		if (!devlink_sb)
+			return ERR_PTR(-ENODEV);
+		return devlink_sb;
+	}
+	return ERR_PTR(-EINVAL);
+}
+
+static struct devlink_sb *devlink_sb_get_from_info(struct devlink *devlink,
+						   struct genl_info *info)
+{
+	return devlink_sb_get_from_attrs(devlink, info->attrs);
+}
+
+static int devlink_sb_pool_index_get_from_attrs(struct devlink_sb *devlink_sb,
+						struct nlattr **attrs,
+						u16 *p_pool_index)
+{
+	u16 val;
+
+	if (!attrs[DEVLINK_ATTR_SB_POOL_INDEX])
+		return -EINVAL;
+
+	val = nla_get_u16(attrs[DEVLINK_ATTR_SB_POOL_INDEX]);
+	if (val >= devlink_sb_pool_count(devlink_sb))
+		return -EINVAL;
+	*p_pool_index = val;
+	return 0;
+}
+
+static int devlink_sb_pool_index_get_from_info(struct devlink_sb *devlink_sb,
+					       struct genl_info *info,
+					       u16 *p_pool_index)
+{
+	return devlink_sb_pool_index_get_from_attrs(devlink_sb, info->attrs,
+						    p_pool_index);
+}
+
+static int
+devlink_sb_pool_type_get_from_attrs(struct nlattr **attrs,
+				    enum devlink_sb_pool_type *p_pool_type)
+{
+	u8 val;
+
+	if (!attrs[DEVLINK_ATTR_SB_POOL_TYPE])
+		return -EINVAL;
+
+	val = nla_get_u8(attrs[DEVLINK_ATTR_SB_POOL_TYPE]);
+	if (val != DEVLINK_SB_POOL_TYPE_INGRESS &&
+	    val != DEVLINK_SB_POOL_TYPE_EGRESS)
+		return -EINVAL;
+	*p_pool_type = val;
+	return 0;
+}
+
+static int
+devlink_sb_pool_type_get_from_info(struct genl_info *info,
+				   enum devlink_sb_pool_type *p_pool_type)
+{
+	return devlink_sb_pool_type_get_from_attrs(info->attrs, p_pool_type);
+}
+
+static int
+devlink_sb_th_type_get_from_attrs(struct nlattr **attrs,
+				  enum devlink_sb_threshold_type *p_th_type)
+{
+	u8 val;
+
+	if (!attrs[DEVLINK_ATTR_SB_POOL_THRESHOLD_TYPE])
+		return -EINVAL;
+
+	val = nla_get_u8(attrs[DEVLINK_ATTR_SB_POOL_THRESHOLD_TYPE]);
+	if (val != DEVLINK_SB_THRESHOLD_TYPE_STATIC &&
+	    val != DEVLINK_SB_THRESHOLD_TYPE_DYNAMIC)
+		return -EINVAL;
+	*p_th_type = val;
+	return 0;
+}
+
+static int
+devlink_sb_th_type_get_from_info(struct genl_info *info,
+				 enum devlink_sb_threshold_type *p_th_type)
+{
+	return devlink_sb_th_type_get_from_attrs(info->attrs, p_th_type);
+}
+
+static int
+devlink_sb_tc_index_get_from_attrs(struct devlink_sb *devlink_sb,
+				   struct nlattr **attrs,
+				   enum devlink_sb_pool_type pool_type,
+				   u16 *p_tc_index)
+{
+	u16 val;
+
+	if (!attrs[DEVLINK_ATTR_SB_TC_INDEX])
+		return -EINVAL;
+
+	val = nla_get_u16(attrs[DEVLINK_ATTR_SB_TC_INDEX]);
+	if (pool_type == DEVLINK_SB_POOL_TYPE_INGRESS &&
+	    val >= devlink_sb->ingress_tc_count)
+		return -EINVAL;
+	if (pool_type == DEVLINK_SB_POOL_TYPE_EGRESS &&
+	    val >= devlink_sb->egress_tc_count)
+		return -EINVAL;
+	*p_tc_index = val;
+	return 0;
+}
+
+static int
+devlink_sb_tc_index_get_from_info(struct devlink_sb *devlink_sb,
+				  struct genl_info *info,
+				  enum devlink_sb_pool_type pool_type,
+				  u16 *p_tc_index)
+{
+	return devlink_sb_tc_index_get_from_attrs(devlink_sb, info->attrs,
+						  pool_type, p_tc_index);
+}
+
 #define DEVLINK_NL_FLAG_NEED_DEVLINK	BIT(0)
 #define DEVLINK_NL_FLAG_NEED_PORT	BIT(1)
+#define DEVLINK_NL_FLAG_NEED_SB		BIT(2)
 
 static int devlink_nl_pre_doit(const struct genl_ops *ops,
 			       struct sk_buff *skb, struct genl_info *info)
@@ -147,6 +306,18 @@ static int devlink_nl_pre_doit(const struct genl_ops *ops,
 		}
 		info->user_ptr[0] = devlink_port;
 	}
+	if (ops->internal_flags & DEVLINK_NL_FLAG_NEED_SB) {
+		struct devlink_sb *devlink_sb;
+
+		devlink_sb = devlink_sb_get_from_info(devlink, info);
+		if (IS_ERR(devlink_sb)) {
+			if (ops->internal_flags & DEVLINK_NL_FLAG_NEED_PORT)
+				mutex_unlock(&devlink_port_mutex);
+			mutex_unlock(&devlink_mutex);
+			return PTR_ERR(devlink_sb);
+		}
+		info->user_ptr[1] = devlink_sb;
+	}
 	return 0;
 }
 
@@ -499,12 +670,675 @@ static int devlink_nl_cmd_port_unsplit_doit(struct sk_buff *skb,
 	return devlink_port_unsplit(devlink, port_index);
 }
 
+static int devlink_nl_sb_fill(struct sk_buff *msg, struct devlink *devlink,
+			      struct devlink_sb *devlink_sb,
+			      enum devlink_command cmd, u32 portid,
+			      u32 seq, int flags)
+{
+	void *hdr;
+
+	hdr = genlmsg_put(msg, portid, seq, &devlink_nl_family, flags, cmd);
+	if (!hdr)
+		return -EMSGSIZE;
+
+	if (devlink_nl_put_handle(msg, devlink))
+		goto nla_put_failure;
+	if (nla_put_u32(msg, DEVLINK_ATTR_SB_INDEX, devlink_sb->index))
+		goto nla_put_failure;
+	if (nla_put_u32(msg, DEVLINK_ATTR_SB_SIZE, devlink_sb->size))
+		goto nla_put_failure;
+	if (nla_put_u16(msg, DEVLINK_ATTR_SB_INGRESS_POOL_COUNT,
+			devlink_sb->ingress_pools_count))
+		goto nla_put_failure;
+	if (nla_put_u16(msg, DEVLINK_ATTR_SB_EGRESS_POOL_COUNT,
+			devlink_sb->egress_pools_count))
+		goto nla_put_failure;
+	if (nla_put_u16(msg, DEVLINK_ATTR_SB_INGRESS_TC_COUNT,
+			devlink_sb->ingress_tc_count))
+		goto nla_put_failure;
+	if (nla_put_u16(msg, DEVLINK_ATTR_SB_EGRESS_TC_COUNT,
+			devlink_sb->egress_tc_count))
+		goto nla_put_failure;
+
+	genlmsg_end(msg, hdr);
+	return 0;
+
+nla_put_failure:
+	genlmsg_cancel(msg, hdr);
+	return -EMSGSIZE;
+}
+
+static int devlink_nl_cmd_sb_get_doit(struct sk_buff *skb,
+				      struct genl_info *info)
+{
+	struct devlink *devlink = info->user_ptr[0];
+	struct devlink_sb *devlink_sb = info->user_ptr[1];
+	struct sk_buff *msg;
+	int err;
+
+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	err = devlink_nl_sb_fill(msg, devlink, devlink_sb,
+				 DEVLINK_CMD_SB_NEW,
+				 info->snd_portid, info->snd_seq, 0);
+	if (err) {
+		nlmsg_free(msg);
+		return err;
+	}
+
+	return genlmsg_reply(msg, info);
+}
+
+static int devlink_nl_cmd_sb_get_dumpit(struct sk_buff *msg,
+					struct netlink_callback *cb)
+{
+	struct devlink *devlink;
+	struct devlink_sb *devlink_sb;
+	int start = cb->args[0];
+	int idx = 0;
+	int err;
+
+	mutex_lock(&devlink_mutex);
+	list_for_each_entry(devlink, &devlink_list, list) {
+		if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+			continue;
+		list_for_each_entry(devlink_sb, &devlink->sb_list, list) {
+			if (idx < start) {
+				idx++;
+				continue;
+			}
+			err = devlink_nl_sb_fill(msg, devlink, devlink_sb,
+						 DEVLINK_CMD_SB_NEW,
+						 NETLINK_CB(cb->skb).portid,
+						 cb->nlh->nlmsg_seq,
+						 NLM_F_MULTI);
+			if (err)
+				goto out;
+			idx++;
+		}
+	}
+out:
+	mutex_unlock(&devlink_mutex);
+
+	cb->args[0] = idx;
+	return msg->len;
+}
+
+static int devlink_nl_sb_pool_fill(struct sk_buff *msg, struct devlink *devlink,
+				   struct devlink_sb *devlink_sb,
+				   u16 pool_index, enum devlink_command cmd,
+				   u32 portid, u32 seq, int flags)
+{
+	struct devlink_sb_pool_info pool_info;
+	void *hdr;
+	int err;
+
+	err = devlink->ops->sb_pool_get(devlink, devlink_sb->index,
+					pool_index, &pool_info);
+	if (err)
+		return err;
+
+	hdr = genlmsg_put(msg, portid, seq, &devlink_nl_family, flags, cmd);
+	if (!hdr)
+		return -EMSGSIZE;
+
+	if (devlink_nl_put_handle(msg, devlink))
+		goto nla_put_failure;
+	if (nla_put_u32(msg, DEVLINK_ATTR_SB_INDEX, devlink_sb->index))
+		goto nla_put_failure;
+	if (nla_put_u16(msg, DEVLINK_ATTR_SB_POOL_INDEX, pool_index))
+		goto nla_put_failure;
+	if (nla_put_u8(msg, DEVLINK_ATTR_SB_POOL_TYPE, pool_info.pool_type))
+		goto nla_put_failure;
+	if (nla_put_u32(msg, DEVLINK_ATTR_SB_POOL_SIZE, pool_info.size))
+		goto nla_put_failure;
+	if (nla_put_u8(msg, DEVLINK_ATTR_SB_POOL_THRESHOLD_TYPE,
+		       pool_info.threshold_type))
+		goto nla_put_failure;
+
+	genlmsg_end(msg, hdr);
+	return 0;
+
+nla_put_failure:
+	genlmsg_cancel(msg, hdr);
+	return -EMSGSIZE;
+}
+
+static int devlink_nl_cmd_sb_pool_get_doit(struct sk_buff *skb,
+					   struct genl_info *info)
+{
+	struct devlink *devlink = info->user_ptr[0];
+	struct devlink_sb *devlink_sb = info->user_ptr[1];
+	struct sk_buff *msg;
+	u16 pool_index;
+	int err;
+
+	err = devlink_sb_pool_index_get_from_info(devlink_sb, info,
+						  &pool_index);
+	if (err)
+		return err;
+
+	if (!devlink->ops || !devlink->ops->sb_pool_get)
+		return -EOPNOTSUPP;
+
+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	err = devlink_nl_sb_pool_fill(msg, devlink, devlink_sb, pool_index,
+				      DEVLINK_CMD_SB_POOL_NEW,
+				      info->snd_portid, info->snd_seq, 0);
+	if (err) {
+		nlmsg_free(msg);
+		return err;
+	}
+
+	return genlmsg_reply(msg, info);
+}
+
+static int __sb_pool_get_dumpit(struct sk_buff *msg, int start, int *p_idx,
+				struct devlink *devlink,
+				struct devlink_sb *devlink_sb,
+				u32 portid, u32 seq)
+{
+	u16 pool_count = devlink_sb_pool_count(devlink_sb);
+	u16 pool_index;
+	int err;
+
+	for (pool_index = 0; pool_index < pool_count; pool_index++) {
+		if (*p_idx < start) {
+			(*p_idx)++;
+			continue;
+		}
+		err = devlink_nl_sb_pool_fill(msg, devlink,
+					      devlink_sb,
+					      pool_index,
+					      DEVLINK_CMD_SB_POOL_NEW,
+					      portid, seq, NLM_F_MULTI);
+		if (err)
+			return err;
+		(*p_idx)++;
+	}
+	return 0;
+}
+
+static int devlink_nl_cmd_sb_pool_get_dumpit(struct sk_buff *msg,
+					     struct netlink_callback *cb)
+{
+	struct devlink *devlink;
+	struct devlink_sb *devlink_sb;
+	int start = cb->args[0];
+	int idx = 0;
+	int err;
+
+	mutex_lock(&devlink_mutex);
+	list_for_each_entry(devlink, &devlink_list, list) {
+		if (!net_eq(devlink_net(devlink), sock_net(msg->sk)) ||
+		    !devlink->ops || !devlink->ops->sb_pool_get)
+			continue;
+		list_for_each_entry(devlink_sb, &devlink->sb_list, list) {
+			err = __sb_pool_get_dumpit(msg, start, &idx, devlink,
+						   devlink_sb,
+						   NETLINK_CB(cb->skb).portid,
+						   cb->nlh->nlmsg_seq);
+			if (err && err != -EOPNOTSUPP)
+				goto out;
+		}
+	}
+out:
+	mutex_unlock(&devlink_mutex);
+
+	cb->args[0] = idx;
+	return msg->len;
+}
+
+static int devlink_sb_pool_set(struct devlink *devlink, unsigned int sb_index,
+			       u16 pool_index, u32 size,
+			       enum devlink_sb_threshold_type threshold_type)
+
+{
+	const struct devlink_ops *ops = devlink->ops;
+
+	if (ops && ops->sb_pool_set)
+		return ops->sb_pool_set(devlink, sb_index, pool_index,
+					size, threshold_type);
+	return -EOPNOTSUPP;
+}
+
+static int devlink_nl_cmd_sb_pool_set_doit(struct sk_buff *skb,
+					   struct genl_info *info)
+{
+	struct devlink *devlink = info->user_ptr[0];
+	struct devlink_sb *devlink_sb = info->user_ptr[1];
+	enum devlink_sb_threshold_type threshold_type;
+	u16 pool_index;
+	u32 size;
+	int err;
+
+	err = devlink_sb_pool_index_get_from_info(devlink_sb, info,
+						  &pool_index);
+	if (err)
+		return err;
+
+	err = devlink_sb_th_type_get_from_info(info, &threshold_type);
+	if (err)
+		return err;
+
+	if (!info->attrs[DEVLINK_ATTR_SB_POOL_SIZE])
+		return -EINVAL;
+
+	size = nla_get_u32(info->attrs[DEVLINK_ATTR_SB_POOL_SIZE]);
+	return devlink_sb_pool_set(devlink, devlink_sb->index,
+				   pool_index, size, threshold_type);
+}
+
+static int devlink_nl_sb_port_pool_fill(struct sk_buff *msg,
+					struct devlink *devlink,
+					struct devlink_port *devlink_port,
+					struct devlink_sb *devlink_sb,
+					u16 pool_index,
+					enum devlink_command cmd,
+					u32 portid, u32 seq, int flags)
+{
+	u32 threshold;
+	void *hdr;
+	int err;
+
+	err = devlink->ops->sb_port_pool_get(devlink_port, devlink_sb->index,
+					     pool_index, &threshold);
+	if (err)
+		return err;
+
+	hdr = genlmsg_put(msg, portid, seq, &devlink_nl_family, flags, cmd);
+	if (!hdr)
+		return -EMSGSIZE;
+
+	if (devlink_nl_put_handle(msg, devlink))
+		goto nla_put_failure;
+	if (nla_put_u32(msg, DEVLINK_ATTR_PORT_INDEX, devlink_port->index))
+		goto nla_put_failure;
+	if (nla_put_u32(msg, DEVLINK_ATTR_SB_INDEX, devlink_sb->index))
+		goto nla_put_failure;
+	if (nla_put_u16(msg, DEVLINK_ATTR_SB_POOL_INDEX, pool_index))
+		goto nla_put_failure;
+	if (nla_put_u32(msg, DEVLINK_ATTR_SB_THRESHOLD, threshold))
+		goto nla_put_failure;
+
+	genlmsg_end(msg, hdr);
+	return 0;
+
+nla_put_failure:
+	genlmsg_cancel(msg, hdr);
+	return -EMSGSIZE;
+}
+
+static int devlink_nl_cmd_sb_port_pool_get_doit(struct sk_buff *skb,
+						struct genl_info *info)
+{
+	struct devlink_port *devlink_port = info->user_ptr[0];
+	struct devlink *devlink = devlink_port->devlink;
+	struct devlink_sb *devlink_sb = info->user_ptr[1];
+	struct sk_buff *msg;
+	u16 pool_index;
+	int err;
+
+	err = devlink_sb_pool_index_get_from_info(devlink_sb, info,
+						  &pool_index);
+	if (err)
+		return err;
+
+	if (!devlink->ops || !devlink->ops->sb_port_pool_get)
+		return -EOPNOTSUPP;
+
+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	err = devlink_nl_sb_port_pool_fill(msg, devlink, devlink_port,
+					   devlink_sb, pool_index,
+					   DEVLINK_CMD_SB_PORT_POOL_NEW,
+					   info->snd_portid, info->snd_seq, 0);
+	if (err) {
+		nlmsg_free(msg);
+		return err;
+	}
+
+	return genlmsg_reply(msg, info);
+}
+
+static int __sb_port_pool_get_dumpit(struct sk_buff *msg, int start, int *p_idx,
+				     struct devlink *devlink,
+				     struct devlink_sb *devlink_sb,
+				     u32 portid, u32 seq)
+{
+	struct devlink_port *devlink_port;
+	u16 pool_count = devlink_sb_pool_count(devlink_sb);
+	u16 pool_index;
+	int err;
+
+	list_for_each_entry(devlink_port, &devlink->port_list, list) {
+		for (pool_index = 0; pool_index < pool_count; pool_index++) {
+			if (*p_idx < start) {
+				(*p_idx)++;
+				continue;
+			}
+			err = devlink_nl_sb_port_pool_fill(msg, devlink,
+							   devlink_port,
+							   devlink_sb,
+							   pool_index,
+							   DEVLINK_CMD_SB_PORT_POOL_NEW,
+							   portid, seq,
+							   NLM_F_MULTI);
+			if (err)
+				return err;
+			(*p_idx)++;
+		}
+	}
+	return 0;
+}
+
+static int devlink_nl_cmd_sb_port_pool_get_dumpit(struct sk_buff *msg,
+						  struct netlink_callback *cb)
+{
+	struct devlink *devlink;
+	struct devlink_sb *devlink_sb;
+	int start = cb->args[0];
+	int idx = 0;
+	int err;
+
+	mutex_lock(&devlink_mutex);
+	mutex_lock(&devlink_port_mutex);
+	list_for_each_entry(devlink, &devlink_list, list) {
+		if (!net_eq(devlink_net(devlink), sock_net(msg->sk)) ||
+		    !devlink->ops || !devlink->ops->sb_port_pool_get)
+			continue;
+		list_for_each_entry(devlink_sb, &devlink->sb_list, list) {
+			err = __sb_port_pool_get_dumpit(msg, start, &idx,
+							devlink, devlink_sb,
+							NETLINK_CB(cb->skb).portid,
+							cb->nlh->nlmsg_seq);
+			if (err && err != -EOPNOTSUPP)
+				goto out;
+		}
+	}
+out:
+	mutex_unlock(&devlink_port_mutex);
+	mutex_unlock(&devlink_mutex);
+
+	cb->args[0] = idx;
+	return msg->len;
+}
+
+static int devlink_sb_port_pool_set(struct devlink_port *devlink_port,
+				    unsigned int sb_index, u16 pool_index,
+				    u32 threshold)
+
+{
+	const struct devlink_ops *ops = devlink_port->devlink->ops;
+
+	if (ops && ops->sb_port_pool_set)
+		return ops->sb_port_pool_set(devlink_port, sb_index,
+					     pool_index, threshold);
+	return -EOPNOTSUPP;
+}
+
+static int devlink_nl_cmd_sb_port_pool_set_doit(struct sk_buff *skb,
+						struct genl_info *info)
+{
+	struct devlink_port *devlink_port = info->user_ptr[0];
+	struct devlink_sb *devlink_sb = info->user_ptr[1];
+	u16 pool_index;
+	u32 threshold;
+	int err;
+
+	err = devlink_sb_pool_index_get_from_info(devlink_sb, info,
+						  &pool_index);
+	if (err)
+		return err;
+
+	if (!info->attrs[DEVLINK_ATTR_SB_THRESHOLD])
+		return -EINVAL;
+
+	threshold = nla_get_u32(info->attrs[DEVLINK_ATTR_SB_THRESHOLD]);
+	return devlink_sb_port_pool_set(devlink_port, devlink_sb->index,
+					pool_index, threshold);
+}
+
+static int
+devlink_nl_sb_tc_pool_bind_fill(struct sk_buff *msg, struct devlink *devlink,
+				struct devlink_port *devlink_port,
+				struct devlink_sb *devlink_sb, u16 tc_index,
+				enum devlink_sb_pool_type pool_type,
+				enum devlink_command cmd,
+				u32 portid, u32 seq, int flags)
+{
+	u16 pool_index;
+	u32 threshold;
+	void *hdr;
+	int err;
+
+	err = devlink->ops->sb_tc_pool_bind_get(devlink_port, devlink_sb->index,
+						tc_index, pool_type,
+						&pool_index, &threshold);
+	if (err)
+		return err;
+
+	hdr = genlmsg_put(msg, portid, seq, &devlink_nl_family, flags, cmd);
+	if (!hdr)
+		return -EMSGSIZE;
+
+	if (devlink_nl_put_handle(msg, devlink))
+		goto nla_put_failure;
+	if (nla_put_u32(msg, DEVLINK_ATTR_PORT_INDEX, devlink_port->index))
+		goto nla_put_failure;
+	if (nla_put_u32(msg, DEVLINK_ATTR_SB_INDEX, devlink_sb->index))
+		goto nla_put_failure;
+	if (nla_put_u16(msg, DEVLINK_ATTR_SB_TC_INDEX, tc_index))
+		goto nla_put_failure;
+	if (nla_put_u8(msg, DEVLINK_ATTR_SB_POOL_TYPE, pool_type))
+		goto nla_put_failure;
+	if (nla_put_u16(msg, DEVLINK_ATTR_SB_POOL_INDEX, pool_index))
+		goto nla_put_failure;
+	if (nla_put_u32(msg, DEVLINK_ATTR_SB_THRESHOLD, threshold))
+		goto nla_put_failure;
+
+	genlmsg_end(msg, hdr);
+	return 0;
+
+nla_put_failure:
+	genlmsg_cancel(msg, hdr);
+	return -EMSGSIZE;
+}
+
+static int devlink_nl_cmd_sb_tc_pool_bind_get_doit(struct sk_buff *skb,
+						   struct genl_info *info)
+{
+	struct devlink_port *devlink_port = info->user_ptr[0];
+	struct devlink *devlink = devlink_port->devlink;
+	struct devlink_sb *devlink_sb = info->user_ptr[1];
+	struct sk_buff *msg;
+	enum devlink_sb_pool_type pool_type;
+	u16 tc_index;
+	int err;
+
+	err = devlink_sb_pool_type_get_from_info(info, &pool_type);
+	if (err)
+		return err;
+
+	err = devlink_sb_tc_index_get_from_info(devlink_sb, info,
+						pool_type, &tc_index);
+	if (err)
+		return err;
+
+	if (!devlink->ops || !devlink->ops->sb_tc_pool_bind_get)
+		return -EOPNOTSUPP;
+
+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	err = devlink_nl_sb_tc_pool_bind_fill(msg, devlink, devlink_port,
+					      devlink_sb, tc_index, pool_type,
+					      DEVLINK_CMD_SB_TC_POOL_BIND_NEW,
+					      info->snd_portid,
+					      info->snd_seq, 0);
+	if (err) {
+		nlmsg_free(msg);
+		return err;
+	}
+
+	return genlmsg_reply(msg, info);
+}
+
+static int __sb_tc_pool_bind_get_dumpit(struct sk_buff *msg,
+					int start, int *p_idx,
+					struct devlink *devlink,
+					struct devlink_sb *devlink_sb,
+					u32 portid, u32 seq)
+{
+	struct devlink_port *devlink_port;
+	u16 tc_index;
+	int err;
+
+	list_for_each_entry(devlink_port, &devlink->port_list, list) {
+		for (tc_index = 0;
+		     tc_index < devlink_sb->ingress_tc_count; tc_index++) {
+			if (*p_idx < start) {
+				(*p_idx)++;
+				continue;
+			}
+			err = devlink_nl_sb_tc_pool_bind_fill(msg, devlink,
+							      devlink_port,
+							      devlink_sb,
+							      tc_index,
+							      DEVLINK_SB_POOL_TYPE_INGRESS,
+							      DEVLINK_CMD_SB_TC_POOL_BIND_NEW,
+							      portid, seq,
+							      NLM_F_MULTI);
+			if (err)
+				return err;
+			(*p_idx)++;
+		}
+		for (tc_index = 0;
+		     tc_index < devlink_sb->egress_tc_count; tc_index++) {
+			if (*p_idx < start) {
+				(*p_idx)++;
+				continue;
+			}
+			err = devlink_nl_sb_tc_pool_bind_fill(msg, devlink,
+							      devlink_port,
+							      devlink_sb,
+							      tc_index,
+							      DEVLINK_SB_POOL_TYPE_EGRESS,
+							      DEVLINK_CMD_SB_TC_POOL_BIND_NEW,
+							      portid, seq,
+							      NLM_F_MULTI);
+			if (err)
+				return err;
+			(*p_idx)++;
+		}
+	}
+	return 0;
+}
+
+static int
+devlink_nl_cmd_sb_tc_pool_bind_get_dumpit(struct sk_buff *msg,
+					  struct netlink_callback *cb)
+{
+	struct devlink *devlink;
+	struct devlink_sb *devlink_sb;
+	int start = cb->args[0];
+	int idx = 0;
+	int err;
+
+	mutex_lock(&devlink_mutex);
+	mutex_lock(&devlink_port_mutex);
+	list_for_each_entry(devlink, &devlink_list, list) {
+		if (!net_eq(devlink_net(devlink), sock_net(msg->sk)) ||
+		    !devlink->ops || !devlink->ops->sb_tc_pool_bind_get)
+			continue;
+		list_for_each_entry(devlink_sb, &devlink->sb_list, list) {
+			err = __sb_tc_pool_bind_get_dumpit(msg, start, &idx,
+							   devlink,
+							   devlink_sb,
+							   NETLINK_CB(cb->skb).portid,
+							   cb->nlh->nlmsg_seq);
+			if (err && err != -EOPNOTSUPP)
+				goto out;
+		}
+	}
+out:
+	mutex_unlock(&devlink_port_mutex);
+	mutex_unlock(&devlink_mutex);
+
+	cb->args[0] = idx;
+	return msg->len;
+}
+
+static int devlink_sb_tc_pool_bind_set(struct devlink_port *devlink_port,
+				       unsigned int sb_index, u16 tc_index,
+				       enum devlink_sb_pool_type pool_type,
+				       u16 pool_index, u32 threshold)
+
+{
+	const struct devlink_ops *ops = devlink_port->devlink->ops;
+
+	if (ops && ops->sb_tc_pool_bind_set)
+		return ops->sb_tc_pool_bind_set(devlink_port, sb_index,
+						tc_index, pool_type,
+						pool_index, threshold);
+	return -EOPNOTSUPP;
+}
+
+static int devlink_nl_cmd_sb_tc_pool_bind_set_doit(struct sk_buff *skb,
+						   struct genl_info *info)
+{
+	struct devlink_port *devlink_port = info->user_ptr[0];
+	struct devlink_sb *devlink_sb = info->user_ptr[1];
+	enum devlink_sb_pool_type pool_type;
+	u16 tc_index;
+	u16 pool_index;
+	u32 threshold;
+	int err;
+
+	err = devlink_sb_pool_type_get_from_info(info, &pool_type);
+	if (err)
+		return err;
+
+	err = devlink_sb_tc_index_get_from_info(devlink_sb, info,
+						pool_type, &tc_index);
+	if (err)
+		return err;
+
+	err = devlink_sb_pool_index_get_from_info(devlink_sb, info,
+						  &pool_index);
+	if (err)
+		return err;
+
+	if (!info->attrs[DEVLINK_ATTR_SB_THRESHOLD])
+		return -EINVAL;
+
+	threshold = nla_get_u32(info->attrs[DEVLINK_ATTR_SB_THRESHOLD]);
+	return devlink_sb_tc_pool_bind_set(devlink_port, devlink_sb->index,
+					   tc_index, pool_type,
+					   pool_index, threshold);
+}
+
 static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
 	[DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING },
 	[DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING },
 	[DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32 },
 	[DEVLINK_ATTR_PORT_TYPE] = { .type = NLA_U16 },
 	[DEVLINK_ATTR_PORT_SPLIT_COUNT] = { .type = NLA_U32 },
+	[DEVLINK_ATTR_SB_INDEX] = { .type = NLA_U32 },
+	[DEVLINK_ATTR_SB_POOL_INDEX] = { .type = NLA_U16 },
+	[DEVLINK_ATTR_SB_POOL_TYPE] = { .type = NLA_U8 },
+	[DEVLINK_ATTR_SB_POOL_SIZE] = { .type = NLA_U32 },
+	[DEVLINK_ATTR_SB_POOL_THRESHOLD_TYPE] = { .type = NLA_U8 },
+	[DEVLINK_ATTR_SB_THRESHOLD] = { .type = NLA_U32 },
+	[DEVLINK_ATTR_SB_TC_INDEX] = { .type = NLA_U16 },
 };
 
 static const struct genl_ops devlink_nl_ops[] = {
@@ -545,6 +1379,66 @@ static const struct genl_ops devlink_nl_ops[] = {
 		.flags = GENL_ADMIN_PERM,
 		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK,
 	},
+	{
+		.cmd = DEVLINK_CMD_SB_GET,
+		.doit = devlink_nl_cmd_sb_get_doit,
+		.dumpit = devlink_nl_cmd_sb_get_dumpit,
+		.policy = devlink_nl_policy,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK |
+				  DEVLINK_NL_FLAG_NEED_SB,
+		/* can be retrieved by unprivileged users */
+	},
+	{
+		.cmd = DEVLINK_CMD_SB_POOL_GET,
+		.doit = devlink_nl_cmd_sb_pool_get_doit,
+		.dumpit = devlink_nl_cmd_sb_pool_get_dumpit,
+		.policy = devlink_nl_policy,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK |
+				  DEVLINK_NL_FLAG_NEED_SB,
+		/* can be retrieved by unprivileged users */
+	},
+	{
+		.cmd = DEVLINK_CMD_SB_POOL_SET,
+		.doit = devlink_nl_cmd_sb_pool_set_doit,
+		.policy = devlink_nl_policy,
+		.flags = GENL_ADMIN_PERM,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK |
+				  DEVLINK_NL_FLAG_NEED_SB,
+	},
+	{
+		.cmd = DEVLINK_CMD_SB_PORT_POOL_GET,
+		.doit = devlink_nl_cmd_sb_port_pool_get_doit,
+		.dumpit = devlink_nl_cmd_sb_port_pool_get_dumpit,
+		.policy = devlink_nl_policy,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_PORT |
+				  DEVLINK_NL_FLAG_NEED_SB,
+		/* can be retrieved by unprivileged users */
+	},
+	{
+		.cmd = DEVLINK_CMD_SB_PORT_POOL_SET,
+		.doit = devlink_nl_cmd_sb_port_pool_set_doit,
+		.policy = devlink_nl_policy,
+		.flags = GENL_ADMIN_PERM,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_PORT |
+				  DEVLINK_NL_FLAG_NEED_SB,
+	},
+	{
+		.cmd = DEVLINK_CMD_SB_TC_POOL_BIND_GET,
+		.doit = devlink_nl_cmd_sb_tc_pool_bind_get_doit,
+		.dumpit = devlink_nl_cmd_sb_tc_pool_bind_get_dumpit,
+		.policy = devlink_nl_policy,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_PORT |
+				  DEVLINK_NL_FLAG_NEED_SB,
+		/* can be retrieved by unprivileged users */
+	},
+	{
+		.cmd = DEVLINK_CMD_SB_TC_POOL_BIND_SET,
+		.doit = devlink_nl_cmd_sb_tc_pool_bind_set_doit,
+		.policy = devlink_nl_policy,
+		.flags = GENL_ADMIN_PERM,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_PORT |
+				  DEVLINK_NL_FLAG_NEED_SB,
+	},
 };
 
 /**
@@ -566,6 +1460,7 @@ struct devlink *devlink_alloc(const struct devlink_ops *ops, size_t priv_size)
 	devlink->ops = ops;
 	devlink_net_set(devlink, &init_net);
 	INIT_LIST_HEAD(&devlink->port_list);
+	INIT_LIST_HEAD(&devlink->sb_list);
 	return devlink;
 }
 EXPORT_SYMBOL_GPL(devlink_alloc);
@@ -721,6 +1616,51 @@ void devlink_port_split_set(struct devlink_port *devlink_port,
 }
 EXPORT_SYMBOL_GPL(devlink_port_split_set);
 
+int devlink_sb_register(struct devlink *devlink, unsigned int sb_index,
+			u32 size, u16 ingress_pools_count,
+			u16 egress_pools_count, u16 ingress_tc_count,
+			u16 egress_tc_count)
+{
+	struct devlink_sb *devlink_sb;
+	int err = 0;
+
+	mutex_lock(&devlink_mutex);
+	if (devlink_sb_index_exists(devlink, sb_index)) {
+		err = -EEXIST;
+		goto unlock;
+	}
+
+	devlink_sb = kzalloc(sizeof(*devlink_sb), GFP_KERNEL);
+	if (!devlink_sb) {
+		err = -ENOMEM;
+		goto unlock;
+	}
+	devlink_sb->index = sb_index;
+	devlink_sb->size = size;
+	devlink_sb->ingress_pools_count = ingress_pools_count;
+	devlink_sb->egress_pools_count = egress_pools_count;
+	devlink_sb->ingress_tc_count = ingress_tc_count;
+	devlink_sb->egress_tc_count = egress_tc_count;
+	list_add_tail(&devlink_sb->list, &devlink->sb_list);
+unlock:
+	mutex_unlock(&devlink_mutex);
+	return err;
+}
+EXPORT_SYMBOL_GPL(devlink_sb_register);
+
+void devlink_sb_unregister(struct devlink *devlink, unsigned int sb_index)
+{
+	struct devlink_sb *devlink_sb;
+
+	mutex_lock(&devlink_mutex);
+	devlink_sb = devlink_sb_get_by_index(devlink, sb_index);
+	WARN_ON(!devlink_sb);
+	list_del(&devlink_sb->list);
+	mutex_unlock(&devlink_mutex);
+	kfree(devlink_sb);
+}
+EXPORT_SYMBOL_GPL(devlink_sb_unregister);
+
 static int __init devlink_module_init(void)
 {
 	return genl_register_family_with_ops_groups(&devlink_nl_family,
-- 
2.5.5

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox