* [RFC PATCH v2 net-next 02/15] net: dsa: sja1105: Get rid of global declaration of struct ptp_clock_info
From: Vladimir Oltean @ 2019-08-30 0:46 UTC (permalink / raw)
To: f.fainelli, vivien.didelot, andrew, davem, vinicius.gomes,
vedang.patel, richardcochran
Cc: weifeng.voon, jiri, m-karicheri2, Jose.Abreu, ilias.apalodimas,
--to=jhs, --to=xiyou.wangcong, netdev, Vladimir Oltean
In-Reply-To: <20190830004635.24863-1-olteanv@gmail.com>
We need priv->ptp_caps to hold a structure and not just a pointer,
because we use container_of in the various PTP callbacks.
Therefore, the sja1105_ptp_caps structure declared in the global memory
of the driver serves no further purpose after copying it into
priv->ptp_caps.
So just populate priv->ptp_caps with the needed operations and remove
sja1105_ptp_caps.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
drivers/net/dsa/sja1105/sja1105_ptp.c | 29 +++++++++++++--------------
1 file changed, 14 insertions(+), 15 deletions(-)
diff --git a/drivers/net/dsa/sja1105/sja1105_ptp.c b/drivers/net/dsa/sja1105/sja1105_ptp.c
index 07374ba6b9be..13f9f5799e46 100644
--- a/drivers/net/dsa/sja1105/sja1105_ptp.c
+++ b/drivers/net/dsa/sja1105/sja1105_ptp.c
@@ -343,29 +343,28 @@ static void sja1105_ptp_overflow_check(struct work_struct *work)
schedule_delayed_work(&priv->refresh_work, SJA1105_REFRESH_INTERVAL);
}
-static const struct ptp_clock_info sja1105_ptp_caps = {
- .owner = THIS_MODULE,
- .name = "SJA1105 PHC",
- .adjfine = sja1105_ptp_adjfine,
- .adjtime = sja1105_ptp_adjtime,
- .gettime64 = sja1105_ptp_gettime,
- .settime64 = sja1105_ptp_settime,
- .max_adj = SJA1105_MAX_ADJ_PPB,
-};
-
int sja1105_ptp_clock_register(struct sja1105_private *priv)
{
struct dsa_switch *ds = priv->ds;
/* Set up the cycle counter */
priv->tstamp_cc = (struct cyclecounter) {
- .read = sja1105_ptptsclk_read,
- .mask = CYCLECOUNTER_MASK(64),
- .shift = SJA1105_CC_SHIFT,
- .mult = SJA1105_CC_MULT,
+ .read = sja1105_ptptsclk_read,
+ .mask = CYCLECOUNTER_MASK(64),
+ .shift = SJA1105_CC_SHIFT,
+ .mult = SJA1105_CC_MULT,
+ };
+ priv->ptp_caps = (struct ptp_clock_info) {
+ .owner = THIS_MODULE,
+ .name = "SJA1105 PHC",
+ .adjfine = sja1105_ptp_adjfine,
+ .adjtime = sja1105_ptp_adjtime,
+ .gettime64 = sja1105_ptp_gettime,
+ .settime64 = sja1105_ptp_settime,
+ .max_adj = SJA1105_MAX_ADJ_PPB,
};
+
mutex_init(&priv->ptp_lock);
- priv->ptp_caps = sja1105_ptp_caps;
priv->clock = ptp_clock_register(&priv->ptp_caps, ds->dev);
if (IS_ERR_OR_NULL(priv->clock))
--
2.17.1
^ permalink raw reply related
* [RFC PATCH v2 net-next 01/15] net: dsa: sja1105: Change the PTP command access pattern
From: Vladimir Oltean @ 2019-08-30 0:46 UTC (permalink / raw)
To: f.fainelli, vivien.didelot, andrew, davem, vinicius.gomes,
vedang.patel, richardcochran
Cc: weifeng.voon, jiri, m-karicheri2, Jose.Abreu, ilias.apalodimas,
--to=jhs, --to=xiyou.wangcong, netdev, Vladimir Oltean
In-Reply-To: <20190830004635.24863-1-olteanv@gmail.com>
The PTP command register contains enable bits for:
- Putting the 64-bit PTPCLKVAL register in add/subtract or write mode
- Taking timestamps off of the corrected vs free-running clock
- Starting/stopping the TTEthernet scheduling
- Starting/stopping PPS output
- Resetting the switch
When a command needs to be issued (e.g. "change the PTPCLKVAL from write
mode to add/subtract mode"), one cannot simply write to the command
register setting the PTPCLKADD bit to 1, because that would zeroize the
other settings. One also cannot do a read-modify-write (that would be
too easy for this hardware) because not all bits of the command register
are readable over SPI.
So this leaves us with the only option of keeping the value of the PTP
command register in the driver, and operating on that.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
drivers/net/dsa/sja1105/sja1105.h | 5 +++++
drivers/net/dsa/sja1105/sja1105_ptp.c | 6 +-----
2 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
index 78094db32622..d8a92646e80a 100644
--- a/drivers/net/dsa/sja1105/sja1105.h
+++ b/drivers/net/dsa/sja1105/sja1105.h
@@ -50,6 +50,10 @@ struct sja1105_regs {
u64 qlevel[SJA1105_NUM_PORTS];
};
+struct sja1105_ptp_cmd {
+ u64 resptp; /* reset */
+};
+
struct sja1105_info {
u64 device_id;
/* Needed for distinction between P and R, and between Q and S
@@ -89,6 +93,7 @@ struct sja1105_private {
struct spi_device *spidev;
struct dsa_switch *ds;
struct sja1105_port ports[SJA1105_NUM_PORTS];
+ struct sja1105_ptp_cmd ptp_cmd;
struct ptp_clock_info ptp_caps;
struct ptp_clock *clock;
/* The cycle counter translates the PTP timestamps (based on
diff --git a/drivers/net/dsa/sja1105/sja1105_ptp.c b/drivers/net/dsa/sja1105/sja1105_ptp.c
index d8e8dd59f3d1..07374ba6b9be 100644
--- a/drivers/net/dsa/sja1105/sja1105_ptp.c
+++ b/drivers/net/dsa/sja1105/sja1105_ptp.c
@@ -54,10 +54,6 @@
#define cc_to_sja1105(d) container_of((d), struct sja1105_private, tstamp_cc)
#define dw_to_sja1105(d) container_of((d), struct sja1105_private, refresh_work)
-struct sja1105_ptp_cmd {
- u64 resptp; /* reset */
-};
-
int sja1105_get_ts_info(struct dsa_switch *ds, int port,
struct ethtool_ts_info *info)
{
@@ -218,8 +214,8 @@ int sja1105_ptpegr_ts_poll(struct sja1105_private *priv, int port, u64 *ts)
int sja1105_ptp_reset(struct sja1105_private *priv)
{
+ struct sja1105_ptp_cmd cmd = priv->ptp_cmd;
struct dsa_switch *ds = priv->ds;
- struct sja1105_ptp_cmd cmd = {0};
int rc;
mutex_lock(&priv->ptp_lock);
--
2.17.1
^ permalink raw reply related
* [RFC PATCH v2 net-next 00/15] tc-taprio offload for SJA1105 DSA
From: Vladimir Oltean @ 2019-08-30 0:46 UTC (permalink / raw)
To: f.fainelli, vivien.didelot, andrew, davem, vinicius.gomes,
vedang.patel, richardcochran
Cc: weifeng.voon, jiri, m-karicheri2, Jose.Abreu, ilias.apalodimas,
--to=jhs, --to=xiyou.wangcong, netdev, Vladimir Oltean
This is the v2 of the patchset from July:
https://lists.openwall.net/netdev/2019/07/07/81
Changes:
- Adapted the taprio offload patch to work by specifying "flags 2" to
the iproute2-next tc. At the moment I don't clearly understand whether
the full offload and the txtime assist ("flags 1") are mutually
exclusive or not (i.e. whether a "flags 3" mode should be rejected,
which it currently isn't).
- Added reference counting to the taprio offload structure. Maybe the
function names and placement could have been better though. As for the
other complaint (cycle time calculation) it got fixed in the taprio
parser in the meantime.
- Converted sja1105 to use the hardware PTP registers, and save/restore
the PTP time across resets.
- Made the DSA callback for ndo_setup_tc a bit more generic, but I don't
know whether it fulfills expectations. Drivers still can't do blocking
operations in its execution context.
- Added a state machine for starting/stopping the scheduler based on the
last command run on the PTP clock.
For those who want to follow along with the hardware implementation, the
manual is here:
https://www.nxp.com/docs/en/user-guide/UM10944.pdf
Original cover letter:
Using Vinicius Costa Gomes' configuration interface for 802.1Qbv (later
resent by Voon Weifeng for the stmmac driver), I am submitting for
review a draft implementation of this offload for a DSA switch.
I don't want to insist too much on the hardware specifics of SJA1105
which isn't otherwise very compliant to the IEEE spec.
In order to be able to test with Vedang Patel's iproute2 patch for
taprio offload (https://www.spinics.net/lists/netdev/msg573072.html)
I had to actually revert the txtime-assist branch as it had changed the
iproute2 interface.
In terms of impact for DSA drivers, I would like to point out that:
- Maybe somebody should pre-populate qopt->cycle_time in case the user
does not provide one. Otherwise each driver needs to iterate over the
GCL once, just to set the cycle time (right now stmmac does as well).
- Configuring the switch over SPI cannot apparently be done from this
ndo_setup_tc callback because it runs in atomic context. I also have
some downstream patches to offload tc clsact matchall with mirred
action, but in that case it looks like the atomic context restriction
does not apply.
- I had to copy the struct tc_taprio_qopt_offload to driver private
memory because a static config needs to be constructed every time a
change takes place, and there are up to 4 switch ports that may take a
TAS configuration. I have created a private
tc_taprio_qopt_offload_copy() helper for this - I don't know whether
it's of any help in the general case.
There is more to be done however. The TAS needs to be integrated with
the PTP driver. This is because with a PTP clock source, the base time
is written dynamically to the PTPSCHTM (PTP schedule time) register and
must be a time in the future. Then the "real" base time of each port's
TAS config can be offset by at most ~50 ms (the DELTA field from the
Schedule Entry Points Table) relative to PTPSCHTM.
Because base times in the past are completely ignored by this hardware,
we need to decide if it's ok behaviorally for a driver to "roll" a past
base time into the immediate future by incrementally adding the cycle
time (so the phase doesn't change). If it is, then decide by how long in
the future it is ok to do so. Or alternatively, is it preferable if the
driver errors out if the user-supplied base time is in the past and the
hardware doesn't like it? But even then, there might be fringe cases
when the base time becomes a past PTP time right as the driver tries to
apply the config.
Also applying a tc-taprio offload to a second SJA1105 switch port will
inevitably need to roll the first port's (now past) base time into an
equivalent future time.
All of this is going to be complicated even further by the fact that
resetting the switch (to apply the tc-taprio offload) makes it reset its
PTP time.
Vinicius Costa Gomes (1):
taprio: Add support for hardware offloading
Vladimir Oltean (14):
net: dsa: sja1105: Change the PTP command access pattern
net: dsa: sja1105: Get rid of global declaration of struct
ptp_clock_info
net: dsa: sja1105: Switch to hardware operations for PTP
net: dsa: sja1105: Implement the .gettimex64 system call for PTP
net: dsa: sja1105: Restore PTP time after switch reset
net: dsa: sja1105: Disallow management xmit during switch reset
net: dsa: sja1105: Move PTP data to its own private structure
net: dsa: sja1105: Advertise the 8 TX queues
net: dsa: Pass ndo_setup_tc slave callback to drivers
net: dsa: sja1105: Add static config tables for scheduling
net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio
offload
net: dsa: sja1105: Make HOSTPRIO a kernel config
net: dsa: sja1105: Make the PTP command read-write
net: dsa: sja1105: Implement state machine for TAS with PTP clock
source
drivers/net/dsa/sja1105/Kconfig | 17 +
drivers/net/dsa/sja1105/Makefile | 4 +
drivers/net/dsa/sja1105/sja1105.h | 36 +-
.../net/dsa/sja1105/sja1105_dynamic_config.c | 8 +
drivers/net/dsa/sja1105/sja1105_main.c | 94 +-
drivers/net/dsa/sja1105/sja1105_ptp.c | 345 ++++---
drivers/net/dsa/sja1105/sja1105_ptp.h | 103 ++-
drivers/net/dsa/sja1105/sja1105_spi.c | 58 +-
.../net/dsa/sja1105/sja1105_static_config.c | 167 ++++
.../net/dsa/sja1105/sja1105_static_config.h | 48 +-
drivers/net/dsa/sja1105/sja1105_tas.c | 851 ++++++++++++++++++
drivers/net/dsa/sja1105/sja1105_tas.h | 74 ++
include/linux/netdevice.h | 1 +
include/net/dsa.h | 3 +
include/net/pkt_sched.h | 33 +
include/uapi/linux/pkt_sched.h | 3 +-
net/dsa/slave.c | 12 +-
net/dsa/tag_sja1105.c | 3 +-
net/sched/sch_taprio.c | 246 ++++-
19 files changed, 1883 insertions(+), 223 deletions(-)
create mode 100644 drivers/net/dsa/sja1105/sja1105_tas.c
create mode 100644 drivers/net/dsa/sja1105/sja1105_tas.h
--
2.17.1
^ permalink raw reply
* Re: ANNOUNCE: rpld an another RPL implementation for Linux
From: Reuben Hawkins @ 2019-08-30 0:40 UTC (permalink / raw)
To: Alexander Aring
Cc: open list:NETWORKING [GENERAL], Michael Richardson,
Jamal Hadi Salim, Robert Kaiser, Martin Gergeleit, Kai Beckmann,
koen, linux-wpan - ML, BlueZ development, Stefan Schmidt,
sebastian.meiling, Marcel Holtmann, Werner Almesberger,
Jukka Rissanen
In-Reply-To: <CAB_54W7h9ca0UJAZtk=ApPX-2ZCvzu4774BTFTaB5mtkobWCtw@mail.gmail.com>
There is a COPYRIGHT file in radvd which I just read for the first time today. It’s been there for 19+ years. Sounds very BSD to me. I’ve been maintainer for 9 years. As far as I’m concerned, you can do with the code whatever you like. Good luck. 👍
Sent from my iPhone
> On Aug 29, 2019, at 2:57 PM, Alexander Aring <alex.aring@gmail.com> wrote:
>
> Hi,
>
> I had some free time, I wanted to know how RPL [0] works so I did a
> implementation. It's _very_ basic as it only gives you a "routable"
> (is that a word?) thing afterwards in a very constrained setup of RPL
> messages.
>
> Took ~1 month to implement it and I reused some great code from radvd
> [1]. I released it under the same license (BSD?). Anyway, I know there
> exists a lot of memory leaks and the parameters are just crazy as not
> practical in a real environment BUT it works.
>
> I changed a little bit the dependencies from radvd (because fancy new things):
>
> - lua for config handling
> - libev for event loop handling
> - libmnl for netlink handling
>
> The code is available at:
>
> https://github.com/linux-wpan/rpld
>
> With a recent kernel (I think 4.19 and above) and necessary user space
> dependencies, just build it and run the start script. It will create
> some virtual IEEE 802.15.4 6LoWPAN interfaces and you can run
> traceroute from namespace ns0 (which is the RPL DODAG root) to any
> other node e.g. namespace ns5. With more knowledge of the scripts you
> can change the underlying topology, everybody is welcome to improve
> them.
>
> I will work more on it when I have time... to have at least something
> running means the real fun can begin (but it was already fun before).
>
> The big thing what everybody wants is source routing, which requires
> some control plane for RPL into the kernel to say how and when to put
> source routing headers in IPv6. I think somehow I know what's
> necessary now... but I didn't implemented it, this simple
> implementation just filling up routing tables as RPL supports storing
> (routing table) or non-storing (source routing) modes. People tells me
> to lookup frrouting to look how they do source routing, I will if I
> get the chance.
>
> It doesn't run on Bluetooth yet, I know there exists a lack of UAPI to
> figure out the linklayer which is used by 6LoWPAN. I need somehow a
> SLAVE_INFO attribute in netlink to figure this out and tell me some
> 6LoWPAN specific attributes. I am sorry Bluetooth people, but I think
> you are also more interested in source routing because I heard
> somebody saying it's the more common approach outside (but I never saw
> any other RPL implementation than unstrung running).
>
> Also I did something in my masters thesis to make a better parent
> selection, if this implementation becomes stable I can look to get
> this migrated.
>
> Please, radvd maintainer let me know if everything is okay from your
> side. As I said I reused some code from radvd. I also operate on
> ICMPv6 sockets. The same to Michael Richardson unstrung [2]. If there
> is anything to talk or you have complains, I can change it.
>
> Thanks, I really only wanted to get more knowledge about routing
> protocols and how to implement such. Besides all known issues, I still
> think it's a good starting point.
>
> - Alex
>
> [0] https://tools.ietf.org/html/rfc6550
> [1] https://github.com/reubenhwk/radvd
> [2] https://github.com/AnimaGUS-minerva/unstrung
^ permalink raw reply
* Re: [pull request][net-next v2 0/8] Mellanox, mlx5 updates 2019-08-22
From: David Miller @ 2019-08-30 0:25 UTC (permalink / raw)
To: saeedm; +Cc: netdev
In-Reply-To: <20190828185720.2300-1-saeedm@mellanox.com>
From: Saeed Mahameed <saeedm@mellanox.com>
Date: Wed, 28 Aug 2019 18:57:39 +0000
> This series provides some misc updates to mlx5 driver.
> For more information please see tag log below.
>
> Please pull and let me know if there is any problem.
>
> Please note that the series starts with a merge of mlx5-next branch,
> to resolve and avoid dependency with rdma tree.
>
> v2:
> - Change statistics counter name to dev_internal_queue_oob as
> suggested by Jakub.
> - Fixed an issue with IP-in-IP TSO patch, found by regression testing.
Pulled, thanks.
^ permalink raw reply
* Re: [PATCH net-next] net: dsa: mv88e6xxx: fix freeing unused SERDES IRQ
From: David Miller @ 2019-08-30 0:24 UTC (permalink / raw)
To: vivien.didelot; +Cc: netdev, marek.behun, f.fainelli, andrew
In-Reply-To: <20190828185511.21956-1-vivien.didelot@gmail.com>
From: Vivien Didelot <vivien.didelot@gmail.com>
Date: Wed, 28 Aug 2019 14:55:11 -0400
> Now mv88e6xxx does not enable its ports at setup itself and let
> the DSA core handle this, unused ports are disabled without being
> powered on first. While that is expected, the SERDES powering code
> was assuming that a port was already set up before powering it down,
> resulting in freeing an unused IRQ. The patch fixes this assumption.
>
> Fixes: b759f528ca3d ("net: dsa: mv88e6xxx: enable SERDES after setup")
> Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Applied, thank you.
^ permalink raw reply
* Re: [iproute2, master 1/2] devlink: Print health reporter's dump time-stamp in a helper function
From: Stephen Hemminger @ 2019-08-29 23:25 UTC (permalink / raw)
To: Aya Levin; +Cc: netdev, Jiri Pirko, Moshe Shemesh
In-Reply-To: <1566471942-28529-2-git-send-email-ayal@mellanox.com>
On Thu, 22 Aug 2019 14:05:41 +0300
Aya Levin <ayal@mellanox.com> wrote:
> Add pr_out_dump_reporter prefix to the helper function's name and
> encapsulate the print in it.
>
> Fixes: 2f1242efe9d0 ("devlink: Add devlink health show command")
> Signed-off-by: Aya Levin <ayal@mellanox.com>
> Acked-by: Jiri Pirko <jiri@mellanox.com>
Looks fine, but devlink needs to be converted from doing JSON
printing its own way and use common iproute2 libraries.
^ permalink raw reply
* Re: [PATCH net-next] net: dsa: mv88e6xxx: keep CMODE writable code private
From: David Miller @ 2019-08-30 0:20 UTC (permalink / raw)
To: vivien.didelot; +Cc: netdev, marek.behun, f.fainelli, andrew
In-Reply-To: <20190828162659.10306-1-vivien.didelot@gmail.com>
From: Vivien Didelot <vivien.didelot@gmail.com>
Date: Wed, 28 Aug 2019 12:26:59 -0400
> This is a follow-up patch for commit 7a3007d22e8d ("net: dsa:
> mv88e6xxx: fully support SERDES on Topaz family").
>
> Since .port_set_cmode is only called from mv88e6xxx_port_setup_mac and
> mv88e6xxx_phylink_mac_config, it is fine to keep this "make writable"
> code private to the mv88e6341_port_set_cmode implementation, instead
> of adding yet another operation to the switch info structure.
>
> Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Applied.
^ permalink raw reply
* Re: [PATCH net-next] net: dsa: mv88e6xxx: get serdes lane after lock
From: David Miller @ 2019-08-30 0:20 UTC (permalink / raw)
To: vivien.didelot; +Cc: netdev, marek.behun, f.fainelli, andrew
In-Reply-To: <20190828162611.10064-1-vivien.didelot@gmail.com>
From: Vivien Didelot <vivien.didelot@gmail.com>
Date: Wed, 28 Aug 2019 12:26:11 -0400
> This is a follow-up patch for commit 17deaf5cb37a ("net: dsa:
> mv88e6xxx: create serdes_get_lane chip operation").
>
> The .serdes_get_lane implementations access the CMODE of a port,
> even though it is cached at the moment, it is safer to call them
> after the mutex is locked, not before.
>
> At the same time, check for an eventual error and return IRQ_DONE,
> instead of blindly ignoring it.
>
> Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Applied.
^ permalink raw reply
* Re: [PATCH bpf-next 00/13] bpf: adding map batch processing support
From: Jakub Kicinski @ 2019-08-30 0:15 UTC (permalink / raw)
To: Brian Vazquez
Cc: Yonghong Song, Alexei Starovoitov, bpf, netdev, Daniel Borkmann,
kernel-team
In-Reply-To: <CAMzD94S87BD0HnjjHVmhMPQ3UijS+oNu+H7NtMN8z8EAexgFtg@mail.gmail.com>
On Thu, 29 Aug 2019 16:13:59 -0700, Brian Vazquez wrote:
> > We need a per-map implementation of the exec side, but roughly maps
> > would do:
> >
> > LIST_HEAD(deleted);
> >
> > for entry in map {
> > struct map_op_ctx {
> > .key = entry->key,
> > .value = entry->value,
> > };
> >
> > act = BPF_PROG_RUN(filter, &map_op_ctx);
> > if (act & ~ACT_BITS)
> > return -EINVAL;
> >
> > if (act & DELETE) {
> > map_unlink(entry);
> > list_add(entry, &deleted);
> > }
> > if (act & STOP)
> > break;
> > }
> >
> > synchronize_rcu();
> >
> > for entry in deleted {
> > struct map_op_ctx {
> > .key = entry->key,
> > .value = entry->value,
> > };
> >
> > BPF_PROG_RUN(dumper, &map_op_ctx);
> > map_free(entry);
> > }
> >
> Hi Jakub,
>
> how would that approach support percpu maps?
>
> I'm thinking of a scenario where you want to do some calculations on
> percpu maps and you are interested on the info on all the cpus not
> just the one that is running the bpf program. Currently on a pcpu map
> the bpf_map_lookup_elem helper only returns the pointer to the data of
> the executing cpu.
Right, we need to have the iteration outside of the bpf program itself,
and pass the element in through the context. That way we can feed each
per cpu entry into the program separately.
^ permalink raw reply
* Re: [PATCH net 1/3] taprio: Fix kernel panic in taprio_destroy
From: David Miller @ 2019-08-30 0:07 UTC (permalink / raw)
To: olteanv
Cc: jhs, xiyou.wangcong, jiri, vinicius.gomes, vedang.patel,
leandro.maciel.dorileo, netdev
In-Reply-To: <20190828144829.32570-2-olteanv@gmail.com>
From: Vladimir Oltean <olteanv@gmail.com>
Date: Wed, 28 Aug 2019 17:48:27 +0300
> taprio_init may fail earlier than this line:
>
> list_add(&q->taprio_list, &taprio_list);
>
> i.e. due to the net device not being multi queue.
>
> Attempting to remove q from the global taprio_list when it is not part
> of it will result in a kernel panic.
>
> Fix it by iterating through the list and removing it only if found.
>
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
I don't like this solution for two reaons, I think it's actually
error prone, and now every taprio_destroy() eats the cost of traversing
the entire list.
The whole reason to use a list head is O(1) removal.
Just init the list head early in the creation then the list_del() just
works.
^ permalink raw reply
* Re: [PATCH net-next 00/12] net: hns3: add some cleanups and optimizations
From: David Miller @ 2019-08-29 23:58 UTC (permalink / raw)
To: tanhuazhong; +Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm
In-Reply-To: <1567002196-63242-1-git-send-email-tanhuazhong@huawei.com>
From: Huazhong Tan <tanhuazhong@huawei.com>
Date: Wed, 28 Aug 2019 22:23:04 +0800
> This patch-set includes cleanups, optimizations and bugfix for
> the HNS3 ethernet controller driver.
...
Series applied, thanks.
^ permalink raw reply
* Re: [PATCH net-next v3 3/3] dpaa2-eth: Add pause frame support
From: David Miller @ 2019-08-29 23:54 UTC (permalink / raw)
To: ruxandra.radulescu; +Cc: netdev, andrew, ioana.ciornei
In-Reply-To: <1567001295-31801-3-git-send-email-ruxandra.radulescu@nxp.com>
From: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Date: Wed, 28 Aug 2019 17:08:15 +0300
> Starting with firmware version MC10.18.0, we have support for
> L2 flow control. Asymmetrical configuration (Rx or Tx only) is
> supported, but not pause frame autonegotioation.
>
> Pause frame configuration is done via ethtool. By default, we start
> with flow control enabled on both Rx and Tx. Changes are propagated
> to hardware through firmware commands, using two flags (PAUSE,
> ASYM_PAUSE) to specify Rx and Tx pause configuration, as follows:
>
> PAUSE | ASYM_PAUSE | Rx pause | Tx pause
> ----------------------------------------
> 0 | 0 | disabled | disabled
> 0 | 1 | disabled | enabled
> 1 | 0 | enabled | enabled
> 1 | 1 | enabled | disabled
>
> The hardware can automatically send pause frames when the number
> of buffers in the pool goes below a predefined threshold. Due to
> this, flow control is incompatible with Rx frame queue taildrop
> (both mechanisms target the case when processing of ingress
> frames can't keep up with the Rx rate; for large frames, the number
> of buffers in the pool may never get low enough to trigger pause
> frames as long as taildrop is enabled). So we set pause frame
> generation and Rx FQ taildrop as mutually exclusive.
>
> Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
> Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Applied.
^ permalink raw reply
* Re: [PATCH net-next v3 2/3] dpaa2-eth: Use stored link settings
From: David Miller @ 2019-08-29 23:54 UTC (permalink / raw)
To: ruxandra.radulescu; +Cc: netdev, andrew, ioana.ciornei
In-Reply-To: <1567001295-31801-2-git-send-email-ruxandra.radulescu@nxp.com>
From: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Date: Wed, 28 Aug 2019 17:08:14 +0300
> Whenever a link state change occurs, we get notified and save
> the new link settings in the device's private data. In ethtool
> get_link_ksettings, use the stored state instead of interrogating
> the firmware each time.
>
> Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
> Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Applied.
^ permalink raw reply
* Re: [PATCH net-next v3 1/3] dpaa2-eth: Remove support for changing link settings
From: David Miller @ 2019-08-29 23:54 UTC (permalink / raw)
To: ruxandra.radulescu; +Cc: netdev, andrew, ioana.ciornei
In-Reply-To: <1567001295-31801-1-git-send-email-ruxandra.radulescu@nxp.com>
From: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Date: Wed, 28 Aug 2019 17:08:13 +0300
> We only support fixed-link for now, so there is no point in
> offering users the option to change link settings via ethtool.
>
> Functionally there is no change, since firmware prevents us from
> changing link parameters anyway.
>
> Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
> Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Applied.
^ permalink raw reply
* Re: pull-request: mac80211 2019-08-29
From: David Miller @ 2019-08-29 23:44 UTC (permalink / raw)
To: johannes; +Cc: netdev, linux-wireless
In-Reply-To: <20190829150011.10512-1-johannes@sipsolutions.net>
From: Johannes Berg <johannes@sipsolutions.net>
Date: Thu, 29 Aug 2019 17:00:10 +0200
> We have just three more fixes now, and one of those is a driver fix
> because Kalle is on vacation and I'm covering for him in the meantime.
>
> Please pull and let me know if there's any problem.
Ok, pulled, thanks.
^ permalink raw reply
* [PATCH mlx5-next 5/5] net/mlx5: Set only stag for match untagged packets
From: Saeed Mahameed @ 2019-08-29 23:42 UTC (permalink / raw)
To: Saeed Mahameed, Leon Romanovsky
Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Mark Bloch
In-Reply-To: <20190829234151.9958-1-saeedm@mellanox.com>
From: Mark Bloch <markb@mellanox.com>
cvlan_tag enabled in match criteria and disabled in
match value means both S & C tags don't exist (untagged of both).
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index cc096f6011d9..9e9b41ab392b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -1593,7 +1593,10 @@ static int __parse_cls_flower(struct mlx5e_priv *priv,
*match_level = MLX5_MATCH_L2;
}
} else if (*match_level != MLX5_MATCH_NONE) {
- MLX5_SET(fte_match_set_lyr_2_4, headers_c, svlan_tag, 1);
+ /* cvlan_tag enabled in match criteria and
+ * disabled in match value means both S & C tags
+ * don't exist (untagged of both)
+ */
MLX5_SET(fte_match_set_lyr_2_4, headers_c, cvlan_tag, 1);
*match_level = MLX5_MATCH_L2;
}
--
2.21.0
^ permalink raw reply related
* [PATCH mlx5-next 4/5] net/mlx5: Add stub for mlx5_eswitch_mode
From: Saeed Mahameed @ 2019-08-29 23:42 UTC (permalink / raw)
To: Saeed Mahameed, Leon Romanovsky
Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Maor Gottlieb,
Mark Bloch
In-Reply-To: <20190829234151.9958-1-saeedm@mellanox.com>
From: Maor Gottlieb <maorg@mellanox.com>
Return MLX5_ESWITCH_NONE when CONFIG_MLX5_ESWITCH
is not selected.
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
include/linux/mlx5/eswitch.h | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/include/linux/mlx5/eswitch.h b/include/linux/mlx5/eswitch.h
index 46b5ba029802..825920d3ca40 100644
--- a/include/linux/mlx5/eswitch.h
+++ b/include/linux/mlx5/eswitch.h
@@ -61,7 +61,6 @@ void *mlx5_eswitch_get_proto_dev(struct mlx5_eswitch *esw,
struct mlx5_eswitch_rep *mlx5_eswitch_vport_rep(struct mlx5_eswitch *esw,
u16 vport_num);
void *mlx5_eswitch_uplink_get_proto_dev(struct mlx5_eswitch *esw, u8 rep_type);
-u8 mlx5_eswitch_mode(struct mlx5_eswitch *esw);
struct mlx5_flow_handle *
mlx5_eswitch_add_send_to_vport_rule(struct mlx5_eswitch *esw,
u16 vport_num, u32 sqn);
@@ -75,7 +74,14 @@ mlx5_eswitch_get_encap_mode(const struct mlx5_core_dev *dev);
bool mlx5_eswitch_vport_match_metadata_enabled(const struct mlx5_eswitch *esw);
u32 mlx5_eswitch_get_vport_metadata_for_match(const struct mlx5_eswitch *esw,
u16 vport_num);
+u8 mlx5_eswitch_mode(struct mlx5_eswitch *esw);
#else /* CONFIG_MLX5_ESWITCH */
+
+static inline u8 mlx5_eswitch_mode(struct mlx5_eswitch *esw)
+{
+ return MLX5_ESWITCH_NONE;
+}
+
static inline enum devlink_eswitch_encap_mode
mlx5_eswitch_get_encap_mode(const struct mlx5_core_dev *dev)
{
--
2.21.0
^ permalink raw reply related
* [PATCH mlx5-next 3/5] net/mlx5: Avoid disabling RoCE when uninitialized
From: Saeed Mahameed @ 2019-08-29 23:42 UTC (permalink / raw)
To: Saeed Mahameed, Leon Romanovsky
Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Maor Gottlieb,
Mark Bloch
In-Reply-To: <20190829234151.9958-1-saeedm@mellanox.com>
From: Maor Gottlieb <maorg@mellanox.com>
Move the check if RoCE steering is initialized to the
disable RoCE function, it will ensure that we disable
RoCE only if we succeeded in enabling it before.
Fixes: 80f09dfc237f ("net/mlx5: Eswitch, enable RoCE loopback traffic")
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/rdma.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/rdma.c b/drivers/net/ethernet/mellanox/mlx5/core/rdma.c
index 18af6981e0be..0fc7de4aa572 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/rdma.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/rdma.c
@@ -14,9 +14,6 @@ static void mlx5_rdma_disable_roce_steering(struct mlx5_core_dev *dev)
{
struct mlx5_core_roce *roce = &dev->priv.roce;
- if (!roce->ft)
- return;
-
mlx5_del_flow_rules(roce->allow_rule);
mlx5_destroy_flow_group(roce->fg);
mlx5_destroy_flow_table(roce->ft);
@@ -145,6 +142,11 @@ static int mlx5_rdma_add_roce_addr(struct mlx5_core_dev *dev)
void mlx5_rdma_disable_roce(struct mlx5_core_dev *dev)
{
+ struct mlx5_core_roce *roce = &dev->priv.roce;
+
+ if (!roce->ft)
+ return;
+
mlx5_rdma_disable_roce_steering(dev);
mlx5_rdma_del_roce_addr(dev);
mlx5_nic_vport_disable_roce(dev);
--
2.21.0
^ permalink raw reply related
* [PATCH mlx5-next 2/5] net/mlx5: Add HW bits and definitions required for SW steering
From: Saeed Mahameed @ 2019-08-29 23:42 UTC (permalink / raw)
To: Saeed Mahameed, Leon Romanovsky
Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Alex Vesker,
Yevgeny Klitenik, Mark Bloch
In-Reply-To: <20190829234151.9958-1-saeedm@mellanox.com>
From: Alex Vesker <valex@mellanox.com>
Add the required Software Steering hardware definitions and
bits to mlx5_ifc.
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Yevgeny Klitenik <kliten@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
include/linux/mlx5/device.h | 7 +
include/linux/mlx5/mlx5_ifc.h | 235 ++++++++++++++++++++++++++++------
2 files changed, 205 insertions(+), 37 deletions(-)
diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h
index ce9839c8bc1a..5767d7fab5f3 100644
--- a/include/linux/mlx5/device.h
+++ b/include/linux/mlx5/device.h
@@ -1162,6 +1162,9 @@ enum mlx5_qcam_feature_groups {
#define MLX5_CAP_FLOWTABLE(mdev, cap) \
MLX5_GET(flow_table_nic_cap, mdev->caps.hca_cur[MLX5_CAP_FLOW_TABLE], cap)
+#define MLX5_CAP64_FLOWTABLE(mdev, cap) \
+ MLX5_GET64(flow_table_nic_cap, (mdev)->caps.hca_cur[MLX5_CAP_FLOW_TABLE], cap)
+
#define MLX5_CAP_FLOWTABLE_MAX(mdev, cap) \
MLX5_GET(flow_table_nic_cap, mdev->caps.hca_max[MLX5_CAP_FLOW_TABLE], cap)
@@ -1225,6 +1228,10 @@ enum mlx5_qcam_feature_groups {
MLX5_GET(e_switch_cap, \
mdev->caps.hca_cur[MLX5_CAP_ESWITCH], cap)
+#define MLX5_CAP64_ESW_FLOWTABLE(mdev, cap) \
+ MLX5_GET64(flow_table_eswitch_cap, \
+ (mdev)->caps.hca_cur[MLX5_CAP_ESWITCH_FLOW_TABLE], cap)
+
#define MLX5_CAP_ESW_MAX(mdev, cap) \
MLX5_GET(e_switch_cap, \
mdev->caps.hca_max[MLX5_CAP_ESWITCH], cap)
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 4e278114d8b3..76e945dbc7ed 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -282,6 +282,7 @@ enum {
MLX5_CMD_OP_ALLOC_MODIFY_HEADER_CONTEXT = 0x940,
MLX5_CMD_OP_DEALLOC_MODIFY_HEADER_CONTEXT = 0x941,
MLX5_CMD_OP_QUERY_MODIFY_HEADER_CONTEXT = 0x942,
+ MLX5_CMD_OP_SYNC_STEERING = 0xb00,
MLX5_CMD_OP_FPGA_CREATE_QP = 0x960,
MLX5_CMD_OP_FPGA_MODIFY_QP = 0x961,
MLX5_CMD_OP_FPGA_QUERY_QP = 0x962,
@@ -485,7 +486,11 @@ union mlx5_ifc_gre_key_bits {
};
struct mlx5_ifc_fte_match_set_misc_bits {
- u8 reserved_at_0[0x8];
+ u8 gre_c_present[0x1];
+ u8 reserved_auto1[0x1];
+ u8 gre_k_present[0x1];
+ u8 gre_s_present[0x1];
+ u8 source_vhca_port[0x4];
u8 source_sqn[0x18];
u8 source_eswitch_owner_vhca_id[0x10];
@@ -565,12 +570,38 @@ struct mlx5_ifc_fte_match_set_misc2_bits {
u8 metadata_reg_a[0x20];
- u8 reserved_at_1a0[0x60];
+ u8 metadata_reg_b[0x20];
+
+ u8 reserved_at_1c0[0x40];
};
struct mlx5_ifc_fte_match_set_misc3_bits {
- u8 reserved_at_0[0x120];
+ u8 inner_tcp_seq_num[0x20];
+
+ u8 outer_tcp_seq_num[0x20];
+
+ u8 inner_tcp_ack_num[0x20];
+
+ u8 outer_tcp_ack_num[0x20];
+
+ u8 reserved_at_80[0x8];
+ u8 outer_vxlan_gpe_vni[0x18];
+
+ u8 outer_vxlan_gpe_next_protocol[0x8];
+ u8 outer_vxlan_gpe_flags[0x8];
+ u8 reserved_at_b0[0x10];
+
+ u8 icmp_header_data[0x20];
+
+ u8 icmpv6_header_data[0x20];
+
+ u8 icmp_type[0x8];
+ u8 icmp_code[0x8];
+ u8 icmpv6_type[0x8];
+ u8 icmpv6_code[0x8];
+
u8 geneve_tlv_option_0_data[0x20];
+
u8 reserved_at_140[0xc0];
};
@@ -666,7 +697,15 @@ struct mlx5_ifc_flow_table_nic_cap_bits {
struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_nic_transmit_sniffer;
- u8 reserved_at_e00[0x7200];
+ u8 reserved_at_e00[0x1200];
+
+ u8 sw_steering_nic_rx_action_drop_icm_address[0x40];
+
+ u8 sw_steering_nic_tx_action_drop_icm_address[0x40];
+
+ u8 sw_steering_nic_tx_action_allow_icm_address[0x40];
+
+ u8 reserved_at_20c0[0x5f40];
};
enum {
@@ -698,7 +737,17 @@ struct mlx5_ifc_flow_table_eswitch_cap_bits {
struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_esw_acl_egress;
- u8 reserved_at_800[0x7800];
+ u8 reserved_at_800[0x1000];
+
+ u8 sw_steering_fdb_action_drop_icm_address_rx[0x40];
+
+ u8 sw_steering_fdb_action_drop_icm_address_tx[0x40];
+
+ u8 sw_steering_uplink_icm_address_rx[0x40];
+
+ u8 sw_steering_uplink_icm_address_tx[0x40];
+
+ u8 reserved_at_1900[0x6700];
};
enum {
@@ -849,6 +898,25 @@ struct mlx5_ifc_roce_cap_bits {
u8 reserved_at_100[0x700];
};
+struct mlx5_ifc_sync_steering_in_bits {
+ u8 opcode[0x10];
+ u8 uid[0x10];
+
+ u8 reserved_at_20[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_at_40[0xc0];
+};
+
+struct mlx5_ifc_sync_steering_out_bits {
+ u8 status[0x8];
+ u8 reserved_at_8[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_at_40[0x40];
+};
+
struct mlx5_ifc_device_mem_cap_bits {
u8 memic[0x1];
u8 reserved_at_1[0x1f];
@@ -1041,6 +1109,12 @@ enum {
MLX5_CAP_UMR_FENCE_NONE = 0x2,
};
+enum {
+ MLX5_FLEX_PARSER_VXLAN_GPE_ENABLED = 1 << 7,
+ MLX5_FLEX_PARSER_ICMP_V4_ENABLED = 1 << 8,
+ MLX5_FLEX_PARSER_ICMP_V6_ENABLED = 1 << 9,
+};
+
enum {
MLX5_UCTX_CAP_RAW_TX = 1UL << 0,
MLX5_UCTX_CAP_INTERNAL_DEV_RES = 1UL << 1,
@@ -1414,7 +1488,14 @@ struct mlx5_ifc_cmd_hca_cap_bits {
u8 reserved_at_6c0[0x4];
u8 flex_parser_id_geneve_tlv_option_0[0x4];
- u8 reserved_at_6c8[0x28];
+ u8 flex_parser_id_icmp_dw1[0x4];
+ u8 flex_parser_id_icmp_dw0[0x4];
+ u8 flex_parser_id_icmpv6_dw1[0x4];
+ u8 flex_parser_id_icmpv6_dw0[0x4];
+ u8 flex_parser_id_outer_first_mpls_over_gre[0x4];
+ u8 flex_parser_id_outer_first_mpls_over_udp_label[0x4];
+
+ u8 reserved_at_6e0[0x10];
u8 sf_base_id[0x10];
u8 reserved_at_700[0x80];
@@ -2652,6 +2733,7 @@ union mlx5_ifc_hca_cap_union_bits {
struct mlx5_ifc_debug_cap_bits debug_cap;
struct mlx5_ifc_fpga_cap_bits fpga_cap;
struct mlx5_ifc_tls_cap_bits tls_cap;
+ struct mlx5_ifc_device_mem_cap_bits device_mem_cap;
u8 reserved_at_0[0x8000];
};
@@ -3255,7 +3337,11 @@ struct mlx5_ifc_esw_vport_context_bits {
u8 cvlan_pcp[0x3];
u8 cvlan_id[0xc];
- u8 reserved_at_60[0x7a0];
+ u8 reserved_at_60[0x720];
+
+ u8 sw_steering_vport_icm_address_rx[0x40];
+
+ u8 sw_steering_vport_icm_address_tx[0x40];
};
enum {
@@ -4941,23 +5027,98 @@ struct mlx5_ifc_query_hca_cap_in_bits {
u8 reserved_at_20[0x10];
u8 op_mod[0x10];
- u8 reserved_at_40[0x40];
+ u8 other_function[0x1];
+ u8 reserved_at_41[0xf];
+ u8 function_id[0x10];
+
+ u8 reserved_at_60[0x20];
};
-struct mlx5_ifc_query_flow_table_out_bits {
+struct mlx5_ifc_other_hca_cap_bits {
+ u8 roce[0x1];
+ u8 reserved_0[0x27f];
+};
+
+struct mlx5_ifc_query_other_hca_cap_out_bits {
u8 status[0x8];
- u8 reserved_at_8[0x18];
+ u8 reserved_0[0x18];
u8 syndrome[0x20];
- u8 reserved_at_40[0x80];
+ u8 reserved_1[0x40];
- u8 reserved_at_c0[0x8];
+ struct mlx5_ifc_other_hca_cap_bits other_capability;
+};
+
+struct mlx5_ifc_query_other_hca_cap_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x10];
+ u8 function_id[0x10];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_modify_other_hca_cap_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_modify_other_hca_cap_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x10];
+ u8 function_id[0x10];
+ u8 field_select[0x20];
+
+ struct mlx5_ifc_other_hca_cap_bits other_capability;
+};
+
+struct mlx5_ifc_flow_table_context_bits {
+ u8 reformat_en[0x1];
+ u8 decap_en[0x1];
+ u8 sw_owner[0x1];
+ u8 termination_table[0x1];
+ u8 table_miss_action[0x4];
u8 level[0x8];
- u8 reserved_at_d0[0x8];
+ u8 reserved_at_10[0x8];
u8 log_size[0x8];
- u8 reserved_at_e0[0x120];
+ u8 reserved_at_20[0x8];
+ u8 table_miss_id[0x18];
+
+ u8 reserved_at_40[0x8];
+ u8 lag_master_next_table_id[0x18];
+
+ u8 reserved_at_60[0x60];
+
+ u8 sw_owner_icm_root_1[0x40];
+
+ u8 sw_owner_icm_root_0[0x40];
+
+};
+
+struct mlx5_ifc_query_flow_table_out_bits {
+ u8 status[0x8];
+ u8 reserved_at_8[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_at_40[0x80];
+
+ struct mlx5_ifc_flow_table_context_bits flow_table_context;
};
struct mlx5_ifc_query_flow_table_in_bits {
@@ -5227,7 +5388,7 @@ struct mlx5_ifc_alloc_packet_reformat_context_out_bits {
u8 reserved_at_60[0x20];
};
-enum {
+enum mlx5_reformat_ctx_type {
MLX5_REFORMAT_TYPE_L2_TO_VXLAN = 0x0,
MLX5_REFORMAT_TYPE_L2_TO_NVGRE = 0x1,
MLX5_REFORMAT_TYPE_L2_TO_L2_TUNNEL = 0x2,
@@ -5323,7 +5484,16 @@ enum {
MLX5_ACTION_IN_FIELD_OUT_DIPV4 = 0x16,
MLX5_ACTION_IN_FIELD_OUT_FIRST_VID = 0x17,
MLX5_ACTION_IN_FIELD_OUT_IPV6_HOPLIMIT = 0x47,
+ MLX5_ACTION_IN_FIELD_METADATA_REG_A = 0x49,
+ MLX5_ACTION_IN_FIELD_METADATA_REG_B = 0x50,
MLX5_ACTION_IN_FIELD_METADATA_REG_C_0 = 0x51,
+ MLX5_ACTION_IN_FIELD_METADATA_REG_C_1 = 0x52,
+ MLX5_ACTION_IN_FIELD_METADATA_REG_C_2 = 0x53,
+ MLX5_ACTION_IN_FIELD_METADATA_REG_C_3 = 0x54,
+ MLX5_ACTION_IN_FIELD_METADATA_REG_C_4 = 0x55,
+ MLX5_ACTION_IN_FIELD_METADATA_REG_C_5 = 0x56,
+ MLX5_ACTION_IN_FIELD_OUT_TCP_SEQ_NUM = 0x59,
+ MLX5_ACTION_IN_FIELD_OUT_TCP_ACK_NUM = 0x5B,
};
struct mlx5_ifc_alloc_modify_header_context_out_bits {
@@ -7369,35 +7539,26 @@ struct mlx5_ifc_create_mkey_in_bits {
u8 klm_pas_mtt[0][0x20];
};
+enum {
+ MLX5_FLOW_TABLE_TYPE_NIC_RX = 0x0,
+ MLX5_FLOW_TABLE_TYPE_NIC_TX = 0x1,
+ MLX5_FLOW_TABLE_TYPE_ESW_EGRESS_ACL = 0x2,
+ MLX5_FLOW_TABLE_TYPE_ESW_INGRESS_ACL = 0x3,
+ MLX5_FLOW_TABLE_TYPE_FDB = 0X4,
+ MLX5_FLOW_TABLE_TYPE_SNIFFER_RX = 0X5,
+ MLX5_FLOW_TABLE_TYPE_SNIFFER_TX = 0X6,
+};
+
struct mlx5_ifc_create_flow_table_out_bits {
u8 status[0x8];
- u8 reserved_at_8[0x18];
+ u8 icm_address_63_40[0x18];
u8 syndrome[0x20];
- u8 reserved_at_40[0x8];
+ u8 icm_address_39_32[0x8];
u8 table_id[0x18];
- u8 reserved_at_60[0x20];
-};
-
-struct mlx5_ifc_flow_table_context_bits {
- u8 reformat_en[0x1];
- u8 decap_en[0x1];
- u8 reserved_at_2[0x1];
- u8 termination_table[0x1];
- u8 table_miss_action[0x4];
- u8 level[0x8];
- u8 reserved_at_10[0x8];
- u8 log_size[0x8];
-
- u8 reserved_at_20[0x8];
- u8 table_miss_id[0x18];
-
- u8 reserved_at_40[0x8];
- u8 lag_master_next_table_id[0x18];
-
- u8 reserved_at_60[0xe0];
+ u8 icm_address_31_0[0x20];
};
struct mlx5_ifc_create_flow_table_in_bits {
--
2.21.0
^ permalink raw reply related
* [PATCH mlx5-next 1/5] net/mlx5: Move device memory management to mlx5_core
From: Saeed Mahameed @ 2019-08-29 23:42 UTC (permalink / raw)
To: Saeed Mahameed, Leon Romanovsky
Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
Ariel Levkovich, Mark Bloch
In-Reply-To: <20190829234151.9958-1-saeedm@mellanox.com>
From: Ariel Levkovich <lariel@mellanox.com>
Move the device memory allocation and deallocation commands
SW ICM memory to mlx5_core to expose this API for all
mlx5_core users.
This comes as preparation for supporting SW steering in kernel
where it will be required to allocate and register device
memory for direct rule insertion.
In addition, an API to register this device memory for future
remote access operations is introduced using the create_mkey
commands.
Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/infiniband/hw/mlx5/cmd.c | 130 ----------
drivers/infiniband/hw/mlx5/cmd.h | 4 -
drivers/infiniband/hw/mlx5/main.c | 102 +++-----
drivers/infiniband/hw/mlx5/mlx5_ib.h | 2 -
.../net/ethernet/mellanox/mlx5/core/Makefile | 2 +-
.../net/ethernet/mellanox/mlx5/core/lib/dm.c | 223 ++++++++++++++++++
.../net/ethernet/mellanox/mlx5/core/main.c | 5 +
.../ethernet/mellanox/mlx5/core/mlx5_core.h | 3 +
include/linux/mlx5/driver.h | 14 ++
9 files changed, 276 insertions(+), 209 deletions(-)
create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/dm.c
diff --git a/drivers/infiniband/hw/mlx5/cmd.c b/drivers/infiniband/hw/mlx5/cmd.c
index 6c8645033102..4937947400cd 100644
--- a/drivers/infiniband/hw/mlx5/cmd.c
+++ b/drivers/infiniband/hw/mlx5/cmd.c
@@ -186,136 +186,6 @@ int mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, phys_addr_t addr, u64 length)
return err;
}
-int mlx5_cmd_alloc_sw_icm(struct mlx5_dm *dm, int type, u64 length,
- u16 uid, phys_addr_t *addr, u32 *obj_id)
-{
- struct mlx5_core_dev *dev = dm->dev;
- u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {};
- u32 in[MLX5_ST_SZ_DW(create_sw_icm_in)] = {};
- unsigned long *block_map;
- u64 icm_start_addr;
- u32 log_icm_size;
- u32 num_blocks;
- u32 max_blocks;
- u64 block_idx;
- void *sw_icm;
- int ret;
-
- MLX5_SET(general_obj_in_cmd_hdr, in, opcode,
- MLX5_CMD_OP_CREATE_GENERAL_OBJECT);
- MLX5_SET(general_obj_in_cmd_hdr, in, obj_type, MLX5_OBJ_TYPE_SW_ICM);
- MLX5_SET(general_obj_in_cmd_hdr, in, uid, uid);
-
- switch (type) {
- case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
- icm_start_addr = MLX5_CAP64_DEV_MEM(dev,
- steering_sw_icm_start_address);
- log_icm_size = MLX5_CAP_DEV_MEM(dev, log_steering_sw_icm_size);
- block_map = dm->steering_sw_icm_alloc_blocks;
- break;
- case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM:
- icm_start_addr = MLX5_CAP64_DEV_MEM(dev,
- header_modify_sw_icm_start_address);
- log_icm_size = MLX5_CAP_DEV_MEM(dev,
- log_header_modify_sw_icm_size);
- block_map = dm->header_modify_sw_icm_alloc_blocks;
- break;
- default:
- return -EINVAL;
- }
-
- num_blocks = (length + MLX5_SW_ICM_BLOCK_SIZE(dev) - 1) >>
- MLX5_LOG_SW_ICM_BLOCK_SIZE(dev);
- max_blocks = BIT(log_icm_size - MLX5_LOG_SW_ICM_BLOCK_SIZE(dev));
- spin_lock(&dm->lock);
- block_idx = bitmap_find_next_zero_area(block_map,
- max_blocks,
- 0,
- num_blocks, 0);
-
- if (block_idx < max_blocks)
- bitmap_set(block_map,
- block_idx, num_blocks);
-
- spin_unlock(&dm->lock);
-
- if (block_idx >= max_blocks)
- return -ENOMEM;
-
- sw_icm = MLX5_ADDR_OF(create_sw_icm_in, in, sw_icm);
- icm_start_addr += block_idx << MLX5_LOG_SW_ICM_BLOCK_SIZE(dev);
- MLX5_SET64(sw_icm, sw_icm, sw_icm_start_addr,
- icm_start_addr);
- MLX5_SET(sw_icm, sw_icm, log_sw_icm_size, ilog2(length));
-
- ret = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
- if (ret) {
- spin_lock(&dm->lock);
- bitmap_clear(block_map,
- block_idx, num_blocks);
- spin_unlock(&dm->lock);
-
- return ret;
- }
-
- *addr = icm_start_addr;
- *obj_id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id);
-
- return 0;
-}
-
-int mlx5_cmd_dealloc_sw_icm(struct mlx5_dm *dm, int type, u64 length,
- u16 uid, phys_addr_t addr, u32 obj_id)
-{
- struct mlx5_core_dev *dev = dm->dev;
- u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {};
- u32 in[MLX5_ST_SZ_DW(general_obj_in_cmd_hdr)] = {};
- unsigned long *block_map;
- u32 num_blocks;
- u64 start_idx;
- int err;
-
- num_blocks = (length + MLX5_SW_ICM_BLOCK_SIZE(dev) - 1) >>
- MLX5_LOG_SW_ICM_BLOCK_SIZE(dev);
-
- switch (type) {
- case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
- start_idx =
- (addr - MLX5_CAP64_DEV_MEM(
- dev, steering_sw_icm_start_address)) >>
- MLX5_LOG_SW_ICM_BLOCK_SIZE(dev);
- block_map = dm->steering_sw_icm_alloc_blocks;
- break;
- case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM:
- start_idx =
- (addr -
- MLX5_CAP64_DEV_MEM(
- dev, header_modify_sw_icm_start_address)) >>
- MLX5_LOG_SW_ICM_BLOCK_SIZE(dev);
- block_map = dm->header_modify_sw_icm_alloc_blocks;
- break;
- default:
- return -EINVAL;
- }
-
- MLX5_SET(general_obj_in_cmd_hdr, in, opcode,
- MLX5_CMD_OP_DESTROY_GENERAL_OBJECT);
- MLX5_SET(general_obj_in_cmd_hdr, in, obj_type, MLX5_OBJ_TYPE_SW_ICM);
- MLX5_SET(general_obj_in_cmd_hdr, in, obj_id, obj_id);
- MLX5_SET(general_obj_in_cmd_hdr, in, uid, uid);
-
- err = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
- if (err)
- return err;
-
- spin_lock(&dm->lock);
- bitmap_clear(block_map,
- start_idx, num_blocks);
- spin_unlock(&dm->lock);
-
- return 0;
-}
-
int mlx5_cmd_query_ext_ppcnt_counters(struct mlx5_core_dev *dev, void *out)
{
u32 in[MLX5_ST_SZ_DW(ppcnt_reg)] = {};
diff --git a/drivers/infiniband/hw/mlx5/cmd.h b/drivers/infiniband/hw/mlx5/cmd.h
index 0572dcba6eae..169cab4915e3 100644
--- a/drivers/infiniband/hw/mlx5/cmd.h
+++ b/drivers/infiniband/hw/mlx5/cmd.h
@@ -65,8 +65,4 @@ int mlx5_cmd_alloc_q_counter(struct mlx5_core_dev *dev, u16 *counter_id,
u16 uid);
int mlx5_cmd_mad_ifc(struct mlx5_core_dev *dev, const void *inb, void *outb,
u16 opmod, u8 port);
-int mlx5_cmd_alloc_sw_icm(struct mlx5_dm *dm, int type, u64 length,
- u16 uid, phys_addr_t *addr, u32 *obj_id);
-int mlx5_cmd_dealloc_sw_icm(struct mlx5_dm *dm, int type, u64 length,
- u16 uid, phys_addr_t addr, u32 obj_id);
#endif /* MLX5_IB_CMD_H */
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index c2a5780cb394..42fdbbea06f0 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2280,6 +2280,7 @@ static inline int check_dm_type_support(struct mlx5_ib_dev *dev,
return -EOPNOTSUPP;
break;
case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
+ case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM:
if (!capable(CAP_SYS_RAWIO) ||
!capable(CAP_NET_RAW))
return -EPERM;
@@ -2344,20 +2345,20 @@ static int handle_alloc_dm_sw_icm(struct ib_ucontext *ctx,
struct uverbs_attr_bundle *attrs,
int type)
{
- struct mlx5_dm *dm_db = &to_mdev(ctx->device)->dm;
+ struct mlx5_core_dev *dev = to_mdev(ctx->device)->mdev;
u64 act_size;
int err;
/* Allocation size must a multiple of the basic block size
* and a power of 2.
*/
- act_size = round_up(attr->length, MLX5_SW_ICM_BLOCK_SIZE(dm_db->dev));
+ act_size = round_up(attr->length, MLX5_SW_ICM_BLOCK_SIZE(dev));
act_size = roundup_pow_of_two(act_size);
dm->size = act_size;
- err = mlx5_cmd_alloc_sw_icm(dm_db, type, act_size,
- to_mucontext(ctx)->devx_uid, &dm->dev_addr,
- &dm->icm_dm.obj_id);
+ err = mlx5_dm_sw_icm_alloc(dev, type, act_size,
+ to_mucontext(ctx)->devx_uid, &dm->dev_addr,
+ &dm->icm_dm.obj_id);
if (err)
return err;
@@ -2365,9 +2366,9 @@ static int handle_alloc_dm_sw_icm(struct ib_ucontext *ctx,
MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET,
&dm->dev_addr, sizeof(dm->dev_addr));
if (err)
- mlx5_cmd_dealloc_sw_icm(dm_db, type, dm->size,
- to_mucontext(ctx)->devx_uid,
- dm->dev_addr, dm->icm_dm.obj_id);
+ mlx5_dm_sw_icm_dealloc(dev, type, dm->size,
+ to_mucontext(ctx)->devx_uid, dm->dev_addr,
+ dm->icm_dm.obj_id);
return err;
}
@@ -2407,8 +2408,14 @@ struct ib_dm *mlx5_ib_alloc_dm(struct ib_device *ibdev,
attrs);
break;
case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
+ err = handle_alloc_dm_sw_icm(context, dm,
+ attr, attrs,
+ MLX5_SW_ICM_TYPE_STEERING);
+ break;
case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM:
- err = handle_alloc_dm_sw_icm(context, dm, attr, attrs, type);
+ err = handle_alloc_dm_sw_icm(context, dm,
+ attr, attrs,
+ MLX5_SW_ICM_TYPE_HEADER_MODIFY);
break;
default:
err = -EOPNOTSUPP;
@@ -2428,6 +2435,7 @@ int mlx5_ib_dealloc_dm(struct ib_dm *ibdm, struct uverbs_attr_bundle *attrs)
{
struct mlx5_ib_ucontext *ctx = rdma_udata_to_drv_context(
&attrs->driver_udata, struct mlx5_ib_ucontext, ibucontext);
+ struct mlx5_core_dev *dev = to_mdev(ibdm->device)->mdev;
struct mlx5_dm *dm_db = &to_mdev(ibdm->device)->dm;
struct mlx5_ib_dm *dm = to_mdm(ibdm);
u32 page_idx;
@@ -2439,19 +2447,23 @@ int mlx5_ib_dealloc_dm(struct ib_dm *ibdm, struct uverbs_attr_bundle *attrs)
if (ret)
return ret;
- page_idx = (dm->dev_addr -
- pci_resource_start(dm_db->dev->pdev, 0) -
- MLX5_CAP64_DEV_MEM(dm_db->dev,
- memic_bar_start_addr)) >>
- PAGE_SHIFT;
+ page_idx = (dm->dev_addr - pci_resource_start(dev->pdev, 0) -
+ MLX5_CAP64_DEV_MEM(dev, memic_bar_start_addr)) >>
+ PAGE_SHIFT;
bitmap_clear(ctx->dm_pages, page_idx,
DIV_ROUND_UP(dm->size, PAGE_SIZE));
break;
case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
+ ret = mlx5_dm_sw_icm_dealloc(dev, MLX5_SW_ICM_TYPE_STEERING,
+ dm->size, ctx->devx_uid, dm->dev_addr,
+ dm->icm_dm.obj_id);
+ if (ret)
+ return ret;
+ break;
case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM:
- ret = mlx5_cmd_dealloc_sw_icm(dm_db, dm->type, dm->size,
- ctx->devx_uid, dm->dev_addr,
- dm->icm_dm.obj_id);
+ ret = mlx5_dm_sw_icm_dealloc(dev, MLX5_SW_ICM_TYPE_HEADER_MODIFY,
+ dm->size, ctx->devx_uid, dm->dev_addr,
+ dm->icm_dm.obj_id);
if (ret)
return ret;
break;
@@ -6097,8 +6109,6 @@ static struct ib_counters *mlx5_ib_create_counters(struct ib_device *device,
static void mlx5_ib_stage_init_cleanup(struct mlx5_ib_dev *dev)
{
- struct mlx5_core_dev *mdev = dev->mdev;
-
mlx5_ib_cleanup_multiport_master(dev);
if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) {
srcu_barrier(&dev->mr_srcu);
@@ -6106,29 +6116,11 @@ static void mlx5_ib_stage_init_cleanup(struct mlx5_ib_dev *dev)
}
WARN_ON(!bitmap_empty(dev->dm.memic_alloc_pages, MLX5_MAX_MEMIC_PAGES));
-
- WARN_ON(dev->dm.steering_sw_icm_alloc_blocks &&
- !bitmap_empty(
- dev->dm.steering_sw_icm_alloc_blocks,
- BIT(MLX5_CAP_DEV_MEM(mdev, log_steering_sw_icm_size) -
- MLX5_LOG_SW_ICM_BLOCK_SIZE(mdev))));
-
- kfree(dev->dm.steering_sw_icm_alloc_blocks);
-
- WARN_ON(dev->dm.header_modify_sw_icm_alloc_blocks &&
- !bitmap_empty(dev->dm.header_modify_sw_icm_alloc_blocks,
- BIT(MLX5_CAP_DEV_MEM(
- mdev, log_header_modify_sw_icm_size) -
- MLX5_LOG_SW_ICM_BLOCK_SIZE(mdev))));
-
- kfree(dev->dm.header_modify_sw_icm_alloc_blocks);
}
static int mlx5_ib_stage_init_init(struct mlx5_ib_dev *dev)
{
struct mlx5_core_dev *mdev = dev->mdev;
- u64 header_modify_icm_blocks = 0;
- u64 steering_icm_blocks = 0;
int err;
int i;
@@ -6173,51 +6165,17 @@ static int mlx5_ib_stage_init_init(struct mlx5_ib_dev *dev)
INIT_LIST_HEAD(&dev->qp_list);
spin_lock_init(&dev->reset_flow_resource_lock);
- if (MLX5_CAP_GEN_64(mdev, general_obj_types) &
- MLX5_GENERAL_OBJ_TYPES_CAP_SW_ICM) {
- if (MLX5_CAP64_DEV_MEM(mdev, steering_sw_icm_start_address)) {
- steering_icm_blocks =
- BIT(MLX5_CAP_DEV_MEM(mdev,
- log_steering_sw_icm_size) -
- MLX5_LOG_SW_ICM_BLOCK_SIZE(mdev));
-
- dev->dm.steering_sw_icm_alloc_blocks =
- kcalloc(BITS_TO_LONGS(steering_icm_blocks),
- sizeof(unsigned long), GFP_KERNEL);
- if (!dev->dm.steering_sw_icm_alloc_blocks)
- goto err_mp;
- }
-
- if (MLX5_CAP64_DEV_MEM(mdev,
- header_modify_sw_icm_start_address)) {
- header_modify_icm_blocks = BIT(
- MLX5_CAP_DEV_MEM(
- mdev, log_header_modify_sw_icm_size) -
- MLX5_LOG_SW_ICM_BLOCK_SIZE(mdev));
-
- dev->dm.header_modify_sw_icm_alloc_blocks =
- kcalloc(BITS_TO_LONGS(header_modify_icm_blocks),
- sizeof(unsigned long), GFP_KERNEL);
- if (!dev->dm.header_modify_sw_icm_alloc_blocks)
- goto err_dm;
- }
- }
-
spin_lock_init(&dev->dm.lock);
dev->dm.dev = mdev;
if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) {
err = init_srcu_struct(&dev->mr_srcu);
if (err)
- goto err_dm;
+ goto err_mp;
}
return 0;
-err_dm:
- kfree(dev->dm.steering_sw_icm_alloc_blocks);
- kfree(dev->dm.header_modify_sw_icm_alloc_blocks);
-
err_mp:
mlx5_ib_cleanup_multiport_master(dev);
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index c482f19958b3..afd69ba33b2b 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -880,8 +880,6 @@ struct mlx5_dm {
*/
spinlock_t lock;
DECLARE_BITMAP(memic_alloc_pages, MLX5_MAX_MEMIC_PAGES);
- unsigned long *steering_sw_icm_alloc_blocks;
- unsigned long *header_modify_sw_icm_alloc_blocks;
};
struct mlx5_read_counters_attr {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 57d2cc666fe3..4eb52e8500c3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -15,7 +15,7 @@ mlx5_core-y := main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
health.o mcg.o cq.o alloc.o qp.o port.o mr.o pd.o \
transobj.o vport.o sriov.o fs_cmd.o fs_core.o pci_irq.o \
fs_counters.o rl.o lag.o dev.o events.o wq.o lib/gid.o \
- lib/devcom.o lib/pci_vsc.o diag/fs_tracepoint.o \
+ lib/devcom.o lib/pci_vsc.o lib/dm.o diag/fs_tracepoint.o \
diag/fw_tracer.o diag/crdump.o devlink.o
#
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/dm.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/dm.c
new file mode 100644
index 000000000000..e065c2f68f5a
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/dm.c
@@ -0,0 +1,223 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+// Copyright (c) 2019 Mellanox Technologies
+
+#include <linux/mlx5/driver.h>
+#include <linux/mlx5/device.h>
+
+#include "mlx5_core.h"
+#include "lib/mlx5.h"
+
+struct mlx5_dm {
+ /* protect access to icm bitmask */
+ spinlock_t lock;
+ unsigned long *steering_sw_icm_alloc_blocks;
+ unsigned long *header_modify_sw_icm_alloc_blocks;
+};
+
+struct mlx5_dm *mlx5_dm_create(struct mlx5_core_dev *dev)
+{
+ u64 header_modify_icm_blocks = 0;
+ u64 steering_icm_blocks = 0;
+ struct mlx5_dm *dm;
+
+ if (!(MLX5_CAP_GEN_64(dev, general_obj_types) & MLX5_GENERAL_OBJ_TYPES_CAP_SW_ICM))
+ return 0;
+
+ dm = kzalloc(sizeof(*dm), GFP_KERNEL);
+ if (!dm)
+ return ERR_PTR(-ENOMEM);
+
+ spin_lock_init(&dm->lock);
+
+ if (MLX5_CAP64_DEV_MEM(dev, steering_sw_icm_start_address)) {
+ steering_icm_blocks =
+ BIT(MLX5_CAP_DEV_MEM(dev, log_steering_sw_icm_size) -
+ MLX5_LOG_SW_ICM_BLOCK_SIZE(dev));
+
+ dm->steering_sw_icm_alloc_blocks =
+ kcalloc(BITS_TO_LONGS(steering_icm_blocks),
+ sizeof(unsigned long), GFP_KERNEL);
+ if (!dm->steering_sw_icm_alloc_blocks)
+ goto err_steering;
+ }
+
+ if (MLX5_CAP64_DEV_MEM(dev, header_modify_sw_icm_start_address)) {
+ header_modify_icm_blocks =
+ BIT(MLX5_CAP_DEV_MEM(dev, log_header_modify_sw_icm_size) -
+ MLX5_LOG_SW_ICM_BLOCK_SIZE(dev));
+
+ dm->header_modify_sw_icm_alloc_blocks =
+ kcalloc(BITS_TO_LONGS(header_modify_icm_blocks),
+ sizeof(unsigned long), GFP_KERNEL);
+ if (!dm->header_modify_sw_icm_alloc_blocks)
+ goto err_modify_hdr;
+ }
+
+ return dm;
+
+err_modify_hdr:
+ kfree(dm->steering_sw_icm_alloc_blocks);
+
+err_steering:
+ kfree(dm);
+
+ return ERR_PTR(-ENOMEM);
+}
+
+void mlx5_dm_cleanup(struct mlx5_core_dev *dev)
+{
+ struct mlx5_dm *dm = dev->dm;
+
+ if (!dev->dm)
+ return;
+
+ if (dm->steering_sw_icm_alloc_blocks) {
+ WARN_ON(!bitmap_empty(dm->steering_sw_icm_alloc_blocks,
+ BIT(MLX5_CAP_DEV_MEM(dev, log_steering_sw_icm_size) -
+ MLX5_LOG_SW_ICM_BLOCK_SIZE(dev))));
+ kfree(dm->steering_sw_icm_alloc_blocks);
+ }
+
+ if (dm->header_modify_sw_icm_alloc_blocks) {
+ WARN_ON(!bitmap_empty(dm->header_modify_sw_icm_alloc_blocks,
+ BIT(MLX5_CAP_DEV_MEM(dev,
+ log_header_modify_sw_icm_size) -
+ MLX5_LOG_SW_ICM_BLOCK_SIZE(dev))));
+ kfree(dm->header_modify_sw_icm_alloc_blocks);
+ }
+
+ kfree(dm);
+}
+
+int mlx5_dm_sw_icm_alloc(struct mlx5_core_dev *dev, enum mlx5_sw_icm_type type,
+ u64 length, u16 uid, phys_addr_t *addr, u32 *obj_id)
+{
+ u32 num_blocks = DIV_ROUND_UP_ULL(length, MLX5_SW_ICM_BLOCK_SIZE(dev));
+ u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {};
+ u32 in[MLX5_ST_SZ_DW(create_sw_icm_in)] = {};
+ struct mlx5_dm *dm = dev->dm;
+ unsigned long *block_map;
+ u64 icm_start_addr;
+ u32 log_icm_size;
+ u32 max_blocks;
+ u64 block_idx;
+ void *sw_icm;
+ int ret;
+
+ if (!dev->dm)
+ return -EOPNOTSUPP;
+
+ if (!length || (length & (length - 1)) ||
+ length & (MLX5_SW_ICM_BLOCK_SIZE(dev) - 1))
+ return -EINVAL;
+
+ MLX5_SET(general_obj_in_cmd_hdr, in, opcode,
+ MLX5_CMD_OP_CREATE_GENERAL_OBJECT);
+ MLX5_SET(general_obj_in_cmd_hdr, in, obj_type, MLX5_OBJ_TYPE_SW_ICM);
+ MLX5_SET(general_obj_in_cmd_hdr, in, uid, uid);
+
+ switch (type) {
+ case MLX5_SW_ICM_TYPE_STEERING:
+ icm_start_addr = MLX5_CAP64_DEV_MEM(dev, steering_sw_icm_start_address);
+ log_icm_size = MLX5_CAP_DEV_MEM(dev, log_steering_sw_icm_size);
+ block_map = dm->steering_sw_icm_alloc_blocks;
+ break;
+ case MLX5_SW_ICM_TYPE_HEADER_MODIFY:
+ icm_start_addr = MLX5_CAP64_DEV_MEM(dev, header_modify_sw_icm_start_address);
+ log_icm_size = MLX5_CAP_DEV_MEM(dev,
+ log_header_modify_sw_icm_size);
+ block_map = dm->header_modify_sw_icm_alloc_blocks;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ if (!block_map)
+ return -EOPNOTSUPP;
+
+ max_blocks = BIT(log_icm_size - MLX5_LOG_SW_ICM_BLOCK_SIZE(dev));
+ spin_lock(&dm->lock);
+ block_idx = bitmap_find_next_zero_area(block_map,
+ max_blocks,
+ 0,
+ num_blocks, 0);
+
+ if (block_idx < max_blocks)
+ bitmap_set(block_map,
+ block_idx, num_blocks);
+
+ spin_unlock(&dm->lock);
+
+ if (block_idx >= max_blocks)
+ return -ENOMEM;
+
+ sw_icm = MLX5_ADDR_OF(create_sw_icm_in, in, sw_icm);
+ icm_start_addr += block_idx << MLX5_LOG_SW_ICM_BLOCK_SIZE(dev);
+ MLX5_SET64(sw_icm, sw_icm, sw_icm_start_addr,
+ icm_start_addr);
+ MLX5_SET(sw_icm, sw_icm, log_sw_icm_size, ilog2(length));
+
+ ret = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
+ if (ret) {
+ spin_lock(&dm->lock);
+ bitmap_clear(block_map,
+ block_idx, num_blocks);
+ spin_unlock(&dm->lock);
+
+ return ret;
+ }
+
+ *addr = icm_start_addr;
+ *obj_id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx5_dm_sw_icm_alloc);
+
+int mlx5_dm_sw_icm_dealloc(struct mlx5_core_dev *dev, enum mlx5_sw_icm_type type,
+ u64 length, u16 uid, phys_addr_t addr, u32 obj_id)
+{
+ u32 num_blocks = DIV_ROUND_UP_ULL(length, MLX5_SW_ICM_BLOCK_SIZE(dev));
+ u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {};
+ u32 in[MLX5_ST_SZ_DW(general_obj_in_cmd_hdr)] = {};
+ struct mlx5_dm *dm = dev->dm;
+ unsigned long *block_map;
+ u64 icm_start_addr;
+ u64 start_idx;
+ int err;
+
+ if (!dev->dm)
+ return -EOPNOTSUPP;
+
+ switch (type) {
+ case MLX5_SW_ICM_TYPE_STEERING:
+ icm_start_addr = MLX5_CAP64_DEV_MEM(dev, steering_sw_icm_start_address);
+ block_map = dm->steering_sw_icm_alloc_blocks;
+ break;
+ case MLX5_SW_ICM_TYPE_HEADER_MODIFY:
+ icm_start_addr = MLX5_CAP64_DEV_MEM(dev, header_modify_sw_icm_start_address);
+ block_map = dm->header_modify_sw_icm_alloc_blocks;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ MLX5_SET(general_obj_in_cmd_hdr, in, opcode,
+ MLX5_CMD_OP_DESTROY_GENERAL_OBJECT);
+ MLX5_SET(general_obj_in_cmd_hdr, in, obj_type, MLX5_OBJ_TYPE_SW_ICM);
+ MLX5_SET(general_obj_in_cmd_hdr, in, obj_id, obj_id);
+ MLX5_SET(general_obj_in_cmd_hdr, in, uid, uid);
+
+ err = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
+ if (err)
+ return err;
+
+ start_idx = (addr - icm_start_addr) >> MLX5_LOG_SW_ICM_BLOCK_SIZE(dev);
+ spin_lock(&dm->lock);
+ bitmap_clear(block_map,
+ start_idx, num_blocks);
+ spin_unlock(&dm->lock);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx5_dm_sw_icm_dealloc);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 7f70ecb1db6d..c1679d11d71f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -879,6 +879,10 @@ static int mlx5_init_once(struct mlx5_core_dev *dev)
goto err_eswitch_cleanup;
}
+ dev->dm = mlx5_dm_create(dev);
+ if (IS_ERR(dev->dm))
+ mlx5_core_warn(dev, "Failed to init device memory%d\n", err);
+
dev->tracer = mlx5_fw_tracer_create(dev);
return 0;
@@ -912,6 +916,7 @@ static int mlx5_init_once(struct mlx5_core_dev *dev)
static void mlx5_cleanup_once(struct mlx5_core_dev *dev)
{
mlx5_fw_tracer_destroy(dev->tracer);
+ mlx5_dm_cleanup(dev);
mlx5_fpga_cleanup(dev);
mlx5_eswitch_cleanup(dev->priv.eswitch);
mlx5_sriov_cleanup(dev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index 471bbc48bc1f..bbcf4ee40ad5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -198,6 +198,9 @@ int mlx5_set_mtpps(struct mlx5_core_dev *mdev, u32 *mtpps, u32 mtpps_size);
int mlx5_query_mtppse(struct mlx5_core_dev *mdev, u8 pin, u8 *arm, u8 *mode);
int mlx5_set_mtppse(struct mlx5_core_dev *mdev, u8 pin, u8 arm, u8 mode);
+struct mlx5_dm *mlx5_dm_create(struct mlx5_core_dev *dev);
+void mlx5_dm_cleanup(struct mlx5_core_dev *dev);
+
#define MLX5_PPS_CAP(mdev) (MLX5_CAP_GEN((mdev), pps) && \
MLX5_CAP_GEN((mdev), pps_modify) && \
MLX5_CAP_MCAM_FEATURE((mdev), mtpps_fs) && \
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 0acd28f2e62c..72bc6ce44b55 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -622,6 +622,11 @@ struct mlx5e_resources {
struct mlx5_sq_bfreg bfreg;
};
+enum mlx5_sw_icm_type {
+ MLX5_SW_ICM_TYPE_STEERING,
+ MLX5_SW_ICM_TYPE_HEADER_MODIFY,
+};
+
#define MLX5_MAX_RESERVED_GIDS 8
struct mlx5_rsvd_gids {
@@ -653,10 +658,14 @@ struct mlx5_clock {
struct mlx5_pps pps_info;
};
+struct mlx5_dm;
struct mlx5_fw_tracer;
struct mlx5_vxlan;
struct mlx5_geneve;
+#define MLX5_LOG_SW_ICM_BLOCK_SIZE(dev) (MLX5_CAP_DEV_MEM(dev, log_sw_icm_alloc_granularity))
+#define MLX5_SW_ICM_BLOCK_SIZE(dev) (1 << MLX5_LOG_SW_ICM_BLOCK_SIZE(dev))
+
struct mlx5_core_dev {
struct device *device;
enum mlx5_coredev_type coredev_type;
@@ -690,6 +699,7 @@ struct mlx5_core_dev {
atomic_t num_qps;
u32 issi;
struct mlx5e_resources mlx5e_res;
+ struct mlx5_dm *dm;
struct mlx5_vxlan *vxlan;
struct mlx5_geneve *geneve;
struct {
@@ -1072,6 +1082,10 @@ int mlx5_lag_query_cong_counters(struct mlx5_core_dev *dev,
size_t *offsets);
struct mlx5_uars_page *mlx5_get_uars_page(struct mlx5_core_dev *mdev);
void mlx5_put_uars_page(struct mlx5_core_dev *mdev, struct mlx5_uars_page *up);
+int mlx5_dm_sw_icm_alloc(struct mlx5_core_dev *dev, enum mlx5_sw_icm_type type,
+ u64 length, u16 uid, phys_addr_t *addr, u32 *obj_id);
+int mlx5_dm_sw_icm_dealloc(struct mlx5_core_dev *dev, enum mlx5_sw_icm_type type,
+ u64 length, u16 uid, phys_addr_t addr, u32 obj_id);
#ifdef CONFIG_MLX5_CORE_IPOIB
struct net_device *mlx5_rdma_netdev_alloc(struct mlx5_core_dev *mdev,
--
2.21.0
^ permalink raw reply related
* [PATCH mlx5-next 0/5] Mellanox, mlx5 next updates 2019-09-29
From: Saeed Mahameed @ 2019-08-29 23:42 UTC (permalink / raw)
To: Saeed Mahameed, Leon Romanovsky
Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org
Hi All,
This series includes misc updates for mlx5-next shared branch required
for upcoming software steering feature.
1) Alex adds HW bits and definitions required for SW steering
2) Ariel moves device memory management to mlx5_core (From mlx5_ib)
3) Maor, Cleanups and fixups for eswitch mode and RoCE
4) Mar, Set only stag for match untagged packets
In case of no objection this series will be applied to mlx5-next branch
and sent later as pull request to both rdma-next and net-next branches.
Thanks,
Saeed.
---
Alex Vesker (1):
net/mlx5: Add HW bits and definitions required for SW steering
Ariel Levkovich (1):
net/mlx5: Move device memory management to mlx5_core
Maor Gottlieb (2):
net/mlx5: Avoid disabling RoCE when uninitialized
net/mlx5: Add stub for mlx5_eswitch_mode
Mark Bloch (1):
net/mlx5: Set only stag for match untagged packets
drivers/infiniband/hw/mlx5/cmd.c | 130 ----------
drivers/infiniband/hw/mlx5/cmd.h | 4 -
drivers/infiniband/hw/mlx5/main.c | 102 +++-----
drivers/infiniband/hw/mlx5/mlx5_ib.h | 2 -
.../net/ethernet/mellanox/mlx5/core/Makefile | 2 +-
.../net/ethernet/mellanox/mlx5/core/en_tc.c | 5 +-
.../net/ethernet/mellanox/mlx5/core/lib/dm.c | 223 +++++++++++++++++
.../net/ethernet/mellanox/mlx5/core/main.c | 5 +
.../ethernet/mellanox/mlx5/core/mlx5_core.h | 3 +
.../net/ethernet/mellanox/mlx5/core/rdma.c | 8 +-
include/linux/mlx5/device.h | 7 +
include/linux/mlx5/driver.h | 14 ++
include/linux/mlx5/eswitch.h | 8 +-
include/linux/mlx5/mlx5_ifc.h | 235 +++++++++++++++---
14 files changed, 497 insertions(+), 251 deletions(-)
create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/dm.c
--
2.21.0
^ permalink raw reply
* Re: [PATCH bpf-next 04/13] bpf: refactor map_get_next_key()
From: Song Liu @ 2019-08-29 23:39 UTC (permalink / raw)
To: Yonghong Song
Cc: bpf, netdev@vger.kernel.org, Alexei Starovoitov, Brian Vazquez,
Daniel Borkmann, Kernel Team
In-Reply-To: <20190829064506.2750717-1-yhs@fb.com>
> On Aug 28, 2019, at 11:45 PM, Yonghong Song <yhs@fb.com> wrote:
>
> Refactor function map_get_next_key() with a new helper
> bpf_map_get_next_key(), which will be used later
> for batched map lookup/lookup_and_delete/delete operations.
>
> Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
^ permalink raw reply
* Re: [PATCH bpf-next 03/13] bpf: refactor map_delete_elem()
From: Song Liu @ 2019-08-29 23:39 UTC (permalink / raw)
To: Yonghong Song
Cc: bpf, netdev@vger.kernel.org, Alexei Starovoitov, Brian Vazquez,
Daniel Borkmann, Kernel Team
In-Reply-To: <20190829064505.2750541-1-yhs@fb.com>
> On Aug 28, 2019, at 11:45 PM, Yonghong Song <yhs@fb.com> wrote:
>
> Refactor function map_delete_elem() with a new helper
> bpf_map_delete_elem(), which will be used later
> for batched lookup_and_delete and delete operations.
>
> Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
^ permalink raw reply
* Re: [PATCH v6 net-next 16/19] ionic: Add netdev-event handling
From: Jakub Kicinski @ 2019-08-29 23:37 UTC (permalink / raw)
To: Shannon Nelson; +Cc: netdev, davem
In-Reply-To: <20190829182720.68419-17-snelson@pensando.io>
On Thu, 29 Aug 2019 11:27:17 -0700, Shannon Nelson wrote:
> When the netdev gets a new name from userland, pass that name
> down to the NIC for internal tracking.
>
> Signed-off-by: Shannon Nelson <snelson@pensando.io>
There is a precedent in ACPI for telling the FW what OS is running but
how is the interface name useful for the firmware I can't really tell.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox