* Re: [RFC rdma 1/3] RDMA/core: Create a common mmap function
From: Jason Gunthorpe @ 2019-07-04 12:35 UTC (permalink / raw)
To: Gal Pressman
Cc: Michal Kalderon, dledford@redhat.com, leon@kernel.org,
sleybo@amazon.com, Ariel Elior, linux-rdma@vger.kernel.org
In-Reply-To: <85247f12-1d78-0e66-fadc-d04862511ca7@amazon.com>
On Wed, Jul 03, 2019 at 11:19:34AM +0300, Gal Pressman wrote:
> On 03/07/2019 1:31, Jason Gunthorpe wrote:
> >> Seems except Mellanox + hns the mmap flags aren't ABI.
> >> Also, current Mellanox code seems like it won't benefit from
> >> mmap cookie helper functions in any case as the mmap function is very specific and the flags used indicate
> >> the address and not just how to map it.
> >
> > IMHO, mlx5 has a goofy implementaiton here as it codes all of the object
> > type, handle and cachability flags in one thing.
>
> Do we need object type flags as well in the generic mmap code?
At the end of the day the driver needs to know what page to map during
the mmap syscall.
mlx5 does this by encoding the page type in the address, and then many
types have seperate lookups based onthe offset for the actual page.
IMHO the single lookup and opaque offset is generally better..
Since the mlx5 scheme is ABI it can't be changed unfortunately.
If you want to do user controlled cachability flags, or not, is a fair
question, but they still become ABI..
I'm wondering if it really makes sense to do that during the mmap, or
if the cachability should be set as part of creating the cookie?
> Another issue is that these flags aren't exposed in an ABI file, so
> a userspace library can't really make use of it in current state.
Woops.
Ah, this is all ABI so you need to dig out of this hole ASAP :)
Jason
^ permalink raw reply
* RE: [for-next V2 10/10] RDMA/core: Provide RDMA DIM support for ULPs
From: Idan Burstein @ 2019-07-04 12:30 UTC (permalink / raw)
To: Sagi Grimberg, Saeed Mahameed, David S. Miller, Doug Ledford,
Jason Gunthorpe
Cc: Leon Romanovsky, Or Gerlitz, Tal Gilboa, netdev@vger.kernel.org,
linux-rdma@vger.kernel.org, Yamin Friedman, Max Gurtovoy
In-Reply-To: <9d26c90c-8e0b-656f-341f-a67251549126@grimberg.me>
The essence of the dynamic in DIM is that it would fit to the workload running on the cores. For user not to trade bandwidth/cqu% and latency with a module parameter they don't know how to config. If DIM consistently hurts latency of latency critical workloads we should debug and fix.
This is where we should go. End goal of no configurate with out of the box performance in terms of both bandwidth/cpu% and latency.
We could make several steps towards this direction if we are not mature enough today but let's define them (e.g. tests on different ulps).
-----Original Message-----
From: linux-rdma-owner@vger.kernel.org <linux-rdma-owner@vger.kernel.org> On Behalf Of Sagi Grimberg
Sent: Tuesday, July 2, 2019 8:37 AM
To: Idan Burstein <idanb@mellanox.com>; Saeed Mahameed <saeedm@mellanox.com>; David S. Miller <davem@davemloft.net>; Doug Ledford <dledford@redhat.com>; Jason Gunthorpe <jgg@mellanox.com>
Cc: Leon Romanovsky <leonro@mellanox.com>; Or Gerlitz <ogerlitz@mellanox.com>; Tal Gilboa <talgi@mellanox.com>; netdev@vger.kernel.org; linux-rdma@vger.kernel.org; Yamin Friedman <yaminf@mellanox.com>; Max Gurtovoy <maxg@mellanox.com>
Subject: Re: [for-next V2 10/10] RDMA/core: Provide RDMA DIM support for ULPs
Hey Idan,
> " Please don't. This is a bad choice to opt it in by default."
>
> I disagree here. I'd prefer Linux to have good out of the box experience (e.g. reach 100G in 4K NVMeOF on Intel servers) with the default parameters. Especially since Yamin have shown it is beneficial / not hurting in terms of performance for variety of use cases. The whole concept of DIM is that it adapts to the workload requirements in terms of bandwidth and latency.
Well, its a Mellanox device driver after all.
But do note that by far, the vast majority of users are not saturating 100G of 4K I/O. The absolute vast majority of users are primarily sensitive to synchronous QD=1 I/O latency, and when the workload is much more dynamic than the synthetic 100%/50%/0% read mix.
As much as I'm a fan (IIRC I was the one giving a first pass at this), the dim default opt-in is not only not beneficial, but potentially harmful to the majority of users out-of-the-box experience.
Given that this is a fresh code with almost no exposure, and that was not tested outside of Yamin running limited performance testing, I think it would be a mistake to add it as a default opt-in, that can come as an incremental stage.
Obviously, I cannot tell what Mellanox should/shouldn't do in its own device driver of course, but I just wanted to emphasize that I think this is a mistake.
> Moreover, net-dim is enabled by default, I don't see why RDMA is different.
Very different animals.
^ permalink raw reply
* Re: [net-next 1/3] ice: Initialize and register platform device to provide RDMA
From: Greg KH @ 2019-07-04 12:29 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Jeff Kirsher, davem@davemloft.net, dledford@redhat.com,
Tony Nguyen, netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
nhorman@redhat.com, sassmann@redhat.com, poswald@suse.com,
mustafa.ismail@intel.com, shiraz.saleem@intel.com, Dave Ertman,
Andrew Bowers
In-Reply-To: <20190704121632.GB3401@mellanox.com>
On Thu, Jul 04, 2019 at 12:16:41PM +0000, Jason Gunthorpe wrote:
> On Wed, Jul 03, 2019 at 07:12:50PM -0700, Jeff Kirsher wrote:
> > From: Tony Nguyen <anthony.l.nguyen@intel.com>
> >
> > The RDMA block does not advertise on the PCI bus or any other bus.
> > Thus the ice driver needs to provide access to the RDMA hardware block
> > via a virtual bus; utilize the platform bus to provide this access.
> >
> > This patch initializes the driver to support RDMA as well as creates
> > and registers a platform device for the RDMA driver to register to. At
> > this point the driver is fully initialized to register a platform
> > driver, however, can not yet register as the ops have not been
> > implemented.
>
> I think you need Greg's ack on all this driver stuff - particularly
> that a platform_device is OK.
A platform_device is almost NEVER ok.
Don't abuse it, make a real device on a real bus. If you don't have a
real bus and just need to create a device to hang other things off of,
then use the virtual one, that's what it is there for.
thanks,
greg k-h
^ permalink raw reply
* Re: [rdma 14/16] RDMA/irdma: Add ABI definitions
From: Jason Gunthorpe @ 2019-07-04 12:19 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Jeff Kirsher, dledford@redhat.com, davem@davemloft.net,
Mustafa Ismail, linux-rdma@vger.kernel.org,
netdev@vger.kernel.org, nhorman@redhat.com, sassmann@redhat.com,
poswald@suse.com, david.m.ertman@intel.com, Shiraz Saleem
In-Reply-To: <20190704074021.GH4727@mtr-leonro.mtl.com>
On Thu, Jul 04, 2019 at 10:40:21AM +0300, Leon Romanovsky wrote:
> On Wed, Jul 03, 2019 at 07:12:57PM -0700, Jeff Kirsher wrote:
> > From: Mustafa Ismail <mustafa.ismail@intel.com>
> >
> > Add ABI definitions for irdma.
> >
> > Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
> > Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
> > include/uapi/rdma/irdma-abi.h | 130 ++++++++++++++++++++++++++++++++++
> > 1 file changed, 130 insertions(+)
> > create mode 100644 include/uapi/rdma/irdma-abi.h
> >
> > diff --git a/include/uapi/rdma/irdma-abi.h b/include/uapi/rdma/irdma-abi.h
> > new file mode 100644
> > index 000000000000..bdfbda4c829e
> > +++ b/include/uapi/rdma/irdma-abi.h
> > @@ -0,0 +1,130 @@
> > +/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */
> > +/* Copyright (c) 2006 - 2019 Intel Corporation. All rights reserved.
> > + * Copyright (c) 2005 Topspin Communications. All rights reserved.
> > + * Copyright (c) 2005 Cisco Systems. All rights reserved.
> > + * Copyright (c) 2005 Open Grid Computing, Inc. All rights reserved.
> > + */
> > +
> > +#ifndef IRDMA_ABI_H
> > +#define IRDMA_ABI_H
> > +
> > +#include <linux/types.h>
> > +
> > +/* irdma must support legacy GEN_1 i40iw kernel
> > + * and user-space whose last ABI ver is 5
> > + */
> > +#define IRDMA_ABI_VER 6
>
> Can you please elaborate about it more?
> There is no irdma code in RDMA yet, so it makes me wonder why new define
> shouldn't start from 1.
It is because they are ABI compatible with the current user space,
which raises the question why we even have this confusing header
file..
I think this needs to be added after you delete the old driver.
Jason
^ permalink raw reply
* Re: [rdma 1/1] RDMA/irdma: Add Kconfig and Makefile
From: Jason Gunthorpe @ 2019-07-04 12:18 UTC (permalink / raw)
To: Jeff Kirsher
Cc: dledford@redhat.com, davem@davemloft.net, Shiraz Saleem,
linux-rdma@vger.kernel.org, netdev@vger.kernel.org,
nhorman@redhat.com, sassmann@redhat.com, poswald@suse.com,
david.m.ertman@intel.com, mustafa.ismail@intel.com
In-Reply-To: <20190704021259.15489-2-jeffrey.t.kirsher@intel.com>
On Wed, Jul 03, 2019 at 07:12:43PM -0700, Jeff Kirsher wrote:
> From: Shiraz Saleem <shiraz.saleem@intel.com>
>
> Add Kconfig and Makefile to build irdma driver and mark i40iw
> deprecated/obsolete, since the irdma driver is replacing it and supports
> x722 devices.
Patch 1/1? Series looks mangled...
> diff --git a/drivers/infiniband/hw/i40iw/Kconfig b/drivers/infiniband/hw/i40iw/Kconfig
> index d867ef1ac72a..7454b84b74be 100644
> +++ b/drivers/infiniband/hw/i40iw/Kconfig
> @@ -1,8 +1,10 @@
> config INFINIBAND_I40IW
> - tristate "Intel(R) Ethernet X722 iWARP Driver"
> + tristate "Intel(R) Ethernet X722 iWARP Driver (DEPRECATED)"
> depends on INET && I40E
> depends on IPV6 || !IPV6
> depends on PCI
> + depends on !(INFINBAND_IRDMA=y || INFINIBAND_IRDMA=m)
No.. all drivers must be able to build at once. At least add some
COMPILE_TEST in here to enable building.
Jason
^ permalink raw reply
* Re: [net-next 1/3] ice: Initialize and register platform device to provide RDMA
From: Jason Gunthorpe @ 2019-07-04 12:16 UTC (permalink / raw)
To: Jeff Kirsher, Greg KH
Cc: davem@davemloft.net, dledford@redhat.com, Tony Nguyen,
netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
nhorman@redhat.com, sassmann@redhat.com, poswald@suse.com,
mustafa.ismail@intel.com, shiraz.saleem@intel.com, Dave Ertman,
Andrew Bowers
In-Reply-To: <20190704021252.15534-2-jeffrey.t.kirsher@intel.com>
On Wed, Jul 03, 2019 at 07:12:50PM -0700, Jeff Kirsher wrote:
> From: Tony Nguyen <anthony.l.nguyen@intel.com>
>
> The RDMA block does not advertise on the PCI bus or any other bus.
> Thus the ice driver needs to provide access to the RDMA hardware block
> via a virtual bus; utilize the platform bus to provide this access.
>
> This patch initializes the driver to support RDMA as well as creates
> and registers a platform device for the RDMA driver to register to. At
> this point the driver is fully initialized to register a platform
> driver, however, can not yet register as the ops have not been
> implemented.
I think you need Greg's ack on all this driver stuff - particularly
that a platform_device is OK.
Jason
^ permalink raw reply
* Re: [net-next 0/3][pull request] Intel Wired LAN ver Updates 2019-07-03
From: Jason Gunthorpe @ 2019-07-04 12:15 UTC (permalink / raw)
To: Jeff Kirsher
Cc: davem@davemloft.net, dledford@redhat.com, netdev@vger.kernel.org,
linux-rdma@vger.kernel.org, nhorman@redhat.com,
sassmann@redhat.com, mustafa.ismail@intel.com,
shiraz.saleem@intel.com, david.m.ertman@intel.com
In-Reply-To: <20190704021252.15534-1-jeffrey.t.kirsher@intel.com>
On Wed, Jul 03, 2019 at 07:12:49PM -0700, Jeff Kirsher wrote:
> This series contains updates to i40e an ice drivers only and is required
> for a series of changes being submitted to the RDMA maintainer/tree.
> Vice Versa, the Intel RDMA driver patches could not be applied to
> net-next due to dependencies to the changes currently in the for-next
> branch of the rdma git tree.
RDMA driver code is not to be applied to the net-next tree. We've
learned this causes too many work flow problems.
You must co-ordinate your driver with a shared git tree as Mellanox is
doing, or wait for alternating kernel releases.
I'm not sure what to do with this PR, it is far too late in the cycle
to submit a new driver so most likely net should not go ahead with
this prep work until this new driver model scheme is properly
reviewed.
> The following are changes since commit a51df9f8da43e8bf9e508143630849b7d696e053:
> gve: fix -ENOMEM null check on a page allocation
> and are available in the git repository at:
> git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 100GbE
And if you expect anything to be shared with rdma you need to produce
pull requests that are based on some sensible -rc tag.
Jason
^ permalink raw reply
* Re: [for-next V2 10/10] RDMA/core: Provide RDMA DIM support for ULPs
From: Leon Romanovsky @ 2019-07-04 7:51 UTC (permalink / raw)
To: Sagi Grimberg
Cc: Idan Burstein, Saeed Mahameed, David S. Miller, Doug Ledford,
Jason Gunthorpe, Or Gerlitz, Tal Gilboa, netdev@vger.kernel.org,
linux-rdma@vger.kernel.org, Yamin Friedman, Max Gurtovoy
In-Reply-To: <8d525d64-6da1-48c3-952d-8c6b0d541859@grimberg.me>
On Wed, Jul 03, 2019 at 11:56:04AM -0700, Sagi Grimberg wrote:
>
> > Hi Sagi,
> >
> > I'm not sharing your worries about bad out-of-the-box experience for a
> > number of reasons.
> >
> > First of all, this code is part of upstream kernel and will take time
> > till users actually start to use it as is and not as part of some distro
> > backports or MOFED packages.
>
> True, but I am still saying that this feature is damaging sync IO which
> represents the majority of the users. It might not be an extreme impact
> but it is still a degradation (from a very limited testing I did this
> morning I'm seeing a consistent 5%-10% latency increase for low QD
> workloads which is consistent with what Yamin reported AFAIR).
>
> But having said that, the call is for you guys to make as this is a
> Mellanox device. I absolutely think that this is useful (as I said
> before), I just don't think its necessarily a good idea to opt it by
> default given that only a limited set of users would take full advantage
> of it while the rest would see a negative impact (even if its 10%).
>
> I don't have a hard objection here, just wanted to give you my
> opinion on this because mlx5 is an important driver for rdma
> users.
Your opinion is very valuable for us and we started internal thread to
challenge this "enable by default", it just takes time and I prefer to
enable this code to get test coverage as wide as possible.
>
> > Second, Yamin did extensive testing and worked very close with Or G.
> > and I have very high confident in the results of their team work.
>
> Has anyone tested other RDMA ulps? NFS/RDMA or SRP/iSER?
>
> Would be interesting to understand how other subsystems with different
> characteristics behave with this.
Me too, and I'll revert this default if needed.
Thanks
^ permalink raw reply
* Re: [rdma 14/16] RDMA/irdma: Add ABI definitions
From: Leon Romanovsky @ 2019-07-04 7:40 UTC (permalink / raw)
To: Jeff Kirsher
Cc: dledford, jgg, davem, Mustafa Ismail, linux-rdma, netdev, nhorman,
sassmann, poswald, david.m.ertman, Shiraz Saleem
In-Reply-To: <20190704021259.15489-16-jeffrey.t.kirsher@intel.com>
On Wed, Jul 03, 2019 at 07:12:57PM -0700, Jeff Kirsher wrote:
> From: Mustafa Ismail <mustafa.ismail@intel.com>
>
> Add ABI definitions for irdma.
>
> Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
> ---
> include/uapi/rdma/irdma-abi.h | 130 ++++++++++++++++++++++++++++++++++
> 1 file changed, 130 insertions(+)
> create mode 100644 include/uapi/rdma/irdma-abi.h
>
> diff --git a/include/uapi/rdma/irdma-abi.h b/include/uapi/rdma/irdma-abi.h
> new file mode 100644
> index 000000000000..bdfbda4c829e
> --- /dev/null
> +++ b/include/uapi/rdma/irdma-abi.h
> @@ -0,0 +1,130 @@
> +/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */
> +/* Copyright (c) 2006 - 2019 Intel Corporation. All rights reserved.
> + * Copyright (c) 2005 Topspin Communications. All rights reserved.
> + * Copyright (c) 2005 Cisco Systems. All rights reserved.
> + * Copyright (c) 2005 Open Grid Computing, Inc. All rights reserved.
> + */
> +
> +#ifndef IRDMA_ABI_H
> +#define IRDMA_ABI_H
> +
> +#include <linux/types.h>
> +
> +/* irdma must support legacy GEN_1 i40iw kernel
> + * and user-space whose last ABI ver is 5
> + */
> +#define IRDMA_ABI_VER 6
Can you please elaborate about it more?
There is no irdma code in RDMA yet, so it makes me wonder why new define
shouldn't start from 1.
Thanks
^ permalink raw reply
* [PATCH for-next] RDMA/hns: Bugfix for hns Makefile
From: Lijun Ou @ 2019-07-04 6:22 UTC (permalink / raw)
To: dledford, jgg; +Cc: leon, linux-rdma, linuxarm
Here has a bug for hns Makefile and will lead to a build error
when use allmodconfig to build hns driver.
The build log as follows:
After merging the rdma tree, today's linux-next build (x86_64
allmodconfig) failed like this:
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_ah.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_alloc.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_cmd.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_cq.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_db.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_hem.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_mr.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_pd.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_qp.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_restrack.o
see include/linux/module.h for more information
WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_srq.o
see include/linux/module.h for more information
ERROR: "hns_roce_bitmap_cleanup" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined!
ERROR: "hns_roce_bitmap_init" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined!
ERROR: "hns_roce_free_cmd_mailbox" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined!
ERROR: "hns_roce_alloc_cmd_mailbox" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined!
ERROR: "hns_roce_table_get" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined!
ERROR: "hns_roce_bitmap_alloc" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined!
ERROR: "hns_roce_table_find" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined!
Fixes: e9816ddf2a33 ("RDMA/hns: Cleanup unnecessary exported symbols")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
---
drivers/infiniband/hw/hns/Makefile | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/hw/hns/Makefile b/drivers/infiniband/hw/hns/Makefile
index b956cf4..b06125f 100644
--- a/drivers/infiniband/hw/hns/Makefile
+++ b/drivers/infiniband/hw/hns/Makefile
@@ -9,8 +9,8 @@ hns-roce-objs := hns_roce_main.o hns_roce_cmd.o hns_roce_pd.o \
hns_roce_cq.o hns_roce_alloc.o hns_roce_db.o hns_roce_srq.o hns_roce_restrack.o
ifdef CONFIG_INFINIBAND_HNS_HIP06
-hns-roce-hw-v1-objs := hns_roce_hw_v1.o
-obj-$(CONFIG_INFINIBAND_HNS) += hns-roce-hw-v1.o $(hns-roce-objs)
+hns-roce-hw-v1-objs := hns_roce_hw_v1.o $(hns-roce-objs)
+obj-$(CONFIG_INFINIBAND_HNS) += hns-roce-hw-v1.o
endif
ifdef CONFIG_INFINIBAND_HNS_HIP08
--
1.9.1
^ permalink raw reply related
* [PATCH mlx5-next 5/5] net/mlx5: Properly name the generic WQE control field
From: Saeed Mahameed @ 2019-07-03 7:39 UTC (permalink / raw)
To: Saeed Mahameed, Leon Romanovsky
Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Tariq Toukan
In-Reply-To: <20190703073909.14965-1-saeedm@mellanox.com>
From: Tariq Toukan <tariqt@mellanox.com>
A generic WQE control field is used for different purposes
in different cases.
Use union to allow using the proper name in each case.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
include/linux/mlx5/qp.h | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/include/linux/mlx5/qp.h b/include/linux/mlx5/qp.h
index d1f353c64797..127d224443e3 100644
--- a/include/linux/mlx5/qp.h
+++ b/include/linux/mlx5/qp.h
@@ -202,7 +202,12 @@ struct mlx5_wqe_ctrl_seg {
u8 signature;
u8 rsvd[2];
u8 fm_ce_se;
- __be32 imm;
+ union {
+ __be32 general_id;
+ __be32 imm;
+ __be32 umr_mkey;
+ __be32 tisn;
+ };
};
#define MLX5_WQE_CTRL_DS_MASK 0x3f
--
2.21.0
^ permalink raw reply related
* [PATCH mlx5-next 4/5] net/mlx5: Introduce TLS TX offload hardware bits and structures
From: Saeed Mahameed @ 2019-07-03 7:39 UTC (permalink / raw)
To: Saeed Mahameed, Leon Romanovsky
Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
Eran Ben Elisha, Tariq Toukan
In-Reply-To: <20190703073909.14965-1-saeedm@mellanox.com>
From: Eran Ben Elisha <eranbe@mellanox.com>
Add TLS offload related IFC structs, layouts and enumerations.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
include/linux/mlx5/device.h | 14 +++++
include/linux/mlx5/mlx5_ifc.h | 104 ++++++++++++++++++++++++++++++++--
2 files changed, 114 insertions(+), 4 deletions(-)
diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h
index 5e760067ac41..5f7d1671ad5a 100644
--- a/include/linux/mlx5/device.h
+++ b/include/linux/mlx5/device.h
@@ -437,6 +437,7 @@ enum {
MLX5_OPCODE_SET_PSV = 0x20,
MLX5_OPCODE_GET_PSV = 0x21,
MLX5_OPCODE_CHECK_PSV = 0x22,
+ MLX5_OPCODE_DUMP = 0x23,
MLX5_OPCODE_RGET_PSV = 0x26,
MLX5_OPCODE_RCHECK_PSV = 0x27,
@@ -444,6 +445,14 @@ enum {
};
+enum {
+ MLX5_OPC_MOD_TLS_TIS_STATIC_PARAMS = 0x20,
+};
+
+enum {
+ MLX5_OPC_MOD_TLS_TIS_PROGRESS_PARAMS = 0x20,
+};
+
enum {
MLX5_SET_PORT_RESET_QKEY = 0,
MLX5_SET_PORT_GUID0 = 16,
@@ -1077,6 +1086,8 @@ enum mlx5_cap_type {
MLX5_CAP_DEBUG,
MLX5_CAP_RESERVED_14,
MLX5_CAP_DEV_MEM,
+ MLX5_CAP_RESERVED_16,
+ MLX5_CAP_TLS,
/* NUM OF CAP Types */
MLX5_CAP_NUM
};
@@ -1255,6 +1266,9 @@ enum mlx5_qcam_feature_groups {
#define MLX5_CAP64_DEV_MEM(mdev, cap)\
MLX5_GET64(device_mem_cap, mdev->caps.hca_cur[MLX5_CAP_DEV_MEM], cap)
+#define MLX5_CAP_TLS(mdev, cap) \
+ MLX5_GET(tls_cap, (mdev)->caps.hca_cur[MLX5_CAP_TLS], cap)
+
enum {
MLX5_CMD_STAT_OK = 0x0,
MLX5_CMD_STAT_INT_ERR = 0x1,
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 031db53e94ce..1f77ae1ed250 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -953,6 +953,16 @@ struct mlx5_ifc_vector_calc_cap_bits {
u8 reserved_at_c0[0x720];
};
+struct mlx5_ifc_tls_cap_bits {
+ u8 tls_1_2_aes_gcm_128[0x1];
+ u8 tls_1_3_aes_gcm_128[0x1];
+ u8 tls_1_2_aes_gcm_256[0x1];
+ u8 tls_1_3_aes_gcm_256[0x1];
+ u8 reserved_at_4[0x1c];
+
+ u8 reserved_at_20[0x7e0];
+};
+
enum {
MLX5_WQ_TYPE_LINKED_LIST = 0x0,
MLX5_WQ_TYPE_CYCLIC = 0x1,
@@ -1282,7 +1292,8 @@ struct mlx5_ifc_cmd_hca_cap_bits {
u8 reserved_at_440[0x20];
- u8 reserved_at_460[0x3];
+ u8 tls[0x1];
+ u8 reserved_at_461[0x2];
u8 log_max_uctx[0x5];
u8 reserved_at_468[0x3];
u8 log_max_umem[0x5];
@@ -1307,7 +1318,9 @@ struct mlx5_ifc_cmd_hca_cap_bits {
u8 max_geneve_tlv_option_data_len[0x5];
u8 reserved_at_570[0x10];
- u8 reserved_at_580[0x3c];
+ u8 reserved_at_580[0x33];
+ u8 log_max_dek[0x5];
+ u8 reserved_at_5b8[0x4];
u8 mini_cqe_resp_stride_index[0x1];
u8 cqe_128_always[0x1];
u8 cqe_compression_128[0x1];
@@ -2586,6 +2599,7 @@ union mlx5_ifc_hca_cap_union_bits {
struct mlx5_ifc_qos_cap_bits qos_cap;
struct mlx5_ifc_debug_cap_bits debug_cap;
struct mlx5_ifc_fpga_cap_bits fpga_cap;
+ struct mlx5_ifc_tls_cap_bits tls_cap;
u8 reserved_at_0[0x8000];
};
@@ -2725,7 +2739,8 @@ struct mlx5_ifc_traffic_counter_bits {
struct mlx5_ifc_tisc_bits {
u8 strict_lag_tx_port_affinity[0x1];
- u8 reserved_at_1[0x3];
+ u8 tls_en[0x1];
+ u8 reserved_at_1[0x2];
u8 lag_tx_port_affinity[0x04];
u8 reserved_at_8[0x4];
@@ -2739,7 +2754,11 @@ struct mlx5_ifc_tisc_bits {
u8 reserved_at_140[0x8];
u8 underlay_qpn[0x18];
- u8 reserved_at_160[0x3a0];
+
+ u8 reserved_at_160[0x8];
+ u8 pd[0x18];
+
+ u8 reserved_at_180[0x380];
};
enum {
@@ -9937,4 +9956,81 @@ struct mlx5_ifc_alloc_sf_in_bits {
u8 reserved_at_60[0x20];
};
+enum {
+ MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_ENCRYPTION_KEY = BIT(0xc),
+};
+
+enum {
+ MLX5_GENERAL_OBJECT_TYPES_ENCRYPTION_KEY = 0xc,
+};
+
+struct mlx5_ifc_encryption_key_obj_bits {
+ u8 modify_field_select[0x40];
+
+ u8 reserved_at_40[0x14];
+ u8 key_size[0x4];
+ u8 reserved_at_58[0x4];
+ u8 key_type[0x4];
+
+ u8 reserved_at_60[0x8];
+ u8 pd[0x18];
+
+ u8 reserved_at_80[0x180];
+ u8 key[8][0x20];
+
+ u8 reserved_at_300[0x500];
+};
+
+struct mlx5_ifc_create_encryption_key_in_bits {
+ struct mlx5_ifc_general_obj_in_cmd_hdr_bits general_obj_in_cmd_hdr;
+ struct mlx5_ifc_encryption_key_obj_bits encryption_key_object;
+};
+
+enum {
+ MLX5_GENERAL_OBJECT_TYPE_ENCRYPTION_KEY_KEY_SIZE_128 = 0x0,
+ MLX5_GENERAL_OBJECT_TYPE_ENCRYPTION_KEY_KEY_SIZE_256 = 0x1,
+};
+
+enum {
+ MLX5_GENERAL_OBJECT_TYPE_ENCRYPTION_KEY_TYPE_DEK = 0x1,
+};
+
+struct mlx5_ifc_tls_static_params_bits {
+ u8 const_2[0x2];
+ u8 tls_version[0x4];
+ u8 const_1[0x2];
+ u8 reserved_at_8[0x14];
+ u8 encryption_standard[0x4];
+
+ u8 reserved_at_20[0x20];
+
+ u8 initial_record_number[0x40];
+
+ u8 resync_tcp_sn[0x20];
+
+ u8 gcm_iv[0x20];
+
+ u8 implicit_iv[0x40];
+
+ u8 reserved_at_100[0x8];
+ u8 dek_index[0x18];
+
+ u8 reserved_at_120[0xe0];
+};
+
+struct mlx5_ifc_tls_progress_params_bits {
+ u8 valid[0x1];
+ u8 reserved_at_1[0x7];
+ u8 pd[0x18];
+
+ u8 next_record_tcp_sn[0x20];
+
+ u8 hw_resync_tcp_sn[0x20];
+
+ u8 record_tracker_state[0x2];
+ u8 auth_state[0x2];
+ u8 reserved_at_64[0x4];
+ u8 hw_offset_record_number[0x18];
+};
+
#endif /* MLX5_IFC_H */
--
2.21.0
^ permalink raw reply related
* [PATCH mlx5-next 3/5] net/mlx5: Refactor mlx5_esw_query_functions for modularity
From: Saeed Mahameed @ 2019-07-03 7:39 UTC (permalink / raw)
To: Saeed Mahameed, Leon Romanovsky
Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Parav Pandit
In-Reply-To: <20190703073909.14965-1-saeedm@mellanox.com>
From: Parav Pandit <parav@mellanox.com>
Functions change event output data size changes when functions other
than VFs will be enabled in HCA CAP.
With current API, multiple callers needs to align, calculate accurate
size of the output data depending on number on non VF functions enabled
in the device.
Instead of duplicating such math at multiple places, refactor
mlx5_esw_query_functions() to return raw output allocated by itself.
Caller must free the allocated memory using kvfree() as described in the
function comment section.
This hides calcuation within mlx5_esw_query_functions() and provides
simpler API.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
.../net/ethernet/mellanox/mlx5/core/eswitch.c | 38 +++++++++++++++----
.../net/ethernet/mellanox/mlx5/core/eswitch.h | 7 ++--
.../mellanox/mlx5/core/eswitch_offloads.c | 8 ++--
.../net/ethernet/mellanox/mlx5/core/sriov.c | 15 +++++---
4 files changed, 46 insertions(+), 22 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
index 9137a8390216..62954265b57c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -1715,14 +1715,34 @@ static int eswitch_vport_event(struct notifier_block *nb,
return NOTIFY_OK;
}
-int mlx5_esw_query_functions(struct mlx5_core_dev *dev, u32 *out, int outlen)
+/**
+ * mlx5_esw_query_functions - Returns raw output about functions state
+ * @dev: Pointer to device to query
+ *
+ * mlx5_esw_query_functions() allocates and returns functions changed
+ * raw output memory pointer from device on success. Otherwise returns ERR_PTR.
+ * Caller must free the memory using kvfree() when valid pointer is returned.
+ */
+const u32 *mlx5_esw_query_functions(struct mlx5_core_dev *dev)
{
+ int outlen = MLX5_ST_SZ_BYTES(query_esw_functions_out);
u32 in[MLX5_ST_SZ_DW(query_esw_functions_in)] = {};
+ u32 *out;
+ int err;
+
+ out = kvzalloc(outlen, GFP_KERNEL);
+ if (!out)
+ return ERR_PTR(-ENOMEM);
MLX5_SET(query_esw_functions_in, in, opcode,
MLX5_CMD_OP_QUERY_ESW_FUNCTIONS);
- return mlx5_cmd_exec(dev, in, sizeof(in), out, outlen);
+ err = mlx5_cmd_exec(dev, in, sizeof(in), out, outlen);
+ if (!err)
+ return out;
+
+ kvfree(out);
+ return ERR_PTR(err);
}
static void mlx5_eswitch_event_handlers_register(struct mlx5_eswitch *esw)
@@ -2527,8 +2547,7 @@ bool mlx5_esw_multipath_prereq(struct mlx5_core_dev *dev0,
void mlx5_eswitch_update_num_of_vfs(struct mlx5_eswitch *esw, const int num_vfs)
{
- u32 out[MLX5_ST_SZ_DW(query_esw_functions_out)] = {};
- int err;
+ const u32 *out;
WARN_ON_ONCE(esw->mode != MLX5_ESWITCH_NONE);
@@ -2537,8 +2556,11 @@ void mlx5_eswitch_update_num_of_vfs(struct mlx5_eswitch *esw, const int num_vfs)
return;
}
- err = mlx5_esw_query_functions(esw->dev, out, sizeof(out));
- if (!err)
- esw->esw_funcs.num_vfs = MLX5_GET(query_esw_functions_out, out,
- host_params_context.host_num_of_vfs);
+ out = mlx5_esw_query_functions(esw->dev);
+ if (IS_ERR(out))
+ return;
+
+ esw->esw_funcs.num_vfs = MLX5_GET(query_esw_functions_out, out,
+ host_params_context.host_num_of_vfs);
+ kvfree(out);
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index f59183440d7f..d2d33a9893bb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -403,7 +403,7 @@ bool mlx5_esw_lag_prereq(struct mlx5_core_dev *dev0,
bool mlx5_esw_multipath_prereq(struct mlx5_core_dev *dev0,
struct mlx5_core_dev *dev1);
-int mlx5_esw_query_functions(struct mlx5_core_dev *dev, u32 *out, int outlen);
+const u32 *mlx5_esw_query_functions(struct mlx5_core_dev *dev);
#define MLX5_DEBUG_ESWITCH_MASK BIT(3)
@@ -560,10 +560,9 @@ static inline int mlx5_eswitch_enable(struct mlx5_eswitch *esw, int mode) { ret
static inline void mlx5_eswitch_disable(struct mlx5_eswitch *esw) {}
static inline bool mlx5_esw_lag_prereq(struct mlx5_core_dev *dev0, struct mlx5_core_dev *dev1) { return true; }
static inline bool mlx5_eswitch_is_funcs_handler(struct mlx5_core_dev *dev) { return false; }
-static inline int
-mlx5_esw_query_functions(struct mlx5_core_dev *dev, u32 *out, int outlen)
+static inline const u32 *mlx5_esw_query_functions(struct mlx5_core_dev *dev)
{
- return -EOPNOTSUPP;
+ return ERR_PTR(-EOPNOTSUPP);
}
static inline void mlx5_eswitch_update_num_of_vfs(struct mlx5_eswitch *esw, const int num_vfs) {}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 42c0db585561..74ab7bd264ed 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -2075,19 +2075,19 @@ esw_vfs_changed_event_handler(struct mlx5_eswitch *esw, const u32 *out)
static void esw_functions_changed_event_handler(struct work_struct *work)
{
- u32 out[MLX5_ST_SZ_DW(query_esw_functions_out)] = {};
struct mlx5_host_work *host_work;
struct mlx5_eswitch *esw;
- int err;
+ const u32 *out;
host_work = container_of(work, struct mlx5_host_work, work);
esw = host_work->esw;
- err = mlx5_esw_query_functions(esw->dev, out, sizeof(out));
- if (err)
+ out = mlx5_esw_query_functions(esw->dev);
+ if (IS_ERR(out))
goto out;
esw_vfs_changed_event_handler(esw, out);
+ kvfree(out);
out:
kfree(host_work);
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
index 547d0be9025e..61fcfd8b39b4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
@@ -197,22 +197,25 @@ void mlx5_sriov_detach(struct mlx5_core_dev *dev)
static u16 mlx5_get_max_vfs(struct mlx5_core_dev *dev)
{
- u32 out[MLX5_ST_SZ_DW(query_esw_functions_out)] = {};
u16 host_total_vfs;
- int err;
+ const u32 *out;
if (mlx5_core_is_ecpf_esw_manager(dev)) {
- err = mlx5_esw_query_functions(dev, out, sizeof(out));
- host_total_vfs = MLX5_GET(query_esw_functions_out, out,
- host_params_context.host_total_vfs);
+ out = mlx5_esw_query_functions(dev);
/* Old FW doesn't support getting total_vfs from esw func
* but supports getting it from pci_sriov.
*/
- if (!err && host_total_vfs)
+ if (IS_ERR(out))
+ goto done;
+ host_total_vfs = MLX5_GET(query_esw_functions_out, out,
+ host_params_context.host_total_vfs);
+ kvfree(out);
+ if (host_total_vfs)
return host_total_vfs;
}
+done:
return pci_sriov_get_totalvfs(dev->pdev);
}
--
2.21.0
^ permalink raw reply related
* [PATCH mlx5-next 2/5] net/mlx5: E-Switch prepare functions change handler to be modular
From: Saeed Mahameed @ 2019-07-03 7:39 UTC (permalink / raw)
To: Saeed Mahameed, Leon Romanovsky
Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Parav Pandit
In-Reply-To: <20190703073909.14965-1-saeedm@mellanox.com>
From: Parav Pandit <parav@mellanox.com>
Eswitch function change handler will service multiple type of events for
VFs and non VF functions update.
Hence, introduce and use the helper function
esw_vfs_changed_event_handler() for handling change in num VFs to improve
the code readability.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
.../mellanox/mlx5/core/eswitch_offloads.c | 44 ++++++++++++-------
1 file changed, 27 insertions(+), 17 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 5c8fb2597bfa..42c0db585561 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -2046,38 +2046,48 @@ static void esw_offloads_steering_cleanup(struct mlx5_eswitch *esw)
esw_destroy_offloads_acl_tables(esw);
}
-static void esw_functions_changed_event_handler(struct work_struct *work)
+static void
+esw_vfs_changed_event_handler(struct mlx5_eswitch *esw, const u32 *out)
{
- u32 out[MLX5_ST_SZ_DW(query_esw_functions_out)] = {};
- struct mlx5_host_work *host_work;
- struct mlx5_eswitch *esw;
bool host_pf_disabled;
- u16 num_vfs = 0;
- int err;
-
- host_work = container_of(work, struct mlx5_host_work, work);
- esw = host_work->esw;
+ u16 new_num_vfs;
- err = mlx5_esw_query_functions(esw->dev, out, sizeof(out));
- num_vfs = MLX5_GET(query_esw_functions_out, out,
- host_params_context.host_num_of_vfs);
+ new_num_vfs = MLX5_GET(query_esw_functions_out, out,
+ host_params_context.host_num_of_vfs);
host_pf_disabled = MLX5_GET(query_esw_functions_out, out,
host_params_context.host_pf_disabled);
- if (err || host_pf_disabled || num_vfs == esw->esw_funcs.num_vfs)
- goto out;
+
+ if (new_num_vfs == esw->esw_funcs.num_vfs || host_pf_disabled)
+ return;
/* Number of VFs can only change from "0 to x" or "x to 0". */
if (esw->esw_funcs.num_vfs > 0) {
esw_offloads_unload_vf_reps(esw, esw->esw_funcs.num_vfs);
} else {
- err = esw_offloads_load_vf_reps(esw, num_vfs);
+ int err;
+ err = esw_offloads_load_vf_reps(esw, new_num_vfs);
if (err)
- goto out;
+ return;
}
+ esw->esw_funcs.num_vfs = new_num_vfs;
+}
+
+static void esw_functions_changed_event_handler(struct work_struct *work)
+{
+ u32 out[MLX5_ST_SZ_DW(query_esw_functions_out)] = {};
+ struct mlx5_host_work *host_work;
+ struct mlx5_eswitch *esw;
+ int err;
+
+ host_work = container_of(work, struct mlx5_host_work, work);
+ esw = host_work->esw;
- esw->esw_funcs.num_vfs = num_vfs;
+ err = mlx5_esw_query_functions(esw->dev, out, sizeof(out));
+ if (err)
+ goto out;
+ esw_vfs_changed_event_handler(esw, out);
out:
kfree(host_work);
}
--
2.21.0
^ permalink raw reply related
* [PATCH mlx5-next 1/5] net/mlx5: Introduce and use mlx5_eswitch_get_total_vports()
From: Saeed Mahameed @ 2019-07-03 7:39 UTC (permalink / raw)
To: Saeed Mahameed, Leon Romanovsky
Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Parav Pandit
In-Reply-To: <20190703073909.14965-1-saeedm@mellanox.com>
From: Parav Pandit <parav@mellanox.com>
Instead MLX5_TOTAL_VPORTS, use mlx5_eswitch_get_total_vports().
mlx5_eswitch_get_total_vports() in subsequent patch accounts for SF
vports as well.
Expanding MLX5_TOTAL_VPORTS macro would require exposing SF internals to
more generic vport.h header file. Such exposure is not desired.
Hence a mlx5_eswitch_get_total_vports() is introduced.
Given that mlx5_eswitch_get_total_vports() API wants to work on const
mlx5_core_dev*, change its helper functions also to accept const *dev.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/infiniband/hw/mlx5/ib_rep.c | 2 +-
.../net/ethernet/mellanox/mlx5/core/eswitch.c | 4 ++-
.../mellanox/mlx5/core/eswitch_offloads.c | 2 +-
.../net/ethernet/mellanox/mlx5/core/fs_core.c | 26 +++++++++++--------
.../net/ethernet/mellanox/mlx5/core/vport.c | 15 +++++++++++
include/linux/mlx5/driver.h | 9 ++++---
include/linux/mlx5/eswitch.h | 3 +++
include/linux/mlx5/vport.h | 3 ---
8 files changed, 43 insertions(+), 21 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/ib_rep.c b/drivers/infiniband/hw/mlx5/ib_rep.c
index 3065c5d0ee96..f2cb789d2331 100644
--- a/drivers/infiniband/hw/mlx5/ib_rep.c
+++ b/drivers/infiniband/hw/mlx5/ib_rep.c
@@ -29,7 +29,7 @@ mlx5_ib_set_vport_rep(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep)
static int
mlx5_ib_vport_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep)
{
- int num_ports = MLX5_TOTAL_VPORTS(dev);
+ int num_ports = mlx5_eswitch_get_total_vports(dev);
const struct mlx5_ib_profile *profile;
struct mlx5_ib_dev *ibdev;
int vport_index;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
index 89f52370e770..9137a8390216 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -1868,14 +1868,16 @@ void mlx5_eswitch_disable(struct mlx5_eswitch *esw)
int mlx5_eswitch_init(struct mlx5_core_dev *dev)
{
- int total_vports = MLX5_TOTAL_VPORTS(dev);
struct mlx5_eswitch *esw;
struct mlx5_vport *vport;
+ int total_vports;
int err, i;
if (!MLX5_VPORT_MANAGER(dev))
return 0;
+ total_vports = mlx5_eswitch_get_total_vports(dev);
+
esw_info(dev,
"Total vports %d, per vport: max uc(%d) max mc(%d)\n",
total_vports,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 50e5841c1698..5c8fb2597bfa 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -1394,7 +1394,7 @@ void esw_offloads_cleanup_reps(struct mlx5_eswitch *esw)
int esw_offloads_init_reps(struct mlx5_eswitch *esw)
{
- int total_vports = MLX5_TOTAL_VPORTS(esw->dev);
+ int total_vports = esw->total_vports;
struct mlx5_core_dev *dev = esw->dev;
struct mlx5_eswitch_rep *rep;
u8 hw_id[ETH_ALEN], rep_type;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 9f5544ac6b8a..8162252585ad 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -2090,7 +2090,7 @@ struct mlx5_flow_namespace *mlx5_get_flow_vport_acl_namespace(struct mlx5_core_d
{
struct mlx5_flow_steering *steering = dev->priv.steering;
- if (!steering || vport >= MLX5_TOTAL_VPORTS(dev))
+ if (!steering || vport >= mlx5_eswitch_get_total_vports(dev))
return NULL;
switch (type) {
@@ -2421,7 +2421,7 @@ static void cleanup_egress_acls_root_ns(struct mlx5_core_dev *dev)
if (!steering->esw_egress_root_ns)
return;
- for (i = 0; i < MLX5_TOTAL_VPORTS(dev); i++)
+ for (i = 0; i < mlx5_eswitch_get_total_vports(dev); i++)
cleanup_root_ns(steering->esw_egress_root_ns[i]);
kfree(steering->esw_egress_root_ns);
@@ -2435,7 +2435,7 @@ static void cleanup_ingress_acls_root_ns(struct mlx5_core_dev *dev)
if (!steering->esw_ingress_root_ns)
return;
- for (i = 0; i < MLX5_TOTAL_VPORTS(dev); i++)
+ for (i = 0; i < mlx5_eswitch_get_total_vports(dev); i++)
cleanup_root_ns(steering->esw_ingress_root_ns[i]);
kfree(steering->esw_ingress_root_ns);
@@ -2614,16 +2614,18 @@ static int init_ingress_acl_root_ns(struct mlx5_flow_steering *steering, int vpo
static int init_egress_acls_root_ns(struct mlx5_core_dev *dev)
{
struct mlx5_flow_steering *steering = dev->priv.steering;
+ int total_vports = mlx5_eswitch_get_total_vports(dev);
int err;
int i;
- steering->esw_egress_root_ns = kcalloc(MLX5_TOTAL_VPORTS(dev),
- sizeof(*steering->esw_egress_root_ns),
- GFP_KERNEL);
+ steering->esw_egress_root_ns =
+ kcalloc(total_vports,
+ sizeof(*steering->esw_egress_root_ns),
+ GFP_KERNEL);
if (!steering->esw_egress_root_ns)
return -ENOMEM;
- for (i = 0; i < MLX5_TOTAL_VPORTS(dev); i++) {
+ for (i = 0; i < total_vports; i++) {
err = init_egress_acl_root_ns(steering, i);
if (err)
goto cleanup_root_ns;
@@ -2641,16 +2643,18 @@ static int init_egress_acls_root_ns(struct mlx5_core_dev *dev)
static int init_ingress_acls_root_ns(struct mlx5_core_dev *dev)
{
struct mlx5_flow_steering *steering = dev->priv.steering;
+ int total_vports = mlx5_eswitch_get_total_vports(dev);
int err;
int i;
- steering->esw_ingress_root_ns = kcalloc(MLX5_TOTAL_VPORTS(dev),
- sizeof(*steering->esw_ingress_root_ns),
- GFP_KERNEL);
+ steering->esw_ingress_root_ns =
+ kcalloc(total_vports,
+ sizeof(*steering->esw_ingress_root_ns),
+ GFP_KERNEL);
if (!steering->esw_ingress_root_ns)
return -ENOMEM;
- for (i = 0; i < MLX5_TOTAL_VPORTS(dev); i++) {
+ for (i = 0; i < total_vports; i++) {
err = init_ingress_acl_root_ns(steering, i);
if (err)
goto cleanup_root_ns;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/vport.c b/drivers/net/ethernet/mellanox/mlx5/core/vport.c
index 670fa493c5f5..c912d82ca64b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/vport.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/vport.c
@@ -34,6 +34,7 @@
#include <linux/etherdevice.h>
#include <linux/mlx5/driver.h>
#include <linux/mlx5/vport.h>
+#include <linux/mlx5/eswitch.h>
#include "mlx5_core.h"
/* Mutex to hold while enabling or disabling RoCE */
@@ -1165,3 +1166,17 @@ u64 mlx5_query_nic_system_image_guid(struct mlx5_core_dev *mdev)
return tmp;
}
EXPORT_SYMBOL_GPL(mlx5_query_nic_system_image_guid);
+
+/**
+ * mlx5_eswitch_get_total_vports - Get total vports of the eswitch
+ *
+ * @dev: Pointer to core device
+ *
+ * mlx5_eswitch_get_total_vports returns total number of vports for
+ * the eswitch.
+ */
+u16 mlx5_eswitch_get_total_vports(const struct mlx5_core_dev *dev)
+{
+ return MLX5_SPECIAL_VPORTS(dev) + mlx5_core_max_vfs(dev);
+}
+EXPORT_SYMBOL(mlx5_eswitch_get_total_vports);
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 7658a4908431..2c3e8d86e12d 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -1083,7 +1083,7 @@ enum {
MLX5_PCI_DEV_IS_VF = 1 << 0,
};
-static inline bool mlx5_core_is_pf(struct mlx5_core_dev *dev)
+static inline bool mlx5_core_is_pf(const struct mlx5_core_dev *dev)
{
return dev->coredev_type == MLX5_COREDEV_PF;
}
@@ -1093,17 +1093,18 @@ static inline bool mlx5_core_is_ecpf(struct mlx5_core_dev *dev)
return dev->caps.embedded_cpu;
}
-static inline bool mlx5_core_is_ecpf_esw_manager(struct mlx5_core_dev *dev)
+static inline bool
+mlx5_core_is_ecpf_esw_manager(const struct mlx5_core_dev *dev)
{
return dev->caps.embedded_cpu && MLX5_CAP_GEN(dev, eswitch_manager);
}
-static inline bool mlx5_ecpf_vport_exists(struct mlx5_core_dev *dev)
+static inline bool mlx5_ecpf_vport_exists(const struct mlx5_core_dev *dev)
{
return mlx5_core_is_pf(dev) && MLX5_CAP_ESW(dev, ecpf_vport_exists);
}
-static inline u16 mlx5_core_max_vfs(struct mlx5_core_dev *dev)
+static inline u16 mlx5_core_max_vfs(const struct mlx5_core_dev *dev)
{
return dev->priv.sriov.max_vfs;
}
diff --git a/include/linux/mlx5/eswitch.h b/include/linux/mlx5/eswitch.h
index d4731199edb4..61db37aa9642 100644
--- a/include/linux/mlx5/eswitch.h
+++ b/include/linux/mlx5/eswitch.h
@@ -66,6 +66,8 @@ struct mlx5_flow_handle *
mlx5_eswitch_add_send_to_vport_rule(struct mlx5_eswitch *esw,
int vport, u32 sqn);
+u16 mlx5_eswitch_get_total_vports(const struct mlx5_core_dev *dev);
+
#ifdef CONFIG_MLX5_ESWITCH
enum devlink_eswitch_encap_mode
mlx5_eswitch_get_encap_mode(const struct mlx5_core_dev *dev);
@@ -93,4 +95,5 @@ mlx5_eswitch_get_vport_metadata_for_match(const struct mlx5_eswitch *esw,
return 0;
};
#endif /* CONFIG_MLX5_ESWITCH */
+
#endif
diff --git a/include/linux/mlx5/vport.h b/include/linux/mlx5/vport.h
index 6cbf29229749..16060fb9b5e5 100644
--- a/include/linux/mlx5/vport.h
+++ b/include/linux/mlx5/vport.h
@@ -44,9 +44,6 @@
MLX5_VPORT_UPLINK_PLACEHOLDER + \
MLX5_VPORT_ECPF_PLACEHOLDER(mdev))
-#define MLX5_TOTAL_VPORTS(mdev) (MLX5_SPECIAL_VPORTS(mdev) + \
- mlx5_core_max_vfs(mdev))
-
#define MLX5_VPORT_MANAGER(mdev) \
(MLX5_CAP_GEN(mdev, vport_group_manager) && \
(MLX5_CAP_GEN(mdev, port_type) == MLX5_CAP_PORT_TYPE_ETH) && \
--
2.21.0
^ permalink raw reply related
* [PATCH mlx5-next 0/5] Mellanox, mlx5 low level updates 2019-07-02
From: Saeed Mahameed @ 2019-07-03 7:39 UTC (permalink / raw)
To: Saeed Mahameed, Leon Romanovsky
Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org
Hi All,
This series includes some low level updates to mlx5 driver, required for
shared mlx5-next branch.
Tariq extends the WQE control fields names.
Eran adds the required HW definitions and structures for upcoming TLS
support.
Parav improves and refactors the E-Switch "function changed" handler.
In case of no objections these patches will be applied to mlx5-next and
will be sent later as pull request to both rdma-next and net-next trees.
Thanks,
Saeed.
---
Eran Ben Elisha (1):
net/mlx5: Introduce TLS TX offload hardware bits and structures
Parav Pandit (3):
net/mlx5: Introduce and use mlx5_eswitch_get_total_vports()
net/mlx5: E-Switch prepare functions change handler to be modular
net/mlx5: Refactor mlx5_esw_query_functions for modularity
Tariq Toukan (1):
net/mlx5: Properly name the generic WQE control field
drivers/infiniband/hw/mlx5/ib_rep.c | 2 +-
.../net/ethernet/mellanox/mlx5/core/eswitch.c | 42 +++++--
.../net/ethernet/mellanox/mlx5/core/eswitch.h | 7 +-
.../mellanox/mlx5/core/eswitch_offloads.c | 46 +++++---
.../net/ethernet/mellanox/mlx5/core/fs_core.c | 26 +++--
.../net/ethernet/mellanox/mlx5/core/sriov.c | 15 ++-
.../net/ethernet/mellanox/mlx5/core/vport.c | 15 +++
include/linux/mlx5/device.h | 14 +++
include/linux/mlx5/driver.h | 9 +-
include/linux/mlx5/eswitch.h | 3 +
include/linux/mlx5/mlx5_ifc.h | 104 +++++++++++++++++-
include/linux/mlx5/qp.h | 7 +-
include/linux/mlx5/vport.h | 3 -
13 files changed, 232 insertions(+), 61 deletions(-)
--
2.21.0
^ permalink raw reply
* Re: [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables
From: Kuehling, Felix @ 2019-07-03 2:27 UTC (permalink / raw)
To: Jason Gunthorpe, Christoph Hellwig, Deucher, Alexander,
David Airlie
Cc: Andrea Arcangeli, Ralph Campbell,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, John Hubbard,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, Jerome Glisse,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
In-Reply-To: <20190702225911.GA11833-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
On 2019-07-02 6:59 p.m., Jason Gunthorpe wrote:
> On Wed, Jul 03, 2019 at 12:49:12AM +0200, Christoph Hellwig wrote:
>> On Tue, Jul 02, 2019 at 07:53:23PM +0000, Jason Gunthorpe wrote:
>>>> I'm sending this out now since we are updating many of the HMM APIs
>>>> and I think it will be useful.
>>> This make so much sense, I'd like to apply this in hmm.git, is there
>>> any objection?
>> As this creates a somewhat hairy conflict for amdgpu, wouldn't it be
>> a better idea to wait a bit and apply it first thing for next merge
>> window?
> My thinking is that AMD GPU already has a monster conflict from this:
>
> int hmm_range_register(struct hmm_range *range,
> - struct mm_struct *mm,
> + struct hmm_mirror *mirror,
> unsigned long start,
> unsigned long end,
> unsigned page_shift);
>
> So, depending on how that is resolved we might want to do both API
> changes at once.
I just sent out a fix for the hmm_mirror API change.
>
> Or we may have to revert the above change at this late date.
>
> Waiting for AMDGPU team to discuss what process they want to use.
Yeah, I'm wondering what the process is myself. With HMM and driver
development happening on different branches these kinds of API changes
are painful. There seems to be a built-in assumption in the current
process, that code flows mostly in one direction amd-staging-drm-next ->
drm-next -> linux-next -> linux. That assumption is broken with HMM code
evolving rapidly in both amdgpu and mm.
If we want to continue developing HMM driver changes in
amd-staging-drm-next, we'll need to synchronize with hmm.git more
frequently, both ways. I believe part of the problem is, that there is a
fairly long lead-time from getting changes from amd-staging-drm-next
into linux-next, as they are held for one release cycle in drm-next.
Pushing HMM-related changes through drm-fixes may offer a kind of
shortcut. Philip and my latest fixup is just bypassing drm-next
completely and going straight into linux-next, though.
Regards,
Felix
>
> Jason
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply
* Re: [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables
From: Christoph Hellwig @ 2019-07-03 0:03 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Andrea Arcangeli, Ralph Campbell,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo@public.gmane.org,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, Jerome Glisse,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
Christoph Hellwig
In-Reply-To: <20190702225911.GA11833-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
On Tue, Jul 02, 2019 at 10:59:16PM +0000, Jason Gunthorpe wrote:
> > As this creates a somewhat hairy conflict for amdgpu, wouldn't it be
> > a better idea to wait a bit and apply it first thing for next merge
> > window?
>
> My thinking is that AMD GPU already has a monster conflict from this:
>
> int hmm_range_register(struct hmm_range *range,
> - struct mm_struct *mm,
> + struct hmm_mirror *mirror,
> unsigned long start,
> unsigned long end,
> unsigned page_shift);
Well, that seems like a relatively easy to fix conflict, at least as
long as you have the mirror easily available. The notifier change
on the other hand basically requires rewriting about two dozen lines
of code entirely.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply
* Re: [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables
From: Jason Gunthorpe @ 2019-07-02 22:59 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Andrea Arcangeli, Ralph Campbell,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo@public.gmane.org,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, Jerome Glisse,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
In-Reply-To: <20190702224912.GA24043-jcswGhMUV9g@public.gmane.org>
On Wed, Jul 03, 2019 at 12:49:12AM +0200, Christoph Hellwig wrote:
> On Tue, Jul 02, 2019 at 07:53:23PM +0000, Jason Gunthorpe wrote:
> > > I'm sending this out now since we are updating many of the HMM APIs
> > > and I think it will be useful.
> >
> > This make so much sense, I'd like to apply this in hmm.git, is there
> > any objection?
>
> As this creates a somewhat hairy conflict for amdgpu, wouldn't it be
> a better idea to wait a bit and apply it first thing for next merge
> window?
My thinking is that AMD GPU already has a monster conflict from this:
int hmm_range_register(struct hmm_range *range,
- struct mm_struct *mm,
+ struct hmm_mirror *mirror,
unsigned long start,
unsigned long end,
unsigned page_shift);
So, depending on how that is resolved we might want to do both API
changes at once.
Or we may have to revert the above change at this late date.
Waiting for AMDGPU team to discuss what process they want to use.
Jason
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply
* Re: [RFC PATCH 00/28] Removing struct page from P2PDMA
From: Logan Gunthorpe @ 2019-07-02 22:52 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Christoph Hellwig, linux-kernel, linux-block, linux-nvme,
linux-pci, linux-rdma, Jens Axboe, Bjorn Helgaas, Dan Williams,
Sagi Grimberg, Keith Busch, Stephen Bates
In-Reply-To: <20190702224530.GD11860@ziepe.ca>
On 2019-07-02 4:45 p.m., Jason Gunthorpe wrote:
> On Fri, Jun 28, 2019 at 01:35:42PM -0600, Logan Gunthorpe wrote:
>
>>> However, I'd feel more comfortable about that assumption if we had
>>> code to support the IOMMU case, and know for sure it doesn't require
>>> more info :(
>>
>> The example I posted *does* support the IOMMU case. That was case (b1)
>> in the description. The idea is that pci_p2pdma_dist() returns a
>> distance with a high bit set (PCI_P2PDMA_THRU_HOST_BRIDGE) when an IOMMU
>> mapping is required and the appropriate flag tells it to call
>> dma_map_resource(). This way, it supports both same-segment and
>> different-segments without needing any look ups in the map step.
>
> I mean we actually have some iommu drivers that can setup P2P in real
> HW. I'm worried that real IOMMUs will need to have the BDF of the
> completer to route completions back to the requester - which we can't
> trivially get through this scheme.
I've never seen such an IOMMU but I guess, in theory, it could exist.
The IOMMUs that setup P2P-like transactions in real hardware make use of
dma_map_resource(). There aren't a lot of users of this function (it's
actually been broken with the Intel IOMMU until I fixed it recently and
I'd expect there are other broken implementations); but, to my
knowledge, none of them have needed the BDF of the provider to date.
> However, maybe that is just a future problem, and certainly we can see
> that with an interval tree or otherwise such a IOMMU could get the
> information it needs.
Yup, the rule of thumb is to design for the needs we have today not
imagined future problems.
Logan
^ permalink raw reply
* Re: [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables
From: Christoph Hellwig @ 2019-07-02 22:49 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Andrea Arcangeli, Ralph Campbell,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo@public.gmane.org,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, Jerome Glisse,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
Christoph Hellwig
In-Reply-To: <20190702195317.GT31718-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
On Tue, Jul 02, 2019 at 07:53:23PM +0000, Jason Gunthorpe wrote:
> > I'm sending this out now since we are updating many of the HMM APIs
> > and I think it will be useful.
>
> This make so much sense, I'd like to apply this in hmm.git, is there
> any objection?
As this creates a somewhat hairy conflict for amdgpu, wouldn't it be
a better idea to wait a bit and apply it first thing for next merge
window?
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply
* Re: [RFC PATCH 00/28] Removing struct page from P2PDMA
From: Jason Gunthorpe @ 2019-07-02 22:45 UTC (permalink / raw)
To: Logan Gunthorpe
Cc: Christoph Hellwig, linux-kernel, linux-block, linux-nvme,
linux-pci, linux-rdma, Jens Axboe, Bjorn Helgaas, Dan Williams,
Sagi Grimberg, Keith Busch, Stephen Bates
In-Reply-To: <cb680437-9615-da42-ebc5-4751e024a45f@deltatee.com>
On Fri, Jun 28, 2019 at 01:35:42PM -0600, Logan Gunthorpe wrote:
> > However, I'd feel more comfortable about that assumption if we had
> > code to support the IOMMU case, and know for sure it doesn't require
> > more info :(
>
> The example I posted *does* support the IOMMU case. That was case (b1)
> in the description. The idea is that pci_p2pdma_dist() returns a
> distance with a high bit set (PCI_P2PDMA_THRU_HOST_BRIDGE) when an IOMMU
> mapping is required and the appropriate flag tells it to call
> dma_map_resource(). This way, it supports both same-segment and
> different-segments without needing any look ups in the map step.
I mean we actually have some iommu drivers that can setup P2P in real
HW. I'm worried that real IOMMUs will need to have the BDF of the
completer to route completions back to the requester - which we can't
trivially get through this scheme.
However, maybe that is just a future problem, and certainly we can see
that with an interval tree or otherwise such a IOMMU could get the
information it needs.
Jason
^ permalink raw reply
* Re: [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables
From: Ralph Campbell @ 2019-07-02 20:11 UTC (permalink / raw)
To: Jason Gunthorpe, Christoph Hellwig
Cc: Andrea Arcangeli,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo@public.gmane.org,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, Jerome Glisse,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
In-Reply-To: <20190702195317.GT31718-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
On 7/2/19 12:53 PM, Jason Gunthorpe wrote:
> On Fri, Jun 07, 2019 at 05:14:52PM -0700, Ralph Campbell wrote:
>> HMM defines its own struct hmm_update which is passed to the
>> sync_cpu_device_pagetables() callback function. This is
>> sufficient when the only action is to invalidate. However,
>> a device may want to know the reason for the invalidation and
>> be able to see the new permissions on a range, update device access
>> rights or range statistics. Since sync_cpu_device_pagetables()
>> can be called from try_to_unmap(), the mmap_sem may not be held
>> and find_vma() is not safe to be called.
>> Pass the struct mmu_notifier_range to sync_cpu_device_pagetables()
>> to allow the full invalidation information to be used.
>>
>> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
>> ---
>>
>> I'm sending this out now since we are updating many of the HMM APIs
>> and I think it will be useful.
>
> This make so much sense, I'd like to apply this in hmm.git, is there
> any objection?
>
> Jason
>
Not from me. :-)
Thanks!
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply
* Re: [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables
From: Jason Gunthorpe @ 2019-07-02 19:53 UTC (permalink / raw)
To: Ralph Campbell, Christoph Hellwig
Cc: Andrea Arcangeli,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo@public.gmane.org,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, Jerome Glisse,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
In-Reply-To: <20190608001452.7922-1-rcampbell-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
On Fri, Jun 07, 2019 at 05:14:52PM -0700, Ralph Campbell wrote:
> HMM defines its own struct hmm_update which is passed to the
> sync_cpu_device_pagetables() callback function. This is
> sufficient when the only action is to invalidate. However,
> a device may want to know the reason for the invalidation and
> be able to see the new permissions on a range, update device access
> rights or range statistics. Since sync_cpu_device_pagetables()
> can be called from try_to_unmap(), the mmap_sem may not be held
> and find_vma() is not safe to be called.
> Pass the struct mmu_notifier_range to sync_cpu_device_pagetables()
> to allow the full invalidation information to be used.
>
> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
> ---
>
> I'm sending this out now since we are updating many of the HMM APIs
> and I think it will be useful.
This make so much sense, I'd like to apply this in hmm.git, is there
any objection?
Jason
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply
* Re: [for-next V2 06/10] linux/dim: Move implementation to .c files
From: Geert Uytterhoeven @ 2019-07-02 16:15 UTC (permalink / raw)
To: Saeed Mahameed, Tal Gilboa
Cc: David S. Miller, Doug Ledford, Jason Gunthorpe, Leon Romanovsky,
Or Gerlitz, Sagi Grimberg, netdev@vger.kernel.org,
linux-rdma@vger.kernel.org, linux-kernel
In-Reply-To: <20190625205701.17849-7-saeedm@mellanox.com>
Hi Saeed, Tal,
On Tue, 25 Jun 2019, Saeed Mahameed wrote:
> From: Tal Gilboa <talgi@mellanox.com>
>
> Moved all logic from dim.h and net_dim.h to dim.c and net_dim.c.
> This is both more structurally appealing and would allow to only
> expose externally used functions.
>
> Signed-off-by: Tal Gilboa <talgi@mellanox.com>
> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
This is now commit 4f75da3666c0c572 ("linux/dim: Move implementation to
.c files") in net-next.
> --- a/drivers/net/ethernet/broadcom/Kconfig
> +++ b/drivers/net/ethernet/broadcom/Kconfig
> @@ -8,6 +8,7 @@ config NET_VENDOR_BROADCOM
> default y
> depends on (SSB_POSSIBLE && HAS_DMA) || PCI || BCM63XX || \
> SIBYTE_SB1xxx_SOC
> + select DIMLIB
Merely enabling a NET_VENDOR_* symbol should not enable inclusion of
any additional code, cfr. the help text for the NET_VENDOR_BROADCOM
option.
Hence please move the select to the config symbol(s) for the driver(s)
that need it.
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -562,6 +562,14 @@ config SIGNATURE
> Digital signature verification. Currently only RSA is supported.
> Implementation is done using GnuPG MPI library
>
> +config DIMLIB
> + bool "DIM library"
> + default y
Please drop this line, as optional library code should never be included
by default.
Thanks!
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox