netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] RESEND - rdmatool - tool for RDMA users
@ 2017-01-18 15:19 Ariel Almog
       [not found] ` <CABvr3-GZQs51Sn3XagTsepsy3CHvx6P=GVJzefajbNt9jxz9Kg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Ariel Almog @ 2017-01-18 15:19 UTC (permalink / raw)
  To: leon, linux-rdma, netdev

sorry for resending.

General
*******

As of today, there is no single, simple, tool that allows monitoring
and configuration of RDMA stack.

For netdev stack there are few standard tools, such as ethtool and
iproute2. There is a need to add a matching tool which will allow
control and query of RDMA subsystem.

rdmatool will provide standard, provider agnostic, user interface.
RDMA user can use this interface to
* Query RDMA device capabilities
* Query RDMA device status and current open resources
* Fetching RDMA statistics
* Configure RDMA device

The rdmatool will have the ability to control RDMA stack which
includes the ib_device and RDMA protocol params.

It is a good point to highlight the similarity to ethtool. It manages
net_device, while rdmatool will manage RDMA stack.

As a proposal, it is appealing to have a similar design to ethtool for
rdmatool too. As ethtool, it should contain user space part, kernel
handler and vendor specific handler(s).

Another tool which allows similar functionality is iproute2. Iproute2
allows user space tools to configure and query transport, network and
link layer. The advantages of using iproute2 is the reuse of existing
tool and familiar interface in addition to the wide spreading of this tool.

As start point of the discussion, we would like to propose two
implementation options for the rdmatool:
(1) Build a tool using ABI interface. This will be a RDMA tool
    which will be distributed as part of RDMA package
(2) Enhancing iproute2 to include rdmatool functionality

Our opinion is that the new ABI interface provides vast functionality and
it will be a waste to use other interface for the same functionality.
Each of the proposed implementation above have their advantages, and we
would like to hear your opinion regarding this direction.

I’m posting this RFC in both netdev and linux-rdma communities in
order to get feedback on this topic.

General Description
*******************
The rdmatool will be combined from the following components:
* user space’s rdmatool - will be referred from now on as rdmatool-u.
* kernel’s rdmatool - will be referred from now on as rdmatool-k. This part
  is in kernel, running under ibcore context
* ib_device’s rdma_ops - rdmatool-p is written by the provider.
  In case the provider doesn't provide rdma_ops code, a generic code will
  be used to provide the matching interface.

.------------------|
| rdmatool-u       |
|------------------|
 U    |
 ABI
 K    |
.------------------|
| rdmatool-k       |
|------------------|
  |
rdmatool_ops
  |
.---------------------------|
| generic rdma_ops          |
|---------------------------|--|
  | ib_device’s rdma_ops      |--|
  |--|------------------------|  |--|
     |--|------------------------|  |
    |---------------------------|

The protocol on top of the ABI interface handles the transaction of
information on both directions (rdmatool-u to rdmatool-k and vice versa).

The transaction protocol shall contain the following information:
* Command – the requested operation by rdmatool-u
* ib_device – the device to handle; a dedicated value will indicate a stack
  layer transactions
* Ancillary information according to the command
Later on, after we will close direction and design concerns, we will start
defining the actual structure to be used.

Flexibility
***********
Some of the commands allow passing both standard and
vendor specific information.
In order to allow easy enhancement of the tool, an Matan's ABI [1]
mechanism can be used to pass the vendor specific information.

As examples:
(1) Statistics ops will return the standard counters. In addition, it can
    report on extra counters. The rdmatool-u will present the addition
    statistics as is.
(2) For configuration, where rdmatool-u finds undefined information at the
    end of the command line, it will pass it as is to the rdmatool-k which
    will pass it to the ops handler where it can be handled or ignored.

Persistence
***********
Another target that we would like to achieve is easy configuration.
rdmatool-u will provide an export mechanism mto store and restore configuration
The configuration file will be human readable and will be parsed by the
various tools (systemd, udev, rdmatool-u, etc.).

Functionality
*************
The rdmatool will provide a platform which can grow as needed. The
initial functionality might include:
* help – man page
* version – version number
* statistics – RDMA statistics – such as port and QP statistics
  To allow easy reading of statistics, we offer to use a filter functionality,
  allowing reading of statistics families, such as link layer, error
  counters, etc.
* protocol – RoCE/iWARP/InfiniBand related configuration, such as RDMA
  congestion configuration and statistics
* query – RDMA objects (qp, cq, srq, ..) information such as owner,
  status, type
* debug – an interface to allow read and write from user space to the
  provider RDMA driver, exposing debug information

We would like to get the community feedback regarding which solution shall be
used: a standalone tool in RDMA package (ABI) or adding functionality to
iproute2 as extension?

Reference
[1] [RFC ABI V6 00/14] SG-based RDMA ABI Proposal -
        https://www.spinics.net/lists/linux-rdma/msg43960.html

Thanks for feedback.
Ariel and Leon

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] RESEND - rdmatool - tool for RDMA users
       [not found] ` <CABvr3-GZQs51Sn3XagTsepsy3CHvx6P=GVJzefajbNt9jxz9Kg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-01-18 16:48   ` Or Gerlitz
  2017-01-18 17:33     ` Leon Romanovsky
  0 siblings, 1 reply; 14+ messages in thread
From: Or Gerlitz @ 2017-01-18 16:48 UTC (permalink / raw)
  To: Ariel Almog
  Cc: Leon Romanovsky,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Linux Netdev List

On Wed, Jan 18, 2017 at 5:19 PM, Ariel Almog
<arielalmogworkemails-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> General
> *******
> As of today, there is no single, simple, tool that allows monitoring
> and configuration of RDMA stack.

Before tool, what kernel UAPI you thought to use?


> rdmatool will provide standard, provider agnostic, user interface.
> RDMA user can use this interface to
> * Query RDMA device capabilities
> * Query RDMA device status and current open resources
> * Fetching RDMA statistics
> * Configure RDMA device
>
> The rdmatool will have the ability to control RDMA stack which
> includes the ib_device and RDMA protocol params.
>
> It is a good point to highlight the similarity to ethtool. It manages
> net_device, while rdmatool will manage RDMA stack.
>
> As a proposal, it is appealing to have a similar design to ethtool for
> rdmatool too. As ethtool, it should contain user space part, kernel
> handler and vendor specific handler(s).
>
> Another tool which allows similar functionality is iproute2. Iproute2
> allows user space tools to configure and query transport, network and
> link layer. The advantages of using iproute2 is the reuse of existing
> tool and familiar interface in addition to the wide spreading of this tool.
>
> As start point of the discussion, we would like to propose two
> implementation options for the rdmatool:
> (1) Build a tool using ABI interface. This will be a RDMA tool
>     which will be distributed as part of RDMA package
> (2) Enhancing iproute2 to include rdmatool functionality
>
> Our opinion is that the new ABI interface provides vast functionality and
> it will be a waste to use other interface for the same functionality.
> Each of the proposed implementation above have their advantages, and we
> would like to hear your opinion regarding this direction.

> I’m posting this RFC in both netdev and linux-rdma communities in
> order to get feedback on this topic.

What's wrong with the RDMA netlink infrastructure?! it's there for
years and used
for various cases by MLNX, did you look on the code?

cea05ea IB/core: Add flow control to the portmapper netlink calls
ae43f82 IB/core: Add IP to GID netlink offload
2ca546b IB/sa: Route SA pathrecord query through netlink
753f618 RDMA/cma: Add support for netlink statistics export
b2cbae2 RDMA: Add netlink infrastructure
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] RESEND - rdmatool - tool for RDMA users
  2017-01-18 16:48   ` Or Gerlitz
@ 2017-01-18 17:33     ` Leon Romanovsky
  2017-01-18 17:50       ` Or Gerlitz
  0 siblings, 1 reply; 14+ messages in thread
From: Leon Romanovsky @ 2017-01-18 17:33 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Ariel Almog, linux-rdma@vger.kernel.org, Linux Netdev List

[-- Attachment #1: Type: text/plain, Size: 3155 bytes --]

On Wed, Jan 18, 2017 at 06:48:21PM +0200, Or Gerlitz wrote:
> On Wed, Jan 18, 2017 at 5:19 PM, Ariel Almog
> <arielalmogworkemails@gmail.com> wrote:
> > General
> > *******
> > As of today, there is no single, simple, tool that allows monitoring
> > and configuration of RDMA stack.
>
> Before tool, what kernel UAPI you thought to use?

I'm aware of the following options:
1) netlink
2) RDMA ABI https://www.spinics.net/lists/linux-rdma/msg43960.html
3) ioctl
4) write/read

The items 1 and 2 are preferred options and one of the main goals
for this RFC is to chose between them.

For example, RDMA ABI has native support of querying and discovering
device capabilities via merge tree feature.
https://github.com/matanb10/linux/commit/61aaa4cae1281f6bb32e261f0b8aad489db764ac

>
>
> > rdmatool will provide standard, provider agnostic, user interface.
> > RDMA user can use this interface to
> > * Query RDMA device capabilities
> > * Query RDMA device status and current open resources
> > * Fetching RDMA statistics
> > * Configure RDMA device
> >
> > The rdmatool will have the ability to control RDMA stack which
> > includes the ib_device and RDMA protocol params.
> >
> > It is a good point to highlight the similarity to ethtool. It manages
> > net_device, while rdmatool will manage RDMA stack.
> >
> > As a proposal, it is appealing to have a similar design to ethtool for
> > rdmatool too. As ethtool, it should contain user space part, kernel
> > handler and vendor specific handler(s).
> >
> > Another tool which allows similar functionality is iproute2. Iproute2
> > allows user space tools to configure and query transport, network and
> > link layer. The advantages of using iproute2 is the reuse of existing
> > tool and familiar interface in addition to the wide spreading of this tool.
> >
> > As start point of the discussion, we would like to propose two
> > implementation options for the rdmatool:
> > (1) Build a tool using ABI interface. This will be a RDMA tool
> >     which will be distributed as part of RDMA package
> > (2) Enhancing iproute2 to include rdmatool functionality
> >
> > Our opinion is that the new ABI interface provides vast functionality and
> > it will be a waste to use other interface for the same functionality.
> > Each of the proposed implementation above have their advantages, and we
> > would like to hear your opinion regarding this direction.
>
> > I’m posting this RFC in both netdev and linux-rdma communities in
> > order to get feedback on this topic.
>
> What's wrong with the RDMA netlink infrastructure?! it's there for
> years and used
> for various cases by MLNX, did you look on the code?

Nothing wrong, it is one of the valuable options and will be used if
community decides to put this tool under iproute2/ethtool umbrella.

>
> cea05ea IB/core: Add flow control to the portmapper netlink calls
> ae43f82 IB/core: Add IP to GID netlink offload
> 2ca546b IB/sa: Route SA pathrecord query through netlink
> 753f618 RDMA/cma: Add support for netlink statistics export
> b2cbae2 RDMA: Add netlink infrastructure

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] RESEND - rdmatool - tool for RDMA users
  2017-01-18 17:33     ` Leon Romanovsky
@ 2017-01-18 17:50       ` Or Gerlitz
  2017-01-18 18:28         ` Leon Romanovsky
                           ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Or Gerlitz @ 2017-01-18 17:50 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Ariel Almog, linux-rdma@vger.kernel.org, Linux Netdev List

On Wed, Jan 18, 2017 at 7:33 PM, Leon Romanovsky wrote:
> On Wed, Jan 18, 2017 at 06:48:21PM +0200, Or Gerlitz wrote:
>> On Wed, Jan 18, 2017 at 5:19 PM, Ariel Almog
>> <arielalmogworkemails@gmail.com> wrote:

>>> As of today, there is no single, simple, tool that allows monitoring
>>> and configuration of RDMA stack.

>> Before tool, what kernel UAPI you thought to use?

> I'm aware of the following options:
> 1) netlink
> 2) RDMA ABI https://www.spinics.net/lists/linux-rdma/msg43960.html
> 3) ioctl
> 4) write/read
>
> The items 1 and 2 are preferred options and one of the main goals
> for this RFC is to chose between them.
>
> For example, RDMA ABI has native support of querying and discovering
> device capabilities via merge tree feature.

To make it clear, when you wrote ABI in your initial email, I tend to
think it was sort of unclear to the netdev crowd that you are talking
on new UAPI which is now under the works for the IB subsystem, so with
my netdev community member hat, I got confused... anyway


>> > It is a good point to highlight the similarity to ethtool. It manages
>> > net_device, while rdmatool will manage RDMA stack.

ethtool is likely to be ported to use netlink somewhere in the 21st century BTW


>> What's wrong with the RDMA netlink infrastructure?! it's there for
>> years and used
>> for various cases by MLNX, did you look on the code?

> Nothing wrong, it is one of the valuable options and will be used if
> community decides to put this tool under iproute2/ethtool umbrella.

yeah, using netlink sounds good to me, and where you package/maintain
the tool is of 2nd order, 1st decide what UAPI you wonna use. So far
netlink was good for what bunch of use-cases needed.

Or.

>> cea05ea IB/core: Add flow control to the portmapper netlink calls
>> ae43f82 IB/core: Add IP to GID netlink offload
>> 2ca546b IB/sa: Route SA pathrecord query through netlink
>> 753f618 RDMA/cma: Add support for netlink statistics export
>> b2cbae2 RDMA: Add netlink infrastructure

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] RESEND - rdmatool - tool for RDMA users
  2017-01-18 17:50       ` Or Gerlitz
@ 2017-01-18 18:28         ` Leon Romanovsky
  2017-01-18 18:31         ` Jason Gunthorpe
  2017-01-19  6:04         ` Leon Romanovsky
  2 siblings, 0 replies; 14+ messages in thread
From: Leon Romanovsky @ 2017-01-18 18:28 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Ariel Almog, linux-rdma@vger.kernel.org, Linux Netdev List

[-- Attachment #1: Type: text/plain, Size: 1226 bytes --]

On Wed, Jan 18, 2017 at 07:50:26PM +0200, Or Gerlitz wrote:
> On Wed, Jan 18, 2017 at 7:33 PM, Leon Romanovsky wrote:
> > On Wed, Jan 18, 2017 at 06:48:21PM +0200, Or Gerlitz wrote:
> >> On Wed, Jan 18, 2017 at 5:19 PM, Ariel Almog
> >> <arielalmogworkemails@gmail.com> wrote:
>
> >>> As of today, there is no single, simple, tool that allows monitoring
> >>> and configuration of RDMA stack.
>
> >> Before tool, what kernel UAPI you thought to use?
>
> > I'm aware of the following options:
> > 1) netlink
> > 2) RDMA ABI https://www.spinics.net/lists/linux-rdma/msg43960.html
> > 3) ioctl
> > 4) write/read
> >
> > The items 1 and 2 are preferred options and one of the main goals
> > for this RFC is to chose between them.
> >
> > For example, RDMA ABI has native support of querying and discovering
> > device capabilities via merge tree feature.
>
> To make it clear, when you wrote ABI in your initial email, I tend to
> think it was sort of unclear to the netdev crowd that you are talking
> on new UAPI which is now under the works for the IB subsystem, so with
> my netdev community member hat, I got confused... anyway

You are right, I missed it in my review for Ariel who wrote this RFC
and sent this RFC.

Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] RESEND - rdmatool - tool for RDMA users
  2017-01-18 17:50       ` Or Gerlitz
  2017-01-18 18:28         ` Leon Romanovsky
@ 2017-01-18 18:31         ` Jason Gunthorpe
  2017-01-18 21:45           ` Bart Van Assche
  2017-01-19  6:04         ` Leon Romanovsky
  2 siblings, 1 reply; 14+ messages in thread
From: Jason Gunthorpe @ 2017-01-18 18:31 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Leon Romanovsky, Ariel Almog, linux-rdma@vger.kernel.org,
	Linux Netdev List

On Wed, Jan 18, 2017 at 07:50:26PM +0200, Or Gerlitz wrote:
> On Wed, Jan 18, 2017 at 7:33 PM, Leon Romanovsky wrote:
> > On Wed, Jan 18, 2017 at 06:48:21PM +0200, Or Gerlitz wrote:
> >> On Wed, Jan 18, 2017 at 5:19 PM, Ariel Almog
> >> <arielalmogworkemails@gmail.com> wrote:
> 
> >>> As of today, there is no single, simple, tool that allows monitoring
> >>> and configuration of RDMA stack.
> 
> >> Before tool, what kernel UAPI you thought to use?
> 
> > I'm aware of the following options:
> > 1) netlink
> > 2) RDMA ABI https://www.spinics.net/lists/linux-rdma/msg43960.html
> > 3) ioctl
> > 4) write/read
> >
> > The items 1 and 2 are preferred options and one of the main goals
> > for this RFC is to chose between them.
> >
> > For example, RDMA ABI has native support of querying and discovering
> > device capabilities via merge tree feature.
> 
> To make it clear, when you wrote ABI in your initial email, I tend to
> think it was sort of unclear to the netdev crowd that you are talking
> on new UAPI which is now under the works for the IB subsystem, so with
> my netdev community member hat, I got confused... anyway

I think it depends on what this tool is supposed to cover, but based
on the description, I would start with netlink-only.

The only place verbs covers a similar ground is in 'device
capabilities' - for some of that you might want to open a new-uAPI
verbs fd, but even the capability data from that would not be
totally offensive to be accessed over netlink.

IMHO netlink should cover almost everything found in sysfs today.

I'm also deeply skeptical about driver-specific stuff at this layer,
that sounds like a way to make a big mess.

Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] RESEND - rdmatool - tool for RDMA users
  2017-01-18 18:31         ` Jason Gunthorpe
@ 2017-01-18 21:45           ` Bart Van Assche
       [not found]             ` <5f90fd26-e7bf-bb2a-01f2-6b166f2265e9-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Bart Van Assche @ 2017-01-18 21:45 UTC (permalink / raw)
  To: Jason Gunthorpe, Or Gerlitz
  Cc: Leon Romanovsky, Ariel Almog, linux-rdma@vger.kernel.org,
	Linux Netdev List

On 01/18/2017 10:31 AM, Jason Gunthorpe wrote:
> I think it depends on what this tool is supposed to cover, but based
> on the description, I would start with netlink-only.
> 
> The only place verbs covers a similar ground is in 'device
> capabilities' - for some of that you might want to open a new-uAPI
> verbs fd, but even the capability data from that would not be
> totally offensive to be accessed over netlink.
> 
> IMHO netlink should cover almost everything found in sysfs today.

We would need a very strong argument to introduce a netlink API that
duplicates existing sysfs API functionality. Since the sysfs API is
extensible, why not extend that API further? E.g. the SCST sysfs API
shows that more is possible with sysfs than what most kernel drivers
realize.

Bart.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] RESEND - rdmatool - tool for RDMA users
  2017-01-18 17:50       ` Or Gerlitz
  2017-01-18 18:28         ` Leon Romanovsky
  2017-01-18 18:31         ` Jason Gunthorpe
@ 2017-01-19  6:04         ` Leon Romanovsky
  2 siblings, 0 replies; 14+ messages in thread
From: Leon Romanovsky @ 2017-01-19  6:04 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Ariel Almog, linux-rdma@vger.kernel.org, Linux Netdev List

[-- Attachment #1: Type: text/plain, Size: 1632 bytes --]

On Wed, Jan 18, 2017 at 07:50:26PM +0200, Or Gerlitz wrote:
> On Wed, Jan 18, 2017 at 7:33 PM, Leon Romanovsky wrote:
> > On Wed, Jan 18, 2017 at 06:48:21PM +0200, Or Gerlitz wrote:
> >> On Wed, Jan 18, 2017 at 5:19 PM, Ariel Almog
> >> <arielalmogworkemails@gmail.com> wrote:
>
> >> What's wrong with the RDMA netlink infrastructure?! it's there for
> >> years and used
> >> for various cases by MLNX, did you look on the code?
>
> > Nothing wrong, it is one of the valuable options and will be used if
> > community decides to put this tool under iproute2/ethtool umbrella.
>
> yeah, using netlink sounds good to me, and where you package/maintain
> the tool is of 2nd order, 1st decide what UAPI you wonna use. So far
> netlink was good for what bunch of use-cases needed.

Or,
This tool is for users and they don't care how it is implemented
underneath. We want to provide to our users similar experience as they
have already with other tools and package/maintainability is important
for them more than kernel UAPI.

Ariel came with a small set of initial functionalities, but it doesn't
say we should limit ourselves.

>
> Or.
>
> >> cea05ea IB/core: Add flow control to the portmapper netlink calls
> >> ae43f82 IB/core: Add IP to GID netlink offload
> >> 2ca546b IB/sa: Route SA pathrecord query through netlink
> >> 753f618 RDMA/cma: Add support for netlink statistics export
> >> b2cbae2 RDMA: Add netlink infrastructure
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] RESEND - rdmatool - tool for RDMA users
       [not found]             ` <5f90fd26-e7bf-bb2a-01f2-6b166f2265e9-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
@ 2017-01-19  6:33               ` Leon Romanovsky
       [not found]                 ` <20170119063326.GJ32481-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Leon Romanovsky @ 2017-01-19  6:33 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jason Gunthorpe, Or Gerlitz, Ariel Almog,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Linux Netdev List

[-- Attachment #1: Type: text/plain, Size: 1238 bytes --]

On Wed, Jan 18, 2017 at 01:45:14PM -0800, Bart Van Assche wrote:
> On 01/18/2017 10:31 AM, Jason Gunthorpe wrote:
> > I think it depends on what this tool is supposed to cover, but based
> > on the description, I would start with netlink-only.
> >
> > The only place verbs covers a similar ground is in 'device
> > capabilities' - for some of that you might want to open a new-uAPI
> > verbs fd, but even the capability data from that would not be
> > totally offensive to be accessed over netlink.
> >
> > IMHO netlink should cover almost everything found in sysfs today.
>
> We would need a very strong argument to introduce a netlink API that
> duplicates existing sysfs API functionality. Since the sysfs API is
> extensible, why not extend that API further? E.g. the SCST sysfs API
> shows that more is possible with sysfs than what most kernel drivers
> realize.

We didn't look deeply on sysfs mainly because it is unpopular
in netdev community. Maybe we were misled and it is simply not true.

>
> Bart.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] RESEND - rdmatool - tool for RDMA users
       [not found]                 ` <20170119063326.GJ32481-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-01-19 18:03                   ` Jason Gunthorpe
  2017-01-19 19:12                     ` Leon Romanovsky
                                       ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Jason Gunthorpe @ 2017-01-19 18:03 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Bart Van Assche, Or Gerlitz, Ariel Almog,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Linux Netdev List

On Thu, Jan 19, 2017 at 08:33:26AM +0200, Leon Romanovsky wrote:
> On Wed, Jan 18, 2017 at 01:45:14PM -0800, Bart Van Assche wrote:
> > On 01/18/2017 10:31 AM, Jason Gunthorpe wrote:
> > > I think it depends on what this tool is supposed to cover, but based
> > > on the description, I would start with netlink-only.
> > >
> > > The only place verbs covers a similar ground is in 'device
> > > capabilities' - for some of that you might want to open a new-uAPI
> > > verbs fd, but even the capability data from that would not be
> > > totally offensive to be accessed over netlink.
> > >
> > > IMHO netlink should cover almost everything found in sysfs today.
> >
> > We would need a very strong argument to introduce a netlink API that
> > duplicates existing sysfs API functionality. Since the sysfs API is
> > extensible, why not extend that API further? E.g. the SCST sysfs API
> > shows that more is possible with sysfs than what most kernel drivers
> > realize.
> 
> We didn't look deeply on sysfs mainly because it is unpopular
> in netdev community. Maybe we were misled and it is simply not true.

sysfs is unpopular because the 'one value per file' dogma is laregly
unsuitable for complex mulit-value atomic changes which are common in
netdev. You can force it to work, but it is pretty horrible..

It is also very expensive if you want to shuttle a lot of data, eg I
could not see doing something like 'netstat' for IB through sysfs

Maybe you should start by showing some examples of command out you
wish to have in a rdmatool ..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] RESEND - rdmatool - tool for RDMA users
  2017-01-19 18:03                   ` Jason Gunthorpe
@ 2017-01-19 19:12                     ` Leon Romanovsky
  2017-01-19 22:06                     ` Bart Van Assche
       [not found]                     ` <20170119180308.GD8109-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2 siblings, 0 replies; 14+ messages in thread
From: Leon Romanovsky @ 2017-01-19 19:12 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Bart Van Assche, Or Gerlitz, Ariel Almog,
	linux-rdma@vger.kernel.org, Linux Netdev List

[-- Attachment #1: Type: text/plain, Size: 2586 bytes --]

On Thu, Jan 19, 2017 at 11:03:08AM -0700, Jason Gunthorpe wrote:
> On Thu, Jan 19, 2017 at 08:33:26AM +0200, Leon Romanovsky wrote:
> > On Wed, Jan 18, 2017 at 01:45:14PM -0800, Bart Van Assche wrote:
> > > On 01/18/2017 10:31 AM, Jason Gunthorpe wrote:
> > > > I think it depends on what this tool is supposed to cover, but based
> > > > on the description, I would start with netlink-only.
> > > >
> > > > The only place verbs covers a similar ground is in 'device
> > > > capabilities' - for some of that you might want to open a new-uAPI
> > > > verbs fd, but even the capability data from that would not be
> > > > totally offensive to be accessed over netlink.
> > > >
> > > > IMHO netlink should cover almost everything found in sysfs today.
> > >
> > > We would need a very strong argument to introduce a netlink API that
> > > duplicates existing sysfs API functionality. Since the sysfs API is
> > > extensible, why not extend that API further? E.g. the SCST sysfs API
> > > shows that more is possible with sysfs than what most kernel drivers
> > > realize.
> >
> > We didn't look deeply on sysfs mainly because it is unpopular
> > in netdev community. Maybe we were misled and it is simply not true.
>
> sysfs is unpopular because the 'one value per file' dogma is laregly
> unsuitable for complex mulit-value atomic changes which are common in
> netdev. You can force it to work, but it is pretty horrible..
>
> It is also very expensive if you want to shuttle a lot of data, eg I
> could not see doing something like 'netstat' for IB through sysfs
>
> Maybe you should start by showing some examples of command out you
> wish to have in a rdmatool ..

Ariel wrote very simplified version in his proposal:
Functionality
*************
The rdmatool will provide a platform which can grow as needed. The
initial functionality might include:
* help – man page
* version – version number
* statistics – RDMA statistics – such as port and QP statistics
  	       To allow easy reading of statistics, we offer to use a filter
  	       functionality, allowing reading of statistics families, such
	       as link layer, error counters, etc.
* protocol – RoCE/iWARP/InfiniBand related configuration, such as RDMA
	     congestion configuration and statistics
* query – RDMA objects (qp, cq, srq, ..) information such as owner,
	  status, type
* debug – an interface to allow read and write from user space
	  to the provider RDMA driver, exposing debug information.

Commands similar to "ip" program.

>
> Jason

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] RESEND - rdmatool - tool for RDMA users
  2017-01-19 18:03                   ` Jason Gunthorpe
  2017-01-19 19:12                     ` Leon Romanovsky
@ 2017-01-19 22:06                     ` Bart Van Assche
  2017-01-19 22:16                       ` Jason Gunthorpe
       [not found]                     ` <20170119180308.GD8109-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2 siblings, 1 reply; 14+ messages in thread
From: Bart Van Assche @ 2017-01-19 22:06 UTC (permalink / raw)
  To: jgunthorpe@obsidianresearch.com, leon@kernel.org
  Cc: netdev@vger.kernel.org, arielalmogworkemails@gmail.com,
	linux-rdma@vger.kernel.org, gerlitz.or@gmail.com

On Thu, 2017-01-19 at 11:03 -0700, Jason Gunthorpe wrote:
> sysfs is unpopular because the 'one value per file' dogma is laregly
> unsuitable for complex mulit-value atomic changes which are common in
> netdev. You can force it to work, but it is pretty horrible..
> 
> It is also very expensive if you want to shuttle a lot of data, eg I
> could not see doing something like 'netstat' for IB through sysfs

Since the RDMA sysfs ABI defines a user space ABI and since user space
ABIs must be backwards compatible removing the existing sysfs ABI is
not an option. We will need to evaluate on a case-by-case basis whether
new functionality should use sysfs or whether another mechanism should
be used.

Bart.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] RESEND - rdmatool - tool for RDMA users
  2017-01-19 22:06                     ` Bart Van Assche
@ 2017-01-19 22:16                       ` Jason Gunthorpe
  0 siblings, 0 replies; 14+ messages in thread
From: Jason Gunthorpe @ 2017-01-19 22:16 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: leon@kernel.org, netdev@vger.kernel.org,
	arielalmogworkemails@gmail.com, linux-rdma@vger.kernel.org,
	gerlitz.or@gmail.com

On Thu, Jan 19, 2017 at 10:06:12PM +0000, Bart Van Assche wrote:
> On Thu, 2017-01-19 at 11:03 -0700, Jason Gunthorpe wrote:
> > sysfs is unpopular because the 'one value per file' dogma is laregly
> > unsuitable for complex mulit-value atomic changes which are common in
> > netdev. You can force it to work, but it is pretty horrible..
> > 
> > It is also very expensive if you want to shuttle a lot of data, eg I
> > could not see doing something like 'netstat' for IB through sysfs
> 
> Since the RDMA sysfs ABI defines a user space ABI and since user space
> ABIs must be backwards compatible removing the existing sysfs ABI is
> not an option. We will need to evaluate on a case-by-case basis whether
> new functionality should use sysfs or whether another mechanism should
> be used.

Not talking about getting rid of it.

But if it makes sense to use netlink for the new stuff we should make
netlink self-consistent so a netlink user does not have to fall back
to sysfs for certain things.

Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] RESEND - rdmatool - tool for RDMA users
       [not found]                     ` <20170119180308.GD8109-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2017-01-31 11:38                       ` Ariel Almog
  0 siblings, 0 replies; 14+ messages in thread
From: Ariel Almog @ 2017-01-31 11:38 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Bart Van Assche, Or Gerlitz,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Linux Netdev List

> On Thu, Jan 19, 2017 at 08:33:26AM +0200, Leon Romanovsky wrote:
> > On Wed, Jan 18, 2017 at 01:45:14PM -0800, Bart Van Assche wrote:
> > > On 01/18/2017 10:31 AM, Jason Gunthorpe wrote:
> > > > I think it depends on what this tool is supposed to cover, but based
> > > > on the description, I would start with netlink-only.
> > > >
> > > > The only place verbs covers a similar ground is in 'device
> > > > capabilities' - for some of that you might want to open a new-uAPI
> > > > verbs fd, but even the capability data from that would not be
> > > > totally offensive to be accessed over netlink.
> > > >
> > > > IMHO netlink should cover almost everything found in sysfs today.
> > >
> > > We would need a very strong argument to introduce a netlink API that
> > > duplicates existing sysfs API functionality. Since the sysfs API is
> > > extensible, why not extend that API further? E.g. the SCST sysfs API
> > > shows that more is possible with sysfs than what most kernel drivers
> > > realize.
> >
> > We didn't look deeply on sysfs mainly because it is unpopular
> > in netdev community. Maybe we were misled and it is simply not true.
>
> sysfs is unpopular because the 'one value per file' dogma is laregly
> unsuitable for complex mulit-value atomic changes which are common in
> netdev. You can force it to work, but it is pretty horrible..
>
> It is also very expensive if you want to shuttle a lot of data, eg I
> could not see doing something like 'netstat' for IB through sysfs
>
> Maybe you should start by showing some examples of command out you
> wish to have in a rdmatool ..
>

sysfs has the 'one value per file' dogma as you wrote and hence driver will
be complicated to maintain

netlink is a simple socket interface, it provides duplex communication,
allows easy expansion and is used by many tools (iproute2, 802.11 wireless
drivers, NFC, OVS and more).

iproute2 [1] provides a collection of utilities for controlling TCP/IP
networking and traffic utilities for network control. it includes
many tools, such as ip, devlink and more. using the iproute2 working frame
will allow rdmatool to be distributed easily.

Regarding the command samples, I was thinking of using the following
methodology (similar to devlink [2])

SYNOPSIS
rdmatool [ OPTIONS ] OBJECT { COMMAND | help }
OBJECT := { dev | protocol }
OPTIONS := { -V[ersion] }

OPTIONS
-V, -Version
Print the version of the rdmatool utility and exit.

OBJECT
dev - rdma device.
Protocol – protocol level

COMMAND (examples, will be enhanced)

rdmatool dev help
provide help on the command

rdmatool dev show [DEV]
display rdma device attributes
[DEV] - specifies the rdma device. if this argument is omitted all
devices are listed.

rdmatool dev interfaces [DEV]
display rdma device interfaces
rdmatool dev connections [DEV]
display rdma device connections
rdmatool dev resources [PROCESS]
display the device resources, e.g. qp, cp, etc.
[PROCESS] - specifies the resources of a specific process

rdmatool dev stat [DEV]
display rdma device statistics
rdmatool protocol show [PROTOCOL] [ENTITY]
display roce protocol attributes
[PROTOCOL] - specifies the protocol, e.g. roce, iwrap, etc.
[ENTITTY] - specifies the entity, e.g. qos, etc.

rdmatool dev debug [DEV] [DEBUG_INFO]
set/query debug on DEV
[DEBUG_INFO] - debug configuration, optional, might be vendor specific

References:
[1] iproute2
https://wiki.linuxfoundation.org/networking/iproute2
[2] devlink
http://man7.org/linux/man-pages/man8/devlink.8.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-01-31 11:38 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-18 15:19 [RFC] RESEND - rdmatool - tool for RDMA users Ariel Almog
     [not found] ` <CABvr3-GZQs51Sn3XagTsepsy3CHvx6P=GVJzefajbNt9jxz9Kg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-01-18 16:48   ` Or Gerlitz
2017-01-18 17:33     ` Leon Romanovsky
2017-01-18 17:50       ` Or Gerlitz
2017-01-18 18:28         ` Leon Romanovsky
2017-01-18 18:31         ` Jason Gunthorpe
2017-01-18 21:45           ` Bart Van Assche
     [not found]             ` <5f90fd26-e7bf-bb2a-01f2-6b166f2265e9-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2017-01-19  6:33               ` Leon Romanovsky
     [not found]                 ` <20170119063326.GJ32481-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-01-19 18:03                   ` Jason Gunthorpe
2017-01-19 19:12                     ` Leon Romanovsky
2017-01-19 22:06                     ` Bart Van Assche
2017-01-19 22:16                       ` Jason Gunthorpe
     [not found]                     ` <20170119180308.GD8109-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-01-31 11:38                       ` Ariel Almog
2017-01-19  6:04         ` Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).