SRIOV switchdev mode BoF minutes

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* SRIOV switchdev mode BoF minutes
@ 2017-11-12 19:49 Or Gerlitz
  2017-11-12 20:38 ` Alexander Duyck
  2018-04-12 17:05 ` Samudrala, Sridhar
  0 siblings, 2 replies; 36+ messages in thread
From: Or Gerlitz @ 2017-11-12 19:49 UTC (permalink / raw)
  To: David Miller
  Cc: Anjali Singhai Jain, Andy Gospodarek, Michael Chan, Simon Horman,
	Jakub Kicinski, John Fastabend, Saeed Mahameed, Jiri Pirko,
	Rony Efraim, Linux Netdev List

Hi Dave and all,

During and after the BoF on SRIOV switchdev mode, we came into a
consensus among the developers from four different HW vendors (CC
audience) that a correct thing to do would be to disallow any new
extensions to the legacy mode.

The idea is to put focus on the new mode and not add new UAPIs and
kernel code which was turned to be a wrong design which does not allow
for properly offloading a kernel switching SW model to e-switch HW.

We also had a good session the day after regarding alignment for the
representation model of the uplink (physical port) and PF/s.

The VF representor netdevs  exist for all drivers that support the new
mode but the representation for the uplink and PF wasn't the same for
all. The decision was to represent the uplink and PFs vports in the
same manner done for VFs, using rep netdevs. This alignment would
provide a more strict and clear view of the kernel model for e-switch
to users and upper layer control plane SW.

Or.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2017-11-12 19:49 SRIOV switchdev mode BoF minutes Or Gerlitz
@ 2017-11-12 20:38 ` Alexander Duyck
  2017-11-13  6:16   ` Or Gerlitz
  2018-04-12 17:05 ` Samudrala, Sridhar
  1 sibling, 1 reply; 36+ messages in thread
From: Alexander Duyck @ 2017-11-12 20:38 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: David Miller, Anjali Singhai Jain, Andy Gospodarek, Michael Chan,
	Simon Horman, Jakub Kicinski, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Sun, Nov 12, 2017 at 11:49 AM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> Hi Dave and all,
>
> During and after the BoF on SRIOV switchdev mode, we came into a
> consensus among the developers from four different HW vendors (CC
> audience) that a correct thing to do would be to disallow any new
> extensions to the legacy mode.
>
> The idea is to put focus on the new mode and not add new UAPIs and
> kernel code which was turned to be a wrong design which does not allow
> for properly offloading a kernel switching SW model to e-switch HW.

I would have to disagree with this. For devices such as 82599 that
doesn't have a true switch this may limit future functionality since
we can't move it over to switchdev mode. For example one thing I may
need to add is the ability to disable multicast and broadcast receive
on a per-VF basis at some point in the future.

You may not recall but we tried to transition the i40e driver over to
SwitchDev, the parts supported by i40e have a much more robust l2
forwarding framework than the 82599, and the result was we were told
that while we might look at doing port representors some other way,
there was no way we could use switchdev since the hardware couldn't
support the requirements of switchdev in terms of default routes and
forwarding behavior. I am planning to resolve the port representor
issue by looking at coming up with something like a "source mode"
macvlan based port representor. I figure that is probably the closest
match for what the Intel hardware does since really the VFs are
nothing more than a physical macvlan interface in and of themselves as
the hardware doesn't have a full switch.

> We also had a good session the day after regarding alignment for the
> representation model of the uplink (physical port) and PF/s.
>
> The VF representor netdevs  exist for all drivers that support the new
> mode but the representation for the uplink and PF wasn't the same for
> all. The decision was to represent the uplink and PFs vports in the
> same manner done for VFs, using rep netdevs. This alignment would
> provide a more strict and clear view of the kernel model for e-switch
> to users and upper layer control plane SW.
>
> Or.

This part sounds fine.

- Alex

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2017-11-12 20:38 ` Alexander Duyck
@ 2017-11-13  6:16   ` Or Gerlitz
  2017-11-13 17:10     ` Alexander Duyck
  0 siblings, 1 reply; 36+ messages in thread
From: Or Gerlitz @ 2017-11-13  6:16 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David Miller, Anjali Singhai Jain, Andy Gospodarek, Michael Chan,
	Simon Horman, Jakub Kicinski, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Sun, Nov 12, 2017 at 10:38 PM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> On Sun, Nov 12, 2017 at 11:49 AM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>> Hi Dave and all,
>>
>> During and after the BoF on SRIOV switchdev mode, we came into a
>> consensus among the developers from four different HW vendors (CC
>> audience) that a correct thing to do would be to disallow any new
>> extensions to the legacy mode.
>>
>> The idea is to put focus on the new mode and not add new UAPIs and
>> kernel code which was turned to be a wrong design which does not allow
>> for properly offloading a kernel switching SW model to e-switch HW.

> You may not recall but we tried to transition the i40e driver over to
> SwitchDev, the parts supported by i40e have a much more robust l2
> forwarding framework than the 82599, and the result was we were told
> that while we might look at doing port representors some other way,
> there was no way we could use switchdev since the hardware couldn't
> support the requirements of switchdev in terms of default routes and
> forwarding behavior. I am planning to resolve the port representor
> issue by looking at coming up with something like a "source mode"
> macvlan based port representor. I figure that is probably the closest
> match for what the Intel hardware does since really the VFs are
> nothing more than a physical macvlan interface in and of themselves as
> the hardware doesn't have a full switch.

Hi Alex,

The what we call slow path requirements are the following:

1. xmit on VF rep always turns to a receive on the VF, regardless of
the offloaded
SW steering rules ("send-to-vport")

2. xmit on VF which doesn't meet any offloaded SW steering rules must
be recieved
into the host OS from the VF rep

1,2 above must hold also for the uplink and the PF reps

When the i40e limitation was described to @ netdev, it seems you have a problem
with VF xmit that should be turned to be a recv on the VF rep but also
goes to the wire.

It smells as if a FW patch can solve that, isn't that?

> I would have to disagree with this. For devices such as 82599 that
> doesn't have a true switch this may limit future functionality since
> we can't move it over to switchdev mode. For example one thing I may
> need to add is the ability to disable multicast and broadcast receive
> on a per-VF basis at some point in the future.

We are on the same boat with ConnectX3/mlx4, so us lucky that misery loves
company (my google search also yielded "many narrow-half consolation" is that
completely unrelated?) - the legacy mode for ixgbe/mlx4 is there for ~8-10 years
- and since then both companies had 2-3 newer HW generations. I don't see why
you can't come to your customers and tell that newish functionality needs newer
HW - it will also help sell more from the new stuff..  If you keep
extending the legacy
mode, more ppl/drivers will do that as well and it will not let us go
in the right direction.

Or.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2017-11-13  6:16   ` Or Gerlitz
@ 2017-11-13 17:10     ` Alexander Duyck
  2017-11-14 16:44       ` Or Gerlitz
  0 siblings, 1 reply; 36+ messages in thread
From: Alexander Duyck @ 2017-11-13 17:10 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: David Miller, Anjali Singhai Jain, Andy Gospodarek, Michael Chan,
	Simon Horman, Jakub Kicinski, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Sun, Nov 12, 2017 at 10:16 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> On Sun, Nov 12, 2017 at 10:38 PM, Alexander Duyck
> <alexander.duyck@gmail.com> wrote:
>> On Sun, Nov 12, 2017 at 11:49 AM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>> Hi Dave and all,
>>>
>>> During and after the BoF on SRIOV switchdev mode, we came into a
>>> consensus among the developers from four different HW vendors (CC
>>> audience) that a correct thing to do would be to disallow any new
>>> extensions to the legacy mode.
>>>
>>> The idea is to put focus on the new mode and not add new UAPIs and
>>> kernel code which was turned to be a wrong design which does not allow
>>> for properly offloading a kernel switching SW model to e-switch HW.
>
>> You may not recall but we tried to transition the i40e driver over to
>> SwitchDev, the parts supported by i40e have a much more robust l2
>> forwarding framework than the 82599, and the result was we were told
>> that while we might look at doing port representors some other way,
>> there was no way we could use switchdev since the hardware couldn't
>> support the requirements of switchdev in terms of default routes and
>> forwarding behavior. I am planning to resolve the port representor
>> issue by looking at coming up with something like a "source mode"
>> macvlan based port representor. I figure that is probably the closest
>> match for what the Intel hardware does since really the VFs are
>> nothing more than a physical macvlan interface in and of themselves as
>> the hardware doesn't have a full switch.
>
> Hi Alex,
>
> The what we call slow path requirements are the following:
>
> 1. xmit on VF rep always turns to a receive on the VF, regardless of
> the offloaded
> SW steering rules ("send-to-vport")
>
> 2. xmit on VF which doesn't meet any offloaded SW steering rules must
> be recieved
> into the host OS from the VF rep
>
> 1,2 above must hold also for the uplink and the PF reps

I am well aware of the requirements. We discussed these with Jiri at
the previous netdev.

> When the i40e limitation was described to @ netdev, it seems you have a problem
> with VF xmit that should be turned to be a recv on the VF rep but also
> goes to the wire.
>
> It smells as if a FW patch can solve that, isn't that?

That is a huge maybe. We looked into it last time and while we can
meet requirements 1 and 2 we do so with a heavy performance penalty
due to the fact that we don't support anywhere near the same number of
flows as a true switch. Also while that might work for i40e we still
have a much larger install base of ixgbe ports that we have to
support.

>> I would have to disagree with this. For devices such as 82599 that
>> doesn't have a true switch this may limit future functionality since
>> we can't move it over to switchdev mode. For example one thing I may
>> need to add is the ability to disable multicast and broadcast receive
>> on a per-VF basis at some point in the future.
>
> We are on the same boat with ConnectX3/mlx4, so us lucky that misery loves
> company (my google search also yielded "many narrow-half consolation" is that
> completely unrelated?) - the legacy mode for ixgbe/mlx4 is there for ~8-10 years
> - and since then both companies had 2-3 newer HW generations. I don't see why
> you can't come to your customers and tell that newish functionality needs newer
> HW - it will also help sell more from the new stuff..  If you keep
> extending the legacy
> mode, more ppl/drivers will do that as well and it will not let us go
> in the right direction.
>
> Or.

Well I don't know about you guys, but we still are selling parts
supported by ixgbe and have still been adding new hardware as recently
as just a couple years ago. I'm not saying SwitchDev doesn't need to
be supported, if anything I am saying we need to leave the legacy
support extendable so that we can setup a glide path between the two.
If I can get the souce mode macvlan port representor working the way I
hope we can start looking at getting our customers used to a SwitchDev
type environment without having to use full SwitchDev. That would help
to make them more amenable to moving over to devices that support that
in the future.

In addition this all works on the basis of all future SR-IOV devices
being based on a VEB. Do we know if there are any existing or future
devices that work in a VEPA type mode? The issue with ixgbe and i40e
is that they were designed to be a hybrid between the two but in my
opinion they lean much more toward the VEPA configuration with just a
little bit of loopback support to make the VEB setup work. As such we
end up with issues such as all broadcasts/multicasts always being
transmitted out the uplink port.

If anything I think what we should define as a requirement would be
that we cannot add any future legacy items without adding support for
the same via the SwitchDev port representor.

- Alex

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2017-11-13 17:10     ` Alexander Duyck
@ 2017-11-14 16:44       ` Or Gerlitz
  2017-11-14 20:00         ` Alexander Duyck
  0 siblings, 1 reply; 36+ messages in thread
From: Or Gerlitz @ 2017-11-14 16:44 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David Miller, Anjali Singhai Jain, Andy Gospodarek, Michael Chan,
	Simon Horman, Jakub Kicinski, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Mon, Nov 13, 2017 at 7:10 PM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> On Sun, Nov 12, 2017 at 10:16 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>> On Sun, Nov 12, 2017 at 10:38 PM, Alexander Duyck

>> The what we call slow path requirements are the following:
>>
>> 1. xmit on VF rep always turns to a receive on the VF, regardless of
>> the offloaded SW steering rules ("send-to-vport")
>>
>> 2. xmit on VF which doesn't meet any offloaded SW steering rules must
>> be received into the host OS from the VF rep

>> 1,2 above must hold also for the uplink and the PF reps

> I am well aware of the requirements. We discussed these with Jiri at
> the previous netdev.

>> When the i40e limitation was described to @ netdev, it seems you have a problem
>> with VF xmit that should be turned to be a recv on the VF rep but also
>> goes to the wire.

>> It smells as if a FW patch can solve that, isn't that?

> That is a huge maybe. We looked into it last time and while we can
> meet requirements 1 and 2 we do so with a heavy performance penalty
> due to the fact that we don't support anywhere near the same number of
> flows as a true switch. Also while that might work for i40e

to recap on i40e, you can support the slow path requirements, but  you have an
issue with the fast path (== offloaded flows)? what is the issue there?

> we still have a much larger install base of ixgbe ports that we have to support.

ok, but support is one thing and keep enhancing a ten years old wrong
SW model is 2nd thing

>>>> I would have to disagree with this. For devices such as 82599 that
>>> doesn't have a true switch this may limit future functionality since
>>> we can't move it over to switchdev mode. For example one thing I may
>>> need to add is the ability to disable multicast and broadcast receive
>>> on a per-VF basis at some point in the future.

>> We are on the same boat with ConnectX3/mlx4, so us lucky that misery loves
>> company (my google search also yielded "many narrow-half consolation" is that
>> completely unrelated?) - the legacy mode for ixgbe/mlx4 is there for ~8-10 years
>> - and since then both companies had 2-3 newer HW generations. I don't see why
>> you can't come to your customers and tell that newish functionality needs newer
>> HW - it will also help sell more from the new stuff..  If you keep
>> extending the legacy mode, more ppl/drivers will do that as well and it will not let us go
>> in the right direction.

> Well I don't know about you guys, but we still are selling parts
> supported by ixgbe

Same here, we are selling lots of CX3 and have to support that, but I didn't
see why someone will want new features there.

> still been adding new hardware as recently as just a couple years ago.

wait, that's different story.

You are saying that your older HW doesn't support e-switch
and you want to keep doing new parts of that older HW and you want the
kernel to keep enhance a wrong SW model b/c you are doing new parts
from old HW, I don't see why we as a community need to go there.

Lets focus on this point for a moment before discussing the other points
you raised.

Or.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2017-11-14 16:44       ` Or Gerlitz
@ 2017-11-14 20:00         ` Alexander Duyck
  2017-11-14 21:50           ` Or Gerlitz
  2017-11-14 23:32           ` Jakub Kicinski
  0 siblings, 2 replies; 36+ messages in thread
From: Alexander Duyck @ 2017-11-14 20:00 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: David Miller, Anjali Singhai Jain, Andy Gospodarek, Michael Chan,
	Simon Horman, Jakub Kicinski, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Tue, Nov 14, 2017 at 8:44 AM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> On Mon, Nov 13, 2017 at 7:10 PM, Alexander Duyck
> <alexander.duyck@gmail.com> wrote:
>> On Sun, Nov 12, 2017 at 10:16 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>> On Sun, Nov 12, 2017 at 10:38 PM, Alexander Duyck
>
>>> The what we call slow path requirements are the following:
>>>
>>> 1. xmit on VF rep always turns to a receive on the VF, regardless of
>>> the offloaded SW steering rules ("send-to-vport")
>>>
>>> 2. xmit on VF which doesn't meet any offloaded SW steering rules must
>>> be received into the host OS from the VF rep
>
>>> 1,2 above must hold also for the uplink and the PF reps
>
>> I am well aware of the requirements. We discussed these with Jiri at
>> the previous netdev.
>
>>> When the i40e limitation was described to @ netdev, it seems you have a problem
>>> with VF xmit that should be turned to be a recv on the VF rep but also
>>> goes to the wire.
>
>>> It smells as if a FW patch can solve that, isn't that?
>
>> That is a huge maybe. We looked into it last time and while we can
>> meet requirements 1 and 2 we do so with a heavy performance penalty
>> due to the fact that we don't support anywhere near the same number of
>> flows as a true switch. Also while that might work for i40e
>
> to recap on i40e, you can support the slow path requirements, but  you have an
> issue with the fast path (== offloaded flows)? what is the issue there?

We basically need to do some feasability research to see if we can
actually meet all the requirements for switchdev on i40e. We have been
getting mixed messages where we are given a great many "yes, but" type
answers. For i40e we are looking into it but I don't have high
confidence in our ability to actually support it in hardare/firmware.
If it were as easy as you have been led to believe, we would have done
it months ago when we were researching the requirements to support
switchdev. In addition i40e isn't really my concern. I am much more
concerned about ixgbe as it has a much larger install base and many
more customers that are still buying it today.

>> we still have a much larger install base of ixgbe ports that we have to support.
>
> ok, but support is one thing and keep enhancing a ten years old wrong
> SW model is 2nd thing

The model might be 10 years old, but as I said we are still shipping
new silicon that was released just over a year ago that is supported
by the ixgbe driver.

Also I don't know if the term "enhancing" is the right word for what I
am thinking. I'm not talking about adding new drivers that only
support legacy mode.  We are looking at probably having to refactor
the whole concept of "trusted" VF in order to break it out into
smaller buckets. In addition I plan to come up with a source mode
macvlan based "port representor" for legacy SR-IOV and hope to be able
to use that to start working on a better path for SR-IOV live
migration.

Fundamentally the problem I have with us saying we cannot extend
legacy mode SR-IOV is that 82599 is a very large piece of the existing
install base for 10Gbit in general. We have it shipping on brand new
platforms as the silicon that is installed on the motherboard. With
that being the case people are going to want to get the most value
they can out of the silicon that they purchased since in many cases it
is just a standard part of the platform.

>>>>> I would have to disagree with this. For devices such as 82599 that
>>>> doesn't have a true switch this may limit future functionality since
>>>> we can't move it over to switchdev mode. For example one thing I may
>>>> need to add is the ability to disable multicast and broadcast receive
>>>> on a per-VF basis at some point in the future.
>
>>> We are on the same boat with ConnectX3/mlx4, so us lucky that misery loves
>>> company (my google search also yielded "many narrow-half consolation" is that
>>> completely unrelated?) - the legacy mode for ixgbe/mlx4 is there for ~8-10 years
>>> - and since then both companies had 2-3 newer HW generations. I don't see why
>>> you can't come to your customers and tell that newish functionality needs newer
>>> HW - it will also help sell more from the new stuff..  If you keep
>>> extending the legacy mode, more ppl/drivers will do that as well and it will not let us go
>>> in the right direction.
>
>> Well I don't know about you guys, but we still are selling parts
>> supported by ixgbe
>
> Same here, we are selling lots of CX3 and have to support that, but I didn't
> see why someone will want new features there.

I think the difference is that we get pressed on as part of the
platform instead of being a single component. If a customer wants some
specific feature enabled on 82599 as a part of the platform we tend to
need to go along with it in order to avoid being a roadblock in a sale
of other components.

>> still been adding new hardware as recently as just a couple years ago.
>
> wait, that's different story.
>
> You are saying that your older HW doesn't support e-switch
> and you want to keep doing new parts of that older HW and you want the
> kernel to keep enhance a wrong SW model b/c you are doing new parts
> from old HW, I don't see why we as a community need to go there.

I'm not saying we have new parts. I'm saying we have existing parts
that will likely need some work done. SwitchDev was only introduced
about 2 years ago. We have parts that were released around or before
then with functionality that didn't anticipate this. We still haven't
finished fully implementing all the features that were available on
the parts, that is what I am arguing. Usually new features go in for
several years after a part is released, usually something on the 3 to
5 year range.

> Lets focus on this point for a moment before discussing the other points
> you raised.
>
> Or.

When SR-IOV was introduced there were two available modes, Virtual
Ethernet Port Aggregation, aka VEPA, and Virtual Ethernet Bridging,
aka VEB. The fact is SwitchDev is designed specifically for networking
SR-IOV with Virtual Ethernet Bridging, aka VEB. You argue that the
legacy model is bad, but I would argue that is because the legacy
model was really designed to work more for both VEPA than with VEB,
whereas SwitchDev only focuses on VEB. If you take a look in the ixgbe
or i40e drivers you will see that we support configuring both of those
modes via ndo_bridge_setlink since we have customer install bases that
actually prefer VEPA over VEB as they prefer to have their traffic
centrally managed instead of having the local host managing the
traffic. We cannot just arbitrarily tell our customers they are doing
SR-IOV using the "wrong model".

I would rather not have SwitchDev become the next SystemD. The type
argument you are making is basically dictating to us and our customers
how things are supposed to work based on your view things. We have
different hardware, different customers, and all of our needs aren't
necessarily met by SwitchDev. I would agree that SwitchDev is the
go-to solution for VEB configuration, and we do plan to have future
hardware support it. In addition I would argue that for the sake of
consistency we should make sure that any feature that gets added to
the legacy has to be supported by the SwitchDev model as well before
it could be supported. If anything my hope is to evolve the legacy
model to have much of the same look and feel as SwitchDev, but that
will take time and require changes to the legacy model.

I don't plan to have a ton of new features added to legacy SR-IOV, as
I stated earlier my main concern is the "trusted" VF mode as that has
become a security issue as everything is getting dumped into that so
we need to break it up to get finer granularity. For example I am
looking at adding a promisc/allmulti/multicast/broadcast control per
VF to set the upper limit of what a VF can request to receive instead
of just turning on "trusted" to allow a VF to turn on promiscuous. My
only other concern is live migration. I don't know if that will
require changes to the legacy SR-IOV mode or not, but it would be
better to not have that door closed as an option than to have to work
around it entirely.

So, to summarize:
1. VEPA is still a thing, that implies no e-switch. Switchdev does not
address that model.
2. I agree that SwitchDev is the way forward for VEB.
3. I agree we should focus on interface consistency so any new feature
added to legacy mode has to also be enabled in SwitchDev.

I hope this makes my point a bit clearer. I don't fundamentally
disagree with the need to focus on having a consistent UAPI going
forward. The only spot where we have issues is that I don't see
SwitchDev as the only solution as we still have customers that aren't
necessarily making use of an eswitch and telling them they are "doing
it wrong" isn't really a viable solution. If nothing else I think we
can look at re-evaluating this at the next netdev/netconf, and for now
I would agree legacy SR-IOV changes should be under greater scrutiny.

- Alex

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2017-11-14 20:00         ` Alexander Duyck
@ 2017-11-14 21:50           ` Or Gerlitz
  2017-11-14 23:05             ` Alexander Duyck
  2017-11-14 23:32           ` Jakub Kicinski
  1 sibling, 1 reply; 36+ messages in thread
From: Or Gerlitz @ 2017-11-14 21:50 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David Miller, Anjali Singhai Jain, Andy Gospodarek, Michael Chan,
	Simon Horman, Jakub Kicinski, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Tue, Nov 14, 2017 at 10:00 PM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> On Tue, Nov 14, 2017 at 8:44 AM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>> On Mon, Nov 13, 2017 at 7:10 PM, Alexander Duyck
>> <alexander.duyck@gmail.com> wrote:
>>> On Sun, Nov 12, 2017 at 10:16 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>>> On Sun, Nov 12, 2017 at 10:38 PM, Alexander Duyck
>>
>>>> The what we call slow path requirements are the following:
>>>>
>>>> 1. xmit on VF rep always turns to a receive on the VF, regardless of
>>>> the offloaded SW steering rules ("send-to-vport")
>>>>
>>>> 2. xmit on VF which doesn't meet any offloaded SW steering rules must
>>>> be received into the host OS from the VF rep
>>
>>>> 1,2 above must hold also for the uplink and the PF reps
>>
>>> I am well aware of the requirements. We discussed these with Jiri at
>>> the previous netdev.
>>
>>>> When the i40e limitation was described to @ netdev, it seems you have a problem
>>>> with VF xmit that should be turned to be a recv on the VF rep but also
>>>> goes to the wire.
>>
>>>> It smells as if a FW patch can solve that, isn't that?
>>
>>> That is a huge maybe. We looked into it last time and while we can
>>> meet requirements 1 and 2 we do so with a heavy performance penalty
>>> due to the fact that we don't support anywhere near the same number of
>>> flows as a true switch. Also while that might work for i40e
>>
>> to recap on i40e, you can support the slow path requirements, but  you have an
>> issue with the fast path (== offloaded flows)? what is the issue there?
>
> We basically need to do some feasability research to see if we can
> actually meet all the requirements for switchdev on i40e. We have been
> getting mixed messages where we are given a great many "yes, but" type
> answers. For i40e we are looking into it but I don't have high
> confidence in our ability to actually support it in hardare/firmware.
> If it were as easy as you have been led to believe, we would have done
> it months ago when we were researching the requirements to support switchdev

wait, Sridhar made seven rounds of his submission (this is the v7
pointer [1]) and you
still don't know if what you were attempting to push upstream can
work, something is
weird here, can you clarify? Jeff?

Sridhar, maybe you can explain if/what wrong assumptions you had in your code
and what you think is the gap to address them and come up with proper
impl for i40e?

[1] https://marc.info/?l=linux-netdev&m=149083338400922&w=2


> In addition i40e isn't really my concern. I am much more
> concerned about ixgbe as it has a much larger install base and many
> more customers that are still buying it today.
>
>>> we still have a much larger install base of ixgbe ports that we have to support.
>>
>> ok, but support is one thing and keep enhancing a ten years old wrong
>> SW model is 2nd thing
>
> The model might be 10 years old, but as I said we are still shipping
> new silicon that was released just over a year ago that is supported
> by the ixgbe driver.
>
> Also I don't know if the term "enhancing" is the right word for what I
> am thinking. I'm not talking about adding new drivers that only
> support legacy mode.  We are looking at probably having to refactor
> the whole concept of "trusted" VF in order to break it out into
> smaller buckets. In addition I plan to come up with a source mode
> macvlan based "port representor" for legacy SR-IOV and hope to be able
> to use that to start working on a better path for SR-IOV live
> migration.
>
> Fundamentally the problem I have with us saying we cannot extend
> legacy mode SR-IOV is that 82599 is a very large piece of the existing
> install base for 10Gbit in general. We have it shipping on brand new
> platforms as the silicon that is installed on the motherboard. With
> that being the case people are going to want to get the most value
> they can out of the silicon that they purchased since in many cases it
> is just a standard part of the platform.

Getting the most value still doesn't mean you should approach the community
and ask to keep enhancing a wrong SW model for a switch.

For example, suppose a single new bit module param to IXGBE will get
you to sell another
100K or 1M or 10M pieces per year but we as community decided that
module params are
not the way to go - will you come and ask to add the module param for
you to get more biz?


> I'm not saying we have new parts. I'm saying we have existing parts
> that will likely need some work done. SwitchDev was only introduced
> about 2 years ago. We have parts that were released around or before
> then with functionality that didn't anticipate this. We still haven't
> finished fully implementing all the features that were available on
> the parts, that is what I am arguing. Usually new features go in for
> several years after a part is released, usually something on the 3 to
> 5 year range.

> When SR-IOV was introduced there were two available modes, Virtual
> Ethernet Port Aggregation, aka VEPA, and Virtual Ethernet Bridging,
> aka VEB. The fact is SwitchDev is designed specifically for networking
> SR-IOV with Virtual Ethernet Bridging, aka VEB. You argue that the
> legacy model is bad, but I would argue that is because the legacy
> model was really designed to work more for both VEPA than with VEB,
> whereas SwitchDev only focuses on VEB. If you take a look in the ixgbe
> or i40e drivers you will see that we support configuring both of those
> modes via ndo_bridge_setlink since we have customer install bases that
> actually prefer VEPA over VEB as they prefer to have their traffic
> centrally managed instead of having the local host managing the
> traffic. We cannot just arbitrarily tell our customers they are doing
> SR-IOV using the "wrong model".
>
> I would rather not have SwitchDev become the next SystemD. The type
> argument you are making is basically dictating to us and our customers
> how things are supposed to work based on your view things. We have
> different hardware, different customers, and all of our needs aren't
> necessarily met by SwitchDev. I would agree that SwitchDev is the
> go-to solution for VEB configuration, and we do plan to have future
> hardware support it. In addition I would argue that for the sake of
> consistency we should make sure that any feature that gets added to
> the legacy has to be supported by the SwitchDev model as well before
> it could be supported. If anything my hope is to evolve the legacy
> model to have much of the same look and feel as SwitchDev, but that
> will take time and require changes to the legacy model.
>
> I don't plan to have a ton of new features added to legacy SR-IOV, as
> I stated earlier my main concern is the "trusted" VF mode as that has
> become a security issue as everything is getting dumped into that so
> we need to break it up to get finer granularity. For example I am
> looking at adding a promisc/allmulti/multicast/broadcast control per
> VF to set the upper limit of what a VF can request to receive instead
> of just turning on "trusted" to allow a VF to turn on promiscuous. My
> only other concern is live migration. I don't know if that will
> require changes to the legacy SR-IOV mode or not, but it would be
> better to not have that door closed as an option than to have to work
> around it entirely.
>
> So, to summarize:
> 1. VEPA is still a thing, that implies no e-switch. Switchdev does not
> address that model.
> 2. I agree that SwitchDev is the way forward for VEB.
> 3. I agree we should focus on interface consistency so any new feature
> added to legacy mode has to also be enabled in SwitchDev.
>
> I hope this makes my point a bit clearer. I don't fundamentally
> disagree with the need to focus on having a consistent UAPI going
> forward. The only spot where we have issues is that I don't see
> SwitchDev as the only solution as we still have customers that aren't
> necessarily making use of an eswitch and telling them they are "doing
> it wrong" isn't really a viable solution. If nothing else I think we
> can look at re-evaluating this at the next netdev/netconf, and for now
> I would agree legacy SR-IOV changes should be under greater scrutiny.

Alex,

Lots of data and argumentation, it's too bad that none of it was
said/presented @ the last
netdev/netconf nor in the previous conferences (Feb 2016 / Oct 2016)
when SRIOV switchdev
was on the stage nor in the submissions that followed, doesn't seem as
new data points, at
least to you. As you said, the switchdev mode for SRIOV is around for
two years (merged in 4.8
but was presented way back). You waited two years to provide this
input and we will have to wait
another 6 months for you to conduct a session on that.

Can you point out public use-cases / white-papers / design documents /
blue prints / etc
that employ the VEPA approach? b/c really no other person/vendor
brought it up... we
were all dealing with the sriov e-switch as a HW switch which should
be programmed
by the host stack according to well known industry models that apply
on physical switches, e.g

1. L2 FDB (Linux Bridge)
2. L3 FIB (Linux Routers)
3. ACLS (Linux TC)

[3] is what implemented by the upstream sriov switchdev drivers, [1] and [2] we
discussed on netdev, maybe you want to play with [1] for i40e? I had a slide on
that in the BoF

Or.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2017-11-14 21:50           ` Or Gerlitz
@ 2017-11-14 23:05             ` Alexander Duyck
  2017-11-14 23:36               ` Jakub Kicinski
  2017-11-16 17:41               ` Or Gerlitz
  0 siblings, 2 replies; 36+ messages in thread
From: Alexander Duyck @ 2017-11-14 23:05 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: David Miller, Anjali Singhai Jain, Andy Gospodarek, Michael Chan,
	Simon Horman, Jakub Kicinski, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Tue, Nov 14, 2017 at 1:50 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> On Tue, Nov 14, 2017 at 10:00 PM, Alexander Duyck
> <alexander.duyck@gmail.com> wrote:
>> On Tue, Nov 14, 2017 at 8:44 AM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>> On Mon, Nov 13, 2017 at 7:10 PM, Alexander Duyck
>>> <alexander.duyck@gmail.com> wrote:
>>>> On Sun, Nov 12, 2017 at 10:16 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>>>> On Sun, Nov 12, 2017 at 10:38 PM, Alexander Duyck
>>>
>>>>> The what we call slow path requirements are the following:
>>>>>
>>>>> 1. xmit on VF rep always turns to a receive on the VF, regardless of
>>>>> the offloaded SW steering rules ("send-to-vport")
>>>>>
>>>>> 2. xmit on VF which doesn't meet any offloaded SW steering rules must
>>>>> be received into the host OS from the VF rep
>>>
>>>>> 1,2 above must hold also for the uplink and the PF reps
>>>
>>>> I am well aware of the requirements. We discussed these with Jiri at
>>>> the previous netdev.
>>>
>>>>> When the i40e limitation was described to @ netdev, it seems you have a problem
>>>>> with VF xmit that should be turned to be a recv on the VF rep but also
>>>>> goes to the wire.
>>>
>>>>> It smells as if a FW patch can solve that, isn't that?
>>>
>>>> That is a huge maybe. We looked into it last time and while we can
>>>> meet requirements 1 and 2 we do so with a heavy performance penalty
>>>> due to the fact that we don't support anywhere near the same number of
>>>> flows as a true switch. Also while that might work for i40e
>>>
>>> to recap on i40e, you can support the slow path requirements, but  you have an
>>> issue with the fast path (== offloaded flows)? what is the issue there?
>>
>> We basically need to do some feasability research to see if we can
>> actually meet all the requirements for switchdev on i40e. We have been
>> getting mixed messages where we are given a great many "yes, but" type
>> answers. For i40e we are looking into it but I don't have high
>> confidence in our ability to actually support it in hardare/firmware.
>> If it were as easy as you have been led to believe, we would have done
>> it months ago when we were researching the requirements to support switchdev
>
> wait, Sridhar made seven rounds of his submission (this is the v7
> pointer [1]) and you
> still don't know if what you were attempting to push upstream can
> work, something is
> weird here, can you clarify? Jeff?

Not weird so much as stubborn. The patches were being pushed based on
the assumption that the community would accept a NIC generating port
representors that didn't necessarily pass traffic, and then even when
we had them passing traffic the PF still wasn't configured to handle
being the default destination for traffic without any rules
associated, instead VFs would directly send to the outside world.

> Sridhar, maybe you can explain if/what wrong assumptions you had in your code
> and what you think is the gap to address them and come up with proper
> impl for i40e?
>
> [1] https://marc.info/?l=linux-netdev&m=149083338400922&w=2

For starters the firmware change you are talking about didn't exist
during this time frame. We can ignore those patches as they assumed
that port representors didn't necessarily have to pass traffic.

>> In addition i40e isn't really my concern. I am much more
>> concerned about ixgbe as it has a much larger install base and many
>> more customers that are still buying it today.
>>
>>>> we still have a much larger install base of ixgbe ports that we have to support.
>>>
>>> ok, but support is one thing and keep enhancing a ten years old wrong
>>> SW model is 2nd thing
>>
>> The model might be 10 years old, but as I said we are still shipping
>> new silicon that was released just over a year ago that is supported
>> by the ixgbe driver.
>>
>> Also I don't know if the term "enhancing" is the right word for what I
>> am thinking. I'm not talking about adding new drivers that only
>> support legacy mode.  We are looking at probably having to refactor
>> the whole concept of "trusted" VF in order to break it out into
>> smaller buckets. In addition I plan to come up with a source mode
>> macvlan based "port representor" for legacy SR-IOV and hope to be able
>> to use that to start working on a better path for SR-IOV live
>> migration.
>>
>> Fundamentally the problem I have with us saying we cannot extend
>> legacy mode SR-IOV is that 82599 is a very large piece of the existing
>> install base for 10Gbit in general. We have it shipping on brand new
>> platforms as the silicon that is installed on the motherboard. With
>> that being the case people are going to want to get the most value
>> they can out of the silicon that they purchased since in many cases it
>> is just a standard part of the platform.
>
> Getting the most value still doesn't mean you should approach the community
> and ask to keep enhancing a wrong SW model for a switch.
>
> For example, suppose a single new bit module param to IXGBE will get
> you to sell another
> 100K or 1M or 10M pieces per year but we as community decided that
> module params are
> not the way to go - will you come and ask to add the module param for
> you to get more biz?

The problem is that is how things have been done in the past. I don't
want us going down that road. That is half of my frustration with how
things have been done. Even worse is how debugfs has been mis-used.
I'm trying to keep us from committing to an agreement that we won't
abide by.

>> I'm not saying we have new parts. I'm saying we have existing parts
>> that will likely need some work done. SwitchDev was only introduced
>> about 2 years ago. We have parts that were released around or before
>> then with functionality that didn't anticipate this. We still haven't
>> finished fully implementing all the features that were available on
>> the parts, that is what I am arguing. Usually new features go in for
>> several years after a part is released, usually something on the 3 to
>> 5 year range.
>
>> When SR-IOV was introduced there were two available modes, Virtual
>> Ethernet Port Aggregation, aka VEPA, and Virtual Ethernet Bridging,
>> aka VEB. The fact is SwitchDev is designed specifically for networking
>> SR-IOV with Virtual Ethernet Bridging, aka VEB. You argue that the
>> legacy model is bad, but I would argue that is because the legacy
>> model was really designed to work more for both VEPA than with VEB,
>> whereas SwitchDev only focuses on VEB. If you take a look in the ixgbe
>> or i40e drivers you will see that we support configuring both of those
>> modes via ndo_bridge_setlink since we have customer install bases that
>> actually prefer VEPA over VEB as they prefer to have their traffic
>> centrally managed instead of having the local host managing the
>> traffic. We cannot just arbitrarily tell our customers they are doing
>> SR-IOV using the "wrong model".
>>
>> I would rather not have SwitchDev become the next SystemD. The type
>> argument you are making is basically dictating to us and our customers
>> how things are supposed to work based on your view things. We have
>> different hardware, different customers, and all of our needs aren't
>> necessarily met by SwitchDev. I would agree that SwitchDev is the
>> go-to solution for VEB configuration, and we do plan to have future
>> hardware support it. In addition I would argue that for the sake of
>> consistency we should make sure that any feature that gets added to
>> the legacy has to be supported by the SwitchDev model as well before
>> it could be supported. If anything my hope is to evolve the legacy
>> model to have much of the same look and feel as SwitchDev, but that
>> will take time and require changes to the legacy model.
>>
>> I don't plan to have a ton of new features added to legacy SR-IOV, as
>> I stated earlier my main concern is the "trusted" VF mode as that has
>> become a security issue as everything is getting dumped into that so
>> we need to break it up to get finer granularity. For example I am
>> looking at adding a promisc/allmulti/multicast/broadcast control per
>> VF to set the upper limit of what a VF can request to receive instead
>> of just turning on "trusted" to allow a VF to turn on promiscuous. My
>> only other concern is live migration. I don't know if that will
>> require changes to the legacy SR-IOV mode or not, but it would be
>> better to not have that door closed as an option than to have to work
>> around it entirely.
>>
>> So, to summarize:
>> 1. VEPA is still a thing, that implies no e-switch. Switchdev does not
>> address that model.
>> 2. I agree that SwitchDev is the way forward for VEB.
>> 3. I agree we should focus on interface consistency so any new feature
>> added to legacy mode has to also be enabled in SwitchDev.
>>
>> I hope this makes my point a bit clearer. I don't fundamentally
>> disagree with the need to focus on having a consistent UAPI going
>> forward. The only spot where we have issues is that I don't see
>> SwitchDev as the only solution as we still have customers that aren't
>> necessarily making use of an eswitch and telling them they are "doing
>> it wrong" isn't really a viable solution. If nothing else I think we
>> can look at re-evaluating this at the next netdev/netconf, and for now
>> I would agree legacy SR-IOV changes should be under greater scrutiny.
>
> Alex,
>
> Lots of data and argumentation, it's too bad that none of it was
> said/presented @ the last
> netdev/netconf nor in the previous conferences (Feb 2016 / Oct 2016)
> when SRIOV switchdev
> was on the stage nor in the submissions that followed, doesn't seem as
> new data points, at
> least to you. As you said, the switchdev mode for SRIOV is around for
> two years (merged in 4.8
> but was presented way back). You waited two years to provide this
> input and we will have to wait
> another 6 months for you to conduct a session on that.

This is the first time where you have essentially said SwitchDev is
the only way things are going to be done going forward. In addition I
don't recall you ever using all the wording basically calling the
legacy model bad for SR-IOV. That is why I have been okay with it up
until now.

> Can you point out public use-cases / white-papers / design documents /
> blue prints / etc
> that employ the VEPA approach? b/c really no other person/vendor
> brought it up... we

Cisco and HP were the two vendors that were pushing it hard for a
while there. It isn't anywhere near as popular as VEB is, but from the
looks of it Cisco is still pushing a variant on it in the form of
vntag. If nothing else you can go look at the 802.1Qbg IEEE spec as it
is called out there as well.

> were all dealing with the sriov e-switch as a HW switch which should
> be programmed
> by the host stack according to well known industry models that apply
> on physical switches, e.g
>
> 1. L2 FDB (Linux Bridge)
> 2. L3 FIB (Linux Routers)
> 3. ACLS (Linux TC)
>
> [3] is what implemented by the upstream sriov switchdev drivers, [1] and [2] we
> discussed on netdev, maybe you want to play with [1] for i40e? I had a slide on
> that in the BoF
>
> Or.

So for i40e we will probably explore option 1, and possibly option 3
though as I said we still have to figure out what we can get the
firmware to actually do for us. That ends up being the ultimate
limitation.

- Alex

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2017-11-14 23:05             ` Alexander Duyck
@ 2017-11-14 23:36               ` Jakub Kicinski
  2017-11-15  3:04                 ` Alexander Duyck
  2017-11-16 17:41               ` Or Gerlitz
  1 sibling, 1 reply; 36+ messages in thread
From: Jakub Kicinski @ 2017-11-14 23:36 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Or Gerlitz, David Miller, Anjali Singhai Jain, Andy Gospodarek,
	Michael Chan, Simon Horman, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Tue, 14 Nov 2017 15:05:08 -0800, Alexander Duyck wrote:
> >> We basically need to do some feasability research to see if we can
> >> actually meet all the requirements for switchdev on i40e. We have been
> >> getting mixed messages where we are given a great many "yes, but" type
> >> answers. For i40e we are looking into it but I don't have high
> >> confidence in our ability to actually support it in hardare/firmware.
> >> If it were as easy as you have been led to believe, we would have done
> >> it months ago when we were researching the requirements to support switchdev  
> >
> > wait, Sridhar made seven rounds of his submission (this is the v7
> > pointer [1]) and you
> > still don't know if what you were attempting to push upstream can
> > work, something is
> > weird here, can you clarify? Jeff?  
> 
> Not weird so much as stubborn. The patches were being pushed based on
> the assumption that the community would accept a NIC generating port
> representors that didn't necessarily pass traffic, and then even when
> we had them passing traffic the PF still wasn't configured to handle
> being the default destination for traffic without any rules
> associated, instead VFs would directly send to the outside world.

Perhaps the way forward is to lift the requirement on passing traffic,
as long as the limitation is clearly expressed to the users.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2017-11-14 23:36               ` Jakub Kicinski
@ 2017-11-15  3:04                 ` Alexander Duyck
  2017-11-15  4:02                   ` Jakub Kicinski
  0 siblings, 1 reply; 36+ messages in thread
From: Alexander Duyck @ 2017-11-15  3:04 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Or Gerlitz, David Miller, Anjali Singhai Jain, Andy Gospodarek,
	Michael Chan, Simon Horman, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Tue, Nov 14, 2017 at 3:36 PM, Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
> On Tue, 14 Nov 2017 15:05:08 -0800, Alexander Duyck wrote:
>> >> We basically need to do some feasability research to see if we can
>> >> actually meet all the requirements for switchdev on i40e. We have been
>> >> getting mixed messages where we are given a great many "yes, but" type
>> >> answers. For i40e we are looking into it but I don't have high
>> >> confidence in our ability to actually support it in hardare/firmware.
>> >> If it were as easy as you have been led to believe, we would have done
>> >> it months ago when we were researching the requirements to support switchdev
>> >
>> > wait, Sridhar made seven rounds of his submission (this is the v7
>> > pointer [1]) and you
>> > still don't know if what you were attempting to push upstream can
>> > work, something is
>> > weird here, can you clarify? Jeff?
>>
>> Not weird so much as stubborn. The patches were being pushed based on
>> the assumption that the community would accept a NIC generating port
>> representors that didn't necessarily pass traffic, and then even when
>> we had them passing traffic the PF still wasn't configured to handle
>> being the default destination for traffic without any rules
>> associated, instead VFs would directly send to the outside world.
>
> Perhaps the way forward is to lift the requirement on passing traffic,
> as long as the limitation is clearly expressed to the users.

No, I am not arguing for that because then SwitchDev will fall into
disarray. If we want to have a strict definition for what is SwitchDev
and what isn't I am okay with that. It gives us a definition of what
our hardware needs to do in order to support it and without that we
are going to get hardware that just bends the rules to claim support
for it.

All I am asking for is for us to not close the door to the possibility
of adding features to legacy SR-IOV. I am hoping to use a source
macvlan based approach to make it so that we can support "port
representors" for devices that can't support full SwitchDev. The idea
would be to use them to get as close to SwitchDev level support on
legacy devices as possible without using full SwitchDev. That should
solve a good part of the issue, but I am pretty certain I need to be
able to extend legacy SR-IOV in order to support it. I had talked with
Jiri at netdev 2.1 about it back when we had submitted the v7 patches,
and the decision was to look at doing "port representors" but don't
associate them with SwitchDev. I was out on Sabbatical for most of the
summer and I am just now starting on the macvlan work I had planned. I
hope to have it done before the next netdev and then we can discuss it
there if it needs more discussion than what we can have on the mailing
list.

I'm fine with us placing any legacy SR-IOV changes under more
scrutiny. I am just not open to saying we will not extend or update
any features for legacy SR-IOV. The fact is we are still selling a ton
of ixgbe based parts, so I can't say with any certainty that there
won't be a request for some new SR-IOV feature in the future.

- Alex

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2017-11-15  3:04                 ` Alexander Duyck
@ 2017-11-15  4:02                   ` Jakub Kicinski
  2017-11-15 18:25                     ` Alexander Duyck
  0 siblings, 1 reply; 36+ messages in thread
From: Jakub Kicinski @ 2017-11-15  4:02 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Or Gerlitz, David Miller, Anjali Singhai Jain, Andy Gospodarek,
	Michael Chan, Simon Horman, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Tue, 14 Nov 2017 19:04:36 -0800, Alexander Duyck wrote:
> On Tue, Nov 14, 2017 at 3:36 PM, Jakub Kicinski
> <jakub.kicinski@netronome.com> wrote:
> > On Tue, 14 Nov 2017 15:05:08 -0800, Alexander Duyck wrote:  
> >> >> We basically need to do some feasability research to see if we can
> >> >> actually meet all the requirements for switchdev on i40e. We have been
> >> >> getting mixed messages where we are given a great many "yes, but" type
> >> >> answers. For i40e we are looking into it but I don't have high
> >> >> confidence in our ability to actually support it in hardare/firmware.
> >> >> If it were as easy as you have been led to believe, we would have done
> >> >> it months ago when we were researching the requirements to support switchdev  
> >> >
> >> > wait, Sridhar made seven rounds of his submission (this is the v7
> >> > pointer [1]) and you
> >> > still don't know if what you were attempting to push upstream can
> >> > work, something is
> >> > weird here, can you clarify? Jeff?  
> >>
> >> Not weird so much as stubborn. The patches were being pushed based on
> >> the assumption that the community would accept a NIC generating port
> >> representors that didn't necessarily pass traffic, and then even when
> >> we had them passing traffic the PF still wasn't configured to handle
> >> being the default destination for traffic without any rules
> >> associated, instead VFs would directly send to the outside world.  
> >
> > Perhaps the way forward is to lift the requirement on passing traffic,
> > as long as the limitation is clearly expressed to the users.  
> 
> No, I am not arguing for that because then SwitchDev will fall into
> disarray. If we want to have a strict definition for what is SwitchDev
> and what isn't I am okay with that. It gives us a definition of what
> our hardware needs to do in order to support it and without that we
> are going to get hardware that just bends the rules to claim support
> for it.

Let me make sure we understand each other.  The switchdev SR-IOV mode is
what happens when user requests DEVLINK_ESWITCH_MODE_SWITCHDEV.  Are you
saying you are opposed to adding DEVLINK_ESWITCH_MODE_VEPA?

> All I am asking for is for us to not close the door to the possibility
> of adding features to legacy SR-IOV. I am hoping to use a source
> macvlan based approach to make it so that we can support "port
> representors" for devices that can't support full SwitchDev. The idea
> would be to use them to get as close to SwitchDev level support on
> legacy devices as possible without using full SwitchDev. That should
> solve a good part of the issue, but I am pretty certain I need to be
> able to extend legacy SR-IOV in order to support it. I had talked with
> Jiri at netdev 2.1 about it back when we had submitted the v7 patches,
> and the decision was to look at doing "port representors" but don't
> associate them with SwitchDev. I was out on Sabbatical for most of the
> summer and I am just now starting on the macvlan work I had planned. I
> hope to have it done before the next netdev and then we can discuss it
> there if it needs more discussion than what we can have on the mailing
> list.

I don't know what you mean with the macvlan based approach.  Could you
perhaps describe it in more detail?  Will it allow users to configure
forwarding and queueing with existing, standard tools and APIs?

> I'm fine with us placing any legacy SR-IOV changes under more
> scrutiny. I am just not open to saying we will not extend or update
> any features for legacy SR-IOV. The fact is we are still selling a ton
> of ixgbe based parts, so I can't say with any certainty that there
> won't be a request for some new SR-IOV feature in the future.
> 
> - Alex

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2017-11-15  4:02                   ` Jakub Kicinski
@ 2017-11-15 18:25                     ` Alexander Duyck
  0 siblings, 0 replies; 36+ messages in thread
From: Alexander Duyck @ 2017-11-15 18:25 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Or Gerlitz, David Miller, Anjali Singhai Jain, Andy Gospodarek,
	Michael Chan, Simon Horman, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Tue, Nov 14, 2017 at 8:02 PM, Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
> On Tue, 14 Nov 2017 19:04:36 -0800, Alexander Duyck wrote:
>> On Tue, Nov 14, 2017 at 3:36 PM, Jakub Kicinski
>> <jakub.kicinski@netronome.com> wrote:
>> > On Tue, 14 Nov 2017 15:05:08 -0800, Alexander Duyck wrote:
>> >> >> We basically need to do some feasability research to see if we can
>> >> >> actually meet all the requirements for switchdev on i40e. We have been
>> >> >> getting mixed messages where we are given a great many "yes, but" type
>> >> >> answers. For i40e we are looking into it but I don't have high
>> >> >> confidence in our ability to actually support it in hardare/firmware.
>> >> >> If it were as easy as you have been led to believe, we would have done
>> >> >> it months ago when we were researching the requirements to support switchdev
>> >> >
>> >> > wait, Sridhar made seven rounds of his submission (this is the v7
>> >> > pointer [1]) and you
>> >> > still don't know if what you were attempting to push upstream can
>> >> > work, something is
>> >> > weird here, can you clarify? Jeff?
>> >>
>> >> Not weird so much as stubborn. The patches were being pushed based on
>> >> the assumption that the community would accept a NIC generating port
>> >> representors that didn't necessarily pass traffic, and then even when
>> >> we had them passing traffic the PF still wasn't configured to handle
>> >> being the default destination for traffic without any rules
>> >> associated, instead VFs would directly send to the outside world.
>> >
>> > Perhaps the way forward is to lift the requirement on passing traffic,
>> > as long as the limitation is clearly expressed to the users.
>>
>> No, I am not arguing for that because then SwitchDev will fall into
>> disarray. If we want to have a strict definition for what is SwitchDev
>> and what isn't I am okay with that. It gives us a definition of what
>> our hardware needs to do in order to support it and without that we
>> are going to get hardware that just bends the rules to claim support
>> for it.
>
> Let me make sure we understand each other.  The switchdev SR-IOV mode is
> what happens when user requests DEVLINK_ESWITCH_MODE_SWITCHDEV.  Are you
> saying you are opposed to adding DEVLINK_ESWITCH_MODE_VEPA?

I wouldn't say I am opposed to that idea. We just need to clearly
define what MODE_VEPA is. I would say that even in MODE_VEPA we would
be passing traffic. The limitation though is that we wouldn't have the
same mechanisms in place to route the traffic.

The big issue with VEPA is that the traffic is routed to an external
entity before it makes a hairpin turn and comes back. As such we don't
have the actual origin of the packet to work with other than MAC and
VLAN. As far as directing a packet to a specific port the only way we
really have of doing that is to direct it to the MAC/VLAN pair for the
VF. This is one of the reasons why I am thinking source mode macvlan
is the solution to go with for something like this. Basically the
source mode macvlan can get pretty close to identifying the origin of
any packet that came from the VF assuming it is programmed with all
the MAC entries belonging to the VF. The only case where this doesn't
work is the "trusted" legacy mode VF that is running in promiscuous
with anti-spoof disabled.

>> All I am asking for is for us to not close the door to the possibility
>> of adding features to legacy SR-IOV. I am hoping to use a source
>> macvlan based approach to make it so that we can support "port
>> representors" for devices that can't support full SwitchDev. The idea
>> would be to use them to get as close to SwitchDev level support on
>> legacy devices as possible without using full SwitchDev. That should
>> solve a good part of the issue, but I am pretty certain I need to be
>> able to extend legacy SR-IOV in order to support it. I had talked with
>> Jiri at netdev 2.1 about it back when we had submitted the v7 patches,
>> and the decision was to look at doing "port representors" but don't
>> associate them with SwitchDev. I was out on Sabbatical for most of the
>> summer and I am just now starting on the macvlan work I had planned. I
>> hope to have it done before the next netdev and then we can discuss it
>> there if it needs more discussion than what we can have on the mailing
>> list.
>
> I don't know what you mean with the macvlan based approach.  Could you
> perhaps describe it in more detail?  Will it allow users to configure
> forwarding and queueing with existing, standard tools and APIs?

So there are a few issues with our devices doing SwitchDev mode that I
am trying to address.

One of the issues is that we have no direct way to figure out where
the packets are coming from as I described above. So instead of us
implementing multiple approaches for the same thing my thought was to
look at using source mode macvlan which does filtering on the source
MAC address instead of the destination. It shouldn't take much to
extend it so that a PF could notify a source mode macvlan interface of
all the unicast addresses a VF can use as a source address for
transmitting. With that we would at least be able to tell where the
traffic came from.

Another issue is directing transmit packets to the VF for any specific
interface. My thought is for our source mode based "port representor"
macvlan would be to limit the transmits so that we can only transmit
unicast packets that are guaranteed to be delivered to the proper
destination. Basically we would have to tag all broadcast and
multicast packets as being already forwarded and they would have to be
dropped on the "port representor" interfaces. Ideally there would be
some sort of uplink representor that would then be able to handle the
broadcast/multicast packets for the device since we end up replicating
the packets across all ports on the same VLAN currently.

The last issue is that by default all transmits that don't have a
matching filter in hardware are transmitted out the uplink port. That
was part of the issue that we don't think can be solved for ixgbe, and
even with a firmware change I am not certain how will i40e will work
for this. With macvlan being used as the model we basically skirt the
whole issue since that is kind of the standard behavior for macvlan
anyway.

In theory this all should work together to allow forwarding with the
existing tools. It would basically just mean we need to use FDB
programming on the port representor to control what MAC addresses are
handled for each interface. In addition we could probably handle the
ndo_setup_tc call in the port representors with some limited subset of
fields supported by flower to use that to route traffic.

It will be much easier to show all this once I have have code. It will
probably take me a month or so to dig out the technical debt that is
currently present for macvlan offload, and the fact that i40e
currently doesn't support it. Once I get those two items addressed my
plan is to then start tackling the source mode macvlan based port
representors. I hope to have an RFC ready early next year.

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2017-11-14 23:05             ` Alexander Duyck
  2017-11-14 23:36               ` Jakub Kicinski
@ 2017-11-16 17:41               ` Or Gerlitz
  2017-11-16 18:20                 ` Alexander Duyck
  1 sibling, 1 reply; 36+ messages in thread
From: Or Gerlitz @ 2017-11-16 17:41 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David Miller, Anjali Singhai Jain, Andy Gospodarek, Michael Chan,
	Simon Horman, Jakub Kicinski, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Wed, Nov 15, 2017 at 1:05 AM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> On Tue, Nov 14, 2017 at 1:50 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:

>> all dealing with the sriov e-switch as a HW switch which should
>> be programmed
>> by the host stack according to well known industry models that apply
>> on physical switches, e.g
>>
>> 1. L2 FDB (Linux Bridge)
>> 2. L3 FIB (Linux Routers)
>> 3. ACLS (Linux TC)
>>
>> [3] is what implemented by the upstream sriov switchdev drivers, [1] and [2] we
>> discussed on netdev, maybe you want to play with [1] for i40e? I had a slide on
>> that in the BoF

> So for i40e we will probably explore option 1, and possibly option 3
> though as I said we still have to figure out what we can get the
> firmware to actually do for us. That ends up being the ultimate
> limitation.

I think Intel/Linux/sriov wise, it would be good if you put now the
focus on that small
corner of the universe and show support for the new community lead
mode by having
one of your current drivers support that.

FDB support would be great and it will help transition existing legacy
mode users to the switchdev
mode, b/c essentially FDBs is what each driver now configures their HW
from within, where's if
we manage to get a bridge to be offloaded, all what left is systemd
script that creates the VF,
puts the driver into switchdev mode, creates a bridge with the reps,
and that is it!!

I have presented a slide in our BoF re what does it take to support
FDB, here it is:

1. create linux bridge (e.g.1q), assign VF and uplink rep netdevices
to the bridge
2. support the switchdev FDB notifications in the HW driver

learning: respond to SWITCHDEV_FDB_ADD_TO_DEVICE events

aging: respond to SWITCHDEV_FDB_DEL_TO_DEVICE events (del FDB from HW)
enhance the driver/bridge API to allows drivers provide last-use
indications on FDB entries

STP:

fwd      - offload FDBs as explained above
learning - make sure HW flow miss (slow path) goes to CPU
discard  - add drop HW rule

flooding:

use SW based flooding

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2017-11-16 17:41               ` Or Gerlitz
@ 2017-11-16 18:20                 ` Alexander Duyck
  0 siblings, 0 replies; 36+ messages in thread
From: Alexander Duyck @ 2017-11-16 18:20 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: David Miller, Anjali Singhai Jain, Andy Gospodarek, Michael Chan,
	Simon Horman, Jakub Kicinski, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Thu, Nov 16, 2017 at 9:41 AM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> On Wed, Nov 15, 2017 at 1:05 AM, Alexander Duyck
> <alexander.duyck@gmail.com> wrote:
>> On Tue, Nov 14, 2017 at 1:50 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>
>>> all dealing with the sriov e-switch as a HW switch which should
>>> be programmed
>>> by the host stack according to well known industry models that apply
>>> on physical switches, e.g
>>>
>>> 1. L2 FDB (Linux Bridge)
>>> 2. L3 FIB (Linux Routers)
>>> 3. ACLS (Linux TC)
>>>
>>> [3] is what implemented by the upstream sriov switchdev drivers, [1] and [2] we
>>> discussed on netdev, maybe you want to play with [1] for i40e? I had a slide on
>>> that in the BoF
>
>> So for i40e we will probably explore option 1, and possibly option 3
>> though as I said we still have to figure out what we can get the
>> firmware to actually do for us. That ends up being the ultimate
>> limitation.
>
> I think Intel/Linux/sriov wise, it would be good if you put now the
> focus on that small
> corner of the universe and show support for the new community lead
> mode by having
> one of your current drivers support that.

I am trying to focus on this area. The problem is you keep assuming
what we can and can't do in our hardware. I am not certain we can
handle the "learning" aspect of things. The biggest issue is that our
hardware was designed to be a VEPA with a filter based hairpin. It
really wasn't designed to be a switch. My concern is you may have been
misinformed about what our hardware can and cannot do. In addition
changing our firmware for the parts supported by i40e isn't that easy.
In addition there is no guarantee that we can do what is being asked
per PCIe function, it might be a global impact on the entire device.
If that were the case then it isn't an option since we can't have one
function breaking another. There are a lot of what-if scenarios that
we have to sort out, if we can even get the firmware update for this
since it was mostly locked down and in maintenance mode.

> FDB support would be great and it will help transition existing legacy
> mode users to the switchdev
> mode, b/c essentially FDBs is what each driver now configures their HW
> from within, where's if
> we manage to get a bridge to be offloaded, all what left is systemd
> script that creates the VF,
> puts the driver into switchdev mode, creates a bridge with the reps,
> and that is it!!
>
> I have presented a slide in our BoF re what does it take to support
> FDB, here it is:
>
> 1. create linux bridge (e.g.1q), assign VF and uplink rep netdevices
> to the bridge
> 2. support the switchdev FDB notifications in the HW driver

This is essentially what I hope to support with source macvlan based
port representors.

> learning: respond to SWITCHDEV_FDB_ADD_TO_DEVICE events

This requires that we see the traffic. We have to figure out if we can
actually make the CPU the default target and can then get the traffic
out of the uplink interface without horribly breaking things. It will
take time to see if we can even do it.

The problem is the CPU/PF is only the default target for traffic
coming from the uplink on our devices. Anything the VF sends will
default to the uplink unless there is a filter for it to route it
otherwise.

> aging: respond to SWITCHDEV_FDB_DEL_TO_DEVICE events (del FDB from HW)
> enhance the driver/bridge API to allows drivers provide last-use
> indications on FDB entries
>
> STP:
>
> fwd      - offload FDBs as explained above
> learning - make sure HW flow miss (slow path) goes to CPU
> discard  - add drop HW rule
>
> flooding:
>
> use SW based flooding

This is much easier said than done when you are working with a device
that was architected years before switchdev was a thing. I'll see what
we can do, but I cannot make any promises.

- Alex

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2017-11-14 20:00         ` Alexander Duyck
  2017-11-14 21:50           ` Or Gerlitz
@ 2017-11-14 23:32           ` Jakub Kicinski
  1 sibling, 0 replies; 36+ messages in thread
From: Jakub Kicinski @ 2017-11-14 23:32 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Or Gerlitz, David Miller, Anjali Singhai Jain, Andy Gospodarek,
	Michael Chan, Simon Horman, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Tue, 14 Nov 2017 12:00:32 -0800, Alexander Duyck wrote:
> On Tue, Nov 14, 2017 at 8:44 AM, Or Gerlitz wrote:
> > On Mon, Nov 13, 2017 at 7:10 PM, Alexander Duyck wrote:  
> >> On Sun, Nov 12, 2017 at 10:16 PM, Or Gerlitz wrote:  
> > Lets focus on this point for a moment before discussing the other points
> > you raised.
>
> When SR-IOV was introduced there were two available modes, Virtual
> Ethernet Port Aggregation, aka VEPA, and Virtual Ethernet Bridging,
> aka VEB. The fact is SwitchDev is designed specifically for networking
> SR-IOV with Virtual Ethernet Bridging, aka VEB. You argue that the
> legacy model is bad, but I would argue that is because the legacy
> model was really designed to work more for both VEPA than with VEB,
> whereas SwitchDev only focuses on VEB. If you take a look in the ixgbe
> or i40e drivers you will see that we support configuring both of those
> modes via ndo_bridge_setlink since we have customer install bases that
> actually prefer VEPA over VEB as they prefer to have their traffic
> centrally managed instead of having the local host managing the
> traffic. We cannot just arbitrarily tell our customers they are doing
> SR-IOV using the "wrong model".

Maybe that's an obvious statement, but the perhaps real problem we are
grappling with here is that VEPA doesn't really exist as forwarding
model outside of SR-IOV NICs.  So we have no software construct that
cleanly maps onto it for offload.

> I would rather not have SwitchDev become the next SystemD. The type
> argument you are making is basically dictating to us and our customers
> how things are supposed to work based on your view things. We have
> different hardware, different customers, and all of our needs aren't
> necessarily met by SwitchDev. I would agree that SwitchDev is the
> go-to solution for VEB configuration, and we do plan to have future
> hardware support it. In addition I would argue that for the sake of
> consistency we should make sure that any feature that gets added to
> the legacy has to be supported by the SwitchDev model as well before
> it could be supported. If anything my hope is to evolve the legacy
> model to have much of the same look and feel as SwitchDev, but that
> will take time and require changes to the legacy model.

To me the whole point of switchdev is to reuse existing ABIs, TC, FDB,
bridging etc.  We are arguing to stop adding special SR-IOV features,
if the general direction of things is to just reflect configuration
done with SW ABIs to the hardware.  I think saying we need feature
parity between the models is missing this crucial point.

Also more ways there are to configure a single thing, the more confusing
it will be to the users.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2017-11-12 19:49 SRIOV switchdev mode BoF minutes Or Gerlitz
  2017-11-12 20:38 ` Alexander Duyck
@ 2018-04-12 17:05 ` Samudrala, Sridhar
  2018-04-12 20:20   ` Or Gerlitz
  1 sibling, 1 reply; 36+ messages in thread
From: Samudrala, Sridhar @ 2018-04-12 17:05 UTC (permalink / raw)
  To: Or Gerlitz, David Miller
  Cc: Anjali Singhai Jain, Andy Gospodarek, Michael Chan, Simon Horman,
	Jakub Kicinski, John Fastabend, Saeed Mahameed, Jiri Pirko,
	Rony Efraim, Linux Netdev List

On 11/12/2017 11:49 AM, Or Gerlitz wrote:
> Hi Dave and all,
>
> During and after the BoF on SRIOV switchdev mode, we came into a
> consensus among the developers from four different HW vendors (CC
> audience) that a correct thing to do would be to disallow any new
> extensions to the legacy mode.
>
> The idea is to put focus on the new mode and not add new UAPIs and
> kernel code which was turned to be a wrong design which does not allow
> for properly offloading a kernel switching SW model to e-switch HW.
>
> We also had a good session the day after regarding alignment for the
> representation model of the uplink (physical port) and PF/s.
>
> The VF representor netdevs  exist for all drivers that support the new
> mode but the representation for the uplink and PF wasn't the same for
> all. The decision was to represent the uplink and PFs vports in the
> same manner done for VFs, using rep netdevs. This alignment would
> provide a more strict and clear view of the kernel model for e-switch
> to users and upper layer control plane SW.
>
I don't see any changes in the Mellanox/other drivers to move to this new model to enable
the uplink and PF port representors, any updates?

It would be really nice to highlight the pros and cons of the old versus the
new model.

We are looking into adding switchdev support for our new 100Gb ice driver and could
use some feedback on the direction we should be taking.

Thanks
Sridhar

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2018-04-12 17:05 ` Samudrala, Sridhar
@ 2018-04-12 20:20   ` Or Gerlitz
  2018-04-12 20:33     ` Samudrala, Sridhar
  0 siblings, 1 reply; 36+ messages in thread
From: Or Gerlitz @ 2018-04-12 20:20 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: David Miller, Anjali Singhai Jain, Andy Gospodarek, Michael Chan,
	Simon Horman, Jakub Kicinski, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Thu, Apr 12, 2018 at 8:05 PM, Samudrala, Sridhar
<sridhar.samudrala@intel.com> wrote:
> On 11/12/2017 11:49 AM, Or Gerlitz wrote:
>>
>> Hi Dave and all,
>>
>> During and after the BoF on SRIOV switchdev mode, we came into a
>> consensus among the developers from four different HW vendors (CC
>> audience) that a correct thing to do would be to disallow any new
>> extensions to the legacy mode.
>>
>> The idea is to put focus on the new mode and not add new UAPIs and
>> kernel code which was turned to be a wrong design which does not allow
>> for properly offloading a kernel switching SW model to e-switch HW.
>>
>> We also had a good session the day after regarding alignment for the
>> representation model of the uplink (physical port) and PF/s.
>>
>> The VF representor netdevs  exist for all drivers that support the new
>> mode but the representation for the uplink and PF wasn't the same for
>> all. The decision was to represent the uplink and PFs vports in the
>> same manner done for VFs, using rep netdevs. This alignment would
>> provide a more strict and clear view of the kernel model for e-switch
>> to users and upper layer control plane SW.
>>
> I don't see any changes in the Mellanox/other drivers to move to this new
> model to enable the uplink and PF port representors, any updates?

Yeah, I am worked on that but didn't get to finalize the upstreaming
so far.  I have resumed
the work and plan uplink rep in mlx5 to replace the PF being uplink rep for 4.18

> It would be really nice to highlight the pros and cons of the old versus the
> new model.
>
> We are looking into adding switchdev support for our new 100Gb ice driver
> and could use some feedback on the direction we should be taking.

good news.

The uplink rep is clear cut that needs to be a rep device representing
the uplink just like vf
rep represents the vport toward the vf - please just do it correct
from the begining

I can spare

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2018-04-12 20:20   ` Or Gerlitz
@ 2018-04-12 20:33     ` Samudrala, Sridhar
  2018-04-13  8:56       ` Or Gerlitz
  0 siblings, 1 reply; 36+ messages in thread
From: Samudrala, Sridhar @ 2018-04-12 20:33 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: David Miller, Anjali Singhai Jain, Andy Gospodarek, Michael Chan,
	Simon Horman, Jakub Kicinski, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On 4/12/2018 1:20 PM, Or Gerlitz wrote:
> On Thu, Apr 12, 2018 at 8:05 PM, Samudrala, Sridhar
> <sridhar.samudrala@intel.com> wrote:
>> On 11/12/2017 11:49 AM, Or Gerlitz wrote:
>>> Hi Dave and all,
>>>
>>> During and after the BoF on SRIOV switchdev mode, we came into a
>>> consensus among the developers from four different HW vendors (CC
>>> audience) that a correct thing to do would be to disallow any new
>>> extensions to the legacy mode.
>>>
>>> The idea is to put focus on the new mode and not add new UAPIs and
>>> kernel code which was turned to be a wrong design which does not allow
>>> for properly offloading a kernel switching SW model to e-switch HW.
>>>
>>> We also had a good session the day after regarding alignment for the
>>> representation model of the uplink (physical port) and PF/s.
>>>
>>> The VF representor netdevs  exist for all drivers that support the new
>>> mode but the representation for the uplink and PF wasn't the same for
>>> all. The decision was to represent the uplink and PFs vports in the
>>> same manner done for VFs, using rep netdevs. This alignment would
>>> provide a more strict and clear view of the kernel model for e-switch
>>> to users and upper layer control plane SW.
>>>
>> I don't see any changes in the Mellanox/other drivers to move to this new
>> model to enable the uplink and PF port representors, any updates?
> Yeah, I am worked on that but didn't get to finalize the upstreaming
> so far.  I have resumed
> the work and plan uplink rep in mlx5 to replace the PF being uplink rep for 4.18
>
>> It would be really nice to highlight the pros and cons of the old versus the
>> new model.
>>
>> We are looking into adding switchdev support for our new 100Gb ice driver
>> and could use some feedback on the direction we should be taking.
> good news.
>
> The uplink rep is clear cut that needs to be a rep device representing
> the uplink just like vf
> rep represents the vport toward the vf - please just do it correct
> from the begining
>
Having an uplink rep will definitely help implement the slow path with flat/vlan network
scenarios by not having to add PF to the bridge.

But how do they help with a vxlan overlay scenario? In case of overlays, the slow path
has to go via vxlan -> ip stack -> pf?

What about pf-rep?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2018-04-12 20:33     ` Samudrala, Sridhar
@ 2018-04-13  8:56       ` Or Gerlitz
  2018-04-13  8:57         ` Or Gerlitz
  0 siblings, 1 reply; 36+ messages in thread
From: Or Gerlitz @ 2018-04-13  8:56 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: David Miller, Anjali Singhai Jain, Andy Gospodarek, Michael Chan,
	Simon Horman, Jakub Kicinski, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Thu, Apr 12, 2018 at 11:33 PM, Samudrala, Sridhar
<sridhar.samudrala@intel.com> wrote:
> On 4/12/2018 1:20 PM, Or Gerlitz wrote:
>>
>> On Thu, Apr 12, 2018 at 8:05 PM, Samudrala, Sridhar
>> <sridhar.samudrala@intel.com> wrote:
>>>
>>> On 11/12/2017 11:49 AM, Or Gerlitz wrote:
>>>>
>>>> Hi Dave and all,
>>>>
>>>> During and after the BoF on SRIOV switchdev mode, we came into a
>>>> consensus among the developers from four different HW vendors (CC
>>>> audience) that a correct thing to do would be to disallow any new
>>>> extensions to the legacy mode.
>>>>
>>>> The idea is to put focus on the new mode and not add new UAPIs and
>>>> kernel code which was turned to be a wrong design which does not allow
>>>> for properly offloading a kernel switching SW model to e-switch HW.
>>>>
>>>> We also had a good session the day after regarding alignment for the
>>>> representation model of the uplink (physical port) and PF/s.
>>>>
>>>> The VF representor netdevs  exist for all drivers that support the new
>>>> mode but the representation for the uplink and PF wasn't the same for
>>>> all. The decision was to represent the uplink and PFs vports in the
>>>> same manner done for VFs, using rep netdevs. This alignment would
>>>> provide a more strict and clear view of the kernel model for e-switch
>>>> to users and upper layer control plane SW.
>>>>
>>> I don't see any changes in the Mellanox/other drivers to move to this new
>>> model to enable the uplink and PF port representors, any updates?
>>
>> Yeah, I am worked on that but didn't get to finalize the upstreaming
>> so far.  I have resumed
>> the work and plan uplink rep in mlx5 to replace the PF being uplink rep
>> for 4.18
>>
>>> It would be really nice to highlight the pros and cons of the old versus
>>> the
>>> new model.
>>>
>>> We are looking into adding switchdev support for our new 100Gb ice driver
>>> and could use some feedback on the direction we should be taking.
>>
>> good news.
>>
>> The uplink rep is clear cut that needs to be a rep device representing
>> the uplink just like vf
>> rep represents the vport toward the vf - please just do it correct
>> from the begining
>>
> Having an uplink rep will definitely help implement the slow path with
> flat/vlan network
> scenarios by not having to add PF to the bridge.
>
> But how do they help with a vxlan overlay scenario? In case of overlays, the
> slow path has to go via vxlan -> ip stack -> pf?

in  overlay networks scheme, the uplink has the VTEP ip and is not connected
to the bridge, e.g you use ovs you have vf reps and vxlan ports connected to ovs
and the ip stack routes through the uplink rep

>
> What about pf-rep?
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2018-04-13  8:56       ` Or Gerlitz
@ 2018-04-13  8:57         ` Or Gerlitz
  2018-04-13 16:49           ` Samudrala, Sridhar
  0 siblings, 1 reply; 36+ messages in thread
From: Or Gerlitz @ 2018-04-13  8:57 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: David Miller, Anjali Singhai Jain, Andy Gospodarek, Michael Chan,
	Simon Horman, Jakub Kicinski, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Fri, Apr 13, 2018 at 11:56 AM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> On Thu, Apr 12, 2018 at 11:33 PM, Samudrala, Sridhar
> <sridhar.samudrala@intel.com> wrote:
>> On 4/12/2018 1:20 PM, Or Gerlitz wrote:
>>>
>>> On Thu, Apr 12, 2018 at 8:05 PM, Samudrala, Sridhar
>>> <sridhar.samudrala@intel.com> wrote:
>>>>
>>>> On 11/12/2017 11:49 AM, Or Gerlitz wrote:
>>>>>
>>>>> Hi Dave and all,
>>>>>
>>>>> During and after the BoF on SRIOV switchdev mode, we came into a
>>>>> consensus among the developers from four different HW vendors (CC
>>>>> audience) that a correct thing to do would be to disallow any new
>>>>> extensions to the legacy mode.
>>>>>
>>>>> The idea is to put focus on the new mode and not add new UAPIs and
>>>>> kernel code which was turned to be a wrong design which does not allow
>>>>> for properly offloading a kernel switching SW model to e-switch HW.
>>>>>
>>>>> We also had a good session the day after regarding alignment for the
>>>>> representation model of the uplink (physical port) and PF/s.
>>>>>
>>>>> The VF representor netdevs  exist for all drivers that support the new
>>>>> mode but the representation for the uplink and PF wasn't the same for
>>>>> all. The decision was to represent the uplink and PFs vports in the
>>>>> same manner done for VFs, using rep netdevs. This alignment would
>>>>> provide a more strict and clear view of the kernel model for e-switch
>>>>> to users and upper layer control plane SW.
>>>>>
>>>> I don't see any changes in the Mellanox/other drivers to move to this new
>>>> model to enable the uplink and PF port representors, any updates?
>>>
>>> Yeah, I am worked on that but didn't get to finalize the upstreaming
>>> so far.  I have resumed
>>> the work and plan uplink rep in mlx5 to replace the PF being uplink rep
>>> for 4.18
>>>
>>>> It would be really nice to highlight the pros and cons of the old versus
>>>> the
>>>> new model.
>>>>
>>>> We are looking into adding switchdev support for our new 100Gb ice driver
>>>> and could use some feedback on the direction we should be taking.
>>>
>>> good news.
>>>
>>> The uplink rep is clear cut that needs to be a rep device representing
>>> the uplink just like vf
>>> rep represents the vport toward the vf - please just do it correct
>>> from the begining
>>>
>> Having an uplink rep will definitely help implement the slow path with
>> flat/vlan network
>> scenarios by not having to add PF to the bridge.
>>
>> But how do they help with a vxlan overlay scenario? In case of overlays, the
>> slow path has to go via vxlan -> ip stack -> pf?
>
> in  overlay networks scheme, the uplink has the VTEP ip and is not connected

the uplink rep has the vtep ip

> to the bridge, e.g you use ovs you have vf reps and vxlan ports connected to ovs
> and the ip stack routes through the uplink rep
>
>>
>> What about pf-rep?
>>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2018-04-13  8:57         ` Or Gerlitz
@ 2018-04-13 16:49           ` Samudrala, Sridhar
  2018-04-13 20:16             ` Or Gerlitz
  0 siblings, 1 reply; 36+ messages in thread
From: Samudrala, Sridhar @ 2018-04-13 16:49 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: David Miller, Anjali Singhai Jain, Andy Gospodarek, Michael Chan,
	Simon Horman, Jakub Kicinski, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On 4/13/2018 1:57 AM, Or Gerlitz wrote:
> On Fri, Apr 13, 2018 at 11:56 AM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>> On Thu, Apr 12, 2018 at 11:33 PM, Samudrala, Sridhar
>> <sridhar.samudrala@intel.com> wrote:
>>> On 4/12/2018 1:20 PM, Or Gerlitz wrote:
>>>> On Thu, Apr 12, 2018 at 8:05 PM, Samudrala, Sridhar
>>>> <sridhar.samudrala@intel.com> wrote:
>>>>> On 11/12/2017 11:49 AM, Or Gerlitz wrote:
>>>>>> Hi Dave and all,
>>>>>>
>>>>>> During and after the BoF on SRIOV switchdev mode, we came into a
>>>>>> consensus among the developers from four different HW vendors (CC
>>>>>> audience) that a correct thing to do would be to disallow any new
>>>>>> extensions to the legacy mode.
>>>>>>
>>>>>> The idea is to put focus on the new mode and not add new UAPIs and
>>>>>> kernel code which was turned to be a wrong design which does not allow
>>>>>> for properly offloading a kernel switching SW model to e-switch HW.
>>>>>>
>>>>>> We also had a good session the day after regarding alignment for the
>>>>>> representation model of the uplink (physical port) and PF/s.
>>>>>>
>>>>>> The VF representor netdevs  exist for all drivers that support the new
>>>>>> mode but the representation for the uplink and PF wasn't the same for
>>>>>> all. The decision was to represent the uplink and PFs vports in the
>>>>>> same manner done for VFs, using rep netdevs. This alignment would
>>>>>> provide a more strict and clear view of the kernel model for e-switch
>>>>>> to users and upper layer control plane SW.
>>>>>>
>>>>> I don't see any changes in the Mellanox/other drivers to move to this new
>>>>> model to enable the uplink and PF port representors, any updates?
>>>> Yeah, I am worked on that but didn't get to finalize the upstreaming
>>>> so far.  I have resumed
>>>> the work and plan uplink rep in mlx5 to replace the PF being uplink rep
>>>> for 4.18
>>>>
>>>>> It would be really nice to highlight the pros and cons of the old versus
>>>>> the
>>>>> new model.
>>>>>
>>>>> We are looking into adding switchdev support for our new 100Gb ice driver
>>>>> and could use some feedback on the direction we should be taking.
>>>> good news.
>>>>
>>>> The uplink rep is clear cut that needs to be a rep device representing
>>>> the uplink just like vf
>>>> rep represents the vport toward the vf - please just do it correct
>>>> from the begining
>>>>
>>> Having an uplink rep will definitely help implement the slow path with
>>> flat/vlan network
>>> scenarios by not having to add PF to the bridge.
>>>
>>> But how do they help with a vxlan overlay scenario? In case of overlays, the
>>> slow path has to go via vxlan -> ip stack -> pf?
>> in  overlay networks scheme, the uplink has the VTEP ip and is not connected
> the uplink rep has the vtep ip
>
>> to the bridge, e.g you use ovs you have vf reps and vxlan ports connected to ovs
>> and the ip stack routes through the uplink rep

This changes the legacy mode behavior of configuring  vtep ip on the pf netdev.
How does host to host traffic expected to work when vtep ip is moved to uplink rep?


>>
>>> What about pf-rep?

Are you planning to create a pf-rep too? Is pf also treated similar to vf in switchdev mode?
All pf traffic goes to pf-rep and pf-rep traffic goes to pf by default without any rules
programmed?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2018-04-13 16:49           ` Samudrala, Sridhar
@ 2018-04-13 20:16             ` Or Gerlitz
  2018-04-13 23:03               ` Samudrala, Sridhar
  0 siblings, 1 reply; 36+ messages in thread
From: Or Gerlitz @ 2018-04-13 20:16 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: David Miller, Anjali Singhai Jain, Andy Gospodarek, Michael Chan,
	Simon Horman, Jakub Kicinski, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Fri, Apr 13, 2018 at 7:49 PM, Samudrala, Sridhar
<sridhar.samudrala@intel.com> wrote:
> On 4/13/2018 1:57 AM, Or Gerlitz wrote:


>>> in  overlay networks scheme, the uplink rep has the VTEP ip and is not connected
>>> to the bridge, e.g you use ovs you have vf reps and vxlan ports connected
>>> to ovs and the ip stack routes through the uplink rep

> This changes the legacy mode behavior of configuring  vtep ip on the pf
> netdev. How does host to host traffic expected to work when vtep ip is moved to uplink rep?

What do you mean host to host traffic, is that two VFs on the same host?
control plane SWs (such as OVS) don't apply encapsulation within the same host

>>>> What about pf-rep?

> Are you planning to create a pf-rep too? Is pf also treated similar to vf in
> switchdev mode?
> All pf traffic goes to pf-rep and pf-rep traffic goes to pf by default
> without any rules programmed?

@ the sriov switchdev ARCH level, pf/pf-rep would work indeed as you described.

We will have pf rep for smartnic schemes where the the pf on the host
is not the manager of the eswitch but rather the smartnic driver instance.

on non smart env, there are some challenges to address for the pf
nic to be fully functional for the slow path (what you described), we
will get there down the road if there is a real need.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2018-04-13 20:16             ` Or Gerlitz
@ 2018-04-13 23:03               ` Samudrala, Sridhar
  2018-04-15  6:01                 ` Or Gerlitz
  0 siblings, 1 reply; 36+ messages in thread
From: Samudrala, Sridhar @ 2018-04-13 23:03 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: David Miller, Anjali Singhai Jain, Andy Gospodarek, Michael Chan,
	Simon Horman, Jakub Kicinski, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On 4/13/2018 1:16 PM, Or Gerlitz wrote:
> On Fri, Apr 13, 2018 at 7:49 PM, Samudrala, Sridhar
> <sridhar.samudrala@intel.com> wrote:
>> On 4/13/2018 1:57 AM, Or Gerlitz wrote:
>
>>>> in  overlay networks scheme, the uplink rep has the VTEP ip and is not connected
>>>> to the bridge, e.g you use ovs you have vf reps and vxlan ports connected
>>>> to ovs and the ip stack routes through the uplink rep
>> This changes the legacy mode behavior of configuring  vtep ip on the pf
>> netdev. How does host to host traffic expected to work when vtep ip is moved to uplink rep?
> What do you mean host to host traffic, is that two VFs on the same host?
> control plane SWs (such as OVS) don't apply encapsulation within the same host

I meant between PFs on 2 compute nodes.


>
>>>>> What about pf-rep?
>> Are you planning to create a pf-rep too? Is pf also treated similar to vf in
>> switchdev mode?
>> All pf traffic goes to pf-rep and pf-rep traffic goes to pf by default
>> without any rules programmed?
> @ the sriov switchdev ARCH level, pf/pf-rep would work indeed as you described.
>
> We will have pf rep for smartnic schemes where the the pf on the host
> is not the manager of the eswitch but rather the smartnic driver instance.
>
> on non smart env, there are some challenges to address for the pf
> nic to be fully functional for the slow path (what you described), we
> will get there down the road if there is a real need.

So on non-smart env, are you planning to only expose uplink rep and vf reps as netdevs.
By smartnic env, i guess you are referring to OVS control plane also running on the NIC.

I will look forward to your patches.

Thanks
Sridhar

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2018-04-13 23:03               ` Samudrala, Sridhar
@ 2018-04-15  6:01                 ` Or Gerlitz
  2018-04-16 12:39                   ` Andy Gospodarek
  0 siblings, 1 reply; 36+ messages in thread
From: Or Gerlitz @ 2018-04-15  6:01 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: David Miller, Anjali Singhai Jain, Andy Gospodarek, Michael Chan,
	Simon Horman, Jakub Kicinski, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Sat, Apr 14, 2018 at 2:03 AM, Samudrala, Sridhar
<sridhar.samudrala@intel.com> wrote:

> I meant between PFs on 2 compute nodes.

If the PF serves as uplink rep, it functions as  a switch port -- applications
don't run on switch ports. One way to get apps to run on the host in switchdev
mode is probe one of the VFs there.

[...]

> By smartnic env, i guess you are referring to OVS control plane also running
> on the NIC.

correct

> I will look forward to your patches.

FWIW, note that my patches don't bring any newz for you.. I am aligning
mlx5 with what was agreed on netdev, e.g nfp does it (uplink rep and
such) already.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2018-04-15  6:01                 ` Or Gerlitz
@ 2018-04-16 12:39                   ` Andy Gospodarek
  2018-04-17  2:08                     ` Samudrala, Sridhar
  0 siblings, 1 reply; 36+ messages in thread
From: Andy Gospodarek @ 2018-04-16 12:39 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Samudrala, Sridhar, David Miller, Anjali Singhai Jain,
	Michael Chan, Simon Horman, Jakub Kicinski, John Fastabend,
	Saeed Mahameed, Jiri Pirko, Rony Efraim, Linux Netdev List

On Sun, Apr 15, 2018 at 09:01:16AM +0300, Or Gerlitz wrote:
> On Sat, Apr 14, 2018 at 2:03 AM, Samudrala, Sridhar
> <sridhar.samudrala@intel.com> wrote:
> 
> > I meant between PFs on 2 compute nodes.
> 
> If the PF serves as uplink rep, it functions as  a switch port -- applications
> don't run on switch ports. One way to get apps to run on the host in switchdev
> mode is probe one of the VFs there.
> 
> 
> [...]
> 
> > By smartnic env, i guess you are referring to OVS control plane also running
> > on the NIC.
> 
> correct
> 

Not just OvS, but other applications running on the SmartNIC could use tc for
programming hardware can benefit from a design like this.

> > I will look forward to your patches.
> 
> FWIW, note that my patches don't bring any newz for you.. I am aligning
> mlx5 with what was agreed on netdev, e.g nfp does it (uplink rep and
> such) already.

Probably not major news from us either since this was discussed at the last
NetConf, but we are planning to have this option for SmartNICs or PCI-multihost
NICs, too.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2018-04-16 12:39                   ` Andy Gospodarek
@ 2018-04-17  2:08                     ` Samudrala, Sridhar
  2018-04-17 13:30                       ` Andy Gospodarek
  0 siblings, 1 reply; 36+ messages in thread
From: Samudrala, Sridhar @ 2018-04-17  2:08 UTC (permalink / raw)
  To: Andy Gospodarek, Or Gerlitz
  Cc: David Miller, Anjali Singhai Jain, Michael Chan, Simon Horman,
	Jakub Kicinski, John Fastabend, Saeed Mahameed, Jiri Pirko,
	Rony Efraim, Linux Netdev List


On 4/16/2018 5:39 AM, Andy Gospodarek wrote:
> On Sun, Apr 15, 2018 at 09:01:16AM +0300, Or Gerlitz wrote:
>> On Sat, Apr 14, 2018 at 2:03 AM, Samudrala, Sridhar
>> <sridhar.samudrala@intel.com> wrote:
>>
>>> I meant between PFs on 2 compute nodes.
>> If the PF serves as uplink rep, it functions as  a switch port -- applications
>> don't run on switch ports. One way to get apps to run on the host in switchdev
>> mode is probe one of the VFs there.
>>
>>
>>
So once a pci device is configured in 'switchdev' mode,  only port representor netdevs are
seen on the host, no more PF netdev.

Are you going to expose another way to change sriov_num_vfs when the device is in
'switchdev' mode OR do we need to switch to 'legacy' mode to increase/decrease the number of
VFs?

Even in switchdev mode, i guess it will be possible for host apps to use the IP configured
on the uplink rep to talk externally.

In case of multiple uplinks, are you exposing one uplink-rep netdev per uplink?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2018-04-17  2:08                     ` Samudrala, Sridhar
@ 2018-04-17 13:30                       ` Andy Gospodarek
  2018-04-17 13:58                         ` Or Gerlitz
  0 siblings, 1 reply; 36+ messages in thread
From: Andy Gospodarek @ 2018-04-17 13:30 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: Andy Gospodarek, Or Gerlitz, David Miller, Anjali Singhai Jain,
	Michael Chan, Simon Horman, Jakub Kicinski, John Fastabend,
	Saeed Mahameed, Jiri Pirko, Rony Efraim, Linux Netdev List

On Mon, Apr 16, 2018 at 07:08:39PM -0700, Samudrala, Sridhar wrote:
> 
> On 4/16/2018 5:39 AM, Andy Gospodarek wrote:
> > On Sun, Apr 15, 2018 at 09:01:16AM +0300, Or Gerlitz wrote:
> > > On Sat, Apr 14, 2018 at 2:03 AM, Samudrala, Sridhar
> > > <sridhar.samudrala@intel.com> wrote:
> > > 
> > > > I meant between PFs on 2 compute nodes.
> > > If the PF serves as uplink rep, it functions as  a switch port -- applications
> > > don't run on switch ports. One way to get apps to run on the host in switchdev
> > > mode is probe one of the VFs there.
> > > 
> > > 
> > > 
> So once a pci device is configured in 'switchdev' mode,  only port representor netdevs are
> seen on the host, no more PF netdev.

That is not the functionality I would propose.  The PF netdev will still be there.

> Are you going to expose another way to change sriov_num_vfs when the device is in
> 'switchdev' mode OR do we need to switch to 'legacy' mode to increase/decrease the number of
> VFs?

Since the PF netdev will not disappear, the standard ways to configure number
of VF, etc is still available.

> Even in switchdev mode, i guess it will be possible for host apps to use the IP configured
> on the uplink rep to talk externally.
> 
> In case of multiple uplinks, are you exposing one uplink-rep netdev per uplink?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2018-04-17 13:30                       ` Andy Gospodarek
@ 2018-04-17 13:58                         ` Or Gerlitz
  2018-04-17 14:47                           ` Andy Gospodarek
  0 siblings, 1 reply; 36+ messages in thread
From: Or Gerlitz @ 2018-04-17 13:58 UTC (permalink / raw)
  To: Andy Gospodarek
  Cc: Samudrala, Sridhar, David Miller, Anjali Singhai Jain,
	Michael Chan, Simon Horman, Jakub Kicinski, John Fastabend,
	Saeed Mahameed, Jiri Pirko, Rony Efraim, Linux Netdev List

On Tue, Apr 17, 2018 at 4:30 PM, Andy Gospodarek
<andrew.gospodarek@broadcom.com> wrote:
> On Mon, Apr 16, 2018 at 07:08:39PM -0700, Samudrala, Sridhar wrote:
>>
>> On 4/16/2018 5:39 AM, Andy Gospodarek wrote:
>> > On Sun, Apr 15, 2018 at 09:01:16AM +0300, Or Gerlitz wrote:
>> > > On Sat, Apr 14, 2018 at 2:03 AM, Samudrala, Sridhar
>> > > <sridhar.samudrala@intel.com> wrote:
>> > >
>> > > > I meant between PFs on 2 compute nodes.
>> > > If the PF serves as uplink rep, it functions as  a switch port -- applications
>> > > don't run on switch ports. One way to get apps to run on the host in switchdev
>> > > mode is probe one of the VFs there.
>> > >
>> > >
>> > >
>> So once a pci device is configured in 'switchdev' mode,  only port representor netdevs are
>> seen on the host, no more PF netdev.
>
> That is not the functionality I would propose.  The PF netdev will still be there.

Andy,

Basically LGTM, so even in smartnic configs, the PF @ the host is
still privileged to
create/destroy VFs or provision MACs for them even if it is not the
e-switch manager
anymore?

Actually AFAIK this  can also work somehow otherwise, e.g a smartnic FW
"pushes" the VFs into the host w.o them being under a host admin directive.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2018-04-17 13:58                         ` Or Gerlitz
@ 2018-04-17 14:47                           ` Andy Gospodarek
  2018-04-17 16:46                             ` Samudrala, Sridhar
  2018-04-17 23:19                             ` Jakub Kicinski
  0 siblings, 2 replies; 36+ messages in thread
From: Andy Gospodarek @ 2018-04-17 14:47 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Andy Gospodarek, Samudrala, Sridhar, David Miller,
	Anjali Singhai Jain, Michael Chan, Simon Horman, Jakub Kicinski,
	John Fastabend, Saeed Mahameed, Jiri Pirko, Rony Efraim,
	Linux Netdev List

On Tue, Apr 17, 2018 at 04:58:05PM +0300, Or Gerlitz wrote:
> On Tue, Apr 17, 2018 at 4:30 PM, Andy Gospodarek
> <andrew.gospodarek@broadcom.com> wrote:
> > On Mon, Apr 16, 2018 at 07:08:39PM -0700, Samudrala, Sridhar wrote:
> >>
> >> On 4/16/2018 5:39 AM, Andy Gospodarek wrote:
> >> > On Sun, Apr 15, 2018 at 09:01:16AM +0300, Or Gerlitz wrote:
> >> > > On Sat, Apr 14, 2018 at 2:03 AM, Samudrala, Sridhar
> >> > > <sridhar.samudrala@intel.com> wrote:
> >> > >
> >> > > > I meant between PFs on 2 compute nodes.
> >> > > If the PF serves as uplink rep, it functions as  a switch port -- applications
> >> > > don't run on switch ports. One way to get apps to run on the host in switchdev
> >> > > mode is probe one of the VFs there.
> >> > >
> >> > >
> >> > >
> >> So once a pci device is configured in 'switchdev' mode,  only port representor netdevs are
> >> seen on the host, no more PF netdev.
> >
> > That is not the functionality I would propose.  The PF netdev will still be there.
> 
> Andy,
> 
> Basically LGTM, so even in smartnic configs, the PF @ the host is
> still privileged to
> create/destroy VFs or provision MACs for them even if it is not the
> e-switch manager
> anymore?

Yes, in a SmartNIC world one config we aim to have is that a host can create
and destroy VFs as needed.  One of the challenges is how the VF reps are
managed by applications in the SmartNIC when the host could make them
disappear.  

> Actually AFAIK this  can also work somehow otherwise, e.g a smartnic FW
> "pushes" the VFs into the host w.o them being under a host admin directive.

The model to 'push' VFs to a host is also another option, but I do not
like it as much.  My general preference is to allow the host to use a
SmartNIC as if it was any other standard NIC (we have been using the
word 'Performance NIC' to desribe what we would call a standard NIC, but
the name is not terribly important).

There is also a school of thought that the VF reps could be
pre-allocated on the SmartNIC so that any application processing that
traffic would sit idle when no traffic arrives on the rep, but could
process frames that do arrive when the VFs were created on the host.
This implementation will depend on how resources are allocated on a
given bit of hardware, but can really work well.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2018-04-17 14:47                           ` Andy Gospodarek
@ 2018-04-17 16:46                             ` Samudrala, Sridhar
  2018-04-17 16:53                               ` Andy Gospodarek
  2018-04-17 23:19                             ` Jakub Kicinski
  1 sibling, 1 reply; 36+ messages in thread
From: Samudrala, Sridhar @ 2018-04-17 16:46 UTC (permalink / raw)
  To: Andy Gospodarek, Or Gerlitz
  Cc: David Miller, Anjali Singhai Jain, Michael Chan, Simon Horman,
	Jakub Kicinski, John Fastabend, Saeed Mahameed, Jiri Pirko,
	Rony Efraim, Linux Netdev List

On 4/17/2018 7:47 AM, Andy Gospodarek wrote:
> On Tue, Apr 17, 2018 at 04:58:05PM +0300, Or Gerlitz wrote:
>> On Tue, Apr 17, 2018 at 4:30 PM, Andy Gospodarek
>> <andrew.gospodarek@broadcom.com> wrote:
>>> On Mon, Apr 16, 2018 at 07:08:39PM -0700, Samudrala, Sridhar wrote:
>>>> On 4/16/2018 5:39 AM, Andy Gospodarek wrote:
>>>>> On Sun, Apr 15, 2018 at 09:01:16AM +0300, Or Gerlitz wrote:
>>>>>> On Sat, Apr 14, 2018 at 2:03 AM, Samudrala, Sridhar
>>>>>> <sridhar.samudrala@intel.com> wrote:
>>>>>>
>>>>>>> I meant between PFs on 2 compute nodes.
>>>>>> If the PF serves as uplink rep, it functions as  a switch port -- applications
>>>>>> don't run on switch ports. One way to get apps to run on the host in switchdev
>>>>>> mode is probe one of the VFs there.
>>>>>>
>>>>>>
>>>>>>
>>>> So once a pci device is configured in 'switchdev' mode,  only port representor netdevs are
>>>> seen on the host, no more PF netdev.
>>> That is not the functionality I would propose.  The PF netdev will still be there.
>> Andy,
>>
>> Basically LGTM, so even in smartnic configs, the PF @ the host is
>> still privileged to
>> create/destroy VFs or provision MACs for them even if it is not the
>> e-switch manager
>> anymore?
> Yes, in a SmartNIC world one config we aim to have is that a host can create
> and destroy VFs as needed.  One of the challenges is how the VF reps are
> managed by applications in the SmartNIC when the host could make them
> disappear.

OK. So are we saying that in 'switchdev' mode with 2 VFs and 1 uplink, the host will
see PF netdev, 2 vf-rep netdev's corresponding to 2 VFs and 1 uplink-rep netdev.

Is PF netdev used only for the control/configure of the VFs? If it used as a datapath,
i think we need a pf-rep netdev too.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2018-04-17 16:46                             ` Samudrala, Sridhar
@ 2018-04-17 16:53                               ` Andy Gospodarek
  0 siblings, 0 replies; 36+ messages in thread
From: Andy Gospodarek @ 2018-04-17 16:53 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: Andy Gospodarek, Or Gerlitz, David Miller, Anjali Singhai Jain,
	Michael Chan, Simon Horman, Jakub Kicinski, John Fastabend,
	Saeed Mahameed, Jiri Pirko, Rony Efraim, Linux Netdev List

On Tue, Apr 17, 2018 at 09:46:38AM -0700, Samudrala, Sridhar wrote:
> On 4/17/2018 7:47 AM, Andy Gospodarek wrote:
> > On Tue, Apr 17, 2018 at 04:58:05PM +0300, Or Gerlitz wrote:
> > > On Tue, Apr 17, 2018 at 4:30 PM, Andy Gospodarek
> > > <andrew.gospodarek@broadcom.com> wrote:
> > > > On Mon, Apr 16, 2018 at 07:08:39PM -0700, Samudrala, Sridhar wrote:
> > > > > On 4/16/2018 5:39 AM, Andy Gospodarek wrote:
> > > > > > On Sun, Apr 15, 2018 at 09:01:16AM +0300, Or Gerlitz wrote:
> > > > > > > On Sat, Apr 14, 2018 at 2:03 AM, Samudrala, Sridhar
> > > > > > > <sridhar.samudrala@intel.com> wrote:
> > > > > > > 
> > > > > > > > I meant between PFs on 2 compute nodes.
> > > > > > > If the PF serves as uplink rep, it functions as  a switch port -- applications
> > > > > > > don't run on switch ports. One way to get apps to run on the host in switchdev
> > > > > > > mode is probe one of the VFs there.
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > So once a pci device is configured in 'switchdev' mode,  only port representor netdevs are
> > > > > seen on the host, no more PF netdev.
> > > > That is not the functionality I would propose.  The PF netdev will still be there.
> > > Andy,
> > > 
> > > Basically LGTM, so even in smartnic configs, the PF @ the host is
> > > still privileged to
> > > create/destroy VFs or provision MACs for them even if it is not the
> > > e-switch manager
> > > anymore?
> > Yes, in a SmartNIC world one config we aim to have is that a host can create
> > and destroy VFs as needed.  One of the challenges is how the VF reps are
> > managed by applications in the SmartNIC when the host could make them
> > disappear.
> 
> OK. So are we saying that in 'switchdev' mode with 2 VFs and 1 uplink, the host will
> see PF netdev, 2 vf-rep netdev's corresponding to 2 VFs and 1 uplink-rep netdev.
> 
> Is PF netdev used only for the control/configure of the VFs? If it used as a datapath,
> i think we need a pf-rep netdev too.
> 

Yes, that is correct.  PF reps could be used for datapath configuration to
redirect traffic to a PF.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2018-04-17 14:47                           ` Andy Gospodarek
  2018-04-17 16:46                             ` Samudrala, Sridhar
@ 2018-04-17 23:19                             ` Jakub Kicinski
  2018-04-18 15:15                               ` Andy Gospodarek
  1 sibling, 1 reply; 36+ messages in thread
From: Jakub Kicinski @ 2018-04-17 23:19 UTC (permalink / raw)
  To: Andy Gospodarek
  Cc: Or Gerlitz, Samudrala, Sridhar, David Miller, Anjali Singhai Jain,
	Michael Chan, Simon Horman, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Tue, 17 Apr 2018 10:47:00 -0400, Andy Gospodarek wrote:
> There is also a school of thought that the VF reps could be
> pre-allocated on the SmartNIC so that any application processing that
> traffic would sit idle when no traffic arrives on the rep, but could
> process frames that do arrive when the VFs were created on the host.
> This implementation will depend on how resources are allocated on a
> given bit of hardware, but can really work well.

+1 if there is no FW resource allocation issues IMHO it's okay to
just show all reprs for "remote PCIes (PFs and VFs)" on the SmartNIC/
controller.  The reprs should just show link down as if PCIe cable
was unpluged until host actually enables them.  

A similar issue exists on multi-host for PFs, right?  If one of the
hosts is down do we still show their PF repr?  IMHO yes.

That makes the thing looks more like a switch with cables being plugged
in and out.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2018-04-17 23:19                             ` Jakub Kicinski
@ 2018-04-18 15:15                               ` Andy Gospodarek
  2018-04-18 16:26                                 ` Jakub Kicinski
  2018-04-18 17:07                                 ` Parikh, Neerav
  0 siblings, 2 replies; 36+ messages in thread
From: Andy Gospodarek @ 2018-04-18 15:15 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Andy Gospodarek, Or Gerlitz, Samudrala, Sridhar, David Miller,
	Anjali Singhai Jain, Michael Chan, Simon Horman, John Fastabend,
	Saeed Mahameed, Jiri Pirko, Rony Efraim, Linux Netdev List

On Tue, Apr 17, 2018 at 04:19:15PM -0700, Jakub Kicinski wrote:
> On Tue, 17 Apr 2018 10:47:00 -0400, Andy Gospodarek wrote:
> > There is also a school of thought that the VF reps could be
> > pre-allocated on the SmartNIC so that any application processing that
> > traffic would sit idle when no traffic arrives on the rep, but could
> > process frames that do arrive when the VFs were created on the host.
> > This implementation will depend on how resources are allocated on a
> > given bit of hardware, but can really work well.
> 
> +1 if there is no FW resource allocation issues IMHO it's okay to
> just show all reprs for "remote PCIes (PFs and VFs)" on the SmartNIC/
> controller.  The reprs should just show link down as if PCIe cable
> was unpluged until host actually enables them.  

Yes we are on the same page on this.

> A similar issue exists on multi-host for PFs, right?  If one of the
> hosts is down do we still show their PF repr?  IMHO yes.

I would agree with that as well.  With today's model the VF reps are
created once a PF is put into switchdev mode, but I'm still working out
how we want to consider whether or not a PF rep for the other domains is
created locally or not and also how one can determine which domain is in
control.

Permanent config options (like NVRAM settings) could easily handle which
domain is in control, but that still does not mean that PF reps must be
created automatically, does it?

> That makes the thing looks more like a switch with cables being plugged
> in and out.

Yes, that's exactly how I view it as well.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2018-04-18 15:15                               ` Andy Gospodarek
@ 2018-04-18 16:26                                 ` Jakub Kicinski
  2018-04-18 17:25                                   ` Andy Gospodarek
  2018-04-18 17:07                                 ` Parikh, Neerav
  1 sibling, 1 reply; 36+ messages in thread
From: Jakub Kicinski @ 2018-04-18 16:26 UTC (permalink / raw)
  To: Andy Gospodarek
  Cc: Or Gerlitz, Samudrala, Sridhar, David Miller, Anjali Singhai Jain,
	Michael Chan, Simon Horman, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List

On Wed, 18 Apr 2018 11:15:29 -0400, Andy Gospodarek wrote:
> > A similar issue exists on multi-host for PFs, right?  If one of the
> > hosts is down do we still show their PF repr?  IMHO yes.  
> 
> I would agree with that as well.  With today's model the VF reps are
> created once a PF is put into switchdev mode, but I'm still working out
> how we want to consider whether or not a PF rep for the other domains is
> created locally or not and also how one can determine which domain is in
> control.
> 
> Permanent config options (like NVRAM settings) could easily handle which
> domain is in control, but that still does not mean that PF reps must be
> created automatically, does it?

The control domain is tricky.  I'm not sure I understand how you could
not have a PF rep for remote domains, though.  How do you configure
switching to the PF netdev if there is no rep?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: SRIOV switchdev mode BoF minutes
  2018-04-18 16:26                                 ` Jakub Kicinski
@ 2018-04-18 17:25                                   ` Andy Gospodarek
  0 siblings, 0 replies; 36+ messages in thread
From: Andy Gospodarek @ 2018-04-18 17:25 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Andy Gospodarek, Or Gerlitz, Samudrala, Sridhar, David Miller,
	Anjali Singhai Jain, Michael Chan, Simon Horman, John Fastabend,
	Saeed Mahameed, Jiri Pirko, Rony Efraim, Linux Netdev List

On Wed, Apr 18, 2018 at 09:26:34AM -0700, Jakub Kicinski wrote:
> On Wed, 18 Apr 2018 11:15:29 -0400, Andy Gospodarek wrote:
> > > A similar issue exists on multi-host for PFs, right?  If one of the
> > > hosts is down do we still show their PF repr?  IMHO yes.  
> > 
> > I would agree with that as well.  With today's model the VF reps are
> > created once a PF is put into switchdev mode, but I'm still working out
> > how we want to consider whether or not a PF rep for the other domains is
> > created locally or not and also how one can determine which domain is in
> > control.
> > 
> > Permanent config options (like NVRAM settings) could easily handle which
> > domain is in control, but that still does not mean that PF reps must be
> > created automatically, does it?
> 
> The control domain is tricky.  I'm not sure I understand how you could
> not have a PF rep for remote domains, though.  How do you configure
> switching to the PF netdev if there is no rep?

Yes, for complete control of all traffic using standard Linux APIs a PF
rep is a requirement.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: SRIOV switchdev mode BoF minutes
  2018-04-18 15:15                               ` Andy Gospodarek
  2018-04-18 16:26                                 ` Jakub Kicinski
@ 2018-04-18 17:07                                 ` Parikh, Neerav
  1 sibling, 0 replies; 36+ messages in thread
From: Parikh, Neerav @ 2018-04-18 17:07 UTC (permalink / raw)
  To: Andy Gospodarek, Jakub Kicinski
  Cc: Or Gerlitz, Samudrala, Sridhar, David Miller, Singhai, Anjali,
	Michael Chan, Simon Horman, John Fastabend, Saeed Mahameed,
	Jiri Pirko, Rony Efraim, Linux Netdev List



> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
> On Behalf Of Andy Gospodarek
> Sent: Wednesday, April 18, 2018 8:15 AM
> To: Jakub Kicinski <jakub.kicinski@netronome.com>
> Cc: Andy Gospodarek <andrew.gospodarek@broadcom.com>; Or Gerlitz
> <gerlitz.or@gmail.com>; Samudrala, Sridhar <sridhar.samudrala@intel.com>;
> David Miller <davem@davemloft.net>; Singhai, Anjali
> <anjali.singhai@intel.com>; Michael Chan <michael.chan@broadcom.com>;
> Simon Horman <simon.horman@netronome.com>; John Fastabend
> <john.fastabend@gmail.com>; Saeed Mahameed <saeedm@mellanox.com>;
> Jiri Pirko <jiri@mellanox.com>; Rony Efraim <ronye@mellanox.com>; Linux
> Netdev List <netdev@vger.kernel.org>
> Subject: Re: SRIOV switchdev mode BoF minutes
> 
> On Tue, Apr 17, 2018 at 04:19:15PM -0700, Jakub Kicinski wrote:
> > On Tue, 17 Apr 2018 10:47:00 -0400, Andy Gospodarek wrote:
> > > There is also a school of thought that the VF reps could be
> > > pre-allocated on the SmartNIC so that any application processing that
> > > traffic would sit idle when no traffic arrives on the rep, but could
> > > process frames that do arrive when the VFs were created on the host.
> > > This implementation will depend on how resources are allocated on a
> > > given bit of hardware, but can really work well.
> >
> > +1 if there is no FW resource allocation issues IMHO it's okay to
> > just show all reprs for "remote PCIes (PFs and VFs)" on the SmartNIC/
> > controller.  The reprs should just show link down as if PCIe cable
> > was unpluged until host actually enables them.
> 
> Yes we are on the same page on this.
> 
> > A similar issue exists on multi-host for PFs, right?  If one of the
> > hosts is down do we still show their PF repr?  IMHO yes.
> 
> I would agree with that as well.  With today's model the VF reps are
> created once a PF is put into switchdev mode, but I'm still working out
> how we want to consider whether or not a PF rep for the other domains is
> created locally or not and also how one can determine which domain is in
> control.
> 
> Permanent config options (like NVRAM settings) could easily handle which
> domain is in control, but that still does not mean that PF reps must be
> created automatically, does it?
> 
> > That makes the thing looks more like a switch with cables being plugged
> > in and out.
> 
> Yes, that's exactly how I view it as well.

If we need to behave like a switch or emulate that mode then is there a 
thought around the usability model?

So, while whichever domain is in control the implication above is that the
max number of vports supported (VFs and PFs) will need to be represented
regardless of whether they're "Enabled" or not in a given "Switch".
By "Enabled" I mean SRIOV VFs may not have been enabled but still the
representor exists.
For example if there are 256 VFs supported on a given PF when someone
switches into the switchdev mode there will be ~257 representor netdevs
added into the system. And if you've multi-port, multi-function devices
the number of representor netdevs will increase accordingly.
While representor netdevs' naming may help a bit here but a user will 
need to determine and differentiate between the sprawl of representors
netdevs and data netdevs to identify which all can be added into an 
OVS bridge (or vSwitch).
And switching to the "switchdev mode" becomes a pre-requisite before
any of the vSwitch bridges that uses these representor netdevs.

While, in a SmartNIC where users are not managing the devices this may
be deployed based on the NIC FW/SW capabilities. 
But, I'm not sure how the same model applied on a standard host running 
Linux will work across devices.

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2018-04-18 17:24 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-12 19:49 SRIOV switchdev mode BoF minutes Or Gerlitz
2017-11-12 20:38 ` Alexander Duyck
2017-11-13  6:16   ` Or Gerlitz
2017-11-13 17:10     ` Alexander Duyck
2017-11-14 16:44       ` Or Gerlitz
2017-11-14 20:00         ` Alexander Duyck
2017-11-14 21:50           ` Or Gerlitz
2017-11-14 23:05             ` Alexander Duyck
2017-11-14 23:36               ` Jakub Kicinski
2017-11-15  3:04                 ` Alexander Duyck
2017-11-15  4:02                   ` Jakub Kicinski
2017-11-15 18:25                     ` Alexander Duyck
2017-11-16 17:41               ` Or Gerlitz
2017-11-16 18:20                 ` Alexander Duyck
2017-11-14 23:32           ` Jakub Kicinski
2018-04-12 17:05 ` Samudrala, Sridhar
2018-04-12 20:20   ` Or Gerlitz
2018-04-12 20:33     ` Samudrala, Sridhar
2018-04-13  8:56       ` Or Gerlitz
2018-04-13  8:57         ` Or Gerlitz
2018-04-13 16:49           ` Samudrala, Sridhar
2018-04-13 20:16             ` Or Gerlitz
2018-04-13 23:03               ` Samudrala, Sridhar
2018-04-15  6:01                 ` Or Gerlitz
2018-04-16 12:39                   ` Andy Gospodarek
2018-04-17  2:08                     ` Samudrala, Sridhar
2018-04-17 13:30                       ` Andy Gospodarek
2018-04-17 13:58                         ` Or Gerlitz
2018-04-17 14:47                           ` Andy Gospodarek
2018-04-17 16:46                             ` Samudrala, Sridhar
2018-04-17 16:53                               ` Andy Gospodarek
2018-04-17 23:19                             ` Jakub Kicinski
2018-04-18 15:15                               ` Andy Gospodarek
2018-04-18 16:26                                 ` Jakub Kicinski
2018-04-18 17:25                                   ` Andy Gospodarek
2018-04-18 17:07                                 ` Parikh, Neerav

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).