Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next v3 2/4] net: mdio: add PTP offset compensation to mdiobus_write_sts
From: Hubert Feurstein @ 2019-08-20 16:56 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: Andrew Lunn, netdev, lkml, Richard Cochran, Florian Fainelli,
	Heiner Kallweit, Vladimir Oltean, David S. Miller
In-Reply-To: <20190820154005.GM891@localhost>

Am Di., 20. Aug. 2019 um 17:40 Uhr schrieb Miroslav Lichvar
<mlichvar@redhat.com>:
>
> On Tue, Aug 20, 2019 at 05:23:06PM +0200, Andrew Lunn wrote:
> > > - take a second "post" system timestamp after the completion
> >
> > For this hardware, completion is an interrupt, which has a lot of
> > jitter on it. But this hardware is odd, in that it uses an
> > interrupt. Every other MDIO bus controller uses polled IO, with an
> > mdelay(10) or similar between each poll. So the jitter is going to be
> > much larger.
>
> I think a large jitter is ok in this case. We just need to timestamp
> something that we know for sure happened after the PHC timestamp. It
> should have no impact on the offset and its stability, just the
> reported delay. A test with phc2sys should be able to confirm that.
> phc2sys selects the measurement with the shortest delay, which has
> least uncertainty. I'd say that applies to both interrupt and polling.
>
> If it is difficult to specify the minimum interrupt delay, I'd still
> prefer an overly pessimistic interval assuming a zero delay.
>
Currently I do not see the benefit from this. The original intention was to
compensate for the remaining offset as good as possible. The current code
of phc2sys uses the delay only for the filtering of the measurement record
with the shortest delay and for reporting and statistics. Why not simple shift
the timestamps with the offset to the point where we expect the PHC timestamp
to be captured, and we have a very good result compared to where we came
from.

Hubert

^ permalink raw reply

* Re: [PATCH 2/6] dt-bindings: net: sun8i-a83t-emac: Add phy-io-supply property
From: Rob Herring @ 2019-08-20 16:57 UTC (permalink / raw)
  To: Rob Herring, David S. Miller, Mark Rutland, Maxime Ripard,
	Chen-Yu Tsai, Giuseppe Cavallaro, Alexandre Torgue, Jose Abreu,
	Maxime Coquelin, netdev, devicetree,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	linux-kernel@vger.kernel.org, linux-stm32
In-Reply-To: <20190820163433.sr4lvjxmmhjtbtcb@core.my.home>

On Tue, Aug 20, 2019 at 11:34 AM Ondřej Jirman <megous@megous.com> wrote:
>
> On Tue, Aug 20, 2019 at 11:20:22AM -0500, Rob Herring wrote:
> > On Tue, Aug 20, 2019 at 9:53 AM <megous@megous.com> wrote:
> > >
> > > From: Ondrej Jirman <megous@megous.com>
> > >
> > > Some PHYs require separate power supply for I/O pins in some modes
> > > of operation. Add phy-io-supply property, to allow enabling this
> > > power supply.
> >
> > Perhaps since this is new, such phys should have *-supply in their nodes.
>
> Yes, I just don't understand, since external ethernet phys are so common,
> and they require power, how there's no fairly generic mechanism for this
> already in the PHY subsystem, or somewhere?

Because generic mechanisms for this don't work. For example, what
happens when the 2 supplies need to be turned on in a certain order
and with certain timings? And then add in reset or control lines into
the mix... You can see in the bindings we already have some of that.

> It looks like other ethernet mac drivers also implement supplies on phys
> on the EMAC nodes. Just grep phy-supply through dt-bindings/net.
>
> Historical reasons, or am I missing something? It almost seems like I must
> be missing something, since putting these properties to phy nodes
> seems so obvious.

Things get added one by one and one new property isn't that
controversial. We've generally learned the lesson and avoid this
pattern now, but ethernet phys are one of the older bindings.

Rob

^ permalink raw reply

* Re: [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
From: Andrew Lunn @ 2019-08-20 16:57 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Mark Brown, Hubert Feurstein, mlichvar, Richard Cochran,
	Florian Fainelli, linux-spi, netdev
In-Reply-To: <CA+h21hr4UcoJK7upNJjG0ibtX7CkF=akxVdrb--1AJn6-z=sUQ@mail.gmail.com>

> - Ethernet has support for hardware timestamping

True, but not all drivers support it.

Marvell switches are often combined with Marvell MACs. None of the
Marvell MAC drivers, mv643xx, mvneta, mvpp2 or octeontx2 support
hardware timestamping. My guess is, the hardware probably supports it,
but nobody has taken the time to writing the driver code. FEC does
have PTP support, but what does it look like in general? Are Marvell
drives the exception, or the norm?

What we have been talking about in this thread, adding timestamp calls
at various places, is simple, compared to adding driver code to a
number of MAC drivers. We can get a lot of 'bang for our buck' with
time stamps, it is easy to copy to other drivers, you don't need a
good knowledge of the hardware, datasheet, etc.

So if we can get this accepted, we should try to do it. You can always
come back later and add your full hardware solution.

   Andrew

^ permalink raw reply

* Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
From: Alex Williamson @ 2019-08-20 17:19 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Jiri Pirko, David S . Miller, Kirti Wankhede, Cornelia Huck,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	cjia@nvidia.com, netdev@vger.kernel.org
In-Reply-To: <AM0PR05MB48668B6221E477A873688CDBD1AB0@AM0PR05MB4866.eurprd05.prod.outlook.com>

On Tue, 20 Aug 2019 08:58:02 +0000
Parav Pandit <parav@mellanox.com> wrote:

> + Dave.
> 
> Hi Jiri, Dave, Alex, Kirti, Cornelia,
> 
> Please provide your feedback on it, how shall we proceed?
> 
> Short summary of requirements.
> For a given mdev (mediated device [1]), there is one representor
> netdevice and devlink port in switchdev mode (similar to SR-IOV VF),
> And there is one netdevice for the actual mdev when mdev is probed.
> 
> (a) representor netdev and devlink port should be able derive
> phys_port_name(). So that representor netdev name can be built
> deterministically across reboots.
> 
> (b) for mdev's netdevice, mdev's device should have an attribute.
> This attribute can be used by udev rules/systemd or something else to
> rename netdev name deterministically.
> 
> (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID.
> A simple grep IFNAMSIZ in stack hints hundreds of users of IFNAMSIZ
> in drivers, uapi, netlink, boot config area and more. Changing
> IFNAMSIZ for a mdev bus doesn't really look reasonable option to me.

How many characters do we really have to work with?  Your examples
below prepend various characters, ex. option-1 results in ens2f0_m10 or
enm10.  Do the extra 8 or 3 characters in these count against IFNAMSIZ?

> Hence, I would like to discuss below options.
> 
> Option-1: mdev index
> Introduce an optional mdev index/handle as u32 during mdev create
> time. User passes mdev index/handle as input.
> 
> phys_port_name=mIndex=m%u
> mdev_index will be available in sysfs as mdev attribute for udev to
> name the mdev's netdev.
> 
> example mdev create command:
> UUID=$(uuidgen)
> echo $UUID index=10
> > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create

Nit, IIRC previous discussions of additional parameters used comma
separators, ex. echo $UUID,index=10 >...

> > example netdevs:
> repnetdev=ens2f0_m10	/*ens2f0 is parent PF's netdevice */

Is the parent really relevant in the name?  Tools like mdevctl are
meant to provide persistence, creating the same mdev devices on the
same parent, but that's simply the easiest policy decision.  We can
also imagine that multiple parent devices might support a specified
mdev type and policies factoring in proximity, load-balancing, power
consumption, etc might be weighed such that we really don't want to
promote userspace creating dependencies on the parent association.

> mdev_netdev=enm10
> 
> Pros:
> 1. mdevctl and any other existing tools are unaffected.
> 2. netdev stack, ovs and other switching platforms are unaffected.
> 3. achieves unique phys_port_name for representor netdev
> 4. achieves unique mdev eth netdev name for the mdev using
> udev/systemd extension. 5. Aligns well with mdev and netdev subsystem
> and similar to existing sriov bdf's.

A user provided index seems strange to me.  It's not really an index,
just a user specified instance number.  Presumably you have the user
providing this because if it really were an index, then the value
depends on the creation order and persistence is lost.  Now the user
needs to both avoid uuid collision as well as "index" number
collision.  The uuid namespace is large enough to mostly ignore this,
but this is not.  This seems like a burden.

> Option-2: shorter mdev name
> Extend mdev to have shorter mdev device name in addition to UUID.
> such as 'foo', 'bar'.
> Mdev will continue to have UUID.
> phys_port_name=mdev_name
> 
> Pros:
> 1. All same as option-1, except mdevctl needs upgrade for newer usage.
> It is common practice to upgrade iproute2 package along with the
> kernel. Similar practice to be done with mdevctl.
> 2. Newer users of mdevctl who wants to work with non_UUID names, will
> use newer mdevctl/tools. Cons:
> 1. Dual naming scheme of mdev might affect some of the existing tools.
> It's unclear how/if it actually affects.
> mdevctl [2] is very recently developed and can be enhanced for dual
> naming scheme.

I think we've already nak'ed this one, the device namespace becomes
meaningless if the name becomes just a string where a uuid might be an
example string.  mdevs are named by uuid.

> Option-3: mdev uuid alias
> Instead of shorter mdev name or mdev index, have alpha-numeric name
> alias. Alias is an optional mdev sysfs attribute such as 'foo', 'bar'.
> example mdev create command:
> UUID=$(uuidgen)
> echo $UUID alias=foo
> > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create
> > example netdevs:
> examle netdevs:
> repnetdev = ens2f0_mfoo
> mdev_netdev=enmfoo
> 
> Pros:
> 1. All same as option-1.
> 2. Doesn't affect existing mdev naming scheme.
> Cons:
> 1. Index scheme of option-1 is better which can number large number
> of mdevs with fewer characters, simplifying the management tool.

No better than option-1, simply a larger secondary namespace, but still
requires the user to come up with two independent names for the device.

> Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ from 16 to
> 64 bytes phys_port_name=mdev_UUID_string mdev_netdev_name=enmUUID
> 
> Pros:
> 1. Doesn't require mdev extension
> Cons:
> 1. netdev stack, driver, uapi, user space, boot config wide changes
> 2. Possible user space extensions who assumed name size being 16
> characters 3. Single device type demands namesize change for all
> netdev types

What about an alias based on the uuid?  For example, we use 160-bit
sha1s daily with git (uuids are only 128-bit), but we generally don't
reference git commits with the full 20 character string.  Generally 12
characters is recommended to avoid ambiguity.  Could mdev automatically
create an abbreviated sha1 alias for the device?  If so, how many
characters should we use and what do we do on collision?  The colliding
device could add enough alias characters to disambiguate (we likely
couldn't re-alias the existing device to disambiguate, but I'm not sure
it matters, userspace has sysfs to associate aliases).  Ex.

UUID=$(uuidgen)
ALIAS=$(echo $UUID | sha1sum | colrm 13)

Since there seems to be some prefix overhead, as I ask about above in
how many characters we actually have to work with in IFNAMESZ, maybe we
start with 8 characters (matching your "index" namespace) and expand as
necessary for disambiguation.  If we can eliminate overhead in
IFNAMESZ, let's start with 12.  Thanks,

Alex

^ permalink raw reply

* Re: [PATCH bpf-next V9 1/3] bpf: new helper to obtain namespace data from current task
From: Yonghong Song @ 2019-08-20 17:29 UTC (permalink / raw)
  To: Carlos Antonio Neira Bustos
  Cc: netdev@vger.kernel.org, ebiederm@xmission.com, brouer@redhat.com,
	bpf@vger.kernel.org
In-Reply-To: <20190820151040.GA53610@localhost>



On 8/20/19 8:10 AM, Carlos Antonio Neira Bustos wrote:
> Hi Yonghong,
> 
> Thanks for taking the time to review this.
> 
> 
>>> + *
>>> + *		**-EINVAL** if *size_of_pidns* is not valid or unable to get ns, pid
>>> + *		or tgid of the current task.
>>> + *
>>> + *		**-ECHILD** if /proc/self/ns/pid does not exists.
>>> + *
>>> + *		**-ENOTDIR** if /proc/self/ns does not exists.
>>
>> Let us remove ECHILD and ENOTDIR and replace it with ENOENT as I
>> described below.
>>
>> Please *do verify* what happens when namespaces or pid_ns are not
>> configured.
>>
> 
> 
> I have tested kernel configurations without namespace support and with
> namespace support but without pid namespaces, the helper returns -EINVAL
> on both cases, now it should return -ENOENT.

Indeed. -ENOENT is better.

> 
> 
>>> +struct bpf_pidns_info {
>>> +	__u32 dev;
>>
>> Please add a comment for dev for how device major and minor number are
>> derived. User space gets device major and minor number, they need to
>> compare to the corresponding major/minor numbers returned by this helper.
>>
>>> +	__u32 nsid;
>>> +	__u32 tgid;
>>> +	__u32 pid;
>>> +};
>>
> 
> What do you think of this comment ?
> 
> struct bpf_pidns_info {
> 	__u32 dev;	/* major/minor numbers from /proc/self/ns/pid.
> 			 * User space gets device major and minor numbers from
> 			 * the same device that need to be compared against the
> 			 * major/minor numbers returned by this helper.
> 			 */
> 	__u32 nsid;
> 	__u32 tgid;
> 	__u32 pid;
> };
> 

To be more specific, I like a comment similar to below in uapi bpf.h

struct bpf_cgroup_dev_ctx {
         /* access_type encoded as (BPF_DEVCG_ACC_* << 16) | 
BPF_DEVCG_DEV_* */
         __u32 access_type;
         __u32 major;
         __u32 minor;
};

Some like:
	/* dev encoded as (major << 8 | (minor & 0xff)) */

>>
>> Please put an empty line. As a general rule for readability,
>> put an empty line if control flow is interrupted, e.g., by
>> "return", "break" or "continue". At least this is what
>> I saw most in bpf mailing list.
>>
> I'll fix it in version 10.
> 
>>> +	len = strlen(pidns_path) + 1;
>>> +	memcpy((char *)tmp->name, pidns_path, len);
>>> +	tmp->uptr = NULL;
>>> +	tmp->aname = NULL;
>>> +	tmp->refcnt = 1;
>>> +	ret = filename_lookup(AT_FDCWD, tmp, 0, &kp, NULL);
>> Adding below to free kmem cache memory
>> 	kmem_cache_free(names_cachep, fname);
>>
> 
> I think we don't need to call kmem_cache_free as filename_lookup
> calls putname that calls kmem_cache_free.

Oh, right. Thanks for checking this.

> 
> 
> Thanks a lot for your help.
> 
> Bests
> 
>> In the above, we checked task_active_pid_ns().
>> If not returning NULL, we have a valid pid ns. So the above
>> filename_lookup should not go wrong. We can still keep
>> the error checking though.
>>
>>> +	if (ret) {
>>> +		memset((void *)pidns_info, 0, (size_t) size);
>>> +		return ret;
>>
>>
> 
> I think we could get rid of this.
> 
> 

^ permalink raw reply

* pull-request: can-next 2019-08-20
From: Marc Kleine-Budde @ 2019-08-20 17:32 UTC (permalink / raw)
  To: netdev; +Cc: davem, kernel, linux-can

[-- Attachment #1.1: Type: text/plain, Size: 3083 bytes --]

Hello David,

this is a pull request for net-next/master consisting of 18 patches.

The first patch is by Geert Uytterhoeven, it removes the unused platform
data support from the rcar_can driver.

A patch by Nishka Dasgupta marks the structure peak_pciec_i2c_bit_ops in
the peak_pci driver as constant.

A patch by me removes the custom DMA support from the hi311x driver.

The next 4 patches target the tcan4x5x driver and are also by me, they
first clean up the driver a bit, and then add missing error handling and
fix a bug in the length calculation in the regmap callbacks.

The next 2 patches are by me for the m_can_platform driver, they also
remove unneeded casts and add missing error handling.

The remaining 9 patches all target the mcp251x driver. The first 5 are
clean up patches by me, the next relaxes the timing in the
mcp251x_hw_reset() function. Alexander Shiyan's patch improves the name
which is used while registering the interrupt handler. Phil Elwell's
patch improves the mcp251x_open() function to use the DT-supplied
interrupt flags instead of hard coding them. The final patch is again by
me, it removes the custom DMA support from the hi311x driver.

Marc

---

The following changes since commit 932630fa902878f4c8c50d0b1260eeb9de16b0a4:

  Merge tag 'wireless-drivers-next-for-davem-2019-08-19' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next (2019-08-19 18:32:30 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next.git tags/linux-can-next-for-5.4-20190820

for you to fetch changes up to df58525df395561551440392d92daa6416b4f8f3:

  can: mcp251x: remove custom DMA mapped buffer (2019-08-20 13:41:26 +0200)

----------------------------------------------------------------
linux-can-next-for-5.4-20190820

----------------------------------------------------------------
Alexander Shiyan (1):
      can: mcp251x: Use dev_name() during request_threaded_irq()

Geert Uytterhoeven (1):
      can: rcar_can: Remove unused platform data support

Marc Kleine-Budde (14):
      can: hi311x: remove custom DMA mapped buffer
      can: tcan4x5x: remove unused struct tcan4x5x_priv::tcan4x5x_lock
      can: tcan4x5x: remove not needed casts to struct tcan4x5x_priv *
      can: tcan4x5x: tcan4x5x_can_probe(): add missing error handling if mcan_class is NULL
      can: tcan4x5x: fix data length in regmap write path
      can: m_can_platform: remove not needed casts to struct m_can_plat_priv *
      can: m_can_platform: m_can_plat_probe(): add missing error handling if mcan_class is NULL
      can: mcp251x: convert block comments to network style comments
      can: mcp251x: remove unnecessary blank lines

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |-
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action
From: Pablo Neira Ayuso @ 2019-08-20 17:33 UTC (permalink / raw)
  To: Edward Cree; +Cc: netfilter-devel, davem, netdev, jakub.kicinski, jiri, vladbu
In-Reply-To: <c8a00a98-74eb-9f8d-660f-c2ea159dec91@solarflare.com>

On Tue, Aug 20, 2019 at 05:00:26PM +0100, Edward Cree wrote:
> On 20/08/2019 15:44, Pablo Neira Ayuso wrote:
> > It looks to me this limitation is coming from tc pedit.
> >
> > Four actions to mangle an IPv6 address consume more memory when making
> > the translation, and if you expect a lot of rules.
>
> Your change means that now every pedit uses four hw entries, even if it
>  was only meant to be a 32-bit mangle.

It makes no sense to me that matching an IPv6 address takes _one_
action, while mangling an IPv6 address takes _four_ actions.

A consistent model for drivers is good to have.

I can update tc pedit to generate one single action for offset
consecutive packet editions, if that is the concern, I'll send a v2.

^ permalink raw reply

* Re: [PATCH net-next 6/6] net: dsa: tag_8021q: Restore bridge pvid when enabling vlan_filtering
From: Florian Fainelli @ 2019-08-20 17:36 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Vivien Didelot, Andrew Lunn, Ido Schimmel, Roopa Prabhu, nikolay,
	David S. Miller, netdev
In-Reply-To: <CA+h21hody3hu_6UE9gU4dQ5+BP5HnUi=uw0PDdvtgPjTrbpKbw@mail.gmail.com>

On 8/20/19 3:28 AM, Vladimir Oltean wrote:
> Hi Florian,
> 
> On Tue, 20 Aug 2019 at 06:33, Florian Fainelli <f.fainelli@gmail.com> wrote:
>>
>>
>>
>> On 8/19/2019 5:00 PM, Vladimir Oltean wrote:
>>> The bridge core assumes that enabling/disabling vlan_filtering will
>>> translate into the simple toggling of a flag for switchdev drivers.
>>>
>>> That is clearly not the case for sja1105, which alters the VLAN table
>>> and the pvids in order to obtain port separation in standalone mode.
>>>
>>> So, since the bridge will not call any vlan operation through switchdev
>>> after enabling vlan_filtering, we need to ensure we're in a functional
>>> state ourselves.
>>>
>>> Hence read the pvid that the bridge is aware of, and program that into
>>> our ports.
>>
>> That is arguably applicable with DSA at large and not just specifically
>> for tag_8021q.c no? Is there a reason why you are not seeking to solve
>> this on a more global scale?
>>
> 
> Perhaps because I don't have a good feel for what are other DSA
> drivers' struggles with restoring the pvid, even after re-reading the
> "What to do when a bridge port gets its pvid deleted?" thread.
> I understand b53 has a similar need, and for that purpose maybe you
> can EXPORT_SYMBOL_GPL(dsa_port_restore_pvid) and use it, but
> otherwise, what is the more global scale?

I was just referring to all DSA drivers, not just tag_8021q.c which is,
so far a very narrow user base. I don't have a strong feeling against
calling a helper function versus the core DSA layer doing this for you,
but clearly this is something that should plague all drivers, not just
sja1105, even if the effects of not restoring the PVID there may be more
"catastrophic".
-- 
Florian

^ permalink raw reply

* Re: [PATCH v2 net-next] netdevsim: Fix build error without CONFIG_INET
From: Jakub Kicinski @ 2019-08-20 17:48 UTC (permalink / raw)
  To: YueHaibing; +Cc: davem, idosch, jiri, mcroce, linux-kernel, netdev
In-Reply-To: <20190820141446.71604-1-yuehaibing@huawei.com>

On Tue, 20 Aug 2019 22:14:46 +0800, YueHaibing wrote:
> If CONFIG_INET is not set, building fails:
> 
> drivers/net/netdevsim/dev.o: In function `nsim_dev_trap_report_work':
> dev.c:(.text+0x67b): undefined reference to `ip_send_check'
> 
> Use ip_fast_csum instead of ip_send_check to avoid
> dependencies on CONFIG_INET.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Fixes: da58f90f11f5 ("netdevsim: Add devlink-trap support")
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Thank you!

Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>

^ permalink raw reply

* Re: [PATCH net-next 3/6] net: dsa: Delete the VID from the upstream port as well
From: Vivien Didelot @ 2019-08-20 17:52 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Florian Fainelli, Andrew Lunn, Ido Schimmel, Roopa Prabhu,
	nikolay, David S. Miller, netdev
In-Reply-To: <CA+h21hpdDuoR5nj98EC+-W4HoBs35e_rURS1LD1jJWF5pkty9w@mail.gmail.com>

Hi Vladimir,

On Tue, 20 Aug 2019 12:54:44 +0300, Vladimir Oltean <olteanv@gmail.com> wrote:
> I can agree that this isn't one of my brightest moments. But at least
> we get to see Cunningham's law in action :)
> When dsa_8021q is cleaning up the switch's VLAN table for the bridge
> to use it, it is good to really clean it up, i.e. not leave any VLAN
> installed on the upstream ports.
> But I think this is just an academical concern at this point. In
> vlan_filtering mode, the CPU port will accept VLAN frames with the
> dsa_8021q ID's, but they will eventually get dropped due to no
> destination. The real breaker is the pvid change. If something like
> patch 4/6 gets accepted I will drop this one.

I wish Ward had mentioned to submit such academical concern as RFC :)

Please submit smaller series, targeting a single functional problem each,
with clear and detailed messages.


Thanks,

	Vivien

^ permalink raw reply

* Re: [PATCH rdma-next 0/3] RDMA RX RoCE Steering Support
From: Doug Ledford @ 2019-08-20 17:54 UTC (permalink / raw)
  To: Leon Romanovsky, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Mark Bloch, Mark Zhang,
	Saeed Mahameed, linux-netdev
In-Reply-To: <20190819113626.20284-1-leon@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 677 bytes --]

On Mon, 2019-08-19 at 14:36 +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
> 
> Hi,
> 
> This series from Mark extends mlx5 with RDMA_RX RoCE flow steering
> support
> for DEVX and QP objects.
> 
> Thanks
> 
> Mark Zhang (3):
>   net/mlx5: Add per-namespace flow table default miss action support
>   net/mlx5: Create bypass and loopback flow steering namespaces for
> RDMA
>     RX
>   RDMA/mlx5: RDMA_RX flow type support for user applications

I have no objection to this series.

-- 
Doug Ledford <dledford@redhat.com>
    GPG KeyID: B826A3330E572FDD
    Fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
From: Cornelia Huck @ 2019-08-20 17:55 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Parav Pandit, Jiri Pirko, David S . Miller, Kirti Wankhede,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	cjia@nvidia.com, netdev@vger.kernel.org
In-Reply-To: <20190820111904.75515f58@x1.home>

On Tue, 20 Aug 2019 11:19:04 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> What about an alias based on the uuid?  For example, we use 160-bit
> sha1s daily with git (uuids are only 128-bit), but we generally don't
> reference git commits with the full 20 character string.  Generally 12
> characters is recommended to avoid ambiguity.  Could mdev automatically
> create an abbreviated sha1 alias for the device?  If so, how many
> characters should we use and what do we do on collision?  The colliding
> device could add enough alias characters to disambiguate (we likely
> couldn't re-alias the existing device to disambiguate, but I'm not sure
> it matters, userspace has sysfs to associate aliases).  Ex.
> 
> UUID=$(uuidgen)
> ALIAS=$(echo $UUID | sha1sum | colrm 13)
> 
> Since there seems to be some prefix overhead, as I ask about above in
> how many characters we actually have to work with in IFNAMESZ, maybe we
> start with 8 characters (matching your "index" namespace) and expand as
> necessary for disambiguation.  If we can eliminate overhead in
> IFNAMESZ, let's start with 12.  Thanks,
> 
> Alex

I really like that idea, and it seems the best option proposed yet, as
we don't need to create a secondary identifier.

^ permalink raw reply

* Re: [PATCH v2 net-next] netdevsim: Fix build error without CONFIG_INET
From: Ido Schimmel @ 2019-08-20 18:10 UTC (permalink / raw)
  To: YueHaibing
  Cc: davem, idosch, jiri, mcroce, jakub.kicinski, linux-kernel, netdev
In-Reply-To: <20190820141446.71604-1-yuehaibing@huawei.com>

On Tue, Aug 20, 2019 at 10:14:46PM +0800, YueHaibing wrote:
> If CONFIG_INET is not set, building fails:
> 
> drivers/net/netdevsim/dev.o: In function `nsim_dev_trap_report_work':
> dev.c:(.text+0x67b): undefined reference to `ip_send_check'
> 
> Use ip_fast_csum instead of ip_send_check to avoid
> dependencies on CONFIG_INET.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Fixes: da58f90f11f5 ("netdevsim: Add devlink-trap support")
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>

Thanks for fixing this in my stead!

^ permalink raw reply

* Re: linux-next: Tree for Aug 19 (drivers/net/netdevsim/dev.o)
From: Ido Schimmel @ 2019-08-20 18:13 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Stephen Rothwell, Linux Next Mailing List,
	Linux Kernel Mailing List, netdev@vger.kernel.org
In-Reply-To: <92ef45a5-c933-0493-b2ff-50352fa8bf3f@infradead.org>

On Mon, Aug 19, 2019 at 09:16:13PM -0700, Randy Dunlap wrote:
> On 8/19/19 2:18 AM, Stephen Rothwell wrote:
> > Hi all,
> > 
> > Changes since 20190816:
> > 
> 
> on x86_64:
> # CONFIG_INET is not set
> 
> ld: drivers/net/netdevsim/dev.o: in function `nsim_dev_trap_report_work':
> dev.c:(.text+0x52f): undefined reference to `ip_send_check'
> 
> 
> Full randconfig file is attached.

Randy,

YueHaibing sent a v2 which is available here [1]. Successfully tested
it with your config.

[1] https://patchwork.ozlabs.org/patch/1150183/

^ permalink raw reply

* Re: [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action
From: Edward Cree @ 2019-08-20 18:15 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: netfilter-devel, davem, netdev, jakub.kicinski, jiri, vladbu
In-Reply-To: <20190820173344.3nrzfjboyztz3lji@salvia>

On 20/08/2019 18:33, Pablo Neira Ayuso wrote:
> I can update tc pedit to generate one single action for offset
> consecutive packet editions, if that is the concern, I'll send a v2.
IMHO the fix belongs in TC userland (i.e. iproute2), to turn a single action on the commandline for an ipv6 addr into four pedit actions before the kernel ever sees it.
Similarly if nftables wants to use this it should generate four separate pedit actions, probably in the kernel netfilter code as (I assume) your uAPI talks in terms of named fields rather than the u32ish offsets and masks of tc pedit.
The TC (well, flow_offload now I suppose) API should be kept narrow, not widened for things that can already be expressed adequately.  Your array of words inside a pedit action looks like a kind of loop unrolling but for data structures, which doesn't look sensible to me.

-Ed

^ permalink raw reply

* Re: [PATCH spi for-5.4 1/5] spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages
From: Mark Brown @ 2019-08-20 18:21 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	linux-spi, netdev
In-Reply-To: <20190818182600.3047-2-olteanv@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 345 bytes --]

On Sun, Aug 18, 2019 at 09:25:56PM +0300, Vladimir Oltean wrote:

>  	/* Extract head of queue */
> -	ctlr->cur_msg =
> -		list_first_entry(&ctlr->queue, struct spi_message, queue);
> +	mesg = list_first_entry(&ctlr->queue, struct spi_message, queue);
> +	ctlr->cur_msg = mesg;

Why mesg when the existing code uses msg as an abbreviation here?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* VRF notes when using ipv6 and flushing tables.
From: Ben Greear @ 2019-08-20 18:27 UTC (permalink / raw)
  To: netdev

I recently spend a few days debugging what in the end was user error on my part.

Here are my notes in hope they help someone else.

First, 'ip -6 route show vrf vrfX' will not show some of the
routes (like local routes) that will show up with
'ip -6 route show table X', where X == vrfX's table-id

If you run 'ip -6 route flush table X', then you will loose all of the auto
generated routes, including anycast, ff00::/8, and local routes.

ff00::/8 is needed for neigh discovery to work (probably among other things)

local route is needed or packets won't actually be accepted up the stack
(I think that is the symptom at least)

Not sure exactly what anycast does, but I'm guessing it is required for
something useful.

You must manually re-add those to the table unless you for certain know that
you do not need them for whatever reason.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action
From: Pablo Neira Ayuso @ 2019-08-20 18:35 UTC (permalink / raw)
  To: Edward Cree; +Cc: netfilter-devel, davem, netdev, jakub.kicinski, jiri, vladbu
In-Reply-To: <f4cf8a97-3322-d982-6068-d4c0ce997b1c@solarflare.com>

On Tue, Aug 20, 2019 at 07:15:10PM +0100, Edward Cree wrote:
> On 20/08/2019 18:33, Pablo Neira Ayuso wrote:
> > I can update tc pedit to generate one single action for offset
> > consecutive packet editions, if that is the concern, I'll send a v2.
> IMHO the fix belongs in TC userland (i.e. iproute2), to turn a
> single action on the commandline for an ipv6 addr into four pedit
> actions before the kernel ever sees it.
> Similarly if nftables wants to use this it should generate four
> separate pedit actions, probably in the kernel netfilter code as (I
> assume) your uAPI talks in terms of named fields rather than the
> u32ish offsets and masks of tc pedit.

The driver flow_offload API does not necessarily need to map 1:1 to
the netlink control plane / UAPI. The driver flow_offload API is
detached from UAPI and it is internal to drivers.

> The TC (well, flow_offload now I suppose) API should be kept narrow,
> not widened for things that can already be expressed adequately. 
> Your array of words inside a pedit action looks like a kind of loop
> unrolling but for data structures, which doesn't look sensible to
> me.

With one action that says "mangle an IPv6 at offset ip6 daddr field"
the driver has more global view on what is going on, rather than
having four actions to mangle four 32-bit words at some offset.

If this patch adds some loops here is because I did not want to make
too smart changes on the drivers.

The only reason I can find why mangling is restricted to 32-bits word
is tc pedit. The existing flow_offload API was modeled after tc
actions, which was exposing tc pedit implementation details to
hardware.

Please, allow for incremental updates on the flow_offload API to get
it better now. Later we'll have way more drivers it will become harder
to update this.

^ permalink raw reply

* [PATCH v3] ravb: implement MTU change while device is up
From: Ulrich Hecht @ 2019-08-20 19:01 UTC (permalink / raw)
  To: linux-renesas-soc
  Cc: netdev, davem, sergei.shtylyov, niklas.soderlund, wsa, horms,
	magnus.damm, geert, Ulrich Hecht

Uses the same method as various other drivers: shut the device down,
change the MTU, then bring it back up again.

Tested on Renesas D3 Draak and M3-W Salvator-X boards.

Signed-off-by: Ulrich Hecht <uli+renesas@fpond.eu>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
---

Hi!

This revision reverts the MTU change if re-opening the device fails.

CU
Uli


 drivers/net/ethernet/renesas/ravb_main.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
index ef8f089..402bcec 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -1810,12 +1810,24 @@ static int ravb_do_ioctl(struct net_device *ndev, struct ifreq *req, int cmd)
 
 static int ravb_change_mtu(struct net_device *ndev, int new_mtu)
 {
+	unsigned int old_mtu = ndev->mtu;
+
 	if (netif_running(ndev))
-		return -EBUSY;
+		ravb_close(ndev);
 
 	ndev->mtu = new_mtu;
 	netdev_update_features(ndev);
 
+	if (netif_running(ndev)) {
+		int err = ravb_open(ndev);
+
+		if (err) {
+			ndev->mtu = old_mtu;
+			netdev_update_features(ndev);
+			return err;
+		}
+	}
+
 	return 0;
 }
 
-- 
2.7.4


^ permalink raw reply related

* Re: [PATCH net-next v2] r8152: divide the tx and rx bottom functions
From: David Miller @ 2019-08-20 19:19 UTC (permalink / raw)
  To: hayeswang; +Cc: netdev, nic_swsd, linux-kernel
In-Reply-To: <1394712342-15778-303-Taiwan-albertk@realtek.com>

From: Hayes Wang <hayeswang@realtek.com>
Date: Mon, 19 Aug 2019 14:40:36 +0800

> Move the tx bottom function from NAPI to a new tasklet. Then, for
> multi-cores, the bottom functions of tx and rx may be run at same
> time with different cores. This is used to improve performance.
> 
> On x86, Tx/Rx 943/943 Mbits/sec -> 945/944.
> For arm platform, Tx/Rx: 917/917 Mbits/sec -> 933/933.
> 
> Signed-off-by: Hayes Wang <hayeswang@realtek.com>
> ---
> v2: add the performance number in the commit message.

Applied.

^ permalink raw reply

* Re: [PATCHv2 0/2] fix dev null pointer dereference when send pkg larger than mtu in collect_md mode
From: David Miller @ 2019-08-20 19:20 UTC (permalink / raw)
  To: liuhangbin; +Cc: netdev, sbrivio, wenxu, ast, eric.dumazet
In-Reply-To: <20190819075327.32412-1-liuhangbin@gmail.com>

From: Hangbin Liu <liuhangbin@gmail.com>
Date: Mon, 19 Aug 2019 15:53:25 +0800

> Subject: [PATCHv2 0/2] fix dev null pointer dereference when send pkg larger than mtu in collect_md mode

Please don't use the word "package" or the shorthand "pkg" when referring
to network packets.  Always use the full word "packets".

Please fix this up for your entire submission.

Thank you.

^ permalink raw reply

* Re: [PATCH net-next 0/2] netfilter: payload mangling offload support
From: Jakub Kicinski @ 2019-08-20 19:22 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, davem, netdev, jiri, vladbu
In-Reply-To: <20190820104807.13843-1-pablo@netfilter.org>

On Tue, 20 Aug 2019 12:48:05 +0200, Pablo Neira Ayuso wrote:
> Hi,
> 
> This patchset adds payload mangling offload support for Netfilter:
> 
> 1) Adapt existing drivers to allow for mangling up to four 32-bit words
>    with one single flow_rule action. Hence, once single action can be
>    used to mangle an IPv6 address.
> 
> 2) Add support for netfilter packet mangling.

Why pick 128b as a unit, because that's nftables' word size? :/

Reality is unless core coalesces _all_ consecutive rewrites drivers 
will have to do their own coalescing, anyway.

We suffered through enough haphazard "updates", I don't like this either.

^ permalink raw reply

* Re: [PATCH v2 net-next 0/1] Add BASE-T1 PHY support
From: David Miller @ 2019-08-20 19:22 UTC (permalink / raw)
  To: christian.herber; +Cc: andrew, f.fainelli, hkallweit1, netdev, linux-kernel
In-Reply-To: <20190819151940.27756-1-christian.herber@nxp.com>

From: Christian Herber <christian.herber@nxp.com>
Date: Mon, 19 Aug 2019 15:19:52 +0000

> v1 patchset can be found here: https://lkml.org/lkml/2019/8/15/626

Please expand and clarify your commit messages as requested by Heiner
in his feedback to v1.

^ permalink raw reply

* Re: [net PATCH] net/smc: make sure EPOLLOUT is raised
From: David Miller @ 2019-08-20 19:25 UTC (permalink / raw)
  To: jbaron; +Cc: netdev, edumazet, ubraun, kgraul
In-Reply-To: <1566239761-30252-1-git-send-email-jbaron@akamai.com>

From: Jason Baron <jbaron@akamai.com>
Date: Mon, 19 Aug 2019 14:36:01 -0400

> Currently, we are only explicitly setting SOCK_NOSPACE on a write timeout
> for non-blocking sockets. Epoll() edge-trigger mode relies on SOCK_NOSPACE
> being set when -EAGAIN is returned to ensure that EPOLLOUT is raised.
> Expand the setting of SOCK_NOSPACE to non-blocking sockets as well that can
> use SO_SNDTIMEO to adjust their write timeout. This mirrors the behavior
> that Eric Dumazet introduced for tcp sockets.
> 
> Signed-off-by: Jason Baron <jbaron@akamai.com>

Applied and queued up for -stable, thanks Jason.

^ permalink raw reply

* Re: [PATCH net-next,v2 2/6] PCI: hv: Add a Hyper-V PCI interface driver for software backchannel interface
From: David Miller @ 2019-08-20 19:29 UTC (permalink / raw)
  To: haiyangz
  Cc: sashal, saeedm, leon, eranbe, lorenzo.pieralisi, bhelgaas,
	linux-pci, linux-hyperv, netdev, kys, sthemmin, linux-kernel
In-Reply-To: <1566242976-108801-3-git-send-email-haiyangz@microsoft.com>

From: Haiyang Zhang <haiyangz@microsoft.com>
Date: Mon, 19 Aug 2019 19:30:47 +0000

> +static void __exit exit_hv_pci_intf(void)
> +{
> +	pr_info("unloaded\n");
> +}
> +
> +static int __init init_hv_pci_intf(void)
> +{
> +	pr_info("loaded\n");
> +

Clogging up the logs with useless messages like this is inappropriate.
Please remove these pr_info() calls.

Also, all of these symbols should probably be GPL exported.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox