Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v2 net] gve: Copy and paste bug in gve_get_stats()
From: Catherine Sullivan @ 2019-08-21 17:20 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Sagi Shahar, Jon Olson, David S. Miller, Willem de Bruijn,
	Luigi Rizzo, Chuhong Yuan, netdev, kernel-janitors
In-Reply-To: <20190820090739.GB1845@kadam>

On Tue, Aug 20, 2019 at 2:11 AM Dan Carpenter <dan.carpenter@oracle.com> wrote:
>
> There is a copy and paste error so we have "rx" where "tx" was intended
> in the priv->tx[] array.
>
> Fixes: f5cedc84a30d ("gve: Add transmit and receive support")
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> ---
> v2: fix a typo in the subject: buy -> bug (Thanks Walter Harms)
>
>  drivers/net/ethernet/google/gve/gve_main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
> index 497298752381..aca95f64bde8 100644
> --- a/drivers/net/ethernet/google/gve/gve_main.c
> +++ b/drivers/net/ethernet/google/gve/gve_main.c
> @@ -50,7 +50,7 @@ static void gve_get_stats(struct net_device *dev, struct rtnl_link_stats64 *s)
>                                   u64_stats_fetch_begin(&priv->tx[ring].statss);
>                                 s->tx_packets += priv->tx[ring].pkt_done;
>                                 s->tx_bytes += priv->tx[ring].bytes_done;
> -                       } while (u64_stats_fetch_retry(&priv->rx[ring].statss,
> +                       } while (u64_stats_fetch_retry(&priv->tx[ring].statss,
>                                                        start));
>                 }
>         }
> --
> 2.20.1

Thanks!

Reviewed-by: Catherine Sullivan <csully@google.com>

^ permalink raw reply

* Re: [PATCH bpf-next v2 2/4] selftests/bpf: test_progs: remove global fail/success counts
From: Stanislav Fomichev @ 2019-08-21 17:11 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Stanislav Fomichev, netdev, bpf, davem, ast, Andrii Nakryiko
In-Reply-To: <5248b967-2887-2205-3e59-fc067e2ada33@iogearbox.net>

On 08/21, Daniel Borkmann wrote:
> On 8/19/19 9:17 PM, Stanislav Fomichev wrote:
> > Now that we have a global per-test/per-environment state, there
> > is no longer need to have global fail/success counters (and there
> > is no need to save/get the diff before/after the test).
> 
> Thanks for the improvements, just a small comment below, otherwise LGTM.
> 
> > Introduce QCHECK macro (suggested by Andrii) and covert existing tests
> > to it. QCHECK uses new test__fail() to record the failure.
> > 
> > Cc: Andrii Nakryiko <andriin@fb.com>
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> [...]
> > @@ -96,17 +93,25 @@ extern struct ipv6_packet pkt_v6;
> >   #define _CHECK(condition, tag, duration, format...) ({			\
> >   	int __ret = !!(condition);					\
> >   	if (__ret) {							\
> > -		error_cnt++;						\
> > +		test__fail();						\
> >   		printf("%s:FAIL:%s ", __func__, tag);			\
> >   		printf(format);						\
> >   	} else {							\
> > -		pass_cnt++;						\
> >   		printf("%s:PASS:%s %d nsec\n",				\
> >   		       __func__, tag, duration);			\
> >   	}								\
> >   	__ret;								\
> >   })
> > +#define QCHECK(condition) ({						\
> > +	int __ret = !!(condition);					\
> > +	if (__ret) {							\
> > +		test__fail();						\
> > +		printf("%s:FAIL:%d ", __func__, __LINE__);		\
> > +	}								\
> > +	__ret;								\
> > +})
> 
> I know it's just a tiny nit but the name QCHECK() really doesn't tell me anything
> if I don't see its definition. Even just a CHECK_FAIL() might be 'better' and
> more aligned with the CHECK() and CHECK_ATTR() we have, at least I don't think
> many would automatically derive 'quiet' from the Q prefix [0].
CHECK_FAIL sounds good, will respin! Thanks!

>   [0] https://lore.kernel.org/bpf/CAEf4BzbUGiUZBWkTWe2=LfhkXYhQGndN9gR6VTZwfV3eytstUw@mail.gmail.com/
> 
> >   #define CHECK(condition, tag, format...) \
> >   	_CHECK(condition, tag, duration, format)
> >   #define CHECK_ATTR(condition, tag, format...) \
> > 
> 

^ permalink raw reply

* Re: [EXT] Re: [PATCH net-next 0/1] Add BASE-T1 PHY support
From: Heiner Kallweit @ 2019-08-21 17:09 UTC (permalink / raw)
  To: Christian Herber, davem@davemloft.net, Andrew Lunn,
	Florian Fainelli
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <AM6PR0402MB3798FCBF1EE592687B13A3C386AB0@AM6PR0402MB3798.eurprd04.prod.outlook.com>

On 20.08.2019 15:36, Christian Herber wrote:
> On 19.08.2019 21:07, Heiner Kallweit wrote:
>> Caution: EXT Email
>>
>> On 19.08.2019 08:32, Christian Herber wrote:
>>> On 16.08.2019 22:59, Heiner Kallweit wrote:
>>>> On 15.08.2019 17:32, Christian Herber wrote:
>>>>> This patch adds basic support for BASE-T1 PHYs in the framework.
>>>>> BASE-T1 PHYs main area of application are automotive and industrial.
>>>>> BASE-T1 is standardized in IEEE 802.3, namely
>>>>> - IEEE 802.3bw: 100BASE-T1
>>>>> - IEEE 802.3bp 1000BASE-T1
>>>>> - IEEE 802.3cg: 10BASE-T1L and 10BASE-T1S
>>>>>
>>>>> There are no products which contain BASE-T1 and consumer type PHYs like
>>>>> 1000BASE-T. However, devices exist which combine 100BASE-T1 and 1000BASE-T1
>>>>> PHYs with auto-negotiation.
>>>>
>>>> Is this meant in a way that *currently* there are no PHY's combining Base-T1
>>>> with normal Base-T modes? Or are there reasons why this isn't possible in
>>>> general? I'm asking because we have PHY's combining copper and fiber, and e.g.
>>>> the mentioned Aquantia PHY that combines NBase-T with 1000Base-T2.
>>>>
>>>>>
>>>>> The intention of this patch is to make use of the existing Clause 45 functions.
>>>>> BASE-T1 adds some additional registers e.g. for aneg control, which follow a
>>>>> similiar register layout as the existing devices. The bits which are used in
>>>>> BASE-T1 specific registers are the same as in basic registers, thus the
>>>>> existing functions can be resued, with get_aneg_ctrl() selecting the correct
>>>>> register address.
>>>>>
>>>> If Base-T1 can't be combined with other modes then at a first glance I see no
>>>> benefit in defining new registers e.g. for aneg control, and the standard ones
>>>> are unused. Why not using the standard registers? Can you shed some light on that?
>>>>
>>>> Are the new registers internally shadowed to the standard location?
>>>> That's something I've seen on other PHY's: one register appears in different
>>>> places in different devices.
>>>>
>>>>> The current version of ethtool has been prepared for 100/1000BASE-T1 and works
>>>>> with this patch. 10BASE-T1 needs to be added to ethtool.
>>>>>
>>>>> Christian Herber (1):
>>>>>     Added BASE-T1 PHY support to PHY Subsystem
>>>>>
>>>>>    drivers/net/phy/phy-c45.c    | 113 +++++++++++++++++++++++++++++++----
>>>>>    drivers/net/phy/phy-core.c   |   4 +-
>>>>>    include/uapi/linux/ethtool.h |   2 +
>>>>>    include/uapi/linux/mdio.h    |  21 +++++++
>>>>>    4 files changed, 129 insertions(+), 11 deletions(-)
>>>>>
>>>>
>>>> Heiner
>>>>
>>>
>>> Hi Heiner,
>>>
>>> I do not think the Aquantia part you are describing is publicly
>>> documented, so i cannot comment on that part.
>> Right, datasheet isn't publicly available. All I wanted to say with
>> mentioning this PHY: It's not a rare exception that a PHY combines
>> standard BaseT modes with "non-consumer" modes for special purposes.
>> One practical use case of this proprietary 1000Base-T2 mode is
>> re-using existing 2-pair cabling in aircrafts.
>>
>>> There are multiple reasons why e.g. xBASE-T1 plus 1000BASE-T is
>>> unlikely. First, the is no use-case known to me, where this would be
>>> required. Second, there is no way that you can do an auto-negotiation
>>> between the two, as these both have their own auto-neg defined (Clause
>>> 28/73 vs. Clause 98). Thirdly, if you would ever have a product with
>>> both, I believe it would just include two full PHYs and a way to select
>>> which flavor you want. Of course, this is the theory until proven
>>> otherwise, but to me it is sufficient to use a single driver.
>>>
>> I'm with you if you say it's unlikely. However your statement in the
>> commit message leaves the impression that there can't be such a device.
>> And that's a difference.
>>
>> Regarding "including two full PHYs":
>> This case we have already, there are PHYs combining different IP blocks,
>> each one supporting a specific mode (e.g. copper and fiber). There you
>> also have the case of different autoneg methods, clause 28 vs. clause 37.
>>
>>> The registers are different in the fields they include. It is just that
>>> the flags which are used by the Linux driver, like restarting auto-neg,
>>> are at the same position.
>>>
>> Good to know. Your commit description doesn't mention any specific PHY.
>> I suppose you have PHYs you'd like to operate with the genphy_c45 driver.
>> Could you give an example? And ideally, is a public datasheet available?
>>
>>> Christian
>>>
>>>
>> Heiner
>>
> 
> There are no public BASE-T1 devices on the market right now that use 
> Clause 45 standard registers. The first such products were developed 
> before the IEEE standard (BroadR-Reach) and used Clause 22 access (see 
> e.g. the support in the Kernel for TJA110x).
> 
> The most convenient way to test with a BASE-T1 device would be to use an 
> SFP (e.g. 
> https://technica-engineering.de/produkt/1000base-t1-sfp-module/). 
> Alternative source could be Goepel.
> 
> There are also a number of media-converters around where one could break 
> out the MDIO and connect to a processor. Of course, in all cases it 
> should be made sure that this is a Clause-45 device.
> 
> As all relevant parts are NDA-restricted, this is pretty much all the 
> information I can share.
> 
If no such device is on the market yet, then I'd suggest:
- wait for such a device to see whether genphy_c45 driver is really
  sufficient or whether other chip features require a dedicated driver
  anyway. In the latter case it may be better to add dedicated T1
  functions to phylib.
- add the missing 10BASE-T1L and 10BASE-T1S support meanwhile

The current patch set IMO is a little bit hacky. I'm not 100% happy
with the implicit assumption that there can't be devices supporting
T1 and classic BaseT modes or fiber modes.

Andrew: Do you have an opinion on that?

> Christian
> 
> 
Heiner

^ permalink raw reply

* Re: [PATCH net-next] net: systemport: use devm_platform_ioremap_resource() to simplify code
From: Florian Fainelli @ 2019-08-21 16:59 UTC (permalink / raw)
  To: YueHaibing, davem, opendmb, bcm-kernel-feedback-list; +Cc: linux-kernel, netdev
In-Reply-To: <20190821134613.23276-1-yuehaibing@huawei.com>

On 8/21/19 6:46 AM, YueHaibing wrote:
> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply

* Re: [PATCH net-next] net: bcmgenet: use devm_platform_ioremap_resource() to simplify code
From: Florian Fainelli @ 2019-08-21 16:59 UTC (permalink / raw)
  To: YueHaibing, davem, opendmb, bcm-kernel-feedback-list; +Cc: linux-kernel, netdev
In-Reply-To: <20190821134131.57780-1-yuehaibing@huawei.com>

On 8/21/19 6:41 AM, YueHaibing wrote:
> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply

* Re: [PATCH net] ixgbe: fix double clean of tx descriptors with xdp
From: Alexander Duyck @ 2019-08-21 16:57 UTC (permalink / raw)
  To: Ilya Maximets
  Cc: Björn Töpel, Netdev, LKML, bpf, David S. Miller,
	Magnus Karlsson, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jeff Kirsher, intel-wired-lan, Eelco Chaudron,
	William Tu
In-Reply-To: <f7d0f7a5-e664-8b72-99c7-63275aff4c18@samsung.com>

On Wed, Aug 21, 2019 at 9:22 AM Ilya Maximets <i.maximets@samsung.com> wrote:
>
> On 21.08.2019 4:17, Alexander Duyck wrote:
> > On Tue, Aug 20, 2019 at 8:58 AM Ilya Maximets <i.maximets@samsung.com> wrote:
> >>
> >> On 20.08.2019 18:35, Alexander Duyck wrote:
> >>> On Tue, Aug 20, 2019 at 8:18 AM Ilya Maximets <i.maximets@samsung.com> wrote:
> >>>>
> >>>> Tx code doesn't clear the descriptor status after cleaning.
> >>>> So, if the budget is larger than number of used elems in a ring, some
> >>>> descriptors will be accounted twice and xsk_umem_complete_tx will move
> >>>> prod_tail far beyond the prod_head breaking the comletion queue ring.
> >>>>
> >>>> Fix that by limiting the number of descriptors to clean by the number
> >>>> of used descriptors in the tx ring.
> >>>>
> >>>> Fixes: 8221c5eba8c1 ("ixgbe: add AF_XDP zero-copy Tx support")
> >>>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
> >>>
> >>> I'm not sure this is the best way to go. My preference would be to
> >>> have something in the ring that would prevent us from racing which I
> >>> don't think this really addresses. I am pretty sure this code is safe
> >>> on x86 but I would be worried about weak ordered systems such as
> >>> PowerPC.
> >>>
> >>> It might make sense to look at adding the eop_desc logic like we have
> >>> in the regular path with a proper barrier before we write it and after
> >>> we read it. So for example we could hold of on writing the bytecount
> >>> value until the end of an iteration and call smp_wmb before we write
> >>> it. Then on the cleanup we could read it and if it is non-zero we take
> >>> an smp_rmb before proceeding further to process the Tx descriptor and
> >>> clearing the value. Otherwise this code is going to just keep popping
> >>> up with issues.
> >>
> >> But, unlike regular case, xdp zero-copy xmit and clean for particular
> >> tx ring always happens in the same NAPI context and even on the same
> >> CPU core.
> >>
> >> I saw the 'eop_desc' manipulations in regular case and yes, we could
> >> use 'next_to_watch' field just as a flag of descriptor existence,
> >> but it seems unnecessarily complicated. Am I missing something?
> >>
> >
> > So is it always in the same NAPI context?. I forgot, I was thinking
> > that somehow the socket could possibly make use of XDP for transmit.
>
> AF_XDP socket only triggers tx interrupt on ndo_xsk_async_xmit() which
> is used in zero-copy mode. Real xmit happens inside
> ixgbe_poll()
>  -> ixgbe_clean_xdp_tx_irq()
>     -> ixgbe_xmit_zc()
>
> This should be not possible to bound another XDP socket to the same netdev
> queue.
>
> It also possible to xmit frames in xdp_ring while performing XDP_TX/REDIRECT
> actions. REDIRECT could happen from different netdev with different NAPI
> context, but this operation is bound to specific CPU core and each core has
> its own xdp_ring.
>
> However, I'm not an expert here.
> Björn, maybe you could comment on this?
>
> >
> > As far as the logic to use I would be good with just using a value you
> > are already setting such as the bytecount value. All that would need
> > to happen is to guarantee that the value is cleared in the Tx path. So
> > if you clear the bytecount in ixgbe_clean_xdp_tx_irq you could
> > theoretically just use that as well to flag that a descriptor has been
> > populated and is ready to be cleaned. Assuming the logic about this
> > all being in the same NAPI context anyway you wouldn't need to mess
> > with the barrier stuff I mentioned before.
>
> Checking the number of used descs, i.e. next_to_use - next_to_clean,
> makes iteration in this function logically equal to the iteration inside
> 'ixgbe_xsk_clean_tx_ring()'. Do you think we need to change the later
> function too to follow same 'bytecount' approach? I don't like having
> two different ways to determine number of used descriptors in the same file.
>
> Best regards, Ilya Maximets.

As far as ixgbe_clean_xdp_tx_irq() vs ixgbe_xsk_clean_tx_ring(), I
would say that if you got rid of budget and framed things more like
how ixgbe_xsk_clean_tx_ring was framed with the ntc != ntu being
obvious I would prefer to see us go that route.

Really there is no need for budget in ixgbe_clean_xdp_tx_irq() if you
are going to be working with a static ntu value since you will only
ever process one iteration through the ring anyway. It might make more
sense if you just went through and got rid of budget and i, and
instead used ntc and ntu like what was done in
ixgbe_xsk_clean_tx_ring().

Thanks.

- Alex

^ permalink raw reply

* Re: [PATCH v2 net] Add genphy_c45_config_aneg() function to phy-c45.c
From: Heiner Kallweit @ 2019-08-21 16:55 UTC (permalink / raw)
  To: Andrew Lunn, Marco Hartmann
  Cc: f.fainelli@gmail.com, davem@davemloft.net, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, Christian Herber
In-Reply-To: <20190821153740.GB22091@lunn.ch>

On 21.08.2019 17:37, Andrew Lunn wrote:
> On Wed, Aug 21, 2019 at 11:00:46AM +0000, Marco Hartmann wrote:
>> Commit 34786005eca3 ("net: phy: prevent PHYs w/o Clause 22 regs from calling
>> genphy_config_aneg") introduced a check that aborts phy_config_aneg()
>> if the phy is a C45 phy.
>> This causes phy_state_machine() to call phy_error() so that the phy
>> ends up in PHY_HALTED state.
>>
>> Instead of returning -EOPNOTSUPP, call genphy_c45_config_aneg()
>> (analogous to the C22 case) so that the state machine can run
>> correctly.
>>
>> genphy_c45_config_aneg() closely resembles mv3310_config_aneg()
>> in drivers/net/phy/marvell10g.c, excluding vendor specific
>> configurations for 1000BaseT.
> 
>> +/**
>> + * genphy_c45_config_aneg - restart auto-negotiation or forced setup
>> + * @phydev: target phy_device struct
>> + *
>> + * Description: If auto-negotiation is enabled, we configure the
>> + *   advertising, and then restart auto-negotiation.  If it is not
>> + *   enabled, then we force a configuration.
>> + */
>> +int genphy_c45_config_aneg(struct phy_device *phydev)
>> +{
>> +	bool changed = false;
>> +	int ret;
>> +
>> +	if (phydev->autoneg == AUTONEG_DISABLE)
>> +		return genphy_c45_pma_setup_forced(phydev);
>> +
>> +	ret = genphy_c45_an_config_aneg(phydev);
>> +	if (ret < 0)
>> +		return ret;
>> +	if (ret > 0)
>> +		changed = true;
>> +
>> +	return genphy_c45_check_and_restart_aneg(phydev, changed);
>> +}
>> +EXPORT_SYMBOL_GPL(genphy_c45_config_aneg);
> 
> The vendor parts for 1000BaseT makes this interesting. Do we expect to
> see an C45 PHYs which don't support 1000BaseT? I think that
> unlikely. So all C45 PHYs are going to implement their own config_aneg
> callback so they can set their vendor registers for 1000BaseT.
> 
>> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
>> index f3adea9ef400..74c4e15ebe52 100644
>> --- a/drivers/net/phy/phy.c
>> +++ b/drivers/net/phy/phy.c
>> @@ -507,7 +507,7 @@ static int phy_config_aneg(struct phy_device *phydev)
>>  	 * allowed to call genphy_config_aneg()
>>  	 */
>>  	if (phydev->is_c45 && !(phydev->c45_ids.devices_in_package & BIT(0)))
>> -		return -EOPNOTSUPP;
>> +		return genphy_c45_config_aneg(phydev);
>>  
>>  	return genphy_config_aneg(phydev);
> 
> So here we should be calling the driver config_aneg function. It can
> then call genphy_c45_config_aneg(phydev) to do the generic parts.
> 
> Heiner, what do you think?
> 
As you say, C45 doesn't cover 1000BaseT, therefore for this mode a
vendor-specific part is needed in the PHY driver always.
That's the reason why genphy_c45_an_config_aneg is meant to be used
within a PHY drivers config_aneg callback implementation, and why
we don't have a generic C45 config_aneg function yet.

Use case for the new function could be cases where 1000BaseT support
isn't needed, and it could serve as a fallback if there's no
dedicated PHY driver yet e.g. for a new chip.

> 	Andrew
> 
Heiner

^ permalink raw reply

* Re: [PATCH net] ixgbe: fix double clean of tx descriptors with xdp
From: Ilya Maximets @ 2019-08-21 16:21 UTC (permalink / raw)
  To: Alexander Duyck, Björn Töpel
  Cc: Netdev, LKML, bpf, David S. Miller, Magnus Karlsson,
	Jakub Kicinski, Alexei Starovoitov, Daniel Borkmann, Jeff Kirsher,
	intel-wired-lan, Eelco Chaudron, William Tu
In-Reply-To: <CAKgT0Uc27+ucd=a_sgTmv5g7_+ZTg1zK4isYJ0H7YWQj3d=Ejg@mail.gmail.com>

On 21.08.2019 4:17, Alexander Duyck wrote:
> On Tue, Aug 20, 2019 at 8:58 AM Ilya Maximets <i.maximets@samsung.com> wrote:
>>
>> On 20.08.2019 18:35, Alexander Duyck wrote:
>>> On Tue, Aug 20, 2019 at 8:18 AM Ilya Maximets <i.maximets@samsung.com> wrote:
>>>>
>>>> Tx code doesn't clear the descriptor status after cleaning.
>>>> So, if the budget is larger than number of used elems in a ring, some
>>>> descriptors will be accounted twice and xsk_umem_complete_tx will move
>>>> prod_tail far beyond the prod_head breaking the comletion queue ring.
>>>>
>>>> Fix that by limiting the number of descriptors to clean by the number
>>>> of used descriptors in the tx ring.
>>>>
>>>> Fixes: 8221c5eba8c1 ("ixgbe: add AF_XDP zero-copy Tx support")
>>>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
>>>
>>> I'm not sure this is the best way to go. My preference would be to
>>> have something in the ring that would prevent us from racing which I
>>> don't think this really addresses. I am pretty sure this code is safe
>>> on x86 but I would be worried about weak ordered systems such as
>>> PowerPC.
>>>
>>> It might make sense to look at adding the eop_desc logic like we have
>>> in the regular path with a proper barrier before we write it and after
>>> we read it. So for example we could hold of on writing the bytecount
>>> value until the end of an iteration and call smp_wmb before we write
>>> it. Then on the cleanup we could read it and if it is non-zero we take
>>> an smp_rmb before proceeding further to process the Tx descriptor and
>>> clearing the value. Otherwise this code is going to just keep popping
>>> up with issues.
>>
>> But, unlike regular case, xdp zero-copy xmit and clean for particular
>> tx ring always happens in the same NAPI context and even on the same
>> CPU core.
>>
>> I saw the 'eop_desc' manipulations in regular case and yes, we could
>> use 'next_to_watch' field just as a flag of descriptor existence,
>> but it seems unnecessarily complicated. Am I missing something?
>>
> 
> So is it always in the same NAPI context?. I forgot, I was thinking
> that somehow the socket could possibly make use of XDP for transmit.

AF_XDP socket only triggers tx interrupt on ndo_xsk_async_xmit() which
is used in zero-copy mode. Real xmit happens inside
ixgbe_poll()
 -> ixgbe_clean_xdp_tx_irq()
    -> ixgbe_xmit_zc()

This should be not possible to bound another XDP socket to the same netdev
queue.

It also possible to xmit frames in xdp_ring while performing XDP_TX/REDIRECT
actions. REDIRECT could happen from different netdev with different NAPI
context, but this operation is bound to specific CPU core and each core has
its own xdp_ring.

However, I'm not an expert here.
Björn, maybe you could comment on this?

> 
> As far as the logic to use I would be good with just using a value you
> are already setting such as the bytecount value. All that would need
> to happen is to guarantee that the value is cleared in the Tx path. So
> if you clear the bytecount in ixgbe_clean_xdp_tx_irq you could
> theoretically just use that as well to flag that a descriptor has been
> populated and is ready to be cleaned. Assuming the logic about this
> all being in the same NAPI context anyway you wouldn't need to mess
> with the barrier stuff I mentioned before.

Checking the number of used descs, i.e. next_to_use - next_to_clean,
makes iteration in this function logically equal to the iteration inside
'ixgbe_xsk_clean_tx_ring()'. Do you think we need to change the later
function too to follow same 'bytecount' approach? I don't like having
two different ways to determine number of used descriptors in the same file.

Best regards, Ilya Maximets.

^ permalink raw reply

* Re: Re: [PATCH v2 net-next 0/1] Add BASE-T1 PHY support
From: Christian Herber @ 2019-08-21 16:16 UTC (permalink / raw)
  To: David Miller, hkallweit1@gmail.com
  Cc: andrew@lunn.ch, f.fainelli@gmail.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <20190820.122234.1290995026664280862.davem@davemloft.net>

On 20.08.2019 21:22, David Miller wrote:
> 
> From: Christian Herber <christian.herber@nxp.com>
> Date: Mon, 19 Aug 2019 15:19:52 +0000
> 
>> v1 patchset can be found here: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml.org%2Flkml%2F2019%2F8%2F15%2F626&amp;data=02%7C01%7Cchristian.herber%40nxp.com%7Ccbb5f329425240eda10a08d725a3c305%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C637019257604516613&amp;sdata=IdBZbqGgA0upPZZrBQSxPL%2Fh7Tn4BtYA4%2FfS6dZngWU%3D&amp;reserved=0
> 
> Please expand and clarify your commit messages as requested by Heiner
> in his feedback to v1.
> 

Hi David, Heiner,

could you please be specific what to add? The discussion was on various 
topics. Agree that it would probably help to add some more clarity, but 
it would be good if you can specify your expectation in this.

Christian

^ permalink raw reply

* Re: [PATCH 1/2] ipv4/icmp: fix rt dst dev null pointer dereference
From: Julian Anastasov @ 2019-08-21 16:11 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: netdev, Stefano Brivio, wenxu, Alexei Starovoitov,
	David S . Miller, Eric Dumazet
In-Reply-To: <20190819075327.32412-2-liuhangbin@gmail.com>


	Hello,

On Mon, 19 Aug 2019, Hangbin Liu wrote:

> In __icmp_send() there is a possibility that the rt->dst.dev is NULL,
> e,g, with tunnel collect_md mode, which will cause kernel crash.
> Here is what the code path looks like, for GRE:
> 
> - ip6gre_tunnel_xmit
>   - ip6gre_xmit_ipv4
>     - __gre6_xmit
>       - ip6_tnl_xmit
>         - if skb->len - t->tun_hlen - eth_hlen > mtu; return -EMSGSIZE
>     - icmp_send
>       - net = dev_net(rt->dst.dev); <-- here
> 
> The reason is __metadata_dst_init() init dst->dev to NULL by default.
> We could not fix it in __metadata_dst_init() as there is no dev supplied.
> On the other hand, the reason we need rt->dst.dev is to get the net.
> So we can just get it from skb->dev, just like commit 8d9336704521
> ("ipv6: make icmp6_send() robust against null skb->dev") did.
> 
> Fixes: c8b34e680a09 ("ip_tunnel: Add tnl_update_pmtu in ip_md_tunnel_xmit")
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
> ---
>  net/ipv4/icmp.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
> index 1510e951f451..5f00c9d18b02 100644
> --- a/net/ipv4/icmp.c
> +++ b/net/ipv4/icmp.c
> @@ -582,7 +582,10 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info,
>  
>  	if (!rt)
>  		goto out;
> -	net = dev_net(rt->dst.dev);
> +
> +	if (!skb_in->dev)
> +		goto out;

	This looks wrong to me. IIRC, we should be able to send
ICMP errors from the OUTPUT hook where skb->dev is NULL. It is
true even for IPv6: net/ipv6/netfilter/ip6t_REJECT.c works for
NF_INET_LOCAL_OUT. nf_send_unreach6() and other IPv6 places have 
workarounds to avoid skb->dev being NULL but IPv4 and IPv6 are
different: IPv4 never required skb->dev to be non-NULL, so better
do not change that. Just check dst.dev to avoid crash.

> +	net = dev_net(skb_in->dev);
>  
>  	/*
>  	 *	Find the original header. It is expected to be valid, of course.
> -- 
> 2.19.2

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* Re: Help needed - Kernel lockup while running ipsec
From: Florian Westphal @ 2019-08-21 16:11 UTC (permalink / raw)
  To: Vakul Garg; +Cc: Florian Westphal, netdev@vger.kernel.org
In-Reply-To: <DB7PR04MB46204E4A3EBD5DD665F492D38BAA0@DB7PR04MB4620.eurprd04.prod.outlook.com>

[-- Attachment #1: Type: text/plain, Size: 759 bytes --]

Vakul Garg <vakul.garg@nxp.com> wrote:
> > Policy refcount is decreasing properly on 4.19.
> > Same should be on the latest kernel too.
> 
> On kernel-4.14, I find dst_release() is getting called through xfrm_output_one().
> However since dst->__refcnt gets decremented to '1', 
> the call_rcu(&dst->rcu_head, dst_destroy_rcu) is not invoked. 
> 
> On kernel-4.19, dst->__refcnt gets decremented to '0', hence things fall in place and 
> dst_destroy_rcu() eventually executes.
> 
> Any further help/pointers for kernel-4.14 would be deeply appreciated.

Can you try getting rid of the pcpu dst cache?

I had a look at 4.14-stable and it at least lacks 2950278d2d04ff531.

I've attached an (untested) revert of the pcpu cache (its gone in 4.19
and onwards).



[-- Attachment #2: 0001-xfrm-policy-remove-pcpu-policy-cache.patch --]
[-- Type: text/x-diff, Size: 8993 bytes --]

From 058cb6719223d10dc57743dbf5c20424f118e7e7 Mon Sep 17 00:00:00 2001
From: Florian Westphal <fw@strlen.de>
Date: Mon, 25 Jun 2018 17:26:02 +0200
Subject: [PATCH 4.4.14.y] xfrm: policy: remove pcpu policy cache

commit e4db5b61c572475bbbcf63e3c8a2606bfccf2c9d upstream.

Kristian Evensen says:
  In a project I am involved in, we are running ipsec (Strongswan) on
  different mt7621-based routers. Each router is configured as an
  initiator and has around ~30 tunnels to different responders (running
  on misc. devices). Before the flow cache was removed (kernel 4.9), we
  got a combined throughput of around 70Mbit/s for all tunnels on one
  router. However, we recently switched to kernel 4.14 (4.14.48), and
  the total throughput is somewhere around 57Mbit/s (best-case). I.e., a
  drop of around 20%. Reverting the flow cache removal restores, as
  expected, performance levels to that of kernel 4.9.

When pcpu xdst exists, it has to be validated first before it can be
used.

A negative hit thus increases cost vs. no-cache.

As number of tunnels increases, hit rate decreases so this pcpu caching
isn't a viable strategy.

Furthermore, the xdst cache also needs to run with BH off, so when
removing this the bh disable/enable pairs can be removed too.

Kristian tested a 4.14.y backport of this change and reported
increased performance:

  In our tests, the throughput reduction has been reduced from around -20%
  to -5%. We also see that the overall throughput is independent of the
  number of tunnels, while before the throughput was reduced as the number
  of tunnels increased.

Reported-by: Kristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 include/net/xfrm.h     |   1 -
 net/xfrm/xfrm_device.c |  10 ---
 net/xfrm/xfrm_policy.c | 138 +----------------------------------------
 net/xfrm/xfrm_state.c  |   5 +-
 4 files changed, 3 insertions(+), 151 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index db99efb2d1d0..bdf185ae93db 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -323,7 +323,6 @@ int xfrm_policy_register_afinfo(const struct xfrm_policy_afinfo *afinfo, int fam
 void xfrm_policy_unregister_afinfo(const struct xfrm_policy_afinfo *afinfo);
 void km_policy_notify(struct xfrm_policy *xp, int dir,
 		      const struct km_event *c);
-void xfrm_policy_cache_flush(void);
 void km_state_notify(struct xfrm_state *x, const struct km_event *c);
 
 struct xfrm_tmpl;
diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index 30e5746085b8..4e458fd9236a 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -153,12 +153,6 @@ static int xfrm_dev_register(struct net_device *dev)
 	return NOTIFY_DONE;
 }
 
-static int xfrm_dev_unregister(struct net_device *dev)
-{
-	xfrm_policy_cache_flush();
-	return NOTIFY_DONE;
-}
-
 static int xfrm_dev_feat_change(struct net_device *dev)
 {
 	if ((dev->features & NETIF_F_HW_ESP) && !dev->xfrmdev_ops)
@@ -178,7 +172,6 @@ static int xfrm_dev_down(struct net_device *dev)
 	if (dev->features & NETIF_F_HW_ESP)
 		xfrm_dev_state_flush(dev_net(dev), dev, true);
 
-	xfrm_policy_cache_flush();
 	return NOTIFY_DONE;
 }
 
@@ -190,9 +183,6 @@ static int xfrm_dev_event(struct notifier_block *this, unsigned long event, void
 	case NETDEV_REGISTER:
 		return xfrm_dev_register(dev);
 
-	case NETDEV_UNREGISTER:
-		return xfrm_dev_unregister(dev);
-
 	case NETDEV_FEAT_CHANGE:
 		return xfrm_dev_feat_change(dev);
 
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 70ec57b887f6..b5006a091fd6 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -45,8 +45,6 @@ struct xfrm_flo {
 	u8 flags;
 };
 
-static DEFINE_PER_CPU(struct xfrm_dst *, xfrm_last_dst);
-static struct work_struct *xfrm_pcpu_work __read_mostly;
 static DEFINE_SPINLOCK(xfrm_policy_afinfo_lock);
 static struct xfrm_policy_afinfo const __rcu *xfrm_policy_afinfo[AF_INET6 + 1]
 						__read_mostly;
@@ -1715,108 +1713,6 @@ static int xfrm_expand_policies(const struct flowi *fl, u16 family,
 
 }
 
-static void xfrm_last_dst_update(struct xfrm_dst *xdst, struct xfrm_dst *old)
-{
-	this_cpu_write(xfrm_last_dst, xdst);
-	if (old)
-		dst_release(&old->u.dst);
-}
-
-static void __xfrm_pcpu_work_fn(void)
-{
-	struct xfrm_dst *old;
-
-	old = this_cpu_read(xfrm_last_dst);
-	if (old && !xfrm_bundle_ok(old))
-		xfrm_last_dst_update(NULL, old);
-}
-
-static void xfrm_pcpu_work_fn(struct work_struct *work)
-{
-	local_bh_disable();
-	rcu_read_lock();
-	__xfrm_pcpu_work_fn();
-	rcu_read_unlock();
-	local_bh_enable();
-}
-
-void xfrm_policy_cache_flush(void)
-{
-	struct xfrm_dst *old;
-	bool found = 0;
-	int cpu;
-
-	might_sleep();
-
-	local_bh_disable();
-	rcu_read_lock();
-	for_each_possible_cpu(cpu) {
-		old = per_cpu(xfrm_last_dst, cpu);
-		if (old && !xfrm_bundle_ok(old)) {
-			if (smp_processor_id() == cpu) {
-				__xfrm_pcpu_work_fn();
-				continue;
-			}
-			found = true;
-			break;
-		}
-	}
-
-	rcu_read_unlock();
-	local_bh_enable();
-
-	if (!found)
-		return;
-
-	get_online_cpus();
-
-	for_each_possible_cpu(cpu) {
-		bool bundle_release;
-
-		rcu_read_lock();
-		old = per_cpu(xfrm_last_dst, cpu);
-		bundle_release = old && !xfrm_bundle_ok(old);
-		rcu_read_unlock();
-
-		if (!bundle_release)
-			continue;
-
-		if (cpu_online(cpu)) {
-			schedule_work_on(cpu, &xfrm_pcpu_work[cpu]);
-			continue;
-		}
-
-		rcu_read_lock();
-		old = per_cpu(xfrm_last_dst, cpu);
-		if (old && !xfrm_bundle_ok(old)) {
-			per_cpu(xfrm_last_dst, cpu) = NULL;
-			dst_release(&old->u.dst);
-		}
-		rcu_read_unlock();
-	}
-
-	put_online_cpus();
-}
-
-static bool xfrm_xdst_can_reuse(struct xfrm_dst *xdst,
-				struct xfrm_state * const xfrm[],
-				int num)
-{
-	const struct dst_entry *dst = &xdst->u.dst;
-	int i;
-
-	if (xdst->num_xfrms != num)
-		return false;
-
-	for (i = 0; i < num; i++) {
-		if (!dst || dst->xfrm != xfrm[i])
-			return false;
-		dst = dst->child;
-	}
-
-	return xfrm_bundle_ok(xdst);
-}
-
 static struct xfrm_dst *
 xfrm_resolve_and_create_bundle(struct xfrm_policy **pols, int num_pols,
 			       const struct flowi *fl, u16 family,
@@ -1824,7 +1720,7 @@ xfrm_resolve_and_create_bundle(struct xfrm_policy **pols, int num_pols,
 {
 	struct net *net = xp_net(pols[0]);
 	struct xfrm_state *xfrm[XFRM_MAX_DEPTH];
-	struct xfrm_dst *xdst, *old;
+	struct xfrm_dst *xdst;
 	struct dst_entry *dst;
 	int err;
 
@@ -1839,21 +1735,6 @@ xfrm_resolve_and_create_bundle(struct xfrm_policy **pols, int num_pols,
 		return ERR_PTR(err);
 	}
 
-	xdst = this_cpu_read(xfrm_last_dst);
-	if (xdst &&
-	    xdst->u.dst.dev == dst_orig->dev &&
-	    xdst->num_pols == num_pols &&
-	    memcmp(xdst->pols, pols,
-		   sizeof(struct xfrm_policy *) * num_pols) == 0 &&
-	    xfrm_xdst_can_reuse(xdst, xfrm, err)) {
-		dst_hold(&xdst->u.dst);
-		while (err > 0)
-			xfrm_state_put(xfrm[--err]);
-		return xdst;
-	}
-
-	old = xdst;
-
 	dst = xfrm_bundle_create(pols[0], xfrm, err, fl, dst_orig);
 	if (IS_ERR(dst)) {
 		XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTBUNDLEGENERROR);
@@ -1866,9 +1747,6 @@ xfrm_resolve_and_create_bundle(struct xfrm_policy **pols, int num_pols,
 	memcpy(xdst->pols, pols, sizeof(struct xfrm_policy *) * num_pols);
 	xdst->policy_genid = atomic_read(&pols[0]->genid);
 
-	atomic_set(&xdst->u.dst.__refcnt, 2);
-	xfrm_last_dst_update(xdst, old);
-
 	return xdst;
 }
 
@@ -2069,11 +1947,8 @@ xfrm_bundle_lookup(struct net *net, const struct flowi *fl, u16 family, u8 dir,
 	if (num_xfrms <= 0)
 		goto make_dummy_bundle;
 
-	local_bh_disable();
 	xdst = xfrm_resolve_and_create_bundle(pols, num_pols, fl, family,
 					      xflo->dst_orig);
-	local_bh_enable();
-
 	if (IS_ERR(xdst)) {
 		err = PTR_ERR(xdst);
 		if (err != -EAGAIN)
@@ -2160,11 +2035,9 @@ struct dst_entry *xfrm_lookup(struct net *net, struct dst_entry *dst_orig,
 				goto no_transform;
 			}
 
-			local_bh_disable();
 			xdst = xfrm_resolve_and_create_bundle(
 					pols, num_pols, fl,
 					family, dst_orig);
-			local_bh_enable();
 
 			if (IS_ERR(xdst)) {
 				xfrm_pols_put(pols, num_pols);
@@ -2992,15 +2865,6 @@ static struct pernet_operations __net_initdata xfrm_net_ops = {
 
 void __init xfrm_init(void)
 {
-	int i;
-
-	xfrm_pcpu_work = kmalloc_array(NR_CPUS, sizeof(*xfrm_pcpu_work),
-				       GFP_KERNEL);
-	BUG_ON(!xfrm_pcpu_work);
-
-	for (i = 0; i < NR_CPUS; i++)
-		INIT_WORK(&xfrm_pcpu_work[i], xfrm_pcpu_work_fn);
-
 	register_pernet_subsys(&xfrm_net_ops);
 	seqcount_init(&xfrm_policy_hash_generation);
 	xfrm_input_init();
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 0cd2bdf3b217..7c093de68780 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -735,10 +735,9 @@ int xfrm_state_flush(struct net *net, u8 proto, bool task_valid)
 	}
 out:
 	spin_unlock_bh(&net->xfrm.xfrm_state_lock);
-	if (cnt) {
+	if (cnt)
 		err = 0;
-		xfrm_policy_cache_flush();
-	}
+
 	return err;
 }
 EXPORT_SYMBOL(xfrm_state_flush);
-- 
2.21.0


^ permalink raw reply related

* Re: [GIT PULL] Keys: Set 4 - Key ACLs for 5.3
From: Mimi Zohar @ 2019-08-21 15:43 UTC (permalink / raw)
  To: David Howells
  Cc: Linus Torvalds, James Morris, keyrings, Netdev, linux-nfs, CIFS,
	linux-afs, linux-fsdevel, linux-integrity, LSM List,
	Linux List Kernel Mailing
In-Reply-To: <23498.1565962602@warthog.procyon.org.uk>

On Fri, 2019-08-16 at 14:36 +0100, David Howells wrote:
> Mimi Zohar <zohar@linux.ibm.com> wrote:
> 
> > Sorry for the delay.  An exception is needed for loading builtin keys
> > "KEY_ALLOC_BUILT_IN" onto a keyring that is not writable by userspace.
> >  The following works, but probably is not how David would handle the
> > exception.
> 
> I think the attached is the right way to fix it.
> 
> load_system_certificate_list(), for example, when it creates keys does this:
> 
> 	key = key_create_or_update(make_key_ref(builtin_trusted_keys, 1),
> 
> marking the keyring as "possessed" in make_key_ref().  This allows the
> possessor permits to be used - and that's the *only* way to use them for
> internal keyrings like this because you can't link to them and you can't join
> them.

In addition, as long as additional keys still can't be added or
existing keys updated by userspace on the .builtin_trusted_keys, then
this is fine.

> 
> David
> ---
> diff --git a/certs/system_keyring.c b/certs/system_keyring.c
> index 57be78b5fdfc..1f8f26f7bb05 100644
> --- a/certs/system_keyring.c
> +++ b/certs/system_keyring.c
> @@ -99,7 +99,7 @@ static __init int system_trusted_keyring_init(void)
>  	builtin_trusted_keys =
>  		keyring_alloc(".builtin_trusted_keys",
>  			      KUIDT_INIT(0), KGIDT_INIT(0), current_cred(),
> -			      &internal_key_acl, KEY_ALLOC_NOT_IN_QUOTA,
> +			      &internal_keyring_acl, KEY_ALLOC_NOT_IN_QUOTA,
>  			      NULL, NULL);
>  	if (IS_ERR(builtin_trusted_keys))
>  		panic("Can't allocate builtin trusted keyring\n");
> diff --git a/security/keys/permission.c b/security/keys/permission.c
> index fc84d9ef6239..86efd3eaf083 100644
> --- a/security/keys/permission.c
> +++ b/security/keys/permission.c
> @@ -47,7 +47,7 @@ struct key_acl internal_keyring_acl = {
>  	.usage	= REFCOUNT_INIT(1),
>  	.nr_ace	= 2,
>  	.aces = {
> -		KEY_POSSESSOR_ACE(KEY_ACE_SEARCH),
> +		KEY_POSSESSOR_ACE(KEY_ACE_SEARCH | KEY_ACE_WRITE),
>  		KEY_OWNER_ACE(KEY_ACE_VIEW | KEY_ACE_READ | KEY_ACE_SEARCH),
>  	}
>  };

Thanks, David.  The builtin keys are now being loaded.

Mimi


^ permalink raw reply

* Re: [PATCH v2 net] Add genphy_c45_config_aneg() function to phy-c45.c
From: Andrew Lunn @ 2019-08-21 15:37 UTC (permalink / raw)
  To: Marco Hartmann
  Cc: f.fainelli@gmail.com, hkallweit1@gmail.com, davem@davemloft.net,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Christian Herber
In-Reply-To: <1566385208-23523-1-git-send-email-marco.hartmann@nxp.com>

On Wed, Aug 21, 2019 at 11:00:46AM +0000, Marco Hartmann wrote:
> Commit 34786005eca3 ("net: phy: prevent PHYs w/o Clause 22 regs from calling
> genphy_config_aneg") introduced a check that aborts phy_config_aneg()
> if the phy is a C45 phy.
> This causes phy_state_machine() to call phy_error() so that the phy
> ends up in PHY_HALTED state.
> 
> Instead of returning -EOPNOTSUPP, call genphy_c45_config_aneg()
> (analogous to the C22 case) so that the state machine can run
> correctly.
> 
> genphy_c45_config_aneg() closely resembles mv3310_config_aneg()
> in drivers/net/phy/marvell10g.c, excluding vendor specific
> configurations for 1000BaseT.

> +/**
> + * genphy_c45_config_aneg - restart auto-negotiation or forced setup
> + * @phydev: target phy_device struct
> + *
> + * Description: If auto-negotiation is enabled, we configure the
> + *   advertising, and then restart auto-negotiation.  If it is not
> + *   enabled, then we force a configuration.
> + */
> +int genphy_c45_config_aneg(struct phy_device *phydev)
> +{
> +	bool changed = false;
> +	int ret;
> +
> +	if (phydev->autoneg == AUTONEG_DISABLE)
> +		return genphy_c45_pma_setup_forced(phydev);
> +
> +	ret = genphy_c45_an_config_aneg(phydev);
> +	if (ret < 0)
> +		return ret;
> +	if (ret > 0)
> +		changed = true;
> +
> +	return genphy_c45_check_and_restart_aneg(phydev, changed);
> +}
> +EXPORT_SYMBOL_GPL(genphy_c45_config_aneg);

The vendor parts for 1000BaseT makes this interesting. Do we expect to
see an C45 PHYs which don't support 1000BaseT? I think that
unlikely. So all C45 PHYs are going to implement their own config_aneg
callback so they can set their vendor registers for 1000BaseT.

> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> index f3adea9ef400..74c4e15ebe52 100644
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
> @@ -507,7 +507,7 @@ static int phy_config_aneg(struct phy_device *phydev)
>  	 * allowed to call genphy_config_aneg()
>  	 */
>  	if (phydev->is_c45 && !(phydev->c45_ids.devices_in_package & BIT(0)))
> -		return -EOPNOTSUPP;
> +		return genphy_c45_config_aneg(phydev);
>  
>  	return genphy_config_aneg(phydev);

So here we should be calling the driver config_aneg function. It can
then call genphy_c45_config_aneg(phydev) to do the generic parts.

Heiner, what do you think?

	Andrew

^ permalink raw reply

* Re: [PATCH] selftests: bpf: install files test_xdp_vlan.sh
From: Daniel Borkmann @ 2019-08-21 15:11 UTC (permalink / raw)
  To: Anders Roxell, shuah, ast, davem, jakub.kicinski, hawk,
	john.fastabend
  Cc: linux-kselftest, netdev, bpf, linux-kernel
In-Reply-To: <20190820134121.25728-1-anders.roxell@linaro.org>

On 8/20/19 3:41 PM, Anders Roxell wrote:
> When ./test_xdp_vlan_mode_generic.sh runs it complains that it can't
> find file test_xdp_vlan.sh.
> 
>   # selftests: bpf: test_xdp_vlan_mode_generic.sh
>   # ./test_xdp_vlan_mode_generic.sh: line 9: ./test_xdp_vlan.sh: No such
>   file or directory
> 
> Rework so that test_xdp_vlan.sh gets installed, added to the variable
> TEST_PROGS_EXTENDED.
> 
> Fixes: d35661fcf95d ("selftests/bpf: add wrapper scripts for test_xdp_vlan.sh")
> Signed-off-by: Anders Roxell <anders.roxell@linaro.org>

Applied, thanks!

^ permalink raw reply

* Re: [PATCH] selftests: bpf: add config fragment BPF_JIT
From: Daniel Borkmann @ 2019-08-21 15:11 UTC (permalink / raw)
  To: Anders Roxell, shuah, ast; +Cc: linux-kselftest, netdev, bpf, linux-kernel
In-Reply-To: <20190820134134.25818-1-anders.roxell@linaro.org>

On 8/20/19 3:41 PM, Anders Roxell wrote:
> When running test_kmod.sh the following shows up
> 
>   # sysctl cannot stat /proc/sys/net/core/bpf_jit_enable No such file or directory
>   cannot: stat_/proc/sys/net/core/bpf_jit_enable #
>   # sysctl cannot stat /proc/sys/net/core/bpf_jit_harden No such file or directory
>   cannot: stat_/proc/sys/net/core/bpf_jit_harden #
> 
> Rework to enable CONFIG_BPF_JIT to solve "No such file or directory"
> 
> Signed-off-by: Anders Roxell <anders.roxell@linaro.org>

Applied, thanks!

^ permalink raw reply

* Re: [PATCH net-next 1/2] net: flow_offload: mangle 128-bit packet field with one action
From: Edward Cree @ 2019-08-21 15:05 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: netfilter-devel, davem, netdev, jakub.kicinski, jiri, vladbu
In-Reply-To: <20190820183533.ykh7mnurpmegxb27@salvia>

On 20/08/2019 19:35, Pablo Neira Ayuso wrote:
> With one action that says "mangle an IPv6 at offset ip6 daddr field"
> the driver has more global view on what is going on, rather than
> having four actions to mangle four 32-bit words at some offset.
But the action doesn't say that, it still says "mangle four 32-bit
 words", it's just that they're now contiguous.  The driver doesn't
 know whether that's an IPv6 address or just a bunch of fields that
 happened to be next to one another.
(Besides, the driver can't rely on that 'global view', because if
 the actions did come from the TC uAPI, they're still going to be
 single u32 mangles.)

> If this patch adds some loops here is because I did not want to make
> too smart changes on the drivers.
The thing is, the drivers are already looping over TC actions, so they
 already naturally support multiple pedits.  You don't gain any
 expressiveness by combining them into batches of four, meanwhile you
 make the API less orthogonal and more laborious to implement.

> Please, allow for incremental updates on the flow_offload API to get
> it better now. Later we'll have way more drivers it will become harder
> to update this.
I'm not opposed to making the API better.  I just don't believe that
 this patch series achieves that.

^ permalink raw reply

* Re: [PATCH net] net: cpsw: fix NULL pointer exception in the probe error path
From: Grygorii Strashko @ 2019-08-21 15:02 UTC (permalink / raw)
  To: Antoine Tenart, davem; +Cc: netdev, linux-omap, linux-kernel, maxime.chevallier
In-Reply-To: <20190821144123.22248-1-antoine.tenart@bootlin.com>



On 21/08/2019 17:41, Antoine Tenart wrote:
> In certain cases when the probe function fails the error path calls
> cpsw_remove_dt() before calling platform_set_drvdata(). This is an
> issue as cpsw_remove_dt() uses platform_get_drvdata() to retrieve the
> cpsw_common data and leds to a NULL pointer exception. This patches
> fixes it by calling platform_set_drvdata() earlier in the probe.
> 
> Fixes: 83a8471ba255 ("net: ethernet: ti: cpsw: refactor probe to group common hw initialization")
> Reported-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
> Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
> ---
>   drivers/net/ethernet/ti/cpsw.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 

Thank you.
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>

-- 
Best regards,
grygorii

^ permalink raw reply

* Re: [PATCH bpf-next v4] libbpf: add xsk_ring_prod__nb_free() function
From: Magnus Karlsson @ 2019-08-21 14:53 UTC (permalink / raw)
  To: Eelco Chaudron
  Cc: Network Development, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, Andrii Nakryiko
In-Reply-To: <CAJ8uoz2v_48F6BuMkG7RPUymjQ2XL4hdPbeZu2R6SoarHSP47A@mail.gmail.com>

On Wed, Aug 21, 2019 at 4:14 PM Magnus Karlsson
<magnus.karlsson@gmail.com> wrote:
>
> On Wed, Aug 21, 2019 at 3:46 PM Eelco Chaudron <echaudro@redhat.com> wrote:
> >
> >
> >
> > On 21 Aug 2019, at 15:11, Magnus Karlsson wrote:
> >
> > > On Wed, Aug 14, 2019 at 3:51 PM Eelco Chaudron <echaudro@redhat.com>
> > > wrote:
> > >>
> > >> When an AF_XDP application received X packets, it does not mean X
> > >> frames can be stuffed into the producer ring. To make it easier for
> > >> AF_XDP applications this API allows them to check how many frames can
> > >> be added into the ring.
> > >>
> > >> The patch below looks like a name change only, but the xsk_prod__
> > >> prefix denotes that this API is exposed to be used by applications.
> > >>
> > >> Besides, if you set the nb value to the size of the ring, you will
> > >> get the exact amount of slots available, at the cost of performance
> > >> (you touch shared state for sure). nb is there to limit the
> > >> touching of the shared state.
> > >>
> > >> Also the example xdpsock application has been modified to use this
> > >> new API, so it's also able to process flows at a 1pps rate on veth
> > >> interfaces.

1 pps! That is not that impressive ;-).

> > > My apologies for the late reply and thank you for working on this. So
> > > what kind of performance difference do you see with your modified
> > > xdpsock application on a regular NIC for txpush and l2fwd? If there is
> > > basically no difference or it is faster, we can go ahead and accept
> > > this. But if the difference is large, we might consider to have two
> > > versions of txpush and l2fwd as the regular NICs do not need this. Or
> > > we optimize your code so that it becomes as fast as the previous
> > > version.
> >
> > For both operation modes, I ran 5 test with and without the changes
> > applied using an iexgb connecting to a XENA tester. The throughput
> > numbers were within the standard deviation, so no noticeable performance
> > gain or drop.
>
> Sounds good, but let me take your patches for a run on something
> faster, just to make sure we are CPU bound. Will get back.

I ran some experiments and with two cores (app on one, softirq on
another) there is no impact since the application core has cycles to
spare. But if you run it on a single core the drop is 1- 2% for l2fwd.
I think this is ok since your version is a better example and more
correct. Just note that your patch did not apply cleanly to bpf-next,
so please rebase it, resubmit and I will ack it.

Thanks: Magnus

> /Magnus
>
> > Let me know if this is enough, if not I can rebuild the setup and do
> > some more tests.
> >
> > > /Magnus
> > >
> > >> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> > >> ---
> > >>
> > >> v3 -> v4
> > >>   - Cleanedup commit message
> > >>   - Updated AF_XDP sample application to use this new API
> > >>
> > >> v2 -> v3
> > >>   - Removed cache by pass option
> > >>
> > >> v1 -> v2
> > >>   - Renamed xsk_ring_prod__free() to xsk_ring_prod__nb_free()
> > >>   - Add caching so it will only touch global state when needed
> > >>
> > >>  samples/bpf/xdpsock_user.c | 109
> > >> ++++++++++++++++++++++++++++---------
> > >>  tools/lib/bpf/xsk.h        |   4 +-
> > >>  2 files changed, 86 insertions(+), 27 deletions(-)
> > >>
> > >> diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
> > >> index 93eaaf7239b2..87115e233b54 100644
> > >> --- a/samples/bpf/xdpsock_user.c
> > >> +++ b/samples/bpf/xdpsock_user.c
> > >> @@ -461,9 +461,13 @@ static void kick_tx(struct xsk_socket_info *xsk)
> > >>
> > >>  static inline void complete_tx_l2fwd(struct xsk_socket_info *xsk)
> > >>  {
> > >> -       u32 idx_cq = 0, idx_fq = 0;
> > >> -       unsigned int rcvd;
> > >> +       static u64 free_frames[NUM_FRAMES];
> > >> +       static size_t nr_free_frames;
> > >> +
> > >> +       u32 idx_cq = 0, idx_fq = 0, free_slots;
> > >> +       unsigned int rcvd, i;
> > >>         size_t ndescs;
> > >> +       int ret;
> > >>
> > >>         if (!xsk->outstanding_tx)
> > >>                 return;
> > >> @@ -474,27 +478,52 @@ static inline void complete_tx_l2fwd(struct
> > >> xsk_socket_info *xsk)
> > >>
> > >>         /* re-add completed Tx buffers */
> > >>         rcvd = xsk_ring_cons__peek(&xsk->umem->cq, ndescs, &idx_cq);
> > >> -       if (rcvd > 0) {
> > >> -               unsigned int i;
> > >> -               int ret;
> > >> +       if (!rcvd)
> > >> +               return;
> > >>
> > >> -               ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd,
> > >> &idx_fq);
> > >> -               while (ret != rcvd) {
> > >> -                       if (ret < 0)
> > >> -                               exit_with_error(-ret);
> > >> -                       ret = xsk_ring_prod__reserve(&xsk->umem->fq,
> > >> rcvd,
> > >> -                                                    &idx_fq);
> > >> -               }
> > >> -               for (i = 0; i < rcvd; i++)
> > >> +       /* When xsk_ring_cons__peek() for example returns that 5
> > >> packets
> > >> +        * have been received, it does not automatically mean that
> > >> +        * xsk_ring_prod__reserve() will have 5 slots available. You
> > >> will
> > >> +        * see this, for example, when using a veth interface due to
> > >> the
> > >> +        * RX_BATCH_SIZE used by the generic driver.
> > >> +        *
> > >> +        * In this example we store unused buffers and try to
> > >> re-stock
> > >> +        * them the next iteration.
> > >> +        */
> > >> +
> > >> +       free_slots = xsk_prod__nb_free(&xsk->umem->fq, rcvd +
> > >> nr_free_frames);
> > >> +       if (free_slots > rcvd + nr_free_frames)
> > >> +               free_slots = rcvd + nr_free_frames;
> > >> +
> > >> +       ret = xsk_ring_prod__reserve(&xsk->umem->fq, free_slots,
> > >> &idx_fq);
> > >> +       while (ret != free_slots) {
> > >> +               if (ret < 0)
> > >> +                       exit_with_error(-ret);
> > >> +               ret = xsk_ring_prod__reserve(&xsk->umem->fq,
> > >> free_slots,
> > >> +                                            &idx_fq);
> > >> +       }
> > >> +       for (i = 0; i < rcvd; i++) {
> > >> +               u64 addr = *xsk_ring_cons__comp_addr(&xsk->umem->cq,
> > >> idx_cq++);
> > >> +
> > >> +               if (i < free_slots)
> > >>                         *xsk_ring_prod__fill_addr(&xsk->umem->fq,
> > >> idx_fq++) =
> > >> -
> > >> *xsk_ring_cons__comp_addr(&xsk->umem->cq,
> > >> -                                                         idx_cq++);
> > >> +                               addr;
> > >> +               else
> > >> +                       free_frames[nr_free_frames++] = addr;
> > >> +       }
> > >>
> > >> -               xsk_ring_prod__submit(&xsk->umem->fq, rcvd);
> > >> -               xsk_ring_cons__release(&xsk->umem->cq, rcvd);
> > >> -               xsk->outstanding_tx -= rcvd;
> > >> -               xsk->tx_npkts += rcvd;
> > >> +       if (free_slots > rcvd) {
> > >> +               for (i = 0; i < (free_slots - rcvd); i++) {
> > >> +                       u64 addr = free_frames[--nr_free_frames];
> > >> +                       *xsk_ring_prod__fill_addr(&xsk->umem->fq,
> > >> idx_fq++) =
> > >> +                               addr;
> > >> +               }
> > >>         }
> > >> +
> > >> +       xsk_ring_prod__submit(&xsk->umem->fq, free_slots);
> > >> +       xsk_ring_cons__release(&xsk->umem->cq, rcvd);
> > >> +       xsk->outstanding_tx -= rcvd;
> > >> +       xsk->tx_npkts += rcvd;
> > >>  }
> > >>
> > >>  static inline void complete_tx_only(struct xsk_socket_info *xsk)
> > >> @@ -517,19 +546,37 @@ static inline void complete_tx_only(struct
> > >> xsk_socket_info *xsk)
> > >>
> > >>  static void rx_drop(struct xsk_socket_info *xsk)
> > >>  {
> > >> +       static u64 free_frames[NUM_FRAMES];
> > >> +       static size_t nr_free_frames;
> > >> +
> > >>         unsigned int rcvd, i;
> > >> -       u32 idx_rx = 0, idx_fq = 0;
> > >> +       u32 idx_rx = 0, idx_fq = 0, free_slots;
> > >>         int ret;
> > >>
> > >>         rcvd = xsk_ring_cons__peek(&xsk->rx, BATCH_SIZE, &idx_rx);
> > >>         if (!rcvd)
> > >>                 return;
> > >>
> > >> -       ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq);
> > >> -       while (ret != rcvd) {
> > >> +       /* When xsk_ring_cons__peek() for example returns that 5
> > >> packets
> > >> +        * have been received, it does not automatically mean that
> > >> +        * xsk_ring_prod__reserve() will have 5 slots available. You
> > >> will
> > >> +        * see this, for example, when using a veth interface due to
> > >> the
> > >> +        * RX_BATCH_SIZE used by the generic driver.
> > >> +        *
> > >> +        * In this example we store unused buffers and try to
> > >> re-stock
> > >> +        * them the next iteration.
> > >> +        */
> > >> +
> > >> +       free_slots = xsk_prod__nb_free(&xsk->umem->fq, rcvd +
> > >> nr_free_frames);
> > >> +       if (free_slots > rcvd + nr_free_frames)
> > >> +               free_slots = rcvd + nr_free_frames;
> > >> +
> > >> +       ret = xsk_ring_prod__reserve(&xsk->umem->fq, free_slots,
> > >> &idx_fq);
> > >> +       while (ret != free_slots) {
> > >>                 if (ret < 0)
> > >>                         exit_with_error(-ret);
> > >> -               ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd,
> > >> &idx_fq);
> > >> +               ret = xsk_ring_prod__reserve(&xsk->umem->fq,
> > >> free_slots,
> > >> +                                            &idx_fq);
> > >>         }
> > >>
> > >>         for (i = 0; i < rcvd; i++) {
> > >> @@ -538,10 +585,22 @@ static void rx_drop(struct xsk_socket_info
> > >> *xsk)
> > >>                 char *pkt = xsk_umem__get_data(xsk->umem->buffer,
> > >> addr);
> > >>
> > >>                 hex_dump(pkt, len, addr);
> > >> -               *xsk_ring_prod__fill_addr(&xsk->umem->fq, idx_fq++) =
> > >> addr;
> > >> +               if (i < free_slots)
> > >> +                       *xsk_ring_prod__fill_addr(&xsk->umem->fq,
> > >> idx_fq++) =
> > >> +                               addr;
> > >> +               else
> > >> +                       free_frames[nr_free_frames++] = addr;
> > >> +       }
> > >> +
> > >> +       if (free_slots > rcvd) {
> > >> +               for (i = 0; i < (free_slots - rcvd); i++) {
> > >> +                       u64 addr = free_frames[--nr_free_frames];
> > >> +                       *xsk_ring_prod__fill_addr(&xsk->umem->fq,
> > >> idx_fq++) =
> > >> +                               addr;
> > >> +               }
> > >>         }
> > >>
> > >> -       xsk_ring_prod__submit(&xsk->umem->fq, rcvd);
> > >> +       xsk_ring_prod__submit(&xsk->umem->fq, free_slots);
> > >>         xsk_ring_cons__release(&xsk->rx, rcvd);
> > >>         xsk->rx_npkts += rcvd;
> > >>  }
> > >> diff --git a/tools/lib/bpf/xsk.h b/tools/lib/bpf/xsk.h
> > >> index 833a6e60d065..cae506ab3f3c 100644
> > >> --- a/tools/lib/bpf/xsk.h
> > >> +++ b/tools/lib/bpf/xsk.h
> > >> @@ -76,7 +76,7 @@ xsk_ring_cons__rx_desc(const struct xsk_ring_cons
> > >> *rx, __u32 idx)
> > >>         return &descs[idx & rx->mask];
> > >>  }
> > >>
> > >> -static inline __u32 xsk_prod_nb_free(struct xsk_ring_prod *r, __u32
> > >> nb)
> > >> +static inline __u32 xsk_prod__nb_free(struct xsk_ring_prod *r, __u32
> > >> nb)
> > >>  {
> > >>         __u32 free_entries = r->cached_cons - r->cached_prod;
> > >>
> > >> @@ -110,7 +110,7 @@ static inline __u32 xsk_cons_nb_avail(struct
> > >> xsk_ring_cons *r, __u32 nb)
> > >>  static inline size_t xsk_ring_prod__reserve(struct xsk_ring_prod
> > >> *prod,
> > >>                                             size_t nb, __u32 *idx)
> > >>  {
> > >> -       if (xsk_prod_nb_free(prod, nb) < nb)
> > >> +       if (xsk_prod__nb_free(prod, nb) < nb)
> > >>                 return 0;
> > >>
> > >>         *idx = prod->cached_prod;
> > >> --
> > >> 2.18.1
> > >>

^ permalink raw reply

* Re: [PATCH 1/2] rtnetlink: gate MAC address with an LSM hook
From: Jeffrey Vander Stoep @ 2019-08-21 14:52 UTC (permalink / raw)
  To: Casey Schaufler; +Cc: netdev, LSM List, selinux
In-Reply-To: <e8f9e1ae-f9e4-987f-eb76-ebde8af4f4db@schaufler-ca.com>

On Wed, Aug 21, 2019 at 4:34 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>
> On 8/21/2019 6:45 AM, Jeff Vander Stoep wrote:
> > MAC addresses are often considered sensitive because they are
> > usually unique and can be used to identify/track a device or
> > user [1].
> >
> > The MAC address is accessible via the RTM_NEWLINK message type of a
> > netlink route socket[2]. Ideally we could grant/deny access to the
> > MAC address on a case-by-case basis without blocking the entire
> > RTM_NEWLINK message type which contains a lot of other useful
> > information. This can be achieved using a new LSM hook on the netlink
> > message receive path. Using this new hook, individual LSMs can select
> > which processes are allowed access to the real MAC, otherwise a
> > default value of zeros is returned. Offloading access control
> > decisions like this to an LSM is convenient because it preserves the
> > status quo for most Linux users while giving the various LSMs
> > flexibility to make finer grained decisions on access to sensitive
> > data based on policy.
>
> Is the MAC address the only bit of skb data that you might
> want to control with MAC? ( Sorry, couldn't help it ;) )
> Just musing, but might it make more sense to leave the core
> code unmodified and clear the MAC address in the skb inside
> the LSM? If you did it that way you could address any other
> data you want to control using the same hook. I would hate
> to see separate LSM hooks for each of several bits of data.
> On the other hand, I wouldn't want you to violate any layering
> policies in the networking code. That would be wrong.

I considered that approach, but having the LSM modifying the skb
like that without the networking code's knowledge did seem like a layering
violation, and fragile. It's also different than how LSM hooks typically
operate - generally they return decisions and the calling code is
responsible for taking appropriate action.

I'm currently only interested in the MAC, but this approach can be extended
to other fields. The selinux patch just splits up the read permission into two
levels, privileged and unprivileged which is consistent with how netlink
audit sockets are handled.



>
> >
> > [1] https://adamdrake.com/mac-addresses-udids-and-privacy.html
> > [2] Other access vectors like ioctl(SIOCGIFHWADDR) are already covered
> > by existing LSM hooks.
> >
> > Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
> > ---
> >  include/linux/lsm_hooks.h |  8 ++++++++
> >  include/linux/security.h  |  6 ++++++
> >  net/core/rtnetlink.c      | 12 ++++++++++--
> >  security/security.c       |  5 +++++
> >  4 files changed, 29 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> > index df1318d85f7d..dfcb2e11ff43 100644
> > --- a/include/linux/lsm_hooks.h
> > +++ b/include/linux/lsm_hooks.h
> > @@ -728,6 +728,12 @@
> >   *
> >   * Security hooks for Netlink messaging.
> >   *
> > + * @netlink_receive
> > + *   Check permissions on a netlink message field before populating it.
> > + *   @sk associated sock of task receiving the message.
> > + *   @skb contains the sk_buff structure for the netlink message.
> > + *   Return 0 if the data should be included in the message.
> > + *
> >   * @netlink_send:
> >   *   Save security information for a netlink message so that permission
> >   *   checking can be performed when the message is processed.  The security
> > @@ -1673,6 +1679,7 @@ union security_list_options {
> >       int (*sem_semop)(struct kern_ipc_perm *perm, struct sembuf *sops,
> >                               unsigned nsops, int alter);
> >
> > +     int (*netlink_receive)(struct sock *sk, struct sk_buff *skb);
> >       int (*netlink_send)(struct sock *sk, struct sk_buff *skb);
> >
> >       void (*d_instantiate)(struct dentry *dentry, struct inode *inode);
> > @@ -1952,6 +1959,7 @@ struct security_hook_heads {
> >       struct hlist_head sem_associate;
> >       struct hlist_head sem_semctl;
> >       struct hlist_head sem_semop;
> > +     struct hlist_head netlink_receive;
> >       struct hlist_head netlink_send;
> >       struct hlist_head d_instantiate;
> >       struct hlist_head getprocattr;
> > diff --git a/include/linux/security.h b/include/linux/security.h
> > index 5f7441abbf42..46b5af6de59e 100644
> > --- a/include/linux/security.h
> > +++ b/include/linux/security.h
> > @@ -382,6 +382,7 @@ int security_getprocattr(struct task_struct *p, const char *lsm, char *name,
> >                        char **value);
> >  int security_setprocattr(const char *lsm, const char *name, void *value,
> >                        size_t size);
> > +int security_netlink_receive(struct sock *sk, struct sk_buff *skb);
> >  int security_netlink_send(struct sock *sk, struct sk_buff *skb);
> >  int security_ismaclabel(const char *name);
> >  int security_secid_to_secctx(u32 secid, char **secdata, u32 *seclen);
> > @@ -1162,6 +1163,11 @@ static inline int security_setprocattr(const char *lsm, char *name,
> >       return -EINVAL;
> >  }
> >
> > +static inline int security_netlink_receive(struct sock *sk, struct sk_buff *skb)
> > +{
> > +     return 0;
> > +}
> > +
> >  static inline int security_netlink_send(struct sock *sk, struct sk_buff *skb)
> >  {
> >       return 0;
> > diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> > index 1ee6460f8275..7d69fcb8d22e 100644
> > --- a/net/core/rtnetlink.c
> > +++ b/net/core/rtnetlink.c
> > @@ -1650,8 +1650,16 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb,
> >               goto nla_put_failure;
> >
> >       if (dev->addr_len) {
> > -             if (nla_put(skb, IFLA_ADDRESS, dev->addr_len, dev->dev_addr) ||
> > -                 nla_put(skb, IFLA_BROADCAST, dev->addr_len, dev->broadcast))
> > +             if (skb->sk && security_netlink_receive(skb->sk, skb)) {
> > +                     if (!nla_reserve(skb, IFLA_ADDRESS, dev->addr_len))
> > +                             goto nla_put_failure;
> > +
> > +             } else {
> > +                     if (nla_put(skb, IFLA_ADDRESS, dev->addr_len,
> > +                                 dev->dev_addr))
> > +                             goto nla_put_failure;
> > +             }
> > +             if (nla_put(skb, IFLA_BROADCAST, dev->addr_len, dev->broadcast))
> >                       goto nla_put_failure;
> >       }
> >
> > diff --git a/security/security.c b/security/security.c
> > index 250ee2d76406..35c5929921b2 100644
> > --- a/security/security.c
> > +++ b/security/security.c
> > @@ -1861,6 +1861,11 @@ int security_setprocattr(const char *lsm, const char *name, void *value,
> >       return -EINVAL;
> >  }
> >
> > +int security_netlink_receive(struct sock *sk, struct sk_buff *skb)
> > +{
> > +     return call_int_hook(netlink_receive, 0, sk, skb);
> > +}
> > +
> >  int security_netlink_send(struct sock *sk, struct sk_buff *skb)
> >  {
> >       return call_int_hook(netlink_send, 0, sk, skb);
>

^ permalink raw reply

* [PATCH net-next v2 1/3] net: dsa: mt7530: Convert to PHYLINK API
From: René van Dorst @ 2019-08-21 14:45 UTC (permalink / raw)
  To: Sean Wang, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	David S . Miller, Matthias Brugger
  Cc: netdev, linux-arm-kernel, linux-mediatek, John Crispin,
	linux-mips, Frank Wunderlich, René van Dorst
In-Reply-To: <20190821144547.15113-1-opensource@vdorst.com>

Convert mt7530 to PHYLINK API

Signed-off-by: René van Dorst <opensource@vdorst.com>

v1->v2:
* Refactor "unsupported" phy_interface part in
  mt7530_phylink_mac_validate() suggested by Russell King
* Report and return when phylink tries to use autoneg_inband in
  mt7530_phylink_mac_config() suggested by Russell King
* Refactor port 6 setup in mt7530_phylink_mac_config()
rfc->v1:
* Renamed P5_MODE_* to P5_INTF_SEL_*. fits the function more
* Convert if-statement for speed bits to a switch suggested by
  Daniel Santos
* Refactor flow_control pause bits and don't use state->link in
  mt7530_phylink_mac_config() suggested by Russell King
* Move MAC tx/rx en/disable to mt7530_phylink_mac_link_up/down()
  suggested by Russell King
* Always support PHY_INTERFACE_MODE_NA in mt7530_phylink_validate()
  suggested by Russell King
* Added phylink_set_port_modes() in mt7530_phylink_validate() suggested
  by Russell King
* Remove dev_err on the end of mt7530_phylink_mac_config() suggested by
  Russell King
---
 drivers/net/dsa/mt7530.c | 266 +++++++++++++++++++++++++++++----------
 drivers/net/dsa/mt7530.h |  32 +++--
 2 files changed, 211 insertions(+), 87 deletions(-)

diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
index c48e29486b10..ecc13b57e619 100644
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -13,7 +13,7 @@
 #include <linux/of_mdio.h>
 #include <linux/of_net.h>
 #include <linux/of_platform.h>
-#include <linux/phy.h>
+#include <linux/phylink.h>
 #include <linux/regmap.h>
 #include <linux/regulator/consumer.h>
 #include <linux/reset.h>
@@ -633,63 +633,6 @@ mt7530_get_sset_count(struct dsa_switch *ds, int port, int sset)
 	return ARRAY_SIZE(mt7530_mib);
 }
 
-static void mt7530_adjust_link(struct dsa_switch *ds, int port,
-			       struct phy_device *phydev)
-{
-	struct mt7530_priv *priv = ds->priv;
-
-	if (phy_is_pseudo_fixed_link(phydev)) {
-		dev_dbg(priv->dev, "phy-mode for master device = %x\n",
-			phydev->interface);
-
-		/* Setup TX circuit incluing relevant PAD and driving */
-		mt7530_pad_clk_setup(ds, phydev->interface);
-
-		if (priv->id == ID_MT7530) {
-			/* Setup RX circuit, relevant PAD and driving on the
-			 * host which must be placed after the setup on the
-			 * device side is all finished.
-			 */
-			mt7623_pad_clk_setup(ds);
-		}
-	} else {
-		u16 lcl_adv = 0, rmt_adv = 0;
-		u8 flowctrl;
-		u32 mcr = PMCR_USERP_LINK | PMCR_FORCE_MODE;
-
-		switch (phydev->speed) {
-		case SPEED_1000:
-			mcr |= PMCR_FORCE_SPEED_1000;
-			break;
-		case SPEED_100:
-			mcr |= PMCR_FORCE_SPEED_100;
-			break;
-		}
-
-		if (phydev->link)
-			mcr |= PMCR_FORCE_LNK;
-
-		if (phydev->duplex) {
-			mcr |= PMCR_FORCE_FDX;
-
-			if (phydev->pause)
-				rmt_adv = LPA_PAUSE_CAP;
-			if (phydev->asym_pause)
-				rmt_adv |= LPA_PAUSE_ASYM;
-
-			lcl_adv = linkmode_adv_to_lcl_adv_t(
-				phydev->advertising);
-			flowctrl = mii_resolve_flowctrl_fdx(lcl_adv, rmt_adv);
-
-			if (flowctrl & FLOW_CTRL_TX)
-				mcr |= PMCR_TX_FC_EN;
-			if (flowctrl & FLOW_CTRL_RX)
-				mcr |= PMCR_RX_FC_EN;
-		}
-		mt7530_write(priv, MT7530_PMCR_P(port), mcr);
-	}
-}
-
 static int
 mt7530_cpu_port_enable(struct mt7530_priv *priv,
 		       int port)
@@ -698,9 +641,6 @@ mt7530_cpu_port_enable(struct mt7530_priv *priv,
 	mt7530_write(priv, MT7530_PVC_P(port),
 		     PORT_SPEC_TAG);
 
-	/* Setup the MAC by default for the cpu port */
-	mt7530_write(priv, MT7530_PMCR_P(port), PMCR_CPUP_LINK);
-
 	/* Disable auto learning on the cpu port */
 	mt7530_set(priv, MT7530_PSC_P(port), SA_DIS);
 
@@ -731,9 +671,6 @@ mt7530_port_enable(struct dsa_switch *ds, int port,
 
 	mutex_lock(&priv->reg_mutex);
 
-	/* Setup the MAC for the user port */
-	mt7530_write(priv, MT7530_PMCR_P(port), PMCR_USERP_LINK);
-
 	/* Allow the user port gets connected to the cpu port and also
 	 * restore the port matrix if the port is the member of a certain
 	 * bridge.
@@ -742,7 +679,7 @@ mt7530_port_enable(struct dsa_switch *ds, int port,
 	priv->ports[port].enable = true;
 	mt7530_rmw(priv, MT7530_PCR_P(port), PCR_MATRIX_MASK,
 		   priv->ports[port].pm);
-	mt7530_port_set_status(priv, port, 1);
+	mt7530_port_set_status(priv, port, 0);
 
 	mutex_unlock(&priv->reg_mutex);
 
@@ -1232,10 +1169,10 @@ static int
 mt7530_setup(struct dsa_switch *ds)
 {
 	struct mt7530_priv *priv = ds->priv;
-	int ret, i;
-	u32 id, val;
-	struct device_node *dn;
 	struct mt7530_dummy_poll p;
+	struct device_node *dn;
+	u32 id, val;
+	int ret, i;
 
 	/* The parent node of master netdev which holds the common system
 	 * controller also is the container for two GMACs nodes representing
@@ -1305,6 +1242,8 @@ mt7530_setup(struct dsa_switch *ds)
 	val |= MHWTRAP_MANUAL;
 	mt7530_write(priv, MT7530_MHWTRAP, val);
 
+	priv->p6_interface = PHY_INTERFACE_MODE_NA;
+
 	/* Enable and reset MIB counters */
 	mt7530_mib_reset(ds);
 
@@ -1329,6 +1268,191 @@ mt7530_setup(struct dsa_switch *ds)
 	return 0;
 }
 
+static void mt7530_phylink_mac_config(struct dsa_switch *ds, int port,
+				      unsigned int mode,
+				      const struct phylink_link_state *state)
+{
+	struct mt7530_priv *priv = ds->priv;
+	u32 mcr_cur, mcr_new;
+
+	switch (port) {
+	case 0: /* Internal phy */
+	case 1:
+	case 2:
+	case 3:
+	case 4:
+		if (state->interface != PHY_INTERFACE_MODE_GMII)
+			return;
+		break;
+	/* case 5: Port 5 is not supported! */
+	case 6: /* 1st cpu port */
+		if (priv->p6_interface == state->interface)
+			break;
+
+		if (state->interface != PHY_INTERFACE_MODE_RGMII &&
+		    state->interface != PHY_INTERFACE_MODE_TRGMII)
+			return;
+
+		/* Setup TX circuit incluing relevant PAD and driving */
+		mt7530_pad_clk_setup(ds, state->interface);
+
+		if (priv->id == ID_MT7530) {
+			/* Setup RX circuit, relevant PAD and driving on the
+			 * host which must be placed after the setup on the
+			 * device side is all finished.
+			 */
+			mt7623_pad_clk_setup(ds);
+		}
+
+		priv->p6_interface = state->interface;
+		break;
+	default:
+		dev_err(ds->dev, "%s: unsupported port: %i\n", __func__, port);
+		return;
+	}
+
+	if (phylink_autoneg_inband(mode)) {
+		dev_err(ds->dev, "%s: in-band negotiation unsupported\n",
+			__func__);
+		return;
+	}
+
+	mcr_cur = mt7530_read(priv, MT7530_PMCR_P(port));
+	mcr_new = mcr_cur;
+	mcr_new &= ~(PMCR_FORCE_SPEED_1000 | PMCR_FORCE_SPEED_100 |
+		     PMCR_FORCE_FDX | PMCR_TX_FC_EN | PMCR_RX_FC_EN);
+	mcr_new |= PMCR_IFG_XMIT(1) | PMCR_MAC_MODE | PMCR_BACKOFF_EN |
+		   PMCR_BACKPR_EN | PMCR_FORCE_MODE | PMCR_FORCE_LNK;
+
+	switch (state->speed) {
+	case SPEED_1000:
+		mcr_new |= PMCR_FORCE_SPEED_1000;
+		break;
+	case SPEED_100:
+		mcr_new |= PMCR_FORCE_SPEED_100;
+		break;
+	}
+	if (state->duplex == DUPLEX_FULL) {
+		mcr_new |= PMCR_FORCE_FDX;
+		if (state->pause & MLO_PAUSE_TX)
+			mcr_new |= PMCR_TX_FC_EN;
+		if (state->pause & MLO_PAUSE_RX)
+			mcr_new |= PMCR_RX_FC_EN;
+	}
+
+	if (mcr_new != mcr_cur)
+		mt7530_write(priv, MT7530_PMCR_P(port), mcr_new);
+}
+
+static void mt7530_phylink_mac_link_down(struct dsa_switch *ds, int port,
+					 unsigned int mode,
+					 phy_interface_t interface)
+{
+	struct mt7530_priv *priv = ds->priv;
+
+	mt7530_port_set_status(priv, port, 0);
+}
+
+static void mt7530_phylink_mac_link_up(struct dsa_switch *ds, int port,
+				       unsigned int mode,
+				       phy_interface_t interface,
+				       struct phy_device *phydev)
+{
+	struct mt7530_priv *priv = ds->priv;
+
+	mt7530_port_set_status(priv, port, 1);
+}
+
+static void mt7530_phylink_validate(struct dsa_switch *ds, int port,
+				    unsigned long *supported,
+				    struct phylink_link_state *state)
+{
+	__ETHTOOL_DECLARE_LINK_MODE_MASK(mask) = { 0, };
+
+	switch (port) {
+	case 0: /* Internal phy */
+	case 1:
+	case 2:
+	case 3:
+	case 4:
+		if (state->interface != PHY_INTERFACE_MODE_NA &&
+		    state->interface != PHY_INTERFACE_MODE_GMII)
+			goto unsupported;
+		break;
+	/* case 5: Port 5 not supported! */
+	case 6: /* 1st cpu port */
+		if (state->interface != PHY_INTERFACE_MODE_NA &&
+		    state->interface != PHY_INTERFACE_MODE_RGMII &&
+		    state->interface != PHY_INTERFACE_MODE_TRGMII)
+			goto unsupported;
+		break;
+	default:
+		dev_err(ds->dev, "%s: unsupported port: %i\n", __func__, port);
+unsupported:
+		linkmode_zero(supported);
+		return;
+	}
+
+	phylink_set_port_modes(mask);
+	phylink_set(mask, Autoneg);
+
+	if (state->interface != PHY_INTERFACE_MODE_TRGMII) {
+		phylink_set(mask, 10baseT_Half);
+		phylink_set(mask, 10baseT_Full);
+		phylink_set(mask, 100baseT_Half);
+		phylink_set(mask, 100baseT_Full);
+		phylink_set(mask, 1000baseT_Half);
+	}
+
+	phylink_set(mask, 1000baseT_Full);
+
+	phylink_set(mask, Pause);
+	phylink_set(mask, Asym_Pause);
+
+	linkmode_and(supported, supported, mask);
+	linkmode_and(state->advertising, state->advertising, mask);
+}
+
+static int
+mt7530_phylink_mac_link_state(struct dsa_switch *ds, int port,
+			      struct phylink_link_state *state)
+{
+	struct mt7530_priv *priv = ds->priv;
+	u32 pmsr;
+
+	if (port < 0 || port >= MT7530_NUM_PORTS)
+		return -EINVAL;
+
+	pmsr = mt7530_read(priv, MT7530_PMSR_P(port));
+
+	state->link = (pmsr & PMSR_LINK);
+	state->an_complete = state->link;
+	state->duplex = !!(pmsr & PMSR_DPX);
+
+	switch (pmsr & PMSR_SPEED_MASK) {
+	case PMSR_SPEED_10:
+		state->speed = SPEED_10;
+		break;
+	case PMSR_SPEED_100:
+		state->speed = SPEED_100;
+		break;
+	case PMSR_SPEED_1000:
+		state->speed = SPEED_1000;
+		break;
+	default:
+		state->speed = SPEED_UNKNOWN;
+		break;
+	}
+
+	state->pause &= ~(MLO_PAUSE_RX | MLO_PAUSE_TX);
+	if (pmsr & PMSR_RX_FC)
+		state->pause |= MLO_PAUSE_RX;
+	if (pmsr & PMSR_TX_FC)
+		state->pause |= MLO_PAUSE_TX;
+
+	return 1;
+}
+
 static const struct dsa_switch_ops mt7530_switch_ops = {
 	.get_tag_protocol	= mtk_get_tag_protocol,
 	.setup			= mt7530_setup,
@@ -1337,7 +1461,6 @@ static const struct dsa_switch_ops mt7530_switch_ops = {
 	.phy_write		= mt7530_phy_write,
 	.get_ethtool_stats	= mt7530_get_ethtool_stats,
 	.get_sset_count		= mt7530_get_sset_count,
-	.adjust_link		= mt7530_adjust_link,
 	.port_enable		= mt7530_port_enable,
 	.port_disable		= mt7530_port_disable,
 	.port_stp_state_set	= mt7530_stp_state_set,
@@ -1350,6 +1473,11 @@ static const struct dsa_switch_ops mt7530_switch_ops = {
 	.port_vlan_prepare	= mt7530_port_vlan_prepare,
 	.port_vlan_add		= mt7530_port_vlan_add,
 	.port_vlan_del		= mt7530_port_vlan_del,
+	.phylink_validate	= mt7530_phylink_validate,
+	.phylink_mac_link_state = mt7530_phylink_mac_link_state,
+	.phylink_mac_config	= mt7530_phylink_mac_config,
+	.phylink_mac_link_down	= mt7530_phylink_mac_link_down,
+	.phylink_mac_link_up	= mt7530_phylink_mac_link_up,
 };
 
 static const struct of_device_id mt7530_of_match[] = {
diff --git a/drivers/net/dsa/mt7530.h b/drivers/net/dsa/mt7530.h
index bfac90f48102..107dd04acede 100644
--- a/drivers/net/dsa/mt7530.h
+++ b/drivers/net/dsa/mt7530.h
@@ -198,26 +198,20 @@ enum mt7530_vlan_port_attr {
 #define  PMCR_FORCE_SPEED_100		BIT(2)
 #define  PMCR_FORCE_FDX			BIT(1)
 #define  PMCR_FORCE_LNK			BIT(0)
-#define  PMCR_COMMON_LINK		(PMCR_IFG_XMIT(1) | PMCR_MAC_MODE | \
-					 PMCR_BACKOFF_EN | PMCR_BACKPR_EN | \
-					 PMCR_TX_EN | PMCR_RX_EN | \
-					 PMCR_TX_FC_EN | PMCR_RX_FC_EN)
-#define  PMCR_CPUP_LINK			(PMCR_COMMON_LINK | PMCR_FORCE_MODE | \
-					 PMCR_FORCE_SPEED_1000 | \
-					 PMCR_FORCE_FDX | \
-					 PMCR_FORCE_LNK)
-#define  PMCR_USERP_LINK		PMCR_COMMON_LINK
-#define  PMCR_FIXED_LINK		(PMCR_IFG_XMIT(1) | PMCR_MAC_MODE | \
-					 PMCR_FORCE_MODE | PMCR_TX_EN | \
-					 PMCR_RX_EN | PMCR_BACKPR_EN | \
-					 PMCR_BACKOFF_EN | \
-					 PMCR_FORCE_SPEED_1000 | \
-					 PMCR_FORCE_FDX | \
-					 PMCR_FORCE_LNK)
-#define PMCR_FIXED_LINK_FC		(PMCR_FIXED_LINK | \
-					 PMCR_TX_FC_EN | PMCR_RX_FC_EN)
+#define  PMCR_SPEED_MASK		(PMCR_FORCE_SPEED_100 | \
+					 PMCR_FORCE_SPEED_1000)
 
 #define MT7530_PMSR_P(x)		(0x3008 + (x) * 0x100)
+#define  PMSR_EEE1G			BIT(7)
+#define  PMSR_EEE100M			BIT(6)
+#define  PMSR_RX_FC			BIT(5)
+#define  PMSR_TX_FC			BIT(4)
+#define  PMSR_SPEED_1000		BIT(3)
+#define  PMSR_SPEED_100			BIT(2)
+#define  PMSR_SPEED_10			0x00
+#define  PMSR_SPEED_MASK		(PMSR_SPEED_100 | PMSR_SPEED_1000)
+#define  PMSR_DPX			BIT(1)
+#define  PMSR_LINK			BIT(0)
 
 /* Register for MIB */
 #define MT7530_PORT_MIB_COUNTER(x)	(0x4000 + (x) * 0x100)
@@ -423,6 +417,7 @@ struct mt7530_port {
  * @ports:		Holding the state among ports
  * @reg_mutex:		The lock for protecting among process accessing
  *			registers
+ * @p6_interface	Holding the current port 6 interface
  */
 struct mt7530_priv {
 	struct device		*dev;
@@ -435,6 +430,7 @@ struct mt7530_priv {
 	struct gpio_desc	*reset;
 	unsigned int		id;
 	bool			mcm;
+	phy_interface_t		p6_interface;
 
 	struct mt7530_port	ports[MT7530_NUM_PORTS];
 	/* protect among processes for registers access*/
-- 
2.20.1


^ permalink raw reply related

* [PATCH net-next v2 2/3] dt-bindings: net: dsa: mt7530: Add support for port 5
From: René van Dorst @ 2019-08-21 14:45 UTC (permalink / raw)
  To: Sean Wang, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	David S . Miller, Matthias Brugger
  Cc: netdev, linux-arm-kernel, linux-mediatek, John Crispin,
	linux-mips, Frank Wunderlich, René van Dorst, devicetree,
	Rob Herring
In-Reply-To: <20190821144547.15113-1-opensource@vdorst.com>

MT7530 port 5 has many modes/configurations.
Update the documentation how to use port 5.

Signed-off-by: René van Dorst <opensource@vdorst.com>
Cc: devicetree@vger.kernel.org
Cc: Rob Herring <robh@kernel.org>
v1->v2:
* Adding extra note about RGMII2 and gpio use.
rfc->v1:
* No change
---
 .../devicetree/bindings/net/dsa/mt7530.txt    | 218 ++++++++++++++++++
 1 file changed, 218 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/dsa/mt7530.txt b/Documentation/devicetree/bindings/net/dsa/mt7530.txt
index 47aa205ee0bd..43993aae3f9c 100644
--- a/Documentation/devicetree/bindings/net/dsa/mt7530.txt
+++ b/Documentation/devicetree/bindings/net/dsa/mt7530.txt
@@ -35,6 +35,42 @@ Required properties for the child nodes within ports container:
 - phy-mode: String, must be either "trgmii" or "rgmii" for port labeled
 	 "cpu".
 
+Port 5 of the switch is muxed between:
+1. GMAC5: GMAC5 can interface with another external MAC or PHY.
+2. PHY of port 0 or port 4: PHY interfaces with an external MAC like 2nd GMAC
+   of the SOC. Used in many setups where port 0/4 becomes the WAN port.
+   Note: On a MT7621 SOC with integrated switch: 2nd GMAC can only connected to
+	 GMAC5 when the gpios for RGMII2 (GPIO 22-33) are not used and not
+	 connected to external component!
+
+Port 5 modes/configurations:
+1. Port 5 is disabled and isolated: An external phy can interface to the 2nd
+   GMAC of the SOC.
+   In the case of a build-in MT7530 switch, port 5 shares the RGMII bus with 2nd
+   GMAC and an optional external phy. Mind the GPIO/pinctl settings of the SOC!
+2. Port 5 is muxed to PHY of port 0/4: Port 0/4 interfaces with 2nd GMAC.
+   It is a simple MAC to PHY interface, port 5 needs to be setup for xMII mode
+   and RGMII delay.
+3. Port 5 is muxed to GMAC5 and can interface to an external phy.
+   Port 5 becomes an extra switch port.
+   Only works on platform where external phy TX<->RX lines are swapped.
+   Like in the Ubiquiti ER-X-SFP.
+4. Port 5 is muxed to GMAC5 and interfaces with the 2nd GAMC as 2nd CPU port.
+   Currently a 2nd CPU port is not supported by DSA code.
+
+Depending on how the external PHY is wired:
+1. normal: The PHY can only connect to 2nd GMAC but not to the switch
+2. swapped: RGMII TX, RX are swapped; external phy interface with the switch as
+   a ethernet port. But can't interface to the 2nd GMAC.
+
+Based on the DT the port 5 mode is configured.
+
+Driver tries to lookup the phy-handle of the 2nd GMAC of the master device.
+When phy-handle matches PHY of port 0 or 4 then port 5 set-up as mode 2.
+phy-mode must be set, see also example 2 below!
+ * mt7621: phy-mode = "rgmii-txid";
+ * mt7623: phy-mode = "rgmii";
+
 See Documentation/devicetree/bindings/net/dsa/dsa.txt for a list of additional
 required, optional properties and how the integrated switch subnodes must
 be specified.
@@ -94,3 +130,185 @@ Example:
 			};
 		};
 	};
+
+Example 2: MT7621: Port 4 is WAN port: 2nd GMAC -> Port 5 -> PHY port 4.
+
+&eth {
+	status = "okay";
+
+	gmac0: mac@0 {
+		compatible = "mediatek,eth-mac";
+		reg = <0>;
+		phy-mode = "rgmii";
+
+		fixed-link {
+			speed = <1000>;
+			full-duplex;
+			pause;
+		};
+	};
+
+	gmac1: mac@1 {
+		compatible = "mediatek,eth-mac";
+		reg = <1>;
+		phy-mode = "rgmii-txid";
+		phy-handle = <&phy4>;
+	};
+
+	mdio: mdio-bus {
+		#address-cells = <1>;
+		#size-cells = <0>;
+
+		/* Internal phy */
+		phy4: ethernet-phy@4 {
+			reg = <4>;
+		};
+
+		mt7530: switch@1f {
+			compatible = "mediatek,mt7621";
+			#address-cells = <1>;
+			#size-cells = <0>;
+			reg = <0x1f>;
+			pinctrl-names = "default";
+			mediatek,mcm;
+
+			resets = <&rstctrl 2>;
+			reset-names = "mcm";
+
+			ports {
+				#address-cells = <1>;
+				#size-cells = <0>;
+
+				port@0 {
+					reg = <0>;
+					label = "lan0";
+				};
+
+				port@1 {
+					reg = <1>;
+					label = "lan1";
+				};
+
+				port@2 {
+					reg = <2>;
+					label = "lan2";
+				};
+
+				port@3 {
+					reg = <3>;
+					label = "lan3";
+				};
+
+/* Commented out. Port 4 is handled by 2nd GMAC.
+				port@4 {
+					reg = <4>;
+					label = "lan4";
+				};
+*/
+
+				cpu_port0: port@6 {
+					reg = <6>;
+					label = "cpu";
+					ethernet = <&gmac0>;
+					phy-mode = "rgmii";
+
+					fixed-link {
+						speed = <1000>;
+						full-duplex;
+						pause;
+					};
+				};
+			};
+		};
+	};
+};
+
+Example 3: MT7621: Port 5 is connected to external PHY: Port 5 -> external PHY.
+
+&eth {
+	status = "okay";
+
+	gmac0: mac@0 {
+		compatible = "mediatek,eth-mac";
+		reg = <0>;
+		phy-mode = "rgmii";
+
+		fixed-link {
+			speed = <1000>;
+			full-duplex;
+			pause;
+		};
+	};
+
+	mdio: mdio-bus {
+		#address-cells = <1>;
+		#size-cells = <0>;
+
+		/* External phy */
+		ephy5: ethernet-phy@7 {
+			reg = <7>;
+		};
+
+		mt7530: switch@1f {
+			compatible = "mediatek,mt7621";
+			#address-cells = <1>;
+			#size-cells = <0>;
+			reg = <0x1f>;
+			pinctrl-names = "default";
+			mediatek,mcm;
+
+			resets = <&rstctrl 2>;
+			reset-names = "mcm";
+
+			ports {
+				#address-cells = <1>;
+				#size-cells = <0>;
+
+				port@0 {
+					reg = <0>;
+					label = "lan0";
+				};
+
+				port@1 {
+					reg = <1>;
+					label = "lan1";
+				};
+
+				port@2 {
+					reg = <2>;
+					label = "lan2";
+				};
+
+				port@3 {
+					reg = <3>;
+					label = "lan3";
+				};
+
+				port@4 {
+					reg = <4>;
+					label = "lan4";
+				};
+
+				port@5 {
+					reg = <5>;
+					label = "lan5";
+					phy-mode = "rgmii";
+					phy-handle = <&ephy5>;
+				};
+
+				cpu_port0: port@6 {
+					reg = <6>;
+					label = "cpu";
+					ethernet = <&gmac0>;
+					phy-mode = "rgmii";
+
+					fixed-link {
+						speed = <1000>;
+						full-duplex;
+						pause;
+					};
+				};
+			};
+		};
+	};
+};
-- 
2.20.1


^ permalink raw reply related

* [PATCH net-next v2 3/3] net: dsa: mt7530: Add support for port 5
From: René van Dorst @ 2019-08-21 14:45 UTC (permalink / raw)
  To: Sean Wang, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	David S . Miller, Matthias Brugger
  Cc: netdev, linux-arm-kernel, linux-mediatek, John Crispin,
	linux-mips, Frank Wunderlich, René van Dorst
In-Reply-To: <20190821144547.15113-1-opensource@vdorst.com>

Adding support for port 5.

Port 5 can muxed/interface to:
- internal 5th GMAC of the switch; can be used as 2nd CPU port or as
  extra port with an external phy for a 6th ethernet port.
- internal PHY of port 0 or 4; Used in most applications so that port 0
  or 4 is the WAN port and interfaces with the 2nd GMAC of the SOC.

Signed-off-by: René van Dorst <opensource@vdorst.com>

v1->v2:
* Also report 1000base-x support for port 5 suggested by Russell King
* Reorder variable declaraiant in reverse christmas tree suggested by
  Daved Miller
* Refactor phy-handle lookup for 2nd GMAC.
* Use of_mdio_parse_addr() instead of do it manualy suggested by
  Florian Fainelli
* Refactor port 5 setup in mt7530_phylink_mac_config()
rfc->v1:
* Removed unnecessary info print suggested by Andrew Lunn
* Added support for MII mode for port 5
---
 drivers/net/dsa/mt7530.c | 145 +++++++++++++++++++++++++++++++++++++--
 drivers/net/dsa/mt7530.h |  29 ++++++++
 2 files changed, 168 insertions(+), 6 deletions(-)

diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
index ecc13b57e619..43407a9b80ac 100644
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -633,6 +633,77 @@ mt7530_get_sset_count(struct dsa_switch *ds, int port, int sset)
 	return ARRAY_SIZE(mt7530_mib);
 }
 
+static void mt7530_setup_port5(struct dsa_switch *ds, phy_interface_t interface)
+{
+	struct mt7530_priv *priv = ds->priv;
+	u8 tx_delay = 0;
+	int val;
+
+	mutex_lock(&priv->reg_mutex);
+
+	val = mt7530_read(priv, MT7530_MHWTRAP);
+
+	val |= MHWTRAP_MANUAL | MHWTRAP_P5_MAC_SEL | MHWTRAP_P5_DIS;
+	val &= ~MHWTRAP_P5_RGMII_MODE & ~MHWTRAP_PHY0_SEL;
+
+	switch (priv->p5_intf_sel) {
+	case P5_INTF_SEL_PHY_P0:
+		/* MT7530_P5_MODE_GPHY_P0: 2nd GMAC -> P5 -> P0 */
+		val |= MHWTRAP_PHY0_SEL;
+		/* fall through */
+	case P5_INTF_SEL_PHY_P4:
+		/* MT7530_P5_MODE_GPHY_P4: 2nd GMAC -> P5 -> P4 */
+		val &= ~MHWTRAP_P5_MAC_SEL & ~MHWTRAP_P5_DIS;
+
+		/* Setup the MAC by default for the cpu port */
+		mt7530_write(priv, MT7530_PMCR_P(5), 0x56300);
+		break;
+	case P5_INTF_SEL_GMAC5:
+		/* MT7530_P5_MODE_GMAC: P5 -> External phy or 2nd GMAC */
+		val &= ~MHWTRAP_P5_DIS;
+		break;
+	case P5_DISABLED:
+		interface = PHY_INTERFACE_MODE_NA;
+		break;
+	default:
+		dev_err(ds->dev, "Unsupported p5_intf_sel %d\n",
+			priv->p5_intf_sel);
+		goto unlock_exit;
+	}
+
+	/* Setup RGMII settings */
+	if (phy_interface_mode_is_rgmii(interface)) {
+		val |= MHWTRAP_P5_RGMII_MODE;
+
+		/* P5 RGMII RX Clock Control: delay setting for 1000M */
+		mt7530_write(priv, MT7530_P5RGMIIRXCR, CSR_RGMII_EDGE_ALIGN);
+
+		/* Don't set delay in DSA mode */
+		if (!dsa_is_dsa_port(priv->ds, 5) &&
+		    (interface == PHY_INTERFACE_MODE_RGMII_TXID ||
+		     interface == PHY_INTERFACE_MODE_RGMII_ID))
+			tx_delay = 4; /* n * 0.5 ns */
+
+		/* P5 RGMII TX Clock Control: delay x */
+		mt7530_write(priv, MT7530_P5RGMIITXCR,
+			     CSR_RGMII_TXC_CFG(0x10 + tx_delay));
+
+		/* reduce P5 RGMII Tx driving, 8mA */
+		mt7530_write(priv, MT7530_IO_DRV_CR,
+			     P5_IO_CLK_DRV(1) | P5_IO_DATA_DRV(1));
+	}
+
+	mt7530_write(priv, MT7530_MHWTRAP, val);
+
+	dev_info(ds->dev, "Setup P5, HWTRAP=0x%x, intf_sel=%s, phy-mode=%s\n",
+		 val, p5_intf_modes(priv->p5_intf_sel), phy_modes(interface));
+
+	priv->p5_interface = interface;
+
+unlock_exit:
+	mutex_unlock(&priv->reg_mutex);
+}
+
 static int
 mt7530_cpu_port_enable(struct mt7530_priv *priv,
 		       int port)
@@ -1169,7 +1240,10 @@ static int
 mt7530_setup(struct dsa_switch *ds)
 {
 	struct mt7530_priv *priv = ds->priv;
+	struct device_node *phy_node;
+	struct device_node *mac_np;
 	struct mt7530_dummy_poll p;
+	phy_interface_t interface;
 	struct device_node *dn;
 	u32 id, val;
 	int ret, i;
@@ -1260,6 +1334,40 @@ mt7530_setup(struct dsa_switch *ds)
 			mt7530_port_disable(ds, i);
 	}
 
+	/* Setup port 5 */
+	priv->p5_intf_sel = P5_DISABLED;
+	interface = PHY_INTERFACE_MODE_NA;
+
+	if (!dsa_is_unused_port(ds, 5)) {
+		priv->p5_intf_sel = P5_INTF_SEL_GMAC5;
+		interface = of_get_phy_mode(ds->ports[5].dn);
+	} else {
+		/* Scan the ethernet nodes. look for GMAC1, lookup used phy */
+		for_each_child_of_node(dn, mac_np) {
+			if (!of_device_is_compatible(mac_np,
+						     "mediatek,eth-mac"))
+				continue;
+
+			ret = of_property_read_u32(mac_np, "reg", &id);
+			if (ret < 0 || id != 1)
+				continue;
+
+			phy_node = of_parse_phandle(mac_np, "phy-handle", 0);
+			if (phy_node->parent == priv->dev->of_node->parent) {
+				interface = of_get_phy_mode(mac_np);
+				id = of_mdio_parse_addr(ds->dev, phy_node);
+				if (id == 0)
+					priv->p5_intf_sel = P5_INTF_SEL_PHY_P0;
+				if (id == 4)
+					priv->p5_intf_sel = P5_INTF_SEL_PHY_P4;
+			}
+			of_node_put(phy_node);
+			break;
+		}
+	}
+
+	mt7530_setup_port5(ds, interface);
+
 	/* Flush the FDB table */
 	ret = mt7530_fdb_cmd(priv, MT7530_FDB_FLUSH, NULL);
 	if (ret < 0)
@@ -1284,7 +1392,16 @@ static void mt7530_phylink_mac_config(struct dsa_switch *ds, int port,
 		if (state->interface != PHY_INTERFACE_MODE_GMII)
 			return;
 		break;
-	/* case 5: Port 5 is not supported! */
+	case 5: /* 2nd cpu port with phy of port 0 or 4 / external phy */
+		if (priv->p5_interface == state->interface)
+			break;
+		if (!phy_interface_mode_is_rgmii(state->interface) &&
+		    state->interface != PHY_INTERFACE_MODE_MII &&
+		    state->interface != PHY_INTERFACE_MODE_GMII)
+			return;
+
+		mt7530_setup_port5(ds, state->interface);
+		break;
 	case 6: /* 1st cpu port */
 		if (priv->p6_interface == state->interface)
 			break;
@@ -1324,6 +1441,10 @@ static void mt7530_phylink_mac_config(struct dsa_switch *ds, int port,
 	mcr_new |= PMCR_IFG_XMIT(1) | PMCR_MAC_MODE | PMCR_BACKOFF_EN |
 		   PMCR_BACKPR_EN | PMCR_FORCE_MODE | PMCR_FORCE_LNK;
 
+	/* Are we connected to external phy */
+	if (port == 5 && dsa_is_user_port(ds, 5))
+		mcr_new |= PMCR_EXT_PHY;
+
 	switch (state->speed) {
 	case SPEED_1000:
 		mcr_new |= PMCR_FORCE_SPEED_1000;
@@ -1379,7 +1500,13 @@ static void mt7530_phylink_validate(struct dsa_switch *ds, int port,
 		    state->interface != PHY_INTERFACE_MODE_GMII)
 			goto unsupported;
 		break;
-	/* case 5: Port 5 not supported! */
+	case 5: /* 2nd cpu port with phy of port 0 or 4 / external phy */
+		if (state->interface != PHY_INTERFACE_MODE_NA &&
+		    !phy_interface_mode_is_rgmii(state->interface) &&
+		    state->interface != PHY_INTERFACE_MODE_MII &&
+		    state->interface != PHY_INTERFACE_MODE_GMII)
+			goto unsupported;
+		break;
 	case 6: /* 1st cpu port */
 		if (state->interface != PHY_INTERFACE_MODE_NA &&
 		    state->interface != PHY_INTERFACE_MODE_RGMII &&
@@ -1396,15 +1523,21 @@ static void mt7530_phylink_validate(struct dsa_switch *ds, int port,
 	phylink_set_port_modes(mask);
 	phylink_set(mask, Autoneg);
 
-	if (state->interface != PHY_INTERFACE_MODE_TRGMII) {
+	if (state->interface == PHY_INTERFACE_MODE_TRGMII) {
+		phylink_set(mask, 1000baseT_Full);
+	} else {
 		phylink_set(mask, 10baseT_Half);
 		phylink_set(mask, 10baseT_Full);
 		phylink_set(mask, 100baseT_Half);
 		phylink_set(mask, 100baseT_Full);
-		phylink_set(mask, 1000baseT_Half);
-	}
 
-	phylink_set(mask, 1000baseT_Full);
+		if (state->interface != PHY_INTERFACE_MODE_MII) {
+			phylink_set(mask, 1000baseT_Half);
+			phylink_set(mask, 1000baseT_Full);
+			if (port == 5)
+				phylink_set(mask, 1000baseX_Full);
+		}
+	}
 
 	phylink_set(mask, Pause);
 	phylink_set(mask, Asym_Pause);
diff --git a/drivers/net/dsa/mt7530.h b/drivers/net/dsa/mt7530.h
index 107dd04acede..ccb9da8cad0d 100644
--- a/drivers/net/dsa/mt7530.h
+++ b/drivers/net/dsa/mt7530.h
@@ -186,6 +186,7 @@ enum mt7530_vlan_port_attr {
 /* Register for port MAC control register */
 #define MT7530_PMCR_P(x)		(0x3000 + ((x) * 0x100))
 #define  PMCR_IFG_XMIT(x)		(((x) & 0x3) << 18)
+#define  PMCR_EXT_PHY			BIT(17)
 #define  PMCR_MAC_MODE			BIT(16)
 #define  PMCR_FORCE_MODE		BIT(15)
 #define  PMCR_TX_EN			BIT(14)
@@ -245,6 +246,7 @@ enum mt7530_vlan_port_attr {
 
 /* Register for hw trap modification */
 #define MT7530_MHWTRAP			0x7804
+#define  MHWTRAP_PHY0_SEL		BIT(20)
 #define  MHWTRAP_MANUAL			BIT(16)
 #define  MHWTRAP_P5_MAC_SEL		BIT(13)
 #define  MHWTRAP_P6_DIS			BIT(8)
@@ -402,6 +404,30 @@ struct mt7530_port {
 	u16 pvid;
 };
 
+/* Port 5 interface select definitions */
+enum p5_interface_select {
+	P5_DISABLED = 0,
+	P5_INTF_SEL_PHY_P0,
+	P5_INTF_SEL_PHY_P4,
+	P5_INTF_SEL_GMAC5,
+};
+
+static const char *p5_intf_modes(unsigned int p5_interface)
+{
+	switch (p5_interface) {
+	case P5_DISABLED:
+		return "DISABLED";
+	case P5_INTF_SEL_PHY_P0:
+		return "PHY P0";
+	case P5_INTF_SEL_PHY_P4:
+		return "PHY P4";
+	case P5_INTF_SEL_GMAC5:
+		return "GMAC5";
+	default:
+		return "unknown";
+	}
+}
+
 /* struct mt7530_priv -	This is the main data structure for holding the state
  *			of the driver
  * @dev:		The device pointer
@@ -418,6 +444,7 @@ struct mt7530_port {
  * @reg_mutex:		The lock for protecting among process accessing
  *			registers
  * @p6_interface	Holding the current port 6 interface
+ * @p5_intf_sel:	Holding the current port 5 interface select
  */
 struct mt7530_priv {
 	struct device		*dev;
@@ -431,6 +458,8 @@ struct mt7530_priv {
 	unsigned int		id;
 	bool			mcm;
 	phy_interface_t		p6_interface;
+	phy_interface_t		p5_interface;
+	unsigned int		p5_intf_sel;
 
 	struct mt7530_port	ports[MT7530_NUM_PORTS];
 	/* protect among processes for registers access*/
-- 
2.20.1


^ permalink raw reply related

* [PATCH net-next v2 0/3] net: dsa: mt7530: Convert to PHYLINK and add support for port 5
From: René van Dorst @ 2019-08-21 14:45 UTC (permalink / raw)
  To: Sean Wang, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	David S . Miller, Matthias Brugger
  Cc: netdev, linux-arm-kernel, linux-mediatek, John Crispin,
	linux-mips, Frank Wunderlich, René van Dorst

1. net: dsa: mt7530: Convert to PHYLINK API
   This patch converts mt7530 to PHYLINK API.
2. dt-bindings: net: dsa: mt7530: Add support for port 5
3. net: dsa: mt7530: Add support for port 5
   These 2 patches adding support for port 5 of the switch.

v1->v2:
 * Mostly phylink improvements after review.
rfc -> v1:
 * Mostly phylink improvements after review.
 * Drop phy isolation patches. Adds no value for now.
René van Dorst (3):
  net: dsa: mt7530: Convert to PHYLINK API
  dt-bindings: net: dsa: mt7530: Add support for port 5
  net: dsa: mt7530: Add support for port 5

 .../devicetree/bindings/net/dsa/mt7530.txt    | 218 ++++++++++
 drivers/net/dsa/mt7530.c                      | 371 +++++++++++++++---
 drivers/net/dsa/mt7530.h                      |  61 ++-
 3 files changed, 577 insertions(+), 73 deletions(-)

-- 
2.20.1


^ permalink raw reply

* [PATCH net] net: cpsw: fix NULL pointer exception in the probe error path
From: Antoine Tenart @ 2019-08-21 14:41 UTC (permalink / raw)
  To: davem, grygorii.strashko
  Cc: Antoine Tenart, netdev, linux-omap, linux-kernel,
	maxime.chevallier

In certain cases when the probe function fails the error path calls
cpsw_remove_dt() before calling platform_set_drvdata(). This is an
issue as cpsw_remove_dt() uses platform_get_drvdata() to retrieve the
cpsw_common data and leds to a NULL pointer exception. This patches
fixes it by calling platform_set_drvdata() earlier in the probe.

Fixes: 83a8471ba255 ("net: ethernet: ti: cpsw: refactor probe to group common hw initialization")
Reported-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
---
 drivers/net/ethernet/ti/cpsw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 32a89744972d..a46b8b2e44e1 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -2775,6 +2775,7 @@ static int cpsw_probe(struct platform_device *pdev)
 	if (!cpsw)
 		return -ENOMEM;
 
+	platform_set_drvdata(pdev, cpsw);
 	cpsw->dev = dev;
 
 	mode = devm_gpiod_get_array_optional(dev, "mode", GPIOD_OUT_LOW);
@@ -2879,7 +2880,6 @@ static int cpsw_probe(struct platform_device *pdev)
 		goto clean_cpts;
 	}
 
-	platform_set_drvdata(pdev, cpsw);
 	priv = netdev_priv(ndev);
 	priv->cpsw = cpsw;
 	priv->ndev = ndev;
-- 
2.21.0


^ permalink raw reply related

* [PATCH net-next v2 3/3] dt-bindings: net: ethernet: Update mt7622 docs and dts to reflect the new phylink API
From: René van Dorst @ 2019-08-21 14:43 UTC (permalink / raw)
  To: John Crispin, Sean Wang, Nelson Chang, David S . Miller,
	Matthias Brugger
  Cc: netdev, linux-arm-kernel, linux-mediatek, linux-mips,
	Frank Wunderlich, Stefan Roese, René van Dorst, devicetree,
	Rob Herring
In-Reply-To: <20190821144336.9259-1-opensource@vdorst.com>

This patch the removes the recently added mediatek,physpeed property.
Use the fixed-link property speed = <2500> to set the phy in 2.5Gbit.
See mt7622-bananapi-bpi-r64.dts for a working example.

Signed-off-by: René van Dorst <opensource@vdorst.com>
Cc: devicetree@vger.kernel.org
Cc: Rob Herring <robh@kernel.org>
--
v1->v2:
* SGMII port only support BASE-X at 2.5Gbit.
---
 .../arm/mediatek/mediatek,sgmiisys.txt        |  2 --
 .../dts/mediatek/mt7622-bananapi-bpi-r64.dts  | 28 +++++++++++++------
 arch/arm64/boot/dts/mediatek/mt7622.dtsi      |  1 -
 3 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,sgmiisys.txt b/Documentation/devicetree/bindings/arm/mediatek/mediatek,sgmiisys.txt
index f5518f26a914..30cb645c0e54 100644
--- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,sgmiisys.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,sgmiisys.txt
@@ -9,8 +9,6 @@ Required Properties:
 	- "mediatek,mt7622-sgmiisys", "syscon"
 	- "mediatek,mt7629-sgmiisys", "syscon"
 - #clock-cells: Must be 1
-- mediatek,physpeed: Should be one of "auto", "1000" or "2500" to match up
-		     the capability of the target PHY.
 
 The SGMIISYS controller uses the common clk binding from
 Documentation/devicetree/bindings/clock/clock-bindings.txt
diff --git a/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts b/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts
index 710c5c3d87d3..83e10591e0e5 100644
--- a/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts
+++ b/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts
@@ -115,24 +115,34 @@
 };
 
 &eth {
-	pinctrl-names = "default";
-	pinctrl-0 = <&eth_pins>;
 	status = "okay";
+	gmac0: mac@0 {
+		compatible = "mediatek,eth-mac";
+		reg = <0>;
+		phy-mode = "2500base-x";
+
+		fixed-link {
+			speed = <2500>;
+			full-duplex;
+			pause;
+		};
+	};
 
 	gmac1: mac@1 {
 		compatible = "mediatek,eth-mac";
 		reg = <1>;
-		phy-handle = <&phy5>;
+		phy-mode = "rgmii";
+
+		fixed-link {
+			speed = <1000>;
+			full-duplex;
+			pause;
+		};
 	};
 
-	mdio-bus {
+	mdio: mdio-bus {
 		#address-cells = <1>;
 		#size-cells = <0>;
-
-		phy5: ethernet-phy@5 {
-			reg = <5>;
-			phy-mode = "sgmii";
-		};
 	};
 };
 
diff --git a/arch/arm64/boot/dts/mediatek/mt7622.dtsi b/arch/arm64/boot/dts/mediatek/mt7622.dtsi
index d1e13d340e26..dac51e98204c 100644
--- a/arch/arm64/boot/dts/mediatek/mt7622.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt7622.dtsi
@@ -931,6 +931,5 @@
 			     "syscon";
 		reg = <0 0x1b128000 0 0x3000>;
 		#clock-cells = <1>;
-		mediatek,physpeed = "2500";
 	};
 };
-- 
2.20.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox