Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH v12 03/10] netdev: cavium: octeon: Add Octeon III BGX Ethernet Nexus
From: Jiri Pirko @ 2018-06-29  6:13 UTC (permalink / raw)
  To: David Miller; +Cc: cmunoz, andrew, steven.hill, netdev, cchavva
In-Reply-To: <20180629.111905.863023692523959153.davem@davemloft.net>

Fri, Jun 29, 2018 at 04:19:05AM CEST, davem@davemloft.net wrote:
>From: Carlos Munoz <cmunoz@cavium.com>
>Date: Thu, 28 Jun 2018 14:20:05 -0700
>
>> 
>> 
>> On 06/28/2018 01:41 AM, Andrew Lunn wrote:
>>> External Email
>>>
>>>> +static char *mix_port;
>>>> +module_param(mix_port, charp, 0444);
>>>> +MODULE_PARM_DESC(mix_port, "Specifies which ports connect to MIX interfaces.");
>>>> +
>>>> +static char *pki_port;
>>>> +module_param(pki_port, charp, 0444);
>>>> +MODULE_PARM_DESC(pki_port, "Specifies which ports connect to the PKI.");
>>> Module parameters are generally not liked. Can you do without them?
>> 
>> These parameters change the kernel port assignment required by user
>> space applications. We rather keep them as they simplify the
>> process.
>
>This is actually a terrible user experience.
>
>Please provide a way to do this by performing operations on a device object
>after the driver loads.
>
>Use something like devlink or similar if you have to.

Devlink params should be used for this. They are not upstream yet. We
will push it most likely early next week.

^ permalink raw reply

* Re: [PATCH] r8169: remove TBI 1000BaseX support
From: Heiner Kallweit @ 2018-06-29  6:08 UTC (permalink / raw)
  To: David Miller, Realtek linux nic maintainers; +Cc: netdev@vger.kernel.org
In-Reply-To: <e68152e2-4187-ad39-39e4-dadad5a1f6cd@gmail.com>

On 29.06.2018 08:07, Heiner Kallweit wrote:
> The very first version of RTL8169 from 2002 (and only this one) has
> support for a TBI 1000BaseX fiber interface. The TBI support in the
> driver makes switching to phylib tricky, so best would be to get
> rid of it. I found no report from anybody using a device with RTL8169
> and fiber interface, also the vendor driver doesn't support this mode
> (any longer).
> So remove TBI support and bail out with a message if a card with
> activated TBI is detected. If there really should be any user of it
> out there, we could add a stripped-down version of the driver
> supporting chip version 01 and TBI only (and maybe move it to
> staging).
> 
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>

Sorry, missed flagging it as net-next.

> ---
>  drivers/net/ethernet/realtek/r8169.c | 156 ++++-----------------------
>  1 file changed, 20 insertions(+), 136 deletions(-)
> 
> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
> index 21ffaf10..72a7778b 100644
> --- a/drivers/net/ethernet/realtek/r8169.c
> +++ b/drivers/net/ethernet/realtek/r8169.c
> @@ -384,12 +384,6 @@ enum rtl_registers {
>  	FuncForceEvent	= 0xfc,
>  };
>  
> -enum rtl8110_registers {
> -	TBICSR			= 0x64,
> -	TBI_ANAR		= 0x68,
> -	TBI_LPAR		= 0x6a,
> -};
> -
>  enum rtl8168_8101_registers {
>  	CSIDR			= 0x64,
>  	CSIAR			= 0x68,
> @@ -556,14 +550,6 @@ enum rtl_register_content {
>  	PMEStatus	= (1 << 0),	/* PME status can be reset by PCI RST# */
>  	ASPM_en		= (1 << 0),	/* ASPM enable */
>  
> -	/* TBICSR p.28 */
> -	TBIReset	= 0x80000000,
> -	TBILoopback	= 0x40000000,
> -	TBINwEnable	= 0x20000000,
> -	TBINwRestart	= 0x10000000,
> -	TBILinkOk	= 0x02000000,
> -	TBINwComplete	= 0x01000000,
> -
>  	/* CPlusCmd p.31 */
>  	EnableBist	= (1 << 15),	// 8168 8101
>  	Mac_dbgo_oe	= (1 << 14),	// 8168 8101
> @@ -761,14 +747,7 @@ struct rtl8169_private {
>  		void (*disable)(struct rtl8169_private *);
>  	} jumbo_ops;
>  
> -	int (*set_speed)(struct net_device *, u8 aneg, u16 sp, u8 dpx, u32 adv);
> -	int (*get_link_ksettings)(struct net_device *,
> -				  struct ethtool_link_ksettings *);
> -	void (*phy_reset_enable)(struct rtl8169_private *tp);
>  	void (*hw_start)(struct rtl8169_private *tp);
> -	unsigned int (*phy_reset_pending)(struct rtl8169_private *tp);
> -	unsigned int (*link_ok)(struct rtl8169_private *tp);
> -	int (*do_ioctl)(struct rtl8169_private *tp, struct mii_ioctl_data *data, int cmd);
>  	bool (*tso_csum)(struct rtl8169_private *, struct sk_buff *, u32 *);
>  
>  	struct {
> @@ -1463,31 +1442,16 @@ static void rtl8169_irq_mask_and_ack(struct rtl8169_private *tp)
>  	RTL_R8(tp, ChipCmd);
>  }
>  
> -static unsigned int rtl8169_tbi_reset_pending(struct rtl8169_private *tp)
> -{
> -	return RTL_R32(tp, TBICSR) & TBIReset;
> -}
> -
>  static unsigned int rtl8169_xmii_reset_pending(struct rtl8169_private *tp)
>  {
>  	return rtl_readphy(tp, MII_BMCR) & BMCR_RESET;
>  }
>  
> -static unsigned int rtl8169_tbi_link_ok(struct rtl8169_private *tp)
> -{
> -	return RTL_R32(tp, TBICSR) & TBILinkOk;
> -}
> -
>  static unsigned int rtl8169_xmii_link_ok(struct rtl8169_private *tp)
>  {
>  	return RTL_R8(tp, PHYstatus) & LinkStatus;
>  }
>  
> -static void rtl8169_tbi_reset_enable(struct rtl8169_private *tp)
> -{
> -	RTL_W32(tp, TBICSR, RTL_R32(tp, TBICSR) | TBIReset);
> -}
> -
>  static void rtl8169_xmii_reset_enable(struct rtl8169_private *tp)
>  {
>  	unsigned int val;
> @@ -1557,7 +1521,7 @@ static void rtl8169_check_link_status(struct net_device *dev,
>  {
>  	struct device *d = tp_to_dev(tp);
>  
> -	if (tp->link_ok(tp)) {
> +	if (rtl8169_xmii_link_ok(tp)) {
>  		rtl_link_chg_patch(tp);
>  		/* This is to cancel a scheduled suspend if there's one. */
>  		pm_request_resume(d);
> @@ -1744,28 +1708,6 @@ static int rtl8169_get_regs_len(struct net_device *dev)
>  	return R8169_REGS_SIZE;
>  }
>  
> -static int rtl8169_set_speed_tbi(struct net_device *dev,
> -				 u8 autoneg, u16 speed, u8 duplex, u32 ignored)
> -{
> -	struct rtl8169_private *tp = netdev_priv(dev);
> -	int ret = 0;
> -	u32 reg;
> -
> -	reg = RTL_R32(tp, TBICSR);
> -	if ((autoneg == AUTONEG_DISABLE) && (speed == SPEED_1000) &&
> -	    (duplex == DUPLEX_FULL)) {
> -		RTL_W32(tp, TBICSR, reg & ~(TBINwEnable | TBINwRestart));
> -	} else if (autoneg == AUTONEG_ENABLE)
> -		RTL_W32(tp, TBICSR, reg | TBINwEnable | TBINwRestart);
> -	else {
> -		netif_warn(tp, link, dev,
> -			   "incorrect speed setting refused in TBI mode\n");
> -		ret = -EOPNOTSUPP;
> -	}
> -
> -	return ret;
> -}
> -
>  static int rtl8169_set_speed_xmii(struct net_device *dev,
>  				  u8 autoneg, u16 speed, u8 duplex, u32 adv)
>  {
> @@ -1849,7 +1791,7 @@ static int rtl8169_set_speed(struct net_device *dev,
>  	struct rtl8169_private *tp = netdev_priv(dev);
>  	int ret;
>  
> -	ret = tp->set_speed(dev, autoneg, speed, duplex, advertising);
> +	ret = rtl8169_set_speed_xmii(dev, autoneg, speed, duplex, advertising);
>  	if (ret < 0)
>  		goto out;
>  
> @@ -1925,53 +1867,14 @@ static void rtl8169_rx_vlan_tag(struct RxDesc *desc, struct sk_buff *skb)
>  		__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), swab16(opts2 & 0xffff));
>  }
>  
> -static int rtl8169_get_link_ksettings_tbi(struct net_device *dev,
> -					  struct ethtool_link_ksettings *cmd)
> -{
> -	struct rtl8169_private *tp = netdev_priv(dev);
> -	u32 status;
> -	u32 supported, advertising;
> -
> -	supported =
> -		SUPPORTED_1000baseT_Full | SUPPORTED_Autoneg | SUPPORTED_FIBRE;
> -	cmd->base.port = PORT_FIBRE;
> -
> -	status = RTL_R32(tp, TBICSR);
> -	advertising = (status & TBINwEnable) ?  ADVERTISED_Autoneg : 0;
> -	cmd->base.autoneg = !!(status & TBINwEnable);
> -
> -	cmd->base.speed = SPEED_1000;
> -	cmd->base.duplex = DUPLEX_FULL; /* Always set */
> -
> -	ethtool_convert_legacy_u32_to_link_mode(cmd->link_modes.supported,
> -						supported);
> -	ethtool_convert_legacy_u32_to_link_mode(cmd->link_modes.advertising,
> -						advertising);
> -
> -	return 0;
> -}
> -
> -static int rtl8169_get_link_ksettings_xmii(struct net_device *dev,
> -					   struct ethtool_link_ksettings *cmd)
> -{
> -	struct rtl8169_private *tp = netdev_priv(dev);
> -
> -	mii_ethtool_get_link_ksettings(&tp->mii, cmd);
> -
> -	return 0;
> -}
> -
>  static int rtl8169_get_link_ksettings(struct net_device *dev,
>  				      struct ethtool_link_ksettings *cmd)
>  {
>  	struct rtl8169_private *tp = netdev_priv(dev);
> -	int rc;
>  
> -	rtl_lock_work(tp);
> -	rc = tp->get_link_ksettings(dev, cmd);
> -	rtl_unlock_work(tp);
> +	mii_ethtool_get_link_ksettings(&tp->mii, cmd);
>  
> -	return rc;
> +	return 0;
>  }
>  
>  static int rtl8169_set_link_ksettings(struct net_device *dev,
> @@ -4395,7 +4298,7 @@ static void rtl_phy_work(struct rtl8169_private *tp)
>  	struct timer_list *timer = &tp->timer;
>  	unsigned long timeout = RTL8169_PHY_TIMEOUT;
>  
> -	if (tp->phy_reset_pending(tp)) {
> +	if (rtl8169_xmii_reset_pending(tp)) {
>  		/*
>  		 * A busy loop could burn quite a few cycles on nowadays CPU.
>  		 * Let's delay the execution of the timer for a few ticks.
> @@ -4404,12 +4307,12 @@ static void rtl_phy_work(struct rtl8169_private *tp)
>  		goto out_mod_timer;
>  	}
>  
> -	if (tp->link_ok(tp))
> +	if (rtl8169_xmii_link_ok(tp))
>  		return;
>  
>  	netif_dbg(tp, link, tp->dev, "PHY reset until link up\n");
>  
> -	tp->phy_reset_enable(tp);
> +	rtl8169_xmii_reset_enable(tp);
>  
>  out_mod_timer:
>  	mod_timer(timer, jiffies + timeout);
> @@ -4430,20 +4333,20 @@ static void rtl8169_phy_timer(struct timer_list *t)
>  
>  DECLARE_RTL_COND(rtl_phy_reset_cond)
>  {
> -	return tp->phy_reset_pending(tp);
> +	return rtl8169_xmii_reset_pending(tp);
>  }
>  
>  static void rtl8169_phy_reset(struct net_device *dev,
>  			      struct rtl8169_private *tp)
>  {
> -	tp->phy_reset_enable(tp);
> +	rtl8169_xmii_reset_enable(tp);
>  	rtl_msleep_loop_wait_low(tp, &rtl_phy_reset_cond, 1, 100);
>  }
>  
>  static bool rtl_tbi_enabled(struct rtl8169_private *tp)
>  {
>  	return (tp->mac_version == RTL_GIGA_MAC_VER_01) &&
> -	    (RTL_R8(tp, PHYstatus) & TBI_Enable);
> +	       (RTL_R8(tp, PHYstatus) & TBI_Enable);
>  }
>  
>  static void rtl8169_init_phy(struct net_device *dev, struct rtl8169_private *tp)
> @@ -4478,9 +4381,6 @@ static void rtl8169_init_phy(struct net_device *dev, struct rtl8169_private *tp)
>  			  (tp->mii.supports_gmii ?
>  			   ADVERTISED_1000baseT_Half |
>  			   ADVERTISED_1000baseT_Full : 0));
> -
> -	if (rtl_tbi_enabled(tp))
> -		netif_info(tp, link, dev, "TBI auto-negotiating\n");
>  }
>  
>  static void rtl_rar_set(struct rtl8169_private *tp, u8 *addr)
> @@ -4523,14 +4423,6 @@ static int rtl_set_mac_address(struct net_device *dev, void *p)
>  	return 0;
>  }
>  
> -static int rtl8169_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
> -{
> -	struct rtl8169_private *tp = netdev_priv(dev);
> -	struct mii_ioctl_data *data = if_mii(ifr);
> -
> -	return netif_running(dev) ? tp->do_ioctl(tp, data, cmd) : -ENODEV;
> -}
> -
>  static int rtl_xmii_ioctl(struct rtl8169_private *tp,
>  			  struct mii_ioctl_data *data, int cmd)
>  {
> @@ -4550,9 +4442,12 @@ static int rtl_xmii_ioctl(struct rtl8169_private *tp,
>  	return -EOPNOTSUPP;
>  }
>  
> -static int rtl_tbi_ioctl(struct rtl8169_private *tp, struct mii_ioctl_data *data, int cmd)
> +static int rtl8169_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
>  {
> -	return -EOPNOTSUPP;
> +	struct rtl8169_private *tp = netdev_priv(dev);
> +	struct mii_ioctl_data *data = if_mii(ifr);
> +
> +	return netif_running(dev) ? rtl_xmii_ioctl(tp, data, cmd) : -ENODEV;
>  }
>  
>  static void rtl_init_mdio_ops(struct rtl8169_private *tp)
> @@ -7676,6 +7571,11 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>  	/* Identify chip attached to board */
>  	rtl8169_get_mac_version(tp, cfg->default_ver);
>  
> +	if (rtl_tbi_enabled(tp)) {
> +		dev_err(&pdev->dev, "TBI fiber mode not supported\n");
> +		return -ENODEV;
> +	}
> +
>  	tp->cp_cmd = RTL_R16(tp, CPlusCmd);
>  
>  	if ((sizeof(dma_addr_t) > 4) &&
> @@ -7724,22 +7624,6 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>  	/* override BIOS settings, use userspace tools to enable WOL */
>  	__rtl8169_set_wol(tp, 0);
>  
> -	if (rtl_tbi_enabled(tp)) {
> -		tp->set_speed = rtl8169_set_speed_tbi;
> -		tp->get_link_ksettings = rtl8169_get_link_ksettings_tbi;
> -		tp->phy_reset_enable = rtl8169_tbi_reset_enable;
> -		tp->phy_reset_pending = rtl8169_tbi_reset_pending;
> -		tp->link_ok = rtl8169_tbi_link_ok;
> -		tp->do_ioctl = rtl_tbi_ioctl;
> -	} else {
> -		tp->set_speed = rtl8169_set_speed_xmii;
> -		tp->get_link_ksettings = rtl8169_get_link_ksettings_xmii;
> -		tp->phy_reset_enable = rtl8169_xmii_reset_enable;
> -		tp->phy_reset_pending = rtl8169_xmii_reset_pending;
> -		tp->link_ok = rtl8169_xmii_link_ok;
> -		tp->do_ioctl = rtl_xmii_ioctl;
> -	}
> -
>  	mutex_init(&tp->wk.mutex);
>  	u64_stats_init(&tp->rx_stats.syncp);
>  	u64_stats_init(&tp->tx_stats.syncp);
> 

^ permalink raw reply

* [PATCH] r8169: remove TBI 1000BaseX support
From: Heiner Kallweit @ 2018-06-29  6:07 UTC (permalink / raw)
  To: David Miller, Realtek linux nic maintainers; +Cc: netdev@vger.kernel.org

The very first version of RTL8169 from 2002 (and only this one) has
support for a TBI 1000BaseX fiber interface. The TBI support in the
driver makes switching to phylib tricky, so best would be to get
rid of it. I found no report from anybody using a device with RTL8169
and fiber interface, also the vendor driver doesn't support this mode
(any longer).
So remove TBI support and bail out with a message if a card with
activated TBI is detected. If there really should be any user of it
out there, we could add a stripped-down version of the driver
supporting chip version 01 and TBI only (and maybe move it to
staging).

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/ethernet/realtek/r8169.c | 156 ++++-----------------------
 1 file changed, 20 insertions(+), 136 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 21ffaf10..72a7778b 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -384,12 +384,6 @@ enum rtl_registers {
 	FuncForceEvent	= 0xfc,
 };
 
-enum rtl8110_registers {
-	TBICSR			= 0x64,
-	TBI_ANAR		= 0x68,
-	TBI_LPAR		= 0x6a,
-};
-
 enum rtl8168_8101_registers {
 	CSIDR			= 0x64,
 	CSIAR			= 0x68,
@@ -556,14 +550,6 @@ enum rtl_register_content {
 	PMEStatus	= (1 << 0),	/* PME status can be reset by PCI RST# */
 	ASPM_en		= (1 << 0),	/* ASPM enable */
 
-	/* TBICSR p.28 */
-	TBIReset	= 0x80000000,
-	TBILoopback	= 0x40000000,
-	TBINwEnable	= 0x20000000,
-	TBINwRestart	= 0x10000000,
-	TBILinkOk	= 0x02000000,
-	TBINwComplete	= 0x01000000,
-
 	/* CPlusCmd p.31 */
 	EnableBist	= (1 << 15),	// 8168 8101
 	Mac_dbgo_oe	= (1 << 14),	// 8168 8101
@@ -761,14 +747,7 @@ struct rtl8169_private {
 		void (*disable)(struct rtl8169_private *);
 	} jumbo_ops;
 
-	int (*set_speed)(struct net_device *, u8 aneg, u16 sp, u8 dpx, u32 adv);
-	int (*get_link_ksettings)(struct net_device *,
-				  struct ethtool_link_ksettings *);
-	void (*phy_reset_enable)(struct rtl8169_private *tp);
 	void (*hw_start)(struct rtl8169_private *tp);
-	unsigned int (*phy_reset_pending)(struct rtl8169_private *tp);
-	unsigned int (*link_ok)(struct rtl8169_private *tp);
-	int (*do_ioctl)(struct rtl8169_private *tp, struct mii_ioctl_data *data, int cmd);
 	bool (*tso_csum)(struct rtl8169_private *, struct sk_buff *, u32 *);
 
 	struct {
@@ -1463,31 +1442,16 @@ static void rtl8169_irq_mask_and_ack(struct rtl8169_private *tp)
 	RTL_R8(tp, ChipCmd);
 }
 
-static unsigned int rtl8169_tbi_reset_pending(struct rtl8169_private *tp)
-{
-	return RTL_R32(tp, TBICSR) & TBIReset;
-}
-
 static unsigned int rtl8169_xmii_reset_pending(struct rtl8169_private *tp)
 {
 	return rtl_readphy(tp, MII_BMCR) & BMCR_RESET;
 }
 
-static unsigned int rtl8169_tbi_link_ok(struct rtl8169_private *tp)
-{
-	return RTL_R32(tp, TBICSR) & TBILinkOk;
-}
-
 static unsigned int rtl8169_xmii_link_ok(struct rtl8169_private *tp)
 {
 	return RTL_R8(tp, PHYstatus) & LinkStatus;
 }
 
-static void rtl8169_tbi_reset_enable(struct rtl8169_private *tp)
-{
-	RTL_W32(tp, TBICSR, RTL_R32(tp, TBICSR) | TBIReset);
-}
-
 static void rtl8169_xmii_reset_enable(struct rtl8169_private *tp)
 {
 	unsigned int val;
@@ -1557,7 +1521,7 @@ static void rtl8169_check_link_status(struct net_device *dev,
 {
 	struct device *d = tp_to_dev(tp);
 
-	if (tp->link_ok(tp)) {
+	if (rtl8169_xmii_link_ok(tp)) {
 		rtl_link_chg_patch(tp);
 		/* This is to cancel a scheduled suspend if there's one. */
 		pm_request_resume(d);
@@ -1744,28 +1708,6 @@ static int rtl8169_get_regs_len(struct net_device *dev)
 	return R8169_REGS_SIZE;
 }
 
-static int rtl8169_set_speed_tbi(struct net_device *dev,
-				 u8 autoneg, u16 speed, u8 duplex, u32 ignored)
-{
-	struct rtl8169_private *tp = netdev_priv(dev);
-	int ret = 0;
-	u32 reg;
-
-	reg = RTL_R32(tp, TBICSR);
-	if ((autoneg == AUTONEG_DISABLE) && (speed == SPEED_1000) &&
-	    (duplex == DUPLEX_FULL)) {
-		RTL_W32(tp, TBICSR, reg & ~(TBINwEnable | TBINwRestart));
-	} else if (autoneg == AUTONEG_ENABLE)
-		RTL_W32(tp, TBICSR, reg | TBINwEnable | TBINwRestart);
-	else {
-		netif_warn(tp, link, dev,
-			   "incorrect speed setting refused in TBI mode\n");
-		ret = -EOPNOTSUPP;
-	}
-
-	return ret;
-}
-
 static int rtl8169_set_speed_xmii(struct net_device *dev,
 				  u8 autoneg, u16 speed, u8 duplex, u32 adv)
 {
@@ -1849,7 +1791,7 @@ static int rtl8169_set_speed(struct net_device *dev,
 	struct rtl8169_private *tp = netdev_priv(dev);
 	int ret;
 
-	ret = tp->set_speed(dev, autoneg, speed, duplex, advertising);
+	ret = rtl8169_set_speed_xmii(dev, autoneg, speed, duplex, advertising);
 	if (ret < 0)
 		goto out;
 
@@ -1925,53 +1867,14 @@ static void rtl8169_rx_vlan_tag(struct RxDesc *desc, struct sk_buff *skb)
 		__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), swab16(opts2 & 0xffff));
 }
 
-static int rtl8169_get_link_ksettings_tbi(struct net_device *dev,
-					  struct ethtool_link_ksettings *cmd)
-{
-	struct rtl8169_private *tp = netdev_priv(dev);
-	u32 status;
-	u32 supported, advertising;
-
-	supported =
-		SUPPORTED_1000baseT_Full | SUPPORTED_Autoneg | SUPPORTED_FIBRE;
-	cmd->base.port = PORT_FIBRE;
-
-	status = RTL_R32(tp, TBICSR);
-	advertising = (status & TBINwEnable) ?  ADVERTISED_Autoneg : 0;
-	cmd->base.autoneg = !!(status & TBINwEnable);
-
-	cmd->base.speed = SPEED_1000;
-	cmd->base.duplex = DUPLEX_FULL; /* Always set */
-
-	ethtool_convert_legacy_u32_to_link_mode(cmd->link_modes.supported,
-						supported);
-	ethtool_convert_legacy_u32_to_link_mode(cmd->link_modes.advertising,
-						advertising);
-
-	return 0;
-}
-
-static int rtl8169_get_link_ksettings_xmii(struct net_device *dev,
-					   struct ethtool_link_ksettings *cmd)
-{
-	struct rtl8169_private *tp = netdev_priv(dev);
-
-	mii_ethtool_get_link_ksettings(&tp->mii, cmd);
-
-	return 0;
-}
-
 static int rtl8169_get_link_ksettings(struct net_device *dev,
 				      struct ethtool_link_ksettings *cmd)
 {
 	struct rtl8169_private *tp = netdev_priv(dev);
-	int rc;
 
-	rtl_lock_work(tp);
-	rc = tp->get_link_ksettings(dev, cmd);
-	rtl_unlock_work(tp);
+	mii_ethtool_get_link_ksettings(&tp->mii, cmd);
 
-	return rc;
+	return 0;
 }
 
 static int rtl8169_set_link_ksettings(struct net_device *dev,
@@ -4395,7 +4298,7 @@ static void rtl_phy_work(struct rtl8169_private *tp)
 	struct timer_list *timer = &tp->timer;
 	unsigned long timeout = RTL8169_PHY_TIMEOUT;
 
-	if (tp->phy_reset_pending(tp)) {
+	if (rtl8169_xmii_reset_pending(tp)) {
 		/*
 		 * A busy loop could burn quite a few cycles on nowadays CPU.
 		 * Let's delay the execution of the timer for a few ticks.
@@ -4404,12 +4307,12 @@ static void rtl_phy_work(struct rtl8169_private *tp)
 		goto out_mod_timer;
 	}
 
-	if (tp->link_ok(tp))
+	if (rtl8169_xmii_link_ok(tp))
 		return;
 
 	netif_dbg(tp, link, tp->dev, "PHY reset until link up\n");
 
-	tp->phy_reset_enable(tp);
+	rtl8169_xmii_reset_enable(tp);
 
 out_mod_timer:
 	mod_timer(timer, jiffies + timeout);
@@ -4430,20 +4333,20 @@ static void rtl8169_phy_timer(struct timer_list *t)
 
 DECLARE_RTL_COND(rtl_phy_reset_cond)
 {
-	return tp->phy_reset_pending(tp);
+	return rtl8169_xmii_reset_pending(tp);
 }
 
 static void rtl8169_phy_reset(struct net_device *dev,
 			      struct rtl8169_private *tp)
 {
-	tp->phy_reset_enable(tp);
+	rtl8169_xmii_reset_enable(tp);
 	rtl_msleep_loop_wait_low(tp, &rtl_phy_reset_cond, 1, 100);
 }
 
 static bool rtl_tbi_enabled(struct rtl8169_private *tp)
 {
 	return (tp->mac_version == RTL_GIGA_MAC_VER_01) &&
-	    (RTL_R8(tp, PHYstatus) & TBI_Enable);
+	       (RTL_R8(tp, PHYstatus) & TBI_Enable);
 }
 
 static void rtl8169_init_phy(struct net_device *dev, struct rtl8169_private *tp)
@@ -4478,9 +4381,6 @@ static void rtl8169_init_phy(struct net_device *dev, struct rtl8169_private *tp)
 			  (tp->mii.supports_gmii ?
 			   ADVERTISED_1000baseT_Half |
 			   ADVERTISED_1000baseT_Full : 0));
-
-	if (rtl_tbi_enabled(tp))
-		netif_info(tp, link, dev, "TBI auto-negotiating\n");
 }
 
 static void rtl_rar_set(struct rtl8169_private *tp, u8 *addr)
@@ -4523,14 +4423,6 @@ static int rtl_set_mac_address(struct net_device *dev, void *p)
 	return 0;
 }
 
-static int rtl8169_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
-{
-	struct rtl8169_private *tp = netdev_priv(dev);
-	struct mii_ioctl_data *data = if_mii(ifr);
-
-	return netif_running(dev) ? tp->do_ioctl(tp, data, cmd) : -ENODEV;
-}
-
 static int rtl_xmii_ioctl(struct rtl8169_private *tp,
 			  struct mii_ioctl_data *data, int cmd)
 {
@@ -4550,9 +4442,12 @@ static int rtl_xmii_ioctl(struct rtl8169_private *tp,
 	return -EOPNOTSUPP;
 }
 
-static int rtl_tbi_ioctl(struct rtl8169_private *tp, struct mii_ioctl_data *data, int cmd)
+static int rtl8169_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 {
-	return -EOPNOTSUPP;
+	struct rtl8169_private *tp = netdev_priv(dev);
+	struct mii_ioctl_data *data = if_mii(ifr);
+
+	return netif_running(dev) ? rtl_xmii_ioctl(tp, data, cmd) : -ENODEV;
 }
 
 static void rtl_init_mdio_ops(struct rtl8169_private *tp)
@@ -7676,6 +7571,11 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	/* Identify chip attached to board */
 	rtl8169_get_mac_version(tp, cfg->default_ver);
 
+	if (rtl_tbi_enabled(tp)) {
+		dev_err(&pdev->dev, "TBI fiber mode not supported\n");
+		return -ENODEV;
+	}
+
 	tp->cp_cmd = RTL_R16(tp, CPlusCmd);
 
 	if ((sizeof(dma_addr_t) > 4) &&
@@ -7724,22 +7624,6 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	/* override BIOS settings, use userspace tools to enable WOL */
 	__rtl8169_set_wol(tp, 0);
 
-	if (rtl_tbi_enabled(tp)) {
-		tp->set_speed = rtl8169_set_speed_tbi;
-		tp->get_link_ksettings = rtl8169_get_link_ksettings_tbi;
-		tp->phy_reset_enable = rtl8169_tbi_reset_enable;
-		tp->phy_reset_pending = rtl8169_tbi_reset_pending;
-		tp->link_ok = rtl8169_tbi_link_ok;
-		tp->do_ioctl = rtl_tbi_ioctl;
-	} else {
-		tp->set_speed = rtl8169_set_speed_xmii;
-		tp->get_link_ksettings = rtl8169_get_link_ksettings_xmii;
-		tp->phy_reset_enable = rtl8169_xmii_reset_enable;
-		tp->phy_reset_pending = rtl8169_xmii_reset_pending;
-		tp->link_ok = rtl8169_xmii_link_ok;
-		tp->do_ioctl = rtl_xmii_ioctl;
-	}
-
 	mutex_init(&tp->wk.mutex);
 	u64_stats_init(&tp->rx_stats.syncp);
 	u64_stats_init(&tp->tx_stats.syncp);
-- 
2.18.0

^ permalink raw reply related

* Re: [PATCH net] net: fib_rules: add protocol check in rule_find
From: Roopa Prabhu @ 2018-06-29  4:59 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <1530149236-5144-1-git-send-email-roopa@cumulusnetworks.com>

On Wed, Jun 27, 2018 at 6:27 PM, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>
> After commit f9d4b0c1e969 ("fib_rules: move common handling of newrule
> delrule msgs into fib_nl2rule"), rule_find is strict about checking
> for an existing rule. rule_find must check against all
> user given attributes, else it may match against a subset
> of attributes and return an existing rule.
>
> In the below case, without support for protocol match, rule_find
> will match only against 'table main' and return an existing rule.
>
> $ip -4 rule add table main protocol boot
> RTNETLINK answers: File exists
>
> This patch adds protocol support to rule_find, forcing it to
> check protocol match if given by the user.
>
> Fixes: f9d4b0c1e969 ("fib_rules: move common handling of newrule delrule msgs into fib_nl2rule")
> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
> ---
> I spent some time looking at all match keys today and protocol
> was the only missing one (protocol is not in a released kernel yet).
> The only way this could be avoided is to move back to the old loose
> rule_find. I am worried about this new strict checking surprising users,
> but going back to the previous loose checking does not seem right either.
> If there is a reason to believe that users did rely on the previous
> behaviour, I will be happy to revert. Here is another example of old and
> new behaviour.
>
> old rule_find behaviour:
> $ip -4 rule add table main protocol boot
> $ip -4 rule add table main protocol boot
> $ip -4 rule add table main protocol boot
> $ip rule show
> 0:      from all lookup local
> 32763:  from all lookup main  proto boot
> 32764:  from all lookup main  proto boot
> 32765:  from all lookup main  proto boot
> 32766:  from all lookup main
> 32767:  from all lookup default
>
> new rule_find behaviour (after this patch):
> $ip -4 rule add table main protocol boot
> $ip -4 rule add table main protocol boot
> RTNETLINK answers: File exists
>

I found the case where the new rule_find breaks for add.
$ip -4 rule add table main tos 10 fwmark 1
$ip -4 rule add table main tos 10
RTNETLINK answers: File exists

The key masks in the new and old rule need to be compared .
And it cannot be easily compared today without an elaborate if-else block.
Its best to introduce key masks for easier and accurate rule comparison.
But this is best done in net-next. I will submit an incremental patch
tomorrow to
restore previous rule_exists for the add case and the rest should be good.

The current patch in context is needed regardless.

Thanks (and sorry about the iterations).

^ permalink raw reply

* Re: [PATCH bpf-net 08/14] bpf: introduce the bpf_get_local_storage() helper function
From: kbuild test robot @ 2018-06-29  4:37 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: kbuild-all, netdev, kernel-team, tj, Roman Gushchin,
	Alexei Starovoitov, Daniel Borkmann
In-Reply-To: <20180628163458.27193-9-guro@fb.com>

[-- Attachment #1: Type: text/plain, Size: 1699 bytes --]

Hi Roman,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on bpf-next/master]
[also build test ERROR on v4.18-rc2]
[cannot apply to next-20180628]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Roman-Gushchin/bpf-cgroup-local-storage/20180629-035104
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: um-x86_64_defconfig (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=um SUBARCH=x86_64

All errors (new ones prefixed by >>):

   net/core/filter.o: In function `cg_skb_func_proto':
   filter.c:(.text+0x58a7): undefined reference to `bpf_get_local_storage_proto'
   net/core/filter.o: In function `sock_filter_func_proto':
   filter.c:(.text+0x5b3d): undefined reference to `bpf_get_local_storage_proto'
   net/core/filter.o: In function `sock_ops_func_proto':
   filter.c:(.text+0x5b9d): undefined reference to `bpf_get_local_storage_proto'
   net/core/filter.o: In function `sk_skb_func_proto':
   filter.c:(.text+0x5c60): undefined reference to `bpf_get_local_storage_proto'
   net/core/filter.o: In function `sk_msg_func_proto':
   filter.c:(.text+0x5cc3): undefined reference to `bpf_get_local_storage_proto'
   net/core/filter.o:filter.c:(.text+0x5ee6): more undefined references to `bpf_get_local_storage_proto' follow
>> collect2: error: ld returned 1 exit status

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 7562 bytes --]

^ permalink raw reply

* Re: [PATCH net-next v2] tcp: force cwnd at least 2 in tcp_cwnd_reduction
From: Lawrence Brakmo @ 2018-06-29  4:32 UTC (permalink / raw)
  To: Neal Cardwell
  Cc: Yuchung Cheng, Matt Mathis, Netdev, Kernel Team, Blake Matheny,
	Alexei Starovoitov, Eric Dumazet, Wei Wang, Steve Ibanez,
	Yousuk Seung
In-Reply-To: <CADVnQy=MsiEBCr+Mnp97mp0MxDqrA+_KiZEQehgcDfe9L-hghQ@mail.gmail.com>


On 6/28/18, 1:48 PM, "netdev-owner@vger.kernel.org on behalf of Neal Cardwell" <netdev-owner@vger.kernel.org on behalf of ncardwell@google.com> wrote:

    On Thu, Jun 28, 2018 at 4:20 PM Lawrence Brakmo <brakmo@fb.com> wrote:
    >
    > I just looked at 4.18 traces and the behavior is as follows:
    >
    >    Host A sends the last packets of the request
    >
    >    Host B receives them, and the last packet is marked with congestion (CE)
    >
    >    Host B sends ACKs for packets not marked with congestion
    >
    >    Host B sends data packet with reply and ACK for packet marked with congestion (TCP flag ECE)
    >
    >    Host A receives ACKs with no ECE flag
    >
    >    Host A receives data packet with ACK for the last packet of request and has TCP ECE bit set
    >
    >    Host A sends 1st data packet of the next request with TCP flag CWR
    >
    >    Host B receives the packet (as seen in tcpdump at B), no CE flag
    >
    >    Host B sends a dup ACK that also has the TCP ECE flag
    >
    >    Host A RTO timer fires!
    >
    >    Host A to send the next packet
    >
    >    Host A receives an ACK for everything it has sent (i.e. Host B did receive 1st packet of request)
    >
    >    Host A send more packets…
    
    Thanks, Larry! This is very interesting. I don't know the cause, but
    this reminds me of an issue  Steve Ibanez raised on the netdev list
    last December, where he was seeing cases with DCTCP where a CWR packet
    would be received and buffered by Host B but not ACKed by Host B. This
    was the thread "Re: Linux ECN Handling", starting around December 5. I
    have cc-ed Steve.
    
    I wonder if this may somehow be related to the DCTCP logic to rewind
    tp->rcv_nxt and call tcp_send_ack(), and then restore tp->rcv_nxt, if
    DCTCP notices that the incoming CE bits have been changed while the
    receiver thinks it is holding on to a delayed ACK (in
    dctcp_ce_state_0_to_1() and dctcp_ce_state_1_to_0()). I wonder if the
    "synthetic" call to tcp_send_ack() somehow has side effects in the
    delayed ACK state machine that can cause the connection to forget that
    it still needs to fire a delayed ACK, even though it just sent an ACK
    just now.
    
    neal

Here is a packetdrill script that reproduces the problem:

// Repro bug that does not ack data, not even with delayed-ack

0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
0.000 setsockopt(3, SOL_TCP, TCP_CONGESTION, "dctcp", 5) = 0
0.000 bind(3, ..., ...) = 0
0.000 listen(3, 1) = 0

0.100 < [ect0] SEW 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
0.100 > SE. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 5>
0.110 < [ect0] . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4

0.200 < [ect0] . 1:1001(1000) ack 1 win 257
0.200 > [ect0] . 1:1(0) ack 1001
0.200 write(4, ..., 1) = 1
0.200 > [ect0] P. 1:2(1) ack 1001

0.200 < [ect0] . 1001:2001(1000) ack 2 win 257
0.200 write(4, ..., 1) = 1
0.200 > [ect0] P. 2:3(1) ack 2001

0.200 < [ect0] . 2001:3001(1000) ack 3 win 257
0.200 < [ect0] . 3001:4001(1000) ack 3 win 257
0.200 > [ect0] . 3:3(0) ack 4001

0.210 < [ce] P. 4001:4501(500) ack 3 win 257

+0.001 read(4, ..., 4500) = 4500
+0 write(4, ..., 1) = 1
+0 > [ect0] PE. 3:4(1) ack 4501

+0.010 < [ect0] W. 4501:5501(1000) ack 4 win 257
+0  > [ect0] E. 4:4(0) ack 4501   // dup ack sent

+0.311 < [ect0] . 5501:6501(1000) ack 4 win 257  // Long RTO
+0 > [ect0] . 4:4(0) ack 6501     // now acks everything

+0.500 < F. 9501:9501(0) ack 4 win 257

^ permalink raw reply

* Re: [PATCH bpf-next 10/14] bpftool: add support for CGROUP_STORAGE maps
From: Jakub Kicinski @ 2018-06-29  3:47 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: netdev, linux-kernel, kernel-team, tj, Alexei Starovoitov,
	Daniel Borkmann
In-Reply-To: <20180628164719.28215-11-guro@fb.com>

On Thu, 28 Jun 2018 09:47:15 -0700, Roman Gushchin wrote:
> Add BPF_MAP_TYPE_CGROUP_STORAGE maps to the list
> of maps types which bpftool recognizes.
> 
> Signed-off-by: Roman Gushchin <guro@fb.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
> Acked-by: Martin KaFai Lau <kafai@fb.com>

Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>

^ permalink raw reply

* Re: [PATCH v12 03/10] netdev: cavium: octeon: Add Octeon III BGX Ethernet Nexus
From: Chavva, Chandrakala @ 2018-06-29  3:30 UTC (permalink / raw)
  To: David Miller, Munoz, Carlos
  Cc: andrew@lunn.ch, Hill, Steven, netdev@vger.kernel.org
In-Reply-To: <20180629.111905.863023692523959153.davem@davemloft.net>

David,

How can we support NFS boot if pass the parameters via devlink. Basically this determines what phy to use from device tree.

Chandra

________________________________________
From: David Miller <davem@davemloft.net>
Sent: Thursday, June 28, 2018 7:19:05 PM
To: Munoz, Carlos
Cc: andrew@lunn.ch; Hill, Steven; netdev@vger.kernel.org; Chavva, Chandrakala
Subject: Re: [PATCH v12 03/10] netdev: cavium: octeon: Add Octeon III BGX Ethernet Nexus

External Email

From: Carlos Munoz <cmunoz@cavium.com>
Date: Thu, 28 Jun 2018 14:20:05 -0700

>
>
> On 06/28/2018 01:41 AM, Andrew Lunn wrote:
>> External Email
>>
>>> +static char *mix_port;
>>> +module_param(mix_port, charp, 0444);
>>> +MODULE_PARM_DESC(mix_port, "Specifies which ports connect to MIX interfaces.");
>>> +
>>> +static char *pki_port;
>>> +module_param(pki_port, charp, 0444);
>>> +MODULE_PARM_DESC(pki_port, "Specifies which ports connect to the PKI.");
>> Module parameters are generally not liked. Can you do without them?
>
> These parameters change the kernel port assignment required by user
> space applications. We rather keep them as they simplify the
> process.

This is actually a terrible user experience.

Please provide a way to do this by performing operations on a device object
after the driver loads.

Use something like devlink or similar if you have to.

^ permalink raw reply

* [PATCH] atmel: using strlcpy() to avoid possible buffer overflows
From: YueHaibing @ 2018-06-29  2:51 UTC (permalink / raw)
  To: simon, kvalo; +Cc: linux-kernel, netdev, linux-wireless, davem, YueHaibing

'firmware' is a module param which may been longer than firmware_id,
so using strlcpy() to guard against overflows

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
 drivers/net/wireless/atmel/atmel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/atmel/atmel.c b/drivers/net/wireless/atmel/atmel.c
index b01dc34..604d5ac 100644
--- a/drivers/net/wireless/atmel/atmel.c
+++ b/drivers/net/wireless/atmel/atmel.c
@@ -1519,7 +1519,7 @@ struct net_device *init_atmel_card(unsigned short irq, unsigned long port,
 	priv->firmware_id[0] = '\0';
 	priv->firmware_type = fw_type;
 	if (firmware) /* module parameter */
-		strcpy(priv->firmware_id, firmware);
+		strlcpy(priv->firmware_id, firmware, sizeof(priv->firmware_id));
 	priv->bus_type = card_present ? BUS_TYPE_PCCARD : BUS_TYPE_PCI;
 	priv->station_state = STATION_STATE_DOWN;
 	priv->do_rx_crc = 0;
-- 
2.7.0

^ permalink raw reply related

* Re: [PATCH net-next 0/4] ila: Cleanup
From: David Miller @ 2018-06-29  2:33 UTC (permalink / raw)
  To: tom; +Cc: netdev, tom
In-Reply-To: <1530135542-10372-1-git-send-email-tom@quantonium.net>

From: Tom Herbert <tom@herbertland.com>
Date: Wed, 27 Jun 2018 14:38:58 -0700

> Perform some cleanup in ILA code. This includes:
> 
> - Fix rhashtable walk for cases where nl dumps are done with muliple
>   function calls. Add a skip index to skip over entries in
>   a node that have been previously visitied. Call rhashtable_walk_peek
>   to avoid dropping items between calls to ila_nl_dump.
> - Call alloc_bucket_spinlocks to create bucket locks.
> - Split out module initialization and netlink definitions into
>   separate files.
> - Add ILA_CMD_FLUSH netlink command to clear the ILA translation table.

Series applied.

^ permalink raw reply

* Re: [PATCH v2 net-next 0/2] net: preserve sock reference when scrubbing the skb.
From: David Miller @ 2018-06-29  2:22 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: fbl, netdev, eric.dumazet, pabeni, fw, netfilter-devel
In-Reply-To: <CAM_iQpVh8=sCiuSVkr4NRHWtxXGah95AOg91cy8AJfAXtU7Hkw@mail.gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Thu, 28 Jun 2018 14:53:09 -0700

> I will send a revert with quote of the above.

And it will go to /dev/null as far as I am concerned.  I read it the
first time, so posting it again will not change my opinion of what you
have to say.

Cong, you really need to calm down and understand that people perhaps
simply fundamentally disagree with you.

And I am one of those people.

^ permalink raw reply

* Re: [PATCH v2 net-next 0/2] net: preserve sock reference when scrubbing the skb.
From: David Miller @ 2018-06-29  2:20 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: fbl, netdev, eric.dumazet, pabeni, fw, netfilter-devel
In-Reply-To: <CAM_iQpW-Ggs5iHjshWiSbNArANsBK8JVY-Qt0Lw+m=sk4onzpw@mail.gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Thu, 28 Jun 2018 14:41:16 -0700

> Pretty sure you didn't even read the rest of my reply,
> I can't help you if you just stop at where you quoted.

I read your entire email, please do not accuse me of things
you are not sure of.

^ permalink raw reply

* Re: [PATCH v12 03/10] netdev: cavium: octeon: Add Octeon III BGX Ethernet Nexus
From: David Miller @ 2018-06-29  2:19 UTC (permalink / raw)
  To: cmunoz; +Cc: andrew, steven.hill, netdev, cchavva
In-Reply-To: <f09305d1-8477-ed79-0289-05a629cef994@cavium.com>

From: Carlos Munoz <cmunoz@cavium.com>
Date: Thu, 28 Jun 2018 14:20:05 -0700

> 
> 
> On 06/28/2018 01:41 AM, Andrew Lunn wrote:
>> External Email
>>
>>> +static char *mix_port;
>>> +module_param(mix_port, charp, 0444);
>>> +MODULE_PARM_DESC(mix_port, "Specifies which ports connect to MIX interfaces.");
>>> +
>>> +static char *pki_port;
>>> +module_param(pki_port, charp, 0444);
>>> +MODULE_PARM_DESC(pki_port, "Specifies which ports connect to the PKI.");
>> Module parameters are generally not liked. Can you do without them?
> 
> These parameters change the kernel port assignment required by user
> space applications. We rather keep them as they simplify the
> process.

This is actually a terrible user experience.

Please provide a way to do this by performing operations on a device object
after the driver loads.

Use something like devlink or similar if you have to.

^ permalink raw reply

* Re: [PATCH bpf-next 03/14] bpf: pass a pointer to a cgroup storage using pcpu variable
From: kbuild test robot @ 2018-06-29  2:08 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: kbuild-all, netdev, linux-kernel, kernel-team, tj, Roman Gushchin,
	Alexei Starovoitov, Daniel Borkmann
In-Reply-To: <20180628164719.28215-4-guro@fb.com>

[-- Attachment #1: Type: text/plain, Size: 5290 bytes --]

Hi Roman,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Roman-Gushchin/bpf-cgroup-local-storage/20180629-031527
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: parisc-allyesconfig (attached as .config)
compiler: hppa-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.2.0 make.cross ARCH=parisc 

All error/warnings (new ones prefixed by >>):

   In file included from include/linux/compiler_types.h:58:0,
                    from <command-line>:0:
   include/linux/bpf-cgroup.h: In function 'bpf_cgroup_storage_set':
>> include/asm-generic/percpu.h:31:40: error: implicit declaration of function 'raw_smp_processor_id' [-Werror=implicit-function-declaration]
    #define __my_cpu_offset per_cpu_offset(raw_smp_processor_id())
                                           ^
   include/linux/compiler-gcc.h:54:26: note: in definition of macro 'RELOC_HIDE'
     (typeof(ptr)) (__ptr + (off));     \
                             ^~~
   include/asm-generic/percpu.h:44:31: note: in expansion of macro 'SHIFT_PERCPU_PTR'
    #define arch_raw_cpu_ptr(ptr) SHIFT_PERCPU_PTR(ptr, __my_cpu_offset)
                                  ^~~~~~~~~~~~~~~~
   include/asm-generic/percpu.h:31:25: note: in expansion of macro 'per_cpu_offset'
    #define __my_cpu_offset per_cpu_offset(raw_smp_processor_id())
                            ^~~~~~~~~~~~~~
>> include/asm-generic/percpu.h:44:53: note: in expansion of macro '__my_cpu_offset'
    #define arch_raw_cpu_ptr(ptr) SHIFT_PERCPU_PTR(ptr, __my_cpu_offset)
                                                        ^~~~~~~~~~~~~~~
>> include/linux/percpu-defs.h:244:2: note: in expansion of macro 'arch_raw_cpu_ptr'
     arch_raw_cpu_ptr(ptr);      \
     ^~~~~~~~~~~~~~~~
>> include/asm-generic/percpu.h:76:3: note: in expansion of macro 'raw_cpu_ptr'
     *raw_cpu_ptr(&(pcp)) op val;     \
      ^~~~~~~~~~~
>> include/asm-generic/percpu.h:152:2: note: in expansion of macro 'raw_cpu_generic_to_op'
     raw_cpu_generic_to_op(pcp, val, op);    \
     ^~~~~~~~~~~~~~~~~~~~~
>> include/asm-generic/percpu.h:337:36: note: in expansion of macro 'this_cpu_generic_to_op'
    #define this_cpu_write_1(pcp, val) this_cpu_generic_to_op(pcp, val, =)
                                       ^~~~~~~~~~~~~~~~~~~~~~
>> include/linux/percpu-defs.h:379:11: note: in expansion of macro 'this_cpu_write_1'
      case 1: stem##1(variable, __VA_ARGS__);break;  \
              ^~~~
>> include/linux/percpu-defs.h:510:34: note: in expansion of macro '__pcpu_size_call'
    #define this_cpu_write(pcp, val) __pcpu_size_call(this_cpu_write_, pcp, val)
                                     ^~~~~~~~~~~~~~~~
>> include/linux/bpf-cgroup.h:109:2: note: in expansion of macro 'this_cpu_write'
     this_cpu_write(bpf_cgroup_storage, &buf->data[0]);
     ^~~~~~~~~~~~~~
   cc1: some warnings being treated as errors

vim +/this_cpu_write +109 include/linux/bpf-cgroup.h

    66	
    67	int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
    68				enum bpf_attach_type type, u32 flags);
    69	int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
    70				enum bpf_attach_type type, u32 flags);
    71	int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
    72			       union bpf_attr __user *uattr);
    73	
    74	/* Wrapper for __cgroup_bpf_*() protected by cgroup_mutex */
    75	int cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
    76			      enum bpf_attach_type type, u32 flags);
    77	int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
    78			      enum bpf_attach_type type, u32 flags);
    79	int cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
    80			     union bpf_attr __user *uattr);
    81	
    82	int __cgroup_bpf_run_filter_skb(struct sock *sk,
    83					struct sk_buff *skb,
    84					enum bpf_attach_type type);
    85	
    86	int __cgroup_bpf_run_filter_sk(struct sock *sk,
    87				       enum bpf_attach_type type);
    88	
    89	int __cgroup_bpf_run_filter_sock_addr(struct sock *sk,
    90					      struct sockaddr *uaddr,
    91					      enum bpf_attach_type type,
    92					      void *t_ctx);
    93	
    94	int __cgroup_bpf_run_filter_sock_ops(struct sock *sk,
    95					     struct bpf_sock_ops_kern *sock_ops,
    96					     enum bpf_attach_type type);
    97	
    98	int __cgroup_bpf_check_dev_permission(short dev_type, u32 major, u32 minor,
    99					      short access, enum bpf_attach_type type);
   100	
   101	static inline void bpf_cgroup_storage_set(struct bpf_cgroup_storage *storage)
   102	{
   103		struct bpf_storage_buffer *buf;
   104	
   105		if (!storage)
   106			return;
   107	
   108		buf = rcu_dereference(storage->buf);
 > 109		this_cpu_write(bpf_cgroup_storage, &buf->data[0]);
   110	}
   111	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 53550 bytes --]

^ permalink raw reply

* Re: [PATCH net-next 00/11] net: hns3: a few code improvements
From: David Miller @ 2018-06-29  2:07 UTC (permalink / raw)
  To: lipeng321; +Cc: netdev, linux-kernel, linuxarm, yisen.zhuang, salil.mehta
In-Reply-To: <1530159149-122284-1-git-send-email-lipeng321@huawei.com>

From: Peng Li <lipeng321@huawei.com>
Date: Thu, 28 Jun 2018 12:12:18 +0800

> This patchset fixes a few code stylistic issues from
> concentrated review, no functional changes introduced.

Series applied to net-next, thank you.

^ permalink raw reply

* Re: [PATCH 6/6] fs: replace f_ops->get_poll_head with a static ->f_poll_head pointer
From: Linus Torvalds @ 2018-06-29  0:13 UTC (permalink / raw)
  To: Al Viro; +Cc: Christoph Hellwig, linux-fsdevel, Network Development, LKP
In-Reply-To: <20180628233720.GN30522@ZenIV.linux.org.uk>

On Thu, Jun 28, 2018 at 4:37 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> You underestimate the nastiness of that thing (and for the record, I'm
> a lot *less* fond of AIO than you are, what with having had to read that
> nest of horrors lately).  It does not "copy the data to userland"; what it
> does instead is copying into an array of pages it keeps, right from
> IO completion callback.  I

Ugh.

Oh well. I'd be perfectly happy to have somebody re-write and
re-architect the aio code entirely.  Much rather than that the
->poll() code. Because I know which one I think is well-desiged with a
nice usable interface, and which one is a pile of shit.

In the meantime, if AIO wants to do poll() in some irq callback, may I
suggest just using workqueues.

            Linus

^ permalink raw reply

* Re: [PATCH 6/6] fs: replace f_ops->get_poll_head with a static ->f_poll_head pointer
From: Al Viro @ 2018-06-28 23:37 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christoph Hellwig, linux-fsdevel, Network Development, LKP
In-Reply-To: <CA+55aFz1PQ86=6xEFM9ajZRoKF7NyAQmf8Gd8qnOOguFieWWbA@mail.gmail.com>

On Thu, Jun 28, 2018 at 03:55:35PM -0700, Linus Torvalds wrote:
> > You are misreading that mess.  What he's trying to do (other than surviving
> > the awful clusterfuck around cancels) is to handle the decision what to
> > report to userland right in the wakeup callback.  *That* is what really
> > drives the "make the second-pass ->poll() or something similar to it
> > non-blocking" (in addition to the fact that it is such in considerable
> > majority of instances).
> 
> That's just crazy BS.
> 
> Just call poll() again when you copy the data to userland (which by
> definition can block, again).
> 
> Stop the idiotic "let's break poll for stupid AIO reasons, because the
> AIO people are morons".

You underestimate the nastiness of that thing (and for the record, I'm
a lot *less* fond of AIO than you are, what with having had to read that
nest of horrors lately).  It does not "copy the data to userland"; what it
does instead is copying into an array of pages it keeps, right from
IO completion callback.  In read/write case.  This
        ev_page = kmap_atomic(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]);
        event = ev_page + pos % AIO_EVENTS_PER_PAGE;

        event->obj = (u64)(unsigned long)iocb->ki_user_iocb;
        event->data = iocb->ki_user_data;
        event->res = res;
        event->res2 = res2;

        kunmap_atomic(ev_page);
        flush_dcache_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]);
is what does the copying.  And that might be done from IRQ context.
Yes, really.

They do have a slightly saner syscall that does copying from the damn
ring buffer, but its use is optional - userland can (and does) direct
read access to mmapped buffer.

Single-consumer ABIs suck and AIO is one such...

It could do schedule_work() and do blocking stuff from that - does so, in
case if it can't grab ->ctx_lock.  Earlier iteration used to try doing
everything straight from wakeup callback, and *that* was racy as hell;
I'd rather have Christoph explain which races he'd been refering to,
but there had been a whole lot of that.  Solution I suggested in the
last round of that was to offload __aio_poll_complete() via schedule_work()
both for cancel and poll wakeup cases.  Doing the common case right
from poll wakeup callback was argued to avoid noticable overhead in
common situation - that's what "aio: try to complete poll iocbs without
context switch" is about.  I'm more than slightly unhappy about the
lack of performance regression testing in non-AIO case...

At that point I would really like to see replies from Christoph - he's
on CET usually, no idea what his effective timezone is...

^ permalink raw reply

* Diagnosing network module for missing link establishment (cxgb3, Chelsio T320)
From: U.Mutlu @ 2018-06-28 23:17 UTC (permalink / raw)
  To: netdev

Hi,

I've got a pair of used old dual port 10GbE NICs (Chelsio T320
10GBASE-R RNIC (rev 3) PCI Express x4 MSI-X) with 2 modular
transceivers on board the 2 NICs (ie. these can be taken off
of the card for replacement etc.).

The problem is that the cards don't establish a link;
the green LEDs go off after a few seconds after loading
the driver named cxgb3, and there is no indication in the syslog
about any error.

The behavior is the same whether the transceivers are present
on the card or not; the drivers always load successfully.

In the kernel sources the driver is located under
drivers/net/ethernet/chelsio/cxgb3
(I haven't recompiled it; just using the stock kernel from Debian repo).

Below, the interfaces are eth4 and eth5.

There was just once a link, but it never happens again, and
such log entries about link are not happening since then:

Jun 27 13:06:09 c6-local vmunix: [  504.108102] cxgb3 0000:01:00.0 eth4: link 
up, 10Gbps, full-duplex
Jun 27 13:17:30 c6-local vmunix: [ 1185.450256] cxgb3 0000:01:00.0 eth4: link down

So, how can I diagnose and pinpoint what the reason is for not establishing a 
link?


The following is from a later reboot:

# dmesg | grep -i "Chelsio\|cxgb3\|eth[0-9]"
[    0.979800] cxgb3: Chelsio T3 Network Driver - version 1.1.5-ko
[    0.984162] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at 
0xffffc90000002000, 74:d4:35:92:72:1b, XID 0c900800 IRQ 43
[    0.984163] r8169 0000:02:00.0 eth0: jumbo features [frames: 9200 bytes, tx 
checksumming: ko]
[    1.319780] cxgb3 0000:01:00.0: irq 55 for MSI/MSI-X
[    1.319785] cxgb3 0000:01:00.0: irq 56 for MSI/MSI-X
[    1.319788] cxgb3 0000:01:00.0: irq 57 for MSI/MSI-X
[    1.319791] cxgb3 0000:01:00.0: irq 58 for MSI/MSI-X
[    1.319794] cxgb3 0000:01:00.0: irq 59 for MSI/MSI-X
[    1.319797] cxgb3 0000:01:00.0: irq 60 for MSI/MSI-X
[    1.319799] cxgb3 0000:01:00.0: irq 61 for MSI/MSI-X
[    1.319803] cxgb3 0000:01:00.0: irq 62 for MSI/MSI-X
[    1.319805] cxgb3 0000:01:00.0: irq 63 for MSI/MSI-X
[    1.319828] cxgb3 0000:01:00.0: Port 0 using 4 queue sets.
[    1.319868] cxgb3 0000:01:00.0: Port 1 using 4 queue sets.
[    1.319909] cxgb3 0000:01:00.0 eth1: Chelsio T320 10GBASE-R RNIC (rev 3) 
PCI Express x8 MSI-X
[    1.319949] cxgb3: eth1: 128MB CM, 256MB PMTX, 256MB PMRX, S/N: PT37080022
[    1.319986] cxgb3 0000:01:00.0 eth2: Chelsio T320 10GBASE-R RNIC (rev 3) 
PCI Express x8 MSI-X
[    5.217529] systemd-udevd[384]: renamed network interface eth0 to eth3
[    5.342653] systemd-udevd[387]: renamed network interface eth2 to eth4
[    5.358663] systemd-udevd[383]: renamed network interface eth1 to eth5
[   11.563555] r8169 0000:02:00.0 eth3: link down
[   11.563668] r8169 0000:02:00.0 eth3: link down
[   13.917486] r8169 0000:02:00.0 eth3: link up


# ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode 
DEFAULT group default
     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP 
mode DEFAULT group default qlen 1000
     link/ether 74:d4:35:92:72:1b brd ff:ff:ff:ff:ff:ff
3: eth5: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode 
DEFAULT group default qlen 1000
     link/ether 00:07:43:05:8b:16 brd ff:ff:ff:ff:ff:ff
4: eth4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode 
DEFAULT group default qlen 1000
     link/ether 00:07:43:05:8b:17 brd ff:ff:ff:ff:ff:ff


# ethtool eth4
Settings for eth4:
     Supported ports: [ AUI FIBRE ]
     Supported link modes:   10000baseT/Full
     Supported pause frame use: No
     Supports auto-negotiation: No
     Advertised link modes:  Not reported
     Advertised pause frame use: No
     Advertised auto-negotiation: No
     Speed: Unknown!
     Duplex: Unknown! (255)
     Port: FIBRE
     PHYAD: 1
     Transceiver: external
     Auto-negotiation: off
     Supports Wake-on: d
     Wake-on: d
     Current message level: 0x000000ff (255)
                    drv probe link timer ifdown ifup rx_err tx_err
     Link detected: no


# ifconfig
...
eth4      Link encap:Ethernet  HWaddr 00:07:43:05:8b:17
           inet addr:192.168.50.4  Bcast:192.168.50.255  Mask:255.255.255.0
           UP BROADCAST MULTICAST  MTU:1500  Metric:1
           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
           Interrupt:18 Memory:fe811000-fe811fff

eth5      Link encap:Ethernet  HWaddr 00:07:43:05:8b:16
           inet addr:192.168.60.5  Bcast:192.168.60.255  Mask:255.255.255.0
           UP BROADCAST MULTICAST  MTU:1500  Metric:1
           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
           Interrupt:18 Memory:fe811000-fe811fff

(I also tried same network 192.168.50 for both, no difference in outcome)


Btw, can a link be established between the 2 transceiver ports on the
same card? I think this should be possible, right?


Kernel: 3.16.0-4-amd64 (Debian v8 stock kernel)

-- 
Thx

^ permalink raw reply

* Re: [PATCH net-next] net: preserve sock reference when scrubbing the skb.
From: Cong Wang @ 2018-06-28 23:18 UTC (permalink / raw)
  To: Flavio Leitner
  Cc: Linux Kernel Network Developers, Eric Dumazet, Paolo Abeni,
	David Miller, Florian Westphal, NetFilter
In-Reply-To: <20180625155610.30802-1-fbl@redhat.com>

On Mon, Jun 25, 2018 at 8:59 AM Flavio Leitner <fbl@redhat.com> wrote:
> XPS breaks because the queue mapping stored in the socket is not
> available, so another random queue might be selected when the stack
> needs to transmit something like a TCP ACK, or TCP Retransmissions.
> That causes packet re-ordering and/or performance issues.

Now let me look at the XPS part, a key question first:

By queue mapping stored in socket, you mean sk_tx_queue_get(),
which is only called in __netdev_pick_tx(), and of course even before
hitting qdisc layer.

However, veth device orphans the skb inside its veth_xmit(),
(dev_forward_skb()), which is after going through qdisc layer.

So, how could the skb_orphan() called _after_ XPS break XPS?

We are talking about a simple netns-to-netns case, so XPS won't
be hit again once leaves it.

Another _dumb_ question:

veth is virtual device, it has literally no queues, I know technically
there is a queue for installing qdisc.

So, why does even queue mapping matters here???

^ permalink raw reply

* Re: [PATCH 6/6] fs: replace f_ops->get_poll_head with a static ->f_poll_head pointer
From: Linus Torvalds @ 2018-06-28 22:55 UTC (permalink / raw)
  To: Al Viro; +Cc: Christoph Hellwig, linux-fsdevel, Network Development, LKP
In-Reply-To: <20180628224930.GM30522@ZenIV.linux.org.uk>

On Thu, Jun 28, 2018 at 3:49 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> aio_poll() is not a problem.  It's wakeup callback that is one.

No, it's not a problem either.

The only thing the wakup callback wants to do is to remove the wait queue entry.

And *that* doesn't need to sleep, and it has absolutely nothing to do
with whether "->poll()" itself sleeps or not.

All that needs to do is "poll_freewait()". Done. In practice, it's
probably only going to be a single free_poll_entry(), but who cares?
The AIO code should handle the "maybe ->poll() added multiple
wait-queues" just fine.

> You are misreading that mess.  What he's trying to do (other than surviving
> the awful clusterfuck around cancels) is to handle the decision what to
> report to userland right in the wakeup callback.  *That* is what really
> drives the "make the second-pass ->poll() or something similar to it
> non-blocking" (in addition to the fact that it is such in considerable
> majority of instances).

That's just crazy BS.

Just call poll() again when you copy the data to userland (which by
definition can block, again).

Stop the idiotic "let's break poll for stupid AIO reasons, because the
AIO people are morons".

Just make the AIO code do the sane thing.

                  Linus

^ permalink raw reply

* Re: [PATCH 6/6] fs: replace f_ops->get_poll_head with a static ->f_poll_head pointer
From: Al Viro @ 2018-06-28 22:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christoph Hellwig, linux-fsdevel, Network Development, LKP
In-Reply-To: <CA+55aFzjUeacfpB8F2_J3rmOv3+hXF2zNNWP4UzdXxWTgD=nxQ@mail.gmail.com>

On Thu, Jun 28, 2018 at 03:35:03PM -0700, Linus Torvalds wrote:

> Yes, the AIO poll implementation did it under the spinlock.
> 
> But there's no good *reason* for that.  The "aio_poll()" function
> itself is called in perfectly fine blocking context.

aio_poll() is not a problem.  It's wakeup callback that is one.

> As far as I can tell, Christoph could have just done the first pass
> '->poll()' *without* taking a spinlock, and that adds the table entry
> to the table. Then, *under the spinlock*, you associate the table the
> the kioctx. And then *after* the spinlock, you can call "->poll()"
> again (now with a NULL table pointer), to verify that the state is
> still not triggered. That's the whole point of the two-phgase poll
> thing - the first phase adds the entry to the wait queues, and the
> second phase checks for the race of "did it the event happen in the
> meantime".

You are misreading that mess.  What he's trying to do (other than surviving
the awful clusterfuck around cancels) is to handle the decision what to
report to userland right in the wakeup callback.  *That* is what really
drives the "make the second-pass ->poll() or something similar to it
non-blocking" (in addition to the fact that it is such in considerable
majority of instances).

^ permalink raw reply

* Re: [PATCH RFC 1/2] hwmon: Add support for power min, lcrit, min_alarm and lcrit_alarm
From: Guenter Roeck @ 2018-06-28 22:42 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: netdev, Florian Fainelli, Russell King, vadimp, linux-hwmon
In-Reply-To: <1530218475-4369-2-git-send-email-andrew@lunn.ch>

On Thu, Jun 28, 2018 at 10:41:14PM +0200, Andrew Lunn wrote:
> Some sensors support reporting minimal and lower critical power, as
> well as alarms when these thresholds are reached. Add support for
> these attributes to the hwmon core.
> 
> Signed-off-by: Andrew Lunn <andrew@lunn.ch>

I am inclined to accept this patch immediately. I'll do that
in the next couple of days unless someone gives me a good reason
not to.

Guenter

> ---
>  drivers/hwmon/hwmon.c | 4 ++++
>  include/linux/hwmon.h | 8 ++++++++
>  2 files changed, 12 insertions(+)
> 
> diff --git a/drivers/hwmon/hwmon.c b/drivers/hwmon/hwmon.c
> index e88c01961948..33d51281272b 100644
> --- a/drivers/hwmon/hwmon.c
> +++ b/drivers/hwmon/hwmon.c
> @@ -394,12 +394,16 @@ static const char * const hwmon_power_attr_templates[] = {
>  	[hwmon_power_cap_hyst] = "power%d_cap_hyst",
>  	[hwmon_power_cap_max] = "power%d_cap_max",
>  	[hwmon_power_cap_min] = "power%d_cap_min",
> +	[hwmon_power_min] = "power%d_min",
>  	[hwmon_power_max] = "power%d_max",
> +	[hwmon_power_lcrit] = "power%d_lcrit",
>  	[hwmon_power_crit] = "power%d_crit",
>  	[hwmon_power_label] = "power%d_label",
>  	[hwmon_power_alarm] = "power%d_alarm",
>  	[hwmon_power_cap_alarm] = "power%d_cap_alarm",
> +	[hwmon_power_min_alarm] = "power%d_min_alarm",
>  	[hwmon_power_max_alarm] = "power%d_max_alarm",
> +	[hwmon_power_lcrit_alarm] = "power%d_lcrit_alarm",
>  	[hwmon_power_crit_alarm] = "power%d_crit_alarm",
>  };
>  
> diff --git a/include/linux/hwmon.h b/include/linux/hwmon.h
> index 1b74ad11a5a4..b217101ca76e 100644
> --- a/include/linux/hwmon.h
> +++ b/include/linux/hwmon.h
> @@ -188,12 +188,16 @@ enum hwmon_power_attributes {
>  	hwmon_power_cap_hyst,
>  	hwmon_power_cap_max,
>  	hwmon_power_cap_min,
> +	hwmon_power_min,
>  	hwmon_power_max,
>  	hwmon_power_crit,
> +	hwmon_power_lcrit,
>  	hwmon_power_label,
>  	hwmon_power_alarm,
>  	hwmon_power_cap_alarm,
> +	hwmon_power_min_alarm,
>  	hwmon_power_max_alarm,
> +	hwmon_power_lcrit_alarm,
>  	hwmon_power_crit_alarm,
>  };
>  
> @@ -214,12 +218,16 @@ enum hwmon_power_attributes {
>  #define HWMON_P_CAP_HYST		BIT(hwmon_power_cap_hyst)
>  #define HWMON_P_CAP_MAX			BIT(hwmon_power_cap_max)
>  #define HWMON_P_CAP_MIN			BIT(hwmon_power_cap_min)
> +#define HWMON_P_MIN			BIT(hwmon_power_min)
>  #define HWMON_P_MAX			BIT(hwmon_power_max)
> +#define HWMON_P_LCRIT			BIT(hwmon_power_lcrit)
>  #define HWMON_P_CRIT			BIT(hwmon_power_crit)
>  #define HWMON_P_LABEL			BIT(hwmon_power_label)
>  #define HWMON_P_ALARM			BIT(hwmon_power_alarm)
>  #define HWMON_P_CAP_ALARM		BIT(hwmon_power_cap_alarm)
> +#define HWMON_P_MIN_ALARM		BIT(hwmon_power_max_alarm)
>  #define HWMON_P_MAX_ALARM		BIT(hwmon_power_max_alarm)
> +#define HWMON_P_LCRIT_ALARM		BIT(hwmon_power_lcrit_alarm)
>  #define HWMON_P_CRIT_ALARM		BIT(hwmon_power_crit_alarm)
>  
>  enum hwmon_energy_attributes {
> -- 
> 2.18.0.rc2
> 

^ permalink raw reply

* Re: [PATCH RFC 2/2] net: phy: sfp: Add HWMON support for module sensors
From: Guenter Roeck @ 2018-06-28 22:41 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: netdev, Florian Fainelli, Russell King, vadimp, linux-hwmon
In-Reply-To: <1530218475-4369-3-git-send-email-andrew@lunn.ch>

On Thu, Jun 28, 2018 at 10:41:15PM +0200, Andrew Lunn wrote:
> SFP modules can contain a number of sensors. The EEPROM also contains
> recommended alarm and critical values for each sensor, and indications
> of if these have been exceeded. Export this information via
> HWMON. Currently temperature, VCC, bias current, transmit power, and
> possibly receiver power is supported.
> 
> The sensors in the modules can either return calibrate or uncalibrated
> values. Uncalibrated values need to be manipulated, using coefficients
> provided in the SFP EEPROM. Uncalibrated receive power values require
> floating point maths in order to calibrate them. Performing this in
> the kernel is hard. So if the SFP module indicates it uses
> uncalibrated values, RX power is not made available.
> 
> With this hwmon device, it is possible to view the sensor values using
> lm-sensors programs:
> 
> in0:          +3.29 V  (crit min =  +2.90 V, min =  +3.00 V)
>                        (max =  +3.60 V, crit max =  +3.70 V)
> temp1:        +33.0°C  (low  =  -5.0°C, high = +80.0°C)
>                        (crit low = -10.0°C, crit = +85.0°C)
> power1:      1000.00 nW (max = 794.00 uW, min =  50.00 uW)  ALARM (LCRIT)
>                        (lcrit =  40.00 uW, crit = 1000.00 uW)
> curr1:        +0.00 A  (crit min =  +0.00 A, min =  +0.00 A)  ALARM (LCRIT, MIN)
>                        (max =  +0.01 A, crit max =  +0.01 A)
> 
> The scaling sensors performs on the bias current is not particularly
> good. The raw values are more useful:
> 
> curr1:
>   curr1_input: 0.000
>   curr1_min: 0.002
>   curr1_max: 0.010
>   curr1_lcrit: 0.000
>   curr1_crit: 0.011
>   curr1_min_alarm: 1.000
>   curr1_max_alarm: 0.000
>   curr1_lcrit_alarm: 1.000
>   curr1_crit_alarm: 0.000
> 
> In order to keep the I2C overhead to a minimum, the constant values,
> such as limits and calibration coefficients are read once at module
> insertion time. Thus only reading *_input and *_alarm properties
> requires i2c read operations.
> 
> Signed-off-by: Andrew Lunn <andrew@lunn.ch>

Overall looks pretty good. A couple of comments below.

Guenter

> ---
>  drivers/net/phy/sfp.c | 732 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/sfp.h   |  72 ++++-
>  2 files changed, 803 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
> index c4c92db86dfa..30428e94b694 100644
> --- a/drivers/net/phy/sfp.c
> +++ b/drivers/net/phy/sfp.c
> @@ -1,5 +1,7 @@
> +#include <linux/ctype.h>
>  #include <linux/delay.h>
>  #include <linux/gpio/consumer.h>
> +#include <linux/hwmon.h>
>  #include <linux/i2c.h>
>  #include <linux/interrupt.h>
>  #include <linux/jiffies.h>
> @@ -131,6 +133,12 @@ struct sfp {
>  	unsigned int sm_retries;
>  
>  	struct sfp_eeprom_id id;
> +#ifdef CONFIG_HWMON

Should probably be "#if IS_ENABLED(CONFIG_HWMON)".
You might also want to add "imply HWMON" to "config SFP".

> +	struct sfp_diag diag;
> +	struct device *hwmon_dev;
> +	char *hwmon_name;
> +#endif
> +
>  };
>  
>  static bool sff_module_supported(const struct sfp_eeprom_id *id)
> @@ -316,6 +324,724 @@ static unsigned int sfp_check(void *buf, size_t len)
>  	return check;
>  }
>  
> +/* hwmon */
> +#ifdef CONFIG_HWMON

#if IS_ENABLED(CONFIG_HWMON)

> +static umode_t sfp_hwmon_is_visible(const void *data,
> +				    enum hwmon_sensor_types type,
> +				    u32 attr, int channel)
> +{
> +	const struct sfp *sfp = data;
> +
> +	switch (type) {
> +	case hwmon_temp:
> +		switch (attr) {
> +		case hwmon_temp_input:
> +		case hwmon_temp_min_alarm:
> +		case hwmon_temp_max_alarm:
> +		case hwmon_temp_lcrit_alarm:
> +		case hwmon_temp_crit_alarm:
> +		case hwmon_temp_min:
> +		case hwmon_temp_max:
> +		case hwmon_temp_lcrit:
> +		case hwmon_temp_crit:
> +			return 0444;
> +		default:
> +			return 0;
> +		}
> +	case hwmon_in:
> +		switch (attr) {
> +		case hwmon_in_input:
> +		case hwmon_in_min_alarm:
> +		case hwmon_in_max_alarm:
> +		case hwmon_in_lcrit_alarm:
> +		case hwmon_in_crit_alarm:
> +		case hwmon_in_min:
> +		case hwmon_in_max:
> +		case hwmon_in_lcrit:
> +		case hwmon_in_crit:
> +			return 0444;
> +		default:
> +			return 0;
> +		}
> +	case hwmon_curr:
> +		switch (attr) {
> +		case hwmon_curr_input:
> +		case hwmon_curr_min_alarm:
> +		case hwmon_curr_max_alarm:
> +		case hwmon_curr_lcrit_alarm:
> +		case hwmon_curr_crit_alarm:
> +		case hwmon_curr_min:
> +		case hwmon_curr_max:
> +		case hwmon_curr_lcrit:
> +		case hwmon_curr_crit:
> +			return 0444;
> +		default:
> +			return 0;
> +		}
> +	case hwmon_power:
> +		/* External calibration of receive power requires
> +		 * floating point arithmetic. Doing that in the kernel
> +		 * is not easy, so just skip it. If the module does
> +		 * not require external calibration, we can however
> +		 * show receiver power, since FP is then not needed.
> +		 */
> +		if (sfp->id.ext.diagmon & SFP_DIAGMON_EXT_CAL &&
> +		    channel == 1)
> +			return 0;

It would be nice if it was possible to convert the floting point to
a fixed point calculation. Would that be possible ?

> +		switch (attr) {
> +		case hwmon_power_input:
> +		case hwmon_power_min_alarm:
> +		case hwmon_power_max_alarm:
> +		case hwmon_power_lcrit_alarm:
> +		case hwmon_power_crit_alarm:
> +		case hwmon_power_min:
> +		case hwmon_power_max:
> +		case hwmon_power_lcrit:
> +		case hwmon_power_crit:
> +			return 0444;
> +		default:
> +			return 0;
> +		}
> +	default:
> +		return 0;
> +	}
> +}
> +
> +/* Sensors values are stored as two bytes, MSB second */
> +static int sfp_hwmon_read_sensor(struct sfp *sfp, int reg, long *value)
> +{
> +	u8 val[2];
> +	int err;
> +
> +	err = sfp_read(sfp, true, reg, val, 2);
> +	if (err < 0)
> +		return err;
> +
> +	*value = val[0] << 8 | val[1];
> +

Any chance to use something like __be16 and be16_to_cpu() ?
You do that elsewhere - why not here ?

> +	return 0;
> +}
> +
> +static void sfp_hwmon_to_rx_power(long *value)
> +{
> +	*value /= 100;

DIV_ROUND_CLOSEST() ? Same for the other divide operations.

> +}
> +
> +static void sfp_hwmon_calibrate(struct sfp *sfp, unsigned int slope, int offset,
> +				long *value)
> +{
> +	if (sfp->id.ext.diagmon & SFP_DIAGMON_EXT_CAL)
> +		*value = (*value * slope) / 256 + offset;
> +}
> +
> +static void sfp_hwmon_calibrate_temp(struct sfp *sfp, long *value)
> +{
> +	sfp_hwmon_calibrate(sfp, be16_to_cpu(sfp->diag.cal_t_slope),
> +			    be16_to_cpu(sfp->diag.cal_t_offset), value);
> +
> +	if (*value >=  0x8000)

etra space (doesn't checkpatch complain about that that ?)

> +		*value -= 0x10000;
> +
> +	*value = *value * 1000 / 256;
> +}
> +
> +static void sfp_hwmon_calibrate_vcc(struct sfp *sfp, long *value)
> +{
> +	sfp_hwmon_calibrate(sfp, be16_to_cpu(sfp->diag.cal_v_slope),
> +			    be16_to_cpu(sfp->diag.cal_v_offset), value);
> +
> +	*value /= 10;
> +}
> +
> +static void sfp_hwmon_calibrate_bias(struct sfp *sfp, long *value)
> +{
> +	sfp_hwmon_calibrate(sfp, be16_to_cpu(sfp->diag.cal_txi_slope),
> +			    be16_to_cpu(sfp->diag.cal_txi_offset), value);
> +
> +	*value /= 500;
> +}
> +
> +static void sfp_hwmon_calibrate_tx_power(struct sfp *sfp, long *value)
> +{
> +	sfp_hwmon_calibrate(sfp, be16_to_cpu(sfp->diag.cal_txpwr_slope),
> +			    be16_to_cpu(sfp->diag.cal_txpwr_offset), value);
> +
> +	*value /= 10;
> +}
> +
> +static int sfp_hwmon_read_temp(struct sfp *sfp, int reg, long *value)
> +{
> +	int err;
> +
> +	err = sfp_hwmon_read_sensor(sfp, reg, value);
> +	if (err < 0)
> +		return err;
> +
> +	sfp_hwmon_calibrate_temp(sfp, value);
> +
> +	return 0;
> +}
> +
> +static int sfp_hwmon_read_vcc(struct sfp *sfp, int reg, long *value)
> +{
> +	int err;
> +
> +	err = sfp_hwmon_read_sensor(sfp, reg, value);
> +	if (err < 0)
> +		return err;
> +
> +	sfp_hwmon_calibrate_vcc(sfp, value);
> +
> +	return 0;
> +}
> +
> +static int sfp_hwmon_read_bias(struct sfp *sfp, int reg, long *value)
> +{
> +	int err;
> +
> +	err = sfp_hwmon_read_sensor(sfp, reg, value);
> +	if (err < 0)
> +		return err;
> +
> +	sfp_hwmon_calibrate_bias(sfp, value);
> +
> +	return 0;
> +}
> +
> +static int sfp_hwmon_read_tx_power(struct sfp *sfp, int reg, long *value)
> +{
> +	int err;
> +
> +	err = sfp_hwmon_read_sensor(sfp, reg, value);
> +	if (err < 0)
> +		return err;
> +
> +	sfp_hwmon_calibrate_tx_power(sfp, value);
> +
> +	return 0;
> +}
> +
> +static int sfp_hwmon_read_rx_power(struct sfp *sfp, int reg, long *value)
> +{
> +	int err;
> +
> +	err = sfp_hwmon_read_sensor(sfp, reg, value);
> +	if (err < 0)
> +		return err;
> +
> +	sfp_hwmon_to_rx_power(value);
> +
> +	return 0;
> +}
> +
> +static int sfp_hwmon_temp(struct sfp *sfp, u32 attr, long *value)
> +{
> +	u8 status;
> +	int err;
> +
> +	switch (attr) {
> +	case hwmon_temp_input:
> +		return sfp_hwmon_read_temp(sfp, SFP_TEMP, value);
> +
> +	case hwmon_temp_lcrit:
> +		*value = be16_to_cpu(sfp->diag.temp_low_alarm);
> +		sfp_hwmon_calibrate_temp(sfp, value);
> +		return 0;
> +
> +	case hwmon_temp_min:
> +		*value = be16_to_cpu(sfp->diag.temp_low_warn);
> +		sfp_hwmon_calibrate_temp(sfp, value);
> +		return 0;
> +	case hwmon_temp_max:
> +		*value = be16_to_cpu(sfp->diag.temp_high_warn);
> +		sfp_hwmon_calibrate_temp(sfp, value);
> +		return 0;
> +
> +	case hwmon_temp_crit:
> +		*value = be16_to_cpu(sfp->diag.temp_high_alarm);
> +		sfp_hwmon_calibrate_temp(sfp, value);
> +		return 0;
> +
> +	case hwmon_temp_lcrit_alarm:
> +		err = sfp_read(sfp, true, SFP_ALARM0, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_ALARM0_TEMP_LOW);
> +		return 0;
> +
> +	case hwmon_temp_min_alarm:
> +		err = sfp_read(sfp, true, SFP_WARN0, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_WARN0_TEMP_LOW);
> +		return 0;
> +
> +	case hwmon_temp_max_alarm:
> +		err = sfp_read(sfp, true, SFP_WARN0, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_WARN0_TEMP_HIGH);
> +		return 0;
> +
> +	case hwmon_temp_crit_alarm:
> +		err = sfp_read(sfp, true, SFP_ALARM0, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_ALARM0_TEMP_HIGH);
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +
> +	return -EOPNOTSUPP;
> +}
> +
> +static int sfp_hwmon_vcc(struct sfp *sfp, u32 attr, long *value)
> +{
> +	u8 status;
> +	int err;
> +
> +	switch (attr) {
> +	case hwmon_in_input:
> +		return sfp_hwmon_read_vcc(sfp, SFP_VCC, value);
> +
> +	case hwmon_in_lcrit:
> +		*value = be16_to_cpu(sfp->diag.volt_low_alarm);
> +		sfp_hwmon_calibrate_vcc(sfp, value);
> +		return 0;
> +
> +	case hwmon_in_min:
> +		*value = be16_to_cpu(sfp->diag.volt_low_warn);
> +		sfp_hwmon_calibrate_vcc(sfp, value);
> +		return 0;
> +
> +	case hwmon_in_max:
> +		*value = be16_to_cpu(sfp->diag.volt_high_warn);
> +		sfp_hwmon_calibrate_vcc(sfp, value);
> +		return 0;
> +
> +	case hwmon_in_crit:
> +		*value = be16_to_cpu(sfp->diag.volt_high_alarm);
> +		sfp_hwmon_calibrate_vcc(sfp, value);
> +		return 0;
> +
> +	case hwmon_in_lcrit_alarm:
> +		err = sfp_read(sfp, true, SFP_ALARM0, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_ALARM0_VCC_LOW);
> +		return 0;
> +
> +	case hwmon_in_min_alarm:
> +		err = sfp_read(sfp, true, SFP_WARN0, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_WARN0_VCC_LOW);
> +		return 0;
> +
> +	case hwmon_in_max_alarm:
> +		err = sfp_read(sfp, true, SFP_WARN0, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_WARN0_VCC_HIGH);
> +		return 0;
> +
> +	case hwmon_in_crit_alarm:
> +		err = sfp_read(sfp, true, SFP_ALARM0, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_ALARM0_VCC_HIGH);
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +
> +	return -EOPNOTSUPP;
> +}
> +
> +static int sfp_hwmon_bias(struct sfp *sfp, u32 attr, long *value)
> +{
> +	u8 status;
> +	int err;
> +
> +	switch (attr) {
> +	case hwmon_curr_input:
> +		return sfp_hwmon_read_bias(sfp, SFP_TX_BIAS, value);
> +
> +	case hwmon_curr_lcrit:
> +		*value = be16_to_cpu(sfp->diag.bias_low_alarm);
> +		sfp_hwmon_calibrate_bias(sfp, value);
> +		return 0;
> +
> +	case hwmon_curr_min:
> +		*value = be16_to_cpu(sfp->diag.bias_low_warn);
> +		sfp_hwmon_calibrate_bias(sfp, value);
> +		return 0;
> +
> +	case hwmon_curr_max:
> +		*value = be16_to_cpu(sfp->diag.bias_high_warn);
> +		sfp_hwmon_calibrate_bias(sfp, value);
> +		return 0;
> +
> +	case hwmon_curr_crit:
> +		*value = be16_to_cpu(sfp->diag.bias_high_alarm);
> +		sfp_hwmon_calibrate_bias(sfp, value);
> +		return 0;
> +
> +	case hwmon_curr_lcrit_alarm:
> +		err = sfp_read(sfp, true, SFP_ALARM0, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_ALARM0_TX_BIAS_LOW);
> +		return 0;
> +
> +	case hwmon_curr_min_alarm:
> +		err = sfp_read(sfp, true, SFP_WARN0, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_WARN0_TX_BIAS_LOW);
> +		return 0;
> +
> +	case hwmon_curr_max_alarm:
> +		err = sfp_read(sfp, true, SFP_WARN0, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_WARN0_TX_BIAS_HIGH);
> +		return 0;
> +
> +	case hwmon_curr_crit_alarm:
> +		err = sfp_read(sfp, true, SFP_ALARM0, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_ALARM0_TX_BIAS_HIGH);
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +
> +	return -EOPNOTSUPP;
> +}
> +
> +static int sfp_hwmon_tx_power(struct sfp *sfp, u32 attr, long *value)
> +{
> +	u8 status;
> +	int err;
> +
> +	switch (attr) {
> +	case hwmon_power_input:
> +		return sfp_hwmon_read_tx_power(sfp, SFP_TX_POWER, value);
> +
> +	case hwmon_power_lcrit:
> +		*value = be16_to_cpu(sfp->diag.txpwr_low_alarm);
> +		sfp_hwmon_calibrate_tx_power(sfp, value);
> +		return 0;
> +
> +	case hwmon_power_min:
> +		*value = be16_to_cpu(sfp->diag.txpwr_low_warn);
> +		sfp_hwmon_calibrate_tx_power(sfp, value);
> +		return 0;
> +
> +	case hwmon_power_max:
> +		*value = be16_to_cpu(sfp->diag.txpwr_high_warn);
> +		sfp_hwmon_calibrate_tx_power(sfp, value);
> +		return 0;
> +
> +	case hwmon_power_crit:
> +		*value = be16_to_cpu(sfp->diag.txpwr_high_alarm);
> +		sfp_hwmon_calibrate_tx_power(sfp, value);
> +		return 0;
> +
> +	case hwmon_power_lcrit_alarm:
> +		err = sfp_read(sfp, true, SFP_ALARM0, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_ALARM0_TXPWR_LOW);
> +		return 0;
> +
> +	case hwmon_power_min_alarm:
> +		err = sfp_read(sfp, true, SFP_WARN0, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_WARN0_TXPWR_LOW);
> +		return 0;
> +
> +	case hwmon_power_max_alarm:
> +		err = sfp_read(sfp, true, SFP_WARN0, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_WARN0_TXPWR_HIGH);
> +		return 0;
> +
> +	case hwmon_power_crit_alarm:
> +		err = sfp_read(sfp, true, SFP_ALARM0, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_ALARM0_TXPWR_HIGH);
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +
> +	return -EOPNOTSUPP;
> +}
> +
> +static int sfp_hwmon_rx_power(struct sfp *sfp, u32 attr, long *value)
> +{
> +	u8 status;
> +	int err;
> +
> +	switch (attr) {
> +	case hwmon_power_input:
> +		return sfp_hwmon_read_rx_power(sfp, SFP_RX_POWER, value);
> +
> +	case hwmon_power_lcrit:
> +		*value = be16_to_cpu(sfp->diag.rxpwr_low_alarm);
> +		sfp_hwmon_to_rx_power(value);
> +		return 0;
> +
> +	case hwmon_power_min:
> +		*value = be16_to_cpu(sfp->diag.rxpwr_low_warn);
> +		sfp_hwmon_to_rx_power(value);
> +		return 0;
> +
> +	case hwmon_power_max:
> +		*value = be16_to_cpu(sfp->diag.rxpwr_high_warn);
> +		sfp_hwmon_to_rx_power(value);
> +		return 0;
> +
> +	case hwmon_power_crit:
> +		*value = be16_to_cpu(sfp->diag.rxpwr_high_alarm);
> +		sfp_hwmon_to_rx_power(value);
> +		return 0;
> +
> +	case hwmon_power_lcrit_alarm:
> +		err = sfp_read(sfp, true, SFP_ALARM1, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_ALARM1_RXPWR_LOW);
> +		return 0;
> +
> +	case hwmon_power_min_alarm:
> +		err = sfp_read(sfp, true, SFP_WARN1, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_WARN1_RXPWR_LOW);
> +		return 0;
> +
> +	case hwmon_power_max_alarm:
> +		err = sfp_read(sfp, true, SFP_WARN1, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_WARN1_RXPWR_HIGH);
> +		return 0;
> +
> +	case hwmon_power_crit_alarm:
> +		err = sfp_read(sfp, true, SFP_ALARM1, &status, sizeof(status));
> +		if (err < 0)
> +			return err;
> +
> +		*value = !!(status & SFP_ALARM1_RXPWR_HIGH);
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +
> +	return -EOPNOTSUPP;
> +}
> +
> +static int sfp_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
> +			  u32 attr, int channel, long *value)
> +{
> +	struct sfp *sfp = dev_get_drvdata(dev);
> +
> +	switch (type) {
> +	case hwmon_temp:
> +		return sfp_hwmon_temp(sfp, attr, value);
> +	case hwmon_in:
> +		return sfp_hwmon_vcc(sfp, attr, value);
> +	case hwmon_curr:
> +		return sfp_hwmon_bias(sfp, attr, value);
> +	case hwmon_power:
> +		switch (channel) {
> +		case 0:
> +			return sfp_hwmon_tx_power(sfp, attr, value);
> +		case 1:
> +			return sfp_hwmon_rx_power(sfp, attr, value);
> +		default:
> +			return -EOPNOTSUPP;
> +		}
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static const struct hwmon_ops sfp_hwmon_ops = {
> +	.is_visible = sfp_hwmon_is_visible,
> +	.read = sfp_hwmon_read,
> +};
> +
> +static u32 sfp_hwmon_chip_config[] = {
> +	HWMON_C_REGISTER_TZ,
> +	0,
> +};
> +
> +static const struct hwmon_channel_info sfp_hwmon_chip = {
> +	.type = hwmon_chip,
> +	.config = sfp_hwmon_chip_config,
> +};
> +
> +static u32 sfp_hwmon_temp_config[] = {
> +	HWMON_T_INPUT |
> +	HWMON_T_MAX | HWMON_T_MIN |
> +	HWMON_T_MAX_ALARM | HWMON_T_MIN_ALARM |
> +	HWMON_T_CRIT | HWMON_T_LCRIT |
> +	HWMON_T_CRIT_ALARM | HWMON_T_LCRIT_ALARM,
> +	0,
> +};
> +
> +static const struct hwmon_channel_info sfp_hwmon_temp_channel_info = {
> +	.type = hwmon_temp,
> +	.config = sfp_hwmon_temp_config,
> +};
> +
> +static u32 sfp_hwmon_vcc_config[] = {
> +	HWMON_I_INPUT |
> +	HWMON_I_MAX | HWMON_I_MIN |
> +	HWMON_I_MAX_ALARM | HWMON_I_MIN_ALARM |
> +	HWMON_I_CRIT | HWMON_I_LCRIT |
> +	HWMON_I_CRIT_ALARM | HWMON_I_LCRIT_ALARM,
> +	0,
> +};
> +
> +static const struct hwmon_channel_info sfp_hwmon_vcc_channel_info = {
> +	.type = hwmon_in,
> +	.config = sfp_hwmon_vcc_config,
> +};
> +
> +static u32 sfp_hwmon_bias_config[] = {
> +	HWMON_C_INPUT |
> +	HWMON_C_MAX | HWMON_C_MIN |
> +	HWMON_C_MAX_ALARM | HWMON_C_MIN_ALARM |
> +	HWMON_C_CRIT | HWMON_C_LCRIT |
> +	HWMON_C_CRIT_ALARM | HWMON_C_LCRIT_ALARM,
> +	0,
> +};
> +
> +static const struct hwmon_channel_info sfp_hwmon_bias_channel_info = {
> +	.type = hwmon_curr,
> +	.config = sfp_hwmon_bias_config,
> +};
> +
> +static u32 sfp_hwmon_power_config[] = {
> +	/* Transmit power */
> +	HWMON_P_INPUT |
> +	HWMON_P_MAX | HWMON_P_MIN |
> +	HWMON_P_MAX_ALARM | HWMON_P_MIN_ALARM |
> +	HWMON_P_CRIT | HWMON_P_LCRIT |
> +	HWMON_P_CRIT_ALARM | HWMON_P_LCRIT_ALARM,
> +	/* Receive power */
> +	HWMON_P_INPUT |
> +	HWMON_P_MAX | HWMON_P_MIN |
> +	HWMON_P_MAX_ALARM | HWMON_P_MIN_ALARM |
> +	HWMON_P_CRIT | HWMON_P_LCRIT |
> +	HWMON_P_CRIT_ALARM | HWMON_P_LCRIT_ALARM,
> +	0,
> +};
> +
> +static const struct hwmon_channel_info sfp_hwmon_power_channel_info = {
> +	.type = hwmon_power,
> +	.config = sfp_hwmon_power_config,
> +};
> +
> +static const struct hwmon_channel_info *sfp_hwmon_info[] = {
> +	&sfp_hwmon_chip,
> +	&sfp_hwmon_vcc_channel_info,
> +	&sfp_hwmon_temp_channel_info,
> +	&sfp_hwmon_bias_channel_info,
> +	&sfp_hwmon_power_channel_info,
> +	NULL,
> +};
> +
> +static const struct hwmon_chip_info sfp_hwmon_chip_info = {
> +	.ops = &sfp_hwmon_ops,
> +	.info = sfp_hwmon_info,
> +};
> +
> +static int sfp_hwmon_insert(struct sfp *sfp)
> +{
> +	int err, i, j;
> +
> +	if (sfp->id.ext.sff8472_compliance == SFP_SFF8472_COMPLIANCE_NONE)
> +		return 0;
> +
> +	if (!(sfp->id.ext.diagmon & SFP_DIAGMON_DDM))
> +		return 0;
> +
> +	if (sfp->id.ext.diagmon & SFP_DIAGMON_ADDRMODE)
> +		/* This driver in general does not support address
> +		 * change.
> +		 */
> +		return 0;
> +
> +	err = sfp_read(sfp, true, 0, &sfp->diag, sizeof(sfp->diag));
> +	if (err < 0)
> +		return err;
> +
> +	sfp->hwmon_name = devm_kstrdup(sfp->dev, dev_name(sfp->dev),
> +				       GFP_KERNEL);
> +	if (!sfp->hwmon_name)
> +		return -ENODEV;
> +
> +	for (i = j = 0; sfp->hwmon_name[i]; i++) {
> +		if (isalnum(sfp->hwmon_name[i])) {
> +			if (i != j)
> +				sfp->hwmon_name[j] = sfp->hwmon_name[i];
> +			j++;
> +		}
> +	}

It might be better and simpler to replace invalid characters with '_'
instead of dropping them. Also note that '_' is a valid character.
Strictly speaking only "-* \t\n" are invalid.

> +	sfp->hwmon_name[j] = '\0';
> +
Is it possible that j == 0 ?

> +	sfp->hwmon_dev = devm_hwmon_device_register_with_info(sfp->dev,
> +				sfp->hwmon_name, sfp, &sfp_hwmon_chip_info,
> +				NULL);
> +
> +	return PTR_ERR_OR_ZERO(sfp->hwmon_dev);
> +}
> +
> +static void sfp_hwmon_remove(struct sfp *sfp)
> +{
> +	devm_hwmon_device_unregister(sfp->hwmon_dev);

If registartion and removal are not tied to a device, it doesn't make sense
to use devm_ functions. Either use hwmon_device_register_with_info()
and hwmon_device_unregister(), or drop the remove function.

> +}
> +#else
> +static int sfp_hwmon_insert(struct sfp *sfp)
> +{
> +	return 0;
> +}
> +
> +static void sfp_hwmon_remove(struct sfp *sfp)
> +{
> +}
> +#endif
> +
>  /* Helpers */
>  static void sfp_module_tx_disable(struct sfp *sfp)
>  {
> @@ -636,6 +1362,10 @@ static int sfp_sm_mod_probe(struct sfp *sfp)
>  		dev_warn(sfp->dev,
>  			 "module address swap to access page 0xA2 is not supported.\n");
>  
> +	ret = sfp_hwmon_insert(sfp);
> +	if (ret < 0)
> +		return ret;
> +
>  	ret = sfp_module_insert(sfp->sfp_bus, &sfp->id);
>  	if (ret < 0)
>  		return ret;
> @@ -647,6 +1377,8 @@ static void sfp_sm_mod_remove(struct sfp *sfp)
>  {
>  	sfp_module_remove(sfp->sfp_bus);
>  
> +	sfp_hwmon_remove(sfp);
> +
>  	if (sfp->mod_phy)
>  		sfp_sm_phy_detach(sfp);
>  
> diff --git a/include/linux/sfp.h b/include/linux/sfp.h
> index ebce9e24906a..d37518e89db2 100644
> --- a/include/linux/sfp.h
> +++ b/include/linux/sfp.h
> @@ -231,6 +231,50 @@ struct sfp_eeprom_id {
>  	struct sfp_eeprom_ext ext;
>  } __packed;
>  
> +struct sfp_diag {
> +	__be16 temp_high_alarm;
> +	__be16 temp_low_alarm;
> +	__be16 temp_high_warn;
> +	__be16 temp_low_warn;
> +	__be16 volt_high_alarm;
> +	__be16 volt_low_alarm;
> +	__be16 volt_high_warn;
> +	__be16 volt_low_warn;
> +	__be16 bias_high_alarm;
> +	__be16 bias_low_alarm;
> +	__be16 bias_high_warn;
> +	__be16 bias_low_warn;
> +	__be16 txpwr_high_alarm;
> +	__be16 txpwr_low_alarm;
> +	__be16 txpwr_high_warn;
> +	__be16 txpwr_low_warn;
> +	__be16 rxpwr_high_alarm;
> +	__be16 rxpwr_low_alarm;
> +	__be16 rxpwr_high_warn;
> +	__be16 rxpwr_low_warn;
> +	__be16 laser_temp_high_alarm;
> +	__be16 laser_temp_low_alarm;
> +	__be16 laser_temp_high_warn;
> +	__be16 laser_temp_low_warn;
> +	__be16 tec_cur_high_alarm;
> +	__be16 tec_cur_low_alarm;
> +	__be16 tec_cur_high_warn;
> +	__be16 tec_cur_low_warn;
> +	__be32 cal_rxpwr4;
> +	__be32 cal_rxpwr3;
> +	__be32 cal_rxpwr2;
> +	__be32 cal_rxpwr1;
> +	__be32 cal_rxpwr0;
> +	__be16 cal_txi_slope;
> +	__be16 cal_txi_offset;
> +	__be16 cal_txpwr_slope;
> +	__be16 cal_txpwr_offset;
> +	__be16 cal_t_slope;
> +	__be16 cal_t_offset;
> +	__be16 cal_v_slope;
> +	__be16 cal_v_offset;
> +} __packed;
> +
>  /* SFP EEPROM registers */
>  enum {
>  	SFP_PHYS_ID			= 0x00,
> @@ -384,7 +428,33 @@ enum {
>  	SFP_TEC_CUR			= 0x6c,
>  
>  	SFP_STATUS			= 0x6e,
> -	SFP_ALARM			= 0x70,
> +	SFP_ALARM0			= 0x70,
> +	SFP_ALARM0_TEMP_HIGH		= BIT(7),
> +	SFP_ALARM0_TEMP_LOW		= BIT(6),
> +	SFP_ALARM0_VCC_HIGH		= BIT(5),
> +	SFP_ALARM0_VCC_LOW		= BIT(4),
> +	SFP_ALARM0_TX_BIAS_HIGH		= BIT(3),
> +	SFP_ALARM0_TX_BIAS_LOW		= BIT(2),
> +	SFP_ALARM0_TXPWR_HIGH		= BIT(1),
> +	SFP_ALARM0_TXPWR_LOW		= BIT(0),
> +
> +	SFP_ALARM1			= 0x71,
> +	SFP_ALARM1_RXPWR_HIGH		= BIT(7),
> +	SFP_ALARM1_RXPWR_LOW		= BIT(6),
> +
> +	SFP_WARN0			= 0x74,
> +	SFP_WARN0_TEMP_HIGH		= BIT(7),
> +	SFP_WARN0_TEMP_LOW		= BIT(6),
> +	SFP_WARN0_VCC_HIGH		= BIT(5),
> +	SFP_WARN0_VCC_LOW		= BIT(4),
> +	SFP_WARN0_TX_BIAS_HIGH		= BIT(3),
> +	SFP_WARN0_TX_BIAS_LOW		= BIT(2),
> +	SFP_WARN0_TXPWR_HIGH		= BIT(1),
> +	SFP_WARN0_TXPWR_LOW		= BIT(0),
> +
> +	SFP_WARN1			= 0x75,
> +	SFP_WARN1_RXPWR_HIGH		= BIT(7),
> +	SFP_WARN1_RXPWR_LOW		= BIT(6),
>  
>  	SFP_EXT_STATUS			= 0x76,
>  	SFP_VSL				= 0x78,
> -- 
> 2.18.0.rc2
> 

^ permalink raw reply

* Re: [PATCH 6/6] fs: replace f_ops->get_poll_head with a static ->f_poll_head pointer
From: Linus Torvalds @ 2018-06-28 22:35 UTC (permalink / raw)
  To: Al Viro; +Cc: Christoph Hellwig, linux-fsdevel, Network Development, LKP
In-Reply-To: <20180628222016.GL30522@ZenIV.linux.org.uk>

On Thu, Jun 28, 2018 at 3:20 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> The rules for drivers change only in one respect - if your ->poll() is going to
> need to block, check poll_requested_events(pt) & EPOLL_ATOMIC and return EPOLLNVAL
> in such case.

OI still don't even understand why you care.

Yes, the AIO poll implementation did it under the spinlock.

But there's no good *reason* for that.  The "aio_poll()" function
itself is called in perfectly fine blocking context.

The only reason it does it under the spinlock is that apparently
Christoph didn't understand how poll() worked.

As far as I can tell, Christoph could have just done the first pass
'->poll()' *without* taking a spinlock, and that adds the table entry
to the table. Then, *under the spinlock*, you associate the table the
the kioctx. And then *after* the spinlock, you can call "->poll()"
again (now with a NULL table pointer), to verify that the state is
still not triggered. That's the whole point of the two-phgase poll
thing - the first phase adds the entry to the wait queues, and the
second phase checks for the race of "did it the event happen in the
meantime".

There is absolutely no excuse for calling '->poll()' itself under the
spinlock. I don't see any reason for it. The whole "AIO needs this to
avoid races" was always complete and utter bullshit, as far as I can
tell.

So stop it with this crazy and pointless "poll() might block".

IT DAMN WELL SHOULD BE ABLE TO BLOCK, AND NOBODY SANE WILL EVER CARE!

If somebody cares, they are doing things wrong. So fix the AIO code,
don't look at the poll() code, for chrissake!

               Linus

^ permalink raw reply

* Re: [patch net-next v2 0/9] net: sched: introduce chain templates support with offloading to mlxsw
From: Cong Wang @ 2018-06-28 22:25 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Linux Kernel Network Developers, David Miller, Jamal Hadi Salim,
	Jakub Kicinski, Simon Horman, john.hurley, David Ahern, mlxsw,
	sridhar.samudrala
In-Reply-To: <20180628130907.951-1-jiri@resnulli.us>

On Thu, Jun 28, 2018 at 6:10 AM Jiri Pirko <jiri@resnulli.us> wrote:
> Add a template of type flower allowing to insert rules matching on last
> 2 bytes of destination mac address:
> # tc chaintemplate add dev dummy0 ingress proto ip flower dst_mac 00:00:00:00:00:00/00:00:00:00:FF:FF
>
> The template is now showed in the list:
> # tc chaintemplate show dev dummy0 ingress
> chaintemplate flower chain 0
>   dst_mac 00:00:00:00:00:00/00:00:00:00:ff:ff
>   eth_type ipv4
>
> Add another template, this time for chain number 22:
> # tc chaintemplate add dev dummy0 ingress proto ip chain 22 flower dst_ip 0.0.0.0/16
> # tc chaintemplate show dev dummy0 ingress
> chaintemplate flower chain 0
>   dst_mac 00:00:00:00:00:00/00:00:00:00:ff:ff
>   eth_type ipv4
> chaintemplate flower chain 22
>   eth_type ipv4
>   dst_ip 0.0.0.0/16

So, if I want to check the template of a chain, I have to use
'tc chaintemplate... chain X'.

If I want to check the filters in a chain, I have to use
'tc filter show .... chain X'.

If you introduce 'tc chain', it would just need one command:
`tc chain show ... X` which could list its template first and
followed by filters in this chain, something like:

# tc chain show dev eth0 chain X
template: # could be none
....
filter1
...
filter2
...

Isn't it more elegant?

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox