netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Lunn <andrew@lunn.ch>
To: Igor Russkikh <Igor.Russkikh@aquantia.com>
Cc: "linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Subject: Re: [PATCH net 2/2] net: usb: aqc111: Support for thermal throttling feature
Date: Wed, 12 Dec 2018 17:08:11 +0100	[thread overview]
Message-ID: <20181212160811.GF1549@lunn.ch> (raw)
In-Reply-To: <658f5fab466face2aed2d53dabfe3af16595019a.1544622414.git.igor.russkikh@aquantia.com>

On Wed, Dec 12, 2018 at 01:50:10PM +0000, Igor Russkikh wrote:
> From: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
> 
> Lab testing shows that chip may get overheated when
> network link is connected on high speed (2.5G/5G).
> 
> Default hardware design uses only passive heatsink without a cooler,
> and that makes things worse.
> 
> To prevent possible chip damage here we add throttling mechanism.
> 
> There is a worker that monitors the PHY's temperature via reading
> PHY registers. When PHY's temperature reaches the critical threshold
> (it is 106 degrees of Celsius) it changes the link speed to the lowest
> regardless user/default link speed settings. It should reduce the PHY's
> temperature to normal numbers.
> 
> When the PHY's temparature is back to normal (low threshold, it is
> 85 degrees) it restores user/default link speed settings.

Hi Igor

Please could you also export the temperature using HWMON. The Marvell
PHY drivers are good examples.

I also have a patch for driver/net/phy/aquantia.c which adds HWMON
support to that PHY.

> 
> Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
> ---
>  drivers/net/usb/aqc111.c | 78 ++++++++++++++++++++++++++++++++++++++++
>  drivers/net/usb/aqc111.h |  8 +++++
>  2 files changed, 86 insertions(+)
> 
> diff --git a/drivers/net/usb/aqc111.c b/drivers/net/usb/aqc111.c
> index 8867f6a3eaa7..fa6b9dfc2a6f 100644
> --- a/drivers/net/usb/aqc111.c
> +++ b/drivers/net/usb/aqc111.c
> @@ -17,6 +17,7 @@
>  #include <linux/usb/cdc.h>
>  #include <linux/usb/usbnet.h>
>  #include <linux/linkmode.h>
> +#include <linux/workqueue.h>
>  
>  #include "aqc111.h"
>  
> @@ -334,6 +335,9 @@ static void aqc111_set_phy_speed(struct usbnet *dev, u8 autoneg, u16 speed)
>  	aqc111_data->phy_cfg |= (3 << AQ_DSH_RETRIES_SHIFT) &
>  				AQ_DSH_RETRIES_MASK;
>  
> +	if (aqc111_data->thermal_throttling)
> +		speed = SPEED_100;
> +

That is a big jump. Do you need to cool is down quickly, or would
1Gbps be sufficient? I think as a user, i would prefer it ping-pongs
between 5G and 1G, not 5G and 100M.

>  	if (autoneg == AUTONEG_ENABLE) {
>  		switch (speed) {
>  		case SPEED_5000:

It looks like this only works when auto-neg is enabled. If i've fixed
configured it i don't think this works?

> @@ -714,6 +718,8 @@ static int aqc111_bind(struct usbnet *dev, struct usb_interface *intf)
>  	/* store aqc111_data pointer in device data field */
>  	dev->driver_priv = aqc111_data;
>  
> +	aqc111_data->dev = dev;
> +
>  	/* Init the MAC address */
>  	ret = aqc111_read_perm_mac(dev);
>  	if (ret)
> @@ -991,6 +997,71 @@ static int aqc111_link_reset(struct usbnet *dev)
>  	return 0;
>  }
>  
> +int aqc111_get_temperature(struct usbnet *dev, u32 *temperature)
> +{
> +	u16 reg16 = 0;
> +
> +	if (aqc111_mdio_read(dev, AQ_GLB_THERMAL_STATUS2, AQ_PHY_GLOBAL_ADDR,
> +			     &reg16) < 0)
> +		goto err;
> +
> +	if (!(reg16 & AQ_THERM_ST_READY))
> +		goto err;
> +
> +	if (aqc111_mdio_read(dev, AQ_GLB_THERMAL_STATUS1, AQ_PHY_GLOBAL_ADDR,
> +			     &reg16) < 0)
> +		goto err;
> +
> +	/*convert from 1/256 to 1/100 degrees of Celsius*/
> +	*temperature = (u32)reg16 * 100 / 256;
> +	return 0;
> +
> +err:
> +	*temperature = 0;
> +	return -1;
> +}
> +
> +void aqc111_thermal_work_cb(struct work_struct *w)
> +{
> +	struct delayed_work *dw = to_delayed_work(w);
> +	struct aqc111_data *aqc111_data = container_of(dw, struct aqc111_data,
> +						       therm_work);
> +	unsigned long timeout = msecs_to_jiffies(AQ_THERMAL_TIMER_MS);
> +	struct usbnet *dev = aqc111_data->dev;
> +	u32 temperature = 0;
> +	u8 reset_speed = 0;
> +
> +	if (!aqc111_data->link)
> +		/* poll not so frequently */
> +		timeout *= 2;
> +
> +	if (aqc111_get_temperature(dev, &temperature) != 0)
> +		goto end;
> +
> +	if (aqc111_data->thermal_throttling &&
> +	    temperature <= AQ_NORMAL_TEMP_THRESHOLD) {
> +		netdev_info(dev->net, "The temperature is back to normal(%u)",
> +			    AQ_NORMAL_TEMP_THRESHOLD / 100);

How often do you see these messages? I'm wondering if they need to be
rate limited?

> +		aqc111_data->thermal_throttling = 0;
> +		reset_speed = 1;
> +	}
> +
> +	if (!aqc111_data->thermal_throttling &&
> +	    temperature >= AQ_CRITICAL_TEMP_THRESHOLD) {

Should there be some hysteresis in here? In fact, if temperature is
AQ_CRITICAL_TEMP_THRESHOLD it is both back to normal, and throttled at
the same time!

> +		netdev_warn(dev->net, "Critical temperature(%u) is reached",
> +			    AQ_CRITICAL_TEMP_THRESHOLD / 100);
> +		aqc111_data->thermal_throttling = 1;
> +		reset_speed = 1;

update_speed might be a better name, since you are not always
resetting it.

	  Andrew

  reply	other threads:[~2018-12-12 16:08 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-12 13:50 [PATCH net 0/2] aqc111: Thermal throttling feature Igor Russkikh
2018-12-12 13:50 ` [PATCH net 1/2] net: usb: aqc111: Add read_mdio operation Igor Russkikh
2018-12-12 16:11   ` Andrew Lunn
2018-12-12 19:38     ` Igor Russkikh
2018-12-12 13:50 ` [PATCH net 2/2] net: usb: aqc111: Support for thermal throttling feature Igor Russkikh
2018-12-12 16:08   ` Andrew Lunn [this message]
2018-12-12 19:34     ` Igor Russkikh
2018-12-12 20:28       ` Andrew Lunn
2018-12-12 13:54 ` [PATCH net 0/2] aqc111: Thermal " Igor Russkikh
2018-12-12 18:15   ` Florian Fainelli
2018-12-12 20:08     ` Igor Russkikh
2018-12-12 20:28       ` Florian Fainelli
2018-12-12 20:38       ` Andrew Lunn
2018-12-13  0:43         ` David Miller
2018-12-14 11:43           ` Igor Russkikh
2018-12-13  0:18 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181212160811.GF1549@lunn.ch \
    --to=andrew@lunn.ch \
    --cc=Dmitry.Bezrukov@aquantia.com \
    --cc=Igor.Russkikh@aquantia.com \
    --cc=davem@davemloft.net \
    --cc=linux-usb@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).