netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Heiner Kallweit <hkallweit1@gmail.com>
To: Andrew Lunn <andrew@lunn.ch>
Cc: Guenter Roeck <linux@roeck-us.net>,
	Russell King - ARM Linux <linux@armlinux.org.uk>,
	Paolo Abeni <pabeni@redhat.com>, Jakub Kicinski <kuba@kernel.org>,
	David Miller <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Simon Horman <horms@kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux-hwmon@vger.kernel.org" <linux-hwmon@vger.kernel.org>,
	Jean Delvare <jdelvare@suse.com>
Subject: Re: [PATCH net-next 3/3] net: phy: realtek: add hwmon support for temp sensor on RTL822x
Date: Sat, 11 Jan 2025 18:32:35 +0100	[thread overview]
Message-ID: <e8bd6c18-1c71-49b0-a513-e38bacac90e7@gmail.com> (raw)
In-Reply-To: <0adfb0e4-72b2-48c1-bf65-da75213a5f18@lunn.ch>

On 11.01.2025 18:00, Andrew Lunn wrote:
>> According to Guenters feedback the alarm attribute must not be written
>> and is expected to be self-clearing on read.
>> If we would clear the alarm in the chip on alarm attribute read, then
>> we can have the following ugly scenario:
>>
>> 1. Temperature threshold is exceeded and chip reduces speed to 1Gbps
>> 2. Temperature is falling below alarm threshold
>> 3. User uses "sensors" to check the current temperature
>> 4. The implicit alarm attribute read causes the chip to clear the
>>    alarm and re-enable 2.5Gbps speed, resulting in the temperature
>>    alarm threshold being exceeded very soon again.
>>
>> What isn't nice here is that it's not transparent to the user that
>> a read-only command from his perspective causes the protective measure
>> of the chip to be cancelled.
>>
>> There's no existing hwmon attribute meant to be used by the user
>> to clear a hw alarm once he took measures to protect the chip
>> from overheating.
> 
> It is generally not the kernels job to implement policy. User space
> should be doing that.
> 
> I see two different possible policies, and there are maybe others:
> 
> 1) The user is happy with one second outages every so often as the
> chip cycles between too hot and down shifting, and cool enough to
> upshift back to the higher speeds.
> 
> 2) The user prefers to have reliable, slower connectivity and needs to
> explicitly do something like down/up the interface to get it back to
> the higher speed.
> 
This seems to be exactly how I do it currently.

> I personally would say, from a user support view, 2) is better. A one
> time 1 second break in connectivity and a kernel message is going to
> cause less issues.
> 
> Maybe the solution is that the hwmon alarm attribute is not directly
> the hardware bit, but a software interpretation of the system state.
> When the alarm fires, copy it into a software alarm state, but leave
> the hardware alarm alone. A hwmon read clears the software state, but
> leaves the hardware alone. A down/up of the interface will then clear
> both the software and hardware alarm state.
> 
Not clearing the alarm on read is better from a user perspective IMO
(at least for this specific PHY).
As long as the alarm is active, the chip forces a downshift. 

> Anybody wanting policy 1) would then need a daemon polling the state
> and taking action. 2) would be the default.
> 
> How easy is it for you to get into the alarm state? Did you need an
> environment chamber/oven, or is it happening for you with just lots of
> continuous traffic at typical room temperature? Are we talking about
> cheap USB dangles in a sealed plastic case with poor thermal design
> are going to be doing this all the time?
> 
I have a M.2 card with RTL8126 (w/o heat sink) and an external RJ45 port.
This card sits in a slot underneath the mainboard of a mini PC. At 2.5Gbps
it makes a big difference whether EEE is active. With EEE it reaches 54°C,
w/o EEE temperature quickly goes over 70°C. For tests I add a PHY write
to the code which sets the over-temp threshold to 60°C. Then I can
easily trigger overheating by disabling EEE.

On my system the over-temp threshold set by the BIOS (?) is 120°C.
Even w/o heat sink I can hardly imagine that this threshold is ever
reached.

> 	Andrew

Heiner

  reply	other threads:[~2025-01-11 17:32 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-10 11:45 [PATCH net-next 0/3] net: phy: realtek: add hwmon support Heiner Kallweit
2025-01-10 11:46 ` [PATCH net-next 1/3] net: phy: realtek: add support for reading MDIO_MMD_VEND2 regs on RTL8125/RTL8126 Heiner Kallweit
2025-01-10 11:47 ` [PATCH net-next 2/3] net: phy: realtek: rename realtek.c to realtek_main.c Heiner Kallweit
2025-01-10 12:40   ` Russell King (Oracle)
2025-01-10 20:35     ` Heiner Kallweit
2025-01-10 11:48 ` [PATCH net-next 3/3] net: phy: realtek: add hwmon support for temp sensor on RTL822x Heiner Kallweit
2025-01-10 21:10   ` Andrew Lunn
2025-01-10 21:41     ` Heiner Kallweit
2025-01-11  0:20       ` Guenter Roeck
2025-01-11  8:52         ` Heiner Kallweit
2025-01-11  0:08     ` Guenter Roeck
2025-01-11 10:16     ` Heiner Kallweit
2025-01-11 17:00       ` Andrew Lunn
2025-01-11 17:32         ` Heiner Kallweit [this message]
2025-01-11 17:44           ` Andrew Lunn
2025-01-11 18:06             ` Heiner Kallweit
2025-01-11  8:52   ` kernel test robot
2025-01-11  9:48   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e8bd6c18-1c71-49b0-a513-e38bacac90e7@gmail.com \
    --to=hkallweit1@gmail.com \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=jdelvare@suse.com \
    --cc=kuba@kernel.org \
    --cc=linux-hwmon@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=linux@roeck-us.net \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).