From: Heiner Kallweit <hkallweit1@gmail.com>
To: Andrew Lunn <andrew@lunn.ch>
Cc: Guenter Roeck <linux@roeck-us.net>,
Russell King - ARM Linux <linux@armlinux.org.uk>,
Paolo Abeni <pabeni@redhat.com>, Jakub Kicinski <kuba@kernel.org>,
David Miller <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Simon Horman <horms@kernel.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"linux-hwmon@vger.kernel.org" <linux-hwmon@vger.kernel.org>,
Jean Delvare <jdelvare@suse.com>
Subject: Re: [PATCH net-next 3/3] net: phy: realtek: add hwmon support for temp sensor on RTL822x
Date: Sat, 11 Jan 2025 18:32:35 +0100 [thread overview]
Message-ID: <e8bd6c18-1c71-49b0-a513-e38bacac90e7@gmail.com> (raw)
In-Reply-To: <0adfb0e4-72b2-48c1-bf65-da75213a5f18@lunn.ch>
On 11.01.2025 18:00, Andrew Lunn wrote:
>> According to Guenters feedback the alarm attribute must not be written
>> and is expected to be self-clearing on read.
>> If we would clear the alarm in the chip on alarm attribute read, then
>> we can have the following ugly scenario:
>>
>> 1. Temperature threshold is exceeded and chip reduces speed to 1Gbps
>> 2. Temperature is falling below alarm threshold
>> 3. User uses "sensors" to check the current temperature
>> 4. The implicit alarm attribute read causes the chip to clear the
>> alarm and re-enable 2.5Gbps speed, resulting in the temperature
>> alarm threshold being exceeded very soon again.
>>
>> What isn't nice here is that it's not transparent to the user that
>> a read-only command from his perspective causes the protective measure
>> of the chip to be cancelled.
>>
>> There's no existing hwmon attribute meant to be used by the user
>> to clear a hw alarm once he took measures to protect the chip
>> from overheating.
>
> It is generally not the kernels job to implement policy. User space
> should be doing that.
>
> I see two different possible policies, and there are maybe others:
>
> 1) The user is happy with one second outages every so often as the
> chip cycles between too hot and down shifting, and cool enough to
> upshift back to the higher speeds.
>
> 2) The user prefers to have reliable, slower connectivity and needs to
> explicitly do something like down/up the interface to get it back to
> the higher speed.
>
This seems to be exactly how I do it currently.
> I personally would say, from a user support view, 2) is better. A one
> time 1 second break in connectivity and a kernel message is going to
> cause less issues.
>
> Maybe the solution is that the hwmon alarm attribute is not directly
> the hardware bit, but a software interpretation of the system state.
> When the alarm fires, copy it into a software alarm state, but leave
> the hardware alarm alone. A hwmon read clears the software state, but
> leaves the hardware alone. A down/up of the interface will then clear
> both the software and hardware alarm state.
>
Not clearing the alarm on read is better from a user perspective IMO
(at least for this specific PHY).
As long as the alarm is active, the chip forces a downshift.
> Anybody wanting policy 1) would then need a daemon polling the state
> and taking action. 2) would be the default.
>
> How easy is it for you to get into the alarm state? Did you need an
> environment chamber/oven, or is it happening for you with just lots of
> continuous traffic at typical room temperature? Are we talking about
> cheap USB dangles in a sealed plastic case with poor thermal design
> are going to be doing this all the time?
>
I have a M.2 card with RTL8126 (w/o heat sink) and an external RJ45 port.
This card sits in a slot underneath the mainboard of a mini PC. At 2.5Gbps
it makes a big difference whether EEE is active. With EEE it reaches 54°C,
w/o EEE temperature quickly goes over 70°C. For tests I add a PHY write
to the code which sets the over-temp threshold to 60°C. Then I can
easily trigger overheating by disabling EEE.
On my system the over-temp threshold set by the BIOS (?) is 120°C.
Even w/o heat sink I can hardly imagine that this threshold is ever
reached.
> Andrew
Heiner
next prev parent reply other threads:[~2025-01-11 17:32 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-10 11:45 [PATCH net-next 0/3] net: phy: realtek: add hwmon support Heiner Kallweit
2025-01-10 11:46 ` [PATCH net-next 1/3] net: phy: realtek: add support for reading MDIO_MMD_VEND2 regs on RTL8125/RTL8126 Heiner Kallweit
2025-01-10 11:47 ` [PATCH net-next 2/3] net: phy: realtek: rename realtek.c to realtek_main.c Heiner Kallweit
2025-01-10 12:40 ` Russell King (Oracle)
2025-01-10 20:35 ` Heiner Kallweit
2025-01-10 11:48 ` [PATCH net-next 3/3] net: phy: realtek: add hwmon support for temp sensor on RTL822x Heiner Kallweit
2025-01-10 21:10 ` Andrew Lunn
2025-01-10 21:41 ` Heiner Kallweit
2025-01-11 0:20 ` Guenter Roeck
2025-01-11 8:52 ` Heiner Kallweit
2025-01-11 0:08 ` Guenter Roeck
2025-01-11 10:16 ` Heiner Kallweit
2025-01-11 17:00 ` Andrew Lunn
2025-01-11 17:32 ` Heiner Kallweit [this message]
2025-01-11 17:44 ` Andrew Lunn
2025-01-11 18:06 ` Heiner Kallweit
2025-01-11 8:52 ` kernel test robot
2025-01-11 9:48 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e8bd6c18-1c71-49b0-a513-e38bacac90e7@gmail.com \
--to=hkallweit1@gmail.com \
--cc=andrew@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=jdelvare@suse.com \
--cc=kuba@kernel.org \
--cc=linux-hwmon@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=linux@roeck-us.net \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).