From: "Alvin Šipraga" <ALSI@bang-olufsen.dk>
To: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Cc: "Vladimir Oltean" <olteanv@gmail.com>,
"Alvin Šipraga" <alvin@pqrs.dk>,
"Linus Walleij" <linus.walleij@linaro.org>,
"Andrew Lunn" <andrew@lunn.ch>,
"Vivien Didelot" <vivien.didelot@gmail.com>,
"Florian Fainelli" <f.fainelli@gmail.com>,
"David S. Miller" <davem@davemloft.net>,
"Jakub Kicinski" <kuba@kernel.org>,
"Arınç ÜNAL" <arinc.unal@arinc9.com>,
"Michael Rasmussen" <MIR@bang-olufsen.dk>,
"open list:NETWORKING DRIVERS" <netdev@vger.kernel.org>,
"open list" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH net-next 2/2] net: dsa: realtek: rtl8365mb: serialize indirect PHY register access
Date: Thu, 17 Feb 2022 08:16:44 +0000 [thread overview]
Message-ID: <87v8xdrjpw.fsf@bang-olufsen.dk> (raw)
In-Reply-To: <CAJq09z6XBQUTBZoQ81Vy3nUc_5QGTF0GH8V-S3bXOw=JpYODvA@mail.gmail.com> (Luiz Angelo Daros de Luca's message of "Thu, 17 Feb 2022 00:01:45 -0300")
Luiz Angelo Daros de Luca <luizluca@gmail.com> writes:
> Hi Vladimir,
>
>> This implementation where the indirect PHY access blocks out every other
>> register read and write is only justified if you can prove that you can
>> stuff just about any unrelated register read or write before
>> RTL8365MB_INDIRECT_ACCESS_READ_DATA_REG, and this, in and of itself,
>> will poison what gets read back from RTL8365MB_INDIRECT_ACCESS_READ_DATA_REG.
>
> I was the first one trying to fix this issue reported by Arinç with
> SMP devices. At first I thought it was caused by two parallel indirect
> access reads polling the interface (it was not using interrupts). With
> no lock, they will eventually collide and one reads the result of the
> other one. However, a simple lock over the indirect access didn't
> solve the issue. Alvin tested it much further to isolate that indirect
> register access is messed up by any other register read. The fails
> while polling the interface status or the other test Alvin created
> only manifests in a device with multiple cores and mine is single
> core. I do get something similar in a single core device by reading an
> unused register address but it is hard to blame Realtek when we are
> doing something we were not supposed to do. Anyway, that indicates
> that "reading a register" is not an atomic operation inside the switch
> asic.
I never observed any issue which suggests that switch register reads are
not atomic... I mean, they are (and always have been) protected by the
default regmap lock. So what makes you say this?
I have only seen issues related to PHY register access, please enlighten
us if there are other issues.
>
>> rtl8365mb_mib_counter_read() doesn't seem like a particularly good
>> example to prove this, since it appears to be an indirect access
>> procedure as well. Single register reads or writes would be ideal, like
>> RTL8365MB_CPU_CTRL_REG, artificially inserted into strategic places.
>> Ideally you wouldn't even have a DSA or MDIO or PHY driver running.
>> Just a simple kernel module with access to the regmap, and try to read
>> something known, like the PHY ID of one of the internal PHYs, via an
>> open-coded function. Then add extra regmap accesses and see what
>> corrupts the indirect PHY access procedure.
>
> The MIB might be just another example where the issue happens. It was
> first noticed with a SMP device without interruptions configured. I
> believe it will always fail with that configuration.
As I stated in the last thread, I tested MIB access and the problem did
not manifest itself there.
>
>> Are Realtek aware of this and do they confirm the issue? Sounds like
>> erratum material to me, and a pretty severe one, at that. Alternatively,
>> we may simply not be understanding the hardware architecture, like for
>> example the fact that MIB indirect access and PHY indirect access may
>> share some common bus and must be sequential w.r.t. each other.
>
> The realtek "API/driver" does exactly how the driver was doing. They
> do have a lock/unlock placeholder, but only in the equivalent
> regmap_{read,write} functions. Indirect access does not use locks at
> all (in fact, there is no other mention of "lock" elsewhere), even
> being obvious that it is not thread-safe. It was just with a DSA
> driver that we started to exercise register access for real, specially
> without interruptions. And even in that case, we could only notice
> this issue in multicore devices. I believe that, if they know about
> this issue, they might not be worried because it has never affected a
> real device. It would be very interesting to hear from Realtek but I
> do not have the contacts.
This is not true, at least with the sources I am reading. As I said in
my reply to Vladimir, the Realtek code takes a lock around each
top-level API call. Example:
rtk_api_ret_t rtk_port_phyStatus_get(...)
{
rtk_api_ret_t retVal;
if (NULL == RT_MAPPER->port_phyStatus_get)
return RT_ERR_DRIVER_NOT_FOUND;
RTK_API_LOCK();
retVal = RT_MAPPER->port_phyStatus_get(port, pLinkStatus, pSpeed, pDuplex);
RTK_API_UNLOCK();
return retVal;
}
Deep down in this port_phyStatus_get() callback, the indirect PHY
register access takes place.
Kind regards,
Alvin
next prev parent reply other threads:[~2022-02-17 8:16 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-16 16:04 [PATCH net-next 0/2] net: dsa: realtek: fix PHY register read corruption Alvin Šipraga
2022-02-16 16:04 ` [PATCH net-next 1/2] net: dsa: realtek: allow subdrivers to externally lock regmap Alvin Šipraga
2022-02-16 16:05 ` [PATCH net-next 2/2] net: dsa: realtek: rtl8365mb: serialize indirect PHY register access Alvin Šipraga
2022-02-16 23:39 ` Vladimir Oltean
2022-02-17 3:01 ` Luiz Angelo Daros de Luca
2022-02-17 8:16 ` Alvin Šipraga [this message]
2022-02-22 0:18 ` Luiz Angelo Daros de Luca
2022-02-17 7:41 ` Alvin Šipraga
2022-02-17 11:17 ` Vladimir Oltean
2022-02-17 12:51 ` Alvin Šipraga
2022-02-21 14:50 ` Alvin Šipraga
2022-02-21 17:15 ` Vladimir Oltean
2022-02-21 18:10 ` Alvin Šipraga
2022-02-16 17:57 ` [PATCH net-next 0/2] net: dsa: realtek: fix PHY register read corruption Luiz Angelo Daros de Luca
2022-02-16 18:23 ` Alvin Šipraga
2022-02-16 19:11 ` Andrew Lunn
2022-02-16 19:26 ` Alvin Šipraga
2022-02-17 12:12 ` Andrew Lunn
2022-02-17 13:09 ` Alvin Šipraga
2022-02-17 13:32 ` Andrew Lunn
2022-02-17 4:28 ` Luiz Angelo Daros de Luca
2022-02-17 7:53 ` Alvin Šipraga
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87v8xdrjpw.fsf@bang-olufsen.dk \
--to=alsi@bang-olufsen.dk \
--cc=MIR@bang-olufsen.dk \
--cc=alvin@pqrs.dk \
--cc=andrew@lunn.ch \
--cc=arinc.unal@arinc9.com \
--cc=davem@davemloft.net \
--cc=f.fainelli@gmail.com \
--cc=kuba@kernel.org \
--cc=linus.walleij@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luizluca@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=olteanv@gmail.com \
--cc=vivien.didelot@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.