public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: Cristian Marussi <cristian.marussi@arm.com>
To: Florian Fainelli <f.fainelli@gmail.com>
Cc: linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, sudeep.holla@arm.com,
	james.quinlan@broadcom.com, Jonathan.Cameron@huawei.com,
	etienne.carriere@linaro.org, vincent.guittot@linaro.org,
	souvik.chakravarty@arm.com
Subject: Re: [PATCH 11/22] firmware: arm_scmi: Add SCMIv3.1 extended names protocols support
Date: Wed, 15 Jun 2022 18:32:31 +0100	[thread overview]
Message-ID: <YqoXgsbg6t46rZeI@e120937-lin> (raw)
In-Reply-To: <f194f9b6-578d-ae08-16c3-6d464da10452@gmail.com>

On Wed, Jun 15, 2022 at 10:19:53AM -0700, Florian Fainelli wrote:
> On 6/15/22 09:29, Cristian Marussi wrote:
> > On Wed, Jun 15, 2022 at 09:10:03AM -0700, Florian Fainelli wrote:
> > > On 6/15/22 02:40, Cristian Marussi wrote:
> > > > On Wed, Jun 15, 2022 at 09:18:03AM +0100, Cristian Marussi wrote:
> > > > > On Wed, Jun 15, 2022 at 05:45:11AM +0200, Florian Fainelli wrote:
> > > > > > 
> > > > > > 
> > > > > > On 3/30/2022 5:05 PM, Cristian Marussi wrote:
> > > > > > > Using the common protocol helper implementation add support for all new
> > > > > > > SCMIv3.1 extended names commands related to all protocols with the
> > > > > > > exception of SENSOR_AXIS_GET_NAME.
> > > > > > > 
> > > > > > > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
> > > > > > 
> > > > > > This causes the following splat on a platform where regulators fail to
> > > > > > initialize:
> > > > > > 
> > > > > 
> > > > > Hi Florian,
> > > > > 
> > > > > thanks for the report.
> > > > > 
> > > > > It seems a memory error while allocating so it was not meant to be
> > > > > solved by the fixes, anyway, I've never seen this splat in my testing
> > > > > and at first sight I cannot see anything wrong in the devm_k* calls
> > > > > inside scmi_voltage_protocol_init...is there any particular config in
> > > > > your setup ?
> > > > > 
> > > > > Moreover, the WARNING line 5402 seems to match v5.19-rc1 and it has
> > > > > slightly changed with -rc-1, so I'll try rebasing on that at first and
> > > > > see if I can reproduce the issue locally.
> > > > > 
> > > > 
> > > > I just re-tested the series rebased on v519-rc1 plus fixes and I cannot
> > > > reproduce in my setup with a few (~9) good and bad voltage domains.
> > > > 
> > > > How many voltage domains are advertised by the platform in your setup ?
> > > 
> > > There are 11 voltage regulators on this platform, and of course, now that I
> > > am trying to reproduce the splat I reported I just cannot anymore... I will
> > > let you know if there is anything that needs to be done. Thanks for being
> > > responsive as usual!
> > 
> > ... you're welcome...
> > 
> > I'm trying to figure out where an abnormal mem request could happen...
> 
> I think the problem is/was with the number of voltage domains being reported
> which was way too big and passed directly, without any clamping to
> devm_kcalloc() resulting the splat indicating that the allocation was beyond
> the MAX_ORDER. The specification allows for up to 2^16 domains which would
> still be way too much to allocate using kmalloc() so we could/should
> consider vmalloc() here eventually?
> 

Yes I was suspicious about the same and I was starting to think about vmalloc
in general across all domain enumerations even if this is may not be the case
for your splat...

> In all likelihood though we probably won't find a system with 65k voltage
> domains.
> 
> > 
> > can you try adding this (for brutal debugging) when you try ?
> > (...just to rule out funny fw replies.... :D)
> 
> Sure, nothing weird coming out and it succeeded in enumerating all of the
> regulators, I smell a transient issue with our firmware implementation,
> maybe...
> 
> [    0.560544] arm-scmi brcm_scmi@0: num_returned:1  num_remaining:0
> [    0.560617] arm-scmi brcm_scmi@0: num_returned:1  num_remaining:0
> [    0.560673] arm-scmi brcm_scmi@0: num_returned:1  num_remaining:0
> [    0.560730] arm-scmi brcm_scmi@0: num_returned:1  num_remaining:0
> [    0.560881] arm-scmi brcm_scmi@0: num_returned:1  num_remaining:0
> [    0.560940] arm-scmi brcm_scmi@0: num_returned:1  num_remaining:0
> [    0.560996] arm-scmi brcm_scmi@0: num_returned:1  num_remaining:0
> [    0.561054] arm-scmi brcm_scmi@0: num_returned:1  num_remaining:0
> [    0.561110] arm-scmi brcm_scmi@0: num_returned:1  num_remaining:0
> [    0.561168] arm-scmi brcm_scmi@0: num_returned:1  num_remaining:0
> [    0.561225] arm-scmi brcm_scmi@0: num_returned:1  num_remaining:0
> [    0.561652] scmi-regulator scmi_dev.2: Regulator stb_vreg_2 registered
> for domain [2]
> [    0.561858] scmi-regulator scmi_dev.2: Regulator stb_vreg_3 registered
> for domain [3]
> [    0.562030] scmi-regulator scmi_dev.2: Regulator stb_vreg_4 registered
> for domain [4]
> [    0.562190] scmi-regulator scmi_dev.2: Regulator stb_vreg_5 registered
> for domain [5]
> [    0.564427] scmi-regulator scmi_dev.2: Regulator stb_vreg_6 registered
> for domain [6]
> [    0.564638] scmi-regulator scmi_dev.2: Regulator stb_vreg_7 registered
> for domain [7]
> [    0.564817] scmi-regulator scmi_dev.2: Regulator stb_vreg_8 registered
> for domain [8]
> [    0.565030] scmi-regulator scmi_dev.2: Regulator stb_vreg_9 registered
> for domain [9]
> [    0.565191] scmi-regulator scmi_dev.2: Regulator stb_vreg_10 registered
> for domain [10]
> 

Ok, this was another place where a reply carrying a consistent number of
non-segmented entries could trigger an jumbo-request to kmalloc...

..maybe this is where a transient fw issue can reply with a dramatic
numbers of possible (non-segmented) levels (num_returned+num_remaining)

(I filter out replies carrying descriptors for segmented voltage-levels
that does not come in triplets (returned:3  remaining:0) since they are out
of spec...but I just hit today on another fw sending such out of spec
reply for clocks and I will relax such req probably not to break too
many out-of-spec blobs out in the wild :P)

So let me know if got new splats...

Thanks,
Cristian


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-06-15 17:33 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-30 15:05 [PATCH 00/22] SCMIv3.1 Miscellaneous changes Cristian Marussi
2022-03-30 15:05 ` [PATCH 01/22] firmware: arm_scmi: Fix sorting of retrieved clock rates Cristian Marussi
2022-03-30 15:05 ` [PATCH 02/22] firmware: arm_scmi: Make protocols init fail on basic errors Cristian Marussi
2022-04-26 15:35   ` Sudeep Holla
2022-04-26 16:25     ` Cristian Marussi
2022-04-28 10:25       ` Sudeep Holla
2022-04-28 12:07         ` Cristian Marussi
2022-03-30 15:05 ` [PATCH 03/22] firmware: arm_scmi: Fix Base list protocols enumeration Cristian Marussi
2022-03-30 15:05 ` [PATCH 04/22] firmware: arm_scmi: Validate BASE_DISCOVER_LIST_PROTOCOLS reply Cristian Marussi
2022-04-28 10:07   ` Sudeep Holla
2022-04-28 13:45     ` Cristian Marussi
2022-04-28 13:55       ` Sudeep Holla
2022-04-28 14:03         ` Cristian Marussi
2022-03-30 15:05 ` [PATCH 05/22] firmware: arm_scmi: Dynamically allocate protocols array Cristian Marussi
2022-04-28 10:27   ` Sudeep Holla
2022-03-30 15:05 ` [PATCH 06/22] firmware: arm_scmi: Make name_get operations return a const Cristian Marussi
2022-03-30 15:05 ` [PATCH 07/22] firmware: arm_scmi: Check CLOCK_RATE_SET_COMPLETE async reply Cristian Marussi
2022-03-30 15:05 ` [PATCH 08/22] firmware: arm_scmi: Remove unneeded NULL termination of clk name Cristian Marussi
2022-03-30 15:05 ` [PATCH 09/22] firmware: arm_scmi: Split protocol specific definitions in a dedicated header Cristian Marussi
2022-03-30 15:05 ` [PATCH 10/22] firmware: arm_scmi: Introduce a common SCMIv3.1 .extended_name_get helper Cristian Marussi
2022-03-30 15:05 ` [PATCH 11/22] firmware: arm_scmi: Add SCMIv3.1 extended names protocols support Cristian Marussi
2022-06-15  3:45   ` Florian Fainelli
2022-06-15  8:17     ` Cristian Marussi
2022-06-15  9:40       ` Cristian Marussi
2022-06-15 16:10         ` Florian Fainelli
2022-06-15 16:29           ` Cristian Marussi
2022-06-15 17:19             ` Florian Fainelli
2022-06-15 17:32               ` Cristian Marussi [this message]
2022-06-15 22:58                 ` Florian Fainelli
2022-03-30 15:05 ` [PATCH 12/22] firmware: arm_scmi: Parse clock_enable_latency conditionally Cristian Marussi
2022-03-30 15:05 ` [PATCH 13/22] firmware: arm_scmi: Add iterators for multi-part commands Cristian Marussi
2022-03-30 15:05 ` [PATCH 14/22] firmware: arm_scmi: Use common iterators in Sensor protocol Cristian Marussi
2022-03-30 15:05 ` [PATCH 15/22] firmware: arm_scmi: Add SCMIv3.1 SENSOR_AXIS_NAME_GET support Cristian Marussi
2022-06-02 14:25   ` Peter Hilber
2022-06-06  8:18     ` Cristian Marussi
2022-06-08  8:40       ` Peter Hilber
2022-06-08  8:49         ` Cristian Marussi
2022-03-30 15:05 ` [PATCH 16/22] firmware: arm_scmi: Use common iterators in Clock protocol Cristian Marussi
2022-03-30 15:05 ` [PATCH 17/22] firmware: arm_scmi: Use common iterators in Voltage protocol Cristian Marussi
2022-03-30 15:05 ` [PATCH 18/22] firmware: arm_scmi: Use common iterators in Perf protocol Cristian Marussi
2022-03-30 15:05 ` [PATCH 19/22] firmware: arm_scmi: Add SCMIv3.1 Clock notifications Cristian Marussi
2022-03-30 15:05 ` [PATCH 20/22] firmware: arm_scmi: Add SCMIv3.1 VOLTAGE_LEVEL_SET_COMPLETE Cristian Marussi
2022-03-30 15:05 ` [PATCH 21/22] firmware: arm_scmi: Add SCMI v3.1 Perf power-cost in microwatts Cristian Marussi
2022-03-30 16:46   ` Lukasz Luba
2022-03-30 15:05 ` [PATCH 22/22] firmware: arm_scmi: Add SCMIv3.1 PERFORMANCE_LIMITS_SET checks Cristian Marussi
2022-04-28 13:13   ` Sudeep Holla
2022-04-28 13:49     ` Cristian Marussi
2022-04-28 13:52       ` Sudeep Holla
2022-04-28 13:46 ` [PATCH 00/22] SCMIv3.1 Miscellaneous changes Sudeep Holla
2022-05-03  8:03 ` Sudeep Holla

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YqoXgsbg6t46rZeI@e120937-lin \
    --to=cristian.marussi@arm.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=etienne.carriere@linaro.org \
    --cc=f.fainelli@gmail.com \
    --cc=james.quinlan@broadcom.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=souvik.chakravarty@arm.com \
    --cc=sudeep.holla@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox