All of lore.kernel.org
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: "Russell King (Oracle)" <linux@armlinux.org.uk>,
	Sudeep Holla <sudeep.holla@arm.com>,
	"Christoph Lameter (Ampere)" <cl@linux.com>,
	Mark Rutland <mark.rutland@arm.com>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Viresh Kumar <vireshk@kernel.org>, Will Deacon <will@kernel.org>,
	Jonathan.Cameron@huawei.com, Matteo.Carlini@arm.com,
	Valentin.Schneider@arm.com, akpm@linux-foundation.org,
	anshuman.khandual@arm.com, Eric Mackay <eric.mackay@oracle.com>,
	dave.kleikamp@oracle.com, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	robin.murphy@arm.com, vanshikonda@os.amperecomputing.com,
	yang@os.amperecomputing.com, Nishanth Menon <nm@ti.com>,
	Stephen Boyd <sboyd@kernel.org>
Subject: Re: [PATCH v3] ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512
Date: Thu, 14 Mar 2024 13:57:54 +0000	[thread overview]
Message-ID: <ZfMCYl7GffVcLEUN@arm.com> (raw)
In-Reply-To: <a210104f-a3af-4554-b734-097cfa77a470@samsung.com>

On Thu, Mar 14, 2024 at 01:28:40PM +0100, Marek Szyprowski wrote:
> On 14.03.2024 09:39, Catalin Marinas wrote:
> > On Wed, Mar 13, 2024 at 05:13:33PM +0000, Russell King wrote:
> >> So, I wonder whether what you're seeing is a latent bug which is
> >> being tickled by the presence of the CPU masks being off-stack
> >> changing the kernel timing.
> >>
> >> I would suggest the printk debug approach may help here to see when
> >> the OPPs are begun to be parsed, when they're created etc and their
> >> timing relationship to being used. Given the suspicion, it's possible
> >> that the mere addition of printk() may "fix" the problem, which again
> >> would be another semi-useful data point.
> > It might be an init order problem. Passing "initcall_debug" on the
> > cmdline might help a bit.
> >
> > It would also be useful in dev_pm_opp_set_config(), in the WARN_ON
> > block, to print opp_table->opp_list.next to get an idea whether it looks
> > like a valid pointer or memory corruption.
> 
> I've finally found some time to do the step-by-step printk-based 
> debugging of this issue and finally found what's broken!
> 
> Here is the fix:
> 
> diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
> index 8bd6e5e8f121..2d83bbc65dd0 100644
> --- a/drivers/cpufreq/cpufreq-dt.c
> +++ b/drivers/cpufreq/cpufreq-dt.c
> @@ -208,7 +208,7 @@ static int dt_cpufreq_early_init(struct device *dev, 
> int cpu)
>          if (!priv)
>                  return -ENOMEM;
> 
> -       if (!alloc_cpumask_var(&priv->cpus, GFP_KERNEL))
> +       if (!zalloc_cpumask_var(&priv->cpus, GFP_KERNEL))
>                  return -ENOMEM;
> 
>          cpumask_set_cpu(cpu, priv->cpus);
> 
> 
> It is really surprising that this didn't blow up for anyone else so 
> far... This means that the $subject patch is fine.
> 
> I will send a proper patch fixing this issue in a few minutes.

Nice. Many thanks for tracking this down. I'll revert the revert of the
CPUMASK_OFFSTACK in the second part of the merging window (I already
sent the pull request).

-- 
Catalin

WARNING: multiple messages have this Message-ID (diff)
From: Catalin Marinas <catalin.marinas@arm.com>
To: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: "Russell King (Oracle)" <linux@armlinux.org.uk>,
	Sudeep Holla <sudeep.holla@arm.com>,
	"Christoph Lameter (Ampere)" <cl@linux.com>,
	Mark Rutland <mark.rutland@arm.com>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Viresh Kumar <vireshk@kernel.org>, Will Deacon <will@kernel.org>,
	Jonathan.Cameron@huawei.com, Matteo.Carlini@arm.com,
	Valentin.Schneider@arm.com, akpm@linux-foundation.org,
	anshuman.khandual@arm.com, Eric Mackay <eric.mackay@oracle.com>,
	dave.kleikamp@oracle.com, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	robin.murphy@arm.com, vanshikonda@os.amperecomputing.com,
	yang@os.amperecomputing.com, Nishanth Menon <nm@ti.com>,
	Stephen Boyd <sboyd@kernel.org>
Subject: Re: [PATCH v3] ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512
Date: Thu, 14 Mar 2024 13:57:54 +0000	[thread overview]
Message-ID: <ZfMCYl7GffVcLEUN@arm.com> (raw)
In-Reply-To: <a210104f-a3af-4554-b734-097cfa77a470@samsung.com>

On Thu, Mar 14, 2024 at 01:28:40PM +0100, Marek Szyprowski wrote:
> On 14.03.2024 09:39, Catalin Marinas wrote:
> > On Wed, Mar 13, 2024 at 05:13:33PM +0000, Russell King wrote:
> >> So, I wonder whether what you're seeing is a latent bug which is
> >> being tickled by the presence of the CPU masks being off-stack
> >> changing the kernel timing.
> >>
> >> I would suggest the printk debug approach may help here to see when
> >> the OPPs are begun to be parsed, when they're created etc and their
> >> timing relationship to being used. Given the suspicion, it's possible
> >> that the mere addition of printk() may "fix" the problem, which again
> >> would be another semi-useful data point.
> > It might be an init order problem. Passing "initcall_debug" on the
> > cmdline might help a bit.
> >
> > It would also be useful in dev_pm_opp_set_config(), in the WARN_ON
> > block, to print opp_table->opp_list.next to get an idea whether it looks
> > like a valid pointer or memory corruption.
> 
> I've finally found some time to do the step-by-step printk-based 
> debugging of this issue and finally found what's broken!
> 
> Here is the fix:
> 
> diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
> index 8bd6e5e8f121..2d83bbc65dd0 100644
> --- a/drivers/cpufreq/cpufreq-dt.c
> +++ b/drivers/cpufreq/cpufreq-dt.c
> @@ -208,7 +208,7 @@ static int dt_cpufreq_early_init(struct device *dev, 
> int cpu)
>          if (!priv)
>                  return -ENOMEM;
> 
> -       if (!alloc_cpumask_var(&priv->cpus, GFP_KERNEL))
> +       if (!zalloc_cpumask_var(&priv->cpus, GFP_KERNEL))
>                  return -ENOMEM;
> 
>          cpumask_set_cpu(cpu, priv->cpus);
> 
> 
> It is really surprising that this didn't blow up for anyone else so 
> far... This means that the $subject patch is fine.
> 
> I will send a proper patch fixing this issue in a few minutes.

Nice. Many thanks for tracking this down. I'll revert the revert of the
CPUMASK_OFFSTACK in the second part of the merging window (I already
sent the pull request).

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2024-03-14 13:58 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20240308140130eucas1p1259c805a0b6491ce2f69c6fca0264b1f@eucas1p1.samsung.com>
2024-03-07  1:45 ` [PATCH v3] ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512 Christoph Lameter (Ampere)
2024-03-07  1:45   ` Christoph Lameter (Ampere)
2024-03-07 17:49   ` Mark Rutland
2024-03-07 17:49     ` Mark Rutland
2024-03-07 19:07   ` Catalin Marinas
2024-03-07 19:07     ` Catalin Marinas
2024-03-18 18:17     ` Catalin Marinas
2024-03-18 18:17       ` Catalin Marinas
2024-03-08 14:01   ` Marek Szyprowski
2024-03-08 14:01     ` Marek Szyprowski
2024-03-08 14:51     ` Catalin Marinas
2024-03-08 14:51       ` Catalin Marinas
2024-03-08 16:21       ` Marek Szyprowski
2024-03-08 16:21         ` Marek Szyprowski
2024-03-08 17:08         ` Christoph Lameter (Ampere)
2024-03-08 17:08           ` Christoph Lameter (Ampere)
2024-03-11 12:12           ` Mark Rutland
2024-03-11 12:12             ` Mark Rutland
2024-03-11 14:56             ` Marek Szyprowski
2024-03-11 14:56               ` Marek Szyprowski
2024-03-11 15:22               ` Catalin Marinas
2024-03-11 15:22                 ` Catalin Marinas
2024-03-11 16:51                 ` Marek Szyprowski
2024-03-11 16:51                   ` Marek Szyprowski
2024-03-11 17:08                   ` Catalin Marinas
2024-03-11 17:08                     ` Catalin Marinas
2024-03-11 18:55     ` Catalin Marinas
2024-03-11 18:55       ` Catalin Marinas
2024-03-11 21:07       ` Christoph Lameter (Ampere)
2024-03-11 21:07         ` Christoph Lameter (Ampere)
2024-03-12 17:06         ` Christoph Lameter (Ampere)
2024-03-12 17:06           ` Christoph Lameter (Ampere)
2024-03-12 17:55           ` Catalin Marinas
2024-03-12 17:55             ` Catalin Marinas
2024-03-13 14:35             ` Sudeep Holla
2024-03-13 14:35               ` Sudeep Holla
2024-03-13 16:22               ` Marek Szyprowski
2024-03-13 16:22                 ` Marek Szyprowski
2024-03-13 16:39                 ` Christoph Lameter (Ampere)
2024-03-13 16:39                   ` Christoph Lameter (Ampere)
2024-03-13 20:18                   ` Marek Szyprowski
2024-03-13 20:18                     ` Marek Szyprowski
2024-03-13 17:13                 ` Russell King (Oracle)
2024-03-13 17:13                   ` Russell King (Oracle)
2024-03-14  8:39                   ` Catalin Marinas
2024-03-14  8:39                     ` Catalin Marinas
2024-03-14 12:28                     ` Marek Szyprowski
2024-03-14 12:28                       ` Marek Szyprowski
2024-03-14 13:17                       ` Russell King (Oracle)
2024-03-14 13:17                         ` Russell King (Oracle)
2024-03-14 17:01                         ` Christoph Lameter (Ampere)
2024-03-14 17:01                           ` Christoph Lameter (Ampere)
2024-03-14 13:57                       ` Catalin Marinas [this message]
2024-03-14 13:57                         ` Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZfMCYl7GffVcLEUN@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=Matteo.Carlini@arm.com \
    --cc=Valentin.Schneider@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=cl@linux.com \
    --cc=dave.kleikamp@oracle.com \
    --cc=eric.mackay@oracle.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=m.szyprowski@samsung.com \
    --cc=mark.rutland@arm.com \
    --cc=nm@ti.com \
    --cc=rafael@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=sboyd@kernel.org \
    --cc=sudeep.holla@arm.com \
    --cc=vanshikonda@os.amperecomputing.com \
    --cc=vireshk@kernel.org \
    --cc=will@kernel.org \
    --cc=yang@os.amperecomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.