From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Martin Peres <martin.peres@free.fr>
Cc: airlied@linux.ie, bskeggs@redhat.com, marcin.slusarz@gmail.com,
dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org
Subject: Re: nouveau shuts the machine down with v3.9-rc1 (temperature (72 C) hit the 'shutdown' threshold).
Date: Mon, 4 Mar 2013 16:41:10 -0500 [thread overview]
Message-ID: <20130304214110.GA17402@phenom.dumpdata.com> (raw)
In-Reply-To: <5134F44C.7040700@free.fr>
On Mon, Mar 04, 2013 at 08:21:48PM +0100, Martin Peres wrote:
> Hi Konrad,
>
> On 04/03/2013 19:40, Konrad Rzeszutek Wilk wrote:> After git merge
> ab7826595e9ec51a51f622c5fc91e2f59440481a
> > (Merge tag 'mfd-3.9-1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6)
> > the nouveau driver ends up shutting of the machine when booting.
> >
> >
> > I hadn't done a git bisection yet and was wondering if there are some
> > juice commits I ought to look at?
>
> Sure, no need to bisect, it is a new (apparently-broken-for-you) feature.
>
> The code is in /drivers/gpu/drm/nouveau/core/subdev/therm/
>
>
> >
> > Here is the serial console:
>
>
> > [ 6.940628] nouveau [ PTHERM][0000:00:0d.0] Thermal
> management: disabled
> > [ 6.957474] nouveau [ PTHERM][0000:00:0d.0] programmed
> thresholds [ 90(2), 95(3), 145(2), 135(5) ]
> > [ 6.966594] nouveau 6.975100] nouveau [
> PTHERM][0000:00:0d.0] Thermal management: automatic
> > [ 6.982059] nouveau [ PTHERM][0000:00:0d.0] temperature (88
> C) hit the 'downclock' threshold
> > [ 6.990680] nouveau [ PTHERM][0000:00:0d.0] temperature (88
> C) hit the 'critical' threshold
> > [ 6.999194] nouveau [ PTHERM][0000:00:0d.0] temperature (90
> C) hit the 'shutdown' threshold
>
> See, this is strange. If I believe the "programmed thresholds" line,
> the fanboost threshold is at 90°C, downclock is at 95°C, critical
> temperature is at 145°C and shutdown is at 135°C.
> So, from the BIOS side, things seem to be in fairly good shape
> (critical should be lower than shutdown, but that's OK).
>
> My theory is that your temperature sensor is very variable that
> would set off the shutdown alarm. So, either the sensor needs more
> settling time or the output is genuinely very variable.
You should see it when I boot it under Xen:
[ 8.427789] nouveau [ PTHERM][0000:00:0d.0] programmed thresholds [ 90(2), 95(3), 145(2), 135(5) ]^M^M
[ 8.427855] nouveau [ PTHERM][0000:00:0d.0] temperature (222 C) hit the 'fanboost' threshold^M^M
[ 8.427919] nouveau [ PTHERM][0000:00:0d.0] Thermal management: automatic^M^M
[ 8.427973] nouveau [ PTHERM][0000:00:0d.0] temperature (222 C) hit the 'downclock' threshold^M^M
[ 8.428036] nouveau [ PTHERM][0000:00:0d.0] temperature (222 C) hit the 'critical' threshold^M^M
[ 8.428099] nouveau [ PTHERM][0000:00:0d.0] temperature (222 C) hit the 'shutdown' threshold^M^M
>
> In the first case, we could fix that by increasing the settling time
> (at the expense of a longer boot period). We could also for a 10s
> wait at boot time before reading temperature.
> If this is the latter case, we only have the solution to average the
> temperature on several samples. I would need statistics on the
> variability in order to calculate a proper low-pass filter that
> wouldn't be too slow or too RAM/wakeup-intensive.
>
> I really hope the problem is the settling time!
>
>
> Here is what you can do to test the theory:
>
> Change the mdelay at line 41 of
> /drivers/gpu/drm/nouveau/core/subdev/therm/nv40.c (http://cgit.freedesktop.org/nouveau/linux-2.6/tree/drivers/gpu/drm/nouveau/core/subdev/therm/nv40.c#n41)
> from 10 to 1000.
> Please also add an mdelay of 1000 between lines 44 and 45.
Let me do that tomorrow and report my findings.
>
> If it works with this patch, then try decreasing the delay to 20ms.
>
> In any way, I'll send some thermal patches tonight to be more
> resistant to long settling times.
Pls CC me in case you would like me also to test them with the
mdelay patch.
>
> Thanks for reporting!
Of course.
>
> Martin (mupuf)
>
>
next prev parent reply other threads:[~2013-03-04 21:41 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-04 18:40 nouveau shuts the machine down with v3.9-rc1 (temperature (72 C) hit the 'shutdown' threshold) Konrad Rzeszutek Wilk
2013-03-04 19:21 ` Martin Peres
2013-03-04 21:41 ` Konrad Rzeszutek Wilk [this message]
[not found] ` <5135D375.9060006@free.fr>
[not found] ` <20130305154404.GA15271@phenom.dumpdata.com>
2013-03-11 12:38 ` Konrad Rzeszutek Wilk
2013-03-11 23:00 ` Martin Peres
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130304214110.GA17402@phenom.dumpdata.com \
--to=konrad.wilk@oracle.com \
--cc=airlied@linux.ie \
--cc=bskeggs@redhat.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=marcin.slusarz@gmail.com \
--cc=martin.peres@free.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox