From: Jon Hunter <jonathanh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
To: Ben Skeggs <bskeggs-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Guillaume Tucker
<guillaume.tucker-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org>,
Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>,
Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>,
"linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
"linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org"
<linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org>,
Mark Brown <broonie-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Subject: Re: next/master boot: 273 boots: 63 failed, 209 passed with 1 untried/unknown (next-20171106)
Date: Fri, 10 Nov 2017 11:26:28 +0000 [thread overview]
Message-ID: <5321abfb-845b-354e-f3d9-7773cfe175f4@nvidia.com> (raw)
In-Reply-To: <1040af29-4d15-4e8a-29ab-40952523535c-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
On 10/11/17 09:18, Jon Hunter wrote:
...
> Thanks Ben. However, looking at next-20171109 this one is already in.
> So maybe the bisect is still not getting me to the current issue. When
> booting next-20171109 the last thing I see is ...
>
> [ 2.228178] nouveau 57000000.gpu: NVIDIA GK20A (0ea000a1)
> [ 2.233634] nouveau 57000000.gpu: imem: using IOMMU
> [ 2.238572] nouveau 57000000.gpu: Direct firmware load for nvidia/gk20a/fecs_inst.bin failed with error -2
> [ 2.248295] nouveau 57000000.gpu: Direct firmware load for nouveau/nvea_fuc409c failed with error -2
> [ 2.257479] nouveau 57000000.gpu: Direct firmware load for nouveau/fuc409c failed with error -2
> [ 2.266189] nouveau 57000000.gpu: gr: failed to load fuc409c
>
> So no crash. I did see the crash after the bisect, but not in top of
> tree. It appears to hang after the nouveau probe fails. Any thoughts
> on how to debug further?
So this is probably wrong, but here is a clue about what is happening.
It appears that the error code is not being propagated from
gk20a_gr_new(). gk20a_gr_new is returning -ENODEV due to the firmware
loading failure...
342 if (gf100_gr_ctor_fw(gr, "fecs_inst", &gr->fuc409c) ||
343 gf100_gr_ctor_fw(gr, "fecs_data", &gr->fuc409d) ||
344 gf100_gr_ctor_fw(gr, "gpccs_inst", &gr->fuc41ac) ||
345 gf100_gr_ctor_fw(gr, "gpccs_data", &gr->fuc41ad))
346 return -ENODEV;
... but this is ignored by nvkm_device_ctor() (probably for good
reason). If I make the following change the hang no longer occurs
(although I realise this is probably wrong as it has been there for
years!) ...
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
index e14643615698..a611615d3ce7 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
@@ -2869,7 +2869,7 @@ struct nvkm_engine *
subdev = nvkm_device_subdev(device, (s)); \
nvkm_subdev_del(&subdev); \
device->m = NULL; \
- if (ret != -ENODEV) { \
+ if (ret == -ENODEV) { \
nvdev_error(device, "%s ctor failed, %d\n", \
nvkm_subdev_name[s], ret); \
goto done; \
So is gk20a_gr_new() returning the wrong error code for when the
firmware load fails?
I have no gone back to see what has change in this regard, but I
can, probably next week.
Cheers
Jon
--
nvpublic
prev parent reply other threads:[~2017-11-10 11:26 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <5a0055f1.85a8500a.98d54.a4e4@mx.google.com>
[not found] ` <5a0055f1.85a8500a.98d54.a4e4-ATjtLOhZ0NVl57MIdRCFDg@public.gmane.org>
2017-11-06 19:17 ` next/master boot: 273 boots: 63 failed, 209 passed with 1 untried/unknown (next-20171106) Mark Brown
2017-11-07 10:12 ` Jon Hunter
[not found] ` <d8e21d87-776b-beff-62af-34e5ad1febc3-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2017-11-07 10:55 ` Mark Brown
[not found] ` <20171107105501.7x74gdqzhr7uulp2-GFdadSzt00ze9xe1eoZjHA@public.gmane.org>
2017-11-07 11:43 ` Guillaume Tucker
[not found] ` <a384e96c-27c7-782b-75b9-7525714f5831-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org>
2017-11-08 15:19 ` Guillaume Tucker
[not found] ` <613bcd63-a215-acbe-9150-c1495f7604f6-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org>
2017-11-08 15:55 ` Robin Murphy
[not found] ` <7ce29bba-485c-b063-961a-3a745718357f-5wv7dgnIgG8@public.gmane.org>
2017-11-08 16:23 ` Mikko Perttunen
[not found] ` <cdac9d47-42ce-b5c2-b325-68726d194888-/1wQRMveznE@public.gmane.org>
2017-11-08 16:47 ` Robin Murphy
2017-11-08 15:57 ` Jon Hunter
[not found] ` <5740b853-4898-2ebc-f67d-0808d1b44c36-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2017-11-08 16:42 ` Guillaume Tucker
2017-11-09 9:55 ` Jon Hunter
[not found] ` <7cdfa633-d9c6-881a-ae5f-f94f7e6413ee-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2017-11-09 10:43 ` Guillaume Tucker
2017-11-09 11:29 ` Jon Hunter
[not found] ` <15792a16-6b57-a6ad-92dc-0ffaba0354db-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2017-11-09 12:51 ` Guillaume Tucker
[not found] ` <1eb4e14f-4728-d4f7-95a6-0a6308760d7a-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org>
2017-11-09 13:17 ` Arnd Bergmann
2017-11-09 15:23 ` Jon Hunter
[not found] ` <18ef379f-0c23-0cbf-4228-30d5c46c690f-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2017-11-09 19:03 ` Guillaume Tucker
2017-11-09 21:45 ` Jon Hunter
2017-11-09 22:54 ` Jon Hunter
[not found] ` <5505affd-58a5-857f-051d-5b93257e175d@redhat.com>
[not found] ` <5505affd-58a5-857f-051d-5b93257e175d-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-11-10 9:18 ` Jon Hunter
[not found] ` <1040af29-4d15-4e8a-29ab-40952523535c-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2017-11-10 11:26 ` Jon Hunter [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5321abfb-845b-354e-f3d9-7773cfe175f4@nvidia.com \
--to=jonathanh-ddmlm1+adcrqt0dzr+alfa@public.gmane.org \
--cc=arnd-r2nGTMty4D4@public.gmane.org \
--cc=broonie-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=bskeggs-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=guillaume.tucker-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org \
--cc=linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org \
--cc=linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=robin.murphy-5wv7dgnIgG8@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox