All of lore.kernel.org
 help / color / mirror / Atom feed
From: Markus Trippelsdorf <markus@trippelsdorf.de>
To: Alex Deucher <alexdeucher@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	dri-devel@lists.freedesktop.org
Subject: Re: Commit f5d9b7f0f9 (fix r600_enable_sclk_control()) causes kexec issues
Date: Tue, 30 Jul 2013 20:46:42 +0200	[thread overview]
Message-ID: <20130730184642.GA352@x4> (raw)
In-Reply-To: <CADnq5_Nu+xXJ8bssEYbi9HHWAyYcPmZVEUn8Bb47Q3GCjecAYw@mail.gmail.com>

On 2013.07.30 at 10:53 -0400, Alex Deucher wrote:
> On Tue, Jul 30, 2013 at 7:27 AM, Markus Trippelsdorf
> <markus@trippelsdorf.de> wrote:
> > On 2013.07.29 at 15:53 -0400, Alex Deucher wrote:
> >> On Mon, Jul 29, 2013 at 2:10 PM, Eric W. Biederman
> >> <ebiederm@xmission.com> wrote:
> >> > Alex Deucher <alexdeucher@gmail.com> writes:
> >> >
> >> >> On Mon, Jul 29, 2013 at 11:50 AM, Eric W. Biederman
> >> >> <ebiederm@xmission.com> wrote:
> >> >>>
> >> >>>
> >> >>> Alex Deucher <alexdeucher@gmail.com> wrote:
> >> >>>>On Mon, Jul 29, 2013 at 10:09 AM, Markus Trippelsdorf
> >> >>>><markus@trippelsdorf.de> wrote:
> >> >>>>> On 2013.07.29 at 09:58 -0400, Alex Deucher wrote:
> >> >>>>>> On Mon, Jul 29, 2013 at 3:51 AM, Markus Trippelsdorf
> >> >>>>>> <markus@trippelsdorf.de> wrote:
> >> >>>>>> > On my test machine Xorg doesn't start anymore when I kexec into a
> >> >>>>>> > 3.11.0-rc3 kernel.
> >> >>>>>>
> >> >>>>>> With kexec, dpm doesn't get torn down properly which can result in a
> >> >>>>>> bad hardware state when the driver loads again.  Does the attached
> >> >>>>>> patch help?  It attempts to disable dpm at startup in case it wasn't
> >> >>>>>> torn down properly previously.
> >> >>>>>
> >> >>>>> dpm initialization now works, but unfortunately GPU acceleration
> >> >>>>still gets
> >> >>>>> disabled:
> >> >>>>
> >> >>>>Stupid kexec complicates things.  We need to make sure dpm is torn
> >> >>>>down before we init the rest of the GPU, but dpm needs get initialized
> >> >>>>later in the init process since it depends on certain other state from
> >> >>>>the driver.  I need to think about this for a bit.  I'm not sure of a
> >> >>>>good way to handle this.
> >> >>>
> >> >>> You might just want to implement a shutdown method that cleans things up properly.   At least as a first pass until you worry about things like kexec on panic.
> >> >>>
> >> >>> Or can you not shutdown the graphics stack on reboot because of the need to display the kernels shutdown progress?
> >> >>
> >> >> Does kexec actually call this shutdown method?  The driver implements
> >> >> appropriate clean-up measures if it's shutdown properly.
> >> >
> >> > Absoltuely.  All parts of the reboot path call ->shutdown.  Including
> >> > kexec.
> >> >
> >> > You don't get a device remove/hotunplug but unless this is a kexec on
> >> > panic ->shutdown is most definitely called.  Now I am talking about the
> >> > device layer/pci layer shutdown method I don't know how gpu drivers are
> >> > wired up.  GPU land was a little strange last I looked.  Hopefully it
> >> > isn't so strange that there is a method named shutdown that is not wired
> >> > up.
> >>
> >> It doesn't look like the drm infrastructure has a shutdown callback.
> >> The drm drivers register a drm_driver callback struct that includes an
> >> unload callback which takes care of all the device teardown (if you
> >> unload the module for example).  I don't know that it actually gets
> >> called at kexec time however.  I don't know enough about how kexec
> >> works.
> >
> > BTW there is r100_restore_sanity() in drm/radeon/r100.c that explicitly
> > handles the kexec case during init. So maybe an r600_restore_sanity()
> > function is needed?
> >
> > (One of the advantages of using kexec (besides the much quicker boot
> > time) is that the monitor is not switched off and then on during boot.
> > I guess that advantage would be lost if the unload callback would be
> > called.)
> 
> r100_restore_sanity() is basically a set of hacks (that gets called at
> driver startup) to work around the fact that with kexec the drm driver
> is not torn down correctly.  So we could add a bunch more asic
> specific tear down sequences to deal with dpm (and all the other
> engines on the GPU that may potentially cause problems if they are not
> torn down properly), but that will just turn into a mess.  All of
> these hacks also add latency to the driver load.  I think the best
> solution would probably be to figure how to hook up the drm unload
> callback to the shutdown method that kexec uses.

FYI the following (ugly) hack works for me. 
(If I don't comment out radeon_fbdev_fini(rdev) kexec will hang.)

diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
index 75349cd..13e2988 100644
--- a/drivers/gpu/drm/radeon/r100.c
+++ b/drivers/gpu/drm/radeon/r100.c
@@ -3947,20 +3947,6 @@ void r100_fini(struct radeon_device *rdev)
  */
 void r100_restore_sanity(struct radeon_device *rdev)
 {
-	u32 tmp;
-
-	tmp = RREG32(RADEON_CP_CSQ_CNTL);
-	if (tmp) {
-		WREG32(RADEON_CP_CSQ_CNTL, 0);
-	}
-	tmp = RREG32(RADEON_CP_RB_CNTL);
-	if (tmp) {
-		WREG32(RADEON_CP_RB_CNTL, 0);
-	}
-	tmp = RREG32(RADEON_SCRATCH_UMSK);
-	if (tmp) {
-		WREG32(RADEON_SCRATCH_UMSK, 0);
-	}
 }
 
 int r100_init(struct radeon_device *rdev)
diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c
index c2b67b4..79b38e2 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -1405,7 +1405,7 @@ int radeon_modeset_init(struct radeon_device *rdev)
 
 void radeon_modeset_fini(struct radeon_device *rdev)
 {
-	radeon_fbdev_fini(rdev);
+//	radeon_fbdev_fini(rdev);
 	kfree(rdev->mode_info.bios_hardcoded_edid);
 	radeon_pm_fini(rdev);
 
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
index 29876b1..e705e8c 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -388,6 +388,15 @@ static const struct file_operations radeon_driver_kms_fops = {
 #endif
 };
 
+
+static void
+radeon_kexec_shutdown(struct pci_dev *pdev)
+{
+	struct drm_device *dev = pci_get_drvdata(pdev);
+
+	radeon_driver_unload_kms(dev);
+}
+
 static struct drm_driver kms_driver = {
 	.driver_features =
 	    DRIVER_USE_AGP | DRIVER_USE_MTRR | DRIVER_PCI_DMA | DRIVER_SG |
@@ -463,6 +472,7 @@ static struct pci_driver radeon_kms_pci_driver = {
 	.remove = radeon_pci_remove,
 	.suspend = radeon_pci_suspend,
 	.resume = radeon_pci_resume,
+	.shutdown = radeon_kexec_shutdown,
 };
 
 static int __init radeon_init(void)

-- 
Markus

  reply	other threads:[~2013-07-30 18:46 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-29  7:51 Commit f5d9b7f0f9 (fix r600_enable_sclk_control()) causes kexec issues Markus Trippelsdorf
2013-07-29 13:58 ` Alex Deucher
2013-07-29 14:09   ` Markus Trippelsdorf
2013-07-29 14:18     ` Alex Deucher
2013-07-29 15:50       ` Eric W. Biederman
2013-07-29 18:02         ` Alex Deucher
2013-07-29 18:10           ` Eric W. Biederman
2013-07-29 19:53             ` Alex Deucher
2013-07-29 19:58               ` Markus Trippelsdorf
2013-07-30 11:27               ` Markus Trippelsdorf
2013-07-30 11:57                 ` Markus Trippelsdorf
2013-07-30 14:53                 ` Alex Deucher
2013-07-30 18:46                   ` Markus Trippelsdorf [this message]
2013-07-30 20:56                     ` Markus Trippelsdorf
2013-07-31  9:19                       ` Joshua C.
2013-07-29 16:14       ` Joshua C.
2013-07-29 16:35         ` Markus Trippelsdorf
2013-07-29 17:58         ` Alex Deucher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130730184642.GA352@x4 \
    --to=markus@trippelsdorf.de \
    --cc=alexander.deucher@amd.com \
    --cc=alexdeucher@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=ebiederm@xmission.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.