All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/nouveau: fix early vram corruption originating from vgacon
@ 2012-09-12 22:52 Marcin Slusarz
       [not found] ` <20120912225230.GD8067-OI9uyE9O0yo@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Marcin Slusarz @ 2012-09-12 22:52 UTC (permalink / raw)
  To: Ben Skeggs; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

There's a short window between module load and fbcon initalization when
it's possible for vgacon to write to VGA RAM. Nouveau uses this memory
for different purposes, so if we are unlucky, it causes mysterious memory
corruptions.

For me, booting with nv_printk debug levels set to 5 was enough to trigger it.
It manifested as long stream of:
"trapped write at ... on channel 0x0001fea0 BAR/PFIFO_WRITE/IN reason:
DMAOBJ_LIMIT / PT_NOT_PRESENT / PAGE_SYSTEM_ONLY / PAGE_NOT_PRESENT"
which eventually lead to complete hang.

Disabling access to VGA memory (through 0x54 PCI config space register) is
enough to fix it, but it breaks copying screen data between old and new
console (because old data is inaccessible). But blanking console (with
entering_gfx==1) is enough to move vgacon screen buffer from VRAM to RAM
and let handover to work correctly.

Signed-off-by: Marcin Slusarz <marcin.slusarz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 drivers/gpu/drm/nouveau/nouveau_drm.c | 31 ++++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
index 6826525..1641bd9 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -25,6 +25,7 @@
 #include <linux/console.h>
 #include <linux/module.h>
 #include <linux/pci.h>
+#include <linux/vt_kern.h>
 
 #include <core/device.h>
 #include <core/client.h>
@@ -51,6 +52,8 @@
 
 #include "nouveau_ttm.h"
 
+#define NV_PCI_VGAMEM_ENABLE 0x54
+
 MODULE_PARM_DESC(config, "option string to pass to driver core");
 static char *nouveau_config;
 module_param_named(config, nouveau_config, charp, 0400);
@@ -247,9 +250,20 @@ nouveau_drm_load(struct drm_device *dev, unsigned long flags)
 	struct nouveau_drm *drm;
 	int ret;
 
+	/* Blank initial console to prevent VRAM corruption while we initialize
+	 * the HW. For vgacon it will move console memory from VGA VRAM to RAM.
+	 */
+	console_lock();
+	do_blank_screen(1);
+	console_unlock();
+
+	/* Completely disable access to VGA IO/memory, just to be sure no one
+	 * will change it. */
+	pci_write_config_byte(pdev, NV_PCI_VGAMEM_ENABLE, 0);
+
 	ret = nouveau_cli_create(pdev, "DRM", sizeof(*drm), (void**)&drm);
 	if (ret)
-		return ret;
+		goto fail_cli;
 
 	dev->dev_private = drm;
 	drm->dev = dev;
@@ -336,6 +350,11 @@ nouveau_drm_load(struct drm_device *dev, unsigned long flags)
 
 	nouveau_accel_init(drm);
 	nouveau_fbcon_init(dev);
+
+	console_lock();
+	do_unblank_screen(1);
+	console_unlock();
+
 	return 0;
 
 fail_dispinit:
@@ -351,12 +370,20 @@ fail_ttm:
 	nouveau_vga_fini(drm);
 fail_device:
 	nouveau_cli_destroy(&drm->client);
+fail_cli:
+	pci_write_config_byte(pdev, NV_PCI_VGAMEM_ENABLE, 1);
+
+	console_lock();
+	do_unblank_screen(1);
+	console_unlock();
+
 	return ret;
 }
 
 static int
 nouveau_drm_unload(struct drm_device *dev)
 {
+	struct pci_dev *pdev = dev->pdev;
 	struct nouveau_drm *drm = nouveau_drm(dev);
 
 	nouveau_fbcon_fini(dev);
@@ -375,6 +402,8 @@ nouveau_drm_unload(struct drm_device *dev)
 	nouveau_vga_fini(drm);
 
 	nouveau_cli_destroy(&drm->client);
+
+	pci_write_config_byte(pdev, NV_PCI_VGAMEM_ENABLE, 1);
 	return 0;
 }
 
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/nouveau: fix early vram corruption originating from vgacon
       [not found] ` <20120912225230.GD8067-OI9uyE9O0yo@public.gmane.org>
@ 2012-10-04 11:35   ` Marcin Slusarz
       [not found]     ` <20121004113515.GC4979-OI9uyE9O0yo@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Marcin Slusarz @ 2012-10-04 11:35 UTC (permalink / raw)
  To: Ben Skeggs; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Thu, Sep 13, 2012 at 12:52:30AM +0200, Marcin Slusarz wrote:
> There's a short window between module load and fbcon initalization when
> it's possible for vgacon to write to VGA RAM. Nouveau uses this memory
> for different purposes, so if we are unlucky, it causes mysterious memory
> corruptions.
> 
> For me, booting with nv_printk debug levels set to 5 was enough to trigger it.
> It manifested as long stream of:
> "trapped write at ... on channel 0x0001fea0 BAR/PFIFO_WRITE/IN reason:
> DMAOBJ_LIMIT / PT_NOT_PRESENT / PAGE_SYSTEM_ONLY / PAGE_NOT_PRESENT"
> which eventually lead to complete hang.
> 
> Disabling access to VGA memory (through 0x54 PCI config space register) is
> enough to fix it, but it breaks copying screen data between old and new
> console (because old data is inaccessible). But blanking console (with
> entering_gfx==1) is enough to move vgacon screen buffer from VRAM to RAM
> and let handover to work correctly.
> 
> Signed-off-by: Marcin Slusarz <marcin.slusarz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
>  drivers/gpu/drm/nouveau/nouveau_drm.c | 31 ++++++++++++++++++++++++++++++-
>  1 file changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
> index 6826525..1641bd9 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> @@ -25,6 +25,7 @@
>  #include <linux/console.h>
>  #include <linux/module.h>
>  #include <linux/pci.h>
> +#include <linux/vt_kern.h>
>  
>  #include <core/device.h>
>  #include <core/client.h>
> @@ -51,6 +52,8 @@
>  
>  #include "nouveau_ttm.h"
>  
> +#define NV_PCI_VGAMEM_ENABLE 0x54
> +
>  MODULE_PARM_DESC(config, "option string to pass to driver core");
>  static char *nouveau_config;
>  module_param_named(config, nouveau_config, charp, 0400);
> @@ -247,9 +250,20 @@ nouveau_drm_load(struct drm_device *dev, unsigned long flags)
>  	struct nouveau_drm *drm;
>  	int ret;
>  
> +	/* Blank initial console to prevent VRAM corruption while we initialize
> +	 * the HW. For vgacon it will move console memory from VGA VRAM to RAM.
> +	 */
> +	console_lock();
> +	do_blank_screen(1);
> +	console_unlock();
> +
> +	/* Completely disable access to VGA IO/memory, just to be sure no one
> +	 * will change it. */
> +	pci_write_config_byte(pdev, NV_PCI_VGAMEM_ENABLE, 0);
> +
>  	ret = nouveau_cli_create(pdev, "DRM", sizeof(*drm), (void**)&drm);
>  	if (ret)
> -		return ret;
> +		goto fail_cli;
>  
>  	dev->dev_private = drm;
>  	drm->dev = dev;
> @@ -336,6 +350,11 @@ nouveau_drm_load(struct drm_device *dev, unsigned long flags)
>  
>  	nouveau_accel_init(drm);
>  	nouveau_fbcon_init(dev);
> +
> +	console_lock();
> +	do_unblank_screen(1);
> +	console_unlock();
> +
>  	return 0;
>  
>  fail_dispinit:
> @@ -351,12 +370,20 @@ fail_ttm:
>  	nouveau_vga_fini(drm);
>  fail_device:
>  	nouveau_cli_destroy(&drm->client);
> +fail_cli:
> +	pci_write_config_byte(pdev, NV_PCI_VGAMEM_ENABLE, 1);
> +
> +	console_lock();
> +	do_unblank_screen(1);
> +	console_unlock();
> +
>  	return ret;
>  }
>  
>  static int
>  nouveau_drm_unload(struct drm_device *dev)
>  {
> +	struct pci_dev *pdev = dev->pdev;
>  	struct nouveau_drm *drm = nouveau_drm(dev);
>  
>  	nouveau_fbcon_fini(dev);
> @@ -375,6 +402,8 @@ nouveau_drm_unload(struct drm_device *dev)
>  	nouveau_vga_fini(drm);
>  
>  	nouveau_cli_destroy(&drm->client);
> +
> +	pci_write_config_byte(pdev, NV_PCI_VGAMEM_ENABLE, 1);
>  	return 0;
>  }
>  
> -- 

What's up with this patch?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/nouveau: fix early vram corruption originating from vgacon
       [not found]     ` <20121004113515.GC4979-OI9uyE9O0yo@public.gmane.org>
@ 2012-10-04 11:44       ` Marcin Kościelnicki
       [not found]         ` <506D768D.6050100-mP9o5jsk0RY@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Marcin Kościelnicki @ 2012-10-04 11:44 UTC (permalink / raw)
  To: Marcin Slusarz; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 04.10.2012 13:35, Marcin Slusarz wrote:
> On Thu, Sep 13, 2012 at 12:52:30AM +0200, Marcin Slusarz wrote:
>> There's a short window between module load and fbcon initalization when
>> it's possible for vgacon to write to VGA RAM. Nouveau uses this memory
>> for different purposes, so if we are unlucky, it causes mysterious memory
>> corruptions.
>>
>> For me, booting with nv_printk debug levels set to 5 was enough to trigger it.
>> It manifested as long stream of:
>> "trapped write at ... on channel 0x0001fea0 BAR/PFIFO_WRITE/IN reason:
>> DMAOBJ_LIMIT / PT_NOT_PRESENT / PAGE_SYSTEM_ONLY / PAGE_NOT_PRESENT"
>> which eventually lead to complete hang.
>>
>> Disabling access to VGA memory (through 0x54 PCI config space register) is
>> enough to fix it, but it breaks copying screen data between old and new
>> console (because old data is inaccessible). But blanking console (with
>> entering_gfx==1) is enough to move vgacon screen buffer from VRAM to RAM
>> and let handover to work correctly.
>>
>> Signed-off-by: Marcin Slusarz <marcin.slusarz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> ---
>>   drivers/gpu/drm/nouveau/nouveau_drm.c | 31 ++++++++++++++++++++++++++++++-
>>   1 file changed, 30 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
>> index 6826525..1641bd9 100644
>> --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
>> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
>> @@ -25,6 +25,7 @@
>>   #include <linux/console.h>
>>   #include <linux/module.h>
>>   #include <linux/pci.h>
>> +#include <linux/vt_kern.h>
>>
>>   #include <core/device.h>
>>   #include <core/client.h>
>> @@ -51,6 +52,8 @@
>>
>>   #include "nouveau_ttm.h"
>>
>> +#define NV_PCI_VGAMEM_ENABLE 0x54
>> +
>>   MODULE_PARM_DESC(config, "option string to pass to driver core");
>>   static char *nouveau_config;
>>   module_param_named(config, nouveau_config, charp, 0400);
>> @@ -247,9 +250,20 @@ nouveau_drm_load(struct drm_device *dev, unsigned long flags)
>>   	struct nouveau_drm *drm;
>>   	int ret;
>>
>> +	/* Blank initial console to prevent VRAM corruption while we initialize
>> +	 * the HW. For vgacon it will move console memory from VGA VRAM to RAM.
>> +	 */
>> +	console_lock();
>> +	do_blank_screen(1);
>> +	console_unlock();
>> +
>> +	/* Completely disable access to VGA IO/memory, just to be sure no one
>> +	 * will change it. */
>> +	pci_write_config_byte(pdev, NV_PCI_VGAMEM_ENABLE, 0);
>> +
>>   	ret = nouveau_cli_create(pdev, "DRM", sizeof(*drm), (void**)&drm);
>>   	if (ret)
>> -		return ret;
>> +		goto fail_cli;
>>
>>   	dev->dev_private = drm;
>>   	drm->dev = dev;
>> @@ -336,6 +350,11 @@ nouveau_drm_load(struct drm_device *dev, unsigned long flags)
>>
>>   	nouveau_accel_init(drm);
>>   	nouveau_fbcon_init(dev);
>> +
>> +	console_lock();
>> +	do_unblank_screen(1);
>> +	console_unlock();
>> +
>>   	return 0;
>>
>>   fail_dispinit:
>> @@ -351,12 +370,20 @@ fail_ttm:
>>   	nouveau_vga_fini(drm);
>>   fail_device:
>>   	nouveau_cli_destroy(&drm->client);
>> +fail_cli:
>> +	pci_write_config_byte(pdev, NV_PCI_VGAMEM_ENABLE, 1);

This (and corresponding unload line) is a bug: you should NOT blindly 
set this register to 1 on unload, use its previous value instead. 
Otherwise you can get into sticky situations involving two GPUs 
responding to VGA address space. Better yet, think up a proper solution 
involving vga arb.
>> +
>> +	console_lock();
>> +	do_unblank_screen(1);
>> +	console_unlock();
>> +
>>   	return ret;
>>   }
>>
>>   static int
>>   nouveau_drm_unload(struct drm_device *dev)
>>   {
>> +	struct pci_dev *pdev = dev->pdev;
>>   	struct nouveau_drm *drm = nouveau_drm(dev);
>>
>>   	nouveau_fbcon_fini(dev);
>> @@ -375,6 +402,8 @@ nouveau_drm_unload(struct drm_device *dev)
>>   	nouveau_vga_fini(drm);
>>
>>   	nouveau_cli_destroy(&drm->client);
>> +
>> +	pci_write_config_byte(pdev, NV_PCI_VGAMEM_ENABLE, 1);
>>   	return 0;
>>   }
>>
>> --
>
> What's up with this patch?
>
> _______________________________________________
> Nouveau mailing list
> Nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/nouveau: fix early vram corruption originating from vgacon
       [not found]         ` <506D768D.6050100-mP9o5jsk0RY@public.gmane.org>
@ 2012-10-05 20:18           ` Marcin Slusarz
  0 siblings, 0 replies; 4+ messages in thread
From: Marcin Slusarz @ 2012-10-05 20:18 UTC (permalink / raw)
  To: Marcin Kościelnicki; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Thu, Oct 04, 2012 at 01:44:13PM +0200, Marcin Kościelnicki wrote:
> On 04.10.2012 13:35, Marcin Slusarz wrote:
> > On Thu, Sep 13, 2012 at 12:52:30AM +0200, Marcin Slusarz wrote:
> This (and corresponding unload line) is a bug: you should NOT blindly 
> set this register to 1 on unload, use its previous value instead. 
> Otherwise you can get into sticky situations involving two GPUs 
> responding to VGA address space. Better yet, think up a proper solution 
> involving vga arb.

Thanks to your comment I noticed two more things:
- I probably should have used dword API variant.
- There is already code in nouveau which bashes this register (nouveau_vga.c /
nouveau_vga_set_decode) and it's wired to vgaarb. It seems it's set to 0 only
when there is more than one card. I'm not sure how to handle it yet...

Marcin
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-10-05 20:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-12 22:52 [PATCH] drm/nouveau: fix early vram corruption originating from vgacon Marcin Slusarz
     [not found] ` <20120912225230.GD8067-OI9uyE9O0yo@public.gmane.org>
2012-10-04 11:35   ` Marcin Slusarz
     [not found]     ` <20121004113515.GC4979-OI9uyE9O0yo@public.gmane.org>
2012-10-04 11:44       ` Marcin Kościelnicki
     [not found]         ` <506D768D.6050100-mP9o5jsk0RY@public.gmane.org>
2012-10-05 20:18           ` Marcin Slusarz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.