linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] drm/i915: Record error batch buffers using iomem
  2010-05-11 18:22 ` [PATCH] drm/i915: Record error batch buffers using iomem Chris Wilson
@ 2010-05-11 15:37   ` Andrew Morton
  2010-05-11 18:49     ` Chris Wilson
  2010-05-11 19:22   ` Jaswinder Singh Rajput
  1 sibling, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2010-05-11 15:37 UTC (permalink / raw)
  To: Chris Wilson
  Cc: intel-gfx, Jaswinder Singh Rajput, dri-devel, Dave Airlie,
	linux-kernel

On Tue, 11 May 2010 19:22:14 +0100 Chris Wilson <chris@chris-wilson.co.uk> wrote:

> +	reloc_offset = src_priv->gtt_offset;
>  	for (page = 0; page < page_count; page++) {
> -		void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> +		void __iomem *s;
> +		void *d;
> +
> +		d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
>  		if (d == NULL)
>  			goto unwind;
> -		s = kmap_atomic(src_priv->pages[page], KM_USER0);
> -		memcpy(d, s, PAGE_SIZE);
> -		kunmap_atomic(s, KM_USER0);
> +
> +		s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
> +					     reloc_offset);
> +		memcpy_fromio(d, s, PAGE_SIZE);
> +		io_mapping_unmap_atomic(s);

As mentioned in the other email, this will still corrupt the KM_USER0
slot, and will generate a debug_kmap_atomic() warning.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] drm/i915: Record error batch buffers using iomem
       [not found] <0100511104818.8382a7de.akpm@linux-foundation.org>
@ 2010-05-11 18:22 ` Chris Wilson
  2010-05-11 15:37   ` Andrew Morton
  2010-05-11 19:22   ` Jaswinder Singh Rajput
  0 siblings, 2 replies; 12+ messages in thread
From: Chris Wilson @ 2010-05-11 18:22 UTC (permalink / raw)
  To: intel-gfx
  Cc: Andrew Morton, Jaswinder Singh Rajput, dri-devel, Dave Airlie,
	linux-kernel, Chris Wilson

Directly read the GTT mapping for the contents of the batch buffers
rather than relying on possibly stale CPU caches. Also for completeness
scan the flushing/inactive lists for the current buffers - we are
collecting error state after all.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_irq.c |   64 ++++++++++++++++++++++++++++++++++----
 1 files changed, 57 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 87113da..14301a4 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -441,9 +441,11 @@ static struct drm_i915_error_object *
 i915_error_object_create(struct drm_device *dev,
 			 struct drm_gem_object *src)
 {
+	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct drm_i915_error_object *dst;
 	struct drm_i915_gem_object *src_priv;
 	int page, page_count;
+	u32 reloc_offset;
 
 	if (src == NULL)
 		return NULL;
@@ -458,14 +460,23 @@ i915_error_object_create(struct drm_device *dev,
 	if (dst == NULL)
 		return NULL;
 
+	reloc_offset = src_priv->gtt_offset;
 	for (page = 0; page < page_count; page++) {
-		void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
+		void __iomem *s;
+		void *d;
+
+		d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
 		if (d == NULL)
 			goto unwind;
-		s = kmap_atomic(src_priv->pages[page], KM_USER0);
-		memcpy(d, s, PAGE_SIZE);
-		kunmap_atomic(s, KM_USER0);
+
+		s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
+					     reloc_offset);
+		memcpy_fromio(d, s, PAGE_SIZE);
+		io_mapping_unmap_atomic(s);
+
 		dst->pages[page] = d;
+
+		reloc_offset += PAGE_SIZE;
 	}
 	dst->page_count = page_count;
 	dst->gtt_offset = src_priv->gtt_offset;
@@ -621,18 +632,57 @@ static void i915_capture_error_state(struct drm_device *dev)
 
 		if (batchbuffer[1] == NULL &&
 		    error->acthd >= obj_priv->gtt_offset &&
-		    error->acthd < obj_priv->gtt_offset + obj->size &&
-		    batchbuffer[0] != obj)
+		    error->acthd < obj_priv->gtt_offset + obj->size)
 			batchbuffer[1] = obj;
 
 		count++;
 	}
+	/* Scan the other lists for completeness for those bizarre errors. */
+	if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
+		list_for_each_entry(obj_priv, &dev_priv->mm.flushing_list, list) {
+			struct drm_gem_object *obj = obj_priv->obj;
+
+			if (batchbuffer[0] == NULL &&
+			    bbaddr >= obj_priv->gtt_offset &&
+			    bbaddr < obj_priv->gtt_offset + obj->size)
+				batchbuffer[0] = obj;
+
+			if (batchbuffer[1] == NULL &&
+			    error->acthd >= obj_priv->gtt_offset &&
+			    error->acthd < obj_priv->gtt_offset + obj->size)
+				batchbuffer[1] = obj;
+
+			if (batchbuffer[0] && batchbuffer[1])
+				break;
+		}
+	}
+	if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
+		list_for_each_entry(obj_priv, &dev_priv->mm.inactive_list, list) {
+			struct drm_gem_object *obj = obj_priv->obj;
+
+			if (batchbuffer[0] == NULL &&
+			    bbaddr >= obj_priv->gtt_offset &&
+			    bbaddr < obj_priv->gtt_offset + obj->size)
+				batchbuffer[0] = obj;
+
+			if (batchbuffer[1] == NULL &&
+			    error->acthd >= obj_priv->gtt_offset &&
+			    error->acthd < obj_priv->gtt_offset + obj->size)
+				batchbuffer[1] = obj;
+
+			if (batchbuffer[0] && batchbuffer[1])
+				break;
+		}
+	}
 
 	/* We need to copy these to an anonymous buffer as the simplest
 	 * method to avoid being overwritten by userpace.
 	 */
 	error->batchbuffer[0] = i915_error_object_create(dev, batchbuffer[0]);
-	error->batchbuffer[1] = i915_error_object_create(dev, batchbuffer[1]);
+	if (batchbuffer[1] != batchbuffer[0])
+		error->batchbuffer[1] = i915_error_object_create(dev, batchbuffer[1]);
+	else
+		error->batchbuffer[1] = NULL;
 
 	/* Record the ringbuffer */
 	error->ringbuffer = i915_error_object_create(dev, dev_priv->ring.ring_obj);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915: Record error batch buffers using iomem
  2010-05-11 15:37   ` Andrew Morton
@ 2010-05-11 18:49     ` Chris Wilson
  0 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2010-05-11 18:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: intel-gfx, Jaswinder Singh Rajput, dri-devel, Dave Airlie,
	linux-kernel

On Tue, 11 May 2010 11:37:22 -0400, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Tue, 11 May 2010 19:22:14 +0100 Chris Wilson <chris@chris-wilson.co.uk> wrote:
> 
> > +	reloc_offset = src_priv->gtt_offset;
> >  	for (page = 0; page < page_count; page++) {
> > -		void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> > +		void __iomem *s;
> > +		void *d;
> > +
> > +		d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> >  		if (d == NULL)
> >  			goto unwind;
> > -		s = kmap_atomic(src_priv->pages[page], KM_USER0);
> > -		memcpy(d, s, PAGE_SIZE);
> > -		kunmap_atomic(s, KM_USER0);
> > +
> > +		s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
> > +					     reloc_offset);
> > +		memcpy_fromio(d, s, PAGE_SIZE);
> > +		io_mapping_unmap_atomic(s);
> 
> As mentioned in the other email, this will still corrupt the KM_USER0
> slot, and will generate a debug_kmap_atomic() warning.

How, as kmap_atomic(KM_USER0) is no longer used?
-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915: Record error batch buffers using iomem
  2010-05-11 18:22 ` [PATCH] drm/i915: Record error batch buffers using iomem Chris Wilson
  2010-05-11 15:37   ` Andrew Morton
@ 2010-05-11 19:22   ` Jaswinder Singh Rajput
  2010-05-11 19:38     ` Jaswinder Singh Rajput
  1 sibling, 1 reply; 12+ messages in thread
From: Jaswinder Singh Rajput @ 2010-05-11 19:22 UTC (permalink / raw)
  To: Chris Wilson
  Cc: intel-gfx, Andrew Morton, dri-devel, Dave Airlie, linux-kernel

Hello Chris and Andrew,

On Tue, May 11, 2010 at 11:52 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> Directly read the GTT mapping for the contents of the batch buffers
> rather than relying on possibly stale CPU caches. Also for completeness
> scan the flushing/inactive lists for the current buffers - we are
> collecting error state after all.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Yes, I have tested this patch.

I booted 3 times, and this patch fixes the DRM as well as softirq
warnings and I am getting Xwindows with this patch.

I am still doing more testing.

Thanks,
--
Jaswinder Singh.
> ---
>  drivers/gpu/drm/i915/i915_irq.c |   64 ++++++++++++++++++++++++++++++++++----
>  1 files changed, 57 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 87113da..14301a4 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -441,9 +441,11 @@ static struct drm_i915_error_object *
>  i915_error_object_create(struct drm_device *dev,
>                         struct drm_gem_object *src)
>  {
> +       drm_i915_private_t *dev_priv = dev->dev_private;
>        struct drm_i915_error_object *dst;
>        struct drm_i915_gem_object *src_priv;
>        int page, page_count;
> +       u32 reloc_offset;
>
>        if (src == NULL)
>                return NULL;
> @@ -458,14 +460,23 @@ i915_error_object_create(struct drm_device *dev,
>        if (dst == NULL)
>                return NULL;
>
> +       reloc_offset = src_priv->gtt_offset;
>        for (page = 0; page < page_count; page++) {
> -               void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> +               void __iomem *s;
> +               void *d;
> +
> +               d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
>                if (d == NULL)
>                        goto unwind;
> -               s = kmap_atomic(src_priv->pages[page], KM_USER0);
> -               memcpy(d, s, PAGE_SIZE);
> -               kunmap_atomic(s, KM_USER0);
> +
> +               s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
> +                                            reloc_offset);
> +               memcpy_fromio(d, s, PAGE_SIZE);
> +               io_mapping_unmap_atomic(s);
> +
>                dst->pages[page] = d;
> +
> +               reloc_offset += PAGE_SIZE;
>        }
>        dst->page_count = page_count;
>        dst->gtt_offset = src_priv->gtt_offset;
> @@ -621,18 +632,57 @@ static void i915_capture_error_state(struct drm_device *dev)
>
>                if (batchbuffer[1] == NULL &&
>                    error->acthd >= obj_priv->gtt_offset &&
> -                   error->acthd < obj_priv->gtt_offset + obj->size &&
> -                   batchbuffer[0] != obj)
> +                   error->acthd < obj_priv->gtt_offset + obj->size)
>                        batchbuffer[1] = obj;
>
>                count++;
>        }
> +       /* Scan the other lists for completeness for those bizarre errors. */
> +       if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
> +               list_for_each_entry(obj_priv, &dev_priv->mm.flushing_list, list) {
> +                       struct drm_gem_object *obj = obj_priv->obj;
> +
> +                       if (batchbuffer[0] == NULL &&
> +                           bbaddr >= obj_priv->gtt_offset &&
> +                           bbaddr < obj_priv->gtt_offset + obj->size)
> +                               batchbuffer[0] = obj;
> +
> +                       if (batchbuffer[1] == NULL &&
> +                           error->acthd >= obj_priv->gtt_offset &&
> +                           error->acthd < obj_priv->gtt_offset + obj->size)
> +                               batchbuffer[1] = obj;
> +
> +                       if (batchbuffer[0] && batchbuffer[1])
> +                               break;
> +               }
> +       }
> +       if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
> +               list_for_each_entry(obj_priv, &dev_priv->mm.inactive_list, list) {
> +                       struct drm_gem_object *obj = obj_priv->obj;
> +
> +                       if (batchbuffer[0] == NULL &&
> +                           bbaddr >= obj_priv->gtt_offset &&
> +                           bbaddr < obj_priv->gtt_offset + obj->size)
> +                               batchbuffer[0] = obj;
> +
> +                       if (batchbuffer[1] == NULL &&
> +                           error->acthd >= obj_priv->gtt_offset &&
> +                           error->acthd < obj_priv->gtt_offset + obj->size)
> +                               batchbuffer[1] = obj;
> +
> +                       if (batchbuffer[0] && batchbuffer[1])
> +                               break;
> +               }
> +       }
>
>        /* We need to copy these to an anonymous buffer as the simplest
>         * method to avoid being overwritten by userpace.
>         */
>        error->batchbuffer[0] = i915_error_object_create(dev, batchbuffer[0]);
> -       error->batchbuffer[1] = i915_error_object_create(dev, batchbuffer[1]);
> +       if (batchbuffer[1] != batchbuffer[0])
> +               error->batchbuffer[1] = i915_error_object_create(dev, batchbuffer[1]);
> +       else
> +               error->batchbuffer[1] = NULL;
>
>        /* Record the ringbuffer */
>        error->ringbuffer = i915_error_object_create(dev, dev_priv->ring.ring_obj);
> --
> 1.7.1
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915: Record error batch buffers using iomem
  2010-05-11 19:22   ` Jaswinder Singh Rajput
@ 2010-05-11 19:38     ` Jaswinder Singh Rajput
  2010-05-11 19:53       ` Chris Wilson
  0 siblings, 1 reply; 12+ messages in thread
From: Jaswinder Singh Rajput @ 2010-05-11 19:38 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Andrew Morton, dri-devel, Dave Airlie, linux-kernel

Hello Chris and Andrew,

I did further testing and noticed that this patch fixes the boot
errors and warnings and I get the XWindows.

But XWindows freezes after some time.

Thanks,
--
Jaswinder Singh.

On Wed, May 12, 2010 at 12:52 AM, Jaswinder Singh Rajput
<jaswinderlinux@gmail.com> wrote:
> Hello Chris and Andrew,
>
> On Tue, May 11, 2010 at 11:52 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>> Directly read the GTT mapping for the contents of the batch buffers
>> rather than relying on possibly stale CPU caches. Also for completeness
>> scan the flushing/inactive lists for the current buffers - we are
>> collecting error state after all.
>>
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>
> Yes, I have tested this patch.
>
> I booted 3 times, and this patch fixes the DRM as well as softirq
> warnings and I am getting Xwindows with this patch.
>
> I am still doing more testing.
>
> Thanks,
> --
> Jaswinder Singh.
>> ---
>>  drivers/gpu/drm/i915/i915_irq.c |   64 ++++++++++++++++++++++++++++++++++----
>>  1 files changed, 57 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
>> index 87113da..14301a4 100644
>> --- a/drivers/gpu/drm/i915/i915_irq.c
>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>> @@ -441,9 +441,11 @@ static struct drm_i915_error_object *
>>  i915_error_object_create(struct drm_device *dev,
>>                         struct drm_gem_object *src)
>>  {
>> +       drm_i915_private_t *dev_priv = dev->dev_private;
>>        struct drm_i915_error_object *dst;
>>        struct drm_i915_gem_object *src_priv;
>>        int page, page_count;
>> +       u32 reloc_offset;
>>
>>        if (src == NULL)
>>                return NULL;
>> @@ -458,14 +460,23 @@ i915_error_object_create(struct drm_device *dev,
>>        if (dst == NULL)
>>                return NULL;
>>
>> +       reloc_offset = src_priv->gtt_offset;
>>        for (page = 0; page < page_count; page++) {
>> -               void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
>> +               void __iomem *s;
>> +               void *d;
>> +
>> +               d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
>>                if (d == NULL)
>>                        goto unwind;
>> -               s = kmap_atomic(src_priv->pages[page], KM_USER0);
>> -               memcpy(d, s, PAGE_SIZE);
>> -               kunmap_atomic(s, KM_USER0);
>> +
>> +               s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
>> +                                            reloc_offset);
>> +               memcpy_fromio(d, s, PAGE_SIZE);
>> +               io_mapping_unmap_atomic(s);
>> +
>>                dst->pages[page] = d;
>> +
>> +               reloc_offset += PAGE_SIZE;
>>        }
>>        dst->page_count = page_count;
>>        dst->gtt_offset = src_priv->gtt_offset;
>> @@ -621,18 +632,57 @@ static void i915_capture_error_state(struct drm_device *dev)
>>
>>                if (batchbuffer[1] == NULL &&
>>                    error->acthd >= obj_priv->gtt_offset &&
>> -                   error->acthd < obj_priv->gtt_offset + obj->size &&
>> -                   batchbuffer[0] != obj)
>> +                   error->acthd < obj_priv->gtt_offset + obj->size)
>>                        batchbuffer[1] = obj;
>>
>>                count++;
>>        }
>> +       /* Scan the other lists for completeness for those bizarre errors. */
>> +       if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
>> +               list_for_each_entry(obj_priv, &dev_priv->mm.flushing_list, list) {
>> +                       struct drm_gem_object *obj = obj_priv->obj;
>> +
>> +                       if (batchbuffer[0] == NULL &&
>> +                           bbaddr >= obj_priv->gtt_offset &&
>> +                           bbaddr < obj_priv->gtt_offset + obj->size)
>> +                               batchbuffer[0] = obj;
>> +
>> +                       if (batchbuffer[1] == NULL &&
>> +                           error->acthd >= obj_priv->gtt_offset &&
>> +                           error->acthd < obj_priv->gtt_offset + obj->size)
>> +                               batchbuffer[1] = obj;
>> +
>> +                       if (batchbuffer[0] && batchbuffer[1])
>> +                               break;
>> +               }
>> +       }
>> +       if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
>> +               list_for_each_entry(obj_priv, &dev_priv->mm.inactive_list, list) {
>> +                       struct drm_gem_object *obj = obj_priv->obj;
>> +
>> +                       if (batchbuffer[0] == NULL &&
>> +                           bbaddr >= obj_priv->gtt_offset &&
>> +                           bbaddr < obj_priv->gtt_offset + obj->size)
>> +                               batchbuffer[0] = obj;
>> +
>> +                       if (batchbuffer[1] == NULL &&
>> +                           error->acthd >= obj_priv->gtt_offset &&
>> +                           error->acthd < obj_priv->gtt_offset + obj->size)
>> +                               batchbuffer[1] = obj;
>> +
>> +                       if (batchbuffer[0] && batchbuffer[1])
>> +                               break;
>> +               }
>> +       }
>>
>>        /* We need to copy these to an anonymous buffer as the simplest
>>         * method to avoid being overwritten by userpace.
>>         */
>>        error->batchbuffer[0] = i915_error_object_create(dev, batchbuffer[0]);
>> -       error->batchbuffer[1] = i915_error_object_create(dev, batchbuffer[1]);
>> +       if (batchbuffer[1] != batchbuffer[0])
>> +               error->batchbuffer[1] = i915_error_object_create(dev, batchbuffer[1]);
>> +       else
>> +               error->batchbuffer[1] = NULL;
>>
>>        /* Record the ringbuffer */
>>        error->ringbuffer = i915_error_object_create(dev, dev_priv->ring.ring_obj);
>> --
>> 1.7.1
>>
>>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915: Record error batch buffers using iomem
  2010-05-11 19:38     ` Jaswinder Singh Rajput
@ 2010-05-11 19:53       ` Chris Wilson
  2010-05-11 20:05         ` Jaswinder Singh Rajput
  2010-05-12 13:15         ` Jaswinder Singh Rajput
  0 siblings, 2 replies; 12+ messages in thread
From: Chris Wilson @ 2010-05-11 19:53 UTC (permalink / raw)
  To: Jaswinder Singh Rajput
  Cc: Andrew Morton, dri-devel, Dave Airlie, linux-kernel

On Wed, 12 May 2010 01:08:23 +0530, Jaswinder Singh Rajput <jaswinderlinux@gmail.com> wrote:
> Hello Chris and Andrew,
> 
> I did further testing and noticed that this patch fixes the boot
> errors and warnings and I get the XWindows.
> 
> But XWindows freezes after some time.

The BUG you were hitting before is on the error collection path which
presumably is still being triggered during boot by a GPU error.
Can you check to see if /sys/kernel/debug/dri/0/i915_error_state has
recorded anything? And if not, wait until it freezes and then please file
a bug report at bugs.freedesktop.org with the i915_error_state, Xorg.0.log
and dmesg.

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915: Record error batch buffers using iomem
  2010-05-11 19:53       ` Chris Wilson
@ 2010-05-11 20:05         ` Jaswinder Singh Rajput
  2010-05-11 21:45           ` Jaswinder Singh Rajput
  2010-05-12 13:15         ` Jaswinder Singh Rajput
  1 sibling, 1 reply; 12+ messages in thread
From: Jaswinder Singh Rajput @ 2010-05-11 20:05 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Andrew Morton, dri-devel, Dave Airlie, linux-kernel

Hello Chris,

On Wed, May 12, 2010 at 1:23 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Wed, 12 May 2010 01:08:23 +0530, Jaswinder Singh Rajput <jaswinderlinux@gmail.com> wrote:
>> Hello Chris and Andrew,
>>
>> I did further testing and noticed that this patch fixes the boot
>> errors and warnings and I get the XWindows.
>>
>> But XWindows freezes after some time.
>
> The BUG you were hitting before is on the error collection path which
> presumably is still being triggered during boot by a GPU error.

No, I am not getting any bug with your patch.

dmesg with your patch :
http://userweb.kernel.org/~jaswinder/acer_netbook/dmesg_2634-rc7-chris.txt

> Can you check to see if /sys/kernel/debug/dri/0/i915_error_state has
> recorded anything?

No.

> And if not, wait until it freezes and then please file
> a bug report at bugs.freedesktop.org with the i915_error_state, Xorg.0.log
> and dmesg.
>

Ok.

Thanks,
--
Jaswinder Singh.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915: Record error batch buffers using iomem
  2010-05-11 20:05         ` Jaswinder Singh Rajput
@ 2010-05-11 21:45           ` Jaswinder Singh Rajput
  0 siblings, 0 replies; 12+ messages in thread
From: Jaswinder Singh Rajput @ 2010-05-11 21:45 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Andrew Morton, dri-devel, Dave Airlie, linux-kernel

Hello Chris,

On Wed, May 12, 2010 at 1:35 AM, Jaswinder Singh Rajput
<jaswinderlinux@gmail.com> wrote:
> Hello Chris,
>
> On Wed, May 12, 2010 at 1:23 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>> On Wed, 12 May 2010 01:08:23 +0530, Jaswinder Singh Rajput <jaswinderlinux@gmail.com> wrote:
>>> Hello Chris and Andrew,
>>>
>>> I did further testing and noticed that this patch fixes the boot
>>> errors and warnings and I get the XWindows.
>>>
>>> But XWindows freezes after some time.
>>
>> The BUG you were hitting before is on the error collection path which
>> presumably is still being triggered during boot by a GPU error.
>
> No, I am not getting any bug with your patch.
>
> dmesg with your patch :
> http://userweb.kernel.org/~jaswinder/acer_netbook/dmesg_2634-rc7-chris.txt
>

I did more testing. And test pass 80% of time. I get the bugs with cold boot :

[   40.090295] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer
elapsed... GPU hung
[   40.090318] ------------[ cut here ]------------
[   40.090338] WARNING: at mm/highmem.c:453 debug_kmap_atomic+0xa9/0x11e()
[   40.090345] Hardware name: Aspire one
[   40.090351] Modules linked in: nf_conntrack_ftp ath9k ath9k_common
battery ath9k_hw [last unloaded: scsi_wait_scan]
[   40.090378] Pid: 0, comm: swapper Not tainted 2.6.34-rc7-netbook #8
[   40.090385] Call Trace:
[   40.090402]  [<c1030ecb>] warn_slowpath_common+0x65/0x7c
[   40.090415]  [<c108ce5d>] ? debug_kmap_atomic+0xa9/0x11e
[   40.090428]  [<c1030eef>] warn_slowpath_null+0xd/0x10
[   40.090440]  [<c108ce5d>] debug_kmap_atomic+0xa9/0x11e
[   40.090454]  [<c1020611>] kmap_atomic_prot_pfn+0x1d/0x5e
[   40.090465]  [<c1020675>] iomap_atomic_prot_pfn+0x23/0x26
[   40.090479]  [<c11f7d8a>] i915_error_object_create+0x110/0x17c
[   40.090492]  [<c11f8298>] i915_handle_error+0x4a2/0x9ba
[   40.090506]  [<c11f884f>] i915_hangcheck_elapsed+0x9f/0xdf
[   40.090518]  [<c103ab6e>] run_timer_softirq+0x1c9/0x269
[   40.090531]  [<c11f87b0>] ? i915_hangcheck_elapsed+0x0/0xdf
[   40.090543]  [<c1035b7b>] __do_softirq+0xc6/0x186
[   40.090553]  [<c1035c61>] do_softirq+0x26/0x2b
[   40.090564]  [<c1035dd2>] irq_exit+0x29/0x66
[   40.090576]  [<c101681f>] smp_apic_timer_interrupt+0x6e/0x7c
[   40.090591]  [<c141f996>] apic_timer_interrupt+0x2a/0x30
[   40.090605]  [<c104007b>] ? ftrace_raw_event_signal_generate+0x6d/0xd4
[   40.090618]  [<c11bed9d>] ? acpi_idle_enter_simple+0x13b/0x168
[   40.090633]  [<c12dd435>] cpuidle_idle_call+0x6b/0xda
[   40.090645]  [<c1001a3c>] cpu_idle+0x44/0x74
[   40.090657]  [<c141a1b1>] start_secondary+0x1b2/0x1b7
[   40.090666] ---[ end trace 5e47c395a6f397dc ]---
[   40.090862] ------------[ cut here ]------------

dmesg with this patch with cold boot :
http://userweb.kernel.org/~jaswinder/acer_netbook/dmesg_2634-rc7-chris-cold.txt

Thanks,
--
Jaswinder Singh.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915: Record error batch buffers using iomem
  2010-05-11 19:53       ` Chris Wilson
  2010-05-11 20:05         ` Jaswinder Singh Rajput
@ 2010-05-12 13:15         ` Jaswinder Singh Rajput
  2010-05-12 13:50           ` Chris Wilson
  1 sibling, 1 reply; 12+ messages in thread
From: Jaswinder Singh Rajput @ 2010-05-12 13:15 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Andrew Morton, dri-devel, Dave Airlie, linux-kernel

Hello Chris,

On Wed, May 12, 2010 at 1:23 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Wed, 12 May 2010 01:08:23 +0530, Jaswinder Singh Rajput <jaswinderlinux@gmail.com> wrote:
>> Hello Chris and Andrew,
>>
>> I did further testing and noticed that this patch fixes the boot
>> errors and warnings and I get the XWindows.
>>
>> But XWindows freezes after some time.
>
> The BUG you were hitting before is on the error collection path which
> presumably is still being triggered during boot by a GPU error.
> Can you check to see if /sys/kernel/debug/dri/0/i915_error_state has
> recorded anything? And if not, wait until it freezes and then please file
> a bug report at bugs.freedesktop.org with the i915_error_state, Xorg.0.log
> and dmesg.
>

With this patch after XWindows freezes, I get :

[   70.433064] wlan0: no IPv6 routers present
[  227.490064] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer
elapsed... GPU hung
[  227.490098] ------------[ cut here ]------------
[  227.490124] WARNING: at mm/highmem.c:453 debug_kmap_atomic+0xa9/0x11e()
[  227.490135] Hardware name: Aspire one
[  227.490143] Modules linked in: nf_conntrack_ftp ath9k ath9k_common
ath9k_hw battery [last unloaded: scsi_wait_scan]
[  227.490183] Pid: 0, comm: swapper Not tainted 2.6.34-rc7-netbook #8
[  227.490193] Call Trace:
[  227.490214]  [<c1030ecb>] warn_slowpath_common+0x65/0x7c

freeze dmesg : http://userweb.kernel.org/~jaswinder/acer_netbook/dmesg_2634_chris_hang.txt

freeze Xorg.log : http://userweb.kernel.org/~jaswinder/acer_netbook/Xorg_log.txt

So it means this patches shifted the BUG and warning messages after
some time. So I can only work on XWindows for few minutes with this
patch.
Andrew patch is in linus git tree. Can you please update your patch
above Andrew patch. So that I can do further testing.

Thanks,
--
Jaswinder Singh.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915: Record error batch buffers using iomem
  2010-05-12 13:15         ` Jaswinder Singh Rajput
@ 2010-05-12 13:50           ` Chris Wilson
  2010-05-12 14:31             ` Jaswinder Singh Rajput
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Wilson @ 2010-05-12 13:50 UTC (permalink / raw)
  To: Jaswinder Singh Rajput
  Cc: Andrew Morton, dri-devel, Dave Airlie, linux-kernel

On Wed, 12 May 2010 18:45:55 +0530, Jaswinder Singh Rajput <jaswinderlinux@gmail.com> wrote:
> Hello Chris,
> 
> With this patch after XWindows freezes, I get :
[snip]
> freeze dmesg : http://userweb.kernel.org/~jaswinder/acer_netbook/dmesg_2634_chris_hang.txt
> 
> freeze Xorg.log : http://userweb.kernel.org/~jaswinder/acer_netbook/Xorg_log.txt

Jaswinder can you also upload the /sys/kernel/debug/dri/0/i915_error_state
following a freeze as well, please. If your /sys/kernel/debug is empty,
you will need to "mount -tdebugfs debug /sys/kernel/debug".

> So it means this patches shifted the BUG and warning messages after
> some time. So I can only work on XWindows for few minutes with this
> patch.
> Andrew patch is in linus git tree. Can you please update your patch
> above Andrew patch. So that I can do further testing.

What is in the tree is adequate for the time being. It will capture the
batch buffer into the error state. My follow-on patch only increases the
level of paranoia. Thanks for testing.

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915: Record error batch buffers using iomem
  2010-05-12 13:50           ` Chris Wilson
@ 2010-05-12 14:31             ` Jaswinder Singh Rajput
  2010-05-13 21:01               ` Jaswinder Singh Rajput
  0 siblings, 1 reply; 12+ messages in thread
From: Jaswinder Singh Rajput @ 2010-05-12 14:31 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Andrew Morton, dri-devel, Dave Airlie, linux-kernel

Hello Chris,

On Wed, May 12, 2010 at 7:20 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Wed, 12 May 2010 18:45:55 +0530, Jaswinder Singh Rajput <jaswinderlinux@gmail.com> wrote:
>> Hello Chris,
>>
>> With this patch after XWindows freezes, I get :
> [snip]
>> freeze dmesg : http://userweb.kernel.org/~jaswinder/acer_netbook/dmesg_2634_chris_hang.txt
>>
>> freeze Xorg.log : http://userweb.kernel.org/~jaswinder/acer_netbook/Xorg_log.txt
>
> Jaswinder can you also upload the /sys/kernel/debug/dri/0/i915_error_state
> following a freeze as well, please. If your /sys/kernel/debug is empty,
> you will need to "mount -tdebugfs debug /sys/kernel/debug".
>

i915_error_state :
http://userweb.kernel.org/~jaswinder/acer_netbook/i915_error_state.txt

Thanks,
--
Jaswinder Singh.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915: Record error batch buffers using iomem
  2010-05-12 14:31             ` Jaswinder Singh Rajput
@ 2010-05-13 21:01               ` Jaswinder Singh Rajput
  0 siblings, 0 replies; 12+ messages in thread
From: Jaswinder Singh Rajput @ 2010-05-13 21:01 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Andrew Morton, dri-devel, Dave Airlie, linux-kernel

Hello Chris,

On Wed, May 12, 2010 at 8:01 PM, Jaswinder Singh Rajput
<jaswinderlinux@gmail.com> wrote:
> Hello Chris,
>
> On Wed, May 12, 2010 at 7:20 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>> On Wed, 12 May 2010 18:45:55 +0530, Jaswinder Singh Rajput <jaswinderlinux@gmail.com> wrote:
>>> Hello Chris,
>>>
>>> With this patch after XWindows freezes, I get :
>> [snip]
>>> freeze dmesg : http://userweb.kernel.org/~jaswinder/acer_netbook/dmesg_2634_chris_hang.txt
>>>
>>> freeze Xorg.log : http://userweb.kernel.org/~jaswinder/acer_netbook/Xorg_log.txt
>>
>> Jaswinder can you also upload the /sys/kernel/debug/dri/0/i915_error_state
>> following a freeze as well, please. If your /sys/kernel/debug is empty,
>> you will need to "mount -tdebugfs debug /sys/kernel/debug".
>>
>
> i915_error_state :
> http://userweb.kernel.org/~jaswinder/acer_netbook/i915_error_state.txt
>

If you need more information, please let me know.

I am waiting for your feedback.

Thanks,
--
Jaswinder Singh.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-05-13 21:01 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <0100511104818.8382a7de.akpm@linux-foundation.org>
2010-05-11 18:22 ` [PATCH] drm/i915: Record error batch buffers using iomem Chris Wilson
2010-05-11 15:37   ` Andrew Morton
2010-05-11 18:49     ` Chris Wilson
2010-05-11 19:22   ` Jaswinder Singh Rajput
2010-05-11 19:38     ` Jaswinder Singh Rajput
2010-05-11 19:53       ` Chris Wilson
2010-05-11 20:05         ` Jaswinder Singh Rajput
2010-05-11 21:45           ` Jaswinder Singh Rajput
2010-05-12 13:15         ` Jaswinder Singh Rajput
2010-05-12 13:50           ` Chris Wilson
2010-05-12 14:31             ` Jaswinder Singh Rajput
2010-05-13 21:01               ` Jaswinder Singh Rajput

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).