public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [2.6.33-rc6-git regression] idr fix breaks Xorg
@ 2010-02-04  1:28 Andy Isaacson
  2010-02-04  3:11 ` Tejun Heo
  2010-02-04  7:56 ` Andy Isaacson
  0 siblings, 2 replies; 9+ messages in thread
From: Andy Isaacson @ 2010-02-04  1:28 UTC (permalink / raw)
  To: linux-kernel, dri-devel, Tejun Heo

[-- Attachment #1: Type: text/plain, Size: 1593 bytes --]

On my Dell Latitude e4300 commit 859ddf0974 ("idr: fix a critical
misallocation bug") causes Xorg to segfault with the following
backtrace:

...
(II) intel(0): Initializing HW Cursor
(II) intel(0): No memory allocations

Backtrace:
0: /usr/bin/X11/X(xorg_backtrace+0x26) [0x4f00c6]
1: /usr/bin/X11/X(xf86SigHandler+0x41) [0x4852c1]
2: /lib/libc.so.6 [0x7fbdad071530]
3: /usr/lib/xorg/modules/drivers//intel_drv.so [0x7fbdab8a1cbd]
4: /usr/lib/xorg/modules/drivers//intel_drv.so(gen4_render_state_init+0x416) [0x
7fbdab8a2a36]
5: /usr/lib/xorg/modules/drivers//intel_drv.so [0x7fbdab87d3e8]
6: /usr/lib/xorg/modules/drivers//intel_drv.so [0x7fbdab87e6fb]
7: /usr/bin/X11/X(AddScreen+0x1d4) [0x4337c4]
8: /usr/bin/X11/X(InitOutput+0x76f) [0x46f27f]
9: /usr/bin/X11/X(main+0x1fe) [0x433ece]
10: /lib/libc.so.6(__libc_start_main+0xfd) [0x7fbdad05cabd]
11: /usr/bin/X11/X [0x433509]
Saw signal 11.  Server aborting.

Reverting commit 859ddf0974 causes X to start working again -- tested on
top of c80d292f.

I'm running Karmic 64-bit with

xserver-xorg-core                    2:1.6.4-2ubuntu4.1
xserver-xorg-video-intel             2:2.9.0-1ubuntu2.1
libdrm-intel1                        2.4.14-1ubuntu1
libdrm2                              2.4.14-1ubuntu1

More information (dmesg, lspci, full dpkg, etc) is available at
http://web.hexapodia.org/~adi/bugs/201002-xorg-segv/

I've attached my kconfig as that seems like the most likely difference
between my config and others who are (presumably?) not seeing this
crash.  (Gzipped, alas, because AFAIK 70KB will exceed vger's size
liimit.)

-andy

[-- Attachment #2: kconfig.gz --]
[-- Type: application/x-gunzip, Size: 16711 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.6.33-rc6-git regression] idr fix breaks Xorg
  2010-02-04  1:28 [2.6.33-rc6-git regression] idr fix breaks Xorg Andy Isaacson
@ 2010-02-04  3:11 ` Tejun Heo
  2010-02-04  7:08   ` Andy Isaacson
  2010-02-04  7:56 ` Andy Isaacson
  1 sibling, 1 reply; 9+ messages in thread
From: Tejun Heo @ 2010-02-04  3:11 UTC (permalink / raw)
  To: Andy Isaacson; +Cc: linux-kernel, dri-devel

On 02/04/2010 10:28 AM, Andy Isaacson wrote:
> On my Dell Latitude e4300 commit 859ddf0974 ("idr: fix a critical
> misallocation bug") causes Xorg to segfault with the following
> backtrace:

Does the following patch make any difference?

diff --git a/lib/idr.c b/lib/idr.c
index ba7d37c..a96c604 100644
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -140,7 +140,8 @@ static int sub_alloc(struct idr *idp, int *starting_id, struct idr_layer **pa)
 	id = *starting_id;
  restart:
 	p = idp->top;
-	l = p->layer;
+	l = idp->layers;
+	pa[l--] = NULL;
 	while (1) {
 		/*
 		 * We run around this while until we reach the leaf node...


-- 
tejun

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [2.6.33-rc6-git regression] idr fix breaks Xorg
  2010-02-04  3:11 ` Tejun Heo
@ 2010-02-04  7:08   ` Andy Isaacson
  0 siblings, 0 replies; 9+ messages in thread
From: Andy Isaacson @ 2010-02-04  7:08 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, dri-devel

On Thu, Feb 04, 2010 at 12:11:41PM +0900, Tejun Heo wrote:
> On 02/04/2010 10:28 AM, Andy Isaacson wrote:
> > On my Dell Latitude e4300 commit 859ddf0974 ("idr: fix a critical
> > misallocation bug") causes Xorg to segfault with the following
> > backtrace:
> 
> Does the following patch make any difference?
> 
> diff --git a/lib/idr.c b/lib/idr.c
> index ba7d37c..a96c604 100644
> --- a/lib/idr.c
> +++ b/lib/idr.c
> @@ -140,7 +140,8 @@ static int sub_alloc(struct idr *idp, int *starting_id, struct idr_layer **pa)
>  	id = *starting_id;
>   restart:
>  	p = idp->top;
> -	l = p->layer;
> +	l = idp->layers;
> +	pa[l--] = NULL;
>  	while (1) {
>  		/*
>  		 * We run around this while until we reach the leaf node...

Alas, no -- with that patch applied Xorg still segfaults in the same
place.

-andy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.6.33-rc6-git regression] idr fix breaks Xorg
  2010-02-04  1:28 [2.6.33-rc6-git regression] idr fix breaks Xorg Andy Isaacson
  2010-02-04  3:11 ` Tejun Heo
@ 2010-02-04  7:56 ` Andy Isaacson
  2010-02-04  8:16   ` Tejun Heo
  1 sibling, 1 reply; 9+ messages in thread
From: Andy Isaacson @ 2010-02-04  7:56 UTC (permalink / raw)
  To: linux-kernel, dri-devel, Tejun Heo

On Wed, Feb 03, 2010 at 05:28:37PM -0800, Andy Isaacson wrote:
> On my Dell Latitude e4300 commit 859ddf0974 ("idr: fix a critical
> misallocation bug") causes Xorg to segfault with the following
> backtrace:
> 
> ...
> (II) intel(0): Initializing HW Cursor
> (II) intel(0): No memory allocations
> 
> Backtrace:
> 0: /usr/bin/X11/X(xorg_backtrace+0x26) [0x4f00c6]
> 1: /usr/bin/X11/X(xf86SigHandler+0x41) [0x4852c1]
> 2: /lib/libc.so.6 [0x7fbdad071530]
> 3: /usr/lib/xorg/modules/drivers//intel_drv.so [0x7fbdab8a1cbd]
> 4: /usr/lib/xorg/modules/drivers//intel_drv.so(gen4_render_state_init+0x416) [0x
> 7fbdab8a2a36]
> 5: /usr/lib/xorg/modules/drivers//intel_drv.so [0x7fbdab87d3e8]
> 6: /usr/lib/xorg/modules/drivers//intel_drv.so [0x7fbdab87e6fb]
> 7: /usr/bin/X11/X(AddScreen+0x1d4) [0x4337c4]
> 8: /usr/bin/X11/X(InitOutput+0x76f) [0x46f27f]
> 9: /usr/bin/X11/X(main+0x1fe) [0x433ece]
> 10: /lib/libc.so.6(__libc_start_main+0xfd) [0x7fbdad05cabd]
> 11: /usr/bin/X11/X [0x433509]
> Saw signal 11.  Server aborting.

strace perhaps shows some more illuminating results; I suspect the
SEGV is due to a failed ioctl.  (This is a different run than the above,
obviously.)

1265267921.566623 ioctl(8, 0xc020645e, 0x7fffe2196980) = 0
1265267921.566944 ioctl(8, 0x400c645f, 0x7fffe21969a0) = 0
1265267921.567272 brk(0x20e7000)        = 0x20e7000
1265267921.567602 ioctl(8, 0x40046460, 0x7fffe21969a0) = 0
1265267921.567922 ioctl(8, 0xc010645b, 0x7fffe2196990) = 0
1265267921.568269 ioctl(8, 0xc020645e, 0x7fffe2196980) = -1 EBADF (Bad file descriptor)
1265267921.568649 write(2, "../../../libdrm/intel/intel_bufmgr_gem.c:637: Error mapping buffer 1073741824 (gen4 WM state): Bad file descriptor .\n", 117) = 117
1265267921.569039 --- SIGSEGV (Segmentation fault) @ 0 (0) ---

The full strace is at
http://web.hexapodia.org/~adi/bugs/201002-xorg-segv/Xorg.strace

-andy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.6.33-rc6-git regression] idr fix breaks Xorg
  2010-02-04  7:56 ` Andy Isaacson
@ 2010-02-04  8:16   ` Tejun Heo
  2010-02-04  8:40     ` Dave Airlie
  2010-02-04  9:39     ` Chris Wilson
  0 siblings, 2 replies; 9+ messages in thread
From: Tejun Heo @ 2010-02-04  8:16 UTC (permalink / raw)
  To: Andy Isaacson; +Cc: linux-kernel, dri-devel

Hello,

On 02/04/2010 04:56 PM, Andy Isaacson wrote:
> 1265267921.568269 ioctl(8, 0xc020645e, 0x7fffe2196980) = -1 EBADF (Bad file descriptor)

Hmm... -EBADF?  I suppose it doesn't mean that the fd is invalid in
this case but that the mapped object can't be found for some reason?
Can anyone more familiar with the subsystem explain what's going on?

> 1265267921.568649 write(2, "../../../libdrm/intel/intel_bufmgr_gem.c:637: Error mapping buffer 1073741824 (gen4 WM state): Bad file descriptor .\n", 117) = 117
> 1265267921.569039 --- SIGSEGV (Segmentation fault) @ 0 (0) ---

I'll forward the fore mentioned fix as it at least fixes one reported
failure.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.6.33-rc6-git regression] idr fix breaks Xorg
  2010-02-04  8:16   ` Tejun Heo
@ 2010-02-04  8:40     ` Dave Airlie
  2010-02-04  8:45       ` Tejun Heo
  2010-02-04  9:39     ` Chris Wilson
  1 sibling, 1 reply; 9+ messages in thread
From: Dave Airlie @ 2010-02-04  8:40 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Andy Isaacson, linux-kernel, dri-devel

On Thu, Feb 4, 2010 at 6:16 PM, Tejun Heo <tj@kernel.org> wrote:
> Hello,
>
> On 02/04/2010 04:56 PM, Andy Isaacson wrote:
>> 1265267921.568269 ioctl(8, 0xc020645e, 0x7fffe2196980) = -1 EBADF (Bad file descriptor)
>
> Hmm... -EBADF?  I suppose it doesn't mean that the fd is invalid in
> this case but that the mapped object can't be found for some reason?
> Can anyone more familiar with the subsystem explain what's going on?
>
>> 1265267921.568649 write(2, "../../../libdrm/intel/intel_bufmgr_gem.c:637: Error mapping buffer 1073741824 (gen4 WM state): Bad file descriptor .\n", 117) = 117
>> 1265267921.569039 --- SIGSEGV (Segmentation fault) @ 0 (0) ---
>
> I'll forward the fore mentioned fix as it at least fixes one reported
> failure.

Hmm at this late stage, maybe revert first? since the old idr code works fine
with the subsystems in question.

The drm idr code usage isn't anything crazy, the EBADF is the return code
from the mmap ioctl when it calls the idr lookup function for a handle.

The lookup function is just an idr_find inside a spinlock.

Dave.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.6.33-rc6-git regression] idr fix breaks Xorg
  2010-02-04  8:40     ` Dave Airlie
@ 2010-02-04  8:45       ` Tejun Heo
  0 siblings, 0 replies; 9+ messages in thread
From: Tejun Heo @ 2010-02-04  8:45 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Andy Isaacson, linux-kernel, dri-devel

Hello,

On 02/04/2010 05:40 PM, Dave Airlie wrote:
> Hmm at this late stage, maybe revert first? since the old idr code works fine
> with the subsystems in question.
> 
> The drm idr code usage isn't anything crazy, the EBADF is the return code
> from the mmap ioctl when it calls the idr lookup function for a handle.
> 
> The lookup function is just an idr_find inside a spinlock.

Yeap, you're right.  I'll send a patch to revert it.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.6.33-rc6-git regression] idr fix breaks Xorg
  2010-02-04  8:16   ` Tejun Heo
  2010-02-04  8:40     ` Dave Airlie
@ 2010-02-04  9:39     ` Chris Wilson
  2010-02-11  8:50       ` Tejun Heo
  1 sibling, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2010-02-04  9:39 UTC (permalink / raw)
  To: Tejun Heo, Andy Isaacson; +Cc: dri-devel, linux-kernel

On Thu, 04 Feb 2010 17:16:56 +0900, Tejun Heo <tj@kernel.org> wrote:
> Hello,
> 
> On 02/04/2010 04:56 PM, Andy Isaacson wrote:
> > 1265267921.568269 ioctl(8, 0xc020645e, 0x7fffe2196980) = -1 EBADF (Bad file descriptor)
> 
> Hmm... -EBADF?  I suppose it doesn't mean that the fd is invalid in
> this case but that the mapped object can't be found for some reason?
> Can anyone more familiar with the subsystem explain what's going on?

Correct, in this case we pass in an 'fd' via the ioctl that we wish to
mmap. idr is then used to translate that handle back to one of our buffer
objects, which is then prepared for mapping.

In this context, it is the immediate lookup of the handle following
allocation that is failing. The critical bit of code lives in
drivers/gpu/drm/drm_gem.c:

int
drm_gem_handle_create(struct drm_file *file_priv,
                       struct drm_gem_object *obj,
                       u32 *handlep)
{
        int     ret;

        /*
         * Get the user-visible handle using idr.
         */
again:
        /* ensure there is space available to allocate a handle */
        if (idr_pre_get(&file_priv->object_idr, GFP_KERNEL) == 0)
                return -ENOMEM;

        /* do the allocation under our spinlock */
        spin_lock(&file_priv->table_lock);
        ret = idr_get_new_above(&file_priv->object_idr, obj, 1, (int
*)handlep);
        spin_unlock(&file_priv->table_lock);
        if (ret == -EAGAIN)
                goto again;

        if (ret != 0)
                return ret;

        drm_gem_object_handle_reference(obj);
        return 0;
}

struct drm_gem_object *
drm_gem_object_lookup(struct drm_device *dev, struct drm_file *filp,
                      u32 handle)
{
        struct drm_gem_object *obj;

        spin_lock(&filp->table_lock);

        /* Check if we currently have a reference on the object */
        obj = idr_find(&filp->object_idr, handle);
        if (obj == NULL) {
                spin_unlock(&filp->table_lock);
                return NULL;
        }

        drm_gem_object_reference(obj);

        spin_unlock(&filp->table_lock);

        return obj;
}

Can you see any problems here?

> 
> > 1265267921.568649 write(2, "../../../libdrm/intel/intel_bufmgr_gem.c:637: Error mapping buffer 1073741824 (gen4 WM state): Bad file descriptor .\n", 117) = 117
> > 1265267921.569039 --- SIGSEGV (Segmentation fault) @ 0 (0) ---

This SEGV is just lazy error handling in the userspace driver; the
impossible just happened.
-ickle

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2.6.33-rc6-git regression] idr fix breaks Xorg
  2010-02-04  9:39     ` Chris Wilson
@ 2010-02-11  8:50       ` Tejun Heo
  0 siblings, 0 replies; 9+ messages in thread
From: Tejun Heo @ 2010-02-11  8:50 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Andy Isaacson, dri-devel, linux-kernel

Hello,

Can you please test the following patch on top of the current
mainline?  Thanks.

diff --git a/lib/idr.c b/lib/idr.c
index 1cac726..0dc7822 100644
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -156,10 +156,12 @@ static int sub_alloc(struct idr *idp, int *starting_id, struct idr_layer **pa)
 			id = (id | ((1 << (IDR_BITS * l)) - 1)) + 1;
 
 			/* if already at the top layer, we need to grow */
-			if (!(p = pa[l])) {
+			if (id >= 1 << (idp->layers * IDR_BITS)) {
 				*starting_id = id;
 				return IDR_NEED_TO_GROW;
 			}
+			p = pa[l];
+			BUG_ON(!p);
 
 			/* If we need to go up one layer, continue the
 			 * loop; otherwise, restart from the top.

-- 
tejun

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-02-11  8:44 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-04  1:28 [2.6.33-rc6-git regression] idr fix breaks Xorg Andy Isaacson
2010-02-04  3:11 ` Tejun Heo
2010-02-04  7:08   ` Andy Isaacson
2010-02-04  7:56 ` Andy Isaacson
2010-02-04  8:16   ` Tejun Heo
2010-02-04  8:40     ` Dave Airlie
2010-02-04  8:45       ` Tejun Heo
2010-02-04  9:39     ` Chris Wilson
2010-02-11  8:50       ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox