All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.4.17 agpgart process hang on crash
@ 2002-02-02 21:55 Lars Christensen
  2002-02-02 22:17 ` Alan Cox
  2002-02-02 22:32 ` Andrew Morton
  0 siblings, 2 replies; 7+ messages in thread
From: Lars Christensen @ 2002-02-02 21:55 UTC (permalink / raw)
  To: linux-kernel, linux-bugs


Hi. I have experienced a problem with the combination of kernel-2.4.16,
the kernel agpgart module and NVIDIA supplied drivers. I don't know which
is the cause of the problem.

Symptoms: Whenever an OpenGL application crashes (segfault etc.), the
process hangs and can't be killed. Responds to no signals (not even 9). ps
-ef hangs, it seems, when the crashed process is to be listed (some other
processes are listed first).

Hardware: AMD Athlon 1.333HGZ, ASUS M266 motherboard (AMD761 AGP
chipset), NVIDIA GeForce2 MX400 gfx card.

The mem=nopentium option have no effect on the problem, but it doesn't
occur if I use the NVIDIA AGP drivers or kernel 2.4.16 agp drivers. I am
not able to test the 2.4.17 agpgart with other 3D hardware that nvidia.


-- 
Lars Christensen, larsch@cs.auc.dk


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.17 agpgart process hang on crash
  2002-02-02 21:55 2.4.17 agpgart process hang on crash Lars Christensen
@ 2002-02-02 22:17 ` Alan Cox
  2002-02-02 22:29   ` Lars Christensen
  2002-02-02 22:32 ` Andrew Morton
  1 sibling, 1 reply; 7+ messages in thread
From: Alan Cox @ 2002-02-02 22:17 UTC (permalink / raw)
  To: Lars Christensen; +Cc: linux-kernel, linux-bugs

> Hi. I have experienced a problem with the combination of kernel-2.4.16,
> the kernel agpgart module and NVIDIA supplied drivers. I don't know which
> is the cause of the problem.
> 

Please report problem with the nvidia drivers loaded to nvidia. They have
the kernel source, we do not have their source code. Only they can help
you. 

Alan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.17 agpgart process hang on crash
  2002-02-02 22:17 ` Alan Cox
@ 2002-02-02 22:29   ` Lars Christensen
  0 siblings, 0 replies; 7+ messages in thread
From: Lars Christensen @ 2002-02-02 22:29 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel

On Sat, 2 Feb 2002, Alan Cox wrote:

> > Hi. I have experienced a problem with the combination of kernel-2.4.16,
> > the kernel agpgart module and NVIDIA supplied drivers. I don't know which
> > is the cause of the problem.
> >
>
> Please report problem with the nvidia drivers loaded to nvidia. They have
> the kernel source, we do not have their source code. Only they can help
> you.

I am sorry -- my initial testing weren't throurough enough. Now, booting
to single-user, without any drivers loaded, i can reproduce the bug:

modprobe agpart   # loads fine, AMD 761 chipset found
ulimit -c unlimited   # only occurs if core file sizes are written
./testgart &
pkill -ABRT testgart  # before testgart ends

testgart AND pkill process hang. Nothing will kill them. "pkill pkill"
hangs too :)

Testgart is the one by Jeff Hartman.

Doesn't seem to be NVIDIA drivers causing this. Note, with ulimit -c 0,
testgart terminates, printing "Aborted".

-- 
Lars Christensen, larsch@cs.auc.dk


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.17 agpgart process hang on crash
  2002-02-02 21:55 2.4.17 agpgart process hang on crash Lars Christensen
  2002-02-02 22:17 ` Alan Cox
@ 2002-02-02 22:32 ` Andrew Morton
  2002-02-02 23:13   ` Lars Christensen
  1 sibling, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2002-02-02 22:32 UTC (permalink / raw)
  To: Lars Christensen; +Cc: linux-kernel, linux-bugs

Lars Christensen wrote:
> 
> Hi. I have experienced a problem with the combination of kernel-2.4.16,
> the kernel agpgart module and NVIDIA supplied drivers. I don't know which
> is the cause of the problem.
> 
> Symptoms: Whenever an OpenGL application crashes (segfault etc.), the
> process hangs and can't be killed. Responds to no signals (not even 9). ps
> -ef hangs, it seems, when the crashed process is to be listed (some other
> processes are listed first).
> 
> Hardware: AMD Athlon 1.333HGZ, ASUS M266 motherboard (AMD761 AGP
> chipset), NVIDIA GeForce2 MX400 gfx card.
> 
> The mem=nopentium option have no effect on the problem, but it doesn't
> occur if I use the NVIDIA AGP drivers or kernel 2.4.16 agp drivers. I am
> not able to test the 2.4.17 agpgart with other 3D hardware that nvidia.
> 

This is possibly because the crashing application tries to dump
core, and the kernel gets a fault accessing the video card's
mapping, and deadlocks over the recursive attempt to take mmap_sem.

Please apply this patch:

	http://www.zip.com.au/~akpm/linux/2.4/2.4.18-pre7/fbmem-mmap.patch

and send a report back.

-

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.17 agpgart process hang on crash
  2002-02-02 22:32 ` Andrew Morton
@ 2002-02-02 23:13   ` Lars Christensen
  2002-02-02 23:32     ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Lars Christensen @ 2002-02-02 23:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andrew Morton

On Sat, 2 Feb 2002, Andrew Morton wrote:

> Lars Christensen wrote:
> >
> > Hi. I have experienced a problem with the combination of kernel-2.4.16,
> > the kernel agpgart module and NVIDIA supplied drivers. I don't know which
> > is the cause of the problem.
> >
> > Symptoms: Whenever an OpenGL application crashes (segfault etc.), the
> > process hangs and can't be killed. Responds to no signals (not even 9). ps
> > -ef hangs, it seems, when the crashed process is to be listed (some other
> > processes are listed first).
> >
> > Hardware: AMD Athlon 1.333HGZ, ASUS M266 motherboard (AMD761 AGP
> > chipset), NVIDIA GeForce2 MX400 gfx card.
> >
> > The mem=nopentium option have no effect on the problem, but it doesn't
> > occur if I use the NVIDIA AGP drivers or kernel 2.4.16 agp drivers. I am
> > not able to test the 2.4.17 agpgart with other 3D hardware that nvidia.
> >
>
> This is possibly because the crashing application tries to dump
> core, and the kernel gets a fault accessing the video card's
> mapping, and deadlocks over the recursive attempt to take mmap_sem.
>
> Please apply this patch:
>
> 	http://www.zip.com.au/~akpm/linux/2.4/2.4.18-pre7/fbmem-mmap.patch
>
> and send a report back.

No luck. Still hangs (e.g. with ./testgart & pkill -ABRT testgart), with
and without that patch and with and without 2.4.18-pre7. Does seem to
happen when dumping core--it doesn't happen with core dumping disabled.

-- 
Lars Christensen, larsch@cs.auc.dk


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.17 agpgart process hang on crash
  2002-02-02 23:13   ` Lars Christensen
@ 2002-02-02 23:32     ` Andrew Morton
  2002-02-03  0:13       ` Lars Christensen
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2002-02-02 23:32 UTC (permalink / raw)
  To: Lars Christensen; +Cc: linux-kernel

Lars Christensen wrote:
> 
> No luck. Still hangs (e.g. with ./testgart & pkill -ABRT testgart), with
> and without that patch and with and without 2.4.18-pre7. Does seem to
> happen when dumping core--it doesn't happen with core dumping disabled.
> 

This one, please:

--- linux-2.4.18-pre7/drivers/char/agp/agpgart_fe.c	Sun Aug 12 10:38:48 2001
+++ linux-akpm/drivers/char/agp/agpgart_fe.c	Sat Feb  2 15:29:49 2002
@@ -605,19 +605,18 @@ static int agp_mmap(struct file *file, s
 	agp_client *client;
 	agp_file_private *priv = (agp_file_private *) file->private_data;
 	agp_kern_info kerninfo;
+	int ret = -EPERM;
 
 	lock_kernel();
 	AGP_LOCK();
 
 	if (agp_fe.backend_acquired != TRUE) {
-		AGP_UNLOCK();
-		unlock_kernel();
-		return -EPERM;
+		ret = -EPERM;
+		goto out;
 	}
 	if (!(test_bit(AGP_FF_IS_VALID, &priv->access_flags))) {
-		AGP_UNLOCK();
-		unlock_kernel();
-		return -EPERM;
+		ret = -EPERM;
+		goto out;
 	}
 	agp_copy_info(&kerninfo);
 	size = vma->vm_end - vma->vm_start;
@@ -627,52 +626,46 @@ static int agp_mmap(struct file *file, s
 
 	if (test_bit(AGP_FF_IS_CLIENT, &priv->access_flags)) {
 		if ((size + offset) > current_size) {
-			AGP_UNLOCK();
-			unlock_kernel();
-			return -EINVAL;
+			ret = -EINVAL;
+			goto out;
 		}
 		client = agp_find_client_by_pid(current->pid);
 
 		if (client == NULL) {
-			AGP_UNLOCK();
-			unlock_kernel();
-			return -EPERM;
+			ret = -EPERM;
+			goto out;
 		}
 		if (!agp_find_seg_in_client(client, offset,
 					    size, vma->vm_page_prot)) {
-			AGP_UNLOCK();
-			unlock_kernel();
-			return -EINVAL;
+			ret = -EINVAL;
+			goto out;
 		}
 		if (remap_page_range(vma->vm_start,
 				     (kerninfo.aper_base + offset),
 				     size, vma->vm_page_prot)) {
-			AGP_UNLOCK();
-			unlock_kernel();
-			return -EAGAIN;
-		}
-		AGP_UNLOCK();
-		unlock_kernel();
-		return 0;
+			ret = -EAGAIN;
+			goto out;
+		}
+		ret = 0;
+		goto out;
 	}
 	if (test_bit(AGP_FF_IS_CONTROLLER, &priv->access_flags)) {
 		if (size != current_size) {
-			AGP_UNLOCK();
-			unlock_kernel();
-			return -EINVAL;
+			ret = -EINVAL;
+			goto out;
 		}
 		if (remap_page_range(vma->vm_start, kerninfo.aper_base,
 				     size, vma->vm_page_prot)) {
-			AGP_UNLOCK();
-			unlock_kernel();
-			return -EAGAIN;
-		}
-		AGP_UNLOCK();
-		unlock_kernel();
-		return 0;
+			ret = -EAGAIN;
+			goto out;
+		}
+		ret = 0;
 	}
+out:
 	AGP_UNLOCK();
 	unlock_kernel();
+	if (ret == 0)
+		vma->vm_flags |= VM_IO;
 	return -EPERM;
 }

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.17 agpgart process hang on crash
  2002-02-02 23:32     ` Andrew Morton
@ 2002-02-03  0:13       ` Lars Christensen
  0 siblings, 0 replies; 7+ messages in thread
From: Lars Christensen @ 2002-02-03  0:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andrew Morton

On Sat, 2 Feb 2002, Andrew Morton wrote:

> Lars Christensen wrote:
> >
> > No luck. Still hangs (e.g. with ./testgart & pkill -ABRT testgart), with
> > and without that patch and with and without 2.4.18-pre7. Does seem to
> > happen when dumping core--it doesn't happen with core dumping disabled.
> >
>
> This one, please:
>
> --- linux-2.4.18-pre7/drivers/char/agp/agpgart_fe.c	Sun Aug 12 10:38:48 2001
> +++ linux-akpm/drivers/char/agp/agpgart_fe.c	Sat Feb  2 15:29:49 2002
> @@ -605,19 +605,18 @@ static int agp_mmap(struct file *file, s

<snip>

> +	if (ret == 0)
> +		vma->vm_flags |= VM_IO;
>  	return -EPERM;
>  }

> Sorry - make the last statement `return ret;'

Better. The process dumps core now, but ps -ef hangs after printing a few
processes. Also, with a app runnig with a window open in X, the window
stays, so apparently, the process isn't gone.

-- 
Lars Christensen, larsch@cs.auc.dk



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2002-02-03  0:13 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-02-02 21:55 2.4.17 agpgart process hang on crash Lars Christensen
2002-02-02 22:17 ` Alan Cox
2002-02-02 22:29   ` Lars Christensen
2002-02-02 22:32 ` Andrew Morton
2002-02-02 23:13   ` Lars Christensen
2002-02-02 23:32     ` Andrew Morton
2002-02-03  0:13       ` Lars Christensen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.