public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH] lib/drmtest: don't use asprintf on signal paths
@ 2014-02-04 19:15 Imre Deak
  2014-02-04 21:29 ` Chris Wilson
  0 siblings, 1 reply; 5+ messages in thread
From: Imre Deak @ 2014-02-04 19:15 UTC (permalink / raw)
  To: intel-gfx

It's not signal safe and I got kms_flip in hung state with the backtrace
below, while the parent process waiting for the signal helper to exit.
It was quite easy to reproduce the bug by running

kms_flip --run-subtest=flip-vs-dpms-off-vs-modeset

With the change I couldn't reproduce it.

0  0x00007f9a1362018b in ?? () from /lib/x86_64-linux-gnu/libc.so.6
1  0x00007f9a1359df81 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
2  0x00007f9a1359b6cf in ?? () from /lib/x86_64-linux-gnu/libc.so.6
3  0x00007f9a13628eb6 in __vasprintf_chk () from /lib/x86_64-linux-gnu/libc.so.6
4  0x00007f9a13628e72 in __asprintf_chk () from /lib/x86_64-linux-gnu/libc.so.6
5  0x000000000040a4a2 in asprintf (__fmt=0x417441 "/dev/dri/card%u", __ptr=0x7fff1a972c08)
    at /usr/include/x86_64-linux-gnu/bits/stdio2.h:178
6  drm_get_card () at drmtest.c:190
7  0x000000000040a54a in __drm_open_any () at drmtest.c:229
8  0x000000000040a846 in quiescent_gpu_at_exit (sig=<optimized out>) at drmtest.c:281
9  0x0000000000408759 in call_exit_handlers (sig=3) at drmtest.c:1519
10 fatal_sig_handler (sig=3) at drmtest.c:1543
11 <signal handler called>
12 0x00007f9a13596770 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
13 0x00007f9a135d8f3f in fork () from /lib/x86_64-linux-gnu/libc.so.6
14 0x000000000040b3af in __igt_fork_helper (proc=0x61d8cc <signal_helper>) at drmtest.c:1199
15 0x000000000040b4ce in igt_fork_signal_helper () at drmtest.c:751
16 0x0000000000404167 in main (argc=<optimized out>, argv=<optimized out>) at kms_flip.c:1533

Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 lib/drmtest.c | 31 +++++++++----------------------
 1 file changed, 9 insertions(+), 22 deletions(-)

diff --git a/lib/drmtest.c b/lib/drmtest.c
index f7262d7..77de80b 100644
--- a/lib/drmtest.c
+++ b/lib/drmtest.c
@@ -181,17 +181,14 @@ void gem_quiescent_gpu(int fd)
  */
 int drm_get_card(void)
 {
-	char *name;
 	int i, fd;
 
 	for (i = 0; i < 16; i++) {
-		int ret;
+		char name[128];
 
-		ret = asprintf(&name, "/dev/dri/card%u", i);
-		igt_assert(ret != -1);
+		snprintf(name, sizeof(name), "/dev/dri/card%u", i);
 
 		fd = open(name, O_RDWR);
-		free(name);
 
 		if (fd == -1)
 			continue;
@@ -223,15 +220,12 @@ static void oom_adjust_for_doom(void)
 /** Open the first DRM device we can find, searching up to 16 device nodes */
 static int __drm_open_any(void)
 {
-	char *name;
-	int ret, fd;
+	char name[128];
+	int fd;
 
-	ret = asprintf(&name, "/dev/dri/card%d", drm_get_card());
-	if (ret == -1)
-		return -1;
+	snprintf(name, sizeof(name), "/dev/dri/card%d", drm_get_card());
 
 	fd = open(name, O_RDWR);
-	free(name);
 
 	if (!is_intel(fd)) {
 		close(fd);
@@ -245,17 +239,14 @@ static int __drm_open_any(void)
 
 static int __drm_open_any_render(void)
 {
-	char *name;
 	int i, fd;
 
 	for (i = 128; i < (128 + 16); i++) {
-		int ret;
+		char name[128];
 
-		ret = asprintf(&name, "/dev/dri/renderD%u", i);
-		igt_assert(ret != -1);
+		snprintf(name, sizeof(name), "/dev/dri/renderD%u", i);
 
 		fd = open(name, O_RDWR);
-		free(name);
 
 		if (fd == -1)
 			continue;
@@ -1041,14 +1032,10 @@ void __igt_skip_check(const char *file, const int line,
 	int err = errno;
 
 	if (f) {
-		static char *buf;
-
-		/* igt_skip never returns, so try to not leak too badly. */
-		if (buf)
-			free(buf);
+		static char buf[4096];
 
 		va_start(args, f);
-		vasprintf(&buf, f, args);
+		vsnprintf(buf, sizeof(buf), f, args);
 		va_end(args);
 
 		igt_skip("Test requirement not met in function %s, file %s:%i:\n"
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] lib/drmtest: don't use asprintf on signal paths
  2014-02-04 19:15 [PATCH] lib/drmtest: don't use asprintf on signal paths Imre Deak
@ 2014-02-04 21:29 ` Chris Wilson
  2014-02-04 22:04   ` Imre Deak
  0 siblings, 1 reply; 5+ messages in thread
From: Chris Wilson @ 2014-02-04 21:29 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Tue, Feb 04, 2014 at 09:15:14PM +0200, Imre Deak wrote:
> It's not signal safe and I got kms_flip in hung state with the backtrace
> below, while the parent process waiting for the signal helper to exit.
> It was quite easy to reproduce the bug by running
> 
> kms_flip --run-subtest=flip-vs-dpms-off-vs-modeset

snprintf is not signalsafe either (man 7 signal). X goes as far as
implementing its own limited pnprintf() instead.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] lib/drmtest: don't use asprintf on signal paths
  2014-02-04 21:29 ` Chris Wilson
@ 2014-02-04 22:04   ` Imre Deak
  2014-02-04 23:19     ` Daniel Vetter
  0 siblings, 1 reply; 5+ messages in thread
From: Imre Deak @ 2014-02-04 22:04 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Tue, 2014-02-04 at 21:29 +0000, Chris Wilson wrote:
> On Tue, Feb 04, 2014 at 09:15:14PM +0200, Imre Deak wrote:
> > It's not signal safe and I got kms_flip in hung state with the backtrace
> > below, while the parent process waiting for the signal helper to exit.
> > It was quite easy to reproduce the bug by running
> > 
> > kms_flip --run-subtest=flip-vs-dpms-off-vs-modeset
> 
> snprintf is not signalsafe either (man 7 signal). X goes as far as
> implementing its own limited pnprintf() instead.

Thanks. I got only as far as to realize that asprintf is not signal-safe
b/c of malloc and didn't remember any place with an official list of
allowed functions..

I also missed at least igt_skip() calling vprintf(), so this needs some
more thought.

--Imre

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] lib/drmtest: don't use asprintf on signal paths
  2014-02-04 22:04   ` Imre Deak
@ 2014-02-04 23:19     ` Daniel Vetter
  2014-02-05  9:29       ` Imre Deak
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Vetter @ 2014-02-04 23:19 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Wed, Feb 05, 2014 at 12:04:46AM +0200, Imre Deak wrote:
> On Tue, 2014-02-04 at 21:29 +0000, Chris Wilson wrote:
> > On Tue, Feb 04, 2014 at 09:15:14PM +0200, Imre Deak wrote:
> > > It's not signal safe and I got kms_flip in hung state with the backtrace
> > > below, while the parent process waiting for the signal helper to exit.
> > > It was quite easy to reproduce the bug by running
> > > 
> > > kms_flip --run-subtest=flip-vs-dpms-off-vs-modeset
> > 
> > snprintf is not signalsafe either (man 7 signal). X goes as far as
> > implementing its own limited pnprintf() instead.
> 
> Thanks. I got only as far as to realize that asprintf is not signal-safe
> b/c of malloc and didn't remember any place with an official list of
> allowed functions..
> 
> I also missed at least igt_skip() calling vprintf(), so this needs some
> more thought.

Hm, what calls igt_skip from exit handlers? That would be a fairly big bug
in the framework ... If it happens atm I guess we should splatter a few
more asserts all over the place.

For the actual issue at hand I dunno what to do. Trying to keep all exit
handlers signal-save seems like a lost cause since we'll always miss some
of them because normally they run as atexit callbacks. And libc isn't
friendly and tells us when we do something stupid. Overall this feels a
bit like a general reliability nightmare :(

Good ideas welcome, since I lack them.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] lib/drmtest: don't use asprintf on signal paths
  2014-02-04 23:19     ` Daniel Vetter
@ 2014-02-05  9:29       ` Imre Deak
  0 siblings, 0 replies; 5+ messages in thread
From: Imre Deak @ 2014-02-05  9:29 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 2892 bytes --]

On Wed, 2014-02-05 at 00:19 +0100, Daniel Vetter wrote:
> On Wed, Feb 05, 2014 at 12:04:46AM +0200, Imre Deak wrote:
> > On Tue, 2014-02-04 at 21:29 +0000, Chris Wilson wrote:
> > > On Tue, Feb 04, 2014 at 09:15:14PM +0200, Imre Deak wrote:
> > > > It's not signal safe and I got kms_flip in hung state with the backtrace
> > > > below, while the parent process waiting for the signal helper to exit.
> > > > It was quite easy to reproduce the bug by running
> > > > 
> > > > kms_flip --run-subtest=flip-vs-dpms-off-vs-modeset
> > > 
> > > snprintf is not signalsafe either (man 7 signal). X goes as far as
> > > implementing its own limited pnprintf() instead.
> > 
> > Thanks. I got only as far as to realize that asprintf is not signal-safe
> > b/c of malloc and didn't remember any place with an official list of
> > allowed functions..
> > 
> > I also missed at least igt_skip() calling vprintf(), so this needs some
> > more thought.
> 
> Hm, what calls igt_skip from exit handlers? That would be a fairly big bug
> in the framework ... If it happens atm I guess we should splatter a few
> more asserts all over the place.

At least quiescent_gpu_at_exit()->__drm_open_any()->drm_get_card().

> For the actual issue at hand I dunno what to do. Trying to keep all exit
> handlers signal-save seems like a lost cause since we'll always miss some
> of them because normally they run as atexit callbacks. And libc isn't
> friendly and tells us when we do something stupid. Overall this feels a
> bit like a general reliability nightmare :(

Yea, doing anything substantial in signal handlers doesn't seem like a
good idea. I haven't found much info about what functions you can call
from exit handlers, but I guess we should treat that case similarly to
signal handlers.

I think the only clean, safe and maintainable way is to run the actual
test in a forked process and do any cleanup in case of error or normal
exit in the parent wrapper process. We wouldn't install any cleanup
signal handlers in the child (children), just let it die and have the
wrapper process cleanup afterwards. If a signal handler for cleanup
would be unavoidable it would be really simple and just communicate
anything necessary to the wrapper so that it can do the actual cleanup.

Forking helpers would happen with the control of the wrapper, using some
IPC like a pipe/socket, where the wrapper would receive the pid of any
newly forked helper and it could make sure none of those are left behind
at exit.

I would also use a different method for the (normal) termination of
forked helpers. Atm, we do kill(SIGQUIT), but using some IPC (pipe)
would be more robust imo.

But, for the short term I would suggest auditing our current atexit
handlers, and making them signal-safe. There are about 10 of those, so
it's not (yet) unsurmountable.

--Imre


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-02-05  9:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-04 19:15 [PATCH] lib/drmtest: don't use asprintf on signal paths Imre Deak
2014-02-04 21:29 ` Chris Wilson
2014-02-04 22:04   ` Imre Deak
2014-02-04 23:19     ` Daniel Vetter
2014-02-05  9:29       ` Imre Deak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox