All of lore.kernel.org
 help / color / mirror / Atom feed
* Segfault on OS X
@ 2013-01-08 19:37 Niraj Tolia
  2013-01-08 20:01 ` Jens Axboe
  0 siblings, 1 reply; 3+ messages in thread
From: Niraj Tolia @ 2013-01-08 19:37 UTC (permalink / raw)
  To: fio

I am running fio (HEAD:a28b019) on OS X (10.8.2) and just ran into a
segfault after more than an hour of running the benchmark. Will dig
into this more but wanted to check if someone else had run into this.
I did manage to get a core though. There were three threads running
with two sitting in __semwait_signal () (via usleep) and the third
was:

[Switching to thread 3 (core thread 2)]
0x000000010fcf7910 in thread_main (data=0x1105fe000) at backend.c:510
510            if (break_on_this_error(td, io_u->ddir, &ret))
(gdb) where
#0  0x000000010fcf7910 in thread_main (data=0x1105fe000) at backend.c:510
#1  0x00007fff885d1742 in _pthread_start ()
#2  0x00007fff885be181 in thread_start ()

It seems like io_u is null here.

Cheers,
Niraj

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Segfault on OS X
  2013-01-08 19:37 Segfault on OS X Niraj Tolia
@ 2013-01-08 20:01 ` Jens Axboe
  2013-01-09  5:32   ` Niraj Tolia
  0 siblings, 1 reply; 3+ messages in thread
From: Jens Axboe @ 2013-01-08 20:01 UTC (permalink / raw)
  To: Niraj Tolia; +Cc: fio

On Tue, Jan 08 2013, Niraj Tolia wrote:
> I am running fio (HEAD:a28b019) on OS X (10.8.2) and just ran into a
> segfault after more than an hour of running the benchmark. Will dig
> into this more but wanted to check if someone else had run into this.
> I did manage to get a core though. There were three threads running
> with two sitting in __semwait_signal () (via usleep) and the third
> was:
> 
> [Switching to thread 3 (core thread 2)]
> 0x000000010fcf7910 in thread_main (data=0x1105fe000) at backend.c:510
> 510            if (break_on_this_error(td, io_u->ddir, &ret))
> (gdb) where
> #0  0x000000010fcf7910 in thread_main (data=0x1105fe000) at backend.c:510
> #1  0x00007fff885d1742 in _pthread_start ()
> #2  0x00007fff885be181 in thread_start ()
> 
> It seems like io_u is null here.

My first thought was "impossible", but looking at the code, we do
clear io_u on requeue events. So that dereference below the
main switch is a bug. The below should fix it, I've committed it.

diff --git a/backend.c b/backend.c
index 225d8a3..099bd9b 100644
--- a/backend.c
+++ b/backend.c
@@ -422,6 +422,7 @@ static void do_verify(struct thread_data *td)
 
 	io_u = NULL;
 	while (!td->terminate) {
+		enum fio_ddir ddir;
 		int ret2, full;
 
 		update_tv_cache(td);
@@ -456,6 +457,8 @@ static void do_verify(struct thread_data *td)
 		else
 			io_u->end_io = verify_io_u;
 
+		ddir = io_u->ddir;
+
 		ret = td_io_queue(td, io_u);
 		switch (ret) {
 		case FIO_Q_COMPLETED:
@@ -507,7 +510,7 @@ sync_done:
 			break;
 		}
 
-		if (break_on_this_error(td, io_u->ddir, &ret))
+		if (break_on_this_error(td, ddir, &ret))
 			break;
 
 		/*

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: Segfault on OS X
  2013-01-08 20:01 ` Jens Axboe
@ 2013-01-09  5:32   ` Niraj Tolia
  0 siblings, 0 replies; 3+ messages in thread
From: Niraj Tolia @ 2013-01-09  5:32 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

On Tue, Jan 8, 2013 at 12:01 PM, Jens Axboe <axboe@kernel.dk> wrote:
> On Tue, Jan 08 2013, Niraj Tolia wrote:
>> I am running fio (HEAD:a28b019) on OS X (10.8.2) and just ran into a
>> segfault after more than an hour of running the benchmark. Will dig
>> into this more but wanted to check if someone else had run into this.
>> I did manage to get a core though. There were three threads running
>> with two sitting in __semwait_signal () (via usleep) and the third
>> was:
>>
>> [Switching to thread 3 (core thread 2)]
>> 0x000000010fcf7910 in thread_main (data=0x1105fe000) at backend.c:510
>> 510            if (break_on_this_error(td, io_u->ddir, &ret))
>> (gdb) where
>> #0  0x000000010fcf7910 in thread_main (data=0x1105fe000) at backend.c:510
>> #1  0x00007fff885d1742 in _pthread_start ()
>> #2  0x00007fff885be181 in thread_start ()
>>
>> It seems like io_u is null here.
>
> My first thought was "impossible", but looking at the code, we do
> clear io_u on requeue events. So that dereference below the
> main switch is a bug. The below should fix it, I've committed it.
>

Thanks for the really quick turn-around. I picked up the patch and ran
it for a while without any failures. Will definitely report back if
something else comes up.

Cheers,
Niraj


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-01-09  5:32 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-08 19:37 Segfault on OS X Niraj Tolia
2013-01-08 20:01 ` Jens Axboe
2013-01-09  5:32   ` Niraj Tolia

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.