* Segfault on OS X
@ 2013-01-08 19:37 Niraj Tolia
2013-01-08 20:01 ` Jens Axboe
0 siblings, 1 reply; 3+ messages in thread
From: Niraj Tolia @ 2013-01-08 19:37 UTC (permalink / raw)
To: fio
I am running fio (HEAD:a28b019) on OS X (10.8.2) and just ran into a
segfault after more than an hour of running the benchmark. Will dig
into this more but wanted to check if someone else had run into this.
I did manage to get a core though. There were three threads running
with two sitting in __semwait_signal () (via usleep) and the third
was:
[Switching to thread 3 (core thread 2)]
0x000000010fcf7910 in thread_main (data=0x1105fe000) at backend.c:510
510 if (break_on_this_error(td, io_u->ddir, &ret))
(gdb) where
#0 0x000000010fcf7910 in thread_main (data=0x1105fe000) at backend.c:510
#1 0x00007fff885d1742 in _pthread_start ()
#2 0x00007fff885be181 in thread_start ()
It seems like io_u is null here.
Cheers,
Niraj
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Segfault on OS X
2013-01-08 19:37 Segfault on OS X Niraj Tolia
@ 2013-01-08 20:01 ` Jens Axboe
2013-01-09 5:32 ` Niraj Tolia
0 siblings, 1 reply; 3+ messages in thread
From: Jens Axboe @ 2013-01-08 20:01 UTC (permalink / raw)
To: Niraj Tolia; +Cc: fio
On Tue, Jan 08 2013, Niraj Tolia wrote:
> I am running fio (HEAD:a28b019) on OS X (10.8.2) and just ran into a
> segfault after more than an hour of running the benchmark. Will dig
> into this more but wanted to check if someone else had run into this.
> I did manage to get a core though. There were three threads running
> with two sitting in __semwait_signal () (via usleep) and the third
> was:
>
> [Switching to thread 3 (core thread 2)]
> 0x000000010fcf7910 in thread_main (data=0x1105fe000) at backend.c:510
> 510 if (break_on_this_error(td, io_u->ddir, &ret))
> (gdb) where
> #0 0x000000010fcf7910 in thread_main (data=0x1105fe000) at backend.c:510
> #1 0x00007fff885d1742 in _pthread_start ()
> #2 0x00007fff885be181 in thread_start ()
>
> It seems like io_u is null here.
My first thought was "impossible", but looking at the code, we do
clear io_u on requeue events. So that dereference below the
main switch is a bug. The below should fix it, I've committed it.
diff --git a/backend.c b/backend.c
index 225d8a3..099bd9b 100644
--- a/backend.c
+++ b/backend.c
@@ -422,6 +422,7 @@ static void do_verify(struct thread_data *td)
io_u = NULL;
while (!td->terminate) {
+ enum fio_ddir ddir;
int ret2, full;
update_tv_cache(td);
@@ -456,6 +457,8 @@ static void do_verify(struct thread_data *td)
else
io_u->end_io = verify_io_u;
+ ddir = io_u->ddir;
+
ret = td_io_queue(td, io_u);
switch (ret) {
case FIO_Q_COMPLETED:
@@ -507,7 +510,7 @@ sync_done:
break;
}
- if (break_on_this_error(td, io_u->ddir, &ret))
+ if (break_on_this_error(td, ddir, &ret))
break;
/*
--
Jens Axboe
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: Segfault on OS X
2013-01-08 20:01 ` Jens Axboe
@ 2013-01-09 5:32 ` Niraj Tolia
0 siblings, 0 replies; 3+ messages in thread
From: Niraj Tolia @ 2013-01-09 5:32 UTC (permalink / raw)
To: Jens Axboe; +Cc: fio
On Tue, Jan 8, 2013 at 12:01 PM, Jens Axboe <axboe@kernel.dk> wrote:
> On Tue, Jan 08 2013, Niraj Tolia wrote:
>> I am running fio (HEAD:a28b019) on OS X (10.8.2) and just ran into a
>> segfault after more than an hour of running the benchmark. Will dig
>> into this more but wanted to check if someone else had run into this.
>> I did manage to get a core though. There were three threads running
>> with two sitting in __semwait_signal () (via usleep) and the third
>> was:
>>
>> [Switching to thread 3 (core thread 2)]
>> 0x000000010fcf7910 in thread_main (data=0x1105fe000) at backend.c:510
>> 510 if (break_on_this_error(td, io_u->ddir, &ret))
>> (gdb) where
>> #0 0x000000010fcf7910 in thread_main (data=0x1105fe000) at backend.c:510
>> #1 0x00007fff885d1742 in _pthread_start ()
>> #2 0x00007fff885be181 in thread_start ()
>>
>> It seems like io_u is null here.
>
> My first thought was "impossible", but looking at the code, we do
> clear io_u on requeue events. So that dereference below the
> main switch is a bug. The below should fix it, I've committed it.
>
Thanks for the really quick turn-around. I picked up the patch and ran
it for a while without any failures. Will definitely report back if
something else comes up.
Cheers,
Niraj
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2013-01-09 5:32 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-08 19:37 Segfault on OS X Niraj Tolia
2013-01-08 20:01 ` Jens Axboe
2013-01-09 5:32 ` Niraj Tolia
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.