* [PATCH 1/2] fio: Simplify forking of processes
@ 2016-05-24 15:03 Jan Kara
2016-05-24 15:03 ` [PATCH 2/2] Fix occasional hangs on mutexes Jan Kara
0 siblings, 1 reply; 4+ messages in thread
From: Jan Kara @ 2016-05-24 15:03 UTC (permalink / raw)
To: fio; +Cc: Jan Kara
There is no reason to re-attach to shared memory segments after fork(2).
shmat(2) manpage specifically says:
After a fork(2), the child inherits the attached shared memory segments.
So get rid of some unnecessary code.
Signed-off-by: Jan Kara <jack@suse.cz>
---
backend.c | 46 +++++++---------------------------------------
1 file changed, 7 insertions(+), 39 deletions(-)
diff --git a/backend.c b/backend.c
index 6d503606b10e..95385bc0b8b3 100644
--- a/backend.c
+++ b/backend.c
@@ -1791,39 +1791,6 @@ err:
return (void *) (uintptr_t) td->error;
}
-
-/*
- * We cannot pass the td data into a forked process, so attach the td and
- * pass it to the thread worker.
- */
-static int fork_main(struct sk_out *sk_out, int shmid, int offset)
-{
- struct fork_data *fd;
- void *data, *ret;
-
-#if !defined(__hpux) && !defined(CONFIG_NO_SHM)
- data = shmat(shmid, NULL, 0);
- if (data == (void *) -1) {
- int __err = errno;
-
- perror("shmat");
- return __err;
- }
-#else
- /*
- * HP-UX inherits shm mappings?
- */
- data = threads;
-#endif
-
- fd = calloc(1, sizeof(*fd));
- fd->td = data + offset * sizeof(struct thread_data);
- fd->sk_out = sk_out;
- ret = thread_main(fd);
- shmdt(data);
- return (int) (uintptr_t) ret;
-}
-
static void dump_td_info(struct thread_data *td)
{
log_err("fio: job '%s' (state=%d) hasn't exited in %lu seconds, it "
@@ -2161,6 +2128,7 @@ reap:
struct thread_data *map[REAL_MAX_JOBS];
struct timeval this_start;
int this_jobs = 0, left;
+ struct fork_data *fd;
/*
* create threads (TD_NOT_CREATED -> TD_CREATED)
@@ -2210,14 +2178,13 @@ reap:
map[this_jobs++] = td;
nr_started++;
+ fd = calloc(1, sizeof(*fd));
+ fd->td = td;
+ fd->sk_out = sk_out;
+
if (td->o.use_thread) {
- struct fork_data *fd;
int ret;
- fd = calloc(1, sizeof(*fd));
- fd->td = td;
- fd->sk_out = sk_out;
-
dprint(FD_PROCESS, "will pthread_create\n");
ret = pthread_create(&td->thread, NULL,
thread_main, fd);
@@ -2237,8 +2204,9 @@ reap:
dprint(FD_PROCESS, "will fork\n");
pid = fork();
if (!pid) {
- int ret = fork_main(sk_out, shm_id, i);
+ int ret;
+ ret = (int)(uintptr_t)thread_main(fd);
_exit(ret);
} else if (i == fio_debug_jobno)
*fio_debug_jobp = pid;
--
2.6.6
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/2] Fix occasional hangs on mutexes
2016-05-24 15:03 [PATCH 1/2] fio: Simplify forking of processes Jan Kara
@ 2016-05-24 15:03 ` Jan Kara
2016-05-25 19:27 ` Jens Axboe
0 siblings, 1 reply; 4+ messages in thread
From: Jan Kara @ 2016-05-24 15:03 UTC (permalink / raw)
To: fio; +Cc: Jan Kara
When running xfstest generic/299 using fio on my test machine using
ramdisk as a backing store, I have noticed that fio often hangs waiting
for td->io_u_lock. After some debugging I have found out the reason is
that mutexes are created as process-private by default and but this
mutex is actually manipulated from several processes. The hang is not
obvious immediately as the mutex is located in shared memory and thus
while the locking is resolved in userspace, everything works as
expected. Only once we use kernel futexes, the process is not properly
woken up when futex is released.
Fix the problem by marking all mutexes and conditional variables that
are located in shared memory as shared.
Signed-off-by: Jan Kara <jack@suse.cz>
---
backend.c | 29 +++++++++++++++++++++++++++--
helper_thread.c | 28 ++++++++++++++++++++++++++--
iolog.c | 7 ++++++-
workqueue.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++-----
4 files changed, 106 insertions(+), 10 deletions(-)
diff --git a/backend.c b/backend.c
index 95385bc0b8b3..4cdeb930a3e8 100644
--- a/backend.c
+++ b/backend.c
@@ -1426,6 +1426,7 @@ static void *thread_main(void *data)
struct thread_options *o = &td->o;
struct sk_out *sk_out = fd->sk_out;
pthread_condattr_t attr;
+ pthread_mutexattr_t mattr;
int clear_state;
int ret;
@@ -1450,10 +1451,34 @@ static void *thread_main(void *data)
INIT_FLIST_HEAD(&td->verify_list);
INIT_FLIST_HEAD(&td->trim_list);
INIT_FLIST_HEAD(&td->next_rand_list);
- pthread_mutex_init(&td->io_u_lock, NULL);
td->io_hist_tree = RB_ROOT;
- pthread_condattr_init(&attr);
+ ret = pthread_mutexattr_init(&mattr);
+ if (ret) {
+ td_verror(td, ret, "pthread_mutexattr_init");
+ goto err;
+ }
+#ifdef FIO_HAVE_PSHARED_MUTEX
+ ret = pthread_mutexattr_setpshared(&mattr, PTHREAD_PROCESS_SHARED);
+ if (ret) {
+ td_verror(td, ret, "pthread_mutexattr_setpshared");
+ goto err;
+ }
+#endif
+ pthread_mutex_init(&td->io_u_lock, &mattr);
+
+ ret = pthread_condattr_init(&attr);
+ if (ret) {
+ td_verror(td, ret, "pthread_condattr_init");
+ goto err;
+ }
+#ifdef FIO_HAVE_PSHARED_MUTEX
+ ret = pthread_condattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
+ if (ret) {
+ td_verror(td, ret, "pthread_condattr_setpshared");
+ goto err;
+ }
+#endif
pthread_cond_init(&td->verify_cond, &attr);
pthread_cond_init(&td->free_cond, &attr);
diff --git a/helper_thread.c b/helper_thread.c
index 1befabfca7a0..c14296fb51a2 100644
--- a/helper_thread.c
+++ b/helper_thread.c
@@ -142,14 +142,38 @@ int helper_thread_create(struct fio_mutex *startup_mutex, struct sk_out *sk_out)
{
struct helper_data *hd;
int ret;
+ pthread_condattr_t cattr;
+ pthread_mutexattr_t mattr;
hd = smalloc(sizeof(*hd));
setup_disk_util();
hd->sk_out = sk_out;
- pthread_cond_init(&hd->cond, NULL);
- pthread_mutex_init(&hd->lock, NULL);
+ ret = pthread_mutexattr_init(&mattr);
+ if (ret) {
+ log_err("pthread_mutexattr_init: %s\n", strerror(ret));
+ return 1;
+ }
+ ret = pthread_condattr_init(&cattr);
+ if (ret) {
+ log_err("pthread_condattr_init: %s\n", strerror(ret));
+ return 1;
+ }
+#ifdef FIO_HAVE_PSHARED_MUTEX
+ ret = pthread_mutexattr_setpshared(&mattr, PTHREAD_PROCESS_SHARED);
+ if (ret) {
+ log_err("pthread_mutexattr_setpshared: %s\n", strerror(ret));
+ return 1;
+ }
+ ret = pthread_condattr_setpshared(&cattr, PTHREAD_PROCESS_SHARED);
+ if (ret) {
+ log_err("pthread_mutexattr_setpshared: %s\n", strerror(ret));
+ return 1;
+ }
+#endif
+ pthread_cond_init(&hd->cond, &cattr);
+ pthread_mutex_init(&hd->lock, &mattr);
hd->startup_mutex = startup_mutex;
ret = pthread_create(&hd->thread, NULL, helper_thread_main, hd);
diff --git a/iolog.c b/iolog.c
index d9a17a5bcc44..e2f9776e3b5c 100644
--- a/iolog.c
+++ b/iolog.c
@@ -576,6 +576,7 @@ void setup_log(struct io_log **log, struct log_params *p,
const char *filename)
{
struct io_log *l;
+ pthread_mutexattr_t mattr;
l = scalloc(1, sizeof(*l));
INIT_FLIST_HEAD(&l->io_logs);
@@ -604,7 +605,11 @@ void setup_log(struct io_log **log, struct log_params *p,
if (l->log_gz && !p->td)
l->log_gz = 0;
else if (l->log_gz || l->log_gz_store) {
- pthread_mutex_init(&l->chunk_lock, NULL);
+ pthread_mutexattr_init(&mattr);
+#ifdef FIO_HAVE_PSHARED_MUTEX
+ pthread_mutexattr_setpshared(&mattr, PTHREAD_PROCESS_SHARED);
+#endif
+ pthread_mutex_init(&l->chunk_lock, &mattr);
p->td->flags |= TD_F_COMPRESS_LOG;
}
diff --git a/workqueue.c b/workqueue.c
index 4f9c414ac119..13edafae38ca 100644
--- a/workqueue.c
+++ b/workqueue.c
@@ -276,10 +276,26 @@ static int start_worker(struct workqueue *wq, unsigned int index,
{
struct submit_worker *sw = &wq->workers[index];
int ret;
+ pthread_condattr_t cattr;
+ pthread_mutexattr_t mattr;
INIT_FLIST_HEAD(&sw->work_list);
- pthread_cond_init(&sw->cond, NULL);
- pthread_mutex_init(&sw->lock, NULL);
+ ret = pthread_condattr_init(&cattr);
+ if (ret)
+ return ret;
+ ret = pthread_mutexattr_init(&mattr);
+ if (ret)
+ return ret;
+#ifdef FIO_HAVE_PSHARED_MUTEX
+ ret = pthread_condattr_setpshared(&cattr, PTHREAD_PROCESS_SHARED);
+ if (ret)
+ return ret;
+ ret = pthread_mutexattr_setpshared(&mattr, PTHREAD_PROCESS_SHARED);
+ if (ret)
+ return ret;
+#endif
+ pthread_cond_init(&sw->cond, &cattr);
+ pthread_mutex_init(&sw->lock, &mattr);
sw->wq = wq;
sw->index = index;
sw->sk_out = sk_out;
@@ -308,15 +324,41 @@ int workqueue_init(struct thread_data *td, struct workqueue *wq,
{
unsigned int running;
int i, error;
+ int ret;
+ pthread_condattr_t cattr;
+ pthread_mutexattr_t mattr;
wq->max_workers = max_workers;
wq->td = td;
wq->ops = *ops;
wq->work_seq = 0;
wq->next_free_worker = 0;
- pthread_cond_init(&wq->flush_cond, NULL);
- pthread_mutex_init(&wq->flush_lock, NULL);
- pthread_mutex_init(&wq->stat_lock, NULL);
+
+ ret = pthread_condattr_init(&cattr);
+ if (ret) {
+ td_verror(td, ret, "pthread_condattr_init");
+ goto err;
+ }
+ ret = pthread_mutexattr_init(&mattr);
+ if (ret) {
+ td_verror(td, ret, "pthread_mutexattr_init");
+ goto err;
+ }
+#ifdef FIO_HAVE_PSHARED_MUTEX
+ ret = pthread_condattr_setpshared(&cattr, PTHREAD_PROCESS_SHARED);
+ if (ret) {
+ td_verror(td, ret, "pthread_condattr_setpshared");
+ goto err;
+ }
+ ret = pthread_mutexattr_setpshared(&mattr, PTHREAD_PROCESS_SHARED);
+ if (ret) {
+ td_verror(td, ret, "pthread_mutexattr_setpshared");
+ goto err;
+ }
+#endif
+ pthread_cond_init(&wq->flush_cond, &cattr);
+ pthread_mutex_init(&wq->flush_lock, &mattr);
+ pthread_mutex_init(&wq->stat_lock, &mattr);
wq->workers = smalloc(wq->max_workers * sizeof(struct submit_worker));
--
2.6.6
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 2/2] Fix occasional hangs on mutexes
2016-05-24 15:03 ` [PATCH 2/2] Fix occasional hangs on mutexes Jan Kara
@ 2016-05-25 19:27 ` Jens Axboe
2016-05-25 19:56 ` Jens Axboe
0 siblings, 1 reply; 4+ messages in thread
From: Jens Axboe @ 2016-05-25 19:27 UTC (permalink / raw)
To: Jan Kara, fio
On 05/24/2016 09:03 AM, Jan Kara wrote:
> When running xfstest generic/299 using fio on my test machine using
> ramdisk as a backing store, I have noticed that fio often hangs waiting
> for td->io_u_lock. After some debugging I have found out the reason is
> that mutexes are created as process-private by default and but this
> mutex is actually manipulated from several processes. The hang is not
> obvious immediately as the mutex is located in shared memory and thus
> while the locking is resolved in userspace, everything works as
> expected. Only once we use kernel futexes, the process is not properly
> woken up when futex is released.
>
> Fix the problem by marking all mutexes and conditional variables that
> are located in shared memory as shared.
Thanks Jan, applied both 1 and 2.
Would be nice to factor out the cv/mutex init code, so we don't have to
essentially copy/paste it wherever we want to have process shared
mutexes or cond vars.
--
Jens Axboe
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 2/2] Fix occasional hangs on mutexes
2016-05-25 19:27 ` Jens Axboe
@ 2016-05-25 19:56 ` Jens Axboe
0 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2016-05-25 19:56 UTC (permalink / raw)
To: Jan Kara, fio
On 05/25/2016 01:27 PM, Jens Axboe wrote:
> On 05/24/2016 09:03 AM, Jan Kara wrote:
>> When running xfstest generic/299 using fio on my test machine using
>> ramdisk as a backing store, I have noticed that fio often hangs waiting
>> for td->io_u_lock. After some debugging I have found out the reason is
>> that mutexes are created as process-private by default and but this
>> mutex is actually manipulated from several processes. The hang is not
>> obvious immediately as the mutex is located in shared memory and thus
>> while the locking is resolved in userspace, everything works as
>> expected. Only once we use kernel futexes, the process is not properly
>> woken up when futex is released.
>>
>> Fix the problem by marking all mutexes and conditional variables that
>> are located in shared memory as shared.
>
> Thanks Jan, applied both 1 and 2.
>
> Would be nice to factor out the cv/mutex init code, so we don't have to
> essentially copy/paste it wherever we want to have process shared
> mutexes or cond vars.
I did that:
http://git.kernel.dk/cgit/fio/commit/?id=34febb23fa9c7b9b0d54c324effff1a808a8fe6e
--
Jens Axboe
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-05-25 19:56 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-24 15:03 [PATCH 1/2] fio: Simplify forking of processes Jan Kara
2016-05-24 15:03 ` [PATCH 2/2] Fix occasional hangs on mutexes Jan Kara
2016-05-25 19:27 ` Jens Axboe
2016-05-25 19:56 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox