Re: [LTP] [PATCH v2 1/4] read_all: Add worker timeout and rewrite scheduling

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Richard Palethorpe <rpalethorpe@suse.de>
To: Li Wang <liwang@redhat.com>
Cc: LTP List <ltp@lists.linux.it>
Subject: Re: [LTP] [PATCH v2 1/4] read_all: Add worker timeout and rewrite scheduling
Date: Thu, 04 Aug 2022 10:47:43 +0100	[thread overview]
Message-ID: <874jystixh.fsf@suse.de> (raw)
In-Reply-To: <CAEemH2ck+baL4PA5F5OoL1VRCS+W=CJHpXJbx9-jUNVa9ofR0Q@mail.gmail.com>

Hello Li,

Li Wang <liwang@redhat.com> writes:

> Hi Richard, All,
>
> On Mon, Jul 25, 2022 at 6:06 PM Richard Palethorpe <rpalethorpe@suse.com> wrote:
>
>  Kill and restart workers that take too long to read a file. The
>  default being 10% of max_runtime. A custom time can be set with the
>  new -t option.
>
>  This is to prevent a worker from blocking forever in a read. Currently
>  when this happens the whole test times out and any remaining files in
>  the worker's queue are not tested.
>
>  As a side effect we can now also set the timeout very low to cause
>  partial reads. However setting it to less than the time it takes to
>  start (fork) a new worker is treated as an error. Forking takes much
>  longer than most reads.
>
>  A number of other possible issues were fixed as well as a side effect
>  of changing the scheduling:
>
>  + The worker queues are now filled in a
>    "round robin" way. Before this only happened when -r was large
>    enough.
>  + When testing is finished the main process waits on the child procs before
>    destroying the semaphores and worker queues.
>  + max_runtime is set to 100 secs. This is so that the worker timeout
>    is a round number.
>  + Files which don't implement O_NONBLOCK and may block, no longer need
>    to be avoided. Even if they refuse to die immediately;
>    although this may result in TBROK due to zombie processes.
>
>  Note that with a worker timeout of 1s, 2 to 3 files will usually timeout on
>  my workstation. With 2s, none will. In any case testing completes in
>  under 3s for proc, sys or dev.
>
>  This is much faster than many other machines. It's quite likely the
>  worker timeout and max_runtime need to be increased on small and very
>  large machines. This can be done manually by setting LTP_RUNTIME_MUL.
>
> Yes, apart from a bit of difficulty (at least for me) to comprehend the detailed
> behavior of this scheduler :). 
>
> Thanks for your improvements! Just one tiny query below.
>
> Reviewed-by: Li Wang <liwang@redhat.com>

Thanks!
>
>   
>  --- a/testcases/kernel/fs/read_all/read_all.c
>  +++ b/testcases/kernel/fs/read_all/read_all.c
>  ...
>
>  +#include <signal.h>
>   #include <sys/types.h>
>   #include <sys/stat.h>
>  +#include <sys/wait.h>
>   #include <lapi/fnmatch.h>
>   #include <stdlib.h>
>   #include <stdio.h>
>  @@ -43,7 +45,10 @@
>   #include <pwd.h>
>   #include <grp.h>
>
>  +#include "tst_atomic.h"
>  +#include "tst_safe_clocks.h"
>   #include "tst_test.h"
>  +#include "tst_timer.h"
>
>   #define QUEUE_SIZE 16384
>   #define BUFFER_SIZE 1024
>  @@ -55,11 +60,15 @@ struct queue {
>          int front;
>          int back;
>          char data[QUEUE_SIZE];
>
> I doubt whether we need to maintain a queue to store the paths.
> During the test it seems only one path is being pushed in the q->data[],
> so the rest of the space is wasted, right?
>
> By shrinking the QUEUE_SIZE to equal BUFFER_SIZE, the test
> still works normally.

The rest of the space should be being used unless it is buggy. It's a
circle buffer with multiple (variable length) items delimited by \0.

However it's a very good point you make. It's true that having a
multi-item queue on each worker is not necessary.

At most we need two buffers (or a flip buffer) on each worker, one
containing the current path being worked on and one with the next path
to be worked on.

This would be much simpler than the circle buffer queue I originally
implemented. It may mean the workers are more likely to starve while
waiting for the main process. However the main process usually adds work
far faster than the workers can consume it.

So we should probably swap the queue for a flip buffer. I'll look into
that and merge this patchset as it is for now.

-- 
Thank you,
Richard.

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

next prev parent reply	other threads:[~2022-08-04 10:14 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-25 10:06 [LTP] [PATCH v2 1/4] read_all: Add worker timeout and rewrite scheduling Richard Palethorpe via ltp
2022-07-25 10:06 ` [LTP] [PATCH v2 2/4] read_all: Fix type warnings Richard Palethorpe via ltp
2022-07-25 10:06 ` [LTP] [PATCH v2 3/4] read_all: Allow /sys/power/wakeup_count Richard Palethorpe via ltp
2022-08-03  7:57   ` Li Wang
2022-07-25 10:06 ` [LTP] [PATCH v2 4/4] read_all: Prevent FNM_EXTMATCH redefinition Richard Palethorpe via ltp
2022-08-03  7:57   ` Li Wang
2022-08-03  7:49 ` [LTP] [PATCH v2 1/4] read_all: Add worker timeout and rewrite scheduling Li Wang
2022-08-04  9:47   ` Richard Palethorpe [this message]
2022-08-05  6:26     ` Jan Stancek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874jystixh.fsf@suse.de \
    --to=rpalethorpe@suse.de \
    --cc=liwang@redhat.com \
    --cc=ltp@lists.linux.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.