All of lore.kernel.org
 help / color / mirror / Atom feed
From: Richard Palethorpe <rpalethorpe@suse.de>
To: Jan Stancek <jstancek@redhat.com>
Cc: LTP List <ltp@lists.linux.it>
Subject: Re: [LTP] [PATCH 1/2] read_all: Add worker timeout
Date: Mon, 18 Jul 2022 14:01:49 +0100	[thread overview]
Message-ID: <87cze2ikud.fsf@suse.de> (raw)
In-Reply-To: <87h73eisbk.fsf@suse.de>

Hello,

Richard Palethorpe <rpalethorpe@suse.de> writes:

> Hello,
>
> Jan Stancek <jstancek@redhat.com> writes:
>
>> On Tue, Jul 12, 2022 at 2:46 PM Richard Palethorpe via ltp
>> <ltp@lists.linux.it> wrote:
>>>
>>> Kill and restart workers that take too long to read a file. The
>>> default being one second. A custom time can be set with the new -t
>>> option.
>>>
>>> This is to prevent a worker from blocking forever in a read. Currently
>>> when this happens the whole test times out and any remaining files in
>>> the worker's queue are not tested.
>>>
>>> As a side effect we can now also set the timeout very low to cause
>>> partial reads.
>>>
>>> Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
>>> Cc: Joerg Vehlow <lkml@jv-coder.de>
>>> Cc: Li Wang <liwang@redhat.com>
>>> ---
>>>  testcases/kernel/fs/read_all/read_all.c | 83 ++++++++++++++++++++++++-
>>>  1 file changed, 82 insertions(+), 1 deletion(-)
>>
>>>
>>> +static void restart_worker(struct worker *const worker)
>>> +{
>>> +       int wstatus, ret, i, q_len;
>>> +       struct timespec now;
>>> +
>>> +       kill(worker->pid, SIGKILL);
>>> +       ret = waitpid(worker->pid, &wstatus, 0);
>>
>> Is there a chance we could get stuck in uninterruptible read? I think I saw some
>> in past, but those may be blacklisted already, so this may only be something
>> to watch for if we still get test timeouts in future.
>>
>
> I was hoping that kill is special somehow, but I suppose that I should
> check exactly what happens. If the process is stuck inside the kernel
> then we don't want to wait too long for it. We just need to know that
> the kill signal was delivered and that the process will not return to
> userland. If we have a large number of zombies then it could exhaust the
> PIDs or some other resource, but most reads are done very quickly and
> don't need interrupting.

AFAICT fatal signals are special which means a reschedule event is sent
immediately on kill. However I guess the IPI doesn't get delivered to
another CPU synchronously, not to mention that preemption might be
disabled, so we don't know what state everything is in by the time
kill() returns.

I guess also some long running kernel code will continue to run after it
has received kill and been rescheduled. So wait could block for a longer
period. In this case though we don't need to worry about the process we
are trying to kill returning to userland.

Regardless of whether all that is correct we can mark the worker as
failed, call kill then call waitpid with WNOHANG. Then either restart it
or return to the main loop. If we didn't restart it, the next time we
encounter that worker we can call waitpid again. After some number of
failures we can abandon waiting on it and create a new process.

-- 
Thank you,
Richard.

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

      reply	other threads:[~2022-07-18 13:58 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-12 12:46 [LTP] [PATCH 1/2] read_all: Add worker timeout Richard Palethorpe via ltp
2022-07-12 12:46 ` [LTP] [PATCH 2/2] read_all: Fix type warnings Richard Palethorpe via ltp
2022-07-14  4:46   ` Li Wang
2022-07-14 17:53   ` Petr Vorel
2022-07-14  4:45 ` [LTP] [PATCH 1/2] read_all: Add worker timeout Li Wang
2022-07-18 10:09   ` Richard Palethorpe
2022-07-19  8:58     ` Li Wang
2022-07-14  6:58 ` Jan Stancek
2022-07-14 17:56   ` Petr Vorel
2022-07-18 10:37     ` Richard Palethorpe
2022-07-18 10:57   ` Richard Palethorpe
2022-07-18 13:01     ` Richard Palethorpe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87cze2ikud.fsf@suse.de \
    --to=rpalethorpe@suse.de \
    --cc=jstancek@redhat.com \
    --cc=ltp@lists.linux.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.