From mboxrd@z Thu Jan  1 00:00:00 1970
From: Cyril Hrubis <chrubis@suse.cz>
Date: Tue, 3 Dec 2019 16:12:38 +0100
Subject: [LTP] [PATCH] memcg_lib/memcg_process: Better synchronization
 of signal USR1
In-Reply-To: <42d40727-f631-39ff-fdc0-576e13336a4d@jv-coder.de>
References: <20191106073621.58738-1-lkml@jv-coder.de>
 <365bdf26-4e52-2159-17cd-52f2fb22e7fd@jv-coder.de>
 <20191125132957.GC8703@rei.lan>
 <2e5756af-d7ef-7919-da6b-46e7fbf3cb66@jv-coder.de>
 <20191125153245.GA15129@rei.lan>
 <5f914dce-92b7-9070-6230-d76b73d7da34@jv-coder.de>
 <20191126121038.GC16922@rei.lan>
 <42d40727-f631-39ff-fdc0-576e13336a4d@jv-coder.de>
Message-ID: <20191203151238.GI2844@rei>
List-Id: <ltp.lists.linux.it>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
To: ltp@lists.linux.it

Hi!
> > I have written a blog post that partly applies to this case, see:
> >
> > https://people.kernel.org/metan/why-sleep-is-almost-never-acceptable-in-tests
> I know where you are coming from and it is basically the same as my own 
> opinion.
> The difference is: When I look at ltp I see a runtime of more than 6 
> hours, looking at the
> controller test alone it is more than 4 hours. This puts 30 seconds into 
> a very differenet
> perspective than looking at only syscall tests. (In the testrun I looked 
> at it is around 13 minutes).
> That is why I don't care about 30 seconds in this case.

controllers testrun runs for 25 minutes on our servers, it will probably
be reduced to 15 minutes in two or three years with next upgrade. The
main point is that hardware tends to be faster and faster but any sleep
in the tests will not scale and ends up being a problem sooner or later.
It also greatly depends on which HW are you running the tests on.

> > So the problem is that sometimes the program has not finished handling
> > the first signal and we are sending another, right?
> >
> > I guess that the proper solution would be avoding the signals in the
> > first place. I guess that we can estabilish two-way communication with
> > fifos, which would also mean that we would get notified as fast as the
> > child dies as well.
> Correct. Using fifos is probably a viable solution, but it would require 
> library work,
> because otherwise the overhead is way too big.
> Another thing I can think of is extending tst_checkpoint wait to also 
> watch a process
> and stop waiting, if that process dies. This would be the simplest way 
> to get good
> synchronization and get rid of the sleep.

I'm not sure if we can implement this without introducing another race
condition. The only way how to wake up futex from sleep before it
timeouts in a race-free way is sending a signal. In this case we should
see EINTR. But that would mean that the process that is waking up the
futex has to be a child of the process, unless we reparent that process,
but all that would be too tricky I guess.

If we decide to wake the futex regulary to check if the process is alive
we can miss the wake. Well the library tries hard and loops over the
wake syscall for a while, but this could still fail on very slow
devices under load. But if the timing is unfortunate we may miss more
than one wake signal, which would lead to timeout. Timing problems like
that can easily arise on VMs with a single CPU on overbookend host.

-- 
Cyril Hrubis
chrubis@suse.cz