All of lore.kernel.org
 help / color / mirror / Atom feed
* [LTP] [PATCH v1] connectors/cn_pec: improve reliability
@ 2021-08-05 14:08 Bogdan Lezhepekov
  2021-08-06  4:35 ` Joerg Vehlow
  2021-08-06  7:48 ` Petr Vorel
  0 siblings, 2 replies; 6+ messages in thread
From: Bogdan Lezhepekov @ 2021-08-05 14:08 UTC (permalink / raw)
  To: ltp

Sometimes pec_listener has not enough time to handle all
the generated events.

This patch only returns back the delay that was removed
in a recent patch.

Signed-off-by: Bogdan Lezhepekov <bogdan.lezhepekov@suse.com>
---
 testcases/kernel/connectors/pec/cn_pec.sh | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/testcases/kernel/connectors/pec/cn_pec.sh b/testcases/kernel/connectors/pec/cn_pec.sh
index 9b85a5c81..24b1db761 100755
--- a/testcases/kernel/connectors/pec/cn_pec.sh
+++ b/testcases/kernel/connectors/pec/cn_pec.sh
@@ -77,6 +77,8 @@ test()
 	tst_sleep 100ms
 
 	ROD event_generator -n $num_events -e $event \>gen_$event.log 2\>gen_$event.err
+    # sleep until pec_listener has seen and handled all of the generated events
+    tst_sleep 100ms
 
 	kill -s INT $pid 2> /dev/null
 	wait $pid
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [LTP] [PATCH v1] connectors/cn_pec: improve reliability
  2021-08-05 14:08 [LTP] [PATCH v1] connectors/cn_pec: improve reliability Bogdan Lezhepekov
@ 2021-08-06  4:35 ` Joerg Vehlow
  2021-08-06  7:48 ` Petr Vorel
  1 sibling, 0 replies; 6+ messages in thread
From: Joerg Vehlow @ 2021-08-06  4:35 UTC (permalink / raw)
  To: ltp

Hi,

On 8/5/2021 4:08 PM, Bogdan Lezhepekov via ltp wrote:
> Sometimes pec_listener has not enough time to handle all
> the generated events.
>
> This patch only returns back the delay that was removed
> in a recent patch.
I think I thought about keeping or removing this delay and finally 
decided to remove it for some reason, that I do not remember right now.
Can you tell me the environment where this failed?

Joerg

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [LTP] [PATCH v1] connectors/cn_pec: improve reliability
  2021-08-05 14:08 [LTP] [PATCH v1] connectors/cn_pec: improve reliability Bogdan Lezhepekov
  2021-08-06  4:35 ` Joerg Vehlow
@ 2021-08-06  7:48 ` Petr Vorel
  2021-08-06  8:14   ` Joerg Vehlow
  1 sibling, 1 reply; 6+ messages in thread
From: Petr Vorel @ 2021-08-06  7:48 UTC (permalink / raw)
  To: ltp

Hi Bogdan, Joerg,
> Sometimes pec_listener has not enough time to handle all
> the generated events.

> This patch only returns back the delay that was removed
> in a recent patch.

> Signed-off-by: Bogdan Lezhepekov <bogdan.lezhepekov@suse.com>
> ---
>  testcases/kernel/connectors/pec/cn_pec.sh | 2 ++
>  1 file changed, 2 insertions(+)

> diff --git a/testcases/kernel/connectors/pec/cn_pec.sh b/testcases/kernel/connectors/pec/cn_pec.sh
> index 9b85a5c81..24b1db761 100755
> --- a/testcases/kernel/connectors/pec/cn_pec.sh
> +++ b/testcases/kernel/connectors/pec/cn_pec.sh
> @@ -77,6 +77,8 @@ test()
>  	tst_sleep 100ms

>  	ROD event_generator -n $num_events -e $event \>gen_$event.log 2\>gen_$event.err
> +    # sleep until pec_listener has seen and handled all of the generated events
> +    tst_sleep 100ms
nit: mixing tabs and spaces (script uses tabs), can be fixed during merge.

If I understand the code correctly, pec_listener keeps receiving msg from PEC
until got signaled by kill. We generally try to avoid sleep in tests [1].

@Joerg: I'm not sure if it's reasonable to add another signal handler in which
pec_listener would print number of already handled requests (with write() as
printf() is not reentrant). Then we could loop for certain time before sending INT.

Kind regards,
Petr

[1] https://people.kernel.org/metan/why-sleep-is-almost-never-acceptable-in-tests

>  	kill -s INT $pid 2> /dev/null
>  	wait $pid

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [LTP] [PATCH v1] connectors/cn_pec: improve reliability
  2021-08-06  7:48 ` Petr Vorel
@ 2021-08-06  8:14   ` Joerg Vehlow
  2021-08-06  9:06     ` Bogdan Lezhepekov
  0 siblings, 1 reply; 6+ messages in thread
From: Joerg Vehlow @ 2021-08-06  8:14 UTC (permalink / raw)
  To: ltp

Hi Petr,

On 8/6/2021 9:48 AM, Petr Vorel wrote:
>
> If I understand the code correctly, pec_listener keeps receiving msg from PEC
> until got signaled by kill. We generally try to avoid sleep in tests [1].
>
> @Joerg: I'm not sure if it's reasonable to add another signal handler in which
> pec_listener would print number of already handled requests (with write() as
> printf() is not reentrant). Then we could loop for certain time before sending INT.
>
The problem is, that it is hard to know, if all expected events were 
received. pec_listener receives events from all processes.
The only way to reliably handle the issue without sleeping is tracking 
exit-events in pec_listener and to tell it to die, once it received an 
exit event from event_generator.
Only then we can be sure, that all events generated by the generator 
(and hopefully all it's children) were received.

We could also get rid of the other sleep (after starting the listener 
using some kind of synchronization (I guess right after 
PROC_CN_MCAST_LISTEN should work). I think checkpoints are not part of 
the new lib...

Joerg

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [LTP] [PATCH v1] connectors/cn_pec: improve reliability
  2021-08-06  8:14   ` Joerg Vehlow
@ 2021-08-06  9:06     ` Bogdan Lezhepekov
  2021-08-06  9:21       ` Joerg Vehlow
  0 siblings, 1 reply; 6+ messages in thread
From: Bogdan Lezhepekov @ 2021-08-06  9:06 UTC (permalink / raw)
  To: ltp

Hi Joerg, Petr!

Thanks for reviewing and commenting! I don't have a strong opinion on how the test flow should look like, but I can say this sleep definitely makes the test much more reliable.

The results, we get on known to both of you SUSE Carwos, are pretty consistent. My feeling is that 8/10 runs end up with some sporadic failures. The fact that extra sleep solves the problem obviously indicates a lack of synchronization.

@Joerg: if you need more info about the environment, please reach out to me.

-Bogdan
________________________________
From: Joerg Vehlow <lkml@jv-coder.de>
Sent: Friday, August 6, 2021 11:14
To: Petr Vorel <pvorel@suse.cz>; Bogdan Lezhepekov <bogdan.lezhepekov@suse.com>
Cc: ltp@lists.linux.it <ltp@lists.linux.it>
Subject: Re: [LTP] [PATCH v1] connectors/cn_pec: improve reliability

Hi Petr,

On 8/6/2021 9:48 AM, Petr Vorel wrote:
>
> If I understand the code correctly, pec_listener keeps receiving msg from PEC
> until got signaled by kill. We generally try to avoid sleep in tests [1].
>
> @Joerg: I'm not sure if it's reasonable to add another signal handler in which
> pec_listener would print number of already handled requests (with write() as
> printf() is not reentrant). Then we could loop for certain time before sending INT.
>
The problem is, that it is hard to know, if all expected events were
received. pec_listener receives events from all processes.
The only way to reliably handle the issue without sleeping is tracking
exit-events in pec_listener and to tell it to die, once it received an
exit event from event_generator.
Only then we can be sure, that all events generated by the generator
(and hopefully all it's children) were received.

We could also get rid of the other sleep (after starting the listener
using some kind of synchronization (I guess right after
PROC_CN_MCAST_LISTEN should work). I think checkpoints are not part of
the new lib...

Joerg

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linux.it/pipermail/ltp/attachments/20210806/8c6a05fb/attachment-0001.htm>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [LTP] [PATCH v1] connectors/cn_pec: improve reliability
  2021-08-06  9:06     ` Bogdan Lezhepekov
@ 2021-08-06  9:21       ` Joerg Vehlow
  0 siblings, 0 replies; 6+ messages in thread
From: Joerg Vehlow @ 2021-08-06  9:21 UTC (permalink / raw)
  To: ltp

Hi Bogdan,

On 8/6/2021 11:06 AM, Bogdan Lezhepekov wrote:
> Hi Joerg, Petr!
>
> Thanks for reviewing and commenting! I don't have a strong opinion on 
> how the test flow should look like, but I can say this sleep 
> definitely makes the test much more reliable.
>
> The results, we get on known to both of you SUSE Carwos, are pretty 
> consistent. My feeling is that 8/10 runs end up with some sporadic 
> failures. The fact that extra sleep solves the problem obviously 
> indicates a lack of synchronization.
>
> @Joerg: if you need more info about the environment, please reach out 
> to me.
Strange, my test environment was probably something derived from carwos 
(not exactly sure anymore).
I will try to reproduce and implement a correct synchronization as 
described in my earlier mail next week.
But maybe you can tell me more details about the actual used environment 
(If it is related to carwos customer project, you can send it to my 
corporate email address: joerg.vehlow@aox-tech.de).

Joerg

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-08-06  9:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-08-05 14:08 [LTP] [PATCH v1] connectors/cn_pec: improve reliability Bogdan Lezhepekov
2021-08-06  4:35 ` Joerg Vehlow
2021-08-06  7:48 ` Petr Vorel
2021-08-06  8:14   ` Joerg Vehlow
2021-08-06  9:06     ` Bogdan Lezhepekov
2021-08-06  9:21       ` Joerg Vehlow

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.