From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02C8EC433ED for ; Thu, 1 Apr 2021 01:21:27 +0000 (UTC) Received: from lists.lttng.org (lists.lttng.org [167.114.26.123]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1110760FED for ; Thu, 1 Apr 2021 01:21:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1110760FED Authentication-Results: mail.kernel.org; dmarc=pass (p=none dis=none) header.from=lists.lttng.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=lttng-dev-bounces@lists.lttng.org Received: from lists-lttng01.efficios.com (localhost [IPv6:::1]) by lists.lttng.org (Postfix) with ESMTP id 4F9lkr16GQzSKq; Wed, 31 Mar 2021 21:21:24 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=lists.lttng.org; s=default; t=1617240085; bh=cq9dXTC6ExBZlPDRLGMMaN6K0+TCTl49yq7mVlrmTX8=; h=References:In-Reply-To:Date:To:Cc:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=AtLowNVgves51e1aHs/LtT6YTn3pGGAfWya3gqa45rJMBTbxkxmC5TmoKC/beKwLU KZ/07BXdFonW3PYyWmR96u2wNNpg4jX26eYbR3epI3A/pTna1c53KlJFbQA4iPSlCO MgYq7gu8OdLPaSsnwPv5LJi2Y8OLyGRQdWLzpU3oFsyXQcpIBwaWHxIapQJfIsxWYQ +6uPFOGoOtwZyxKM5rfxHO+Y8H5nundFpRqPl6kO58CBYJSqA7lwolz9S0RzxkqqAz Hs/bbV6btPg494nGQSx75Pxes1tPH6P/Pwsm8qcShE1LOreXIpgD9golFCpn2Howcq kv9Rp4YYv4rDg== Received: from mail-io1-xd30.google.com (mail-io1-xd30.google.com [IPv6:2607:f8b0:4864:20::d30]) by lists.lttng.org (Postfix) with ESMTPS id 4F9lkp5GT7zSb1 for ; Wed, 31 Mar 2021 21:21:22 -0400 (EDT) Received: by mail-io1-xd30.google.com with SMTP id k8so500416iop.12 for ; Wed, 31 Mar 2021 18:21:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=qD8n59Ivhe7yx3kGtem+1KPfwhxZo1v6HswKc3URs8A=; b=GMtcsqJ80msYZ+ApxN0Yl0Y4huOLwcw7RkCqscEQsVwZPyOQP5KD9MdQaZN0oDfUYy bO7rq6/qUHxR5PCmJMGqfnmI1ExKvsswKwGuGOTNRzRdoYMr9sfwWDvanyPvf+mwE6mv bg16TvU3L1PLltuADJ03jVPL3Jh9SDHcKhxyYzoH1DOoLmVk+WGzN4BX1LMpxtY47FMm Ik0k56svecqLZd4Ln2kV3jywTgRdefInAXRAwcGYzLWMxSUOqcU+ClH/ahJ7f/BnDghO jhUo5SZYMuB9t8VacOSgfo6No9sgO8ZdmyKvIiCjgMP+a52/nxKW+O+zVRPYPWJozviC Lw0A== X-Gm-Message-State: AOAM530lzsSs5ytwYOhPZPjaIKtmNcmCWZJRrzdADrJznDUVy/KFxdfs OLumv+2MWDD+6iWtACfrrjASikCQMXfLqtQW1og= X-Google-Smtp-Source: ABdhPJwjMHj1GEOJNnKM5EF6i4UM4D0dwrpt3ZD0wGf0rXa82gUWdrRr2dOfUsHeWYF6AqSEZHtFc3IPyvKjcTX3mKU= X-Received: by 2002:a05:6602:2e82:: with SMTP id m2mr4480200iow.70.1617240081897; Wed, 31 Mar 2021 18:21:21 -0700 (PDT) MIME-Version: 1.0 References: <20210331185624.406960-1-wallinux@gmail.com> <592114658.48857.1617218709305.JavaMail.zimbra@efficios.com> <20210331213313.GD28307@joraj-alpa> In-Reply-To: <20210331213313.GD28307@joraj-alpa> Date: Thu, 1 Apr 2021 03:21:10 +0200 Message-ID: To: Jonathan Rajotte-Julien Cc: lttng-dev Subject: Re: [lttng-dev] [PATCH lttng-tools] Fix: test code assumes that child process is schedule to run before parent X-BeenThere: lttng-dev@lists.lttng.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: LTTng development list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Anders Wallin via lttng-dev Reply-To: Anders Wallin Content-Type: multipart/mixed; boundary="===============4113572608032867662==" Errors-To: lttng-dev-bounces@lists.lttng.org Sender: "lttng-dev" --===============4113572608032867662== Content-Type: multipart/alternative; boundary="000000000000991d3605bedf0c27" --000000000000991d3605bedf0c27 Content-Type: text/plain; charset="UTF-8" Hi Jonathan :) It's very unlikely that the race could occur, BUT it can happen. OK run 1. test_event_tracker starts gen-ust-events 2. test_event_tracker waits for gen-ust-events to create AFTER_FIRST_PATH 3. gen-ust-event create first event and create AFTER_FIRST_PATH 4. gen-ust-event continue and create 99 events 5. gen-ust-event wait for BEFORE_LAST_PATH 6. test_event_track start collecting events (lttng start .....) 7. test_event_tracker calls "lttng track ... -pid "Faulty PID" 8. test_event_tracker touches BEFORE_LAST_PATH 9. gen-ust-events creates the last event 10. test_event finishes and since it tracks the faulty pid the result will be 0 events == OK Faulty run 1. test_event_tracker starts gen-ust-events 2. test_event_tracker waits for gen-ust-events to create AFTER_FIRST_PATH 3. gen-ust-event create first event and create AFTER_FIRST_PATH 4. gen-ust-event gets rescheduled before it has created 99 events, e.g after 9 events 5. test_event_track start collecting events (lttng start .....) 6. gen-ust-event is rescheduled and starts generating the remaining events. 90 events 7. lttng collects these 90 events since we have not setup "tracking" of PID yet 8. test_event_tracker calls "lttng track ... -pid "Faulty PID" 9. test_event_tracker touches BEFORE_LAST_PATH 10. gen-ust-events creates the last event 11. test_event finishes and since it tracks the faulty pid the result should be 0 events, but we got 90 == FAULTY We can solve this by; A: using NR_ITER=2 or B: by adding a flag to gen-ust-events to wait before sending the first event 1. test_event_tracker starts gen-ust-events 2. test_event_tracker waits for gen-ust-events waits for BEFORE_FIRST_PATH 3. test_event_track start collecting events (lttng start .....) 4. test_event_tracker calls "lttng track ... -pid "Faulty PID" 5. test_event_track creates BEFORE_FIRST_PATH 6. gen_ust_event creates 100 events 7. test_event_tracker waits for gen_event_ust to end 8. test_event finishes and since it tracked the faulty pid the result will be 0 events == OK This is in principle how gen-kernel-test-events works (but with different arguments) I would suggest to use B since that will be bulletproof Anders Wallin On Wed, Mar 31, 2021 at 11:33 PM Jonathan Rajotte-Julien < jonathan.rajotte-julien@efficios.com> wrote: > Hi, > > On Wed, Mar 31, 2021 at 11:09:42PM +0200, Anders Wallin wrote: > > Hi Julian, > > You can use Jonathan. ;) > > > > > Neither mine "sleep 0.1" or your version with "while [! -f ............" > > are race condition free. > > I might be missing something here but as far as I understand the race you > are > trying to mitigate is that the test script execute/continue before the > `backgrounded` > command (background test app) had time to execute, right? > > If so at least waiting for the app to create a file is necessary. Now > gen_kernel_test_events does not have this functionality. The > PATH_WAIT_FILE is > used to control when the testapp can continue. Hence the script still > cannot > know if the app have been scheduled. > > Now based on the test case you might need more synchronization for the test > cases. > > Note that in the ust cases, the trace_ust_app uses `touch > "$BEFORE_LAST_PATH"` > that effectively unblock the app and allows it to perform the last > tracepoint > hit and the we `wait` on the background process. > > Note: some tests case are a bit clever and uses "trace_"$domain"_app" > instead of > calling trace_ust_app directly. > > For these tests case it seems that we are only expecting at least a single > event > matching the event name under test. Here the last tracepoint hit should > satisfy > this criteria. > > Am I missing a race? > > Cheers > > > > I suggest that we add an option to gen-ust-events to wait before the > first > > event is generated. > > gen_kernel_test_events already have this functionality to wait before the > > first event. > > > > Something like this > > static struct option long_options[] = > > { > > /* These options set a flag. */ > > {"iter", required_argument, 0, 'i'}, > > {"wait", required_argument, 0, 'w'}, > > {"sync-after-first-event", required_argument, 0, 'a'}, > > {"sync-before-last-event", required_argument, 0, 'b'}, > > {"sync-before-last-event-touch", required_argument, 0, 'c'}, > > {"sync-before-exit", required_argument, 0, 'd'}, > > {"sync-before-exit-touch", required_argument, 0, 'e'}, > > *+ {"sync-before-first-event", required_argument, 0, 'f'},* > > > > {0, 0, 0, 0} > > }; > > .... > > > > I will create one or more patches for this tomorrow > > > > Anders Wallin > > > > > > On Wed, Mar 31, 2021 at 9:25 PM Jonathan Rajotte-Julien < > > jonathan.rajotte-julien@efficios.com> wrote: > > > > > > # > > > > # SPDX-License-Identifier: GPL-2.0-only > > > > > > > > -TEST_DESC="LTTng - Event traker test" > > > > +TEST_DESC="LTTng - Event tracker test" > > > > > > > > CURDIR=$(dirname "$0")/ > > > > TESTDIR="$CURDIR/../../.." > > > > @@ -42,6 +42,8 @@ function prepare_ust_app > > > > > > > > $TESTAPP_BIN -i $NR_ITER -w $NR_USEC_WAIT -a > "$AFTER_FIRST_PATH" -b > > > > "$BEFORE_LAST_PATH" & > > > > CHILD_PID=$! > > > > + # voluntary context switch to start $TESTAPP_BIN > > > > + sleep 0.1 > > > > > > A wait on the $AFTER_FIRST_PATH file would be probably more > deterministic > > > than a sleep here. > > > > > > while [ ! -f "${AFTER_FIRST_PATH}" ]; do > > > sleep 0.1 > > > done > > > > > > I would also expect something similar for the `prepare_kernel_app` > > > function considering the same race is most probably present and simply > not > > > triggered by a chance of luck. > > > Seems like gen-kernel-test-events does not expose the same sync > > > capabilities here, please use gen-ust-events as an example of how it is > > > done. > > > > > > Let us know how your testing goes. > > > > > > Thanks > > > > > -- > Jonathan Rajotte-Julien > EfficiOS > --000000000000991d3605bedf0c27 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Jonathan :)

It's very unlik= ely that the race could occur, BUT it can happen.

<= /div>
OK run
1. test_e= vent_tracker starts gen-ust-events
2. test_event_tracker waits fo= r gen-ust-events to create=C2=A0AFTER_FIRST_PATH
3. gen-ust-event= create first event and create AFTER_FIRST_PATH
4. gen-ust-event = continue and create 99 events
5. gen-ust-event wait for=C2=A0=C2= =A0BEFORE_LAST_PATH
6. test_event_track start collecting events (= lttng start .....)
7. test_event_tracker calls "lttng track = ... -pid "Faulty PID"
8. test_event_tracker touches= =C2=A0BEFORE_LAST_PATH
9. gen-ust-events creates the last event
10. test_event finishes and since it tracks the faulty pid the res= ult will be 0 events =3D=3D OK

Fa= ulty run
1. test_event_tracker starts gen-ust-events<= br>
2. test_event_tracker waits for gen-ust-events to create=C2= =A0AFTER_FIRST_PATH
3. gen-ust-event create first event and creat= e AFTER_FIRST_PATH
4. gen-ust-event gets rescheduled before it ha= s created 99 events, e.g after 9 events
5. test_event_track start= collecting events (lttng start .....)
6. gen-ust-event is resche= duled and starts=C2=A0generating the remaining events. 90 events
7. lttn= g collects these 90 events since we have not setup "tracking" of = PID yet
8. test_event_tracker calls "lttng track ... -pid &q= uot;Faulty PID"
9. test_event_tracker touches=C2=A0BEFORE_LA= ST_PATH
10. gen-ust-events creates the last event
11.= =C2=A0test_event finishes and since it tracks the faulty pid the result sho= uld=C2=A0 be 0 events, but we got 90 =3D=3D FAULTY

We can solve this by;
A: using NR_ITER=3D2
or
B: by adding a flag to gen-ust-events to wait before sending the first ev= ent
1. test_event_tracker starts gen-ust-eve= nts
2. test_event_tracker waits for gen-ust-events waits for BEFO= RE_FIRST_PATH
3. test_event_track start collecting events (lttng = start .....)
4. test_event_tracker calls "lttng track ..= . -pid "Faulty PID"
5. test_event_track creates BEF= ORE_FIRST_PATH
6. gen_ust_event creates 100 events
7. test_event_tracker waits for gen_event_ust to end
8. te= st_event finishes and since it tracked the faulty pid the result will be 0 = events =3D=3D OK

This= is in principle how gen-kernel-test-events works (but with different argum= ents)
I would suggest to use B since that will be bulletproof

Anders Wallin

On Wed, = Mar 31, 2021 at 11:33 PM Jonathan Rajotte-Julien <jonathan.rajotte-julien@efficios.com&= gt; wrote:
Hi,
On Wed, Mar 31, 2021 at 11:09:42PM +0200, Anders Wallin wrote:
> Hi Julian,

You can use Jonathan. ;)

>
> Neither mine "sleep 0.1" or your version with "while [!= -f ............"
> are race condition free.

I might be missing something here but as far as I understand the race you a= re
trying to mitigate is that the test script execute/continue before the `bac= kgrounded`
command (background test app) had time to execute, right?

If so at least waiting for the app to create a file is necessary. Now
gen_kernel_test_events does not have this functionality. The PATH_WAIT_FILE= is
used to control when the testapp can continue. Hence the script still canno= t
know if the app have been scheduled.

Now based on the test case you might need more synchronization for the test=
cases.

Note that in the ust cases, the trace_ust_app uses `touch "$BEFORE_LAS= T_PATH"`
that effectively unblock the app and allows it to perform the last tracepoi= nt
hit and the we `wait` on the background process.

Note: some tests case are a bit clever and uses "trace_"$domain&q= uot;_app" instead of
calling trace_ust_app directly.

For these tests case it seems that we are only expecting at least a single = event
matching the event name under test. Here the last tracepoint hit should sat= isfy
this criteria.

Am I missing a race?

Cheers


> I suggest that we add an option to gen-ust-events to wait before the f= irst
> event is generated.
> gen_kernel_test_events already have this functionality to wait before = the
> first event.
>
> Something like this
> static struct option long_options[] =3D
> {
> /* These options set a flag. */
> {"iter", required_argument, 0, 'i'},
> {"wait", required_argument, 0, 'w'},
> {"sync-after-first-event", required_argument, 0, 'a'= },
> {"sync-before-last-event", required_argument, 0, 'b'= },
> {"sync-before-last-event-touch", required_argument, 0, '= c'},
> {"sync-before-exit", required_argument, 0, 'd'},
> {"sync-before-exit-touch", required_argument, 0, 'e'= },
> *+ {"sync-before-first-event", required_argument, 0, 'f&= #39;},*
>
> {0, 0, 0, 0}
> };
> ....
>
> I will create one or more patches for this tomorrow
>
> Anders Wallin
>
>
> On Wed, Mar 31, 2021 at 9:25 PM Jonathan Rajotte-Julien <
> jonathan.rajotte-julien@efficios.com> wrote:
>
> > > #
> > > # SPDX-License-Identifier: GPL-2.0-only
> > >
> > > -TEST_DESC=3D"LTTng - Event traker test"
> > > +TEST_DESC=3D"LTTng - Event tracker test"
> > >
> > > CURDIR=3D$(dirname "$0")/
> > > TESTDIR=3D"$CURDIR/../../.."
> > > @@ -42,6 +42,8 @@ function prepare_ust_app
> > >
> > >=C2=A0 =C2=A0 =C2=A0 =C2=A0$TESTAPP_BIN -i $NR_ITER -w $NR_US= EC_WAIT -a "$AFTER_FIRST_PATH" -b
> > >=C2=A0 =C2=A0 =C2=A0 =C2=A0"$BEFORE_LAST_PATH" &= ;
> > >=C2=A0 =C2=A0 =C2=A0 =C2=A0CHILD_PID=3D$!
> > > +=C2=A0 =C2=A0 =C2=A0# voluntary context switch to start $TE= STAPP_BIN
> > > +=C2=A0 =C2=A0 =C2=A0sleep 0.1
> >
> > A wait on the $AFTER_FIRST_PATH file would be probably more deter= ministic
> > than a sleep here.
> >
> >=C2=A0 =C2=A0while [ ! -f "${AFTER_FIRST_PATH}" ]; do > >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0sleep 0.1
> >=C2=A0 =C2=A0done
> >
> > I would also expect something similar for the `prepare_kernel_app= `
> > function considering the same race is most probably present and s= imply not
> > triggered by a chance of luck.
> > Seems like gen-kernel-test-events does not expose the same sync > > capabilities here, please use gen-ust-events as an example of how= it is
> > done.
> >
> > Let us know how your testing goes.
> >
> > Thanks
> >

--
Jonathan Rajotte-Julien
EfficiOS
--000000000000991d3605bedf0c27-- --===============4113572608032867662== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev --===============4113572608032867662==--