From mboxrd@z Thu Jan 1 00:00:00 1970 From: Xiao Yang Date: Tue, 8 Jan 2019 17:08:11 +0800 Subject: [LTP] [PATCH v3 2/3] lib/tst_test.c: Update result counters when calling tst_brk() In-Reply-To: <20190107150619.GC15221@rei.lan> References: <20181211151733.GC1180@rei> <1544690160-13900-1-git-send-email-yangx.jy@cn.fujitsu.com> <1544690160-13900-2-git-send-email-yangx.jy@cn.fujitsu.com> <20190107150619.GC15221@rei.lan> Message-ID: <5C34687B.9020902@cn.fujitsu.com> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: ltp@lists.linux.it On 2019/01/07 23:06, Cyril Hrubis wrote: > Hi! >> 1) Catch and report the TFAIL exit status of child process. > Looking at the codebase we do have a few usages of tst_brk(TFAIL, "...") > to exit the child process, which sort of works but it's incorrect. The > tst_brk() always meant "unrecoverable failure have happened, exit the > current process as fast as possible". Looking over our codebase most of > the tst_brk(TFAIL, "...") should not actually cause the main test > process to exit, these were only meant to exit the child and report the > result in one call. It will for instance break the test with -i option > on the first failure, which is incorrect. Hi Cyril, Detailed explanation, and i got it. > So if we ever want to have a function to exit child process with a result we > should implement tst_ret() that would be equivalent to tst_res() followed by > exit(0). > > It could be even implemented as: > > #define tst_ret(ttype, fmt, ...) \ > do { \ > tst_res_(__FILE__, __LINE__, (ttype), (fmt), ##__VA_ARGS__); \ > exit(0); \ > } while (0) > > This function has one big advantage, it increments the results counters > before the child process exits. > > Actually one of the big points of the new test library was that the > results counters are atomically increased, because passing the results > in exit values is nightmare that cannot be done correclty. Agreed. All of tst_brk(TFAIL, ...) can be converted to tst_ret(TFAIL, ...) or tst_brk(TBROK, ...) in this way and then add TFAIL to tst_brk compile time check as Jan replied, so that only TCONF and TBROK can be passed into tst_brk(). >> 2) Only update result counters in library process and main test >> process because the exit status of child can be reported by >> main test process. > Actually after I spend some time on it I think that the best solution is > to update the results in the piece of shared memory as fast as possible, > anything else is prone to various races and corner cases. ... >> 3) Print TCONF message and increase skipped when calling tst_brk(TCONF). >> Print TBROK message and increase broken when calling tst_brk(TBROK). >> Print TFAIL message and increase failed when calling tst_brk(TFAIL). >> 4) Remove duplicate update_results() in run_tcases_per_fs(). > I've been thinking about this and the problem is more complex, and I'm > even not sure that it's possible to write the library so that the > counters are consistent at the time we exit the test if something > unexpected happened and we called tst_brk(). > > Consider for instance this example: > > #include "tst_test.h" > > static void do_test(void) > { > if (!SAFE_FORK()) > tst_brk(TBROK, "child"); > tst_brk(TBROK, "parent"); > } > > static struct tst_test test = { > .test_all = do_test, > .forks_child = 1, > }; > > When tst_brk() is called both in parent and child the counter would be > incremented only once because the child is not waited for by the main > test. > > We can close this special case by changing the main test pid to wait for the > children before it calls exit() in the tst_brk() but that may cause the > main process to get stuck undefinitely if the child processes get stuck, > so we would have to be careful. > > Also from the very definition of the TBROK return status the test > results would be incomplete at best, since TBROK really means > "unrecoverable error happened during the test" which would mostly means > that something as low level as filesystem got corrupted and there is no > point in presenting the results in that case, so I guess that the best > we could do in the case of TBROK is to print big message that says > "things went horribly wrong!" or something similar. Sorry, my patch is too rough becasue some suitations are not taken into account. For tst_brk(TCONF), do you mean to replace the current solution using wait() in check_child_status() with your suggested shared memory? For tst_brk(TBROK), do you mean to just print big message instead of updating test results? > All in all I would like to avoid applying patches to the test library > before we finalize the release, since there is not much time for > testing now. Agreed, drop these patches during the upcoming release. We still need to do future investigation and testing. Best Regards, Xiao Yang