* [PATCH 1/2] report: safely update the result.xml file
@ 2023-07-06 20:42 Theodore Ts'o
2023-07-06 20:42 ` [PATCH 2/2] report: remove xmlns specifier Theodore Ts'o
2023-07-07 15:01 ` [PATCH 1/2] report: safely update the result.xml file Darrick J. Wong
0 siblings, 2 replies; 8+ messages in thread
From: Theodore Ts'o @ 2023-07-06 20:42 UTC (permalink / raw)
To: fstests; +Cc: Theodore Ts'o
After every single test, we rewrite result.xml from scratch. This
ensures that the XML file is always in a valid, parseable state, even
if the check script is killed or the machine crashes in the middle of
a test.
If the test is being run in a Cloud VM as a "spot" (Amazon, Azure, or
GCE) or "preemptible" (Oracle) instance, it is possible that the VM
can be halted whenever the Cloud provider needs the capacity for
customers who are willing to pay full price. ("Spot" instances can be
60% to 90% cheaper --- allowing the frugal kernel developer to get up
to 10 times more testing for the same amount of money. :-)
Since a "spot" VM can get terminated at any time, it is possible for
the VM to be terminated immediately after a test has completed and
while the result.xml file is in the middle of being written out. In
that case, the result.xml file could partially written, resulting in
an invalid result.xml file and lost information about the tests run
before the VM was terminated.
To address this race, write the new result.xml file as result.xml.new,
and only rename it to result.xml after the XML file is fully written
out.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
---
common/report | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)
diff --git a/common/report b/common/report
index 9bfa09ecc..3ad14f94e 100644
--- a/common/report
+++ b/common/report
@@ -109,13 +109,15 @@ _xunit_make_section_report()
local notrun_count="$4"
local sect_time="$5"
local timestamp
+ local tmp_fn="$REPORT_DIR/result.xml.new"
+ local out_fn="$REPORT_DIR/result.xml"
if [ $sect_name == '-no-sections-' ]; then
sect_name='global'
fi
local report=$tmp.report.xunit.$sect_name.xml
# Header
- echo "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" > $REPORT_DIR/result.xml
+ echo "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" > "$tmp_fn"
if [ -n "$test_start_time" ]; then
timestamp="$(date -Iseconds --date="$test_start_time")"
else
@@ -123,7 +125,7 @@ _xunit_make_section_report()
fi
local fstests_ns="https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git"
- cat >> $REPORT_DIR/result.xml << ENDL
+ cat >> "$tmp_fn" << ENDL
<testsuite
xmlns="$fstests_ns"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
@@ -142,19 +144,20 @@ ENDL
__generate_report_vars
# Properties
- echo -e "\t<properties>" >> $REPORT_DIR/result.xml
+ echo -e "\t<properties>" >> "$tmp_fn"
(for key in "${!REPORT_VARS[@]}"; do
_xunit_add_property "$key" "${REPORT_VARS["$key"]}"
done;
for p in "${REPORT_ENV_LIST[@]}"; do
_xunit_add_property "$p" "${!p}"
- done) | sort >> $REPORT_DIR/result.xml
- echo -e "\t</properties>" >> $REPORT_DIR/result.xml
+ done) | sort >> "$tmp_fn"
+ echo -e "\t</properties>" >> "$tmp_fn"
if [ -f $report ]; then
- cat $report >> $REPORT_DIR/result.xml
+ cat $report >> "$tmp_fn"
fi
- echo "</testsuite>" >> $REPORT_DIR/result.xml
- echo "Xunit report: $REPORT_DIR/result.xml"
+ echo "</testsuite>" >> "$tmp_fn"
+ mv "$tmp_fn" "$out_fn"
+ echo "Xunit report: $out_fn"
}
_xunit_make_testcase_report()
--
2.31.0
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH 2/2] report: remove xmlns specifier 2023-07-06 20:42 [PATCH 1/2] report: safely update the result.xml file Theodore Ts'o @ 2023-07-06 20:42 ` Theodore Ts'o 2023-07-07 14:55 ` Darrick J. Wong 2023-07-07 15:01 ` [PATCH 1/2] report: safely update the result.xml file Darrick J. Wong 1 sibling, 1 reply; 8+ messages in thread From: Theodore Ts'o @ 2023-07-06 20:42 UTC (permalink / raw) To: fstests; +Cc: Theodore Ts'o, Darrick J. Wong By specifying "xmlns=https://git.kernel.org/.../xfstests-dev.git", this causes XML complaint parsers, such as used by the python junitparser library, to put all of the XML elements into a namespace, which then causes junitparser to toss its cookies. This can be worked-around in a test runner script via: sed -i.orig -e 's/xmlns=\".*\"//' "$RESULT_BASE/result.xml" but it's better not to include the xmlns line at all in the first place, since this may cause other users of fstests who are using the Python junitparser library a lot of headaches. Cc: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu> --- common/report | 1 - 1 file changed, 1 deletion(-) diff --git a/common/report b/common/report index 3ad14f94e..081df988f 100644 --- a/common/report +++ b/common/report @@ -127,7 +127,6 @@ _xunit_make_section_report() local fstests_ns="https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git" cat >> "$tmp_fn" << ENDL <testsuite - xmlns="$fstests_ns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="$fstests_ns $fstests_ns/tree/doc/xunit.xsd" -- 2.31.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] report: remove xmlns specifier 2023-07-06 20:42 ` [PATCH 2/2] report: remove xmlns specifier Theodore Ts'o @ 2023-07-07 14:55 ` Darrick J. Wong 0 siblings, 0 replies; 8+ messages in thread From: Darrick J. Wong @ 2023-07-07 14:55 UTC (permalink / raw) To: Theodore Ts'o; +Cc: fstests On Thu, Jul 06, 2023 at 04:42:32PM -0400, Theodore Ts'o wrote: > By specifying "xmlns=https://git.kernel.org/.../xfstests-dev.git", > this causes XML complaint parsers, such as used by the python > junitparser library, to put all of the XML elements into a namespace, > which then causes junitparser to toss its cookies. > > This can be worked-around in a test runner script via: > > sed -i.orig -e 's/xmlns=\".*\"//' "$RESULT_BASE/result.xml" > > but it's better not to include the xmlns line at all in the first > place, since this may cause other users of fstests who are using > the Python junitparser library a lot of headaches. > > Cc: "Darrick J. Wong" <djwong@kernel.org> > Signed-off-by: Theodore Ts'o <tytso@mit.edu> Yeah, I guess. Note to self: xmlns needs to be in the file format from the start, or adding it will break parsers. :( Sooo much engineering and yet it's easy to stop up the drain. Reviewed-by: Darrick J. Wong <djwong@kernel.org> --D > --- > common/report | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/common/report b/common/report > index 3ad14f94e..081df988f 100644 > --- a/common/report > +++ b/common/report > @@ -127,7 +127,6 @@ _xunit_make_section_report() > local fstests_ns="https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git" > cat >> "$tmp_fn" << ENDL > <testsuite > - xmlns="$fstests_ns" > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > xsi:schemaLocation="$fstests_ns $fstests_ns/tree/doc/xunit.xsd" > > -- > 2.31.0 > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] report: safely update the result.xml file 2023-07-06 20:42 [PATCH 1/2] report: safely update the result.xml file Theodore Ts'o 2023-07-06 20:42 ` [PATCH 2/2] report: remove xmlns specifier Theodore Ts'o @ 2023-07-07 15:01 ` Darrick J. Wong 2023-07-07 16:17 ` Zorro Lang 1 sibling, 1 reply; 8+ messages in thread From: Darrick J. Wong @ 2023-07-07 15:01 UTC (permalink / raw) To: Theodore Ts'o; +Cc: fstests On Thu, Jul 06, 2023 at 04:42:31PM -0400, Theodore Ts'o wrote: > After every single test, we rewrite result.xml from scratch. This > ensures that the XML file is always in a valid, parseable state, even > if the check script is killed or the machine crashes in the middle of > a test. > > If the test is being run in a Cloud VM as a "spot" (Amazon, Azure, or > GCE) or "preemptible" (Oracle) instance, it is possible that the VM > can be halted whenever the Cloud provider needs the capacity for > customers who are willing to pay full price. ("Spot" instances can be > 60% to 90% cheaper --- allowing the frugal kernel developer to get up > to 10 times more testing for the same amount of money. :-) > > Since a "spot" VM can get terminated at any time, it is possible for > the VM to be terminated immediately after a test has completed and > while the result.xml file is in the middle of being written out. In > that case, the result.xml file could partially written, resulting in > an invalid result.xml file and lost information about the tests run > before the VM was terminated. > > To address this race, write the new result.xml file as result.xml.new, > and only rename it to result.xml after the XML file is fully written > out. > > Signed-off-by: Theodore Ts'o <tytso@mit.edu> > --- > common/report | 19 +++++++++++-------- > 1 file changed, 11 insertions(+), 8 deletions(-) > > diff --git a/common/report b/common/report > index 9bfa09ecc..3ad14f94e 100644 > --- a/common/report > +++ b/common/report > @@ -109,13 +109,15 @@ _xunit_make_section_report() > local notrun_count="$4" > local sect_time="$5" > local timestamp > + local tmp_fn="$REPORT_DIR/result.xml.new" > + local out_fn="$REPORT_DIR/result.xml" > > if [ $sect_name == '-no-sections-' ]; then > sect_name='global' > fi > local report=$tmp.report.xunit.$sect_name.xml > # Header > - echo "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" > $REPORT_DIR/result.xml > + echo "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" > "$tmp_fn" Nit: You might want to rm -f $tmp_fn first to reduce the nastiness if someone plants a named pipe at that path. > if [ -n "$test_start_time" ]; then > timestamp="$(date -Iseconds --date="$test_start_time")" > else > @@ -123,7 +125,7 @@ _xunit_make_section_report() > fi > > local fstests_ns="https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git" > - cat >> $REPORT_DIR/result.xml << ENDL > + cat >> "$tmp_fn" << ENDL > <testsuite > xmlns="$fstests_ns" > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > @@ -142,19 +144,20 @@ ENDL > __generate_report_vars > > # Properties > - echo -e "\t<properties>" >> $REPORT_DIR/result.xml > + echo -e "\t<properties>" >> "$tmp_fn" > (for key in "${!REPORT_VARS[@]}"; do > _xunit_add_property "$key" "${REPORT_VARS["$key"]}" > done; > for p in "${REPORT_ENV_LIST[@]}"; do > _xunit_add_property "$p" "${!p}" > - done) | sort >> $REPORT_DIR/result.xml > - echo -e "\t</properties>" >> $REPORT_DIR/result.xml > + done) | sort >> "$tmp_fn" > + echo -e "\t</properties>" >> "$tmp_fn" > if [ -f $report ]; then > - cat $report >> $REPORT_DIR/result.xml > + cat $report >> "$tmp_fn" > fi > - echo "</testsuite>" >> $REPORT_DIR/result.xml > - echo "Xunit report: $REPORT_DIR/result.xml" > + echo "</testsuite>" >> "$tmp_fn" > + mv "$tmp_fn" "$out_fn" Second nit: Make sure we actually wrote tmp_fn before blowing away the old report. sync "$tmp_fn" && mv "$tmp_fn" "$out_fn" With that fixed, Reviewed-by: Darrick J. Wong <djwong@kernel.org> --D > + echo "Xunit report: $out_fn" > } > > _xunit_make_testcase_report() > -- > 2.31.0 > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] report: safely update the result.xml file 2023-07-07 15:01 ` [PATCH 1/2] report: safely update the result.xml file Darrick J. Wong @ 2023-07-07 16:17 ` Zorro Lang 2023-07-07 19:13 ` Theodore Ts'o 0 siblings, 1 reply; 8+ messages in thread From: Zorro Lang @ 2023-07-07 16:17 UTC (permalink / raw) To: Darrick J. Wong; +Cc: Theodore Ts'o, fstests On Fri, Jul 07, 2023 at 08:01:30AM -0700, Darrick J. Wong wrote: > On Thu, Jul 06, 2023 at 04:42:31PM -0400, Theodore Ts'o wrote: > > After every single test, we rewrite result.xml from scratch. This > > ensures that the XML file is always in a valid, parseable state, even > > if the check script is killed or the machine crashes in the middle of > > a test. > > > > If the test is being run in a Cloud VM as a "spot" (Amazon, Azure, or > > GCE) or "preemptible" (Oracle) instance, it is possible that the VM > > can be halted whenever the Cloud provider needs the capacity for > > customers who are willing to pay full price. ("Spot" instances can be > > 60% to 90% cheaper --- allowing the frugal kernel developer to get up > > to 10 times more testing for the same amount of money. :-) > > > > Since a "spot" VM can get terminated at any time, it is possible for > > the VM to be terminated immediately after a test has completed and > > while the result.xml file is in the middle of being written out. In > > that case, the result.xml file could partially written, resulting in > > an invalid result.xml file and lost information about the tests run > > before the VM was terminated. > > > > To address this race, write the new result.xml file as result.xml.new, > > and only rename it to result.xml after the XML file is fully written > > out. > > > > Signed-off-by: Theodore Ts'o <tytso@mit.edu> > > --- > > common/report | 19 +++++++++++-------- > > 1 file changed, 11 insertions(+), 8 deletions(-) > > > > diff --git a/common/report b/common/report > > index 9bfa09ecc..3ad14f94e 100644 > > --- a/common/report > > +++ b/common/report > > @@ -109,13 +109,15 @@ _xunit_make_section_report() > > local notrun_count="$4" > > local sect_time="$5" > > local timestamp > > + local tmp_fn="$REPORT_DIR/result.xml.new" > > + local out_fn="$REPORT_DIR/result.xml" > > > > if [ $sect_name == '-no-sections-' ]; then > > sect_name='global' > > fi > > local report=$tmp.report.xunit.$sect_name.xml > > # Header > > - echo "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" > $REPORT_DIR/result.xml > > + echo "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" > "$tmp_fn" > > Nit: You might want to rm -f $tmp_fn first to reduce the nastiness if > someone plants a named pipe at that path. > > > if [ -n "$test_start_time" ]; then > > timestamp="$(date -Iseconds --date="$test_start_time")" > > else > > @@ -123,7 +125,7 @@ _xunit_make_section_report() > > fi > > > > local fstests_ns="https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git" > > - cat >> $REPORT_DIR/result.xml << ENDL > > + cat >> "$tmp_fn" << ENDL > > <testsuite > > xmlns="$fstests_ns" > > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > > @@ -142,19 +144,20 @@ ENDL > > __generate_report_vars > > > > # Properties > > - echo -e "\t<properties>" >> $REPORT_DIR/result.xml > > + echo -e "\t<properties>" >> "$tmp_fn" > > (for key in "${!REPORT_VARS[@]}"; do > > _xunit_add_property "$key" "${REPORT_VARS["$key"]}" > > done; > > for p in "${REPORT_ENV_LIST[@]}"; do > > _xunit_add_property "$p" "${!p}" > > - done) | sort >> $REPORT_DIR/result.xml > > - echo -e "\t</properties>" >> $REPORT_DIR/result.xml > > + done) | sort >> "$tmp_fn" > > + echo -e "\t</properties>" >> "$tmp_fn" > > if [ -f $report ]; then > > - cat $report >> $REPORT_DIR/result.xml > > + cat $report >> "$tmp_fn" > > fi > > - echo "</testsuite>" >> $REPORT_DIR/result.xml > > - echo "Xunit report: $REPORT_DIR/result.xml" > > + echo "</testsuite>" >> "$tmp_fn" > > + mv "$tmp_fn" "$out_fn" > > Second nit: Make sure we actually wrote tmp_fn before blowing away the > old report. > > sync "$tmp_fn" && mv "$tmp_fn" "$out_fn" I can help to add this when I merge it, if Ted doesn't want to change more than that. Just curious, will renameat2 ignore data still in cached? I never did things likes "sync && mv" before, is there any known issue or it's as expected? Thanks, Zorro > > With that fixed, > Reviewed-by: Darrick J. Wong <djwong@kernel.org> > > --D > > > + echo "Xunit report: $out_fn" > > } > > > > _xunit_make_testcase_report() > > -- > > 2.31.0 > > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] report: safely update the result.xml file 2023-07-07 16:17 ` Zorro Lang @ 2023-07-07 19:13 ` Theodore Ts'o 2023-07-08 3:02 ` Zorro Lang 0 siblings, 1 reply; 8+ messages in thread From: Theodore Ts'o @ 2023-07-07 19:13 UTC (permalink / raw) To: Zorro Lang; +Cc: Darrick J. Wong, fstests On Sat, Jul 08, 2023 at 12:17:47AM +0800, Zorro Lang wrote: > > sync "$tmp_fn" && mv "$tmp_fn" "$out_fn" > > I can help to add this when I merge it, if Ted doesn't want to change > more than that. I'll be sending out v2 shortly. > Just curious, will renameat2 ignore data still in cached? I never did things > likes "sync && mv" before, is there any known issue or it's as expected? POSIX does not require rename() or renameat2() to do anything with any cached data. All it does is make changes to the directory entries involved. And that's what the VFS Layer does. Some file systems implementation *may* do something special on a rename(2), but it's not required by any specification. For example, in the wake of the O_PONIES controversy, ext4 will initiate (but not wait for) the file writeback of file being renamed *if* it is being renamed on top of another file, causing the destination inode to be unlinked and deleted if it doesn't have any other hard links to keep its reflink about 0. This means that in the case of renaming result.xml.new on top of result.xml, if result.xml exists, this will trigger an immediate writeback of result.xml.new, instead of waiting for the 30 second writeback timer --- if this is happening on ext4. We added this to ext4 because when people would exit the old Tuxracer game, this would rewrite the top ten file in an unsafe manner, and then close the OpenGL handle --- and on kernels with buggy proprietary graphics drivers, it would cause a kernel crash, causing the top-ten score file to be lost, and causing many users to complain to the file system developers (but curiously enough, not to the application/game developers). So we added something which reduces the chances of this being an issue, but it's not something anyone should count upon. As far as sync && mv, that's the shell script equivalent of what careful userspace programs should always do, which is to write the file to foo.new, then fsync it, and then rename foo.new to foo, optionally renaming foo to foo.old or foo~ beforehand. Text editors have always gotten this right, because programmers get salty when their source files get trashed. For whatever reason, desktop application and games tend not be as careful.... Cheers, - Ted ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] report: safely update the result.xml file 2023-07-07 19:13 ` Theodore Ts'o @ 2023-07-08 3:02 ` Zorro Lang 2023-07-08 4:31 ` Theodore Ts'o 0 siblings, 1 reply; 8+ messages in thread From: Zorro Lang @ 2023-07-08 3:02 UTC (permalink / raw) To: Theodore Ts'o; +Cc: Darrick J. Wong, fstests On Fri, Jul 07, 2023 at 03:13:31PM -0400, Theodore Ts'o wrote: > On Sat, Jul 08, 2023 at 12:17:47AM +0800, Zorro Lang wrote: > > > sync "$tmp_fn" && mv "$tmp_fn" "$out_fn" > > > > I can help to add this when I merge it, if Ted doesn't want to change > > more than that. > > I'll be sending out v2 shortly. > > > Just curious, will renameat2 ignore data still in cached? I never did things > > likes "sync && mv" before, is there any known issue or it's as expected? > > POSIX does not require rename() or renameat2() to do anything with any > cached data. All it does is make changes to the directory entries > involved. And that's what the VFS Layer does. > > Some file systems implementation *may* do something special on a > rename(2), but it's not required by any specification. For example, > in the wake of the O_PONIES controversy, ext4 will initiate (but not > wait for) the file writeback of file being renamed *if* it is being > renamed on top of another file, causing the destination inode to be > unlinked and deleted if it doesn't have any other hard links to keep > its reflink about 0. > > This means that in the case of renaming result.xml.new on top of > result.xml, if result.xml exists, this will trigger an immediate > writeback of result.xml.new, instead of waiting for the 30 second > writeback timer --- if this is happening on ext4. We added this to > ext4 because when people would exit the old Tuxracer game, this would > rewrite the top ten file in an unsafe manner, and then close the > OpenGL handle --- and on kernels with buggy proprietary graphics > drivers, it would cause a kernel crash, causing the top-ten score file > to be lost, and causing many users to complain to the file system > developers (but curiously enough, not to the application/game > developers). So we added something which reduces the chances of this > being an issue, but it's not something anyone should count upon. > > As far as sync && mv, that's the shell script equivalent of what > careful userspace programs should always do, which is to write the > file to foo.new, then fsync it, and then rename foo.new to foo, > optionally renaming foo to foo.old or foo~ beforehand. Text editors > have always gotten this right, because programmers get salty when > their source files get trashed. For whatever reason, desktop > application and games tend not be as careful.... Thanks Ted, good to know that! Maybe we can have a "_mv" in common/rc does "sync $1 && mv $1 $2". Thanks, Zorro > > Cheers, > > - Ted > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] report: safely update the result.xml file 2023-07-08 3:02 ` Zorro Lang @ 2023-07-08 4:31 ` Theodore Ts'o 0 siblings, 0 replies; 8+ messages in thread From: Theodore Ts'o @ 2023-07-08 4:31 UTC (permalink / raw) To: Zorro Lang; +Cc: Darrick J. Wong, fstests On Sat, Jul 08, 2023 at 11:02:05AM +0800, Zorro Lang wrote: > > Thanks Ted, good to know that! Maybe we can have a "_mv" in common/rc does > "sync $1 && mv $1 $2". Perhaps, although I can't think of any other place in xfstests were we are actively (and repeatedly) renaming one file on top of another one, or rewriting a file in place. There are certainly places where we use "sed -i" to rewrite a file in place, and it does do write the file to a new file, and then renames it to the original file's name (yay!), although it doesn't do an fsync before the rename (boo!). That's arguably a bug report we should send to the maintainer of the sed package. - Ted ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-07-08 4:31 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-07-06 20:42 [PATCH 1/2] report: safely update the result.xml file Theodore Ts'o 2023-07-06 20:42 ` [PATCH 2/2] report: remove xmlns specifier Theodore Ts'o 2023-07-07 14:55 ` Darrick J. Wong 2023-07-07 15:01 ` [PATCH 1/2] report: safely update the result.xml file Darrick J. Wong 2023-07-07 16:17 ` Zorro Lang 2023-07-07 19:13 ` Theodore Ts'o 2023-07-08 3:02 ` Zorro Lang 2023-07-08 4:31 ` Theodore Ts'o
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox