public inbox for fio@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] fio: Avoid errno and errno string mismatch
@ 2026-02-06 16:40 Niklas Cassel
  2026-02-06 16:40 ` [PATCH 1/3] fio: Fix error string not matching errno Niklas Cassel
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Niklas Cassel @ 2026-02-06 16:40 UTC (permalink / raw)
  To: Jens Axboe, Vincent Fu
  Cc: fio, Damien Le Moal, Jorgen S Hansen, Niklas Cassel

Hello all,

This series fixes a problem where errno and error string representing that
errno could be out of sync.

Another possible solution is to remove td->verror everywhere and make sure
that the function wants the error string calls strerror() with the errno
when they actually need it. However, that would be a larger change.

Another optimization that could also be done is to look at the places where
we call update_error_count() and td_clear_error(), if we don't set
td->error for a non-fatal error, there is no reason to unconditionally call
td_clear_error() for a non-fatal error. (update_error_count() still has to
be called, as it sets td->first_error.) However, I do not really understand
why update_error_count() + td_clear_error() is called by both:
io_u.c:io_completed() and backend.c:break_on_this_error(), so I avoided to
do this optimization.


Niklas Cassel (3):
  fio: Fix error string not matching errno
  io_u: Fix inconsistent handling of non-fatal errors with option
    error_dump
  stat: Remove duplicate space in __show_run_stats()

 fio.h  |  7 +++----
 io_u.c | 10 ++++++++--
 stat.c |  2 +-
 3 files changed, 12 insertions(+), 7 deletions(-)

-- 
2.53.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/3] fio: Fix error string not matching errno
  2026-02-06 16:40 [PATCH 0/3] fio: Avoid errno and errno string mismatch Niklas Cassel
@ 2026-02-06 16:40 ` Niklas Cassel
  2026-02-13  3:12   ` Damien Le Moal
  2026-02-06 16:40 ` [PATCH 2/3] io_u: Fix inconsistent handling of non-fatal errors with option error_dump Niklas Cassel
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Niklas Cassel @ 2026-02-06 16:40 UTC (permalink / raw)
  To: Jens Axboe, Vincent Fu
  Cc: fio, Damien Le Moal, Jorgen S Hansen, Niklas Cassel

When using --error_dump=1 together with --continue_on_error=io and
--ignore_error=62, we can get an inconsistent error in the summary line
 produced by __show_run_stats():

Before patch:
test: (groupid=0, jobs=1): err= 5 (file:io_u.c:2012, func=io_u error, error=Timer expired): pid=30925: Thu Feb  5 09:15:15 2026
...
  IO depths    : 1=0.8%, 2=1.6%, 4=3.3%, 8=6.6%, 16=13.1%, 32=74.6%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=98.9%, 8=0.0%, 16=0.0%, 32=1.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,122,0,0 short=0,0,0,0 dropped=0,0,0,0
     errors    : total=4, first_error=62/<Timer expired>
     latency   : target=0, window=0, percentile=100.00%, depth=32

The string for errno "err= 5" is incorrectly printed as "Timer expired".
(The correct string for "err= 5" is "Input/output error".)

There is thus a mismatch between the errno and the verbose string for the
errno.

This problem is this code in __td_verror():

td->error = ____e;
if (!td->first_error)
	nowarn_snprintf(td->verror, ..);

I.e. td->error (errno) is updated unconditionally, while td->verror (the
verbose error string), is updated only if td->first_error is not set.

Thus, if you get a non-fatal error, it will set td->error and td->error.
Later, for a non-fatal error, io_completed() will call update_error_count()
which sets td->first_error, followed by td_clear_error() which will clear
td->error.

Thus a second error (fatal or non-fatal) will set td->error, but since
td->first_error is now set, it will not update td->verror.

Since td->verror contains the string representation of the errno stored in
td->error, these two struct members should obviously always be updated at
the same time.

If you look at the example print above, you can see that there is another
print: first_error=62/<Timer expired>

Which prints td->first_error, and does not even use td->verror. Instead
show_thread_status_normal() calls strerror(td->first_error) to get the
error string for td->first_error.

There is thus absolutely no reason to not set td->verror every time
td->error is set.

Remove the useless guard such that the error string will always correspond
to errno.

Fixes: f2bba1820a56 ("Add a 'continue_on_error' option to fio")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
---
 fio.h | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fio.h b/fio.h
index e0b7c13c..65c68d4b 100644
--- a/fio.h
+++ b/fio.h
@@ -542,10 +542,9 @@ enum {
 		if ((td)->error)					\
 			break;						\
 		(td)->error = ____e;					\
-		if (!(td)->first_error)					\
-			nowarn_snprintf(td->verror, sizeof(td->verror),	\
-					"file:%s:%d, func=%s, error=%s", \
-					__FILE__, __LINE__, (func), (msg)); \
+		nowarn_snprintf(td->verror, sizeof(td->verror),		\
+				"file:%s:%d, func=%s, error=%s",	\
+				__FILE__, __LINE__, (func), (msg));	\
 	} while (0)
 
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/3] io_u: Fix inconsistent handling of non-fatal errors with option error_dump
  2026-02-06 16:40 [PATCH 0/3] fio: Avoid errno and errno string mismatch Niklas Cassel
  2026-02-06 16:40 ` [PATCH 1/3] fio: Fix error string not matching errno Niklas Cassel
@ 2026-02-06 16:40 ` Niklas Cassel
  2026-02-13  3:17   ` Damien Le Moal
  2026-02-06 16:40 ` [PATCH 3/3] stat: Remove duplicate space in __show_run_stats() Niklas Cassel
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Niklas Cassel @ 2026-02-06 16:40 UTC (permalink / raw)
  To: Jens Axboe, Vincent Fu
  Cc: fio, Damien Le Moal, Jorgen S Hansen, Niklas Cassel

Commit 8b28bd413759 ("backend: Add configurable non fatal error list")
added an early return in io_u_log_error() for non-fatal errors.

This early return is performed if the error is a non-fatal error, and if
error_dump is not set.

Looking at the help text for the error_dump option:
"If set dump every error even if it is non fatal, true by default.
If disabled only fatal error will be dumped."

So this commit made sure that, if error dump is NOT set:
For a non-fatal error, io_u_log_error() will return early and will thus:
1) NOT print an error to the log
2) NOT call td_verror()

However, if error dump is set, io_u_log_error() will not do an early
return, instead it will log the non-fatal error and then call td_verror().

It is clear that the intention is for a non-fatal error to not set
td->error. (If error_dump is used it should _log_ the non-fatal error.)

Thus, fix the code such that if error_dump is set, for a non-fatal error,
we:
1) print an error to the log
2) NOT call td_verror()

This will make the behavior in io_u_log_error() consistent.
If error_dump is set, we will log the non-fatal error, but regardless of
error_dump being set or not, we do NOT call td_verror() (which would set
td->error) for a non-fatal error, since that was obviously the intention
of commit 8b28bd413759 ("backend: Add configurable non fatal error list").

Fixes: 8b28bd413759 ("backend: Add configurable non fatal error list")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
---
 io_u.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/io_u.c b/io_u.c
index ac9a1c7d..653a700c 100644
--- a/io_u.c
+++ b/io_u.c
@@ -1985,8 +1985,14 @@ err_put:
 static void __io_u_log_error(struct thread_data *td, struct io_u *io_u)
 {
 	enum error_type_bit eb = td_error_type(io_u->ddir, io_u->error);
+	bool non_fatal_error = td_non_fatal_error(td, eb, io_u->error);
 
-	if (td_non_fatal_error(td, eb, io_u->error) && !td->o.error_dump)
+	/*
+	 * Non-fatal errors (errors that should be ignored), are normally not
+	 * dumped to the log, unless td->o.error_dump. Regardless, non-fatal
+	 * errors should never call td_verror() to set td->error.
+	 */
+	if (non_fatal_error && !td->o.error_dump)
 		return;
 
 	log_err("fio: io_u error%s%s: %s: %s offset=%llu, buflen=%llu\n",
@@ -2008,7 +2014,7 @@ static void __io_u_log_error(struct thread_data *td, struct io_u *io_u)
 		}
 	}
 
-	if (!td->error)
+	if (!td->error && !non_fatal_error)
 		td_verror(td, io_u->error, "io_u error");
 }
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/3] stat: Remove duplicate space in __show_run_stats()
  2026-02-06 16:40 [PATCH 0/3] fio: Avoid errno and errno string mismatch Niklas Cassel
  2026-02-06 16:40 ` [PATCH 1/3] fio: Fix error string not matching errno Niklas Cassel
  2026-02-06 16:40 ` [PATCH 2/3] io_u: Fix inconsistent handling of non-fatal errors with option error_dump Niklas Cassel
@ 2026-02-06 16:40 ` Niklas Cassel
  2026-02-13  3:17   ` Damien Le Moal
  2026-02-06 17:57 ` [PATCH 0/3] fio: Avoid errno and errno string mismatch fiotestbot
  2026-02-14  2:39 ` Vincent Fu
  4 siblings, 1 reply; 9+ messages in thread
From: Niklas Cassel @ 2026-02-06 16:40 UTC (permalink / raw)
  To: Jens Axboe, Vincent Fu
  Cc: fio, Damien Le Moal, Jorgen S Hansen, Niklas Cassel

Remove duplicate space in __show_run_stats().

Signed-off-by: Niklas Cassel <cassel@kernel.org>
---
 stat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/stat.c b/stat.c
index b999eb4b..620b4626 100644
--- a/stat.c
+++ b/stat.c
@@ -2604,7 +2604,7 @@ void __show_run_stats(void)
 				ts->error = td->first_error;
 				snprintf(ts->verror, sizeof(ts->verror), "%s",
 					 td->verror);
-			} else  if (td->error) {
+			} else if (td->error) {
 				ts->error = td->error;
 				snprintf(ts->verror, sizeof(ts->verror), "%s",
 					 td->verror);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] fio: Avoid errno and errno string mismatch
  2026-02-06 16:40 [PATCH 0/3] fio: Avoid errno and errno string mismatch Niklas Cassel
                   ` (2 preceding siblings ...)
  2026-02-06 16:40 ` [PATCH 3/3] stat: Remove duplicate space in __show_run_stats() Niklas Cassel
@ 2026-02-06 17:57 ` fiotestbot
  2026-02-14  2:39 ` Vincent Fu
  4 siblings, 0 replies; 9+ messages in thread
From: fiotestbot @ 2026-02-06 17:57 UTC (permalink / raw)
  To: fio

[-- Attachment #1: Type: text/plain, Size: 144 bytes --]


The result of fio's continuous integration tests was: success

For more details see https://github.com/fiotestbot/fio/actions/runs/21758930735

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] fio: Fix error string not matching errno
  2026-02-06 16:40 ` [PATCH 1/3] fio: Fix error string not matching errno Niklas Cassel
@ 2026-02-13  3:12   ` Damien Le Moal
  0 siblings, 0 replies; 9+ messages in thread
From: Damien Le Moal @ 2026-02-13  3:12 UTC (permalink / raw)
  To: Niklas Cassel, Jens Axboe, Vincent Fu; +Cc: fio, Jorgen S Hansen

On 2/7/26 01:40, Niklas Cassel wrote:
> When using --error_dump=1 together with --continue_on_error=io and
> --ignore_error=62, we can get an inconsistent error in the summary line
>  produced by __show_run_stats():
> 
> Before patch:
> test: (groupid=0, jobs=1): err= 5 (file:io_u.c:2012, func=io_u error, error=Timer expired): pid=30925: Thu Feb  5 09:15:15 2026
> ...
>   IO depths    : 1=0.8%, 2=1.6%, 4=3.3%, 8=6.6%, 16=13.1%, 32=74.6%, >=64=0.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>      complete  : 0=0.0%, 4=98.9%, 8=0.0%, 16=0.0%, 32=1.1%, 64=0.0%, >=64=0.0%
>      issued rwts: total=0,122,0,0 short=0,0,0,0 dropped=0,0,0,0
>      errors    : total=4, first_error=62/<Timer expired>
>      latency   : target=0, window=0, percentile=100.00%, depth=32
> 
> The string for errno "err= 5" is incorrectly printed as "Timer expired".
> (The correct string for "err= 5" is "Input/output error".)
> 
> There is thus a mismatch between the errno and the verbose string for the
> errno.
> 
> This problem is this code in __td_verror():
> 
> td->error = ____e;
> if (!td->first_error)
> 	nowarn_snprintf(td->verror, ..);
> 
> I.e. td->error (errno) is updated unconditionally, while td->verror (the
> verbose error string), is updated only if td->first_error is not set.
> 
> Thus, if you get a non-fatal error, it will set td->error and td->error.
> Later, for a non-fatal error, io_completed() will call update_error_count()
> which sets td->first_error, followed by td_clear_error() which will clear
> td->error.
> 
> Thus a second error (fatal or non-fatal) will set td->error, but since
> td->first_error is now set, it will not update td->verror.
> 
> Since td->verror contains the string representation of the errno stored in
> td->error, these two struct members should obviously always be updated at
> the same time.
> 
> If you look at the example print above, you can see that there is another
> print: first_error=62/<Timer expired>
> 
> Which prints td->first_error, and does not even use td->verror. Instead
> show_thread_status_normal() calls strerror(td->first_error) to get the
> error string for td->first_error.
> 
> There is thus absolutely no reason to not set td->verror every time
> td->error is set.
> 
> Remove the useless guard such that the error string will always correspond
> to errno.
> 
> Fixes: f2bba1820a56 ("Add a 'continue_on_error' option to fio")
> Signed-off-by: Niklas Cassel <cassel@kernel.org>

Looks OK to me.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] io_u: Fix inconsistent handling of non-fatal errors with option error_dump
  2026-02-06 16:40 ` [PATCH 2/3] io_u: Fix inconsistent handling of non-fatal errors with option error_dump Niklas Cassel
@ 2026-02-13  3:17   ` Damien Le Moal
  0 siblings, 0 replies; 9+ messages in thread
From: Damien Le Moal @ 2026-02-13  3:17 UTC (permalink / raw)
  To: Niklas Cassel, Jens Axboe, Vincent Fu; +Cc: fio, Jorgen S Hansen

On 2/7/26 01:40, Niklas Cassel wrote:
> Commit 8b28bd413759 ("backend: Add configurable non fatal error list")
> added an early return in io_u_log_error() for non-fatal errors.
> 
> This early return is performed if the error is a non-fatal error, and if
> error_dump is not set.
> 
> Looking at the help text for the error_dump option:
> "If set dump every error even if it is non fatal, true by default.
> If disabled only fatal error will be dumped."
> 
> So this commit made sure that, if error dump is NOT set:
> For a non-fatal error, io_u_log_error() will return early and will thus:
> 1) NOT print an error to the log
> 2) NOT call td_verror()
> 
> However, if error dump is set, io_u_log_error() will not do an early
> return, instead it will log the non-fatal error and then call td_verror().
> 
> It is clear that the intention is for a non-fatal error to not set
> td->error. (If error_dump is used it should _log_ the non-fatal error.)
> 
> Thus, fix the code such that if error_dump is set, for a non-fatal error,
> we:
> 1) print an error to the log
> 2) NOT call td_verror()
> 
> This will make the behavior in io_u_log_error() consistent.
> If error_dump is set, we will log the non-fatal error, but regardless of
> error_dump being set or not, we do NOT call td_verror() (which would set
> td->error) for a non-fatal error, since that was obviously the intention
> of commit 8b28bd413759 ("backend: Add configurable non fatal error list").
> 
> Fixes: 8b28bd413759 ("backend: Add configurable non fatal error list")
> Signed-off-by: Niklas Cassel <cassel@kernel.org>

Looks OK to me.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] stat: Remove duplicate space in __show_run_stats()
  2026-02-06 16:40 ` [PATCH 3/3] stat: Remove duplicate space in __show_run_stats() Niklas Cassel
@ 2026-02-13  3:17   ` Damien Le Moal
  0 siblings, 0 replies; 9+ messages in thread
From: Damien Le Moal @ 2026-02-13  3:17 UTC (permalink / raw)
  To: Niklas Cassel, Jens Axboe, Vincent Fu; +Cc: fio, Jorgen S Hansen

On 2/7/26 01:40, Niklas Cassel wrote:
> Remove duplicate space in __show_run_stats().
> 
> Signed-off-by: Niklas Cassel <cassel@kernel.org>

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] fio: Avoid errno and errno string mismatch
  2026-02-06 16:40 [PATCH 0/3] fio: Avoid errno and errno string mismatch Niklas Cassel
                   ` (3 preceding siblings ...)
  2026-02-06 17:57 ` [PATCH 0/3] fio: Avoid errno and errno string mismatch fiotestbot
@ 2026-02-14  2:39 ` Vincent Fu
  4 siblings, 0 replies; 9+ messages in thread
From: Vincent Fu @ 2026-02-14  2:39 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: Jens Axboe, Vincent Fu, fio, Damien Le Moal, Jorgen S Hansen

On Fri, Feb 6, 2026 at 11:46 AM Niklas Cassel <cassel@kernel.org> wrote:
>
> Hello all,
>
> This series fixes a problem where errno and error string representing that
> errno could be out of sync.
>
> Another possible solution is to remove td->verror everywhere and make sure
> that the function wants the error string calls strerror() with the errno
> when they actually need it. However, that would be a larger change.
>
> Another optimization that could also be done is to look at the places where
> we call update_error_count() and td_clear_error(), if we don't set
> td->error for a non-fatal error, there is no reason to unconditionally call
> td_clear_error() for a non-fatal error. (update_error_count() still has to
> be called, as it sets td->first_error.) However, I do not really understand
> why update_error_count() + td_clear_error() is called by both:
> io_u.c:io_completed() and backend.c:break_on_this_error(), so I avoided to
> do this optimization.
>
>
> Niklas Cassel (3):
>   fio: Fix error string not matching errno
>   io_u: Fix inconsistent handling of non-fatal errors with option
>     error_dump
>   stat: Remove duplicate space in __show_run_stats()
>
>  fio.h  |  7 +++----
>  io_u.c | 10 ++++++++--
>  stat.c |  2 +-
>  3 files changed, 12 insertions(+), 7 deletions(-)
>
> --
> 2.53.0

Applied. Thanks.

Vincent

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-02-14  2:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-06 16:40 [PATCH 0/3] fio: Avoid errno and errno string mismatch Niklas Cassel
2026-02-06 16:40 ` [PATCH 1/3] fio: Fix error string not matching errno Niklas Cassel
2026-02-13  3:12   ` Damien Le Moal
2026-02-06 16:40 ` [PATCH 2/3] io_u: Fix inconsistent handling of non-fatal errors with option error_dump Niklas Cassel
2026-02-13  3:17   ` Damien Le Moal
2026-02-06 16:40 ` [PATCH 3/3] stat: Remove duplicate space in __show_run_stats() Niklas Cassel
2026-02-13  3:17   ` Damien Le Moal
2026-02-06 17:57 ` [PATCH 0/3] fio: Avoid errno and errno string mismatch fiotestbot
2026-02-14  2:39 ` Vincent Fu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox