* [PATCH v2] block: I/O error occurs during SATA disk stress test
@ 2022-08-25 7:09 Gu Mi
2022-08-25 17:18 ` Bart Van Assche
2022-08-26 21:15 ` Bart Van Assche
0 siblings, 2 replies; 11+ messages in thread
From: Gu Mi @ 2022-08-25 7:09 UTC (permalink / raw)
To: axboe, damien.lemoal, bvanassche; +Cc: linux-block, Gu Mi
The problem occurs in two async processes, One is when a new IO
calls the blk_mq_start_request() interface to start sending,The other
is that the block layer timer process calls the blk_mq_req_expired
interface to check whether there is an IO timeout.
When an instruction out of sequence occurs between blk_add_timer
and WRITE_ONCE(rq->state,MQ_RQ_IN_FLIGHT) in the interface
blk_mq_start_request,at this time, the block timer is checking the
new IO timeout, Since the req status has been set to MQ_RQ_IN_FLIGHT
and req->deadline is 0 at this time, the new IO will be misjudged as
a timeout.
Our repair plan is for the deadline to be 0, and we do not think
that a timeout occurs. At the same time, because the jiffies of the
32-bit system will be reversed shortly after the system is turned on,
we will add 1 jiffies to the deadline at this time.
Signed-off-by: Gu Mi <gumi@linux.alibaba.com>
---
v1->v2:
time_after_eq() can handle the overflow, so remove the change on 32-bit
in blk_add_timer().
block/blk-mq.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4b90d2d..6defaa1 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1451,6 +1451,8 @@ static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
return false;
deadline = READ_ONCE(rq->deadline);
+ if (unlikely(deadline == 0))
+ return false;
if (time_after_eq(jiffies, deadline))
return true;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2] block: I/O error occurs during SATA disk stress test
2022-08-25 7:09 Gu Mi
@ 2022-08-25 17:18 ` Bart Van Assche
2022-08-26 21:15 ` Bart Van Assche
1 sibling, 0 replies; 11+ messages in thread
From: Bart Van Assche @ 2022-08-25 17:18 UTC (permalink / raw)
To: Gu Mi, axboe, damien.lemoal; +Cc: linux-block
On 8/25/22 00:09, Gu Mi wrote:
> The problem occurs in two async processes, One is when a new IO
> calls the blk_mq_start_request() interface to start sending,The other
> is that the block layer timer process calls the blk_mq_req_expired
> interface to check whether there is an IO timeout.
>
> When an instruction out of sequence occurs between blk_add_timer
> and WRITE_ONCE(rq->state,MQ_RQ_IN_FLIGHT) in the interface
> blk_mq_start_request,at this time, the block timer is checking the
> new IO timeout, Since the req status has been set to MQ_RQ_IN_FLIGHT
> and req->deadline is 0 at this time, the new IO will be misjudged as
> a timeout.
>
> Our repair plan is for the deadline to be 0, and we do not think
> that a timeout occurs. At the same time, because the jiffies of the
> 32-bit system will be reversed shortly after the system is turned on,
> we will add 1 jiffies to the deadline at this time.
>
> Signed-off-by: Gu Mi <gumi@linux.alibaba.com>
> ---
> v1->v2:
>
> time_after_eq() can handle the overflow, so remove the change on 32-bit
> in blk_add_timer().
>
> block/blk-mq.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 4b90d2d..6defaa1 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1451,6 +1451,8 @@ static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
> return false;
>
> deadline = READ_ONCE(rq->deadline);
> + if (unlikely(deadline == 0))
> + return false;
> if (time_after_eq(jiffies, deadline))
> return true;
>
rq->deadline == 0 can be a valid deadline value so the above patch
doesn't look right to me.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] block: I/O error occurs during SATA disk stress test
@ 2022-08-26 3:25 gumi
2022-08-26 13:36 ` Jens Axboe
0 siblings, 1 reply; 11+ messages in thread
From: gumi @ 2022-08-26 3:25 UTC (permalink / raw)
To: 'Bart Van Assche', axboe, damien.lemoal; +Cc: linux-block
On 8/25/22 00:09, Gu Mi wrote:
> The problem occurs in two async processes, One is when a new IO calls
> the blk_mq_start_request() interface to start sending,The other is
> that the block layer timer process calls the blk_mq_req_expired
> interface to check whether there is an IO timeout.
>
> When an instruction out of sequence occurs between blk_add_timer and
> WRITE_ONCE(rq->state,MQ_RQ_IN_FLIGHT) in the interface
> blk_mq_start_request,at this time, the block timer is checking the new
> IO timeout, Since the req status has been set to MQ_RQ_IN_FLIGHT and
> req->deadline is 0 at this time, the new IO will be misjudged as a
> timeout.
>
> Our repair plan is for the deadline to be 0, and we do not think that
> a timeout occurs. At the same time, because the jiffies of the 32-bit
> system will be reversed shortly after the system is turned on, we will
> add 1 jiffies to the deadline at this time.
>
> Signed-off-by: Gu Mi <gumi@linux.alibaba.com>
> ---
> v1->v2:
>
> time_after_eq() can handle the overflow, so remove the change on
> 32-bit in blk_add_timer().
>
> block/blk-mq.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c index 4b90d2d..6defaa1
> 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1451,6 +1451,8 @@ static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
> return false;
>
> deadline = READ_ONCE(rq->deadline);
> + if (unlikely(deadline == 0))
> + return false;
> if (time_after_eq(jiffies, deadline))
> return true;
>
rq->deadline == 0 can be a valid deadline value so the above patch
doesn't look right to me.
Thanks,
Bart.
---
in patch 1, I added another modification in blk_add_timer(). As follows,
> +#ifndef CONFIG_64BIT
> +/* In case INITIAL_JIFFIES wraps on 32-bit */
> + expiry |= 1UL;
The purpose of this modification is to ensure that rq->deadline is 0 means that it is initialized to 0 in blk_mq_req_expired().
In this case, make sure rq->deadline is an invalid value in blk_mq_req_expired().
Please review my workaround again.
Thanks,
Gu Mi.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] block: I/O error occurs during SATA disk stress test
2022-08-26 3:25 gumi
@ 2022-08-26 13:36 ` Jens Axboe
2022-08-26 16:06 ` Bart Van Assche
0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2022-08-26 13:36 UTC (permalink / raw)
To: gumi, 'Bart Van Assche', damien.lemoal; +Cc: linux-block
On 8/25/22 9:25 PM, gumi@linux.alibaba.com wrote:
> On 8/25/22 00:09, Gu Mi wrote:
>> The problem occurs in two async processes, One is when a new IO calls
>> the blk_mq_start_request() interface to start sending,The other is
>> that the block layer timer process calls the blk_mq_req_expired
>> interface to check whether there is an IO timeout.
>>
>> When an instruction out of sequence occurs between blk_add_timer and
>> WRITE_ONCE(rq->state,MQ_RQ_IN_FLIGHT) in the interface
>> blk_mq_start_request,at this time, the block timer is checking the new
>> IO timeout, Since the req status has been set to MQ_RQ_IN_FLIGHT and
>> req->deadline is 0 at this time, the new IO will be misjudged as a
>> timeout.
>>
>> Our repair plan is for the deadline to be 0, and we do not think that
>> a timeout occurs. At the same time, because the jiffies of the 32-bit
>> system will be reversed shortly after the system is turned on, we will
>> add 1 jiffies to the deadline at this time.
>>
>> Signed-off-by: Gu Mi <gumi@linux.alibaba.com>
>> ---
>> v1->v2:
>>
>> time_after_eq() can handle the overflow, so remove the change on
>> 32-bit in blk_add_timer().
>>
>> block/blk-mq.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/block/blk-mq.c b/block/blk-mq.c index 4b90d2d..6defaa1
>> 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -1451,6 +1451,8 @@ static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
>> return false;
>>
>> deadline = READ_ONCE(rq->deadline);
>> + if (unlikely(deadline == 0))
>> + return false;
>> if (time_after_eq(jiffies, deadline))
>> return true;
>>
>
> rq->deadline == 0 can be a valid deadline value so the above patch
> doesn't look right to me.
Gu, you need to fix your quoting of emails, these are impossible to
read.
That aside, I think there's a misunderstanding here. v1 has some
parts and v2 has others. Please post a v3 that has the hunk
that guarantees that deadline always has the lowest bit set if
assigned, and the !deadline check as well.
--
Jens Axboe
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] block: I/O error occurs during SATA disk stress test
2022-08-26 13:36 ` Jens Axboe
@ 2022-08-26 16:06 ` Bart Van Assche
2022-08-26 19:40 ` Jens Axboe
0 siblings, 1 reply; 11+ messages in thread
From: Bart Van Assche @ 2022-08-26 16:06 UTC (permalink / raw)
To: Jens Axboe, gumi, damien.lemoal; +Cc: linux-block
On 8/26/22 06:36, Jens Axboe wrote:
> That aside, I think there's a misunderstanding here. v1 has some
> parts and v2 has others. Please post a v3 that has the hunk
> that guarantees that deadline always has the lowest bit set if
> assigned, and the !deadline check as well.
Hi Jens,
Would it be considered acceptable to store the request state (rq->state)
in the lowest two bits of rq->deadline? This would reduce the deadline
resolution a little bit but I think that's acceptable. Except for
blk_abort_request(), all changes of rq->state and rq->deadline are
already serialized. So with this approach only blk_abort_request() would
have to use an atomic-compare-exchange loop.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] block: I/O error occurs during SATA disk stress test
2022-08-26 16:06 ` Bart Van Assche
@ 2022-08-26 19:40 ` Jens Axboe
0 siblings, 0 replies; 11+ messages in thread
From: Jens Axboe @ 2022-08-26 19:40 UTC (permalink / raw)
To: Bart Van Assche, gumi, damien.lemoal; +Cc: linux-block
On 8/26/22 10:06 AM, Bart Van Assche wrote:
> On 8/26/22 06:36, Jens Axboe wrote:
>> That aside, I think there's a misunderstanding here. v1 has some
>> parts and v2 has others. Please post a v3 that has the hunk
>> that guarantees that deadline always has the lowest bit set if
>> assigned, and the !deadline check as well.
>
> Hi Jens,
>
> Would it be considered acceptable to store the request state
> (rq->state) in the lowest two bits of rq->deadline? This would reduce
> the deadline resolution a little bit but I think that's acceptable.
> Except for blk_abort_request(), all changes of rq->state and
> rq->deadline are already serialized. So with this approach only
> blk_abort_request() would have to use an atomic-compare-exchange loop.
Sure, I think that would be fine, as long as:
1) We keep the expensive bits in the actual timeout/abort path and not
part of regular issue.
2) We rename the field while doing so.
Might even be worthwhile to look into NOT having both timeout and
deadline in struct request, it's a bit annoying to waste double the
space on something that should just be one field.
--
Jens Axboe
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] block: I/O error occurs during SATA disk stress test
2022-08-25 7:09 Gu Mi
2022-08-25 17:18 ` Bart Van Assche
@ 2022-08-26 21:15 ` Bart Van Assche
1 sibling, 0 replies; 11+ messages in thread
From: Bart Van Assche @ 2022-08-26 21:15 UTC (permalink / raw)
To: Gu Mi, axboe, damien.lemoal; +Cc: linux-block
On 8/25/22 00:09, Gu Mi wrote:
> The problem occurs in two async processes, One is when a new IO
> calls the blk_mq_start_request() interface to start sending,The other
> is that the block layer timer process calls the blk_mq_req_expired
> interface to check whether there is an IO timeout.
>
> When an instruction out of sequence occurs between blk_add_timer
> and WRITE_ONCE(rq->state,MQ_RQ_IN_FLIGHT) in the interface
> blk_mq_start_request,at this time, the block timer is checking the
> new IO timeout, Since the req status has been set to MQ_RQ_IN_FLIGHT
> and req->deadline is 0 at this time, the new IO will be misjudged as
> a timeout.
>
> Our repair plan is for the deadline to be 0, and we do not think
> that a timeout occurs. At the same time, because the jiffies of the
> 32-bit system will be reversed shortly after the system is turned on,
> we will add 1 jiffies to the deadline at this time.
Hi Gu,
With which kernel version has this race been observed? Since commit
2e315dc07df0 ("blk-mq: grab rq->refcount before calling ->fn in
blk_mq_tagset_busy_iter") the request reference count is increased
before the timeout handler (blk_mq_check_expired()) is called. Do you
agree that since then it is no longer possible that
blk_mq_start_request() is called while blk_mq_check_expired() is in
progress?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] block: I/O error occurs during SATA disk stress test
@ 2022-08-29 3:25 gumi
2022-08-29 3:30 ` Bart Van Assche
0 siblings, 1 reply; 11+ messages in thread
From: gumi @ 2022-08-29 3:25 UTC (permalink / raw)
To: 'Bart Van Assche', axboe, damien.lemoal; +Cc: linux-block
Re: [PATCH v2] block: I/O error occurs during SATA disk stress test
On 8/25/22 00:09, Gu Mi wrote:
> The problem occurs in two async processes, One is when a new IO calls
> the blk_mq_start_request() interface to start sending,The other is
> that the block layer timer process calls the blk_mq_req_expired
> interface to check whether there is an IO timeout.
>
> When an instruction out of sequence occurs between blk_add_timer and
> WRITE_ONCE(rq->state,MQ_RQ_IN_FLIGHT) in the interface
> blk_mq_start_request,at this time, the block timer is checking the new
> IO timeout, Since the req status has been set to MQ_RQ_IN_FLIGHT and
> req->deadline is 0 at this time, the new IO will be misjudged as a
> timeout.
>
> Our repair plan is for the deadline to be 0, and we do not think that
> a timeout occurs. At the same time, because the jiffies of the 32-bit
> system will be reversed shortly after the system is turned on, we will
> add 1 jiffies to the deadline at this time.
Hi Gu,
With which kernel version has this race been observed? Since commit
2e315dc07df0 ("blk-mq: grab rq->refcount before calling ->fn in
blk_mq_tagset_busy_iter") the request reference count is increased before the timeout handler (blk_mq_check_expired()) is called. Do you agree that since then it is no longer possible that
blk_mq_start_request() is called while blk_mq_check_expired() is in progress?
Thanks,
Bart.
---
Hi Bart,
This problem occurs on kernel version 5.10, and i read this commit you mentioned. The problem I observed is not a problem of req re-used fixed by commit 2e315dc07df0, but
a different problem. The specific scene is this: A new IO has called blk_mq_start_request() to start sending, and an instruction out of sequence occurs between blk_add_timer() and
WRITE_ONCE(rq->state,MQ_RQ_IN_FLIGHT) in blk_mq_start_request(), so the req->state is set to MQ_RQ_IN_FLIGHT, but req->deadline still 0, and at this very moment, timeout handler(blk_mq_check_expired())
check if this new IO times out, this condition(if (time_after_eq(jiffies, deadline)) in blk_mq_req_expired() called by blk_mq_check_expired()) will is true. The end result is that this new IO is considered to have timed out.
I looked at the latest kernel code and the problem persists, do you agree with my analysis process?
Thanks,
Gu Mi.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] block: I/O error occurs during SATA disk stress test
2022-08-29 3:25 [PATCH v2] block: I/O error occurs during SATA disk stress test gumi
@ 2022-08-29 3:30 ` Bart Van Assche
0 siblings, 0 replies; 11+ messages in thread
From: Bart Van Assche @ 2022-08-29 3:30 UTC (permalink / raw)
To: gumi, axboe, damien.lemoal; +Cc: linux-block
On 8/28/22 20:25, gumi@linux.alibaba.com wrote:
> This problem occurs on kernel version 5.10, and i read this commit
> you mentioned. The problem I observed is not a problem of req re-used
> fixed by commit 2e315dc07df0, but a different problem. The specific
> scene is this: A new IO has called blk_mq_start_request() to start
> sending, and an instruction out of sequence occurs between
> blk_add_timer() and WRITE_ONCE(rq->state,MQ_RQ_IN_FLIGHT) in
> blk_mq_start_request(), so the req->state is set to MQ_RQ_IN_FLIGHT,
> but req->deadline still 0, and at this very moment, timeout
> handler(blk_mq_check_expired()) check if this new IO times out, this
> condition(if (time_after_eq(jiffies, deadline)) in
> blk_mq_req_expired() called by blk_mq_check_expired()) will is true.
> The end result is that this new IO is considered to have timed out. I
> looked at the latest kernel code and the problem persists, do you
> agree with my analysis process?
It seems unlikely to me that the above analysis is correct. If this
problem would occur with recent kernel versions, I think that it would
already have been reported by other Linux users.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] block: I/O error occurs during SATA disk stress test
@ 2022-08-29 4:07 gumi
0 siblings, 0 replies; 11+ messages in thread
From: gumi @ 2022-08-29 4:07 UTC (permalink / raw)
To: 'Bart Van Assche', axboe, damien.lemoal; +Cc: linux-block
On 8/28/22 20:25, gumi@linux.alibaba.com wrote:
> This problem occurs on kernel version 5.10, and i read this commit you
> mentioned. The problem I observed is not a problem of req re-used
> fixed by commit 2e315dc07df0, but a different problem. The specific
> scene is this: A new IO has called blk_mq_start_request() to start
> sending, and an instruction out of sequence occurs between
> blk_add_timer() and WRITE_ONCE(rq->state,MQ_RQ_IN_FLIGHT) in
> blk_mq_start_request(), so the req->state is set to MQ_RQ_IN_FLIGHT,
> but req->deadline still 0, and at this very moment, timeout
> handler(blk_mq_check_expired()) check if this new IO times out, this
> condition(if (time_after_eq(jiffies, deadline)) in
> blk_mq_req_expired() called by blk_mq_check_expired()) will is true.
> The end result is that this new IO is considered to have timed out. I
> looked at the latest kernel code and the problem persists, do you
> agree with my analysis process?
It seems unlikely to me that the above analysis is correct. If this problem would occur with recent kernel versions, I think that it would already have been reported by other Linux users.
Thanks,
Bart.
---
Hi Bart,
First of all, this problem is not easy to be tested. We have tested it hundreds of times, and the recurrence is low probability. Basically, it needs to run continuously for 15-20 days under the high pressure test.
After analyzing the reason, we modified it according to the method of V1, and continued to test for more than 30 days, and this problem did not appear. Finally I decided to submit this patch to upstream.
According to our code analysis, this problem scenario is one that may occur under polar probability, please re-examine my analysis process.
Thanks,
Gu Mi
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] block: I/O error occurs during SATA disk stress test
@ 2022-08-29 4:11 gumi
0 siblings, 0 replies; 11+ messages in thread
From: gumi @ 2022-08-29 4:11 UTC (permalink / raw)
To: 'Jens Axboe', 'Bart Van Assche', damien.lemoal
Cc: linux-block
On 8/25/22 9:25 PM, gumi@linux.alibaba.com wrote:
> On 8/25/22 00:09, Gu Mi wrote:
>> The problem occurs in two async processes, One is when a new IO calls
>> the blk_mq_start_request() interface to start sending,The other is
>> that the block layer timer process calls the blk_mq_req_expired
>> interface to check whether there is an IO timeout.
>>
>> When an instruction out of sequence occurs between blk_add_timer and
>> WRITE_ONCE(rq->state,MQ_RQ_IN_FLIGHT) in the interface
>> blk_mq_start_request,at this time, the block timer is checking the
>> new IO timeout, Since the req status has been set to MQ_RQ_IN_FLIGHT
>> and
>> req->deadline is 0 at this time, the new IO will be misjudged as a
>> timeout.
>>
>> Our repair plan is for the deadline to be 0, and we do not think that
>> a timeout occurs. At the same time, because the jiffies of the 32-bit
>> system will be reversed shortly after the system is turned on, we
>> will add 1 jiffies to the deadline at this time.
>>
>> Signed-off-by: Gu Mi <gumi@linux.alibaba.com>
>> ---
>> v1->v2:
>>
>> time_after_eq() can handle the overflow, so remove the change on
>> 32-bit in blk_add_timer().
>>
>> block/blk-mq.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/block/blk-mq.c b/block/blk-mq.c index 4b90d2d..6defaa1
>> 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -1451,6 +1451,8 @@ static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
>> return false;
>>
>> deadline = READ_ONCE(rq->deadline);
>> + if (unlikely(deadline == 0))
>> + return false;
>> if (time_after_eq(jiffies, deadline))
>> return true;
>>
>
> rq->deadline == 0 can be a valid deadline value so the above patch
> doesn't look right to me.
Gu, you need to fix your quoting of emails, these are impossible to read.
That aside, I think there's a misunderstanding here. v1 has some parts and v2 has others. Please post a v3 that has the hunk that guarantees that deadline always has the lowest bit set if assigned, and the !deadline check as well.
--
Jens Axboe
---
Hi Jens
Thanks for your reminder, I I will post a v3 patch later, my v3 patch and V1 patch are the same, please review again
Thanks,
Gu Mi
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2022-08-29 4:11 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-08-29 3:25 [PATCH v2] block: I/O error occurs during SATA disk stress test gumi
2022-08-29 3:30 ` Bart Van Assche
-- strict thread matches above, loose matches on Subject: below --
2022-08-29 4:11 gumi
2022-08-29 4:07 gumi
2022-08-26 3:25 gumi
2022-08-26 13:36 ` Jens Axboe
2022-08-26 16:06 ` Bart Van Assche
2022-08-26 19:40 ` Jens Axboe
2022-08-25 7:09 Gu Mi
2022-08-25 17:18 ` Bart Van Assche
2022-08-26 21:15 ` Bart Van Assche
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).