* [OOPS] elevator private data for REQ_FLUSH
@ 2011-03-25 15:15 Sergey Senozhatsky
2011-03-25 15:22 ` Markus Trippelsdorf
0 siblings, 1 reply; 38+ messages in thread
From: Sergey Senozhatsky @ 2011-03-25 15:15 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-kernel, Mike Snitzer
Hello,
Commit
9d5a4e946ce5352f19400b6370f4cd8e72806278
block: skip elevator data initialization for flush requests
Skip elevator initialization for flush requests by passing priv=0 to
blk_alloc_request() in get_request(). As such elv_set_request() is
never called for flush requests.
introduced priv flag, to skip elevator_private data init for FLUSH requests.
This, I guess, lead to NULL pointer deref on my machine in cfq_insert_request,
which requires elevator_private to be set:
1 [ 78.982169] Call Trace:
2 [ 78.982178] [<ffffffff8122d1fe>] cfq_insert_request+0x4e/0x47d
3 [ 78.982184] [<ffffffff8123e139>] ? do_raw_spin_lock+0x6b/0x122
4 [ 78.982189] [<ffffffff81218c4a>] elv_insert+0x212/0x265
5 [ 78.982192] [<ffffffff81218ced>] __elv_add_request+0x50/0x52
6 [ 78.982195] [<ffffffff8121b44a>] flush_plug_list+0xce/0x12f
7 [ 78.982199] [<ffffffff8121b4c0>] __blk_flush_plug+0x15/0x21
8 [ 78.982205] [<ffffffff8146b321>] schedule+0x43e/0xbea
9 [ 78.982211] [<ffffffff8106f8bd>] ? __lock_acquire+0x149d/0x1576
10 [ 78.982215] [<ffffffff8121ba9e>] ? drive_stat_acct+0x1b6/0x1c3
11 [ 78.982218] [<ffffffff8121b92c>] ? drive_stat_acct+0x44/0x1c3
12 [ 78.982223] [<ffffffff8121f641>] ? __make_request+0x268/0x2bf
13 [ 78.982226] [<ffffffff8146bf66>] schedule_timeout+0x35/0x3b8
14 [ 78.982230] [<ffffffff810705ed>] ? mark_held_locks+0x4b/0x6d
15 [ 78.982234] [<ffffffff8146e768>] ? _raw_spin_unlock_irq+0x28/0x56
16 [ 78.982239] [<ffffffff81033bc1>] ? get_parent_ip+0xe/0x3e
17 [ 78.982244] [<ffffffff81471822>] ? sub_preempt_count+0x90/0xa3
18 [ 78.982247] [<ffffffff8146acbb>] wait_for_common+0xc3/0x141
19 [ 78.982251] [<ffffffff8103823a>] ? default_wake_function+0x0/0xf
20 [ 78.982254] [<ffffffff8146ad51>] wait_for_completion+0x18/0x1a
21 [ 78.982258] [<ffffffff8122087b>] blkdev_issue_flush+0xcb/0x11a
22 [ 78.982264] [<ffffffff811a9d65>] ext4_sync_file+0x2b3/0x302
23 [ 78.982268] [<ffffffff81129e25>] vfs_fsync_range+0x55/0x75
24 [ 78.982271] [<ffffffff81129e84>] generic_write_sync+0x3f/0x41
25 [ 78.982278] [<ffffffff810c6c6c>] generic_file_aio_write+0x8c/0xb9
26 [ 78.982281] [<ffffffff811a99a9>] ext4_file_write+0x1dc/0x237
27 [ 78.982285] [<ffffffff8123e139>] ? do_raw_spin_lock+0x6b/0x122
28 [ 78.982288] [<ffffffff811a97cd>] ? ext4_file_write+0x0/0x237
29 [ 78.982292] [<ffffffff81104859>] do_sync_readv_writev+0xb4/0xf9
30 [ 78.982298] [<ffffffff8120af11>] ? security_file_permission+0x1e/0x84
31 [ 78.982302] [<ffffffff8110418e>] ? rw_verify_area+0xab/0xc8
32 [ 78.982305] [<ffffffff81104ac6>] do_readv_writev+0xb8/0x17d
33 [ 78.982309] [<ffffffff81105b5e>] ? fget_light+0x166/0x30b
34 [ 78.982312] [<ffffffff81104bcb>] vfs_writev+0x40/0x42
35 [ 78.982315] [<ffffffff81104e1c>] sys_pwritev+0x55/0x99
36 [ 78.982320] [<ffffffff81474a52>] system_call_fastpath+0x16/0x1b
37
Should we in that case use ELEVATOR_INSERT_FLUSH for REQ_FLUSH | REQ_FUA requests
(like below)?
---
block/elevator.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/block/elevator.c b/block/elevator.c
index c387d31..b17e577 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -734,6 +734,8 @@ void __elv_add_request(struct request_queue *q, struct request *rq, int where)
q->end_sector = rq_end_sector(rq);
q->boundary_rq = rq;
}
+ } else if (rq->cmd_flags & (REQ_FLUSH | REQ_FUA)) {
+ where = ELEVATOR_INSERT_FLUSH;
} else if (!(rq->cmd_flags & REQ_ELVPRIV) &&
where == ELEVATOR_INSERT_SORT)
where = ELEVATOR_INSERT_BACK;
^ permalink raw reply related [flat|nested] 38+ messages in thread* Re: [OOPS] elevator private data for REQ_FLUSH 2011-03-25 15:15 [OOPS] elevator private data for REQ_FLUSH Sergey Senozhatsky @ 2011-03-25 15:22 ` Markus Trippelsdorf 2011-03-25 15:40 ` Mike Snitzer 2011-03-25 15:57 ` [OOPS] elevator private data for REQ_FLUSH Jens Axboe 0 siblings, 2 replies; 38+ messages in thread From: Markus Trippelsdorf @ 2011-03-25 15:22 UTC (permalink / raw) To: Sergey Senozhatsky; +Cc: Jens Axboe, linux-kernel, Mike Snitzer, Chris Mason On 2011.03.25 at 17:15 +0200, Sergey Senozhatsky wrote: > Hello, > > Commit > 9d5a4e946ce5352f19400b6370f4cd8e72806278 > block: skip elevator data initialization for flush requests > > Skip elevator initialization for flush requests by passing priv=0 to > blk_alloc_request() in get_request(). As such elv_set_request() is > never called for flush requests. > > introduced priv flag, to skip elevator_private data init for FLUSH requests. > This, I guess, lead to NULL pointer deref on my machine in cfq_insert_request, > which requires elevator_private to be set: > > 1 [ 78.982169] Call Trace: > 2 [ 78.982178] [<ffffffff8122d1fe>] cfq_insert_request+0x4e/0x47d > 3 [ 78.982184] [<ffffffff8123e139>] ? do_raw_spin_lock+0x6b/0x122 > Should we in that case use ELEVATOR_INSERT_FLUSH for REQ_FLUSH | REQ_FUA requests > (like below)? > > --- > > block/elevator.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/block/elevator.c b/block/elevator.c > index c387d31..b17e577 100644 > --- a/block/elevator.c > +++ b/block/elevator.c > @@ -734,6 +734,8 @@ void __elv_add_request(struct request_queue *q, struct request *rq, int where) > q->end_sector = rq_end_sector(rq); > q->boundary_rq = rq; > } > + } else if (rq->cmd_flags & (REQ_FLUSH | REQ_FUA)) { > + where = ELEVATOR_INSERT_FLUSH; > } else if (!(rq->cmd_flags & REQ_ELVPRIV) && > where == ELEVATOR_INSERT_SORT) > where = ELEVATOR_INSERT_BACK; Thanks. That solves all (corruption-) problems that I reported earlier in an other thread. -- Markus ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: elevator private data for REQ_FLUSH 2011-03-25 15:22 ` Markus Trippelsdorf @ 2011-03-25 15:40 ` Mike Snitzer 2011-03-25 15:50 ` Jens Axboe 2011-03-25 15:57 ` [OOPS] elevator private data for REQ_FLUSH Jens Axboe 1 sibling, 1 reply; 38+ messages in thread From: Mike Snitzer @ 2011-03-25 15:40 UTC (permalink / raw) To: Markus Trippelsdorf Cc: Sergey Senozhatsky, Jens Axboe, linux-kernel, Chris Mason, tj, Vivek Goyal, Jeff Moyer On Fri, Mar 25 2011 at 11:22am -0400, Markus Trippelsdorf <markus@trippelsdorf.de> wrote: > On 2011.03.25 at 17:15 +0200, Sergey Senozhatsky wrote: > > Hello, > > > > Commit > > 9d5a4e946ce5352f19400b6370f4cd8e72806278 > > block: skip elevator data initialization for flush requests > > > > Skip elevator initialization for flush requests by passing priv=0 to > > blk_alloc_request() in get_request(). As such elv_set_request() is > > never called for flush requests. > > > > introduced priv flag, to skip elevator_private data init for FLUSH requests. > > This, I guess, lead to NULL pointer deref on my machine in cfq_insert_request, > > which requires elevator_private to be set: > > > > 1 [ 78.982169] Call Trace: > > 2 [ 78.982178] [<ffffffff8122d1fe>] cfq_insert_request+0x4e/0x47d > > 3 [ 78.982184] [<ffffffff8123e139>] ? do_raw_spin_lock+0x6b/0x122 > > > Should we in that case use ELEVATOR_INSERT_FLUSH for REQ_FLUSH | REQ_FUA requests > > (like below)? > > > > --- > > > > block/elevator.c | 2 ++ > > 1 files changed, 2 insertions(+), 0 deletions(-) > > > > diff --git a/block/elevator.c b/block/elevator.c > > index c387d31..b17e577 100644 > > --- a/block/elevator.c > > +++ b/block/elevator.c > > @@ -734,6 +734,8 @@ void __elv_add_request(struct request_queue *q, struct request *rq, int where) > > q->end_sector = rq_end_sector(rq); > > q->boundary_rq = rq; > > } > > + } else if (rq->cmd_flags & (REQ_FLUSH | REQ_FUA)) { > > + where = ELEVATOR_INSERT_FLUSH; > > } else if (!(rq->cmd_flags & REQ_ELVPRIV) && > > where == ELEVATOR_INSERT_SORT) > > where = ELEVATOR_INSERT_BACK; > > Thanks. That solves all (corruption-) problems that I reported earlier in an other > thread. So the flush-merge changes introduced ELEVATOR_INSERT_FLUSH (via commit ae1b1539). And the flush bio will now get ELEVATOR_INSERT_FLUSH in __make_request(). So it is interesting that the flush is getting inserted in the elevator at all. AFAIK that shouldn't be (and historically hasn't been) the case. Combination of onstack plug changes? 1 [ 78.982169] Call Trace: 2 [ 78.982178] [<ffffffff8122d1fe>] cfq_insert_request+0x4e/0x47d 3 [ 78.982184] [<ffffffff8123e139>] ? do_raw_spin_lock+0x6b/0x122 4 [ 78.982189] [<ffffffff81218c4a>] elv_insert+0x212/0x265 5 [ 78.982192] [<ffffffff81218ced>] __elv_add_request+0x50/0x52 6 [ 78.982195] [<ffffffff8121b44a>] flush_plug_list+0xce/0x12f 7 [ 78.982199] [<ffffffff8121b4c0>] __blk_flush_plug+0x15/0x21 8 [ 78.982205] [<ffffffff8146b321>] schedule+0x43e/0xbea Mike ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: elevator private data for REQ_FLUSH 2011-03-25 15:40 ` Mike Snitzer @ 2011-03-25 15:50 ` Jens Axboe 2011-03-25 18:54 ` Mike Snitzer 0 siblings, 1 reply; 38+ messages in thread From: Jens Axboe @ 2011-03-25 15:50 UTC (permalink / raw) To: Mike Snitzer Cc: Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, tj, Vivek Goyal, Jeff Moyer On 2011-03-25 16:40, Mike Snitzer wrote: > On Fri, Mar 25 2011 at 11:22am -0400, > Markus Trippelsdorf <markus@trippelsdorf.de> wrote: > >> On 2011.03.25 at 17:15 +0200, Sergey Senozhatsky wrote: >>> Hello, >>> >>> Commit >>> 9d5a4e946ce5352f19400b6370f4cd8e72806278 >>> block: skip elevator data initialization for flush requests >>> >>> Skip elevator initialization for flush requests by passing priv=0 to >>> blk_alloc_request() in get_request(). As such elv_set_request() is >>> never called for flush requests. >>> >>> introduced priv flag, to skip elevator_private data init for FLUSH requests. >>> This, I guess, lead to NULL pointer deref on my machine in cfq_insert_request, >>> which requires elevator_private to be set: >>> >>> 1 [ 78.982169] Call Trace: >>> 2 [ 78.982178] [<ffffffff8122d1fe>] cfq_insert_request+0x4e/0x47d >>> 3 [ 78.982184] [<ffffffff8123e139>] ? do_raw_spin_lock+0x6b/0x122 >> >>> Should we in that case use ELEVATOR_INSERT_FLUSH for REQ_FLUSH | REQ_FUA requests >>> (like below)? >>> >>> --- >>> >>> block/elevator.c | 2 ++ >>> 1 files changed, 2 insertions(+), 0 deletions(-) >>> >>> diff --git a/block/elevator.c b/block/elevator.c >>> index c387d31..b17e577 100644 >>> --- a/block/elevator.c >>> +++ b/block/elevator.c >>> @@ -734,6 +734,8 @@ void __elv_add_request(struct request_queue *q, struct request *rq, int where) >>> q->end_sector = rq_end_sector(rq); >>> q->boundary_rq = rq; >>> } >>> + } else if (rq->cmd_flags & (REQ_FLUSH | REQ_FUA)) { >>> + where = ELEVATOR_INSERT_FLUSH; >>> } else if (!(rq->cmd_flags & REQ_ELVPRIV) && >>> where == ELEVATOR_INSERT_SORT) >>> where = ELEVATOR_INSERT_BACK; >> >> Thanks. That solves all (corruption-) problems that I reported earlier in an other >> thread. > > So the flush-merge changes introduced ELEVATOR_INSERT_FLUSH (via commit > ae1b1539). And the flush bio will now get ELEVATOR_INSERT_FLUSH in > __make_request(). > > So it is interesting that the flush is getting inserted in the elevator > at all. AFAIK that shouldn't be (and historically hasn't been) the > case. > > Combination of onstack plug changes? It is, it forces a sort insert. I'll fix this up, I'm relieved we have a good handle on this issue now. -- Jens Axboe ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: elevator private data for REQ_FLUSH 2011-03-25 15:50 ` Jens Axboe @ 2011-03-25 18:54 ` Mike Snitzer 2011-03-25 19:50 ` Jens Axboe 0 siblings, 1 reply; 38+ messages in thread From: Mike Snitzer @ 2011-03-25 18:54 UTC (permalink / raw) To: Jens Axboe Cc: Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, tj, Vivek Goyal, Jeff Moyer On Fri, Mar 25 2011 at 11:50am -0400, Jens Axboe <axboe@kernel.dk> wrote: > On 2011-03-25 16:40, Mike Snitzer wrote: > > On Fri, Mar 25 2011 at 11:22am -0400, > > Markus Trippelsdorf <markus@trippelsdorf.de> wrote: > > > >> On 2011.03.25 at 17:15 +0200, Sergey Senozhatsky wrote: > >>> Hello, > >>> > >>> Commit > >>> 9d5a4e946ce5352f19400b6370f4cd8e72806278 > >>> block: skip elevator data initialization for flush requests > >>> > >>> Skip elevator initialization for flush requests by passing priv=0 to > >>> blk_alloc_request() in get_request(). As such elv_set_request() is > >>> never called for flush requests. > >>> > >>> introduced priv flag, to skip elevator_private data init for FLUSH requests. > >>> This, I guess, lead to NULL pointer deref on my machine in cfq_insert_request, > >>> which requires elevator_private to be set: > >>> > >>> 1 [ 78.982169] Call Trace: > >>> 2 [ 78.982178] [<ffffffff8122d1fe>] cfq_insert_request+0x4e/0x47d > >>> 3 [ 78.982184] [<ffffffff8123e139>] ? do_raw_spin_lock+0x6b/0x122 > >> > >>> Should we in that case use ELEVATOR_INSERT_FLUSH for REQ_FLUSH | REQ_FUA requests > >>> (like below)? > >>> > >>> --- > >>> > >>> block/elevator.c | 2 ++ > >>> 1 files changed, 2 insertions(+), 0 deletions(-) > >>> > >>> diff --git a/block/elevator.c b/block/elevator.c > >>> index c387d31..b17e577 100644 > >>> --- a/block/elevator.c > >>> +++ b/block/elevator.c > >>> @@ -734,6 +734,8 @@ void __elv_add_request(struct request_queue *q, struct request *rq, int where) > >>> q->end_sector = rq_end_sector(rq); > >>> q->boundary_rq = rq; > >>> } > >>> + } else if (rq->cmd_flags & (REQ_FLUSH | REQ_FUA)) { > >>> + where = ELEVATOR_INSERT_FLUSH; > >>> } else if (!(rq->cmd_flags & REQ_ELVPRIV) && > >>> where == ELEVATOR_INSERT_SORT) > >>> where = ELEVATOR_INSERT_BACK; > >> > >> Thanks. That solves all (corruption-) problems that I reported earlier in an other > >> thread. > > > > So the flush-merge changes introduced ELEVATOR_INSERT_FLUSH (via commit > > ae1b1539). And the flush bio will now get ELEVATOR_INSERT_FLUSH in > > __make_request(). > > > > So it is interesting that the flush is getting inserted in the elevator > > at all. AFAIK that shouldn't be (and historically hasn't been) the > > case. > > > > Combination of onstack plug changes? > > It is, it forces a sort insert. I'll fix this up, I'm relieved we have a > good handle on this issue now. Should we also add a safety net to avoid the potential for future silent corruption, etc? E.g.: --- block/elevator.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/block/elevator.c b/block/elevator.c index c387d31..86d258e 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -657,6 +657,9 @@ void elv_quiesce_end(struct request_queue *q) void elv_insert(struct request_queue *q, struct request *rq, int where) { + BUG_ON(rq->cmd_flags & (REQ_FLUSH | REQ_FUA) && + where != ELEVATOR_INSERT_FLUSH); + trace_block_rq_insert(q, rq); rq->q = q; ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: elevator private data for REQ_FLUSH 2011-03-25 18:54 ` Mike Snitzer @ 2011-03-25 19:50 ` Jens Axboe 2011-03-26 4:21 ` Mike Snitzer 0 siblings, 1 reply; 38+ messages in thread From: Jens Axboe @ 2011-03-25 19:50 UTC (permalink / raw) To: Mike Snitzer Cc: Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, tj, Vivek Goyal, Jeff Moyer On 2011-03-25 19:54, Mike Snitzer wrote: > On Fri, Mar 25 2011 at 11:50am -0400, > Jens Axboe <axboe@kernel.dk> wrote: > >> On 2011-03-25 16:40, Mike Snitzer wrote: >>> On Fri, Mar 25 2011 at 11:22am -0400, >>> Markus Trippelsdorf <markus@trippelsdorf.de> wrote: >>> >>>> On 2011.03.25 at 17:15 +0200, Sergey Senozhatsky wrote: >>>>> Hello, >>>>> >>>>> Commit >>>>> 9d5a4e946ce5352f19400b6370f4cd8e72806278 >>>>> block: skip elevator data initialization for flush requests >>>>> >>>>> Skip elevator initialization for flush requests by passing priv=0 to >>>>> blk_alloc_request() in get_request(). As such elv_set_request() is >>>>> never called for flush requests. >>>>> >>>>> introduced priv flag, to skip elevator_private data init for FLUSH requests. >>>>> This, I guess, lead to NULL pointer deref on my machine in cfq_insert_request, >>>>> which requires elevator_private to be set: >>>>> >>>>> 1 [ 78.982169] Call Trace: >>>>> 2 [ 78.982178] [<ffffffff8122d1fe>] cfq_insert_request+0x4e/0x47d >>>>> 3 [ 78.982184] [<ffffffff8123e139>] ? do_raw_spin_lock+0x6b/0x122 >>>> >>>>> Should we in that case use ELEVATOR_INSERT_FLUSH for REQ_FLUSH | REQ_FUA requests >>>>> (like below)? >>>>> >>>>> --- >>>>> >>>>> block/elevator.c | 2 ++ >>>>> 1 files changed, 2 insertions(+), 0 deletions(-) >>>>> >>>>> diff --git a/block/elevator.c b/block/elevator.c >>>>> index c387d31..b17e577 100644 >>>>> --- a/block/elevator.c >>>>> +++ b/block/elevator.c >>>>> @@ -734,6 +734,8 @@ void __elv_add_request(struct request_queue *q, struct request *rq, int where) >>>>> q->end_sector = rq_end_sector(rq); >>>>> q->boundary_rq = rq; >>>>> } >>>>> + } else if (rq->cmd_flags & (REQ_FLUSH | REQ_FUA)) { >>>>> + where = ELEVATOR_INSERT_FLUSH; >>>>> } else if (!(rq->cmd_flags & REQ_ELVPRIV) && >>>>> where == ELEVATOR_INSERT_SORT) >>>>> where = ELEVATOR_INSERT_BACK; >>>> >>>> Thanks. That solves all (corruption-) problems that I reported earlier in an other >>>> thread. >>> >>> So the flush-merge changes introduced ELEVATOR_INSERT_FLUSH (via commit >>> ae1b1539). And the flush bio will now get ELEVATOR_INSERT_FLUSH in >>> __make_request(). >>> >>> So it is interesting that the flush is getting inserted in the elevator >>> at all. AFAIK that shouldn't be (and historically hasn't been) the >>> case. >>> >>> Combination of onstack plug changes? >> >> It is, it forces a sort insert. I'll fix this up, I'm relieved we have a >> good handle on this issue now. > > Should we also add a safety net to avoid the potential for future silent > corruption, etc? E.g.: Yes, I was thinking about something like that. I consider the patch merged an immediate stop gap, we need to improve this situation. It's not exactly pretty to have this sort of condition in both __make_request() and flush_plug_list(). Clearly it should be handled further down. -- Jens Axboe ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: elevator private data for REQ_FLUSH 2011-03-25 19:50 ` Jens Axboe @ 2011-03-26 4:21 ` Mike Snitzer 2011-03-28 8:23 ` Tejun Heo 0 siblings, 1 reply; 38+ messages in thread From: Mike Snitzer @ 2011-03-26 4:21 UTC (permalink / raw) To: Jens Axboe Cc: Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, tj, Vivek Goyal, Jeff Moyer On Fri, Mar 25 2011 at 3:50pm -0400, Jens Axboe <axboe@kernel.dk> wrote: > On 2011-03-25 19:54, Mike Snitzer wrote: > > On Fri, Mar 25 2011 at 11:50am -0400, > > Jens Axboe <axboe@kernel.dk> wrote: > > > >> On 2011-03-25 16:40, Mike Snitzer wrote: > >>> On Fri, Mar 25 2011 at 11:22am -0400, > >>> Markus Trippelsdorf <markus@trippelsdorf.de> wrote: > >>> > >>>> On 2011.03.25 at 17:15 +0200, Sergey Senozhatsky wrote: > >>>>> Hello, > >>>>> > >>>>> Commit > >>>>> 9d5a4e946ce5352f19400b6370f4cd8e72806278 > >>>>> block: skip elevator data initialization for flush requests > >>>>> > >>>>> Skip elevator initialization for flush requests by passing priv=0 to > >>>>> blk_alloc_request() in get_request(). As such elv_set_request() is > >>>>> never called for flush requests. > >>>>> > >>>>> introduced priv flag, to skip elevator_private data init for FLUSH requests. > >>>>> This, I guess, lead to NULL pointer deref on my machine in cfq_insert_request, > >>>>> which requires elevator_private to be set: > >>>>> > >>>>> 1 [ 78.982169] Call Trace: > >>>>> 2 [ 78.982178] [<ffffffff8122d1fe>] cfq_insert_request+0x4e/0x47d > >>>>> 3 [ 78.982184] [<ffffffff8123e139>] ? do_raw_spin_lock+0x6b/0x122 > >>>> > >>>>> Should we in that case use ELEVATOR_INSERT_FLUSH for REQ_FLUSH | REQ_FUA requests > >>>>> (like below)? > >>>>> > >>>>> --- > >>>>> > >>>>> block/elevator.c | 2 ++ > >>>>> 1 files changed, 2 insertions(+), 0 deletions(-) > >>>>> > >>>>> diff --git a/block/elevator.c b/block/elevator.c > >>>>> index c387d31..b17e577 100644 > >>>>> --- a/block/elevator.c > >>>>> +++ b/block/elevator.c > >>>>> @@ -734,6 +734,8 @@ void __elv_add_request(struct request_queue *q, struct request *rq, int where) > >>>>> q->end_sector = rq_end_sector(rq); > >>>>> q->boundary_rq = rq; > >>>>> } > >>>>> + } else if (rq->cmd_flags & (REQ_FLUSH | REQ_FUA)) { > >>>>> + where = ELEVATOR_INSERT_FLUSH; > >>>>> } else if (!(rq->cmd_flags & REQ_ELVPRIV) && > >>>>> where == ELEVATOR_INSERT_SORT) > >>>>> where = ELEVATOR_INSERT_BACK; > >>>> > >>>> Thanks. That solves all (corruption-) problems that I reported earlier in an other > >>>> thread. > >>> > >>> So the flush-merge changes introduced ELEVATOR_INSERT_FLUSH (via commit > >>> ae1b1539). And the flush bio will now get ELEVATOR_INSERT_FLUSH in > >>> __make_request(). > >>> > >>> So it is interesting that the flush is getting inserted in the elevator > >>> at all. AFAIK that shouldn't be (and historically hasn't been) the > >>> case. > >>> > >>> Combination of onstack plug changes? > >> > >> It is, it forces a sort insert. I'll fix this up, I'm relieved we have a > >> good handle on this issue now. > > > > Should we also add a safety net to avoid the potential for future silent > > corruption, etc? E.g.: > > Yes, I was thinking about something like that. I consider the patch > merged an immediate stop gap, we need to improve this situation. It's > not exactly pretty to have this sort of condition in both > __make_request() and flush_plug_list(). Clearly it should be handled > further down. OK, and btw my patch was too restrictive. blk_kick_flush() elv_insert()s a flush request with ELEVATOR_INSERT_REQUEUE. Should blk_kick_flush() process the flush request without calling elv_insert() -- like is done with open coded list_add() in blk_insert_flush()? Or should blk_insert_flush() use elv_insert() with ELEVATOR_INSERT_REQUEUE too? Mike ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: elevator private data for REQ_FLUSH 2011-03-26 4:21 ` Mike Snitzer @ 2011-03-28 8:23 ` Tejun Heo 2011-03-28 22:15 ` [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE (was: Re: elevator private data for REQ_FLUSH) Mike Snitzer 0 siblings, 1 reply; 38+ messages in thread From: Tejun Heo @ 2011-03-28 8:23 UTC (permalink / raw) To: Mike Snitzer Cc: Jens Axboe, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, Vivek Goyal, Jeff Moyer Hey, On Sat, Mar 26, 2011 at 12:21:56AM -0400, Mike Snitzer wrote: > > Yes, I was thinking about something like that. I consider the patch > > merged an immediate stop gap, we need to improve this situation. It's > > not exactly pretty to have this sort of condition in both > > __make_request() and flush_plug_list(). Clearly it should be handled > > further down. > > OK, and btw my patch was too restrictive. blk_kick_flush() > elv_insert()s a flush request with ELEVATOR_INSERT_REQUEUE. > > Should blk_kick_flush() process the flush request without calling > elv_insert() -- like is done with open coded list_add() in > blk_insert_flush()? > > Or should blk_insert_flush() use elv_insert() with > ELEVATOR_INSERT_REQUEUE too? Hmmm... I would prefer the latter. Given that INSERT_REQUEUE and FRONT are no longer different, it would probably be better to use FRONT tho. The only reason REQUEUE is used there is to avoid kicking the queue from elv_insert(), which is gone now. Thanks. -- tejun ^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE (was: Re: elevator private data for REQ_FLUSH) 2011-03-28 8:23 ` Tejun Heo @ 2011-03-28 22:15 ` Mike Snitzer 2011-03-29 11:56 ` [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE Jens Axboe 2011-03-29 14:13 ` [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE (was: Re: elevator private data for REQ_FLUSH) Jeff Moyer 0 siblings, 2 replies; 38+ messages in thread From: Mike Snitzer @ 2011-03-28 22:15 UTC (permalink / raw) To: Tejun Heo, Jens Axboe Cc: Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, Vivek Goyal, Jeff Moyer On Mon, Mar 28 2011 at 4:23am -0400, Tejun Heo <tj@kernel.org> wrote: > On Sat, Mar 26, 2011 at 12:21:56AM -0400, Mike Snitzer wrote: > > Should blk_kick_flush() process the flush request without calling > > elv_insert() -- like is done with open coded list_add() in > > blk_insert_flush()? > > > > Or should blk_insert_flush() use elv_insert() with > > ELEVATOR_INSERT_REQUEUE too? > > Hmmm... I would prefer the latter. Given that INSERT_REQUEUE and > FRONT are no longer different, it would probably be better to use > FRONT tho. The only reason REQUEUE is used there is to avoid kicking > the queue from elv_insert(), which is gone now. OK, I came up with the following patch. Jens, this is just a natural cleanup given the code that resulted from the flush-merge and onstack plugging changes coming together. From: Mike Snitzer <snitzer@redhat.com> Subject: block: eliminate ELEVATOR_INSERT_REQUEUE elv_insert() no longer has a need to differentiate between ELEVATOR_INSERT_REQUEUE and ELEVATOR_INSERT_FRONT. The onstack plugging changes eliminated the need to avoid unplugging the queue (via ELEVATOR_INSERT_REQUEUE). Also, in blk_insert_flush(), use elv_insert() with ELEVATOR_INSERT_FRONT rather than open-coding the equivalent. Signed-off-by: Mike Snitzer <snitzer@redhat.com> --- block/blk-flush.c | 4 ++-- block/elevator.c | 3 +-- include/linux/elevator.h | 5 ++--- 3 files changed, 5 insertions(+), 7 deletions(-) diff --git a/block/blk-flush.c b/block/blk-flush.c index 93d5fd8..2c43044 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -261,7 +261,7 @@ static bool blk_kick_flush(struct request_queue *q) q->flush_rq.end_io = flush_end_io; q->flush_pending_idx ^= 1; - elv_insert(q, &q->flush_rq, ELEVATOR_INSERT_REQUEUE); + elv_insert(q, &q->flush_rq, ELEVATOR_INSERT_FRONT); return true; } @@ -312,7 +312,7 @@ void blk_insert_flush(struct request *rq) */ if ((policy & REQ_FSEQ_DATA) && !(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) { - list_add(&rq->queuelist, &q->queue_head); + elv_insert(q, rq, ELEVATOR_INSERT_FRONT); return; } diff --git a/block/elevator.c b/block/elevator.c index c387d31..df7c4ba 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -610,7 +610,7 @@ void elv_requeue_request(struct request_queue *q, struct request *rq) rq->cmd_flags &= ~REQ_STARTED; - elv_insert(q, rq, ELEVATOR_INSERT_REQUEUE); + elv_insert(q, rq, ELEVATOR_INSERT_FRONT); } void elv_drain_elevator(struct request_queue *q) @@ -662,7 +662,6 @@ void elv_insert(struct request_queue *q, struct request *rq, int where) rq->q = q; switch (where) { - case ELEVATOR_INSERT_REQUEUE: case ELEVATOR_INSERT_FRONT: rq->cmd_flags |= REQ_SOFTBARRIER; list_add(&rq->queuelist, &q->queue_head); diff --git a/include/linux/elevator.h b/include/linux/elevator.h index d93efcc445..0b4a354 100644 --- a/include/linux/elevator.h +++ b/include/linux/elevator.h @@ -164,9 +164,8 @@ extern struct request *elv_rb_find(struct rb_root *, sector_t); #define ELEVATOR_INSERT_FRONT 1 #define ELEVATOR_INSERT_BACK 2 #define ELEVATOR_INSERT_SORT 3 -#define ELEVATOR_INSERT_REQUEUE 4 -#define ELEVATOR_INSERT_FLUSH 5 -#define ELEVATOR_INSERT_SORT_MERGE 6 +#define ELEVATOR_INSERT_FLUSH 4 +#define ELEVATOR_INSERT_SORT_MERGE 5 /* * return values from elevator_may_queue_fn ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-28 22:15 ` [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE (was: Re: elevator private data for REQ_FLUSH) Mike Snitzer @ 2011-03-29 11:56 ` Jens Axboe 2011-03-29 14:18 ` Mike Snitzer 2011-03-29 18:25 ` [PATCH] " Christoph Hellwig 2011-03-29 14:13 ` [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE (was: Re: elevator private data for REQ_FLUSH) Jeff Moyer 1 sibling, 2 replies; 38+ messages in thread From: Jens Axboe @ 2011-03-29 11:56 UTC (permalink / raw) To: Mike Snitzer Cc: Tejun Heo, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, Vivek Goyal, Jeff Moyer On 2011-03-29 00:15, Mike Snitzer wrote: > On Mon, Mar 28 2011 at 4:23am -0400, > Tejun Heo <tj@kernel.org> wrote: > >> On Sat, Mar 26, 2011 at 12:21:56AM -0400, Mike Snitzer wrote: >>> Should blk_kick_flush() process the flush request without calling >>> elv_insert() -- like is done with open coded list_add() in >>> blk_insert_flush()? >>> >>> Or should blk_insert_flush() use elv_insert() with >>> ELEVATOR_INSERT_REQUEUE too? >> >> Hmmm... I would prefer the latter. Given that INSERT_REQUEUE and >> FRONT are no longer different, it would probably be better to use >> FRONT tho. The only reason REQUEUE is used there is to avoid kicking >> the queue from elv_insert(), which is gone now. > > OK, I came up with the following patch. > > Jens, this is just a natural cleanup given the code that resulted from > the flush-merge and onstack plugging changes coming together. That looks nice and clean. What kind of testing has been done? -- Jens Axboe ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-29 11:56 ` [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE Jens Axboe @ 2011-03-29 14:18 ` Mike Snitzer 2011-03-29 18:25 ` [PATCH] " Christoph Hellwig 1 sibling, 0 replies; 38+ messages in thread From: Mike Snitzer @ 2011-03-29 14:18 UTC (permalink / raw) To: Jens Axboe Cc: Tejun Heo, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, Vivek Goyal, Jeff Moyer On Tue, Mar 29 2011 at 7:56am -0400, Jens Axboe <axboe@kernel.dk> wrote: > On 2011-03-29 00:15, Mike Snitzer wrote: > > On Mon, Mar 28 2011 at 4:23am -0400, > > Tejun Heo <tj@kernel.org> wrote: > > > >> On Sat, Mar 26, 2011 at 12:21:56AM -0400, Mike Snitzer wrote: > >>> Should blk_kick_flush() process the flush request without calling > >>> elv_insert() -- like is done with open coded list_add() in > >>> blk_insert_flush()? > >>> > >>> Or should blk_insert_flush() use elv_insert() with > >>> ELEVATOR_INSERT_REQUEUE too? > >> > >> Hmmm... I would prefer the latter. Given that INSERT_REQUEUE and > >> FRONT are no longer different, it would probably be better to use > >> FRONT tho. The only reason REQUEUE is used there is to avoid kicking > >> the queue from elv_insert(), which is gone now. > > > > OK, I came up with the following patch. > > > > Jens, this is just a natural cleanup given the code that resulted from > > the flush-merge and onstack plugging changes coming together. > > That looks nice and clean. What kind of testing has been done? I successfully tested it with that fsync-heavy ffsb workload (xfs on mpath device). Mike ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-29 11:56 ` [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE Jens Axboe 2011-03-29 14:18 ` Mike Snitzer @ 2011-03-29 18:25 ` Christoph Hellwig 2011-03-30 7:42 ` Tejun Heo 2011-03-30 7:53 ` Jens Axboe 1 sibling, 2 replies; 38+ messages in thread From: Christoph Hellwig @ 2011-03-29 18:25 UTC (permalink / raw) To: Jens Axboe Cc: Mike Snitzer, Tejun Heo, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, Vivek Goyal, Jeff Moyer Btw, is there any need to actually keep the elv_insert interface? Itw has two callers in current mainline, two of them hardcode ELEVATOR_INSERT_REQUEUE as the reason and could opencode it, and the other cases could be merged into __elv_add_request. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-29 18:25 ` [PATCH] " Christoph Hellwig @ 2011-03-30 7:42 ` Tejun Heo 2011-03-30 7:53 ` Jens Axboe 1 sibling, 0 replies; 38+ messages in thread From: Tejun Heo @ 2011-03-30 7:42 UTC (permalink / raw) To: Christoph Hellwig Cc: Jens Axboe, Mike Snitzer, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, Vivek Goyal, Jeff Moyer On Tue, Mar 29, 2011 at 02:25:55PM -0400, Christoph Hellwig wrote: > Btw, is there any need to actually keep the elv_insert interface? > > Itw has two callers in current mainline, two of them hardcode > ELEVATOR_INSERT_REQUEUE as the reason and could opencode it, and > the other cases could be merged into __elv_add_request. Agreed, at this point, the function seems a bit pointless. Thanks. -- tejun ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-29 18:25 ` [PATCH] " Christoph Hellwig 2011-03-30 7:42 ` Tejun Heo @ 2011-03-30 7:53 ` Jens Axboe 1 sibling, 0 replies; 38+ messages in thread From: Jens Axboe @ 2011-03-30 7:53 UTC (permalink / raw) To: Christoph Hellwig Cc: Mike Snitzer, Tejun Heo, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, Vivek Goyal, Jeff Moyer On 2011-03-29 20:25, Christoph Hellwig wrote: > Btw, is there any need to actually keep the elv_insert interface? > > Itw has two callers in current mainline, two of them hardcode > ELEVATOR_INSERT_REQUEUE as the reason and could opencode it, and > the other cases could be merged into __elv_add_request. Agree, I merged the two and killed elv_insert(). -- Jens Axboe ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE (was: Re: elevator private data for REQ_FLUSH) 2011-03-28 22:15 ` [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE (was: Re: elevator private data for REQ_FLUSH) Mike Snitzer 2011-03-29 11:56 ` [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE Jens Axboe @ 2011-03-29 14:13 ` Jeff Moyer 2011-03-29 17:54 ` Vivek Goyal 1 sibling, 1 reply; 38+ messages in thread From: Jeff Moyer @ 2011-03-29 14:13 UTC (permalink / raw) To: Mike Snitzer Cc: Tejun Heo, Jens Axboe, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, Vivek Goyal Mike Snitzer <snitzer@redhat.com> writes: > OK, I came up with the following patch. > > Jens, this is just a natural cleanup given the code that resulted from > the flush-merge and onstack plugging changes coming together. > > > From: Mike Snitzer <snitzer@redhat.com> > Subject: block: eliminate ELEVATOR_INSERT_REQUEUE > > elv_insert() no longer has a need to differentiate between > ELEVATOR_INSERT_REQUEUE and ELEVATOR_INSERT_FRONT. The onstack plugging > changes eliminated the need to avoid unplugging the queue (via > ELEVATOR_INSERT_REQUEUE). > > Also, in blk_insert_flush(), use elv_insert() with ELEVATOR_INSERT_FRONT > rather than open-coding the equivalent. What you change by doing the call to elv_insert is that now the request will have REQ_SOFTBARRIER set. I don't think that affects anything, though (I checked). The rest looks pretty straight-forward. Reviewed-by: Jeff Moyer <jmoyer@redhat.com> ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE (was: Re: elevator private data for REQ_FLUSH) 2011-03-29 14:13 ` [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE (was: Re: elevator private data for REQ_FLUSH) Jeff Moyer @ 2011-03-29 17:54 ` Vivek Goyal 2011-03-30 7:41 ` Tejun Heo 0 siblings, 1 reply; 38+ messages in thread From: Vivek Goyal @ 2011-03-29 17:54 UTC (permalink / raw) To: Jeff Moyer Cc: Mike Snitzer, Tejun Heo, Jens Axboe, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason On Tue, Mar 29, 2011 at 10:13:05AM -0400, Jeff Moyer wrote: > Mike Snitzer <snitzer@redhat.com> writes: > > > OK, I came up with the following patch. > > > > Jens, this is just a natural cleanup given the code that resulted from > > the flush-merge and onstack plugging changes coming together. > > > > > > From: Mike Snitzer <snitzer@redhat.com> > > Subject: block: eliminate ELEVATOR_INSERT_REQUEUE > > > > elv_insert() no longer has a need to differentiate between > > ELEVATOR_INSERT_REQUEUE and ELEVATOR_INSERT_FRONT. The onstack plugging > > changes eliminated the need to avoid unplugging the queue (via > > ELEVATOR_INSERT_REQUEUE). > > > > Also, in blk_insert_flush(), use elv_insert() with ELEVATOR_INSERT_FRONT > > rather than open-coding the equivalent. > > What you change by doing the call to elv_insert is that now the request > will have REQ_SOFTBARRIER set. I don't think that affects anything, > though (I checked). The rest looks pretty straight-forward. What is the significance of REQ_SOFTBARRIER? Why it should be set for all INSERT_FRONT and requeue requests. With flush logic we got rid of all hardbarriers and hence there is no notion of ordering as such. One has to wait for request to finish to enforce ordering. Should we get rid of softbarriers too? Thanks Vivek ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE (was: Re: elevator private data for REQ_FLUSH) 2011-03-29 17:54 ` Vivek Goyal @ 2011-03-30 7:41 ` Tejun Heo 2011-03-30 7:57 ` Tejun Heo 0 siblings, 1 reply; 38+ messages in thread From: Tejun Heo @ 2011-03-30 7:41 UTC (permalink / raw) To: Vivek Goyal Cc: Jeff Moyer, Mike Snitzer, Jens Axboe, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason On Tue, Mar 29, 2011 at 01:54:58PM -0400, Vivek Goyal wrote: > What is the significance of REQ_SOFTBARRIER? Why it should be set for all > INSERT_FRONT and requeue requests. > > With flush logic we got rid of all hardbarriers and hence there is no > notion of ordering as such. One has to wait for request to finish to > enforce ordering. > > Should we get rid of softbarriers too? REQ_SOFTBARRIER makes sure that other requests don't pass the barrier ones when dispatched from the elevator to the dispatch queue. When the request is being inserted explicitly at front, dispatching other requests in front of it is a bit, umm, weird. IIRC, at least in the requeue path, some drivers depend on front queueing for forward progress guarantee. Forward progress for prepping is guaranteed by mempool (or something like that) and when the request is retried, it should stay at the front of the queue; otherwise, prepping can stall with prepped requests stuck behind unprepped ones. Although flush requests don't need REQ_SOFTBARRIER anymore, I don't think removing it would bring much benefit either, so no reason to change it just for that reason. Thanks. -- tejun ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE (was: Re: elevator private data for REQ_FLUSH) 2011-03-30 7:41 ` Tejun Heo @ 2011-03-30 7:57 ` Tejun Heo 2011-03-30 7:59 ` [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE Jens Axboe 0 siblings, 1 reply; 38+ messages in thread From: Tejun Heo @ 2011-03-30 7:57 UTC (permalink / raw) To: Vivek Goyal Cc: Jeff Moyer, Mike Snitzer, Jens Axboe, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason On Wed, Mar 30, 2011 at 09:41:23AM +0200, Tejun Heo wrote: > IIRC, at least in the requeue path, some drivers depend on front > queueing for forward progress guarantee. Forward progress for > prepping is guaranteed by mempool (or something like that) and when > the request is retried, it should stay at the front of the queue; > otherwise, prepping can stall with prepped requests stuck behind > unprepped ones. After writing the above, I think the current flush implementation actually is violating the above. I think the problem is over-use of front insertion. Flush can just append to the dispatch queue. Front insertion should only be used by requeueing, really. Thanks. -- tejun ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-30 7:57 ` Tejun Heo @ 2011-03-30 7:59 ` Jens Axboe 2011-03-30 8:02 ` Tejun Heo 0 siblings, 1 reply; 38+ messages in thread From: Jens Axboe @ 2011-03-30 7:59 UTC (permalink / raw) To: Tejun Heo Cc: Vivek Goyal, Jeff Moyer, Mike Snitzer, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason On 2011-03-30 09:57, Tejun Heo wrote: > On Wed, Mar 30, 2011 at 09:41:23AM +0200, Tejun Heo wrote: >> IIRC, at least in the requeue path, some drivers depend on front >> queueing for forward progress guarantee. Forward progress for >> prepping is guaranteed by mempool (or something like that) and when >> the request is retried, it should stay at the front of the queue; >> otherwise, prepping can stall with prepped requests stuck behind >> unprepped ones. > > After writing the above, I think the current flush implementation > actually is violating the above. I think the problem is over-use of > front insertion. Flush can just append to the dispatch queue. Front > insertion should only be used by requeueing, really. Pure front insert should be used for requeue and internal commands (like spin up this drive, or get error information). Flush should append to the dispatch list. -- Jens Axboe ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-30 7:59 ` [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE Jens Axboe @ 2011-03-30 8:02 ` Tejun Heo 2011-03-30 10:16 ` Jens Axboe ` (2 more replies) 0 siblings, 3 replies; 38+ messages in thread From: Tejun Heo @ 2011-03-30 8:02 UTC (permalink / raw) To: Jens Axboe Cc: Vivek Goyal, Jeff Moyer, Mike Snitzer, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason Hello, Jens. On Wed, Mar 30, 2011 at 09:59:09AM +0200, Jens Axboe wrote: > Pure front insert should be used for requeue and internal commands (like > spin up this drive, or get error information). Flush should append to > the dispatch list. Yeah, right. The reason I used REQUEUE/FRONT was because BACK insertion involves draining the elevator and then appending the request at the end of the dispatch queue, which is unnecessary and inefficient. So, front insertion was a quick work around that. If we're removing elv_insert(), we can just append directly to the dispatch queue from flush code. Thanks. -- tejun ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-30 8:02 ` Tejun Heo @ 2011-03-30 10:16 ` Jens Axboe 2011-03-30 11:20 ` Tejun Heo 2011-03-30 11:21 ` Sergey Senozhatsky 2011-03-30 13:46 ` Vivek Goyal 2011-03-30 14:01 ` Mike Snitzer 2 siblings, 2 replies; 38+ messages in thread From: Jens Axboe @ 2011-03-30 10:16 UTC (permalink / raw) To: Tejun Heo Cc: Vivek Goyal, Jeff Moyer, Mike Snitzer, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason On 2011-03-30 10:02, Tejun Heo wrote: > Hello, Jens. > > On Wed, Mar 30, 2011 at 09:59:09AM +0200, Jens Axboe wrote: >> Pure front insert should be used for requeue and internal commands (like >> spin up this drive, or get error information). Flush should append to >> the dispatch list. > > Yeah, right. The reason I used REQUEUE/FRONT was because BACK > insertion involves draining the elevator and then appending the > request at the end of the dispatch queue, which is unnecessary and > inefficient. So, front insertion was a quick work around that. If > we're removing elv_insert(), we can just append directly to the > dispatch queue from flush code. How does this look? It's really two patches, but rolled up into one for easier posting here. diff --git a/block/blk-flush.c b/block/blk-flush.c index 93d5fd8..6a16fea 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -261,7 +261,7 @@ static bool blk_kick_flush(struct request_queue *q) q->flush_rq.end_io = flush_end_io; q->flush_pending_idx ^= 1; - elv_insert(q, &q->flush_rq, ELEVATOR_INSERT_REQUEUE); + __elv_add_request(q, &q->flush_rq, ELEVATOR_INSERT_FLUSH); return true; } @@ -281,7 +281,7 @@ static void flush_data_end_io(struct request *rq, int error) * blk_insert_flush - insert a new FLUSH/FUA request * @rq: request to insert * - * To be called from elv_insert() for %ELEVATOR_INSERT_FLUSH insertions. + * To be called from __elv_add_request() for %ELEVATOR_INSERT_FLUSH insertions. * @rq is being submitted. Analyze what needs to be done and put it on the * right queue. * @@ -312,7 +312,7 @@ void blk_insert_flush(struct request *rq) */ if ((policy & REQ_FSEQ_DATA) && !(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) { - list_add(&rq->queuelist, &q->queue_head); + list_add_tail(&rq->queuelist, &q->queue_head); return; } diff --git a/block/elevator.c b/block/elevator.c index c387d31..0cdb4e7 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -610,7 +610,7 @@ void elv_requeue_request(struct request_queue *q, struct request *rq) rq->cmd_flags &= ~REQ_STARTED; - elv_insert(q, rq, ELEVATOR_INSERT_REQUEUE); + __elv_add_request(q, rq, ELEVATOR_INSERT_REQUEUE); } void elv_drain_elevator(struct request_queue *q) @@ -655,12 +655,25 @@ void elv_quiesce_end(struct request_queue *q) queue_flag_clear(QUEUE_FLAG_ELVSWITCH, q); } -void elv_insert(struct request_queue *q, struct request *rq, int where) +void __elv_add_request(struct request_queue *q, struct request *rq, int where) { trace_block_rq_insert(q, rq); rq->q = q; + BUG_ON(rq->cmd_flags & REQ_ON_PLUG); + + if (rq->cmd_flags & REQ_SOFTBARRIER) { + /* barriers are scheduling boundary, update end_sector */ + if (rq->cmd_type == REQ_TYPE_FS || + (rq->cmd_flags & REQ_DISCARD)) { + q->end_sector = rq_end_sector(rq); + q->boundary_rq = rq; + } + } else if (!(rq->cmd_flags & REQ_ELVPRIV) && + where == ELEVATOR_INSERT_SORT) + where = ELEVATOR_INSERT_BACK; + switch (where) { case ELEVATOR_INSERT_REQUEUE: case ELEVATOR_INSERT_FRONT: @@ -722,24 +735,6 @@ void elv_insert(struct request_queue *q, struct request *rq, int where) BUG(); } } - -void __elv_add_request(struct request_queue *q, struct request *rq, int where) -{ - BUG_ON(rq->cmd_flags & REQ_ON_PLUG); - - if (rq->cmd_flags & REQ_SOFTBARRIER) { - /* barriers are scheduling boundary, update end_sector */ - if (rq->cmd_type == REQ_TYPE_FS || - (rq->cmd_flags & REQ_DISCARD)) { - q->end_sector = rq_end_sector(rq); - q->boundary_rq = rq; - } - } else if (!(rq->cmd_flags & REQ_ELVPRIV) && - where == ELEVATOR_INSERT_SORT) - where = ELEVATOR_INSERT_BACK; - - elv_insert(q, rq, where); -} EXPORT_SYMBOL(__elv_add_request); void elv_add_request(struct request_queue *q, struct request *rq, int where) diff --git a/include/linux/elevator.h b/include/linux/elevator.h index d93efcc445..21a8ebf 100644 --- a/include/linux/elevator.h +++ b/include/linux/elevator.h @@ -101,7 +101,6 @@ extern void elv_dispatch_sort(struct request_queue *, struct request *); extern void elv_dispatch_add_tail(struct request_queue *, struct request *); extern void elv_add_request(struct request_queue *, struct request *, int); extern void __elv_add_request(struct request_queue *, struct request *, int); -extern void elv_insert(struct request_queue *, struct request *, int); extern int elv_merge(struct request_queue *, struct request **, struct bio *); extern int elv_try_merge(struct request *, struct bio *); extern void elv_merge_requests(struct request_queue *, struct request *, -- Jens Axboe ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-30 10:16 ` Jens Axboe @ 2011-03-30 11:20 ` Tejun Heo 2011-03-30 11:23 ` Jens Axboe 2011-03-30 11:21 ` Sergey Senozhatsky 1 sibling, 1 reply; 38+ messages in thread From: Tejun Heo @ 2011-03-30 11:20 UTC (permalink / raw) To: Jens Axboe Cc: Vivek Goyal, Jeff Moyer, Mike Snitzer, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason Hello, Jens. On Wed, Mar 30, 2011 at 12:16:13PM +0200, Jens Axboe wrote: > @@ -261,7 +261,7 @@ static bool blk_kick_flush(struct request_queue *q) > q->flush_rq.end_io = flush_end_io; > > q->flush_pending_idx ^= 1; > - elv_insert(q, &q->flush_rq, ELEVATOR_INSERT_REQUEUE); > + __elv_add_request(q, &q->flush_rq, ELEVATOR_INSERT_FLUSH); Shouldn't this be list_add_tail(&rq->queuelist, &q->queue_head)? __elv_add_request(FLUSH) calls back into blk_insert_flush() which decomposes the flush request from FS into flush sequence. blk_kick_flush() is used to insert the decomposed sequence requests into the dispatch queue. Thanks. -- tejun ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-30 11:20 ` Tejun Heo @ 2011-03-30 11:23 ` Jens Axboe 0 siblings, 0 replies; 38+ messages in thread From: Jens Axboe @ 2011-03-30 11:23 UTC (permalink / raw) To: Tejun Heo Cc: Vivek Goyal, Jeff Moyer, Mike Snitzer, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason On 2011-03-30 13:20, Tejun Heo wrote: > Hello, Jens. > > On Wed, Mar 30, 2011 at 12:16:13PM +0200, Jens Axboe wrote: >> @@ -261,7 +261,7 @@ static bool blk_kick_flush(struct request_queue *q) >> q->flush_rq.end_io = flush_end_io; >> >> q->flush_pending_idx ^= 1; >> - elv_insert(q, &q->flush_rq, ELEVATOR_INSERT_REQUEUE); >> + __elv_add_request(q, &q->flush_rq, ELEVATOR_INSERT_FLUSH); > > Shouldn't this be list_add_tail(&rq->queuelist, &q->queue_head)? > __elv_add_request(FLUSH) calls back into blk_insert_flush() which > decomposes the flush request from FS into flush sequence. > blk_kick_flush() is used to insert the decomposed sequence requests > into the dispatch queue. Mid-air collision, I just sent that addon out to Sergey who tested. Yes of course it should be a regular list_add, the above is a braino. -- Jens Axboe ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-30 10:16 ` Jens Axboe 2011-03-30 11:20 ` Tejun Heo @ 2011-03-30 11:21 ` Sergey Senozhatsky 2011-03-30 11:22 ` Jens Axboe 1 sibling, 1 reply; 38+ messages in thread From: Sergey Senozhatsky @ 2011-03-30 11:21 UTC (permalink / raw) To: Jens Axboe Cc: Tejun Heo, Vivek Goyal, Jeff Moyer, Mike Snitzer, Markus Trippelsdorf, linux-kernel, Chris Mason [-- Attachment #1: Type: text/plain, Size: 6376 bytes --] Hello, On (03/30/11 12:16), Jens Axboe wrote: > How does this look? It's really two patches, but rolled up into one for > easier posting here. > Nope, doesn't work for me. fsck.ext4 crashed the system. __elv_add_request blk_flush_complete_seq blk_insert_flush __elv_add_request __make_request generic_make_request ?bio_alloc_bioset blkdev_issue_flush blkdev_fsync vfs_fsync [..] RIP is on blk_insert_flush + 0x54, which is one of the BUG_ONs Dump of assembler code for function blk_insert_flush: 0x00000000000004a4 <+0>: push %rbp 0x00000000000004a5 <+1>: xor %esi,%esi 0x00000000000004a7 <+3>: mov %rdi,%r8 0x00000000000004aa <+6>: mov 0x38(%rdi),%rax 0x00000000000004ae <+10>: mov %rsp,%rbp 0x00000000000004b1 <+13>: mov 0x6d4(%rax),%edx 0x00000000000004b7 <+19>: test $0x800000,%edx 0x00000000000004bd <+25>: je 0x4ee <blk_insert_flush+74> 0x00000000000004bf <+27>: mov 0x40(%rdi),%ecx 0x00000000000004c2 <+30>: xor %esi,%esi 0x00000000000004c4 <+32>: mov 0x54(%rdi),%r9d 0x00000000000004c8 <+36>: test $0x800000,%ecx 0x00000000000004ce <+42>: setne %sil 0x00000000000004d2 <+46>: mov %esi,%edi 0x00000000000004d4 <+48>: or $0x2,%edi 0x00000000000004d7 <+51>: shr $0x9,%r9d 0x00000000000004db <+55>: cmovne %edi,%esi 0x00000000000004de <+58>: test $0x10,%dh 0x00000000000004e1 <+61>: jne 0x4ee <blk_insert_flush+74> 0x00000000000004e3 <+63>: mov %esi,%edi 0x00000000000004e5 <+65>: or $0x4,%edi 0x00000000000004e8 <+68>: and $0x10,%ch 0x00000000000004eb <+71>: cmovne %edi,%esi 0x00000000000004ee <+74>: cmpq $0x0,0x148(%r8) 0x00000000000004f6 <+82>: je 0x4fa <blk_insert_flush+86> 0x00000000000004f8 <+84>: ud2a ^^^^^^^^^^^^ 0x00000000000004fa <+86>: mov 0x60(%r8),%rcx 0x00000000000004fe <+90>: test %rcx,%rcx 0x0000000000000501 <+93>: je 0x509 <blk_insert_flush+101> 0x0000000000000503 <+95>: cmp 0x68(%r8),%rcx 0x0000000000000507 <+99>: je 0x50b <blk_insert_flush+103> 0x0000000000000509 <+101>: ud2a Sergey > diff --git a/block/blk-flush.c b/block/blk-flush.c > index 93d5fd8..6a16fea 100644 > --- a/block/blk-flush.c > +++ b/block/blk-flush.c > @@ -261,7 +261,7 @@ static bool blk_kick_flush(struct request_queue *q) > q->flush_rq.end_io = flush_end_io; > > q->flush_pending_idx ^= 1; > - elv_insert(q, &q->flush_rq, ELEVATOR_INSERT_REQUEUE); > + __elv_add_request(q, &q->flush_rq, ELEVATOR_INSERT_FLUSH); > return true; > } > > @@ -281,7 +281,7 @@ static void flush_data_end_io(struct request *rq, int error) > * blk_insert_flush - insert a new FLUSH/FUA request > * @rq: request to insert > * > - * To be called from elv_insert() for %ELEVATOR_INSERT_FLUSH insertions. > + * To be called from __elv_add_request() for %ELEVATOR_INSERT_FLUSH insertions. > * @rq is being submitted. Analyze what needs to be done and put it on the > * right queue. > * > @@ -312,7 +312,7 @@ void blk_insert_flush(struct request *rq) > */ > if ((policy & REQ_FSEQ_DATA) && > !(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) { > - list_add(&rq->queuelist, &q->queue_head); > + list_add_tail(&rq->queuelist, &q->queue_head); > return; > } > > diff --git a/block/elevator.c b/block/elevator.c > index c387d31..0cdb4e7 100644 > --- a/block/elevator.c > +++ b/block/elevator.c > @@ -610,7 +610,7 @@ void elv_requeue_request(struct request_queue *q, struct request *rq) > > rq->cmd_flags &= ~REQ_STARTED; > > - elv_insert(q, rq, ELEVATOR_INSERT_REQUEUE); > + __elv_add_request(q, rq, ELEVATOR_INSERT_REQUEUE); > } > > void elv_drain_elevator(struct request_queue *q) > @@ -655,12 +655,25 @@ void elv_quiesce_end(struct request_queue *q) > queue_flag_clear(QUEUE_FLAG_ELVSWITCH, q); > } > > -void elv_insert(struct request_queue *q, struct request *rq, int where) > +void __elv_add_request(struct request_queue *q, struct request *rq, int where) > { > trace_block_rq_insert(q, rq); > > rq->q = q; > > + BUG_ON(rq->cmd_flags & REQ_ON_PLUG); > + > + if (rq->cmd_flags & REQ_SOFTBARRIER) { > + /* barriers are scheduling boundary, update end_sector */ > + if (rq->cmd_type == REQ_TYPE_FS || > + (rq->cmd_flags & REQ_DISCARD)) { > + q->end_sector = rq_end_sector(rq); > + q->boundary_rq = rq; > + } > + } else if (!(rq->cmd_flags & REQ_ELVPRIV) && > + where == ELEVATOR_INSERT_SORT) > + where = ELEVATOR_INSERT_BACK; > + > switch (where) { > case ELEVATOR_INSERT_REQUEUE: > case ELEVATOR_INSERT_FRONT: > @@ -722,24 +735,6 @@ void elv_insert(struct request_queue *q, struct request *rq, int where) > BUG(); > } > } > - > -void __elv_add_request(struct request_queue *q, struct request *rq, int where) > -{ > - BUG_ON(rq->cmd_flags & REQ_ON_PLUG); > - > - if (rq->cmd_flags & REQ_SOFTBARRIER) { > - /* barriers are scheduling boundary, update end_sector */ > - if (rq->cmd_type == REQ_TYPE_FS || > - (rq->cmd_flags & REQ_DISCARD)) { > - q->end_sector = rq_end_sector(rq); > - q->boundary_rq = rq; > - } > - } else if (!(rq->cmd_flags & REQ_ELVPRIV) && > - where == ELEVATOR_INSERT_SORT) > - where = ELEVATOR_INSERT_BACK; > - > - elv_insert(q, rq, where); > -} > EXPORT_SYMBOL(__elv_add_request); > > void elv_add_request(struct request_queue *q, struct request *rq, int where) > diff --git a/include/linux/elevator.h b/include/linux/elevator.h > index d93efcc445..21a8ebf 100644 > --- a/include/linux/elevator.h > +++ b/include/linux/elevator.h > @@ -101,7 +101,6 @@ extern void elv_dispatch_sort(struct request_queue *, struct request *); > extern void elv_dispatch_add_tail(struct request_queue *, struct request *); > extern void elv_add_request(struct request_queue *, struct request *, int); > extern void __elv_add_request(struct request_queue *, struct request *, int); > -extern void elv_insert(struct request_queue *, struct request *, int); > extern int elv_merge(struct request_queue *, struct request **, struct bio *); > extern int elv_try_merge(struct request *, struct bio *); > extern void elv_merge_requests(struct request_queue *, struct request *, > > -- > Jens Axboe > [-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-30 11:21 ` Sergey Senozhatsky @ 2011-03-30 11:22 ` Jens Axboe 2011-03-30 11:49 ` Sergey Senozhatsky 0 siblings, 1 reply; 38+ messages in thread From: Jens Axboe @ 2011-03-30 11:22 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Tejun Heo, Vivek Goyal, Jeff Moyer, Mike Snitzer, Markus Trippelsdorf, linux-kernel, Chris Mason On 2011-03-30 13:21, Sergey Senozhatsky wrote: > Hello, > > On (03/30/11 12:16), Jens Axboe wrote: >> How does this look? It's really two patches, but rolled up into one for >> easier posting here. >> > > Nope, doesn't work for me. fsck.ext4 crashed the system. > > __elv_add_request > blk_flush_complete_seq > blk_insert_flush > __elv_add_request > __make_request > generic_make_request > ?bio_alloc_bioset > blkdev_issue_flush > blkdev_fsync > vfs_fsync > [..] Ah, this addon should behave better. diff --git a/block/blk-flush.c b/block/blk-flush.c index 6a16fea..eba4a27 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -261,7 +261,7 @@ static bool blk_kick_flush(struct request_queue *q) q->flush_rq.end_io = flush_end_io; q->flush_pending_idx ^= 1; - __elv_add_request(q, &q->flush_rq, ELEVATOR_INSERT_FLUSH); + list_add_tail(&q->flush_rq.queuelist, &q->queue_head); return true; } -- Jens Axboe ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-30 11:22 ` Jens Axboe @ 2011-03-30 11:49 ` Sergey Senozhatsky 0 siblings, 0 replies; 38+ messages in thread From: Sergey Senozhatsky @ 2011-03-30 11:49 UTC (permalink / raw) To: Jens Axboe Cc: Tejun Heo, Vivek Goyal, Jeff Moyer, Mike Snitzer, Markus Trippelsdorf, linux-kernel, Chris Mason [-- Attachment #1: Type: text/plain, Size: 910 bytes --] On (03/30/11 13:22), Jens Axboe wrote: > > Nope, doesn't work for me. fsck.ext4 crashed the system. > > > > __elv_add_request > > blk_flush_complete_seq > > blk_insert_flush > > __elv_add_request > > __make_request > > generic_make_request > > ?bio_alloc_bioset > > blkdev_issue_flush > > blkdev_fsync > > vfs_fsync > > [..] > > Ah, this addon should behave better. > Seems to work fine for me. Sergey > diff --git a/block/blk-flush.c b/block/blk-flush.c > index 6a16fea..eba4a27 100644 > --- a/block/blk-flush.c > +++ b/block/blk-flush.c > @@ -261,7 +261,7 @@ static bool blk_kick_flush(struct request_queue *q) > q->flush_rq.end_io = flush_end_io; > > q->flush_pending_idx ^= 1; > - __elv_add_request(q, &q->flush_rq, ELEVATOR_INSERT_FLUSH); > + list_add_tail(&q->flush_rq.queuelist, &q->queue_head); > return true; > } > > > -- > Jens Axboe > [-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-30 8:02 ` Tejun Heo 2011-03-30 10:16 ` Jens Axboe @ 2011-03-30 13:46 ` Vivek Goyal 2011-03-30 13:49 ` Tejun Heo 2011-03-30 14:01 ` Mike Snitzer 2 siblings, 1 reply; 38+ messages in thread From: Vivek Goyal @ 2011-03-30 13:46 UTC (permalink / raw) To: Tejun Heo Cc: Jens Axboe, Jeff Moyer, Mike Snitzer, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason On Wed, Mar 30, 2011 at 10:02:03AM +0200, Tejun Heo wrote: > Hello, Jens. > > On Wed, Mar 30, 2011 at 09:59:09AM +0200, Jens Axboe wrote: > > Pure front insert should be used for requeue and internal commands (like > > spin up this drive, or get error information). Flush should append to > > the dispatch list. > > Yeah, right. The reason I used REQUEUE/FRONT was because BACK > insertion involves draining the elevator and then appending the > request at the end of the dispatch queue, which is unnecessary and > inefficient. So, front insertion was a quick work around that. If > we're removing elv_insert(), we can just append directly to the > dispatch queue from flush code. Hi Tejun, With ordering semantics gone, do we still need to drain the elevator before queuing flush at the end of request queue. Thanks Vivek ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-30 13:46 ` Vivek Goyal @ 2011-03-30 13:49 ` Tejun Heo 0 siblings, 0 replies; 38+ messages in thread From: Tejun Heo @ 2011-03-30 13:49 UTC (permalink / raw) To: Vivek Goyal Cc: Jens Axboe, Jeff Moyer, Mike Snitzer, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason Hello, On Wed, Mar 30, 2011 at 09:46:58AM -0400, Vivek Goyal wrote: > With ordering semantics gone, do we still need to drain the elevator > before queuing flush at the end of request queue. No, not at all, which was why I was cheating with front insertion. Anyways, with Jens' patch posted in this thread, this is no longer a problem. Thanks. -- tejun ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-30 8:02 ` Tejun Heo 2011-03-30 10:16 ` Jens Axboe 2011-03-30 13:46 ` Vivek Goyal @ 2011-03-30 14:01 ` Mike Snitzer 2011-03-30 14:27 ` Tejun Heo 2 siblings, 1 reply; 38+ messages in thread From: Mike Snitzer @ 2011-03-30 14:01 UTC (permalink / raw) To: Tejun Heo Cc: Jens Axboe, Vivek Goyal, Jeff Moyer, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, hch On Wed, Mar 30 2011 at 4:02am -0400, Tejun Heo <tj@kernel.org> wrote: > Hello, Jens. > > On Wed, Mar 30, 2011 at 09:59:09AM +0200, Jens Axboe wrote: > > Pure front insert should be used for requeue and internal commands (like > > spin up this drive, or get error information). Flush should append to > > the dispatch list. > > Yeah, right. The reason I used REQUEUE/FRONT was because BACK > insertion involves draining the elevator and then appending the > request at the end of the dispatch queue, which is unnecessary and > inefficient. So, front insertion was a quick work around that. If > we're removing elv_insert(), we can just append directly to the > dispatch queue from flush code. I'm trying to follow along but unrolling what was said above is challenging considering we're not getting rid of elv_insert()'s functionality; it was just folded into __elv_add_request() -- offering no functional change AFAIK. So placing special meaning on getting rid of elv_insert() is confusing me. Why can we all of a sudden append the flush to the dispatch queue _but_ not have any concern about queue draining? Seems that avoiding use of BACK, by using list_add_tail, is enabling that. Couldn't we have always done that? The folding of elv_insert() into __elv_add_request() seems irrelevant. Can we take a step back and be more explicit about the implications of having inserted the flush with REQUEUE/FRONT? Seems to me that having _not_ inserted the flush at the end of the dispatch queue is cause for potential corruption (preceding data hasn't been issued to the device yet). And just to be clear: none of this is a concern for stable right? It is just the flush-merge code introduced for 2.6.39 that needs fixing? Please advise, thanks! Mike ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-30 14:01 ` Mike Snitzer @ 2011-03-30 14:27 ` Tejun Heo 2011-03-30 15:22 ` Vivek Goyal 0 siblings, 1 reply; 38+ messages in thread From: Tejun Heo @ 2011-03-30 14:27 UTC (permalink / raw) To: Mike Snitzer Cc: Jens Axboe, Vivek Goyal, Jeff Moyer, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, hch On Wed, Mar 30, 2011 at 10:01:24AM -0400, Mike Snitzer wrote: > Why can we all of a sudden append the flush to the dispatch queue _but_ > not have any concern about queue draining? Seems that avoiding use of > BACK, by using list_add_tail, is enabling that. Couldn't we have always > done that? Yes, we could have. I was just being lazy and looking for an easy solution. > The folding of elv_insert() into __elv_add_request() seems > irrelevant. Strictly, it is but they kinda go well together. > Can we take a step back and be more explicit about the implications of > having inserted the flush with REQUEUE/FRONT? Seems to me that having > _not_ inserted the flush at the end of the dispatch queue is cause for > potential corruption (preceding data hasn't been issued to the device > yet). No, the data ordering is enforced by the filesystem in the new implementation meaning that by the time FLUSH is issued by the filesystem, it should have made sure that all requests which must be written before the FLUSH already had completed. > And just to be clear: none of this is a concern for stable right? It is > just the flush-merge code introduced for 2.6.39 that needs fixing? I think 2.6.38 needs a -stable fix. It has the previous version of the new flush implementation and is using front insertion. Thanks. -- tejun ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-30 14:27 ` Tejun Heo @ 2011-03-30 15:22 ` Vivek Goyal 2011-03-30 15:30 ` Tejun Heo 0 siblings, 1 reply; 38+ messages in thread From: Vivek Goyal @ 2011-03-30 15:22 UTC (permalink / raw) To: Tejun Heo Cc: Mike Snitzer, Jens Axboe, Jeff Moyer, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, hch On Wed, Mar 30, 2011 at 04:27:51PM +0200, Tejun Heo wrote: [..] > > And just to be clear: none of this is a concern for stable right? It is > > just the flush-merge code introduced for 2.6.39 that needs fixing? > > I think 2.6.38 needs a -stable fix. It has the previous version of > the new flush implementation and is using front insertion. Tejun, So wee need this as stable fix because FLUSH request can get ahead of REQUEUED requests and it can break some drivers? Thanks Vivek ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-30 15:22 ` Vivek Goyal @ 2011-03-30 15:30 ` Tejun Heo 2011-03-30 17:13 ` Jens Axboe 0 siblings, 1 reply; 38+ messages in thread From: Tejun Heo @ 2011-03-30 15:30 UTC (permalink / raw) To: Vivek Goyal Cc: Mike Snitzer, Jens Axboe, Jeff Moyer, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, hch On Wed, Mar 30, 2011 at 11:22:48AM -0400, Vivek Goyal wrote: > So wee need this as stable fix because FLUSH request can get ahead of > REQUEUED requests and it can break some drivers? Yes, I think so. All we need is just replacing elv_insert() calls in blk-flush.c with list_add_tail(). Something like the following. I'll test it and send a proper patch later. Thanks. diff --git a/block/blk-flush.c b/block/blk-flush.c index b27d020..51a45a5 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -132,7 +132,7 @@ static struct request *queue_next_fseq(struct request_queue *q) BUG(); } - elv_insert(q, rq, ELEVATOR_INSERT_REQUEUE); + list_add_tail(&rq->queuelist, &q->queue_head); return rq; } -- tejun ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-30 15:30 ` Tejun Heo @ 2011-03-30 17:13 ` Jens Axboe 2011-03-30 17:32 ` Tejun Heo 2011-03-30 17:56 ` Jeff Moyer 0 siblings, 2 replies; 38+ messages in thread From: Jens Axboe @ 2011-03-30 17:13 UTC (permalink / raw) To: Tejun Heo Cc: Vivek Goyal, Mike Snitzer, Jeff Moyer, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, hch On 2011-03-30 17:30, Tejun Heo wrote: > On Wed, Mar 30, 2011 at 11:22:48AM -0400, Vivek Goyal wrote: >> So wee need this as stable fix because FLUSH request can get ahead of >> REQUEUED requests and it can break some drivers? > > Yes, I think so. All we need is just replacing elv_insert() calls in > blk-flush.c with list_add_tail(). Something like the following. I'll > test it and send a proper patch later. Thanks. I'd suggest I just mark the queued patch for stable. -- Jens Axboe ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-30 17:13 ` Jens Axboe @ 2011-03-30 17:32 ` Tejun Heo 2011-03-30 17:56 ` Jeff Moyer 1 sibling, 0 replies; 38+ messages in thread From: Tejun Heo @ 2011-03-30 17:32 UTC (permalink / raw) To: Jens Axboe Cc: Vivek Goyal, Mike Snitzer, Jeff Moyer, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, hch On Wed, Mar 30, 2011 at 07:13:20PM +0200, Jens Axboe wrote: > On 2011-03-30 17:30, Tejun Heo wrote: > > On Wed, Mar 30, 2011 at 11:22:48AM -0400, Vivek Goyal wrote: > >> So wee need this as stable fix because FLUSH request can get ahead of > >> REQUEUED requests and it can break some drivers? > > > > Yes, I think so. All we need is just replacing elv_insert() calls in > > blk-flush.c with list_add_tail(). Something like the following. I'll > > test it and send a proper patch later. Thanks. > > I'd suggest I just mark the queued patch for stable. Ah, right. It might not apply without conflict but they're basically the same changes. Thanks. -- tejun ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-30 17:13 ` Jens Axboe 2011-03-30 17:32 ` Tejun Heo @ 2011-03-30 17:56 ` Jeff Moyer 2011-03-30 18:12 ` Jens Axboe 1 sibling, 1 reply; 38+ messages in thread From: Jeff Moyer @ 2011-03-30 17:56 UTC (permalink / raw) To: Jens Axboe Cc: Tejun Heo, Vivek Goyal, Mike Snitzer, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, hch Jens Axboe <axboe@kernel.dk> writes: > On 2011-03-30 17:30, Tejun Heo wrote: >> On Wed, Mar 30, 2011 at 11:22:48AM -0400, Vivek Goyal wrote: >>> So wee need this as stable fix because FLUSH request can get ahead of >>> REQUEUED requests and it can break some drivers? >> >> Yes, I think so. All we need is just replacing elv_insert() calls in >> blk-flush.c with list_add_tail(). Something like the following. I'll >> test it and send a proper patch later. Thanks. > > I'd suggest I just mark the queued patch for stable. Could you document the nuance in that patch, please? Thanks, Jeff ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: block: eliminate ELEVATOR_INSERT_REQUEUE 2011-03-30 17:56 ` Jeff Moyer @ 2011-03-30 18:12 ` Jens Axboe 0 siblings, 0 replies; 38+ messages in thread From: Jens Axboe @ 2011-03-30 18:12 UTC (permalink / raw) To: Jeff Moyer Cc: Tejun Heo, Vivek Goyal, Mike Snitzer, Markus Trippelsdorf, Sergey Senozhatsky, linux-kernel, Chris Mason, hch On 2011-03-30 19:56, Jeff Moyer wrote: > Jens Axboe <axboe@kernel.dk> writes: > >> On 2011-03-30 17:30, Tejun Heo wrote: >>> On Wed, Mar 30, 2011 at 11:22:48AM -0400, Vivek Goyal wrote: >>>> So wee need this as stable fix because FLUSH request can get ahead of >>>> REQUEUED requests and it can break some drivers? >>> >>> Yes, I think so. All we need is just replacing elv_insert() calls in >>> blk-flush.c with list_add_tail(). Something like the following. I'll >>> test it and send a proper patch later. Thanks. >> >> I'd suggest I just mark the queued patch for stable. > > Could you document the nuance in that patch, please? Yeah I will. I rebase for-linus, it's not a "stable" branch as such. Today I mainly collected some items as not to drop them, I'll expand on the explanations in there. -- Jens Axboe ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [OOPS] elevator private data for REQ_FLUSH 2011-03-25 15:22 ` Markus Trippelsdorf 2011-03-25 15:40 ` Mike Snitzer @ 2011-03-25 15:57 ` Jens Axboe 2011-03-25 16:03 ` Markus Trippelsdorf 1 sibling, 1 reply; 38+ messages in thread From: Jens Axboe @ 2011-03-25 15:57 UTC (permalink / raw) To: Markus Trippelsdorf Cc: Sergey Senozhatsky, linux-kernel, Mike Snitzer, Chris Mason On 2011-03-25 16:22, Markus Trippelsdorf wrote: > On 2011.03.25 at 17:15 +0200, Sergey Senozhatsky wrote: >> Hello, >> >> Commit >> 9d5a4e946ce5352f19400b6370f4cd8e72806278 >> block: skip elevator data initialization for flush requests >> >> Skip elevator initialization for flush requests by passing priv=0 to >> blk_alloc_request() in get_request(). As such elv_set_request() is >> never called for flush requests. >> >> introduced priv flag, to skip elevator_private data init for FLUSH requests. >> This, I guess, lead to NULL pointer deref on my machine in cfq_insert_request, >> which requires elevator_private to be set: >> >> 1 [ 78.982169] Call Trace: >> 2 [ 78.982178] [<ffffffff8122d1fe>] cfq_insert_request+0x4e/0x47d >> 3 [ 78.982184] [<ffffffff8123e139>] ? do_raw_spin_lock+0x6b/0x122 > >> Should we in that case use ELEVATOR_INSERT_FLUSH for REQ_FLUSH | REQ_FUA requests >> (like below)? >> >> --- >> >> block/elevator.c | 2 ++ >> 1 files changed, 2 insertions(+), 0 deletions(-) >> >> diff --git a/block/elevator.c b/block/elevator.c >> index c387d31..b17e577 100644 >> --- a/block/elevator.c >> +++ b/block/elevator.c >> @@ -734,6 +734,8 @@ void __elv_add_request(struct request_queue *q, struct request *rq, int where) >> q->end_sector = rq_end_sector(rq); >> q->boundary_rq = rq; >> } >> + } else if (rq->cmd_flags & (REQ_FLUSH | REQ_FUA)) { >> + where = ELEVATOR_INSERT_FLUSH; >> } else if (!(rq->cmd_flags & REQ_ELVPRIV) && >> where == ELEVATOR_INSERT_SORT) >> where = ELEVATOR_INSERT_BACK; > > Thanks. That solves all (corruption-) problems that I reported earlier > in an other thread. That's great. I'm surprised that this would cause silent corruption for you, should have been accompanied by an oops. Or was that with noop only? -- Jens Axboe ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [OOPS] elevator private data for REQ_FLUSH 2011-03-25 15:57 ` [OOPS] elevator private data for REQ_FLUSH Jens Axboe @ 2011-03-25 16:03 ` Markus Trippelsdorf 0 siblings, 0 replies; 38+ messages in thread From: Markus Trippelsdorf @ 2011-03-25 16:03 UTC (permalink / raw) To: Jens Axboe; +Cc: Sergey Senozhatsky, linux-kernel, Mike Snitzer, Chris Mason On 2011.03.25 at 16:57 +0100, Jens Axboe wrote: > On 2011-03-25 16:22, Markus Trippelsdorf wrote: > > On 2011.03.25 at 17:15 +0200, Sergey Senozhatsky wrote: > >> Hello, > >> > >> Commit > >> 9d5a4e946ce5352f19400b6370f4cd8e72806278 > >> block: skip elevator data initialization for flush requests > >> > >> Skip elevator initialization for flush requests by passing priv=0 to > >> blk_alloc_request() in get_request(). As such elv_set_request() is > >> never called for flush requests. > >> > >> introduced priv flag, to skip elevator_private data init for FLUSH requests. > >> This, I guess, lead to NULL pointer deref on my machine in cfq_insert_request, > >> which requires elevator_private to be set: > >> > >> 1 [ 78.982169] Call Trace: > >> 2 [ 78.982178] [<ffffffff8122d1fe>] cfq_insert_request+0x4e/0x47d > >> 3 [ 78.982184] [<ffffffff8123e139>] ? do_raw_spin_lock+0x6b/0x122 > > > >> Should we in that case use ELEVATOR_INSERT_FLUSH for REQ_FLUSH | REQ_FUA requests > >> (like below)? > >> > >> --- > >> > >> block/elevator.c | 2 ++ > >> 1 files changed, 2 insertions(+), 0 deletions(-) > >> > >> diff --git a/block/elevator.c b/block/elevator.c > >> index c387d31..b17e577 100644 > >> --- a/block/elevator.c > >> +++ b/block/elevator.c > >> @@ -734,6 +734,8 @@ void __elv_add_request(struct request_queue *q, struct request *rq, int where) > >> q->end_sector = rq_end_sector(rq); > >> q->boundary_rq = rq; > >> } > >> + } else if (rq->cmd_flags & (REQ_FLUSH | REQ_FUA)) { > >> + where = ELEVATOR_INSERT_FLUSH; > >> } else if (!(rq->cmd_flags & REQ_ELVPRIV) && > >> where == ELEVATOR_INSERT_SORT) > >> where = ELEVATOR_INSERT_BACK; > > > > Thanks. That solves all (corruption-) problems that I reported earlier > > in an other thread. > > That's great. I'm surprised that this would cause silent corruption for > you, should have been accompanied by an oops. Or was that with noop > only? The corruption happend during my bisection with cfq and also with noop on the latest git kernel. -- Markus ^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2011-03-30 18:12 UTC | newest] Thread overview: 38+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-03-25 15:15 [OOPS] elevator private data for REQ_FLUSH Sergey Senozhatsky 2011-03-25 15:22 ` Markus Trippelsdorf 2011-03-25 15:40 ` Mike Snitzer 2011-03-25 15:50 ` Jens Axboe 2011-03-25 18:54 ` Mike Snitzer 2011-03-25 19:50 ` Jens Axboe 2011-03-26 4:21 ` Mike Snitzer 2011-03-28 8:23 ` Tejun Heo 2011-03-28 22:15 ` [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE (was: Re: elevator private data for REQ_FLUSH) Mike Snitzer 2011-03-29 11:56 ` [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE Jens Axboe 2011-03-29 14:18 ` Mike Snitzer 2011-03-29 18:25 ` [PATCH] " Christoph Hellwig 2011-03-30 7:42 ` Tejun Heo 2011-03-30 7:53 ` Jens Axboe 2011-03-29 14:13 ` [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE (was: Re: elevator private data for REQ_FLUSH) Jeff Moyer 2011-03-29 17:54 ` Vivek Goyal 2011-03-30 7:41 ` Tejun Heo 2011-03-30 7:57 ` Tejun Heo 2011-03-30 7:59 ` [PATCH] block: eliminate ELEVATOR_INSERT_REQUEUE Jens Axboe 2011-03-30 8:02 ` Tejun Heo 2011-03-30 10:16 ` Jens Axboe 2011-03-30 11:20 ` Tejun Heo 2011-03-30 11:23 ` Jens Axboe 2011-03-30 11:21 ` Sergey Senozhatsky 2011-03-30 11:22 ` Jens Axboe 2011-03-30 11:49 ` Sergey Senozhatsky 2011-03-30 13:46 ` Vivek Goyal 2011-03-30 13:49 ` Tejun Heo 2011-03-30 14:01 ` Mike Snitzer 2011-03-30 14:27 ` Tejun Heo 2011-03-30 15:22 ` Vivek Goyal 2011-03-30 15:30 ` Tejun Heo 2011-03-30 17:13 ` Jens Axboe 2011-03-30 17:32 ` Tejun Heo 2011-03-30 17:56 ` Jeff Moyer 2011-03-30 18:12 ` Jens Axboe 2011-03-25 15:57 ` [OOPS] elevator private data for REQ_FLUSH Jens Axboe 2011-03-25 16:03 ` Markus Trippelsdorf
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox