* [PATCHSET block/for-4.2/writeback] block, cgroup: make cfq charge async IOs to the appropriate blkcgs
@ 2015-06-08 8:59 Tejun Heo
2015-06-08 8:59 ` [PATCH 6/8] cfq-iosched: move cfq_group determination from cfq_find_alloc_queue() to cfq_get_queue() Tejun Heo
` (3 more replies)
0 siblings, 4 replies; 31+ messages in thread
From: Tejun Heo @ 2015-06-08 8:59 UTC (permalink / raw)
To: axboe-tSWWG44O7X1aa/9Udqfwiw
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w
Hello,
cfq has always charged all async IOs to the root cgroup. It didn't
have much choice as writeback didn't know about cgroups and there was
no way to tell who to blame for a given writeback IO. writeback
finally grew support for cgroups and now tags each writeback IO with
the appropriate cgroup to charge it against.
This patchset updates cfq so that it follows the blkcg each bio is
tagged with. Async cfq_queues are now shared across cfq_group, which
is per-cgroup, instead of per-request_queue cfq_data. This makes all
IOs follow the weight based IO resource distribution implemented by
cfq.
This patchset contains the following 8 patches.
0001-cfq-iosched-simplify-control-flow-in-cfq_get_queue.patch
0002-cfq-iosched-fix-async-oom-queue-handling.patch
0003-cfq-iosched-fix-oom-cfq_queue-ref-leak-in-cfq_set_re.patch
0004-cfq-iosched-minor-cleanups.patch
0005-cfq-iosched-remove-gfp_mask-from-cfq_find_alloc_queu.patch
0006-cfq-iosched-move-cfq_group-determination-from-cfq_fi.patch
0007-cfq-iosched-fold-cfq_find_alloc_queue-into-cfq_get_q.patch
0008-cfq-iosched-charge-async-IOs-to-the-appropriate-blkc.patch
0001-0003 are a prep and two fix patches on top. The bugs fixed by
these patches are very unlikely to cause problems in actual usage, so
the patches aren't tagged w/ -stable.
0004-0007 are prep patches.
0008 makes cfq cgroup-writeback ready.
This patchset is on top of block/for-4.2/writeback 5857cd637bc0 ("bdi:
fix wrong error return value in cgwb_create()").
Thanks, diffstat follows.
block/cfq-iosched.c | 220 ++++++++++++++++++++--------------------------------
1 file changed, 85 insertions(+), 135 deletions(-)
--
tejun
^ permalink raw reply [flat|nested] 31+ messages in thread* [PATCH 6/8] cfq-iosched: move cfq_group determination from cfq_find_alloc_queue() to cfq_get_queue() 2015-06-08 8:59 [PATCHSET block/for-4.2/writeback] block, cgroup: make cfq charge async IOs to the appropriate blkcgs Tejun Heo @ 2015-06-08 8:59 ` Tejun Heo [not found] ` <1433753973-23684-7-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 2015-06-08 8:59 ` [PATCH 8/8] cfq-iosched: charge async IOs to the appropriate blkcg's instead of the root Tejun Heo ` (2 subsequent siblings) 3 siblings, 1 reply; 31+ messages in thread From: Tejun Heo @ 2015-06-08 8:59 UTC (permalink / raw) To: axboe; +Cc: linux-kernel, cgroups, vgoyal, avanzini.arianna, Tejun Heo This is necessary for making async cfq_cgroups per-cfq_group instead of per-cfq_data. While this change makes cfq_get_queue() perform RCU locking and look up cfq_group even when it reuses async queue, the extra overhead is extremely unlikely to be noticeable given that this is already sitting behind cic->cfqq[] cache and the overall cost of cfq operation. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Arianna Avanzini <avanzini.arianna@gmail.com> --- block/cfq-iosched.c | 28 ++++++++++++---------------- 1 file changed, 12 insertions(+), 16 deletions(-) diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index b8e83cd..a775128 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -3573,21 +3573,10 @@ static inline void check_blkcg_changed(struct cfq_io_cq *cic, struct bio *bio) { #endif /* CONFIG_CFQ_GROUP_IOSCHED */ static struct cfq_queue * -cfq_find_alloc_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic, - struct bio *bio) +cfq_find_alloc_queue(struct cfq_data *cfqd, struct cfq_group *cfqg, bool is_sync, + struct cfq_io_cq *cic, struct bio *bio) { - struct blkcg *blkcg; struct cfq_queue *cfqq; - struct cfq_group *cfqg; - - rcu_read_lock(); - - blkcg = bio_blkcg(bio); - cfqg = cfq_lookup_create_cfqg(cfqd, blkcg); - if (!cfqg) { - cfqq = &cfqd->oom_cfqq; - goto out; - } cfqq = cic_to_cfqq(cic, is_sync); @@ -3607,8 +3596,6 @@ cfq_find_alloc_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic, } else cfqq = &cfqd->oom_cfqq; } -out: - rcu_read_unlock(); return cfqq; } @@ -3638,6 +3625,14 @@ cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic, int ioprio = IOPRIO_PRIO_DATA(cic->ioprio); struct cfq_queue **async_cfqq; struct cfq_queue *cfqq; + struct cfq_group *cfqg; + + rcu_read_lock(); + cfqg = cfq_lookup_create_cfqg(cfqd, bio_blkcg(bio)); + if (!cfqg) { + cfqq = &cfqd->oom_cfqq; + goto out; + } if (!is_sync) { if (!ioprio_valid(cic->ioprio)) { @@ -3651,7 +3646,7 @@ cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic, goto out; } - cfqq = cfq_find_alloc_queue(cfqd, is_sync, cic, bio); + cfqq = cfq_find_alloc_queue(cfqd, cfqg, is_sync, cic, bio); /* * pin the queue now that it's allocated, scheduler exit will prune it @@ -3662,6 +3657,7 @@ cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic, } out: cfqq->ref++; + rcu_read_unlock(); return cfqq; } -- 2.4.2 ^ permalink raw reply related [flat|nested] 31+ messages in thread
[parent not found: <1433753973-23684-7-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>]
* Re: [PATCH 6/8] cfq-iosched: move cfq_group determination from cfq_find_alloc_queue() to cfq_get_queue() [not found] ` <1433753973-23684-7-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> @ 2015-06-09 14:32 ` Jeff Moyer 0 siblings, 0 replies; 31+ messages in thread From: Jeff Moyer @ 2015-06-09 14:32 UTC (permalink / raw) To: Tejun Heo Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> writes: > This is necessary for making async cfq_cgroups per-cfq_group instead > of per-cfq_data. While this change makes cfq_get_queue() perform RCU > locking and look up cfq_group even when it reuses async queue, the > extra overhead is extremely unlikely to be noticeable given that this > is already sitting behind cic->cfqq[] cache and the overall cost of > cfq operation. > > Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > Cc: Arianna Avanzini <avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Reviewed-by: Jeff Moyer <jmoyer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 8/8] cfq-iosched: charge async IOs to the appropriate blkcg's instead of the root 2015-06-08 8:59 [PATCHSET block/for-4.2/writeback] block, cgroup: make cfq charge async IOs to the appropriate blkcgs Tejun Heo 2015-06-08 8:59 ` [PATCH 6/8] cfq-iosched: move cfq_group determination from cfq_find_alloc_queue() to cfq_get_queue() Tejun Heo @ 2015-06-08 8:59 ` Tejun Heo [not found] ` <1433753973-23684-9-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 2015-06-08 19:49 ` [PATCHSET block/for-4.2/writeback] block, cgroup: make cfq charge async IOs to the appropriate blkcgs Jeff Moyer [not found] ` <1433753973-23684-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 3 siblings, 1 reply; 31+ messages in thread From: Tejun Heo @ 2015-06-08 8:59 UTC (permalink / raw) To: axboe; +Cc: linux-kernel, cgroups, vgoyal, avanzini.arianna, Tejun Heo Up until now, all async IOs were queued to async queues which are shared across the whole request_queue, which means that blkcg resource control is completely void on async IOs including all writeback IOs. It was done this way because writeback didn't support writeback and there was no way of telling which writeback IO belonged to which cgroup; however, writeback recently became cgroup aware and writeback bio's are sent down properly tagged with the blkcg's to charge them against. This patch makes async cfq_queues per-cfq_cgroup instead of per-cfq_data so that each async IO is charged to the blkcg that it was tagged for instead of unconditionally attributing it to root. * cfq_data->async_cfqq and ->async_idle_cfqq are moved to cfq_group and alloc / destroy paths are updated accordingly. * cfq_link_cfqq_cfqg() no longer overrides @cfqg to root for async queues. * check_blkcg_changed() now also invalidates async queues as they no longer stay the same across cgroups. After this patch, cfq's proportional IO control through blkio.weight works correctly when cgroup writeback is in use. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Arianna Avanzini <avanzini.arianna@gmail.com> --- block/cfq-iosched.c | 85 ++++++++++++++++++++++++++--------------------------- 1 file changed, 42 insertions(+), 43 deletions(-) diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index 393befb..fded8d7 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -291,6 +291,11 @@ struct cfq_group { struct cfq_ttime ttime; struct cfqg_stats stats; /* stats for this cfqg */ struct cfqg_stats dead_stats; /* stats pushed from dead children */ + + /* async queue for each priority case */ + struct cfq_queue *async_cfqq[2][IOPRIO_BE_NR]; + struct cfq_queue *async_idle_cfqq; + }; struct cfq_io_cq { @@ -356,12 +361,6 @@ struct cfq_data { struct cfq_queue *active_queue; struct cfq_io_cq *active_cic; - /* - * async queue for each priority case - */ - struct cfq_queue *async_cfqq[2][IOPRIO_BE_NR]; - struct cfq_queue *async_idle_cfqq; - sector_t last_position; /* @@ -387,6 +386,7 @@ struct cfq_data { }; static struct cfq_group *cfq_get_next_cfqg(struct cfq_data *cfqd); +static void cfq_put_queue(struct cfq_queue *cfqq); static struct cfq_rb_root *st_for(struct cfq_group *cfqg, enum wl_class_t class, @@ -1556,13 +1556,26 @@ static void cfq_pd_init(struct blkcg_gq *blkg) static void cfq_pd_offline(struct blkcg_gq *blkg) { + struct cfq_group *cfqg = blkg_to_cfqg(blkg); + int i; + + for (i = 0; i < IOPRIO_BE_NR; i++) { + if (cfqg->async_cfqq[0][i]) + cfq_put_queue(cfqg->async_cfqq[0][i]); + if (cfqg->async_cfqq[1][i]) + cfq_put_queue(cfqg->async_cfqq[1][i]); + } + + if (cfqg->async_idle_cfqq) + cfq_put_queue(cfqg->async_idle_cfqq); + /* * @blkg is going offline and will be ignored by * blkg_[rw]stat_recursive_sum(). Transfer stats to the parent so * that they don't get lost. If IOs complete after this point, the * stats for them will be lost. Oh well... */ - cfqg_stats_xfer_dead(blkg_to_cfqg(blkg)); + cfqg_stats_xfer_dead(cfqg); } /* offset delta from cfqg->stats to cfqg->dead_stats */ @@ -1625,10 +1638,6 @@ static struct cfq_group *cfq_lookup_create_cfqg(struct cfq_data *cfqd, static void cfq_link_cfqq_cfqg(struct cfq_queue *cfqq, struct cfq_group *cfqg) { - /* Currently, all async queues are mapped to root group */ - if (!cfq_cfqq_sync(cfqq)) - cfqg = cfqq->cfqd->root_group; - cfqq->cfqg = cfqg; /* cfqq reference on cfqg */ cfqg_get(cfqg); @@ -3541,7 +3550,7 @@ static void cfq_init_cfqq(struct cfq_data *cfqd, struct cfq_queue *cfqq, static void check_blkcg_changed(struct cfq_io_cq *cic, struct bio *bio) { struct cfq_data *cfqd = cic_to_cfqd(cic); - struct cfq_queue *sync_cfqq; + struct cfq_queue *cfqq; uint64_t serial_nr; rcu_read_lock(); @@ -3555,15 +3564,22 @@ static void check_blkcg_changed(struct cfq_io_cq *cic, struct bio *bio) if (unlikely(!cfqd) || likely(cic->blkcg_serial_nr == serial_nr)) return; - sync_cfqq = cic_to_cfqq(cic, 1); - if (sync_cfqq) { - /* - * Drop reference to sync queue. A new sync queue will be - * assigned in new group upon arrival of a fresh request. - */ - cfq_log_cfqq(cfqd, sync_cfqq, "changed cgroup"); - cic_set_cfqq(cic, NULL, 1); - cfq_put_queue(sync_cfqq); + /* + * Drop reference to queues. New queues will be assigned in new + * group upon arrival of fresh requests. + */ + cfqq = cic_to_cfqq(cic, false); + if (cfqq) { + cfq_log_cfqq(cfqd, cfqq, "changed cgroup"); + cic_set_cfqq(cic, NULL, false); + cfq_put_queue(cfqq); + } + + cfqq = cic_to_cfqq(cic, true); + if (cfqq) { + cfq_log_cfqq(cfqd, cfqq, "changed cgroup"); + cic_set_cfqq(cic, NULL, true); + cfq_put_queue(cfqq); } cic->blkcg_serial_nr = serial_nr; @@ -3573,18 +3589,18 @@ static inline void check_blkcg_changed(struct cfq_io_cq *cic, struct bio *bio) { #endif /* CONFIG_CFQ_GROUP_IOSCHED */ static struct cfq_queue ** -cfq_async_queue_prio(struct cfq_data *cfqd, int ioprio_class, int ioprio) +cfq_async_queue_prio(struct cfq_group *cfqg, int ioprio_class, int ioprio) { switch (ioprio_class) { case IOPRIO_CLASS_RT: - return &cfqd->async_cfqq[0][ioprio]; + return &cfqg->async_cfqq[0][ioprio]; case IOPRIO_CLASS_NONE: ioprio = IOPRIO_NORM; /* fall through */ case IOPRIO_CLASS_BE: - return &cfqd->async_cfqq[1][ioprio]; + return &cfqg->async_cfqq[1][ioprio]; case IOPRIO_CLASS_IDLE: - return &cfqd->async_idle_cfqq; + return &cfqg->async_idle_cfqq; default: BUG(); } @@ -3613,7 +3629,7 @@ cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic, ioprio = task_nice_ioprio(tsk); ioprio_class = task_nice_ioclass(tsk); } - async_cfqq = cfq_async_queue_prio(cfqd, ioprio_class, ioprio); + async_cfqq = cfq_async_queue_prio(cfqg, ioprio_class, ioprio); cfqq = *async_cfqq; if (cfqq) goto out; @@ -4287,21 +4303,6 @@ static void cfq_shutdown_timer_wq(struct cfq_data *cfqd) cancel_work_sync(&cfqd->unplug_work); } -static void cfq_put_async_queues(struct cfq_data *cfqd) -{ - int i; - - for (i = 0; i < IOPRIO_BE_NR; i++) { - if (cfqd->async_cfqq[0][i]) - cfq_put_queue(cfqd->async_cfqq[0][i]); - if (cfqd->async_cfqq[1][i]) - cfq_put_queue(cfqd->async_cfqq[1][i]); - } - - if (cfqd->async_idle_cfqq) - cfq_put_queue(cfqd->async_idle_cfqq); -} - static void cfq_exit_queue(struct elevator_queue *e) { struct cfq_data *cfqd = e->elevator_data; @@ -4314,8 +4315,6 @@ static void cfq_exit_queue(struct elevator_queue *e) if (cfqd->active_queue) __cfq_slice_expired(cfqd, cfqd->active_queue, 0); - cfq_put_async_queues(cfqd); - spin_unlock_irq(q->queue_lock); cfq_shutdown_timer_wq(cfqd); -- 2.4.2 ^ permalink raw reply related [flat|nested] 31+ messages in thread
[parent not found: <1433753973-23684-9-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>]
* Re: [PATCH 8/8] cfq-iosched: charge async IOs to the appropriate blkcg's instead of the root [not found] ` <1433753973-23684-9-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> @ 2015-06-08 22:29 ` Vivek Goyal [not found] ` <20150608222904.GB20918-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2015-06-09 15:03 ` Jeff Moyer 1 sibling, 1 reply; 31+ messages in thread From: Vivek Goyal @ 2015-06-08 22:29 UTC (permalink / raw) To: Tejun Heo Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w, device-mapper development On Mon, Jun 08, 2015 at 05:59:33PM +0900, Tejun Heo wrote: > Up until now, all async IOs were queued to async queues which are > shared across the whole request_queue, which means that blkcg resource > control is completely void on async IOs including all writeback IOs. > It was done this way because writeback didn't support writeback and > there was no way of telling which writeback IO belonged to which > cgroup; however, writeback recently became cgroup aware and writeback > bio's are sent down properly tagged with the blkcg's to charge them > against. > > This patch makes async cfq_queues per-cfq_cgroup instead of > per-cfq_data so that each async IO is charged to the blkcg that it was > tagged for instead of unconditionally attributing it to root. > > * cfq_data->async_cfqq and ->async_idle_cfqq are moved to cfq_group > and alloc / destroy paths are updated accordingly. > > * cfq_link_cfqq_cfqg() no longer overrides @cfqg to root for async > queues. > > * check_blkcg_changed() now also invalidates async queues as they no > longer stay the same across cgroups. > > After this patch, cfq's proportional IO control through blkio.weight > works correctly when cgroup writeback is in use. > > Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > Cc: Arianna Avanzini <avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > --- > block/cfq-iosched.c | 85 ++++++++++++++++++++++++++--------------------------- > 1 file changed, 42 insertions(+), 43 deletions(-) > > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c > index 393befb..fded8d7 100644 > --- a/block/cfq-iosched.c > +++ b/block/cfq-iosched.c > @@ -291,6 +291,11 @@ struct cfq_group { > struct cfq_ttime ttime; > struct cfqg_stats stats; /* stats for this cfqg */ > struct cfqg_stats dead_stats; /* stats pushed from dead children */ > + > + /* async queue for each priority case */ > + struct cfq_queue *async_cfqq[2][IOPRIO_BE_NR]; > + struct cfq_queue *async_idle_cfqq; > + > }; > > struct cfq_io_cq { > @@ -356,12 +361,6 @@ struct cfq_data { > struct cfq_queue *active_queue; > struct cfq_io_cq *active_cic; > > - /* > - * async queue for each priority case > - */ > - struct cfq_queue *async_cfqq[2][IOPRIO_BE_NR]; > - struct cfq_queue *async_idle_cfqq; > - > sector_t last_position; > > /* > @@ -387,6 +386,7 @@ struct cfq_data { > }; > > static struct cfq_group *cfq_get_next_cfqg(struct cfq_data *cfqd); > +static void cfq_put_queue(struct cfq_queue *cfqq); > > static struct cfq_rb_root *st_for(struct cfq_group *cfqg, > enum wl_class_t class, > @@ -1556,13 +1556,26 @@ static void cfq_pd_init(struct blkcg_gq *blkg) > > static void cfq_pd_offline(struct blkcg_gq *blkg) > { > + struct cfq_group *cfqg = blkg_to_cfqg(blkg); > + int i; > + > + for (i = 0; i < IOPRIO_BE_NR; i++) { > + if (cfqg->async_cfqq[0][i]) > + cfq_put_queue(cfqg->async_cfqq[0][i]); > + if (cfqg->async_cfqq[1][i]) > + cfq_put_queue(cfqg->async_cfqq[1][i]); > + } > + > + if (cfqg->async_idle_cfqq) > + cfq_put_queue(cfqg->async_idle_cfqq); > + > /* > * @blkg is going offline and will be ignored by > * blkg_[rw]stat_recursive_sum(). Transfer stats to the parent so > * that they don't get lost. If IOs complete after this point, the > * stats for them will be lost. Oh well... > */ > - cfqg_stats_xfer_dead(blkg_to_cfqg(blkg)); > + cfqg_stats_xfer_dead(cfqg); > } > > /* offset delta from cfqg->stats to cfqg->dead_stats */ > @@ -1625,10 +1638,6 @@ static struct cfq_group *cfq_lookup_create_cfqg(struct cfq_data *cfqd, > > static void cfq_link_cfqq_cfqg(struct cfq_queue *cfqq, struct cfq_group *cfqg) > { > - /* Currently, all async queues are mapped to root group */ > - if (!cfq_cfqq_sync(cfqq)) > - cfqg = cfqq->cfqd->root_group; > - > cfqq->cfqg = cfqg; > /* cfqq reference on cfqg */ > cfqg_get(cfqg); > @@ -3541,7 +3550,7 @@ static void cfq_init_cfqq(struct cfq_data *cfqd, struct cfq_queue *cfqq, > static void check_blkcg_changed(struct cfq_io_cq *cic, struct bio *bio) > { > struct cfq_data *cfqd = cic_to_cfqd(cic); > - struct cfq_queue *sync_cfqq; > + struct cfq_queue *cfqq; > uint64_t serial_nr; > > rcu_read_lock(); > @@ -3555,15 +3564,22 @@ static void check_blkcg_changed(struct cfq_io_cq *cic, struct bio *bio) > if (unlikely(!cfqd) || likely(cic->blkcg_serial_nr == serial_nr)) > return; > > - sync_cfqq = cic_to_cfqq(cic, 1); > - if (sync_cfqq) { > - /* > - * Drop reference to sync queue. A new sync queue will be > - * assigned in new group upon arrival of a fresh request. > - */ > - cfq_log_cfqq(cfqd, sync_cfqq, "changed cgroup"); > - cic_set_cfqq(cic, NULL, 1); > - cfq_put_queue(sync_cfqq); > + /* > + * Drop reference to queues. New queues will be assigned in new > + * group upon arrival of fresh requests. > + */ > + cfqq = cic_to_cfqq(cic, false); > + if (cfqq) { > + cfq_log_cfqq(cfqd, cfqq, "changed cgroup"); > + cic_set_cfqq(cic, NULL, false); > + cfq_put_queue(cfqq); > + } > + > + cfqq = cic_to_cfqq(cic, true); > + if (cfqq) { > + cfq_log_cfqq(cfqd, cfqq, "changed cgroup"); > + cic_set_cfqq(cic, NULL, true); > + cfq_put_queue(cfqq); > } Hi Tejun, I am getting confused between cgroup and iocontext interaction, hence some basic questions. So a bio can carry either both iocontext and cgroup information. If iocontext or cgroup information is present, it is used during rq and cfqq allocation otherwise submitter's iocontext and cgroup is used. bio_associate_current() will associate an iocontext as well as cgroup of submitter to bio. Now bio can be offloaded to helper threads for submission and still be accounted in right cgroup and right iocontext. As of now only blk throttling layer makes use of it. But above is not true for buffered writes, we will not associate io context. Instead only cgroup information will be sent down and io context of submitter will be used. So any thread which is forced to submit buffered write for some other cgroup, will have its sync queue also reset (Because CFQ will think that cgroup of submitter has changed). Not sure how often it will happen, but if it happens frequenty, this might show up in profiles. I had mentioned this in the past and IIUC you said we will have to carry writeback information all the way into lower layers. May be that's an optimzation for later. So nothing new here, just trying to understand the current situation. Also I am wondring if this cgroup and io context information is carried through dm layer or not. I guess it might be a separate discussion. It has come for discussion internally in the past. So this might be a good time to get attention of dm developers on these upcoming changes. (CC dm-devel). Thanks Vivek ^ permalink raw reply [flat|nested] 31+ messages in thread
[parent not found: <20150608222904.GB20918-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 8/8] cfq-iosched: charge async IOs to the appropriate blkcg's instead of the root [not found] ` <20150608222904.GB20918-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2015-06-09 3:11 ` Tejun Heo 0 siblings, 0 replies; 31+ messages in thread From: Tejun Heo @ 2015-06-09 3:11 UTC (permalink / raw) To: Vivek Goyal Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w, device-mapper development Hey, Vivek. On Mon, Jun 08, 2015 at 06:29:04PM -0400, Vivek Goyal wrote: ... > But above is not true for buffered writes, we will not associate io > context. Instead only cgroup information will be sent down and io > context of submitter will be used. > > So any thread which is forced to submit buffered write for some other > cgroup, will have its sync queue also reset (Because CFQ will think Yeah, it'll usually be the writeback work items. > that cgroup of submitter has changed). Not sure how often it will happen, > but if it happens frequenty, this might show up in profiles. I had They're unlikely to have sync queues associated with them to begin with and even when the context gets reset each iteration should be big enough to drown this sort of overhead. Writeback operates in a fairly sizable chunks. > mentioned this in the past and IIUC you said we will have to carry > writeback information all the way into lower layers. May be that's > an optimzation for later. So nothing new here, just trying to understand > the current situation. I'm actually kinda doubtful about caching async queues on cic at all. We already perform most of operations necessary to lookup the async queue for check_blkcg_changed(). We might as well just look up and pin each time. > Also I am wondring if this cgroup and io context information is carried > through dm layer or not. I guess it might be a separate discussion. It > has come for discussion internally in the past. So this might be a good > time to get attention of dm developers on these upcoming changes. (CC dm-devel). Hmmm... it depends on what dm does with the bio. It can surely propagate the cgroup information through sub bios. Thanks. -- tejun ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 8/8] cfq-iosched: charge async IOs to the appropriate blkcg's instead of the root [not found] ` <1433753973-23684-9-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 2015-06-08 22:29 ` Vivek Goyal @ 2015-06-09 15:03 ` Jeff Moyer 1 sibling, 0 replies; 31+ messages in thread From: Jeff Moyer @ 2015-06-09 15:03 UTC (permalink / raw) To: Tejun Heo Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> writes: > Up until now, all async IOs were queued to async queues which are > shared across the whole request_queue, which means that blkcg resource > control is completely void on async IOs including all writeback IOs. > It was done this way because writeback didn't support writeback and > there was no way of telling which writeback IO belonged to which > cgroup; however, writeback recently became cgroup aware and writeback > bio's are sent down properly tagged with the blkcg's to charge them > against. > > This patch makes async cfq_queues per-cfq_cgroup instead of > per-cfq_data so that each async IO is charged to the blkcg that it was > tagged for instead of unconditionally attributing it to root. > > * cfq_data->async_cfqq and ->async_idle_cfqq are moved to cfq_group > and alloc / destroy paths are updated accordingly. > > * cfq_link_cfqq_cfqg() no longer overrides @cfqg to root for async > queues. > > * check_blkcg_changed() now also invalidates async queues as they no > longer stay the same across cgroups. > > After this patch, cfq's proportional IO control through blkio.weight > works correctly when cgroup writeback is in use. > > Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > Cc: Arianna Avanzini <avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Reviewed-by: Jeff Moyer <jmoyer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCHSET block/for-4.2/writeback] block, cgroup: make cfq charge async IOs to the appropriate blkcgs 2015-06-08 8:59 [PATCHSET block/for-4.2/writeback] block, cgroup: make cfq charge async IOs to the appropriate blkcgs Tejun Heo 2015-06-08 8:59 ` [PATCH 6/8] cfq-iosched: move cfq_group determination from cfq_find_alloc_queue() to cfq_get_queue() Tejun Heo 2015-06-08 8:59 ` [PATCH 8/8] cfq-iosched: charge async IOs to the appropriate blkcg's instead of the root Tejun Heo @ 2015-06-08 19:49 ` Jeff Moyer [not found] ` <x491thmhtej.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org> [not found] ` <1433753973-23684-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 3 siblings, 1 reply; 31+ messages in thread From: Jeff Moyer @ 2015-06-08 19:49 UTC (permalink / raw) To: Tejun Heo; +Cc: axboe, linux-kernel, cgroups, vgoyal, avanzini.arianna Tejun Heo <tj@kernel.org> writes: > Hello, > > cfq has always charged all async IOs to the root cgroup. It didn't > have much choice as writeback didn't know about cgroups and there was > no way to tell who to blame for a given writeback IO. writeback > finally grew support for cgroups and now tags each writeback IO with > the appropriate cgroup to charge it against. > > This patchset updates cfq so that it follows the blkcg each bio is > tagged with. Async cfq_queues are now shared across cfq_group, which > is per-cgroup, instead of per-request_queue cfq_data. This makes all > IOs follow the weight based IO resource distribution implemented by > cfq. > > This patchset contains the following 8 patches. > > 0001-cfq-iosched-simplify-control-flow-in-cfq_get_queue.patch > 0002-cfq-iosched-fix-async-oom-queue-handling.patch > 0003-cfq-iosched-fix-oom-cfq_queue-ref-leak-in-cfq_set_re.patch > 0004-cfq-iosched-minor-cleanups.patch > 0005-cfq-iosched-remove-gfp_mask-from-cfq_find_alloc_queu.patch > 0006-cfq-iosched-move-cfq_group-determination-from-cfq_fi.patch > 0007-cfq-iosched-fold-cfq_find_alloc_queue-into-cfq_get_q.patch > 0008-cfq-iosched-charge-async-IOs-to-the-appropriate-blkc.patch Hi, Tejun, Assuming you're ok with dropping patch 5, I'll review patches 6-8 once they've been reworked to account for that. I took a look at them, and they look OK to me. But, if they are going to change, I'd rather wait to ack the final versions. Cheers, Jeff ^ permalink raw reply [flat|nested] 31+ messages in thread
[parent not found: <x491thmhtej.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>]
* Re: [PATCHSET block/for-4.2/writeback] block, cgroup: make cfq charge async IOs to the appropriate blkcgs [not found] ` <x491thmhtej.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org> @ 2015-06-09 3:03 ` Tejun Heo [not found] ` <20150609030327.GL21465-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> 0 siblings, 1 reply; 31+ messages in thread From: Tejun Heo @ 2015-06-09 3:03 UTC (permalink / raw) To: Jeff Moyer Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w Hey, Jeff. On Mon, Jun 08, 2015 at 03:49:56PM -0400, Jeff Moyer wrote: > Assuming you're ok with dropping patch 5, I'll review patches 6-8 once I'll update it to use GPF_NOWAIT. Can't drop it easily as that would complicate group lookup more complicated w/ pinning and all. > they've been reworked to account for that. I took a look at them, and > they look OK to me. But, if they are going to change, I'd rather wait > to ack the final versions. Hmm... the changes will be simple s/GFP_ATOMIC/GFP_NOWAIT. Shouldn't affect reviewing, I think. Thanks. -- tejun ^ permalink raw reply [flat|nested] 31+ messages in thread
[parent not found: <20150609030327.GL21465-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>]
* Re: [PATCHSET block/for-4.2/writeback] block, cgroup: make cfq charge async IOs to the appropriate blkcgs [not found] ` <20150609030327.GL21465-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2015-06-09 15:05 ` Jeff Moyer [not found] ` <x49lhfssz05.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org> 0 siblings, 1 reply; 31+ messages in thread From: Jeff Moyer @ 2015-06-09 15:05 UTC (permalink / raw) To: Tejun Heo Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> writes: > Hey, Jeff. > > On Mon, Jun 08, 2015 at 03:49:56PM -0400, Jeff Moyer wrote: >> Assuming you're ok with dropping patch 5, I'll review patches 6-8 once > > I'll update it to use GPF_NOWAIT. Can't drop it easily as that would > complicate group lookup more complicated w/ pinning and all. > >> they've been reworked to account for that. I took a look at them, and >> they look OK to me. But, if they are going to change, I'd rather wait >> to ack the final versions. > > Hmm... the changes will be simple s/GFP_ATOMIC/GFP_NOWAIT. Shouldn't > affect reviewing, I think. Yeah, I sent this email before realizing that GFP_NOWAIT would work out. I've reviewed the rest of the series. Looks good to me! You did a nice job splitting things up into easily reviewable pieces, so thanks for that. Cheers, Jeff ^ permalink raw reply [flat|nested] 31+ messages in thread
[parent not found: <x49lhfssz05.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>]
* Re: [PATCHSET block/for-4.2/writeback] block, cgroup: make cfq charge async IOs to the appropriate blkcgs [not found] ` <x49lhfssz05.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org> @ 2015-06-10 2:49 ` Tejun Heo 0 siblings, 0 replies; 31+ messages in thread From: Tejun Heo @ 2015-06-10 2:49 UTC (permalink / raw) To: Jeff Moyer Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w Hello, On Tue, Jun 09, 2015 at 11:05:46AM -0400, Jeff Moyer wrote: > Yeah, I sent this email before realizing that GFP_NOWAIT would work out. > I've reviewed the rest of the series. Looks good to me! You did a nice > job splitting things up into easily reviewable pieces, so thanks for > that. Thanks a lot for reviewing the patches. :) I think it's already too late for the v4.2 merge window. I'll re-post the patches w/ your reviewed-by's added once v4.2-rc1 drops. -- tejun ^ permalink raw reply [flat|nested] 31+ messages in thread
[parent not found: <1433753973-23684-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>]
* [PATCH 1/8] cfq-iosched: simplify control flow in cfq_get_queue() [not found] ` <1433753973-23684-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> @ 2015-06-08 8:59 ` Tejun Heo [not found] ` <1433753973-23684-2-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 2015-06-08 8:59 ` [PATCH 2/8] cfq-iosched: fix async oom queue handling Tejun Heo ` (5 subsequent siblings) 6 siblings, 1 reply; 31+ messages in thread From: Tejun Heo @ 2015-06-08 8:59 UTC (permalink / raw) To: axboe-tSWWG44O7X1aa/9Udqfwiw Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w, Tejun Heo cfq_get_queue()'s control flow looks like the following. async_cfqq = NULL; cfqq = NULL; if (!is_sync) { ... async_cfqq = ...; cfqq = *async_cfqq; } if (!cfqq) cfqq = ...; if (!is_sync && !(*async_cfqq)) ...; The only thing the local variable init, the second if, and the async_cfqq test in the third if achieves is to skip cfqq creation and installation if *async_cfqq was already non-NULL. This is needlessly complicated with different tests examining the same condition. Simplify it to the following. if (!is_sync) { ... async_cfqq = ...; cfqq = *async_cfqq; if (cfqq) goto out; } cfqq = ...; if (!is_sync) ...; out: Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Cc: Arianna Avanzini <avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> --- block/cfq-iosched.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index bc8f429..2814bb7 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -3663,8 +3663,8 @@ cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic, { int ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio); int ioprio = IOPRIO_PRIO_DATA(cic->ioprio); - struct cfq_queue **async_cfqq = NULL; - struct cfq_queue *cfqq = NULL; + struct cfq_queue **async_cfqq; + struct cfq_queue *cfqq; if (!is_sync) { if (!ioprio_valid(cic->ioprio)) { @@ -3674,19 +3674,20 @@ cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic, } async_cfqq = cfq_async_queue_prio(cfqd, ioprio_class, ioprio); cfqq = *async_cfqq; + if (cfqq) + goto out; } - if (!cfqq) - cfqq = cfq_find_alloc_queue(cfqd, is_sync, cic, bio, gfp_mask); + cfqq = cfq_find_alloc_queue(cfqd, is_sync, cic, bio, gfp_mask); /* * pin the queue now that it's allocated, scheduler exit will prune it */ - if (!is_sync && !(*async_cfqq)) { + if (!is_sync) { cfqq->ref++; *async_cfqq = cfqq; } - +out: cfqq->ref++; return cfqq; } -- 2.4.2 ^ permalink raw reply related [flat|nested] 31+ messages in thread
[parent not found: <1433753973-23684-2-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>]
* Re: [PATCH 1/8] cfq-iosched: simplify control flow in cfq_get_queue() [not found] ` <1433753973-23684-2-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> @ 2015-06-08 18:36 ` Jeff Moyer 0 siblings, 0 replies; 31+ messages in thread From: Jeff Moyer @ 2015-06-08 18:36 UTC (permalink / raw) To: Tejun Heo Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> writes: > cfq_get_queue()'s control flow looks like the following. > > async_cfqq = NULL; > cfqq = NULL; > > if (!is_sync) { > ... > async_cfqq = ...; > cfqq = *async_cfqq; > } > > if (!cfqq) > cfqq = ...; > > if (!is_sync && !(*async_cfqq)) > ...; > > The only thing the local variable init, the second if, and the > async_cfqq test in the third if achieves is to skip cfqq creation and > installation if *async_cfqq was already non-NULL. This is needlessly > complicated with different tests examining the same condition. > Simplify it to the following. > > if (!is_sync) { > ... > async_cfqq = ...; > cfqq = *async_cfqq; > if (cfqq) > goto out; > } > > cfqq = ...; > > if (!is_sync) > ...; > out: > > Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > Cc: Arianna Avanzini <avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Acked-by: Jeff Moyer <jmoyer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 2/8] cfq-iosched: fix async oom queue handling [not found] ` <1433753973-23684-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 2015-06-08 8:59 ` [PATCH 1/8] cfq-iosched: simplify control flow in cfq_get_queue() Tejun Heo @ 2015-06-08 8:59 ` Tejun Heo [not found] ` <1433753973-23684-3-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 2015-06-08 8:59 ` [PATCH 3/8] cfq-iosched: fix oom cfq_queue ref leak in cfq_set_request() Tejun Heo ` (4 subsequent siblings) 6 siblings, 1 reply; 31+ messages in thread From: Tejun Heo @ 2015-06-08 8:59 UTC (permalink / raw) To: axboe-tSWWG44O7X1aa/9Udqfwiw Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w, Tejun Heo Async cfqq's (cfq_queue's) are shared across cfq_data. When cfq_get_queue() obtains a new queue from cfq_find_alloc_queue(), it stashes the pointer in cfq_data and reuses it from then on; however, the function doesn't consider that cfq_find_alloc_queue() may return the oom_cfqq under memory pressure and installs the returned queue unconditionally. If the oom_cfqq is installed as an async cfqq, cfq_set_request() will continue calling cfq_get_queue() hoping to replace it with a proper queue; however, cfq_get_queue() will keep returning the cached queue for the slot - the oom_cfqq. Fix it by skipping caching if the queue is the oom one. Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Cc: Arianna Avanzini <avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> --- block/cfq-iosched.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index 2814bb7..c7b33aa 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -3683,7 +3683,7 @@ cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic, /* * pin the queue now that it's allocated, scheduler exit will prune it */ - if (!is_sync) { + if (!is_sync && cfqq != &cfqd->oom_cfqq) { cfqq->ref++; *async_cfqq = cfqq; } -- 2.4.2 ^ permalink raw reply related [flat|nested] 31+ messages in thread
[parent not found: <1433753973-23684-3-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>]
* Re: [PATCH 2/8] cfq-iosched: fix async oom queue handling [not found] ` <1433753973-23684-3-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> @ 2015-06-08 18:42 ` Jeff Moyer 0 siblings, 0 replies; 31+ messages in thread From: Jeff Moyer @ 2015-06-08 18:42 UTC (permalink / raw) To: Tejun Heo Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> writes: > Async cfqq's (cfq_queue's) are shared across cfq_data. When > cfq_get_queue() obtains a new queue from cfq_find_alloc_queue(), it > stashes the pointer in cfq_data and reuses it from then on; however, > the function doesn't consider that cfq_find_alloc_queue() may return > the oom_cfqq under memory pressure and installs the returned queue > unconditionally. > > If the oom_cfqq is installed as an async cfqq, cfq_set_request() will > continue calling cfq_get_queue() hoping to replace it with a proper > queue; however, cfq_get_queue() will keep returning the cached queue > for the slot - the oom_cfqq. > > Fix it by skipping caching if the queue is the oom one. Good catch. Reviewed-by: Jeff Moyer <jmoyer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 3/8] cfq-iosched: fix oom cfq_queue ref leak in cfq_set_request() [not found] ` <1433753973-23684-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 2015-06-08 8:59 ` [PATCH 1/8] cfq-iosched: simplify control flow in cfq_get_queue() Tejun Heo 2015-06-08 8:59 ` [PATCH 2/8] cfq-iosched: fix async oom queue handling Tejun Heo @ 2015-06-08 8:59 ` Tejun Heo [not found] ` <1433753973-23684-4-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 2015-06-08 8:59 ` [PATCH 4/8] cfq-iosched: minor cleanups Tejun Heo ` (3 subsequent siblings) 6 siblings, 1 reply; 31+ messages in thread From: Tejun Heo @ 2015-06-08 8:59 UTC (permalink / raw) To: axboe-tSWWG44O7X1aa/9Udqfwiw Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w, Tejun Heo If the cfq_queue cached in cfq_io_cq is the oom one, cfq_set_request() replaces it by invoking cfq_get_queue() again without putting the oom queue leaking the reference it was holding. While oom queues are not released through reference counting, they're still reference counted and this can theoretically lead to the reference count overflowing and incorrectly invoke the usual release path on it. Fix it by making cfq_set_request() put the ref it was holding. Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Cc: Arianna Avanzini <avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> --- block/cfq-iosched.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index c7b33aa..e0a34ba 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -4231,6 +4231,8 @@ cfq_set_request(struct request_queue *q, struct request *rq, struct bio *bio, new_queue: cfqq = cic_to_cfqq(cic, is_sync); if (!cfqq || cfqq == &cfqd->oom_cfqq) { + if (cfqq) + cfq_put_queue(cfqq); cfqq = cfq_get_queue(cfqd, is_sync, cic, bio, gfp_mask); cic_set_cfqq(cic, cfqq, is_sync); } else { -- 2.4.2 ^ permalink raw reply related [flat|nested] 31+ messages in thread
[parent not found: <1433753973-23684-4-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>]
* Re: [PATCH 3/8] cfq-iosched: fix oom cfq_queue ref leak in cfq_set_request() [not found] ` <1433753973-23684-4-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> @ 2015-06-08 18:51 ` Jeff Moyer 0 siblings, 0 replies; 31+ messages in thread From: Jeff Moyer @ 2015-06-08 18:51 UTC (permalink / raw) To: Tejun Heo Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> writes: > If the cfq_queue cached in cfq_io_cq is the oom one, cfq_set_request() > replaces it by invoking cfq_get_queue() again without putting the oom > queue leaking the reference it was holding. While oom queues are not > released through reference counting, they're still reference counted > and this can theoretically lead to the reference count overflowing and > incorrectly invoke the usual release path on it. > > Fix it by making cfq_set_request() put the ref it was holding. > > Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > Cc: Arianna Avanzini <avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Reviewed-by: Jeff Moyer <jmoyer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 4/8] cfq-iosched: minor cleanups [not found] ` <1433753973-23684-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> ` (2 preceding siblings ...) 2015-06-08 8:59 ` [PATCH 3/8] cfq-iosched: fix oom cfq_queue ref leak in cfq_set_request() Tejun Heo @ 2015-06-08 8:59 ` Tejun Heo [not found] ` <1433753973-23684-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 2015-06-08 8:59 ` [PATCH 5/8] cfq-iosched: remove @gfp_mask from cfq_find_alloc_queue() Tejun Heo ` (2 subsequent siblings) 6 siblings, 1 reply; 31+ messages in thread From: Tejun Heo @ 2015-06-08 8:59 UTC (permalink / raw) To: axboe-tSWWG44O7X1aa/9Udqfwiw Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w, Tejun Heo * Some were accessing cic->cfqq[] directly. Always use cic_to_cfqq() and cic_set_cfqq(). * check_ioprio_changed() doesn't need to verify cfq_get_queue()'s return for NULL. It's always non-NULL. Simplify accordingly. This patch doesn't cause any functional changes. Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Cc: Arianna Avanzini <avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> --- block/cfq-iosched.c | 26 +++++++++++--------------- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index e0a34ba..90d5a87 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -3438,14 +3438,14 @@ static void cfq_exit_icq(struct io_cq *icq) struct cfq_io_cq *cic = icq_to_cic(icq); struct cfq_data *cfqd = cic_to_cfqd(cic); - if (cic->cfqq[BLK_RW_ASYNC]) { - cfq_exit_cfqq(cfqd, cic->cfqq[BLK_RW_ASYNC]); - cic->cfqq[BLK_RW_ASYNC] = NULL; + if (cic_to_cfqq(cic, false)) { + cfq_exit_cfqq(cfqd, cic_to_cfqq(cic, false)); + cic_set_cfqq(cic, NULL, false); } - if (cic->cfqq[BLK_RW_SYNC]) { - cfq_exit_cfqq(cfqd, cic->cfqq[BLK_RW_SYNC]); - cic->cfqq[BLK_RW_SYNC] = NULL; + if (cic_to_cfqq(cic, true)) { + cfq_exit_cfqq(cfqd, cic_to_cfqq(cic, true)); + cic_set_cfqq(cic, NULL, true); } } @@ -3504,18 +3504,14 @@ static void check_ioprio_changed(struct cfq_io_cq *cic, struct bio *bio) if (unlikely(!cfqd) || likely(cic->ioprio == ioprio)) return; - cfqq = cic->cfqq[BLK_RW_ASYNC]; + cfqq = cic_to_cfqq(cic, false); if (cfqq) { - struct cfq_queue *new_cfqq; - new_cfqq = cfq_get_queue(cfqd, BLK_RW_ASYNC, cic, bio, - GFP_ATOMIC); - if (new_cfqq) { - cic->cfqq[BLK_RW_ASYNC] = new_cfqq; - cfq_put_queue(cfqq); - } + cfq_put_queue(cfqq); + cfqq = cfq_get_queue(cfqd, BLK_RW_ASYNC, cic, bio, GFP_ATOMIC); + cic_set_cfqq(cic, cfqq, false); } - cfqq = cic->cfqq[BLK_RW_SYNC]; + cfqq = cic_to_cfqq(cic, true); if (cfqq) cfq_mark_cfqq_prio_changed(cfqq); -- 2.4.2 ^ permalink raw reply related [flat|nested] 31+ messages in thread
[parent not found: <1433753973-23684-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>]
* Re: [PATCH 4/8] cfq-iosched: minor cleanups [not found] ` <1433753973-23684-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> @ 2015-06-08 18:59 ` Jeff Moyer 0 siblings, 0 replies; 31+ messages in thread From: Jeff Moyer @ 2015-06-08 18:59 UTC (permalink / raw) To: Tejun Heo Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> writes: > * Some were accessing cic->cfqq[] directly. Always use cic_to_cfqq() > and cic_set_cfqq(). > > * check_ioprio_changed() doesn't need to verify cfq_get_queue()'s > return for NULL. It's always non-NULL. Simplify accordingly. > > This patch doesn't cause any functional changes. > > Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > Cc: Arianna Avanzini <avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Looks good. I don't much like the "bool is_sync" function parameters. Calls to cic_to_cfqq, cic_set_cfqq, etc would be much easier to read if they took BLK_RW_SYNC and BLK_RW_ASYNC. Certainly not a problem with this patch, though, just a general observation. Reviewed-by: Jeff Moyer <jmoyer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 5/8] cfq-iosched: remove @gfp_mask from cfq_find_alloc_queue() [not found] ` <1433753973-23684-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> ` (3 preceding siblings ...) 2015-06-08 8:59 ` [PATCH 4/8] cfq-iosched: minor cleanups Tejun Heo @ 2015-06-08 8:59 ` Tejun Heo [not found] ` <1433753973-23684-6-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 2015-06-08 8:59 ` [PATCH 7/8] cfq-iosched: fold cfq_find_alloc_queue() into cfq_get_queue() Tejun Heo 2015-06-09 4:21 ` [PATCH 4.5/8] blkcg, cfq-iosched: use GFP_NOWAIT instead of GFP_ATOMIC for non-critical allocations Tejun Heo 6 siblings, 1 reply; 31+ messages in thread From: Tejun Heo @ 2015-06-08 8:59 UTC (permalink / raw) To: axboe-tSWWG44O7X1aa/9Udqfwiw Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w, Tejun Heo Even when allocations fail, cfq_find_alloc_queue() always returns a valid cfq_queue by falling back to the oom cfq_queue. As such, there isn't much point in taking @gfp_mask and trying "harder" if __GFP_WAIT is set. GFP_ATOMIC allocations don't fail often and even when they do the degraded behavior is acceptable and temporary. After all, the only reason get_request(), which ultimately determines the gfp_mask, cares about __GFP_WAIT is to guarantee request allocation, assuming IO forward progress, for callers which are willing to wait. There's no reason for cfq_find_alloc_queue() to behave differently on __GFP_WAIT when it already has a fallback mechanism. Remove @gfp_mask from cfq_find_alloc_queue() and propagate the changes to its callers. This simplifies the function quite a bit and will help making async queues per-cfq_group. Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Cc: Arianna Avanzini <avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> --- block/cfq-iosched.c | 45 ++++++++++----------------------------------- 1 file changed, 10 insertions(+), 35 deletions(-) diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index 90d5a87..b8e83cd 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -858,8 +858,7 @@ static inline int cfqg_busy_async_queues(struct cfq_data *cfqd, static void cfq_dispatch_insert(struct request_queue *, struct request *); static struct cfq_queue *cfq_get_queue(struct cfq_data *cfqd, bool is_sync, - struct cfq_io_cq *cic, struct bio *bio, - gfp_t gfp_mask); + struct cfq_io_cq *cic, struct bio *bio); static inline struct cfq_io_cq *icq_to_cic(struct io_cq *icq) { @@ -3507,7 +3506,7 @@ static void check_ioprio_changed(struct cfq_io_cq *cic, struct bio *bio) cfqq = cic_to_cfqq(cic, false); if (cfqq) { cfq_put_queue(cfqq); - cfqq = cfq_get_queue(cfqd, BLK_RW_ASYNC, cic, bio, GFP_ATOMIC); + cfqq = cfq_get_queue(cfqd, BLK_RW_ASYNC, cic, bio); cic_set_cfqq(cic, cfqq, false); } @@ -3575,13 +3574,12 @@ static inline void check_blkcg_changed(struct cfq_io_cq *cic, struct bio *bio) { static struct cfq_queue * cfq_find_alloc_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic, - struct bio *bio, gfp_t gfp_mask) + struct bio *bio) { struct blkcg *blkcg; - struct cfq_queue *cfqq, *new_cfqq = NULL; + struct cfq_queue *cfqq; struct cfq_group *cfqg; -retry: rcu_read_lock(); blkcg = bio_blkcg(bio); @@ -3598,27 +3596,9 @@ cfq_find_alloc_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic, * originally, since it should just be a temporary situation. */ if (!cfqq || cfqq == &cfqd->oom_cfqq) { - cfqq = NULL; - if (new_cfqq) { - cfqq = new_cfqq; - new_cfqq = NULL; - } else if (gfp_mask & __GFP_WAIT) { - rcu_read_unlock(); - spin_unlock_irq(cfqd->queue->queue_lock); - new_cfqq = kmem_cache_alloc_node(cfq_pool, - gfp_mask | __GFP_ZERO, - cfqd->queue->node); - spin_lock_irq(cfqd->queue->queue_lock); - if (new_cfqq) - goto retry; - else - return &cfqd->oom_cfqq; - } else { - cfqq = kmem_cache_alloc_node(cfq_pool, - gfp_mask | __GFP_ZERO, - cfqd->queue->node); - } - + cfqq = kmem_cache_alloc_node(cfq_pool, + GFP_ATOMIC | __GFP_ZERO, + cfqd->queue->node); if (cfqq) { cfq_init_cfqq(cfqd, cfqq, current->pid, is_sync); cfq_init_prio_data(cfqq, cic); @@ -3628,9 +3608,6 @@ cfq_find_alloc_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic, cfqq = &cfqd->oom_cfqq; } out: - if (new_cfqq) - kmem_cache_free(cfq_pool, new_cfqq); - rcu_read_unlock(); return cfqq; } @@ -3655,7 +3632,7 @@ cfq_async_queue_prio(struct cfq_data *cfqd, int ioprio_class, int ioprio) static struct cfq_queue * cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic, - struct bio *bio, gfp_t gfp_mask) + struct bio *bio) { int ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio); int ioprio = IOPRIO_PRIO_DATA(cic->ioprio); @@ -3674,7 +3651,7 @@ cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic, goto out; } - cfqq = cfq_find_alloc_queue(cfqd, is_sync, cic, bio, gfp_mask); + cfqq = cfq_find_alloc_queue(cfqd, is_sync, cic, bio); /* * pin the queue now that it's allocated, scheduler exit will prune it @@ -4218,8 +4195,6 @@ cfq_set_request(struct request_queue *q, struct request *rq, struct bio *bio, const bool is_sync = rq_is_sync(rq); struct cfq_queue *cfqq; - might_sleep_if(gfp_mask & __GFP_WAIT); - spin_lock_irq(q->queue_lock); check_ioprio_changed(cic, bio); @@ -4229,7 +4204,7 @@ cfq_set_request(struct request_queue *q, struct request *rq, struct bio *bio, if (!cfqq || cfqq == &cfqd->oom_cfqq) { if (cfqq) cfq_put_queue(cfqq); - cfqq = cfq_get_queue(cfqd, is_sync, cic, bio, gfp_mask); + cfqq = cfq_get_queue(cfqd, is_sync, cic, bio); cic_set_cfqq(cic, cfqq, is_sync); } else { /* -- 2.4.2 ^ permalink raw reply related [flat|nested] 31+ messages in thread
[parent not found: <1433753973-23684-6-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>]
* Re: [PATCH 5/8] cfq-iosched: remove @gfp_mask from cfq_find_alloc_queue() [not found] ` <1433753973-23684-6-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> @ 2015-06-08 19:24 ` Jeff Moyer 2015-06-08 20:27 ` Jeff Moyer 0 siblings, 1 reply; 31+ messages in thread From: Jeff Moyer @ 2015-06-08 19:24 UTC (permalink / raw) To: Tejun Heo Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> writes: > Even when allocations fail, cfq_find_alloc_queue() always returns a > valid cfq_queue by falling back to the oom cfq_queue. As such, there > isn't much point in taking @gfp_mask and trying "harder" if __GFP_WAIT > is set. GFP_ATOMIC allocations don't fail often and even when they do > the degraded behavior is acceptable and temporary. > > After all, the only reason get_request(), which ultimately determines > the gfp_mask, cares about __GFP_WAIT is to guarantee request > allocation, assuming IO forward progress, for callers which are > willing to wait. There's no reason for cfq_find_alloc_queue() to > behave differently on __GFP_WAIT when it already has a fallback > mechanism. > > Remove @gfp_mask from cfq_find_alloc_queue() and propagate the changes > to its callers. This simplifies the function quite a bit and will > help making async queues per-cfq_group. Sorry, I disagree with this patch. You've changed it so that all cfqq allocations are GFP_ATOMIC, and most, if not all of them simply don't need to be. I'll take it one step further and suggest that we should fix cfq_find_alloc_queue to pass the gfp_mask to check_ioprio_changed. That also shouldn't be using GFP_ATOMIC unconditionally. NAK -Jeff ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 5/8] cfq-iosched: remove @gfp_mask from cfq_find_alloc_queue() 2015-06-08 19:24 ` Jeff Moyer @ 2015-06-08 20:27 ` Jeff Moyer [not found] ` <x49sia2gd41.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org> 0 siblings, 1 reply; 31+ messages in thread From: Jeff Moyer @ 2015-06-08 20:27 UTC (permalink / raw) To: Tejun Heo; +Cc: axboe, linux-kernel, cgroups, vgoyal, avanzini.arianna Jeff Moyer <jmoyer@redhat.com> writes: > Tejun Heo <tj@kernel.org> writes: > >> Even when allocations fail, cfq_find_alloc_queue() always returns a >> valid cfq_queue by falling back to the oom cfq_queue. As such, there >> isn't much point in taking @gfp_mask and trying "harder" if __GFP_WAIT >> is set. GFP_ATOMIC allocations don't fail often and even when they do >> the degraded behavior is acceptable and temporary. >> >> After all, the only reason get_request(), which ultimately determines >> the gfp_mask, cares about __GFP_WAIT is to guarantee request >> allocation, assuming IO forward progress, for callers which are >> willing to wait. There's no reason for cfq_find_alloc_queue() to >> behave differently on __GFP_WAIT when it already has a fallback >> mechanism. >> >> Remove @gfp_mask from cfq_find_alloc_queue() and propagate the changes >> to its callers. This simplifies the function quite a bit and will >> help making async queues per-cfq_group. > > Sorry, I disagree with this patch. You've changed it so that all cfqq > allocations are GFP_ATOMIC, and most, if not all of them simply don't > need to be. It occurs to me that replacing GFP_ATOMIC with GFP_NOWAIT in your patch would address my concerns, and patches 6-8 would apply almost as-is. What do you think about that? -Jeff ^ permalink raw reply [flat|nested] 31+ messages in thread
[parent not found: <x49sia2gd41.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>]
* Re: [PATCH 5/8] cfq-iosched: remove @gfp_mask from cfq_find_alloc_queue() [not found] ` <x49sia2gd41.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org> @ 2015-06-08 21:19 ` Vivek Goyal [not found] ` <20150608211930.GA20918-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2015-06-09 3:00 ` Tejun Heo 1 sibling, 1 reply; 31+ messages in thread From: Vivek Goyal @ 2015-06-08 21:19 UTC (permalink / raw) To: Jeff Moyer Cc: Tejun Heo, axboe-tSWWG44O7X1aa/9Udqfwiw, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w On Mon, Jun 08, 2015 at 04:27:10PM -0400, Jeff Moyer wrote: > Jeff Moyer <jmoyer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: > > > Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> writes: > > > >> Even when allocations fail, cfq_find_alloc_queue() always returns a > >> valid cfq_queue by falling back to the oom cfq_queue. As such, there > >> isn't much point in taking @gfp_mask and trying "harder" if __GFP_WAIT > >> is set. GFP_ATOMIC allocations don't fail often and even when they do > >> the degraded behavior is acceptable and temporary. > >> > >> After all, the only reason get_request(), which ultimately determines > >> the gfp_mask, cares about __GFP_WAIT is to guarantee request > >> allocation, assuming IO forward progress, for callers which are > >> willing to wait. There's no reason for cfq_find_alloc_queue() to > >> behave differently on __GFP_WAIT when it already has a fallback > >> mechanism. > >> > >> Remove @gfp_mask from cfq_find_alloc_queue() and propagate the changes > >> to its callers. This simplifies the function quite a bit and will > >> help making async queues per-cfq_group. > > > > Sorry, I disagree with this patch. You've changed it so that all cfqq > > allocations are GFP_ATOMIC, and most, if not all of them simply don't > > need to be. > > It occurs to me that replacing GFP_ATOMIC with GFP_NOWAIT in your patch > would address my concerns, and patches 6-8 would apply almost as-is. > What do you think about that? Whatever we end up using, may be it is a good idea to use same policy for block group allocation too. Right now we use GFP_ATOMIC for blkcg allocation. So this will be equivalent of that when memory is low, we don't provide service differentiation. Thanks Vivek ^ permalink raw reply [flat|nested] 31+ messages in thread
[parent not found: <20150608211930.GA20918-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 5/8] cfq-iosched: remove @gfp_mask from cfq_find_alloc_queue() [not found] ` <20150608211930.GA20918-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2015-06-09 3:01 ` Tejun Heo 0 siblings, 0 replies; 31+ messages in thread From: Tejun Heo @ 2015-06-09 3:01 UTC (permalink / raw) To: Vivek Goyal Cc: Jeff Moyer, axboe-tSWWG44O7X1aa/9Udqfwiw, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w On Mon, Jun 08, 2015 at 05:19:30PM -0400, Vivek Goyal wrote: > Whatever we end up using, may be it is a good idea to use same policy > for block group allocation too. Right now we use GFP_ATOMIC for blkcg > allocation. > > So this will be equivalent of that when memory is low, we don't provide > service differentiation. Yeap, I'll add a patch to convert all optional allocations to GFP_NOWAIT. Thanks. -- tejun ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 5/8] cfq-iosched: remove @gfp_mask from cfq_find_alloc_queue() [not found] ` <x49sia2gd41.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org> 2015-06-08 21:19 ` Vivek Goyal @ 2015-06-09 3:00 ` Tejun Heo [not found] ` <20150609030054.GJ21465-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> 1 sibling, 1 reply; 31+ messages in thread From: Tejun Heo @ 2015-06-09 3:00 UTC (permalink / raw) To: Jeff Moyer Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w Hey, Jeff. On Mon, Jun 08, 2015 at 04:27:10PM -0400, Jeff Moyer wrote: > >> Remove @gfp_mask from cfq_find_alloc_queue() and propagate the changes > >> to its callers. This simplifies the function quite a bit and will > >> help making async queues per-cfq_group. > > > > Sorry, I disagree with this patch. You've changed it so that all cfqq > > allocations are GFP_ATOMIC, and most, if not all of them simply don't > > need to be. > > It occurs to me that replacing GFP_ATOMIC with GFP_NOWAIT in your patch > would address my concerns, and patches 6-8 would apply almost as-is. > What do you think about that? Oh yeah, it's okay to fail these allocations under memory pressure. GFP_NOWAIT is the better pick here. It's GFP_ATOMIC mostly due to historic copy&paste anyway. I'll change them to GFP_NOWAIT. Thanks. -- tejun ^ permalink raw reply [flat|nested] 31+ messages in thread
[parent not found: <20150609030054.GJ21465-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>]
* Re: [PATCH 5/8] cfq-iosched: remove @gfp_mask from cfq_find_alloc_queue() [not found] ` <20150609030054.GJ21465-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2015-06-09 14:29 ` Jeff Moyer 0 siblings, 0 replies; 31+ messages in thread From: Jeff Moyer @ 2015-06-09 14:29 UTC (permalink / raw) To: Tejun Heo Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> writes: > Hey, Jeff. > > On Mon, Jun 08, 2015 at 04:27:10PM -0400, Jeff Moyer wrote: >> >> Remove @gfp_mask from cfq_find_alloc_queue() and propagate the changes >> >> to its callers. This simplifies the function quite a bit and will >> >> help making async queues per-cfq_group. >> > >> > Sorry, I disagree with this patch. You've changed it so that all cfqq >> > allocations are GFP_ATOMIC, and most, if not all of them simply don't >> > need to be. >> >> It occurs to me that replacing GFP_ATOMIC with GFP_NOWAIT in your patch >> would address my concerns, and patches 6-8 would apply almost as-is. >> What do you think about that? > > Oh yeah, it's okay to fail these allocations under memory pressure. > GFP_NOWAIT is the better pick here. It's GFP_ATOMIC mostly due to > historic copy&paste anyway. I'll change them to GFP_NOWAIT. OK. Assuming that's the only change, you can go ahead and add my Reviewed-by: Jeff Moyer <jmoyer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> to the next iteration. Thanks! Jeff ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 7/8] cfq-iosched: fold cfq_find_alloc_queue() into cfq_get_queue() [not found] ` <1433753973-23684-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> ` (4 preceding siblings ...) 2015-06-08 8:59 ` [PATCH 5/8] cfq-iosched: remove @gfp_mask from cfq_find_alloc_queue() Tejun Heo @ 2015-06-08 8:59 ` Tejun Heo [not found] ` <1433753973-23684-8-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> 2015-06-09 4:21 ` [PATCH 4.5/8] blkcg, cfq-iosched: use GFP_NOWAIT instead of GFP_ATOMIC for non-critical allocations Tejun Heo 6 siblings, 1 reply; 31+ messages in thread From: Tejun Heo @ 2015-06-08 8:59 UTC (permalink / raw) To: axboe-tSWWG44O7X1aa/9Udqfwiw Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w, Tejun Heo cfq_find_alloc_queue() checks whether a queue actually needs to be allocated, which is unnecessary as its sole caller, cfq_get_queue(), only calls it if so. Also, the oom queue fallback logic is scattered between cfq_get_queue() and cfq_find_alloc_queue(). There really isn't much going on in the latter and things can be made simpler by folding it into cfq_get_queue(). This patch collapses cfq_find_alloc_queue() into cfq_get_queue(). The change is fairly straight-forward with one exception - async_cfqq is now initialized to NULL and the "!is_sync" test in the last if conditional is replaced with "async_cfqq" test. This is because gcc (5.1.1) gets confused for some reason and warns that async_cfqq may be used uninitialized otherwise. Oh well, the code isn't necessarily worse this way. This patch doesn't cause any functional difference. Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Cc: Arianna Avanzini <avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> --- block/cfq-iosched.c | 49 +++++++++++++++---------------------------------- 1 file changed, 15 insertions(+), 34 deletions(-) diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index a775128..393befb 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -3572,33 +3572,6 @@ static void check_blkcg_changed(struct cfq_io_cq *cic, struct bio *bio) static inline void check_blkcg_changed(struct cfq_io_cq *cic, struct bio *bio) { } #endif /* CONFIG_CFQ_GROUP_IOSCHED */ -static struct cfq_queue * -cfq_find_alloc_queue(struct cfq_data *cfqd, struct cfq_group *cfqg, bool is_sync, - struct cfq_io_cq *cic, struct bio *bio) -{ - struct cfq_queue *cfqq; - - cfqq = cic_to_cfqq(cic, is_sync); - - /* - * Always try a new alloc if we fell back to the OOM cfqq - * originally, since it should just be a temporary situation. - */ - if (!cfqq || cfqq == &cfqd->oom_cfqq) { - cfqq = kmem_cache_alloc_node(cfq_pool, - GFP_ATOMIC | __GFP_ZERO, - cfqd->queue->node); - if (cfqq) { - cfq_init_cfqq(cfqd, cfqq, current->pid, is_sync); - cfq_init_prio_data(cfqq, cic); - cfq_link_cfqq_cfqg(cfqq, cfqg); - cfq_log_cfqq(cfqd, cfqq, "alloced"); - } else - cfqq = &cfqd->oom_cfqq; - } - return cfqq; -} - static struct cfq_queue ** cfq_async_queue_prio(struct cfq_data *cfqd, int ioprio_class, int ioprio) { @@ -3623,7 +3596,7 @@ cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic, { int ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio); int ioprio = IOPRIO_PRIO_DATA(cic->ioprio); - struct cfq_queue **async_cfqq; + struct cfq_queue **async_cfqq = NULL; struct cfq_queue *cfqq; struct cfq_group *cfqg; @@ -3646,12 +3619,20 @@ cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic, goto out; } - cfqq = cfq_find_alloc_queue(cfqd, cfqg, is_sync, cic, bio); + cfqq = kmem_cache_alloc_node(cfq_pool, GFP_ATOMIC | __GFP_ZERO, + cfqd->queue->node); + if (!cfqq) { + cfqq = &cfqd->oom_cfqq; + goto out; + } + + cfq_init_cfqq(cfqd, cfqq, current->pid, is_sync); + cfq_init_prio_data(cfqq, cic); + cfq_link_cfqq_cfqg(cfqq, cfqg); + cfq_log_cfqq(cfqd, cfqq, "alloced"); - /* - * pin the queue now that it's allocated, scheduler exit will prune it - */ - if (!is_sync && cfqq != &cfqd->oom_cfqq) { + if (async_cfqq) { + /* a new async queue is created, pin and remember */ cfqq->ref++; *async_cfqq = cfqq; } @@ -4401,7 +4382,7 @@ static int cfq_init_queue(struct request_queue *q, struct elevator_type *e) cfqd->prio_trees[i] = RB_ROOT; /* - * Our fallback cfqq if cfq_find_alloc_queue() runs into OOM issues. + * Our fallback cfqq if cfq_get_queue() runs into OOM issues. * Grab a permanent reference to it, so that the normal code flow * will not attempt to free it. oom_cfqq is linked to root_group * but shouldn't hold a reference as it'll never be unlinked. Lose -- 2.4.2 ^ permalink raw reply related [flat|nested] 31+ messages in thread
[parent not found: <1433753973-23684-8-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>]
* Re: [PATCH 7/8] cfq-iosched: fold cfq_find_alloc_queue() into cfq_get_queue() [not found] ` <1433753973-23684-8-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> @ 2015-06-09 14:40 ` Jeff Moyer [not found] ` <x49twuhrlml.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org> 0 siblings, 1 reply; 31+ messages in thread From: Jeff Moyer @ 2015-06-09 14:40 UTC (permalink / raw) To: Tejun Heo Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> writes: > cfq_find_alloc_queue() checks whether a queue actually needs to be > allocated, which is unnecessary as its sole caller, cfq_get_queue(), > only calls it if so. Also, the oom queue fallback logic is scattered > between cfq_get_queue() and cfq_find_alloc_queue(). There really > isn't much going on in the latter and things can be made simpler by > folding it into cfq_get_queue(). > > This patch collapses cfq_find_alloc_queue() into cfq_get_queue(). The > change is fairly straight-forward with one exception - async_cfqq is > now initialized to NULL and the "!is_sync" test in the last if > conditional is replaced with "async_cfqq" test. This is because gcc > (5.1.1) gets confused for some reason and warns that async_cfqq may be > used uninitialized otherwise. Oh well, the code isn't necessarily > worse this way. > > This patch doesn't cause any functional difference. The resulting code (introduced by the last patch, I know) is not ideal: rcu_read_lock(); cfqg = cfq_lookup_create_cfqg(cfqd, bio_blkcg(bio)); if (!cfqg) { cfqq = &cfqd->oom_cfqq; goto out; } if (!is_sync) { if (!ioprio_valid(cic->ioprio)) { struct task_struct *tsk = current; ioprio = task_nice_ioprio(tsk); ioprio_class = task_nice_ioclass(tsk); } async_cfqq = cfq_async_queue_prio(cfqd, ioprio_class, ioprio); cfqq = *async_cfqq; if (cfqq) goto out; } As you mentioned, we don't need to lookup the cfqg for the async queue. What's more is we could fallback to the oom_cfqq even if we had an existing async cfqq. I'm guessing you structured the code this way to make the error path cleaner. I don't think it's a big deal, as it should be a rare occurrence, so... Reviewed-by: Jeff Moyer <jmoyer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> ^ permalink raw reply [flat|nested] 31+ messages in thread
[parent not found: <x49twuhrlml.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>]
* Re: [PATCH 7/8] cfq-iosched: fold cfq_find_alloc_queue() into cfq_get_queue() [not found] ` <x49twuhrlml.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org> @ 2015-06-10 2:47 ` Tejun Heo 0 siblings, 0 replies; 31+ messages in thread From: Tejun Heo @ 2015-06-10 2:47 UTC (permalink / raw) To: Jeff Moyer Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w Hey, Jeff. On Tue, Jun 09, 2015 at 10:40:02AM -0400, Jeff Moyer wrote: > The resulting code (introduced by the last patch, I know) is not ideal: > > rcu_read_lock(); > cfqg = cfq_lookup_create_cfqg(cfqd, bio_blkcg(bio)); > if (!cfqg) { > cfqq = &cfqd->oom_cfqq; > goto out; > } > > if (!is_sync) { > if (!ioprio_valid(cic->ioprio)) { > struct task_struct *tsk = current; > ioprio = task_nice_ioprio(tsk); > ioprio_class = task_nice_ioclass(tsk); > } > async_cfqq = cfq_async_queue_prio(cfqd, ioprio_class, > ioprio); > cfqq = *async_cfqq; > if (cfqq) > goto out; > } > > As you mentioned, we don't need to lookup the cfqg for the async queue. > What's more is we could fallback to the oom_cfqq even if we had an > existing async cfqq. I'm guessing you structured the code this way to > make the error path cleaner. I don't think it's a big deal, as it > should be a rare occurrence, so... In this patch, the lookup seems unnecessary for the async case but the change is required for the following changes because async queues are moved from cfq_data to cfq_group, so we can't determine async queues w/o looking up cfqg at all and we'll have to fall back to oom_cfqq if cfqg lookup fails (cuz there's no system-wide async queues). Thanks. -- tejun ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 4.5/8] blkcg, cfq-iosched: use GFP_NOWAIT instead of GFP_ATOMIC for non-critical allocations [not found] ` <1433753973-23684-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> ` (5 preceding siblings ...) 2015-06-08 8:59 ` [PATCH 7/8] cfq-iosched: fold cfq_find_alloc_queue() into cfq_get_queue() Tejun Heo @ 2015-06-09 4:21 ` Tejun Heo [not found] ` <20150609042131.GN21465-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> 6 siblings, 1 reply; 31+ messages in thread From: Tejun Heo @ 2015-06-09 4:21 UTC (permalink / raw) To: axboe-tSWWG44O7X1aa/9Udqfwiw Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w From b848495d2c987e960d1f7794966d82c05fcefc6d Mon Sep 17 00:00:00 2001 From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Date: Tue, 9 Jun 2015 13:19:40 +0900 blkcg performs several allocations to track IOs per cgroup and enforce resource control. Most of these allocations are performed lazily on demand in the IO path and thus can't involve reclaim path. Currently, these allocations use GFP_ATOMIC; however, blkcg can gracefully deal with occassional failures of these allocations by punting IOs to the root cgroup and there's no reason to reach into the emergency reserve. This patch replaces GFP_ATOMIC with GFP_NOWAIT for the following allocations. * bdi_writeback_congested and blkcg_gq allocations in blkg_create(). * radix tree node allocations for blkcg->blkg_tree. * cfq_queue allocation on ioprio changes. Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Suggested-by: Jeff Moyer <jmoyer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Suggested-by: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> --- Hello, git branch updated accordingly. A couple of later patches need to be updated but the changes are trivial. I'll repost the patchset once review is done. Thanks. block/blk-cgroup.c | 8 ++++---- block/cfq-iosched.c | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 31610ae..1fddbbd 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -175,7 +175,7 @@ EXPORT_SYMBOL_GPL(blkg_lookup); /* * If @new_blkg is %NULL, this function tries to allocate a new one as - * necessary using %GFP_ATOMIC. @new_blkg is always consumed on return. + * necessary using %GFP_NOWAIT. @new_blkg is always consumed on return. */ static struct blkcg_gq *blkg_create(struct blkcg *blkcg, struct request_queue *q, @@ -195,7 +195,7 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg, } wb_congested = wb_congested_get_create(&q->backing_dev_info, - blkcg->css.id, GFP_ATOMIC); + blkcg->css.id, GFP_NOWAIT); if (!wb_congested) { ret = -ENOMEM; goto err_put_css; @@ -203,7 +203,7 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg, /* allocate */ if (!new_blkg) { - new_blkg = blkg_alloc(blkcg, q, GFP_ATOMIC); + new_blkg = blkg_alloc(blkcg, q, GFP_NOWAIT); if (unlikely(!new_blkg)) { ret = -ENOMEM; goto err_put_congested; @@ -841,7 +841,7 @@ blkcg_css_alloc(struct cgroup_subsys_state *parent_css) blkcg->cfq_leaf_weight = CFQ_WEIGHT_DEFAULT; done: spin_lock_init(&blkcg->lock); - INIT_RADIX_TREE(&blkcg->blkg_tree, GFP_ATOMIC); + INIT_RADIX_TREE(&blkcg->blkg_tree, GFP_NOWAIT); INIT_HLIST_HEAD(&blkcg->blkg_list); #ifdef CONFIG_CGROUP_WRITEBACK INIT_LIST_HEAD(&blkcg->cgwb_list); diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index 90d5a87..97863ce 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -3507,7 +3507,7 @@ static void check_ioprio_changed(struct cfq_io_cq *cic, struct bio *bio) cfqq = cic_to_cfqq(cic, false); if (cfqq) { cfq_put_queue(cfqq); - cfqq = cfq_get_queue(cfqd, BLK_RW_ASYNC, cic, bio, GFP_ATOMIC); + cfqq = cfq_get_queue(cfqd, BLK_RW_ASYNC, cic, bio, GFP_NOWAIT); cic_set_cfqq(cic, cfqq, false); } -- 2.4.2 ^ permalink raw reply related [flat|nested] 31+ messages in thread
[parent not found: <20150609042131.GN21465-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>]
* Re: [PATCH 4.5/8] blkcg, cfq-iosched: use GFP_NOWAIT instead of GFP_ATOMIC for non-critical allocations [not found] ` <20150609042131.GN21465-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2015-06-09 14:27 ` Jeff Moyer 0 siblings, 0 replies; 31+ messages in thread From: Jeff Moyer @ 2015-06-09 14:27 UTC (permalink / raw) To: Tejun Heo Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA, avanzini.arianna-Re5JQEeQqe8AvxtiuMwx3w Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> writes: > From b848495d2c987e960d1f7794966d82c05fcefc6d Mon Sep 17 00:00:00 2001 > From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Date: Tue, 9 Jun 2015 13:19:40 +0900 > > blkcg performs several allocations to track IOs per cgroup and enforce > resource control. Most of these allocations are performed lazily on > demand in the IO path and thus can't involve reclaim path. Currently, > these allocations use GFP_ATOMIC; however, blkcg can gracefully deal > with occassional failures of these allocations by punting IOs to the > root cgroup and there's no reason to reach into the emergency reserve. > > This patch replaces GFP_ATOMIC with GFP_NOWAIT for the following > allocations. > > * bdi_writeback_congested and blkcg_gq allocations in blkg_create(). > > * radix tree node allocations for blkcg->blkg_tree. > > * cfq_queue allocation on ioprio changes. > > Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Suggested-by: Jeff Moyer <jmoyer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > Suggested-by: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Thanks, Tejun! Reviewed-by: Jeff Moyer <jmoyer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> ^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2015-06-10 2:49 UTC | newest]
Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-08 8:59 [PATCHSET block/for-4.2/writeback] block, cgroup: make cfq charge async IOs to the appropriate blkcgs Tejun Heo
2015-06-08 8:59 ` [PATCH 6/8] cfq-iosched: move cfq_group determination from cfq_find_alloc_queue() to cfq_get_queue() Tejun Heo
[not found] ` <1433753973-23684-7-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-06-09 14:32 ` Jeff Moyer
2015-06-08 8:59 ` [PATCH 8/8] cfq-iosched: charge async IOs to the appropriate blkcg's instead of the root Tejun Heo
[not found] ` <1433753973-23684-9-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-06-08 22:29 ` Vivek Goyal
[not found] ` <20150608222904.GB20918-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-06-09 3:11 ` Tejun Heo
2015-06-09 15:03 ` Jeff Moyer
2015-06-08 19:49 ` [PATCHSET block/for-4.2/writeback] block, cgroup: make cfq charge async IOs to the appropriate blkcgs Jeff Moyer
[not found] ` <x491thmhtej.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
2015-06-09 3:03 ` Tejun Heo
[not found] ` <20150609030327.GL21465-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2015-06-09 15:05 ` Jeff Moyer
[not found] ` <x49lhfssz05.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
2015-06-10 2:49 ` Tejun Heo
[not found] ` <1433753973-23684-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-06-08 8:59 ` [PATCH 1/8] cfq-iosched: simplify control flow in cfq_get_queue() Tejun Heo
[not found] ` <1433753973-23684-2-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-06-08 18:36 ` Jeff Moyer
2015-06-08 8:59 ` [PATCH 2/8] cfq-iosched: fix async oom queue handling Tejun Heo
[not found] ` <1433753973-23684-3-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-06-08 18:42 ` Jeff Moyer
2015-06-08 8:59 ` [PATCH 3/8] cfq-iosched: fix oom cfq_queue ref leak in cfq_set_request() Tejun Heo
[not found] ` <1433753973-23684-4-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-06-08 18:51 ` Jeff Moyer
2015-06-08 8:59 ` [PATCH 4/8] cfq-iosched: minor cleanups Tejun Heo
[not found] ` <1433753973-23684-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-06-08 18:59 ` Jeff Moyer
2015-06-08 8:59 ` [PATCH 5/8] cfq-iosched: remove @gfp_mask from cfq_find_alloc_queue() Tejun Heo
[not found] ` <1433753973-23684-6-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-06-08 19:24 ` Jeff Moyer
2015-06-08 20:27 ` Jeff Moyer
[not found] ` <x49sia2gd41.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
2015-06-08 21:19 ` Vivek Goyal
[not found] ` <20150608211930.GA20918-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-06-09 3:01 ` Tejun Heo
2015-06-09 3:00 ` Tejun Heo
[not found] ` <20150609030054.GJ21465-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2015-06-09 14:29 ` Jeff Moyer
2015-06-08 8:59 ` [PATCH 7/8] cfq-iosched: fold cfq_find_alloc_queue() into cfq_get_queue() Tejun Heo
[not found] ` <1433753973-23684-8-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-06-09 14:40 ` Jeff Moyer
[not found] ` <x49twuhrlml.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
2015-06-10 2:47 ` Tejun Heo
2015-06-09 4:21 ` [PATCH 4.5/8] blkcg, cfq-iosched: use GFP_NOWAIT instead of GFP_ATOMIC for non-critical allocations Tejun Heo
[not found] ` <20150609042131.GN21465-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2015-06-09 14:27 ` Jeff Moyer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).