* [PATCH] bdi flusher should not be throttled here when it fall into buddy slow path
@ 2016-10-18 7:12 zhouxianrong
2016-10-18 9:34 ` Hillf Danton
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: zhouxianrong @ 2016-10-18 7:12 UTC (permalink / raw)
To: linux-mm
Cc: linux-kernel, akpm, viro, mingo, peterz, hannes, mgorman, vbabka,
mhocko, vdavydov.dev, minchan, riel, zhouxianrong, zhouxiyu,
zhangshiming5, won.ho.park, tuxiaobing
From: z00281421 <z00281421@notesmail.huawei.com>
bdi flusher may enter page alloc slow path due to writepage and kmalloc.
in that case the flusher as a direct reclaimer should not be throttled here
because it can not to reclaim clean file pages or anaonymous pages
for next moment; furthermore writeback rate of dirty pages would be
slow down and other direct reclaimers and kswapd would be affected.
bdi flusher should be iosceduled by get_request rather than here.
Signed-off-by: z00281421 <z00281421@notesmail.huawei.com>
---
fs/fs-writeback.c | 4 ++--
include/linux/sched.h | 1 +
mm/vmscan.c | 15 +++++++++++----
3 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 05713a5..f6bf067 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1908,7 +1908,7 @@ void wb_workfn(struct work_struct *work)
long pages_written;
set_worker_desc("flush-%s", dev_name(wb->bdi->dev));
- current->flags |= PF_SWAPWRITE;
+ current->flags |= (PF_SWAPWRITE | PF_BDI_FLUSHER | PF_LESS_THROTTLE);
if (likely(!current_is_workqueue_rescuer() ||
!test_bit(WB_registered, &wb->state))) {
@@ -1938,7 +1938,7 @@ void wb_workfn(struct work_struct *work)
else if (wb_has_dirty_io(wb) && dirty_writeback_interval)
wb_wakeup_delayed(wb);
- current->flags &= ~PF_SWAPWRITE;
+ current->flags &= ~(PF_SWAPWRITE | PF_BDI_FLUSHER | PF_LESS_THROTTLE);
}
/*
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 62c68e5..4bb70f2 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2232,6 +2232,7 @@ extern void thread_group_cputime_adjusted(struct task_struct *p, cputime_t *ut,
#define PF_KTHREAD 0x00200000 /* I am a kernel thread */
#define PF_RANDOMIZE 0x00400000 /* randomize virtual address space */
#define PF_SWAPWRITE 0x00800000 /* Allowed to write to swap */
+#define PF_BDI_FLUSHER 0x01000000 /* I am bdi flusher */
#define PF_NO_SETAFFINITY 0x04000000 /* Userland is not allowed to meddle with cpus_allowed */
#define PF_MCE_EARLY 0x08000000 /* Early kill for mce process policy */
#define PF_MUTEX_TESTER 0x20000000 /* Thread belongs to the rt mutex tester */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 0fe8b71..492e9e7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1643,12 +1643,19 @@ putback_inactive_pages(struct lruvec *lruvec, struct list_head *page_list)
* If a kernel thread (such as nfsd for loop-back mounts) services
* a backing device by writing to the page cache it sets PF_LESS_THROTTLE.
* In that case we should only throttle if the backing device it is
- * writing to is congested. In other cases it is safe to throttle.
+ * writing to is congested. another case is that bdi flusher could
+ * not be throttled here even though whose bdi is consgested.
+ * In other cases it is safe to throttle.
*/
-static int current_may_throttle(void)
+static bool current_may_throttle(void)
{
- return !(current->flags & PF_LESS_THROTTLE) ||
- current->backing_dev_info == NULL ||
+ if (!(current->flags & PF_LESS_THROTTLE))
+ return true;
+
+ if (current->flags & PF_BDI_FLUSHER)
+ return false;
+
+ return current->backing_dev_info == NULL ||
bdi_write_congested(current->backing_dev_info);
}
--
1.7.9.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH] bdi flusher should not be throttled here when it fall into buddy slow path 2016-10-18 7:12 [PATCH] bdi flusher should not be throttled here when it fall into buddy slow path zhouxianrong @ 2016-10-18 9:34 ` Hillf Danton 2016-10-18 9:59 ` Mel Gorman 2016-10-20 12:38 ` zhouxianrong 2 siblings, 0 replies; 8+ messages in thread From: Hillf Danton @ 2016-10-18 9:34 UTC (permalink / raw) To: zhouxianrong, linux-mm Cc: linux-kernel, akpm, viro, mingo, peterz, hannes, mgorman, vbabka, mhocko, vdavydov.dev, minchan, riel, zhouxiyu, zhangshiming5, won.ho.park, tuxiaobing > @@ -1908,7 +1908,7 @@ void wb_workfn(struct work_struct *work) > long pages_written; > > set_worker_desc("flush-%s", dev_name(wb->bdi->dev)); > - current->flags |= PF_SWAPWRITE; If flags carries PF_LESS_THROTTLE before modified, then you have to restore it. > + current->flags |= (PF_SWAPWRITE | PF_BDI_FLUSHER | PF_LESS_THROTTLE); > > if (likely(!current_is_workqueue_rescuer() || > !test_bit(WB_registered, &wb->state))) { > @@ -1938,7 +1938,7 @@ void wb_workfn(struct work_struct *work) > else if (wb_has_dirty_io(wb) && dirty_writeback_interval) > wb_wakeup_delayed(wb); > > - current->flags &= ~PF_SWAPWRITE; > + current->flags &= ~(PF_SWAPWRITE | PF_BDI_FLUSHER | PF_LESS_THROTTLE); > } > thanks Hillf -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] bdi flusher should not be throttled here when it fall into buddy slow path 2016-10-18 7:12 [PATCH] bdi flusher should not be throttled here when it fall into buddy slow path zhouxianrong 2016-10-18 9:34 ` Hillf Danton @ 2016-10-18 9:59 ` Mel Gorman 2016-10-18 11:08 ` zhouxianrong 2016-10-20 12:38 ` zhouxianrong 2 siblings, 1 reply; 8+ messages in thread From: Mel Gorman @ 2016-10-18 9:59 UTC (permalink / raw) To: zhouxianrong Cc: linux-mm, linux-kernel, akpm, viro, mingo, peterz, hannes, vbabka, mhocko, vdavydov.dev, minchan, riel, zhouxiyu, zhangshiming5, won.ho.park, tuxiaobing On Tue, Oct 18, 2016 at 03:12:45PM +0800, zhouxianrong@huawei.com wrote: > From: z00281421 <z00281421@notesmail.huawei.com> > > bdi flusher may enter page alloc slow path due to writepage and kmalloc. > in that case the flusher as a direct reclaimer should not be throttled here > because it can not to reclaim clean file pages or anaonymous pages > for next moment; furthermore writeback rate of dirty pages would be > slow down and other direct reclaimers and kswapd would be affected. > bdi flusher should be iosceduled by get_request rather than here. > > Signed-off-by: z00281421 <z00281421@notesmail.huawei.com> What does this patch do that PF_LESS_THROTTLE is not doing already if there is an underlying BDI? There have been a few patches like this recently that look like they might do something useful but are subtle. They really should be accompanied by a test case and data showing they either fix a functional issue (machine livelocking due to writeback not making progress) or a performance issue. -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] bdi flusher should not be throttled here when it fall into buddy slow path 2016-10-18 9:59 ` Mel Gorman @ 2016-10-18 11:08 ` zhouxianrong 2016-10-18 11:42 ` Michal Hocko 0 siblings, 1 reply; 8+ messages in thread From: zhouxianrong @ 2016-10-18 11:08 UTC (permalink / raw) To: Mel Gorman Cc: linux-mm, linux-kernel, akpm, viro, mingo, peterz, hannes, vbabka, mhocko, vdavydov.dev, minchan, riel, zhouxiyu, zhangshiming5, won.ho.park, tuxiaobing Call trace: [<ffffffc0000863dc>] __switch_to+0x80/0x98 [<ffffffc001160c58>] __schedule+0x314/0x854 [<ffffffc0011611e0>] schedule+0x48/0xa4 [<ffffffc0011648c4>] schedule_timeout+0x158/0x2c8 [<ffffffc0011608b4>] io_schedule_timeout+0xbc/0x14c [<ffffffc0001aec84>] wait_iff_congested+0x1d4/0x1ec [<ffffffc0001a36b0>] shrink_inactive_list+0x530/0x760 [<ffffffc0001a3e14>] shrink_lruvec+0x534/0x76c [<ffffffc0001a40d4>] shrink_zone+0x88/0x1b8 [<ffffffc0001a4444>] do_try_to_free_pages+0x240/0x478 [<ffffffc0001a4788>] try_to_free_pages+0x10c/0x284 [<ffffffc0001968a4>] __alloc_pages_nodemask+0x540/0x918 [<ffffffc0001dd0e8>] new_slab+0x334/0x4a0 [<ffffffc0001df37c>] __slab_alloc.isra.75.constprop.77+0x6bc/0x780 [<ffffffc0001df584>] kmem_cache_alloc+0x144/0x23c [<ffffffc00018f040>] mempool_alloc_slab+0x2c/0x38 [<ffffffc00018f1f4>] mempool_alloc+0x7c/0x188 [<ffffffc0003f462c>] bio_alloc_bioset+0x1cc/0x254 [<ffffffc00022a430>] _submit_bh+0x74/0x1c8 [<ffffffc00022c9d0>] __block_write_full_page.constprop.33+0x1a0/0x40c [<ffffffc00022cd1c>] block_write_full_page+0xe0/0x134 [<ffffffc00022da64>] blkdev_writepage+0x30/0x3c [<ffffffc000197d08>] __writepage+0x34/0x74 [<ffffffc000198880>] write_cache_pages+0x1e8/0x450 [<ffffffc000198b3c>] generic_writepages+0x54/0x8c [<ffffffc00019a990>] do_writepages+0x40/0x6c [<ffffffc00021e604>] __writeback_single_inode+0x60/0x51c [<ffffffc00021eeec>] writeback_sb_inodes+0x2d4/0x46c [<ffffffc00021f128>] __writeback_inodes_wb+0xa4/0xe8 [<ffffffc00021f480>] wb_writeback+0x314/0x3fc [<ffffffc000220224>] bdi_writeback_workfn+0x130/0x4e0 [<ffffffc0000be4d4>] process_one_work+0x18c/0x51c [<ffffffc0000bedd8>] worker_thread+0x15c/0x51c [<ffffffc0000c5718>] kthread+0x10c/0x120 the above calltrace occured when write sdcard under large and long pressure. the patch is a performance issue. i hope flusher do not be throttled just here and let it reclaim the successive clean file pages or anonymous pages on lru list and then return to write left dirty pages of inode. it would speed up write-back speed of dirty pages. so other direct reclaimers can reclaim more clean pages. in low memory caused by big pagecache bdi writeback speed play a key role. On 2016/10/18 17:59, Mel Gorman wrote: > On Tue, Oct 18, 2016 at 03:12:45PM +0800, zhouxianrong@huawei.com wrote: >> From: z00281421 <z00281421@notesmail.huawei.com> >> >> bdi flusher may enter page alloc slow path due to writepage and kmalloc. >> in that case the flusher as a direct reclaimer should not be throttled here >> because it can not to reclaim clean file pages or anaonymous pages >> for next moment; furthermore writeback rate of dirty pages would be >> slow down and other direct reclaimers and kswapd would be affected. >> bdi flusher should be iosceduled by get_request rather than here. >> >> Signed-off-by: z00281421 <z00281421@notesmail.huawei.com> > > What does this patch do that PF_LESS_THROTTLE is not doing already if > there is an underlying BDI? > > There have been a few patches like this recently that look like they might > do something useful but are subtle. They really should be accompanied by > a test case and data showing they either fix a functional issue (machine > livelocking due to writeback not making progress) or a performance issue. > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] bdi flusher should not be throttled here when it fall into buddy slow path 2016-10-18 11:08 ` zhouxianrong @ 2016-10-18 11:42 ` Michal Hocko 0 siblings, 0 replies; 8+ messages in thread From: Michal Hocko @ 2016-10-18 11:42 UTC (permalink / raw) To: zhouxianrong Cc: Mel Gorman, linux-mm, linux-kernel, akpm, viro, mingo, peterz, hannes, vbabka, vdavydov.dev, minchan, riel, zhouxiyu, zhangshiming5, won.ho.park, tuxiaobing On Tue 18-10-16 19:08:05, zhouxianrong wrote: > Call trace: > [<ffffffc0000863dc>] __switch_to+0x80/0x98 > [<ffffffc001160c58>] __schedule+0x314/0x854 > [<ffffffc0011611e0>] schedule+0x48/0xa4 > [<ffffffc0011648c4>] schedule_timeout+0x158/0x2c8 > [<ffffffc0011608b4>] io_schedule_timeout+0xbc/0x14c > [<ffffffc0001aec84>] wait_iff_congested+0x1d4/0x1ec > [<ffffffc0001a36b0>] shrink_inactive_list+0x530/0x760 > [<ffffffc0001a3e14>] shrink_lruvec+0x534/0x76c > [<ffffffc0001a40d4>] shrink_zone+0x88/0x1b8 > [<ffffffc0001a4444>] do_try_to_free_pages+0x240/0x478 > [<ffffffc0001a4788>] try_to_free_pages+0x10c/0x284 > [<ffffffc0001968a4>] __alloc_pages_nodemask+0x540/0x918 > [<ffffffc0001dd0e8>] new_slab+0x334/0x4a0 > [<ffffffc0001df37c>] __slab_alloc.isra.75.constprop.77+0x6bc/0x780 > [<ffffffc0001df584>] kmem_cache_alloc+0x144/0x23c > [<ffffffc00018f040>] mempool_alloc_slab+0x2c/0x38 > [<ffffffc00018f1f4>] mempool_alloc+0x7c/0x188 > [<ffffffc0003f462c>] bio_alloc_bioset+0x1cc/0x254 > [<ffffffc00022a430>] _submit_bh+0x74/0x1c8 > [<ffffffc00022c9d0>] __block_write_full_page.constprop.33+0x1a0/0x40c > [<ffffffc00022cd1c>] block_write_full_page+0xe0/0x134 > [<ffffffc00022da64>] blkdev_writepage+0x30/0x3c > [<ffffffc000197d08>] __writepage+0x34/0x74 > [<ffffffc000198880>] write_cache_pages+0x1e8/0x450 > [<ffffffc000198b3c>] generic_writepages+0x54/0x8c > [<ffffffc00019a990>] do_writepages+0x40/0x6c > [<ffffffc00021e604>] __writeback_single_inode+0x60/0x51c > [<ffffffc00021eeec>] writeback_sb_inodes+0x2d4/0x46c > [<ffffffc00021f128>] __writeback_inodes_wb+0xa4/0xe8 > [<ffffffc00021f480>] wb_writeback+0x314/0x3fc > [<ffffffc000220224>] bdi_writeback_workfn+0x130/0x4e0 > [<ffffffc0000be4d4>] process_one_work+0x18c/0x51c > [<ffffffc0000bedd8>] worker_thread+0x15c/0x51c > [<ffffffc0000c5718>] kthread+0x10c/0x120 > > the above calltrace occured when write sdcard under large and long pressure. > the patch is a performance issue. i hope flusher do not be throttled just here and > let it reclaim the successive clean file pages or anonymous pages on lru list > and then return to write left dirty pages of inode. it would speed up write-back > speed of dirty pages. so other direct reclaimers can reclaim more clean pages. > in low memory caused by big pagecache bdi writeback speed play a key role. If we got here then we are hitting into dirty/writeback pages on the tail of the LRU list and the bdi is congested. So there are no clean pages most probably and the storage doesn't catch up with that IO. Why do you think that not throttling would help here? Do you really see that the further reclaim really makes forward progress or it just wastes more CPU without doing a useful work? In other words much more information please! -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] bdi flusher should not be throttled here when it fall into buddy slow path 2016-10-18 7:12 [PATCH] bdi flusher should not be throttled here when it fall into buddy slow path zhouxianrong 2016-10-18 9:34 ` Hillf Danton 2016-10-18 9:59 ` Mel Gorman @ 2016-10-20 12:38 ` zhouxianrong 2016-10-20 13:05 ` Mika Penttilä 2016-10-20 13:28 ` Michal Hocko 2 siblings, 2 replies; 8+ messages in thread From: zhouxianrong @ 2016-10-20 12:38 UTC (permalink / raw) To: linux-mm Cc: linux-kernel, akpm, viro, mingo, peterz, hannes, mgorman, vbabka, mhocko, vdavydov.dev, minchan, riel, zhouxianrong, zhouxiyu, zhangshiming5, won.ho.park, tuxiaobing From: z00281421 <z00281421@notesmail.huawei.com> The bdi flusher should be throttled only depends on own bdi and is decoupled with others. separate PGDAT_WRITEBACK into PGDAT_ANON_WRITEBACK and PGDAT_FILE_WRITEBACK avoid scanning anon lru and it is ok then throttled on file WRITEBACK. i think above may be not right. Signed-off-by: z00281421 <z00281421@notesmail.huawei.com> --- fs/fs-writeback.c | 8 ++++++-- include/linux/mmzone.h | 7 +++++-- mm/vmscan.c | 20 ++++++++++++-------- 3 files changed, 23 insertions(+), 12 deletions(-) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 05713a5..ddcc70f 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -1905,10 +1905,13 @@ void wb_workfn(struct work_struct *work) { struct bdi_writeback *wb = container_of(to_delayed_work(work), struct bdi_writeback, dwork); + struct backing_dev_info *bdi = container_of(to_delayed_work(work), + struct backing_dev_info, wb.dwork); long pages_written; set_worker_desc("flush-%s", dev_name(wb->bdi->dev)); - current->flags |= PF_SWAPWRITE; + current->flags |= (PF_SWAPWRITE | PF_LESS_THROTTLE); + current->bdi = bdi; if (likely(!current_is_workqueue_rescuer() || !test_bit(WB_registered, &wb->state))) { @@ -1938,7 +1941,8 @@ void wb_workfn(struct work_struct *work) else if (wb_has_dirty_io(wb) && dirty_writeback_interval) wb_wakeup_delayed(wb); - current->flags &= ~PF_SWAPWRITE; + current->bdi = NULL; + current->flags &= ~(PF_SWAPWRITE | PF_LESS_THROTTLE); } /* diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 7f2ae99..fa602e9 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -528,8 +528,11 @@ enum pgdat_flags { * many dirty file pages at the tail * of the LRU. */ - PGDAT_WRITEBACK, /* reclaim scanning has recently found - * many pages under writeback + PGDAT_ANON_WRITEBACK, /* reclaim scanning has recently found + * many anonymous pages under writeback + */ + PGDAT_FILE_WRITEBACK, /* reclaim scanning has recently found + * many file pages under writeback */ PGDAT_RECLAIM_LOCKED, /* prevents concurrent reclaim */ }; diff --git a/mm/vmscan.c b/mm/vmscan.c index 0fe8b71..3f08ba3 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -917,6 +917,7 @@ static unsigned long shrink_page_list(struct list_head *page_list, unsigned long nr_reclaimed = 0; unsigned long nr_writeback = 0; unsigned long nr_immediate = 0; + int file; cond_resched(); @@ -954,6 +955,8 @@ static unsigned long shrink_page_list(struct list_head *page_list, may_enter_fs = (sc->gfp_mask & __GFP_FS) || (PageSwapCache(page) && (sc->gfp_mask & __GFP_IO)); + file = page_is_file_cache(page) + /* * The number of dirty pages determines if a zone is marked * reclaim_congested which affects wait_iff_congested. kswapd @@ -1016,7 +1019,7 @@ static unsigned long shrink_page_list(struct list_head *page_list, /* Case 1 above */ if (current_is_kswapd() && PageReclaim(page) && - test_bit(PGDAT_WRITEBACK, &pgdat->flags)) { + test_bit(PGDAT_ANON_WRITEBACK + file, &pgdat->flags)) { nr_immediate++; goto keep_locked; @@ -1643,13 +1646,14 @@ putback_inactive_pages(struct lruvec *lruvec, struct list_head *page_list) * If a kernel thread (such as nfsd for loop-back mounts) services * a backing device by writing to the page cache it sets PF_LESS_THROTTLE. * In that case we should only throttle if the backing device it is - * writing to is congested. In other cases it is safe to throttle. + * writing to is congested. The bdi flusher should be throttled only depends + * on own bdi and is decoupled with others. In other cases it is safe to throttle. */ -static int current_may_throttle(void) +static int current_may_throttle(int file) { return !(current->flags & PF_LESS_THROTTLE) || - current->backing_dev_info == NULL || - bdi_write_congested(current->backing_dev_info); + (file && (current->backing_dev_info == NULL || + bdi_write_congested(current->backing_dev_info))); } static bool inactive_reclaimable_pages(struct lruvec *lruvec, @@ -1774,7 +1778,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, * are encountered in the nr_immediate check below. */ if (nr_writeback && nr_writeback == nr_taken) - set_bit(PGDAT_WRITEBACK, &pgdat->flags); + set_bit(PGDAT_ANON_WRITEBACK + file, &pgdat->flags); /* * Legacy memcg will stall in page writeback so avoid forcibly @@ -1803,7 +1807,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, * that pages are cycling through the LRU faster than * they are written so also forcibly stall. */ - if (nr_immediate && current_may_throttle()) + if (nr_immediate && current_may_throttle(file)) congestion_wait(BLK_RW_ASYNC, HZ/10); } @@ -1813,7 +1817,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, * unqueued dirty pages or cycling through the LRU too quickly. */ if (!sc->hibernation_mode && !current_is_kswapd() && - current_may_throttle()) + current_may_throttle(file)) wait_iff_congested(pgdat, BLK_RW_ASYNC, HZ/10); trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id, -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] bdi flusher should not be throttled here when it fall into buddy slow path 2016-10-20 12:38 ` zhouxianrong @ 2016-10-20 13:05 ` Mika Penttilä 2016-10-20 13:28 ` Michal Hocko 1 sibling, 0 replies; 8+ messages in thread From: Mika Penttilä @ 2016-10-20 13:05 UTC (permalink / raw) To: zhouxianrong, linux-mm Cc: linux-kernel, akpm, viro, mingo, peterz, hannes, mgorman, vbabka, mhocko, vdavydov.dev, minchan, riel, zhouxiyu, zhangshiming5, won.ho.park, tuxiaobing On 20.10.2016 15:38, zhouxianrong@huawei.com wrote: > From: z00281421 <z00281421@notesmail.huawei.com> > > The bdi flusher should be throttled only depends on > own bdi and is decoupled with others. > > separate PGDAT_WRITEBACK into PGDAT_ANON_WRITEBACK and > PGDAT_FILE_WRITEBACK avoid scanning anon lru and it is ok > then throttled on file WRITEBACK. > > i think above may be not right. > > Signed-off-by: z00281421 <z00281421@notesmail.huawei.com> > --- > fs/fs-writeback.c | 8 ++++++-- > include/linux/mmzone.h | 7 +++++-- > mm/vmscan.c | 20 ++++++++++++-------- > 3 files changed, 23 insertions(+), 12 deletions(-) > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > index 05713a5..ddcc70f 100644 > --- a/fs/fs-writeback.c > +++ b/fs/fs-writeback.c > @@ -1905,10 +1905,13 @@ void wb_workfn(struct work_struct *work) > { > struct bdi_writeback *wb = container_of(to_delayed_work(work), > struct bdi_writeback, dwork); > + struct backing_dev_info *bdi = container_of(to_delayed_work(work), > + struct backing_dev_info, wb.dwork); > long pages_written; > > set_worker_desc("flush-%s", dev_name(wb->bdi->dev)); > - current->flags |= PF_SWAPWRITE; > + current->flags |= (PF_SWAPWRITE | PF_LESS_THROTTLE); > + current->bdi = bdi; > > if (likely(!current_is_workqueue_rescuer() || > !test_bit(WB_registered, &wb->state))) { > @@ -1938,7 +1941,8 @@ void wb_workfn(struct work_struct *work) > else if (wb_has_dirty_io(wb) && dirty_writeback_interval) > wb_wakeup_delayed(wb); > > - current->flags &= ~PF_SWAPWRITE; > + current->bdi = NULL; > + current->flags &= ~(PF_SWAPWRITE | PF_LESS_THROTTLE); > } > > /* > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index 7f2ae99..fa602e9 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -528,8 +528,11 @@ enum pgdat_flags { > * many dirty file pages at the tail > * of the LRU. > */ > - PGDAT_WRITEBACK, /* reclaim scanning has recently found > - * many pages under writeback > + PGDAT_ANON_WRITEBACK, /* reclaim scanning has recently found > + * many anonymous pages under writeback > + */ > + PGDAT_FILE_WRITEBACK, /* reclaim scanning has recently found > + * many file pages under writeback > */ > PGDAT_RECLAIM_LOCKED, /* prevents concurrent reclaim */ Nobody seems to be clearing those bits (same was with PGDAT_WRITEBACK) ? --Mika -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] bdi flusher should not be throttled here when it fall into buddy slow path 2016-10-20 12:38 ` zhouxianrong 2016-10-20 13:05 ` Mika Penttilä @ 2016-10-20 13:28 ` Michal Hocko 1 sibling, 0 replies; 8+ messages in thread From: Michal Hocko @ 2016-10-20 13:28 UTC (permalink / raw) To: zhouxianrong Cc: linux-mm, linux-kernel, akpm, viro, mingo, peterz, hannes, mgorman, vbabka, vdavydov.dev, minchan, riel, zhouxiyu, zhangshiming5, won.ho.park, tuxiaobing On Thu 20-10-16 20:38:05, zhouxianrong@huawei.com wrote: > From: z00281421 <z00281421@notesmail.huawei.com> > > The bdi flusher should be throttled only depends on > own bdi and is decoupled with others. > > separate PGDAT_WRITEBACK into PGDAT_ANON_WRITEBACK and > PGDAT_FILE_WRITEBACK avoid scanning anon lru and it is ok > then throttled on file WRITEBACK. Could you please answer questions from http://lkml.kernel.org/r/20161018114207.GD12092@dhcp22.suse.cz before coming up with new and even more complex patches please? I would really like to understand the issue you are seeing before jumping into patches... Thanks! -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-10-20 13:28 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-10-18 7:12 [PATCH] bdi flusher should not be throttled here when it fall into buddy slow path zhouxianrong 2016-10-18 9:34 ` Hillf Danton 2016-10-18 9:59 ` Mel Gorman 2016-10-18 11:08 ` zhouxianrong 2016-10-18 11:42 ` Michal Hocko 2016-10-20 12:38 ` zhouxianrong 2016-10-20 13:05 ` Mika Penttilä 2016-10-20 13:28 ` Michal Hocko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).