* f2fs freezes the machine @ 2015-02-20 20:10 Sebastian Andrzej Siewior 2015-02-23 23:33 ` Jaegeuk Kim 0 siblings, 1 reply; 8+ messages in thread From: Sebastian Andrzej Siewior @ 2015-02-20 20:10 UTC (permalink / raw) To: Jaegeuk Kim, Changman Lee; +Cc: linux-f2fs-devel Hi, I've been playing with f2fs on eMMC with v3.19. I simply cp/rm files on the partition for a while. The partition has a size of 510MiB and runs out of space now and then. I managed to run into this: |[ 428.035231] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker/u2:2:59] |[ 428.042896] Modules linked in: |[ 428.045972] CPU: 0 PID: 59 Comm: kworker/u2:2 Tainted: G W 3.19.0-00010-g10c11c51ffed #153 |[ 428.062166] Workqueue: writeback bdi_writeback_workfn (flush-179:0) |[ 428.068460] task: df230000 ti: df23e000 task.ti: df23e000 |[ 428.073866] PC is at __submit_merged_bio+0x6c/0x110 |[ 428.078746] LR is at f2fs_submit_merged_bio+0x74/0x80 |[ 428.126618] CPU: 0 PID: 59 Comm: kworker/u2:2 Tainted: G W 3.19.0-00010-g10c11c51ffed #153 |[ 428.142802] Workqueue: writeback bdi_writeback_workfn (flush-179:0) … |[ 428.256579] [<c00085c4>] (gic_handle_irq) from [<c0012e84>] (__irq_svc+0x44/0x5c) |[ 428.264063] Exception stack(0xdf23fb48 to 0xdf23fb90) |[ 428.269119] fb40: deef3484 ffff0001 ffff0001 00000027 deef3484 00000000 |[ 428.277299] fb60: deef3440 00000000 de426000 deef34ec deefc440 df23fbb4 df23fbb8 df23fb90 |[ 428.285477] fb80: c02191f0 c0218fa0 60000013 ffffffff |[ 428.290539] [<c0012e84>] (__irq_svc) from [<c0218fa0>] (__submit_merged_bio+0x6c/0x110) |[ 428.298552] [<c0218fa0>] (__submit_merged_bio) from [<c02191f0>] (f2fs_submit_merged_bio+0x74/0x80) |[ 428.307604] [<c02191f0>] (f2fs_submit_merged_bio) from [<c021624c>] (sync_dirty_dir_inodes+0x70/0x78) |[ 428.316831] [<c021624c>] (sync_dirty_dir_inodes) from [<c0216358>] (write_checkpoint+0x104/0xc10) |[ 428.325709] [<c0216358>] (write_checkpoint) from [<c021231c>] (f2fs_sync_fs+0x80/0xbc) |[ 428.333637] [<c021231c>] (f2fs_sync_fs) from [<c0221eb8>] (f2fs_balance_fs_bg+0x4c/0x68) |[ 428.341736] [<c0221eb8>] (f2fs_balance_fs_bg) from [<c021e9b8>] (f2fs_write_node_pages+0x40/0x110) |[ 428.350704] [<c021e9b8>] (f2fs_write_node_pages) from [<c00de620>] (do_writepages+0x34/0x48) |[ 428.359152] [<c00de620>] (do_writepages) from [<c0145714>] (__writeback_single_inode+0x50/0x228) |[ 428.367946] [<c0145714>] (__writeback_single_inode) from [<c0146184>] (writeback_sb_inodes+0x1a8/0x378) |[ 428.377346] [<c0146184>] (writeback_sb_inodes) from [<c01463e4>] (__writeback_inodes_wb+0x90/0xc8) |[ 428.386312] [<c01463e4>] (__writeback_inodes_wb) from [<c01465f8>] (wb_writeback+0x1dc/0x28c) |[ 428.394843] [<c01465f8>] (wb_writeback) from [<c0146dd8>] (bdi_writeback_workfn+0x2ac/0x460) |[ 428.403293] [<c0146dd8>] (bdi_writeback_workfn) from [<c003c3fc>] (process_one_work+0x11c/0x3a4) |[ 428.412087] [<c003c3fc>] (process_one_work) from [<c003c844>] (worker_thread+0x17c/0x490) |[ 428.420272] [<c003c844>] (worker_thread) from [<c0041398>] (kthread+0xec/0x100) |[ 428.427590] [<c0041398>] (kthread) from [<c000ed10>] (ret_from_fork+0x14/0x24) The kernel config is UP no preemption/server. Once in this state, it does not recover. I have more backtraces and they all look more or less the same. I pulled in the f2fs changes from this merge window and was able to reproduce the same thing. It can take hours to get it reproduced. The backtrace up to sync_dirty_dir_inodes() is same once this "stall" occures. I meassured the time spent within this function and added printk() once 5 seconds were reached and I saw output. At the point of the stall, the task state of this work queue: |kworker/u2:1 R running 0 53 2 0x00000002 |Workqueue: writeback bdi_writeback_workfn (flush-179:0) |[<c0012e84>] (__irq_svc) from [<c0219dd8>] (f2fs_submit_merged_bio+0x4c/0x84) |[<c0219dd8>] (f2fs_submit_merged_bio) from [<c021632c>] (sync_dirty_dir_inodes+0x70/0x78) |[<c021632c>] (sync_dirty_dir_inodes) from [<c0216448>] (write_checkpoint+0x114/0xbf0) |[<c0216448>] (write_checkpoint) from [<c0212314>] (f2fs_sync_fs+0x7c/0xcc) |[<c0212314>] (f2fs_sync_fs) from [<c0221f14>] (f2fs_balance_fs_bg+0x4c/0x68) |[<c0221f14>] (f2fs_balance_fs_bg) from [<c021ed28>] (f2fs_write_node_pages+0x40/0x110) |[<c021ed28>] (f2fs_write_node_pages) from [<c00de620>] (do_writepages+0x34/0x48) |[<c00de620>] (do_writepages) from [<c0145714>] (__writeback_single_inode+0x50/0x228) |[<c0145714>] (__writeback_single_inode) from [<c0146184>] (writeback_sb_inodes+0x1a8/0x378) |[<c0146184>] (writeback_sb_inodes) from [<c01463e4>] (__writeback_inodes_wb+0x90/0xc8) |[<c01463e4>] (__writeback_inodes_wb) from [<c01465f8>] (wb_writeback+0x1dc/0x28c) |[<c01465f8>] (wb_writeback) from [<c0146dd8>] (bdi_writeback_workfn+0x2ac/0x460) |[<c0146dd8>] (bdi_writeback_workfn) from [<c003c3fc>] (process_one_work+0x11c/0x3a4) |[<c003c3fc>] (process_one_work) from [<c003c844>] (worker_thread+0x17c/0x490) |[<c003c844>] (worker_thread) from [<c0041398>] (kthread+0xec/0x100) |[<c0041398>] (kthread) from [<c000ed10>] (ret_from_fork+0x14/0x24) The 0x0…2 mean TIF_NEED_RESCHED is set but since preemption is off this schedule() seem not to happen. I have no idea what sync_dirty_dir_inodes() is doing :) Can you image a case where it could busy loop? I could come up with this: | void sync_dirty_dir_inodes(struct f2fs_sb_info *sbi) | { | retry: … | head = &sbi->dir_inode_list; … | entry = list_entry(head->next, struct inode_entry, list); | inode = igrab(entry->inode); | spin_unlock(&sbi->dir_inode_lock); … | } else { | /* | * We should submit bio, since it exists several | * wribacking dentry pages in the freeing inode. | */ | f2fs_submit_merged_bio(sbi, DATA, WRITE); -> |void f2fs_submit_merged_bio(struct f2fs_sb_info *sbi, | enum page_type type, int rw) | { | io = is_read_io(rw) ? &sbi->read_io : &sbi->write_io[btype]; | | down_write(&io->io_rwsem); … | | __submit_merged_bio(io); -> | static void __submit_merged_bio(struct f2fs_bio_info *io) | { | if (!io->bio) exit here since it is set to NULL at the end of this function. | return; … | submit_bio(fio->rw, io->bio); | io->bio = NULL; | } <- | up_write(&io->io_rwsem); | } | <- | } | goto retry; | } I don't know what submit_bio() does (in terms of terminating the loop here) but I haven't seen it in my backtraces (and assume it does not get invoked). Ah. The other process that (I think) could run: | rm R running 0 1989 1774 0x00000000 | [<c047c55c>] (__schedule) from [<c00486dc>] (__cond_resched+0x30/0x4c) | [<c00486dc>] (__cond_resched) from [<c047c8c8>] (_cond_resched+0x4c/0x54) | [<c047c8c8>] (_cond_resched) from [<c00e1aec>] (truncate_inode_pages_range+0x1f0/0x5e8) | [<c00e1aec>] (truncate_inode_pages_range) from [<c00e1fd8>] (truncate_inode_pages+0x28/0x30) | [<c00e1fd8>] (truncate_inode_pages) from [<c00e2148>] (truncate_inode_pages_final+0x60/0x64) | [<c00e2148>] (truncate_inode_pages_final) from [<c020c92c>] (f2fs_evict_inode+0x4c/0x268) | [<c020c92c>] (f2fs_evict_inode) from [<c0137214>] (evict+0x94/0x140) | [<c0137214>] (evict) from [<c01377e8>] (iput+0xc8/0x134) | [<c01377e8>] (iput) from [<c01333e4>] (d_delete+0x154/0x180) | [<c01333e4>] (d_delete) from [<c0129870>] (vfs_rmdir+0x114/0x12c) | [<c0129870>] (vfs_rmdir) from [<c012d644>] (do_rmdir+0x158/0x168) | [<c012d644>] (do_rmdir) from [<c012dd90>] (SyS_unlinkat+0x30/0x3c) | [<c012dd90>] (SyS_unlinkat) from [<c000ec40>] (ret_fast_syscall+0x0/0x4c) I wasn't able reproduce the issue after enable preemption (low latency desktop). I saw however the the code looping in sync_dirty_dir_inodes() for 5 seconds but it did not lockup the machine. The box continued to work and the file system access was working. Sebastian ------------------------------------------------------------------------------ Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk _______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: f2fs freezes the machine 2015-02-20 20:10 f2fs freezes the machine Sebastian Andrzej Siewior @ 2015-02-23 23:33 ` Jaegeuk Kim 2015-02-24 1:20 ` Nicholas Krause 2015-02-27 12:13 ` [PATCH] fs/f2fs: add cond_resched() to sync_dirty_dir_inodes() Sebastian Andrzej Siewior 0 siblings, 2 replies; 8+ messages in thread From: Jaegeuk Kim @ 2015-02-23 23:33 UTC (permalink / raw) To: Sebastian Andrzej Siewior; +Cc: linux-f2fs-devel Hi Sebastian, Thank you for the report. On Fri, Feb 20, 2015 at 09:10:35PM +0100, Sebastian Andrzej Siewior wrote: > Hi, > > I've been playing with f2fs on eMMC with v3.19. I simply cp/rm files on > the partition for a while. The partition has a size of 510MiB and runs > out of space now and then. > I managed to run into this: > > |[ 428.035231] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker/u2:2:59] > |[ 428.042896] Modules linked in: > |[ 428.045972] CPU: 0 PID: 59 Comm: kworker/u2:2 Tainted: G W 3.19.0-00010-g10c11c51ffed #153 > |[ 428.062166] Workqueue: writeback bdi_writeback_workfn (flush-179:0) > |[ 428.068460] task: df230000 ti: df23e000 task.ti: df23e000 > |[ 428.073866] PC is at __submit_merged_bio+0x6c/0x110 > |[ 428.078746] LR is at f2fs_submit_merged_bio+0x74/0x80 > |[ 428.126618] CPU: 0 PID: 59 Comm: kworker/u2:2 Tainted: G W 3.19.0-00010-g10c11c51ffed #153 > |[ 428.142802] Workqueue: writeback bdi_writeback_workfn (flush-179:0) > … > |[ 428.256579] [<c00085c4>] (gic_handle_irq) from [<c0012e84>] (__irq_svc+0x44/0x5c) > |[ 428.264063] Exception stack(0xdf23fb48 to 0xdf23fb90) > |[ 428.269119] fb40: deef3484 ffff0001 ffff0001 00000027 deef3484 00000000 > |[ 428.277299] fb60: deef3440 00000000 de426000 deef34ec deefc440 df23fbb4 df23fbb8 df23fb90 > |[ 428.285477] fb80: c02191f0 c0218fa0 60000013 ffffffff > |[ 428.290539] [<c0012e84>] (__irq_svc) from [<c0218fa0>] (__submit_merged_bio+0x6c/0x110) > |[ 428.298552] [<c0218fa0>] (__submit_merged_bio) from [<c02191f0>] (f2fs_submit_merged_bio+0x74/0x80) > |[ 428.307604] [<c02191f0>] (f2fs_submit_merged_bio) from [<c021624c>] (sync_dirty_dir_inodes+0x70/0x78) > |[ 428.316831] [<c021624c>] (sync_dirty_dir_inodes) from [<c0216358>] (write_checkpoint+0x104/0xc10) > |[ 428.325709] [<c0216358>] (write_checkpoint) from [<c021231c>] (f2fs_sync_fs+0x80/0xbc) > |[ 428.333637] [<c021231c>] (f2fs_sync_fs) from [<c0221eb8>] (f2fs_balance_fs_bg+0x4c/0x68) > |[ 428.341736] [<c0221eb8>] (f2fs_balance_fs_bg) from [<c021e9b8>] (f2fs_write_node_pages+0x40/0x110) > |[ 428.350704] [<c021e9b8>] (f2fs_write_node_pages) from [<c00de620>] (do_writepages+0x34/0x48) > |[ 428.359152] [<c00de620>] (do_writepages) from [<c0145714>] (__writeback_single_inode+0x50/0x228) > |[ 428.367946] [<c0145714>] (__writeback_single_inode) from [<c0146184>] (writeback_sb_inodes+0x1a8/0x378) > |[ 428.377346] [<c0146184>] (writeback_sb_inodes) from [<c01463e4>] (__writeback_inodes_wb+0x90/0xc8) > |[ 428.386312] [<c01463e4>] (__writeback_inodes_wb) from [<c01465f8>] (wb_writeback+0x1dc/0x28c) > |[ 428.394843] [<c01465f8>] (wb_writeback) from [<c0146dd8>] (bdi_writeback_workfn+0x2ac/0x460) > |[ 428.403293] [<c0146dd8>] (bdi_writeback_workfn) from [<c003c3fc>] (process_one_work+0x11c/0x3a4) > |[ 428.412087] [<c003c3fc>] (process_one_work) from [<c003c844>] (worker_thread+0x17c/0x490) > |[ 428.420272] [<c003c844>] (worker_thread) from [<c0041398>] (kthread+0xec/0x100) > |[ 428.427590] [<c0041398>] (kthread) from [<c000ed10>] (ret_from_fork+0x14/0x24) > > The kernel config is UP no preemption/server. Once in this state, it > does not recover. I have more backtraces and they all look more or less > the same. I pulled in the f2fs changes from this merge window and was able > to reproduce the same thing. It can take hours to get it reproduced. > > The backtrace up to sync_dirty_dir_inodes() is same once this "stall" > occures. I meassured the time spent within this function and added > printk() once 5 seconds were reached and I saw output. > At the point of the stall, the task state of this work queue: > > |kworker/u2:1 R running 0 53 2 0x00000002 > |Workqueue: writeback bdi_writeback_workfn (flush-179:0) > |[<c0012e84>] (__irq_svc) from [<c0219dd8>] (f2fs_submit_merged_bio+0x4c/0x84) > |[<c0219dd8>] (f2fs_submit_merged_bio) from [<c021632c>] (sync_dirty_dir_inodes+0x70/0x78) > |[<c021632c>] (sync_dirty_dir_inodes) from [<c0216448>] (write_checkpoint+0x114/0xbf0) > |[<c0216448>] (write_checkpoint) from [<c0212314>] (f2fs_sync_fs+0x7c/0xcc) > |[<c0212314>] (f2fs_sync_fs) from [<c0221f14>] (f2fs_balance_fs_bg+0x4c/0x68) > |[<c0221f14>] (f2fs_balance_fs_bg) from [<c021ed28>] (f2fs_write_node_pages+0x40/0x110) > |[<c021ed28>] (f2fs_write_node_pages) from [<c00de620>] (do_writepages+0x34/0x48) > |[<c00de620>] (do_writepages) from [<c0145714>] (__writeback_single_inode+0x50/0x228) > |[<c0145714>] (__writeback_single_inode) from [<c0146184>] (writeback_sb_inodes+0x1a8/0x378) > |[<c0146184>] (writeback_sb_inodes) from [<c01463e4>] (__writeback_inodes_wb+0x90/0xc8) > |[<c01463e4>] (__writeback_inodes_wb) from [<c01465f8>] (wb_writeback+0x1dc/0x28c) > |[<c01465f8>] (wb_writeback) from [<c0146dd8>] (bdi_writeback_workfn+0x2ac/0x460) > |[<c0146dd8>] (bdi_writeback_workfn) from [<c003c3fc>] (process_one_work+0x11c/0x3a4) > |[<c003c3fc>] (process_one_work) from [<c003c844>] (worker_thread+0x17c/0x490) > |[<c003c844>] (worker_thread) from [<c0041398>] (kthread+0xec/0x100) > |[<c0041398>] (kthread) from [<c000ed10>] (ret_from_fork+0x14/0x24) > > The 0x0…2 mean TIF_NEED_RESCHED is set but since preemption is off this > schedule() seem not to happen. > > I have no idea what sync_dirty_dir_inodes() is doing :) Can you image a > case where it could busy loop? I could come up with this: > > | void sync_dirty_dir_inodes(struct f2fs_sb_info *sbi) > | { > | retry: > … > | head = &sbi->dir_inode_list; > … > | entry = list_entry(head->next, struct inode_entry, list); > | inode = igrab(entry->inode); > | spin_unlock(&sbi->dir_inode_lock); > … > | } else { > | /* > | * We should submit bio, since it exists several > | * wribacking dentry pages in the freeing inode. > | */ > | f2fs_submit_merged_bio(sbi, DATA, WRITE); > -> > |void f2fs_submit_merged_bio(struct f2fs_sb_info *sbi, > | enum page_type type, int rw) > | { > | io = is_read_io(rw) ? &sbi->read_io : &sbi->write_io[btype]; > | > | down_write(&io->io_rwsem); > … > | > | __submit_merged_bio(io); > -> > | static void __submit_merged_bio(struct f2fs_bio_info *io) > | { > | if (!io->bio) > exit here since it is set to NULL at the end of this function. > | return; > … > | submit_bio(fio->rw, io->bio); > | io->bio = NULL; > | } > <- > | up_write(&io->io_rwsem); > | } > | > <- > | } > | goto retry; > | } > > I don't know what submit_bio() does (in terms of terminating the loop > here) but I haven't seen it in my backtraces (and assume it does not get > invoked). > Ah. The other process that (I think) could run: > | rm R running 0 1989 1774 0x00000000 > | [<c047c55c>] (__schedule) from [<c00486dc>] (__cond_resched+0x30/0x4c) > | [<c00486dc>] (__cond_resched) from [<c047c8c8>] (_cond_resched+0x4c/0x54) > | [<c047c8c8>] (_cond_resched) from [<c00e1aec>] (truncate_inode_pages_range+0x1f0/0x5e8) > | [<c00e1aec>] (truncate_inode_pages_range) from [<c00e1fd8>] (truncate_inode_pages+0x28/0x30) > | [<c00e1fd8>] (truncate_inode_pages) from [<c00e2148>] (truncate_inode_pages_final+0x60/0x64) > | [<c00e2148>] (truncate_inode_pages_final) from [<c020c92c>] (f2fs_evict_inode+0x4c/0x268) > | [<c020c92c>] (f2fs_evict_inode) from [<c0137214>] (evict+0x94/0x140) > | [<c0137214>] (evict) from [<c01377e8>] (iput+0xc8/0x134) > | [<c01377e8>] (iput) from [<c01333e4>] (d_delete+0x154/0x180) > | [<c01333e4>] (d_delete) from [<c0129870>] (vfs_rmdir+0x114/0x12c) > | [<c0129870>] (vfs_rmdir) from [<c012d644>] (do_rmdir+0x158/0x168) > | [<c012d644>] (do_rmdir) from [<c012dd90>] (SyS_unlinkat+0x30/0x3c) > | [<c012dd90>] (SyS_unlinkat) from [<c000ec40>] (ret_fast_syscall+0x0/0x4c) This inode is the directory (c.f., do_rmdir) causing a infinite loop on sync_dirty_dir_inodes. The sync_dirty_dir_inodes tries to flush dirty dentry pages, but if the inode is under eviction, it submits bios and do it again until eviction is finished. It needs to check why truncate_inode_pages_range cannot be completed. How about adding cond_resched() right after f2fs_submit_merged_bio in sync_dirty_dir_inodes? Could you test this? Thanks, > > I wasn't able reproduce the issue after enable preemption (low latency > desktop). I saw however the the code looping in sync_dirty_dir_inodes() > for 5 seconds but it did not lockup the machine. The box continued to > work and the file system access was working. > > Sebastian ------------------------------------------------------------------------------ Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk _______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: f2fs freezes the machine 2015-02-23 23:33 ` Jaegeuk Kim @ 2015-02-24 1:20 ` Nicholas Krause 2015-02-27 12:13 ` [PATCH] fs/f2fs: add cond_resched() to sync_dirty_dir_inodes() Sebastian Andrzej Siewior 1 sibling, 0 replies; 8+ messages in thread From: Nicholas Krause @ 2015-02-24 1:20 UTC (permalink / raw) To: Jaegeuk Kim, Sebastian Andrzej Siewior; +Cc: linux-f2fs-devel Greetings All, I was wondering if there is anyone working on tracing this bug. If no one else is I am will look into this later when I am at my actual work system. Nick On February 23, 2015 6:33:53 PM EST, Jaegeuk Kim <jaegeuk@kernel.org> wrote: >Hi Sebastian, > >Thank you for the report. > >On Fri, Feb 20, 2015 at 09:10:35PM +0100, Sebastian Andrzej Siewior >wrote: >> Hi, >> >> I've been playing with f2fs on eMMC with v3.19. I simply cp/rm files >on >> the partition for a while. The partition has a size of 510MiB and >runs >> out of space now and then. >> I managed to run into this: >> >> |[ 428.035231] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! >[kworker/u2:2:59] >> |[ 428.042896] Modules linked in: >> |[ 428.045972] CPU: 0 PID: 59 Comm: kworker/u2:2 Tainted: G W > 3.19.0-00010-g10c11c51ffed #153 >> |[ 428.062166] Workqueue: writeback bdi_writeback_workfn >(flush-179:0) >> |[ 428.068460] task: df230000 ti: df23e000 task.ti: df23e000 >> |[ 428.073866] PC is at __submit_merged_bio+0x6c/0x110 >> |[ 428.078746] LR is at f2fs_submit_merged_bio+0x74/0x80 >> |[ 428.126618] CPU: 0 PID: 59 Comm: kworker/u2:2 Tainted: G W > 3.19.0-00010-g10c11c51ffed #153 >> |[ 428.142802] Workqueue: writeback bdi_writeback_workfn >(flush-179:0) >> … >> |[ 428.256579] [<c00085c4>] (gic_handle_irq) from [<c0012e84>] >(__irq_svc+0x44/0x5c) >> |[ 428.264063] Exception stack(0xdf23fb48 to 0xdf23fb90) >> |[ 428.269119] fb40: deef3484 ffff0001 ffff0001 >00000027 deef3484 00000000 >> |[ 428.277299] fb60: deef3440 00000000 de426000 deef34ec deefc440 >df23fbb4 df23fbb8 df23fb90 >> |[ 428.285477] fb80: c02191f0 c0218fa0 60000013 ffffffff >> |[ 428.290539] [<c0012e84>] (__irq_svc) from [<c0218fa0>] >(__submit_merged_bio+0x6c/0x110) >> |[ 428.298552] [<c0218fa0>] (__submit_merged_bio) from [<c02191f0>] >(f2fs_submit_merged_bio+0x74/0x80) >> |[ 428.307604] [<c02191f0>] (f2fs_submit_merged_bio) from >[<c021624c>] (sync_dirty_dir_inodes+0x70/0x78) >> |[ 428.316831] [<c021624c>] (sync_dirty_dir_inodes) from >[<c0216358>] (write_checkpoint+0x104/0xc10) >> |[ 428.325709] [<c0216358>] (write_checkpoint) from [<c021231c>] >(f2fs_sync_fs+0x80/0xbc) >> |[ 428.333637] [<c021231c>] (f2fs_sync_fs) from [<c0221eb8>] >(f2fs_balance_fs_bg+0x4c/0x68) >> |[ 428.341736] [<c0221eb8>] (f2fs_balance_fs_bg) from [<c021e9b8>] >(f2fs_write_node_pages+0x40/0x110) >> |[ 428.350704] [<c021e9b8>] (f2fs_write_node_pages) from >[<c00de620>] (do_writepages+0x34/0x48) >> |[ 428.359152] [<c00de620>] (do_writepages) from [<c0145714>] >(__writeback_single_inode+0x50/0x228) >> |[ 428.367946] [<c0145714>] (__writeback_single_inode) from >[<c0146184>] (writeback_sb_inodes+0x1a8/0x378) >> |[ 428.377346] [<c0146184>] (writeback_sb_inodes) from [<c01463e4>] >(__writeback_inodes_wb+0x90/0xc8) >> |[ 428.386312] [<c01463e4>] (__writeback_inodes_wb) from >[<c01465f8>] (wb_writeback+0x1dc/0x28c) >> |[ 428.394843] [<c01465f8>] (wb_writeback) from [<c0146dd8>] >(bdi_writeback_workfn+0x2ac/0x460) >> |[ 428.403293] [<c0146dd8>] (bdi_writeback_workfn) from [<c003c3fc>] >(process_one_work+0x11c/0x3a4) >> |[ 428.412087] [<c003c3fc>] (process_one_work) from [<c003c844>] >(worker_thread+0x17c/0x490) >> |[ 428.420272] [<c003c844>] (worker_thread) from [<c0041398>] >(kthread+0xec/0x100) >> |[ 428.427590] [<c0041398>] (kthread) from [<c000ed10>] >(ret_from_fork+0x14/0x24) >> >> The kernel config is UP no preemption/server. Once in this state, it >> does not recover. I have more backtraces and they all look more or >less >> the same. I pulled in the f2fs changes from this merge window and was >able >> to reproduce the same thing. It can take hours to get it reproduced. >> >> The backtrace up to sync_dirty_dir_inodes() is same once this "stall" >> occures. I meassured the time spent within this function and added >> printk() once 5 seconds were reached and I saw output. >> At the point of the stall, the task state of this work queue: >> >> |kworker/u2:1 R running 0 53 2 0x00000002 >> |Workqueue: writeback bdi_writeback_workfn (flush-179:0) >> |[<c0012e84>] (__irq_svc) from [<c0219dd8>] >(f2fs_submit_merged_bio+0x4c/0x84) >> |[<c0219dd8>] (f2fs_submit_merged_bio) from [<c021632c>] >(sync_dirty_dir_inodes+0x70/0x78) >> |[<c021632c>] (sync_dirty_dir_inodes) from [<c0216448>] >(write_checkpoint+0x114/0xbf0) >> |[<c0216448>] (write_checkpoint) from [<c0212314>] >(f2fs_sync_fs+0x7c/0xcc) >> |[<c0212314>] (f2fs_sync_fs) from [<c0221f14>] >(f2fs_balance_fs_bg+0x4c/0x68) >> |[<c0221f14>] (f2fs_balance_fs_bg) from [<c021ed28>] >(f2fs_write_node_pages+0x40/0x110) >> |[<c021ed28>] (f2fs_write_node_pages) from [<c00de620>] >(do_writepages+0x34/0x48) >> |[<c00de620>] (do_writepages) from [<c0145714>] >(__writeback_single_inode+0x50/0x228) >> |[<c0145714>] (__writeback_single_inode) from [<c0146184>] >(writeback_sb_inodes+0x1a8/0x378) >> |[<c0146184>] (writeback_sb_inodes) from [<c01463e4>] >(__writeback_inodes_wb+0x90/0xc8) >> |[<c01463e4>] (__writeback_inodes_wb) from [<c01465f8>] >(wb_writeback+0x1dc/0x28c) >> |[<c01465f8>] (wb_writeback) from [<c0146dd8>] >(bdi_writeback_workfn+0x2ac/0x460) >> |[<c0146dd8>] (bdi_writeback_workfn) from [<c003c3fc>] >(process_one_work+0x11c/0x3a4) >> |[<c003c3fc>] (process_one_work) from [<c003c844>] >(worker_thread+0x17c/0x490) >> |[<c003c844>] (worker_thread) from [<c0041398>] (kthread+0xec/0x100) >> |[<c0041398>] (kthread) from [<c000ed10>] (ret_from_fork+0x14/0x24) >> >> The 0x0…2 mean TIF_NEED_RESCHED is set but since preemption is off >this >> schedule() seem not to happen. >> >> I have no idea what sync_dirty_dir_inodes() is doing :) Can you image >a >> case where it could busy loop? I could come up with this: >> >> | void sync_dirty_dir_inodes(struct f2fs_sb_info *sbi) >> | { >> | retry: >> … >> | head = &sbi->dir_inode_list; >> … >> | entry = list_entry(head->next, struct inode_entry, list); >> | inode = igrab(entry->inode); >> | spin_unlock(&sbi->dir_inode_lock); >> … >> | } else { >> | /* >> | * We should submit bio, since it exists several >> | * wribacking dentry pages in the freeing inode. >> | */ >> | f2fs_submit_merged_bio(sbi, DATA, WRITE); >> -> >> |void f2fs_submit_merged_bio(struct f2fs_sb_info *sbi, >> | enum page_type type, int rw) >> | { >> | io = is_read_io(rw) ? &sbi->read_io : >&sbi->write_io[btype]; >> | >> | down_write(&io->io_rwsem); >> … >> | >> | __submit_merged_bio(io); >> -> >> | static void __submit_merged_bio(struct f2fs_bio_info *io) >> | { >> | if (!io->bio) >> exit here since it is set to NULL at the end of this function. >> | return; >> … >> | submit_bio(fio->rw, io->bio); >> | io->bio = NULL; >> | } >> <- >> | up_write(&io->io_rwsem); >> | } >> | >> <- >> | } >> | goto retry; >> | } >> >> I don't know what submit_bio() does (in terms of terminating the loop >> here) but I haven't seen it in my backtraces (and assume it does not >get >> invoked). >> Ah. The other process that (I think) could run: >> | rm R running 0 1989 1774 0x00000000 >> | [<c047c55c>] (__schedule) from [<c00486dc>] >(__cond_resched+0x30/0x4c) >> | [<c00486dc>] (__cond_resched) from [<c047c8c8>] >(_cond_resched+0x4c/0x54) >> | [<c047c8c8>] (_cond_resched) from [<c00e1aec>] >(truncate_inode_pages_range+0x1f0/0x5e8) >> | [<c00e1aec>] (truncate_inode_pages_range) from [<c00e1fd8>] >(truncate_inode_pages+0x28/0x30) >> | [<c00e1fd8>] (truncate_inode_pages) from [<c00e2148>] >(truncate_inode_pages_final+0x60/0x64) >> | [<c00e2148>] (truncate_inode_pages_final) from [<c020c92c>] >(f2fs_evict_inode+0x4c/0x268) >> | [<c020c92c>] (f2fs_evict_inode) from [<c0137214>] >(evict+0x94/0x140) >> | [<c0137214>] (evict) from [<c01377e8>] (iput+0xc8/0x134) >> | [<c01377e8>] (iput) from [<c01333e4>] (d_delete+0x154/0x180) >> | [<c01333e4>] (d_delete) from [<c0129870>] (vfs_rmdir+0x114/0x12c) >> | [<c0129870>] (vfs_rmdir) from [<c012d644>] (do_rmdir+0x158/0x168) >> | [<c012d644>] (do_rmdir) from [<c012dd90>] (SyS_unlinkat+0x30/0x3c) >> | [<c012dd90>] (SyS_unlinkat) from [<c000ec40>] >(ret_fast_syscall+0x0/0x4c) > >This inode is the directory (c.f., do_rmdir) causing a infinite loop on >sync_dirty_dir_inodes. >The sync_dirty_dir_inodes tries to flush dirty dentry pages, but if the >inode is >under eviction, it submits bios and do it again until eviction is >finished. > >It needs to check why truncate_inode_pages_range cannot be completed. > >How about adding cond_resched() right after f2fs_submit_merged_bio in >sync_dirty_dir_inodes? > >Could you test this? > >Thanks, > >> >> I wasn't able reproduce the issue after enable preemption (low >latency >> desktop). I saw however the the code looping in >sync_dirty_dir_inodes() >> for 5 seconds but it did not lockup the machine. The box continued to >> work and the file system access was working. >> >> Sebastian > >------------------------------------------------------------------------------ >Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server >from Actuate! Instantly Supercharge Your Business Reports and >Dashboards >with Interactivity, Sharing, Native Excel Exports, App Integration & >more >Get technology previously reserved for billion-dollar corporations, >FREE >http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk >_______________________________________________ >Linux-f2fs-devel mailing list >Linux-f2fs-devel@lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel -- Sent from my Android device with K-9 Mail. Please excuse my brevity. ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] fs/f2fs: add cond_resched() to sync_dirty_dir_inodes() 2015-02-23 23:33 ` Jaegeuk Kim 2015-02-24 1:20 ` Nicholas Krause @ 2015-02-27 12:13 ` Sebastian Andrzej Siewior 2015-03-03 1:13 ` Changman Lee 1 sibling, 1 reply; 8+ messages in thread From: Sebastian Andrzej Siewior @ 2015-02-27 12:13 UTC (permalink / raw) To: Jaegeuk Kim; +Cc: linux-f2fs-devel In a preempt-off enviroment a alot of FS activity (write/delete) I run into a CPU stall: | NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker/u2:2:59] | Modules linked in: | CPU: 0 PID: 59 Comm: kworker/u2:2 Tainted: G W 3.19.0-00010-g10c11c51ffed #153 | Workqueue: writeback bdi_writeback_workfn (flush-179:0) | task: df230000 ti: df23e000 task.ti: df23e000 | PC is at __submit_merged_bio+0x6c/0x110 | LR is at f2fs_submit_merged_bio+0x74/0x80 … | [<c00085c4>] (gic_handle_irq) from [<c0012e84>] (__irq_svc+0x44/0x5c) | Exception stack(0xdf23fb48 to 0xdf23fb90) | fb40: deef3484 ffff0001 ffff0001 00000027 deef3484 00000000 | fb60: deef3440 00000000 de426000 deef34ec deefc440 df23fbb4 df23fbb8 df23fb90 | fb80: c02191f0 c0218fa0 60000013 ffffffff | [<c0012e84>] (__irq_svc) from [<c0218fa0>] (__submit_merged_bio+0x6c/0x110) | [<c0218fa0>] (__submit_merged_bio) from [<c02191f0>] (f2fs_submit_merged_bio+0x74/0x80) | [<c02191f0>] (f2fs_submit_merged_bio) from [<c021624c>] (sync_dirty_dir_inodes+0x70/0x78) | [<c021624c>] (sync_dirty_dir_inodes) from [<c0216358>] (write_checkpoint+0x104/0xc10) | [<c0216358>] (write_checkpoint) from [<c021231c>] (f2fs_sync_fs+0x80/0xbc) | [<c021231c>] (f2fs_sync_fs) from [<c0221eb8>] (f2fs_balance_fs_bg+0x4c/0x68) | [<c0221eb8>] (f2fs_balance_fs_bg) from [<c021e9b8>] (f2fs_write_node_pages+0x40/0x110) | [<c021e9b8>] (f2fs_write_node_pages) from [<c00de620>] (do_writepages+0x34/0x48) | [<c00de620>] (do_writepages) from [<c0145714>] (__writeback_single_inode+0x50/0x228) | [<c0145714>] (__writeback_single_inode) from [<c0146184>] (writeback_sb_inodes+0x1a8/0x378) | [<c0146184>] (writeback_sb_inodes) from [<c01463e4>] (__writeback_inodes_wb+0x90/0xc8) | [<c01463e4>] (__writeback_inodes_wb) from [<c01465f8>] (wb_writeback+0x1dc/0x28c) | [<c01465f8>] (wb_writeback) from [<c0146dd8>] (bdi_writeback_workfn+0x2ac/0x460) | [<c0146dd8>] (bdi_writeback_workfn) from [<c003c3fc>] (process_one_work+0x11c/0x3a4) | [<c003c3fc>] (process_one_work) from [<c003c844>] (worker_thread+0x17c/0x490) | [<c003c844>] (worker_thread) from [<c0041398>] (kthread+0xec/0x100) | [<c0041398>] (kthread) from [<c000ed10>] (ret_from_fork+0x14/0x24) As it turns out, the code loops in sync_dirty_dir_inodes() and waits for others to make progress but since it never leaves the CPU there is no progress made. At the time of this stall, there is also a rm process blocked: | rm R running 0 1989 1774 0x00000000 | [<c047c55c>] (__schedule) from [<c00486dc>] (__cond_resched+0x30/0x4c) | [<c00486dc>] (__cond_resched) from [<c047c8c8>] (_cond_resched+0x4c/0x54) | [<c047c8c8>] (_cond_resched) from [<c00e1aec>] (truncate_inode_pages_range+0x1f0/0x5e8) | [<c00e1aec>] (truncate_inode_pages_range) from [<c00e1fd8>] (truncate_inode_pages+0x28/0x30) | [<c00e1fd8>] (truncate_inode_pages) from [<c00e2148>] (truncate_inode_pages_final+0x60/0x64) | [<c00e2148>] (truncate_inode_pages_final) from [<c020c92c>] (f2fs_evict_inode+0x4c/0x268) | [<c020c92c>] (f2fs_evict_inode) from [<c0137214>] (evict+0x94/0x140) | [<c0137214>] (evict) from [<c01377e8>] (iput+0xc8/0x134) | [<c01377e8>] (iput) from [<c01333e4>] (d_delete+0x154/0x180) | [<c01333e4>] (d_delete) from [<c0129870>] (vfs_rmdir+0x114/0x12c) | [<c0129870>] (vfs_rmdir) from [<c012d644>] (do_rmdir+0x158/0x168) | [<c012d644>] (do_rmdir) from [<c012dd90>] (SyS_unlinkat+0x30/0x3c) | [<c012dd90>] (SyS_unlinkat) from [<c000ec40>] (ret_fast_syscall+0x0/0x4c) As explained by Jaegeuk Kim: |This inode is the directory (c.f., do_rmdir) causing a infinite loop on |sync_dirty_dir_inodes. |The sync_dirty_dir_inodes tries to flush dirty dentry pages, but if the |inode is under eviction, it submits bios and do it again until eviction |is finished. This patch adds a cond_resched() (as suggested by Jaegeuk) after a BIO is submitted so other thread can make progress. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- Hi Jaegeuk, >How about adding cond_resched() right after f2fs_submit_merged_bio in >sync_dirty_dir_inodes? > >Could you test this? So I added it as you suggsted. I've seen that the two function looped for >5sec but the system did not freeze like before that patch. So it seems to work, thanks. fs/f2fs/checkpoint.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index 7f794b72b3b7..a2ad3df39f24 100644 --- a/fs/f2fs/checkpoint.c +++ b/fs/f2fs/checkpoint.c @@ -796,6 +796,7 @@ void sync_dirty_dir_inodes(struct f2fs_sb_info *sbi) * wribacking dentry pages in the freeing inode. */ f2fs_submit_merged_bio(sbi, DATA, WRITE); + cond_resched(); } goto retry; } -- 2.1.4 ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] fs/f2fs: add cond_resched() to sync_dirty_dir_inodes() 2015-02-27 12:13 ` [PATCH] fs/f2fs: add cond_resched() to sync_dirty_dir_inodes() Sebastian Andrzej Siewior @ 2015-03-03 1:13 ` Changman Lee 2015-03-03 1:32 ` nick ` (2 more replies) 0 siblings, 3 replies; 8+ messages in thread From: Changman Lee @ 2015-03-03 1:13 UTC (permalink / raw) To: Sebastian Andrzej Siewior; +Cc: Jaegeuk Kim, linux-f2fs-devel On Fri, Feb 27, 2015 at 01:13:14PM +0100, Sebastian Andrzej Siewior wrote: > In a preempt-off enviroment a alot of FS activity (write/delete) I run > into a CPU stall: > > | NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker/u2:2:59] > | Modules linked in: > | CPU: 0 PID: 59 Comm: kworker/u2:2 Tainted: G W 3.19.0-00010-g10c11c51ffed #153 > | Workqueue: writeback bdi_writeback_workfn (flush-179:0) > | task: df230000 ti: df23e000 task.ti: df23e000 > | PC is at __submit_merged_bio+0x6c/0x110 > | LR is at f2fs_submit_merged_bio+0x74/0x80 > … > | [<c00085c4>] (gic_handle_irq) from [<c0012e84>] (__irq_svc+0x44/0x5c) > | Exception stack(0xdf23fb48 to 0xdf23fb90) > | fb40: deef3484 ffff0001 ffff0001 00000027 deef3484 00000000 > | fb60: deef3440 00000000 de426000 deef34ec deefc440 df23fbb4 df23fbb8 df23fb90 > | fb80: c02191f0 c0218fa0 60000013 ffffffff > | [<c0012e84>] (__irq_svc) from [<c0218fa0>] (__submit_merged_bio+0x6c/0x110) > | [<c0218fa0>] (__submit_merged_bio) from [<c02191f0>] (f2fs_submit_merged_bio+0x74/0x80) > | [<c02191f0>] (f2fs_submit_merged_bio) from [<c021624c>] (sync_dirty_dir_inodes+0x70/0x78) > | [<c021624c>] (sync_dirty_dir_inodes) from [<c0216358>] (write_checkpoint+0x104/0xc10) > | [<c0216358>] (write_checkpoint) from [<c021231c>] (f2fs_sync_fs+0x80/0xbc) > | [<c021231c>] (f2fs_sync_fs) from [<c0221eb8>] (f2fs_balance_fs_bg+0x4c/0x68) > | [<c0221eb8>] (f2fs_balance_fs_bg) from [<c021e9b8>] (f2fs_write_node_pages+0x40/0x110) > | [<c021e9b8>] (f2fs_write_node_pages) from [<c00de620>] (do_writepages+0x34/0x48) > | [<c00de620>] (do_writepages) from [<c0145714>] (__writeback_single_inode+0x50/0x228) > | [<c0145714>] (__writeback_single_inode) from [<c0146184>] (writeback_sb_inodes+0x1a8/0x378) > | [<c0146184>] (writeback_sb_inodes) from [<c01463e4>] (__writeback_inodes_wb+0x90/0xc8) > | [<c01463e4>] (__writeback_inodes_wb) from [<c01465f8>] (wb_writeback+0x1dc/0x28c) > | [<c01465f8>] (wb_writeback) from [<c0146dd8>] (bdi_writeback_workfn+0x2ac/0x460) > | [<c0146dd8>] (bdi_writeback_workfn) from [<c003c3fc>] (process_one_work+0x11c/0x3a4) > | [<c003c3fc>] (process_one_work) from [<c003c844>] (worker_thread+0x17c/0x490) > | [<c003c844>] (worker_thread) from [<c0041398>] (kthread+0xec/0x100) > | [<c0041398>] (kthread) from [<c000ed10>] (ret_from_fork+0x14/0x24) > > As it turns out, the code loops in sync_dirty_dir_inodes() and waits for > others to make progress but since it never leaves the CPU there is no > progress made. At the time of this stall, there is also a rm process > blocked: > | rm R running 0 1989 1774 0x00000000 > | [<c047c55c>] (__schedule) from [<c00486dc>] (__cond_resched+0x30/0x4c) > | [<c00486dc>] (__cond_resched) from [<c047c8c8>] (_cond_resched+0x4c/0x54) > | [<c047c8c8>] (_cond_resched) from [<c00e1aec>] (truncate_inode_pages_range+0x1f0/0x5e8) > | [<c00e1aec>] (truncate_inode_pages_range) from [<c00e1fd8>] (truncate_inode_pages+0x28/0x30) > | [<c00e1fd8>] (truncate_inode_pages) from [<c00e2148>] (truncate_inode_pages_final+0x60/0x64) > | [<c00e2148>] (truncate_inode_pages_final) from [<c020c92c>] (f2fs_evict_inode+0x4c/0x268) > | [<c020c92c>] (f2fs_evict_inode) from [<c0137214>] (evict+0x94/0x140) > | [<c0137214>] (evict) from [<c01377e8>] (iput+0xc8/0x134) > | [<c01377e8>] (iput) from [<c01333e4>] (d_delete+0x154/0x180) > | [<c01333e4>] (d_delete) from [<c0129870>] (vfs_rmdir+0x114/0x12c) > | [<c0129870>] (vfs_rmdir) from [<c012d644>] (do_rmdir+0x158/0x168) > | [<c012d644>] (do_rmdir) from [<c012dd90>] (SyS_unlinkat+0x30/0x3c) > | [<c012dd90>] (SyS_unlinkat) from [<c000ec40>] (ret_fast_syscall+0x0/0x4c) > > As explained by Jaegeuk Kim: > |This inode is the directory (c.f., do_rmdir) causing a infinite loop on > |sync_dirty_dir_inodes. > |The sync_dirty_dir_inodes tries to flush dirty dentry pages, but if the > |inode is under eviction, it submits bios and do it again until eviction > |is finished. > > This patch adds a cond_resched() (as suggested by Jaegeuk) after a BIO > is submitted so other thread can make progress. > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > --- > Hi Jaegeuk, > > >How about adding cond_resched() right after f2fs_submit_merged_bio in > >sync_dirty_dir_inodes? > > > >Could you test this? > > So I added it as you suggsted. I've seen that the two function looped > for >5sec but the system did not freeze like before that patch. So it > seems to work, thanks. Hi Sebastian, After this patch, your test is all done without any CPU stall, Right? IMHO, context should be switched without cond_resched() after consumed own time quota. So, it just reduces system latency due to yielding. I thought another way to discard pages of inode to be evicted in merged bio instead of submit. If so, evict() doesn't need to wait for writeback. Just my curiousity out of this problem. Thanks, > > fs/f2fs/checkpoint.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c > index 7f794b72b3b7..a2ad3df39f24 100644 > --- a/fs/f2fs/checkpoint.c > +++ b/fs/f2fs/checkpoint.c > @@ -796,6 +796,7 @@ void sync_dirty_dir_inodes(struct f2fs_sb_info *sbi) > * wribacking dentry pages in the freeing inode. > */ > f2fs_submit_merged_bio(sbi, DATA, WRITE); > + cond_resched(); > } > goto retry; > } > -- > 2.1.4 ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] fs/f2fs: add cond_resched() to sync_dirty_dir_inodes() 2015-03-03 1:13 ` Changman Lee @ 2015-03-03 1:32 ` nick 2015-03-03 8:22 ` Sebastian Andrzej Siewior 2015-06-12 13:28 ` Sebastian Andrzej Siewior 2 siblings, 0 replies; 8+ messages in thread From: nick @ 2015-03-03 1:32 UTC (permalink / raw) To: Changman Lee, Sebastian Andrzej Siewior; +Cc: Jaegeuk Kim, linux-f2fs-devel On 2015-03-02 08:13 PM, Changman Lee wrote: > On Fri, Feb 27, 2015 at 01:13:14PM +0100, Sebastian Andrzej Siewior wrote: >> In a preempt-off enviroment a alot of FS activity (write/delete) I run >> into a CPU stall: >> >> | NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker/u2:2:59] >> | Modules linked in: >> | CPU: 0 PID: 59 Comm: kworker/u2:2 Tainted: G W 3.19.0-00010-g10c11c51ffed #153 >> | Workqueue: writeback bdi_writeback_workfn (flush-179:0) >> | task: df230000 ti: df23e000 task.ti: df23e000 >> | PC is at __submit_merged_bio+0x6c/0x110 >> | LR is at f2fs_submit_merged_bio+0x74/0x80 >> … >> | [<c00085c4>] (gic_handle_irq) from [<c0012e84>] (__irq_svc+0x44/0x5c) >> | Exception stack(0xdf23fb48 to 0xdf23fb90) >> | fb40: deef3484 ffff0001 ffff0001 00000027 deef3484 00000000 >> | fb60: deef3440 00000000 de426000 deef34ec deefc440 df23fbb4 df23fbb8 df23fb90 >> | fb80: c02191f0 c0218fa0 60000013 ffffffff >> | [<c0012e84>] (__irq_svc) from [<c0218fa0>] (__submit_merged_bio+0x6c/0x110) >> | [<c0218fa0>] (__submit_merged_bio) from [<c02191f0>] (f2fs_submit_merged_bio+0x74/0x80) >> | [<c02191f0>] (f2fs_submit_merged_bio) from [<c021624c>] (sync_dirty_dir_inodes+0x70/0x78) >> | [<c021624c>] (sync_dirty_dir_inodes) from [<c0216358>] (write_checkpoint+0x104/0xc10) >> | [<c0216358>] (write_checkpoint) from [<c021231c>] (f2fs_sync_fs+0x80/0xbc) >> | [<c021231c>] (f2fs_sync_fs) from [<c0221eb8>] (f2fs_balance_fs_bg+0x4c/0x68) >> | [<c0221eb8>] (f2fs_balance_fs_bg) from [<c021e9b8>] (f2fs_write_node_pages+0x40/0x110) >> | [<c021e9b8>] (f2fs_write_node_pages) from [<c00de620>] (do_writepages+0x34/0x48) >> | [<c00de620>] (do_writepages) from [<c0145714>] (__writeback_single_inode+0x50/0x228) >> | [<c0145714>] (__writeback_single_inode) from [<c0146184>] (writeback_sb_inodes+0x1a8/0x378) >> | [<c0146184>] (writeback_sb_inodes) from [<c01463e4>] (__writeback_inodes_wb+0x90/0xc8) >> | [<c01463e4>] (__writeback_inodes_wb) from [<c01465f8>] (wb_writeback+0x1dc/0x28c) >> | [<c01465f8>] (wb_writeback) from [<c0146dd8>] (bdi_writeback_workfn+0x2ac/0x460) >> | [<c0146dd8>] (bdi_writeback_workfn) from [<c003c3fc>] (process_one_work+0x11c/0x3a4) >> | [<c003c3fc>] (process_one_work) from [<c003c844>] (worker_thread+0x17c/0x490) >> | [<c003c844>] (worker_thread) from [<c0041398>] (kthread+0xec/0x100) >> | [<c0041398>] (kthread) from [<c000ed10>] (ret_from_fork+0x14/0x24) >> >> As it turns out, the code loops in sync_dirty_dir_inodes() and waits for >> others to make progress but since it never leaves the CPU there is no >> progress made. At the time of this stall, there is also a rm process >> blocked: >> | rm R running 0 1989 1774 0x00000000 >> | [<c047c55c>] (__schedule) from [<c00486dc>] (__cond_resched+0x30/0x4c) >> | [<c00486dc>] (__cond_resched) from [<c047c8c8>] (_cond_resched+0x4c/0x54) >> | [<c047c8c8>] (_cond_resched) from [<c00e1aec>] (truncate_inode_pages_range+0x1f0/0x5e8) >> | [<c00e1aec>] (truncate_inode_pages_range) from [<c00e1fd8>] (truncate_inode_pages+0x28/0x30) >> | [<c00e1fd8>] (truncate_inode_pages) from [<c00e2148>] (truncate_inode_pages_final+0x60/0x64) >> | [<c00e2148>] (truncate_inode_pages_final) from [<c020c92c>] (f2fs_evict_inode+0x4c/0x268) >> | [<c020c92c>] (f2fs_evict_inode) from [<c0137214>] (evict+0x94/0x140) >> | [<c0137214>] (evict) from [<c01377e8>] (iput+0xc8/0x134) >> | [<c01377e8>] (iput) from [<c01333e4>] (d_delete+0x154/0x180) >> | [<c01333e4>] (d_delete) from [<c0129870>] (vfs_rmdir+0x114/0x12c) >> | [<c0129870>] (vfs_rmdir) from [<c012d644>] (do_rmdir+0x158/0x168) >> | [<c012d644>] (do_rmdir) from [<c012dd90>] (SyS_unlinkat+0x30/0x3c) >> | [<c012dd90>] (SyS_unlinkat) from [<c000ec40>] (ret_fast_syscall+0x0/0x4c) >> >> As explained by Jaegeuk Kim: >> |This inode is the directory (c.f., do_rmdir) causing a infinite loop on >> |sync_dirty_dir_inodes. >> |The sync_dirty_dir_inodes tries to flush dirty dentry pages, but if the >> |inode is under eviction, it submits bios and do it again until eviction >> |is finished. >> >> This patch adds a cond_resched() (as suggested by Jaegeuk) after a BIO >> is submitted so other thread can make progress. >> >> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> >> --- >> Hi Jaegeuk, >> >>> How about adding cond_resched() right after f2fs_submit_merged_bio in >>> sync_dirty_dir_inodes? >>> >>> Could you test this? >> >> So I added it as you suggsted. I've seen that the two function looped >> for >5sec but the system did not freeze like before that patch. So it >> seems to work, thanks. > > Hi Sebastian, > > After this patch, your test is all done without any CPU stall, Right? > IMHO, context should be switched without cond_resched() after consumed > own time quota. So, it just reduces system latency due to yielding. > I thought another way to discard pages of inode to be evicted in merged bio > instead of submit. If so, evict() doesn't need to wait for writeback. > Just my curiousity out of this problem. > > Thanks, > >> >> fs/f2fs/checkpoint.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c >> index 7f794b72b3b7..a2ad3df39f24 100644 >> --- a/fs/f2fs/checkpoint.c >> +++ b/fs/f2fs/checkpoint.c >> @@ -796,6 +796,7 @@ void sync_dirty_dir_inodes(struct f2fs_sb_info *sbi) >> * wribacking dentry pages in the freeing inode. >> */ >> f2fs_submit_merged_bio(sbi, DATA, WRITE); >> + cond_resched(); >> } >> goto retry; >> } >> -- >> 2.1.4 > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for all > things parallel software development, from weekly thought leadership blogs to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Linux-f2fs-devel mailing list > Linux-f2fs-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel > Changman, That seems a good idea until we think about file system caching and if that would cause a performance hit. I don't f2fs that while but this seems the simplest and cleanest way to do this without removing the caching of the inodes and related flash blocks. Regards, Nick P.S. Good caught goes out to Sebastian on this patch :) ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] fs/f2fs: add cond_resched() to sync_dirty_dir_inodes() 2015-03-03 1:13 ` Changman Lee 2015-03-03 1:32 ` nick @ 2015-03-03 8:22 ` Sebastian Andrzej Siewior 2015-06-12 13:28 ` Sebastian Andrzej Siewior 2 siblings, 0 replies; 8+ messages in thread From: Sebastian Andrzej Siewior @ 2015-03-03 8:22 UTC (permalink / raw) To: Changman Lee; +Cc: Jaegeuk Kim, linux-f2fs-devel On 03/03/2015 02:13 AM, Changman Lee wrote: > Hi Sebastian, Hi Jaegeuk, > After this patch, your test is all done without any CPU stall, Right? correct. > IMHO, context should be switched without cond_resched() after consumed > own time quota. So, it just reduces system latency due to yielding. Correct but without preemption enabled there is no time quota. I still have the >5sec printk but since the task schedules away I don't have a "problem" anymore. > I thought another way to discard pages of inode to be evicted in merged bio > instead of submit. If so, evict() doesn't need to wait for writeback. You are lucky that the workqueue code is aware of "that". That means it will schedule another worker if the current task schedules away because it waits for something. Otherwise you would stall other worker. If you could solve it in another way, that would be better than just polling for it. > Just my curiousity out of this problem. > > Thanks, Sebastian ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] fs/f2fs: add cond_resched() to sync_dirty_dir_inodes() 2015-03-03 1:13 ` Changman Lee 2015-03-03 1:32 ` nick 2015-03-03 8:22 ` Sebastian Andrzej Siewior @ 2015-06-12 13:28 ` Sebastian Andrzej Siewior 2 siblings, 0 replies; 8+ messages in thread From: Sebastian Andrzej Siewior @ 2015-06-12 13:28 UTC (permalink / raw) To: Changman Lee; +Cc: Jaegeuk Kim, linux-f2fs-devel On 03/03/2015 02:13 AM, Changman Lee wrote: > > Hi Sebastian, Hi Changman, > After this patch, your test is all done without any CPU stall, Right? > IMHO, context should be switched without cond_resched() after consumed > own time quota. So, it just reduces system latency due to yielding. > I thought another way to discard pages of inode to be evicted in merged bio > instead of submit. If so, evict() doesn't need to wait for writeback. > Just my curiousity out of this problem. Where do we stand on this? > > Thanks, > Sebastian ------------------------------------------------------------------------------ ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-06-12 13:28 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-02-20 20:10 f2fs freezes the machine Sebastian Andrzej Siewior 2015-02-23 23:33 ` Jaegeuk Kim 2015-02-24 1:20 ` Nicholas Krause 2015-02-27 12:13 ` [PATCH] fs/f2fs: add cond_resched() to sync_dirty_dir_inodes() Sebastian Andrzej Siewior 2015-03-03 1:13 ` Changman Lee 2015-03-03 1:32 ` nick 2015-03-03 8:22 ` Sebastian Andrzej Siewior 2015-06-12 13:28 ` Sebastian Andrzej Siewior
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).