From mboxrd@z Thu Jan 1 00:00:00 1970 From: Josef Bacik Subject: Re: [PATCH 6/8] bdi: add a new writeback list for sync Date: Tue, 16 Jun 2015 08:42:27 -0700 Message-ID: <558043E3.9070006@fb.com> References: <1434051673-13838-1-git-send-email-jbacik@fb.com> <1434051673-13838-7-git-send-email-jbacik@fb.com> <20150615141221.GH4368@quack.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Cc: , , , , , Dave Chinner To: Jan Kara Return-path: Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:50344 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754959AbbFPPnA (ORCPT ); Tue, 16 Jun 2015 11:43:00 -0400 In-Reply-To: <20150615141221.GH4368@quack.suse.cz> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On 06/15/2015 07:12 AM, Jan Kara wrote: > On Thu 11-06-15 15:41:11, Josef Bacik wrote: >> From: Dave Chinner >> >> wait_sb_inodes() current does a walk of all inodes in the filesystem >> to find dirty one to wait on during sync. This is highly >> inefficient and wastes a lot of CPU when there are lots of clean >> cached inodes that we don't need to wait on. >> >> To avoid this "all inode" walk, we need to track inodes that are >> currently under writeback that we need to wait for. We do this by >> adding inodes to a writeback list on the bdi when the mapping is >> first tagged as having pages under writeback. wait_sb_inodes() can >> then walk this list of "inodes under IO" and wait specifically just >> for the inodes that the current sync(2) needs to wait for. >> >> To avoid needing all the realted locking to be safe against >> interrupts, Jan Kara suggested that we be lazy about removal from >> the writeback list. That is, we don't remove inodes from the >> writeback list on IO completion, but do it directly during a >> wait_sb_inodes() walk. >> >> This means that the a rare sync(2) call will have some work to do >> skipping clean inodes However, in the current problem case of >> concurrent sync workloads, concurrent wait_sb_inodes() calls only >> walk the very recently dispatched inodes and hence should have very >> little work to do. >> >> This also means that we have to remove the inodes from the writeback >> list during eviction. Do this in inode_wait_for_writeback() once >> all writeback on the inode is complete. >> >> Signed-off-by: Dave Chinner >> Signed-off-by: Josef Bacik > > I don't think you noticed comments I did when you last posted this > series: You're right I missed them completely, sorry. > >> /* >> - * Wait for writeback on an inode to complete. Caller must have inode pinned. >> + * Wait for writeback on an inode to complete during eviction. Caller must have >> + * inode pinned. >> */ >> void inode_wait_for_writeback(struct inode *inode) >> { >> + BUG_ON(!(inode->i_state & I_FREEING)); >> + >> spin_lock(&inode->i_lock); >> __inode_wait_for_writeback(inode); >> spin_unlock(&inode->i_lock); >> + >> + /* >> + * bd_inode's will have already had the bdev free'd so don't bother >> + * doing the bdi_clear_inode_writeback. >> + */ >> + if (!sb_is_blkdev_sb(inode->i_sb)) >> + bdi_clear_inode_writeback(inode_to_bdi(inode), inode); >> } > > Why do we bother with not putting bdev inode back? > My memory is rusty on this, but if the inode is the inode for a bdev we will have already free'd the bdev at this point and we get a null pointer deref, so this just skips that bit. I'll fix up the indenting. Thanks, Josef