From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:53677 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759959AbcAUSIN (ORCPT ); Thu, 21 Jan 2016 13:08:13 -0500 Subject: Re: [PATCH v6 1/2] sb: add a new writeback list for sync To: Brian Foster , Jan Kara References: <1453226353-61481-1-git-send-email-bfoster@redhat.com> <1453226353-61481-2-git-send-email-bfoster@redhat.com> <20160120132626.GE10810@quack.suse.cz> <20160120201159.GW6033@dastard> <20160121152256.GB19272@bfoster.bfoster> <20160121163411.GP10810@quack.suse.cz> <20160121171306.GC19272@bfoster.bfoster> CC: Dave Chinner , , From: Josef Bacik Message-ID: <56A11E83.1010000@fb.com> Date: Thu, 21 Jan 2016 13:08:03 -0500 MIME-Version: 1.0 In-Reply-To: <20160121171306.GC19272@bfoster.bfoster> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On 01/21/2016 12:13 PM, Brian Foster wrote: > On Thu, Jan 21, 2016 at 05:34:11PM +0100, Jan Kara wrote: >> On Thu 21-01-16 10:22:57, Brian Foster wrote: >>> On Thu, Jan 21, 2016 at 07:11:59AM +1100, Dave Chinner wrote: >>>> On Wed, Jan 20, 2016 at 02:26:26PM +0100, Jan Kara wrote: >>>>> On Tue 19-01-16 12:59:12, Brian Foster wrote: >>>>>> From: Dave Chinner >>>>>> > ... >>>>> >>> >>> Hi Jan, Dave, >>> > ... >>>>> a) How much sync(2) speed has improved if there's not much to wait for. >>>> >>>> Depends on the size of the inode cache when sync is run. If it's >>>> empty it's not noticable. When you have tens of millions of cached, >>>> clean inodes the inode list traversal can takes tens of seconds. >>>> This is the sort of problem Josef reported that FB were having... >>>> >>> >>> FWIW, Ceph has indicated this is a pain point for them as well. The >>> results at [0] below show the difference in sync time with a largely >>> populated inode cache before and after this patch. >>> >>>>> b) See whether parallel heavy stat(2) load which is rotating lots of inodes >>>>> in inode cache sees some improvement when it doesn't have to contend with >>>>> sync(2) on s_inode_list_lock. I believe Dave Chinner had some loads where >>>>> the contention on s_inode_list_lock due to sync and rotation of inodes was >>>>> pretty heavy. >>>> >>>> Just my usual fsmark workloads - they have parallel find and >>>> parallel ls -lR traversals over the created fileset. Even just >>>> running sync during creation (because there are millions of cached >>>> inodes, and ~250,000 inodes being instiated and reclaimed every >>>> second) causes lock contention problems.... >>>> >>> >>> I ran a similar parallel (16x) fs_mark workload using '-S 4,' which >>> incorporates a sync() per pass. Without this patch, this demonstrates a >>> slow degradation as the inode cache grows. Results at [1]. >> >> Thanks for the results. I think it would be good if you incorporated them >> in the changelog since other people will likely be asking similar >> questions when seeing the inode is growing. Other than that feel free to >> add: >> >> Reviewed-by: Jan Kara >> > > No problem, thanks! Sure, I don't want to dump the raw stuff into the > commit log description to avoid making it too long, but I can reference > the core sync time impact. I've appended the following for now: > > "With this change, filesystem sync times are significantly reduced for > fs' with largely populated inode caches and otherwise no other work to > do. For example, on a 16xcpu 2GHz x86-64 server, 10TB XFS filesystem > with a ~10m entry inode cache, sync times are reduced from ~7.3s to less > than 0.1s when the filesystem is fully clean." > > I'll repost in a day or so if I don't receive any other feedback. > Sorry I dropped the ball on this guys, thanks for picking it up Brian! I think that changelog is acceptable. Thanks, Josef