From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-fsdevel-owner@vger.kernel.org>
Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:53677 "EHLO
	mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK)
	by vger.kernel.org with ESMTP id S1759959AbcAUSIN (ORCPT
	<rfc822;linux-fsdevel@vger.kernel.org>);
	Thu, 21 Jan 2016 13:08:13 -0500
Subject: Re: [PATCH v6 1/2] sb: add a new writeback list for sync
To: Brian Foster <bfoster@redhat.com>, Jan Kara <jack@suse.cz>
References: <1453226353-61481-1-git-send-email-bfoster@redhat.com>
 <1453226353-61481-2-git-send-email-bfoster@redhat.com>
 <20160120132626.GE10810@quack.suse.cz> <20160120201159.GW6033@dastard>
 <20160121152256.GB19272@bfoster.bfoster>
 <20160121163411.GP10810@quack.suse.cz>
 <20160121171306.GC19272@bfoster.bfoster>
CC: Dave Chinner <david@fromorbit.com>,
	<linux-fsdevel@vger.kernel.org>, <dchinner@redhat.com>
From: Josef Bacik <jbacik@fb.com>
Message-ID: <56A11E83.1010000@fb.com>
Date: Thu, 21 Jan 2016 13:08:03 -0500
MIME-Version: 1.0
In-Reply-To: <20160121171306.GC19272@bfoster.bfoster>
Content-Type: text/plain; charset="windows-1252"; format=flowed
Content-Transfer-Encoding: 7bit
Sender: linux-fsdevel-owner@vger.kernel.org
List-ID: <linux-fsdevel.vger.kernel.org>

On 01/21/2016 12:13 PM, Brian Foster wrote:
> On Thu, Jan 21, 2016 at 05:34:11PM +0100, Jan Kara wrote:
>> On Thu 21-01-16 10:22:57, Brian Foster wrote:
>>> On Thu, Jan 21, 2016 at 07:11:59AM +1100, Dave Chinner wrote:
>>>> On Wed, Jan 20, 2016 at 02:26:26PM +0100, Jan Kara wrote:
>>>>> On Tue 19-01-16 12:59:12, Brian Foster wrote:
>>>>>> From: Dave Chinner <dchinner@redhat.com>
>>>>>>
> ...
>>>>>
>>>
>>> Hi Jan, Dave,
>>>
> ...
>>>>> a) How much sync(2) speed has improved if there's not much to wait for.
>>>>
>>>> Depends on the size of the inode cache when sync is run.  If it's
>>>> empty it's not noticable. When you have tens of millions of cached,
>>>> clean inodes the inode list traversal can takes tens of seconds.
>>>> This is the sort of problem Josef reported that FB were having...
>>>>
>>>
>>> FWIW, Ceph has indicated this is a pain point for them as well. The
>>> results at [0] below show the difference in sync time with a largely
>>> populated inode cache before and after this patch.
>>>
>>>>> b) See whether parallel heavy stat(2) load which is rotating lots of inodes
>>>>> in inode cache sees some improvement when it doesn't have to contend with
>>>>> sync(2) on s_inode_list_lock. I believe Dave Chinner had some loads where
>>>>> the contention on s_inode_list_lock due to sync and rotation of inodes was
>>>>> pretty heavy.
>>>>
>>>> Just my usual fsmark workloads - they have parallel find and
>>>> parallel ls -lR traversals over the created fileset. Even just
>>>> running sync during creation (because there are millions of cached
>>>> inodes, and ~250,000 inodes being instiated and reclaimed every
>>>> second) causes lock contention problems....
>>>>
>>>
>>> I ran a similar parallel (16x) fs_mark workload using '-S 4,' which
>>> incorporates a sync() per pass. Without this patch, this demonstrates a
>>> slow degradation as the inode cache grows. Results at [1].
>>
>> Thanks for the results. I think it would be good if you incorporated them
>> in the changelog since other people will likely be asking similar
>> questions when seeing the inode is growing. Other than that feel free to
>> add:
>>
>> Reviewed-by: Jan Kara <jack@suse.cz>
>>
>
> No problem, thanks! Sure, I don't want to dump the raw stuff into the
> commit log description to avoid making it too long, but I can reference
> the core sync time impact. I've appended the following for now:
>
>      "With this change, filesystem sync times are significantly reduced for
>      fs' with largely populated inode caches and otherwise no other work to
>      do. For example, on a 16xcpu 2GHz x86-64 server, 10TB XFS filesystem
>      with a ~10m entry inode cache, sync times are reduced from ~7.3s to less
>      than 0.1s when the filesystem is fully clean."
>
> I'll repost in a day or so if I don't receive any other feedback.
>

Sorry I dropped the ball on this guys, thanks for picking it up Brian! 
I think that changelog is acceptable.  Thanks,

Josef