From mboxrd@z Thu Jan  1 00:00:00 1970
From: Mike Waychison <mikew@google.com>
Subject: Re: [PATCH v1 0/8] Deferred dput() and iput() -- reducing lock contention
Date: Tue, 20 Jan 2009 12:00:29 -0800
Message-ID: <49762D5D.6050705@google.com>
References: <20090117022936.20425.43248.stgit@crlf.corp.google.com> <49718304.90404@cosmosbay.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8;
	format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
To: Eric Dumazet <dada1@cosmosbay.com>
Return-path: <linux-fsdevel-owner@vger.kernel.org>
Received: from smtp-out.google.com ([216.239.33.17]:24900 "EHLO
	smtp-out.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753464AbZATUBM (ORCPT
	<rfc822;linux-fsdevel@vger.kernel.org>);
	Tue, 20 Jan 2009 15:01:12 -0500
In-Reply-To: <49718304.90404@cosmosbay.com>
Sender: linux-fsdevel-owner@vger.kernel.org
List-ID: <linux-fsdevel.vger.kernel.org>

Eric Dumazet wrote:
> Mike Waychison a =C3=A9crit :
>> We've noticed that at times it can become very easy to have a system=
 begin to
>> livelock on dcache_lock/inode_lock (specifically in atomic_dec_and_l=
ock()) when
>> a lot of dentries are getting finalized at the same time (massive de=
lete and
>> large fdtable destructions are two paths I've seen cause problems).
>>
>> This patchset is an attempt to try and reduce the locking overheads =
associated
>> with final dput() and final iput().  This is done by batching dentri=
es and
>> inodes into per-process queues and processing them in 'parallel' to =
consolidate
>> some of the locking.
>>
>> Besides various workload testing, I threw together a load (at the en=
d of this
>> email) that causes massive fdtables (50K sockets by default) to get =
destroyed
>> on each cpu in the system.  It also populates the dcache for procfs =
on those
>> tasks for good measure.  Comparing lock_stat results (hardware is a =
Sun x4600
>> M2 populated with 8 4-core 2.3GHz packages (32 CPUs) + 128GiB RAM):
>>
>=20
> Hello Mike
>=20
> Seems quite a large/intrusive infrastructure for a well known problem=
=2E
> I even wasted some time on it.
> But it seems nobody cared too much or people were too busy.
>=20
> https://kerneltrap.org/mailarchive/linux-netdev/2008/12/11/4397594
>=20
> (patch 6 should be discarded as followups show it was wrong
> [PATH 6/7] fs: struct file move from call_rcu() to SLAB_DESTROY_BY_RC=
U)
>=20
>=20
> sockets / pipes dont need dcache_lock or inode_lock at all, I am=20
> sure Google machines also uses sockets :)

Yup :)  I'll try to take a look at your patches this week.  At a=20
minimum, the removal of the locks seems highly desirable.

>=20
> Your test/bench program is quite biased (populating dcache for procfs=
, using
> 50k filedesc on 32 cpu, not very realistic IMHO).

Yup, extremely biased.  It was meant to hurt the dput/iput path=20
specifically and I used it as a way to compare apples to apples=20
with/without the changes.  It is still representative of a real-world=20
workload we see though (our frontend servers when they are restarted=20
have many tcp sockets, easily more than 50K each).

>=20
> I had a workload with processes using 1.000.000 file descriptors,
> (mainly sockets) and got some latency problems when they had to exit(=
).
> This problem was addressed by one cond_resched() added in close_files=
()
> (commit 944be0b224724fcbf63c3a3fe3a5478c325a6547 )
>=20

Yup.  We pulled that change into our tree a while back for the same=20
reason.  It doesn't help the lock contention issue though.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel=
" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html