From: Mike Waychison <mikew@google.com>
To: Eric Dumazet <dada1@cosmosbay.com>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v1 0/8] Deferred dput() and iput() -- reducing lock contention
Date: Tue, 20 Jan 2009 12:00:29 -0800 [thread overview]
Message-ID: <49762D5D.6050705@google.com> (raw)
In-Reply-To: <49718304.90404@cosmosbay.com>
Eric Dumazet wrote:
> Mike Waychison a écrit :
>> We've noticed that at times it can become very easy to have a system begin to
>> livelock on dcache_lock/inode_lock (specifically in atomic_dec_and_lock()) when
>> a lot of dentries are getting finalized at the same time (massive delete and
>> large fdtable destructions are two paths I've seen cause problems).
>>
>> This patchset is an attempt to try and reduce the locking overheads associated
>> with final dput() and final iput(). This is done by batching dentries and
>> inodes into per-process queues and processing them in 'parallel' to consolidate
>> some of the locking.
>>
>> Besides various workload testing, I threw together a load (at the end of this
>> email) that causes massive fdtables (50K sockets by default) to get destroyed
>> on each cpu in the system. It also populates the dcache for procfs on those
>> tasks for good measure. Comparing lock_stat results (hardware is a Sun x4600
>> M2 populated with 8 4-core 2.3GHz packages (32 CPUs) + 128GiB RAM):
>>
>
> Hello Mike
>
> Seems quite a large/intrusive infrastructure for a well known problem.
> I even wasted some time on it.
> But it seems nobody cared too much or people were too busy.
>
> https://kerneltrap.org/mailarchive/linux-netdev/2008/12/11/4397594
>
> (patch 6 should be discarded as followups show it was wrong
> [PATH 6/7] fs: struct file move from call_rcu() to SLAB_DESTROY_BY_RCU)
>
>
> sockets / pipes dont need dcache_lock or inode_lock at all, I am
> sure Google machines also uses sockets :)
Yup :) I'll try to take a look at your patches this week. At a
minimum, the removal of the locks seems highly desirable.
>
> Your test/bench program is quite biased (populating dcache for procfs, using
> 50k filedesc on 32 cpu, not very realistic IMHO).
Yup, extremely biased. It was meant to hurt the dput/iput path
specifically and I used it as a way to compare apples to apples
with/without the changes. It is still representative of a real-world
workload we see though (our frontend servers when they are restarted
have many tcp sockets, easily more than 50K each).
>
> I had a workload with processes using 1.000.000 file descriptors,
> (mainly sockets) and got some latency problems when they had to exit().
> This problem was addressed by one cond_resched() added in close_files()
> (commit 944be0b224724fcbf63c3a3fe3a5478c325a6547 )
>
Yup. We pulled that change into our tree a while back for the same
reason. It doesn't help the lock contention issue though.
WARNING: multiple messages have this Message-ID (diff)
From: Mike Waychison <mikew@google.com>
To: Eric Dumazet <dada1@cosmosbay.com>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v1 0/8] Deferred dput() and iput() -- reducing lock contention
Date: Tue, 20 Jan 2009 12:00:29 -0800 [thread overview]
Message-ID: <49762D5D.6050705@google.com> (raw)
In-Reply-To: <49718304.90404@cosmosbay.com>
Eric Dumazet wrote:
> Mike Waychison a écrit :
>> We've noticed that at times it can become very easy to have a system begin to
>> livelock on dcache_lock/inode_lock (specifically in atomic_dec_and_lock()) when
>> a lot of dentries are getting finalized at the same time (massive delete and
>> large fdtable destructions are two paths I've seen cause problems).
>>
>> This patchset is an attempt to try and reduce the locking overheads associated
>> with final dput() and final iput(). This is done by batching dentries and
>> inodes into per-process queues and processing them in 'parallel' to consolidate
>> some of the locking.
>>
>> Besides various workload testing, I threw together a load (at the end of this
>> email) that causes massive fdtables (50K sockets by default) to get destroyed
>> on each cpu in the system. It also populates the dcache for procfs on those
>> tasks for good measure. Comparing lock_stat results (hardware is a Sun x4600
>> M2 populated with 8 4-core 2.3GHz packages (32 CPUs) + 128GiB RAM):
>>
>
> Hello Mike
>
> Seems quite a large/intrusive infrastructure for a well known problem.
> I even wasted some time on it.
> But it seems nobody cared too much or people were too busy.
>
> https://kerneltrap.org/mailarchive/linux-netdev/2008/12/11/4397594
>
> (patch 6 should be discarded as followups show it was wrong
> [PATH 6/7] fs: struct file move from call_rcu() to SLAB_DESTROY_BY_RCU)
>
>
> sockets / pipes dont need dcache_lock or inode_lock at all, I am
> sure Google machines also uses sockets :)
Yup :) I'll try to take a look at your patches this week. At a
minimum, the removal of the locks seems highly desirable.
>
> Your test/bench program is quite biased (populating dcache for procfs, using
> 50k filedesc on 32 cpu, not very realistic IMHO).
Yup, extremely biased. It was meant to hurt the dput/iput path
specifically and I used it as a way to compare apples to apples
with/without the changes. It is still representative of a real-world
workload we see though (our frontend servers when they are restarted
have many tcp sockets, easily more than 50K each).
>
> I had a workload with processes using 1.000.000 file descriptors,
> (mainly sockets) and got some latency problems when they had to exit().
> This problem was addressed by one cond_resched() added in close_files()
> (commit 944be0b224724fcbf63c3a3fe3a5478c325a6547 )
>
Yup. We pulled that change into our tree a while back for the same
reason. It doesn't help the lock contention issue though.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2009-01-20 20:01 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-17 2:29 [PATCH v1 0/8] Deferred dput() and iput() -- reducing lock contention Mike Waychison
2009-01-17 2:29 ` [PATCH v1 1/8] Deferred batching of dput() Mike Waychison
2009-01-17 10:15 ` Evgeniy Polyakov
2009-01-20 20:07 ` Mike Waychison
2009-01-17 2:29 ` [PATCH v1 2/8] Parallel dput() Mike Waychison
2009-01-17 2:29 ` [PATCH v1 3/8] Deferred batching of iput() Mike Waychison
2009-01-17 10:18 ` Evgeniy Polyakov
2009-01-20 20:07 ` Mike Waychison
2009-01-17 2:29 ` [PATCH v1 4/8] Fixing iput() called from put_super path Mike Waychison
2009-01-17 2:30 ` [PATCH v1 5/8] Parallelize iput() Mike Waychison
2009-01-17 2:30 ` [PATCH v1 6/8] hugetlbfs drop_inode update Mike Waychison
2009-01-17 2:30 ` [PATCH v1 7/8] Make drop_caches flush pending dput()s and iput()s Mike Waychison
2009-01-17 2:30 ` [PATCH v1 8/8] Make the sync path drain dentries and inodes Mike Waychison
2009-01-17 7:04 ` [PATCH v1 0/8] Deferred dput() and iput() -- reducing lock contention Eric Dumazet
2009-01-20 20:00 ` Mike Waychison [this message]
2009-01-20 20:00 ` Mike Waychison
2009-01-17 8:12 ` Dave Chinner
2009-01-20 19:01 ` Mike Waychison
2009-01-29 2:09 ` Mike Waychison
2009-01-21 5:52 ` Andi Kleen
2009-01-21 6:22 ` Mike Waychison
2009-01-21 8:48 ` Andi Kleen
2009-01-21 17:28 ` Mike Waychison
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49762D5D.6050705@google.com \
--to=mikew@google.com \
--cc=akpm@linux-foundation.org \
--cc=dada1@cosmosbay.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.