From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alex Zhuravlev Date: Wed, 08 Oct 2008 15:48:50 +0400 Subject: [Lustre-devel] COS performance issues In-Reply-To: <200810081544.08292.alexander.zarochentsev@sun.com> References: <200810081544.08292.alexander.zarochentsev@sun.com> Message-ID: <48EC9E22.2090403@sun.com> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: lustre-devel@lists.lustre.org try to profile with single CPU? you'll probably get an idea how "per-cpu" approach can help. thanks, Alex Alexander Zarochentsev wrote: > I have a patch to avoid using of obd_uncommitted_replies_lock > in ptlrpc_server_handle_reply but it has minimal effect, > ptlrpc_server_handle_reply still the most cpu consuming function > because of svc->srv_lock contention. > > I think the problem is that COS defers processing of replies to transaction commit time. > When commit happens, MDS has to process thousands of replies (about 14k replies per > commit in the test 3.a) in short period of time. I guess the mdt service threads > all woken up and spin trying to get the service svr_lock. Processing of new requests may > also suffer of this. > > I ran the tests with with CONFIG_DEBUG_SPINLOCK_SLEEP debugging compiled into a kernel, it found no > sleep under spinlock bugs. > > Further optimization may include > 1. per-reply spin locks. > 2. per-cpu structures and threads to process reply queues. > > Any comments? > > Thanks. > > PS. the test results are much better when MDS server is sata20 machine with 4 cores > (the MDS from Washie1 has 2 cores), COS=0 and COS=1 have only %3 difference: > > COS=1 > Rate: 3101.77 creates/sec (total: 2 threads 930530 creates 300 secs) > Rate: 3096.94 creates/sec (total: 2 threads 929083 creates 300 secs) > > COS=0 > Rate: 3184.01 creates/sec (total: 2 threads 958388 creates 301 secs) > Rate: 3152.89 creates/sec (total: 2 threads 945868 creates 300 secs) >