From mboxrd@z Thu Jan  1 00:00:00 1970
From: Patrick Caulfield <pcaulfie@redhat.com>
Date: Wed, 20 Dec 2006 15:01:45 +0000
Subject: [Cluster-devel] DLM thoughts - multi threaded recvd
Message-ID: <45895059.9030506@redhat.com>
List-Id: <cluster-devel.redhat.com>
To: cluster-devel.redhat.com
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit

One of the things that Andrew Morton commented on when taking the DLM was that that there is a single dlm_recvd process (and sendd
too) processing incoming requests and it could become a bottleneck on large systems.

So, I've been thinking how to make this scale a little better and have come up with several things.

1. How do we decide how many threads to start?

My first thought was to start one per CPU. But how do we cope with CPU hotplug events (if we do at all). This
is also slightly wasteful in a two-node SMP cluster where you could have 2 machines each with 4 cores each running 4 dlm_recvd
threads with only really work for 1 per machine.  We can't split up messages from one machine over threads because the packets may
be fragmented *

Is there a reasonable API in the kernel for getting the (current) number of CPUs in a system ?

2. Do we need an additional sysfs parameter to the DLM that tells it how many threads to start which defaults
to the number of CPUs in the system?


3. Is it worth multi-threading dlm_sendd too?

I'm not sure it is. dlm_sendd's job is very simple...to put stuff on the TCP (or SCTP) send queue. If that queue is full then the
request is simply requeued inside the DLM. It's not like dlm_recvd which does actual locking operations.


* This complicates the SCTP one rather a lot because we don't know which remote machine the request came from until we read it
because they all come down the same socket. But as SCTP is not currently reliable enough for us I'll ignore that for the moment.

patrick