From mboxrd@z Thu Jan 1 00:00:00 1970 From: Patrick Caulfield Date: Wed, 20 Dec 2006 15:01:45 +0000 Subject: [Cluster-devel] DLM thoughts - multi threaded recvd Message-ID: <45895059.9030506@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit One of the things that Andrew Morton commented on when taking the DLM was that that there is a single dlm_recvd process (and sendd too) processing incoming requests and it could become a bottleneck on large systems. So, I've been thinking how to make this scale a little better and have come up with several things. 1. How do we decide how many threads to start? My first thought was to start one per CPU. But how do we cope with CPU hotplug events (if we do at all). This is also slightly wasteful in a two-node SMP cluster where you could have 2 machines each with 4 cores each running 4 dlm_recvd threads with only really work for 1 per machine. We can't split up messages from one machine over threads because the packets may be fragmented * Is there a reasonable API in the kernel for getting the (current) number of CPUs in a system ? 2. Do we need an additional sysfs parameter to the DLM that tells it how many threads to start which defaults to the number of CPUs in the system? 3. Is it worth multi-threading dlm_sendd too? I'm not sure it is. dlm_sendd's job is very simple...to put stuff on the TCP (or SCTP) send queue. If that queue is full then the request is simply requeued inside the DLM. It's not like dlm_recvd which does actual locking operations. * This complicates the SCTP one rather a lot because we don't know which remote machine the request came from until we read it because they all come down the same socket. But as SCTP is not currently reliable enough for us I'll ignore that for the moment. patrick