* [Cluster-devel] DLM thoughts - multi threaded recvd
@ 2006-12-20 15:01 Patrick Caulfield
2006-12-20 15:41 ` Steven Whitehouse
2006-12-20 15:53 ` David Teigland
0 siblings, 2 replies; 4+ messages in thread
From: Patrick Caulfield @ 2006-12-20 15:01 UTC (permalink / raw)
To: cluster-devel.redhat.com
One of the things that Andrew Morton commented on when taking the DLM was that that there is a single dlm_recvd process (and sendd
too) processing incoming requests and it could become a bottleneck on large systems.
So, I've been thinking how to make this scale a little better and have come up with several things.
1. How do we decide how many threads to start?
My first thought was to start one per CPU. But how do we cope with CPU hotplug events (if we do at all). This
is also slightly wasteful in a two-node SMP cluster where you could have 2 machines each with 4 cores each running 4 dlm_recvd
threads with only really work for 1 per machine. We can't split up messages from one machine over threads because the packets may
be fragmented *
Is there a reasonable API in the kernel for getting the (current) number of CPUs in a system ?
2. Do we need an additional sysfs parameter to the DLM that tells it how many threads to start which defaults
to the number of CPUs in the system?
3. Is it worth multi-threading dlm_sendd too?
I'm not sure it is. dlm_sendd's job is very simple...to put stuff on the TCP (or SCTP) send queue. If that queue is full then the
request is simply requeued inside the DLM. It's not like dlm_recvd which does actual locking operations.
* This complicates the SCTP one rather a lot because we don't know which remote machine the request came from until we read it
because they all come down the same socket. But as SCTP is not currently reliable enough for us I'll ignore that for the moment.
patrick
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Cluster-devel] DLM thoughts - multi threaded recvd
2006-12-20 15:01 [Cluster-devel] DLM thoughts - multi threaded recvd Patrick Caulfield
@ 2006-12-20 15:41 ` Steven Whitehouse
2006-12-21 11:57 ` Patrick Caulfield
2006-12-20 15:53 ` David Teigland
1 sibling, 1 reply; 4+ messages in thread
From: Steven Whitehouse @ 2006-12-20 15:41 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
On Wed, 2006-12-20 at 15:01 +0000, Patrick Caulfield wrote:
> One of the things that Andrew Morton commented on when taking the DLM was that that there is a single dlm_recvd process (and sendd
> too) processing incoming requests and it could become a bottleneck on large systems.
>
> So, I've been thinking how to make this scale a little better and have come up with several things.
>
> 1. How do we decide how many threads to start?
>
> My first thought was to start one per CPU. But how do we cope with CPU hotplug events (if we do at all). This
> is also slightly wasteful in a two-node SMP cluster where you could have 2 machines each with 4 cores each running 4 dlm_recvd
> threads with only really work for 1 per machine. We can't split up messages from one machine over threads because the packets may
> be fragmented *
>
> Is there a reasonable API in the kernel for getting the (current) number of CPUs in a system ?
>
Not that I know of, although there are two different "numbers of CPUs" I
think, one being the current number and one being the max number. I had
the same problem when I was looking into hashing the rwlocks for the
glock hash table and settled for using the max number and hoping for the
best, though I think you need to be more accurate than I did.
As an alternative suggestion - is it possible to do this without any
threads at all? In that case the receive processing would run in softirq
context, and on the same CPU that did the tcp receive processing. That
would potentially save two context switches per message delivered.
I'm not so sure that its worth having the extra threads unless you are
able to bind each thread to a CPU and ensure that it only processes
packets delivered on that CPU.
Are you just talking about reading here? I assume that the accept per of
it isn't going to be a problem here so that could potentially stay as it
is?
There is an example of something similar to what I'm suggesting in
net/sunrpc/xprtsock.c:xs_tcp_data_recv() and xs_tcp_data_ready().
> 2. Do we need an additional sysfs parameter to the DLM that tells it how many threads to start which defaults
> to the number of CPUs in the system?
>
>
> 3. Is it worth multi-threading dlm_sendd too?
>
> I'm not sure it is. dlm_sendd's job is very simple...to put stuff on the TCP (or SCTP) send queue. If that queue is full then the
> request is simply requeued inside the DLM. It's not like dlm_recvd which does actual locking operations.
>
Its single threaded anyway as soon as it hits the tcp send queue. I
don't know if thats true of SCTP as well.
Steve.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Cluster-devel] DLM thoughts - multi threaded recvd
2006-12-20 15:41 ` Steven Whitehouse
@ 2006-12-21 11:57 ` Patrick Caulfield
0 siblings, 0 replies; 4+ messages in thread
From: Patrick Caulfield @ 2006-12-21 11:57 UTC (permalink / raw)
To: cluster-devel.redhat.com
Steven Whitehouse wrote:
> Hi,
>
> On Wed, 2006-12-20 at 15:01 +0000, Patrick Caulfield wrote:
>> One of the things that Andrew Morton commented on when taking the DLM was that that there is a single dlm_recvd process (and sendd
>> too) processing incoming requests and it could become a bottleneck on large systems.
>>
>> So, I've been thinking how to make this scale a little better and have come up with several things.
>>
>> 1. How do we decide how many threads to start?
>>
>> My first thought was to start one per CPU. But how do we cope with CPU hotplug events (if we do at all). This
>> is also slightly wasteful in a two-node SMP cluster where you could have 2 machines each with 4 cores each running 4 dlm_recvd
>> threads with only really work for 1 per machine. We can't split up messages from one machine over threads because the packets may
>> be fragmented *
>>
>> Is there a reasonable API in the kernel for getting the (current) number of CPUs in a system ?
>>
> Not that I know of, although there are two different "numbers of CPUs" I
> think, one being the current number and one being the max number. I had
> the same problem when I was looking into hashing the rwlocks for the
> glock hash table and settled for using the max number and hoping for the
> best, though I think you need to be more accurate than I did.
>
> As an alternative suggestion - is it possible to do this without any
> threads at all? In that case the receive processing would run in softirq
> context, and on the same CPU that did the tcp receive processing. That
> would potentially save two context switches per message delivered.
I'm not sure softirq would be appropriate as there is a good chance that the DLM functions might need to sleep. A workqueue might be
an idea though.
> I'm not so sure that its worth having the extra threads unless you are
> able to bind each thread to a CPU and ensure that it only processes
> packets delivered on that CPU.
>
> Are you just talking about reading here? I assume that the accept per of
> it isn't going to be a problem here so that could potentially stay as it
> is?
>
> There is an example of something similar to what I'm suggesting in
> net/sunrpc/xprtsock.c:xs_tcp_data_recv() and xs_tcp_data_ready().
>
>> 2. Do we need an additional sysfs parameter to the DLM that tells it how many threads to start which defaults
>> to the number of CPUs in the system?
>>
>>
>> 3. Is it worth multi-threading dlm_sendd too?
>>
>> I'm not sure it is. dlm_sendd's job is very simple...to put stuff on the TCP (or SCTP) send queue. If that queue is full then the
>> request is simply requeued inside the DLM. It's not like dlm_recvd which does actual locking operations.
>>
> Its single threaded anyway as soon as it hits the tcp send queue. I
> don't know if thats true of SCTP as well.
Ok, I might as well leave that then, thanks.
--
patrick
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Cluster-devel] DLM thoughts - multi threaded recvd
2006-12-20 15:01 [Cluster-devel] DLM thoughts - multi threaded recvd Patrick Caulfield
2006-12-20 15:41 ` Steven Whitehouse
@ 2006-12-20 15:53 ` David Teigland
1 sibling, 0 replies; 4+ messages in thread
From: David Teigland @ 2006-12-20 15:53 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Wed, Dec 20, 2006 at 03:01:45PM +0000, Patrick Caulfield wrote:
> One of the things that Andrew Morton commented on when taking the DLM
> was that that there is a single dlm_recvd process (and sendd too)
> processing incoming requests and it could become a bottleneck on large
> systems.
Yes, dlm_recvd does quite a bit of work per message, I could see a backlog
as a real possibility. Whether there are very dire effects of that I'm
not sure.
>
> So, I've been thinking how to make this scale a little better and have
> come up with several things.
>
> 1. How do we decide how many threads to start?
I think we just pick a default number that will work well for most use
cases, probably something like 2, and then allow it to be configured. Or,
create one or two threads per lockspace.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2006-12-21 11:57 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-20 15:01 [Cluster-devel] DLM thoughts - multi threaded recvd Patrick Caulfield
2006-12-20 15:41 ` Steven Whitehouse
2006-12-21 11:57 ` Patrick Caulfield
2006-12-20 15:53 ` David Teigland
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.