From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeff Garzik Subject: CLD ping/response algorithm Date: Fri, 31 Jul 2009 17:55:20 -0400 Message-ID: <4A736848.20201@garzik.org> References: <20090731104031.GA21249@havoc.gtf.org> <4A733A26.2080801@garzik.org> <4A7356FC.8020108@garzik.org> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Sender: hail-devel-owner@vger.kernel.org List-ID: Content-Type: text/plain; charset="us-ascii"; format="flowed" To: Sage Weil Cc: hail-devel@vger.kernel.org, Pete Zaitcev Sage Weil wrote: > One question about the choice of UDP. I'm not sure how closely you're > following the chubby design.. but if it's a similar liveness/notification > model, the server is delaying keepalive rpc responses and piggybacking > notification of updates. If the replies are lossy, are you just planning > on a conservative client timeout/retry, and keeping the normal keepalive > round trip a healthy factor shorter than the client lease length? > Shorter timeouts mean higher server load (more frequent keepalives)... and > longer timeouts mean frequent stalls on writes when doing the cache > invalidation if there is any packet loss on the network... In general, I'm not trying to closely follow the Chubby paper. I was mainly inspired by it's general design -- that of a filesystem. But several implementation choices described by Google make a lot of technical sense for CLD, so you see a lot of commonality. At present, the algorithm is as follows: * for all client events, send immediately * if we haven't heard from the client in LEASE TIME / 2, ping them However, once we have sent _any_ packet to the client, be it a response, asynchronous event or ping, we wait for a client acknowledgement. If we do not receive an ack after CLD_RETRY_START (2) seconds, we - resend packet - set next-retry timer *= 2 So, retry #2 from server->client comes 4 seconds after retry #1. Retry #3 comes 8 seconds after retry #2. And so on, until the client lease expires. That is definitely different from Chubby. Is it wise? Unknown. This was a design chosen for simplicity, and may need to be revisited once caching (and strict cache coherence) is implemented in the client and server. Comments and criticism (and patches!) welcome. The CLD network protocol is not yet "solid" in my opinion. Ignoring the lack of caching, master fail-over, large messages and sequence ids also need additional attention. Jeff