From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christophe Gouault Subject: Re: [PATCH ipsec-next 2/2] xfrm: configure policy hash table thresholds by /proc Date: Mon, 19 May 2014 09:41:05 +0200 Message-ID: <5379B591.6020001@6wind.com> References: <1399902325-1788-1-git-send-email-christophe.gouault@6wind.com> <1399902325-1788-3-git-send-email-christophe.gouault@6wind.com> <20140515083447.GC32371@secunet.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "David S. Miller" , netdev@vger.kernel.org To: Steffen Klassert Return-path: Received: from mail-wi0-f170.google.com ([209.85.212.170]:44927 "EHLO mail-wi0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750852AbaESHlR (ORCPT ); Mon, 19 May 2014 03:41:17 -0400 Received: by mail-wi0-f170.google.com with SMTP id bs8so4737968wib.3 for ; Mon, 19 May 2014 00:41:16 -0700 (PDT) In-Reply-To: <20140515083447.GC32371@secunet.com> Sender: netdev-owner@vger.kernel.org List-ID: On 05/15/2014 10:34 AM, Steffen Klassert wrote: > On Mon, May 12, 2014 at 03:45:25PM +0200, Christophe Gouault wrote: >> Enable to specify local and remote prefix length thresholds >> for the policy hash table via /proc entries. Example: >> >> echo 0 24 > /proc/sys/net/ipv4/xfrm4_policy_hash_tresh >> echo 0 56 > /proc/sys/net/ipv6/xfrm6_policy_hash_tresh > > I would not like to have this configurable from userspace. > Fist of all, a good threshold depends on the IPsec configuration > and can change during runtime. So it is not obvious for a user > which values are good for his configuration. Most users will > just leave the default, so they will not benefit from your > changes. Hi Steffen, Like for several other /proc entries, the default values are suitable for simple use cases and users can let them unchanged. Users usually only start tuning them when they have a specific use case (typically scalability needs). Moreover, I am concerned that any heuristic for automatic changes would be a performance killer when the system is flapping. See below. > Second, on the long run we have to remove the IPsec flowcache > as this has the same limitation as our routing cache had. > To do this, we need to replace the hashlist based policy and > state lookups by a well performing lookup algorithm and I > would like to do that without any user visible changes. Efficient lookup is a field we have studied for long in my company. There are many thesis about multi-field classification, but none enables to cover all use cases. All suffer from limitations (building time, memory consumption, number of fields, time and memory unpredictability...) and each is adapted to a specific use case. The best seems to offer several methods and enable to select and tune them according to the use case. The main advantage of the hash table with configurable thresholds is that it enables to cover a wide variety of use cases by adjusting the thresholds. And we have the benefit of "keep it simple". > Can't we tune the hash threshold internally? We could maintain > a per hashlist policy counter. If we have 'many' policies and > most of these policies are in the same hashlist we could change > the hash threshold. We could check this when we add policies > and update the hash threshold if needed. I think that finding a generic algorithm to determine a good tradeof for the local and remote thresholds is quite tough. I'm afraid tracking the number of entries in each hlist is not enough. It would help to trigger a change, but not to choose the new values. Thresholds both determine which SPs will actually be hashed (vs. ones that will just be enqueued in the inexact list) and the number of bits that will be included in the hash key (and hence the entropy of the key). Moreover, it is a pair of thresholds, which makes the choice even harder. A user who knows what his SPD contains would probably prefer to be able to tune the hash thresholds instead of relying on an uncontrolled, automatic algorithm. Exporting a userland API (here by /proc) enables a user or a daemon to choose a strategy according to information the kernel does not necessarily have, and enables to implement various (possibly complex) policies. > Everything else looks pretty good, thanks! > You're welcome :)