* IPsec policy database customization proposal
@ 2014-06-30 12:50 Christophe Gouault
2014-07-08 11:35 ` Steffen Klassert
0 siblings, 1 reply; 4+ messages in thread
From: Christophe Gouault @ 2014-06-30 12:50 UTC (permalink / raw)
To: Steffen Klassert, Herbert Xu, David S. Miller; +Cc: netdev@vger.kernel.org
Hi IPsec and network maintainers,
After proposing a patchset to netdev (xfrm: scalability enhancements
for policy database) and discussing with Steffen Klassert, we agree on
the fact that the SPD lookup algorithm needs performance and
scalability improvements: SPs with non-prefixed selectors are
optimized through a hash table, but other SPs (the majority) are
stored in a sorted chained list, which does not scale. Additionally a
flowcache is used, and is known not to scale.
The bottleneck is the SPD lookup by selector (configuration and lookup itself).
Unfortunately, there is no all-in-one multi-field classifier that
would behave well in all situations. However, various classifiers
exist that are fitted to this or that use case. Therefore, I suggest
the following approach: adding hooks in the IPsec SPD, so that one can
dynamically register a custom SPD implementation ("SPD driver") fitted
to its use case, typically by loading a kernel module.
This obviously needs discussion before starting any development, so
here is a more detailed proposal:
- Define the minimum handlers to manipulate the SPD lookup by selector (alloc,
insert, delete, flush, lookup_bysel, lookup_byflow, destroy...).
- export a register/unregister function, so that an SPD implementation may
register/unregister its handlers.
- Separate the SPD common code from the SPD lookup by selector code. Keep the
policy_all and policy_byidx tables in the common code, extract the current
policy_inexact + policy_bydst implementation as an SPD driver. It is the
default implementation when no SPD driver is registered.
- *struct xfrm_policy* must offer a private area for SPD driver data (void * or
opaque place holder of fixed size or opaque place holder of size specific to
driver implementation).
- since we keep the current implementation as the default, the policy_inexact +
policy_bydst database heads (currently stored in netns->xfrm and xfrm_policy
link fields (bydst and flo) may remain at their current location.
- SPD drivers needing some configuration may export their specific
configuration API (/proc, netlink...)
- as a first step, we only support one registered handler at a time.
- as a first step, an SPD driver can only be loaded or unloaded if the SPD is
empty (return EBUSY otherwise).
Remarks:
- this architecture is open to later evolutions such as supporting the
registration of several handlers, dynamically listing/selecting/switching
drivers via netlink messages (to support dynamic change of SPD implementation
according to SPD content).
- loading/unloading or changing SPD drivers with a non empty SPD implies to
rebuild the SPD from the SP list. This may lock the SPD for a rather long
time.
I would like your opinion/questions/advices.
Best Regards,
Christophe
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: IPsec policy database customization proposal
2014-06-30 12:50 IPsec policy database customization proposal Christophe Gouault
@ 2014-07-08 11:35 ` Steffen Klassert
2014-07-16 7:35 ` Christophe Gouault
0 siblings, 1 reply; 4+ messages in thread
From: Steffen Klassert @ 2014-07-08 11:35 UTC (permalink / raw)
To: Christophe Gouault; +Cc: Herbert Xu, David S. Miller, netdev@vger.kernel.org
On Mon, Jun 30, 2014 at 02:50:18PM +0200, Christophe Gouault wrote:
> Hi IPsec and network maintainers,
>
> After proposing a patchset to netdev (xfrm: scalability enhancements
> for policy database) and discussing with Steffen Klassert, we agree on
> the fact that the SPD lookup algorithm needs performance and
> scalability improvements: SPs with non-prefixed selectors are
> optimized through a hash table, but other SPs (the majority) are
> stored in a sorted chained list, which does not scale. Additionally a
> flowcache is used, and is known not to scale.
I'd not say that the flowcache does not scale, it scales quite well
in some situations as it returns a precalculated xfrm bundle (policy
and states) based on a hash. The problem of the flowcache is that it
gets the performance by learning from the network traffic that arrives
and therefore it might be partly controllable by remote entities.
>
> The bottleneck is the SPD lookup by selector (configuration and lookup itself).
>
> Unfortunately, there is no all-in-one multi-field classifier that
> would behave well in all situations. However, various classifiers
> exist that are fitted to this or that use case. Therefore, I suggest
> the following approach: adding hooks in the IPsec SPD, so that one can
> dynamically register a custom SPD implementation ("SPD driver") fitted
> to its use case, typically by loading a kernel module.
Can you name some multi-field classifiers with their usecases?
While I think adding such a API is a step in the right direction,
I would like to see that we have known well scaling algorithms
that can replace the current method in some situations. Otherwise
we just add complextiy without any benefit.
>
> This obviously needs discussion before starting any development, so
> here is a more detailed proposal:
>
> - Define the minimum handlers to manipulate the SPD lookup by selector (alloc,
> insert, delete, flush, lookup_bysel, lookup_byflow, destroy...).
> - export a register/unregister function, so that an SPD implementation may
> register/unregister its handlers.
> - Separate the SPD common code from the SPD lookup by selector code. Keep the
> policy_all and policy_byidx tables in the common code, extract the current
> policy_inexact + policy_bydst implementation as an SPD driver. It is the
> default implementation when no SPD driver is registered.
> - *struct xfrm_policy* must offer a private area for SPD driver data (void * or
> opaque place holder of fixed size or opaque place holder of size specific to
> driver implementation).
Please keep in mind that we need to lookup policies and states, so both
lookups need to be reasonably fast for a well scaling IPsec lookup method.
> - since we keep the current implementation as the default, the policy_inexact +
> policy_bydst database heads (currently stored in netns->xfrm and xfrm_policy
> link fields (bydst and flo) may remain at their current location.
> - SPD drivers needing some configuration may export their specific
> configuration API (/proc, netlink...)
No /proc files please, netlink should be ok for that.
> - as a first step, we only support one registered handler at a time.
> - as a first step, an SPD driver can only be loaded or unloaded if the SPD is
> empty (return EBUSY otherwise).
>
> Remarks:
>
> - this architecture is open to later evolutions such as supporting the
> registration of several handlers, dynamically listing/selecting/switching
> drivers via netlink messages (to support dynamic change of SPD implementation
> according to SPD content).
> - loading/unloading or changing SPD drivers with a non empty SPD implies to
> rebuild the SPD from the SP list. This may lock the SPD for a rather long
> time.
>
> I would like your opinion/questions/advices.
>
Would be good to hear further opinions on this topic...
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: IPsec policy database customization proposal
2014-07-08 11:35 ` Steffen Klassert
@ 2014-07-16 7:35 ` Christophe Gouault
2014-07-21 11:01 ` Steffen Klassert
0 siblings, 1 reply; 4+ messages in thread
From: Christophe Gouault @ 2014-07-16 7:35 UTC (permalink / raw)
To: Steffen Klassert; +Cc: Herbert Xu, David S. Miller, netdev@vger.kernel.org
Hi Steffen,
thanks for your answer,
2014-07-08 13:35 GMT+02:00 Steffen Klassert <steffen.klassert@secunet.com>
:
> On Mon, Jun 30, 2014 at 02:50:18PM +0200, Christophe Gouault wrote:
>> Hi IPsec and network maintainers,
>>
>> After proposing a patchset to netdev (xfrm: scalability enhancements
>> for policy database) and discussing with Steffen Klassert, we agree on
>> the fact that the SPD lookup algorithm needs performance and
>> scalability improvements: SPs with non-prefixed selectors are
>> optimized through a hash table, but other SPs (the majority) are
>> stored in a sorted chained list, which does not scale. Additionally a
>> flowcache is used, and is known not to scale.
>
> I'd not say that the flowcache does not scale, it scales quite well
> in some situations as it returns a precalculated xfrm bundle (policy
> and states) based on a hash. The problem of the flowcache is that it
> gets the performance by learning from the network traffic that arrives
> and therefore it might be partly controllable by remote entities.
>
>>
>> The bottleneck is the SPD lookup by selector (configuration and lookup itself).
>>
>> Unfortunately, there is no all-in-one multi-field classifier that
>> would behave well in all situations. However, various classifiers
>> exist that are fitted to this or that use case. Therefore, I suggest
>> the following approach: adding hooks in the IPsec SPD, so that one can
>> dynamically register a custom SPD implementation ("SPD driver") fitted
>> to its use case, typically by loading a kernel module.
>
> Can you name some multi-field classifiers with their usecases?
> While I think adding such a API is a step in the right direction,
> I would like to see that we have known well scaling algorithms
> that can replace the current method in some situations. Otherwise
> we just add complextiy without any benefit.
There are several multi-field classification algorithms, but few seem
adapted to SPD/SAD lookup:
- linear search
- hierarchical tries
- set-pruning tries
- grid-of-tries
- bit-vector linear search
- cross-producting
- recursive flow classification (RFC)
- decision-tree approach (HiCuts)
- ...
They all suffer from at least one of these issues:
- update time grows too fast with number of rules
- memory consumption grows too fast with number of rules
- memory or update time is unpredictible
- no incremental update
- algorithm too complex to tune
- some are limited to 2 dimensions (e.g. src@ and dst@)
Several of them may be very efficient with a limited number of rules,
but none really scales.
In brief, I agree that the complexity added by SPD replacement hooks
is probably not worth, considering the few algorithm replacement
candidates.
In my humble opinion, the patchset I initially proposed (xfrm:
scalability enhancements for policy database) is a good trade-off: it
is just an extension of the current algorithm, that relaxes conditions
on hashable SPs. It is scalable, it enables to address a large variety
of use cases and it defaults to the current algorithm. And it
drastically improves update and lookup performance of average or big
SPDs, as long as you set good prefix thresholds.
As far as I understand, the only things that concerned you about the
patchset were:
- using /proc to configure the algorithm, you prefer netlink.
- adding a configuration API that could potentially be later
deprecated. My feeling is that the choice of a brand new SPD algorithm
will not happen before long.
- calculation of thresholds is not automatic. As I already suggested,
it may be configured by a daemon if an automatic system is needed.
What if I rework the patchset and replace the configuration via /proc
by a configuration by netlink:
- supporting message XFRM_MSG_NEWSPDINFO from userland, with
attribute XFRMA_SPD_HTHRESH
- adding XFRMA_SPD_HTHRESH attribute to XFRM_MSG_NEWSPDINFO messages
from the kernel in reply to XFRM_MSG_GETSPDINFO?
Best Regards,
Christophe
>> This obviously needs discussion before starting any development, so
>> here is a more detailed proposal:
>>
>> - Define the minimum handlers to manipulate the SPD lookup by selector (alloc,
>> insert, delete, flush, lookup_bysel, lookup_byflow, destroy...).
>> - export a register/unregister function, so that an SPD implementation may
>> register/unregister its handlers.
>> - Separate the SPD common code from the SPD lookup by selector code. Keep the
>> policy_all and policy_byidx tables in the common code, extract the current
>> policy_inexact + policy_bydst implementation as an SPD driver. It is the
>> default implementation when no SPD driver is registered.
>> - *struct xfrm_policy* must offer a private area for SPD driver data (void * or
>> opaque place holder of fixed size or opaque place holder of size specific to
>> driver implementation).
>
> Please keep in mind that we need to lookup policies and states, so both
> lookups need to be reasonably fast for a well scaling IPsec lookup method.
>
>> - since we keep the current implementation as the default, the policy_inexact +
>> policy_bydst database heads (currently stored in netns->xfrm and xfrm_policy
>> link fields (bydst and flo) may remain at their current location.
>> - SPD drivers needing some configuration may export their specific
>> configuration API (/proc, netlink...)
>
> No /proc files please, netlink should be ok for that.
>
>> - as a first step, we only support one registered handler at a time.
>> - as a first step, an SPD driver can only be loaded or unloaded if the SPD is
>> empty (return EBUSY otherwise).
>>
>> Remarks:
>>
>> - this architecture is open to later evolutions such as supporting the
>> registration of several handlers, dynamically listing/selecting/switching
>> drivers via netlink messages (to support dynamic change of SPD implementation
>> according to SPD content).
>> - loading/unloading or changing SPD drivers with a non empty SPD implies to
>> rebuild the SPD from the SP list. This may lock the SPD for a rather long
>> time.
>>
>> I would like your opinion/questions/advices.
>>
>
> Would be good to hear further opinions on this topic...
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: IPsec policy database customization proposal
2014-07-16 7:35 ` Christophe Gouault
@ 2014-07-21 11:01 ` Steffen Klassert
0 siblings, 0 replies; 4+ messages in thread
From: Steffen Klassert @ 2014-07-21 11:01 UTC (permalink / raw)
To: Christophe Gouault; +Cc: Herbert Xu, David S. Miller, netdev@vger.kernel.org
On Wed, Jul 16, 2014 at 09:35:41AM +0200, Christophe Gouault wrote:
>
> There are several multi-field classification algorithms, but few seem
> adapted to SPD/SAD lookup:
>
> - linear search
> - hierarchical tries
> - set-pruning tries
> - grid-of-tries
> - bit-vector linear search
> - cross-producting
We used a cross-producting algorithm for a while to do
policy lookups. It scaled for policy lookups because
the policy database was rather static. It does not
scale at all for state lookups because states change
quite often and the cross-product table must be
reconstructed whenever the database changes.
>
> As far as I understand, the only things that concerned you about the
> patchset were:
>
> - using /proc to configure the algorithm, you prefer netlink.
> - adding a configuration API that could potentially be later
> deprecated. My feeling is that the choice of a brand new SPD algorithm
> will not happen before long.
> - calculation of thresholds is not automatic. As I already suggested,
> it may be configured by a daemon if an automatic system is needed.
Why do you think a daemon can do a better job in automatic
configuration? We maintain the policy hashlists inside the
kernel, so we should know the best about their sizes.
But I agree that not everybody wants to have such an automatic
configuration, so we need some knobs to tune it from userspace.
>
> What if I rework the patchset and replace the configuration via /proc
> by a configuration by netlink:
>
> - supporting message XFRM_MSG_NEWSPDINFO from userland, with
> attribute XFRMA_SPD_HTHRESH
> - adding XFRMA_SPD_HTHRESH attribute to XFRM_MSG_NEWSPDINFO messages
> from the kernel in reply to XFRM_MSG_GETSPDINFO?
>
OK.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-07-21 11:01 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-30 12:50 IPsec policy database customization proposal Christophe Gouault
2014-07-08 11:35 ` Steffen Klassert
2014-07-16 7:35 ` Christophe Gouault
2014-07-21 11:01 ` Steffen Klassert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).