* [patch/RFC]: Asynchronous IPsec processing.
@ 2005-04-29 10:41 Evgeniy Polyakov
2005-04-30 13:36 ` Evgeniy Polyakov
2005-05-03 9:53 ` Herbert Xu
0 siblings, 2 replies; 12+ messages in thread
From: Evgeniy Polyakov @ 2005-04-29 10:41 UTC (permalink / raw)
To: netdev; +Cc: Herbert Xu, Patrick McHardy, David S. Miller, Jamal Hadi Salim
Hello.
I've created POC code to perform asynchronous IPsec [ESP]
processing. Please comment about bugs in the following patch.
It of course very dirty - but it is only begining,
I just want to know if approach is right.
Patch was tested with several ssh session and some
traffic like find / and tcpdump over them.
Thank you.
diff -ru ../linux-2.6-orig/net/ipv4/esp4.c ./net/ipv4/esp4.c
--- ../linux-2.6-orig/net/ipv4/esp4.c 2005-04-25 15:41:39.000000000 +0400
+++ ./net/ipv4/esp4.c 2005-04-29 14:34:10.000000000 +0400
@@ -7,6 +7,7 @@
#include <linux/crypto.h>
#include <linux/pfkeyv2.h>
#include <linux/random.h>
+#include <linux/timer.h>
#include <net/icmp.h>
#include <net/udp.h>
@@ -17,6 +18,95 @@
__u8 proto;
};
+static int esp_output(struct xfrm_state *x, struct sk_buff *skb);
+
+struct esp_async {
+ struct timer_list tm;
+ struct sk_buff *skb;
+ struct xfrm_state *x;
+ struct dst_entry *dst;
+};
+
+static void esp4_callback(unsigned long data)
+{
+ struct esp_async *ea = (struct esp_async *)data;
+ struct sk_buff *skb = ea->skb;
+ struct dst_entry *dst = ea->dst;
+ struct xfrm_state *x = ea->x;
+ int err;
+
+ printk("%s: skb=%p, skb->users=%d.\n", __func__, skb, atomic_read(&skb->users));
+ printk("%s: dst=%p, skb->dst=%p.\n", __func__, dst, skb->dst);
+ printk("%s: xfrm=%p, skb->dst->xfrm=%p.\n", __func__, x, (skb->dst)?skb->dst->xfrm:NULL);
+
+ spin_lock_bh(&x->lock);
+ err = esp_output(x, skb);
+ spin_unlock_bh(&x->lock);
+
+ printk("%s: Data has been processed: err=%d.\n", __func__, err);
+
+ if (err)
+ goto err_out;
+
+ skb->dst = dst_pop(dst);
+ printk("%s: pop has been finished: skb->dst=%p, dst=%p, skb->users=%d.\n",
+ __func__, skb->dst, dst, atomic_read(&skb->users));
+ if (!skb->dst)
+ goto err_out;
+
+ dst_output(skb);
+
+out:
+ kfree(ea);
+ return;
+
+err_out:
+ kfree_skb(skb);
+ goto out;
+}
+
+static int esp_output_async(struct xfrm_state *x, struct sk_buff *skb)
+{
+ struct esp_async *ea;
+ struct dst_entry *child;
+
+ printk("%s: enter. Child list: ", __func__);
+ for (child = skb->dst; child; child = child->child)
+ printk("%p [%s] [%d] -> ", child, child->dev->name, atomic_read(&child->__refcnt));
+ printk("\n");
+
+ ea = kmalloc(sizeof(*ea), GFP_ATOMIC);
+ if (!ea)
+ return -ENOMEM;
+
+ memset(ea, 0, sizeof(*ea));
+
+ skb = skb_clone(skb, GFP_ATOMIC);
+ if (!skb)
+ return -ENOMEM;
+ dst_hold(skb->dst);
+
+ ea->skb = skb;
+ ea->x = x;
+ ea->dst = skb->dst;
+
+ printk("%s: x=%p, skb=%p, skb->dst=%p, skb->dst->xfrm=%p.\n",
+ __func__, x, skb, skb->dst, skb->dst->xfrm);
+
+ init_timer(&ea->tm);
+ ea->tm.function = &esp4_callback;
+ ea->tm.data = (unsigned long)ea;
+ ea->tm.expires = jiffies;
+
+ add_timer(&ea->tm);
+
+ printk("%s: timer added: skb->users=%d.\n", __func__, atomic_read(&skb->users));
+
+ return 0;
+
+}
+
+
static int esp_output(struct xfrm_state *x, struct sk_buff *skb)
{
int err;
@@ -465,7 +555,7 @@
.get_max_size = esp4_get_max_size,
.input = esp_input,
.post_input = esp_post_input,
- .output = esp_output
+ .output = esp_output_async
};
static struct net_protocol esp4_protocol = {
diff -ru ../linux-2.6-orig/net/ipv4/xfrm4_output.c ./net/ipv4/xfrm4_output.c
--- ../linux-2.6-orig/net/ipv4/xfrm4_output.c 2005-04-25 15:41:40.000000000 +0400
+++ ./net/ipv4/xfrm4_output.c 2005-04-29 12:13:41.000000000 +0400
@@ -124,12 +124,6 @@
x->curlft.packets++;
spin_unlock_bh(&x->lock);
-
- if (!(skb->dst = dst_pop(dst))) {
- err = -EHOSTUNREACH;
- goto error_nolock;
- }
- err = NET_XMIT_BYPASS;
out_exit:
return err;
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch/RFC]: Asynchronous IPsec processing.
2005-04-29 10:41 [patch/RFC]: Asynchronous IPsec processing Evgeniy Polyakov
@ 2005-04-30 13:36 ` Evgeniy Polyakov
2005-05-03 9:53 ` Herbert Xu
1 sibling, 0 replies; 12+ messages in thread
From: Evgeniy Polyakov @ 2005-04-30 13:36 UTC (permalink / raw)
To: Evgeniy Polyakov
Cc: netdev, Herbert Xu, Patrick McHardy, David S. Miller,
Jamal Hadi Salim
On Fri, 29 Apr 2005 14:41:03 +0400
Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:
> Hello.
>
> I've created POC code to perform asynchronous IPsec [ESP]
> processing. Please comment about bugs in the following patch.
> It of course very dirty - but it is only begining,
> I just want to know if approach is right.
> Patch was tested with several ssh session and some
> traffic like find / and tcpdump over them.
Some ideas behind it follow.
Due to struct dst_entry stackability
we can return NET_XMIT_SUCCESS (0) from not the last dst->output(),
which is ip_output_xmit(), which queues skb and returns 0, but from
xfrm's output method, which calls transformation function like esp_output()
or ah_output(), before transformaton take place and defer it's processing,
when later we will call dst_output() again, thus emulating it's original
call from the network stack, with skb->dst being set to the next
struct dst_entry like it was done in synchronous IPsec, the most
likely it will be ip_queue_xmit() - the latest transport output method.
So after original dst_output() is finished in it's middle,
skb will be flying aroung for some time
until some callback calls dst_output() again with appropriate skb->dst.
Above logic is implemented with cloned skbs and dst entry being held,
but I have some suspicions about xfrm state in that processing.
Patch was tested on SMP with netperf TCP strem test too, and it does not
crash for quite long time, but then some bug occured - netconsole
did not show it, but in real console was something from receive path
(net_receive_*) so it probably was not that change.
Patch is against 2.6.12-rc2.
It was created to allow asynchronous crypto processing deferring and
thus to be able to use hardware crypto acceleration which
is supported in the existing asynchronous crypto layers for linux kernel.
Comments?
Evgeniy Polyakov
Only failure makes us experts. -- Theo de Raadt
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch/RFC]: Asynchronous IPsec processing.
2005-04-29 10:41 [patch/RFC]: Asynchronous IPsec processing Evgeniy Polyakov
2005-04-30 13:36 ` Evgeniy Polyakov
@ 2005-05-03 9:53 ` Herbert Xu
2005-05-03 10:18 ` Evgeniy Polyakov
1 sibling, 1 reply; 12+ messages in thread
From: Herbert Xu @ 2005-05-03 9:53 UTC (permalink / raw)
To: Evgeniy Polyakov
Cc: netdev, Patrick McHardy, David S. Miller, Jamal Hadi Salim
On Fri, Apr 29, 2005 at 02:41:03PM +0400, Evgeniy Polyakov wrote:
>
> I've created POC code to perform asynchronous IPsec [ESP]
> processing. Please comment about bugs in the following patch.
> It of course very dirty - but it is only begining,
> I just want to know if approach is right.
> Patch was tested with several ssh session and some
> traffic like find / and tcpdump over them.
IMHO we should ensure that the async code path does not adversely
impact synchronous crypto performance. Most users will be using
synchronous crypto primitives. Synchronous crypto is also the best
way to utilise VIA Padlock which is arguably the best hardware crypto
solution that's available today.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch/RFC]: Asynchronous IPsec processing.
2005-05-03 10:18 ` Evgeniy Polyakov
@ 2005-05-03 10:14 ` Herbert Xu
2005-05-03 10:31 ` Evgeniy Polyakov
0 siblings, 1 reply; 12+ messages in thread
From: Herbert Xu @ 2005-05-03 10:14 UTC (permalink / raw)
To: Evgeniy Polyakov
Cc: netdev, Patrick McHardy, David S. Miller, Jamal Hadi Salim
On Tue, May 03, 2005 at 02:18:22PM +0400, Evgeniy Polyakov wrote:
>
> It can be compile option - those people who wants asynchronous crypto
> processing and has appropriate hardware will benefit from that even
> if theirs general purpose CPU is VIA with PadLock ACE.
Well if there were no better options then we'll have to do that.
However, I believe that with the right crypto API we should be
able to have async crypto support without sacrificing synchronous
performance.
> It looks like several CPUs can not be used for synchronous crypto
> processing in current IPsec implementation. Using asynchronous
That's just an implementation quirk. I will be addressing that
soon as part of the xfrm locking clean-up.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch/RFC]: Asynchronous IPsec processing.
2005-05-03 9:53 ` Herbert Xu
@ 2005-05-03 10:18 ` Evgeniy Polyakov
2005-05-03 10:14 ` Herbert Xu
0 siblings, 1 reply; 12+ messages in thread
From: Evgeniy Polyakov @ 2005-05-03 10:18 UTC (permalink / raw)
To: Herbert Xu; +Cc: netdev, Patrick McHardy, David S. Miller, Jamal Hadi Salim
[-- Attachment #1: Type: text/plain, Size: 1274 bytes --]
On Tue, 2005-05-03 at 19:53 +1000, Herbert Xu wrote:
> On Fri, Apr 29, 2005 at 02:41:03PM +0400, Evgeniy Polyakov wrote:
> >
> > I've created POC code to perform asynchronous IPsec [ESP]
> > processing. Please comment about bugs in the following patch.
> > It of course very dirty - but it is only begining,
> > I just want to know if approach is right.
> > Patch was tested with several ssh session and some
> > traffic like find / and tcpdump over them.
>
> IMHO we should ensure that the async code path does not adversely
> impact synchronous crypto performance. Most users will be using
> synchronous crypto primitives. Synchronous crypto is also the best
> way to utilise VIA Padlock which is arguably the best hardware crypto
> solution that's available today.
It can be compile option - those people who wants asynchronous crypto
processing and has appropriate hardware will benefit from that even
if theirs general purpose CPU is VIA with PadLock ACE.
It looks like several CPUs can not be used for synchronous crypto
processing in current IPsec implementation. Using asynchronous
mode there might be significant performance win.
> Cheers,
--
Evgeniy Polyakov
Crash is better than data corruption -- Arthur Grabowski
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch/RFC]: Asynchronous IPsec processing.
2005-05-03 10:31 ` Evgeniy Polyakov
@ 2005-05-03 10:29 ` Herbert Xu
2005-05-03 10:55 ` Evgeniy Polyakov
0 siblings, 1 reply; 12+ messages in thread
From: Herbert Xu @ 2005-05-03 10:29 UTC (permalink / raw)
To: Evgeniy Polyakov
Cc: netdev, Patrick McHardy, David S. Miller, Jamal Hadi Salim
On Tue, May 03, 2005 at 02:31:35PM +0400, Evgeniy Polyakov wrote:
>
> Asynchronous processing will not hurt synchronous pathes in any way.
It will if you force everybody to go through the asynchronous path
because you're jacking up the latency.
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch/RFC]: Asynchronous IPsec processing.
2005-05-03 10:14 ` Herbert Xu
@ 2005-05-03 10:31 ` Evgeniy Polyakov
2005-05-03 10:29 ` Herbert Xu
0 siblings, 1 reply; 12+ messages in thread
From: Evgeniy Polyakov @ 2005-05-03 10:31 UTC (permalink / raw)
To: Herbert Xu; +Cc: netdev, Patrick McHardy, David S. Miller, Jamal Hadi Salim
[-- Attachment #1: Type: text/plain, Size: 1494 bytes --]
On Tue, 2005-05-03 at 20:14 +1000, Herbert Xu wrote:
> On Tue, May 03, 2005 at 02:18:22PM +0400, Evgeniy Polyakov wrote:
> >
> > It can be compile option - those people who wants asynchronous crypto
> > processing and has appropriate hardware will benefit from that even
> > if theirs general purpose CPU is VIA with PadLock ACE.
>
> Well if there were no better options then we'll have to do that.
>
> However, I believe that with the right crypto API we should be
> able to have async crypto support without sacrificing synchronous
> performance.
Asynchronous processing will not hurt synchronous pathes in any way.
But in some places we can use async api easily - like block devices
encryption, but for others - like IPsec, there is no ability
to split packet processing and thus even use async api.
How carefully asynchronous API would be created current IPsec code
just can not use it.
> > It looks like several CPUs can not be used for synchronous crypto
> > processing in current IPsec implementation. Using asynchronous
>
> That's just an implementation quirk. I will be addressing that
> soon as part of the xfrm locking clean-up.
That is not enough, as far as I can see, since only one tfm is used
for one transformer state.
Locking changes will allow parallel processing of AH and ESP for
example,
but not two packets from the same flow.
> Cheers,
--
Evgeniy Polyakov
Crash is better than data corruption -- Arthur Grabowski
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch/RFC]: Asynchronous IPsec processing.
2005-05-03 10:29 ` Herbert Xu
@ 2005-05-03 10:55 ` Evgeniy Polyakov
2005-05-03 13:38 ` [patch/RFC]: Asynchronous IPsec processing benchmark Evgeniy Polyakov
0 siblings, 1 reply; 12+ messages in thread
From: Evgeniy Polyakov @ 2005-05-03 10:55 UTC (permalink / raw)
To: Herbert Xu; +Cc: netdev, Patrick McHardy, David S. Miller, Jamal Hadi Salim
[-- Attachment #1: Type: text/plain, Size: 1671 bytes --]
On Tue, 2005-05-03 at 20:29 +1000, Herbert Xu wrote:
> On Tue, May 03, 2005 at 02:31:35PM +0400, Evgeniy Polyakov wrote:
> >
> > Asynchronous processing will not hurt synchronous pathes in any way.
>
> It will if you force everybody to go through the asynchronous path
> because you're jacking up the latency.
But if it will not be selected - IPsec users will not be affected.
Using asynchronous crypto processing of course has it's own nitpics,
and although it's value was prooven [1] to be unsignificant, it is
still
there.
Current IPsec processing [even if it is UP only] has very strong model
which
always gets the maximum only from synchronous crypto.
If people select asynchronous IPsec processing - they will use
_asynchronous_ IPsec processing, and no _synchronous_ pathes will be
affected.
Using asynchronous IPsec processing is only usefull with asynchronous
crypto layers, so no need to turn it on if none could be used with
hardware.
Btw, current crypto schema by design is SMP unfriendly - there is
only low-level TFM entity, which
1. must be recreated for several CPUs
2. caller must know about how many CPUs are, which TFM to use and so on.
Asynchronous crypto layers allow to hide it using proper API.
I doubt there will be any benfit for existing IPsec schema from several
CPUs without either some crypto processing rewrite
(either by using per-cpu xfrm states or using several tfms per
transformer),
or without using some asynchronous crypto processing schema...
[1] http://www.openbsd.org/papers/ocf.pdf
--
Evgeniy Polyakov
Crash is better than data corruption -- Arthur Grabowski
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* [patch/RFC]: Asynchronous IPsec processing benchmark.
2005-05-03 10:55 ` Evgeniy Polyakov
@ 2005-05-03 13:38 ` Evgeniy Polyakov
2005-05-04 10:40 ` jamal
0 siblings, 1 reply; 12+ messages in thread
From: Evgeniy Polyakov @ 2005-05-03 13:38 UTC (permalink / raw)
To: Herbert Xu; +Cc: netdev, Patrick McHardy, David S. Miller, Jamal Hadi Salim
[-- Attachment #1: Type: text/plain, Size: 497 bytes --]
Here are some numbers:
./netperf -l 60 -H gw -t TCP_STREAM -i 10,2 -I 99,5 -- -m 4096 -s 57344
-S 57344
TCP STREAM TEST to gw : +/-2.5% @ 99% conf.
async-ipsec, 10^6bits/sec: 35.42
sync-ipsec, 10^6bits/sec: 37.11
So even with existing timer deferring it is not noticebly slower [about
4%].
And I think that benefits it provides definitely cost that price and
compile time option.
--
Evgeniy Polyakov
Crash is better than data corruption -- Arthur Grabowski
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch/RFC]: Asynchronous IPsec processing benchmark.
2005-05-03 13:38 ` [patch/RFC]: Asynchronous IPsec processing benchmark Evgeniy Polyakov
@ 2005-05-04 10:40 ` jamal
2005-05-04 16:11 ` Evgeniy Polyakov
0 siblings, 1 reply; 12+ messages in thread
From: jamal @ 2005-05-04 10:40 UTC (permalink / raw)
To: johnpol; +Cc: Herbert Xu, netdev, Patrick McHardy, David S. Miller
On Tue, 2005-03-05 at 17:38 +0400, Evgeniy Polyakov wrote:
> Here are some numbers:
>
> ./netperf -l 60 -H gw -t TCP_STREAM -i 10,2 -I 99,5 -- -m 4096 -s 57344
> -S 57344
>
> TCP STREAM TEST to gw : +/-2.5% @ 99% conf.
>
> async-ipsec, 10^6bits/sec: 35.42
> sync-ipsec, 10^6bits/sec: 37.11
>
> So even with existing timer deferring it is not noticebly slower [about
> 4%].
>
by "sync" i hope you mean the original code without your change?
The one thing i see in your POC code that may affect numbers a bit is
allocation of struct esp_async every time in the path. Perhaps precreate
a pool of those and then just grab/return to/fro pool;
BTW, you may need to incr ref counter of x pre-callback and decrement
when done in callback.
> And I think that benefits it provides definitely cost that price and
> compile time option.
I think Herberts concerns about latency should go away if you really
have some proper crypto hardware.
cheers,
jamal
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch/RFC]: Asynchronous IPsec processing benchmark.
2005-05-04 10:40 ` jamal
@ 2005-05-04 16:11 ` Evgeniy Polyakov
2005-05-05 13:04 ` jamal
0 siblings, 1 reply; 12+ messages in thread
From: Evgeniy Polyakov @ 2005-05-04 16:11 UTC (permalink / raw)
To: hadi; +Cc: Herbert Xu, netdev, Patrick McHardy, David S. Miller
On Wed, 04 May 2005 06:40:14 -0400
jamal <hadi@cyberus.ca> wrote:
> On Tue, 2005-03-05 at 17:38 +0400, Evgeniy Polyakov wrote:
> > Here are some numbers:
> >
> > ./netperf -l 60 -H gw -t TCP_STREAM -i 10,2 -I 99,5 -- -m 4096 -s 57344
> > -S 57344
> >
> > TCP STREAM TEST to gw : +/-2.5% @ 99% conf.
> >
> > async-ipsec, 10^6bits/sec: 35.42
> > sync-ipsec, 10^6bits/sec: 37.11
> >
> > So even with existing timer deferring it is not noticebly slower [about
> > 4%].
> >
>
> by "sync" i hope you mean the original code without your change?
Yes, it is vanilla 2.6.12-rc2 kernel with native IPsec.
> The one thing i see in your POC code that may affect numbers a bit is
> allocation of struct esp_async every time in the path. Perhaps precreate
> a pool of those and then just grab/return to/fro pool;
That could be skb too - since both skb/kmalloc
atomic allocations end up in kmem_cache_alloc().
> BTW, you may need to incr ref counter of x pre-callback and decrement
> when done in callback.
It looks like dst entry holding is enough since
direct dst->output(), i.e. xfrm_state->output(),
call itself is not protected by reference counter,
but dst entry is being held during that call.
> > And I think that benefits it provides definitely cost that price and
> > compile time option.
>
> I think Herberts concerns about latency should go away if you really
> have some proper crypto hardware.
Unfortunately I do not have hardware crypto accelerator setup currently
[board freezes before even monitor and keyboard blink with my HIFN card,
it looks like bus arbitrage problem],
so I can not provide real numbers with it, but I will set acrypto with
several software crypto providers with that patch on SMP
[scheiъe, I burned second HT Xeon] (1+1HT) up, and will rerun the test soon.
> cheers,
> jamal
>
Evgeniy Polyakov
Only failure makes us experts. -- Theo de Raadt
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch/RFC]: Asynchronous IPsec processing benchmark.
2005-05-04 16:11 ` Evgeniy Polyakov
@ 2005-05-05 13:04 ` jamal
0 siblings, 0 replies; 12+ messages in thread
From: jamal @ 2005-05-05 13:04 UTC (permalink / raw)
To: johnpol; +Cc: Herbert Xu, netdev, Patrick McHardy, David S. Miller
On Wed, 2005-04-05 at 20:11 +0400, Evgeniy Polyakov wrote:
> On Wed, 04 May 2005 06:40:14 -0400
> jamal <hadi@cyberus.ca> wrote:
>
> > On Tue, 2005-03-05 at 17:38 +0400, Evgeniy Polyakov wrote:
> > > Here are some numbers:
> > >
> > > ./netperf -l 60 -H gw -t TCP_STREAM -i 10,2 -I 99,5 -- -m 4096 -s 57344
> > > -S 57344
> > >
> > > TCP STREAM TEST to gw : +/-2.5% @ 99% conf.
> > >
> > > async-ipsec, 10^6bits/sec: 35.42
> > > sync-ipsec, 10^6bits/sec: 37.11
> > >
> > > So even with existing timer deferring it is not noticebly slower [about
> > > 4%].
> > >
> >
> > by "sync" i hope you mean the original code without your change?
>
> Yes, it is vanilla 2.6.12-rc2 kernel with native IPsec.
>
Very nice numbers then - considering the async was at least delaying
things by at least a jiffie.
> > The one thing i see in your POC code that may affect numbers a bit is
> > allocation of struct esp_async every time in the path. Perhaps precreate
> > a pool of those and then just grab/return to/fro pool;
>
> That could be skb too - since both skb/kmalloc
> atomic allocations end up in kmem_cache_alloc().
The skb is a little tricky in particular if you allocate in one CPU and
free on another. If this could happen with the esp_async struct as well
then it is not worth it.
>
> > BTW, you may need to incr ref counter of x pre-callback and decrement
> > when done in callback.
>
> It looks like dst entry holding is enough since
> direct dst->output(), i.e. xfrm_state->output(),
> call itself is not protected by reference counter,
> but dst entry is being held during that call.
>
makes sense.
cheers,
jamal
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2005-05-05 13:04 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-29 10:41 [patch/RFC]: Asynchronous IPsec processing Evgeniy Polyakov
2005-04-30 13:36 ` Evgeniy Polyakov
2005-05-03 9:53 ` Herbert Xu
2005-05-03 10:18 ` Evgeniy Polyakov
2005-05-03 10:14 ` Herbert Xu
2005-05-03 10:31 ` Evgeniy Polyakov
2005-05-03 10:29 ` Herbert Xu
2005-05-03 10:55 ` Evgeniy Polyakov
2005-05-03 13:38 ` [patch/RFC]: Asynchronous IPsec processing benchmark Evgeniy Polyakov
2005-05-04 10:40 ` jamal
2005-05-04 16:11 ` Evgeniy Polyakov
2005-05-05 13:04 ` jamal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).