From: Christian Lamparter <chunkeey@googlemail.com>
To: Ben Greear <greearb@candelatech.com>
Cc: "linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>
Subject: Re: Looking for non-NIC hardware-offload for wpa2 decrypt.
Date: Wed, 30 Jul 2014 00:29:56 +0200 [thread overview]
Message-ID: <3302077.5sUEMiqNRr@debian64> (raw)
In-Reply-To: <53D6B78E.1070705@candelatech.com>
On Monday, July 28, 2014 01:50:22 PM Ben Greear wrote:
> On 03/31/2014 11:09 AM, Christian Lamparter wrote:
> > Hello,
> >
> > On Sunday, March 30, 2014 09:40:24 PM Ben Greear wrote:
> >> Due to hardware/firmware limitations, it does not appear possible to
> >> have a wifi NIC do hardware decrypt when using multiple stations on a single
> >> NIC (and have both stations connected to the same AP).
> >>
> >> This just happens to be one of my favourite things to do, and it kills
> >> performance compared to normal 'Open' throughput.
> >>
> >> I am curious if anyone knows of any way to accelerate rx-decrypt, perhaps by
> >> using a specialized hardware board or maybe a feature of certain CPUs?
> >
> > You could check if your CPU (bios and kernel) have support for AES-NI [0].
> > AFAICT mac80211 utilizes the cryptoapi. Therefore anything that supports
> > the proper crypto bindings can be used to accelerate the encryption and
> > decryption process to some degree. And it just happens that thanks to
> > AES-NI parts of math can be efficiently calculated by the CPU.
>
> I recently took a look at this again, and the Intel E5 I'm using
> does use the aesni instructions/driver as far as I can tell.
Which E5 exactly? There are many different E5.
> Throughput is still around 500Mbps where open is around 800Mbps.
I can't test ath10k or your multiple station on a single NIC thing. But
can you run a test for a "simple" single station - single AP wpa2 setup?
I want to know how close to the 800Mbps it actually goes.
> perf top shows this:
>
> Samples: 37K of event 'cycles', Event count (approx.): 19360716192
> 12.01% [kernel] [k] math_state_restore
> 11.64% [kernel] [k] _aesni_enc1
> 8.25% [kernel] [k] __save_init_fpu
> 2.44% [kernel] [k] crypto_xor
> 1.87% [kernel] [k] irq_fpu_usable
> 1.30% [kernel] [k] aes_encrypt
> 0.76% [kernel] [k] __kernel_fpu_end
> ....
Yes, aesni is doing some of the heavy lifting! But in your original post,
you said you are interested in accelerate rx-decrypt... Now it's about
encryption offload?! [please make up your mind :-D]
That being said 12.01% (math_state_restore -
called by kernel_fpu_end) and 8.25% (__save_init_fpu - called
by kernel_fpu_begin) cycles are wasted due fpu save and
restore overhead. [You have noticed that before, didn't you ;-) ]
I think part of the poor performance is due to the design of
aes_encrypt in arch/x86/crypto/aesni-intel_glue.c:
> static void aes_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
> {
> struct crypto_aes_ctx *ctx = aes_ctx(crypto_tfm_ctx(tfm));
> [...]
> kernel_fpu_begin();
> aesni_enc(ctx, dst, src);
> kernel_fpu_end();
> [...]
> }
Ideally you would want something like:
> kernel_fpu_begin();
> aesni_enc(ctx, dst_frame1, src_frame1);
> aesni_enc(ctx, dst_frame2, src_frame2);
> ...
> aesni_enc(ctx, dst_frameN, src_frameN);
> kernel_fpu_end();
But getting there might not be easy and involve more than a bit
of "real programming".
In theory, it should be enough to test if there is some potential
in this approach by "enhancing" the tx-path in the following way:
1. the fpu_begin and fpu_end calls should be added to
ieee80211_crypto_ccmp_encrypt in net/mac80211/wpa.c.
>+ kernel_fpu_begin();
> skb_queue_walk(&tx->skbs, skb) {
> if (ccmp_encrypt_skb(tx, skb) < 0)
> return TX_DROP;
> }
>+ kernel_fpu_end();
>
> return TX_CONTINUE;
2. ieee80211_aes_ccm_encrypt in net/mac80211/aes_ccm.c
has to call __aes_encrypt instead of aes_encrypt in crypto_aead_encrypt.
[I can't think of a sane way to make this work. Of course, it's possible to
make a copy of ccm(aes) crypto_alg* and overwrite aes_encrypt with
__aes_encrypt. But that's not very nice... (It should work though) ]
> Any other magic add-in cards that would somehow just make this all faster w/out
> having to do any real programming work? :)
I doubt there is an magic add-in card for such a use-case. I think most of
them target directly applications/libraries and not the crypto-kernel
interface mac80211 is using.
[It would be really nice to know what E5 you actually have]
Regards
Christian
next prev parent reply other threads:[~2014-07-29 22:30 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-31 4:40 Looking for non-NIC hardware-offload for wpa2 decrypt Ben Greear
2014-03-31 18:09 ` Christian Lamparter
2014-07-28 20:50 ` Ben Greear
2014-07-29 22:29 ` Christian Lamparter [this message]
2014-07-29 22:50 ` Ben Greear
2014-07-30 18:59 ` Christian Lamparter
2014-07-30 19:08 ` Ben Greear
2014-07-31 20:05 ` Jouni Malinen
2014-07-31 20:45 ` Christian Lamparter
2014-08-05 23:09 ` Ben Greear
2014-08-07 14:05 ` Christian Lamparter
2014-08-07 17:45 ` Ben Greear
2014-08-10 13:44 ` Christian Lamparter
2014-08-12 18:34 ` Ben Greear
2014-08-14 12:39 ` Christian Lamparter
2014-08-14 17:09 ` Ben Greear
2014-08-19 18:18 ` Ben Greear
2014-08-20 20:47 ` Christian Lamparter
2014-08-20 21:04 ` Ben Greear
2014-08-22 22:55 ` Christian Lamparter
2014-07-30 7:06 ` Johannes Berg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3302077.5sUEMiqNRr@debian64 \
--to=chunkeey@googlemail.com \
--cc=greearb@candelatech.com \
--cc=linux-wireless@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).