* [RFC] Replace scatterlist with crypto_frag
@ 2005-06-03 23:46 Herbert Xu
2005-06-04 0:02 ` Jeff Garzik
` (2 more replies)
0 siblings, 3 replies; 30+ messages in thread
From: Herbert Xu @ 2005-06-03 23:46 UTC (permalink / raw)
To: David S. Miller, James Morris, Linux Crypto Mailing List, netdev
Hi:
I was looking at how we can move the IPsec input/output processing out
of the critical section protected by the spin locks on the xfrm_state.
This is useful because it would allow concurrent processing of IPsec
packets for the same SA. It is also necessary if we're ever going to
add support for asynchronous crypto to IPsec.
The first requirement for this is that we need to stop using data that
is shared across a single SA in the IPsec input/output routines. The
biggest hurdle there as it stands is sgbuf in esp_data. This was
introduced to reduce stack usage in esp_input/esp_output as sgbuf
would consume up to 64 bytes of space.
In order to move it back onto the stack (so we can run these things
in parallel), I'm thinking of reducing the size of the scatterlist
structure itself.
The Crypto API doesn't need all the data contained in a scatterlist
structure. For instance, it has no need for anything to do with DMA.
When we implement hardware crypto (which might do DMA), they're going
to have their own lists of descriptors so they can't use the scatterlist
as is anyway.
The skb_frag_t structure on the other hand is much more suited for
our purpose. It is only half the size of scatterlist on i386.
So what do you think about introducing a new crypto_frag structure
which looks like this:
struct crypto_frag {
struct page *page;
u16 offset;
u16 length;
};
We could then move sgbuf back into esp_input/esp_output at the cost
of 32 bytes of stack. Is this stack cost acceptable?
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-03 23:46 [RFC] Replace scatterlist with crypto_frag Herbert Xu @ 2005-06-04 0:02 ` Jeff Garzik 2005-06-04 0:42 ` Herbert Xu 2005-06-04 9:55 ` Evgeniy Polyakov 2005-06-04 11:23 ` Christoph Hellwig 2 siblings, 1 reply; 30+ messages in thread From: Jeff Garzik @ 2005-06-04 0:02 UTC (permalink / raw) To: Herbert Xu Cc: David S. Miller, James Morris, Linux Crypto Mailing List, netdev Herbert Xu wrote: > The Crypto API doesn't need all the data contained in a scatterlist > structure. For instance, it has no need for anything to do with DMA. > When we implement hardware crypto (which might do DMA), they're going > to have their own lists of descriptors so they can't use the scatterlist > as is anyway. I'm not sure I agree with this. A standard feature of struct scatterlist is having the DMA mappings right next to the kernel virtual address/length info. Drivers use the arch-specific DMA-mapped part of struct scatterlist to fill the hardware-specific descriptions with addresses and other info. Since you -will- have to DMA map buffers before passing them to hardware, it seems like struct scatterlist is much more appropriate than crypto_frag when dealing with hardware. For pure software implementations, I don't see why you can't just ignore the extra fields that each arch puts into struct scatterlist. Jeff ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-04 0:02 ` Jeff Garzik @ 2005-06-04 0:42 ` Herbert Xu 2005-06-04 4:39 ` James Morris 2005-06-04 10:00 ` Evgeniy Polyakov 0 siblings, 2 replies; 30+ messages in thread From: Herbert Xu @ 2005-06-04 0:42 UTC (permalink / raw) To: Jeff Garzik Cc: David S. Miller, James Morris, Linux Crypto Mailing List, netdev Hi Jeff: On Fri, Jun 03, 2005 at 08:02:52PM -0400, Jeff Garzik wrote: > > A standard feature of struct scatterlist is having the DMA mappings > right next to the kernel virtual address/length info. Drivers use the > arch-specific DMA-mapped part of struct scatterlist to fill the > hardware-specific descriptions with addresses and other info. Agreed. > Since you -will- have to DMA map buffers before passing them to > hardware, it seems like struct scatterlist is much more appropriate than > crypto_frag when dealing with hardware. > > For pure software implementations, I don't see why you can't just ignore > the extra fields that each arch puts into struct scatterlist. It depends on who is going to do the mapping. When we implement hardware crypto, the DMA mapping will be done either by the crypto layer or under it by the driver itself. So the crypto layer is certainly going to need the scatterlist structure. However, the users of the crypto layer (such as IPsec/dmcrypt) don't have to know about DMA at all. Therefore the data structure between the users and the crypto layer itself doesn't have to carry DMA information. Compare this with the block layer. Between the users of the block layer and the block layer itself you have the bio_vec structure which carries no DMA information. The scatterlist structure only comes into play after DMA mapping has been carried out under the block layer. So this is really a sort of bio_vec for crypto structures. The objective here is to make the structure as compact as possible to allow users to allocate it on the stack most of the time. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-04 0:42 ` Herbert Xu @ 2005-06-04 4:39 ` James Morris 2005-06-04 4:51 ` Herbert Xu 2005-06-04 10:00 ` Evgeniy Polyakov 1 sibling, 1 reply; 30+ messages in thread From: James Morris @ 2005-06-04 4:39 UTC (permalink / raw) To: Herbert Xu Cc: Jeff Garzik, David S. Miller, Linux Crypto Mailing List, netdev On Sat, 4 Jun 2005, Herbert Xu wrote: > So this is really a sort of bio_vec for crypto structures. The objective > here is to make the structure as compact as possible to allow users to > allocate it on the stack most of the time. Seems like a good idea to me. - James -- James Morris <jmorris@redhat.com> ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-04 4:39 ` James Morris @ 2005-06-04 4:51 ` Herbert Xu 2005-06-04 5:23 ` James Morris 0 siblings, 1 reply; 30+ messages in thread From: Herbert Xu @ 2005-06-04 4:51 UTC (permalink / raw) To: James Morris Cc: Jeff Garzik, David S. Miller, Linux Crypto Mailing List, netdev On Sat, Jun 04, 2005 at 12:39:41AM -0400, James Morris wrote: > On Sat, 4 Jun 2005, Herbert Xu wrote: > > > So this is really a sort of bio_vec for crypto structures. The objective > > here is to make the structure as compact as possible to allow users to > > allocate it on the stack most of the time. > > Seems like a good idea to me. Thanks James. What do you think about eating up 32 bytes on the stack in esp_input/esp_output? In fact, how did we come up with the number of four frags? Why wouldn't say two frags do for most users or perhaps even one? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-04 4:51 ` Herbert Xu @ 2005-06-04 5:23 ` James Morris 2005-06-04 5:33 ` Herbert Xu 0 siblings, 1 reply; 30+ messages in thread From: James Morris @ 2005-06-04 5:23 UTC (permalink / raw) To: Herbert Xu Cc: Jeff Garzik, David S. Miller, Linux Crypto Mailing List, netdev On Sat, 4 Jun 2005, Herbert Xu wrote: > Thanks James. What do you think about eating up 32 bytes on the > stack in esp_input/esp_output? Sounds like a low price to pay, given the general overhead of ipsec. > In fact, how did we come up with the number of four frags? Why wouldn't > say two frags do for most users or perhaps even one? I don't know where that came from. - James -- James Morris <jmorris@redhat.com> ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-04 5:23 ` James Morris @ 2005-06-04 5:33 ` Herbert Xu 0 siblings, 0 replies; 30+ messages in thread From: Herbert Xu @ 2005-06-04 5:33 UTC (permalink / raw) To: James Morris Cc: Jeff Garzik, David S. Miller, Linux Crypto Mailing List, netdev On Sat, Jun 04, 2005 at 01:23:55AM -0400, James Morris wrote: > On Sat, 4 Jun 2005, Herbert Xu wrote: > > > Thanks James. What do you think about eating up 32 bytes on the > > stack in esp_input/esp_output? > > Sounds like a low price to pay, given the general overhead of ipsec. I agree with you on the stack usage. BTW, we can now pump 5Gb/s through the Crypto API using a 1Ghz VIA CPU with the Padlock so encryption is no longer necessarily the slowest piece along the pipeline :) Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-04 0:42 ` Herbert Xu 2005-06-04 4:39 ` James Morris @ 2005-06-04 10:00 ` Evgeniy Polyakov 1 sibling, 0 replies; 30+ messages in thread From: Evgeniy Polyakov @ 2005-06-04 10:00 UTC (permalink / raw) To: Herbert Xu Cc: Jeff Garzik, David S. Miller, James Morris, Linux Crypto Mailing List, netdev On Sat, 4 Jun 2005 10:42:01 +1000 Herbert Xu <herbert@gondor.apana.org.au> wrote: > Hi Jeff: > > On Fri, Jun 03, 2005 at 08:02:52PM -0400, Jeff Garzik wrote: > > > > A standard feature of struct scatterlist is having the DMA mappings > > right next to the kernel virtual address/length info. Drivers use the > > arch-specific DMA-mapped part of struct scatterlist to fill the > > hardware-specific descriptions with addresses and other info. > > Agreed. > > > Since you -will- have to DMA map buffers before passing them to > > hardware, it seems like struct scatterlist is much more appropriate than > > crypto_frag when dealing with hardware. > > > > For pure software implementations, I don't see why you can't just ignore > > the extra fields that each arch puts into struct scatterlist. > > It depends on who is going to do the mapping. When we implement hardware > crypto, the DMA mapping will be done either by the crypto layer or under > it by the driver itself. So the crypto layer is certainly going to need > the scatterlist structure. > > However, the users of the crypto layer (such as IPsec/dmcrypt) don't have > to know about DMA at all. Therefore the data structure between the users > and the crypto layer itself doesn't have to carry DMA information. > > Compare this with the block layer. Between the users of the block layer > and the block layer itself you have the bio_vec structure which carries > no DMA information. The scatterlist structure only comes into play after > DMA mapping has been carried out under the block layer. > > So this is really a sort of bio_vec for crypto structures. The objective > here is to make the structure as compact as possible to allow users to > allocate it on the stack most of the time. As far as I remember, IPsec has scterlists specially to _not_ remap from any inner strucutre to scaterlist later. Block layer was not designed in a such way because there is no easy mapping in block cache into scaterlist and bio_vec has much bigger usage than SA, so removing dma address is suitable there. > Cheers, > -- > Visit Openswan at http://www.openswan.org/ > Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt > - > To unsubscribe from this list: send the line "unsubscribe linux-crypto" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Evgeniy Polyakov Only failure makes us experts. -- Theo de Raadt ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-03 23:46 [RFC] Replace scatterlist with crypto_frag Herbert Xu 2005-06-04 0:02 ` Jeff Garzik @ 2005-06-04 9:55 ` Evgeniy Polyakov 2005-06-04 9:58 ` Herbert Xu 2005-06-04 11:23 ` Christoph Hellwig 2 siblings, 1 reply; 30+ messages in thread From: Evgeniy Polyakov @ 2005-06-04 9:55 UTC (permalink / raw) To: Herbert Xu Cc: David S. Miller, James Morris, Linux Crypto Mailing List, netdev On Sat, 4 Jun 2005 09:46:23 +1000 Herbert Xu <herbert@gondor.apana.org.au> wrote: > Hi: > > I was looking at how we can move the IPsec input/output processing out > of the critical section protected by the spin locks on the xfrm_state. > This is useful because it would allow concurrent processing of IPsec > packets for the same SA. It is also necessary if we're ever going to > add support for asynchronous crypto to IPsec. Asynchronous schemas already works without any changes to scaterlist processing code. And you can not easily move away of SA lock due to synchronous problems with the same tfm. Existing asynchronous schemas do not use any shared object in SA, only skb. > The first requirement for this is that we need to stop using data that > is shared across a single SA in the IPsec input/output routines. The > biggest hurdle there as it stands is sgbuf in esp_data. This was > introduced to reduce stack usage in esp_input/esp_output as sgbuf > would consume up to 64 bytes of space. No need to have it at all, I think. > Cheers, > -- > Visit Openswan at http://www.openswan.org/ > Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt > - > To unsubscribe from this list: send the line "unsubscribe linux-crypto" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Evgeniy Polyakov Only failure makes us experts. -- Theo de Raadt ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-04 9:55 ` Evgeniy Polyakov @ 2005-06-04 9:58 ` Herbert Xu 2005-06-04 10:17 ` Evgeniy Polyakov 0 siblings, 1 reply; 30+ messages in thread From: Herbert Xu @ 2005-06-04 9:58 UTC (permalink / raw) To: Evgeniy Polyakov Cc: David S. Miller, James Morris, Linux Crypto Mailing List, netdev On Sat, Jun 04, 2005 at 01:55:35PM +0400, Evgeniy Polyakov wrote: > > processing code. And you can not easily move away of SA lock due to > synchronous problems with the same tfm. This is not true. The tfm context contains no shared state apart from the IV. As the IV can be specified through the *_iv functions, this allows crypto API users to process the same cipher tfm on two CPUs in parallel. If you don't believe me just wait for my upcoming patches to IPsec. -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-04 9:58 ` Herbert Xu @ 2005-06-04 10:17 ` Evgeniy Polyakov 2005-06-04 10:22 ` Herbert Xu 0 siblings, 1 reply; 30+ messages in thread From: Evgeniy Polyakov @ 2005-06-04 10:17 UTC (permalink / raw) To: Herbert Xu Cc: David S. Miller, James Morris, Linux Crypto Mailing List, netdev On Sat, 4 Jun 2005 19:58:54 +1000 Herbert Xu <herbert@gondor.apana.org.au> wrote: > On Sat, Jun 04, 2005 at 01:55:35PM +0400, Evgeniy Polyakov wrote: > > > > processing code. And you can not easily move away of SA lock due to > > synchronous problems with the same tfm. > > This is not true. The tfm context contains no shared state apart > from the IV. As the IV can be specified through the *_iv functions, > this allows crypto API users to process the same cipher tfm on two > CPUs in parallel. > > If you don't believe me just wait for my upcoming patches to IPsec. Sure I believe you, in tfm there are no shared objects except data. But can we catch the situation when we encrypting the same skb? As far as I can see skb_cow_data() must take care of it. You are right, encrypting is safe. Here is part of esp_output() I use for acrypto. Static scaterlists are not used and new are dinamically allocated. @@ -95,7 +239,90 @@ esph->spi = x->id.spi; esph->seq_no = htonl(++x->replay.oseq); + +#ifdef CONFIG_ACRYPTO + do { + struct crypto_session_initializer ci; + struct crypto_data data; + struct scatterlist *sg; + struct crypto_session *s; + u8 *key, *iv; + + nfrags++; /* key */ + + if (esp->conf.ivlen) + nfrags++; + memset(&ci, 0, sizeof(ci)); + memset(&data, 0, sizeof(data)); + + ci.operation = CRYPTO_OP_ENCRYPT; + ci.mode = crypto_tfm_get_mode(tfm); + ci.type = crypto_tfm_get_type(tfm); + ci.priority = 0; + ci.callback = &esp4_async_callback; + + if (ci.mode == 0xffff || ci.type == 0xffff) + goto sync_crypto; + + sg = kmalloc(sizeof(struct scatterlist)*nfrags, GFP_ATOMIC); + if (!sg) + goto error; + skb_to_sgvec(skb, sg, esph->enc_data+esp->conf.ivlen-skb->data, clen); + data.sg_src = data.sg_dst = sg; + + key = kmalloc(crypto_tfm_alg_ivsize(tfm) + esp->conf.key_len, GFP_ATOMIC); + if (!key) + goto err_out_free_sg; + + iv = key + esp->conf.key_len; + + if (esp->conf.ivlen) { + data.sg_key = &sg[nfrags - 2]; + data.sg_iv = &sg[nfrags - 1]; + data.sg_key_num = data.sg_iv_num = 1; + } else { + data.sg_key = &sg[nfrags - 1]; + data.sg_iv = NULL; + data.sg_key_num = 1; + data.sg_iv_num = 0; + } + + data.sg_src_num = data.sg_dst_num = nfrags - data.sg_key_num - data.sg_iv_num; + + memcpy(key, esp->conf.key, esp->conf.key_len); + data.sg_key[0].offset = offset_in_page(key); + data.sg_key[0].length = esp->conf.key_len; + data.sg_key[0].page = virt_to_page(key); + + if (esp->conf.ivlen) { + memcpy(iv, esp->conf.ivec, crypto_tfm_alg_ivsize(tfm)); + data.sg_iv[0].offset = offset_in_page(iv); + data.sg_iv[0].length = crypto_tfm_alg_ivsize(tfm); + data.sg_iv[0].page = virt_to_page(iv); + } + + data.priv = esp_output_async_prepare(x, skb); + if (!data.priv) + goto err_out_free_key; + + s = crypto_session_alloc(&ci, &data); + if (!s) + goto err_out_free_ea; + + return 0; + +err_out_free_ea: + kfree(data.priv); +err_out_free_key: + kfree(key); +err_out_free_sg: + kfree(sg); + goto sync_crypto; + } while (0); + +sync_crypto: +#endif if (esp->conf.ivlen) crypto_cipher_set_iv(tfm, esp->conf.ivec, crypto_tfm_alg_ivsize(tfm)); > -- > Visit Openswan at http://www.openswan.org/ > Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt Evgeniy Polyakov Only failure makes us experts. -- Theo de Raadt ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-04 10:17 ` Evgeniy Polyakov @ 2005-06-04 10:22 ` Herbert Xu 2005-06-04 10:29 ` Evgeniy Polyakov 0 siblings, 1 reply; 30+ messages in thread From: Herbert Xu @ 2005-06-04 10:22 UTC (permalink / raw) To: Evgeniy Polyakov Cc: David S. Miller, James Morris, Linux Crypto Mailing List, netdev On Sat, Jun 04, 2005 at 02:17:31PM +0400, Evgeniy Polyakov wrote: > > Static scaterlists are not used and new are dinamically allocated. That's precisely why we're having this discussion. We can now encrypt/decrypt a 1500 byte packet in 2us so the last thing we want is to impose additional latencies on the common case unless it's absolutely required. If we can shrink the structure used between IPsec and the crypto layer then we can allocate the sgbuf off the stack for 99% of the users. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-04 10:22 ` Herbert Xu @ 2005-06-04 10:29 ` Evgeniy Polyakov 2005-06-04 10:32 ` Herbert Xu 0 siblings, 1 reply; 30+ messages in thread From: Evgeniy Polyakov @ 2005-06-04 10:29 UTC (permalink / raw) To: Herbert Xu Cc: David S. Miller, James Morris, Linux Crypto Mailing List, netdev On Sat, 4 Jun 2005 20:22:04 +1000 Herbert Xu <herbert@gondor.apana.org.au> wrote: > On Sat, Jun 04, 2005 at 02:17:31PM +0400, Evgeniy Polyakov wrote: > > > > Static scaterlists are not used and new are dinamically allocated. > > That's precisely why we're having this discussion. We can now > encrypt/decrypt a 1500 byte packet in 2us so the last thing we > want is to impose additional latencies on the common case unless > it's absolutely required. > > If we can shrink the structure used between IPsec and the crypto > layer then we can allocate the sgbuf off the stack for 99% of > the users. I do see that 4 sg are enough for 99% of the users, I event think 2 is enough - it will be 8kb, almost the maximum seen 9kb jumbo frame. But without sg we sill save 4*sizeof(dma addr) - is it really a price? For hardware we will need to remap it later... > Cheers, > -- > Visit Openswan at http://www.openswan.org/ > Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt Evgeniy Polyakov Only failure makes us experts. -- Theo de Raadt ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-04 10:29 ` Evgeniy Polyakov @ 2005-06-04 10:32 ` Herbert Xu 2005-06-04 10:40 ` Evgeniy Polyakov 2005-06-06 22:36 ` David S. Miller 0 siblings, 2 replies; 30+ messages in thread From: Herbert Xu @ 2005-06-04 10:32 UTC (permalink / raw) To: Evgeniy Polyakov Cc: David S. Miller, James Morris, Linux Crypto Mailing List, netdev On Sat, Jun 04, 2005 at 02:29:39PM +0400, Evgeniy Polyakov wrote: > > But without sg we sill save 4*sizeof(dma addr) - is it really a price? We're also reducing the offset/length to 16 bits from 32 bits so we're shaving off half the size. > For hardware we will need to remap it later... Well we can't modify the supplied scatterlist structure in the crypto API anyway since we don't have exclusive ownership of it. Since we can't expect the user of the crypto API to do the mapping this space is basically wasted. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-04 10:32 ` Herbert Xu @ 2005-06-04 10:40 ` Evgeniy Polyakov 2005-06-06 22:36 ` David S. Miller 1 sibling, 0 replies; 30+ messages in thread From: Evgeniy Polyakov @ 2005-06-04 10:40 UTC (permalink / raw) To: Herbert Xu Cc: David S. Miller, James Morris, Linux Crypto Mailing List, netdev On Sat, 4 Jun 2005 20:32:49 +1000 Herbert Xu <herbert@gondor.apana.org.au> wrote: > On Sat, Jun 04, 2005 at 02:29:39PM +0400, Evgeniy Polyakov wrote: > > > > But without sg we sill save 4*sizeof(dma addr) - is it really a price? > > We're also reducing the offset/length to 16 bits from 32 bits so we're > shaving off half the size. > > > For hardware we will need to remap it later... > > Well we can't modify the supplied scatterlist structure in the > crypto API anyway since we don't have exclusive ownership of it. > Since we can't expect the user of the crypto API to do the mapping > this space is basically wasted. So why not remove it completely? Sycnhronous hardware (like VIA/freescale processors) do not use at all any scaterlists, so it is not needed there. CryptoAPI does not use half of the scaterlist structure. CryptoAPI design can not be used with "interruptible" hardware like HIFN, so for asynchronous hardware we need some kind of remapping anyway, so why just not to move to the new fragments Herbert introduced all over the place in CryptoAPI? But pleaso do not remove skb_to_sgvec() :) > Cheers, > -- > Visit Openswan at http://www.openswan.org/ > Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt Evgeniy Polyakov Only failure makes us experts. -- Theo de Raadt ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-04 10:32 ` Herbert Xu 2005-06-04 10:40 ` Evgeniy Polyakov @ 2005-06-06 22:36 ` David S. Miller 1 sibling, 0 replies; 30+ messages in thread From: David S. Miller @ 2005-06-06 22:36 UTC (permalink / raw) To: herbert; +Cc: johnpol, jmorris, linux-crypto, netdev From: Herbert Xu <herbert@gondor.apana.org.au> Date: Sat, 4 Jun 2005 20:32:49 +1000 > On Sat, Jun 04, 2005 at 02:29:39PM +0400, Evgeniy Polyakov wrote: > > > > But without sg we sill save 4*sizeof(dma addr) - is it really a price? > > We're also reducing the offset/length to 16 bits from 32 bits so we're > shaving off half the size. Note, it is still going to be a 16 byte structure on 64-bit machines. This is mainly due to the 8-byte alignment needed by the page pointer. I'm not objecting to your ideas, just mentioning this fact... I am also puzzled as to where the "4" came from :-) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-03 23:46 [RFC] Replace scatterlist with crypto_frag Herbert Xu 2005-06-04 0:02 ` Jeff Garzik 2005-06-04 9:55 ` Evgeniy Polyakov @ 2005-06-04 11:23 ` Christoph Hellwig 2005-06-04 11:26 ` Herbert Xu 2 siblings, 1 reply; 30+ messages in thread From: Christoph Hellwig @ 2005-06-04 11:23 UTC (permalink / raw) To: Herbert Xu Cc: David S. Miller, James Morris, Linux Crypto Mailing List, netdev On Sat, Jun 04, 2005 at 09:46:23AM +1000, Herbert Xu wrote: > struct crypto_frag { > struct page *page; > u16 offset; > u16 length; > }; we have this structure as skb_frag_struct and bio_vec already, care to use the same structure with a generic name for all of them? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-04 11:23 ` Christoph Hellwig @ 2005-06-04 11:26 ` Herbert Xu 2005-06-04 11:58 ` Christoph Hellwig 0 siblings, 1 reply; 30+ messages in thread From: Herbert Xu @ 2005-06-04 11:26 UTC (permalink / raw) To: Christoph Hellwig Cc: David S. Miller, James Morris, Linux Crypto Mailing List, netdev On Sat, Jun 04, 2005 at 12:23:14PM +0100, Christoph Hellwig wrote: > On Sat, Jun 04, 2005 at 09:46:23AM +1000, Herbert Xu wrote: > > struct crypto_frag { > > struct page *page; > > u16 offset; > > u16 length; > > }; > > we have this structure as skb_frag_struct and bio_vec already, care > to use the same structure with a generic name for all of them? I certainly would have no problems merging with skb_frag_struct. However, merging with bio_vec would mean that either bio_vec would have to drop down to 16-bit counters, or crypto_frag would have to move up to 32-bit counters. The latter is problematic because I'm trying to shrink the size enough so that we can squeeze four of these things onto the stack. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-04 11:26 ` Herbert Xu @ 2005-06-04 11:58 ` Christoph Hellwig 2005-06-06 11:59 ` Herbert Xu 0 siblings, 1 reply; 30+ messages in thread From: Christoph Hellwig @ 2005-06-04 11:58 UTC (permalink / raw) To: Herbert Xu Cc: Christoph Hellwig, David S. Miller, James Morris, Linux Crypto Mailing List, netdev On Sat, Jun 04, 2005 at 09:26:06PM +1000, Herbert Xu wrote: > On Sat, Jun 04, 2005 at 12:23:14PM +0100, Christoph Hellwig wrote: > > On Sat, Jun 04, 2005 at 09:46:23AM +1000, Herbert Xu wrote: > > > struct crypto_frag { > > > struct page *page; > > > u16 offset; > > > u16 length; > > > }; > > > > we have this structure as skb_frag_struct and bio_vec already, care > > to use the same structure with a generic name for all of them? > > I certainly would have no problems merging with skb_frag_struct. > However, merging with bio_vec would mean that either bio_vec would > have to drop down to 16-bit counters, or crypto_frag would have to > move up to 32-bit counters. the usage of 16bit counters in bio_vec doesn't make sense, and if did all others would have to move to 32bit aswell (in case we started supporting page sizes that aren't addressable by 16bits) ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-04 11:58 ` Christoph Hellwig @ 2005-06-06 11:59 ` Herbert Xu 2005-06-06 12:09 ` Christoph Hellwig 0 siblings, 1 reply; 30+ messages in thread From: Herbert Xu @ 2005-06-06 11:59 UTC (permalink / raw) To: Christoph Hellwig Cc: David S. Miller, James Morris, Linux Crypto Mailing List, netdev On Sat, Jun 04, 2005 at 12:58:53PM +0100, Christoph Hellwig wrote: > > the usage of 16bit counters in bio_vec doesn't make sense, and if did > all others would have to move to 32bit aswell (in case we started > supporting page sizes that aren't addressable by 16bits) You know what? The more I think about this the more I think that your idea is brilliant. The reason is that the two main users of crypto API happen to be in possession of bio_vec and skb_frag_t respectively. Had we merged the three structures, they would not have to copy the structures as they do now or even worse, process the buffers one-by-one as dmcrypt is doing. Back to the topic of 16-bit vs. 32-bit counters. Could we do something like this? #if (PAGE_SHIFT > 16) || (BITS_PER_LONG > 32) typedef unsigned int page_offset_t #else typedef unsigned short page_offset_t #endif And then define struct foovec { struct page *page; page_offset_t offset; page_offset_t length; }; Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-06 11:59 ` Herbert Xu @ 2005-06-06 12:09 ` Christoph Hellwig 2005-06-06 12:40 ` Herbert Xu 0 siblings, 1 reply; 30+ messages in thread From: Christoph Hellwig @ 2005-06-06 12:09 UTC (permalink / raw) To: Herbert Xu Cc: Christoph Hellwig, David S. Miller, James Morris, Linux Crypto Mailing List, netdev On Mon, Jun 06, 2005 at 09:59:39PM +1000, Herbert Xu wrote: > On Sat, Jun 04, 2005 at 12:58:53PM +0100, Christoph Hellwig wrote: > > > > the usage of 16bit counters in bio_vec doesn't make sense, and if did > > all others would have to move to 32bit aswell (in case we started > > supporting page sizes that aren't addressable by 16bits) > > You know what? The more I think about this the more I think that your > idea is brilliant. The reason is that the two main users of crypto API > happen to be in possession of bio_vec and skb_frag_t respectively. > > Had we merged the three structures, they would not have to copy the > structures as they do now or even worse, process the buffers one-by-one > as dmcrypt is doing. > > Back to the topic of 16-bit vs. 32-bit counters. Could we do something > like this? > > #if (PAGE_SHIFT > 16) || (BITS_PER_LONG > 32) what is the BITS_PER_LONG check for? > typedef unsigned int page_offset_t > #else > typedef unsigned short page_offset_t > #endif the name is a) a little long and b) easy to confuse with pgoff_t as used in the pagecache. I'm not sure what a better name would be. We probably shouldn't care about this as the networking code didn't handle larger offsets either. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-06 12:09 ` Christoph Hellwig @ 2005-06-06 12:40 ` Herbert Xu 2005-06-06 13:30 ` Christoph Hellwig 2005-06-06 22:46 ` David S. Miller 0 siblings, 2 replies; 30+ messages in thread From: Herbert Xu @ 2005-06-06 12:40 UTC (permalink / raw) To: Christoph Hellwig Cc: David S. Miller, James Morris, Linux Crypto Mailing List, netdev On Mon, Jun 06, 2005 at 01:09:14PM +0100, Christoph Hellwig wrote: > > > #if (PAGE_SHIFT > 16) || (BITS_PER_LONG > 32) > > what is the BITS_PER_LONG check for? These structures are normally used in arrays. On a 64-bit machine the alignment requirement means that the 16-bit version will be padded to have the same length as the 32-bit version. Since 32-bit access is usually faster we might as well get it for free. > > typedef unsigned int page_offset_t > > the name is a) a little long and b) easy to confuse with pgoff_t as used in > the pagecache. I'm not sure what a better name would be. Alternatively we can put the ifdef around (or inside) the struct definition. > We probably shouldn't care about this as the networking code didn't handle > larger offsets either. I'm not sure what you mean here. However, for skb_frag_t at least going to the 32-bit version on i386 means at least 72 bytes extra for every skb->data allocation. Dave, what are your views on making skb_frag_t bigger? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-06 12:40 ` Herbert Xu @ 2005-06-06 13:30 ` Christoph Hellwig 2005-06-06 22:46 ` David S. Miller 1 sibling, 0 replies; 30+ messages in thread From: Christoph Hellwig @ 2005-06-06 13:30 UTC (permalink / raw) To: Herbert Xu Cc: Christoph Hellwig, David S. Miller, James Morris, Linux Crypto Mailing List, netdev On Mon, Jun 06, 2005 at 10:40:43PM +1000, Herbert Xu wrote: > On Mon, Jun 06, 2005 at 01:09:14PM +0100, Christoph Hellwig wrote: > > > > > #if (PAGE_SHIFT > 16) || (BITS_PER_LONG > 32) > > > > what is the BITS_PER_LONG check for? > > These structures are normally used in arrays. On a 64-bit machine > the alignment requirement means that the 16-bit version will be > padded to have the same length as the 32-bit version. Since 32-bit > access is usually faster we might as well get it for free. At this point it might be easiest to just say the architecture must declare the type in asm/types.h ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-06 12:40 ` Herbert Xu 2005-06-06 13:30 ` Christoph Hellwig @ 2005-06-06 22:46 ` David S. Miller 2005-06-06 23:04 ` Herbert Xu 2005-06-06 23:05 ` Jeff Garzik 1 sibling, 2 replies; 30+ messages in thread From: David S. Miller @ 2005-06-06 22:46 UTC (permalink / raw) To: herbert; +Cc: hch, jmorris, linux-crypto, netdev From: Herbert Xu <herbert@gondor.apana.org.au> Date: Mon, 6 Jun 2005 22:40:43 +1000 > However, for skb_frag_t at least going to the 32-bit version on i386 > means at least 72 bytes extra for every skb->data allocation. > > Dave, what are your views on making skb_frag_t bigger? Good question. There is an ancillary issue that I'd like to address at some point, and what you do here is tied into that. Currently, NETIF_F_SG drivers do one DMA mapping call for each fragment of the packet. That totally stinks performance wise, and the PPC64 and SPARC64 folks feel this the most. So I wanted to create a set of interfaces ala: int dma_map_skb(struct sk_buff *skb, ...); void dma_unmap_skb(struct sk_buff *skb, ...); void dma_sync_skb_for_cpu(struct sk_buff *skb, ...); void dma_sync_skb_for_device(struct sk_buff *skb, ...); The question is where to put the DMA mapping cookies :-) On i386 and alike, using something like the DECLARE_PCI_UNMAP_*() macros would allow us to NOP out the DMA addresses entirely. Since they are computable from the page struct and offset. Note that the above interface, on IOMMU platforms, would allow DMA coalescing to be performed. This would hit heavily with TSO, for example. Most packets would go out with a maximum of 2 DMA descriptors, 1 for the mapping of skb->data and 1 for all of the paged SKB data afterwards combined. Note that, due to this coalescing, the "size" member must be larger than a __u16. So I guess I'm taking you a step backwards, I want to make skb_frag_struct a little bigger :-) Ie. put the DMA mapping cookies into the skb_frag_struct, then a set of accessor macros like we have for scatterlist. Well, in fact, it would become a scatterlist and therefore the only thing special about dma_map_skb() is that is maps a linear buffer via skb->data then the scatterlist in skb_shared_info(skb). ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-06 22:46 ` David S. Miller @ 2005-06-06 23:04 ` Herbert Xu 2005-06-06 23:09 ` David S. Miller 2005-06-06 23:05 ` Jeff Garzik 1 sibling, 1 reply; 30+ messages in thread From: Herbert Xu @ 2005-06-06 23:04 UTC (permalink / raw) To: David S. Miller; +Cc: hch, jmorris, linux-crypto, netdev On Mon, Jun 06, 2005 at 03:46:53PM -0700, David S. Miller wrote: > > So I guess I'm taking you a step backwards, I want to make > skb_frag_struct a little bigger :-) Ie. put the DMA mapping > cookies into the skb_frag_struct, then a set of accessor > macros like we have for scatterlist. Well, in fact, it would > become a scatterlist and therefore the only thing special > about dma_map_skb() is that is maps a linear buffer via > skb->data then the scatterlist in skb_shared_info(skb). Bigger is better actually :) I'm now thinking of using the memory occupied by the frags array for IPsec crypto operations. So if it's bigger then it simply means that we can store more fragments. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-06 23:04 ` Herbert Xu @ 2005-06-06 23:09 ` David S. Miller 2005-06-06 23:14 ` Herbert Xu 0 siblings, 1 reply; 30+ messages in thread From: David S. Miller @ 2005-06-06 23:09 UTC (permalink / raw) To: herbert; +Cc: hch, jmorris, linux-crypto, netdev From: Herbert Xu <herbert@gondor.apana.org.au> Date: Tue, 7 Jun 2005 09:04:37 +1000 > Bigger is better actually :) I'm now thinking of using the > memory occupied by the frags array for IPsec crypto operations. > So if it's bigger then it simply means that we can store more > fragments. So you want to use this area as a sort-of temporary scratch pad for something other than scatterlist information? That's interesting if so... ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-06 23:09 ` David S. Miller @ 2005-06-06 23:14 ` Herbert Xu 2005-06-06 23:18 ` David S. Miller 0 siblings, 1 reply; 30+ messages in thread From: Herbert Xu @ 2005-06-06 23:14 UTC (permalink / raw) To: David S. Miller; +Cc: hch, jmorris, linux-crypto, netdev On Mon, Jun 06, 2005 at 04:09:47PM -0700, David S. Miller wrote: > > So you want to use this area as a sort-of temporary > scratch pad for something other than scatterlist > information? That's interesting if so... There are two possibilities: 1) We use it directly as a scratch buffer for now since the frags are always linearised currently. 2) We have a list of lists of which this is simply a member. The meta-list itself can then be stored on the stack since each member is only 4 bytes. We can go with 1) for now. When it becomes possible for us to not flatten the frags, we can switch to 2). Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-06 23:14 ` Herbert Xu @ 2005-06-06 23:18 ` David S. Miller 2005-06-06 23:20 ` Herbert Xu 0 siblings, 1 reply; 30+ messages in thread From: David S. Miller @ 2005-06-06 23:18 UTC (permalink / raw) To: herbert; +Cc: hch, jmorris, linux-crypto, netdev From: Herbert Xu <herbert@gondor.apana.org.au> Date: Tue, 7 Jun 2005 09:14:05 +1000 > 1) We use it directly as a scratch buffer for now since the > frags are always linearised currently. And since skb_shinfo(skb)->nr_frags will be zero, nobody will mistakedly look at the contents and interpret it as some valid frags. Right? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-06 23:18 ` David S. Miller @ 2005-06-06 23:20 ` Herbert Xu 0 siblings, 0 replies; 30+ messages in thread From: Herbert Xu @ 2005-06-06 23:20 UTC (permalink / raw) To: David S. Miller; +Cc: hch, jmorris, linux-crypto, netdev On Mon, Jun 06, 2005 at 04:18:14PM -0700, David S. Miller wrote: > From: Herbert Xu <herbert@gondor.apana.org.au> > Date: Tue, 7 Jun 2005 09:14:05 +1000 > > > 1) We use it directly as a scratch buffer for now since the > > frags are always linearised currently. > > And since skb_shinfo(skb)->nr_frags will be zero, nobody > will mistakedly look at the contents and interpret it as > some valid frags. Right? Yep. -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC] Replace scatterlist with crypto_frag 2005-06-06 22:46 ` David S. Miller 2005-06-06 23:04 ` Herbert Xu @ 2005-06-06 23:05 ` Jeff Garzik 1 sibling, 0 replies; 30+ messages in thread From: Jeff Garzik @ 2005-06-06 23:05 UTC (permalink / raw) To: David S. Miller; +Cc: herbert, hch, jmorris, linux-crypto, netdev David S. Miller wrote: > The question is where to put the DMA mapping cookies :-) Indeed. Jeff ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2005-06-06 23:20 UTC | newest] Thread overview: 30+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-06-03 23:46 [RFC] Replace scatterlist with crypto_frag Herbert Xu 2005-06-04 0:02 ` Jeff Garzik 2005-06-04 0:42 ` Herbert Xu 2005-06-04 4:39 ` James Morris 2005-06-04 4:51 ` Herbert Xu 2005-06-04 5:23 ` James Morris 2005-06-04 5:33 ` Herbert Xu 2005-06-04 10:00 ` Evgeniy Polyakov 2005-06-04 9:55 ` Evgeniy Polyakov 2005-06-04 9:58 ` Herbert Xu 2005-06-04 10:17 ` Evgeniy Polyakov 2005-06-04 10:22 ` Herbert Xu 2005-06-04 10:29 ` Evgeniy Polyakov 2005-06-04 10:32 ` Herbert Xu 2005-06-04 10:40 ` Evgeniy Polyakov 2005-06-06 22:36 ` David S. Miller 2005-06-04 11:23 ` Christoph Hellwig 2005-06-04 11:26 ` Herbert Xu 2005-06-04 11:58 ` Christoph Hellwig 2005-06-06 11:59 ` Herbert Xu 2005-06-06 12:09 ` Christoph Hellwig 2005-06-06 12:40 ` Herbert Xu 2005-06-06 13:30 ` Christoph Hellwig 2005-06-06 22:46 ` David S. Miller 2005-06-06 23:04 ` Herbert Xu 2005-06-06 23:09 ` David S. Miller 2005-06-06 23:14 ` Herbert Xu 2005-06-06 23:18 ` David S. Miller 2005-06-06 23:20 ` Herbert Xu 2005-06-06 23:05 ` Jeff Garzik
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).