From: Dan Williams <dan.j.williams@intel.com>
To: Vishnu Suresh <Vishnu@freescale.com>
Cc: herbert@gondor.apana.org.au, linux-kernel@vger.kernel.org,
	linux-raid@vger.kernel.org, linuxppc-dev@ozlabs.org,
	linux-crypto@vger.kernel.org,
	Dipen Dudhat <Dipen.Dudhat@freescale.com>,
	Maneesh Gupta <Maneesh.Gupta@freescale.com>
Subject: Re: [PATCH v0 2/2] Crypto: Talitos: Support for Async_tx XOR offload
Date: Thu, 29 Oct 2009 15:46:37 -0700	[thread overview]
Message-ID: <e9c3a7c20910291546r2603e191g87a23023c32f1b1b@mail.gmail.com> (raw)
In-Reply-To: <1255588866-4133-2-git-send-email-Vishnu@freescale.com>
On Wed, Oct 14, 2009 at 11:41 PM, Vishnu Suresh <Vishnu@freescale.com> wrot=
e:
[..]
> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
> index b08403d..343e578 100644
> --- a/drivers/crypto/Kconfig
> +++ b/drivers/crypto/Kconfig
> @@ -192,6 +192,8 @@ config CRYPTO_DEV_TALITOS
> =A0 =A0 =A0 =A0select CRYPTO_ALGAPI
> =A0 =A0 =A0 =A0select CRYPTO_AUTHENC
> =A0 =A0 =A0 =A0select HW_RANDOM
> + =A0 =A0 =A0 select DMA_ENGINE
> + =A0 =A0 =A0 select ASYNC_XOR
You need only select DMA_ENGINE.  ASYNC_XOR is selected by its users.
> =A0 =A0 =A0 =A0depends on FSL_SOC
> =A0 =A0 =A0 =A0help
> =A0 =A0 =A0 =A0 =A0Say 'Y' here to use the Freescale Security Engine (SEC=
)
> diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
> index c47ffe8..84819d4 100644
> --- a/drivers/crypto/talitos.c
> +++ b/drivers/crypto/talitos.c
[..]
> +static void talitos_process_pending(struct talitos_xor_chan *xor_chan)
> +{
> + =A0 =A0 =A0 struct talitos_xor_desc *desc, *_desc;
> + =A0 =A0 =A0 unsigned long flags;
> + =A0 =A0 =A0 int status;
> +
> + =A0 =A0 =A0 spin_lock_irqsave(&xor_chan->desc_lock, flags);
> +
> + =A0 =A0 =A0 list_for_each_entry_safe(desc, _desc, &xor_chan->pending_q,=
 node) {
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 status =3D talitos_submit(xor_chan->dev, &d=
esc->hwdesc,
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0 talitos_release_xor, desc);
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (status !=3D -EINPROGRESS)
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 break;
> +
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 list_del(&desc->node);
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 list_add_tail(&desc->node, &xor_chan->in_pr=
ogress_q);
> + =A0 =A0 =A0 }
> +
> + =A0 =A0 =A0 spin_unlock_irqrestore(&xor_chan->desc_lock, flags);
> +}
The driver uses spin_lock_bh everywhere else which is either a bug, or
this code is being overly protective.  In any event lockdep will
rightly complain about this.  The API and its users do not submit
operations in hard-irq context.
> +
> +static void talitos_release_xor(struct device *dev, struct talitos_desc =
*hwdesc,
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 void *conte=
xt, int error)
> +{
> + =A0 =A0 =A0 struct talitos_xor_desc *desc =3D context;
> + =A0 =A0 =A0 struct talitos_xor_chan *xor_chan;
> + =A0 =A0 =A0 dma_async_tx_callback callback;
> + =A0 =A0 =A0 void *callback_param;
> +
> + =A0 =A0 =A0 if (unlikely(error)) {
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 dev_err(dev, "xor operation: talitos error =
%d\n", error);
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 BUG();
> + =A0 =A0 =A0 }
> +
> + =A0 =A0 =A0 xor_chan =3D container_of(desc->async_tx.chan, struct talit=
os_xor_chan,
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 common);
> + =A0 =A0 =A0 spin_lock_bh(&xor_chan->desc_lock);
> + =A0 =A0 =A0 if (xor_chan->completed_cookie < desc->async_tx.cookie)
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 xor_chan->completed_cookie =3D desc->async_=
tx.cookie;
> +
> + =A0 =A0 =A0 callback =3D desc->async_tx.callback;
> + =A0 =A0 =A0 callback_param =3D desc->async_tx.callback_param;
> +
> + =A0 =A0 =A0 if (callback) {
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock_bh(&xor_chan->desc_lock);
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 callback(callback_param);
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_lock_bh(&xor_chan->desc_lock);
> + =A0 =A0 =A0 }
You do not need to unlock to call the callback.  Users of the API
assume that they are called in bh context and that they are not
allowed to submit new operations directly from the callback (this
simplifies the descriptor cleanup routines in other drivers).
> +
> + =A0 =A0 =A0 list_del(&desc->node);
> + =A0 =A0 =A0 list_add_tail(&desc->node, &xor_chan->free_desc);
> + =A0 =A0 =A0 spin_unlock_bh(&xor_chan->desc_lock);
> + =A0 =A0 =A0 if (!list_empty(&xor_chan->pending_q))
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 talitos_process_pending(xor_chan);
> +}
It appears this routine is missing a call to dma_run_dependencies()?
This is needed if another channel is handling memory copy offload.  I
assume this is the case due to your other patch to fsldma.  See
iop_adma_run_tx_complete_actions().  If this xor resource can also
handle copy operations that would be more efficient as it eliminates
channel switching.  See the ioatdma driver and its use of
ASYNC_TX_DISABLE_CHANNEL_SWITCH.
> +static struct dma_async_tx_descriptor * talitos_prep_dma_xor(
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 struct dma_chan *chan, dma_=
addr_t dest, dma_addr_t *src,
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 unsigned int src_cnt, size_=
t len, unsigned long flags)
> +{
> + =A0 =A0 =A0 struct talitos_xor_chan *xor_chan;
> + =A0 =A0 =A0 struct talitos_xor_desc *new;
> + =A0 =A0 =A0 struct talitos_desc *desc;
> + =A0 =A0 =A0 int i, j;
> +
> + =A0 =A0 =A0 BUG_ON(unlikely(len > TALITOS_MAX_DATA_LEN));
> +
> + =A0 =A0 =A0 xor_chan =3D container_of(chan, struct talitos_xor_chan, co=
mmon);
> +
> + =A0 =A0 =A0 spin_lock_bh(&xor_chan->desc_lock);
> + =A0 =A0 =A0 if (!list_empty(&xor_chan->free_desc)) {
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 new =3D container_of(xor_chan->free_desc.ne=
xt,
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0stru=
ct talitos_xor_desc, node);
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 list_del(&new->node);
> + =A0 =A0 =A0 } else {
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 new =3D talitos_xor_alloc_descriptor(xor_ch=
an, GFP_KERNEL);
> + =A0 =A0 =A0 }
> + =A0 =A0 =A0 spin_unlock_bh(&xor_chan->desc_lock);
> +
> + =A0 =A0 =A0 if (!new) {
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 dev_err(xor_chan->common.device->dev,
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 "No free memory for XOR DMA=
 descriptor\n");
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return NULL;
> + =A0 =A0 =A0 }
> + =A0 =A0 =A0 dma_async_tx_descriptor_init(&new->async_tx, &xor_chan->com=
mon);
> +
> + =A0 =A0 =A0 INIT_LIST_HEAD(&new->node);
> + =A0 =A0 =A0 INIT_LIST_HEAD(&new->tx_list);
You can save some overhead in the fast path by moving this
initialization to talitos_xor_alloc_descriptor.
> +
> + =A0 =A0 =A0 desc =3D &new->hwdesc;
> + =A0 =A0 =A0 /* Set destination: Last pointer pair */
> + =A0 =A0 =A0 to_talitos_ptr(&desc->ptr[6], dest);
> + =A0 =A0 =A0 desc->ptr[6].len =3D cpu_to_be16(len);
> + =A0 =A0 =A0 desc->ptr[6].j_extent =3D 0;
> +
> + =A0 =A0 =A0 /* Set Sources: End loading from second-last pointer pair *=
/
> + =A0 =A0 =A0 for (i =3D 5, j =3D 0; (j < src_cnt) && (i > 0); i--, j++) =
{
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 to_talitos_ptr(&desc->ptr[i], src[j]);
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 desc->ptr[i].len =3D cpu_to_be16(len);
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 desc->ptr[i].j_extent =3D 0;
> + =A0 =A0 =A0 }
> +
> + =A0 =A0 =A0 /*
> + =A0 =A0 =A0 =A0* documentation states first 0 ptr/len combo marks end o=
f sources
> + =A0 =A0 =A0 =A0* yet device produces scatter boundary error unless all =
subsequent
> + =A0 =A0 =A0 =A0* sources are zeroed out
> + =A0 =A0 =A0 =A0*/
> + =A0 =A0 =A0 for (; i >=3D 0; i--) {
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 to_talitos_ptr(&desc->ptr[i], 0);
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 desc->ptr[i].len =3D 0;
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 desc->ptr[i].j_extent =3D 0;
> + =A0 =A0 =A0 }
> +
> + =A0 =A0 =A0 desc->hdr =3D DESC_HDR_SEL0_AESU | DESC_HDR_MODE0_AESU_XOR
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | DESC_HDR_TYPE_RAID_XOR;
> +
> + =A0 =A0 =A0 new->async_tx.parent =3D NULL;
> + =A0 =A0 =A0 new->async_tx.next =3D NULL;
These fields are managed by the async_tx channel switch code.  No need
to manage it here.
> + =A0 =A0 =A0 new->async_tx.cookie =3D 0;
This is set below to -EBUSY, it's redundant to touch it here.
> + =A0 =A0 =A0 async_tx_ack(&new->async_tx);
This makes it impossible to attach a dependency to this operation.
Not sure this is what you want.
--
Dan
next prev parent reply	other threads:[~2009-10-29 22:46 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-15  6:41 [PATCH v0 1/2] DMA: fsldma: Disable DMA_INTERRUPT when Async_tx enabled Vishnu Suresh
2009-10-15  6:41 ` [PATCH v0 2/2] Crypto: Talitos: Support for Async_tx XOR offload Vishnu Suresh
2009-10-29 22:46   ` Dan Williams [this message]
2009-11-03  0:43     ` hank peng
2009-10-16  1:25 ` [PATCH v0 1/2] DMA: fsldma: Disable DMA_INTERRUPT when Async_tx enabled Dan Williams
2009-10-16 15:33   ` Ira W. Snyder
2009-10-20 13:09     ` Suresh Vishnu-B05022
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to=e9c3a7c20910291546r2603e191g87a23023c32f1b1b@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=Dipen.Dudhat@freescale.com \
    --cc=Maneesh.Gupta@freescale.com \
    --cc=Vishnu@freescale.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=linuxppc-dev@ozlabs.org \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
  Be sure your reply has a Subject: header at the top and a blank line
  before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).