* Re: [PATCH -V5 11/13] arch/powerpc: properly isolate kernel and user proto-VSID
From: Paul Mackerras @ 2012-08-01 4:31 UTC (permalink / raw)
To: Aneesh Kumar K.V; +Cc: linuxppc-dev
In-Reply-To: <1343647339-25576-12-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
On Mon, Jul 30, 2012 at 04:52:17PM +0530, Aneesh Kumar K.V wrote:
> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>
> The proto-VSID space is divided into two class
> User: 0 to 2^(CONTEXT_BITS + USER_ESID_BITS) -1
> kernel: 2^(CONTEXT_BITS + USER_ESID_BITS) to 2^(VSID_BITS) - 1
>
> With KERNEL_START at 0xc000000000000000, the proto vsid for
> the kernel ends up with 0xc00000000 (36 bits). With 64TB
> patchset we need to have kernel proto-VSID in the
> [2^37 to 2^38 - 1] range due to the increased USER_ESID_BITS.
This needs to be rolled in with the previous patch, otherwise you'll
break bisection.
> diff --git a/arch/powerpc/mm/slb_low.S b/arch/powerpc/mm/slb_low.S
> index db2cb3f..405d380 100644
> --- a/arch/powerpc/mm/slb_low.S
> +++ b/arch/powerpc/mm/slb_low.S
> @@ -57,8 +57,16 @@ _GLOBAL(slb_allocate_realmode)
> _GLOBAL(slb_miss_kernel_load_linear)
> li r11,0
> BEGIN_FTR_SECTION
> + li r9,0x1
> + rldimi r10,r9,(CONTEXT_BITS + USER_ESID_BITS),0
> b slb_finish_load
> END_MMU_FTR_SECTION_IFCLR(MMU_FTR_1T_SEGMENT)
> + li r9,0x1
> + /*
> + * shift 12 bits less here, slb_finish_load_1T will do
> + * the necessary shits
> + */
> + rldimi r10,r9,(CONTEXT_BITS + USER_ESID_BITS),0
> b slb_finish_load_1T
Since you're actually doing exactly the same instructions in the 256M
and 1T segment cases, why not do the li; rldimi before the
BEGIN_FTR_SECTION?
> @@ -86,8 +94,16 @@ _GLOBAL(slb_miss_kernel_load_vmemmap)
> li r11,0
> 6:
> BEGIN_FTR_SECTION
> + li r9,0x1
> + rldimi r10,r9,(CONTEXT_BITS + USER_ESID_BITS),0
> b slb_finish_load
> END_MMU_FTR_SECTION_IFCLR(MMU_FTR_1T_SEGMENT)
> + li r9,0x1
> + /*
> + * shift 12 bits less here, slb_finish_load_1T will do
> + * the necessary shits
> + */
> + rldimi r10,r9,(CONTEXT_BITS + USER_ESID_BITS),0
> b slb_finish_load_1T
And similarly here.
Paul.
^ permalink raw reply
* RE: [PATCH 7/7] carma: remove unnecessary DMA_INTERRUPT capability
From: Liu Qiang-B32616 @ 2012-08-01 3:56 UTC (permalink / raw)
To: Ira W. Snyder, linux-crypto@vger.kernel.org; +Cc: linuxppc-dev@lists.ozlabs.org
In-Reply-To: <1343778352-5549-8-git-send-email-iws@ovro.caltech.edu>
> -----Original Message-----
> From: Ira W. Snyder [mailto:iws@ovro.caltech.edu]
> Sent: Wednesday, August 01, 2012 7:46 AM
> To: linux-crypto@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org; Liu Qiang-B32616; Ira W. Snyder
> Subject: [PATCH 7/7] carma: remove unnecessary DMA_INTERRUPT capability
>=20
> From: "Ira W. Snyder" <iws@ovro.caltech.edu>
>=20
> These drivers set the DMA_INTERRUPT capability bit when requesting a DMA
> controller channel. This was historical, and is no longer needed.
>=20
> Recent changes to the drivers/dma/fsldma.c driver have removed support
> for this flag. This makes the carma drivers unable to find a DMA channel
> with the required capabilities.
>=20
> Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
> ---
> drivers/misc/carma/carma-fpga-program.c | 1 -
> drivers/misc/carma/carma-fpga.c | 3 +--
> 2 files changed, 1 insertions(+), 3 deletions(-)
>=20
> diff --git a/drivers/misc/carma/carma-fpga-program.c
> b/drivers/misc/carma/carma-fpga-program.c
> index a2d25e4..eaddfe9 100644
> --- a/drivers/misc/carma/carma-fpga-program.c
> +++ b/drivers/misc/carma/carma-fpga-program.c
> @@ -978,7 +978,6 @@ static int fpga_of_probe(struct platform_device *op)
> dev_set_drvdata(priv->dev, priv);
> dma_cap_zero(mask);
> dma_cap_set(DMA_MEMCPY, mask);
> - dma_cap_set(DMA_INTERRUPT, mask);
> dma_cap_set(DMA_SLAVE, mask);
> dma_cap_set(DMA_SG, mask);
>=20
> diff --git a/drivers/misc/carma/carma-fpga.c b/drivers/misc/carma/carma-
> fpga.c index 8c279da..861b298 100644
> --- a/drivers/misc/carma/carma-fpga.c
> +++ b/drivers/misc/carma/carma-fpga.c
> @@ -666,7 +666,7 @@ static int data_submit_dma(struct fpga_device *priv,
> struct data_buf *buf)
> src =3D SYS_FPGA_BLOCK;
> tx =3D chan->device->device_prep_dma_memcpy(chan, dst, src,
> REG_BLOCK_SIZE,
> - DMA_PREP_INTERRUPT);
> + 0);
> if (!tx) {
> dev_err(priv->dev, "unable to prep SYS-FPGA DMA\n");
> return -ENOMEM;
> @@ -1333,7 +1333,6 @@ static int data_of_probe(struct platform_device *op=
)
>=20
> dma_cap_zero(mask);
> dma_cap_set(DMA_MEMCPY, mask);
> - dma_cap_set(DMA_INTERRUPT, mask);
> dma_cap_set(DMA_SLAVE, mask);
> dma_cap_set(DMA_SG, mask);
Hi Ira, attached link is the comments of Dan Williams to the original patch=
of fsl-dma.
http://lkml.indiana.edu/hypermail/linux/kernel/0910.1/03836.html
> --
> 1.7.8.6
>=20
^ permalink raw reply
* RE: [PATCH 0/7] fsl-dma: fixes for Freescale DMA driver
From: Liu Qiang-B32616 @ 2012-08-01 3:47 UTC (permalink / raw)
To: Ira W. Snyder, linux-crypto@vger.kernel.org; +Cc: linuxppc-dev@lists.ozlabs.org
In-Reply-To: <1343778352-5549-1-git-send-email-iws@ovro.caltech.edu>
Hi Ira,
My comments inline.
> -----Original Message-----
> From: Ira W. Snyder [mailto:iws@ovro.caltech.edu]
> Sent: Wednesday, August 01, 2012 7:46 AM
> To: linux-crypto@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org; Liu Qiang-B32616; Ira W. Snyder
> Subject: [PATCH 0/7] fsl-dma: fixes for Freescale DMA driver
>=20
> From: "Ira W. Snyder" <iws@ovro.caltech.edu>
>=20
> Hello everyone,
>=20
> This is my alternative (simpler) attempt at solving the problems reported
> by Qiang Liu with the async_tx API and MD RAID hardware offload support
> when using the Freescale DMA driver.
>=20
> The bug is caused by this driver freeing descriptors before they have
> been ACKed by software using the async_tx API.
>=20
> I don't like Qiang Liu's code to check where the hardware is in the
> processing of the descriptor chain, and try to free a partial list of
> descriptors. This was a source of bugs in this driver before I fixed them
> several years ago.
It's a bug which you think the whole list is completed when an interrupt is=
raised, there is a potential risk when an interrupt is raised by "Programm=
ed Error". The "ld_running" is a s/w concept, we should not depend on it to=
judge the status of descriptors list.
I know you don't like this process, but it's a safe and common process. You=
can refer to other dma drivers, like ioap-adma, mv-xor and ibm-ppc440x-adm=
a.
Said far point, usb also take this method to judge which descriptor is comp=
leted, I don't know which device can use a s/w list to free all descriptors=
, you can refer to the implement of dl_reverse_done_list().
If you find any problem in my patch, please point out, or you can give a li=
nk about the bug you mentioned many years ago.
Thanks.
>=20
> Instead, the DMA controller raises an interrupt every time it has
> completed a descriptor chain. This means it is ready for new descriptors:
> no need to try and figure out where it is in the middle of a descriptor
> chain.
Attached again,
The interrupt is only report the state of hardware, we cannot assume all de=
scriptors are finished when an interrupt is raised.
>=20
> Qiang Liu: I do not have a hardware setup capable of using MD RAID.
> Please test these patches to see if they fix the bug you reported. You
> may use these patches as-is, or build upon them.
I hope we can discuss it based on my patch. If you think my patch will invo=
lve some issue, please point out. I'm willing to fix it.
Thanks.
>=20
> I have tested this using the drivers/dma/dmatest.c driver, as well as the
> CARMA drivers. There are no regressions that I can find.
>=20
> [ 355.069679] dma0chan3-copy0: terminating after 100000 tests, 0
> failures (status 0) [ 355.192278] dma0chan2-copy0: terminating after
> 100000 tests, 0 failures (status 0)
>=20
> Ira W. Snyder (5):
> fsl-dma: minimize locking overhead
> fsl-dma: add fsl_dma_free_descriptor() to reduce code duplication
> fsl-dma: move functions to avoid forward declarations
> fsl-dma: fix support for async_tx API
> carma: remove unnecessary DMA_INTERRUPT capability
>=20
> Qiang Liu (2):
> fsl-dma: remove attribute DMA_INTERRUPT of dmaengine
> fsl-dma: fix a warning of unitialized cookie
>=20
> drivers/dma/fsldma.c | 318 +++++++++++++++----------
> ------
> drivers/dma/fsldma.h | 1 +
> drivers/misc/carma/carma-fpga-program.c | 1 -
> drivers/misc/carma/carma-fpga.c | 3 +-
> 4 files changed, 159 insertions(+), 164 deletions(-)
>=20
> --
> 1.7.8.6
>=20
^ permalink raw reply
* RE: [PATCH v4 3/7] fsl-dma: change release process of dma descriptor for supporting async_tx
From: Liu Qiang-B32616 @ 2012-08-01 3:29 UTC (permalink / raw)
To: Ira W. Snyder
Cc: Li Yang-R58472, Vinod Koul, Phillips Kim-R1AAHA,
dan.j.williams@gmail.com, linux-crypto@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org, davem@davemloft.net,
herbert@gondor.apana.org.au
In-Reply-To: <20120731221331.GH12430@ovro.caltech.edu>
> -----Original Message-----
> From: Ira W. Snyder [mailto:iws@ovro.caltech.edu]
> Sent: Wednesday, August 01, 2012 6:14 AM
> To: Liu Qiang-B32616
> Cc: linux-crypto@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; Phillips
> Kim-R1AAHA; davem@davemloft.net; dan.j.williams@gmail.com; Vinod Koul; Li
> Yang-R58472; herbert@gondor.apana.org.au
> Subject: Re: [PATCH v4 3/7] fsl-dma: change release process of dma
> descriptor for supporting async_tx
>=20
> On Tue, Jul 31, 2012 at 04:09:28AM +0000, Liu Qiang-B32616 wrote:
> > > -----Original Message-----
> > > From: Ira W. Snyder [mailto:iws@ovro.caltech.edu]
> > > Sent: Tuesday, July 31, 2012 5:10 AM
> > > To: Liu Qiang-B32616
> > > Cc: linux-crypto@vger.kernel.org; linuxppc-dev@lists.ozlabs.org;
> Phillips
> > > Kim-R1AAHA; herbert@gondor.hengli.com.au; davem@davemloft.net; Dan
> > > Williams; Vinod Koul; Li Yang-R58472
> > > Subject: Re: [PATCH v4 3/7] fsl-dma: change release process of dma
> > > descriptor for supporting async_tx
> > >
> > > On Fri, Jul 27, 2012 at 05:16:09PM +0800, qiang.liu@freescale.com
> wrote:
> > > > From: Qiang Liu <qiang.liu@freescale.com>
> > > >
> > > > Fix the potential risk when enable config NET_DMA and ASYNC_TX.
> > > > Async_tx is lack of support in current release process of dma
> > > descriptor,
> > > > all descriptors will be released whatever is acked or no-acked by
> > > async_tx,
> > > > so there is a potential race condition when dma engine is uesd by
> > > others
> > > > clients (e.g. when enable NET_DMA to offload TCP).
> > > >
> > > > In our case, a race condition which is raised when use both of
> talitos
> > > > and dmaengine to offload xor is because napi scheduler will sync
> all
> > > > pending requests in dma channels, it affects the process of raid
> > > operations
> > > > due to ack_tx is not checked in fsl dma. The no-acked descriptor is
> > > freed
> > > > which is submitted just now, as a dependent tx, this freed
> descriptor
> > > trigger
> > > > BUG_ON(async_tx_test_ack(depend_tx)) in async_tx_submit().
> > > >
> > >
> > > I'm preparing an alternative version of this patch that I think is
> > > easier to understand (it is much shorter). I'll post it up here as
> soon
> > > as I finish testing.
> > Can you give a simple description/idea about your patch? My patch is
> for fix the
> > problems when I build a raid environment with talitos offload xor.
> > I think the new interface is clear enough and similar with the
> implement of other dma devices.
> >
> > And do you have any comments about this patch?
> >
>=20
> My patch will fix the same problem, in a simpler way. It will not
> involve checking if the hardware is finished with a descriptor on
> ld_running.
>=20
> > >
> > > It would be nice to know how to easily reproduce this bug, without
> > > needing to set up a RAID system. I don't have access to any such
> > > hardware. A driver similar to drivers/dma/dmatest.c (using the
> async_tx
> > > API instead) would be wonderful.
> > You can refer to raid5.c if you do not want to use hardware. Or you can
> use
> > you ram (or other storage devices) to build a raid env to test.
> > Thanks.
> >
> > >
> > > Thanks,
> > > Ira
> > >
> > > > TASK =3D ee1a94a0[1390] 'md0_raid5' THREAD: ecf40000 CPU: 0
> > > > GPR00: 00000001 ecf41ca0 ee44/921a94a0 0000003f 00000001 c00593e4
> > > 00000000 00000001
> > > > GPR08: 00000000 a7a7a7a7 00000001 045/920000002 42028042 100a38d4
> > > ed576d98 00000000
> > > > GPR16: ed5a11b0 00000000 2b162000 00000200 046/920000000 2d555000
> > > ed3015e8 c15a7aa0
> > > > GPR24: 00000000 c155fc40 00000000 ecb63220 ecf41d28 e47/92f640bb0
> > > ef640c30 ecf41ca0
> > > > NIP [c02b048c] async_tx_submit+0x6c/0x2b4
> > > > LR [c02b068c] async_tx_submit+0x26c/0x2b4
> > > > Call Trace:
> > > > [ecf41ca0] [c02b068c] async_tx_submit+0x26c/0x2b448/92 (unreliable)
> > > > [ecf41cd0] [c02b0a4c] async_memcpy+0x240/0x25c
> > > > [ecf41d20] [c0421064] async_copy_data+0xa0/0x17c
> > > > [ecf41d70] [c0421cf4] __raid_run_ops+0x874/0xe10
> > > > [ecf41df0] [c0426ee4] handle_stripe+0x820/0x25e8
> > > > [ecf41e90] [c0429080] raid5d+0x3d4/0x5b4
> > > > [ecf41f40] [c04329b8] md_thread+0x138/0x16c
> > > > [ecf41f90] [c008277c] kthread+0x8c/0x90
> > > > [ecf41ff0] [c0011630] kernel_thread+0x4c/0x68
> > > >
> > > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > > Cc: Vinod Koul <vinod.koul@intel.com>
> > > > Cc: Li Yang <leoli@freescale.com>
> > > > Cc: Ira W. Snyder <iws@ovro.caltech.edu>
> > > > Signed-off-by: Qiang Liu <qiang.liu@freescale.com>
> > > > ---
> > > > drivers/dma/fsldma.c | 242 +++++++++++++++++++++++++++++++++++---
> ----
> > > --------
> > > > drivers/dma/fsldma.h | 1 +
> > > > 2 files changed, 172 insertions(+), 71 deletions(-)
> > > >
> > > > diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
> > > > index 4f2f212..87f52c0 100644
> > > > --- a/drivers/dma/fsldma.c
> > > > +++ b/drivers/dma/fsldma.c
> > > > @@ -400,6 +400,125 @@ out_splice:
> > > > list_splice_tail_init(&desc->tx_list, &chan->ld_pending);
> > > > }
> > > >
> > > > +static void fsldma_cleanup_descriptor(struct fsldma_chan *chan);
> > > > +static void fsl_chan_xfer_ld_queue(struct fsldma_chan *chan);
> > > > +
>=20
> You should have re-arranged the patches to avoid introducing these
> forward declarations in this patch and then deleting them in the next
> patch. I reversed the order in my patch series.
I split it up according to Li Yang's advice. Maybe you miss the mail, pleas=
e refer to:
http://patchwork.ozlabs.org/patch/173605/
Thanks.
> > > > +/**
> > > > + * fsldma_clean_completed_descriptor - free all descriptors which
> > > > + * has been completed and acked
> > > > + * @chan: Freescale DMA channel
> > > > + *
> > > > + * This function is used on all completed and acked descriptors.
> > > > + * All descriptors should only be freed in this function.
> > > > + */
> > > > +static int
> > > > +fsldma_clean_completed_descriptor(struct fsldma_chan *chan)
> > > > +{
> > > > + struct fsl_desc_sw *desc, *_desc;
> > > > +
> > > > + /* Run the callback for each descriptor, in order */
> > > > + list_for_each_entry_safe(desc, _desc, &chan->ld_completed,
> node) {
> > > > +
> > > > + if (async_tx_test_ack(&desc->async_tx)) {
> > > > + /* Remove from the list of transactions */
> > > > + list_del(&desc->node);
> > > > +#ifdef FSL_DMA_LD_DEBUG
> > > > + chan_dbg(chan, "LD %p free\n", desc);
> > > > +#endif
> > > > + dma_pool_free(chan->desc_pool, desc,
> > > > + desc->async_tx.phys);
> > > > + }
> > > > + }
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > +/**
> > > > + * fsldma_run_tx_complete_actions - cleanup and free a single link
> > > descriptor
> > > > + * @chan: Freescale DMA channel
> > > > + * @desc: descriptor to cleanup and free
> > > > + * @cookie: Freescale DMA transaction identifier
> > > > + *
> > > > + * This function is used on a descriptor which has been executed
> by
> > > the DMA
> > > > + * controller. It will run any callbacks, submit any dependencies.
> > > > + */
> > > > +static dma_cookie_t fsldma_run_tx_complete_actions(struct
> fsl_desc_sw
> > > *desc,
> > > > + struct fsldma_chan *chan, dma_cookie_t cookie)
> > > > +{
> > > > + struct dma_async_tx_descriptor *txd =3D &desc->async_tx;
> > > > + struct device *dev =3D chan->common.device->dev;
> > > > + dma_addr_t src =3D get_desc_src(chan, desc);
> > > > + dma_addr_t dst =3D get_desc_dst(chan, desc);
> > > > + u32 len =3D get_desc_cnt(chan, desc);
> > > > +
> > > > + BUG_ON(txd->cookie < 0);
> > > > +
> > > > + if (txd->cookie > 0) {
> > > > + cookie =3D txd->cookie;
> > > > +
> > > > + /* Run the link descriptor callback function */
> > > > + if (txd->callback) {
> > > > +#ifdef FSL_DMA_LD_DEBUG
> > > > + chan_dbg(chan, "LD %p callback\n", desc);
> > > > +#endif
> > > > + txd->callback(txd->callback_param);
> > > > + }
> > > > +
> > > > + /* Unmap the dst buffer, if requested */
> > > > + if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
> > > > + if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
> > > > + dma_unmap_single(dev, dst, len,
> DMA_FROM_DEVICE);
> > > > + else
> > > > + dma_unmap_page(dev, dst, len,
> DMA_FROM_DEVICE);
> > > > + }
> > > > +
> > > > + /* Unmap the src buffer, if requested */
> > > > + if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
> > > > + if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
> > > > + dma_unmap_single(dev, src, len,
> DMA_TO_DEVICE);
> > > > + else
> > > > + dma_unmap_page(dev, src, len,
> DMA_TO_DEVICE);
> > > > + }
> > > > + }
> > > > +
> > > > + /* Run any dependencies */
> > > > + dma_run_dependencies(txd);
> > > > +
> > > > + return cookie;
> > > > +}
> > > > +
> > > > +/**
> > > > + * fsldma_clean_running_descriptor - move the completed descriptor
> > > from
> > > > + * ld_running to ld_completed
> > > > + * @chan: Freescale DMA channel
> > > > + * @desc: the descriptor which is completed
> > > > + *
> > > > + * Free the descriptor directly if acked by async_tx api, or move
> it
> > > to
> > > > + * queue ld_completed.
> > > > + */
> > > > +static int
> > > > +fsldma_clean_running_descriptor(struct fsldma_chan *chan,
> > > > + struct fsl_desc_sw *desc)
> > > > +{
> > > > + /* Remove from the list of transactions */
> > > > + list_del(&desc->node);
> > > > + /*
> > > > + * the client is allowed to attach dependent operations
> > > > + * until 'ack' is set
> > > > + */
> > > > + if (!async_tx_test_ack(&desc->async_tx)) {
> > > > + /*
> > > > + * Move this descriptor to the list of descriptors
> which is
> > > > + * completed, but still awaiting the 'ack' bit to be
> set.
> > > > + */
> > > > + list_add_tail(&desc->node, &chan->ld_completed);
> > > > + return 0;
> > > > + }
> > > > +
> > > > + dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
> > > > + return 0;
> > > > +}
> > > > +
> > > > static dma_cookie_t fsl_dma_tx_submit(struct
> dma_async_tx_descriptor
> > > *tx)
> > > > {
> > > > struct fsldma_chan *chan =3D to_fsl_chan(tx->chan);
> > > > @@ -534,8 +653,10 @@ static void fsl_dma_free_chan_resources(struct
> > > dma_chan *dchan)
> > > >
> > > > chan_dbg(chan, "free all channel resources\n");
> > > > spin_lock_irqsave(&chan->desc_lock, flags);
> > > > + fsldma_cleanup_descriptor(chan);
> > > > fsldma_free_desc_list(chan, &chan->ld_pending);
> > > > fsldma_free_desc_list(chan, &chan->ld_running);
> > > > + fsldma_free_desc_list(chan, &chan->ld_completed);
> > > > spin_unlock_irqrestore(&chan->desc_lock, flags);
> > > >
> > > > dma_pool_destroy(chan->desc_pool);
> > > > @@ -819,46 +940,53 @@ static int fsl_dma_device_control(struct
> dma_chan
> > > *dchan,
> > > > * controller. It will run any callbacks, submit any dependencies,
> and
> > > then
> > > > * free the descriptor.
> > > > */
> > > > -static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
> > > > - struct fsl_desc_sw *desc)
> > > > +static void fsldma_cleanup_descriptor(struct fsldma_chan *chan)
> > > > {
> > > > - struct dma_async_tx_descriptor *txd =3D &desc->async_tx;
> > > > - struct device *dev =3D chan->common.device->dev;
> > > > - dma_addr_t src =3D get_desc_src(chan, desc);
> > > > - dma_addr_t dst =3D get_desc_dst(chan, desc);
> > > > - u32 len =3D get_desc_cnt(chan, desc);
> > > > + struct fsl_desc_sw *desc, *_desc;
> > > > + dma_cookie_t cookie =3D 0;
> > > > + dma_addr_t curr_phys =3D get_cdar(chan);
> > > > + int idle =3D dma_is_idle(chan);
> > > > + int seen_current =3D 0;
> > > >
> > > > - /* Run the link descriptor callback function */
> > > > - if (txd->callback) {
> > > > -#ifdef FSL_DMA_LD_DEBUG
> > > > - chan_dbg(chan, "LD %p callback\n", desc);
> > > > -#endif
> > > > - txd->callback(txd->callback_param);
> > > > - }
> > > > + fsldma_clean_completed_descriptor(chan);
> > > >
> > > > - /* Run any dependencies */
> > > > - dma_run_dependencies(txd);
> > > > + /* Run the callback for each descriptor, in order */
> > > > + list_for_each_entry_safe(desc, _desc, &chan->ld_running, node)
> {
> > > > + /*
> > > > + * do not advance past the current descriptor loaded
> into the
> > > > + * hardware channel, subsequent descriptors are either
> in
> > > > + * process or have not been submitted
> > > > + */
> > > > + if (seen_current)
> > > > + break;
> > > >
> > > > - /* Unmap the dst buffer, if requested */
> > > > - if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
> > > > - if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
> > > > - dma_unmap_single(dev, dst, len, DMA_FROM_DEVICE);
> > > > - else
> > > > - dma_unmap_page(dev, dst, len, DMA_FROM_DEVICE);
> > > > - }
> > > > + /*
> > > > + * stop the search if we reach the current descriptor
> and the
> > > > + * channel is busy
> > > > + */
> > > > + if (desc->async_tx.phys =3D=3D curr_phys) {
> > > > + seen_current =3D 1;
> > > > + if (!idle)
> > > > + break;
> > > > + }
>=20
> I really don't like the idea of trying to guess what the hardware is
> doing. Variables curr_phys and idle can be stale by the time the
> processor gets here. The DMA engine is very fast at processing
> descriptors.
I know hardware is very fast, but you cannot assume it has already been com=
pleted.
Below is the description about current list descriptor,
Current List Descriptor Address Registers (CLSDARn and ECLSDARn)
After finishing the last link descriptor in the current list, the DMA contr=
oller loads
the contents of the next list descriptor address register into the current =
list descriptor address register. If
NLSDARn[EOLSD] in the next list descriptor address register is clear, the D=
MA controller reads the new
current list descriptor from memory to process that list. If EOLSD in the n=
ext list descriptor address
register is set and the last link in the current list is finished, all DMA =
transfers are complete.
So it's a bug which you think the whole list is completed when an interrupt=
is raised, there is a potential
risk when an interrupt is raised by "Programmed Error". The "ld_running" is=
a s/w concept, we should not
depend on it to judge the status of descriptors list.
I know you don't like this process, but it's a safe and common process. You=
can refer to other dma
drivers, like ioap-adma, mv-xor and ibm-ppc440x-adma.
Said far point, usb also take this method to judge which descriptor is comp=
leted, I don't know which device
can use a s/w list to free all descriptors, you can refer to the implement =
of dl_reverse_done_list().
Thanks.
>=20
> It is much easier to reason about the hardware state if you have it tell
> you (via an interrupt) when it is ready for more descriptors. My patch
> takes this approach. I'll be posting it in a few minutes.
No, the interrupt is only report the state of hardware, we cannot assume al=
l descriptors
are finished when an interrupt is raised :)
>=20
> > > > +
> > > > + cookie =3D fsldma_run_tx_complete_actions(desc, chan,
> cookie);
> > > > +
> > > > + if (fsldma_clean_running_descriptor(chan, desc))
> > > > + break;
> > > >
> > > > - /* Unmap the src buffer, if requested */
> > > > - if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
> > > > - if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
> > > > - dma_unmap_single(dev, src, len, DMA_TO_DEVICE);
> > > > - else
> > > > - dma_unmap_page(dev, src, len, DMA_TO_DEVICE);
> > > > }
> > > >
> > > > -#ifdef FSL_DMA_LD_DEBUG
> > > > - chan_dbg(chan, "LD %p free\n", desc);
> > > > -#endif
> > > > - dma_pool_free(chan->desc_pool, desc, txd->phys);
> > > > + /*
> > > > + * Start any pending transactions automatically
> > > > + *
> > > > + * In the ideal case, we keep the DMA controller busy while
> we go
> > > > + * ahead and free the descriptors below.
> > > > + */
> > > > + fsl_chan_xfer_ld_queue(chan);
> > > > +
> > > > + if (cookie > 0)
> > > > + chan->common.completed_cookie =3D cookie;
> > > > }
> > > >
> > > > /**
> > > > @@ -954,11 +1082,15 @@ static enum dma_status fsl_tx_status(struct
> > > dma_chan *dchan,
> > > > enum dma_status ret;
> > > > unsigned long flags;
> > > >
> > > > - spin_lock_irqsave(&chan->desc_lock, flags);
> > > > ret =3D dma_cookie_status(dchan, cookie, txstate);
> > > > + if (ret =3D=3D DMA_SUCCESS)
> > > > + return ret;
> > > > +
> > > > + spin_lock_irqsave(&chan->desc_lock, flags);
> > > > + fsldma_cleanup_descriptor(chan);
> > > > spin_unlock_irqrestore(&chan->desc_lock, flags);
> > > >
> > > > - return ret;
> > > > + return dma_cookie_status(dchan, cookie, txstate);
> > > > }
> > > >
> > > > /*----------------------------------------------------------------
> ----
> > > --------*/
> > > > @@ -1035,52 +1167,19 @@ static irqreturn_t fsldma_chan_irq(int irq,
> > > void *data)
> > > > static void dma_do_tasklet(unsigned long data)
> > > > {
> > > > struct fsldma_chan *chan =3D (struct fsldma_chan *)data;
> > > > - struct fsl_desc_sw *desc, *_desc;
> > > > - LIST_HEAD(ld_cleanup);
> > > > unsigned long flags;
> > > >
> > > > chan_dbg(chan, "tasklet entry\n");
> > > >
> > > > spin_lock_irqsave(&chan->desc_lock, flags);
> > > >
> > > > - /* update the cookie if we have some descriptors to cleanup
> */
> > > > - if (!list_empty(&chan->ld_running)) {
> > > > - dma_cookie_t cookie;
> > > > -
> > > > - desc =3D to_fsl_desc(chan->ld_running.prev);
> > > > - cookie =3D desc->async_tx.cookie;
> > > > - dma_cookie_complete(&desc->async_tx);
> > > > -
> > > > - chan_dbg(chan, "completed_cookie=3D%d\n", cookie);
> > > > - }
> > > > -
> > > > - /*
> > > > - * move the descriptors to a temporary list so we can drop
> the lock
> > > > - * during the entire cleanup operation
> > > > - */
> > > > - list_splice_tail_init(&chan->ld_running, &ld_cleanup);
> > > > -
> > > > /* the hardware is now idle and ready for more */
> > > > chan->idle =3D true;
> > > >
> > > > - /*
> > > > - * Start any pending transactions automatically
> > > > - *
> > > > - * In the ideal case, we keep the DMA controller busy while
> we go
> > > > - * ahead and free the descriptors below.
> > > > - */
> > > > - fsl_chan_xfer_ld_queue(chan);
> > > > - spin_unlock_irqrestore(&chan->desc_lock, flags);
> > > > -
> > > > - /* Run the callback for each descriptor, in order */
> > > > - list_for_each_entry_safe(desc, _desc, &ld_cleanup, node) {
> > > > + /* Run all cleanup for this descriptor */
> > > > + fsldma_cleanup_descriptor(chan);
> > > >
> > > > - /* Remove from the list of transactions */
> > > > - list_del(&desc->node);
> > > > -
> > > > - /* Run all cleanup for this descriptor */
> > > > - fsldma_cleanup_descriptor(chan, desc);
> > > > - }
> > > > + spin_unlock_irqrestore(&chan->desc_lock, flags);
> > > >
> > > > chan_dbg(chan, "tasklet exit\n");
> > > > }
> > > > @@ -1262,6 +1361,7 @@ static int __devinit
> fsl_dma_chan_probe(struct
> > > fsldma_device *fdev,
> > > > spin_lock_init(&chan->desc_lock);
> > > > INIT_LIST_HEAD(&chan->ld_pending);
> > > > INIT_LIST_HEAD(&chan->ld_running);
> > > > + INIT_LIST_HEAD(&chan->ld_completed);
> > > > chan->idle =3D true;
> > > >
> > > > chan->common.device =3D &fdev->common;
> > > > diff --git a/drivers/dma/fsldma.h b/drivers/dma/fsldma.h
> > > > index f5c3879..7ede908 100644
> > > > --- a/drivers/dma/fsldma.h
> > > > +++ b/drivers/dma/fsldma.h
> > > > @@ -140,6 +140,7 @@ struct fsldma_chan {
> > > > spinlock_t desc_lock; /* Descriptor operation lock */
> > > > struct list_head ld_pending; /* Link descriptors queue */
> > > > struct list_head ld_running; /* Link descriptors queue */
> > > > + struct list_head ld_completed; /* Link descriptors queue
> */
> > > > struct dma_chan common; /* DMA common channel */
> > > > struct dma_pool *desc_pool; /* Descriptors pool */
> > > > struct device *dev; /* Channel device */
> > > > --
> > > > 1.7.5.1
> > > >
> > > >
> >
> >
^ permalink raw reply
* Re: [RFC PATCH v5 12/19] memory-hotplug: introduce new function arch_remove_memory()
From: jencce zhou @ 2012-08-01 2:44 UTC (permalink / raw)
To: Wen Congyang
Cc: linux-s390, linux-ia64, linux-acpi, len.brown, linux-sh,
linux-kernel, cmetcalf, linux-mm, Yasuaki ISIMATU, paulus,
minchan.kim, kosaki.motohiro, rientjes, cl, linuxppc-dev, akpm,
liuj97
In-Reply-To: <50126E2F.8010301@cn.fujitsu.com>
2012/7/27 Wen Congyang <wency@cn.fujitsu.com>:
> We don't call __add_pages() directly in the function add_memory()
> because some other architecture related things need to be done
> before or after calling __add_pages(). So we should introduce
> a new function arch_remove_memory() to revert the things
> done in arch_add_memory().
>
> Note: the function for s390 is not implemented(I don't know how to
> implement it for s390).
>
> CC: David Rientjes <rientjes@google.com>
> CC: Jiang Liu <liuj97@gmail.com>
> CC: Len Brown <len.brown@intel.com>
> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> CC: Paul Mackerras <paulus@samba.org>
> CC: Christoph Lameter <cl@linux.com>
> Cc: Minchan Kim <minchan.kim@gmail.com>
> CC: Andrew Morton <akpm@linux-foundation.org>
> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> CC: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> ---
> arch/ia64/mm/init.c | 16 ++++
> arch/powerpc/mm/mem.c | 14 +++
> arch/s390/mm/init.c | 8 ++
> arch/sh/mm/init.c | 15 +++
> arch/tile/mm/init.c | 8 ++
> arch/x86/include/asm/pgtable_types.h | 1 +
> arch/x86/mm/init_32.c | 10 ++
> arch/x86/mm/init_64.c | 160 ++++++++++++++++++++++++++++++++++
> arch/x86/mm/pageattr.c | 47 +++++-----
> include/linux/memory_hotplug.h | 1 +
> mm/memory_hotplug.c | 1 +
> 11 files changed, 259 insertions(+), 22 deletions(-)
>
> diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
> index 0eab454..1e345ed 100644
> --- a/arch/ia64/mm/init.c
> +++ b/arch/ia64/mm/init.c
> @@ -688,6 +688,22 @@ int arch_add_memory(int nid, u64 start, u64 size)
>
> return ret;
> }
> +
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +int arch_remove_memory(u64 start, u64 size)
> +{
> + unsigned long start_pfn = start >> PAGE_SHIFT;
> + unsigned long nr_pages = size >> PAGE_SHIFT;
> + int ret;
> +
> + ret = __remove_pages(start_pfn, nr_pages);
> + if (ret)
> + pr_warn("%s: Problem encountered in __remove_pages() as"
> + " ret=%d\n", __func__, ret);
> +
> + return ret;
> +}
> +#endif
> #endif
>
in 3.5 ia64 implementation did not call __remove_pages at all. so why?
> /*
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index baaafde..249cef4 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -133,6 +133,20 @@ int arch_add_memory(int nid, u64 start, u64 size)
>
> return __add_pages(nid, zone, start_pfn, nr_pages);
> }
> +
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +int arch_remove_memory(u64 start, u64 size)
> +{
> + unsigned long start_pfn = start >> PAGE_SHIFT;
> + unsigned long nr_pages = size >> PAGE_SHIFT;
> +
> + start = (unsigned long)__va(start);
> + if (remove_section_mapping(start, start + size))
> + return -EINVAL;
> +
> + return __remove_pages(start_pfn, nr_pages);
> +}
> +#endif
> #endif /* CONFIG_MEMORY_HOTPLUG */
>
> /*
> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
> index 6adbc08..ca4bc46 100644
> --- a/arch/s390/mm/init.c
> +++ b/arch/s390/mm/init.c
> @@ -257,4 +257,12 @@ int arch_add_memory(int nid, u64 start, u64 size)
> vmem_remove_mapping(start, size);
> return rc;
> }
> +
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +int arch_remove_memory(u64 start, u64 size)
> +{
> + /* TODO */
> + return -EBUSY;
> +}
> +#endif
> #endif /* CONFIG_MEMORY_HOTPLUG */
> diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
> index 82cc576..fc84491 100644
> --- a/arch/sh/mm/init.c
> +++ b/arch/sh/mm/init.c
> @@ -558,4 +558,19 @@ int memory_add_physaddr_to_nid(u64 addr)
> EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid);
> #endif
>
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +int arch_remove_memory(u64 start, u64 size)
> +{
> + unsigned long start_pfn = start >> PAGE_SHIFT;
> + unsigned long nr_pages = size >> PAGE_SHIFT;
> + int ret;
> +
> + ret = __remove_pages(start_pfn, nr_pages);
> + if (unlikely(ret))
> + pr_warn("%s: Failed, __remove_pages() == %d\n", __func__,
> + ret);
> +
> + return ret;
> +}
> +#endif
> #endif /* CONFIG_MEMORY_HOTPLUG */
> diff --git a/arch/tile/mm/init.c b/arch/tile/mm/init.c
> index ef29d6c..2749515 100644
> --- a/arch/tile/mm/init.c
> +++ b/arch/tile/mm/init.c
> @@ -935,6 +935,14 @@ int remove_memory(u64 start, u64 size)
> {
> return -EINVAL;
> }
> +
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +int arch_remove_memory(u64 start, u64 size)
> +{
> + /* TODO */
> + return -EBUSY;
> +}
> +#endif
> #endif
>
> struct kmem_cache *pgd_cache;
> diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
> index 013286a..b725af2 100644
> --- a/arch/x86/include/asm/pgtable_types.h
> +++ b/arch/x86/include/asm/pgtable_types.h
> @@ -334,6 +334,7 @@ static inline void update_page_count(int level, unsigned long pages) { }
> * as a pte too.
> */
> extern pte_t *lookup_address(unsigned long address, unsigned int *level);
> +extern int __split_large_page(pte_t *kpte, unsigned long address, pte_t *pbase);
>
> #endif /* !__ASSEMBLY__ */
>
> diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
> index 575d86f..a690153 100644
> --- a/arch/x86/mm/init_32.c
> +++ b/arch/x86/mm/init_32.c
> @@ -842,6 +842,16 @@ int arch_add_memory(int nid, u64 start, u64 size)
>
> return __add_pages(nid, zone, start_pfn, nr_pages);
> }
> +
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +int arch_remove_memory(unsigned long start, unsigned long size)
> +{
> + unsigned long start_pfn = start >> PAGE_SHIFT;
> + unsigned long nr_pages = size >> PAGE_SHIFT;
> +
> + return __remove_pages(start_pfn, nr_pages);
> +}
> +#endif
> #endif
>
> /*
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 2b6b4a3..f1554a9 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -675,6 +675,166 @@ int arch_add_memory(int nid, u64 start, u64 size)
> }
> EXPORT_SYMBOL_GPL(arch_add_memory);
>
> +static void __meminit
> +phys_pte_remove(pte_t *pte_page, unsigned long addr, unsigned long end)
> +{
> + unsigned pages = 0;
> + int i = pte_index(addr);
> +
> + pte_t *pte = pte_page + pte_index(addr);
> +
> + for (; i < PTRS_PER_PTE; i++, addr += PAGE_SIZE, pte++) {
> +
> + if (addr >= end)
> + break;
> +
> + if (!pte_present(*pte))
> + continue;
> +
> + pages++;
> + set_pte(pte, __pte(0));
> + }
> +
> + update_page_count(PG_LEVEL_4K, -pages);
> +}
> +
> +static void __meminit
> +phys_pmd_remove(pmd_t *pmd_page, unsigned long addr, unsigned long end)
> +{
> + unsigned long pages = 0, next;
> + int i = pmd_index(addr);
> +
> + for (; i < PTRS_PER_PMD; i++, addr = next) {
> + unsigned long pte_phys;
> + pmd_t *pmd = pmd_page + pmd_index(addr);
> + pte_t *pte;
> +
> + if (addr >= end)
> + break;
> +
> + next = (addr & PMD_MASK) + PMD_SIZE;
> +
> + if (!pmd_present(*pmd))
> + continue;
> +
> + if (pmd_large(*pmd)) {
> + if ((addr & ~PMD_MASK) == 0 && next <= end) {
> + set_pmd(pmd, __pmd(0));
> + pages++;
> + continue;
> + }
> +
> + /*
> + * We use 2M page, but we need to remove part of them,
> + * so split 2M page to 4K page.
> + */
> + pte = alloc_low_page(&pte_phys);
> + __split_large_page((pte_t *)pmd, addr, pte);
> +
> + spin_lock(&init_mm.page_table_lock);
> + pmd_populate_kernel(&init_mm, pmd, __va(pte_phys));
> + spin_unlock(&init_mm.page_table_lock);
> + }
> +
> + spin_lock(&init_mm.page_table_lock);
> + pte = map_low_page((pte_t *)pmd_page_vaddr(*pmd));
> + phys_pte_remove(pte, addr, end);
> + unmap_low_page(pte);
> + spin_unlock(&init_mm.page_table_lock);
> + }
> + update_page_count(PG_LEVEL_2M, -pages);
> +}
> +
> +static void __meminit
> +phys_pud_remove(pud_t *pud_page, unsigned long addr, unsigned long end)
> +{
> + unsigned long pages = 0, next;
> + int i = pud_index(addr);
> +
> + for (; i < PTRS_PER_PUD; i++, addr = next) {
> + unsigned long pmd_phys;
> + pud_t *pud = pud_page + pud_index(addr);
> + pmd_t *pmd;
> +
> + if (addr >= end)
> + break;
> +
> + next = (addr & PUD_MASK) + PUD_SIZE;
> +
> + if (!pud_present(*pud))
> + continue;
> +
> + if (pud_large(*pud)) {
> + if ((addr & ~PUD_MASK) == 0 && next <= end) {
> + set_pud(pud, __pud(0));
> + pages++;
> + continue;
> + }
> +
> + /*
> + * We use 1G page, but we need to remove part of them,
> + * so split 1G page to 2M page.
> + */
> + pmd = alloc_low_page(&pmd_phys);
> + __split_large_page((pte_t *)pud, addr, (pte_t *)pmd);
> +
> + spin_lock(&init_mm.page_table_lock);
> + pud_populate(&init_mm, pud, __va(pmd_phys));
> + spin_unlock(&init_mm.page_table_lock);
> + }
> +
> + pmd = map_low_page(pmd_offset(pud, 0));
> + phys_pmd_remove(pmd, addr, end);
> + unmap_low_page(pmd);
> + __flush_tlb_all();
> + }
> + __flush_tlb_all();
> +
> + update_page_count(PG_LEVEL_1G, -pages);
> +}
> +
> +void __meminit
> +kernel_physical_mapping_remove(unsigned long start, unsigned long end)
> +{
> + unsigned long next;
> +
> + start = (unsigned long)__va(start);
> + end = (unsigned long)__va(end);
> +
> + for (; start < end; start = next) {
> + pgd_t *pgd = pgd_offset_k(start);
> + pud_t *pud;
> +
> + next = (start + PGDIR_SIZE) & PGDIR_MASK;
> + if (next > end)
> + next = end;
> +
> + if (!pgd_present(*pgd))
> + continue;
> +
> + pud = map_low_page((pud_t *)pgd_page_vaddr(*pgd));
> + phys_pud_remove(pud, __pa(start), __pa(end));
> + unmap_low_page(pud);
> + }
> +
> + __flush_tlb_all();
> +}
> +
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +int __ref arch_remove_memory(unsigned long start, unsigned long size)
> +{
> + unsigned long start_pfn = start >> PAGE_SHIFT;
> + unsigned long nr_pages = size >> PAGE_SHIFT;
> + int ret;
> +
> + ret = __remove_pages(start_pfn, nr_pages);
> + WARN_ON_ONCE(ret);
> +
> + kernel_physical_mapping_remove(start, start + size);
> +
> + return ret;
> +}
> +#endif
> #endif /* CONFIG_MEMORY_HOTPLUG */
>
> static struct kcore_list kcore_vsyscall;
> diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
> index 931930a..c22963d 100644
> --- a/arch/x86/mm/pageattr.c
> +++ b/arch/x86/mm/pageattr.c
> @@ -501,21 +501,13 @@ out_unlock:
> return do_split;
> }
>
> -static int split_large_page(pte_t *kpte, unsigned long address)
> +int __split_large_page(pte_t *kpte, unsigned long address, pte_t *pbase)
> {
> unsigned long pfn, pfninc = 1;
> unsigned int i, level;
> - pte_t *pbase, *tmp;
> + pte_t *tmp;
> pgprot_t ref_prot;
> - struct page *base;
> -
> - if (!debug_pagealloc)
> - spin_unlock(&cpa_lock);
> - base = alloc_pages(GFP_KERNEL | __GFP_NOTRACK, 0);
> - if (!debug_pagealloc)
> - spin_lock(&cpa_lock);
> - if (!base)
> - return -ENOMEM;
> + struct page *base = virt_to_page(pbase);
>
> spin_lock(&pgd_lock);
> /*
> @@ -523,10 +515,11 @@ static int split_large_page(pte_t *kpte, unsigned long address)
> * up for us already:
> */
> tmp = lookup_address(address, &level);
> - if (tmp != kpte)
> - goto out_unlock;
> + if (tmp != kpte) {
> + spin_unlock(&pgd_lock);
> + return 1;
> + }
>
> - pbase = (pte_t *)page_address(base);
> paravirt_alloc_pte(&init_mm, page_to_pfn(base));
> ref_prot = pte_pgprot(pte_clrhuge(*kpte));
> /*
> @@ -579,17 +572,27 @@ static int split_large_page(pte_t *kpte, unsigned long address)
> * going on.
> */
> __flush_tlb_all();
> + spin_unlock(&pgd_lock);
>
> - base = NULL;
> + return 0;
> +}
>
> -out_unlock:
> - /*
> - * If we dropped out via the lookup_address check under
> - * pgd_lock then stick the page back into the pool:
> - */
> - if (base)
> +static int split_large_page(pte_t *kpte, unsigned long address)
> +{
> + pte_t *pbase;
> + struct page *base;
> +
> + if (!debug_pagealloc)
> + spin_unlock(&cpa_lock);
> + base = alloc_pages(GFP_KERNEL | __GFP_NOTRACK, 0);
> + if (!debug_pagealloc)
> + spin_lock(&cpa_lock);
> + if (!base)
> + return -ENOMEM;
> +
> + pbase = (pte_t *)page_address(base);
> + if (__split_large_page(kpte, address, pbase))
> __free_page(base);
> - spin_unlock(&pgd_lock);
>
> return 0;
> }
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 8bf820d..0d500be 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -85,6 +85,7 @@ extern void __online_page_free(struct page *page);
>
> #ifdef CONFIG_MEMORY_HOTREMOVE
> extern bool is_pageblock_removable_nolock(struct page *page);
> +extern int arch_remove_memory(unsigned long start, unsigned long size);
> #endif /* CONFIG_MEMORY_HOTREMOVE */
>
> /* reasonably generic interface to expand the physical pages in a zone */
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index a9e1579..0c932e1 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1071,6 +1071,7 @@ int __ref remove_memory(int nid, u64 start, u64 size)
line 1071? which version does this patch base on? thanks a lot.
> /* remove memmap entry */
> firmware_map_remove(start, start + size, "System RAM");
>
> + arch_remove_memory(start, size);
> out:
> unlock_memory_hotplug();
> return ret;
> --
> 1.7.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply
* RE: [PATCH V3 1/5] powerpc/fsl-pci: Unify pci/pcie initialization code
From: Jia Hongtao-B38951 @ 2012-08-01 2:24 UTC (permalink / raw)
To: Kumar Gala, Li Yang-R58472
Cc: Wood Scott-B07421, linuxppc-dev@lists.ozlabs.org, Li Yang-R58472
In-Reply-To: <0080192C-3C13-40DA-BE8A-4FCE67F79E04@kernel.crashing.org>
> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]
> Sent: Tuesday, July 31, 2012 9:38 PM
> To: Li Yang-R58472
> Cc: Jia Hongtao-B38951; Wood Scott-B07421; linuxppc-dev@lists.ozlabs.org;
> Li Yang-R58472
> Subject: Re: [PATCH V3 1/5] powerpc/fsl-pci: Unify pci/pcie
> initialization code
>=20
>=20
> On Jul 31, 2012, at 2:21 AM, Li Yang wrote:
>=20
> > On Mon, Jul 30, 2012 at 10:46 PM, Kumar Gala <galak@kernel.crashing.org=
>
> wrote:
> >>
> >> On Jul 30, 2012, at 3:26 AM, Jia Hongtao-B38951 wrote:
> >>
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Kumar Gala [mailto:galak@kernel.crashing.org]
> >>>> Sent: Saturday, July 28, 2012 5:17 AM
> >>>> To: Wood Scott-B07421
> >>>> Cc: Jia Hongtao-B38951; linuxppc-dev@lists.ozlabs.org; Wood Scott-
> B07421;
> >>>> Li Yang-R58472
> >>>> Subject: Re: [PATCH V3 1/5] powerpc/fsl-pci: Unify pci/pcie
> >>>> initialization code
> >>>>
> >>>>
> >>>> On Jul 27, 2012, at 3:24 PM, Scott Wood wrote:
> >>>>
> >>>>> On 07/27/2012 05:10 AM, Jia Hongtao-B38951 wrote:
> >>>>>> Hi kumar,
> >>>>>>
> >>>>>> I know "duplicate code from pci_process_bridge_OF_ranges()" is
> >>>>>> hard to accept but "refactor the code to have a shared function"
> >>>>>> is knotty. Actually this is the reason I didn't do the refactor.
> >>>>>
> >>>>> Maybe we should keep doing the init early? We could still have a
> >>>>> platform device for the PM stuff, but some init would be done
> before
> >>>> probe.
> >>>>>
> >>>>> Another possibility is to try to handle swiotlb init later --
> possibly
> >>>>> by reserving memory for it if the platform indicates it's a
> possibility
> >>>>> that it will be needed, then freeing the memory if it's not needed.
> >>>>>
> >>>>> -Scott
> >>>>
> >>>> I think the first option seems reasonable. Can we leave
> fsl_pci_init()
> >>>> as we now have it and just have the platform driver deal with PM
> restore
> >>>> via calling setup_pci_atmu() [probably need to update setup_pci_atmu
> to
> >>>> handle restore case, but seems like minor changes]
> >>>>
> >>>> - k
> >>>>
> >>>
> >>>
> >>> I think the second option is better if it's hard to decouple swiotlb
> >>> determination from pci init. I believe the better architecture that
> >>> PCI init in probe function of platform driver will bring us
> considerable
> >>> advantage. I really like to keep the completion of pci controller
> >>> platform driver not only for PM support but also for pci init.
> >>>
> >>> -Hongtao.
> >>>
> >>
> >> Shifting of swiotlb init has a lot more issues. Why do we need to do
> the PCI init in probe?
> >
> > The ordering issues are introduced by swiotlb. And the ideal way is
> > to solve the problem within swiotlb instead of changing PCI to
> > workaround it. Take the implementation of x86 as reference it's
> > possible to be addressed bu allocating first and free later approach.
> >
> > It is common sense that the initialization of a device is in the probe
> > function of the driver of the device. And the change will provide
> > better unification of PCI controller code. The PCI controller is
> > generic enough not to be taken care of at the platform area.
> >
> > Leo
>=20
> Than lets look at going with that approach.. Be careful with impact on
> other users of swiotlb on PPC, I believe one 44x board uses swiotlb.
>=20
> - k
I will be careful with this approach.
I have already noticed 44x. Thank you all the same.
-Hongtao.
^ permalink raw reply
* Re: 3.5+: yaboot, Invalid memory access
From: Tony Breeds @ 2012-08-01 2:02 UTC (permalink / raw)
To: Christian Kujau; +Cc: linuxppc-dev
In-Reply-To: <alpine.DEB.2.01.1207302235170.26923@trent.utfs.org>
[-- Attachment #1: Type: text/plain, Size: 945 bytes --]
On Mon, Jul 30, 2012 at 10:46:39PM -0700, Christian Kujau wrote:
> Hi,
>
> when trying to upgrade from 3.5 (final) to today's git checkout from
> Linus' tree, yaboot cannot boot and the following is printed:
>
> [...]
> returning from prom_init
> Invalid memory access at %SRR0: 00c62fd4 %SRR1: 00003030
>
> The whole message: http://nerdbynature.de/bits/3.5.0/yaboot/
>
> The Yaboot version is 1.3.16 (from Debian/testing), I haven't tried 1.3.17
> yet, its changelog does not say anything about "newer kernel support" so
> I'm not sure if this would help here.
Well the "returning from prom_init" message is from the kernel so yaboot
has done something. We'll work out if it was the right thing ;P
I had a look at the URL you provided.
1) can you also upload you vmlinux so we can poke at it.
2) What is the dmseg.txt file there? If you're unable to boot 3.5 how
did you get the dmesg?
Yours Tony
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply
* Re: [RFC PATCH v5 12/19] memory-hotplug: introduce new function arch_remove_memory()
From: Wen Congyang @ 2012-08-01 1:42 UTC (permalink / raw)
To: gerald.schaefer
Cc: linux-s390, linux-ia64, len.brown, linux-acpi, linux-sh,
Heiko Carstens, linux-kernel, cmetcalf, linux-mm, Yasuaki ISIMATU,
paulus, minchan.kim, kosaki.motohiro, rientjes, cl, linuxppc-dev,
akpm, liuj97
In-Reply-To: <20120731144000.33fd4a0a@thinkpad>
At 07/31/2012 08:40 PM, Gerald Schaefer Wrote:
> On Mon, 30 Jul 2012 18:35:37 +0800
> Wen Congyang <wency@cn.fujitsu.com> wrote:
>=20
>> At 07/30/2012 06:23 PM, Heiko Carstens Wrote:
>>> On Fri, Jul 27, 2012 at 06:32:15PM +0800, Wen Congyang wrote:
>>>> We don't call =5F=5Fadd=5Fpages() directly in the function add=5Fmemor=
y()
>>>> because some other architecture related things need to be done
>>>> before or after calling =5F=5Fadd=5Fpages(). So we should introduce
>>>> a new function arch=5Fremove=5Fmemory() to revert the things
>>>> done in arch=5Fadd=5Fmemory().
>>>>
>>>> Note: the function for s390 is not implemented(I don't know how to
>>>> implement it for s390).
>>>
>>> There is no hardware or firmware interface which could trigger a
>>> hot memory remove on s390. So there is nothing that needs to be
>>> implemented.
>>
>> Thanks for providing this information.
>>
>> According to this, arch=5Fremove=5Fmemory() for s390 can just return
>> -EBUSY.
>=20
> Yes, but there is a prototype mismatch for arch=5Fremove=5Fmemory() on s3=
90
> and also other architectures (u64 vs. unsigned long).
>=20
> arch/s390/mm/init.c:262: error: conflicting types for
> =E2=80=98arch=5Fremove=5Fmemory=E2=80=99 include/linux/memory=5Fhotplug.h=
:88: error: previous
> declaration of =E2=80=98arch=5Fremove=5Fmemory=E2=80=99 was here
>=20
> In memory=5Fhotplug.h you have:
> extern int arch=5Fremove=5Fmemory(unsigned long start, unsigned long size=
);
>=20
> On all archs other than x86 you have:
> int arch=5Fremove=5Fmemory(u64 start, u64 size)
Thanks for pointing it out. I will fix it.
Wen Congyang
>=20
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>=20
=
^ permalink raw reply
* [PATCH 7/7] carma: remove unnecessary DMA_INTERRUPT capability
From: Ira W. Snyder @ 2012-07-31 23:45 UTC (permalink / raw)
To: linux-crypto; +Cc: B32616, linuxppc-dev, Ira W. Snyder
In-Reply-To: <1343778352-5549-1-git-send-email-iws@ovro.caltech.edu>
From: "Ira W. Snyder" <iws@ovro.caltech.edu>
These drivers set the DMA_INTERRUPT capability bit when requesting a DMA
controller channel. This was historical, and is no longer needed.
Recent changes to the drivers/dma/fsldma.c driver have removed support
for this flag. This makes the carma drivers unable to find a DMA channel
with the required capabilities.
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
---
drivers/misc/carma/carma-fpga-program.c | 1 -
drivers/misc/carma/carma-fpga.c | 3 +--
2 files changed, 1 insertions(+), 3 deletions(-)
diff --git a/drivers/misc/carma/carma-fpga-program.c b/drivers/misc/carma/carma-fpga-program.c
index a2d25e4..eaddfe9 100644
--- a/drivers/misc/carma/carma-fpga-program.c
+++ b/drivers/misc/carma/carma-fpga-program.c
@@ -978,7 +978,6 @@ static int fpga_of_probe(struct platform_device *op)
dev_set_drvdata(priv->dev, priv);
dma_cap_zero(mask);
dma_cap_set(DMA_MEMCPY, mask);
- dma_cap_set(DMA_INTERRUPT, mask);
dma_cap_set(DMA_SLAVE, mask);
dma_cap_set(DMA_SG, mask);
diff --git a/drivers/misc/carma/carma-fpga.c b/drivers/misc/carma/carma-fpga.c
index 8c279da..861b298 100644
--- a/drivers/misc/carma/carma-fpga.c
+++ b/drivers/misc/carma/carma-fpga.c
@@ -666,7 +666,7 @@ static int data_submit_dma(struct fpga_device *priv, struct data_buf *buf)
src = SYS_FPGA_BLOCK;
tx = chan->device->device_prep_dma_memcpy(chan, dst, src,
REG_BLOCK_SIZE,
- DMA_PREP_INTERRUPT);
+ 0);
if (!tx) {
dev_err(priv->dev, "unable to prep SYS-FPGA DMA\n");
return -ENOMEM;
@@ -1333,7 +1333,6 @@ static int data_of_probe(struct platform_device *op)
dma_cap_zero(mask);
dma_cap_set(DMA_MEMCPY, mask);
- dma_cap_set(DMA_INTERRUPT, mask);
dma_cap_set(DMA_SLAVE, mask);
dma_cap_set(DMA_SG, mask);
--
1.7.8.6
^ permalink raw reply related
* [PATCH 6/7] fsl-dma: fix a warning of unitialized cookie
From: Ira W. Snyder @ 2012-07-31 23:45 UTC (permalink / raw)
To: linux-crypto; +Cc: Vinod Koul, B32616, Qiang Liu, Dan Williams, linuxppc-dev
In-Reply-To: <1343778352-5549-1-git-send-email-iws@ovro.caltech.edu>
From: Qiang Liu <qiang.liu@freescale.com>
Fix a warning of unitialized value when compile with -Wuninitialized.
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Li Yang <leoli@freescale.com>
Signed-off-by: Qiang Liu <qiang.liu@freescale.com>
Cc: Kim Phillips <kim.phillips@freescale.com>
---
drivers/dma/fsldma.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 380c1b7..8588cf7 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -523,7 +523,7 @@ static dma_cookie_t fsl_dma_tx_submit(struct dma_async_tx_descriptor *tx)
struct fsldma_chan *chan = to_fsl_chan(tx->chan);
struct fsl_desc_sw *desc = tx_to_fsl_desc(tx);
struct fsl_desc_sw *child;
- dma_cookie_t cookie;
+ dma_cookie_t cookie = 0;
spin_lock_irq(&chan->desc_lock);
--
1.7.8.6
^ permalink raw reply related
* [PATCH 5/7] fsl-dma: fix support for async_tx API
From: Ira W. Snyder @ 2012-07-31 23:45 UTC (permalink / raw)
To: linux-crypto
Cc: Ira W. Snyder, Vinod Koul, B32616, Qiang Liu, Dan Williams,
linuxppc-dev
In-Reply-To: <1343778352-5549-1-git-send-email-iws@ovro.caltech.edu>
From: "Ira W. Snyder" <iws@ovro.caltech.edu>
The current fsldma driver does not support the async_tx API. This is due
to a bug: all descriptors are released when the hardware is finished
with them. The async_tx API requires that the ACK bit is set by software
before descriptors are freed.
This bug is easiest to reproduce by enabling both CONFIG_NET_DMA and
CONFIG_ASYNC_TX, and using the hardware offload support in MD RAID. The
network stack will force operations on shared DMA channels, and will
free descriptors which are being used by the MD RAID code.
The BUG_ON(async_tx_test_ack(depend_tx)) test in async_tx_submit() will
be hit, and panic the machine.
TASK = ee1a94a0[1390] 'md0_raid5' THREAD: ecf40000 CPU: 0
GPR00: 00000001 ecf41ca0 ee44/921a94a0 0000003f 00000001 c00593e4 00000000 00000001
GPR08: 00000000 a7a7a7a7 00000001 045/920000002 42028042 100a38d4 ed576d98 00000000
GPR16: ed5a11b0 00000000 2b162000 00000200 046/920000000 2d555000 ed3015e8 c15a7aa0
GPR24: 00000000 c155fc40 00000000 ecb63220 ecf41d28 e47/92f640bb0 ef640c30 ecf41ca0
NIP [c02b048c] async_tx_submit+0x6c/0x2b4
LR [c02b068c] async_tx_submit+0x26c/0x2b4
Call Trace:
[ecf41ca0] [c02b068c] async_tx_submit+0x26c/0x2b448/92 (unreliable)
[ecf41cd0] [c02b0a4c] async_memcpy+0x240/0x25c
[ecf41d20] [c0421064] async_copy_data+0xa0/0x17c
[ecf41d70] [c0421cf4] __raid_run_ops+0x874/0xe10
[ecf41df0] [c0426ee4] handle_stripe+0x820/0x25e8
[ecf41e90] [c0429080] raid5d+0x3d4/0x5b4
[ecf41f40] [c04329b8] md_thread+0x138/0x16c
[ecf41f90] [c008277c] kthread+0x8c/0x90
[ecf41ff0] [c0011630] kernel_thread+0x4c/0x68
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Li Yang <leoli@freescale.com>
Cc: Qiang Liu <qiang.liu@freescale.com>
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
---
drivers/dma/fsldma.c | 156 +++++++++++++++++++++++++++++++-------------------
drivers/dma/fsldma.h | 1 +
2 files changed, 97 insertions(+), 60 deletions(-)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 80edc63..380c1b7 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -410,16 +410,15 @@ static void fsl_dma_free_descriptor(struct fsldma_chan *chan, struct fsl_desc_sw
}
/**
- * fsldma_cleanup_descriptor - cleanup and free a single link descriptor
+ * fsldma_run_tx_complete_actions - run callback and unmap descriptor
* @chan: Freescale DMA channel
* @desc: descriptor to cleanup and free
*
* This function is used on a descriptor which has been executed by the DMA
- * controller. It will run any callbacks, submit any dependencies, and then
- * free the descriptor.
+ * controller. It will run the callback and unmap the descriptor, if requested.
*/
-static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
- struct fsl_desc_sw *desc)
+static void fsldma_run_tx_complete_actions(struct fsldma_chan *chan,
+ struct fsl_desc_sw *desc)
{
struct dma_async_tx_descriptor *txd = &desc->async_tx;
struct device *dev = chan->common.device->dev;
@@ -427,6 +426,10 @@ static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
dma_addr_t dst = get_desc_dst(chan, desc);
u32 len = get_desc_cnt(chan, desc);
+ /* Cookies with value zero are already cleaned up */
+ if (txd->cookie == 0)
+ return;
+
/* Run the link descriptor callback function */
if (txd->callback) {
#ifdef FSL_DMA_LD_DEBUG
@@ -435,9 +438,6 @@ static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
txd->callback(txd->callback_param);
}
- /* Run any dependencies */
- dma_run_dependencies(txd);
-
/* Unmap the dst buffer, if requested */
if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
@@ -454,7 +454,68 @@ static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
dma_unmap_page(dev, src, len, DMA_TO_DEVICE);
}
- fsl_dma_free_descriptor(chan, desc);
+ /*
+ * A zeroed cookie indicates that cleanup actions have been
+ * run, but the descriptor has not yet been ACKed.
+ */
+ txd->cookie = 0;
+}
+
+/**
+ * fsldma_cleanup_descriptors - cleanup and free link descriptors
+ * @chan: Freescale DMA channel
+ *
+ * This function is used to clean up all descriptors which have been executed
+ * by the DMA controller. It will run any callbacks, submit any dependencies,
+ * and free any descriptors which have been ACKed.
+ *
+ * LOCKING: must NOT hold chan->desc_lock
+ * CONTEXT: any
+ */
+static void fsldma_cleanup_descriptors(struct fsldma_chan *chan)
+{
+ struct fsl_desc_sw *desc, *_desc;
+ dma_cookie_t cookie = 0;
+ LIST_HEAD(ld_cleanup);
+ unsigned long flags;
+
+ /*
+ * Move all descriptors onto a temporary list so that hardware
+ * interrupts can be enabled during cleanup
+ */
+ spin_lock_irqsave(&chan->desc_lock, flags);
+ list_splice_tail_init(&chan->ld_completed, &ld_cleanup);
+ spin_unlock_irqrestore(&chan->desc_lock, flags);
+
+ /* Run TX completion actions on completed descriptors */
+ list_for_each_entry_safe(desc, _desc, &ld_cleanup, node) {
+ struct dma_async_tx_descriptor *txd = &desc->async_tx;
+ cookie = txd->cookie;
+
+ /* Clean up this descriptor */
+ fsldma_run_tx_complete_actions(chan, desc);
+
+ /* Run any dependencies */
+ dma_run_dependencies(txd);
+
+ /* Test for ACK and free */
+ if (async_tx_test_ack(txd))
+ fsl_dma_free_descriptor(chan, desc);
+ }
+
+ spin_lock_irqsave(&chan->desc_lock, flags);
+
+ /*
+ * Replace any non-ACKed descriptors on the front of the ld_completed
+ * list so they will be freed in the order they were submitted
+ */
+ list_splice(&ld_cleanup, &chan->ld_completed);
+
+ /* Update the cookie, if it changed */
+ if (cookie > 0)
+ chan->common.completed_cookie = cookie;
+
+ spin_unlock_irqrestore(&chan->desc_lock, flags);
}
static dma_cookie_t fsl_dma_tx_submit(struct dma_async_tx_descriptor *tx)
@@ -578,9 +639,11 @@ static void fsl_dma_free_chan_resources(struct dma_chan *dchan)
struct fsldma_chan *chan = to_fsl_chan(dchan);
chan_dbg(chan, "free all channel resources\n");
+ fsldma_cleanup_descriptors(chan);
spin_lock_irq(&chan->desc_lock);
fsldma_free_desc_list(chan, &chan->ld_pending);
fsldma_free_desc_list(chan, &chan->ld_running);
+ fsldma_free_desc_list(chan, &chan->ld_completed);
spin_unlock_irq(&chan->desc_lock);
dma_pool_destroy(chan->desc_pool);
@@ -945,11 +1008,13 @@ static enum dma_status fsl_tx_status(struct dma_chan *dchan,
struct fsldma_chan *chan = to_fsl_chan(dchan);
enum dma_status ret;
- spin_lock_irq(&chan->desc_lock);
ret = dma_cookie_status(dchan, cookie, txstate);
- spin_unlock_irq(&chan->desc_lock);
+ if (ret == DMA_SUCCESS)
+ return ret;
- return ret;
+ tasklet_schedule(&chan->tasklet);
+
+ return dma_cookie_status(dchan, cookie, txstate);
}
/*----------------------------------------------------------------------------*/
@@ -1013,10 +1078,24 @@ static irqreturn_t fsldma_chan_irq(int irq, void *data)
if (stat)
chan_err(chan, "irq: unhandled sr 0x%08x\n", stat);
+ spin_lock(&chan->desc_lock);
+ chan->idle = true;
+
+ /* Move all running descriptors onto the completed list */
+ list_splice_tail_init(&chan->ld_running, &chan->ld_completed);
+
+ /*
+ * Start any pending transactions automatically
+ *
+ * In the ideal case, we keep the DMA controller busy while we go
+ * ahead and free the descriptors in the tasklet.
+ */
+ fsl_chan_xfer_ld_queue(chan);
+ spin_unlock(&chan->desc_lock);
+
/*
- * Schedule the tasklet to handle all cleanup of the current
- * transaction. It will start a new transaction if there is
- * one pending.
+ * Schedule the tasklet to handle all cleanup of the completed
+ * descriptors.
*/
tasklet_schedule(&chan->tasklet);
chan_dbg(chan, "irq: Exit\n");
@@ -1026,53 +1105,9 @@ static irqreturn_t fsldma_chan_irq(int irq, void *data)
static void dma_do_tasklet(unsigned long data)
{
struct fsldma_chan *chan = (struct fsldma_chan *)data;
- struct fsl_desc_sw *desc, *_desc;
- LIST_HEAD(ld_cleanup);
- unsigned long flags;
chan_dbg(chan, "tasklet entry\n");
-
- spin_lock_irqsave(&chan->desc_lock, flags);
-
- /* update the cookie if we have some descriptors to cleanup */
- if (!list_empty(&chan->ld_running)) {
- dma_cookie_t cookie;
-
- desc = to_fsl_desc(chan->ld_running.prev);
- cookie = desc->async_tx.cookie;
- dma_cookie_complete(&desc->async_tx);
-
- chan_dbg(chan, "completed_cookie=%d\n", cookie);
- }
-
- /*
- * move the descriptors to a temporary list so we can drop the lock
- * during the entire cleanup operation
- */
- list_splice_tail_init(&chan->ld_running, &ld_cleanup);
-
- /* the hardware is now idle and ready for more */
- chan->idle = true;
-
- /*
- * Start any pending transactions automatically
- *
- * In the ideal case, we keep the DMA controller busy while we go
- * ahead and free the descriptors below.
- */
- fsl_chan_xfer_ld_queue(chan);
- spin_unlock_irqrestore(&chan->desc_lock, flags);
-
- /* Run the callback for each descriptor, in order */
- list_for_each_entry_safe(desc, _desc, &ld_cleanup, node) {
-
- /* Remove from the list of transactions */
- list_del(&desc->node);
-
- /* Run all cleanup for this descriptor */
- fsldma_cleanup_descriptor(chan, desc);
- }
-
+ fsldma_cleanup_descriptors(chan);
chan_dbg(chan, "tasklet exit\n");
}
@@ -1253,6 +1288,7 @@ static int __devinit fsl_dma_chan_probe(struct fsldma_device *fdev,
spin_lock_init(&chan->desc_lock);
INIT_LIST_HEAD(&chan->ld_pending);
INIT_LIST_HEAD(&chan->ld_running);
+ INIT_LIST_HEAD(&chan->ld_completed);
chan->idle = true;
chan->common.device = &fdev->common;
diff --git a/drivers/dma/fsldma.h b/drivers/dma/fsldma.h
index f5c3879..7ede908 100644
--- a/drivers/dma/fsldma.h
+++ b/drivers/dma/fsldma.h
@@ -140,6 +140,7 @@ struct fsldma_chan {
spinlock_t desc_lock; /* Descriptor operation lock */
struct list_head ld_pending; /* Link descriptors queue */
struct list_head ld_running; /* Link descriptors queue */
+ struct list_head ld_completed; /* Link descriptors queue */
struct dma_chan common; /* DMA common channel */
struct dma_pool *desc_pool; /* Descriptors pool */
struct device *dev; /* Channel device */
--
1.7.8.6
^ permalink raw reply related
* [PATCH 4/7] fsl-dma: move functions to avoid forward declarations
From: Ira W. Snyder @ 2012-07-31 23:45 UTC (permalink / raw)
To: linux-crypto; +Cc: B32616, linuxppc-dev, Ira W. Snyder
In-Reply-To: <1343778352-5549-1-git-send-email-iws@ovro.caltech.edu>
From: "Ira W. Snyder" <iws@ovro.caltech.edu>
This function will be modified in the next patch in the series. By
moving the function in a patch separate from the changes, it will make
review easier.
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
---
drivers/dma/fsldma.c | 96 +++++++++++++++++++++++++-------------------------
1 files changed, 48 insertions(+), 48 deletions(-)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index c34a628..80edc63 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -409,6 +409,54 @@ static void fsl_dma_free_descriptor(struct fsldma_chan *chan, struct fsl_desc_sw
dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
}
+/**
+ * fsldma_cleanup_descriptor - cleanup and free a single link descriptor
+ * @chan: Freescale DMA channel
+ * @desc: descriptor to cleanup and free
+ *
+ * This function is used on a descriptor which has been executed by the DMA
+ * controller. It will run any callbacks, submit any dependencies, and then
+ * free the descriptor.
+ */
+static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
+ struct fsl_desc_sw *desc)
+{
+ struct dma_async_tx_descriptor *txd = &desc->async_tx;
+ struct device *dev = chan->common.device->dev;
+ dma_addr_t src = get_desc_src(chan, desc);
+ dma_addr_t dst = get_desc_dst(chan, desc);
+ u32 len = get_desc_cnt(chan, desc);
+
+ /* Run the link descriptor callback function */
+ if (txd->callback) {
+#ifdef FSL_DMA_LD_DEBUG
+ chan_dbg(chan, "LD %p callback\n", desc);
+#endif
+ txd->callback(txd->callback_param);
+ }
+
+ /* Run any dependencies */
+ dma_run_dependencies(txd);
+
+ /* Unmap the dst buffer, if requested */
+ if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
+ if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
+ dma_unmap_single(dev, dst, len, DMA_FROM_DEVICE);
+ else
+ dma_unmap_page(dev, dst, len, DMA_FROM_DEVICE);
+ }
+
+ /* Unmap the src buffer, if requested */
+ if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
+ if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
+ dma_unmap_single(dev, src, len, DMA_TO_DEVICE);
+ else
+ dma_unmap_page(dev, src, len, DMA_TO_DEVICE);
+ }
+
+ fsl_dma_free_descriptor(chan, desc);
+}
+
static dma_cookie_t fsl_dma_tx_submit(struct dma_async_tx_descriptor *tx)
{
struct fsldma_chan *chan = to_fsl_chan(tx->chan);
@@ -807,54 +855,6 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
}
/**
- * fsldma_cleanup_descriptor - cleanup and free a single link descriptor
- * @chan: Freescale DMA channel
- * @desc: descriptor to cleanup and free
- *
- * This function is used on a descriptor which has been executed by the DMA
- * controller. It will run any callbacks, submit any dependencies, and then
- * free the descriptor.
- */
-static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
- struct fsl_desc_sw *desc)
-{
- struct dma_async_tx_descriptor *txd = &desc->async_tx;
- struct device *dev = chan->common.device->dev;
- dma_addr_t src = get_desc_src(chan, desc);
- dma_addr_t dst = get_desc_dst(chan, desc);
- u32 len = get_desc_cnt(chan, desc);
-
- /* Run the link descriptor callback function */
- if (txd->callback) {
-#ifdef FSL_DMA_LD_DEBUG
- chan_dbg(chan, "LD %p callback\n", desc);
-#endif
- txd->callback(txd->callback_param);
- }
-
- /* Run any dependencies */
- dma_run_dependencies(txd);
-
- /* Unmap the dst buffer, if requested */
- if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
- if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
- dma_unmap_single(dev, dst, len, DMA_FROM_DEVICE);
- else
- dma_unmap_page(dev, dst, len, DMA_FROM_DEVICE);
- }
-
- /* Unmap the src buffer, if requested */
- if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
- if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
- dma_unmap_single(dev, src, len, DMA_TO_DEVICE);
- else
- dma_unmap_page(dev, src, len, DMA_TO_DEVICE);
- }
-
- fsl_dma_free_descriptor(chan, desc);
-}
-
-/**
* fsl_chan_xfer_ld_queue - transfer any pending transactions
* @chan : Freescale DMA channel
*
--
1.7.8.6
^ permalink raw reply related
* [PATCH 3/7] fsl-dma: add fsl_dma_free_descriptor() to reduce code duplication
From: Ira W. Snyder @ 2012-07-31 23:45 UTC (permalink / raw)
To: linux-crypto; +Cc: B32616, linuxppc-dev, Ira W. Snyder
In-Reply-To: <1343778352-5549-1-git-send-email-iws@ovro.caltech.edu>
From: "Ira W. Snyder" <iws@ovro.caltech.edu>
There are several places where descriptors are freed using identical
code. Put this code into a function to reduce code duplication.
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
---
drivers/dma/fsldma.c | 32 ++++++++++++++------------------
1 files changed, 14 insertions(+), 18 deletions(-)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 8f0505d..c34a628 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -400,6 +400,15 @@ out_splice:
list_splice_tail_init(&desc->tx_list, &chan->ld_pending);
}
+static void fsl_dma_free_descriptor(struct fsldma_chan *chan, struct fsl_desc_sw *desc)
+{
+ list_del(&desc->node);
+#ifdef FSL_DMA_LD_DEBUG
+ chan_dbg(chan, "LD %p free\n", desc);
+#endif
+ dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
+}
+
static dma_cookie_t fsl_dma_tx_submit(struct dma_async_tx_descriptor *tx)
{
struct fsldma_chan *chan = to_fsl_chan(tx->chan);
@@ -499,13 +508,8 @@ static void fsldma_free_desc_list(struct fsldma_chan *chan,
{
struct fsl_desc_sw *desc, *_desc;
- list_for_each_entry_safe(desc, _desc, list, node) {
- list_del(&desc->node);
-#ifdef FSL_DMA_LD_DEBUG
- chan_dbg(chan, "LD %p free\n", desc);
-#endif
- dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
- }
+ list_for_each_entry_safe(desc, _desc, list, node)
+ fsl_dma_free_descriptor(chan, desc);
}
static void fsldma_free_desc_list_reverse(struct fsldma_chan *chan,
@@ -513,13 +517,8 @@ static void fsldma_free_desc_list_reverse(struct fsldma_chan *chan,
{
struct fsl_desc_sw *desc, *_desc;
- list_for_each_entry_safe_reverse(desc, _desc, list, node) {
- list_del(&desc->node);
-#ifdef FSL_DMA_LD_DEBUG
- chan_dbg(chan, "LD %p free\n", desc);
-#endif
- dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
- }
+ list_for_each_entry_safe_reverse(desc, _desc, list, node)
+ fsl_dma_free_descriptor(chan, desc);
}
/**
@@ -852,10 +851,7 @@ static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
dma_unmap_page(dev, src, len, DMA_TO_DEVICE);
}
-#ifdef FSL_DMA_LD_DEBUG
- chan_dbg(chan, "LD %p free\n", desc);
-#endif
- dma_pool_free(chan->desc_pool, desc, txd->phys);
+ fsl_dma_free_descriptor(chan, desc);
}
/**
--
1.7.8.6
^ permalink raw reply related
* [PATCH 2/7] fsl-dma: minimize locking overhead
From: Ira W. Snyder @ 2012-07-31 23:45 UTC (permalink / raw)
To: linux-crypto; +Cc: B32616, linuxppc-dev, Ira W. Snyder
In-Reply-To: <1343778352-5549-1-git-send-email-iws@ovro.caltech.edu>
From: "Ira W. Snyder" <iws@ovro.caltech.edu>
The use of spin_lock_irqsave() was a stronger locking mechanism than was
actually needed in many cases.
As the current code is written, spin_lock_bh() everywhere is sufficient.
The next patch in this series will add some code to hardware interrupt
context. This patch is intended to minimize the differences in the next
patch and make review easier.
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
---
drivers/dma/fsldma.c | 25 ++++++++++---------------
1 files changed, 10 insertions(+), 15 deletions(-)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 4f2f212..8f0505d 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -405,10 +405,9 @@ static dma_cookie_t fsl_dma_tx_submit(struct dma_async_tx_descriptor *tx)
struct fsldma_chan *chan = to_fsl_chan(tx->chan);
struct fsl_desc_sw *desc = tx_to_fsl_desc(tx);
struct fsl_desc_sw *child;
- unsigned long flags;
dma_cookie_t cookie;
- spin_lock_irqsave(&chan->desc_lock, flags);
+ spin_lock_irq(&chan->desc_lock);
/*
* assign cookies to all of the software descriptors
@@ -421,7 +420,7 @@ static dma_cookie_t fsl_dma_tx_submit(struct dma_async_tx_descriptor *tx)
/* put this transaction onto the tail of the pending queue */
append_ld_queue(chan, desc);
- spin_unlock_irqrestore(&chan->desc_lock, flags);
+ spin_unlock_irq(&chan->desc_lock);
return cookie;
}
@@ -530,13 +529,12 @@ static void fsldma_free_desc_list_reverse(struct fsldma_chan *chan,
static void fsl_dma_free_chan_resources(struct dma_chan *dchan)
{
struct fsldma_chan *chan = to_fsl_chan(dchan);
- unsigned long flags;
chan_dbg(chan, "free all channel resources\n");
- spin_lock_irqsave(&chan->desc_lock, flags);
+ spin_lock_irq(&chan->desc_lock);
fsldma_free_desc_list(chan, &chan->ld_pending);
fsldma_free_desc_list(chan, &chan->ld_running);
- spin_unlock_irqrestore(&chan->desc_lock, flags);
+ spin_unlock_irq(&chan->desc_lock);
dma_pool_destroy(chan->desc_pool);
chan->desc_pool = NULL;
@@ -755,7 +753,6 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
{
struct dma_slave_config *config;
struct fsldma_chan *chan;
- unsigned long flags;
int size;
if (!dchan)
@@ -765,7 +762,7 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
switch (cmd) {
case DMA_TERMINATE_ALL:
- spin_lock_irqsave(&chan->desc_lock, flags);
+ spin_lock_irq(&chan->desc_lock);
/* Halt the DMA engine */
dma_halt(chan);
@@ -775,7 +772,7 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
fsldma_free_desc_list(chan, &chan->ld_running);
chan->idle = true;
- spin_unlock_irqrestore(&chan->desc_lock, flags);
+ spin_unlock_irq(&chan->desc_lock);
return 0;
case DMA_SLAVE_CONFIG:
@@ -935,11 +932,10 @@ static void fsl_chan_xfer_ld_queue(struct fsldma_chan *chan)
static void fsl_dma_memcpy_issue_pending(struct dma_chan *dchan)
{
struct fsldma_chan *chan = to_fsl_chan(dchan);
- unsigned long flags;
- spin_lock_irqsave(&chan->desc_lock, flags);
+ spin_lock_irq(&chan->desc_lock);
fsl_chan_xfer_ld_queue(chan);
- spin_unlock_irqrestore(&chan->desc_lock, flags);
+ spin_unlock_irq(&chan->desc_lock);
}
/**
@@ -952,11 +948,10 @@ static enum dma_status fsl_tx_status(struct dma_chan *dchan,
{
struct fsldma_chan *chan = to_fsl_chan(dchan);
enum dma_status ret;
- unsigned long flags;
- spin_lock_irqsave(&chan->desc_lock, flags);
+ spin_lock_irq(&chan->desc_lock);
ret = dma_cookie_status(dchan, cookie, txstate);
- spin_unlock_irqrestore(&chan->desc_lock, flags);
+ spin_unlock_irq(&chan->desc_lock);
return ret;
}
--
1.7.8.6
^ permalink raw reply related
* [PATCH 1/7] fsl-dma: remove attribute DMA_INTERRUPT of dmaengine
From: Ira W. Snyder @ 2012-07-31 23:45 UTC (permalink / raw)
To: linux-crypto; +Cc: Vinod Koul, B32616, Qiang Liu, Dan Williams, linuxppc-dev
In-Reply-To: <1343778352-5549-1-git-send-email-iws@ovro.caltech.edu>
From: Qiang Liu <qiang.liu@freescale.com>
Delete attribute DMA_INTERRUPT because fsl-dma doesn't support this function,
exception will be thrown if talitos is used to offload xor at the same time.
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Li Yang <leoli@freescale.com>
Signed-off-by: Qiang Liu <qiang.liu@freescale.com>
Acked-by: Ira W. Snyder <iws@ovro.caltech.edu>
---
drivers/dma/fsldma.c | 31 -------------------------------
1 files changed, 0 insertions(+), 31 deletions(-)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 8f84761..4f2f212 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -543,35 +543,6 @@ static void fsl_dma_free_chan_resources(struct dma_chan *dchan)
}
static struct dma_async_tx_descriptor *
-fsl_dma_prep_interrupt(struct dma_chan *dchan, unsigned long flags)
-{
- struct fsldma_chan *chan;
- struct fsl_desc_sw *new;
-
- if (!dchan)
- return NULL;
-
- chan = to_fsl_chan(dchan);
-
- new = fsl_dma_alloc_descriptor(chan);
- if (!new) {
- chan_err(chan, "%s\n", msg_ld_oom);
- return NULL;
- }
-
- new->async_tx.cookie = -EBUSY;
- new->async_tx.flags = flags;
-
- /* Insert the link descriptor to the LD ring */
- list_add_tail(&new->node, &new->tx_list);
-
- /* Set End-of-link to the last link descriptor of new list */
- set_ld_eol(chan, new);
-
- return &new->async_tx;
-}
-
-static struct dma_async_tx_descriptor *
fsl_dma_prep_memcpy(struct dma_chan *dchan,
dma_addr_t dma_dst, dma_addr_t dma_src,
size_t len, unsigned long flags)
@@ -1352,12 +1323,10 @@ static int __devinit fsldma_of_probe(struct platform_device *op)
fdev->irq = irq_of_parse_and_map(op->dev.of_node, 0);
dma_cap_set(DMA_MEMCPY, fdev->common.cap_mask);
- dma_cap_set(DMA_INTERRUPT, fdev->common.cap_mask);
dma_cap_set(DMA_SG, fdev->common.cap_mask);
dma_cap_set(DMA_SLAVE, fdev->common.cap_mask);
fdev->common.device_alloc_chan_resources = fsl_dma_alloc_chan_resources;
fdev->common.device_free_chan_resources = fsl_dma_free_chan_resources;
- fdev->common.device_prep_dma_interrupt = fsl_dma_prep_interrupt;
fdev->common.device_prep_dma_memcpy = fsl_dma_prep_memcpy;
fdev->common.device_prep_dma_sg = fsl_dma_prep_sg;
fdev->common.device_tx_status = fsl_tx_status;
--
1.7.8.6
^ permalink raw reply related
* [PATCH 0/7] fsl-dma: fixes for Freescale DMA driver
From: Ira W. Snyder @ 2012-07-31 23:45 UTC (permalink / raw)
To: linux-crypto; +Cc: B32616, linuxppc-dev, Ira W. Snyder
From: "Ira W. Snyder" <iws@ovro.caltech.edu>
Hello everyone,
This is my alternative (simpler) attempt at solving the problems reported
by Qiang Liu with the async_tx API and MD RAID hardware offload support
when using the Freescale DMA driver.
The bug is caused by this driver freeing descriptors before they have been
ACKed by software using the async_tx API.
I don't like Qiang Liu's code to check where the hardware is in the
processing of the descriptor chain, and try to free a partial list of
descriptors. This was a source of bugs in this driver before I fixed them
several years ago.
Instead, the DMA controller raises an interrupt every time it has completed
a descriptor chain. This means it is ready for new descriptors: no need to
try and figure out where it is in the middle of a descriptor chain.
Qiang Liu: I do not have a hardware setup capable of using MD RAID. Please
test these patches to see if they fix the bug you reported. You may use
these patches as-is, or build upon them.
I have tested this using the drivers/dma/dmatest.c driver, as well as the
CARMA drivers. There are no regressions that I can find.
[ 355.069679] dma0chan3-copy0: terminating after 100000 tests, 0 failures (status 0)
[ 355.192278] dma0chan2-copy0: terminating after 100000 tests, 0 failures (status 0)
Ira W. Snyder (5):
fsl-dma: minimize locking overhead
fsl-dma: add fsl_dma_free_descriptor() to reduce code duplication
fsl-dma: move functions to avoid forward declarations
fsl-dma: fix support for async_tx API
carma: remove unnecessary DMA_INTERRUPT capability
Qiang Liu (2):
fsl-dma: remove attribute DMA_INTERRUPT of dmaengine
fsl-dma: fix a warning of unitialized cookie
drivers/dma/fsldma.c | 318 +++++++++++++++----------------
drivers/dma/fsldma.h | 1 +
drivers/misc/carma/carma-fpga-program.c | 1 -
drivers/misc/carma/carma-fpga.c | 3 +-
4 files changed, 159 insertions(+), 164 deletions(-)
--
1.7.8.6
^ permalink raw reply
* Re: [PATCH v4 3/7] fsl-dma: change release process of dma descriptor for supporting async_tx
From: Ira W. Snyder @ 2012-07-31 22:13 UTC (permalink / raw)
To: Liu Qiang-B32616
Cc: Li Yang-R58472, Vinod Koul, Phillips Kim-R1AAHA,
dan.j.williams@gmail.com, linux-crypto@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org, davem@davemloft.net,
herbert@gondor.apana.org.au
In-Reply-To: <BCB48C05FCE8BC4D9E61E841ECBE6DB70C5D24@039-SN2MPN1-011.039d.mgd.msft.net>
On Tue, Jul 31, 2012 at 04:09:28AM +0000, Liu Qiang-B32616 wrote:
> > -----Original Message-----
> > From: Ira W. Snyder [mailto:iws@ovro.caltech.edu]
> > Sent: Tuesday, July 31, 2012 5:10 AM
> > To: Liu Qiang-B32616
> > Cc: linux-crypto@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; Phillips
> > Kim-R1AAHA; herbert@gondor.hengli.com.au; davem@davemloft.net; Dan
> > Williams; Vinod Koul; Li Yang-R58472
> > Subject: Re: [PATCH v4 3/7] fsl-dma: change release process of dma
> > descriptor for supporting async_tx
> >
> > On Fri, Jul 27, 2012 at 05:16:09PM +0800, qiang.liu@freescale.com wrote:
> > > From: Qiang Liu <qiang.liu@freescale.com>
> > >
> > > Fix the potential risk when enable config NET_DMA and ASYNC_TX.
> > > Async_tx is lack of support in current release process of dma
> > descriptor,
> > > all descriptors will be released whatever is acked or no-acked by
> > async_tx,
> > > so there is a potential race condition when dma engine is uesd by
> > others
> > > clients (e.g. when enable NET_DMA to offload TCP).
> > >
> > > In our case, a race condition which is raised when use both of talitos
> > > and dmaengine to offload xor is because napi scheduler will sync all
> > > pending requests in dma channels, it affects the process of raid
> > operations
> > > due to ack_tx is not checked in fsl dma. The no-acked descriptor is
> > freed
> > > which is submitted just now, as a dependent tx, this freed descriptor
> > trigger
> > > BUG_ON(async_tx_test_ack(depend_tx)) in async_tx_submit().
> > >
> >
> > I'm preparing an alternative version of this patch that I think is
> > easier to understand (it is much shorter). I'll post it up here as soon
> > as I finish testing.
> Can you give a simple description/idea about your patch? My patch is for fix the
> problems when I build a raid environment with talitos offload xor.
> I think the new interface is clear enough and similar with the implement of other dma devices.
>
> And do you have any comments about this patch?
>
My patch will fix the same problem, in a simpler way. It will not
involve checking if the hardware is finished with a descriptor on
ld_running.
> >
> > It would be nice to know how to easily reproduce this bug, without
> > needing to set up a RAID system. I don't have access to any such
> > hardware. A driver similar to drivers/dma/dmatest.c (using the async_tx
> > API instead) would be wonderful.
> You can refer to raid5.c if you do not want to use hardware. Or you can use
> you ram (or other storage devices) to build a raid env to test.
> Thanks.
>
> >
> > Thanks,
> > Ira
> >
> > > TASK = ee1a94a0[1390] 'md0_raid5' THREAD: ecf40000 CPU: 0
> > > GPR00: 00000001 ecf41ca0 ee44/921a94a0 0000003f 00000001 c00593e4
> > 00000000 00000001
> > > GPR08: 00000000 a7a7a7a7 00000001 045/920000002 42028042 100a38d4
> > ed576d98 00000000
> > > GPR16: ed5a11b0 00000000 2b162000 00000200 046/920000000 2d555000
> > ed3015e8 c15a7aa0
> > > GPR24: 00000000 c155fc40 00000000 ecb63220 ecf41d28 e47/92f640bb0
> > ef640c30 ecf41ca0
> > > NIP [c02b048c] async_tx_submit+0x6c/0x2b4
> > > LR [c02b068c] async_tx_submit+0x26c/0x2b4
> > > Call Trace:
> > > [ecf41ca0] [c02b068c] async_tx_submit+0x26c/0x2b448/92 (unreliable)
> > > [ecf41cd0] [c02b0a4c] async_memcpy+0x240/0x25c
> > > [ecf41d20] [c0421064] async_copy_data+0xa0/0x17c
> > > [ecf41d70] [c0421cf4] __raid_run_ops+0x874/0xe10
> > > [ecf41df0] [c0426ee4] handle_stripe+0x820/0x25e8
> > > [ecf41e90] [c0429080] raid5d+0x3d4/0x5b4
> > > [ecf41f40] [c04329b8] md_thread+0x138/0x16c
> > > [ecf41f90] [c008277c] kthread+0x8c/0x90
> > > [ecf41ff0] [c0011630] kernel_thread+0x4c/0x68
> > >
> > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > Cc: Vinod Koul <vinod.koul@intel.com>
> > > Cc: Li Yang <leoli@freescale.com>
> > > Cc: Ira W. Snyder <iws@ovro.caltech.edu>
> > > Signed-off-by: Qiang Liu <qiang.liu@freescale.com>
> > > ---
> > > drivers/dma/fsldma.c | 242 +++++++++++++++++++++++++++++++++++-------
> > --------
> > > drivers/dma/fsldma.h | 1 +
> > > 2 files changed, 172 insertions(+), 71 deletions(-)
> > >
> > > diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
> > > index 4f2f212..87f52c0 100644
> > > --- a/drivers/dma/fsldma.c
> > > +++ b/drivers/dma/fsldma.c
> > > @@ -400,6 +400,125 @@ out_splice:
> > > list_splice_tail_init(&desc->tx_list, &chan->ld_pending);
> > > }
> > >
> > > +static void fsldma_cleanup_descriptor(struct fsldma_chan *chan);
> > > +static void fsl_chan_xfer_ld_queue(struct fsldma_chan *chan);
> > > +
You should have re-arranged the patches to avoid introducing these
forward declarations in this patch and then deleting them in the next
patch. I reversed the order in my patch series.
> > > +/**
> > > + * fsldma_clean_completed_descriptor - free all descriptors which
> > > + * has been completed and acked
> > > + * @chan: Freescale DMA channel
> > > + *
> > > + * This function is used on all completed and acked descriptors.
> > > + * All descriptors should only be freed in this function.
> > > + */
> > > +static int
> > > +fsldma_clean_completed_descriptor(struct fsldma_chan *chan)
> > > +{
> > > + struct fsl_desc_sw *desc, *_desc;
> > > +
> > > + /* Run the callback for each descriptor, in order */
> > > + list_for_each_entry_safe(desc, _desc, &chan->ld_completed, node) {
> > > +
> > > + if (async_tx_test_ack(&desc->async_tx)) {
> > > + /* Remove from the list of transactions */
> > > + list_del(&desc->node);
> > > +#ifdef FSL_DMA_LD_DEBUG
> > > + chan_dbg(chan, "LD %p free\n", desc);
> > > +#endif
> > > + dma_pool_free(chan->desc_pool, desc,
> > > + desc->async_tx.phys);
> > > + }
> > > + }
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +/**
> > > + * fsldma_run_tx_complete_actions - cleanup and free a single link
> > descriptor
> > > + * @chan: Freescale DMA channel
> > > + * @desc: descriptor to cleanup and free
> > > + * @cookie: Freescale DMA transaction identifier
> > > + *
> > > + * This function is used on a descriptor which has been executed by
> > the DMA
> > > + * controller. It will run any callbacks, submit any dependencies.
> > > + */
> > > +static dma_cookie_t fsldma_run_tx_complete_actions(struct fsl_desc_sw
> > *desc,
> > > + struct fsldma_chan *chan, dma_cookie_t cookie)
> > > +{
> > > + struct dma_async_tx_descriptor *txd = &desc->async_tx;
> > > + struct device *dev = chan->common.device->dev;
> > > + dma_addr_t src = get_desc_src(chan, desc);
> > > + dma_addr_t dst = get_desc_dst(chan, desc);
> > > + u32 len = get_desc_cnt(chan, desc);
> > > +
> > > + BUG_ON(txd->cookie < 0);
> > > +
> > > + if (txd->cookie > 0) {
> > > + cookie = txd->cookie;
> > > +
> > > + /* Run the link descriptor callback function */
> > > + if (txd->callback) {
> > > +#ifdef FSL_DMA_LD_DEBUG
> > > + chan_dbg(chan, "LD %p callback\n", desc);
> > > +#endif
> > > + txd->callback(txd->callback_param);
> > > + }
> > > +
> > > + /* Unmap the dst buffer, if requested */
> > > + if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
> > > + if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
> > > + dma_unmap_single(dev, dst, len, DMA_FROM_DEVICE);
> > > + else
> > > + dma_unmap_page(dev, dst, len, DMA_FROM_DEVICE);
> > > + }
> > > +
> > > + /* Unmap the src buffer, if requested */
> > > + if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
> > > + if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
> > > + dma_unmap_single(dev, src, len, DMA_TO_DEVICE);
> > > + else
> > > + dma_unmap_page(dev, src, len, DMA_TO_DEVICE);
> > > + }
> > > + }
> > > +
> > > + /* Run any dependencies */
> > > + dma_run_dependencies(txd);
> > > +
> > > + return cookie;
> > > +}
> > > +
> > > +/**
> > > + * fsldma_clean_running_descriptor - move the completed descriptor
> > from
> > > + * ld_running to ld_completed
> > > + * @chan: Freescale DMA channel
> > > + * @desc: the descriptor which is completed
> > > + *
> > > + * Free the descriptor directly if acked by async_tx api, or move it
> > to
> > > + * queue ld_completed.
> > > + */
> > > +static int
> > > +fsldma_clean_running_descriptor(struct fsldma_chan *chan,
> > > + struct fsl_desc_sw *desc)
> > > +{
> > > + /* Remove from the list of transactions */
> > > + list_del(&desc->node);
> > > + /*
> > > + * the client is allowed to attach dependent operations
> > > + * until 'ack' is set
> > > + */
> > > + if (!async_tx_test_ack(&desc->async_tx)) {
> > > + /*
> > > + * Move this descriptor to the list of descriptors which is
> > > + * completed, but still awaiting the 'ack' bit to be set.
> > > + */
> > > + list_add_tail(&desc->node, &chan->ld_completed);
> > > + return 0;
> > > + }
> > > +
> > > + dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
> > > + return 0;
> > > +}
> > > +
> > > static dma_cookie_t fsl_dma_tx_submit(struct dma_async_tx_descriptor
> > *tx)
> > > {
> > > struct fsldma_chan *chan = to_fsl_chan(tx->chan);
> > > @@ -534,8 +653,10 @@ static void fsl_dma_free_chan_resources(struct
> > dma_chan *dchan)
> > >
> > > chan_dbg(chan, "free all channel resources\n");
> > > spin_lock_irqsave(&chan->desc_lock, flags);
> > > + fsldma_cleanup_descriptor(chan);
> > > fsldma_free_desc_list(chan, &chan->ld_pending);
> > > fsldma_free_desc_list(chan, &chan->ld_running);
> > > + fsldma_free_desc_list(chan, &chan->ld_completed);
> > > spin_unlock_irqrestore(&chan->desc_lock, flags);
> > >
> > > dma_pool_destroy(chan->desc_pool);
> > > @@ -819,46 +940,53 @@ static int fsl_dma_device_control(struct dma_chan
> > *dchan,
> > > * controller. It will run any callbacks, submit any dependencies, and
> > then
> > > * free the descriptor.
> > > */
> > > -static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
> > > - struct fsl_desc_sw *desc)
> > > +static void fsldma_cleanup_descriptor(struct fsldma_chan *chan)
> > > {
> > > - struct dma_async_tx_descriptor *txd = &desc->async_tx;
> > > - struct device *dev = chan->common.device->dev;
> > > - dma_addr_t src = get_desc_src(chan, desc);
> > > - dma_addr_t dst = get_desc_dst(chan, desc);
> > > - u32 len = get_desc_cnt(chan, desc);
> > > + struct fsl_desc_sw *desc, *_desc;
> > > + dma_cookie_t cookie = 0;
> > > + dma_addr_t curr_phys = get_cdar(chan);
> > > + int idle = dma_is_idle(chan);
> > > + int seen_current = 0;
> > >
> > > - /* Run the link descriptor callback function */
> > > - if (txd->callback) {
> > > -#ifdef FSL_DMA_LD_DEBUG
> > > - chan_dbg(chan, "LD %p callback\n", desc);
> > > -#endif
> > > - txd->callback(txd->callback_param);
> > > - }
> > > + fsldma_clean_completed_descriptor(chan);
> > >
> > > - /* Run any dependencies */
> > > - dma_run_dependencies(txd);
> > > + /* Run the callback for each descriptor, in order */
> > > + list_for_each_entry_safe(desc, _desc, &chan->ld_running, node) {
> > > + /*
> > > + * do not advance past the current descriptor loaded into the
> > > + * hardware channel, subsequent descriptors are either in
> > > + * process or have not been submitted
> > > + */
> > > + if (seen_current)
> > > + break;
> > >
> > > - /* Unmap the dst buffer, if requested */
> > > - if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
> > > - if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
> > > - dma_unmap_single(dev, dst, len, DMA_FROM_DEVICE);
> > > - else
> > > - dma_unmap_page(dev, dst, len, DMA_FROM_DEVICE);
> > > - }
> > > + /*
> > > + * stop the search if we reach the current descriptor and the
> > > + * channel is busy
> > > + */
> > > + if (desc->async_tx.phys == curr_phys) {
> > > + seen_current = 1;
> > > + if (!idle)
> > > + break;
> > > + }
I really don't like the idea of trying to guess what the hardware is
doing. Variables curr_phys and idle can be stale by the time the
processor gets here. The DMA engine is very fast at processing
descriptors.
It is much easier to reason about the hardware state if you have it tell
you (via an interrupt) when it is ready for more descriptors. My patch
takes this approach. I'll be posting it in a few minutes.
> > > +
> > > + cookie = fsldma_run_tx_complete_actions(desc, chan, cookie);
> > > +
> > > + if (fsldma_clean_running_descriptor(chan, desc))
> > > + break;
> > >
> > > - /* Unmap the src buffer, if requested */
> > > - if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
> > > - if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
> > > - dma_unmap_single(dev, src, len, DMA_TO_DEVICE);
> > > - else
> > > - dma_unmap_page(dev, src, len, DMA_TO_DEVICE);
> > > }
> > >
> > > -#ifdef FSL_DMA_LD_DEBUG
> > > - chan_dbg(chan, "LD %p free\n", desc);
> > > -#endif
> > > - dma_pool_free(chan->desc_pool, desc, txd->phys);
> > > + /*
> > > + * Start any pending transactions automatically
> > > + *
> > > + * In the ideal case, we keep the DMA controller busy while we go
> > > + * ahead and free the descriptors below.
> > > + */
> > > + fsl_chan_xfer_ld_queue(chan);
> > > +
> > > + if (cookie > 0)
> > > + chan->common.completed_cookie = cookie;
> > > }
> > >
> > > /**
> > > @@ -954,11 +1082,15 @@ static enum dma_status fsl_tx_status(struct
> > dma_chan *dchan,
> > > enum dma_status ret;
> > > unsigned long flags;
> > >
> > > - spin_lock_irqsave(&chan->desc_lock, flags);
> > > ret = dma_cookie_status(dchan, cookie, txstate);
> > > + if (ret == DMA_SUCCESS)
> > > + return ret;
> > > +
> > > + spin_lock_irqsave(&chan->desc_lock, flags);
> > > + fsldma_cleanup_descriptor(chan);
> > > spin_unlock_irqrestore(&chan->desc_lock, flags);
> > >
> > > - return ret;
> > > + return dma_cookie_status(dchan, cookie, txstate);
> > > }
> > >
> > > /*--------------------------------------------------------------------
> > --------*/
> > > @@ -1035,52 +1167,19 @@ static irqreturn_t fsldma_chan_irq(int irq,
> > void *data)
> > > static void dma_do_tasklet(unsigned long data)
> > > {
> > > struct fsldma_chan *chan = (struct fsldma_chan *)data;
> > > - struct fsl_desc_sw *desc, *_desc;
> > > - LIST_HEAD(ld_cleanup);
> > > unsigned long flags;
> > >
> > > chan_dbg(chan, "tasklet entry\n");
> > >
> > > spin_lock_irqsave(&chan->desc_lock, flags);
> > >
> > > - /* update the cookie if we have some descriptors to cleanup */
> > > - if (!list_empty(&chan->ld_running)) {
> > > - dma_cookie_t cookie;
> > > -
> > > - desc = to_fsl_desc(chan->ld_running.prev);
> > > - cookie = desc->async_tx.cookie;
> > > - dma_cookie_complete(&desc->async_tx);
> > > -
> > > - chan_dbg(chan, "completed_cookie=%d\n", cookie);
> > > - }
> > > -
> > > - /*
> > > - * move the descriptors to a temporary list so we can drop the lock
> > > - * during the entire cleanup operation
> > > - */
> > > - list_splice_tail_init(&chan->ld_running, &ld_cleanup);
> > > -
> > > /* the hardware is now idle and ready for more */
> > > chan->idle = true;
> > >
> > > - /*
> > > - * Start any pending transactions automatically
> > > - *
> > > - * In the ideal case, we keep the DMA controller busy while we go
> > > - * ahead and free the descriptors below.
> > > - */
> > > - fsl_chan_xfer_ld_queue(chan);
> > > - spin_unlock_irqrestore(&chan->desc_lock, flags);
> > > -
> > > - /* Run the callback for each descriptor, in order */
> > > - list_for_each_entry_safe(desc, _desc, &ld_cleanup, node) {
> > > + /* Run all cleanup for this descriptor */
> > > + fsldma_cleanup_descriptor(chan);
> > >
> > > - /* Remove from the list of transactions */
> > > - list_del(&desc->node);
> > > -
> > > - /* Run all cleanup for this descriptor */
> > > - fsldma_cleanup_descriptor(chan, desc);
> > > - }
> > > + spin_unlock_irqrestore(&chan->desc_lock, flags);
> > >
> > > chan_dbg(chan, "tasklet exit\n");
> > > }
> > > @@ -1262,6 +1361,7 @@ static int __devinit fsl_dma_chan_probe(struct
> > fsldma_device *fdev,
> > > spin_lock_init(&chan->desc_lock);
> > > INIT_LIST_HEAD(&chan->ld_pending);
> > > INIT_LIST_HEAD(&chan->ld_running);
> > > + INIT_LIST_HEAD(&chan->ld_completed);
> > > chan->idle = true;
> > >
> > > chan->common.device = &fdev->common;
> > > diff --git a/drivers/dma/fsldma.h b/drivers/dma/fsldma.h
> > > index f5c3879..7ede908 100644
> > > --- a/drivers/dma/fsldma.h
> > > +++ b/drivers/dma/fsldma.h
> > > @@ -140,6 +140,7 @@ struct fsldma_chan {
> > > spinlock_t desc_lock; /* Descriptor operation lock */
> > > struct list_head ld_pending; /* Link descriptors queue */
> > > struct list_head ld_running; /* Link descriptors queue */
> > > + struct list_head ld_completed; /* Link descriptors queue */
> > > struct dma_chan common; /* DMA common channel */
> > > struct dma_pool *desc_pool; /* Descriptors pool */
> > > struct device *dev; /* Channel device */
> > > --
> > > 1.7.5.1
> > >
> > >
>
>
^ permalink raw reply
* Re: [PATCH] scsi/ibmvscsi: add module alias for ibmvscsic
From: Robert Jennings @ 2012-07-31 17:57 UTC (permalink / raw)
To: Brian King
Cc: Olaf Hering, Brian J King, linux-scsi, James E.J. Bottomley,
linuxppc-dev, Robert C Jennings
In-Reply-To: <501805C3.7030504@linux.vnet.ibm.com>
On Tue, Jul 31, 2012 at 11:20 AM, Brian King <brking@linux.vnet.ibm.com> wrote:
>
> On 07/30/2012 10:08 PM, Benjamin Herrenschmidt wrote:
> > On Mon, 2012-07-30 at 21:06 +0200, Olaf Hering wrote:
> >>> So while this would work, I do wonder however whether we could
> >> instead
> >>> fix it by simplifying the whole thing as follow since iSeries is now
> >>> gone and so we don't need split backends anymore:
> >>>
> >>> scsi/ibmvscsi: Remove backend abstraction
> >>
> >> I cant that these things myself anymore.
> >
> > Brian, can somebody from your side own these ?
>
> I talked to Rob and he will be picking this up. Rob - can you submit
> a patch to the maintainers file, adding yourself as the ibmvscsi
> maintainer?
I've submitted a patch for the MAINTAINERS file and I'll take a look at these
patches to verify them as well. Thanks.
--Rob Jennings
^ permalink raw reply
* Re: [PATCH] scsi/ibmvscsi: add module alias for ibmvscsic
From: Robert Jennings @ 2012-07-31 17:54 UTC (permalink / raw)
To: Brian King
Cc: Olaf Hering, Brian J King, linux-scsi, James E.J. Bottomley,
linuxppc-dev, Robert C Jennings
In-Reply-To: <501805C3.7030504@linux.vnet.ibm.com>
[-- Attachment #1: Type: text/plain, Size: 854 bytes --]
On Tue, Jul 31, 2012 at 11:20 AM, Brian King <brking@linux.vnet.ibm.com>wrote:
> On 07/30/2012 10:08 PM, Benjamin Herrenschmidt wrote:
> > On Mon, 2012-07-30 at 21:06 +0200, Olaf Hering wrote:
> >>> So while this would work, I do wonder however whether we could
> >> instead
> >>> fix it by simplifying the whole thing as follow since iSeries is now
> >>> gone and so we don't need split backends anymore:
> >>>
> >>> scsi/ibmvscsi: Remove backend abstraction
> >>
> >> I cant that these things myself anymore.
> >
> > Brian, can somebody from your side own these ?
>
> I talked to Rob and he will be picking this up. Rob - can you submit
> a patch to the maintainers file, adding yourself as the ibmvscsi
> maintainer?
>
Submitted a patch for the MAINTAINERS file and I'll take a look at these
patches to verify them as well. Thanks.
--Rob Jennings
[-- Attachment #2: Type: text/html, Size: 1292 bytes --]
^ permalink raw reply
* Re: [PATCH] powerpc: Update g5_defconfig
From: Andrey Gusev @ 2012-07-31 16:40 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, Andreas Schwab
In-Reply-To: <1343027168.2957.30.camel@pasglop>
23.07.2012 11:06, Benjamin Herrenschmidt пишет:
> On Mon, 2012-07-23 at 09:04 +0200, Andreas Schwab wrote:
>> Benjamin Herrenschmidt<benh@kernel.crashing.org> writes:
>>
>>> This updates the g5 defconfig to include nouveau instead of nvidiafb
>>> (which works much better nowadays, in fact the latter crashes on modern
>>> distros),
>> Does it? Nvidiafb is super stable here, whereas nouveau has a lot of
>> problems.
> nvidiafb crashes solid here when X gets involved.
>
> What do you mean by "lots of problems" ? I've been using nouveau for a
> long time now on my two quad g5's including 3D, gnome-shell etc...
> without anything more than a glitch or two on the edge of windows...
>
> Also can you remind me what machine model and what video card variant ?
>
> Cheers,
> Ben.
>
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
>
>
nouveau doesn't work on my machine. There is a bug in Xorg project
https://bugs.freedesktop.org/show_bug.cgi?id=21273 It need to make
a workaround for correct initialization of video card. Unfortunately,
I am not skilled enough for it.
Nvidiafb worked fine for me, but I am not using it.
Andrey
^ permalink raw reply
* Re: [PATCH] powerpc/fsl: mpic timer driver
From: Scott Wood @ 2012-07-31 16:26 UTC (permalink / raw)
To: Kumar Gala
Cc: Wang Dongsheng-B40534, paulus@samba.org,
linuxppc-dev@lists.ozlabs.org, Wood Scott-B07421
In-Reply-To: <267909D0-5FDC-4864-AA4B-189A651889A6@kernel.crashing.org>
On 07/31/2012 09:31 AM, Kumar Gala wrote:
>
> On Jul 31, 2012, at 2:58 AM, Wang Dongsheng-B40534 wrote:
>
>>
>>
>>> -----Original Message-----
>>> From: Kumar Gala [mailto:galak@kernel.crashing.org]
>>> Sent: Friday, July 27, 2012 9:14 PM
>>> To: Wang Dongsheng-B40534
>>> Cc: benh@kernel.crashing.org; paulus@samba.org; Wood Scott-B07421;
>>> linuxppc-dev@lists.ozlabs.org
>>> Subject: Re: [PATCH] powerpc/fsl: mpic timer driver
>>>
>>>
>>> On Jul 27, 2012, at 1:20 AM, <Dongsheng.wang@freescale.com>
>>> <Dongsheng.wang@freescale.com> wrote:
>>>
>>>> From: Wang Dongsheng <Dongsheng.Wang@freescale.com>
>>>>
>>>> Global timers A and B internal to the PIC. The two independent groups
>>>> of global timer, group A and group B, are identical in their
>>> functionality.
>>>> The hardware timer generates an interrupt on every timer cycle.
>>>> e.g
>>>> Power management can use the hardware timer to wake up the machine.
>>>>
>>>> Signed-off-by: Wang Dongsheng <Dongsheng.Wang@freescale.com>
>>>> Signed-off-by: Li Yang <leoli@freescale.com>
>>>
>>> How much of this is FSL specific vs openpic? OpenPIC spec's timer
>>> support (only a single group).
>>>
>> [Wang Dongsheng] Yes, OpenPIC only a single group timer.
>> FSL: add more register, features and group.
>> This patch only to support FSL chip.
>> "mpic_timer.c" -> "fsl_mpic_timer.c"
>> I will modify the description of the patch. how about?
>
> I'd rather we support both, can we not use the MPIC_FSL flag to deal with FSL specific behavior?
The device this driver binds against is "fsl,mpic-global-timer". We
don't have a binding for ordinary OpenPIC timers, and inferring them
from the basic OpenPIC node will cause AMP headaches.
Let someone who cares about ordinary OpenPIC drivers add support. :-)
-Scott
^ permalink raw reply
* Re: [PATCH] scsi/ibmvscsi: add module alias for ibmvscsic
From: Brian King @ 2012-07-31 16:20 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Olaf Hering, Brian J King, linux-scsi, James E.J. Bottomley,
linuxppc-dev, Robert C Jennings
In-Reply-To: <1343704122.8227.3.camel@pasglop>
On 07/30/2012 10:08 PM, Benjamin Herrenschmidt wrote:
> On Mon, 2012-07-30 at 21:06 +0200, Olaf Hering wrote:
>>> So while this would work, I do wonder however whether we could
>> instead
>>> fix it by simplifying the whole thing as follow since iSeries is now
>>> gone and so we don't need split backends anymore:
>>>
>>> scsi/ibmvscsi: Remove backend abstraction
>>
>> I cant that these things myself anymore.
>
> Brian, can somebody from your side own these ?
I talked to Rob and he will be picking this up. Rob - can you submit
a patch to the maintainers file, adding yourself as the ibmvscsi maintainer?
Thanks,
Brian
--
Brian King
Power Linux I/O
IBM Linux Technology Center
^ permalink raw reply
* Re: [PATCH] dma/fsldma: fix a compilation warnings
From: Kumar Gala @ 2012-07-31 14:53 UTC (permalink / raw)
To: Liu Qiang-B32616
Cc: vinod.koul@intel.com, linuxppc-dev@ozlabs.org,
dan.j.williams@intel.com
In-Reply-To: <BCB48C05FCE8BC4D9E61E841ECBE6DB70C5FA7@039-SN2MPN1-011.039d.mgd.msft.net>
On Jul 31, 2012, at 9:48 AM, Liu Qiang-B32616 wrote:
> Hi Kumar,
> ________________________________________
>> From: Linuxppc-dev =
[linuxppc-dev-bounces+qiang.liu=3Dfreescale.com@lists.ozlabs.org] on =
behalf of Kumar Gala [galak@kernel.crashing.org]
>> Sent: Tuesday, July 31, 2012 8:57 AM
>> To: dan.j.williams@intel.com
>> Cc: vinod.koul@intel.com; linuxppc-dev@ozlabs.org
>> Subject: [PATCH] dma/fsldma: fix a compilation warnings
>=20
>> drivers/dma/fsldma.c: In function 'fsl_dma_tx_submit':
>> drivers/dma/fsldma.c:409:15: warning: 'cookie' may be used =
uninitialized in this function
>>=20
>> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
>> ---
>> drivers/dma/fsldma.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>=20
>> diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
>> index 8f84761..6194eb7 100644
>> --- a/drivers/dma/fsldma.c
>> +++ b/drivers/dma/fsldma.c
>> @@ -406,7 +406,7 @@ static dma_cookie_t fsl_dma_tx_submit(struct =
dma_async_tx_descriptor *tx)
>> struct fsl_desc_sw *desc =3D tx_to_fsl_desc(tx);
>> struct fsl_desc_sw *child;
>> unsigned long flags;
>> - dma_cookie_t cookie;
>> + dma_cookie_t cookie =3D 0;
> I have already submitted a patch to fix it. Please refer to:
> https://lists.ozlabs.org/pipermail/linuxppc-dev/2012-July/099819.html
>=20
> Thanks.
Ah, good.. missed it ;)
- k=
^ permalink raw reply
* RE: [PATCH] dma/fsldma: fix a compilation warnings
From: Liu Qiang-B32616 @ 2012-07-31 14:48 UTC (permalink / raw)
To: Kumar Gala, dan.j.williams@intel.com
Cc: vinod.koul@intel.com, linuxppc-dev@ozlabs.org
In-Reply-To: <1343743058-8013-1-git-send-email-galak@kernel.crashing.org>
Hi Kumar,=0A=
________________________________________=0A=
> From: Linuxppc-dev [linuxppc-dev-bounces+qiang.liu=3Dfreescale.com@lists.=
ozlabs.org] on behalf of Kumar Gala [galak@kernel.crashing.org]=0A=
> Sent: Tuesday, July 31, 2012 8:57 AM=0A=
> To: dan.j.williams@intel.com=0A=
> Cc: vinod.koul@intel.com; linuxppc-dev@ozlabs.org=0A=
> Subject: [PATCH] dma/fsldma: fix a compilation warnings=0A=
=0A=
> drivers/dma/fsldma.c: In function 'fsl_dma_tx_submit':=0A=
> drivers/dma/fsldma.c:409:15: warning: 'cookie' may be used uninitialized =
in this function=0A=
> =0A=
> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>=0A=
> ---=0A=
> drivers/dma/fsldma.c | 2 +-=0A=
> 1 file changed, 1 insertion(+), 1 deletion(-)=0A=
> =0A=
> diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c=0A=
> index 8f84761..6194eb7 100644=0A=
> --- a/drivers/dma/fsldma.c=0A=
> +++ b/drivers/dma/fsldma.c=0A=
> @@ -406,7 +406,7 @@ static dma_cookie_t fsl_dma_tx_submit(struct dma_asyn=
c_tx_descriptor *tx)=0A=
> struct fsl_desc_sw *desc =3D tx_to_fsl_desc(tx);=0A=
> struct fsl_desc_sw *child;=0A=
> unsigned long flags;=0A=
> - dma_cookie_t cookie;=0A=
> + dma_cookie_t cookie =3D 0;=0A=
I have already submitted a patch to fix it. Please refer to:=0A=
https://lists.ozlabs.org/pipermail/linuxppc-dev/2012-July/099819.html=0A=
=0A=
Thanks.=0A=
=0A=
> spin_lock_irqsave(&chan->desc_lock, flags);=0A=
> =0A=
> --=0A=
> 1.7.9.7=0A=
=0A=
_______________________________________________=0A=
Linuxppc-dev mailing list=0A=
Linuxppc-dev@lists.ozlabs.org=0A=
https://lists.ozlabs.org/listinfo/linuxppc-dev=0A=
=0A=
^ permalink raw reply
* [PATCH 3/3 v3] powerpc/mpic: FSL MPIC error interrupt support.
From: Varun Sethi @ 2012-07-31 14:42 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Bogdan Hamciuc, Varun Sethi
All SOC device error interrupts are muxed and delivered to the core as a single
MPIC error interrupt. Currently all the device drivers requiring access to device
errors have to register for the MPIC error interrupt as a shared interrupt.
With this patch we add interrupt demuxing capability in the mpic driver, allowing
device drivers to register for their individual error interrupts. This is achieved
by handling error interrupts in a cascaded fashion.
MPIC error interrupt is handled by the "error_int_handler", which subsequently demuxes
it using the EISR and delivers it to the respective drivers.
The error interrupt capability is dependent on the MPIC EIMR register, which was
introduced in FSL MPIC version 4.1 (P4080 rev2). So, error interrupt demuxing capability
is dependent on the MPIC version and can be used for versions >= 4.1.
Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Bogdan Hamciuc <bogdan.hamciuc@freescale.com>
[In the initial version of the patch we were using handle_simple_irq
as the handler for cascaded error interrupts, this resulted
in issues in case of threaded isrs (with RT kernel). This issue was
debugged by Bogdan and decision was taken to use the handle_level_irq
handler]
---
arch/powerpc/include/asm/mpic.h | 13 +++
arch/powerpc/sysdev/Makefile | 2 +-
arch/powerpc/sysdev/fsl_mpic_err.c | 152 ++++++++++++++++++++++++++++++++++++
arch/powerpc/sysdev/mpic.c | 41 ++++++++++-
arch/powerpc/sysdev/mpic.h | 22 +++++
5 files changed, 228 insertions(+), 2 deletions(-)
create mode 100644 arch/powerpc/sysdev/fsl_mpic_err.c
diff --git a/arch/powerpc/include/asm/mpic.h b/arch/powerpc/include/asm/mpic.h
index e14d35d..5cc8000 100644
--- a/arch/powerpc/include/asm/mpic.h
+++ b/arch/powerpc/include/asm/mpic.h
@@ -118,6 +118,9 @@
#define MPIC_MAX_CPUS 32
#define MPIC_MAX_ISU 32
+#define MPIC_MAX_ERR 32
+#define MPIC_FSL_ERR_INT 16
+
/*
* Tsi108 implementation of MPIC has many differences from the original one
*/
@@ -270,6 +273,7 @@ struct mpic
struct irq_chip hc_ipi;
#endif
struct irq_chip hc_tm;
+ struct irq_chip hc_err;
const char *name;
/* Flags */
unsigned int flags;
@@ -283,6 +287,8 @@ struct mpic
/* vector numbers used for internal sources (ipi/timers) */
unsigned int ipi_vecs[4];
unsigned int timer_vecs[8];
+ /* vector numbers used for FSL MPIC error interrupts */
+ unsigned int err_int_vecs[MPIC_MAX_ERR];
/* Spurious vector to program into unused sources */
unsigned int spurious_vec;
@@ -306,6 +312,11 @@ struct mpic
struct mpic_reg_bank cpuregs[MPIC_MAX_CPUS];
struct mpic_reg_bank isus[MPIC_MAX_ISU];
+ /* ioremap'ed base for error interrupt registers */
+ u32 __iomem *err_regs;
+ /* error interrupt config */
+ u32 err_int_config_done;
+
/* Protected sources */
unsigned long *protected;
@@ -370,6 +381,8 @@ struct mpic
#define MPIC_NO_RESET 0x00004000
/* Freescale MPIC (compatible includes "fsl,mpic") */
#define MPIC_FSL 0x00008000
+/* Freescale MPIC supports EIMR (error interrupt mask register)*/
+#define MPIC_FSL_HAS_EIMR 0x00010000
/* MPIC HW modification ID */
#define MPIC_REGSET_MASK 0xf0000000
diff --git a/arch/powerpc/sysdev/Makefile b/arch/powerpc/sysdev/Makefile
index 1bd7ecb..a57600b 100644
--- a/arch/powerpc/sysdev/Makefile
+++ b/arch/powerpc/sysdev/Makefile
@@ -15,7 +15,7 @@ obj-$(CONFIG_PPC_DCR_NATIVE) += dcr-low.o
obj-$(CONFIG_PPC_PMI) += pmi.o
obj-$(CONFIG_U3_DART) += dart_iommu.o
obj-$(CONFIG_MMIO_NVRAM) += mmio_nvram.o
-obj-$(CONFIG_FSL_SOC) += fsl_soc.o
+obj-$(CONFIG_FSL_SOC) += fsl_soc.o fsl_mpic_err.o
obj-$(CONFIG_FSL_PCI) += fsl_pci.o $(fsl-msi-obj-y)
obj-$(CONFIG_FSL_PMC) += fsl_pmc.o
obj-$(CONFIG_FSL_LBC) += fsl_lbc.o
diff --git a/arch/powerpc/sysdev/fsl_mpic_err.c b/arch/powerpc/sysdev/fsl_mpic_err.c
new file mode 100644
index 0000000..1ebfa36
--- /dev/null
+++ b/arch/powerpc/sysdev/fsl_mpic_err.c
@@ -0,0 +1,152 @@
+/*
+ * Copyright (C) 2012 Freescale Semiconductor, Inc.
+ *
+ * Author: Varun Sethi <varun.sethi@freescale.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2 of the
+ * License.
+ *
+ */
+
+#include <linux/irq.h>
+#include <linux/smp.h>
+#include <linux/interrupt.h>
+
+#include <asm/io.h>
+#include <asm/irq.h>
+#include <asm/mpic.h>
+
+#include "mpic.h"
+
+#define MPIC_ERR_INT_BASE 0x3900
+#define MPIC_ERR_INT_EISR 0x0000
+#define MPIC_ERR_INT_EIMR 0x0010
+
+static inline u32 mpic_fsl_err_read(u32 __iomem *base, unsigned int err_reg)
+{
+ return in_be32(base + (err_reg >> 2));
+}
+
+static inline void mpic_fsl_err_write(u32 __iomem *base, u32 value)
+{
+ out_be32(base + (MPIC_ERR_INT_EIMR >> 2), value);
+}
+
+static void fsl_mpic_mask_err(struct irq_data *d)
+{
+ u32 eimr;
+ struct mpic *mpic = irq_data_get_irq_chip_data(d);
+ unsigned int src = virq_to_hw(d->irq) - mpic->err_int_vecs[0];
+
+ eimr = mpic_fsl_err_read(mpic->err_regs, MPIC_ERR_INT_EIMR);
+ eimr |= (1 << (31 - src));
+ mpic_fsl_err_write(mpic->err_regs, eimr);
+}
+
+static void fsl_mpic_unmask_err(struct irq_data *d)
+{
+ u32 eimr;
+ struct mpic *mpic = irq_data_get_irq_chip_data(d);
+ unsigned int src = virq_to_hw(d->irq) - mpic->err_int_vecs[0];
+
+ eimr = mpic_fsl_err_read(mpic->err_regs, MPIC_ERR_INT_EIMR);
+ eimr &= ~(1 << (31 - src));
+ mpic_fsl_err_write(mpic->err_regs, eimr);
+}
+
+static struct irq_chip fsl_mpic_err_chip = {
+ .irq_disable = fsl_mpic_mask_err,
+ .irq_mask = fsl_mpic_mask_err,
+ .irq_unmask = fsl_mpic_unmask_err,
+};
+
+void mpic_setup_error_int(struct mpic *mpic, int intvec)
+{
+ int i;
+
+ mpic->err_regs = ioremap(mpic->paddr + MPIC_ERR_INT_BASE, 0x1000);
+ if (!mpic->err_regs) {
+ pr_err("could not map mpic error registers\n");
+ return;
+ }
+ mpic->hc_err = fsl_mpic_err_chip;
+ mpic->hc_err.name = mpic->name;
+ mpic->flags |= MPIC_FSL_HAS_EIMR;
+ /* allocate interrupt vectors for error interrupts */
+ for (i = MPIC_MAX_ERR - 1; i >= 0; i--)
+ mpic->err_int_vecs[i] = --intvec;
+
+}
+
+int mpic_map_error_int(struct mpic *mpic, unsigned int virq, irq_hw_number_t hw)
+{
+ if ((mpic->flags & MPIC_FSL_HAS_EIMR) &&
+ (hw >= mpic->err_int_vecs[0] &&
+ hw <= mpic->err_int_vecs[MPIC_MAX_ERR - 1])) {
+ WARN_ON(mpic->flags & MPIC_SECONDARY);
+
+ pr_debug("mpic: mapping as Error Interrupt\n");
+ irq_set_chip_data(virq, mpic);
+ irq_set_chip_and_handler(virq, &mpic->hc_err,
+ handle_level_irq);
+ return 1;
+ }
+
+ return 0;
+}
+
+static irqreturn_t fsl_error_int_handler(int irq, void *data)
+{
+ struct mpic *mpic = (struct mpic *) data;
+ u32 eisr, eimr;
+ int errint;
+ unsigned int cascade_irq;
+
+ eisr = mpic_fsl_err_read(mpic->err_regs, MPIC_ERR_INT_EISR);
+ eimr = mpic_fsl_err_read(mpic->err_regs, MPIC_ERR_INT_EIMR);
+
+ if (!(eisr & ~eimr))
+ return IRQ_NONE;
+
+ while (eisr) {
+ errint = __builtin_clz(eisr);
+ cascade_irq = irq_linear_revmap(mpic->irqhost,
+ mpic->err_int_vecs[errint]);
+ WARN_ON(cascade_irq == NO_IRQ);
+ if (cascade_irq != NO_IRQ) {
+ generic_handle_irq(cascade_irq);
+ } else {
+ eimr |= 1 << (31 - errint);
+ mpic_fsl_err_write(mpic->err_regs, eimr);
+ }
+ eisr &= ~(1 << (31 - errint));
+ }
+
+ return IRQ_HANDLED;
+}
+
+int mpic_err_int_init(struct mpic *mpic, irq_hw_number_t irqnum)
+{
+ unsigned int virq;
+ int ret;
+
+ virq = irq_create_mapping(mpic->irqhost, irqnum);
+ if (virq == NO_IRQ) {
+ pr_err("Error interrupt setup failed\n");
+ return 0;
+ }
+
+ /* Mask all error interrupts */
+ mpic_fsl_err_write(mpic->err_regs, ~0);
+
+ ret = request_irq(virq, fsl_error_int_handler, IRQF_NO_THREAD,
+ "mpic-error-int", mpic);
+ if (ret) {
+ pr_err("Failed to register error interrupt handler\n");
+ return 0;
+ }
+
+ return 1;
+}
diff --git a/arch/powerpc/sysdev/mpic.c b/arch/powerpc/sysdev/mpic.c
index 7e32db7..2a0b632 100644
--- a/arch/powerpc/sysdev/mpic.c
+++ b/arch/powerpc/sysdev/mpic.c
@@ -1026,6 +1026,9 @@ static int mpic_host_map(struct irq_domain *h, unsigned int virq,
return 0;
}
+ if (mpic_map_error_int(mpic, virq, hw))
+ return 0;
+
if (hw >= mpic->num_sources)
return -EINVAL;
@@ -1085,7 +1088,16 @@ static int mpic_host_xlate(struct irq_domain *h, struct device_node *ct,
*/
switch (intspec[2]) {
case 0:
- case 1: /* no EISR/EIMR support for now, treat as shared IRQ */
+ break;
+ case 1:
+ if (!mpic->err_int_config_done)
+ break;
+
+ if (intspec[3] >= ARRAY_SIZE(mpic->err_int_vecs))
+ return -EINVAL;
+
+ *out_hwirq = mpic->err_int_vecs[intspec[3]];
+
break;
case 2:
if (intspec[0] >= ARRAY_SIZE(mpic->ipi_vecs))
@@ -1302,6 +1314,8 @@ struct mpic * __init mpic_alloc(struct device_node *node,
mpic_map(mpic, mpic->paddr, &mpic->tmregs, MPIC_INFO(TIMER_BASE), 0x1000);
if (mpic->flags & MPIC_FSL) {
+ u32 brr1, version;
+
/*
* Yes, Freescale really did put global registers in the
* magic per-cpu area -- and they don't even show up in the
@@ -1309,6 +1323,26 @@ struct mpic * __init mpic_alloc(struct device_node *node,
*/
mpic_map(mpic, mpic->paddr, &mpic->thiscpuregs,
MPIC_CPU_THISBASE, 0x1000);
+
+ brr1 = _mpic_read(mpic->reg_type, &mpic->thiscpuregs,
+ MPIC_FSL_BRR1);
+ version = brr1 & MPIC_FSL_BRR1_VER;
+
+ /* Error interrupt mask register (EIMR) is required for
+ * handling individual device error interrupts. EIMR
+ * was added in MPIC version 4.1.
+ *
+ * Over here we reserve vector number space for error
+ * interrupt vectors. This space is stolen from the
+ * global vector number space, as in case of ipis
+ * and timer interrupts.
+ *
+ * Available vector space = intvec_top - 12, where 12
+ * is the number of vectors which have been consumed by
+ * ipis and timer interrupts.
+ */
+ if (version >= 0x401)
+ mpic_setup_error_int(mpic, intvec_top - 12);
}
/* Reset */
@@ -1474,6 +1508,11 @@ void __init mpic_init(struct mpic *mpic)
num_timers = 8;
}
+ /* FSL mpic error interrupt intialization */
+ if (mpic->flags & MPIC_FSL_HAS_EIMR)
+ mpic->err_int_config_done =
+ mpic_err_int_init(mpic, MPIC_FSL_ERR_INT);
+
/* Initialize timers to our reserved vectors and mask them for now */
for (i = 0; i < num_timers; i++) {
unsigned int offset = mpic_tm_offset(mpic, i);
diff --git a/arch/powerpc/sysdev/mpic.h b/arch/powerpc/sysdev/mpic.h
index 13f3e89..1a6995a 100644
--- a/arch/powerpc/sysdev/mpic.h
+++ b/arch/powerpc/sysdev/mpic.h
@@ -40,4 +40,26 @@ extern int mpic_set_affinity(struct irq_data *d,
const struct cpumask *cpumask, bool force);
extern void mpic_reset_core(int cpu);
+#ifdef CONFIG_FSL_SOC
+extern int mpic_map_error_int(struct mpic *mpic, unsigned int virq, irq_hw_number_t hw);
+extern int mpic_err_int_init(struct mpic *mpic, irq_hw_number_t irqnum);
+extern void mpic_setup_error_int(struct mpic *mpic, int intvec);
+#else
+static inline int mpic_map_error_int(struct mpic *mpic, unsigned int virq, irq_hw_number_t hw)
+{
+ return 0;
+}
+
+
+static inline int mpic_err_int_init(struct mpic *mpic, irq_hw_number_t irqnum)
+{
+ return -1;
+}
+
+static inline void mpic_setup_error_int(struct mpic *mpic, int intvec)
+{
+ return;
+}
+#endif
+
#endif /* _POWERPC_SYSDEV_MPIC_H */
--
1.7.4.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox