LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCHv2 8/8] Documentation: Add device tree bindings for Freescale VF610 sound.
From: Xiubo Li @ 2013-11-01  7:04 UTC (permalink / raw)
  To: r65073, timur, lgirdwood, broonie
  Cc: mark.rutland, alsa-devel, linux-doc, tiwai, b18965, perex, LW,
	linux, b42378, oskar, grant.likely, devicetree, ian.campbell,
	pawel.moll, swarren, rob.herring, linux-arm-kernel, fabio.estevam,
	linux-kernel, rob, r64188, shawn.guo, linuxppc-dev
In-Reply-To: <1383289495-24523-1-git-send-email-Li.Xiubo@freescale.com>

This adds the Document for Freescale VF610 sound driver under
Documentation/devicetree/bindings/sound/.

Signed-off-by: Xiubo Li <Li.Xiubo@freescale.com>
---
 .../bindings/sound/fsl_audio_sgt15000_vf610.txt    | 46 ++++++++++++++++++++++
 1 file changed, 46 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/sound/fsl_audio_sgt15000_vf610.txt

diff --git a/Documentation/devicetree/bindings/sound/fsl_audio_sgt15000_vf610.txt b/Documentation/devicetree/bindings/sound/fsl_audio_sgt15000_vf610.txt
new file mode 100644
index 0000000..76ff838
--- /dev/null
+++ b/Documentation/devicetree/bindings/sound/fsl_audio_sgt15000_vf610.txt
@@ -0,0 +1,46 @@
+Freescale VF610 audio complex with SGTL5000 codec
+
+Required properties:
+- compatible: "fsl,vf610-sgtl5000"
+- model: The user-visible name of this sound complex.
+- saif-controllers: The phandle list of the SAI controller.
+- audio-codec: The phandle of the SGTL5000 audio codec.
+- audio-routing : A list of the connections between audio components.
+  Each entry is a pair of strings, the first being the connection's sink,
+  the second being the connection's source. Valid names could be power
+  supplies, SGTL5000 pins, and the jacks on the board:
+
+  -- Power supplies:
+     * Mic Bias
+
+  -- Board connectors:
+     * Mic Jack
+     * Line In Jack
+     * Headphone Jack
+     * Line Out Jack
+     * Ext Spk
+
+Example:
+
+sound {
+	compatible = "fsl,vf610-sgtl5000";
+	model = "vf610-sgtl5000";
+	saif-controller = <&sai2>;
+	audio-codec = <&codec>;
+	audio-routing =
+		"MIC_IN", "Mic Jack",
+		"Mic Jack", "Mic Bias",
+		"LINE_IN", "Line In Jack",
+		"Headphone Jack", "HP_OUT",
+		"Ext Spk", "LINE_OUT";
+};
+
+&i2c0 {
+	...
+
+	codec: sgtl5000@0a {
+	       compatible = "fsl,sgtl5000";
+	       reg = <0x0a>;
+	       clocks = <&clks VF610_CLK_SAI2>;
+       };
+};
-- 
1.8.4

^ permalink raw reply related

* [PATCHv1 0/8] ALSA: Add SAI driver and enable SGT15000 codec
From: Li Xiubo @ 2013-11-01  7:52 UTC (permalink / raw)
  To: Li Xiubo, Shawn Guo, timur@tabi.org, lgirdwood@gmail.com,
	broonie@kernel.org
  Cc: mark.rutland@arm.com, alsa-devel@alsa-project.org,
	linux-doc@vger.kernel.org, tiwai@suse.de, Huan Wang,
	perex@perex.cz, LW@KARO-electronics.de, linux@arm.linux.org.uk,
	Guangyu Chen, oskar@scara.com, grant.likely@linaro.org,
	devicetree@vger.kernel.org, ian.campbell@citrix.com,
	pawel.moll@arm.com, swarren@wwwdotorg.org,
	rob.herring@calxeda.com, linux-arm-kernel@lists.infradead.org,
	Fabio Estevam, linux-kernel@vger.kernel.org, rob@landley.net,
	Zhengxiong Jin, shawn.guo@linaro.org,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <1383289495-24523-1-git-send-email-Li.Xiubo@freescale.com>

Sorry, forgot the Subject!

Sent again.


>=20
> Hello,
>=20
> This patch series is mostly Freescale's SAI SoC Digital Audio Interface
> driver implementation. And the implementation is only compatible with
> device tree definition.
>=20
> This patch series is based on linux-next and has been tested on Vybrid
> VF610 Tower board using device tree.
>=20
> Changed in v2:
> - Use default settings for the generic dmaengine PCM driver.
> - Separate receive and transmit setting in most functions, but some
> couldn't for the HW limitation.
> - Drop some not reduntant code.
> - Use devm_snd_soc_register_component() instead of
> snd_soc_register_component().
> - Use devm_snd_soc_register_card() instead of
> devm_snd_soc_register_card().
> - Adjust the code sentences sequence.
> - Make the namespacing consistent.
> - Rename CONFIG_SND_SOC_FSL_SGTL5000 to CONFIG_SND_SOC_FSL_SGTL5000_VF610=
.
> - Drop some meaningless lines.
> - Rename the binding document file.
>=20
> Added in v1:
> - Add SAI SoC Digital Audio Interface driver.
> - Add Freescale SAI ALSA SoC Digital Audio Interface node for VF610.
> - Enables SAI ALSA SoC DAI device for Vybrid VF610 TOWER board.
> - Add device tree bindings for Freescale SAI.
> - Revise the bugs about the sgt15000 codec.
> - Add SGT15000 based audio machine driver.
> - Enable SGT15000 codec based audio driver node for VF610.
> - Add device tree bindings for Freescale VF610 sound.
>=20
>=20
>=20
>=20
> --
> To unsubscribe from this list: send the line "unsubscribe devicetree" in
> the body of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH RFC v5 3/5] dma: of: Add common xlate function for matching by channel id
From: Arnd Bergmann @ 2013-11-01  7:52 UTC (permalink / raw)
  To: Alexander Popov
  Cc: devicetree, Lars-Peter Clausen, Vinod Koul, Gerhard Sittig,
	Dan Williams, Anatolij Gustschin, linuxppc-dev
In-Reply-To: <1383290374-17484-4-git-send-email-a13xp0p0v88@gmail.com>

On Friday 01 November 2013, Alexander Popov wrote:

> + * of_dma_xlate_by_chan_id - Translate dt property to DMA channel by channel id
> + * @dma_spec:	pointer to DMA specifier as found in the device tree
> + * @of_dma:	pointer to DMA controller data
> + *
> + * This function can be used as the of xlate callback for DMA driver which wants
> + * to match the channel based on the channel id. When using this xlate function
> + * the #dma-cells propety of the DMA controller dt node needs to be set to 1.
> + * The data parameter of of_dma_controller_register must be a pointer to the
> + * dma_device struct the function should match upon.
> + *
> + * Returns pointer to appropriate dma channel on success or NULL on error.
> + */
> +struct dma_chan *of_dma_xlate_by_chan_id(struct of_phandle_args *dma_spec,
> +					 struct of_dma *ofdma)
> +{
> +	struct of_dma_filter_by_chan_id_args args;
> +	dma_cap_mask_t cap;
> +
> +	args.dev = ofdma->of_dma_data;
> +	if (!args.dev)
> +		return NULL;
> +
> +	if (dma_spec->args_count != 1)
> +		return NULL;
> +
> +	dma_cap_zero(cap);
> +	dma_cap_set(DMA_SLAVE, cap);
> +
> +	args.chan_id = dma_spec->args[0];
> +
> +	return dma_request_channel(cap, of_dma_filter_by_chan_id, &args);
> +}
> +EXPORT_SYMBOL_GPL(of_dma_xlate_by_chan_id);

This seems rather clumsy, now that we have added the dma_get_slave_channel interface.
Can you try using that instead of dma_request_channel() now?

^ permalink raw reply

* Re: [PATCH 2/2] of: move definition of of_find_next_cache_node into common code.
From: Benjamin Herrenschmidt @ 2013-11-01  8:16 UTC (permalink / raw)
  To: Sudeep KarkadaNagesha
  Cc: grant.likely@linaro.org, devicetree@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, rob.herring@calxeda.com
In-Reply-To: <527231DA.9060002@arm.com>

On Thu, 2013-10-31 at 10:32 +0000, Sudeep KarkadaNagesha wrote:

> Thanks for the follow up. Grant wanted to see usage of this outside PPC and I
> pointed him[0] to the RFC[1] I had posted to support cacheinfo on ARM.
> 
> These patches are based on v3.12-rc1, let me know if you want me to
> rebase/repost on any particular version.

I've applied them to powerpc-next

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCHv2 1/8] ALSA: Add SAI SoC Digital Audio Interface driver.
From: Nicolin Chen @ 2013-11-01  8:59 UTC (permalink / raw)
  To: Xiubo Li
  Cc: mark.rutland, alsa-devel, linux-doc, tiwai, b18965, timur, perex,
	r65073, LW, linux, linux-arm-kernel, grant.likely, devicetree,
	ian.campbell, pawel.moll, swarren, rob.herring, broonie, oskar,
	fabio.estevam, lgirdwood, linux-kernel, rob, r64188, shawn.guo,
	linuxppc-dev
In-Reply-To: <1383289495-24523-2-git-send-email-Li.Xiubo@freescale.com>

Hi Xiubo,

On Fri, Nov 01, 2013 at 03:04:48PM +0800, Xiubo Li wrote:
> This adds Freescale SAI ASoC Audio support.
> This implementation is only compatible with device tree definition.
> Features:
> o Supports playback/capture
> o Supports 16/20/24 bit PCM
> o Supports 8k - 96k sample rates
> o Supports slave mode only.
> 

Just for curiosity, I found there're quite a bit code actually support
I2S master mode such as set_sysclk() and register configuration to FMT
SND_SOC_DAIFMT_CBS_CFS.

Is that really "supports slave mode only"? Or just you haven't make
it properly work?

> diff --git a/sound/soc/fsl/Kconfig b/sound/soc/fsl/Kconfig
> index b7ab71f..9a8851e 100644
> --- a/sound/soc/fsl/Kconfig
> +++ b/sound/soc/fsl/Kconfig
> @@ -213,3 +213,19 @@ config SND_SOC_IMX_MC13783
>  	select SND_SOC_IMX_PCM_DMA
>  
>  endif # SND_IMX_SOC
> +
> +menuconfig SND_FSL_SOC
> +	tristate "SoC Audio for Freescale FSL CPUs"
> +	help
> +	  Say Y or M if you want to add support for codecs attached to
> +	  the FSL CPUs.
> +
> +	  This will enable Freeacale SAI, SGT15000 codec.
> +
> +if SND_FSL_SOC

Why check this here? SAI should be a regular IP module which can be used by
other SoC as well. We would never know if next i.MX SoC won't contain SAI.

> +
> +config SND_SOC_FSL_SAI
> +	tristate
> +	select SND_SOC_GENERIC_DMAENGINE_PCM

I think you should keep SAI beside SSI/SPDIF directly.

> +
> +endif # SND_FSL_SOC
> diff --git a/sound/soc/fsl/Makefile b/sound/soc/fsl/Makefile
> index 8db705b..e5acc03 100644
> --- a/sound/soc/fsl/Makefile
> +++ b/sound/soc/fsl/Makefile
> @@ -56,3 +56,8 @@ obj-$(CONFIG_SND_SOC_IMX_SGTL5000) += snd-soc-imx-sgtl5000.o
>  obj-$(CONFIG_SND_SOC_IMX_WM8962) += snd-soc-imx-wm8962.o
>  obj-$(CONFIG_SND_SOC_IMX_SPDIF) += snd-soc-imx-spdif.o
>  obj-$(CONFIG_SND_SOC_IMX_MC13783) += snd-soc-imx-mc13783.o
> +
> +# FSL ARM SAI/SGT15000 Platform Support

Should be SGTL5000, not SGT1. And SAI should not be used with SGTL5000 only,
it's a bit confusing to mention SGTL5000 here.

> +snd-soc-fsl-sai-objs := fsl-sai.o

And I think it should be better to put it along with fsl-ssi.o and fsl-spdif.o

> +
> +obj-$(CONFIG_SND_SOC_FSL_SAI) += snd-soc-fsl-sai.o

Ditto

> diff --git a/sound/soc/fsl/fsl-sai.c b/sound/soc/fsl/fsl-sai.c
> new file mode 100644
> index 0000000..bb57e67
> --- /dev/null
> +++ b/sound/soc/fsl/fsl-sai.c
> @@ -0,0 +1,472 @@
> +/*
> + * Freescale SAI ALSA SoC Digital Audio Interface driver.
> + *
> + * Copyright 2012-2013 Freescale Semiconductor, Inc.
> + *
> + * This program is free software; you can redistribute  it and/or modify it
> + * under  the terms of  the GNU General  Public License as published by the
> + * Free Software Foundation;  either version 2 of the  License, or (at your
> + * option) any later version.

There're too many double space inside. Could you pls revise it?

> + *
> + */
> +
> +#include <linux/clk.h>
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +#include <linux/of_address.h>
> +#include <sound/core.h>
> +#include <sound/pcm_params.h>
> +#include <linux/delay.h>
> +#include <linux/dmaengine.h>
> +#include <sound/dmaengine_pcm.h>

I think it's better to keep <sound/xxx.h> together. And basically
we can keep header files ordered by alphabet.

> +
> +#include "fsl-sai.h"
> +
> +static int fsl_sai_set_dai_sysclk_tr(struct snd_soc_dai *cpu_dai,
> +		int clk_id, unsigned int freq, int fsl_dir)
> +{
> +	u32 val_cr2, reg_cr2;
> +	struct fsl_sai *sai = snd_soc_dai_get_drvdata(cpu_dai);
> +
> +	if (fsl_dir == FSL_FMT_TRANSMITTER)
> +		reg_cr2 = FSL_SAI_TCR2;
> +	else
> +		reg_cr2 = FSL_SAI_RCR2;
> +
> +	val_cr2 = readl(sai->base + reg_cr2);
> +	switch (clk_id) {
> +	case FSL_SAI_CLK_BUS:
> +		val_cr2 &= ~FSL_SAI_CR2_MSEL_MASK;
> +		val_cr2 |= FSL_SAI_CR2_MSEL_BUS;
> +		break;
> +	case FSL_SAI_CLK_MAST1:
> +		val_cr2 &= ~FSL_SAI_CR2_MSEL_MASK;
> +		val_cr2 |= FSL_SAI_CR2_MSEL_MCLK1;
> +		break;
> +	case FSL_SAI_CLK_MAST2:
> +		val_cr2 &= ~FSL_SAI_CR2_MSEL_MASK;
> +		val_cr2 |= FSL_SAI_CR2_MSEL_MCLK2;
> +		break;
> +	case FSL_SAI_CLK_MAST3:
> +		val_cr2 &= ~FSL_SAI_CR2_MSEL_MASK;
> +		val_cr2 |= FSL_SAI_CR2_MSEL_MCLK3;
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +	writel(val_cr2, sai->base + reg_cr2);
> +
> +	return 0;
> +}
> +
> +static int fsl_sai_set_dai_sysclk(struct snd_soc_dai *cpu_dai,
> +		int clk_id, unsigned int freq, int dir)
> +{
> +	int ret;
> +
> +	if (dir == SND_SOC_CLOCK_IN)
> +		return 0;
> +
> +	ret = fsl_sai_set_dai_sysclk_tr(cpu_dai, clk_id, freq,
> +					FSL_FMT_TRANSMITTER);
> +	if (ret) {
> +		dev_err(cpu_dai->dev,
> +				"Cannot set sai's transmitter sysclk: %d\n",
> +				ret);
> +		return ret;
> +	}
> +
> +	ret = fsl_sai_set_dai_sysclk_tr(cpu_dai, clk_id, freq,
> +					FSL_FMT_RECEIVER);
> +	if (ret) {
> +		dev_err(cpu_dai->dev,
> +				"Cannot set sai's receiver sysclk: %d\n",
> +				ret);
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int fsl_sai_set_dai_clkdiv(struct snd_soc_dai *cpu_dai,
> +		int div_id, int div)
> +{
> +	struct fsl_sai *sai = snd_soc_dai_get_drvdata(cpu_dai);
> +	u32 tcr2, rcr2;
> +
> +	if (div_id == FSL_SAI_TX_DIV) {
> +		tcr2 = readl(sai->base + FSL_SAI_TCR2);
> +		tcr2 &= ~FSL_SAI_CR2_DIV_MASK;
> +		tcr2 |= FSL_SAI_CR2_DIV(div);
> +		writel(tcr2, sai->base + FSL_SAI_TCR2);
> +
> +	} else if (div_id == FSL_SAI_RX_DIV) {
> +		rcr2 = readl(sai->base + FSL_SAI_RCR2);
> +		rcr2 &= ~FSL_SAI_CR2_DIV_MASK;
> +		rcr2 |= FSL_SAI_CR2_DIV(div);
> +		writel(rcr2, sai->base + FSL_SAI_RCR2);
> +
> +	} else
> +		return -EINVAL;

It would look better if using switch-case. And the last 'else'
could be 'default:'.

> +
> +	return 0;
> +}
> +
> +static int fsl_sai_set_dai_fmt_tr(struct snd_soc_dai *cpu_dai,
> +				unsigned int fmt, int fsl_dir)
> +{
> +	u32 val_cr2, val_cr3, val_cr4, reg_cr2, reg_cr3, reg_cr4;
> +	struct fsl_sai *sai = snd_soc_dai_get_drvdata(cpu_dai);
> +
> +	if (fsl_dir == FSL_FMT_TRANSMITTER) {
> +		reg_cr2 = FSL_SAI_TCR2;
> +		reg_cr3 = FSL_SAI_TCR3;
> +		reg_cr4 = FSL_SAI_TCR4;
> +	} else {
> +		reg_cr2 = FSL_SAI_RCR2;
> +		reg_cr3 = FSL_SAI_RCR3;
> +		reg_cr4 = FSL_SAI_RCR4;
> +	}
> +
> +	val_cr2 = readl(sai->base + reg_cr2);
> +	val_cr3 = readl(sai->base + reg_cr3);
> +	val_cr4 = readl(sai->base + reg_cr4);
> +
> +	if (sai->fbt == FSL_SAI_FBT_MSB)
> +		val_cr4 |= FSL_SAI_CR4_MF;
> +	else if (sai->fbt == FSL_SAI_FBT_LSB)
> +		val_cr4 &= ~FSL_SAI_CR4_MF;
> +
> +	switch (fmt & SND_SOC_DAIFMT_FORMAT_MASK) {
> +	case SND_SOC_DAIFMT_I2S:
> +		val_cr4 |= FSL_SAI_CR4_FSE;
> +		val_cr4 |= FSL_SAI_CR4_FSP;
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	switch (fmt & SND_SOC_DAIFMT_INV_MASK) {
> +	case SND_SOC_DAIFMT_IB_IF:
> +		val_cr4 |= FSL_SAI_CR4_FSP;
> +		val_cr2 &= ~FSL_SAI_CR2_BCP;
> +		break;
> +	case SND_SOC_DAIFMT_IB_NF:
> +		val_cr4 &= ~FSL_SAI_CR4_FSP;
> +		val_cr2 &= ~FSL_SAI_CR2_BCP;
> +		break;
> +	case SND_SOC_DAIFMT_NB_IF:
> +		val_cr4 |= FSL_SAI_CR4_FSP;
> +		val_cr2 |= FSL_SAI_CR2_BCP;
> +		break;
> +	case SND_SOC_DAIFMT_NB_NF:
> +		val_cr4 &= ~FSL_SAI_CR4_FSP;
> +		val_cr2 |= FSL_SAI_CR2_BCP;
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	switch (fmt & SND_SOC_DAIFMT_MASTER_MASK) {
> +	case SND_SOC_DAIFMT_CBS_CFS:
> +		val_cr2 |= FSL_SAI_CR2_BCD_MSTR;
> +		val_cr4 |= FSL_SAI_CR4_FSD_MSTR;
> +		break;
> +	case SND_SOC_DAIFMT_CBM_CFM:
> +		val_cr2 &= ~FSL_SAI_CR2_BCD_MSTR;
> +		val_cr4 &= ~FSL_SAI_CR4_FSD_MSTR;
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	val_cr3 |= FSL_SAI_CR3_TRCE;
> +
> +	if (fsl_dir == FSL_FMT_RECEIVER)
> +		val_cr2 |= FSL_SAI_CR2_SYNC;
> +
> +	writel(val_cr2, sai->base + reg_cr2);
> +	writel(val_cr3, sai->base + reg_cr3);
> +	writel(val_cr4, sai->base + reg_cr4);
> +
> +	return 0;

> +

Pls drop this extra line.

> +}
> +
> +static int fsl_sai_set_dai_fmt(struct snd_soc_dai *cpu_dai, unsigned int fmt)
> +{
> +	int ret;
> +
> +	ret = fsl_sai_set_dai_fmt_tr(cpu_dai, fmt, FSL_FMT_TRANSMITTER);
> +	if (ret) {
> +		dev_err(cpu_dai->dev,
> +				"Cannot set sai's transmitter format: %d\n",
> +				ret);
> +		return ret;
> +	}
> +
> +	ret = fsl_sai_set_dai_fmt_tr(cpu_dai, fmt, FSL_FMT_RECEIVER);
> +	if (ret) {
> +		dev_err(cpu_dai->dev,
> +				"Cannot set sai's receiver format: %d\n",
> +				ret);
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int fsl_sai_set_dai_tdm_slot(struct snd_soc_dai *cpu_dai,
> +		unsigned int tx_mask, unsigned int rx_mask,
> +		int slots, int slot_width)
> +{
> +	struct fsl_sai *sai = snd_soc_dai_get_drvdata(cpu_dai);
> +	u32 tcr4, rcr4;
> +
> +	tcr4 = readl(sai->base + FSL_SAI_TCR4);
> +	tcr4 &= ~FSL_SAI_CR4_FRSZ_MASK;
> +	tcr4 |= FSL_SAI_CR4_FRSZ(2);
> +	writel(tcr4, sai->base + FSL_SAI_TCR4);
> +	writel(tx_mask, sai->base + FSL_SAI_TMR);
> +
> +	rcr4 = readl(sai->base + FSL_SAI_RCR4);
> +	rcr4 &= ~FSL_SAI_CR4_FRSZ_MASK;
> +	rcr4 |= FSL_SAI_CR4_FRSZ(2);
> +	writel(rcr4, sai->base + FSL_SAI_RCR4);
> +	writel(rx_mask, sai->base + FSL_SAI_RMR);
> +
> +	return 0;
> +}
> +
> +static int fsl_sai_hw_params(struct snd_pcm_substream *substream,
> +		struct snd_pcm_hw_params *params,
> +		struct snd_soc_dai *cpu_dai)
> +{
> +	u32 val_cr4, val_cr5, reg_cr4, reg_cr5, word_width;
> +	struct fsl_sai *sai = snd_soc_dai_get_drvdata(cpu_dai);
> +
> +	if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) {
> +		reg_cr4 = FSL_SAI_TCR4;
> +		reg_cr5 = FSL_SAI_TCR5;
> +	} else {
> +		reg_cr4 = FSL_SAI_RCR4;
> +		reg_cr5 = FSL_SAI_RCR5;
> +	}
> +
> +	val_cr4 = readl(sai->base + reg_cr4);
> +	val_cr4 &= ~FSL_SAI_CR4_SYWD_MASK;
> +
> +	val_cr5 = readl(sai->base + reg_cr5);
> +	val_cr5 &= ~FSL_SAI_CR5_WNW_MASK;
> +	val_cr5 &= ~FSL_SAI_CR5_W0W_MASK;
> +	val_cr5 &= ~FSL_SAI_CR5_FBT_MASK;
> +
> +	switch (params_format(params)) {
> +	case SNDRV_PCM_FORMAT_S16_LE:
> +		word_width = 16;
> +		break;
> +	case SNDRV_PCM_FORMAT_S20_3LE:
> +		word_width = 20;
> +		break;
> +	case SNDRV_PCM_FORMAT_S24_LE:
> +		word_width = 24;
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	val_cr4 |= FSL_SAI_CR4_SYWD(word_width);
> +	val_cr5 |= FSL_SAI_CR5_WNW(word_width);
> +	val_cr5 |= FSL_SAI_CR5_W0W(word_width);
> +
> +	if (sai->fbt == FSL_SAI_FBT_MSB)
> +		val_cr5 |= FSL_SAI_CR5_FBT(word_width - 1);
> +	else if (sai->fbt == FSL_SAI_FBT_LSB)
> +		val_cr5 |= FSL_SAI_CR5_FBT(0);
> +
> +	writel(val_cr4, sai->base + reg_cr4);
> +	writel(val_cr5, sai->base + reg_cr5);
> +
> +	return 0;
> +}
> +
> +static int fsl_sai_trigger(struct snd_pcm_substream *substream, int cmd,
> +		struct snd_soc_dai *dai)
> +{
> +	struct fsl_sai *sai = snd_soc_dai_get_drvdata(dai);
> +	unsigned int tcsr, rcsr;
> +
> +	tcsr = readl(sai->base + FSL_SAI_TCSR);
> +	rcsr = readl(sai->base + FSL_SAI_RCSR);
> +
> +	if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) {
> +		tcsr |= FSL_SAI_CSR_FRDE;
> +		rcsr &= ~FSL_SAI_CSR_FRDE;
> +	} else {
> +		rcsr |= FSL_SAI_CSR_FRDE;
> +		tcsr &= ~FSL_SAI_CSR_FRDE;
> +	}
> +
> +	switch (cmd) {
> +	case SNDRV_PCM_TRIGGER_START:
> +	case SNDRV_PCM_TRIGGER_RESUME:
> +	case SNDRV_PCM_TRIGGER_PAUSE_RELEASE:
> +		tcsr |= FSL_SAI_CSR_TERE;
> +		rcsr |= FSL_SAI_CSR_TERE;
> +		writel(rcsr, sai->base + FSL_SAI_RCSR);
> +		udelay(10);

Does SAI really needs this udelay() here? Required by IP's operation flow?
If so, I think it's better to add comments here to explain it.

> +		writel(tcsr, sai->base + FSL_SAI_TCSR);
> +		break;
> +
> +	case SNDRV_PCM_TRIGGER_STOP:
> +	case SNDRV_PCM_TRIGGER_SUSPEND:
> +	case SNDRV_PCM_TRIGGER_PAUSE_PUSH:
> +		if (!(dai->playback_active || dai->capture_active)) {
> +			tcsr &= ~FSL_SAI_CSR_TERE;
> +			rcsr &= ~FSL_SAI_CSR_TERE;
> +		}
> +		writel(rcsr, sai->base + FSL_SAI_RCSR);
> +		udelay(10);
> +		writel(tcsr, sai->base + FSL_SAI_TCSR);
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static const struct snd_soc_dai_ops fsl_sai_pcm_dai_ops = {
> +	.set_sysclk	= fsl_sai_set_dai_sysclk,
> +	.set_clkdiv	= fsl_sai_set_dai_clkdiv,
> +	.set_fmt	= fsl_sai_set_dai_fmt,
> +	.set_tdm_slot	= fsl_sai_set_dai_tdm_slot,
> +	.hw_params	= fsl_sai_hw_params,
> +	.trigger	= fsl_sai_trigger,
> +};
> +
> +static int fsl_sai_dai_probe(struct snd_soc_dai *dai)
> +{
> +	int ret;
> +	struct fsl_sai *sai = dev_get_drvdata(dai->dev);
> +
> +	ret = clk_prepare_enable(sai->clk);
> +	if (ret)
> +		return ret;
> +
> +	writel(0x0, sai->base + FSL_SAI_RCSR);
> +	writel(0x0, sai->base + FSL_SAI_TCSR);
> +	writel(sai->dma_params_tx.maxburst, sai->base + FSL_SAI_TCR1);
> +	writel(sai->dma_params_rx.maxburst, sai->base + FSL_SAI_RCR1);
> +
> +	dai->playback_dma_data = &sai->dma_params_tx;
> +	dai->capture_dma_data = &sai->dma_params_rx;
> +
> +	snd_soc_dai_set_drvdata(dai, sai);
> +
> +	return 0;
> +}
> +
> +int fsl_sai_dai_remove(struct snd_soc_dai *dai)

static

> +{
> +	struct fsl_sai *sai = dev_get_drvdata(dai->dev);
> +
> +	clk_disable_unprepare(sai->clk);
> +
> +	return 0;
> +}
> +
> +static struct snd_soc_dai_driver fsl_sai_dai = {
> +	.probe = fsl_sai_dai_probe,
> +	.remove = fsl_sai_dai_remove,
> +	.playback = {
> +		.channels_min = 1,
> +		.channels_max = 2,
> +		.rates = SNDRV_PCM_RATE_8000_96000,
> +		.formats = FSL_SAI_FORMATS,
> +	},
> +	.capture = {
> +		.channels_min = 1,
> +		.channels_max = 2,
> +		.rates = SNDRV_PCM_RATE_8000_96000,
> +		.formats = FSL_SAI_FORMATS,
> +	},
> +	.ops = &fsl_sai_pcm_dai_ops,
> +};
> +
> +static const struct snd_soc_component_driver fsl_component = {
> +	.name           = "fsl-sai",
> +};
> +
> +static int fsl_sai_probe(struct platform_device *pdev)
> +{
> +	struct resource *res;
> +	struct fsl_sai *sai;
> +	int ret = 0;
> +
> +	sai = devm_kzalloc(&pdev->dev, sizeof(*sai), GFP_KERNEL);
> +	if (!sai)
> +		return -ENOMEM;
> +
> +	sai->fbt = FSL_SAI_FBT_MSB;
> +
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	sai->base = devm_ioremap_resource(&pdev->dev, res);
> +	if (IS_ERR(sai->base))
> +		return PTR_ERR(sai->base);
> +
> +	sai->clk = devm_clk_get(&pdev->dev, "sai");
> +	if (IS_ERR(sai->clk)) {
> +		dev_err(&pdev->dev, "Cannot get sai's clock: %d\n", ret);
> +		return PTR_ERR(sai->clk);
> +	}
> +
> +	sai->dma_params_rx.addr = res->start + FSL_SAI_RDR;
> +	sai->dma_params_rx.maxburst = 6;
> +
> +	sai->dma_params_tx.addr = res->start + FSL_SAI_TDR;
> +	sai->dma_params_tx.maxburst = 6;
> +
> +	platform_set_drvdata(pdev, sai);
> +
> +	ret = devm_snd_soc_register_component(&pdev->dev, &fsl_component,
> +			&fsl_sai_dai, 1);
> +	if (ret)
> +		return ret;
> +
> +	ret = snd_dmaengine_pcm_register(&pdev->dev, NULL,
> +			SND_DMAENGINE_PCM_FLAG_NO_RESIDUE);
> +	if (ret)
> +		return ret;
> +
> +	return 0;
> +}
> +
> +static int fsl_sai_remove(struct platform_device *pdev)
> +{
> +	snd_dmaengine_pcm_unregister(&pdev->dev);
> +
> +	return 0;
> +}
> +
> +static const struct of_device_id fsl_sai_ids[] = {
> +	{ .compatible = "fsl,vf610-sai", },

> +	{ /*sentinel*/ }

I think this could be left in blank without comments inside.
And if you really want to add it, pls add like: /* sentinel */
						  ^        ^
> +};
> +
> +static struct platform_driver fsl_sai_driver = {
> +	.probe = fsl_sai_probe,
> +	.remove = fsl_sai_remove,
> +
> +	.driver = {
> +		.name = "fsl-sai",
> +		.owner = THIS_MODULE,
> +		.of_match_table = fsl_sai_ids,
> +	},
> +};
> +module_platform_driver(fsl_sai_driver);
> +
> +MODULE_AUTHOR("Xiubo Li, <Li.Xiubo@freescale.com>");
> +MODULE_AUTHOR("Alison Wang, <b18965@freescale.com>");
> +MODULE_DESCRIPTION("Freescale Soc SAI Interface");
> +MODULE_LICENSE("GPL");

Should be better if added:
MODULE_ALIAS("platform:fsl-sai");

This would support module auto-load feature in some Linux-OS.

Best regards,
Nicolin Chen

> diff --git a/sound/soc/fsl/fsl-sai.h b/sound/soc/fsl/fsl-sai.h
> new file mode 100644
> index 0000000..1637679
> --- /dev/null
> +++ b/sound/soc/fsl/fsl-sai.h
> @@ -0,0 +1,120 @@
> +/*
> + * Copyright 2012-2013 Freescale Semiconductor, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#ifndef __FSL_SAI_H
> +#define __FSL_SAI_H
> +
> +#include <sound/dmaengine_pcm.h>
> +
> +#define FSL_SAI_FORMATS (SNDRV_PCM_FMTBIT_S16_LE |\
> +			 SNDRV_PCM_FMTBIT_S20_3LE |\
> +			 SNDRV_PCM_FMTBIT_S24_LE)
> +
> +/* SAI Transmit/Recieve Control Register */
> +#define FSL_SAI_TCSR		0x00
> +#define FSL_SAI_RCSR		0x80
> +#define FSL_SAI_CSR_TERE	BIT(31)
> +#define FSL_SAI_CSR_FWF		BIT(17)
> +#define FSL_SAI_CSR_FRIE	BIT(8)
> +#define FSL_SAI_CSR_FRDE	BIT(0)
> +
> +/* SAI Transmit Data/FIFO/MASK Register */
> +#define FSL_SAI_TDR		0x20
> +#define FSL_SAI_TFR		0x40
> +#define FSL_SAI_TMR		0x60
> +
> +/* SAI Recieve Data/FIFO/MASK Register */
> +#define FSL_SAI_RDR		0xa0
> +#define FSL_SAI_RFR		0xc0
> +#define FSL_SAI_RMR		0xe0
> +
> +/* SAI Transmit and Recieve Configuration 1 Register */
> +#define FSL_SAI_TCR1		0x04
> +#define FSL_SAI_RCR1		0x84
> +
> +/* SAI Transmit and Recieve Configuration 2 Register */
> +#define FSL_SAI_TCR2		0x08
> +#define FSL_SAI_RCR2		0x88
> +#define FSL_SAI_CR2_SYNC	BIT(30)
> +#define FSL_SAI_CR2_MSEL_MASK	(0xff << 26)
> +#define FSL_SAI_CR2_MSEL_BUS	0
> +#define FSL_SAI_CR2_MSEL_MCLK1	BIT(26)
> +#define FSL_SAI_CR2_MSEL_MCLK2	BIT(27)
> +#define FSL_SAI_CR2_MSEL_MCLK3	(BIT(26) | BIT(27))
> +#define FSL_SAI_CR2_BCP		BIT(25)
> +#define FSL_SAI_CR2_BCD_MSTR	BIT(24)
> +#define FSL_SAI_CR2_DIV(x)	(x)
> +#define FSL_SAI_CR2_DIV_MASK	0xff
> +
> +/* SAI Transmit and Recieve Configuration 3 Register */
> +#define FSL_SAI_TCR3		0x0c
> +#define FSL_SAI_RCR3		0x8c
> +#define FSL_SAI_CR3_TRCE	BIT(16)
> +#define FSL_SAI_CR3_WDFL(x)	(x)
> +#define FSL_SAI_CR3_WDFL_MASK	0x1f
> +
> +/* SAI Transmit and Recieve Configuration 4 Register */
> +#define FSL_SAI_TCR4		0x10
> +#define FSL_SAI_RCR4		0x90
> +#define FSL_SAI_CR4_FRSZ(x)	(((x) - 1) << 16)
> +#define FSL_SAI_CR4_FRSZ_MASK	(0x1f << 16)
> +#define FSL_SAI_CR4_SYWD(x)	(((x) - 1) << 8)
> +#define FSL_SAI_CR4_SYWD_MASK	(0x1f << 8)
> +#define FSL_SAI_CR4_MF		BIT(4)
> +#define FSL_SAI_CR4_FSE		BIT(3)
> +#define FSL_SAI_CR4_FSP		BIT(1)
> +#define FSL_SAI_CR4_FSD_MSTR	BIT(0)
> +
> +/* SAI Transmit and Recieve Configuration 5 Register */
> +#define FSL_SAI_TCR5		0x14
> +#define FSL_SAI_RCR5		0x94
> +#define FSL_SAI_CR5_WNW(x)	(((x) - 1) << 24)
> +#define FSL_SAI_CR5_WNW_MASK	(0x1f << 24)
> +#define FSL_SAI_CR5_W0W(x)	(((x) - 1) << 16)
> +#define FSL_SAI_CR5_W0W_MASK	(0x1f << 16)
> +#define FSL_SAI_CR5_FBT(x)	((x) << 8)
> +#define FSL_SAI_CR5_FBT_MASK	(0x1f << 8)
> +
> +/* SAI audio dividers */
> +#define FSL_SAI_TX_DIV		0
> +#define FSL_SAI_RX_DIV		1
> +
> +/* SAI type */
> +#define FSL_SAI_DMA		BIT(0)
> +#define FSL_SAI_USE_AC97	BIT(1)
> +#define FSL_SAI_NET		BIT(2)
> +#define FSL_SAI_TRA_SYN		BIT(3)
> +#define FSL_SAI_REC_SYN		BIT(4)
> +#define FSL_SAI_USE_I2S_SLAVE	BIT(5)
> +
> +#define FSL_FMT_TRANSMITTER	0
> +#define FSL_FMT_RECEIVER	1
> +
> +/* SAI clock sources */
> +#define FSL_SAI_CLK_BUS		0
> +#define FSL_SAI_CLK_MAST1	1
> +#define FSL_SAI_CLK_MAST2	2
> +#define FSL_SAI_CLK_MAST3	3
> +
> +enum fsl_sai_fbt {
> +	FSL_SAI_FBT_MSB,
> +	FSL_SAI_FBT_LSB,
> +};
> +
> +struct fsl_sai {
> +	struct clk *clk;
> +
> +	void __iomem *base;
> +
> +	enum fsl_sai_fbt fbt;
> +
> +	struct snd_dmaengine_dai_dma_data dma_params_rx;
> +	struct snd_dmaengine_dai_dma_data dma_params_tx;
> +};
> +
> +#endif /* __FSL_SAI_H */
> -- 
> 1.8.4
> 

^ permalink raw reply

* Re: [PATCH v4 0/4] Add dual-fifo mode support of i.MX ssi
From: Shawn Guo @ 2013-11-01  9:11 UTC (permalink / raw)
  To: Nicolin Chen, Vinod Koul
  Cc: mark.rutland, devicetree, alsa-devel, pawel.moll, linux-doc,
	s.hauer, swarren, timur, rob.herring, linux-kernel, broonie,
	dmaengine, ijc+devicetree, linuxppc-dev, linux-arm-kernel
In-Reply-To: <cover.1383226444.git.b42378@freescale.com>

Nicolin,

The dmaengine maintainer Vinod Koul should be copied on the series.
(Just added).

On Thu, Oct 31, 2013 at 09:44:12PM +0800, Nicolin Chen wrote:
> Nicolin Chen (4):
>   dma: imx-sdma: Add sdma firmware version 2 support
>   dma: imx-sdma: Add new dma type for ssi dual fifo script
>   ASoC: fsl_ssi: Add dual fifo mode support
>   ARM: dts: imx: use dual-fifo sdma script for ssi

Vinod,

To keep git bisect, the series has to be merged through single tree.
Since Mark has ACK-ed patch #3, I think it's easier to merge the series
through your tree, so for patch #4:

Acked-by: Shawn Guo <shawn.guo@linaro.org>

^ permalink raw reply

* Re: perf events ring buffer memory barrier on powerpc
From: Paul E. McKenney @ 2013-11-01  9:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Neuling, Mathieu Desnoyers, LKML, Oleg Nesterov,
	Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky
In-Reply-To: <20131031151955.GY19466@laptop.lan>

On Thu, Oct 31, 2013 at 04:19:55PM +0100, Peter Zijlstra wrote:
> On Thu, Oct 31, 2013 at 08:07:56AM -0700, Paul E. McKenney wrote:
> > On Thu, Oct 31, 2013 at 10:04:57AM +0100, Peter Zijlstra wrote:
> > > On Wed, Oct 30, 2013 at 09:32:58PM -0700, Paul E. McKenney wrote:
> > > > Before C/C++11, the closest thing to such a prohibition is use of
> > > > volatile, for example, ACCESS_ONCE().  Even in C/C++11, you have to
> > > > use atomics to get anything resembing this prohibition.
> > > > 
> > > > If you just use normal variables, the compiler is within its rights
> > > > to transform something like the following:
> > > > 
> > > > 	if (a)
> > > > 		b = 1;
> > > > 	else
> > > > 		b = 42;
> > > > 
> > > > Into:
> > > > 
> > > > 	b = 42;
> > > > 	if (a)
> > > > 		b = 1;
> > > > 
> > > > Many other similar transformations are permitted.  Some are used to all
> > > > vector instructions to be used -- the compiler can do a write with an
> > > > overly wide vector instruction, then clean up the clobbered variables
> > > > later, if it wishes.  Again, if the variables are not marked volatile,
> > > > or, in C/C++11, atomic.
> > > 
> > > While I've heard you tell this story before, my mind keeps boggling how
> > > we've been able to use shared memory at all, all these years.
> > > 
> > > It seems to me stuff should have broken left, right and center if
> > > compilers were really aggressive about this.
> > 
> > Sometimes having stupid compilers is a good thing.  But they really are
> > getting more aggressive.
> 
> But surely we cannot go mark all data structures lodged in shared memory
> as volatile, that's insane.
> 
> I'm sure you're quite worried about this as well. Suppose we have:
> 
> struct foo {
> 	unsigned long value;
> 	void *ptr;
> 	unsigned long value1;
> };
> 
> And our ptr member is RCU managed. Then while the assignment using:
> rcu_assign_ptr() will use the volatile cast, what stops the compiler
> from wrecking ptr while writing either of the value* members and
> 'fixing' her up after?

Nothing at all!

We can reduce the probability by putting the pointer at one end or the
other, so that if the compiler uses (say) vector instructions to aggregate
individual assignments to the other fields, it will be less likely to hit
"ptr".  But yes, this is ugly and it would be really hard to get all
this right, and would often conflict with cache-locality needs.

> This is a completely untenable position.

Indeed it is!

C/C++ never was intended to be used for parallel programming, and this is
but one of the problems that can arise when we nevertheless use it for
parallel programming.  As compilers get smarter (for some definition of
"smarter") and as more systems have special-purpose hardware (such as
vector units) that are visible to the compiler, we can expect more of
this kind of trouble.

This was one of many reasons that I decided to help with the C/C++11
effort, whatever anyone might think about the results.

> How do the C/C++ people propose to deal with this?

By marking "ptr" as atomic, thus telling the compiler not to mess with it.
And thus requiring that all accesses to it be decorated, which in the
case of RCU could be buried in the RCU accessors.

							Thanx, Paul

^ permalink raw reply

* Re: [PATCH v4 0/4] Add dual-fifo mode support of i.MX ssi
From: Nicolin Chen @ 2013-11-01  9:33 UTC (permalink / raw)
  To: Shawn Guo
  Cc: mark.rutland, devicetree, alsa-devel, pawel.moll, linux-doc,
	Vinod Koul, s.hauer, swarren, timur, rob.herring, linux-kernel,
	broonie, dmaengine, ijc+devicetree, linuxppc-dev,
	linux-arm-kernel
In-Reply-To: <20131101091156.GC10324@S2101-09.ap.freescale.net>

On Fri, Nov 01, 2013 at 05:11:58PM +0800, Shawn Guo wrote:
> Nicolin,
> 
> The dmaengine maintainer Vinod Koul should be copied on the series.
> (Just added).
>

Ah, I was careless last night when I sent these patches. My VPN connection
was pretty laggy :( Please forgive me, Vinod.

Thank Shawn for adding for me. Much obliged.

> On Thu, Oct 31, 2013 at 09:44:12PM +0800, Nicolin Chen wrote:
> > Nicolin Chen (4):
> >   dma: imx-sdma: Add sdma firmware version 2 support
> >   dma: imx-sdma: Add new dma type for ssi dual fifo script
> >   ASoC: fsl_ssi: Add dual fifo mode support
> >   ARM: dts: imx: use dual-fifo sdma script for ssi
> 
> Vinod,
> 
> To keep git bisect, the series has to be merged through single tree.
> Since Mark has ACK-ed patch #3, I think it's easier to merge the series
> through your tree, so for patch #4:
> 
> Acked-by: Shawn Guo <shawn.guo@linaro.org>

^ permalink raw reply

* Re: [PATCH RFC v5 1/5] dma: mpc512x: reorder mpc8308 specific instructions
From: Anatolij Gustschin @ 2013-11-01 10:04 UTC (permalink / raw)
  To: Alexander Popov
  Cc: devicetree, Lars-Peter Clausen, Arnd Bergmann, Vinod Koul,
	Gerhard Sittig, Dan Williams, linuxppc-dev
In-Reply-To: <1383290374-17484-2-git-send-email-a13xp0p0v88@gmail.com>

On Fri,  1 Nov 2013 11:19:30 +0400
Alexander Popov <a13xp0p0v88@gmail.com> wrote:

> Concentrate the specific code for MPC8308 in the 'if' branch
> and handle MPC512x in the 'else' branch.
> This modification only reorders instructions but doesn't change behaviour.
> 
> Signed-off-by: Alexander Popov <a13xp0p0v88@gmail.com>
> ---
>  drivers/dma/mpc512x_dma.c | 42 +++++++++++++++++++++++++-----------------
>  1 file changed, 25 insertions(+), 17 deletions(-)

Acked-by: Anatolij Gustschin <agust@denx.de>

> 
> diff --git a/drivers/dma/mpc512x_dma.c b/drivers/dma/mpc512x_dma.c
> index 2fe4353..f41639f 100644
> --- a/drivers/dma/mpc512x_dma.c
> +++ b/drivers/dma/mpc512x_dma.c
> @@ -50,9 +50,17 @@
>  #define MPC_DMA_DESCRIPTORS	64
>  
>  /* Macro definitions */
> -#define MPC_DMA_CHANNELS	64
>  #define MPC_DMA_TCD_OFFSET	0x1000
>  
> +/*
> + * Maximum channel counts for individual hardware variants
> + * and the maximum channel count over all supported controllers,
> + * used for data structure size
> + */
> +#define MPC8308_DMACHAN_MAX	16
> +#define MPC512x_DMACHAN_MAX	64
> +#define MPC_DMA_CHANNELS	64
> +
>  /* Arbitration mode of group and channel */
>  #define MPC_DMA_DMACR_EDCG	(1 << 31)
>  #define MPC_DMA_DMACR_ERGA	(1 << 3)
> @@ -708,10 +716,10 @@ static int mpc_dma_probe(struct platform_device *op)
>  
>  	dma = &mdma->dma;
>  	dma->dev = dev;
> -	if (!mdma->is_mpc8308)
> -		dma->chancnt = MPC_DMA_CHANNELS;
> +	if (mdma->is_mpc8308)
> +		dma->chancnt = MPC8308_DMACHAN_MAX;
>  	else
> -		dma->chancnt = 16; /* MPC8308 DMA has only 16 channels */
> +		dma->chancnt = MPC512x_DMACHAN_MAX;
>  	dma->device_alloc_chan_resources = mpc_dma_alloc_chan_resources;
>  	dma->device_free_chan_resources = mpc_dma_free_chan_resources;
>  	dma->device_issue_pending = mpc_dma_issue_pending;
> @@ -745,7 +753,19 @@ static int mpc_dma_probe(struct platform_device *op)
>  	 * - Round-robin group arbitration,
>  	 * - Round-robin channel arbitration.
>  	 */
> -	if (!mdma->is_mpc8308) {
> +	if (mdma->is_mpc8308) {
> +		/* MPC8308 has 16 channels and lacks some registers */
> +		out_be32(&mdma->regs->dmacr, MPC_DMA_DMACR_ERCA);
> +
> +		/* enable snooping */
> +		out_be32(&mdma->regs->dmagpor, MPC_DMA_DMAGPOR_SNOOP_ENABLE);
> +		/* Disable error interrupts */
> +		out_be32(&mdma->regs->dmaeeil, 0);
> +
> +		/* Clear interrupts status */
> +		out_be32(&mdma->regs->dmaintl, 0xFFFF);
> +		out_be32(&mdma->regs->dmaerrl, 0xFFFF);
> +	} else {
>  		out_be32(&mdma->regs->dmacr, MPC_DMA_DMACR_EDCG |
>  					MPC_DMA_DMACR_ERGA | MPC_DMA_DMACR_ERCA);
>  
> @@ -766,18 +786,6 @@ static int mpc_dma_probe(struct platform_device *op)
>  		/* Route interrupts to IPIC */
>  		out_be32(&mdma->regs->dmaihsa, 0);
>  		out_be32(&mdma->regs->dmailsa, 0);
> -	} else {
> -		/* MPC8308 has 16 channels and lacks some registers */
> -		out_be32(&mdma->regs->dmacr, MPC_DMA_DMACR_ERCA);
> -
> -		/* enable snooping */
> -		out_be32(&mdma->regs->dmagpor, MPC_DMA_DMAGPOR_SNOOP_ENABLE);
> -		/* Disable error interrupts */
> -		out_be32(&mdma->regs->dmaeeil, 0);
> -
> -		/* Clear interrupts status */
> -		out_be32(&mdma->regs->dmaintl, 0xFFFF);
> -		out_be32(&mdma->regs->dmaerrl, 0xFFFF);
>  	}
>  
>  	/* Register DMA engine */

^ permalink raw reply

* Re: [PATCHv2 5/8] ASoC: SGTL5000: Enhance the SGTL5000 codec driver about regulator.
From: Nicolin Chen @ 2013-11-01 10:02 UTC (permalink / raw)
  To: Xiubo Li
  Cc: mark.rutland, alsa-devel, linux-doc, tiwai, b18965, timur, perex,
	r65073, LW, linux, linux-arm-kernel, grant.likely, devicetree,
	ian.campbell, pawel.moll, swarren, rob.herring, broonie, oskar,
	fabio.estevam, lgirdwood, linux-kernel, rob, r64188, shawn.guo,
	linuxppc-dev
In-Reply-To: <1383289495-24523-6-git-send-email-Li.Xiubo@freescale.com>

On Fri, Nov 01, 2013 at 03:04:52PM +0800, Xiubo Li wrote:
> On VF610 series there are no regulators used, and now whether the
> CONFIG_REGULATOR mirco is enabled or not, for the VF610 audio

micro? or macro?

> patch series, the board cannot be probe successfully.
> And this patch will solve this issue.
> 

I finally got your idea about what this patch does. But it seems to
be more likely a work around. Maybe I here can't think it through
comprehensively, but I think you can try to add a fixed regulator to
sgtl5000 in the devicetree, rather than disabling common functions
even if not breaking others.

> Signed-off-by: Xiubo Li <Li.Xiubo@freescale.com>
> ---
>  sound/soc/codecs/sgtl5000.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/sound/soc/codecs/sgtl5000.c b/sound/soc/codecs/sgtl5000.c
> index 1f4093f..c2f6d86 100644
> --- a/sound/soc/codecs/sgtl5000.c
> +++ b/sound/soc/codecs/sgtl5000.c
> @@ -61,6 +61,7 @@ static const struct reg_default sgtl5000_reg_defaults[] = {
>  	{ SGTL5000_DAP_AVC_DECAY,		0x0050 },
>  };
>  
> +#ifdef CONFIG_REGULATOR
>  /* regulator supplies for sgtl5000, VDDD is an optional external supply */
>  enum sgtl5000_regulator_supplies {
>  	VDDA,
> @@ -93,6 +94,9 @@ static struct regulator_init_data ldo_init_data = {
>  	.num_consumer_supplies = 1,
>  	.consumer_supplies = &ldo_consumer[0],
>  };
> +#else
> +#define SGTL5000_SUPPLY_NUM 0
> +#endif
>  
>  /*
>   * sgtl5000 internal ldo regulator,
> @@ -112,7 +116,9 @@ struct sgtl5000_priv {
>  	int master;	/* i2s master or not */
>  	int fmt;	/* i2s data format */
>  	struct regulator_bulk_data supplies[SGTL5000_SUPPLY_NUM];
> +#ifdef CONFIG_REGULATOR
>  	struct ldo_regulator *ldo;
> +#endif
>  	struct regmap *regmap;
>  	struct clk *mclk;
>  };
> @@ -879,6 +885,7 @@ static int ldo_regulator_remove(struct snd_soc_codec *codec)
>  	return 0;
>  }
>  #else
> +#ifndef CONFIG_SND_SOC_FSL_SGTL5000_VF610

Checking a platform or SoC specific define here is just a bit....

Could you pls find a better solution? Like I said at first, use fixed regulator
or figure out why your System would hang with current sgtl5000 driver. If it
really has a critical bug in its regulator-related code, I think you can then
make a greater contribution by fixing the bug rather than bypassing it.

Best regards,
Nicolin Chen

>  static int ldo_regulator_register(struct snd_soc_codec *codec,
>  				struct regulator_init_data *init_data,
>  				int voltage)
> @@ -886,6 +893,7 @@ static int ldo_regulator_register(struct snd_soc_codec *codec,
>  	dev_err(codec->dev, "this setup needs regulator support in the kernel\n");
>  	return -EINVAL;
>  }
> +#endif
>  
>  static int ldo_regulator_remove(struct snd_soc_codec *codec)
>  {
> @@ -1137,6 +1145,7 @@ static int sgtl5000_resume(struct snd_soc_codec *codec)
>  #define sgtl5000_resume  NULL
>  #endif	/* CONFIG_SUSPEND */
>  
> +#ifdef CONFIG_REGULATOR
>  /*
>   * sgtl5000 has 3 internal power supplies:
>   * 1. VAG, normally set to vdda/2
> @@ -1373,6 +1382,7 @@ err_regulator_free:
>  	return ret;
>  
>  }
> +#endif
>  
>  static int sgtl5000_probe(struct snd_soc_codec *codec)
>  {
> @@ -1387,6 +1397,7 @@ static int sgtl5000_probe(struct snd_soc_codec *codec)
>  		return ret;
>  	}
>  
> +#ifdef CONFIG_REGULATOR
>  	ret = sgtl5000_enable_regulators(codec);
>  	if (ret)
>  		return ret;
> @@ -1395,6 +1406,7 @@ static int sgtl5000_probe(struct snd_soc_codec *codec)
>  	ret = sgtl5000_set_power_regs(codec);
>  	if (ret)
>  		goto err;
> +#endif
>  
>  	/* enable small pop, introduce 400ms delay in turning off */
>  	snd_soc_update_bits(codec, SGTL5000_CHIP_REF_CTRL,
> -- 
> 1.8.4
> 

^ permalink raw reply

* Re: perf events ring buffer memory barrier on powerpc
From: Peter Zijlstra @ 2013-11-01 10:30 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Michael Neuling, Mathieu Desnoyers, LKML, Oleg Nesterov,
	Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky
In-Reply-To: <20131101092814.GG4067@linux.vnet.ibm.com>

On Fri, Nov 01, 2013 at 02:28:14AM -0700, Paul E. McKenney wrote:
> > This is a completely untenable position.
> 
> Indeed it is!
> 
> C/C++ never was intended to be used for parallel programming, 

And yet pretty much all kernels ever written for SMP systems are written
in it; what drugs are those people smoking?

Furthermore there's a gazillion parallel userspace programs.

> and this is
> but one of the problems that can arise when we nevertheless use it for
> parallel programming.  As compilers get smarter (for some definition of
> "smarter") and as more systems have special-purpose hardware (such as
> vector units) that are visible to the compiler, we can expect more of
> this kind of trouble.
> 
> This was one of many reasons that I decided to help with the C/C++11
> effort, whatever anyone might think about the results.

Well, I applaud your efforts, but given the results I think the C/C++
people are all completely insane.

> > How do the C/C++ people propose to deal with this?
> 
> By marking "ptr" as atomic, thus telling the compiler not to mess with it.
> And thus requiring that all accesses to it be decorated, which in the
> case of RCU could be buried in the RCU accessors.

This seems contradictory; marking it atomic would look like:

struct foo {
	unsigned long value;
	__atomic void *ptr;
	unsigned long value1;
};

Clearly we cannot hide this definition in accessors, because then
accesses to value* won't see the annotation.

That said; mandating we mark all 'shared' data with __atomic is
completely untenable and is not backwards compatible.

To be safe we must assume all data shared unless indicated otherwise.

^ permalink raw reply

* Re: [PATCHv2 6/8] ASoC: fsl: add SGTL5000 based audio machine driver.
From: Nicolin Chen @ 2013-11-01 10:28 UTC (permalink / raw)
  To: Xiubo Li, shawn.guo
  Cc: mark.rutland, alsa-devel, linux-doc, tiwai, b18965, timur,
	lgirdwood, r65073, LW, linux, linux-arm-kernel, grant.likely,
	devicetree, ian.campbell, pawel.moll, swarren, rob.herring,
	broonie, perex, oskar, fabio.estevam, linux-kernel, rob, r64188,
	linuxppc-dev
In-Reply-To: <1383289495-24523-7-git-send-email-Li.Xiubo@freescale.com>

Hi Xiubo,

On Fri, Nov 01, 2013 at 03:04:53PM +0800, Xiubo Li wrote:
> This is the SGTL5000 codec based audio driver supported with both
> playback and capture dai link implemention.
> 
> This implementation is only compatible with device tree definition.
> 
> Signed-off-by: Alison Wang <b18965@freescale.com
> Signed-off-by: Xiubo Li <Li.Xiubo@freescale.com>
> 
> Conflicts:
> 	sound/soc/fsl/Makefile
> ---
>  sound/soc/fsl/Kconfig              |  10 ++
>  sound/soc/fsl/Makefile             |   2 +

>  sound/soc/fsl/fsl-sgtl5000-vf610.c | 208 +++++++++++++++++++++++++++++++++++++

I just doubt if this file naming is appropriate. Even if we might not have
rigor rule for the file names, according to existing ones, they are all in
a same pattern: [SoC name]-[codec name].c

"imx-sgtl5000.c" for example

I think it would make user less confused about what this file exactly is if
this machine driver also follow the pattern: vf610-sgtl5000.c


@Shawn

What do you think about the file name?

>  3 files changed, 220 insertions(+)
>  create mode 100644 sound/soc/fsl/fsl-sgtl5000-vf610.c
> 
> diff --git a/sound/soc/fsl/Kconfig b/sound/soc/fsl/Kconfig
> index 9a8851e..1b835ba 100644
> --- a/sound/soc/fsl/Kconfig
> +++ b/sound/soc/fsl/Kconfig
> @@ -228,4 +228,14 @@ config SND_SOC_FSL_SAI
>  	tristate
>  	select SND_SOC_GENERIC_DMAENGINE_PCM
>  
> +config SND_SOC_FSL_SGTL5000_VF610

Same problem with the this define.

> +	tristate "SoC Audio support for FSL boards with sgtl5000"

And 'FSL' here confuses me a lot. Because those boards based on i.MX series
also could be called FSL boards.

> +	depends on OF && I2C
> +	select SND_SOC_FSL_SAI
> +	select SND_SOC_FSL_PCM
> +	select SND_SOC_SGTL5000
> +	help
> +	  Say Y if you want to add support for SoC audio on an FSL board with
> +	  a sgtl5000 codec.
> +
>  endif # SND_FSL_SOC
> diff --git a/sound/soc/fsl/Makefile b/sound/soc/fsl/Makefile
> index e5acc03..26fc551 100644
> --- a/sound/soc/fsl/Makefile
> +++ b/sound/soc/fsl/Makefile
> @@ -59,5 +59,7 @@ obj-$(CONFIG_SND_SOC_IMX_MC13783) += snd-soc-imx-mc13783.o
>  
>  # FSL ARM SAI/SGT15000 Platform Support
>  snd-soc-fsl-sai-objs := fsl-sai.o
> +snd-soc-fsl-sgtl5000-vf610-objs := fsl-sgtl5000-vf610.o
>  
>  obj-$(CONFIG_SND_SOC_FSL_SAI) += snd-soc-fsl-sai.o
> +obj-$(CONFIG_SND_SOC_FSL_SGTL5000_VF610) += snd-soc-fsl-sgtl5000-vf610.o
> diff --git a/sound/soc/fsl/fsl-sgtl5000-vf610.c b/sound/soc/fsl/fsl-sgtl5000-vf610.c
> new file mode 100644
> index 0000000..f535b42
> --- /dev/null
> +++ b/sound/soc/fsl/fsl-sgtl5000-vf610.c
> @@ -0,0 +1,208 @@
> +/*
> + * Freeacale ALSA SoC Audio using SGT1500 as codec.
> + *
> + * Copyright 2012-2013 Freescale Semiconductor, Inc.
> + *
> + * The code contained herein is licensed under the GNU General Public
> + * License. You may obtain a copy of the GNU General Public License
> + * Version 2 or later at the following locations:
> + *
> + */
> +
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/of_platform.h>
> +#include <linux/i2c.h>
> +#include <linux/clk.h>
> +
> +#include "../codecs/sgtl5000.h"
> +#include "fsl-sai.h"
> +
> +static unsigned int sysclk_rate;
> +
> +static int fsl_sgtl5000_dai_init(struct snd_soc_pcm_runtime *rtd)

Naming issue here again.

At least from my point of view, if you actually merged imx-sgtl5000 with
vf610-sgtl5000 and made it also compatible to other freescale SoCs, you
could then fairly call it fsl_sgtl5000_xxxx.

Well, I might be a little picky here because it's a static function and
won't conflict others. Just the name here doesn't look so explicit to me.

Please reconsider about this whole file's naming.

Best regards,
Nicolin Chen

> +{
> +	int ret;
> +	struct device *dev = rtd->card->dev;
> +
> +	ret = snd_soc_dai_set_sysclk(rtd->codec_dai, SGTL5000_SYSCLK,
> +				     sysclk_rate, SND_SOC_CLOCK_IN);
> +	if (ret) {
> +		dev_err(dev, "could not set codec driver clock params :%d\n",
> +				ret);
> +		return ret;
> +	}
> +
> +	ret = snd_soc_dai_set_sysclk(rtd->cpu_dai, FSL_SAI_CLK_BUS,
> +				     sysclk_rate, SND_SOC_CLOCK_OUT);
> +	if (ret) {
> +		dev_err(dev, "could not set cpu dai driver clock params :%d\n",
> +				ret);
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int sgtl5000_params(struct snd_pcm_substream *substream,
> +		struct snd_pcm_hw_params *params)
> +{
> +	struct snd_soc_pcm_runtime *rtd = substream->private_data;
> +	struct snd_soc_dai *cpu_dai = rtd->cpu_dai;
> +	unsigned int channels = params_channels(params);
> +
> +	/* TODO: The SAI driver should figure this out for us */
> +	switch (channels) {
> +	case 2:
> +		snd_soc_dai_set_tdm_slot(cpu_dai, 0xfffffffc, 0xfffffffc, 2, 0);
> +		break;
> +	case 1:
> +		snd_soc_dai_set_tdm_slot(cpu_dai, 0xfffffffe, 0xfffffffe, 1, 0);
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static struct snd_soc_ops fsl_sgtl5000_hifi_ops = {
> +	.hw_params = sgtl5000_params,
> +};
> +
> +static struct snd_soc_dai_link fsl_sgtl5000_dai = {
> +	.name = "HiFi",
> +	.stream_name = "HiFi",
> +	.codec_dai_name = "sgtl5000",
> +	.init = &fsl_sgtl5000_dai_init,
> +	.dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF |
> +		SND_SOC_DAIFMT_CBM_CFM,
> +	.ops = &fsl_sgtl5000_hifi_ops,
> +};
> +
> +static const struct snd_soc_dapm_widget fsl_sgtl5000_dapm_widgets[] = {
> +	SND_SOC_DAPM_MIC("Mic Jack", NULL),
> +	SND_SOC_DAPM_LINE("Line In Jack", NULL),
> +	SND_SOC_DAPM_HP("Headphone Jack", NULL),
> +	SND_SOC_DAPM_SPK("Line Out Jack", NULL),
> +	SND_SOC_DAPM_SPK("Ext Spk", NULL),
> +};
> +
> +static struct snd_soc_card fsl_sgt1500_card = {
> +	.owner = THIS_MODULE,
> +	.num_links = 1,
> +	.dai_link = &fsl_sgtl5000_dai,
> +	.dapm_widgets = fsl_sgtl5000_dapm_widgets,
> +	.num_dapm_widgets = ARRAY_SIZE(fsl_sgtl5000_dapm_widgets),
> +};
> +
> +static int fsl_sgtl5000_parse_dt(struct platform_device *pdev)
> +{
> +	int ret;
> +	struct device_node *sai_np, *codec_np;
> +	struct clk *codec_clk;
> +	struct i2c_client *codec_dev;
> +	struct device_node *np = pdev->dev.of_node;
> +
> +	ret = snd_soc_of_parse_card_name(&fsl_sgt1500_card, "model");
> +	if (ret)
> +		return ret;
> +
> +	ret = snd_soc_of_parse_audio_routing(&fsl_sgt1500_card,
> +			"audio-routing");
> +	if (ret)
> +		return ret;
> +
> +	sai_np = of_parse_phandle(np, "saif-controller", 0);
> +	if (!sai_np) {
> +		dev_err(&pdev->dev, "\"saif-controller\" phandle missing or "
> +				"invalid\n");
> +		return -EINVAL;
> +	}
> +	fsl_sgtl5000_dai.cpu_of_node = sai_np;
> +	fsl_sgtl5000_dai.platform_of_node = sai_np;
> +
> +	codec_np = of_parse_phandle(np, "audio-codec", 0);
> +	if (!codec_np) {
> +		dev_err(&pdev->dev, "\"audio-codec\" phandle missing or "
> +				"invalid\n");
> +		ret = -EINVAL;
> +		goto sai_np_fail;
> +	}
> +	fsl_sgtl5000_dai.codec_of_node = codec_np;
> +
> +	codec_dev = of_find_i2c_device_by_node(codec_np);
> +	if (!codec_dev) {
> +		dev_err(&pdev->dev, "failed to find codec platform device\n");
> +		ret = PTR_ERR(codec_dev);
> +		goto codec_np_fail;
> +	}
> +
> +	codec_clk = devm_clk_get(&codec_dev->dev, NULL);
> +	if (IS_ERR(codec_clk)) {
> +		dev_err(&pdev->dev, "failed to get codec clock\n");
> +		ret = PTR_ERR(codec_clk);
> +		goto codec_np_fail;
> +	}
> +
> +	sysclk_rate = clk_get_rate(codec_clk);
> +
> +codec_np_fail:
> +	of_node_put(codec_np);
> +sai_np_fail:
> +	of_node_put(sai_np);
> +
> +	return ret;
> +}
> +
> +static int fsl_sgtl5000_probe(struct platform_device *pdev)
> +{
> +	int ret;
> +
> +	fsl_sgt1500_card.dev = &pdev->dev;
> +
> +	ret = fsl_sgtl5000_parse_dt(pdev);
> +	if (ret) {
> +		dev_err(&pdev->dev,
> +				"parse sgtl5000 device tree failed :%d\n",
> +				ret);
> +		return ret;
> +	}
> +
> +	ret = devm_snd_soc_register_card(&pdev->dev, &fsl_sgt1500_card);
> +	if (ret) {
> +		dev_err(&pdev->dev, "register soc sound card failed :%d\n",
> +				ret);
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int fsl_sgtl5000_remove(struct platform_device *pdev)
> +{
> +	snd_soc_unregister_card(&fsl_sgt1500_card);
> +
> +	return 0;
> +}
> +
> +static const struct of_device_id fsl_sgtl5000_dt_ids[] = {
> +	{ .compatible = "fsl,vf610-sgtl5000", },
> +	{ /* sentinel */ }
> +};
> +MODULE_DEVICE_TABLE(of, fsl_sgtl5000_dt_ids);
> +
> +static struct platform_driver fsl_sgtl5000_driver = {
> +	.driver = {
> +		.name = "fsl-sgtl5000",
> +		.owner = THIS_MODULE,
> +		.of_match_table = fsl_sgtl5000_dt_ids,
> +	},
> +	.probe = fsl_sgtl5000_probe,
> +	.remove = fsl_sgtl5000_remove,
> +};
> +module_platform_driver(fsl_sgtl5000_driver);
> +
> +MODULE_AUTHOR("Xiubo Li <Li.Xiubo@freescale.com>");
> +MODULE_DESCRIPTION("Freescale SGTL5000 ASoC driver");
> +MODULE_LICENSE("GPL");
> -- 
> 1.8.4
> 

^ permalink raw reply

* Re: [PATCHv2 6/8] ASoC: fsl: add SGTL5000 based audio machine driver.
From: Oskar Schirmer @ 2013-11-01 10:17 UTC (permalink / raw)
  To: Xiubo Li
  Cc: mark.rutland, alsa-devel, linux-doc, tiwai, b18965, timur, perex,
	r65073, LW, linux, b42378, grant.likely, devicetree, ian.campbell,
	pawel.moll, swarren, rob.herring, broonie, linux-arm-kernel,
	fabio.estevam, lgirdwood, linux-kernel, rob, r64188, shawn.guo,
	linuxppc-dev
In-Reply-To: <1383289495-24523-7-git-send-email-Li.Xiubo@freescale.com>

Not that it would improve functionality, but:

On Fri, Nov 01, 2013 at 15:04:53 +0800, Xiubo Li wrote:
[...]
> diff --git a/sound/soc/fsl/fsl-sgtl5000-vf610.c b/sound/soc/fsl/fsl-sgtl5000-vf610.c
> new file mode 100644
> index 0000000..f535b42
> --- /dev/null
> +++ b/sound/soc/fsl/fsl-sgtl5000-vf610.c
> @@ -0,0 +1,208 @@
> +/*
> + * Freeacale ALSA SoC Audio using SGT1500 as codec.
          ^                             ^   ^
     "Freescale"                    "SGTL5000"

regards,
  Oskar

^ permalink raw reply

* Re: perf events ring buffer memory barrier on powerpc
From: Paul E. McKenney @ 2013-10-31 15:25 UTC (permalink / raw)
  To: Victor Kaplansky
  Cc: Michael Neuling, Mathieu Desnoyers, Peter Zijlstra, LKML,
	Oleg Nesterov, Linux PPC dev, Anton Blanchard,
	Frederic Weisbecker
In-Reply-To: <OFDF23985D.1D949148-ON42257C15.002DEB25-42257C15.0036DF93@il.ibm.com>

On Thu, Oct 31, 2013 at 11:59:21AM +0200, Victor Kaplansky wrote:
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote on 10/31/2013
> 06:32:58 AM:
> 
> > If you want to play the "omit memory barriers" game, then proving a
> > negative is in fact the task before you.
> 
> Generally it is not fair. Otherwise, anyone could put an smp_mb() at a
> random place, and expect others to "prove" that it is not needed.
> 
> It is not fair also because it should be virtually impossible to prove lack
> of any problem. OTH, if a problem exists, it should be easy for proponents
> of a memory barrier to build a test case or design a scenario demonstrating
> the problem.

I really don't care about "fair" -- I care instead about the kernel
working reliably.

And it should also be easy for proponents of removing memory barriers to
clearly articulate what orderings their code does and does not need.

> Actually, advocates of the memory barrier in our case do have an argument -
> - the rule of thumb saying that barriers should be paired. I consider this
> rule only as a general recommendation to look into potentially risky
> places.
> And indeed, in our case if the store to circular wasn't conditional, it
> would require a memory barrier to prevent the store to be performed before
> the read of @tail. But in our case the store is conditional, so no memory
> barrier is required.

You are assuming control dependencies that the C language does not
provide.  Now, for all I know right now, there might well be some other
reason why a full barrier is not required, but the "if" statement cannot
be that reason.

Please review section 1.10 of the C++11 standard (or the corresponding
section of the C11 standard, if you prefer).  The point is that the
C/C++11 covers only data dependencies, not control dependencies.

> > And the correctness of this code has been called into question.  :-(
> > An embarrassingly long time ago -- I need to get this either proven
> > or fixed.
> 
> I agree.

Glad we agree on something!

> > Before C/C++11, the closest thing to such a prohibition is use of
> > volatile, for example, ACCESS_ONCE().  Even in C/C++11, you have to
> > use atomics to get anything resembing this prohibition.
> >
> > If you just use normal variables, the compiler is within its rights
> > to transform something like the following:
> >
> >    if (a)
> >       b = 1;
> >    else
> >       b = 42;
> >
> > Into:
> >
> >    b = 42;
> >    if (a)
> >       b = 1;
> >
> > Many other similar transformations are permitted.  Some are used to all
> > vector instructions to be used -- the compiler can do a write with an
> > overly wide vector instruction, then clean up the clobbered variables
> > later, if it wishes.  Again, if the variables are not marked volatile,
> > or, in C/C++11, atomic.
> 
> All this can justify only compiler barrier() which is almost free from
> performance point of view, since current gcc anyway doesn't perform store
> hoisting optimization in our case.

If the above example doesn't get you to give up your incorrect assumption
about "if" statements having much effect on ordering, you need more help
than I can give you just now.

> (And I'm not getting into philosophical discussion whether kernel code
> should consider future possible bugs/features in gcc or C/C++11
> standard).

Should you wish to get into that discussion in the near future, you
will need to find someone else to discuss it with.

> > The compilers don't always know as much as they might about the
> underlying
> > hardware's memory model.
> 
> That's correct in general. But can you point out a problem that really
> exists?

We will see.

In the meantime, can you summarize the ordering requirements of your
code?

> > Of course, if this code is architecture specific,
> > it can avoid DEC Alpha's fun and games, which could also violate your
> > assumptions in the above paragraph:
> >
> >    http://www.openvms.compaq.com/wizard/wiz_2637.html
> 
> Are you talking about this paragraph from above link:
> 
> "For instance, your producer must issue a "memory barrier" instruction
>   after writing the data to shared memory and before inserting it on
>   the queue; likewise, your consumer must issue a memory barrier
>   instruction after removing an item from the queue and before reading
>   from its memory.  Otherwise, you risk seeing stale data, since, while
>   the Alpha processor does provide coherent memory, it does not provide
>   implicit ordering of reads and writes.  (That is, the write of the
>   producer's data might reach memory after the write of the queue, such
>   that the consumer might read the new item from the queue but get the
>   previous values from the item's memory."
> 
> If yes, I don't think it explains the need of memory barrier on Alpha
> in our case (we all agree about the need of smp_wmb() right before @head
> update by producer). If not, could you please point to specific paragraph?

Did you miss the following passage in the paragraph you quoted?

	"... likewise, your consumer must issue a memory barrier
	instruction after removing an item from the queue and before
	reading from its memory."

That is why DEC Alpha readers need a read-side memory barrier -- it says
so right there.  And as either you or Peter noted earlier in this thread,
this barrier can be supplied by smp_read_barrier_depends().

I can sympathize if you are having trouble believing this.  After all,
it took the DEC Alpha architects a full hour to convince me, and that was
in a face-to-face meeting instead of over email.  (Just for the record,
it took me even longer to convince them that their earlier documentation
did not clearly indicate the need for these read-side barriers.)  But
regardless of whether or not I sympathize, DEC Alpha is what it is.

> > Anyway, proving or fixing the code in Documentation/circular-buffers.txt
> > has been on my list for too long, so I will take a closer look at it.
> 
> Thanks!
> 
> I'm concerned more about performance overhead imposed by the full memory
> barrier in kfifo circular buffers. Even if it is needed on Alpha (I don't
> understand why) we could try to solve this with some memory barrier which
> is effective only on architectures which really need it.

By exactly how much does the memory barrier slow your code down on some
example system?  (Yes, I can believe that it is a problem, but is it
really a problem in your exact situation?)

							Thanx, Paul

^ permalink raw reply

* Re: perf events ring buffer memory barrier on powerpc
From: Paul E. McKenney @ 2013-10-31  6:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Neuling, Mathieu Desnoyers, Oleg Nesterov, LKML,
	Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky
In-Reply-To: <20131030112526.GI16117@laptop.programming.kicks-ass.net>

On Wed, Oct 30, 2013 at 12:25:26PM +0100, Peter Zijlstra wrote:
> On Wed, Oct 30, 2013 at 02:27:25AM -0700, Paul E. McKenney wrote:
> > On Mon, Oct 28, 2013 at 10:58:58PM +0200, Victor Kaplansky wrote:
> > > Oleg Nesterov <oleg@redhat.com> wrote on 10/28/2013 10:17:35 PM:
> > > 
> > > >       mb();   // XXXXXXXX: do we really need it? I think yes.
> > > 
> > > Oh, it is hard to argue with feelings. Also, it is easy to be on
> > > conservative side and put the barrier here just in case.
> > > But I still insist that the barrier is redundant in your example.
> > 
> > If you were to back up that insistence with a description of the orderings
> > you are relying on, why other orderings are not important, and how the
> > important orderings are enforced, I might be tempted to pay attention
> > to your opinion.
> 
> OK, so let me try.. a slightly less convoluted version of the code in
> kernel/events/ring_buffer.c coupled with a userspace consumer would look
> something like the below.
> 
> One important detail is that the kbuf part and the kbuf_writer() are
> strictly per cpu and we can thus rely on implicit ordering for those.
> 
> Only the userspace consumer can possibly run on another cpu, and thus we
> need to ensure data consistency for those. 
> 
> struct buffer {
> 	u64 size;
> 	u64 tail;
> 	u64 head;
> 	void *data;
> };
> 
> struct buffer *kbuf, *ubuf;
> 
> /*
>  * Determine there's space in the buffer to store data at @offset to
>  * @head without overwriting data at @tail.
>  */
> bool space(u64 tail, u64 offset, u64 head)
> {
> 	offset = (offset - tail) % kbuf->size;
> 	head   = (head   - tail) % kbuf->size;
> 
> 	return (s64)(head - offset) >= 0;
> }
> 
> /*
>  * If there's space in the buffer; store the data @buf; otherwise
>  * discard it.
>  */
> void kbuf_write(int sz, void *buf)
> {
> 	u64 tail = ACCESS_ONCE(ubuf->tail); /* last location userspace read */
> 	u64 offset = kbuf->head; /* we already know where we last wrote */
> 	u64 head = offset + sz;
> 
> 	if (!space(tail, offset, head)) {
> 		/* discard @buf */
> 		return;
> 	}
> 
> 	/*
> 	 * Ensure that if we see the userspace tail (ubuf->tail) such
> 	 * that there is space to write @buf without overwriting data
> 	 * userspace hasn't seen yet, we won't in fact store data before
> 	 * that read completes.
> 	 */
> 
> 	smp_mb(); /* A, matches with D */
> 
> 	write(kbuf->data + offset, buf, sz);
> 	kbuf->head = head % kbuf->size;
> 
> 	/*
> 	 * Ensure that we write all the @buf data before we update the
> 	 * userspace visible ubuf->head pointer.
> 	 */
> 	smp_wmb(); /* B, matches with C */
> 
> 	ubuf->head = kbuf->head;
> }
> 
> /*
>  * Consume the buffer data and update the tail pointer to indicate to
>  * kernel space there's 'free' space.
>  */
> void ubuf_read(void)
> {
> 	u64 head, tail;
> 
> 	tail = ACCESS_ONCE(ubuf->tail);
> 	head = ACCESS_ONCE(ubuf->head);
> 
> 	/*
> 	 * Ensure we read the buffer boundaries before the actual buffer
> 	 * data...
> 	 */
> 	smp_rmb(); /* C, matches with B */
> 
> 	while (tail != head) {
> 		obj = ubuf->data + tail;
> 		/* process obj */
> 		tail += obj->size;
> 		tail %= ubuf->size;
> 	}
> 
> 	/*
> 	 * Ensure all data reads are complete before we issue the
> 	 * ubuf->tail update; once that update hits, kbuf_write() can
> 	 * observe and overwrite data.
> 	 */
> 	smp_mb(); /* D, matches with A */
> 
> 	ubuf->tail = tail;
> }
> 
> 
> Now the whole crux of the question is if we need barrier A at all, since
> the STORES issued by the @buf writes are dependent on the ubuf->tail
> read.

The dependency you are talking about is via the "if" statement?
Even C/C++11 is not required to respect control dependencies.

This one is a bit annoying.  The x86 TSO means that you really only
need barrier(), ARM (recent ARM, anyway) and Power could use a weaker
barrier, and so on -- but smp_mb() emits a full barrier.

Perhaps a new smp_tmb() for TSO semantics, where reads are ordered
before reads, writes before writes, and reads before writes, but not
writes before reads?  Another approach would be to define a per-arch
barrier for this particular case.

> If the read shows no available space, we simply will not issue those
> writes -- therefore we could argue we can avoid the memory barrier.

Proving that means iterating through the permitted combinations of
compilers and architectures...  There is always hand-coded assembly
language, I suppose.

> However, that leaves D unpaired and me confused. We must have D because
> otherwise the CPU could reorder that write into the reads previous and
> the kernel could start overwriting data we're still reading.. which
> seems like a bad deal.

Yep.  If you were hand-coding only for x86 and s390, D would pair with
the required barrier() asm.

> Also, I'm not entirely sure on C, that too seems like a dependency, we
> simply cannot read the buffer @tail before we've read the tail itself,
> now can we? Similarly we cannot compare tail to head without having the
> head read completed.
> 
> Could we replace A and C with an smp_read_barrier_depends()?

C, yes, given that you have ACCESS_ONCE() on the fetch from ->tail
and that the value fetch from ->tail feeds into the address used for
the "obj =" assignment.  A, not so much -- again, compilers are not
required to respect control dependencies.

							Thanx, Paul

^ permalink raw reply

* Re: perf events ring buffer memory barrier on powerpc
From: Paul E. McKenney @ 2013-10-31  6:16 UTC (permalink / raw)
  To: Victor Kaplansky
  Cc: Michael Neuling, Mathieu Desnoyers, Peter Zijlstra, LKML,
	Oleg Nesterov, Linux PPC dev, Anton Blanchard,
	Frederic Weisbecker
In-Reply-To: <OFF6C78E48.E477DB1E-ON42257C14.0050F9B3-42257C14.0051AC6D@il.ibm.com>

On Wed, Oct 30, 2013 at 04:52:05PM +0200, Victor Kaplansky wrote:
> Peter Zijlstra <peterz@infradead.org> wrote on 10/30/2013 01:25:26 PM:
> 
> > Also, I'm not entirely sure on C, that too seems like a dependency, we
> > simply cannot read the buffer @tail before we've read the tail itself,
> > now can we? Similarly we cannot compare tail to head without having the
> > head read completed.
> 
> No, this one we cannot omit, because our problem on consumer side is not
> with @tail, which is written exclusively by consumer, but with @head.
> 
> BTW, it is why you also don't need ACCESS_ONCE() around @tail, but only
> around
> @head read.

If you omit the ACCESS_ONCE() calls around @tail, the compiler is within
its rights to combine adjacent operations and also to invent loads and
stores, for example, in cases of register pressure.  It is also within
its rights to do piece-at-a-time loads and stores, which might sound
unlikely, but which can actually has happened when the compiler figures
out exactly what is to be stored at compile time, especially on hardware
that only allows small immediate values.

So the ACCESS_ONCE() calls are not optional, the current contents of
Documentation/circular-buffers.txt notwithstanding.

							Thanx, Paul

^ permalink raw reply

* Re: Gianfar driver crashes in Kernel v3.10
From: Claudiu Manoil @ 2013-11-01 11:51 UTC (permalink / raw)
  To: Thomas Hühn; +Cc: linuxppc-dev@lists.ozlabs.org
In-Reply-To: <05E1EDFE-EDFB-4E6F-AA83-D353B4032CA5@net.t-labs.tu-berlin.de>

Hi Thomas,

On 10/31/2013 1:51 PM, Thomas H=FChn wrote:
> Hi Claudiu,
>
>
>> Please try the following patch:
>> http://patchwork.ozlabs.org/patch/283235/
>>
>> It should help with your issue.
>>
>
> Several OpenWrt users including myself have tested your patch on
> TPLink-4900 routers.
> We do have positive feedback, as no crash nor system freeze was reporte=
d
> for different
> network loads and router setups.
> All different scenarios / details and two digit uptimes are in this
> forum thread:
> https://forum.openwrt.org/viewtopic.php?id=3D42062&p=3D13
>
> Thanks again for your work and  I hope to see this patch merged upstrea=
m.
>
> Greetings Thomas
>

Thanks for the testing and feedback.

The patch has been merged into davem/net-next.git:

http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=
=3D3ba405db1c1b05d157474c71e559393f7ea436ad

And I think it will be merged from there into the next
kernel release. I think that in time it will be back-ported
to stable kernel versions too. In some cases, requests are made
to the netdev mailing list to speedup inclusion of fixes (such
as this one) to certain stable kernel versions.

Thanks,
Claudiu

^ permalink raw reply

* Re: [PATCHv2 6/8] ASoC: fsl: add SGTL5000 based audio machine driver.
From: Shawn Guo @ 2013-11-01 12:07 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: mark.rutland, alsa-devel, linux-doc, tiwai, b18965, timur, perex,
	r65073, LW, linux, Xiubo Li, linux-arm-kernel, grant.likely,
	devicetree, ian.campbell, pawel.moll, swarren, rob.herring,
	broonie, oskar, fabio.estevam, lgirdwood, linux-kernel, rob,
	r64188, linuxppc-dev
In-Reply-To: <20131101102805.GG28401@MrMyself>

On Fri, Nov 01, 2013 at 06:28:05PM +0800, Nicolin Chen wrote:
> >  sound/soc/fsl/fsl-sgtl5000-vf610.c | 208 +++++++++++++++++++++++++++++++++++++
> 
> I just doubt if this file naming is appropriate. Even if we might not have
> rigor rule for the file names, according to existing ones, they are all in
> a same pattern: [SoC name]-[codec name].c
> 
> "imx-sgtl5000.c" for example
> 
> I think it would make user less confused about what this file exactly is if
> this machine driver also follow the pattern: vf610-sgtl5000.c
> 
> 
> @Shawn
> 
> What do you think about the file name?

Yeah, it would be better to name the file following the existing the
pattern.

Shawn

^ permalink raw reply

* Re: perf events ring buffer memory barrier on powerpc
From: Victor Kaplansky @ 2013-11-01 13:12 UTC (permalink / raw)
  To: paulmck
  Cc: Michael Neuling, Mathieu Desnoyers, Peter Zijlstra, LKML,
	Oleg Nesterov, Linux PPC dev, Anton Blanchard,
	Frederic Weisbecker
In-Reply-To: <20131031061602.GU4126@linux.vnet.ibm.com>

"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote on 10/31/2013
08:16:02 AM:

> > BTW, it is why you also don't need ACCESS_ONCE() around @tail, but only
> > around
> > @head read.

Just to be sure, that we are talking about the same code - I was
considering
ACCESS_ONCE() around @tail in point AAA in the following example from
Documentation/circular-buffers.txt for CONSUMER:

        unsigned long head = ACCESS_ONCE(buffer->head);
        unsigned long tail = buffer->tail;      /* AAA */

        if (CIRC_CNT(head, tail, buffer->size) >= 1) {
                /* read index before reading contents at that index */
                smp_read_barrier_depends();

                /* extract one item from the buffer */
                struct item *item = buffer[tail];

                consume_item(item);

                smp_mb(); /* finish reading descriptor before incrementing
tail */

                buffer->tail = (tail + 1) & (buffer->size - 1); /* BBB */
        }

>
> If you omit the ACCESS_ONCE() calls around @tail, the compiler is within
> its rights to combine adjacent operations and also to invent loads and
> stores, for example, in cases of register pressure.

Right. And I was completely aware about these possible transformations when
said that ACCESS_ONCE() around @tail in point AAA is redundant. Moved, or
even
completely dismissed reads of @tail in consumer code, are not a problem at
all,
since @tail is written exclusively by CONSUMER side.


> It is also within
> its rights to do piece-at-a-time loads and stores, which might sound
> unlikely, but which can actually has happened when the compiler figures
> out exactly what is to be stored at compile time, especially on hardware
> that only allows small immediate values.

As for writes to @tail, the ACCESS_ONCE around @tail at point AAA,
doesn't prevent in any way an imaginary super-optimizing compiler
from moving around the store to @tail (which appears in the code at point
BBB).

It is why ACCESS_ONCE at point AAA is completely redundant.

-- Victor

^ permalink raw reply

* Re: perf events ring buffer memory barrier on powerpc
From: Victor Kaplansky @ 2013-11-01 14:25 UTC (permalink / raw)
  To: paulmck
  Cc: Michael Neuling, Mathieu Desnoyers, Peter Zijlstra, LKML,
	Oleg Nesterov, Linux PPC dev, Anton Blanchard,
	Frederic Weisbecker
In-Reply-To: <20131031064015.GV4126@linux.vnet.ibm.com>

"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote on 10/31/2013
08:40:15 AM:

> > void ubuf_read(void)
> > {
> >    u64 head, tail;
> >
> >    tail = ACCESS_ONCE(ubuf->tail);
> >    head = ACCESS_ONCE(ubuf->head);
> >
> >    /*
> >     * Ensure we read the buffer boundaries before the actual buffer
> >     * data...
> >     */
> >    smp_rmb(); /* C, matches with B */
> >
> >    while (tail != head) {
> >       obj = ubuf->data + tail;
> >       /* process obj */
> >       tail += obj->size;
> >       tail %= ubuf->size;
> >    }
> >
> >    /*
> >     * Ensure all data reads are complete before we issue the
> >     * ubuf->tail update; once that update hits, kbuf_write() can
> >     * observe and overwrite data.
> >     */
> >    smp_mb(); /* D, matches with A */
> >
> >    ubuf->tail = tail;
> > }

> > Could we replace A and C with an smp_read_barrier_depends()?
>
> C, yes, given that you have ACCESS_ONCE() on the fetch from ->tail
> and that the value fetch from ->tail feeds into the address used for
> the "obj =" assignment.

No! You must to have a full smp_rmb() at C. The race on the reader side
is not between fetch of @tail and read from address pointed by @tail.
The real race here is between a fetch of @head and read of obj from
memory pointed by @tail.

Regards,
-- Victor

^ permalink raw reply

* Re: perf events ring buffer memory barrier on powerpc
From: Peter Zijlstra @ 2013-11-01 14:56 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Michael Neuling, Mathieu Desnoyers, Oleg Nesterov, LKML,
	Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky
In-Reply-To: <20131031064015.GV4126@linux.vnet.ibm.com>

On Wed, Oct 30, 2013 at 11:40:15PM -0700, Paul E. McKenney wrote:
> > Now the whole crux of the question is if we need barrier A at all, since
> > the STORES issued by the @buf writes are dependent on the ubuf->tail
> > read.
> 
> The dependency you are talking about is via the "if" statement?
> Even C/C++11 is not required to respect control dependencies.
> 
> This one is a bit annoying.  The x86 TSO means that you really only
> need barrier(), ARM (recent ARM, anyway) and Power could use a weaker
> barrier, and so on -- but smp_mb() emits a full barrier.
> 
> Perhaps a new smp_tmb() for TSO semantics, where reads are ordered
> before reads, writes before writes, and reads before writes, but not
> writes before reads?  Another approach would be to define a per-arch
> barrier for this particular case.

I suppose we can only introduce new barrier primitives if there's more
than 1 use-case.

> > If the read shows no available space, we simply will not issue those
> > writes -- therefore we could argue we can avoid the memory barrier.
> 
> Proving that means iterating through the permitted combinations of
> compilers and architectures...  There is always hand-coded assembly
> language, I suppose.

I'm starting to think that while the C/C++ language spec says they can
wreck the world by doing these silly optimization, real world users will
push back for breaking their existing code.

I'm fairly sure the GCC people _will_ get shouted at _loudly_ when they
break the kernel by doing crazy shit like that.

Given its near impossible to write a correct program in C/C++ and
tagging the entire kernel with __atomic is equally not going to happen,
I think we must find a practical solution.

Either that, or we really need to consider forking the language and
compiler :-(

^ permalink raw reply

* Re: perf events ring buffer memory barrier on powerpc
From: Victor Kaplansky @ 2013-11-01 16:06 UTC (permalink / raw)
  To: paulmck
  Cc: Michael Neuling, Mathieu Desnoyers, Peter Zijlstra, LKML,
	Oleg Nesterov, Linux PPC dev, Anton Blanchard,
	Frederic Weisbecker
In-Reply-To: <20131031152543.GC4067@linux.vnet.ibm.com>

"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote on 10/31/2013
05:25:43 PM:

> I really don't care about "fair" -- I care instead about the kernel
> working reliably.

Though I don't see how putting a memory barrier without deep understanding
why it is needed helps kernel reliability, I do agree that reliability
is more important than performance.

> And it should also be easy for proponents of removing memory barriers to
> clearly articulate what orderings their code does and does not need.

I intentionally took a simplified example of circle buffer from
Documentation/circular-buffers.txt. I think both sides agree about
memory ordering requirements in the example. At least I didn't see anyone
argued about them.

> You are assuming control dependencies that the C language does not
> provide.  Now, for all I know right now, there might well be some other
> reason why a full barrier is not required, but the "if" statement cannot
> be that reason.
>
> Please review section 1.10 of the C++11 standard (or the corresponding
> section of the C11 standard, if you prefer).  The point is that the
> C/C++11 covers only data dependencies, not control dependencies.

I feel you made a wrong assumption about my expertise in compilers. I don't
need to reread section 1.10 of the C++11 standard, because I do agree that
potentially compiler can break the code in our case. And I do agree that
a compiler barrier() or some other means (including a change of the
standard)
can be required in future to prevent a compiler from moving memory accesses
around.

But "broken" compiler is much wider issue to be deeply discussed in this
thread. I'm pretty sure that kernel have tons of very subtle
code that actually creates locks and memory ordering. Such code
usually just use the "barrier()"  approach to tell gcc not to combine
or move memory accesses around it.

Let's just agree for the sake of this memory barrier discussion that we
*do* need compiler barrier to tell gcc not to combine or move memory
accesses around it.

> Glad we agree on something!

I'm glad too!

> Did you miss the following passage in the paragraph you quoted?
>
>    "... likewise, your consumer must issue a memory barrier
>    instruction after removing an item from the queue and before
>    reading from its memory."
>
> That is why DEC Alpha readers need a read-side memory barrier -- it says
> so right there.  And as either you or Peter noted earlier in this thread,
> this barrier can be supplied by smp_read_barrier_depends().

I did not miss that passage. That passage explains why consumer on Alpha
processor after reading @head is required to execute an additional
smp_read_barrier_depends() before it can *read* from memory pointed by
@tail. And I think that I understand why - because the reader have to wait
till local caches are fully updated and only then it can read data from
the data buffer.

But on the producer side, after we read @tail, we don't need to wait for
update of local caches before we start *write* data to the buffer, since
the
producer is the only one who write data there!

>
> I can sympathize if you are having trouble believing this.  After all,
> it took the DEC Alpha architects a full hour to convince me, and that was
> in a face-to-face meeting instead of over email.  (Just for the record,
> it took me even longer to convince them that their earlier documentation
> did not clearly indicate the need for these read-side barriers.)  But
> regardless of whether or not I sympathize, DEC Alpha is what it is.

Again, I do understand quirkiness of the DEC Alpha, and I still think that
there is no need in *full* memory barrier on producer side - the one
before writing data to the buffer and which you've put in kfifo
implementation.

Regard,
-- Victor

^ permalink raw reply

* Re: perf events ring buffer memory barrier on powerpc
From: Peter Zijlstra @ 2013-11-01 16:11 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Michael Neuling, Mathieu Desnoyers, Oleg Nesterov, LKML,
	Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky
In-Reply-To: <20131031064015.GV4126@linux.vnet.ibm.com>

On Wed, Oct 30, 2013 at 11:40:15PM -0700, Paul E. McKenney wrote:
> > void kbuf_write(int sz, void *buf)
> > {
> > 	u64 tail = ACCESS_ONCE(ubuf->tail); /* last location userspace read */
> > 	u64 offset = kbuf->head; /* we already know where we last wrote */
> > 	u64 head = offset + sz;
> > 
> > 	if (!space(tail, offset, head)) {
> > 		/* discard @buf */
> > 		return;
> > 	}
> > 
> > 	/*
> > 	 * Ensure that if we see the userspace tail (ubuf->tail) such
> > 	 * that there is space to write @buf without overwriting data
> > 	 * userspace hasn't seen yet, we won't in fact store data before
> > 	 * that read completes.
> > 	 */
> > 
> > 	smp_mb(); /* A, matches with D */
> > 
> > 	write(kbuf->data + offset, buf, sz);
> > 	kbuf->head = head % kbuf->size;
> > 
> > 	/*
> > 	 * Ensure that we write all the @buf data before we update the
> > 	 * userspace visible ubuf->head pointer.
> > 	 */
> > 	smp_wmb(); /* B, matches with C */
> > 
> > 	ubuf->head = kbuf->head;
> > }

> > Now the whole crux of the question is if we need barrier A at all, since
> > the STORES issued by the @buf writes are dependent on the ubuf->tail
> > read.
> 
> The dependency you are talking about is via the "if" statement?
> Even C/C++11 is not required to respect control dependencies.

But surely we must be able to make it so; otherwise you'd never be able
to write:

void *ptr = obj1;

void foo(void)
{

	/* create obj2, obj3 */

	smp_wmb(); /* ensure the objs are complete */

	/* expose either obj2 or obj3 */
	if (x)
		ptr = obj2;
	else
		ptr = obj3;


	/* free the unused one */
	if (x)
		free(obj3);
	else
		free(obj2);
}

Earlier you said that 'volatile' or '__atomic' avoids speculative
writes; so would:

volatile void *ptr = obj1;

Make the compiler respect control dependencies again? If so, could we
somehow mark that !space() condition volatile?

Currently the above would be considered a valid pattern. But you're
saying its not because the compiler is free to expose both obj2 and obj3
(for however short a time) and thus the free of the 'unused' object is
incorrect and can cause use-after-free.

In fact; how can we be sure that:

void *ptr = NULL;

void bar(void)
{
	void *obj = malloc(...);

	/* fill obj */

	if (!err)
		rcu_assign_pointer(ptr, obj);
	else
		free(obj);
}

Does not get 'optimized' into:

void bar(void)
{
	void *obj = malloc(...);
	void *old_ptr = ptr;

	/* fill obj */

	rcu_assign_pointer(ptr, obj);
	if (err) { /* because runtime profile data says this is unlikely */
		ptr = old_ptr;
		free(obj);
	}
}

We _MUST_ be able to rely on control flow, otherwise me might as well
all go back to writing kernels in asm.

^ permalink raw reply

* Re: perf events ring buffer memory barrier on powerpc
From: Peter Zijlstra @ 2013-11-01 16:18 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Michael Neuling, Mathieu Desnoyers, Oleg Nesterov, LKML,
	Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky
In-Reply-To: <20131031064015.GV4126@linux.vnet.ibm.com>

On Wed, Oct 30, 2013 at 11:40:15PM -0700, Paul E. McKenney wrote:
> The dependency you are talking about is via the "if" statement?
> Even C/C++11 is not required to respect control dependencies.
> 
> This one is a bit annoying.  The x86 TSO means that you really only
> need barrier(), ARM (recent ARM, anyway) and Power could use a weaker
> barrier, and so on -- but smp_mb() emits a full barrier.
> 
> Perhaps a new smp_tmb() for TSO semantics, where reads are ordered
> before reads, writes before writes, and reads before writes, but not
> writes before reads?  Another approach would be to define a per-arch
> barrier for this particular case.

Supposing a sane language where we can rely on control flow; would that
change the story?

I'm afraid I'm now terminally confused between actual proper memory
model issues and fucked compilers.

^ permalink raw reply

* RE: perf events ring buffer memory barrier on powerpc
From: David Laight @ 2013-11-01 16:25 UTC (permalink / raw)
  To: Victor Kaplansky, paulmck
  Cc: Michael Neuling, Mathieu Desnoyers, Peter Zijlstra,
	Frederic Weisbecker, LKML, Oleg Nesterov, Linux PPC dev,
	Anton Blanchard
In-Reply-To: <OF3CB471BA.BACC7843-ON42257C16.0055EF23-42257C16.0058875B@il.ibm.com>

> But "broken" compiler is much wider issue to be deeply discussed in =
this
> thread. I'm pretty sure that kernel have tons of very subtle
> code that actually creates locks and memory ordering. Such code
> usually just use the "barrier()"  approach to tell gcc not to combine
> or move memory accesses around it.

gcc will do unexpected memory accesses for bit fields that are
adjacent to volatile data.
In particular it may generate 64bit sized (and aligned) RMW cycles
when accessing bit fields.
And yes, this has caused real problems.

	David

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox