Linux userland API discussions
 help / color / mirror / Atom feed
* Re: [PATCH 05/14] ARM: call reset_controller_of_init from default time_init handler
From: Rob Herring @ 2015-02-15 22:17 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: Jonathan Corbet, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Philipp Zabel, Russell King,
	Daniel Lezcano, Thomas Gleixner, Linus Walleij,
	Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann, Andrew Morton,
	David S. Miller, Mauro Carvalho Chehab, Joe Perches,
	Antti Palosaari, Tejun Heo, Will Deacon, Nikolay Borisov,
	Rusty Russell, Kees
In-Reply-To: <1423763164-5606-6-git-send-email-mcoquelin.stm32@gmail.com>

On Thu, Feb 12, 2015 at 11:45 AM, Maxime Coquelin
<mcoquelin.stm32@gmail.com> wrote:
> Some DT ARM platforms need the reset controllers to be initialized before
> the timers.
> This is the case of the stm32 and sunxi platforms.

I would say this is the exception, not the rule and therefore should
be handled in a machine desc function. Or it could be part of your
timer setup. Or is the bootloader's problem (like arch timer setup).

We just want to limit how much this mechanism gets used.

Rob

>
> This patch adds a call to reset_controller_of_init() to the default
> .init_time callback when RESET_CONTROLLER is used by the platform.
>
> Signed-off-by: Maxime Coquelin <mcoquelin.stm32@gmail.com>
> ---
>  arch/arm/kernel/time.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/arch/arm/kernel/time.c b/arch/arm/kernel/time.c
> index 0cc7e58..4601b1e 100644
> --- a/arch/arm/kernel/time.c
> +++ b/arch/arm/kernel/time.c
> @@ -20,6 +20,7 @@
>  #include <linux/irq.h>
>  #include <linux/kernel.h>
>  #include <linux/profile.h>
> +#include <linux/reset-controller.h>
>  #include <linux/sched.h>
>  #include <linux/sched_clock.h>
>  #include <linux/smp.h>
> @@ -117,6 +118,9 @@ void __init time_init(void)
>         if (machine_desc->init_time) {
>                 machine_desc->init_time();
>         } else {
> +#ifdef CONFIG_RESET_CONTROLLER
> +               reset_controller_of_init();
> +#endif
>  #ifdef CONFIG_COMMON_CLK
>                 of_clk_init(NULL);
>  #endif
> --
> 1.9.1
>

^ permalink raw reply

* Re: [PATCH 03/14] clocksource: Add ARM System timer driver
From: Rob Herring @ 2015-02-15 22:31 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: Jonathan Corbet, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Philipp Zabel, Russell King,
	Daniel Lezcano, Thomas Gleixner, Linus Walleij,
	Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann, Andrew Morton,
	David S. Miller, Mauro Carvalho Chehab, Joe Perches,
	Antti Palosaari, Tejun Heo, Will Deacon, Nikolay Borisov,
	Rusty Russell, Kees
In-Reply-To: <1423763164-5606-4-git-send-email-mcoquelin.stm32-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

On Thu, Feb 12, 2015 at 11:45 AM, Maxime Coquelin
<mcoquelin.stm32-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> This patch adds clocksource support for ARMv7-M's System timer,
> also known as SysTick.
>
> Signed-off-by: Maxime Coquelin <mcoquelin.stm32-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
>  .../devicetree/bindings/arm/system_timer.txt       | 15 +++++

Please include v7M in the name. System timer sounds very generic. This
is the only timer architecturally defined IIRC, so perhaps just
"armv7m_systick".

>  drivers/clocksource/Kconfig                        |  7 ++
>  drivers/clocksource/Makefile                       |  1 +
>  drivers/clocksource/arm_system_timer.c             | 74 ++++++++++++++++++++++

Same here.


>  4 files changed, 97 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/system_timer.txt
>  create mode 100644 drivers/clocksource/arm_system_timer.c
>
> diff --git a/Documentation/devicetree/bindings/arm/system_timer.txt b/Documentation/devicetree/bindings/arm/system_timer.txt
> new file mode 100644
> index 0000000..35268b7
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/system_timer.txt
> @@ -0,0 +1,15 @@
> +* ARM System Timer
> +
> +ARMv7-M includes a system timer, known as SysTick. Current driver only
> +implements the clocksource feature.
> +
> +Required properties:
> +- compatible : Should be "arm,armv7m-systick"
> +- reg       : The address range of the timer
> +- clocks     : The input clock of the timer

You may want to consider supporting "clock-frequency" here too. In
more simple chips you may just have fixed clocks and may want to run a
kernel with COMMON_CLK disabled for size savings.

> +
> +systick: system-timer {

This should be "systick: timer@e000e010".

Same for your dts file.

> +       compatible = "arm,armv7m-systick";
> +       reg = <0xe000e010 0x10>;
> +       clocks = <&clk_systick>;
> +};
> diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
> index fc01ec2..f9fe4ac 100644
> --- a/drivers/clocksource/Kconfig
> +++ b/drivers/clocksource/Kconfig
> @@ -124,6 +124,13 @@ config CLKSRC_ARM_GLOBAL_TIMER_SCHED_CLOCK
>         help
>          Use ARM global timer clock source as sched_clock
>
> +config ARM_SYSTEM_TIMER
> +       bool
> +       select CLKSRC_OF if OF
> +       select CLKSRC_MMIO
> +       help
> +         This options enables support for the ARM system timer unit
> +
>  config ATMEL_PIT
>         select CLKSRC_OF if OF
>         def_bool SOC_AT91SAM9 || SOC_SAMA5
> diff --git a/drivers/clocksource/Makefile b/drivers/clocksource/Makefile
> index 94d90b2..194400b 100644
> --- a/drivers/clocksource/Makefile
> +++ b/drivers/clocksource/Makefile
> @@ -42,6 +42,7 @@ obj-$(CONFIG_MTK_TIMER)               += mtk_timer.o
>
>  obj-$(CONFIG_ARM_ARCH_TIMER)           += arm_arch_timer.o
>  obj-$(CONFIG_ARM_GLOBAL_TIMER)         += arm_global_timer.o
> +obj-$(CONFIG_ARM_SYSTEM_TIMER)         += arm_system_timer.o
>  obj-$(CONFIG_CLKSRC_METAG_GENERIC)     += metag_generic.o
>  obj-$(CONFIG_ARCH_HAS_TICK_BROADCAST)  += dummy_timer.o
>  obj-$(CONFIG_ARCH_KEYSTONE)            += timer-keystone.o
> diff --git a/drivers/clocksource/arm_system_timer.c b/drivers/clocksource/arm_system_timer.c
> new file mode 100644
> index 0000000..69e6ef9
> --- /dev/null
> +++ b/drivers/clocksource/arm_system_timer.c
> @@ -0,0 +1,74 @@
> +/*
> + * Copyright (C) Maxime Coquelin 2015
> + * Author:  Maxime Coquelin <mcoquelin.stm32-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> + * License terms:  GNU General Public License (GPL), version 2
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/clocksource.h>
> +#include <linux/clockchips.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/clk.h>
> +#include <linux/bitops.h>
> +
> +#define SYST_CSR       0x00
> +#define SYST_RVR       0x04
> +#define SYST_CVR       0x08
> +#define SYST_CALIB     0x0c
> +
> +#define SYST_CSR_ENABLE BIT(0)
> +
> +#define SYSTICK_LOAD_RELOAD_MASK 0x00FFFFFF
> +
> +static void __init system_timer_of_register(struct device_node *np)
> +{
> +       struct clk *clk;
> +       void __iomem *base;
> +       unsigned long rate;
> +       int ret;
> +
> +       base = of_iomap(np, 0);
> +       if (!base) {
> +               pr_warn("system-timer: invalid base address\n");
> +               return;
> +       }
> +
> +       clk = of_clk_get(np, 0);
> +       if (IS_ERR(clk)) {
> +               pr_warn("system-timer: clk not found\n");
> +               ret = PTR_ERR(clk);
> +               goto out_unmap;
> +       }
> +
> +       ret = clk_prepare_enable(clk);
> +       if (ret)
> +               goto out_clk_put;
> +
> +       rate = clk_get_rate(clk);
> +
> +       writel_relaxed(SYSTICK_LOAD_RELOAD_MASK, base + SYST_RVR);
> +       writel_relaxed(SYST_CSR_ENABLE, base + SYST_CSR);
> +
> +       ret = clocksource_mmio_init(base + SYST_CVR, "arm_system_timer", rate,
> +                       200, 24, clocksource_mmio_readl_down);
> +       if (ret) {
> +               pr_err("failed to init clocksource (%d)\n", ret);
> +               goto out_clk_disable;
> +       }
> +
> +       pr_info("ARM System timer initialized as clocksource\n");
> +
> +       return;
> +
> +out_clk_disable:
> +       clk_disable_unprepare(clk);
> +out_clk_put:
> +       clk_put(clk);
> +out_unmap:
> +       iounmap(base);
> +       WARN(ret, "ARM System timer register failed (%d)\n", ret);
> +}
> +
> +CLOCKSOURCE_OF_DECLARE(arm_systick, "arm,armv7m-systick",
> +                       system_timer_of_register);
> --
> 1.9.1
>

^ permalink raw reply

* Re: [PATCH 02/14] ARM: ARMv7M: Enlarge vector table to 256 entries
From: Rob Herring @ 2015-02-15 22:42 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: Geert Uytterhoeven, Jonathan Corbet, Rob Herring, Pawel Moll,
	Mark Rutland, Ian Campbell, Kumar Gala, Philipp Zabel,
	Russell King, Daniel Lezcano, Thomas Gleixner, Linus Walleij,
	Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann, Andrew Morton,
	David S. Miller, Mauro Carvalho Chehab, Joe Perches,
	Antti Palosaari, Tejun Heo, Will Deacon, Nikolay Borisov
In-Reply-To: <CALszF6BDa9pUb534YN2z9DbYA+hPCnG8XYy5YbjJwSiseKz4xg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Fri, Feb 13, 2015 at 2:42 AM, Maxime Coquelin
<mcoquelin.stm32-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Hi Geert,
>
> 2015-02-12 21:34 GMT+01:00 Geert Uytterhoeven <geert-Td1EMuHUCqxL1ZNQvxDV9g@public.gmane.org>:
>> On Thu, Feb 12, 2015 at 6:45 PM, Maxime Coquelin
>> <mcoquelin.stm32-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>> From Cortex-M4 and M7 reference manuals, the nvic supports up to 240
>>> interrupts. So the number of entries in vectors table is 256.
>>>
>>> This patch adds the missing entries, and change the alignement, so that
>>> vector_table remains naturally aligned.
>>
>> Shouldn't this depend on ARCH_STM32, or some other M4 or M7 specific
>> Kconfig option, to avoid wasting the space on other CPUs?
>
> Actually, the STM32F429 has 90 interrupts, so it would need 106
> entries in the vector table.
> The maximum of supported interrupts is not only for Cortex-M4 and M7,
> this is also true for Cortex-M3.
>
> I see two possibilities:
>  1 - We declare the vector table for the maximum supported number of
> IRQs, as this patch does.
>         - Pro: it will be functionnal with all Cortex-M MCUs
>         - Con: Waste of less than 1KB for memory

The waste depends on the alignment size as well and could be up to
almost 2KB worst case. It varies depending on the padding. We should
try to place it so it always aligned and the wasted space is
minimized.

Rob

>  2 - We introduce a config flag that provides the number of interrupts
>         - Pro: No more memory waste
>         - Con: Need to declare a per MCU model config flag.
>
> Then, regarding the natural alignment, is there a way to ensure it
> depending on the value of a config flag?
> Or we should keep it at the maximum value possible?
>
> Any feedback will be appreciated, especially from Uwe who maintains
> the efm32 machine.
>
> Kind regards,
> Maxime

^ permalink raw reply

* Re: [PATCH 05/14] ARM: call reset_controller_of_init from default time_init handler
From: Russell King - ARM Linux @ 2015-02-15 23:12 UTC (permalink / raw)
  To: Rob Herring
  Cc: Maxime Coquelin, Jonathan Corbet, Rob Herring, Pawel Moll,
	Mark Rutland, Ian Campbell, Kumar Gala, Philipp Zabel,
	Daniel Lezcano, Thomas Gleixner, Linus Walleij,
	Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann, Andrew Morton,
	David S. Miller, Mauro Carvalho Chehab, Joe Perches,
	Antti Palosaari, Tejun Heo, Will Deacon, Nikolay Borisov,
	Rusty Russell
In-Reply-To: <CAL_Jsq+Sk0C-1UHCKE18fEVwBbV=quV9mrJSKTO_UPXWcaYfCw@mail.gmail.com>

On Sun, Feb 15, 2015 at 04:17:31PM -0600, Rob Herring wrote:
> On Thu, Feb 12, 2015 at 11:45 AM, Maxime Coquelin
> <mcoquelin.stm32@gmail.com> wrote:
> > Some DT ARM platforms need the reset controllers to be initialized before
> > the timers.
> > This is the case of the stm32 and sunxi platforms.
> 
> I would say this is the exception, not the rule and therefore should
> be handled in a machine desc function. Or it could be part of your
> timer setup. Or is the bootloader's problem (like arch timer setup).
> 
> We just want to limit how much this mechanism gets used.

Can you clarify please - what is "this mechanism"?  Placing explicit
calls at this location, or the whole OF_DECLARE_* stuff?

Sebastian suggested using the OF_DECLARE_* stuff for the Dove PMU -
so maybe you have a comment on that too?

Thanks.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply

* Re: [PATCH 03/14] clocksource: Add ARM System timer driver
From: Andreas Färber @ 2015-02-15 23:43 UTC (permalink / raw)
  To: Maxime Coquelin, Rob Herring
  Cc: Jonathan Corbet, Pawel Moll, Mark Rutland, Ian Campbell,
	Kumar Gala, Philipp Zabel, Russell King, Daniel Lezcano,
	Thomas Gleixner, Linus Walleij, Greg Kroah-Hartman, Jiri Slaby,
	Arnd Bergmann, Andrew Morton, David S. Miller,
	Mauro Carvalho Chehab, Joe Perches, Antti Palosaari, Tejun Heo,
	Will Deacon, Nikolay Borisov, Rusty Russell, Kees Cook, Micha
In-Reply-To: <1423763164-5606-4-git-send-email-mcoquelin.stm32-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

Am 12.02.2015 um 18:45 schrieb Maxime Coquelin:
> This patch adds clocksource support for ARMv7-M's System timer,
> also known as SysTick.
> 
> Signed-off-by: Maxime Coquelin <mcoquelin.stm32-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
>  .../devicetree/bindings/arm/system_timer.txt       | 15 +++++
>  drivers/clocksource/Kconfig                        |  7 ++
>  drivers/clocksource/Makefile                       |  1 +
>  drivers/clocksource/arm_system_timer.c             | 74 ++++++++++++++++++++++
>  4 files changed, 97 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/system_timer.txt
>  create mode 100644 drivers/clocksource/arm_system_timer.c
> 
> diff --git a/Documentation/devicetree/bindings/arm/system_timer.txt b/Documentation/devicetree/bindings/arm/system_timer.txt
> new file mode 100644
> index 0000000..35268b7
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/system_timer.txt
> @@ -0,0 +1,15 @@
> +* ARM System Timer
> +
> +ARMv7-M includes a system timer, known as SysTick. Current driver only
> +implements the clocksource feature.
> +
> +Required properties:
> +- compatible : Should be "arm,armv7m-systick"
> +- reg	     : The address range of the timer
> +- clocks     : The input clock of the timer
> +
> +systick: system-timer {
> +	compatible = "arm,armv7m-systick";
> +	reg = <0xe000e010 0x10>;
> +	clocks = <&clk_systick>;
> +};

Binding documentation is supposed to go into its own patch:
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/tree/Documentation/devicetree/bindings/submitting-patches.txt

> diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
> index fc01ec2..f9fe4ac 100644
> --- a/drivers/clocksource/Kconfig
> +++ b/drivers/clocksource/Kconfig
> @@ -124,6 +124,13 @@ config CLKSRC_ARM_GLOBAL_TIMER_SCHED_CLOCK
>  	help
>  	 Use ARM global timer clock source as sched_clock
>  
> +config ARM_SYSTEM_TIMER
> +	bool
> +	select CLKSRC_OF if OF
> +	select CLKSRC_MMIO
> +	help
> +	  This options enables support for the ARM system timer unit
> +
>  config ATMEL_PIT
>  	select CLKSRC_OF if OF
>  	def_bool SOC_AT91SAM9 || SOC_SAMA5
> diff --git a/drivers/clocksource/Makefile b/drivers/clocksource/Makefile
> index 94d90b2..194400b 100644
> --- a/drivers/clocksource/Makefile
> +++ b/drivers/clocksource/Makefile
> @@ -42,6 +42,7 @@ obj-$(CONFIG_MTK_TIMER)		+= mtk_timer.o
>  
>  obj-$(CONFIG_ARM_ARCH_TIMER)		+= arm_arch_timer.o
>  obj-$(CONFIG_ARM_GLOBAL_TIMER)		+= arm_global_timer.o
> +obj-$(CONFIG_ARM_SYSTEM_TIMER)		+= arm_system_timer.o
>  obj-$(CONFIG_CLKSRC_METAG_GENERIC)	+= metag_generic.o
>  obj-$(CONFIG_ARCH_HAS_TICK_BROADCAST)	+= dummy_timer.o
>  obj-$(CONFIG_ARCH_KEYSTONE)		+= timer-keystone.o
> diff --git a/drivers/clocksource/arm_system_timer.c b/drivers/clocksource/arm_system_timer.c
> new file mode 100644
> index 0000000..69e6ef9
> --- /dev/null
> +++ b/drivers/clocksource/arm_system_timer.c
> @@ -0,0 +1,74 @@
> +/*
> + * Copyright (C) Maxime Coquelin 2015
> + * Author:  Maxime Coquelin <mcoquelin.stm32-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> + * License terms:  GNU General Public License (GPL), version 2
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/clocksource.h>
> +#include <linux/clockchips.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/clk.h>
> +#include <linux/bitops.h>
> +
> +#define SYST_CSR	0x00
> +#define SYST_RVR	0x04
> +#define SYST_CVR	0x08
> +#define SYST_CALIB	0x0c
> +
> +#define SYST_CSR_ENABLE BIT(0)
> +
> +#define SYSTICK_LOAD_RELOAD_MASK 0x00FFFFFF
> +
> +static void __init system_timer_of_register(struct device_node *np)
> +{
> +	struct clk *clk;
> +	void __iomem *base;
> +	unsigned long rate;
> +	int ret;
> +
> +	base = of_iomap(np, 0);
> +	if (!base) {
> +		pr_warn("system-timer: invalid base address\n");
> +		return;
> +	}
> +
> +	clk = of_clk_get(np, 0);
> +	if (IS_ERR(clk)) {
> +		pr_warn("system-timer: clk not found\n");
> +		ret = PTR_ERR(clk);
> +		goto out_unmap;
> +	}
> +
> +	ret = clk_prepare_enable(clk);
> +	if (ret)
> +		goto out_clk_put;
> +
> +	rate = clk_get_rate(clk);
> +
> +	writel_relaxed(SYSTICK_LOAD_RELOAD_MASK, base + SYST_RVR);
> +	writel_relaxed(SYST_CSR_ENABLE, base + SYST_CSR);
> +
> +	ret = clocksource_mmio_init(base + SYST_CVR, "arm_system_timer", rate,
> +			200, 24, clocksource_mmio_readl_down);
> +	if (ret) {
> +		pr_err("failed to init clocksource (%d)\n", ret);
> +		goto out_clk_disable;
> +	}
> +
> +	pr_info("ARM System timer initialized as clocksource\n");
> +
> +	return;
> +
> +out_clk_disable:
> +	clk_disable_unprepare(clk);
> +out_clk_put:
> +	clk_put(clk);
> +out_unmap:
> +	iounmap(base);
> +	WARN(ret, "ARM System timer register failed (%d)\n", ret);
> +}
> +
> +CLOCKSOURCE_OF_DECLARE(arm_systick, "arm,armv7m-systick",
> +			system_timer_of_register);

I've used a SysTick based implementation on my stm32 branch myself, but
looking at efm32 I got the impression that it would be better to use one
of the 32-bit TIM2/TIM5 as clocksource and the other as clockevents?

Still this implementation will be handy to have, also for other targets.

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu,
Graham Norton; HRB 21284 (AG Nürnberg)

^ permalink raw reply

* Re: [PATCH 06/14] drivers: reset: Add STM32 reset driver
From: Andreas Färber @ 2015-02-15 23:59 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: Jonathan Corbet, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Philipp Zabel, Russell King,
	Daniel Lezcano, Thomas Gleixner, Linus Walleij,
	Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann, Andrew Morton,
	David S. Miller, Mauro Carvalho Chehab, Joe Perches,
	Antti Palosaari, Tejun Heo, Will Deacon, Nikolay Borisov,
	Rusty Russell, Kees
In-Reply-To: <1423763164-5606-7-git-send-email-mcoquelin.stm32@gmail.com>

Am 12.02.2015 um 18:45 schrieb Maxime Coquelin:
> The STM32 MCUs family IP can be reset by accessing some shared registers.
> 
> The specificity is that some reset lines are used by the timers.
> At timer initialization time, the timer has to be reset, that's why
> we cannot use a regular driver.
> 
> Signed-off-by: Maxime Coquelin <mcoquelin.stm32@gmail.com>
> ---
>  .../devicetree/bindings/reset/st,stm32-reset.txt   |  19 ++++
>  drivers/reset/Makefile                             |   1 +
>  drivers/reset/reset-stm32.c                        | 124 +++++++++++++++++++++
>  3 files changed, 144 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/reset/st,stm32-reset.txt
>  create mode 100644 drivers/reset/reset-stm32.c
> 
> diff --git a/Documentation/devicetree/bindings/reset/st,stm32-reset.txt b/Documentation/devicetree/bindings/reset/st,stm32-reset.txt
> new file mode 100644
> index 0000000..add1298
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/reset/st,stm32-reset.txt
> @@ -0,0 +1,19 @@
> +STMicroelectronics STM32 Peripheral Reset Controller
> +====================================================
> +
> +Please also refer to reset.txt in this directory for common reset
> +controller binding usage.
> +
> +Required properties:
> +- compatible: Should be "st,stm32-reset"
> +- reg: should be register base and length as documented in the
> +  datasheet
> +- #reset-cells: 1, see below
> +
> +example:
> +
> +reset_ahb1: reset@40023810 {
> +	#reset-cells = <1>;
> +	compatible = "st,stm32-reset";
> +	reg = <0x40023810 0x4>;
> +};
[snip]

RM0090 has two different chapters on the RCC IP:
* Reset and clock control for STM32F42xxx and STM32F43xxx (RCC)
* Reset and clock control for STM32F405xx/07xx and STM32F415xx/17xx(RCC)

I therefore feel it is wrong to use "stm32-" here; instead I used
"st,stm32f429-rcc" (also relates to 12/14 discussion). This may apply to
other identifiers, too.

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu,
Graham Norton; HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-gpio" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH RFC v3 0/7] epoll: Introduce new syscalls, epoll_ctl_batch and epoll_pwait1
From: Fam Zheng @ 2015-02-16  1:02 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86-DgEjT+Ai2ygdnm+yROfE0A, Alexander Viro,
	Andrew Morton, Kees Cook, Andy Lutomirski, David Herrmann,
	Alexei Starovoitov, Miklos Szeredi, David Drysdale, Oleg Nesterov,
	David S. Miller, Vivek Goyal, Mike Frysinger, Theodore Ts'o,
	Heiko Carstens, Rasmus Villemoes, Rashika Kheria, Hugh Dickins,
	Mathieu Desnoyers, Peter Zijlstra <peter>
In-Reply-To: <20150215150011.0340686c-T1hC0tSOHrs@public.gmane.org>

On Sun, 02/15 15:00, Jonathan Corbet wrote:
> On Fri, 13 Feb 2015 17:03:56 +0800
> Fam Zheng <famz-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> 
> > SYNOPSIS
> > 
> >        #include <sys/epoll.h>
> > 
> >        int epoll_pwait1(int epfd, int flags,
> >                         struct epoll_event *events,
> >                         int maxevents,
> >                         struct epoll_wait_params *params);
> 
> Quick, possibly dumb question: might it make sense to also pass in 
> sizeof(struct epoll_wait_params)?  That way, when somebody wants to add
> another parameter in the future, the kernel can tell which version is in
> use and they won't have to do an epoll_pwait2()?
> 

Flags can be used for that, if the change is not radically different.

Fam

^ permalink raw reply

* Re: [PATCH v17 1/7] mm: support madvise(MADV_FREE)
From: Minchan Kim @ 2015-02-16  4:36 UTC (permalink / raw)
  To: Shaohua Li
  Cc: Michael Kerrisk (man-pages), Michal Hocko, Andrew Morton,
	linux-kernel, linux-mm, linux-api, Hugh Dickins, Johannes Weiner,
	Rik van Riel, KOSAKI Motohiro, Mel Gorman, Jason Evans,
	zhangyanfei, Kirill A. Shutemov, Kirill A. Shutemov
In-Reply-To: <20150212001403.GA2380@kernel.org>

Hi Shaohua,

On Wed, Feb 11, 2015 at 04:14:03PM -0800, Shaohua Li wrote:
> On Wed, Feb 11, 2015 at 09:56:20AM +0900, Minchan Kim wrote:
> > Hi Shaohua,
> > 
> > On Tue, Feb 10, 2015 at 02:38:26PM -0800, Shaohua Li wrote:
> > > On Mon, Feb 09, 2015 at 04:15:53PM +0900, Minchan Kim wrote:
> > > > On Fri, Feb 06, 2015 at 10:29:18AM -0800, Shaohua Li wrote:
> > > > > On Fri, Feb 06, 2015 at 02:51:03PM +0900, Minchan Kim wrote:
> > > > > > Hi Shaohua,
> > > > > > 
> > > > > > On Thu, Feb 05, 2015 at 04:33:11PM -0800, Shaohua Li wrote:
> > > > > > > 
> > > > > > > Hi Minchan,
> > > > > > > 
> > > > > > > Sorry to jump in this thread so later, and if some issues are discussed before.
> > > > > > > I'm interesting in this patch, so tried it here. I use a simple test with
> > > > > > 
> > > > > > No problem at all. Interest is always win over ignorance.
> > > > > > 
> > > > > > > jemalloc. Obviously this can improve performance when there is no memory
> > > > > > > pressure. Did you try setup with memory pressure?
> > > > > > 
> > > > > > Sure but it was not a huge memory system like yours.
> > > > > 
> > > > > Yes, I'd like to check the symptom in memory pressure, so choose such test.
> > > > > 
> > > > > > > In my test, jemalloc will map 61G vma, and use about 32G memory without
> > > > > > > MADV_FREE. If MADV_FREE is enabled, jemalloc will use whole 61G memory because
> > > > > > > madvise doesn't reclaim the unused memory. If I disable swap (tweak your patch
> > > > > > 
> > > > > > Yes, IIUC, jemalloc replaces MADV_DONTNEED with MADV_FREE completely.
> > > > > 
> > > > > right.
> > > > > > > slightly to make it work without swap), I got oom. If swap is enabled, my
> > > > > > 
> > > > > > You mean you modified anon aging logic so it works although there is no swap?
> > > > > > If so, I have no idea why OOM happens. I guess it should free all of freeable
> > > > > > pages during the aging so although system stall happens more, I don't expect
> > > > > > OOM. Anyway, with MADV_FREE with no swap, we should consider more things
> > > > > > about anonymous aging.
> > > > > 
> > > > > In the patch, MADV_FREE will be disabled and fallback to DONTNEED if no swap is
> > > > > enabled. Our production environment doesn't enable swap, so I tried to delete
> > > > > the 'no swap' check and make MADV_FREE always enabled regardless if swap is
> > > > > enabled. I didn't change anything else. With such change, I saw oom
> > > > > immediately. So definitely we have aging issue, the pages aren't reclaimed
> > > > > fast.
> > > > 
> > > > In current VM implementation, it doesn't age anonymous LRU list if we have no
> > > > swap. That's the reason to drop freeing pages instantly.
> > > > I think it could be enhanced later.
> > > > http://lists.infradead.org/pipermail/linux-arm-kernel/2014-December/311591.html
> > > > 
> > > > > 
> > > > > > > system is totally stalled because of swap activity. Without the MADV_FREE,
> > > > > > > everything is ok. Considering we definitely don't want to waste too much
> > > > > > > memory, a system with memory pressure is normal, so sounds MADV_FREE will
> > > > > > > introduce big trouble here.
> > > > > > > 
> > > > > > > Did you think about move the MADV_FREE pages to the head of inactive LRU, so
> > > > > > > they can be reclaimed easily?
> > > > > > 
> > > > > > I think it's desirable if the page lived in active LRU.
> > > > > > The reason I didn't that was caused by volatile ranges system call which
> > > > > > was motivaion for MADV_FREE in my mind.
> > > > > > In last LSF/MM, there was concern about data's hotness.
> > > > > > Some of users want to keep that as it is in LRU position, others want to
> > > > > > handle that as cold(tail of inactive list)/warm(head of inactive list)/
> > > > > > hot(head of active list), for example.
> > > > > > The vrange syscall was just about volatiltiy, not depends on page hotness
> > > > > > so the decision on my head was not to change LRU order and let's make new
> > > > > > hotness advise if we need it later.
> > > > > > 
> > > > > > However, MADV_FREE's main customer is allocators and afaik, they want
> > > > > > to replace MADV_DONTNEED with MADV_FREE so I think it is really cold,
> > > > > > but we couldn't make sure so head of inactive is good compromise.
> > > > > > Another concern about tail of inactive list is that there could be
> > > > > > plenty of pages in there, which was asynchromos write-backed in
> > > > > > previous reclaim path, not-yet reclaimed because of not being able
> > > > > > to free the in softirq context of writeback. It means we ends up
> > > > > > freeing more potential pages to become workingset in advance
> > > > > > than pages VM already decided to evict.
> > > > > 
> > > > > Yes, they are definitely cold pages. I thought We should make sure the
> > > > > MADV_FREE pages are reclaimed first before other pages, at least in the anon
> > > > > LRU list, though there might be difficult to determine if we should reclaim
> > > > > writeback pages first or MADV_FREE pages first.
> > > > 
> > > > Frankly speaking, the issue with writeback page is just hurdle of
> > > > implementation, not design so if we could fix it, we might move
> > > > cold pages into tail of the inactive LRU. I tried it but don't have
> > > > time slot to continue these days. Hope to get a time to look soon.
> > > > https://lkml.org/lkml/2014/7/1/628
> > > > Even, it wouldn't be critical problem although we couldn't fix
> > > > the problem of writeback pages because they are already all
> > > > cold pages so it might be not important to keep order in LRU so
> > > > we could save working set and effort of VM to reclaim them
> > > > at the cost of moving all of hinting pages into tail of the LRU
> > > > whenever the syscall is called.
> > > > 
> > > > However, significant problem from my mind is we couldn't make
> > > > sure they are really cold pages. It would be true for allocators
> > > > but it's cache-friendly pages so it might be better to discard
> > > > tail pages of inactive LRU, which are really cold.
> > > > In addition, we couldn't expect all of usecase for MADV_FREE
> > > > so some of users might want to treat them as warm, not cold.
> > > > 
> > > > With moving them into inactive list's head, if we still see
> > > > a lot stall, I think it's a sign to add other logic, for example,
> > > > we could drop MADV_FREEed pages instantly if the zone is below
> > > > low min watermark when the syscall is called. Because everybody
> > > > doesn't like direct reclaim.
> > > 
> > > So I tried move the MADV_FREE pages to inactive list head or tail. It helps a
> > > little. But there are still stalls/oom. kswapd isn't fast enough to free the
> > > pages, App enters direct reclaim frequently. In one machine, no swap trigger,
> > > but MADV_FREE is 5x slower than MADV_DONTNEED. In another machine, MADV_FREE
> > 
> > It's expected. MADV_DONTNEED and MADV_FREE is really different.
> > MADV_DONTNEED is self-sacrificy for others in the system while MADV_FREE is
> > greedy approach for itself because random process asking the memory could
> > enter direct reclaim.
> > However, as I said earlier, we could mitigate the problem by checking
> > min_free_kbytes. If memory in the system is under min_free_kbytes, it is
> > pointless to impose reclaim overhead for hinted pages because we alreay
> > know the hint is "please free when you are trouble with memory" and we got
> > know it already.
> > 
> > When I test below patch on my 3G machine + 12 CPU + 8G swap with below test
> > test: 12 processes(each process does 5 iteration: mmap 512M + memset + madvise),
> > 
> > 1. MADV_DONTNEED : 41.884sec, sys:3m4.552
> > 2. MADV_FREE : 1m28sec, sys: 5m23
> > 3. MADV_FREE + below patch : 37.188s, sys: 2m20
> > 
> > Could you test?
> >         
> > diff --git a/mm/madvise.c b/mm/madvise.c
> > index 6d0fcb8..da15f8f 100644
> > --- a/mm/madvise.c
> > +++ b/mm/madvise.c
> > @@ -523,7 +523,7 @@ madvise_vma(struct vm_area_struct *vma, struct vm_area_struct **prev,
> >  		 * XXX: In this implementation, MADV_FREE works like
> >  		 * MADV_DONTNEED on swapless system or full swap.
> >  		 */
> > -		if (get_nr_swap_pages() > 0)
> > +		if (get_nr_swap_pages() > 0 && min_free_kbytes < nr_free_pages())
> >  			return madvise_free(vma, prev, start, end);
> >  		/* passthrough */
> >  	case MADV_DONTNEED:
> 
> The throttling makes a lot of sense, definitely should be included in the
> patch. At least my jemalloc test has similar performance result with/without

Yeb, I will post it with a little modification after long vacation.

> the patch in memory pressure case. So overall I'm pretty happy with it.

Thanks for the testing.

> However, this only solves half of the problem. pages which are MADV_FREE before
> watermark is hit are still hard to be reclaimed later if there are other
> allocations. I'm not sure how severe this issue is. My jemalloc test frequently
> does madvise (fallback to DONTNEED with above change), so itself can free a lot
> of memory in memory pressure. If application uses MADV_FREE before watermark is
> hit, but don't use it after watermark is hit, we will have trouble.

Fair enough. It might make those pages close to inactive LRU's tail
be unlikely to free, instead rotate back to active LRU.
Hmm, I don't know how such anonymous LRU scanning without freeing makes
trobule in huge system.

Anyway, one of the idea is we could use COW so that it could move recent
dirtied pages into active LRU's head. Although it adds more overhead for
MADV_FREE than now, it could solve above issue.

As well, I think we could make MADV_FREE support on swapless system easier.
On swapless system, we don't move pages in active LRU to inactive so
when MADV_FREE is called on, we could move those pages in inactive's LRU
and if recent access happens on those pages before discarding by VM,
we could move them from inactive to active list. So, inactive LRU list
could have mostly freeable pages(if swapoff race happens, some of
non-freeable pages remains inactive list) so it's not a performan
problem only if VM does aging if there are anonymous pages
in inactive LRU list on swapless system.

> 
> Thanks,
> Shaohua

-- 
Kind regards,
Minchan Kim

^ permalink raw reply

* RE: [PATCH RFC v3 0/7] epoll: Introduce new syscalls, epoll_ctl_batch and epoll_pwait1
From: Seymour, Shane M @ 2015-02-16  7:25 UTC (permalink / raw)
  To: Fam Zheng, Jonathan Corbet
  Cc: linux-kernel@vger.kernel.org, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86@kernel.org, Alexander Viro, Andrew Morton,
	Kees Cook, Andy Lutomirski, David Herrmann, Alexei Starovoitov,
	Miklos Szeredi, David Drysdale, Oleg Nesterov, David S. Miller,
	Vivek Goyal, Mike Frysinger, Theodore Ts'o, Heiko Carstens,
	Rasmus Villemoes, Rashika Kheria, Hugh Dickins, Mathieu
In-Reply-To: <20150216010224.GA32421@ad.nay.redhat.com>

I found the manual pages really confusing so I had a go at rewriting
them - there were places in the manual page that didn't match the
functionality provided by your code as well as I could tell).

My apologies for a few formatting issues though. I still don't like
parts of epoll_pwait1 but it's less confusing than it was.

You are free to take some or all or none of the changes.

I did have a question I marked with **** below about what you
describe and what your code does.

1) epoll_ctl_batch
------------------

NAME
       epoll_ctl_batch - batch control interface for an epoll descriptor

SYNOPSIS

       #include <sys/epoll.h>

       int epoll_ctl_batch(int epfd, int flags,
                           int ncmds, struct epoll_ctl_cmd *cmds);

DESCRIPTION

       This system call is an extension of epoll_ctl(). The primary
       difference is that this system call allows you to batch multiple
       operations with the one system call. This provides a more efficient
       interface for updating events on this epoll file descriptor epfd.

       The flags argument is reserved and must be 0.

       The argument ncmds is the number of cmds entries being passed in.
       This number must be greater than 0.

       Each operation is specified as an element in the cmds array, defined as:

           struct epoll_ctl_cmd {

                  /* Reserved flags for future extension, must be 0. */
                  int flags;

                  /* The same as epoll_ctl() op parameter. */
                  int op;

                  /* The same as epoll_ctl() fd parameter. */
                  int fd;

                  /* The same as the "events" field in struct epoll_event. */
                  uint32_t events;

                  /* The same as the "data" field in struct epoll_event. */
                  uint64_t data;

                  /* Output field, will be set to the return code after this
                   * command is executed by kernel */
                  int result;
           };

       This system call is not atomic when updating the epoll descriptor.
       All entries in cmds are executed in the order provided. If any cmds
       entry fails to be processed no further entries are processed and 
       the number of successfully processed entries is returned.

       Each single operation defined by a struct epoll_ctl_cmd has the same 
       semantics as an epoll_ctl(2) call. See the epoll_ctl() manual page
       for more information about how to correctly setup the members of a
       struct epoll_ctl_cmd.

       Upon completion of the call the result member of each struct epoll_ctl_cmd
       may be set to 0 (sucessfully completed) or an error code depending on the
       result of the command. If the kernel fails to change the result (for
       example the location of the cmds argument is fully or partly read only)
       the result member of each struct epoll_ctl_cmd may be unchanged. 

RETURN VALUE

       epoll_ctl_batch() returns a number greater than 0 to indicate the number
       of cmnd entries processed. If all entries have been processed this will
       equal the ncmds parameter passed in.

       If one or more parameters are incorrect the value returned is -1 with
       errno set appropriately - no cmds entries have been processed when this
       happens.

       If processing any entry in the cmds argument results in an error the
       number returned is the number of the failing entry - this number will be
       less than ncmds. Since ncmds must be greater than 0 a return value of
       0 indicates an error associated with the very first cmds entry. 
       A return value of 0 does not indicate a successful system call.

       To correctly test the return value from epoll_ctl_batch() use code similar
       to the following:

		ret=epoll_ctl_batch(epfd, flags, ncmds, &cmds);
		if (ret < ncmds) {
			if (ret == -1) {
				/* An argument was invalid */
			} else {
				/* ret contains the number of successful entries
                                 * processed. If you (mis?)use it as a C index it
                                 * will index directly to the failing entry to
                                 * get the result use cmds[ret].result which may 
                                 * contain the errno value associated with the
                                 * entry.
                                 */
			}
		} else {
			/* Success */
		}

ERRORS

       EINVAL flags is non-zero, or ncmds is less than or equal to zero, or
              cmds is NULL.

       ENOMEM There was insufficient memory to handle the requested op control
              operation.

       EFAULT The memory area pointed to by cmds is not accessible with read
              permissions.

       In the event that the return value is not the same as the ncmds parameter
       the result member of the failing struct epoll_ctl_cmd will contain a
       negative errno value related to the error. The errno values that can be set
       are those documented on the epoll_ctl(2) manual page.

CONFORMING TO

       epoll_ctl_batch() is Linux-specific.

SEE ALSO

       epoll_create(2), epoll_ctl(2), epoll_wait(2), epoll_pwait(2), epoll(7)


2) epoll_pwait1
---------------

NAME
       epoll_pwait1 - wait for an I/O event on an epoll file descriptor

SYNOPSIS

       #include <sys/epoll.h>

       int epoll_pwait1(int epfd, int flags,
                        struct epoll_event *events,
                        int maxevents,
                        struct epoll_wait_params *params);

DESCRIPTION

       The epoll_pwait1() syscall differs from epoll_pwait() only in
       parameter list. The epfd, events and maxevents parameters are the same
       as in epoll_wait() and epoll_pwait(). The flags and params are new.

       The flags is reserved and must be zero.

       The params is a pointer to a struct epoll_wait_params which is
       defined as:

           struct epoll_wait_params {
               int clockid;
               struct timespec timeout;
               sigset_t *sigmask;
               size_t sigsetsize;
           };

       The clockid member must be either CLOCK_REALTIME or CLOCK_MONOTONIC.
       This will choose the clock type to use for timeout. This differs to
       epoll_pwait(2) which has an implicit clock type of CLOCK_MONOTONIC.
       
       The timeout member specifies the minimum time that epoll_wait(2) will
       block. The time spent waiting will be rounded up to the clock
       granularity. Kernel scheduling delays mean that the blocking
       interval may overrun by a small amount. Specifying a -1 for either
       tv_sec or tv_nsec member of the struct timespec timeout will cause
       causes epoll_pwait1(2) to block indefinitely. Specifying a timeout
       equal to zero (both tv_sec or tv_nsec member of the struct timespec
       timeout are zero) causes epoll_wait(2) to return immediately, even
       if no events are available.

**** Are you really really sure about this for the -1 stuff? your code copies in the timespec and just passes it to timespec_to_ktime:

+	if (copy_from_user(&p, params, sizeof(p)))
+		return -EFAULT;
...
+	kt = timespec_to_ktime(p.timeout);

Compare that to something like the futex syscall which does this:

		if (copy_from_user(&ts, utime, sizeof(ts)) != 0)
			return -EFAULT;
		if (!timespec_valid(&ts))
			return -EINVAL;

		t = timespec_to_ktime(ts);

If the timespec is not valid it returns -EINVAL back to user space. With your settings of tv_sec and/or tv_usec to -1 are you relying on a side effect of the conversion that could break your code in the future if in the unlikely event someone changes timespec_to_ktime() and should it be:

+	if (copy_from_user(&p, params, sizeof(p)))
+		return -EFAULT;
+       if ((p.timeout.tv_sec == -1) || (p.timeout.tv_nsec == -1)) {
+  /* this is off the top of my head no idea if it will compile */
+		p.timeout.tv_sec = KTIME_SEC_MAX;
+		p.timeout.tv_nsec = 0;
+	}
+       if (!timespec_valid(&p.timeout))
+       	return -EINVAL;
...
+	kt = timespec_to_ktime(p.timeout);

I could of course be worried about nothing here is what I've suggested the right thing to do? Anyone feel free to chime in.

       Both sigmask and sigsetsize have the same semantics as epoll_pwait(2). The
       sigmask field may be specified as NULL, in which case epoll_pwait1(2)
       will behave like epoll_wait(2).

   User visibility of sigsetsize

       In epoll_pwait(2) and other syscalls, sigsetsize is not visible to
       an application developer as glibc has a wrapper around epoll_pwait(2).
       Now we pack several parameters in epoll_wait_params. In
       order to hide sigsetsize from application code this system call also
       needs to be wrapped either by expanding parameters and building the
       structure in the wrapper function, or by only asking application to
       provide this part of the structure:

           struct epoll_wait_params_user {
               int clockid;
               struct timespec timeout;
               sigset_t *sigmask;
           };

      In the wrapper function it would be copied to a full structure and
      sigsetsize filled in.

RETURN VALUE

       When successful, epoll_wait1() returns the number of file descriptors
       ready for the requested I/O, or zero if no file descriptor became ready
       during the requested timeout nanoseconds. When an error occurs, 
       epoll_wait1() returns -1 and errno is set appropriately.

ERRORS

       This system call can set errno to the same values as epoll_pwait(2), 
       as well as the following additional reasons:

       EINVAL flags is not zero, or clockid is not one of CLOCK_REALTIME or
              CLOCK_MONOTONIC.

       EFAULT The memory area pointed to by params is not accessible.


CONFORMING TO

       epoll_pwait1() is Linux-specific.

SEE ALSO

       epoll_create(2), epoll_ctl(2), epoll_wait(2), epoll_pwait(2), epoll(7)

^ permalink raw reply

* Re: [PATCH RFC v3 0/7] epoll: Introduce new syscalls, epoll_ctl_batch and epoll_pwait1
From: Fam Zheng @ 2015-02-16  8:12 UTC (permalink / raw)
  To: Seymour, Shane M
  Cc: Jonathan Corbet,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	x86-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org, Alexander Viro,
	Andrew Morton, Kees Cook, Andy Lutomirski, David Herrmann,
	Alexei Starovoitov, Miklos Szeredi, David Drysdale, Oleg Nesterov,
	David S. Miller, Vivek Goyal, Mike Frysinger, Theodore Ts'o,
	Heiko Carstens, Rasmus Villemoes, Rashika Kheria,
	Hugh Dickins <hughd>
In-Reply-To: <DDB9C85B850785449757F9914A034FCB3BF41130-4I1V4pQFGigSZAcGdq5asR6epYMZPwEe5NbjCUgZEJk@public.gmane.org>

Hi Seymour,

On Mon, 02/16 07:25, Seymour, Shane M wrote:
> I found the manual pages really confusing so I had a go at rewriting
> them - there were places in the manual page that didn't match the
> functionality provided by your code as well as I could tell).

Could you point which places don't match the code?

> 
> My apologies for a few formatting issues though. I still don't like
> parts of epoll_pwait1 but it's less confusing than it was.

Any other than the timespec question don't you like?

> 
> You are free to take some or all or none of the changes.
> 
> I did have a question I marked with **** below about what you
> describe and what your code does.
> 

<snip>

>        The timeout member specifies the minimum time that epoll_wait(2) will
>        block. The time spent waiting will be rounded up to the clock
>        granularity. Kernel scheduling delays mean that the blocking
>        interval may overrun by a small amount. Specifying a -1 for either
>        tv_sec or tv_nsec member of the struct timespec timeout will cause
>        causes epoll_pwait1(2) to block indefinitely. Specifying a timeout
>        equal to zero (both tv_sec or tv_nsec member of the struct timespec
>        timeout are zero) causes epoll_wait(2) to return immediately, even
>        if no events are available.
> 
> **** Are you really really sure about this for the -1 stuff? your code copies
> in the timespec and just passes it to timespec_to_ktime:
> 
> +	if (copy_from_user(&p, params, sizeof(p)))
> +		return -EFAULT;
> ...
> +	kt = timespec_to_ktime(p.timeout);
> 
> Compare that to something like the futex syscall which does this:
> 
> 		if (copy_from_user(&ts, utime, sizeof(ts)) != 0)
> 			return -EFAULT;
> 		if (!timespec_valid(&ts))
> 			return -EINVAL;
> 
> 		t = timespec_to_ktime(ts);
> 
> If the timespec is not valid it returns -EINVAL back to user space. With your
> settings of tv_sec and/or tv_usec to -1 are you relying on a side effect of
> the conversion that could break your code in the future if in the unlikely
> event someone changes timespec_to_ktime() and should it be:
> 
> +	if (copy_from_user(&p, params, sizeof(p)))
> +		return -EFAULT;
> +       if ((p.timeout.tv_sec == -1) || (p.timeout.tv_nsec == -1)) {
> +  /* this is off the top of my head no idea if it will compile */
> +		p.timeout.tv_sec = KTIME_SEC_MAX;
> +		p.timeout.tv_nsec = 0;
> +	}
> +       if (!timespec_valid(&p.timeout))
> +       	return -EINVAL;
> ...
> +	kt = timespec_to_ktime(p.timeout);

OK. timespec_valid() is clear about this: negative tv_sec is invalid, so I
don't think accepting -1 from user is the right thing to do.

We cannot do pointer check as ppoll already because the structure is embedded
in epoll_wait_params.

Maybe it's best to use a flags bit (#define EPOLL_PWAIT1_BLOCK 1).  What do you
think?

Fam

<snip>

^ permalink raw reply

* Re: [PATCH v3 linux-trace 1/8] tracing: attach eBPF programs to tracepoints and syscalls
From: He Kuang @ 2015-02-16 11:26 UTC (permalink / raw)
  To: Alexei Starovoitov, Steven Rostedt
  Cc: Ingo Molnar, Namhyung Kim, Arnaldo Carvalho de Melo, Jiri Olsa,
	Masami Hiramatsu, Linux API, Network Development, LKML,
	Linus Torvalds, Peter Zijlstra, Eric W. Biederman, wangnan0
In-Reply-To: <CAMEtUuzY_Po=WtFEFg1aqzJ8dEF4rHGcWDsaS44KYgACMNPPgA@mail.gmail.com>

Hi, Alexei

Another suggestion on bpf syscall interface. Currently, BPF +
syscalls/kprobes depends on CONFIG_BPF_SYSCALL. In kernel used on
commercial products, CONFIG_BPF_SYSCALL is probably disabled, in this
case, bpf bytecode cannot be loaded to the kernel.

If we turn the functionality of BPF_SYSCALL into a loadable module, then
we can use it without any dependencies on the kernel. What about change
bpf syscall to a /dev node or /sys file which can be exported by a
kernel module?

^ permalink raw reply

* Re: [PATCH 00/14] Add support to STMicroelectronics STM32 family
From: Maxime Coquelin @ 2015-02-16 11:52 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Jonathan Corbet, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Philipp Zabel, Russell King,
	Daniel Lezcano, Thomas Gleixner, Linus Walleij,
	Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann, Andrew Morton,
	David S. Miller, Mauro Carvalho Chehab, Joe Perches,
	Antti Palosaari, Tejun Heo, Will Deacon, Nikolay Borisov,
	Rusty Russell, Kees
In-Reply-To: <54E0B7C4.7050900-l3A5Bk7waGM@public.gmane.org>

Hi Andreas,

2015-02-15 16:14 GMT+01:00 Andreas Färber <afaerber-l3A5Bk7waGM@public.gmane.org>:
> Hi Maxime,
>
> Am 12.02.2015 um 18:45 schrieb Maxime Coquelin:
>> This patchset adds basic support for STMicroelectronics STM32 series MCUs.
>>
>> STM32 MCUs are Cortex-M CPU, used in various applications (consumer
>> electronics, industrial applications, hobbyists...).
>> Datasheets, user and programming manuals are publicly available on
>> STMicroelectronics website.
>>
>> With this series applied, the STM32F419 Discovery can boot succesfully.
>>
>> Once this series accepted, next steps will be to add DMA support, as USART,
>> I2C and SPI IPs don't have any FIFO. Then will come the clock driver, as today
>> the bootloader has to be patched to enable the needed clocks.
>
> This is somewhat unfortunate, as I have been working on the same thing
> and have demonstrated the STM32F429 Discovery Kit at ARM TechSymposium
> Europe in December and submitted a talk for LinuxCon Japan.
>
> https://github.com/afaerber/afboot-stm32
> https://github.com/afaerber/linux/commits/stm32

Hmm, I wasn't aware you were also working on it.
The good news is that we are not alone on this task :).
I do it on my spare time, so any contribution is more than welcome.

>
> On a brief look, it seems you are further along in terms of code quality
> and documenting. Do you spot anything that's missing in your series and
> could be added from my branch? The clk controller maybe? Also I already
> started looking into gpio and usb drivers. Me too, I skipped DMA support
> though.

The GPIO support is already part of the pinctrl patch.
The missing thing is the GPIO interrupt feature, but I am working on it.

Maybe you could focus on the clock support, as I see its support is
well advanced in you tree?
I see one bug in it, the timer clocks should be 90MHz, but your patch
indicates 45MHz.

I see you have started the LCD controller driver, maybe that is
another task you could handle?

Regarding USB, have you made it to work?

Kind regards,
Maxime

>
> Regards,
> Andreas
>
>>
>> Maxime Coquelin (14):
>>   scripts: link-vmlinux: Don't pass page offset to kallsyms if XIP
>>     Kernel
>>   ARM: ARMv7M: Enlarge vector table to 256 entries
>>   clocksource: Add ARM System timer driver
>>   reset: Add reset_controller_of_init() function
>>   ARM: call reset_controller_of_init from default time_init handler
>>   drivers: reset: Add STM32 reset driver
>>   clockevent: Add STM32 Timer driver
>>   pinctrl: Add pinctrl driver for STM32 MCUs
>>   serial: stm32-usart: Add STM32 USART Driver
>>   ARM: Add STM32 family machine
>>   ARM: dts: Add ARM System timer as clockevent in armv7m
>>   ARM: dts: Introduce STM32F429 MCU
>>   ARM: configs: Add STM32 defconfig
>>   MAINTAINERS: Add entry for STM32 MCUs
>>
>>  Documentation/arm/stm32/overview.txt               |  32 +
>>  Documentation/arm/stm32/stm32f429-overview.txt     |  22 +
>>  .../devicetree/bindings/arm/system_timer.txt       |  15 +
>>  .../devicetree/bindings/pinctrl/pinctrl-stm32.txt  |  99 +++
>>  .../devicetree/bindings/reset/st,stm32-reset.txt   |  19 +
>>  .../devicetree/bindings/serial/st,stm32-usart.txt  |  18 +
>>  .../devicetree/bindings/timer/st,stm32-timer.txt   |  19 +
>>  MAINTAINERS                                        |   7 +
>>  arch/arm/Kconfig                                   |  22 +
>>  arch/arm/Makefile                                  |   1 +
>>  arch/arm/boot/dts/Makefile                         |   1 +
>>  arch/arm/boot/dts/armv7-m.dtsi                     |   7 +
>>  arch/arm/boot/dts/stm32f429-disco.dts              |  41 ++
>>  arch/arm/boot/dts/stm32f429.dtsi                   | 279 ++++++++
>>  arch/arm/configs/stm32_defconfig                   |  72 ++
>>  arch/arm/kernel/entry-v7m.S                        |   8 +-
>>  arch/arm/kernel/time.c                             |   4 +
>>  arch/arm/mach-stm32/Makefile                       |   1 +
>>  arch/arm/mach-stm32/Makefile.boot                  |   0
>>  arch/arm/mach-stm32/board-dt.c                     |  19 +
>>  drivers/clocksource/Kconfig                        |  16 +
>>  drivers/clocksource/Makefile                       |   2 +
>>  drivers/clocksource/arm_system_timer.c             |  74 ++
>>  drivers/clocksource/timer-stm32.c                  | 187 +++++
>>  drivers/pinctrl/Kconfig                            |   9 +
>>  drivers/pinctrl/Makefile                           |   1 +
>>  drivers/pinctrl/pinctrl-stm32.c                    | 779 +++++++++++++++++++++
>>  drivers/reset/Makefile                             |   1 +
>>  drivers/reset/core.c                               |  20 +
>>  drivers/reset/reset-stm32.c                        | 124 ++++
>>  drivers/tty/serial/Kconfig                         |  17 +
>>  drivers/tty/serial/Makefile                        |   1 +
>>  drivers/tty/serial/stm32-usart.c                   | 695 ++++++++++++++++++
>>  include/asm-generic/vmlinux.lds.h                  |   4 +-
>>  include/dt-bindings/pinctrl/pinctrl-stm32.h        |  43 ++
>>  include/linux/reset-controller.h                   |   6 +
>>  include/uapi/linux/serial_core.h                   |   3 +
>>  scripts/link-vmlinux.sh                            |   2 +-
>>  38 files changed, 2664 insertions(+), 6 deletions(-)
>>  create mode 100644 Documentation/arm/stm32/overview.txt
>>  create mode 100644 Documentation/arm/stm32/stm32f429-overview.txt
>>  create mode 100644 Documentation/devicetree/bindings/arm/system_timer.txt
>>  create mode 100644 Documentation/devicetree/bindings/pinctrl/pinctrl-stm32.txt
>>  create mode 100644 Documentation/devicetree/bindings/reset/st,stm32-reset.txt
>>  create mode 100644 Documentation/devicetree/bindings/serial/st,stm32-usart.txt
>>  create mode 100644 Documentation/devicetree/bindings/timer/st,stm32-timer.txt
>>  create mode 100644 arch/arm/boot/dts/stm32f429-disco.dts
>>  create mode 100644 arch/arm/boot/dts/stm32f429.dtsi
>>  create mode 100644 arch/arm/configs/stm32_defconfig
>>  create mode 100644 arch/arm/mach-stm32/Makefile
>>  create mode 100644 arch/arm/mach-stm32/Makefile.boot
>>  create mode 100644 arch/arm/mach-stm32/board-dt.c
>>  create mode 100644 drivers/clocksource/arm_system_timer.c
>>  create mode 100644 drivers/clocksource/timer-stm32.c
>>  create mode 100644 drivers/pinctrl/pinctrl-stm32.c
>>  create mode 100644 drivers/reset/reset-stm32.c
>>  create mode 100644 drivers/tty/serial/stm32-usart.c
>>  create mode 100644 include/dt-bindings/pinctrl/pinctrl-stm32.h
>>
>
>
> --
> SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Felix Imendörffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu,
> Graham Norton; HRB 21284 (AG Nürnberg)
>

^ permalink raw reply

* Re: [PATCH 05/14] ARM: call reset_controller_of_init from default time_init handler
From: Maxime Coquelin @ 2015-02-16 12:02 UTC (permalink / raw)
  To: Rob Herring
  Cc: Jonathan Corbet, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Philipp Zabel, Russell King,
	Daniel Lezcano, Thomas Gleixner, Linus Walleij,
	Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann, Andrew Morton,
	David S. Miller, Mauro Carvalho Chehab, Joe Perches,
	Antti Palosaari, Tejun Heo, Will Deacon, Nikolay Borisov,
	Rusty Russell, Kees
In-Reply-To: <CAL_Jsq+Sk0C-1UHCKE18fEVwBbV=quV9mrJSKTO_UPXWcaYfCw@mail.gmail.com>

2015-02-15 23:17 GMT+01:00 Rob Herring <robherring2@gmail.com>:
> On Thu, Feb 12, 2015 at 11:45 AM, Maxime Coquelin
> <mcoquelin.stm32@gmail.com> wrote:
>> Some DT ARM platforms need the reset controllers to be initialized before
>> the timers.
>> This is the case of the stm32 and sunxi platforms.
>
> I would say this is the exception, not the rule and therefore should
> be handled in a machine desc function. Or it could be part of your
> timer setup. Or is the bootloader's problem (like arch timer setup).

The only valid way in my opinion would be to implement the init_time
callback (as your first proposal),
duplicating what performs the time_init() function.

Then, if other machines than sunxi and stm32 have some day the same need,
we could consider moving the call to reset_controller_of_init()
function to time_init().

>
> We just want to limit how much this mechanism gets used.

Could you elaborate the reason why we want to limit this mechanism please?
I am not sure to understand.

Thanks,
Maxime

>
> Rob
>
>>
>> This patch adds a call to reset_controller_of_init() to the default
>> .init_time callback when RESET_CONTROLLER is used by the platform.
>>
>> Signed-off-by: Maxime Coquelin <mcoquelin.stm32@gmail.com>
>> ---
>>  arch/arm/kernel/time.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/arm/kernel/time.c b/arch/arm/kernel/time.c
>> index 0cc7e58..4601b1e 100644
>> --- a/arch/arm/kernel/time.c
>> +++ b/arch/arm/kernel/time.c
>> @@ -20,6 +20,7 @@
>>  #include <linux/irq.h>
>>  #include <linux/kernel.h>
>>  #include <linux/profile.h>
>> +#include <linux/reset-controller.h>
>>  #include <linux/sched.h>
>>  #include <linux/sched_clock.h>
>>  #include <linux/smp.h>
>> @@ -117,6 +118,9 @@ void __init time_init(void)
>>         if (machine_desc->init_time) {
>>                 machine_desc->init_time();
>>         } else {
>> +#ifdef CONFIG_RESET_CONTROLLER
>> +               reset_controller_of_init();
>> +#endif
>>  #ifdef CONFIG_COMMON_CLK
>>                 of_clk_init(NULL);
>>  #endif
>> --
>> 1.9.1
>>

^ permalink raw reply

* Re: [PATCH 03/14] clocksource: Add ARM System timer driver
From: Maxime Coquelin @ 2015-02-16 12:08 UTC (permalink / raw)
  To: Rob Herring
  Cc: Jonathan Corbet, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Philipp Zabel, Russell King,
	Daniel Lezcano, Thomas Gleixner, Linus Walleij,
	Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann, Andrew Morton,
	David S. Miller, Mauro Carvalho Chehab, Joe Perches,
	Antti Palosaari, Tejun Heo, Will Deacon, Nikolay Borisov,
	Rusty Russell, Kees
In-Reply-To: <CAL_JsqKoT_rWzt6ZCQXwg-NxM_Mnuqy6UwmPKBRodBCf0i7zyg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

2015-02-15 23:31 GMT+01:00 Rob Herring <robherring2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:
> On Thu, Feb 12, 2015 at 11:45 AM, Maxime Coquelin
> <mcoquelin.stm32-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> This patch adds clocksource support for ARMv7-M's System timer,
>> also known as SysTick.
>>
>> Signed-off-by: Maxime Coquelin <mcoquelin.stm32-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> ---
>>  .../devicetree/bindings/arm/system_timer.txt       | 15 +++++
>
> Please include v7M in the name. System timer sounds very generic. This
> is the only timer architecturally defined IIRC, so perhaps just
> "armv7m_systick".

Ok, let's go for "armv7m_systick".

>
>>  drivers/clocksource/Kconfig                        |  7 ++
>>  drivers/clocksource/Makefile                       |  1 +
>>  drivers/clocksource/arm_system_timer.c             | 74 ++++++++++++++++++++++
>
> Same here.

Agree, will be in the v2.

>
>
>>  4 files changed, 97 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/arm/system_timer.txt
>>  create mode 100644 drivers/clocksource/arm_system_timer.c
>>
>> diff --git a/Documentation/devicetree/bindings/arm/system_timer.txt b/Documentation/devicetree/bindings/arm/system_timer.txt
>> new file mode 100644
>> index 0000000..35268b7
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/arm/system_timer.txt
>> @@ -0,0 +1,15 @@
>> +* ARM System Timer
>> +
>> +ARMv7-M includes a system timer, known as SysTick. Current driver only
>> +implements the clocksource feature.
>> +
>> +Required properties:
>> +- compatible : Should be "arm,armv7m-systick"
>> +- reg       : The address range of the timer
>> +- clocks     : The input clock of the timer
>
> You may want to consider supporting "clock-frequency" here too. In
> more simple chips you may just have fixed clocks and may want to run a
> kernel with COMMON_CLK disabled for size savings.

Ok, I will add this option in the v2.

>
>> +
>> +systick: system-timer {
>
> This should be "systick: timer@e000e010".
>
> Same for your dts file.

Right, it will be fixed in the v2.

Thanks for the review,
Maxime
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 03/14] clocksource: Add ARM System timer driver
From: Maxime Coquelin @ 2015-02-16 12:21 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Rob Herring, Jonathan Corbet, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Philipp Zabel, Russell King,
	Daniel Lezcano, Thomas Gleixner, Linus Walleij,
	Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann, Andrew Morton,
	David S. Miller, Mauro Carvalho Chehab, Joe Perches,
	Antti Palosaari, Tejun Heo, Will Deacon, Nikolay Borisov,
	Rusty Russell, Kees
In-Reply-To: <54E12F39.6030509-l3A5Bk7waGM@public.gmane.org>

2015-02-16 0:43 GMT+01:00 Andreas Färber <afaerber-l3A5Bk7waGM@public.gmane.org>:
> Am 12.02.2015 um 18:45 schrieb Maxime Coquelin:
>> This patch adds clocksource support for ARMv7-M's System timer,
>> also known as SysTick.
>>
>> Signed-off-by: Maxime Coquelin <mcoquelin.stm32-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> ---
>>  .../devicetree/bindings/arm/system_timer.txt       | 15 +++++
>>  drivers/clocksource/Kconfig                        |  7 ++
>>  drivers/clocksource/Makefile                       |  1 +
>>  drivers/clocksource/arm_system_timer.c             | 74 ++++++++++++++++++++++
>>  4 files changed, 97 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/arm/system_timer.txt
>>  create mode 100644 drivers/clocksource/arm_system_timer.c
>>
>> diff --git a/Documentation/devicetree/bindings/arm/system_timer.txt b/Documentation/devicetree/bindings/arm/system_timer.txt
>> new file mode 100644
>> index 0000000..35268b7
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/arm/system_timer.txt
>> @@ -0,0 +1,15 @@
>> +* ARM System Timer
>> +
>> +ARMv7-M includes a system timer, known as SysTick. Current driver only
>> +implements the clocksource feature.
>> +
>> +Required properties:
>> +- compatible : Should be "arm,armv7m-systick"
>> +- reg             : The address range of the timer
>> +- clocks     : The input clock of the timer
>> +
>> +systick: system-timer {
>> +     compatible = "arm,armv7m-systick";
>> +     reg = <0xe000e010 0x10>;
>> +     clocks = <&clk_systick>;
>> +};
>
> Binding documentation is supposed to go into its own patch:
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/tree/Documentation/devicetree/bindings/submitting-patches.txt
Ok, will change this in the v2.

>
...
>
> I've used a SysTick based implementation on my stm32 branch myself, but
> looking at efm32 I got the impression that it would be better to use one
> of the 32-bit TIM2/TIM5 as clocksource and the other as clockevents?
>
> Still this implementation will be handy to have, also for other targets.

My view is that we should use as much generic parts of the Cortex-M as possible.
Moreover, doing, that, we can keep one more IP instance under reset
with associated clock gated,
and so maybe reduce the power consumption a little (I haven't done any
measurements)

Do you see a case where it could be better to use the STM32 timers?


Thanks,
Maxime
>
> Regards,
> Andreas
>
> --
> SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Felix Imendörffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu,
> Graham Norton; HRB 21284 (AG Nürnberg)

^ permalink raw reply

* Re: [PATCH 06/14] drivers: reset: Add STM32 reset driver
From: Maxime Coquelin @ 2015-02-16 12:25 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Jonathan Corbet, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Philipp Zabel, Russell King,
	Daniel Lezcano, Thomas Gleixner, Linus Walleij,
	Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann, Andrew Morton,
	David S. Miller, Mauro Carvalho Chehab, Joe Perches,
	Antti Palosaari, Tejun Heo, Will Deacon, Nikolay Borisov,
	Rusty Russell, Kees
In-Reply-To: <54E132D6.6010608@suse.de>

2015-02-16 0:59 GMT+01:00 Andreas Färber <afaerber@suse.de>:
> Am 12.02.2015 um 18:45 schrieb Maxime Coquelin:
>> The STM32 MCUs family IP can be reset by accessing some shared registers.
>>
>> The specificity is that some reset lines are used by the timers.
>> At timer initialization time, the timer has to be reset, that's why
>> we cannot use a regular driver.
>>
>> Signed-off-by: Maxime Coquelin <mcoquelin.stm32@gmail.com>
>> ---
>>  .../devicetree/bindings/reset/st,stm32-reset.txt   |  19 ++++
>>  drivers/reset/Makefile                             |   1 +
>>  drivers/reset/reset-stm32.c                        | 124 +++++++++++++++++++++
>>  3 files changed, 144 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/reset/st,stm32-reset.txt
>>  create mode 100644 drivers/reset/reset-stm32.c
>>
>> diff --git a/Documentation/devicetree/bindings/reset/st,stm32-reset.txt b/Documentation/devicetree/bindings/reset/st,stm32-reset.txt
>> new file mode 100644
>> index 0000000..add1298
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/reset/st,stm32-reset.txt
>> @@ -0,0 +1,19 @@
>> +STMicroelectronics STM32 Peripheral Reset Controller
>> +====================================================
>> +
>> +Please also refer to reset.txt in this directory for common reset
>> +controller binding usage.
>> +
>> +Required properties:
>> +- compatible: Should be "st,stm32-reset"
>> +- reg: should be register base and length as documented in the
>> +  datasheet
>> +- #reset-cells: 1, see below
>> +
>> +example:
>> +
>> +reset_ahb1: reset@40023810 {
>> +     #reset-cells = <1>;
>> +     compatible = "st,stm32-reset";
>> +     reg = <0x40023810 0x4>;
>> +};
> [snip]
>
> RM0090 has two different chapters on the RCC IP:
> * Reset and clock control for STM32F42xxx and STM32F43xxx (RCC)
> * Reset and clock control for STM32F405xx/07xx and STM32F415xx/17xx(RCC)
>
> I therefore feel it is wrong to use "stm32-" here; instead I used
> "st,stm32f429-rcc" (also relates to 12/14 discussion). This may apply to
> other identifiers, too.

In this first version, the reset driver was really generic, and was
compatible with the STM32 family.
The only difference would have been in the device trees.

Now, from the discussion with Philipp, I will reconsider the
implementation to add some named constants,
so maybe I will reconsider the compatible, it will depend on how I
will implement it.

Thanks,
Maxime

>
> Regards,
> Andreas
>
> --
> SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Felix Imendörffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu,
> Graham Norton; HRB 21284 (AG Nürnberg)

^ permalink raw reply

* Re: futex(2) man page update help request
From: Cyril Hrubis @ 2015-02-16 13:14 UTC (permalink / raw)
  To: Darren Hart
  Cc: Michael Kerrisk (man-pages), Thomas Gleixner, Ingo Molnar,
	Jakub Jelinek, linux-man@vger.kernel.org, lkml, Davidlohr Bueso,
	Arnd Bergmann, Steven Rostedt, Peter Zijlstra, Linux API,
	Carlos O'Donell
In-Reply-To: <CF9A658E.91322%dvhart@linux.intel.com>

Hi!
> I'll follow up with you in a couple weeks most likely. I have some urgent
> things that will be taking all my time and then some until then. Feel free
> to poke me though if I lose track of it :-)

FYI I've started to work on futex testcases for LTP. The first batch has
been commited in:

https://github.com/linux-test-project/ltp/commit/6270ba2ebe999ffdb1364e5e814d7e56070a0198

Some of these are losely based on futextest some are written from
scratch. The requeue operation, pi futexes and bitset are not covered
yet.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply

* [PATCH RESEND 0/12] fs: Introduce FALLOC_FL_INSERT_RANGE for fallocate
From: Namjae Jeon @ 2015-02-16 15:47 UTC (permalink / raw)
  To: david-FqsqvQoI3Ljby3iVrkZq2A, tytso-3s7WtUTddSA
  Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA, xfs-VZNHf3L845pBDgjK7y7TUQ,
	a.sangwan-Sze3O3UU22JBDgjK7y7TUQ, bfoster-H+wXaHxf7aLQT0dZR+AlfA,
	mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	linux-man-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Namjae Jeon, Namjae Jeon

From: Namjae Jeon <namjae.jeon-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>

In continuation of the work of making the process of non linear editing of
media files faster, we introduce here the new flag FALLOC_FL_INSERT_RANGE
for fallocate.

This flag will work opposite to the FALLOC_FL_COLLAPSE_RANGE flag.
As such, specifying FALLOC_FL_INSERT_RANGE flag will create new space inside file
by inserting a hole within the range specified by offset and len. 
User can write new data in this space. e.g. ads.
Like collapse range, currently we have the limitation that offset and len should
be block size aligned for both XFS and Ext4.

The semantics of the flag are :
1) It creates space within file by inserting a hole of  len bytes starting
   at offset byte without overwriting any existing data. All the data blocks
   from offset to EOF are shifted towards right to make hole space.
2) It should be used exclusively. No other fallocate flag in combination.
3) Offset and length supplied to fallocate should be fs block size aligned
   in case of xfs and ext4.
4) Insert range does not work for the case when offset is overlapping/beyond
   i_size. If the user wants to insert space at the end of file they are
   advised to use either ftruncate(2) or fallocate(2) with mode 0.
5) It increses the size of file by len bytes.


Namjae Jeon (12):
 fs: Add support FALLOC_FL_INSERT_RANGE for fallocate
 xfs: Add support FALLOC_FL_INSERT_RANGE for fallocate
 ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate
 xfsprog: xfsio: update xfs_io manpage for FALLOC_FL_INSERT_RANGE
 xfstests: generic/042: Standard insert range tests
 xfstests: generic/043: Delayed allocation insert range
 xfstests: generic/044: Multi insert range tests
 xfstests: generic/045: Delayed allocation multi insert
 xfstests: generic/046: Test multiple fallocate insert/collapse range calls
 xfstests: fsstress: Add fallocate insert range operation
 xfstests: fsx: Add fallocate insert range operation
 manpage: update FALLOC_FL_INSERT_RANGE flag in fallocate
-- 
1.7.9.5

^ permalink raw reply

* [PATCH RESEND 1/12] fs: Add support FALLOC_FL_INSERT_RANGE for fallocate
From: Namjae Jeon @ 2015-02-16 15:47 UTC (permalink / raw)
  To: david, tytso
  Cc: linux-man, Namjae Jeon, Namjae Jeon, linux-api, bfoster,
	linux-kernel, xfs, mtk.manpages, a.sangwan, linux-fsdevel,
	linux-ext4
In-Reply-To: <1424101680-3301-1-git-send-email-linkinjeon@gmail.com>

From: Namjae Jeon <namjae.jeon@samsung.com>

FALLOC_FL_INSERT_RANGE command is the opposite command of
FALLOC_FL_COLLAPSE_RANGE that is needed for advertisers or someone who want to
add some data in the middle of file. FALLOC_FL_INSERT_RANGE will create space
for writing new data within a file after shifting extents to right as given
length. and this command also has same limitation as FALLOC_FL_COLLAPSE_RANGE,
that is block boundary and use ftruncate(2) for crosses EOF.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
Cc: Brian Foster<bfoster@redhat.com>
---
 fs/open.c                   |    8 +++++++-
 include/uapi/linux/falloc.h |   17 +++++++++++++++++
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/fs/open.c b/fs/open.c
index 813be03..762fb45 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -232,7 +232,8 @@ int vfs_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
 
 	/* Return error if mode is not supported */
 	if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE |
-		     FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE))
+		     FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE |
+		     FALLOC_FL_INSERT_RANGE))
 		return -EOPNOTSUPP;
 
 	/* Punch hole and zero range are mutually exclusive */
@@ -250,6 +251,11 @@ int vfs_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
 	    (mode & ~FALLOC_FL_COLLAPSE_RANGE))
 		return -EINVAL;
 
+	/* Insert range should only be used exclusively. */
+	if ((mode & FALLOC_FL_INSERT_RANGE) &&
+	    (mode & ~FALLOC_FL_INSERT_RANGE))
+		return -EINVAL;
+
 	if (!(file->f_mode & FMODE_WRITE))
 		return -EBADF;
 
diff --git a/include/uapi/linux/falloc.h b/include/uapi/linux/falloc.h
index d1197ae..3e445a7 100644
--- a/include/uapi/linux/falloc.h
+++ b/include/uapi/linux/falloc.h
@@ -41,4 +41,21 @@
  */
 #define FALLOC_FL_ZERO_RANGE		0x10
 
+/*
+ * FALLOC_FL_INSERT_RANGE is use to insert space within the file size without
+ * overwriting any existing data. The contents of the file beyond offset are
+ * shifted towards right by len bytes to create a hole.  As such, this
+ * operation will increase the size of the file by len bytes.
+ *
+ * Different filesystems may implement different limitations on the granularity
+ * of the operation. Most will limit operations to filesystem block size
+ * boundaries, but this boundary may be larger or smaller depending on
+ * the filesystem and/or the configuration of the filesystem or file.
+ *
+ * Attempting to insert space using this flag at OR beyond the end of
+ * the file is considered an illegal operation - just use ftruncate(2) or
+ * fallocate(2) with mode 0 for such type of operations.
+ */
+#define FALLOC_FL_INSERT_RANGE		0x20
+
 #endif /* _UAPI_FALLOC_H_ */
-- 
1.7.9.5

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related

* [PATCH RESEND 2/12] xfs: Add support FALLOC_FL_INSERT_RANGE for fallocate
From: Namjae Jeon @ 2015-02-16 15:47 UTC (permalink / raw)
  To: david, tytso
  Cc: linux-man, Namjae Jeon, Namjae Jeon, linux-api, bfoster,
	linux-kernel, xfs, mtk.manpages, a.sangwan, linux-fsdevel,
	linux-ext4
In-Reply-To: <1424101680-3301-1-git-send-email-linkinjeon@gmail.com>

From: Namjae Jeon <namjae.jeon@samsung.com>

This patch implements fallocate's FALLOC_FL_INSERT_RANGE for XFS.

1) Make sure that both offset and len are block size aligned.
2) Update the i_size of inode by len bytes.
3) Compute the file's logical block number against offset. If the computed
   block number is not the starting block of the extent, split the extent
   such that the block number is the starting block of the extent.
4) Shift all the extents which are lying bewteen [offset, last allocated extent]
   towards right by len bytes. This step will make a hole of len bytes
   at offset.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/libxfs/xfs_bmap.c |  358 ++++++++++++++++++++++++++++++++++++++++------
 fs/xfs/libxfs/xfs_bmap.h |   13 +-
 fs/xfs/xfs_bmap_util.c   |  126 +++++++++++-----
 fs/xfs/xfs_bmap_util.h   |    2 +
 fs/xfs/xfs_file.c        |   38 ++++-
 fs/xfs/xfs_trace.h       |    1 +
 6 files changed, 455 insertions(+), 83 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 61ec015..6699e53 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -5518,50 +5518,86 @@ xfs_bmse_shift_one(
 	int				*current_ext,
 	struct xfs_bmbt_rec_host	*gotp,
 	struct xfs_btree_cur		*cur,
-	int				*logflags)
+	int				*logflags,
+	enum SHIFT_DIRECTION		SHIFT)
 {
 	struct xfs_ifork		*ifp;
 	xfs_fileoff_t			startoff;
-	struct xfs_bmbt_rec_host	*leftp;
+	struct xfs_bmbt_rec_host	*contp;
 	struct xfs_bmbt_irec		got;
-	struct xfs_bmbt_irec		left;
+	struct xfs_bmbt_irec		cont;
 	int				error;
 	int				i;
+	int				total_extents;
 
 	ifp = XFS_IFORK_PTR(ip, whichfork);
+	total_extents = ifp->if_bytes / sizeof(xfs_bmbt_rec_t);
 
 	xfs_bmbt_get_all(gotp, &got);
-	startoff = got.br_startoff - offset_shift_fsb;
 
 	/* delalloc extents should be prevented by caller */
 	XFS_WANT_CORRUPTED_RETURN(!isnullstartblock(got.br_startblock));
 
-	/*
-	 * Check for merge if we've got an extent to the left, otherwise make
-	 * sure there's enough room at the start of the file for the shift.
-	 */
-	if (*current_ext) {
-		/* grab the left extent and check for a large enough hole */
-		leftp = xfs_iext_get_ext(ifp, *current_ext - 1);
-		xfs_bmbt_get_all(leftp, &left);
+	if (SHIFT == SHIFT_LEFT) {
+		startoff = got.br_startoff - offset_shift_fsb;
 
-		if (startoff < left.br_startoff + left.br_blockcount)
+		/*
+		 * Check for merge if we've got an extent to the left,
+		 * otherwise make sure there's enough room at the start
+		 * of the file for the shift.
+		 */
+		if (*current_ext) {
+			/*
+			 * grab the left extent and check for a large
+			 * enough hole.
+			 */
+			contp = xfs_iext_get_ext(ifp, *current_ext - 1);
+			xfs_bmbt_get_all(contp, &cont);
+
+			if (startoff < cont.br_startoff + cont.br_blockcount)
+				return -EINVAL;
+
+			/* check whether to merge the extent or shift it down */
+			if (xfs_bmse_can_merge(&cont, &got, offset_shift_fsb)) {
+				return xfs_bmse_merge(ip, whichfork,
+						      offset_shift_fsb,
+						      *current_ext, gotp, contp,
+						      cur, logflags);
+			}
+		} else if (got.br_startoff < offset_shift_fsb)
 			return -EINVAL;
+	} else {
+		startoff = got.br_startoff + offset_shift_fsb;
+		/*
+		 * If this is not the last extent in the file, make sure there's
+		 * enough room between current extent and next extent for
+		 * accommodating the shift.
+		 */
+		if (*current_ext < (total_extents - 1)) {
+			contp = xfs_iext_get_ext(ifp, *current_ext + 1);
+			xfs_bmbt_get_all(contp, &cont);
+			if (startoff + got.br_blockcount > cont.br_startoff)
+				return -EINVAL;
 
-		/* check whether to merge the extent or shift it down */
-		if (xfs_bmse_can_merge(&left, &got, offset_shift_fsb)) {
-			return xfs_bmse_merge(ip, whichfork, offset_shift_fsb,
-					      *current_ext, gotp, leftp, cur,
-					      logflags);
+			/*
+			 * Unlike a left shift (which involves a hole punch),
+			 * a right shift does not modify extent neighbors
+			 * in any way. We should never find mergeable extents
+			 * in this scenario. Check anyways and warn if we
+			 * encounter two extents that could be one.
+			 */
+			if (xfs_bmse_can_merge(&got, &cont,  offset_shift_fsb))
+				WARN_ON_ONCE(1);
 		}
-	} else if (got.br_startoff < offset_shift_fsb)
-		return -EINVAL;
-
+	}
 	/*
 	 * Increment the extent index for the next iteration, update the start
 	 * offset of the in-core extent and update the btree if applicable.
 	 */
-	(*current_ext)++;
+	if (SHIFT == SHIFT_LEFT)
+		(*current_ext)++;
+	else
+		(*current_ext)--;
 	xfs_bmbt_set_startoff(gotp, startoff);
 	*logflags |= XFS_ILOG_CORE;
 	if (!cur) {
@@ -5581,10 +5617,10 @@ xfs_bmse_shift_one(
 }
 
 /*
- * Shift extent records to the left to cover a hole.
+ * Shift extent records to the left/right to cover/create a hole.
  *
  * The maximum number of extents to be shifted in a single operation is
- * @num_exts. @start_fsb specifies the file offset to start the shift and the
+ * @num_exts. @stop_fsb specifies the file offset at which to stop shift and the
  * file offset where we've left off is returned in @next_fsb. @offset_shift_fsb
  * is the length by which each extent is shifted. If there is no hole to shift
  * the extents into, this will be considered invalid operation and we abort
@@ -5594,12 +5630,13 @@ int
 xfs_bmap_shift_extents(
 	struct xfs_trans	*tp,
 	struct xfs_inode	*ip,
-	xfs_fileoff_t		start_fsb,
+	xfs_fileoff_t		*next_fsb,
 	xfs_fileoff_t		offset_shift_fsb,
 	int			*done,
-	xfs_fileoff_t		*next_fsb,
+	xfs_fileoff_t		stop_fsb,
 	xfs_fsblock_t		*firstblock,
 	struct xfs_bmap_free	*flist,
+	enum SHIFT_DIRECTION	SHIFT,
 	int			num_exts)
 {
 	struct xfs_btree_cur		*cur = NULL;
@@ -5609,10 +5646,11 @@ xfs_bmap_shift_extents(
 	struct xfs_ifork		*ifp;
 	xfs_extnum_t			nexts = 0;
 	xfs_extnum_t			current_ext;
+	xfs_extnum_t			total_extents;
+	xfs_extnum_t			stop_extent;
 	int				error = 0;
 	int				whichfork = XFS_DATA_FORK;
 	int				logflags = 0;
-	int				total_extents;
 
 	if (unlikely(XFS_TEST_ERROR(
 	    (XFS_IFORK_FORMAT(ip, whichfork) != XFS_DINODE_FMT_EXTENTS &&
@@ -5628,6 +5666,7 @@ xfs_bmap_shift_extents(
 
 	ASSERT(xfs_isilocked(ip, XFS_IOLOCK_EXCL));
 	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
+	ASSERT(SHIFT == SHIFT_LEFT || SHIFT == SHIFT_RIGHT);
 
 	ifp = XFS_IFORK_PTR(ip, whichfork);
 	if (!(ifp->if_flags & XFS_IFEXTENTS)) {
@@ -5645,43 +5684,85 @@ xfs_bmap_shift_extents(
 	}
 
 	/*
+	 * There may be delalloc extents in the data fork before the range we
+	 * are collapsing out, so we cannot use the count of real extents here.
+	 * Instead we have to calculate it from the incore fork.
+	 */
+	total_extents = ifp->if_bytes / sizeof(xfs_bmbt_rec_t);
+	if (total_extents == 0) {
+		*done = 1;
+		goto del_cursor;
+	}
+
+	/*
+	 * In case of first right shift, we need to initialize next_fsb
+	 */
+	if (*next_fsb == NULLFSBLOCK) {
+		ASSERT(SHIFT == SHIFT_RIGHT);
+		gotp = xfs_iext_get_ext(ifp, total_extents - 1);
+		xfs_bmbt_get_all(gotp, &got);
+		*next_fsb = got.br_startoff;
+		if (stop_fsb > *next_fsb) {
+			*done = 1;
+			goto del_cursor;
+		}
+	}
+
+	/* Lookup the extent index at which we have to stop */
+	if (SHIFT == SHIFT_RIGHT) {
+		gotp = xfs_iext_bno_to_ext(ifp, stop_fsb, &stop_extent);
+		/* Make stop_extent exclusive of shift range */
+		stop_extent--;
+	} else
+		stop_extent = total_extents;
+
+	/*
 	 * Look up the extent index for the fsb where we start shifting. We can
 	 * henceforth iterate with current_ext as extent list changes are locked
 	 * out via ilock.
 	 *
 	 * gotp can be null in 2 cases: 1) if there are no extents or 2)
-	 * start_fsb lies in a hole beyond which there are no extents. Either
+	 * *next_fsb lies in a hole beyond which there are no extents. Either
 	 * way, we are done.
 	 */
-	gotp = xfs_iext_bno_to_ext(ifp, start_fsb, &current_ext);
+	gotp = xfs_iext_bno_to_ext(ifp, *next_fsb, &current_ext);
 	if (!gotp) {
 		*done = 1;
 		goto del_cursor;
 	}
 
-	/*
-	 * There may be delalloc extents in the data fork before the range we
-	 * are collapsing out, so we cannot use the count of real extents here.
-	 * Instead we have to calculate it from the incore fork.
-	 */
-	total_extents = ifp->if_bytes / sizeof(xfs_bmbt_rec_t);
-	while (nexts++ < num_exts && current_ext < total_extents) {
+	/* some sanity checking before we finally start shifting extents */
+	if ((SHIFT == SHIFT_LEFT && current_ext >= stop_extent) ||
+	     (SHIFT == SHIFT_RIGHT && current_ext <= stop_extent)) {
+		error = EIO;
+		goto del_cursor;
+	}
+
+	while (nexts++ < num_exts) {
 		error = xfs_bmse_shift_one(ip, whichfork, offset_shift_fsb,
-					&current_ext, gotp, cur, &logflags);
+					   &current_ext, gotp, cur, &logflags,
+					   SHIFT);
 		if (error)
 			goto del_cursor;
+		/*
+		 * In case there was an extent merge after shifting extent,
+		 * extent numbers would change.
+		 * Update total extent count and grab the next record.
+		 */
+		if (SHIFT == SHIFT_LEFT) {
+			total_extents = ifp->if_bytes / sizeof(xfs_bmbt_rec_t);
+			stop_extent = total_extents;
+		}
 
-		/* update total extent count and grab the next record */
-		total_extents = ifp->if_bytes / sizeof(xfs_bmbt_rec_t);
-		if (current_ext >= total_extents)
+		if (current_ext == stop_extent) {
+			*done = 1;
+			*next_fsb = NULLFSBLOCK;
 			break;
+		}
 		gotp = xfs_iext_get_ext(ifp, current_ext);
 	}
 
-	/* Check if we are done */
-	if (current_ext == total_extents) {
-		*done = 1;
-	} else if (next_fsb) {
+	if (!*done) {
 		xfs_bmbt_get_all(gotp, &got);
 		*next_fsb = got.br_startoff;
 	}
@@ -5696,3 +5777,192 @@ del_cursor:
 
 	return error;
 }
+
+/*
+ * Splits an extent into two extents at split_fsb block that it is
+ * the first block of the current_ext. @current_ext is a target extent
+ * to be split. @split_fsb is a block where the extents is split.
+ * If split_fsb lies in a hole or the first block of extents, just return 0.
+ */
+STATIC int
+xfs_bmap_split_extent_at(
+	struct xfs_trans	*tp,
+	struct xfs_inode	*ip,
+	xfs_fileoff_t		split_fsb,
+	xfs_fsblock_t		*firstfsb,
+	struct xfs_bmap_free	*free_list)
+{
+	int				whichfork = XFS_DATA_FORK;
+	struct xfs_btree_cur		*cur = NULL;
+	struct xfs_bmbt_rec_host	*gotp;
+	struct xfs_bmbt_irec		got;
+	struct xfs_bmbt_irec		new; /* split extent */
+	struct xfs_mount		*mp = ip->i_mount;
+	struct xfs_ifork		*ifp;
+	xfs_fsblock_t			gotblkcnt; /* new block count for got */
+	xfs_extnum_t			current_ext;
+	int				error = 0;
+	int				logflags = 0;
+	int				i = 0;
+
+	if (unlikely(XFS_TEST_ERROR(
+	    (XFS_IFORK_FORMAT(ip, whichfork) != XFS_DINODE_FMT_EXTENTS &&
+	     XFS_IFORK_FORMAT(ip, whichfork) != XFS_DINODE_FMT_BTREE),
+	     mp, XFS_ERRTAG_BMAPIFORMAT, XFS_RANDOM_BMAPIFORMAT))) {
+		XFS_ERROR_REPORT("xfs_bmap_split_extent_at",
+				 XFS_ERRLEVEL_LOW, mp);
+		return -EFSCORRUPTED;
+	}
+
+	if (XFS_FORCED_SHUTDOWN(mp))
+		return -EIO;
+
+	ifp = XFS_IFORK_PTR(ip, whichfork);
+	if (!(ifp->if_flags & XFS_IFEXTENTS)) {
+		/* Read in all the extents */
+		error = xfs_iread_extents(tp, ip, whichfork);
+		if (error)
+			return error;
+	}
+
+	gotp = xfs_iext_bno_to_ext(ifp, split_fsb, &current_ext);
+	/*
+	 * gotp can be null in 2 cases: 1) if there are no extents
+	 * or 2) split_fsb lies in a hole beyond which there are
+	 * no extents. Either way, we are done.
+	 */
+	if (!gotp)
+		return 0;
+
+	xfs_bmbt_get_all(gotp, &got);
+
+	/*
+	 * Check split_fsb lies in a hole or the start boundary offset
+	 * of the extent.
+	 */
+	if (got.br_startoff >= split_fsb)
+		return 0;
+
+	gotblkcnt = split_fsb - got.br_startoff;
+	new.br_startoff = split_fsb;
+	new.br_startblock = got.br_startblock + gotblkcnt;
+	new.br_blockcount = got.br_blockcount - gotblkcnt;
+	new.br_state = got.br_state;
+
+	if (ifp->if_flags & XFS_IFBROOT) {
+		cur = xfs_bmbt_init_cursor(mp, tp, ip, whichfork);
+		cur->bc_private.b.firstblock = *firstfsb;
+		cur->bc_private.b.flist = free_list;
+		cur->bc_private.b.flags = 0;
+	}
+
+	if (cur) {
+		error = xfs_bmbt_lookup_eq(cur, got.br_startoff,
+				got.br_startblock,
+				got.br_blockcount,
+				&i);
+		if (error)
+			goto del_cursor;
+		XFS_WANT_CORRUPTED_GOTO(i == 1, del_cursor);
+	}
+
+	xfs_bmbt_set_blockcount(gotp, gotblkcnt);
+	got.br_blockcount = gotblkcnt;
+
+	logflags = XFS_ILOG_CORE;
+	if (cur) {
+		error = xfs_bmbt_update(cur, got.br_startoff,
+				got.br_startblock,
+				got.br_blockcount,
+				got.br_state);
+		if (error)
+			goto del_cursor;
+	} else
+		logflags |= XFS_ILOG_DEXT;
+
+	/* Add new extent */
+	current_ext++;
+	xfs_iext_insert(ip, current_ext, 1, &new, 0);
+	XFS_IFORK_NEXT_SET(ip, whichfork,
+			   XFS_IFORK_NEXTENTS(ip, whichfork) + 1);
+
+	if (cur) {
+		error = xfs_bmbt_lookup_eq(cur, new.br_startoff,
+				new.br_startblock, new.br_blockcount,
+				&i);
+		if (error)
+			goto del_cursor;
+		XFS_WANT_CORRUPTED_GOTO(i == 0, del_cursor);
+		cur->bc_rec.b.br_state = new.br_state;
+
+		error = xfs_btree_insert(cur, &i);
+		if (error)
+			goto del_cursor;
+		XFS_WANT_CORRUPTED_GOTO(i == 1, del_cursor);
+	}
+
+	/*
+	 * Convert to a btree if necessary.
+	 */
+	if (xfs_bmap_needs_btree(ip, whichfork)) {
+		int tmp_logflags; /* partial log flag return val */
+
+		ASSERT(cur == NULL);
+		error = xfs_bmap_extents_to_btree(tp, ip, firstfsb, free_list,
+				&cur, 0, &tmp_logflags, whichfork);
+		logflags |= tmp_logflags;
+	}
+
+del_cursor:
+	if (cur) {
+		cur->bc_private.b.allocated = 0;
+		xfs_btree_del_cursor(cur,
+				error ? XFS_BTREE_ERROR : XFS_BTREE_NOERROR);
+	}
+
+	if (logflags)
+		xfs_trans_log_inode(tp, ip, logflags);
+	return error;
+}
+
+int
+xfs_bmap_split_extent(
+	struct xfs_inode        *ip,
+	xfs_fileoff_t           split_fsb)
+{
+	struct xfs_mount        *mp = ip->i_mount;
+	struct xfs_trans        *tp;
+	struct xfs_bmap_free    free_list;
+	xfs_fsblock_t           firstfsb;
+	int                     committed;
+	int                     error;
+
+	tp = xfs_trans_alloc(mp, XFS_TRANS_DIOSTRAT);
+	error = xfs_trans_reserve(tp, &M_RES(mp)->tr_write,
+			XFS_DIOSTRAT_SPACE_RES(mp, 0), 0);
+	if (error) {
+		xfs_trans_cancel(tp, 0);
+		return error;
+	}
+
+	xfs_ilock(ip, XFS_ILOCK_EXCL);
+	xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
+
+	xfs_bmap_init(&free_list, &firstfsb);
+
+	error = xfs_bmap_split_extent_at(tp, ip, split_fsb,
+			&firstfsb, &free_list);
+	if (error)
+		goto out;
+
+	error = xfs_bmap_finish(&tp, &free_list, &committed);
+	if (error)
+		goto out;
+
+	return xfs_trans_commit(tp, XFS_TRANS_RELEASE_LOG_RES);
+
+
+out:
+	xfs_trans_cancel(tp, XFS_TRANS_RELEASE_LOG_RES | XFS_TRANS_ABORT);
+	return error;
+}
diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h
index b9d8a49..6ed6cd1 100644
--- a/fs/xfs/libxfs/xfs_bmap.h
+++ b/fs/xfs/libxfs/xfs_bmap.h
@@ -166,6 +166,11 @@ static inline void xfs_bmap_init(xfs_bmap_free_t *flp, xfs_fsblock_t *fbp)
  */
 #define XFS_BMAP_MAX_SHIFT_EXTENTS	1
 
+enum SHIFT_DIRECTION {
+	SHIFT_LEFT = 0,
+	SHIFT_RIGHT,
+};
+
 #ifdef DEBUG
 void	xfs_bmap_trace_exlist(struct xfs_inode *ip, xfs_extnum_t cnt,
 		int whichfork, unsigned long caller_ip);
@@ -211,8 +216,10 @@ int	xfs_check_nostate_extents(struct xfs_ifork *ifp, xfs_extnum_t idx,
 		xfs_extnum_t num);
 uint	xfs_default_attroffset(struct xfs_inode *ip);
 int	xfs_bmap_shift_extents(struct xfs_trans *tp, struct xfs_inode *ip,
-		xfs_fileoff_t start_fsb, xfs_fileoff_t offset_shift_fsb,
-		int *done, xfs_fileoff_t *next_fsb, xfs_fsblock_t *firstblock,
-		struct xfs_bmap_free *flist, int num_exts);
+		xfs_fileoff_t *next_fsb, xfs_fileoff_t offset_shift_fsb,
+		int *done, xfs_fileoff_t stop_fsb, xfs_fsblock_t *firstblock,
+		struct xfs_bmap_free *flist, enum SHIFT_DIRECTION SHIFT,
+		int num_exts);
+int	xfs_bmap_split_extent(struct xfs_inode *ip, xfs_fileoff_t split_offset);
 
 #endif	/* __XFS_BMAP_H__ */
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 22a5dcb..841744c 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -1376,22 +1376,19 @@ out:
 }
 
 /*
- * xfs_collapse_file_space()
- *	This routine frees disk space and shift extent for the given file.
- *	The first thing we do is to free data blocks in the specified range
- *	by calling xfs_free_file_space(). It would also sync dirty data
- *	and invalidate page cache over the region on which collapse range
- *	is working. And Shift extent records to the left to cover a hole.
- * RETURNS:
- *	0 on success
- *	errno on error
- *
+ * @next_fsb will keep track of the extent currently undergoing shift.
+ * @stop_fsb will keep track of the extent at which we have to stop.
+ * If we are shifting left, we will start with block (offset + len) and
+ * shift each extent till last extent.
+ * If we are shifting right, we will start with last extent inside file space
+ * and continue until we reach the block corresponding to offset.
  */
 int
-xfs_collapse_file_space(
-	struct xfs_inode	*ip,
-	xfs_off_t		offset,
-	xfs_off_t		len)
+xfs_shift_file_space(
+	struct xfs_inode        *ip,
+	xfs_off_t               offset,
+	xfs_off_t               len,
+	enum SHIFT_DIRECTION	SHIFT)
 {
 	int			done = 0;
 	struct xfs_mount	*mp = ip->i_mount;
@@ -1400,21 +1397,26 @@ xfs_collapse_file_space(
 	struct xfs_bmap_free	free_list;
 	xfs_fsblock_t		first_block;
 	int			committed;
-	xfs_fileoff_t		start_fsb;
+	xfs_fileoff_t		stop_fsb;
 	xfs_fileoff_t		next_fsb;
 	xfs_fileoff_t		shift_fsb;
 
-	ASSERT(xfs_isilocked(ip, XFS_IOLOCK_EXCL));
+	ASSERT(SHIFT == SHIFT_LEFT || SHIFT == SHIFT_RIGHT);
 
-	trace_xfs_collapse_file_space(ip);
+	if (SHIFT == SHIFT_LEFT) {
+		next_fsb = XFS_B_TO_FSB(mp, offset + len);
+		stop_fsb = XFS_B_TO_FSB(mp, VFS_I(ip)->i_size);
+	} else {
+		/*
+		 * If right shift, delegate the work of initialization of
+		 * next_fsb to xfs_bmap_shift_extent as it has ilock held.
+		 */
+		next_fsb = NULLFSBLOCK;
+		stop_fsb = XFS_B_TO_FSB(mp, offset);
+	}
 
-	next_fsb = XFS_B_TO_FSB(mp, offset + len);
 	shift_fsb = XFS_B_TO_FSB(mp, len);
 
-	error = xfs_free_file_space(ip, offset, len);
-	if (error)
-		return error;
-
 	/*
 	 * Trim eofblocks to avoid shifting uninitialized post-eof preallocation
 	 * into the accessible region of the file.
@@ -1427,20 +1429,23 @@ xfs_collapse_file_space(
 
 	/*
 	 * Writeback and invalidate cache for the remainder of the file as we're
-	 * about to shift down every extent from the collapse range to EOF. The
-	 * free of the collapse range above might have already done some of
-	 * this, but we shouldn't rely on it to do anything outside of the range
-	 * that was freed.
+	 * about to shift down every extent from offset to EOF.
 	 */
 	error = filemap_write_and_wait_range(VFS_I(ip)->i_mapping,
-					     offset + len, -1);
+					     offset, -1);
 	if (error)
 		return error;
 	error = invalidate_inode_pages2_range(VFS_I(ip)->i_mapping,
-					(offset + len) >> PAGE_CACHE_SHIFT, -1);
+					offset >> PAGE_CACHE_SHIFT, -1);
 	if (error)
 		return error;
 
+	if (SHIFT == SHIFT_RIGHT) {
+		error = xfs_bmap_split_extent(ip, stop_fsb);
+		if (error)
+			return error;
+	}
+
 	while (!error && !done) {
 		tp = xfs_trans_alloc(mp, XFS_TRANS_DIOSTRAT);
 		/*
@@ -1464,7 +1469,7 @@ xfs_collapse_file_space(
 		if (error)
 			goto out;
 
-		xfs_trans_ijoin(tp, ip, 0);
+		xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
 
 		xfs_bmap_init(&free_list, &first_block);
 
@@ -1472,10 +1477,9 @@ xfs_collapse_file_space(
 		 * We are using the write transaction in which max 2 bmbt
 		 * updates are allowed
 		 */
-		start_fsb = next_fsb;
-		error = xfs_bmap_shift_extents(tp, ip, start_fsb, shift_fsb,
-				&done, &next_fsb, &first_block, &free_list,
-				XFS_BMAP_MAX_SHIFT_EXTENTS);
+		error = xfs_bmap_shift_extents(tp, ip, &next_fsb, shift_fsb,
+				&done, stop_fsb, &first_block, &free_list,
+				SHIFT, XFS_BMAP_MAX_SHIFT_EXTENTS);
 		if (error)
 			goto out;
 
@@ -1484,18 +1488,70 @@ xfs_collapse_file_space(
 			goto out;
 
 		error = xfs_trans_commit(tp, XFS_TRANS_RELEASE_LOG_RES);
-		xfs_iunlock(ip, XFS_ILOCK_EXCL);
 	}
 
 	return error;
 
 out:
 	xfs_trans_cancel(tp, XFS_TRANS_RELEASE_LOG_RES | XFS_TRANS_ABORT);
-	xfs_iunlock(ip, XFS_ILOCK_EXCL);
 	return error;
 }
 
 /*
+ * xfs_collapse_file_space()
+ *	This routine frees disk space and shift extent for the given file.
+ *	The first thing we do is to free data blocks in the specified range
+ *	by calling xfs_free_file_space(). It would also sync dirty data
+ *	and invalidate page cache over the region on which collapse range
+ *	is working. And Shift extent records to the left to cover a hole.
+ * RETURNS:
+ *	0 on success
+ *	errno on error
+ *
+ */
+int
+xfs_collapse_file_space(
+	struct xfs_inode	*ip,
+	xfs_off_t		offset,
+	xfs_off_t		len)
+{
+	int error;
+
+	ASSERT(xfs_isilocked(ip, XFS_IOLOCK_EXCL));
+	trace_xfs_collapse_file_space(ip);
+
+	error = xfs_free_file_space(ip, offset, len);
+	if (error)
+		return error;
+
+	return xfs_shift_file_space(ip, offset, len, SHIFT_LEFT);
+}
+
+/*
+ * xfs_insert_file_space()
+ *	This routine create hole space by shifting extents for the given file.
+ *	The first thing we do is to sync dirty data and invalidate page cache
+ *	over the region on which insert range is working. And split an extent
+ *	to two extents at given offset by calling xfs_bmap_split_extent.
+ *	And shift all extent records which are laying between [offset,
+ *	last allocated extent] to the right to reserve hole range.
+ * RETURNS:
+ *	0 on success
+ *	errno on error
+ */
+int
+xfs_insert_file_space(
+	struct xfs_inode	*ip,
+	loff_t			offset,
+	loff_t			len)
+{
+	ASSERT(xfs_isilocked(ip, XFS_IOLOCK_EXCL));
+	trace_xfs_insert_file_space(ip);
+
+	return xfs_shift_file_space(ip, offset, len, SHIFT_RIGHT);
+}
+
+/*
  * We need to check that the format of the data fork in the temporary inode is
  * valid for the target inode before doing the swap. This is not a problem with
  * attr1 because of the fixed fork offset, but attr2 has a dynamically sized
diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h
index 736429a..af97d9a 100644
--- a/fs/xfs/xfs_bmap_util.h
+++ b/fs/xfs/xfs_bmap_util.h
@@ -63,6 +63,8 @@ int	xfs_zero_file_space(struct xfs_inode *ip, xfs_off_t offset,
 			    xfs_off_t len);
 int	xfs_collapse_file_space(struct xfs_inode *, xfs_off_t offset,
 				xfs_off_t len);
+int	xfs_insert_file_space(struct xfs_inode *, xfs_off_t offset,
+				xfs_off_t len);
 
 /* EOF block manipulation functions */
 bool	xfs_can_free_eofblocks(struct xfs_inode *ip, bool force);
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 1cdba95..222a91a 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -823,11 +823,13 @@ xfs_file_fallocate(
 	long			error;
 	enum xfs_prealloc_flags	flags = 0;
 	loff_t			new_size = 0;
+	int			do_file_insert = 0;
 
 	if (!S_ISREG(inode->i_mode))
 		return -EINVAL;
 	if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE |
-		     FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE))
+		     FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE |
+		     FALLOC_FL_INSERT_RANGE))
 		return -EOPNOTSUPP;
 
 	xfs_ilock(ip, XFS_IOLOCK_EXCL);
@@ -857,6 +859,28 @@ xfs_file_fallocate(
 		error = xfs_collapse_file_space(ip, offset, len);
 		if (error)
 			goto out_unlock;
+	} else if (mode & FALLOC_FL_INSERT_RANGE) {
+		unsigned blksize_mask = (1 << inode->i_blkbits) - 1;
+
+		if (offset & blksize_mask || len & blksize_mask) {
+			error = -EINVAL;
+			goto out_unlock;
+		}
+
+		/* Check for wrap through zero */
+		if (inode->i_size + len > inode->i_sb->s_maxbytes) {
+			error = -EFBIG;
+			goto out_unlock;
+		}
+
+		/* Offset should be less than i_size */
+		if (offset >= i_size_read(inode)) {
+			error = -EINVAL;
+			goto out_unlock;
+		}
+
+		new_size = i_size_read(inode) + len;
+		do_file_insert = 1;
 	} else {
 		flags |= XFS_PREALLOC_SET;
 
@@ -891,8 +915,20 @@ xfs_file_fallocate(
 		iattr.ia_valid = ATTR_SIZE;
 		iattr.ia_size = new_size;
 		error = xfs_setattr_size(ip, &iattr);
+		if (error)
+			goto out_unlock;
 	}
 
+	/*
+	 * Some operations are performed after the inode size is updated. For
+	 * example, insert range expands the address space of the file, shifts
+	 * all subsequent extents to create a hole inside the file. Updating
+	 * the size first ensures that shifted extents aren't left hanging
+	 * past EOF in the event of a crash or failure.
+	 */
+	if (do_file_insert)
+		error = xfs_insert_file_space(ip, offset, len);
+
 out_unlock:
 	xfs_iunlock(ip, XFS_IOLOCK_EXCL);
 	return error;
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 51372e3..7e45fa1 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -664,6 +664,7 @@ DEFINE_INODE_EVENT(xfs_alloc_file_space);
 DEFINE_INODE_EVENT(xfs_free_file_space);
 DEFINE_INODE_EVENT(xfs_zero_file_space);
 DEFINE_INODE_EVENT(xfs_collapse_file_space);
+DEFINE_INODE_EVENT(xfs_insert_file_space);
 DEFINE_INODE_EVENT(xfs_readdir);
 #ifdef CONFIG_XFS_POSIX_ACL
 DEFINE_INODE_EVENT(xfs_get_acl);
-- 
1.7.9.5

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related

* [PATCH RESEND 3/12] ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate
From: Namjae Jeon @ 2015-02-16 15:47 UTC (permalink / raw)
  To: david, tytso
  Cc: linux-fsdevel, linux-kernel, linux-ext4, xfs, a.sangwan, bfoster,
	mtk.manpages, linux-man, linux-api, Namjae Jeon, Namjae Jeon
In-Reply-To: <1424101680-3301-1-git-send-email-linkinjeon@gmail.com>

From: Namjae Jeon <namjae.jeon@samsung.com>

This patch implements fallocate's FALLOC_FL_INSERT_RANGE for Ext4.

1) Make sure that both offset and len are block size aligned.
2) Update the i_size of inode by len bytes.
3) Compute the file's logical block number against offset. If the computed
   block number is not the starting block of the extent, split the extent
   such that the block number is the starting block of the extent.
4) Shift all the extents which are lying bewteen [offset, last allocated extent]
   towards right by len bytes. This step will make a hole of len bytes
   at offset.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
---
 fs/ext4/ext4.h              |    6 +
 fs/ext4/extents.c           |  302 +++++++++++++++++++++++++++++++++++--------
 include/trace/events/ext4.h |   25 ++++
 3 files changed, 282 insertions(+), 51 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 98ee89c..6db57e6 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -90,6 +90,11 @@ typedef __u32 ext4_lblk_t;
 /* data type for block group number */
 typedef unsigned int ext4_group_t;
 
+enum SHIFT_DIRECTION {
+	SHIFT_LEFT = 0,
+	SHIFT_RIGHT,
+};
+
 /*
  * Flags used in mballoc's allocation_context flags field.
  *
@@ -2766,6 +2771,7 @@ extern int ext4_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 			__u64 start, __u64 len);
 extern int ext4_ext_precache(struct inode *inode);
 extern int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len);
+extern int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len);
 extern int ext4_swap_extents(handle_t *handle, struct inode *inode1,
 				struct inode *inode2, ext4_lblk_t lblk1,
 			     ext4_lblk_t lblk2,  ext4_lblk_t count,
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index bed4308..a07b109 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4924,7 +4924,8 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
 
 	/* Return error if mode is not supported */
 	if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE |
-		     FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE))
+		     FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE |
+		     FALLOC_FL_INSERT_RANGE))
 		return -EOPNOTSUPP;
 
 	if (mode & FALLOC_FL_PUNCH_HOLE)
@@ -4944,6 +4945,9 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
 	if (mode & FALLOC_FL_COLLAPSE_RANGE)
 		return ext4_collapse_range(inode, offset, len);
 
+	if (mode & FALLOC_FL_INSERT_RANGE)
+		return ext4_insert_range(inode, offset, len);
+
 	if (mode & FALLOC_FL_ZERO_RANGE)
 		return ext4_zero_range(file, offset, len, mode);
 
@@ -5230,13 +5234,13 @@ ext4_access_path(handle_t *handle, struct inode *inode,
 /*
  * ext4_ext_shift_path_extents:
  * Shift the extents of a path structure lying between path[depth].p_ext
- * and EXT_LAST_EXTENT(path[depth].p_hdr) downwards, by subtracting shift
- * from starting block for each extent.
+ * and EXT_LAST_EXTENT(path[depth].p_hdr), by @shift blocks. @SHIFT tells
+ * if it is right shift or left shift operation.
  */
 static int
 ext4_ext_shift_path_extents(struct ext4_ext_path *path, ext4_lblk_t shift,
 			    struct inode *inode, handle_t *handle,
-			    ext4_lblk_t *start)
+			    enum SHIFT_DIRECTION SHIFT)
 {
 	int depth, err = 0;
 	struct ext4_extent *ex_start, *ex_last;
@@ -5258,19 +5262,25 @@ ext4_ext_shift_path_extents(struct ext4_ext_path *path, ext4_lblk_t shift,
 			if (ex_start == EXT_FIRST_EXTENT(path[depth].p_hdr))
 				update = 1;
 
-			*start = le32_to_cpu(ex_last->ee_block) +
-				ext4_ext_get_actual_len(ex_last);
-
 			while (ex_start <= ex_last) {
-				le32_add_cpu(&ex_start->ee_block, -shift);
-				/* Try to merge to the left. */
-				if ((ex_start >
-				     EXT_FIRST_EXTENT(path[depth].p_hdr)) &&
-				    ext4_ext_try_to_merge_right(inode,
-							path, ex_start - 1))
+				if (SHIFT == SHIFT_LEFT) {
+					le32_add_cpu(&ex_start->ee_block,
+						-shift);
+					/* Try to merge to the left. */
+					if ((ex_start >
+					    EXT_FIRST_EXTENT(path[depth].p_hdr))
+					    &&
+					    ext4_ext_try_to_merge_right(inode,
+					    path, ex_start - 1))
+						ex_last--;
+					else
+						ex_start++;
+				} else {
+					le32_add_cpu(&ex_last->ee_block, shift);
+					ext4_ext_try_to_merge_right(inode, path,
+						ex_last);
 					ex_last--;
-				else
-					ex_start++;
+				}
 			}
 			err = ext4_ext_dirty(handle, inode, path + depth);
 			if (err)
@@ -5285,7 +5295,10 @@ ext4_ext_shift_path_extents(struct ext4_ext_path *path, ext4_lblk_t shift,
 		if (err)
 			goto out;
 
-		le32_add_cpu(&path[depth].p_idx->ei_block, -shift);
+		if (SHIFT == SHIFT_LEFT)
+			le32_add_cpu(&path[depth].p_idx->ei_block, -shift);
+		else
+			le32_add_cpu(&path[depth].p_idx->ei_block, shift);
 		err = ext4_ext_dirty(handle, inode, path + depth);
 		if (err)
 			goto out;
@@ -5303,19 +5316,20 @@ out:
 
 /*
  * ext4_ext_shift_extents:
- * All the extents which lies in the range from start to the last allocated
- * block for the file are shifted downwards by shift blocks.
+ * All the extents which lies in the range from @start to the last allocated
+ * block for the @inode are shifted either towards left or right (depending
+ * upon @SHIFT) by @shift blocks.
  * On success, 0 is returned, error otherwise.
  */
 static int
 ext4_ext_shift_extents(struct inode *inode, handle_t *handle,
-		       ext4_lblk_t start, ext4_lblk_t shift)
+		       ext4_lblk_t start, ext4_lblk_t shift,
+		       enum SHIFT_DIRECTION SHIFT)
 {
 	struct ext4_ext_path *path;
 	int ret = 0, depth;
 	struct ext4_extent *extent;
-	ext4_lblk_t stop_block;
-	ext4_lblk_t ex_start, ex_end;
+	ext4_lblk_t stop, *iterator, ex_start, ex_end;
 
 	/* Let path point to the last extent */
 	path = ext4_find_extent(inode, EXT_MAX_BLOCKS - 1, NULL, 0);
@@ -5327,58 +5341,84 @@ ext4_ext_shift_extents(struct inode *inode, handle_t *handle,
 	if (!extent)
 		goto out;
 
-	stop_block = le32_to_cpu(extent->ee_block) +
+	stop = le32_to_cpu(extent->ee_block) +
 			ext4_ext_get_actual_len(extent);
 
-	/* Nothing to shift, if hole is at the end of file */
-	if (start >= stop_block)
-		goto out;
+       /*
+	 * In case of left shift, Don't start shifting extents until we make
+	 * sure the hole is big enough to accommodate the shift.
+	*/
+	if (SHIFT == SHIFT_LEFT) {
+		path = ext4_find_extent(inode, start - 1, &path, 0);
+		if (IS_ERR(path))
+			return PTR_ERR(path);
+		depth = path->p_depth;
+		extent =  path[depth].p_ext;
+		if (extent) {
+			ex_start = le32_to_cpu(extent->ee_block);
+			ex_end = le32_to_cpu(extent->ee_block) +
+				ext4_ext_get_actual_len(extent);
+		} else {
+			ex_start = 0;
+			ex_end = 0;
+		}
 
-	/*
-	 * Don't start shifting extents until we make sure the hole is big
-	 * enough to accomodate the shift.
-	 */
-	path = ext4_find_extent(inode, start - 1, &path, 0);
-	if (IS_ERR(path))
-		return PTR_ERR(path);
-	depth = path->p_depth;
-	extent =  path[depth].p_ext;
-	if (extent) {
-		ex_start = le32_to_cpu(extent->ee_block);
-		ex_end = le32_to_cpu(extent->ee_block) +
-			ext4_ext_get_actual_len(extent);
-	} else {
-		ex_start = 0;
-		ex_end = 0;
+		if ((start == ex_start && shift > ex_start) ||
+		    (shift > start - ex_end)) {
+			ext4_ext_drop_refs(path);
+			kfree(path);
+			return -EINVAL;
+		}
 	}
 
-	if ((start == ex_start && shift > ex_start) ||
-	    (shift > start - ex_end))
-		return -EINVAL;
+	/*
+	 * In case of left shift, iterator points to start and it is increased
+	 * till we reach stop. In case of right shift, iterator points to stop
+	 * and it is decreased till we reach start.
+	 */
+	if (SHIFT == SHIFT_LEFT)
+		iterator = &start;
+	else
+		iterator = &stop;
 
 	/* Its safe to start updating extents */
-	while (start < stop_block) {
-		path = ext4_find_extent(inode, start, &path, 0);
+	while (start < stop) {
+		path = ext4_find_extent(inode, *iterator, &path, 0);
 		if (IS_ERR(path))
 			return PTR_ERR(path);
 		depth = path->p_depth;
 		extent = path[depth].p_ext;
 		if (!extent) {
 			EXT4_ERROR_INODE(inode, "unexpected hole at %lu",
-					 (unsigned long) start);
+					 (unsigned long) *iterator);
 			return -EIO;
 		}
-		if (start > le32_to_cpu(extent->ee_block)) {
+		if (SHIFT == SHIFT_LEFT && *iterator >
+		    le32_to_cpu(extent->ee_block)) {
 			/* Hole, move to the next extent */
 			if (extent < EXT_LAST_EXTENT(path[depth].p_hdr)) {
 				path[depth].p_ext++;
 			} else {
-				start = ext4_ext_next_allocated_block(path);
+				*iterator = ext4_ext_next_allocated_block(path);
 				continue;
 			}
 		}
+
+		if (SHIFT == SHIFT_LEFT) {
+			extent = EXT_LAST_EXTENT(path[depth].p_hdr);
+			*iterator = le32_to_cpu(extent->ee_block) +
+					ext4_ext_get_actual_len(extent);
+		} else {
+			extent = EXT_FIRST_EXTENT(path[depth].p_hdr);
+			*iterator =  le32_to_cpu(extent->ee_block) > 0 ?
+				le32_to_cpu(extent->ee_block) - 1 : 0;
+			/* Update path extent in case we need to stop */
+			while (le32_to_cpu(extent->ee_block) < start)
+				extent++;
+			path[depth].p_ext = extent;
+		}
 		ret = ext4_ext_shift_path_extents(path, shift, inode,
-				handle, &start);
+				handle, SHIFT);
 		if (ret)
 			break;
 	}
@@ -5483,7 +5523,7 @@ int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len)
 	ext4_discard_preallocations(inode);
 
 	ret = ext4_ext_shift_extents(inode, handle, punch_stop,
-				     punch_stop - punch_start);
+				     punch_stop - punch_start, SHIFT_LEFT);
 	if (ret) {
 		up_write(&EXT4_I(inode)->i_data_sem);
 		goto out_stop;
@@ -5508,6 +5548,166 @@ out_mutex:
 	return ret;
 }
 
+/*
+ * ext4_insert_range:
+ * This function implements the FALLOC_FL_INSERT_RANGE flag of fallocate.
+ * The data blocks starting from @offset to the EOF are shifted by @len
+ * towards right to create a hole in the @inode. Inode size is increased
+ * by len bytes.
+ * Returns 0 on success, error otherwise.
+ */
+int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len)
+{
+	struct super_block *sb = inode->i_sb;
+	handle_t *handle;
+	struct ext4_ext_path *path;
+	struct ext4_extent *extent;
+	ext4_lblk_t offset_lblk, len_lblk, ee_start_lblk = 0;
+	unsigned int credits, ee_len;
+	int ret = 0, depth, split_flag = 0;
+	loff_t ioffset;
+
+	/* Insert range works only on fs block size aligned offsets. */
+	if (offset & (EXT4_CLUSTER_SIZE(sb) - 1) ||
+			len & (EXT4_CLUSTER_SIZE(sb) - 1))
+		return -EINVAL;
+
+	if (!S_ISREG(inode->i_mode))
+		return -EOPNOTSUPP;
+
+	trace_ext4_insert_range(inode, offset, len);
+
+	offset_lblk = offset >> EXT4_BLOCK_SIZE_BITS(sb);
+	len_lblk = len >> EXT4_BLOCK_SIZE_BITS(sb);
+
+	/* Call ext4_force_commit to flush all data in case of data=journal */
+	if (ext4_should_journal_data(inode)) {
+		ret = ext4_force_commit(inode->i_sb);
+		if (ret)
+			return ret;
+	}
+
+	/*
+	 * Need to round down to align start offset to page size boundary
+	 * for page size > block size.
+	 */
+	ioffset = round_down(offset, PAGE_SIZE);
+
+	/* Write out all dirty pages */
+	ret = filemap_write_and_wait_range(inode->i_mapping, ioffset,
+			LLONG_MAX);
+	if (ret)
+		return ret;
+
+	/* Take mutex lock */
+	mutex_lock(&inode->i_mutex);
+
+	/* Currently just for extent based files */
+	if (!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) {
+		ret = -EOPNOTSUPP;
+		goto out_mutex;
+	}
+
+	/* Check for wrap through zero */
+	if (inode->i_size + len > inode->i_sb->s_maxbytes) {
+		ret = -EFBIG;
+		goto out_mutex;
+	}
+
+	/* Offset should be less than i_size */
+	if (offset >= i_size_read(inode)) {
+		ret = -EINVAL;
+		goto out_mutex;
+	}
+
+	truncate_pagecache(inode, ioffset);
+
+	/* Wait for existing dio to complete */
+	ext4_inode_block_unlocked_dio(inode);
+	inode_dio_wait(inode);
+
+	credits = ext4_writepage_trans_blocks(inode);
+	handle = ext4_journal_start(inode, EXT4_HT_TRUNCATE, credits);
+	if (IS_ERR(handle)) {
+		ret = PTR_ERR(handle);
+		goto out_dio;
+	}
+
+	/* Expand file to avoid data loss if there is error while shifting */
+	inode->i_size += len;
+	EXT4_I(inode)->i_disksize += len;
+	inode->i_mtime = inode->i_ctime = ext4_current_time(inode);
+	ret = ext4_mark_inode_dirty(handle, inode);
+	if (ret)
+		goto out_stop;
+
+	down_write(&EXT4_I(inode)->i_data_sem);
+	ext4_discard_preallocations(inode);
+
+	path = ext4_find_extent(inode, offset_lblk, NULL, 0);
+	if (IS_ERR(path)) {
+		up_write(&EXT4_I(inode)->i_data_sem);
+		goto out_stop;
+	}
+
+	depth = ext_depth(inode);
+	extent = path[depth].p_ext;
+	if (extent) {
+		ee_start_lblk = le32_to_cpu(extent->ee_block);
+		ee_len = ext4_ext_get_actual_len(extent);
+
+		/*
+		 * If offset_lblk is not the starting block of extent, split
+		 * the extent @offset_lblk
+		 */
+		if ((offset_lblk > ee_start_lblk) &&
+				(offset_lblk < (ee_start_lblk + ee_len))) {
+			if (ext4_ext_is_unwritten(extent))
+				split_flag = EXT4_EXT_MARK_UNWRIT1 |
+					EXT4_EXT_MARK_UNWRIT2;
+			ret = ext4_split_extent_at(handle, inode, &path,
+					offset_lblk, split_flag,
+					EXT4_EX_NOCACHE |
+					EXT4_GET_BLOCKS_PRE_IO |
+					EXT4_GET_BLOCKS_METADATA_NOFAIL);
+		}
+
+		ext4_ext_drop_refs(path);
+		kfree(path);
+		if (ret < 0) {
+			up_write(&EXT4_I(inode)->i_data_sem);
+			goto out_stop;
+		}
+	}
+
+	ret = ext4_es_remove_extent(inode, offset_lblk,
+			EXT_MAX_BLOCKS - offset_lblk);
+	if (ret) {
+		up_write(&EXT4_I(inode)->i_data_sem);
+		goto out_stop;
+	}
+
+	/*
+	 * if offset_lblk lies in a hole which is at start of file, use
+	 * ee_start_lblk to shift extents
+	 */
+	ret = ext4_ext_shift_extents(inode, handle,
+		ee_start_lblk > offset_lblk ? ee_start_lblk : offset_lblk,
+		len_lblk, SHIFT_RIGHT);
+
+	up_write(&EXT4_I(inode)->i_data_sem);
+	if (IS_SYNC(inode))
+		ext4_handle_sync(handle);
+
+out_stop:
+	ext4_journal_stop(handle);
+out_dio:
+	ext4_inode_resume_unlocked_dio(inode);
+out_mutex:
+	mutex_unlock(&inode->i_mutex);
+	return ret;
+}
+
 /**
  * ext4_swap_extents - Swap extents between two inodes
  *
diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
index 6e5abd6..2a89d66 100644
--- a/include/trace/events/ext4.h
+++ b/include/trace/events/ext4.h
@@ -2478,6 +2478,31 @@ TRACE_EVENT(ext4_collapse_range,
 		  __entry->offset, __entry->len)
 );
 
+TRACE_EVENT(ext4_insert_range,
+	TP_PROTO(struct inode *inode, loff_t offset, loff_t len),
+
+	TP_ARGS(inode, offset, len),
+
+	TP_STRUCT__entry(
+		__field(dev_t,	dev)
+		__field(ino_t,	ino)
+		__field(loff_t,	offset)
+		__field(loff_t, len)
+	),
+
+	TP_fast_assign(
+		__entry->dev	= inode->i_sb->s_dev;
+		__entry->ino	= inode->i_ino;
+		__entry->offset	= offset;
+		__entry->len	= len;
+	),
+
+	TP_printk("dev %d,%d ino %lu offset %lld len %lld",
+		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  (unsigned long) __entry->ino,
+		  __entry->offset, __entry->len)
+);
+
 TRACE_EVENT(ext4_es_shrink,
 	TP_PROTO(struct super_block *sb, int nr_shrunk, u64 scan_time,
 		 int nr_skipped, int retried),
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH RESEND 4/12] xfsprog: xfsio: update xfs_io manpage for FALLOC_FL_INSERT_RANGE
From: Namjae Jeon @ 2015-02-16 15:47 UTC (permalink / raw)
  To: david-FqsqvQoI3Ljby3iVrkZq2A, tytso-3s7WtUTddSA
  Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA, xfs-VZNHf3L845pBDgjK7y7TUQ,
	a.sangwan-Sze3O3UU22JBDgjK7y7TUQ, bfoster-H+wXaHxf7aLQT0dZR+AlfA,
	mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	linux-man-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Namjae Jeon, Namjae Jeon
In-Reply-To: <1424101680-3301-1-git-send-email-linkinjeon-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

From: Namjae Jeon <namjae.jeon-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>

Update xfs_io manpage for FALLOC_FL_INSERT_RANGE.

Signed-off-by: Namjae Jeon <namjae.jeon-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
Signed-off-by: Ashish Sangwan <a.sangwan-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
---
 man/man8/xfs_io.8 |    5 +++++
 1 file changed, 5 insertions(+)

diff --git a/man/man8/xfs_io.8 b/man/man8/xfs_io.8
index cf27b99..416206f 100644
--- a/man/man8/xfs_io.8
+++ b/man/man8/xfs_io.8
@@ -404,6 +404,11 @@ Call fallocate with FALLOC_FL_COLLAPSE_RANGE flag as described in the
 manual page to de-allocates blocks and eliminates the hole created in this process
 by shifting data blocks into the hole.
 .TP
+.BI finsert " offset length"
+Call fallocate with FALLOC_FL_INSERT_RANGE flag as described in the
+.BR fallocate (2)
+manual page to create the hole by shifting data blocks.
+.TP
 .BI fpunch " offset length"
 Punches (de-allocates) blocks in the file by calling fallocate with 
 the FALLOC_FL_PUNCH_HOLE flag as described in the
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH RESEND 5/12] xfstests: generic/042: Standard insert range tests
From: Namjae Jeon @ 2015-02-16 15:47 UTC (permalink / raw)
  To: david, tytso
  Cc: linux-man, Namjae Jeon, Namjae Jeon, linux-api, bfoster,
	linux-kernel, xfs, mtk.manpages, a.sangwan, linux-fsdevel,
	linux-ext4
In-Reply-To: <1424101680-3301-1-git-send-email-linkinjeon@gmail.com>

From: Namjae Jeon <namjae.jeon@samsung.com>

This testcase(042) tries to test various corner cases for finsert range
functionality over different type of extents.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
---
 common/punch          |    5 ++++
 common/rc             |    2 +-
 tests/generic/042     |   65 +++++++++++++++++++++++++++++++++++++++++
 tests/generic/042.out |   78 +++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/group   |    1 +
 5 files changed, 150 insertions(+), 1 deletion(-)
 create mode 100644 tests/generic/042
 create mode 100644 tests/generic/042.out

diff --git a/common/punch b/common/punch
index 237b4d8..a75f4cf 100644
--- a/common/punch
+++ b/common/punch
@@ -527,6 +527,11 @@ _test_generic_punch()
 		return
 	fi
 
+	# If zero_cmd is finsert, don't check unaligned offsets
+	if [ "$zero_cmd" == "finsert" ]; then
+		return
+	fi
+
 	echo "	16. data -> cache cold ->hole"
 	if [ "$remove_testfile" ]; then
 		rm -f $testfile
diff --git a/common/rc b/common/rc
index 5377ba0..4388e29 100644
--- a/common/rc
+++ b/common/rc
@@ -1520,7 +1520,7 @@ _require_xfs_io_command()
 	"falloc" )
 		testio=`$XFS_IO_PROG -F -f -c "falloc 0 1m" $testfile 2>&1`
 		;;
-	"fpunch" | "fcollapse" | "zero" | "fzero" )
+	"fpunch" | "fcollapse" | "zero" | "fzero" | "finsert" )
 		testio=`$XFS_IO_PROG -F -f -c "pwrite 0 20k" -c "fsync" \
 			-c "$command 4k 8k" $testfile 2>&1`
 		;;
diff --git a/tests/generic/042 b/tests/generic/042
new file mode 100644
index 0000000..9b83e8d
--- /dev/null
+++ b/tests/generic/042
@@ -0,0 +1,65 @@
+#! /bin/bash
+# FS QA Test No. generic/042
+#
+# Standard insert range tests
+# This testcase is one of the 4 testcases which tries to
+# test various corner cases for finsert range functionality over different
+# type of extents. These tests are based on generic/255 test case.
+# For the type of tests, check the description of _test_generic_punch
+# in common/rc.
+#-----------------------------------------------------------------------
+# Copyright (c) 2015 Samsung Electronics.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#-----------------------------------------------------------------------
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+
+_cleanup()
+{
+    rm -f $tmp.*
+}
+
+trap "_cleanup ; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+# we need to include common/punch to get defination fo filter functions
+. ./common/rc
+. ./common/filter
+. ./common/punch
+
+# real QA test starts here
+_supported_fs generic
+_supported_os Linux
+
+_require_xfs_io_command "fpunch"
+_require_xfs_io_command "falloc"
+_require_xfs_io_command "fiemap"
+_require_xfs_io_command "finsert"
+
+testfile=$TEST_DIR/$seq.$$
+
+_test_generic_punch falloc fpunch finsert fiemap _filter_hole_fiemap $testfile
+_check_test_fs
+
+status=0
+exit
diff --git a/tests/generic/042.out b/tests/generic/042.out
new file mode 100644
index 0000000..2406d71
--- /dev/null
+++ b/tests/generic/042.out
@@ -0,0 +1,78 @@
+QA output created by 042
+	1. into a hole
+cf845a781c107ec1346e849c9dd1b7e8
+	2. into allocated space
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..55]: extent
+64e72217eebcbdf31b1b058f9f5f476a
+	3. into unwritten space
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..55]: extent
+cf845a781c107ec1346e849c9dd1b7e8
+	4. hole -> data
+0: [0..31]: hole
+1: [32..47]: extent
+2: [48..55]: hole
+adb08a6d94a3b5eff90fdfebb2366d31
+	5. hole -> unwritten
+0: [0..31]: hole
+1: [32..47]: extent
+2: [48..55]: hole
+cf845a781c107ec1346e849c9dd1b7e8
+	6. data -> hole
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..31]: extent
+3: [32..55]: hole
+be0f35d4292a20040766d87883b0abd1
+	7. data -> unwritten
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..47]: extent
+3: [48..55]: hole
+be0f35d4292a20040766d87883b0abd1
+	8. unwritten -> hole
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..31]: extent
+3: [32..55]: hole
+cf845a781c107ec1346e849c9dd1b7e8
+	9. unwritten -> data
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..47]: extent
+3: [48..55]: hole
+adb08a6d94a3b5eff90fdfebb2366d31
+	10. hole -> data -> hole
+0: [0..39]: hole
+1: [40..47]: extent
+2: [48..63]: hole
+0487b3c52810f994c541aa166215375f
+	11. data -> hole -> data
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..39]: extent
+3: [40..47]: hole
+4: [48..63]: extent
+e3a8d52acc4d91a8ed19d7b6f4f26a71
+	12. unwritten -> data -> unwritten
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..63]: extent
+0487b3c52810f994c541aa166215375f
+	13. data -> unwritten -> data
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..63]: extent
+2b22165f4a24a2c36fd05ef00b41df88
+	14. data -> hole @ EOF
+0: [0..23]: extent
+1: [24..39]: hole
+2: [40..55]: extent
+aa0f20d1edcdbce60d8ef82700ba30c3
+	15. data -> hole @ 0
+0: [0..15]: hole
+1: [16..55]: extent
+86c9d033be2761385c9cfa203c426bb2
diff --git a/tests/generic/group b/tests/generic/group
index fb67b57..0d41c72 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -44,6 +44,7 @@
 039 metadata auto quick
 040 metadata auto quick
 041 metadata auto quick
+042 auto quick prealloc
 053 acl repair auto quick
 062 attr udf auto quick
 068 other auto freeze dangerous stress
-- 
1.7.9.5

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related

* [PATCH RESEND 6/12] xfstests: generic/043: Delayed allocation insert range
From: Namjae Jeon @ 2015-02-16 15:47 UTC (permalink / raw)
  To: david, tytso
  Cc: linux-man, Namjae Jeon, Namjae Jeon, linux-api, bfoster,
	linux-kernel, xfs, mtk.manpages, a.sangwan, linux-fsdevel,
	linux-ext4
In-Reply-To: <1424101680-3301-1-git-send-email-linkinjeon@gmail.com>

From: Namjae Jeon <namjae.jeon@samsung.com>

This testcase(043) tries to test various corner cases with delayed extents
for finsert range functionality over different type of extents.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
---
 tests/generic/043     |   65 +++++++++++++++++++++++++++++++++++++++++
 tests/generic/043.out |   78 +++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/group   |    1 +
 3 files changed, 144 insertions(+)
 create mode 100644 tests/generic/043
 create mode 100644 tests/generic/043.out

diff --git a/tests/generic/043 b/tests/generic/043
new file mode 100644
index 0000000..e70644d
--- /dev/null
+++ b/tests/generic/043
@@ -0,0 +1,65 @@
+#! /bin/bash
+# FS QA Test No. generic/043
+#
+# Delayed allocation insert range tests
+# This testcase is one of the 4 testcases which tries to
+# test various corner cases for finsert range functionality over different
+# type of extents. These tests are based on generic/255 test case.
+# For the type of tests, check the description of _test_generic_punch
+# in common/rc.
+#-----------------------------------------------------------------------
+# Copyright (c) 2015 Samsung Electronics.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#-----------------------------------------------------------------------
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+
+_cleanup()
+{
+    rm -f $tmp.*
+}
+
+trap "_cleanup ; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+# we need to include common/punch to get defination fo filter functions
+. ./common/rc
+. ./common/filter
+. ./common/punch
+
+# real QA test starts here
+_supported_fs generic
+_supported_os Linux
+
+_require_xfs_io_command "fpunch"
+_require_xfs_io_command "falloc"
+_require_xfs_io_command "fiemap"
+_require_xfs_io_command "finsert"
+
+testfile=$TEST_DIR/$seq.$$
+
+_test_generic_punch -d falloc fpunch finsert fiemap _filter_hole_fiemap $testfile
+_check_test_fs
+
+status=0
+exit
diff --git a/tests/generic/043.out b/tests/generic/043.out
new file mode 100644
index 0000000..817ed09
--- /dev/null
+++ b/tests/generic/043.out
@@ -0,0 +1,78 @@
+QA output created by 043
+	1. into a hole
+cf845a781c107ec1346e849c9dd1b7e8
+	2. into allocated space
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..55]: extent
+64e72217eebcbdf31b1b058f9f5f476a
+	3. into unwritten space
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..55]: extent
+cf845a781c107ec1346e849c9dd1b7e8
+	4. hole -> data
+0: [0..31]: hole
+1: [32..47]: extent
+2: [48..55]: hole
+adb08a6d94a3b5eff90fdfebb2366d31
+	5. hole -> unwritten
+0: [0..31]: hole
+1: [32..47]: extent
+2: [48..55]: hole
+cf845a781c107ec1346e849c9dd1b7e8
+	6. data -> hole
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..31]: extent
+3: [32..55]: hole
+be0f35d4292a20040766d87883b0abd1
+	7. data -> unwritten
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..47]: extent
+3: [48..55]: hole
+be0f35d4292a20040766d87883b0abd1
+	8. unwritten -> hole
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..31]: extent
+3: [32..55]: hole
+cf845a781c107ec1346e849c9dd1b7e8
+	9. unwritten -> data
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..47]: extent
+3: [48..55]: hole
+adb08a6d94a3b5eff90fdfebb2366d31
+	10. hole -> data -> hole
+0: [0..39]: hole
+1: [40..47]: extent
+2: [48..63]: hole
+0487b3c52810f994c541aa166215375f
+	11. data -> hole -> data
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..39]: extent
+3: [40..47]: hole
+4: [48..63]: extent
+e3a8d52acc4d91a8ed19d7b6f4f26a71
+	12. unwritten -> data -> unwritten
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..63]: extent
+0487b3c52810f994c541aa166215375f
+	13. data -> unwritten -> data
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..63]: extent
+2b22165f4a24a2c36fd05ef00b41df88
+	14. data -> hole @ EOF
+0: [0..23]: extent
+1: [24..39]: hole
+2: [40..55]: extent
+aa0f20d1edcdbce60d8ef82700ba30c3
+	15. data -> hole @ 0
+0: [0..15]: hole
+1: [16..55]: extent
+86c9d033be2761385c9cfa203c426bb2
diff --git a/tests/generic/group b/tests/generic/group
index 0d41c72..c2156a1 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -45,6 +45,7 @@
 040 metadata auto quick
 041 metadata auto quick
 042 auto quick prealloc
+043 auto quick prealloc
 053 acl repair auto quick
 062 attr udf auto quick
 068 other auto freeze dangerous stress
-- 
1.7.9.5

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related

* [PATCH RESEND 7/12] xfstests: generic/044: Multi insert range tests
From: Namjae Jeon @ 2015-02-16 15:47 UTC (permalink / raw)
  To: david, tytso
  Cc: linux-man, Namjae Jeon, Namjae Jeon, linux-api, bfoster,
	linux-kernel, xfs, mtk.manpages, a.sangwan, linux-fsdevel,
	linux-ext4
In-Reply-To: <1424101680-3301-1-git-send-email-linkinjeon@gmail.com>

From: Namjae Jeon <namjae.jeon@samsung.com>

This testcase(044) tries to test various corner cases with pre-existing holes
for finsert range functionality over different type of extents.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
---
 tests/generic/044     |   65 ++++++++++++++++++++++++++++++++++++++++
 tests/generic/044.out |   80 +++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/group   |    1 +
 3 files changed, 146 insertions(+)
 create mode 100644 tests/generic/044
 create mode 100644 tests/generic/044.out

diff --git a/tests/generic/044 b/tests/generic/044
new file mode 100644
index 0000000..4d6be1b
--- /dev/null
+++ b/tests/generic/044
@@ -0,0 +1,65 @@
+#! /bin/bash
+# FS QA Test No. generic/044
+#
+# Multi insert range tests
+# This testcase is one of the 4 testcases which tries to
+# test various corner cases for finsert range functionality over different
+# type of extents. These tests are based on generic/255 test case.
+# For the type of tests, check the description of _test_generic_punch
+# in common/rc.
+#-----------------------------------------------------------------------
+# Copyright (c) 2015 Samsung Electronics.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#-----------------------------------------------------------------------
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+
+_cleanup()
+{
+    rm -f $tmp.*
+}
+
+trap "_cleanup ; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+# we need to include common/punch to get defination fo filter functions
+. ./common/rc
+. ./common/filter
+. ./common/punch
+
+# real QA test starts here
+_supported_fs generic
+_supported_os Linux
+
+_require_xfs_io_command "fpunch"
+_require_xfs_io_command "falloc"
+_require_xfs_io_command "fiemap"
+_require_xfs_io_command "finsert"
+
+testfile=$TEST_DIR/$seq.$$
+
+_test_generic_punch -k falloc fpunch finsert fiemap _filter_hole_fiemap $testfile
+_check_test_fs
+
+status=0
+exit
diff --git a/tests/generic/044.out b/tests/generic/044.out
new file mode 100644
index 0000000..4ddfb65
--- /dev/null
+++ b/tests/generic/044.out
@@ -0,0 +1,80 @@
+QA output created by 044
+	1. into a hole
+cf845a781c107ec1346e849c9dd1b7e8
+	2. into allocated space
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..55]: extent
+64e72217eebcbdf31b1b058f9f5f476a
+	3. into unwritten space
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..55]: extent
+22b7303d274481990b5401b6263effe0
+	4. hole -> data
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..55]: extent
+c4fef62ba1de9d91a977cfeec6632f19
+	5. hole -> unwritten
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..55]: extent
+1ca74f7572a0f4ab477fdbb5682e5f61
+	6. data -> hole
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..31]: extent
+3: [32..47]: hole
+4: [48..55]: extent
+be0f35d4292a20040766d87883b0abd1
+	7. data -> unwritten
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..47]: extent
+3: [48..55]: hole
+bddb1f3895268acce30d516a99cb0f2f
+	8. unwritten -> hole
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..31]: extent
+3: [32..39]: hole
+4: [40..55]: extent
+f8fc47adc45b7cf72f988b3ddf5bff64
+	9. unwritten -> data
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..47]: extent
+3: [48..55]: hole
+c4fef62ba1de9d91a977cfeec6632f19
+	10. hole -> data -> hole
+0: [0..7]: extent
+1: [8..39]: hole
+2: [40..63]: extent
+52af1bfcbf43f28af2328de32e0567e5
+	11. data -> hole -> data
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..39]: extent
+3: [40..47]: hole
+4: [48..63]: extent
+e3a8d52acc4d91a8ed19d7b6f4f26a71
+	12. unwritten -> data -> unwritten
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..63]: extent
+52af1bfcbf43f28af2328de32e0567e5
+	13. data -> unwritten -> data
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..63]: extent
+2b22165f4a24a2c36fd05ef00b41df88
+	14. data -> hole @ EOF
+0: [0..23]: extent
+1: [24..39]: hole
+2: [40..55]: extent
+aa0f20d1edcdbce60d8ef82700ba30c3
+	15. data -> hole @ 0
+0: [0..15]: hole
+1: [16..55]: extent
+86c9d033be2761385c9cfa203c426bb2
diff --git a/tests/generic/group b/tests/generic/group
index c2156a1..70444a3 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -46,6 +46,7 @@
 041 metadata auto quick
 042 auto quick prealloc
 043 auto quick prealloc
+044 auto quick prealloc
 053 acl repair auto quick
 062 attr udf auto quick
 068 other auto freeze dangerous stress
-- 
1.7.9.5

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox