Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/16] mm: use augmented rbtrees for finding unmapped areas
From: Michel Lespinasse @ 2012-11-05 22:46 UTC (permalink / raw)
  To: linux-arm-kernel

Earlier this year, Rik proposed using augmented rbtrees to optimize
our search for a suitable unmapped area during mmap(). This prompted
my work on improving the augmented rbtree code. Rik doesn't seem to
have time to follow up on his idea at this time, so I'm sending this
series to revive the idea.

These changes are against v3.7-rc4. I have not converted all applicable
architectuers yet, but we don't necessarily need to get them all onboard
at once - the series is fully bisectable and additional architectures
can be added later on. I am confident enough in my tests for patches 1-8;
however the second half of the series basically didn't get tested as
I don't have access to all the relevant architectures.

Change log  since the previous (RFC) send:
- Added bug fix in validate_mm(), noticed by Sasha Levin and figured
  out by Bob Liu, which sometimes caused NULL pointer dereference when
  running with CONFIG_DEBUG_VM_RB=y
- Fixed generic and x86_64 arch_get_unmapped_area_topdown to avoid
  allocating new areas at addr=0 as suggested by Rik Van Riel
- Converted more architectures to use the new vm_unmapped_area()
  search function
- Converted hugetlbfs (generic / i386 / sparc64 / tile) to use the new
  vm_unmapped_area() search function as well.

In this resend, I have kept Rik's Reviewed-by tags from the original
RFC submission for patches that haven't been updated other than applying
his suggestions.

Patch 1 is the validate_mm() fix from Bob Liu (+ fixed-the-fix from me :)

Patch 2 augments the VMA rbtree with a new rb_subtree_gap field,
indicating the length of the largest gap immediately preceding any
VMAs in a subtree.

Patch 3 adds new checks to CONFIG_DEBUG_VM_RB to verify the above
information is correctly maintained.

Patch 4 rearranges the vm_area_struct layout so that rbtree searches only
need data that is contained in the first cacheline (this one is from
Rik's original patch series)

Patch 5 adds a generic vm_unmapped_area() search function, which
allows for searching for an address space of any desired length,
within [low; high[ address constraints, with any desired alignment.
The generic arch_get_unmapped_area[_topdown] functions are also converted
to use this.

Patch 6 converts the x86_64 arch_get_unmapped_area[_topdown] functions
to use vm_unmapped_area() as well.

Patch 7 fixes cache coloring on x86_64, as suggested by Rik in his
previous series.

Patch 8 and 9 convert the generic and i386 hugetlbfs code to use
vm_unmapped_area()

Patches 10-16 convert extra architectures to use vm_unmapped_area()

I'm happy that this series removes more code than it adds, as calling
vm_unmapped_area() with the desired arguments is quite shorter than
duplicating the brute force algorithm all over the place. There is
still a bit of repetition between various implementations of
arch_get_unmapped_area[_topdown] functions that could probably be
simplified somehow, but I feel we can keep that for a later step...

Michel Lespinasse (15):
  mm: add anon_vma_lock to validate_mm()
  mm: augment vma rbtree with rb_subtree_gap
  mm: check rb_subtree_gap correctness
  mm: vm_unmapped_area() lookup function
  mm: use vm_unmapped_area() on x86_64 architecture
  mm: fix cache coloring on x86_64 architecture
  mm: use vm_unmapped_area() in hugetlbfs
  mm: use vm_unmapped_area() in hugetlbfs on i386 architecture
  mm: use vm_unmapped_area() on mips architecture
  mm: use vm_unmapped_area() on arm architecture
  mm: use vm_unmapped_area() on sh architecture
  mm: use vm_unmapped_area() on sparc64 architecture
  mm: use vm_unmapped_area() in hugetlbfs on sparc64 architecture
  mm: use vm_unmapped_area() on sparc32 architecture
  mm: use vm_unmapped_area() in hugetlbfs on tile architecture

Rik van Riel (1):
  mm: rearrange vm_area_struct for fewer cache misses

 arch/arm/mm/mmap.c               |  119 ++--------
 arch/mips/mm/mmap.c              |   99 ++-------
 arch/sh/mm/mmap.c                |  126 ++---------
 arch/sparc/kernel/sys_sparc_32.c |   24 +--
 arch/sparc/kernel/sys_sparc_64.c |  132 +++---------
 arch/sparc/mm/hugetlbpage.c      |  123 +++--------
 arch/tile/mm/hugetlbpage.c       |  139 ++----------
 arch/x86/include/asm/elf.h       |    6 +-
 arch/x86/kernel/sys_x86_64.c     |  151 +++----------
 arch/x86/mm/hugetlbpage.c        |  130 ++---------
 arch/x86/vdso/vma.c              |    2 +-
 fs/hugetlbfs/inode.c             |   42 +---
 include/linux/mm.h               |   31 +++
 include/linux/mm_types.h         |   19 ++-
 mm/mmap.c                        |  452 +++++++++++++++++++++++++++++---------
 15 files changed, 616 insertions(+), 979 deletions(-)

-- 
1.7.7.3

^ permalink raw reply

* [PATCH v2] ARM: plat-versatile: move FPGA irq driver to drivers/irqchip
From: Arnd Bergmann @ 2012-11-05 22:42 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <50983422.9090301@gmail.com>

On Monday 05 November 2012, Rob Herring wrote:
> But this should work:
> 
> if (!handle_arch_irq)
>         handle_arch_irq = fpga_handle_irq;
> 
> As long as the primary controller is always initialized first, this will
> work. This is guaranteed by DT of_irq_init, and you will probably have
> other problems if that wasn't the case for non-DT.

How about adding a top-level function in arch/arm that does the assignment
and hides the handle_arch_irq variable:

void set_handle_irq(void (*handle_irq)(struct pt_regs *))
{
	if (WARN_ON(handle_arch_irq))
		return;

	handle_arch_irq = handle_irq;
}
EXPORT_SYMBOL_GPL(set_handle_irq);

Hmm, maybe putting the top-level handler into a loadable module is a bit
far-fetched, but one can hope ;-)

	Arnd

^ permalink raw reply

* [PATCHv9 2/8] ARM: OMAP4: PM: add errata support
From: Kevin Hilman @ 2012-11-05 22:36 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1350552010-28760-3-git-send-email-t-kristo@ti.com>

Tero Kristo <t-kristo@ti.com> writes:

> Added similar PM errata flag support as omap3 has. This should be used
> in similar manner, set the flags during init time, and check the flag
> values during runtime.
>
> Signed-off-by: Tero Kristo <t-kristo@ti.com>

These allow basic suspend/resume to work on 4460/Panda-ES, so I'm going
to queue these up as fixes.

However, since they're not technically regressions, it may be too late
to get them in for v3.7, but they'll be in for v3.8 for sure.

Kevin


> ---
>  arch/arm/mach-omap2/pm.h     |    7 +++++++
>  arch/arm/mach-omap2/pm44xx.c |    1 +
>  2 files changed, 8 insertions(+), 0 deletions(-)
>
> diff --git a/arch/arm/mach-omap2/pm.h b/arch/arm/mach-omap2/pm.h
> index 707e9cb..f26f2d7 100644
> --- a/arch/arm/mach-omap2/pm.h
> +++ b/arch/arm/mach-omap2/pm.h
> @@ -100,6 +100,13 @@ extern void enable_omap3630_toggle_l2_on_restore(void);
>  static inline void enable_omap3630_toggle_l2_on_restore(void) { }
>  #endif		/* defined(CONFIG_PM) && defined(CONFIG_ARCH_OMAP3) */
>  
> +#if defined(CONFIG_ARCH_OMAP4)
> +extern u16 pm44xx_errata;
> +#define IS_PM44XX_ERRATUM(id)		(pm44xx_errata & (id))
> +#else
> +#define IS_PM44XX_ERRATUM(id)		0
> +#endif
> +
>  #ifdef CONFIG_POWER_AVS_OMAP
>  extern int omap_devinit_smartreflex(void);
>  extern void omap_enable_smartreflex_on_init(void);
> diff --git a/arch/arm/mach-omap2/pm44xx.c b/arch/arm/mach-omap2/pm44xx.c
> index ba06300..07e7ef2 100644
> --- a/arch/arm/mach-omap2/pm44xx.c
> +++ b/arch/arm/mach-omap2/pm44xx.c
> @@ -33,6 +33,7 @@ struct power_state {
>  };
>  
>  static LIST_HEAD(pwrst_list);
> +u16 pm44xx_errata;
>  
>  #ifdef CONFIG_SUSPEND
>  static int omap4_pm_suspend(void)

^ permalink raw reply

* [PATCH 1/6 v2] arm: use devicetree to get smp_twd clock
From: Russell King - ARM Linux @ 2012-11-05 22:31 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <50983D75.50009@calxeda.com>

On Mon, Nov 05, 2012 at 04:28:05PM -0600, Mark Langsdorf wrote:
> On 11/04/2012 04:08 AM, Russell King - ARM Linux wrote:
> > On Fri, Nov 02, 2012 at 01:51:44PM -0500, Mark Langsdorf wrote:
> >> -static struct clk *twd_get_clock(void)
> >> +static struct clk *twd_get_clock(struct device_node *np)
> >>  {
> >> -	struct clk *clk;
> >> +	struct clk *clk = NULL;
> >>  	int err;
> >>  
> >> -	clk = clk_get_sys("smp_twd", NULL);
> >> +	if (np)
> >> +		clk = of_clk_get(np, 0);
> >> +	if (!clk)
> > 
> > What does a NULL return from of_clk_get() mean?  Where is this defined?
> 
> Well, it's a valid path if (np) is NULL. I'll add an IS_ERR(clk) and
> resubmit.

Hang on - what logic are you trying to achieve here?  Wouldn't:

	if (np)
		clk = of_clk_get(np, 0);
	else
		clk = clk_get_sys("smp_twd", NULL);

be sufficient?  If we have DT, why would we ever want to fall back to
"smp_twd" ?

^ permalink raw reply

* scheduler clock for MXS [Was: Re: Wakeup latency measured with SCHED_TRACER depends on HZ]
From: Russell King - ARM Linux @ 2012-11-05 22:28 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <5097E4A9.3090008@meduna.org>

On Mon, Nov 05, 2012 at 05:09:13PM +0100, Stanislav Meduna wrote:
> On 05.11.2012 14:46, Shawn Guo wrote:
> 
> >>> From my quick testing on imx23 with printk timestamp, it's not OK,
> >>> so we may need to leave imx23 out there.
> >>
> > I should say it's practically not OK since it wraps in such a short
> > period.  But it actually works as expected.
> > 
> >> Hmm, does it wrap after 2 seconds?
> > 
> > Yes, it does wrap after ~2 seconds.
> 
> This is weird. AFAIK the printk should be using sched_clock(),
> which is a weak symbol overridden in arch/arm/kernel/sched_clock.c
> and it should take care of the extension to never-ever-wrapping
> 64-bit timestamp. Looks that it does not and if it does not,
> I think there is more to be worried of than just printk timestamps.

It most certainly does handle the wrapping correctly - it was designed
to from the very start.

> BTW this patch deserves IMHO looking at
>   https://patchwork.kernel.org/patch/1193631/
> but it is probably not the problem here.

Yes, that patch is probably required... if an update to the sched_clock
epoch happens on a different CPU, then the epoch cycles can be in advance
of the read clock cycle value.  That needs to get into my patch system.

^ permalink raw reply

* [PATCH 1/6 v2] arm: use devicetree to get smp_twd clock
From: Mark Langsdorf @ 2012-11-05 22:28 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20121104100822.GY21164@n2100.arm.linux.org.uk>

On 11/04/2012 04:08 AM, Russell King - ARM Linux wrote:
> On Fri, Nov 02, 2012 at 01:51:44PM -0500, Mark Langsdorf wrote:
>> -static struct clk *twd_get_clock(void)
>> +static struct clk *twd_get_clock(struct device_node *np)
>>  {
>> -	struct clk *clk;
>> +	struct clk *clk = NULL;
>>  	int err;
>>  
>> -	clk = clk_get_sys("smp_twd", NULL);
>> +	if (np)
>> +		clk = of_clk_get(np, 0);
>> +	if (!clk)
> 
> What does a NULL return from of_clk_get() mean?  Where is this defined?

Well, it's a valid path if (np) is NULL. I'll add an IS_ERR(clk) and
resubmit.

>> @@ -349,6 +348,10 @@ int __init twd_local_timer_register(struct twd_local_timer *tlt)
>>  	if (!twd_base)
>>  		return -ENOMEM;
>>  
>> +	twd_clk = twd_get_clock(NULL);
>> +
>> +	twd_clk = twd_get_clock(NULL);
>> +
> 
> Why twice?

No good reason. I'll resubmit with it cleaned up. Thanks for the review.

--Mark Langsdorf
Calxeda, Inc.

^ permalink raw reply

* [PATCHv9 6/8] ARM: OMAP4: retrigger localtimers after re-enabling gic
From: Kevin Hilman @ 2012-11-05 22:25 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1350552010-28760-7-git-send-email-t-kristo@ti.com>

Tero Kristo <t-kristo@ti.com> writes:

> From: Colin Cross <ccross@android.com>
>
> 'Workaround for ROM bug because of CA9 r2pX gic control'
> register change disables the gic distributor while the secondary

Just to clarify: this is referring to PATCH 3/8 of this series, correct?

Kevin

^ permalink raw reply

* [PATCHv9 0/8] ARM: OMAP4: core retention support
From: Kevin Hilman @ 2012-11-05 22:23 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1350552010-28760-1-git-send-email-t-kristo@ti.com>

Hi Tero,

Tero Kristo <t-kristo@ti.com> writes:

> Hi,
>
> Changes compared to previous version:
> - rebased on top of 3.7-rc1
> - applies on top of latest func pwrst code (v6)
> - added back patch #1 to this set (it wasn't queued yet after all)
> - added patch #7 for fixing a bug in the functional pwrst code
> - added patch #8 for fixing a regression with MUSB PHY power handling
>   (not quite sure if this is the correct way to fix this or not)
>
> Tested with omap4460 gp panda + omap4430 emu blaze boards, with cpuidle +
> suspend.
>
> Branch also available here:
> git://gitorious.org/~kristo/omap-pm/omap-pm-work.git
> branch: mainline-3.7-rc1-omap4-ret-v9

I tested this branch on 4430/Panda and 4460/Panda-ES and I'm seeing
several domains not hitting target power state in suspend[1].

Am I missing some other fixes?  Using omap2plus_defconfig, I tried your
branch alone, and merged with v3.7-rc4, and I get the same errors.

Kevin

[1]
# echo enabled > /sys/devices/platform/omap_uart.2/tty/ttyO2/power/wakeup
# echo mem > /sys/power/state 
[  102.271087] PM: Syncing filesystems ... done.
[  102.282196] Freezing user space processes ... (elapsed 0.02 seconds) done.
[  102.312133] Freezing remaining freezable tasks ... (elapsed 0.02 seconds) done.
[  102.343353] Suspending console(s) (use no_console_suspend to debug)
?[  102.363433] PM: suspend of devices complete after 10.650 msecs
[  102.365631] PM: late suspend of devices complete after 2.166 msecs
[  102.369201] PM: noirq suspend of devices complete after 3.509 msecs
[  102.369232] Disabling non-boot CPUs ...
[  102.373016] CPU1: shutdown
[  104.357421] Powerdomain (core_pwrdm) didn't enter target state 1
[  104.357452] Powerdomain (tesla_pwrdm) didn't enter target state 1
[  104.357452] Powerdomain (ivahd_pwrdm) didn't enter target state 1
[  104.357482] Powerdomain (l3init_pwrdm) didn't enter target state 1
[  104.357482] Could not enter target state in pm_suspend
[  104.357666] Enabling non-boot CPUs ...
[  104.359863] CPU1: Booted secondary processor
[  104.360626] cpu cpu0: opp_init_cpufreq_table: Device OPP not found (-19)
[  104.360656] cpu cpu0: omap_cpu_init: cpu1: failed creating freq table[-19]
[  104.360656] CPU1 is up
[  104.362579] PM: noirq resume of devices complete after 1.892 msecs
[  104.364807] PM: early resume of devices complete after 1.373 msecs
[  104.710937] PM: resume of devices complete after 346.099 msecs
[  104.817901] Restarting tasks ... done.

^ permalink raw reply

* [PATCH 0/2] clk: ux500: Add some more clk lookups for u8500
From: Russell King - ARM Linux @ 2012-11-05 22:17 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CACRpkdZjLac9+dRwUm=ka7ZCouAx0bztNjEcjsPdsjZ=Ckz5rQ@mail.gmail.com>

On Mon, Nov 05, 2012 at 12:40:20PM +0100, Linus Walleij wrote:
> On Wed, Oct 31, 2012 at 2:40 PM, Ulf Hansson <ulf.hansson@stericsson.com> wrote:
> 
> > From: Ulf Hansson <ulf.hansson@linaro.org>
> >
> > Some more clock lookups added for rng clocks and for the nomadik ske
> > keypad clocks.
> >
> > Ulf Hansson (2):
> >   clk: ux500: Register rng clock lookups for u8500
> >   clk: ux500: Register nomadik keypad clock lookups for u8500
> 
> Acked-by: Linus Walleij <linus.walleij@linaro.org>
> 
> for these.
> 
> They have the right name and all, apb_pclk is
> "AMBA peripheral bus, peripheral block clock"
> so a clock for the silicon, right.
> 
> ... then how it's supposed to be used, that's another
> issue...

Well, the apb pclk is the APB bus clock which times all transfers on
the APB bus.  Without the APB bus clock running, you can't talk to any
peripherals attached to that bus.

If your SoC controls the APB bus clock to each peripheral individually
(like, I seem to remember your Ux500 stuff does) and the peripheral is
not being clocked, then although the bus master may be seeing a clock,
and will manipulate the bus signals, the target will remain unresponsive
due to lack of clock.

So, the general principle is that the APB PCLK needs to be 'enabled'
whenever the peripheral in question is expecting any kind of access via
the APB bus.

All rather simple really.

^ permalink raw reply

* [RFC] dmaengine: omap-dma: Allow DMA controller to prefetch data
From: Mark A. Greer @ 2012-11-05 22:06 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <50814B83.7090203@ti.com>

On Fri, Oct 19, 2012 at 02:45:55PM +0200, P?ter Ujfalusi wrote:
> Hi,
> 
> On 10/19/2012 01:33 AM, Russell King - ARM Linux wrote:
> > I would suggest getting some feedback from the ASoC people first, before
> > trying to invent new APIs to work around this stuff.  If they can live
> > with having prefetch enabled on OMAP then there isn't an issue here.  If
> > not, we need a solution to this.
> > 
> > I do not believe that precisely stopping and starting playback across a
> > suspend/resume event is really necessary (it's desirable but the world
> > doesn't collapse if you miss a few samples.)  It could be more of an
> > issue for pause/resume though, but as I say, that's for ASoC people to
> > comment on.
> 
> There is another issue with the prefetch in audio:
> we tend to like to know the position of the DMA and also to know how much data
> we have stored in buffers, FIFOs. This information is used by userspace to do
> echo cancellation and also used by PA for example to do runtime mixing
> directly in the audio buffer. We have means to extract this information from
> McBSP for example (and from tlv320dac33 codec) but AFAIK this information can
> not be retrieved from sDMA.
> We could assume that the sDMA FIFO is kept full and report that as a 'delay'
> or do not account this information.
> 
> For now I think the cyclic mode should not set the prefetch. If I recall right
> the cyclic mode is only used by audio at the moment.
> 
> > I'm merely pointing out here that we need their feedback here before
> > deciding if there's anything further that needs to happen.
> 
> Thanks Russell, I'll take a look at the implication of the prefetch for audio.

Ping?

^ permalink raw reply

* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
From: Michael Hope @ 2012-11-05 22:02 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <5097C3B5.4080406@gmail.com>

On 6 November 2012 02:48, Rob Herring <robherring2@gmail.com> wrote:
>
> On 11/05/2012 05:13 AM, Russell King - ARM Linux wrote:
> > On Mon, Nov 05, 2012 at 10:48:50AM +0000, Dave Martin wrote:
> >> On Thu, Oct 25, 2012 at 05:08:16PM +0200, Johannes Stezenbach wrote:
> >>> On Thu, Oct 25, 2012 at 09:25:06AM -0500, Rob Herring wrote:
> >>>> On 10/25/2012 09:16 AM, Johannes Stezenbach wrote:
> >>>>> On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
> >>>>>> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
> >>>>>>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> >>>>>>>
> >>>>>>>> While v6 can support unaligned accesses, it is optional and current
> >>>>>>>> compilers won't emit unaligned accesses. So we don't clear the A bit for
> >>>>>>>> v6.
> >>>>>>>
> >>>>>>> not true according to the gcc changes page
> >>>>>>
> >>>>>> What are you going to believe: documentation or what the compiler
> >>>>>> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
> >>>>>> support backported and 4.7.2, unaligned accesses are emitted for v7
> >>>>>> only. I guess default here means it is the default unless you change the
> >>>>>> default in your build of gcc.
> >>>>>
> >>>>> Since ARMv6 can handle unaligned access in the same way as ARMv7
> >>>>> it seems a clear bug in gcc which might hopefully get fixed.
> >>>>> Thus in this case I think it is reasonable to follow the
> >>>>> gcc documentation, otherwise the code would break for ARMv6
> >>>>> when gcc gets fixed.
> >>>>
> >>>> But the compiler can't assume the state of the U bit. I think it is
> >>>> still legal on v6 to not support unaligned accesses, but on v7 it is
> >>>> required. All the standard v6 ARM cores support it, but I'm not sure
> >>>> about custom cores or if there are SOCs with buses that don't support
> >>>> unaligned accesses properly.
> >>>
> >>> Well, I read the "...since Linux version 2.6.28" comment
> >>> in the gcc changes page in the way that they assume the
> >>> U-bit is set. (Although I'm not sure it really is???)
> >>
> >> Actually, the kernel checks the arch version and the U bit on boot,
> >> and chooses the appropriate setting for the A bit depending on the
> >> result.  (See arch/arm/mm/alignment.c:alignment_init().)
> >
> > That is in the kernel itself, _after_ the decompressor has run.  It is
> > not relevant to any discussion about the decompressor.
> >
> >> Currently, we depend on the CPU reset behaviour or firmware/
> >> bootloader to set the U bit for v6, but the behaviour should be
> >> correct either way, though unaligned accesses will obviously
> >> perform (much) better with U=1.
> >
> > Will someone _PLEASE_ address my initial comments against this patch
> > in light of the fact that it's now been proven _NOT_ to be just a V7
> > issue, rather than everyone seemingly buring their heads in the sand
> > over this.
>
> I tried adding -munaligned-accesses on a v6 build and still get byte
> accesses rather than unaligned word accesses. So this does seem to be a
> v7 only issue based on what gcc will currently produce. Copying Michael
> Hope who can hopefully provide some insight on why v6 unaligned accesses
> are not enabled.

This looks like a bug.  Unaligned access is enabled for armv6 but
seems to only take effect for cores with Thumb-2.  Here's a test case
both with unaligned field access and unaligned block copy:

struct foo
{
  char a;
  int b;
  struct
  {
    int x[3];
  } c;
} __attribute__((packed));

int get_field(struct foo *p)
{
  return p->b;
}

int copy_block(struct foo *p, struct foo *q)
{
  p->c = q->c;
}

With -march=armv7-a you get the correct:

bar:
	ldr	r0, [r0, #1]	@ unaligned	@ 11	unaligned_loadsi/2	[length = 4]
	bx	lr	@ 21	*arm_return	[length = 12]

baz:
	str	r4, [sp, #-4]!	@ 25	*push_multi	[length = 4]
	mov	r2, r0	@ 2	*arm_movsi_vfp/1	[length = 4]
	ldr	r4, [r1, #5]!	@ unaligned	@ 9	unaligned_loadsi/2	[length = 4]
	ldr	ip, [r1, #4]	@ unaligned	@ 10	unaligned_loadsi/2	[length = 4]
	ldr	r1, [r1, #8]	@ unaligned	@ 11	unaligned_loadsi/2	[length = 4]
	str	r4, [r2, #5]	@ unaligned	@ 12	unaligned_storesi/2	[length = 4]
	str	ip, [r2, #9]	@ unaligned	@ 13	unaligned_storesi/2	[length = 4]
	str	r1, [r2, #13]	@ unaligned	@ 14	unaligned_storesi/2	[length = 4]
	ldmfd	sp!, {r4}
	bx	lr

With -march=armv6 you get a byte-by-byte field access and a correct
unaligned block copy:

bar:
	ldrb	r1, [r0, #2]	@ zero_extendqisi2
	ldrb	r3, [r0, #1]	@ zero_extendqisi2
	ldrb	r2, [r0, #3]	@ zero_extendqisi2
	ldrb	r0, [r0, #4]	@ zero_extendqisi2
	orr	r3, r3, r1, asl #8
	orr	r3, r3, r2, asl #16
	orr	r0, r3, r0, asl #24
	bx	lr

baz:
	str	r4, [sp, #-4]!
	mov	r2, r0
	ldr	r4, [r1, #5]!	@ unaligned
	ldr	ip, [r1, #4]	@ unaligned
	ldr	r1, [r1, #8]	@ unaligned
	str	r4, [r2, #5]	@ unaligned
	str	ip, [r2, #9]	@ unaligned
	str	r1, [r2, #13]	@ unaligned
	ldmfd	sp!, {r4}
	bx	lr

readelf -A shows that the compiler planned to use unaligned access in
both.  My suspicion is that GCC is using the extv pattern to extract
the field from memory, and that pattern is only enabled for Thumb-2
capable cores.

I've logged PR55218.  We'll discuss it at our next meeting.

-- Michael

^ permalink raw reply

* [PATCH 11/15] ARM: OMAP: timer: Interchange clksrc and clkevt for AM33XX
From: Santosh Shilimkar @ 2012-11-05 21:59 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <87pq3s7wro.fsf@deeprootsystems.com>

On Monday 05 November 2012 11:33 PM, Kevin Hilman wrote:
> "Bedia, Vaibhav" <vaibhav.bedia@ti.com> writes:
>
>> On Sat, Nov 03, 2012 at 18:34:30, Kevin Hilman wrote:
>> [...]
>>>>>
>>>>> Doesn't this also mean that you won't get timer wakeups
>>>>> in idle?  Or are you keeping the domain where the clockevent is
>>>>> on during idle?
>>>>>
>>>>
>>>> The lowest idle state that we are targeting will have MPU powered
>>>> off with external memory in self-refresh mode. Peripheral domain
>>>> with the clockevent will be kept on.
>>>
>>> Is this a limitation of the hardware?  or the software?
>>>
>>
>> Well, making the lowest idle state same as the suspend state will
>> require us to involve WKUP_M3 in the idle path and wakeup sources get
>> limited to the IPs in the WKUP domain alone. There's no IO daisy
>> chaining in AM33XX so that's one big difference compared to OMAP.  The
>> other potential problem is that the IPC mechanism that we have uses
>> interrupts.
>
> It can still interrupt the M3, it's only the interrupt back to the MPU
> that is the issue, right?  That being said, there's no reason it
> couldn't use polling in the idle path, right?
>
>> Assuming that the lowest idle state, say Cx, is the same as the
>> suspend state, we'll need to communicate with the WKUP_M3 using
>> interrupts once we decide to enter Cx. I am not sure if we can do
>> something in the cpuidle implementation to work around the "interrupt
>> for idle" problem.
>>
>> We could probably not wait for an ACK when we want to enter Cx,
>
> why not?
>
> Are the response times from the M3 really up to 500ms (guessing based on
> the timeout you used in the suspend path.)  That seems rather unlikely.
>
> Hmm, but as I think about it.  Why does the MPU need to wait for an ACK
> at all?  Why not just send the cmd and WFI?
>
>> but the problem of limited wakeup sources remains. If we let the
>> various drivers block the entry to Cx, since almost all the IPs are in
>> the peripheral domain a system which uses anything other than UART and
>> Timer in WKUP domain will probably never be able enter Cx.
>
> Even so, I think the system needs to be designed to hit the same power
> states in idle and suspend.  Then, the states can be restricted based
> wakeup capabilities as you described.  This would be easy to do in the
> runtime PM implementation for this device.
>
> IMO, assuming that idle will not be useful from the begining is leading
> down the path to poor design choices that will be much more difficult to
> fixup down the road in order to add idle support later.  We need to
> design both idle and suspend at the same time.
>
I agree with Kevin. Not supporting CPUIDLE deep states can hit the
active power numbers dearly. I just don't know why the SOCs don't share
the standard and must have design choices. But thats another discussion.

How about leaving the timer choices as is. PER timer for clock source
and wakeuptimer for clock event. Anyway in suspend the clock-source
can be suspended and that is evident from recent discussion. The only
downside is you won't count time in suspend which is any way the case.

Vaibhav,
Do you guys see any implementation bottleneck for above ?

Regards
Santosh

^ permalink raw reply

* [PATCH v3 07/11] ARM: davinci - restructure header files for common clock migration
From: Murali Karicheri @ 2012-11-05 21:57 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <50967641.4090503@ti.com>

On 11/04/2012 09:05 AM, Sekhar Nori wrote:
>
> On 10/25/2012 9:41 PM, Murali Karicheri wrote:
>> pll.h is added to migrate some of the PLL controller defines for sleep.S.
>> psc.h is modified to keep only PSC modules definitions needed by sleep.S
>> after migrating to common clock. The definitions under
>> ifdef CONFIG_COMMON_CLK will be removed in a subsequent patch.
>> davinci_watchdog_reset prototype is moved to time.h as clock.h is
>> being obsoleted. sleep.S and pm.c is modified to include the new header
>> file replacements.
>>
>> Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
>> ---
>>   arch/arm/mach-davinci/devices.c           |    2 ++
>>   arch/arm/mach-davinci/include/mach/pll.h  |   46 +++++++++++++++++++++++++++++
>>   arch/arm/mach-davinci/include/mach/psc.h  |    4 +++
>>   arch/arm/mach-davinci/include/mach/time.h |    4 ++-
>>   arch/arm/mach-davinci/pm.c                |    4 +++
>>   arch/arm/mach-davinci/sleep.S             |    4 +++
>>   6 files changed, 63 insertions(+), 1 deletion(-)
>>   create mode 100644 arch/arm/mach-davinci/include/mach/pll.h
> With this patch a _third_ copy of PLL definitions is created in kernel
> sources. The existing PLL definitions in clock.h inside mach-davinci
> should be moved to mach/pll.h and the pll.h you introduced inside
> drivers/clk in 5/11 should be removed (this patch should appear before
> 5/11).
>
> The biggest disadvantage of this approach is inclusion of mach/ includes
> in drivers/clk. But duplicating code is definitely not the fix for this.
> Anyway, mach/ includes are not uncommon in drivers/clk (they are all
> probably suffering from the same issue).
>
> $ grep -rl "include <mach/" drivers/clk/*
> drivers/clk/clk-u300.c
> drivers/clk/mmp/clk-pxa168.c
> drivers/clk/mmp/clk-mmp2.c
> drivers/clk/mmp/clk-pxa910.c
> drivers/clk/mxs/clk-imx23.c
> drivers/clk/mxs/clk-imx28.c
> drivers/clk/spear/spear6xx_clock.c
> drivers/clk/spear/spear3xx_clock.c
> drivers/clk/spear/spear1340_clock.c
> drivers/clk/spear/spear1310_clock.c
> drivers/clk/ux500/clk-prcc.c
> drivers/clk/versatile/clk-integrator.c
> drivers/clk/versatile/clk-realview.c
>
> pll.h can probably be moved to include/linux/clk/ to avoid this. Would
> like to hear from Mike on this before going ahead.
>
> Anyway, instead of just commenting, I though I will be more useful and
> went ahead and made some of the changes I have been talking about. I
> fixed the multiple PLL definitions issue, the build infrastructure issue
> and the commit ordering too.
>
> I pushed the patches I fixed to devel-common-clk branch of my git tree.
> It is build tested using davinci_all_defconfig but its not runtime tested.
>
> Can you start from here and provide me incremental changes on top of
> this? That way we can collaborate to finish this faster.
>
> Thanks,
> Sekhar
>
I made a build from your branch and it doesn't boot up DM6446. I will 
debug this tomorrow. But what should I focus on? I thought it is a 
header file re-arrangement?

Murali

^ permalink raw reply

* [PATCH v3] ARM: zynq: Allow UART1 to be used as DEBUG_LL console.
From: Josh Cartwright @ 2012-11-05 21:54 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1352151949-18319-1-git-send-email-nbowler@elliptictech.com>

On Mon, Nov 05, 2012 at 04:45:49PM -0500, Nick Bowler wrote:
> The main UART on the Xilinx ZC702 board is UART1, located at address
> e0001000.  Add a Kconfig option to select this device as the low-level
> debugging port.  This allows the really early boot printouts to reach
> the USB serial adaptor on this board.
>
> For consistency's sake, add a choice entry for UART0 even though it is
> the the default if UART1 is not selected.
>
> Signed-off-by: Nick Bowler <nbowler@elliptictech.com>
> Tested-by: Josh Cartwright <josh.cartwright@ni.com>
> ---
> Sorry all for the phenominal delay in sending this out.  Josh, I kept
> your Tested-By since this version is Obviously Equivalent??? to v2...

I re-tested again just in case and everything is good.

Thanks,
  Josh
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20121105/c752664b/attachment-0001.sig>

^ permalink raw reply

* [PATCH 15/15] ARM: OMAP2+: AM33XX: Basic suspend resume support
From: Santosh Shilimkar @ 2012-11-05 21:52 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <87txt4aqyc.fsf@deeprootsystems.com>

On Monday 05 November 2012 11:10 PM, Kevin Hilman wrote:
> +Santosh (to help with EMIF questions/comments)
>
> On 11/02/2012 12:32 PM, Vaibhav Bedia wrote:
>> AM335x supports various low power modes as documented
>> in section 8.1.4.3 of the AM335x TRM which is available
>> @ http://www.ti.com/litv/pdf/spruh73f
>>
>> DeepSleep0 mode offers the lowest power mode with limited
>> wakeup sources without a system reboot and is mapped as
>> the suspend state in the kernel. In this state, MPU and
>> PER domains are turned off with the internal RAM held in
>> retention to facilitate resume process. As part of the boot
>> process, the assembly code is copied over to OCMCRAM using
>> the OMAP SRAM code.
>>
>> AM335x has a Cortex-M3 (WKUP_M3) which assists the MPU
>> in DeepSleep0 entry and exit. WKUP_M3 takes care of the
>> clockdomain and powerdomain transitions based on the
>> intended low power state. MPU needs to load the appropriate
>> WKUP_M3 binary onto the WKUP_M3 memory space before it can
>> leverage any of the PM features like DeepSleep.
>>
>> The IPC mechanism between MPU and WKUP_M3 uses a mailbox
>> sub-module and 8 IPC registers in the Control module. MPU
>> uses the assigned Mailbox for issuing an interrupt to
>> WKUP_M3 which then goes and checks the IPC registers for
>> the payload. WKUP_M3 has the ability to trigger on interrupt
>> to MPU by executing the "sev" instruction.
>>
>> In the current implementation when the suspend process
>> is initiated MPU interrupts the WKUP_M3 to let about the
>> intent of entering DeepSleep0 and waits for an ACK. When
>> the ACK is received, MPU continues with its suspend process
>> to suspend all the drivers and then jumps to assembly in
>> OCMC RAM to put the PLLs in bypass, put the external RAM in
>> self-refresh mode and then finally execute the WFI instruction.
>> The WFI instruction triggers another interrupt to the WKUP_M3
>> which then continues wiht the power down sequence wherein the
>> clockdomain and powerdomain transition takes place. As part of
>> the sleep sequence, WKUP_M3 unmasks the interrupt lines for
>> the wakeup sources. When WKUP_M3 executes WFI, the hardware
>> disables the main oscillator.
>>
>> When a wakeup event occurs, WKUP_M3 starts the power-up
>> sequence by switching on the power domains and finally
>> enabling the clock to MPU. Since the MPU gets powered down
>> as part of the sleep sequence, in the resume path ROM code
>> starts executing. The ROM code detects a wakeup from sleep
>> and then jumps to the resume location in OCMC which was
>> populated in one of the IPC registers as part of the suspend
>> sequence.
>>
>> The low level code in OCMC relocks the PLLs, enables access
>> to external RAM and then jumps to the cpu_resume code of
>> the kernel to finish the resume process.
>>
>> Signed-off-by: Vaibhav Bedia <vaibhav.bedia@ti.com>
>
> Very well summarized.  Thanks for the thorough changelog.
>
> First, some general comments.  This is a big patch and probably should
> be broken up a bit.  I suspect it could be broken up a bit, maybe into
> at least:
>
> - EMIF interface
> - SCM interface, new APIs
> - assembly/OCM code
> - pm33xx.[ch]
> - lastly, the late_init stuff that actually initizlizes
>
> I have a handful of comments below.  I wrote this up on the plane over
> the weekend, and I see that Santosh has already made some similar
> comments, but I'll send mine anyways.
>
[...]

>
>> +extern void __iomem *am33xx_get_emif_base(void);
>> +int wkup_mbox_msg(struct notifier_block *self, unsigned long len, void *msg);
>> +#endif
>> +
>> +#define	IPC_CMD_DS0			0x3
>> +#define IPC_CMD_RESET                   0xe
>> +#define DS_IPC_DEFAULT			0xffffffff
>> +
>> +#define IPC_RESP_SHIFT			16
>> +#define IPC_RESP_MASK			(0xffff << 16)
>> +
>> +#define M3_STATE_UNKNOWN		0
>> +#define M3_STATE_RESET			1
>> +#define M3_STATE_INITED			2
>> +#define M3_STATE_MSG_FOR_LP		3
>> +#define M3_STATE_MSG_FOR_RESET		4
>> +
>> +#define AM33XX_OCMC_END			0x40310000
>> +#define AM33XX_EMIF_BASE		0x4C000000
>> +
>> +/*
>> + * This a subset of registers defined in drivers/memory/emif.h
>> + * Move that to include/linux/?
>> + */
>
> I'd probably suggest just moving the register definitions you
> need into <plat/emif_plat.h> so they can be shared with the driver.
>
> Also, the EMIF stuff would benefit greatly from using symbolic defines
> for the values being written.  Probably having those in
> <plat/emif_plat.h> would also help out here.
>
> Or, maybe the EMIF driver can provide some self-contained stubs that can
> be copied to OCP RAM for the functionality needed here?
>
> Santosh, what do you think of that?
>
Thats what I mentioned in my comment. I also don't know why such a bad
hardware choice was made when we have perfectly working EMIF IP across
low power states. Even the self refresh control is managed inside
hardware upon idle.  My guess is DDR self-refresh might be the reason
to use OCMC RAM.

In either case, Keeping EMIF changes separate as part of EMIF 
driver/platform code is right way to go about it. May be the
disable_selfrefresh() might need to kept in assembly files since it 
needs to be running from SRAM but rest need not be part of
PM code.

Regards
Santosh

For

Regards
Santosh

^ permalink raw reply

* [PATCH v2] ARM: plat-versatile: move FPGA irq driver to drivers/irqchip
From: Rob Herring @ 2012-11-05 21:48 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20121102121556.GV21164@n2100.arm.linux.org.uk>

On 11/02/2012 07:15 AM, Russell King - ARM Linux wrote:
> On Thu, Nov 01, 2012 at 11:20:10PM +0100, Thomas Petazzoni wrote:
>> Linus,
>>
>> On Thu,  1 Nov 2012 22:28:49 +0100, Linus Walleij wrote:
>>
>>> +void fpga_handle_irq(struct pt_regs *regs);
>>
>> This function does not need to be exposed in a public header: as
>> proposed for the bcm2835 and armada-370-xp IRQ controller drivers, the
>> driver should directly do handle_arch_irq = fpga_handle_irq, and
>> therefore there is no need for the machine desc structure to reference
>> fpga_handle_irq anymore.
> 
> Err no, then you don't understand what's going on here.  This may or may
> not be a top-level IRQ handler.  Some ARM platforms have three of these
> cascaded, others have one of these cascaded off a VIC or GIC.
> 
> To override the top level IRQ handler unconditionally is going to break
> platforms.

But this should work:

if (!handle_arch_irq)
	handle_arch_irq = fpga_handle_irq;

As long as the primary controller is always initialized first, this will
work. This is guaranteed by DT of_irq_init, and you will probably have
other problems if that wasn't the case for non-DT.

Rob

^ permalink raw reply

* [PATCH 2/8] ARM: zynq: move ttc timer code to drivers/clocksource
From: Josh Cartwright @ 2012-11-05 21:47 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CAHTX3d++0fGSw7GQHcc-S1X1Qh-rfekpr-E8Jkg2_vFqdCFFTg@mail.gmail.com>

On Mon, Nov 05, 2012 at 12:22:55PM +0100, Michal Simek wrote:
> 2012/10/29 Josh Cartwright <josh.cartwright@ni.com>:
> > Suggested cleanup by Arnd Bergmann.  Move the ttc timer.c code to
> > drivers/clocksource, and out of the mach-zynq directory.
> >
> > The common.h (which only held the timer declaration) was renamed to
> > xilinx_ttc.h and moved into include/linux.
> >
> > Signed-off-by: Josh Cartwright <josh.cartwright@ni.com>
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > ---
> >  arch/arm/mach-zynq/Makefile                                    | 2 +-
> >  arch/arm/mach-zynq/common.c                                    | 2 +-
> >  drivers/clocksource/Makefile                                   | 1 +
> >  arch/arm/mach-zynq/timer.c => drivers/clocksource/xilinx_ttc.c | 1 -
> >  arch/arm/mach-zynq/common.h => include/linux/xilinx_ttc.h      | 4 ++--
> >  5 files changed, 5 insertions(+), 5 deletions(-)
> >  rename arch/arm/mach-zynq/timer.c => drivers/clocksource/xilinx_ttc.c (99%)
> >  rename arch/arm/mach-zynq/common.h => include/linux/xilinx_ttc.h (91%)
>
> Not going to apply this patch till there is clean way how to move all
> drivers there.  Especially I don't like to add xilinx_ttc.h to
> include/linux folder.

Okay;  I think it's best to defer the moving of the ttc driver from this
patchset.  It is not a dependency of the clk driver support stuff.
If you agree, I can spin up a v2 of the patchset w/o this change, and
without the serial CONFIG_OF stuff.  Should v2 contain the patches
you've already pulled into testing?

I'll give Rob's irqchip-like suggestion a spin and see how that works
out in parallel.

Thanks,

  Josh
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20121105/54a58e60/attachment.sig>

^ permalink raw reply

* [PATCH v2 4/4] ARM: OMAP: gpmc: add DT bindings for GPMC timings and NAND
From: Jon Hunter @ 2012-11-05 21:46 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <50942BDC.9060806@gmail.com>


On 11/02/2012 03:23 PM, Daniel Mack wrote:
> On 02.11.2012 20:57, Jon Hunter wrote:
>> On 11/02/2012 02:23 PM, Daniel Mack wrote:
>>> On 02.11.2012 20:18, Jon Hunter wrote:
>>>> On 11/02/2012 06:14 AM, Daniel Mack wrote:
> 
>>>>>>> diff --git a/Documentation/devicetree/bindings/bus/ti-gpmc.txt b/Documentation/devicetree/bindings/bus/ti-gpmc.txt
>>>>>>> new file mode 100644
>>>>>>> index 0000000..6f44487
>>>>>>> --- /dev/null
>>>>>>> +++ b/Documentation/devicetree/bindings/bus/ti-gpmc.txt
>>>>>>> @@ -0,0 +1,73 @@
>>>>>>> +Device tree bindings for OMAP general purpose memory controllers (GPMC)
>>>>>>> +
>>>>>>> +The actual devices are instantiated from the child nodes of a GPMC node.
>>>>>>> +
>>>>>>> +Required properties:
>>>>>>> +
>>>>>>> + - compatible:		Should be set to "ti,gpmc"
>>>>>>> + - reg:			A resource specifier for the register space
>>>>>>> +			(see the example below)
>>>>>>> + - ti,hwmods:		Should be set to "ti,gpmc" until the DT transition is
>>>>>>> +			completed.
>>>>>>> + - #address-cells:	Must be set to 2 to allow memory address translation
>>>>>>> + - #size-cells:		Must be set to 1 to allow CS address passing
>>>>>>> + - ranges:		Must be set up to reflect the memory layout
>>>>>>> +			Note that this property is not currently parsed.
>>>>>>> +			Calculated values derived from the contents of
>>>>>>> +			GPMC_CS_CONFIG7 as set up by the bootloader. That will
>>>>>>> +			change in the future, so be sure to fill the correct
>>>>>>> +			values here.
>>>>>>
>>>>>> I still think it would be good to add number of chip-selects and
>>>>>> wait-pins here.
>>>>>
>>>>> The number of chip-selects can be derived from the ranges property.
>>>>> Namely, each 4-value entry to this property maps to one chip-select. I
>>>>> can try and make the more clear in the documentation.
>>>>
>>>> Yes but that only tells you how many you are using. The binding should
>>>> describe the hardware and so should tell us how many chip-selects we
>>>> have. We should get away from using GPMC_CS_NUM in the code.
>>>
>>> Maybe I don't get your point, but we only need to care for as many cs
>>> lines as we actually use, right?
>>
>> But how many does your device have? How many clients can you support?
> 
> Well, you state that in the ranges property. Even if the chip could in
> theory support 8 CS lines - if the actual setup only uses the first one
> of them, the code would only need to allocate and set up the one that is
> in use. And as the entries in "ranges" are mandatory, there can actually
> be no mis-allocation.

Ah, I see your point now. Well typically, we have been putting the
device-level peripheral info in the device's *.dtsi (ie. am33xx.dtsi)
and then board specific stuff in the board *.dts file (am335x-bone.dts).
So I would envision that the device-level info (reg, ti,hwmods,
interrupt, num-cs) be in am33xx.dtsi and ranges be in am335x.dts. So it
would still be nice to catch any badly configured ranges property in the
driver by querying in the number of chip-selects.

> I can still add the maximum number as a separate property, but I wanted
> to outline my idea here. Is "num-cs" a good name for the property?

Sounds good.

>> If we know how many the device has and then we can get rid of "#define
>> GPMC_CS_NUM". We currently allocate the CS by calling gpmc_cs_request().
>> Hmmm ... I now see that your patch is not calling this before
>> configuring the CS and so that needs to be fixed too.
> 
> It does implicitly, by calling gpmc_nand_init().

Yes, you are right!

>> Without knowing the total CS available, how do we ensure we have the CS
>> available that someone is asking for?
>>
>>>> What about wait-pins?
>>>
>>> Afaik, their use depends on the driver acting as GPMC client, right?
>>> Could you point me to code that acts conditionally and that should be
>>> reflected in DT?
>>
>> Again we need to know how many the device has. Clients may or may not
>> use these. However, if a client wants one they need to request one which
>> is just like a chip-select. This is not in the current driver but Afzal
>> has a patch for this [1].
> 
> Ah, thanks for the pointer to the patch. Ok, I'll add a "num-waitpins"
> property. Does that name sound appropriate?

Yes, that would be great!

>> Bottom line, for such hardware specific features, device tree is a good
>> place to describe how many resources we have. Then we can eliminate such
>> #defines from the driver code.
> 
> Agreed.
> 
>>> Quoting Documentation/devicetree/bindings/mtd/gpmc-nand.txt:
>>>
>>> 	For NAND specific properties such as ECC modes or bus width,
>>> 	please refer to Documentation/devicetree/bindings/mtd/nand.txt
>>
>> Ok, thanks I see that now. Looking at other bindings, some also include
>> these details but not all. Could be worth listing ecc-mode under
>> mandatory and bus-width under optional with a reference to nand.txt
>> binding. I don't think it is worth duplicating but listing the actual
>> property names would be nice.
> 
> Ok, I amended my local version. With the details above sorted out and
> "num-cs" and "num-waitpins" in place, do you think we're ready for v4?

Yes, thanks for doing this.

Cheers
Jon

^ permalink raw reply

* [PATCH v3] ARM: zynq: Allow UART1 to be used as DEBUG_LL console.
From: Nick Bowler @ 2012-11-05 21:45 UTC (permalink / raw)
  To: linux-arm-kernel

The main UART on the Xilinx ZC702 board is UART1, located at address
e0001000.  Add a Kconfig option to select this device as the low-level
debugging port.  This allows the really early boot printouts to reach
the USB serial adaptor on this board.

For consistency's sake, add a choice entry for UART0 even though it is
the the default if UART1 is not selected.

Signed-off-by: Nick Bowler <nbowler@elliptictech.com>
Tested-by: Josh Cartwright <josh.cartwright@ni.com>
---
Sorry all for the phenominal delay in sending this out.  Josh, I kept
your Tested-By since this version is Obviously Equivalent? to v2...

v2: rebase on newest patch series, signoff.
v3: squash in style tweaks suggested by Michal Simek.

 arch/arm/Kconfig.debug                     |   17 +++++++++++++++++
 arch/arm/mach-zynq/common.c                |    6 +++---
 arch/arm/mach-zynq/include/mach/zynq_soc.h |   16 +++++++++++-----
 3 files changed, 31 insertions(+), 8 deletions(-)

diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug
index b0f3857b3a4c..7754d51f2b19 100644
--- a/arch/arm/Kconfig.debug
+++ b/arch/arm/Kconfig.debug
@@ -132,6 +132,23 @@ choice
 		  their output to UART1 serial port on DaVinci TNETV107X
 		  devices.
 
+	config DEBUG_ZYNQ_UART0
+		bool "Kernel low-level debugging on Xilinx Zynq using UART0"
+		depends on ARCH_ZYNQ
+		help
+		  Say Y here if you want the debug print routines to direct
+		  their output to UART0 on the Zynq platform.
+
+	config DEBUG_ZYNQ_UART1
+		bool "Kernel low-level debugging on Xilinx Zynq using UART1"
+		depends on ARCH_ZYNQ
+		help
+		  Say Y here if you want the debug print routines to direct
+		  their output to UART1 on the Zynq platform.
+
+		  If you have a ZC702 board and want early boot messages to
+		  appear on the USB serial adaptor, select this option.
+
 	config DEBUG_DC21285_PORT
 		bool "Kernel low-level debugging messages via footbridge serial port"
 		depends on FOOTBRIDGE
diff --git a/arch/arm/mach-zynq/common.c b/arch/arm/mach-zynq/common.c
index ba8d14f78d4d..93b91059faab 100644
--- a/arch/arm/mach-zynq/common.c
+++ b/arch/arm/mach-zynq/common.c
@@ -84,9 +84,9 @@ static struct map_desc io_desc[] __initdata = {
 
 #ifdef CONFIG_DEBUG_LL
 	{
-		.virtual	= UART0_VIRT,
-		.pfn		= __phys_to_pfn(UART0_PHYS),
-		.length		= UART0_SIZE,
+		.virtual	= LL_UART_VADDR,
+		.pfn		= __phys_to_pfn(LL_UART_PADDR),
+		.length		= UART_SIZE,
 		.type		= MT_DEVICE,
 	},
 #endif
diff --git a/arch/arm/mach-zynq/include/mach/zynq_soc.h b/arch/arm/mach-zynq/include/mach/zynq_soc.h
index 1b8bf0ecbcb0..5ebbd8e6eeee 100644
--- a/arch/arm/mach-zynq/include/mach/zynq_soc.h
+++ b/arch/arm/mach-zynq/include/mach/zynq_soc.h
@@ -25,8 +25,9 @@
  * address that is known to work.
  */
 #define UART0_PHYS		0xE0000000
-#define UART0_SIZE		SZ_4K
-#define UART0_VIRT		0xF0001000
+#define UART1_PHYS		0xE0001000
+#define UART_SIZE		SZ_4K
+#define UART_VIRT		0xF0001000
 
 #define TTC0_PHYS		0xF8001000
 #define TTC0_SIZE		SZ_4K
@@ -36,12 +37,17 @@
 #define SCU_PERIPH_SIZE		SZ_8K
 #define SCU_PERIPH_VIRT		(TTC0_VIRT - SCU_PERIPH_SIZE)
 
+#if IS_ENABLED(CONFIG_DEBUG_ZYNQ_UART1)
+# define LL_UART_PADDR		UART1_PHYS
+#else
+# define LL_UART_PADDR		UART0_PHYS
+#endif
+
+#define LL_UART_VADDR		UART_VIRT
+
 /* The following are intended for the devices that are mapped early */
 
 #define TTC0_BASE			IOMEM(TTC0_VIRT)
 #define SCU_PERIPH_BASE			IOMEM(SCU_PERIPH_VIRT)
 
-#define LL_UART_PADDR	UART0_PHYS
-#define LL_UART_VADDR	UART0_VIRT
-
 #endif
-- 
1.7.8.6

^ permalink raw reply related

* [PATCH 13/15] ARM: DTS: AM33XX: Add nodes for OCMCRAM and Mailbox
From: Santosh Shilimkar @ 2012-11-05 21:45 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <50982D61.9060204@ti.com>

On Tuesday 06 November 2012 02:49 AM, Santosh Shilimkar wrote:
> On Tuesday 06 November 2012 12:59 AM, Kevin Hilman wrote:
>> "Bedia, Vaibhav" <vaibhav.bedia@ti.com> writes:
>>
>>> On Mon, Nov 05, 2012 at 20:23:11, Shilimkar, Santosh wrote:
>>> [...]
>>>>>
>>>> On OMAP the OCMC RAM is always clocked and doesn't need any special
>>>> clock enable. CM_L3_2_OCMC_RAM_CLKCTRL module mode field is read only.
>>>> Isn't it same on AMXX ?
>>>>
>>>
>>> On AM33xx, OCMC RAM is in PER domain and the corresponding CLKCLTR
>>> module
>>> mode fields are r/w. OCMC RAM needs to be disabled as part of the
>>> DeepSleep0
>>> entry to let PER domain transition.
>>
>> After DeepSleep0, the ROM code is being given an address in OCMC RAM to
>> jump to.  If OCMC RAM is disabled as part of suspend, this means that
>> OCMC RAM contents are maintained even though PER domain transitions?
>>
>> If so, that needs to be more clearly documented.
>>
> Thats very good point. How does OCMC RAM retains the contents without
> clock ?
>
Ignore the question. I figured out from other patch changelog the OCMC
RAM supports retention. Please have that clearly captured in
change log.

Regards
Santosh

^ permalink raw reply

* [PATCH 12/15] ARM: OMAP: timer: Add suspend-resume callbacks for clockevent device
From: Jon Hunter @ 2012-11-05 21:20 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <B5906170F1614E41A8A28DE3B8D121433EBFEA88@DBDE01.ent.ti.com>


On 11/03/2012 08:17 AM, Bedia, Vaibhav wrote:
> On Sat, Nov 03, 2012 at 17:45:03, Kevin Hilman wrote:
>> On 11/02/2012 01:32 PM, Vaibhav Bedia wrote:
>>> From: Vaibhav Hiremath <hvaibhav@ti.com>
>>>
>>> The current OMAP timer code registers two timers -
>>> one as clocksource and one as clockevent.
>>> AM33XX has only one usable timer in the WKUP domain
>>> so one of the timers needs suspend-resume support
>>> to restore the configuration to pre-suspend state.
>>
>> The changelog describes "what", but doesn't answer "why?"
>>
> 
> Sorry I'll try to take of this in the future.
> 
>>> commit adc78e6 (timekeeping: Add suspend and resume
>>> of clock event devices) introduced .suspend and .resume
>>> callbacks for clock event devices. Leverages these
>>> callbacks to have AM33XX clockevent timer which is
>>> in not in WKUP domain to behave properly across system
>>> suspend.
>>
>> You say it behaves properly without describing what improper
>> behavior is happening.
>>
> 
> There are two issues. One is that the clockevent timer doesn't
> get idled which blocks PER domain transition. 

Why is this? How is the dmtimer TIOCP_CFG register configured on AM33xx?
Is it using smart-idle?

> The next one is that
> the clockevent doesn't generate any further interrupts once the
> system resumes. We need to restore the pre-suspend configuration.
> I haven't tried but I guess we could have used the save and restore
> of timer registers here.

It would be interesting to try using an non-wakeup domain timer on
OMAP3/4 for clock events and seeing if suspend/resume works.

Do you know what the exact problem here is? I understand that the timer
context could get lost, but exactly what is not getting restarted by the
kernel? For example, the only place we set the interrupt enable is
during the clock event init and so if the context is lost, then I could
see no more interrupts occurring. So is it enough to just restore the
interrupt enable register, do you really need to program the timer again?

Cheers
Jon

^ permalink raw reply

* [PATCH 13/15] ARM: DTS: AM33XX: Add nodes for OCMCRAM and Mailbox
From: Santosh Shilimkar @ 2012-11-05 21:19 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <87a9uv97c0.fsf@deeprootsystems.com>

On Tuesday 06 November 2012 12:59 AM, Kevin Hilman wrote:
> "Bedia, Vaibhav" <vaibhav.bedia@ti.com> writes:
>
>> On Mon, Nov 05, 2012 at 20:23:11, Shilimkar, Santosh wrote:
>> [...]
>>>>
>>> On OMAP the OCMC RAM is always clocked and doesn't need any special
>>> clock enable. CM_L3_2_OCMC_RAM_CLKCTRL module mode field is read only.
>>> Isn't it same on AMXX ?
>>>
>>
>> On AM33xx, OCMC RAM is in PER domain and the corresponding CLKCLTR module
>> mode fields are r/w. OCMC RAM needs to be disabled as part of the DeepSleep0
>> entry to let PER domain transition.
>
> After DeepSleep0, the ROM code is being given an address in OCMC RAM to
> jump to.  If OCMC RAM is disabled as part of suspend, this means that
> OCMC RAM contents are maintained even though PER domain transitions?
>
> If so, that needs to be more clearly documented.
>
Thats very good point. How does OCMC RAM retains the contents without
clock ?

Regards
Santosh

^ permalink raw reply

* [PATCH 12/15] ARM: OMAP: timer: Add suspend-resume callbacks for clockevent device
From: Jon Hunter @ 2012-11-05 21:04 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1351859566-24818-13-git-send-email-vaibhav.bedia@ti.com>


On 11/02/2012 07:32 AM, Vaibhav Bedia wrote:
> From: Vaibhav Hiremath <hvaibhav@ti.com>
> 
> The current OMAP timer code registers two timers -
> one as clocksource and one as clockevent.
> AM33XX has only one usable timer in the WKUP domain
> so one of the timers needs suspend-resume support
> to restore the configuration to pre-suspend state.
> 
> commit adc78e6 (timekeeping: Add suspend and resume
> of clock event devices) introduced .suspend and .resume
> callbacks for clock event devices. Leverages these
> callbacks to have AM33XX clockevent timer which is
> in not in WKUP domain to behave properly across system
> suspend.
> 
> Signed-off-by: Vaibhav Hiremath <hvaibhav@ti.com>
> Signed-off-by: Vaibhav Bedia <vaibhav.bedia@ti.com>
> ---
>  arch/arm/mach-omap2/timer.c |   31 +++++++++++++++++++++++++++++++
>  1 files changed, 31 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/arm/mach-omap2/timer.c b/arch/arm/mach-omap2/timer.c
> index 6584ee0..e8781fd 100644
> --- a/arch/arm/mach-omap2/timer.c
> +++ b/arch/arm/mach-omap2/timer.c
> @@ -135,6 +135,35 @@ static void omap2_gp_timer_set_mode(enum clock_event_mode mode,
>  	}
>  }
>  
> +static void omap_clkevt_suspend(struct clock_event_device *unused)
> +{
> +	char name[10];
> +	struct omap_hwmod *oh;
> +
> +	sprintf(name, "timer%d", 2);
> +	oh = omap_hwmod_lookup(name);
> +	if (!oh)
> +		return;
> +
> +	omap_hwmod_idle(oh);
> +}
> +
> +static void omap_clkevt_resume(struct clock_event_device *unused)
> +{
> +	char name[10];
> +	struct omap_hwmod *oh;
> +
> +	sprintf(name, "timer%d", 2);
> +	oh = omap_hwmod_lookup(name);
> +	if (!oh)
> +		return;
> +
> +	omap_hwmod_enable(oh);
> +	__omap_dm_timer_load_start(&clkev,
> +			OMAP_TIMER_CTRL_ST | OMAP_TIMER_CTRL_AR, 0, 1);
> +	__omap_dm_timer_int_enable(&clkev, OMAP_TIMER_INT_OVERFLOW);
> +}
> +
>  static struct clock_event_device clockevent_gpt = {
>  	.name		= "gp_timer",
>  	.features       = CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT,
> @@ -142,6 +171,8 @@ static struct clock_event_device clockevent_gpt = {
>  	.rating		= 300,
>  	.set_next_event	= omap2_gp_timer_set_next_event,
>  	.set_mode	= omap2_gp_timer_set_mode,
> +	.suspend	= omap_clkevt_suspend,
> +	.resume		= omap_clkevt_resume,

So these suspend/resume callbacks are going to be called for all OMAP2+
and AMxxxx devices? I don't think we want that. AFAIK OMAP timers will
idle on their own when stopped and don't require this.

Cheers
Jon

^ permalink raw reply

* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
From: Nicolas Pitre @ 2012-11-05 20:02 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <5097FB14.20404@gmail.com>

On Mon, 5 Nov 2012, Rob Herring wrote:

> On 11/05/2012 11:26 AM, Dave Martin wrote:
> > On Mon, Nov 05, 2012 at 11:13:51AM -0500, Nicolas Pitre wrote:
> >> On Mon, 5 Nov 2012, Russell King - ARM Linux wrote:
> >>
> >>> On Mon, Nov 05, 2012 at 01:02:55PM +0000, Dave Martin wrote:
> >>>> Why not allow unaligned accesses in the decompressor, though, both
> >>>> for v6 and v7?
> >>>
> >>> EXACTLY.
> >>
> >> I have no objections to that.  In fact, I made a remark to this effect 
> >> in my initial review of this patch.  Whether or not gcc does take 
> >> advantage of this hardware ability in the end is orthogonal.
> > 
> > For the sake of argument, here's how it might look.
> > 
> > Currently, I make no attempt to restore the original state of the U bit.
> > The A bit if forced later by the kernel during boot, after a short window
> > during which we should only run low-level arch code and therefore where
> > no unaligned accesses should happen.
> > 
> > Does anyone think these issues are likely to be important?
> > 
> 
> And here is my updated version that does v6 somewhat differently:

If I had to choose, I'd prefer Dave's version as being a bit cleaner.

> 
> 8<------------------------------------------------------------------
> >From 76c2b7685397f13aa53f426822128430fc24b8a0 Mon Sep 17 00:00:00 2001
> From: Rob Herring <rob.herring@calxeda.com>
> Date: Mon, 5 Nov 2012 11:39:48 -0600
> Subject: [PATCH v2] ARM: decompressor: clear SCTLR.A bit for v6 and v7 cores
> 
> With recent compilers and move to generic unaligned.h in commit d25c881
> (ARM: 7493/1: use generic unaligned.h), unaligned accesses will be used
> by the LZO decompressor on v7 cores. So we need to make sure unaligned
> accesses are allowed by clearing the SCTLR A bit.
> 
> While v6 can support unaligned accesses, it is optional and current
> compilers won't emit unaligned accesses. In case this changes and to align
> with the kernel behavior, we clear the A bit and set the U bit.
> 
> Signed-off-by: Rob Herring <rob.herring@calxeda.com>
> Acked-by: Nicolas Pitre <nico@linaro.org>
> Tested-by: Shawn Guo <shawn.guo@linaro.org>
> ---
>  arch/arm/boot/compressed/head.S |    6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
> index bc67cbf..f14d7ec 100644
> --- a/arch/arm/boot/compressed/head.S
> +++ b/arch/arm/boot/compressed/head.S
> @@ -629,6 +629,11 @@ __armv4_mmu_cache_on:
>  		mcr	p15, 0, r0, c7, c10, 4	@ drain write buffer
>  		mcr	p15, 0, r0, c8, c7, 0	@ flush I,D TLBs
>  		mrc	p15, 0, r0, c1, c0, 0	@ read control reg
> +		mrc	p15, 0, r11, c0, c0	@ get processor ID
> +		and	r11, r11, #0xf0000
> +		tst	r11, #0x70000		@ ARMv6
> +		orreq	r0, r0, #1 << 22	@ set SCTLR.U
> +		biceq	r0, r0, #1 << 1		@ clear SCTLR.A
>  		orr	r0, r0, #0x5000		@ I-cache enable, RR cache replacement
>  		orr	r0, r0, #0x0030
>  #ifdef CONFIG_CPU_ENDIAN_BE8
> @@ -654,6 +659,7 @@ __armv7_mmu_cache_on:
>  #endif
>  		mrc	p15, 0, r0, c1, c0, 0	@ read control reg
>  		bic	r0, r0, #1 << 28	@ clear SCTLR.TRE
> +		bic	r0, r0, #1 << 1		@ clear SCTLR.A
>  		orr	r0, r0, #0x5000		@ I-cache enable, RR cache replacement
>  		orr	r0, r0, #0x003c		@ write buffer
>  #ifdef CONFIG_MMU
> -- 
> 1.7.10.4
> 

^ permalink raw reply

* [PATCH 13/15] ARM: DTS: AM33XX: Add nodes for OCMCRAM and Mailbox
From: Kevin Hilman @ 2012-11-05 19:29 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <B5906170F1614E41A8A28DE3B8D121433EC0337B@DBDE01.ent.ti.com>

"Bedia, Vaibhav" <vaibhav.bedia@ti.com> writes:

> On Mon, Nov 05, 2012 at 20:23:11, Shilimkar, Santosh wrote:
> [...]
>> >
>> On OMAP the OCMC RAM is always clocked and doesn't need any special
>> clock enable. CM_L3_2_OCMC_RAM_CLKCTRL module mode field is read only.
>> Isn't it same on AMXX ?
>> 
>
> On AM33xx, OCMC RAM is in PER domain and the corresponding CLKCLTR module
> mode fields are r/w. OCMC RAM needs to be disabled as part of the DeepSleep0
> entry to let PER domain transition.

After DeepSleep0, the ROM code is being given an address in OCMC RAM to
jump to.  If OCMC RAM is disabled as part of suspend, this means that
OCMC RAM contents are maintained even though PER domain transitions?

If so, that needs to be more clearly documented.

Kevin

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox