LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 12/40] x86, compat: convert ia32 layer to use
From: Christoph Hellwig @ 2010-06-23 10:46 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Andrew Morton, Jason Baron, x86, Greg Ungerer, linux-kernel,
	Steven Rostedt, linuxppc-dev, Ingo Molnar, Paul Mackerras,
	Ian Munsie, H. Peter Anvin, Russell King, Thomas Gleixner,
	Christoph Hellwig
In-Reply-To: <20100623103619.GC5242@nowhere>

On Wed, Jun 23, 2010 at 12:36:21PM +0200, Frederic Weisbecker wrote:
> I think we wanted that to keep the sys32_ prefixed based naming, to avoid
> collisions with generic compat handler names.

For native syscalls we do this by adding a arch prefix inside the
syscall name, e.g.:

arch/s390/kernel/sys_s390.c:SYSCALL_DEFINE(s390_fallocate)(int fd, int mode, loff_t offset,
arch/sparc/kernel/sys_sparc_64.c:SYSCALL_DEFINE1(sparc_pipe_real, struct pt_regs *, regs)

^ permalink raw reply

* Re: [PATCH 31/40] trace syscalls: Convert various generic compat syscalls
From: Andi Kleen @ 2010-06-23 10:37 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Christoph Lameter, WANG Cong, Heiko Carstens, Oleg Nesterov,
	linuxppc-dev, Paul Mackerras, H. Peter Anvin, Sam Ravnborg,
	Mike Frysinger, Eric Dumazet, Jeff Moyer, Ingo Molnar,
	KOSAKI Motohiro, David Rientjes, Ingo Molnar, Arnd Bergmann,
	Steven Rostedt, Ian Munsie, Thomas Gleixner, Johannes Berg,
	Roland McGrath, Lee Schermerhorn, Arnaldo Carvalho de Melo,
	Neil Horman, linux-mm, netdev, Jason Baron, Greg Kroah-Hartman,
	Roel Kluin, linux-kernel, kexec, Eric Biederman, linux-fsdevel,
	Simon Kagstrom, Andrew Morton, David S. Miller, Alexander Viro
In-Reply-To: <20100623102931.GB5242@nowhere>

, Frederic Weisbecker wrote:
> On Wed, Jun 23, 2010 at 12:19:38PM +0200, Andi Kleen wrote:
>> , Ian Munsie wrote:
>>> From: Ian Munsie<imunsie@au1.ibm.com>
>>>
>>> This patch converts numerous trivial compat syscalls through the generic
>>> kernel code to use the COMPAT_SYSCALL_DEFINE family of macros.
>>
>> Why? This just makes the code look uglier and the functions harder
>> to grep for.
>
>
> Because it makes them usable with syscall tracing.

Ok that information is missing in the changelog then.

Also I hope the uglification<->usefullness factor is really worth it.
The patch is certainly no slouch on the uglification side.

It also has maintenance costs, e.g. I doubt ctags and cscope
will be able to deal with these kinds of macros, so it has a
high cost for everyone using these tools. For those
it would be actually better if you used separate annotation
that does not confuse standard C parsers.

-Andi

^ permalink raw reply

* Re: [PATCH 31/40] trace syscalls: Convert various generic compat syscalls
From: KOSAKI Motohiro @ 2010-06-23 10:30 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Christoph Lameter, WANG Cong, Frederic Weisbecker, Heiko Carstens,
	Oleg Nesterov, linuxppc-dev, Paul Mackerras, H. Peter Anvin,
	Sam Ravnborg, Mike Frysinger, Eric Dumazet, Jeff Moyer,
	Ingo Molnar, kosaki.motohiro, Ingo Molnar, Arnd Bergmann,
	David Rientjes, Steven Rostedt, Ian Munsie, Thomas Gleixner,
	Johannes Berg, Roland McGrath, Lee Schermerhorn,
	Arnaldo Carvalho de Melo, Neil Horman, linux-mm, netdev,
	Jason Baron, Greg Kroah-Hartman, Roel Kluin, linux-kernel, kexec,
	Eric Biederman, linux-fsdevel, Simon Kagstrom, Andrew Morton,
	David S. Miller, Alexander Viro
In-Reply-To: <4C21DFBA.2070202@linux.intel.com>

> , Ian Munsie wrote:
> > From: Ian Munsie<imunsie@au1.ibm.com>
> >
> > This patch converts numerous trivial compat syscalls through the generic
> > kernel code to use the COMPAT_SYSCALL_DEFINE family of macros.
> 
> Why? This just makes the code look uglier and the functions harder
> to grep for.

I guess trace-syscall feature need to override COMPAT_SYSCALL_DEFINE. but
It's only guess...

^ permalink raw reply

* Re: [PATCH 31/40] trace syscalls: Convert various generic compat syscalls
From: Frederic Weisbecker @ 2010-06-23 11:38 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Christoph Lameter, WANG Cong, Heiko Carstens, Oleg Nesterov,
	linuxppc-dev, Paul Mackerras, H. Peter Anvin, Sam Ravnborg,
	Mike Frysinger, Eric Dumazet, Jeff Moyer, Ingo Molnar,
	KOSAKI Motohiro, David Rientjes, Ingo Molnar, Arnd Bergmann,
	Steven Rostedt, Ian Munsie, Thomas Gleixner, Johannes Berg,
	Roland McGrath, Lee Schermerhorn, Arnaldo Carvalho de Melo,
	Neil Horman, linux-mm, netdev, Jason Baron, Greg Kroah-Hartman,
	Roel Kluin, linux-kernel, kexec, Eric Biederman, linux-fsdevel,
	Simon Kagstrom, Andrew Morton, David S. Miller, Alexander Viro
In-Reply-To: <4C21E3F8.9000405@linux.intel.com>

On Wed, Jun 23, 2010 at 12:37:44PM +0200, Andi Kleen wrote:
> , Frederic Weisbecker wrote:
>> On Wed, Jun 23, 2010 at 12:19:38PM +0200, Andi Kleen wrote:
>>> , Ian Munsie wrote:
>>>> From: Ian Munsie<imunsie@au1.ibm.com>
>>>>
>>>> This patch converts numerous trivial compat syscalls through the generic
>>>> kernel code to use the COMPAT_SYSCALL_DEFINE family of macros.
>>>
>>> Why? This just makes the code look uglier and the functions harder
>>> to grep for.
>>
>>
>> Because it makes them usable with syscall tracing.
>
> Ok that information is missing in the changelog then.


Agreed, the changelog lacks the purpose of what it does.



> Also I hope the uglification<->usefullness factor is really worth it.
> The patch is certainly no slouch on the uglification side.


It's worth because the kernel's syscall tracing is not complete, we lack all
the compat part.

These wrappers let us create TRACE_EVENT() for every syscalls automatically.
If we had to create them manually, the uglification would be way much more worse.

Most syscalls use the syscall wrappers already, so the uglification
is there mostly. We just forgot to uglify a bunch of them :)


> It also has maintenance costs, e.g. I doubt ctags and cscope
> will be able to deal with these kinds of macros, so it has a
> high cost for everyone using these tools. For those
> it would be actually better if you used separate annotation
> that does not confuse standard C parsers.


I haven't heard any complains about existing syscalls wrappers.

What kind of annotations could solve that?

^ permalink raw reply

* Re: [PATCH 12/40] x86, compat: convert ia32 layer to use
From: Frederic Weisbecker @ 2010-06-23 11:41 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jason Baron, x86, Greg Ungerer, linux-kernel, Steven Rostedt,
	linuxppc-dev, Ingo Molnar, Paul Mackerras, Ian Munsie,
	H. Peter Anvin, Russell King, Thomas Gleixner, Andrew Morton
In-Reply-To: <20100623104603.GB11845@lst.de>

On Wed, Jun 23, 2010 at 12:46:04PM +0200, Christoph Hellwig wrote:
> On Wed, Jun 23, 2010 at 12:36:21PM +0200, Frederic Weisbecker wrote:
> > I think we wanted that to keep the sys32_ prefixed based naming, to avoid
> > collisions with generic compat handler names.
> 
> For native syscalls we do this by adding a arch prefix inside the
> syscall name, e.g.:
> 
> arch/s390/kernel/sys_s390.c:SYSCALL_DEFINE(s390_fallocate)(int fd, int mode, loff_t offset,
> arch/sparc/kernel/sys_sparc_64.c:SYSCALL_DEFINE1(sparc_pipe_real, struct pt_regs *, regs)


In fact we sort of wanted to standardize the name of arch overriden compat
syscalls, so that userspace programs playing with syscalls tracing won't have
to deal with arch naming differences.

^ permalink raw reply

* Re: [PATCH 31/40] trace syscalls: Convert various generic compat syscalls
From: Andi Kleen @ 2010-06-23 12:35 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Christoph Lameter, WANG Cong, Heiko Carstens, Oleg Nesterov,
	linuxppc-dev, Paul Mackerras, H. Peter Anvin, Sam Ravnborg,
	Mike Frysinger, Eric Dumazet, Jeff Moyer, Ingo Molnar,
	KOSAKI Motohiro, David Rientjes, Ingo Molnar, Arnd Bergmann,
	Steven Rostedt, Ian Munsie, Thomas Gleixner, Johannes Berg,
	Roland McGrath, Lee Schermerhorn, Arnaldo Carvalho de Melo,
	Neil Horman, linux-mm, netdev, Jason Baron, Greg Kroah-Hartman,
	Roel Kluin, linux-kernel, kexec, Eric Biederman, linux-fsdevel,
	Simon Kagstrom, Andrew Morton, David S. Miller, Alexander Viro
In-Reply-To: <20100623113806.GD5242@nowhere>


>
> I haven't heard any complains about existing syscalls wrappers.

At least for me they always interrupt my grepping.

>
> What kind of annotations could solve that?

If you put the annotation in a separate macro and leave the original
prototype alone. Then C parsers could still parse it.

-Andi

^ permalink raw reply

* Re: [PATCH 31/40] trace syscalls: Convert various generic compat syscalls
From: Christoph Hellwig @ 2010-06-23 12:41 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Christoph Lameter, WANG Cong, Frederic Weisbecker, Heiko Carstens,
	Oleg Nesterov, linuxppc-dev, Paul Mackerras, H. Peter Anvin,
	Sam Ravnborg, Mike Frysinger, Eric Dumazet, Jeff Moyer,
	Ingo Molnar, KOSAKI Motohiro, Ingo Molnar, Arnd Bergmann,
	David Rientjes, Steven Rostedt, Ian Munsie, Thomas Gleixner,
	Johannes Berg, Roland McGrath, Lee Schermerhorn,
	Arnaldo Carvalho de Melo, Neil Horman, linux-mm, netdev,
	Jason Baron, Greg Kroah-Hartman, Roel Kluin, linux-kernel, kexec,
	Eric Biederman, linux-fsdevel, Simon Kagstrom, Andrew Morton,
	David S. Miller, Alexander Viro
In-Reply-To: <4C21FF9A.2040207@linux.intel.com>

On Wed, Jun 23, 2010 at 02:35:38PM +0200, Andi Kleen wrote:
> >I haven't heard any complains about existing syscalls wrappers.
> 
> At least for me they always interrupt my grepping.
> 
> >
> >What kind of annotations could solve that?
> 
> If you put the annotation in a separate macro and leave the original
> prototype alone. Then C parsers could still parse it.

I personally hate the way SYSCALL_DEFINE works with passion, mostly
for the grep reason, but also because it looks horribly ugly.

But there is no reason not to be consistent here.  We already use
the wrappers for all native system calls, so leaving the compat
calls out doesn't make any sense.  And I'd cheer for anyone who
comes up with a better scheme for the native and compat wrappers.

^ permalink raw reply

* unsubscribe
From: Frederic LEGER @ 2010-06-23 14:33 UTC (permalink / raw)
  To: linuxppc-dev

unsubscribe
 

^ permalink raw reply

* Re: [PATCH 01/40] ftrace syscalls: don't add events for unmapped syscalls
From: Steven Rostedt @ 2010-06-23 15:02 UTC (permalink / raw)
  To: Ian Munsie
  Cc: Jason Baron, linux-kernel, linuxppc-dev, Ingo Molnar,
	Paul Mackerras, Frederic Weisbecker, Ingo Molnar,
	Masami Hiramatsu
In-Reply-To: <1277287401-28571-2-git-send-email-imunsie@au1.ibm.com>

On Wed, 2010-06-23 at 20:02 +1000, Ian Munsie wrote:
> From: Ian Munsie <imunsie@au1.ibm.com>
> 
> FTRACE_SYSCALLS would create events for each and every system call, even
> if it had failed to map the system call's name with it's number. This
> resulted in a number of events being created that would not behave as
> expected.
> 
> This could happen, for example, on architectures who's symbol names are
> unusual and will not match the system call name. It could also happen
> with system calls which were mapped to sys_ni_syscall.
> 
> This patch changes the default system call number in the metadata to -1.
> If the system call name from the metadata is not successfully mapped to
> a system call number during boot, than the event initialisation routine
> will now return an error, preventing the event from being created.
> 
> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
> ---
>  include/linux/syscalls.h      |    2 ++
>  kernel/trace/trace_syscalls.c |    8 ++++++++
>  2 files changed, 10 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index 7f614ce..86f082b 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -159,6 +159,7 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>  	  __attribute__((section("__syscalls_metadata")))	\
>  	  __syscall_meta_##sname = {				\
>  		.name 		= "sys"#sname,			\
> +		.syscall_nr	= -1,	/* Filled in at boot */	\
>  		.nb_args 	= nb,				\
>  		.types		= types_##sname,		\
>  		.args		= args_##sname,			\
> @@ -176,6 +177,7 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>  	  __attribute__((section("__syscalls_metadata")))	\
>  	  __syscall_meta__##sname = {				\
>  		.name 		= "sys_"#sname,			\
> +		.syscall_nr	= -1,	/* Filled in at boot */	\
>  		.nb_args 	= 0,				\
>  		.enter_event	= &event_enter__##sname,	\
>  		.exit_event	= &event_exit__##sname,		\
> diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
> index 34e3580..82246ce 100644
> --- a/kernel/trace/trace_syscalls.c
> +++ b/kernel/trace/trace_syscalls.c
> @@ -431,6 +431,14 @@ void unreg_event_syscall_exit(struct ftrace_event_call *call)
>  int init_syscall_trace(struct ftrace_event_call *call)
>  {
>  	int id;
> +	int num;
> +
> +	num = ((struct syscall_metadata *)call->data)->syscall_nr;
> +	if (num < 0 || num >= NR_syscalls) {
> +		pr_debug("syscall %s metadata not mapped, disabling ftrace event\n",
> +				((struct syscall_metadata *)call->data)->name);
> +		return -ENOSYS;
> +	}

Perhaps this should be:

	if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls)
		return -ENOSYS;

-- Steve

>  
>  	if (set_syscall_print_fmt(call) < 0)
>  		return -ENOMEM;

^ permalink raw reply

* Re: [PATCH 08/40] tracing: remove syscall bitmaps in preparation for compat support
From: Steven Rostedt @ 2010-06-23 15:16 UTC (permalink / raw)
  To: Ian Munsie
  Cc: Lai Jiangshan, Jason Baron, linux-kernel, linuxppc-dev,
	Ingo Molnar, Paul Mackerras, Frederic Weisbecker, Ingo Molnar,
	Masami Hiramatsu
In-Reply-To: <1277287401-28571-9-git-send-email-imunsie@au1.ibm.com>

On Wed, 2010-06-23 at 20:02 +1000, Ian Munsie wrote:
> From: Jason Baron <jbaron@redhat.com>
> 
> In preparation for compat syscall tracing support, let's store the enabled
> syscalls, with the struct syscall_metadata itself. That way we don't duplicate
> enabled information when the compat table points to an entry in the regular
> syscall table. Also, allows us to remove the bitmap data structures completely.
> 
> Signed-off-by: Jason Baron <jbaron@redhat.com>
> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
> ---
>  include/linux/syscalls.h      |    8 +++++++
>  include/trace/syscall.h       |    4 +++
>  kernel/trace/trace_syscalls.c |   42 +++++++++++++++++++---------------------
>  3 files changed, 32 insertions(+), 22 deletions(-)
> 
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index 86f082b..755d05b 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -163,6 +163,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>  		.nb_args 	= nb,				\
>  		.types		= types_##sname,		\
>  		.args		= args_##sname,			\
> +		.ftrace_enter	= 0,				\
> +		.ftrace_exit	= 0,				\
> +		.perf_enter	= 0,				\
> +		.perf_exit	= 0,				\

I really hate this change!

You just removed a nice compressed bitmap (1 bit per syscall) to add 4
bytes per syscall. On my box I have 308 syscalls being traced. That was
308 bits per bitmask = 39 bytes * 2 = 78 * 2 (perf and ftrace) = 156.

Now we have 8 bytes per syscall (enter and exit), which is 1232 bytes.

Thus this change added 1076 bytes.

This may not seem as much, but the change is not worth 1K. Can't we just
add another bitmask or something for the compat case?

I also hate the moving of ftrace and perf internal data to an external
interface.

-- Steve

>  		.enter_event	= &event_enter_##sname,		\
>  		.exit_event	= &event_exit_##sname,		\
>  		.enter_fields	= LIST_HEAD_INIT(__syscall_meta_##sname.enter_fields), \
> @@ -179,6 +183,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>  		.name 		= "sys_"#sname,			\
>  		.syscall_nr	= -1,	/* Filled in at boot */	\
>  		.nb_args 	= 0,				\
> +		.ftrace_enter	= 0,				\
> +		.ftrace_exit	= 0,				\
> +		.perf_enter	= 0,				\
> +		.perf_exit	= 0,				\
>  		.enter_event	= &event_enter__##sname,	\
>  		.exit_event	= &event_exit__##sname,		\
>  		.enter_fields	= LIST_HEAD_INIT(__syscall_meta__##sname.enter_fields), \

^ permalink raw reply

* Re: [PATCH 10/40] tracing: add tracing support for compat syscalls
From: Steven Rostedt @ 2010-06-23 15:26 UTC (permalink / raw)
  To: Ian Munsie
  Cc: Lai Jiangshan, Arnd Bergmann, Jason Baron, Li Zefan, linux-kernel,
	linuxppc-dev, Ingo Molnar, Paul Mackerras, Frederic Weisbecker,
	Andrew Morton, David S. Miller, Masami Hiramatsu
In-Reply-To: <1277287401-28571-11-git-send-email-imunsie@au1.ibm.com>

On Wed, 2010-06-23 at 20:02 +1000, Ian Munsie wrote:
> From: Jason Baron <jbaron@redhat.com>
> 
> Add core support to event tracing for compat syscalls. The basic idea is that we
> check if we have a compat task via 'is_compat_task()'. If so, we lookup in the
> new compat_syscalls_metadata table, the corresponding struct syscall_metadata, to
> check syscall arguments and whether or not we are enabled.
> 
> Signed-off-by: Jason Baron <jbaron@redhat.com>
> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
> ---
>  include/linux/compat.h        |    2 +
>  include/trace/syscall.h       |    4 ++
>  kernel/trace/trace.h          |    2 +
>  kernel/trace/trace_syscalls.c |   86 +++++++++++++++++++++++++++++++++++++----
>  4 files changed, 86 insertions(+), 8 deletions(-)
> 
> diff --git a/include/linux/compat.h b/include/linux/compat.h
> index ab638e9..a94f13d 100644
> --- a/include/linux/compat.h
> +++ b/include/linux/compat.h
> @@ -363,6 +363,8 @@ extern ssize_t compat_rw_copy_check_uvector(int type,
>  
>  #else /* CONFIG_COMPAT */
>  
> +#define NR_syscalls_compat 0
> +
>  static inline int is_compat_task(void)
>  {
>  	return 0;
> diff --git a/include/trace/syscall.h b/include/trace/syscall.h
> index 75f3dce..67d4e64 100644
> --- a/include/trace/syscall.h
> +++ b/include/trace/syscall.h
> @@ -22,6 +22,7 @@
>  struct syscall_metadata {
>  	const char	*name;
>  	int		syscall_nr;
> +	int		compat_syscall_nr;

This is also bloating the kernel. I don't like to add fields to the
syscall_metadata lightly.

You're adding 4 more bytes to this structure to handle a few items. Find
a better way to do this.

>  	int		nb_args;
>  	const char	**types;
>  	const char	**args;
> @@ -38,6 +39,9 @@ struct syscall_metadata {
>  
>  #ifdef CONFIG_FTRACE_SYSCALLS
>  extern unsigned long arch_syscall_addr(int nr);
> +#ifdef CONFIG_COMPAT
> +extern unsigned long arch_compat_syscall_addr(int nr);
> +#endif
>  extern int init_syscall_trace(struct ftrace_event_call *call);
>  
>  extern int reg_event_syscall_enter(struct ftrace_event_call *call);
> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index 01ce088..53ace4b 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -79,12 +79,14 @@ enum trace_type {
>  struct syscall_trace_enter {
>  	struct trace_entry	ent;
>  	int			nr;
> +	int			compat;

You're adding 4 bytes to a trace (taking up precious buffer space) for a
single flag. If anything, set the 31st bit of nr if it is compat.

-- Steve

>  	unsigned long		args[];
>  };
>  
>  struct syscall_trace_exit {
>  	struct trace_entry	ent;
>  	int			nr;
> +	int			compat;
>  	long			ret;
>  };
>  

^ permalink raw reply

* [PATCH v3 0/2] powerpc: add support for new hcall H_BEST_ENERGY
From: Vaidyanathan Srinivasan @ 2010-06-23  6:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Anton Blanchard; +Cc: linuxppc-dev

Hi Ben,

The following series adds a kernel module for powerpc pseries platforms in
order to export platform energy management capabilities.

The module exports data from a new hypervisor call H_BEST_ENERGY.

Comments and suggestions made on the previous iteration of the patch
has been incorporated.

Changed in v3:

* Added more documentation in the cleanup patch
* Removed RFC tag, rebased and tested on 2.6.35-rc3
* Ready for inclusion in powerpc/next tree for further testing

Changes in v2:

[1] [RFC PATCH v2 0/2] powerpc: add support for new hcall H_BEST_ENERGY
http://lists.ozlabs.org/pipermail/linuxppc-dev/2010-May/082246.html

* Cleanup cpu/thread/core APIs
* Export APIs to module instead of threads_per_core
* Use of_find_node_by_path() instead of of_find_node_by_name()
* Error checking and whitespace cleanups

First version:
[2] [RFC] powerpc: add support for new hcall H_BEST_ENERGY
http://lists.ozlabs.org/pipermail/linuxppc-dev/2010-March/080796.html

Please review and include in powerpc/next tree for further testing.  I will
add the following enhancements incrementally:

* Detect h_call support from ibm,hypertas-functions
* Limit sysfs file creation only on hcall availability

This patch series will apply on 2.6.35-rc3 as well as powerpc/next tree.

Thanks,
Vaidy
---

Vaidyanathan Srinivasan (2):
      powerpc: cleanup APIs for cpu/thread/core mappings
      powerpc: add support for new hcall H_BEST_ENERGY


 arch/powerpc/include/asm/cputhreads.h           |   15 +
 arch/powerpc/include/asm/hvcall.h               |    3 
 arch/powerpc/kernel/smp.c                       |   19 +-
 arch/powerpc/mm/mmu_context_nohash.c            |   12 +
 arch/powerpc/platforms/pseries/Kconfig          |   10 +
 arch/powerpc/platforms/pseries/Makefile         |    1 
 arch/powerpc/platforms/pseries/pseries_energy.c |  258 +++++++++++++++++++++++
 7 files changed, 302 insertions(+), 16 deletions(-)
 create mode 100644 arch/powerpc/platforms/pseries/pseries_energy.c

-- 

^ permalink raw reply

* [PATCH v3 1/2] powerpc: cleanup APIs for cpu/thread/core mappings
From: Vaidyanathan Srinivasan @ 2010-06-23  6:04 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Anton Blanchard; +Cc: linuxppc-dev
In-Reply-To: <20100623060122.4957.33819.stgit@drishya.in.ibm.com>

These APIs take logical cpu number as input
Change cpu_first_thread_in_core() to cpu_leftmost_thread_sibling()
Change cpu_last_thread_in_core() to cpu_rightmost_thread_sibling()

These APIs convert core number (index) to logical cpu/thread numbers
Add cpu_first_thread_of_core(int core)
Changed cpu_thread_to_core() to cpu_core_of_thread(int cpu)

The goal is to make 'threads_per_core' accessible to the
pseries_energy module.  Instead of making an API to read
threads_per_core, this is a higher level wrapper function to
convert from logical cpu number to core number.

The current APIs cpu_first_thread_in_core() and
cpu_last_thread_in_core() returns logical CPU number while
cpu_thread_to_core() returns core number or index which is
not a logical CPU number.  The APIs are now clearly named to
distinguish 'core number' versus first and last 'logical cpu
number' in that core.

The new APIs cpu_{left,right}most_thread_sibling() work on
logical cpu numbers.  While cpu_first_thread_of_core() and
cpu_core_of_thread() work on core index.

Example usage:  (4 threads per core system)

cpu_leftmost_thread_sibling(5) = 4
cpu_rightmost_thread_sibling(5) = 7
cpu_core_of_thread(5) = 1
cpu_first_thread_of_core(1) = 4

cpu_core_of_thread() is used in cpu_to_drc_index() in the
module and cpu_first_thread_of_core() is used in
drc_index_to_cpu() in the module.

Made API changes to few callers.  Exported symbols for use
in modules.

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/cputhreads.h |   15 +++++++++------
 arch/powerpc/kernel/smp.c             |   19 ++++++++++++++++---
 arch/powerpc/mm/mmu_context_nohash.c  |   12 ++++++------
 3 files changed, 31 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/cputhreads.h b/arch/powerpc/include/asm/cputhreads.h
index a8e1844..26dc6bd 100644
--- a/arch/powerpc/include/asm/cputhreads.h
+++ b/arch/powerpc/include/asm/cputhreads.h
@@ -61,22 +61,25 @@ static inline cpumask_t cpu_online_cores_map(void)
 	return cpu_thread_mask_to_cores(cpu_online_map);
 }

-static inline int cpu_thread_to_core(int cpu)
-{
-	return cpu >> threads_shift;
-}
+#ifdef CONFIG_SMP
+int cpu_core_of_thread(int cpu);
+int cpu_first_thread_of_core(int core);
+#else
+static inline int cpu_core_of_thread(int cpu) { return cpu; }
+static inline int cpu_first_thread_of_core(int core) { return core; }
+#endif

 static inline int cpu_thread_in_core(int cpu)
 {
 	return cpu & (threads_per_core - 1);
 }

-static inline int cpu_first_thread_in_core(int cpu)
+static inline int cpu_leftmost_thread_sibling(int cpu)
 {
 	return cpu & ~(threads_per_core - 1);
 }

-static inline int cpu_last_thread_in_core(int cpu)
+static inline int cpu_rightmost_thread_sibling(int cpu)
 {
 	return cpu | (threads_per_core - 1);
 }
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 5c196d1..da4c2f8 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -468,7 +468,20 @@ out:
 	return id;
 }

-/* Must be called when no change can occur to cpu_present_mask,
+/* Helper routines for cpu to core mapping */
+int cpu_core_of_thread(int cpu)
+{
+	return cpu >> threads_shift;
+}
+EXPORT_SYMBOL_GPL(cpu_core_of_thread);
+
+int cpu_first_thread_of_core(int core)
+{
+	return core << threads_shift;
+}
+EXPORT_SYMBOL_GPL(cpu_first_thread_of_core);
+
+/* Must be called when no change can occur to cpu_present_map,
  * i.e. during cpu online or offline.
  */
 static struct device_node *cpu_to_l2cache(int cpu)
@@ -527,7 +540,7 @@ int __devinit start_secondary(void *unused)
 	notify_cpu_starting(cpu);
 	set_cpu_online(cpu, true);
 	/* Update sibling maps */
-	base = cpu_first_thread_in_core(cpu);
+	base = cpu_leftmost_thread_sibling(cpu);
 	for (i = 0; i < threads_per_core; i++) {
 		if (cpu_is_offline(base + i))
 			continue;
@@ -606,7 +619,7 @@ int __cpu_disable(void)
 		return err;

 	/* Update sibling maps */
-	base = cpu_first_thread_in_core(cpu);
+	base = cpu_leftmost_thread_sibling(cpu);
 	for (i = 0; i < threads_per_core; i++) {
 		cpumask_clear_cpu(cpu, cpu_sibling_mask(base + i));
 		cpumask_clear_cpu(base + i, cpu_sibling_mask(cpu));
diff --git a/arch/powerpc/mm/mmu_context_nohash.c b/arch/powerpc/mm/mmu_context_nohash.c
index ddfd7ad..22f3bc5 100644
--- a/arch/powerpc/mm/mmu_context_nohash.c
+++ b/arch/powerpc/mm/mmu_context_nohash.c
@@ -111,8 +111,8 @@ static unsigned int steal_context_smp(unsigned int id)
 		 * a core map instead but this will do for now.
 		 */
 		for_each_cpu(cpu, mm_cpumask(mm)) {
-			for (i = cpu_first_thread_in_core(cpu);
-			     i <= cpu_last_thread_in_core(cpu); i++)
+			for (i = cpu_leftmost_thread_sibling(cpu);
+			     i <= cpu_rightmost_thread_sibling(cpu); i++)
 				__set_bit(id, stale_map[i]);
 			cpu = i - 1;
 		}
@@ -264,14 +264,14 @@ void switch_mmu_context(struct mm_struct *prev, struct mm_struct *next)
 	 */
 	if (test_bit(id, stale_map[cpu])) {
 		pr_hardcont(" | stale flush %d [%d..%d]",
-			    id, cpu_first_thread_in_core(cpu),
-			    cpu_last_thread_in_core(cpu));
+			    id, cpu_leftmost_thread_sibling(cpu),
+			    cpu_rightmost_thread_sibling(cpu));

 		local_flush_tlb_mm(next);

 		/* XXX This clear should ultimately be part of local_flush_tlb_mm */
-		for (i = cpu_first_thread_in_core(cpu);
-		     i <= cpu_last_thread_in_core(cpu); i++) {
+		for (i = cpu_leftmost_thread_sibling(cpu);
+		     i <= cpu_rightmost_thread_sibling(cpu); i++) {
 			__clear_bit(id, stale_map[i]);
 		}
 	}

^ permalink raw reply related

* [PATCH v3 2/2] powerpc: add support for new hcall H_BEST_ENERGY
From: Vaidyanathan Srinivasan @ 2010-06-23  6:04 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Anton Blanchard; +Cc: linuxppc-dev
In-Reply-To: <20100623060122.4957.33819.stgit@drishya.in.ibm.com>

	Create sysfs interface to export data from H_BEST_ENERGY hcall
	that can be used by administrative tools on supported pseries
	platforms for energy management	optimizations.

	/sys/device/system/cpu/pseries_(de)activate_hint_list and
	/sys/device/system/cpu/cpuN/pseries_(de)activate_hint will provide
	hints for activation and deactivation of cpus respectively.

	Added new driver module
		arch/powerpc/platforms/pseries/pseries_energy.c
	under new config option CONFIG_PSERIES_ENERGY

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/hvcall.h               |    3 
 arch/powerpc/platforms/pseries/Kconfig          |   10 +
 arch/powerpc/platforms/pseries/Makefile         |    1 
 arch/powerpc/platforms/pseries/pseries_energy.c |  258 +++++++++++++++++++++++
 4 files changed, 271 insertions(+), 1 deletions(-)
 create mode 100644 arch/powerpc/platforms/pseries/pseries_energy.c

diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index 5119b7d..34b66e0 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -231,7 +231,8 @@
 #define H_GET_EM_PARMS		0x2B8
 #define H_SET_MPP		0x2D0
 #define H_GET_MPP		0x2D4
-#define MAX_HCALL_OPCODE	H_GET_MPP
+#define H_BEST_ENERGY		0x2F4
+#define MAX_HCALL_OPCODE	H_BEST_ENERGY

 #ifndef __ASSEMBLY__

diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig
index c667f0f..b3dd108 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -33,6 +33,16 @@ config PSERIES_MSI
        depends on PCI_MSI && EEH
        default y

+config PSERIES_ENERGY
+	tristate "pseries energy management capabilities driver"
+	depends on PPC_PSERIES
+	default y
+	help
+	  Provides interface to platform energy management capabilities
+	  on supported PSERIES platforms.
+	  Provides: /sys/devices/system/cpu/pseries_(de)activation_hint_list
+	  and /sys/devices/system/cpu/cpuN/pseries_(de)activation_hint
+
 config SCANLOG
 	tristate "Scanlog dump interface"
 	depends on RTAS_PROC && PPC_PSERIES
diff --git a/arch/powerpc/platforms/pseries/Makefile b/arch/powerpc/platforms/pseries/Makefile
index 3dbef30..32ae72e 100644
--- a/arch/powerpc/platforms/pseries/Makefile
+++ b/arch/powerpc/platforms/pseries/Makefile
@@ -16,6 +16,7 @@ obj-$(CONFIG_EEH)	+= eeh.o eeh_cache.o eeh_driver.o eeh_event.o eeh_sysfs.o
 obj-$(CONFIG_KEXEC)	+= kexec.o
 obj-$(CONFIG_PCI)	+= pci.o pci_dlpar.o
 obj-$(CONFIG_PSERIES_MSI)	+= msi.o
+obj-$(CONFIG_PSERIES_ENERGY)	+= pseries_energy.o

 obj-$(CONFIG_HOTPLUG_CPU)	+= hotplug-cpu.o
 obj-$(CONFIG_MEMORY_HOTPLUG)	+= hotplug-memory.o
diff --git a/arch/powerpc/platforms/pseries/pseries_energy.c b/arch/powerpc/platforms/pseries/pseries_energy.c
new file mode 100644
index 0000000..9a936b1
--- /dev/null
+++ b/arch/powerpc/platforms/pseries/pseries_energy.c
@@ -0,0 +1,258 @@
+/*
+ * POWER platform energy management driver
+ * Copyright (C) 2010 IBM Corporation
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ *
+ * This pseries platform device driver provides access to
+ * platform energy management capabilities.
+ */
+
+#include <linux/module.h>
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/init.h>
+#include <linux/seq_file.h>
+#include <linux/sysdev.h>
+#include <linux/cpu.h>
+#include <linux/of.h>
+#include <asm/cputhreads.h>
+#include <asm/page.h>
+#include <asm/hvcall.h>
+
+
+#define MODULE_VERS "1.0"
+#define MODULE_NAME "pseries_energy"
+
+/* Helper Routines to convert between drc_index to cpu numbers */
+
+static u32 cpu_to_drc_index(int cpu)
+{
+	struct device_node *dn = NULL;
+	const int *indexes;
+	int i;
+	dn = of_find_node_by_path("/cpus");
+	if (dn == NULL)
+		goto err;
+	indexes = of_get_property(dn, "ibm,drc-indexes", NULL);
+	if (indexes == NULL)
+		goto err;
+	/* Convert logical cpu number to core number */
+	i = cpu_core_of_thread(cpu);
+	/*
+	 * The first element indexes[0] is the number of drc_indexes
+	 * returned in the list.  Hence i+1 will get the drc_index
+	 * corresponding to core number i.
+	 */
+	WARN_ON(i > indexes[0]);
+	return indexes[i + 1];
+err:
+	printk(KERN_WARNING "cpu_to_drc_index(%d) failed", cpu);
+	return 0;
+}
+
+static int drc_index_to_cpu(u32 drc_index)
+{
+	struct device_node *dn = NULL;
+	const int *indexes;
+	int i, cpu;
+	dn = of_find_node_by_path("/cpus");
+	if (dn == NULL)
+		goto err;
+	indexes = of_get_property(dn, "ibm,drc-indexes", NULL);
+	if (indexes == NULL)
+		goto err;
+	/*
+	 * First element in the array is the number of drc_indexes
+	 * returned.  Search through the list to find the matching
+	 * drc_index and get the core number
+	 */
+	for (i = 0; i < indexes[0]; i++) {
+		if (indexes[i + 1] == drc_index)
+			break;
+	}
+	/* Convert core number to logical cpu number */
+	cpu = cpu_first_thread_of_core(i);
+	return cpu;
+err:
+	printk(KERN_WARNING "drc_index_to_cpu(%d) failed", drc_index);
+	return 0;
+}
+
+/*
+ * pseries hypervisor call H_BEST_ENERGY provides hints to OS on
+ * preferred logical cpus to activate or deactivate for optimized
+ * energy consumption.
+ */
+
+#define FLAGS_MODE1	0x004E200000080E01
+#define FLAGS_MODE2	0x004E200000080401
+#define FLAGS_ACTIVATE  0x100
+
+static ssize_t get_best_energy_list(char *page, int activate)
+{
+	int rc, cnt, i, cpu;
+	unsigned long retbuf[PLPAR_HCALL9_BUFSIZE];
+	unsigned long flags = 0;
+	u32 *buf_page;
+	char *s = page;
+
+	buf_page = (u32 *) get_zeroed_page(GFP_KERNEL);
+	if (!buf_page)
+		return -ENOMEM;
+
+	flags = FLAGS_MODE1;
+	if (activate)
+		flags |= FLAGS_ACTIVATE;
+
+	rc = plpar_hcall9(H_BEST_ENERGY, retbuf, flags, 0, __pa(buf_page),
+				0, 0, 0, 0, 0, 0);
+	if (rc != H_SUCCESS) {
+		free_page((unsigned long) buf_page);
+		return -EINVAL;
+	}
+
+	cnt = retbuf[0];
+	for (i = 0; i < cnt; i++) {
+		cpu = drc_index_to_cpu(buf_page[2*i+1]);
+		if ((cpu_online(cpu) && !activate) ||
+		    (!cpu_online(cpu) && activate))
+			s += sprintf(s, "%d,", cpu);
+	}
+	if (s > page) { /* Something to show */
+		s--; /* Suppress last comma */
+		s += sprintf(s, "\n");
+	}
+
+	free_page((unsigned long) buf_page);
+	return s-page;
+}
+
+static ssize_t get_best_energy_data(struct sys_device *dev,
+					char *page, int activate)
+{
+	int rc;
+	unsigned long retbuf[PLPAR_HCALL9_BUFSIZE];
+	unsigned long flags = 0;
+
+	flags = FLAGS_MODE2;
+	if (activate)
+		flags |= FLAGS_ACTIVATE;
+
+	rc = plpar_hcall9(H_BEST_ENERGY, retbuf, flags,
+				cpu_to_drc_index(dev->id),
+				0, 0, 0, 0, 0, 0, 0);
+
+	if (rc != H_SUCCESS)
+		return -EINVAL;
+
+	return sprintf(page, "%lu\n", retbuf[1] >> 32);
+}
+
+/* Wrapper functions */
+
+static ssize_t cpu_activate_hint_list_show(struct sysdev_class *class,
+			struct sysdev_class_attribute *attr, char *page)
+{
+	return get_best_energy_list(page, 1);
+}
+
+static ssize_t cpu_deactivate_hint_list_show(struct sysdev_class *class,
+			struct sysdev_class_attribute *attr, char *page)
+{
+	return get_best_energy_list(page, 0);
+}
+
+static ssize_t percpu_activate_hint_show(struct sys_device *dev,
+			struct sysdev_attribute *attr, char *page)
+{
+	return get_best_energy_data(dev, page, 1);
+}
+
+static ssize_t percpu_deactivate_hint_show(struct sys_device *dev,
+			struct sysdev_attribute *attr, char *page)
+{
+	return get_best_energy_data(dev, page, 0);
+}
+
+/*
+ * Create sysfs interface:
+ * /sys/devices/system/cpu/pseries_activate_hint_list
+ * /sys/devices/system/cpu/pseries_deactivate_hint_list
+ * 	Comma separated list of cpus to activate or deactivate
+ * /sys/devices/system/cpu/cpuN/pseries_activate_hint
+ * /sys/devices/system/cpu/cpuN/pseries_deactivate_hint
+ *	Per-cpu value of the hint
+ */
+
+struct sysdev_class_attribute attr_cpu_activate_hint_list =
+		_SYSDEV_CLASS_ATTR(pseries_activate_hint_list, 0444,
+		cpu_activate_hint_list_show, NULL);
+
+struct sysdev_class_attribute attr_cpu_deactivate_hint_list =
+		_SYSDEV_CLASS_ATTR(pseries_deactivate_hint_list, 0444,
+		cpu_deactivate_hint_list_show, NULL);
+
+struct sysdev_attribute attr_percpu_activate_hint =
+		_SYSDEV_ATTR(pseries_activate_hint, 0444,
+		percpu_activate_hint_show, NULL);
+
+struct sysdev_attribute attr_percpu_deactivate_hint =
+		_SYSDEV_ATTR(pseries_deactivate_hint, 0444,
+		percpu_deactivate_hint_show, NULL);
+
+static int __init pseries_energy_init(void)
+{
+	int cpu, err;
+	struct sys_device *cpu_sys_dev;
+
+	/* Create the sysfs files */
+	err = sysfs_create_file(&cpu_sysdev_class.kset.kobj,
+				&attr_cpu_activate_hint_list.attr);
+	if (!err)
+		err = sysfs_create_file(&cpu_sysdev_class.kset.kobj,
+				&attr_cpu_deactivate_hint_list.attr);
+
+	for_each_possible_cpu(cpu) {
+		cpu_sys_dev = get_cpu_sysdev(cpu);
+		err = sysfs_create_file(&cpu_sys_dev->kobj,
+				&attr_percpu_activate_hint.attr);
+		if (err)
+			break;
+		err = sysfs_create_file(&cpu_sys_dev->kobj,
+				&attr_percpu_deactivate_hint.attr);
+		if (err)
+			break;
+	}
+	return err;
+
+}
+
+static void __exit pseries_energy_cleanup(void)
+{
+	int cpu;
+	struct sys_device *cpu_sys_dev;
+
+	/* Remove the sysfs files */
+	sysfs_remove_file(&cpu_sysdev_class.kset.kobj,
+				&attr_cpu_activate_hint_list.attr);
+
+	sysfs_remove_file(&cpu_sysdev_class.kset.kobj,
+				&attr_cpu_deactivate_hint_list.attr);
+
+	for_each_possible_cpu(cpu) {
+		cpu_sys_dev = get_cpu_sysdev(cpu);
+		sysfs_remove_file(&cpu_sys_dev->kobj,
+				&attr_percpu_activate_hint.attr);
+		sysfs_remove_file(&cpu_sys_dev->kobj,
+				&attr_percpu_deactivate_hint.attr);
+	}
+}
+
+module_init(pseries_energy_init);
+module_exit(pseries_energy_cleanup);
+MODULE_DESCRIPTION("Driver for pseries platform energy management");
+MODULE_AUTHOR("Vaidyanathan Srinivasan");
+MODULE_LICENSE("GPL");

^ permalink raw reply related

* Re: [PATCH 31/40] trace syscalls: Convert various generic compat syscalls
From: H. Peter Anvin @ 2010-06-23 15:51 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Christoph Lameter, WANG Cong, Heiko Carstens, Oleg Nesterov,
	linuxppc-dev, Paul Mackerras, Sam Ravnborg, Andi Kleen,
	Mike Frysinger, Eric Dumazet, Jeff Moyer, Ingo Molnar,
	KOSAKI Motohiro, David Rientjes, Ingo Molnar, Arnd Bergmann,
	Steven Rostedt, Ian Munsie, Thomas Gleixner, Johannes Berg,
	Roland McGrath, Lee Schermerhorn, Arnaldo Carvalho de Melo,
	Neil Horman, linux-mm, netdev, Jason Baron, Greg Kroah-Hartman,
	Roel Kluin, linux-kernel, kexec, Eric Biederman, linux-fsdevel,
	Simon Kagstrom, Andrew Morton, David S. Miller, Alexander Viro
In-Reply-To: <20100623113806.GD5242@nowhere>

On 06/23/2010 04:38 AM, Frederic Weisbecker wrote:
> 
> I haven't heard any complains about existing syscalls wrappers.
> 

Then you truly haven't been listening.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

^ permalink raw reply

* Re: [PATCH 10/40] tracing: add tracing support for compat syscalls
From: Frederic Weisbecker @ 2010-06-23 16:02 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Lai Jiangshan, Arnd Bergmann, Jason Baron, Li Zefan, linux-kernel,
	linuxppc-dev, Ingo Molnar, Paul Mackerras, Ian Munsie,
	Andrew Morton, David S. Miller, Masami Hiramatsu
In-Reply-To: <1277306786.9181.51.camel@gandalf.stny.rr.com>

On Wed, Jun 23, 2010 at 11:26:26AM -0400, Steven Rostedt wrote:
> On Wed, 2010-06-23 at 20:02 +1000, Ian Munsie wrote:
> > From: Jason Baron <jbaron@redhat.com>
> > 
> > Add core support to event tracing for compat syscalls. The basic idea is that we
> > check if we have a compat task via 'is_compat_task()'. If so, we lookup in the
> > new compat_syscalls_metadata table, the corresponding struct syscall_metadata, to
> > check syscall arguments and whether or not we are enabled.
> > 
> > Signed-off-by: Jason Baron <jbaron@redhat.com>
> > Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
> > ---
> >  include/linux/compat.h        |    2 +
> >  include/trace/syscall.h       |    4 ++
> >  kernel/trace/trace.h          |    2 +
> >  kernel/trace/trace_syscalls.c |   86 +++++++++++++++++++++++++++++++++++++----
> >  4 files changed, 86 insertions(+), 8 deletions(-)
> > 
> > diff --git a/include/linux/compat.h b/include/linux/compat.h
> > index ab638e9..a94f13d 100644
> > --- a/include/linux/compat.h
> > +++ b/include/linux/compat.h
> > @@ -363,6 +363,8 @@ extern ssize_t compat_rw_copy_check_uvector(int type,
> >  
> >  #else /* CONFIG_COMPAT */
> >  
> > +#define NR_syscalls_compat 0
> > +
> >  static inline int is_compat_task(void)
> >  {
> >  	return 0;
> > diff --git a/include/trace/syscall.h b/include/trace/syscall.h
> > index 75f3dce..67d4e64 100644
> > --- a/include/trace/syscall.h
> > +++ b/include/trace/syscall.h
> > @@ -22,6 +22,7 @@
> >  struct syscall_metadata {
> >  	const char	*name;
> >  	int		syscall_nr;
> > +	int		compat_syscall_nr;
> 
> This is also bloating the kernel. I don't like to add fields to the
> syscall_metadata lightly.
> 
> You're adding 4 more bytes to this structure to handle a few items. Find
> a better way to do this.


This is for cases when the compat and normal handlers are the same.
I haven't checked whether such case happen enough to make this a
benefit. I suspect it's not. Moreover I guess this adds more
complexity to the code.

May be this should be simply dropped.



 
> >  	int		nb_args;
> >  	const char	**types;
> >  	const char	**args;
> > @@ -38,6 +39,9 @@ struct syscall_metadata {
> >  
> >  #ifdef CONFIG_FTRACE_SYSCALLS
> >  extern unsigned long arch_syscall_addr(int nr);
> > +#ifdef CONFIG_COMPAT
> > +extern unsigned long arch_compat_syscall_addr(int nr);
> > +#endif
> >  extern int init_syscall_trace(struct ftrace_event_call *call);
> >  
> >  extern int reg_event_syscall_enter(struct ftrace_event_call *call);
> > diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> > index 01ce088..53ace4b 100644
> > --- a/kernel/trace/trace.h
> > +++ b/kernel/trace/trace.h
> > @@ -79,12 +79,14 @@ enum trace_type {
> >  struct syscall_trace_enter {
> >  	struct trace_entry	ent;
> >  	int			nr;
> > +	int			compat;
> 
> You're adding 4 bytes to a trace (taking up precious buffer space) for a
> single flag. If anything, set the 31st bit of nr if it is compat.



That too can be dropped I think. compat syscalls and normal syscalls are
different events, so the difference can be made in the event id or name.

^ permalink raw reply

* Re: [PATCH 39/40] trace syscalls: Clean confusing {s,r,}name and fix ABI breakage
From: Jason Baron @ 2010-06-23 18:03 UTC (permalink / raw)
  To: Ian Munsie
  Cc: Jesper Nilsson, Ingo Molnar, linux-kernel, Steven Rostedt,
	linuxppc-dev, Ingo Molnar, Paul Mackerras, Frederic Weisbecker,
	Andrew Morton, Christoph Hellwig
In-Reply-To: <1277287401-28571-40-git-send-email-imunsie@au1.ibm.com>

On Wed, Jun 23, 2010 at 08:03:20PM +1000, Ian Munsie wrote:
> From: Ian Munsie <imunsie@au1.ibm.com>
> 
> This patch cleans up the preprocessor macros defining system calls by
> standidising on the parameters passing around the system call name, with
> and without it's prefix.
> 
> Overall this makes the preprocessor code easier to understand and
> follow and less likely to introduce bugs due to misunderstanding what to
> place into each parameter.
> 
> The parameters henceforth should be named:
> 
> sname is the system call name WITHOUT the prefix (e.g. read)
> prefix is the prefix INCLUDING the trailing underscore (e.g. sys_)
> 
> These are mashed together to form the full syscall name when required
> like blah##prefix##sname. For just prefix##sname please use the provided
> SYSNAME(prefix,sname) macro instead as it will be safe if prefix itself
> is a macro.
> 
> This patch also fixes an ABI breakage - the ftrace events are once again
> named like 'sys_enter_read' instead of 'enter_sys_read'. The rwtop
> script shipped with perf relies on this ABI. Others may as well.
> 
> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
> ---

overall this patch is a major improvement! My question though is
about the naming of the compat syscalls in the context of events. I
believe this patch differeniates compat syscall event names as:
"sys32_enter_sname", and "compat_sys_enter_sname". I agree that we keep that
distinction for purposes of defining the actual syscall function. However,
we had discuessed previously about keeping the event name the same for
all compat syscalls. ie they are all called "compat_sys_sname" or some
such. Reason being you could just do "compat_*" to match all compat
events.

thanks,

-Jason

^ permalink raw reply

* Re: [PATCH 08/40] tracing: remove syscall bitmaps in preparation for compat support
From: Jason Baron @ 2010-06-23 19:14 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Lai Jiangshan, Frederic Weisbecker, linux-kernel, linuxppc-dev,
	Ingo Molnar, Paul Mackerras, Ian Munsie, Ingo Molnar,
	Masami Hiramatsu
In-Reply-To: <1277306204.9181.43.camel@gandalf.stny.rr.com>

On Wed, Jun 23, 2010 at 11:16:44AM -0400, Steven Rostedt wrote:
> On Wed, 2010-06-23 at 20:02 +1000, Ian Munsie wrote:
> > From: Jason Baron <jbaron@redhat.com>
> > 
> > In preparation for compat syscall tracing support, let's store the enabled
> > syscalls, with the struct syscall_metadata itself. That way we don't duplicate
> > enabled information when the compat table points to an entry in the regular
> > syscall table. Also, allows us to remove the bitmap data structures completely.
> > 
> > Signed-off-by: Jason Baron <jbaron@redhat.com>
> > Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
> > ---
> >  include/linux/syscalls.h      |    8 +++++++
> >  include/trace/syscall.h       |    4 +++
> >  kernel/trace/trace_syscalls.c |   42 +++++++++++++++++++---------------------
> >  3 files changed, 32 insertions(+), 22 deletions(-)
> > 
> > diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> > index 86f082b..755d05b 100644
> > --- a/include/linux/syscalls.h
> > +++ b/include/linux/syscalls.h
> > @@ -163,6 +163,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
> >  		.nb_args 	= nb,				\
> >  		.types		= types_##sname,		\
> >  		.args		= args_##sname,			\
> > +		.ftrace_enter	= 0,				\
> > +		.ftrace_exit	= 0,				\
> > +		.perf_enter	= 0,				\
> > +		.perf_exit	= 0,				\
> 
> I really hate this change!
> 
> You just removed a nice compressed bitmap (1 bit per syscall) to add 4
> bytes per syscall. On my box I have 308 syscalls being traced. That was
> 308 bits per bitmask = 39 bytes * 2 = 78 * 2 (perf and ftrace) = 156.
> 
> Now we have 8 bytes per syscall (enter and exit), which is 1232 bytes.
> 
> Thus this change added 1076 bytes.
> 
> This may not seem as much, but the change is not worth 1K. Can't we just
> add another bitmask or something for the compat case?
> 
> I also hate the moving of ftrace and perf internal data to an external
> interface.
> 
> -- Steve
> 

I made this change (I also wrote the original bitmap), b/c compat
syscalls can share "regular" syscalls. That is the compat syscall table
points to syscalls from non-compat mode. (looking at ia32 on x86 it
looks like at least half).

Thus, if we continue along the bitmap path, we would have to introduce
another 4 bitmaps for compat. 2 for enter and exit and 2 for perf and
ftrace. Thus, using your math above: 39 bytes * 8 = 312 bytes. So
approximately 1 byte per system call.

Instead, if we store this data in the syscall metadata, we actually only
need 4 bits per syscall. Now, the above implementation uses 4 chars,
where we really only need 1 char (or really 4 bits, which we could
eventually store in the last last bit of the four existing pointer
assuming they are 2 byte aligned for no increased storage space at all).
But even assuming we use 1 byte per system call we are going to have in
the worse case the above 312 bytes + (1 byte * # of non-shared compat
syscalls). So, yes we might need a little more storage in this scheme.
Another consideration too, is obviously the alignment of
syscall_metadata, since the extra 1 byte, might be more...

However, we don't have to compute the location of the bits in the
compat syscall map each time a tracing syscall is enable/disable. This
would be more expensive, especially if we don't store the compat syscall
number with each syscall meta data structure (which you have proposed
dropping). So with compat syscalls, we are setting two bit locations
with each enable/disable instead of 1 with this new scheme.

Also, I think the more important reason to store these bits in the
syscall meta data structure is simplicity. Not all arches start their tables
counting from 0 (requiring a constant shift factor), and obviously we
waste bits for non-implemented syscalls. I don't want to have to deal
with these arch specific implementation issues, if I don't need to.

thanks,

-Jason

^ permalink raw reply

* Re: [PATCH 12/40] x86, compat: convert ia32 layer to use
From: H. Peter Anvin @ 2010-06-23 19:23 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Andrew Morton, Jason Baron, x86, linux-kernel, Steven Rostedt,
	linuxppc-dev, Ingo Molnar, Paul Mackerras, Ian Munsie,
	Greg Ungerer, Russell King, Thomas Gleixner, Christoph Hellwig
In-Reply-To: <20100623114116.GE5242@nowhere>

On 06/23/2010 04:41 AM, Frederic Weisbecker wrote:
> On Wed, Jun 23, 2010 at 12:46:04PM +0200, Christoph Hellwig wrote:
>> On Wed, Jun 23, 2010 at 12:36:21PM +0200, Frederic Weisbecker wrote:
>>> I think we wanted that to keep the sys32_ prefixed based naming, to avoid
>>> collisions with generic compat handler names.
>>
>> For native syscalls we do this by adding a arch prefix inside the
>> syscall name, e.g.:
>>
>> arch/s390/kernel/sys_s390.c:SYSCALL_DEFINE(s390_fallocate)(int fd, int mode, loff_t offset,
>> arch/sparc/kernel/sys_sparc_64.c:SYSCALL_DEFINE1(sparc_pipe_real, struct pt_regs *, regs)
> 
> In fact we sort of wanted to standardize the name of arch overriden compat
> syscalls, so that userspace programs playing with syscalls tracing won't have
> to deal with arch naming differences.
> 

That seems totally wrong in so many ways.

What userspace sees is the system call name, e.g. fallocate or pipe.  It
should *not* be visible to userspace that there is an arch-specici
implementation.

There are already a huge amount of gratuitous user space ABI
differences, of course, partly because we *still* don't actually have a
systematic model for argument passing.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

^ permalink raw reply

* Re: [PATCH 08/40] tracing: remove syscall bitmaps in preparation for compat support
From: Jason Baron @ 2010-06-23 19:34 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Lai Jiangshan, Frederic Weisbecker, linux-kernel, linuxppc-dev,
	Ingo Molnar, Paul Mackerras, Ian Munsie, Ingo Molnar,
	Masami Hiramatsu
In-Reply-To: <20100623191454.GC2680@redhat.com>

On Wed, Jun 23, 2010 at 03:14:54PM -0400, Jason Baron wrote:
> On Wed, Jun 23, 2010 at 11:16:44AM -0400, Steven Rostedt wrote:
> > On Wed, 2010-06-23 at 20:02 +1000, Ian Munsie wrote:
> > > From: Jason Baron <jbaron@redhat.com>
> > > 
> > > In preparation for compat syscall tracing support, let's store the enabled
> > > syscalls, with the struct syscall_metadata itself. That way we don't duplicate
> > > enabled information when the compat table points to an entry in the regular
> > > syscall table. Also, allows us to remove the bitmap data structures completely.
> > > 
> > > Signed-off-by: Jason Baron <jbaron@redhat.com>
> > > Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
> > > ---
> > >  include/linux/syscalls.h      |    8 +++++++
> > >  include/trace/syscall.h       |    4 +++
> > >  kernel/trace/trace_syscalls.c |   42 +++++++++++++++++++---------------------
> > >  3 files changed, 32 insertions(+), 22 deletions(-)
> > > 
> > > diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> > > index 86f082b..755d05b 100644
> > > --- a/include/linux/syscalls.h
> > > +++ b/include/linux/syscalls.h
> > > @@ -163,6 +163,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
> > >  		.nb_args 	= nb,				\
> > >  		.types		= types_##sname,		\
> > >  		.args		= args_##sname,			\
> > > +		.ftrace_enter	= 0,				\
> > > +		.ftrace_exit	= 0,				\
> > > +		.perf_enter	= 0,				\
> > > +		.perf_exit	= 0,				\
> > 
> > I really hate this change!
> > 
> > You just removed a nice compressed bitmap (1 bit per syscall) to add 4
> > bytes per syscall. On my box I have 308 syscalls being traced. That was
> > 308 bits per bitmask = 39 bytes * 2 = 78 * 2 (perf and ftrace) = 156.
> > 
> > Now we have 8 bytes per syscall (enter and exit), which is 1232 bytes.
> > 
> > Thus this change added 1076 bytes.
> > 
> > This may not seem as much, but the change is not worth 1K. Can't we just
> > add another bitmask or something for the compat case?
> > 
> > I also hate the moving of ftrace and perf internal data to an external
> > interface.
> > 
> > -- Steve
> > 
> 
> I made this change (I also wrote the original bitmap), b/c compat
> syscalls can share "regular" syscalls. That is the compat syscall table
> points to syscalls from non-compat mode. (looking at ia32 on x86 it
> looks like at least half).
> 
> Thus, if we continue along the bitmap path, we would have to introduce
> another 4 bitmaps for compat. 2 for enter and exit and 2 for perf and
> ftrace. Thus, using your math above: 39 bytes * 8 = 312 bytes. So
> approximately 1 byte per system call.
> 
> Instead, if we store this data in the syscall metadata, we actually only
> need 4 bits per syscall. Now, the above implementation uses 4 chars,
> where we really only need 1 char (or really 4 bits, which we could
> eventually store in the last last bit of the four existing pointer
> assuming they are 2 byte aligned for no increased storage space at all).
> But even assuming we use 1 byte per system call we are going to have in
> the worse case the above 312 bytes + (1 byte * # of non-shared compat
> syscalls). So, yes we might need a little more storage in this scheme.
> Another consideration too, is obviously the alignment of
> syscall_metadata, since the extra 1 byte, might be more...
> 
> However, we don't have to compute the location of the bits in the
> compat syscall map each time a tracing syscall is enable/disable. This
> would be more expensive, especially if we don't store the compat syscall
> number with each syscall meta data structure (which you have proposed
> dropping). So with compat syscalls, we are setting two bit locations
> with each enable/disable instead of 1 with this new scheme.
> 
> Also, I think the more important reason to store these bits in the
> syscall meta data structure is simplicity. Not all arches start their tables
> counting from 0 (requiring a constant shift factor), and obviously we
> waste bits for non-implemented syscalls. I don't want to have to deal
> with these arch specific implementation issues, if I don't need to.
> 
> thanks,
> 
> -Jason
> 

Actually, looking at this further, what we probably want to do change
the "int nb_args" field, which is already in syscall_metadata into a bit
field. nb_args I think can be at most 6, or 3 bits, and we only need 4
bits for storing the enabled/disabled data, so we could even make it a
char. Thus, actually saving space with this patch :) (at least as far as
the syscall_metadata field is concerned).

thanks,

-Jason

^ permalink raw reply

* Re: [PATCH 08/40] tracing: remove syscall bitmaps in preparation for compat support
From: Steven Rostedt @ 2010-06-23 19:45 UTC (permalink / raw)
  To: Jason Baron
  Cc: Lai Jiangshan, Frederic Weisbecker, linux-kernel, linuxppc-dev,
	Ingo Molnar, Paul Mackerras, Ian Munsie, Ingo Molnar,
	Masami Hiramatsu
In-Reply-To: <20100623193455.GE2680@redhat.com>

On Wed, 2010-06-23 at 15:34 -0400, Jason Baron wrote:

> Actually, looking at this further, what we probably want to do change
> the "int nb_args" field, which is already in syscall_metadata into a bit
> field. nb_args I think can be at most 6, or 3 bits, and we only need 4
> bits for storing the enabled/disabled data, so we could even make it a
> char. Thus, actually saving space with this patch :) (at least as far as
> the syscall_metadata field is concerned).

Yeah, I'm fine with turning that into a count/flags field.

-- Steve

^ permalink raw reply

* Re: [PATCH 16/40] tags: recognize compat syscalls
From: Michal Marek @ 2010-06-24 12:02 UTC (permalink / raw)
  To: Ian Munsie
  Cc: Frederic Weisbecker, Jason Baron, linux-kernel, Steven Rostedt,
	linuxppc-dev, Ingo Molnar, Paul Mackerras, John Kacur, WANG Cong,
	Andrew Morton, Stefani Seibold
In-Reply-To: <1277287401-28571-17-git-send-email-imunsie@au1.ibm.com>

On 23.6.2010 12:02, Ian Munsie wrote:
> From: Jason Baron <jbaron@redhat.com>
> 
> make tags.sh recognize the new syscall macros:
> 
> COMPAT_SYSCALL_DEFINE#N()
> ARCH_COMPAT_SYSCALL_DEFINE#N()
> 
> Signed-off-by: Jason Baron <jbaron@redhat.com>
> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>

Acked-by: Michal Marek <mmarek@suse.cz>

Michal

^ permalink raw reply

* Re: [PATCH 31/40] trace syscalls: Convert various generic compat syscalls
From: Michal Marek @ 2010-06-24 12:05 UTC (permalink / raw)
  To: Andi Kleen
  Cc: WANG Cong, Frederic Weisbecker, Oleg Nesterov, linuxppc-dev,
	Paul Mackerras, Jeff Moyer, Ingo Molnar, Ingo Molnar,
	Arnd Bergmann, "Sam Ravnborg", Steven Rostedt, Ian Munsie,
	Thomas Gleixner, Jason Baron, Greg Kroah-Hartman, linux-kernel,
	Eric Biederman, Simon Kagstrom, Andrew Morton, David S. Miller,
	Alexander Viro
In-Reply-To: <4C21E3F8.9000405@linux.intel.com>

On 23.6.2010 12:37, Andi Kleen wrote:
> It also has maintenance costs, e.g. I doubt ctags and cscope
> will be able to deal with these kinds of macros, so it has a
> high cost for everyone using these tools.

FWIW, patch 16/40 of this series teaches 'make tags' to recognize these
macros: http://permalink.gmane.org/gmane.linux.kernel/1002103

Michal

^ permalink raw reply

* Continuing UCC UART woes
From: Gary Thomas @ 2010-06-24 12:54 UTC (permalink / raw)
  To: linuxppc-dev

I thought I had this working, but it seems to only work for UCC3.
Sadly, I can't get it to work on UCC4/UCC5/UCC8 at all.

Starting with UCC4, I have:
             /* ttyQE0 (UCC4) */
             serial_qe0: serial@3200 {
                 device_type = "serial";
                 compatible = "ucc_uart";
                 cell-index = <4>;
                 reg = <0x3200 0x200>;
                 interrupts = <35>;
                 interrupt-parent = <&qeic>;
                 port-number = <0>;
                 rx-clock-name = "brg9";
                 tx-clock-name = "brg10";
                 soft-uart;
             };

With this setup, I can receive characters from the device, but
no characters ever go out.  Same behaviour as before - the output
descriptors get filled, but no action seems to be taken, no interrupts,
etc.  Except randomly there will be output!  For example, if I
connect to the port with minicom, what I type is received by /dev/ttyQE2.
On some rare occasions, what I type is also echoed back, but when
this happens, it's only the first couple of characters, then nothing.

I verified that the CMXUCR registers are being set per the DTS
as well as the BRG registers.  They all look correct:

   ucc_set_qe_mux_rxtx - CMX: 0x000a0000

   root@ppc_target:~ stty </dev/ttyQE0 38400
   qe_setbrg - BRG: fddfd660 = 0x000106b6, CLK = 132000000, RATE = 9600, MULTIPLIER = 16
   qe_setbrg - BRG: fddfd664 = 0x000106b5, CLK = 132000000, RATE = 9600, MULTIPLIER = 1
   qe_setbrg - BRG: fddfd660 = 0x000101aa, CLK = 132000000, RATE = 38400, MULTIPLIER = 16
   qe_setbrg - BRG: fddfd664 = 0x00011ada, CLK = 132000000, RATE = 38400, MULTIPLIER = 1

I've also tried this on UCC5 & UCC8 which show the identical
behaviour - input works, output does not.  I've tried running
with only a single soft UART (UCC4) as well as having all three
configured.  No changes seen.

At this point, I'm using the stock driver from 2.6.33.3

One thing I noticed is that the firmware patch seems quite old.
I got the firmware package from http://opensource.freescale.com/firmware/
We were also told (by FreeScale) to look at https://www.freescale.com/webapp/Download?colCode=QERAMPKG

Looking at these two packages, it's unclear that they match.  Certainly
the dates are very different:

[gthomas@hermes 8358]$ ls -l fsl_qe_ucode QERAMPKG
fsl_qe_ucode:
total 16
-rw-rw-r-- 1 gthomas gthomas 5940 2007-12-10 14:39 fsl_qe_ucode_uart_8360_21.bin
-rw-r--r-- 1 gthomas gthomas 7892 2007-11-30 10:14 license.txt

QERAMPKG:
total 972
-rw-rw-r-- 1 gthomas gthomas 132915 2009-04-07 14:04 SlowProtocols_8323rev11.c
-rw-rw-r-- 1 gthomas gthomas 455446 2009-09-16 15:44 Soft_UART_Microcode_Rel_0_1_2.pdf
-rw-rw-r-- 1 gthomas gthomas  29379 2009-09-16 15:49 Soft_UART_mpc8360_r2.0.h
-rw-rw-r-- 1 gthomas gthomas  29379 2009-09-16 15:14 Soft_UART_mpc8360_r2.1.h
-rw-rw-r-- 1 gthomas gthomas  29379 2009-09-16 15:14 Soft_UART_mpc8568_r1.1.h
-rw-rw-r-- 1 gthomas gthomas 105457 2009-09-16 16:00 SWUART_8360rev20.c
-rw-rw-r-- 1 gthomas gthomas  34689 2009-09-16 15:32 SWUART_8360rev20.srx
-rw-rw-r-- 1 gthomas gthomas 105318 2009-09-16 15:59 SWUART_8360rev21.c
-rw-rw-r-- 1 gthomas gthomas  34689 2009-09-16 15:14 SWUART_8360rev21.srx

Any ideas what I'm doing wrong?

Thanks

-- 
------------------------------------------------------------
Gary Thomas                 |  Consulting for the
MLB Associates              |    Embedded world
------------------------------------------------------------

^ permalink raw reply

* [PATCH v1]460EX on-chip SATA driver<resubmisison>
From: Rupjyoti Sarmah @ 2010-06-24 13:27 UTC (permalink / raw)
  To: linux-ide, linux-kernel; +Cc: jgarzik, sr, rsarmah, linuxppc-dev

This patch enables the on-chip DWC SATA controller of the AppliedMicro processor 460EX.

Signed-off-by: Rupjyoti Sarmah <rsarmah@appliedmicro.com> 
Signed-off-by: Mark Miesfeld <mmiesfeld@appliedmicro.com>
Signed-off-by: Prodyut Hazarika <phazarika@appliedmicro.com>

---
This patch incorporates the changes advised in the mailing list. The device
tree changes were submitted as a seperate patch. 

 drivers/ata/Kconfig          |    9 +
 drivers/ata/Makefile         |    1 +
 drivers/ata/sata_dwc_460ex.c | 1756 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 1766 insertions(+), 0 deletions(-)
 create mode 100644 drivers/ata/sata_dwc_460ex.c

diff --git a/drivers/ata/Kconfig b/drivers/ata/Kconfig
index aa85a98..29e29a2 100644
--- a/drivers/ata/Kconfig
+++ b/drivers/ata/Kconfig
@@ -98,6 +98,15 @@ config SATA_SIL24
 
 	  If unsure, say N.
 
+config SATA_DWC
+	tristate "DesignWare Cores SATA support"
+	depends on 460EX
+	help
+	  This option enables support for the on-chip SATA controller of the
+	  AppliedMicro processor 460EX.
+
+	  If unsure, say N.
+
 config ATA_SFF
 	bool "ATA SFF support"
 	default y
diff --git a/drivers/ata/Makefile b/drivers/ata/Makefile
index 7ef89d7..d863e66 100644
--- a/drivers/ata/Makefile
+++ b/drivers/ata/Makefile
@@ -7,6 +7,7 @@ obj-$(CONFIG_SATA_AHCI_PLATFORM) += ahci_platform.o libahci.o
 obj-$(CONFIG_SATA_FSL)		+= sata_fsl.o
 obj-$(CONFIG_SATA_INIC162X)	+= sata_inic162x.o
 obj-$(CONFIG_SATA_SIL24)	+= sata_sil24.o
+obj-$(CONFIG_SATA_DWC)		+= sata_dwc_460ex.o
 
 # SFF w/ custom DMA
 obj-$(CONFIG_PDC_ADMA)		+= pdc_adma.o
diff --git a/drivers/ata/sata_dwc_460ex.c b/drivers/ata/sata_dwc_460ex.c
new file mode 100644
index 0000000..ea24c1e
--- /dev/null
+++ b/drivers/ata/sata_dwc_460ex.c
@@ -0,0 +1,1756 @@
+/*
+ * drivers/ata/sata_dwc_460ex.c
+ *
+ * Synopsys DesignWare Cores (DWC) SATA host driver
+ *
+ * Author: Mark Miesfeld <mmiesfeld@amcc.com>
+ *
+ * Ported from 2.6.19.2 to 2.6.25/26 by Stefan Roese <sr@denx.de>
+ * Copyright 2008 DENX Software Engineering
+ *
+ * Based on versions provided by AMCC and Synopsys which are:
+ *          Copyright 2006 Applied Micro Circuits Corporation
+ *          COPYRIGHT (C) 2005  SYNOPSYS, INC.  ALL RIGHTS RESERVED
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#ifdef CONFIG_SATA_DWC_DEBUG
+#define DEBUG
+#endif
+
+#ifdef CONFIG_SATA_DWC_VDEBUG
+#define VERBOSE_DEBUG
+#define DEBUG_NCQ
+#endif
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/device.h>
+#include <linux/of_platform.h>
+#include <linux/platform_device.h>
+#include <linux/libata.h>
+#include <linux/slab.h>
+#include "libata.h"
+
+#include <scsi/scsi_host.h>
+#include <scsi/scsi_cmnd.h>
+
+#define DRV_NAME        "sata-dwc"
+#define DRV_VERSION     "1.0"
+
+/* SATA DMA driver Globals */
+#define DMA_NUM_CHANS		1
+#define DMA_NUM_CHAN_REGS	8
+
+/* SATA DMA Register definitions */
+#define AHB_DMA_BRST_DFLT	64	/* 16 data items burst length*/
+
+struct dmareg {
+	u32 low;		/* Low bits 0-31 */
+	u32 high;		/* High bits 32-63 */
+};
+
+/* DMA Per Channel registers */
+struct dma_chan_regs {
+	struct dmareg sar;	/* Source Address */
+	struct dmareg dar;	/* Destination address */
+	struct dmareg llp;	/* Linked List Pointer */
+	struct dmareg ctl;	/* Control */
+	struct dmareg sstat;	/* Source Status not implemented in core */
+	struct dmareg dstat;	/* Destination Status not implemented in core*/
+	struct dmareg sstatar;	/* Source Status Address not impl in core */
+	struct dmareg dstatar;	/* Destination Status Address not implemente */
+	struct dmareg cfg;	/* Config */
+	struct dmareg sgr;	/* Source Gather */
+	struct dmareg dsr;	/* Destination Scatter */
+};
+
+/* Generic Interrupt Registers */
+struct dma_interrupt_regs {
+	struct dmareg tfr;	/* Transfer Interrupt */
+	struct dmareg block;	/* Block Interrupt */
+	struct dmareg srctran;	/* Source Transfer Interrupt */
+	struct dmareg dsttran;	/* Dest Transfer Interrupt */
+	struct dmareg error;	/* Error */
+};
+
+struct ahb_dma_regs {
+	struct dma_chan_regs	chan_regs[DMA_NUM_CHAN_REGS];
+	struct dma_interrupt_regs interrupt_raw;	/* Raw Interrupt */
+	struct dma_interrupt_regs interrupt_status;	/* Interrupt Status */
+	struct dma_interrupt_regs interrupt_mask;	/* Interrupt Mask */
+	struct dma_interrupt_regs interrupt_clear;	/* Interrupt Clear */
+	struct dmareg		statusInt;	/* Interrupt combined*/
+	struct dmareg		rq_srcreg;	/* Src Trans Req */
+	struct dmareg		rq_dstreg;	/* Dst Trans Req */
+	struct dmareg		rq_sgl_srcreg;	/* Sngl Src Trans Req*/
+	struct dmareg		rq_sgl_dstreg;	/* Sngl Dst Trans Req*/
+	struct dmareg		rq_lst_srcreg;	/* Last Src Trans Req*/
+	struct dmareg		rq_lst_dstreg;	/* Last Dst Trans Req*/
+	struct dmareg		dma_cfg;		/* DMA Config */
+	struct dmareg		dma_chan_en;		/* DMA Channel Enable*/
+	struct dmareg		dma_id;			/* DMA ID */
+	struct dmareg		dma_test;		/* DMA Test */
+	struct dmareg		res1;			/* reserved */
+	struct dmareg		res2;			/* reserved */
+	/*
+	 * DMA Comp Params
+	 * Param 6 = dma_param[0], Param 5 = dma_param[1],
+	 * Param 4 = dma_param[2] ...
+	 */
+	struct dmareg		dma_params[6];
+};
+
+/* Data structure for linked list item */
+struct lli {
+	u32		sar;		/* Source Address */
+	u32		dar;		/* Destination address */
+	u32		llp;		/* Linked List Pointer */
+	struct dmareg	ctl;		/* Control */
+	struct dmareg	dstat;		/* Destination Status */
+};
+
+enum {
+	SATA_DWC_DMAC_LLI_SZ =	(sizeof(struct lli)),
+	SATA_DWC_DMAC_LLI_NUM =	256,
+	SATA_DWC_DMAC_LLI_TBL_SZ = (SATA_DWC_DMAC_LLI_SZ * \
+					SATA_DWC_DMAC_LLI_NUM),
+	SATA_DWC_DMAC_TWIDTH_BYTES = 4,
+	SATA_DWC_DMAC_CTRL_TSIZE_MAX = (0x00000800 * \
+						SATA_DWC_DMAC_TWIDTH_BYTES),
+};
+
+/* DMA Register Operation Bits */
+enum {
+	DMA_EN	=		0x00000001, /* Enable AHB DMA */
+	DMA_CTL_LLP_SRCEN =	0x10000000, /* Blk chain enable Src */
+	DMA_CTL_LLP_DSTEN =	0x08000000, /* Blk chain enable Dst */
+};
+
+#define	DMA_CTL_BLK_TS(size)	((size) & 0x000000FFF)	/* Blk Transfer size */
+#define DMA_CHANNEL(ch)		(0x00000001 << (ch))	/* Select channel */
+	/* Enable channel */
+#define	DMA_ENABLE_CHAN(ch)	((0x00000001 << (ch)) |			\
+				 ((0x000000001 << (ch)) << 8))
+	/* Disable channel */
+#define	DMA_DISABLE_CHAN(ch)	(0x00000000 | ((0x000000001 << (ch)) << 8))
+	/* Transfer Type & Flow Controller */
+#define	DMA_CTL_TTFC(type)	(((type) & 0x7) << 20)
+#define	DMA_CTL_SMS(num)	(((num) & 0x3) << 25) /* Src Master Select */
+#define	DMA_CTL_DMS(num)	(((num) & 0x3) << 23)/* Dst Master Select */
+	/* Src Burst Transaction Length */
+#define DMA_CTL_SRC_MSIZE(size) (((size) & 0x7) << 14)
+	/* Dst Burst Transaction Length */
+#define	DMA_CTL_DST_MSIZE(size) (((size) & 0x7) << 11)
+	/* Source Transfer Width */
+#define	DMA_CTL_SRC_TRWID(size) (((size) & 0x7) << 4)
+	/* Destination Transfer Width */
+#define	DMA_CTL_DST_TRWID(size) (((size) & 0x7) << 1)
+
+/* Assign HW handshaking interface (x) to destination / source peripheral */
+#define	DMA_CFG_HW_HS_DEST(int_num) (((int_num) & 0xF) << 11)
+#define	DMA_CFG_HW_HS_SRC(int_num) (((int_num) & 0xF) << 7)
+#define	DMA_LLP_LMS(addr, master) (((addr) & 0xfffffffc) | (master))
+
+/*
+ * This define is used to set block chaining disabled in the control low
+ * register.  It is already in little endian format so it can be &'d dirctly.
+ * It is essentially: cpu_to_le32(~(DMA_CTL_LLP_SRCEN | DMA_CTL_LLP_DSTEN))
+ */
+enum {
+	DMA_CTL_LLP_DISABLE_LE32 = 0xffffffe7,
+	DMA_CTL_TTFC_P2M_DMAC =	0x00000002, /* Per to mem, DMAC cntr */
+	DMA_CTL_TTFC_M2P_PER =	0x00000003, /* Mem to per, peripheral cntr */
+	DMA_CTL_SINC_INC =	0x00000000, /* Source Address Increment */
+	DMA_CTL_SINC_DEC =	0x00000200,
+	DMA_CTL_SINC_NOCHANGE =	0x00000400,
+	DMA_CTL_DINC_INC =	0x00000000, /* Destination Address Increment */
+	DMA_CTL_DINC_DEC =	0x00000080,
+	DMA_CTL_DINC_NOCHANGE =	0x00000100,
+	DMA_CTL_INT_EN =	0x00000001, /* Interrupt Enable */
+
+/* Channel Configuration Register high bits */
+	DMA_CFG_FCMOD_REQ =	0x00000001, /* Flow Control - request based */
+	DMA_CFG_PROTCTL	=	(0x00000003 << 2),/* Protection Control */
+
+/* Channel Configuration Register low bits */
+	DMA_CFG_RELD_DST =	0x80000000, /* Reload Dest / Src Addr */
+	DMA_CFG_RELD_SRC =	0x40000000,
+	DMA_CFG_HS_SELSRC =	0x00000800, /* Software handshake Src/ Dest */
+	DMA_CFG_HS_SELDST =	0x00000400,
+	DMA_CFG_FIFOEMPTY =     (0x00000001 << 9), /* FIFO Empty bit */
+
+/* Channel Linked List Pointer Register */
+	DMA_LLP_AHBMASTER1 =	0,	/* List Master Select */
+	DMA_LLP_AHBMASTER2 =	1,
+
+	SATA_DWC_MAX_PORTS = 1,
+
+	SATA_DWC_SCR_OFFSET = 0x24,
+	SATA_DWC_REG_OFFSET = 0x64,
+};
+
+/* DWC SATA Registers */
+struct sata_dwc_regs {
+	u32 fptagr;		/* 1st party DMA tag */
+	u32 fpbor;		/* 1st party DMA buffer offset */
+	u32 fptcr;		/* 1st party DMA Xfr count */
+	u32 dmacr;		/* DMA Control */
+	u32 dbtsr;		/* DMA Burst Transac size */
+	u32 intpr;		/* Interrupt Pending */
+	u32 intmr;		/* Interrupt Mask */
+	u32 errmr;		/* Error Mask */
+	u32 llcr;		/* Link Layer Control */
+	u32 phycr;		/* PHY Control */
+	u32 physr;		/* PHY Status */
+	u32 rxbistpd;		/* Recvd BIST pattern def register */
+	u32 rxbistpd1;		/* Recvd BIST data dword1 */
+	u32 rxbistpd2;		/* Recvd BIST pattern data dword2 */
+	u32 txbistpd;		/* Trans BIST pattern def register */
+	u32 txbistpd1;		/* Trans BIST data dword1 */
+	u32 txbistpd2;		/* Trans BIST data dword2 */
+	u32 bistcr;		/* BIST Control Register */
+	u32 bistfctr;		/* BIST FIS Count Register */
+	u32 bistsr;		/* BIST Status Register */
+	u32 bistdecr;		/* BIST Dword Error count register */
+	u32 res[15];		/* Reserved locations */
+	u32 testr;		/* Test Register */
+	u32 versionr;		/* Version Register */
+	u32 idr;		/* ID Register */
+	u32 unimpl[192];	/* Unimplemented */
+	u32 dmadr[256];	/* FIFO Locations in DMA Mode */
+};
+
+enum {
+	SCR_SCONTROL_DET_ENABLE	=	0x00000001,
+	SCR_SSTATUS_DET_PRESENT	=	0x00000001,
+	SCR_SERROR_DIAG_X	=	0x04000000,
+/* DWC SATA Register Operations */
+	SATA_DWC_TXFIFO_DEPTH	=	0x01FF,
+	SATA_DWC_RXFIFO_DEPTH	=	0x01FF,
+	SATA_DWC_DMACR_TMOD_TXCHEN =	0x00000004,
+	SATA_DWC_DMACR_TXCHEN	= (0x00000001 | SATA_DWC_DMACR_TMOD_TXCHEN),
+	SATA_DWC_DMACR_RXCHEN	= (0x00000002 | SATA_DWC_DMACR_TMOD_TXCHEN),
+	SATA_DWC_DMACR_TXRXCH_CLEAR =	SATA_DWC_DMACR_TMOD_TXCHEN,
+	SATA_DWC_INTPR_DMAT	=	0x00000001,
+	SATA_DWC_INTPR_NEWFP	=	0x00000002,
+	SATA_DWC_INTPR_PMABRT	=	0x00000004,
+	SATA_DWC_INTPR_ERR	=	0x00000008,
+	SATA_DWC_INTPR_NEWBIST	=	0x00000010,
+	SATA_DWC_INTPR_IPF	=	0x10000000,
+	SATA_DWC_INTMR_DMATM	=	0x00000001,
+	SATA_DWC_INTMR_NEWFPM	=	0x00000002,
+	SATA_DWC_INTMR_PMABRTM	=	0x00000004,
+	SATA_DWC_INTMR_ERRM	=	0x00000008,
+	SATA_DWC_INTMR_NEWBISTM	=	0x00000010,
+	SATA_DWC_LLCR_SCRAMEN	=	0x00000001,
+	SATA_DWC_LLCR_DESCRAMEN	=	0x00000002,
+	SATA_DWC_LLCR_RPDEN	=	0x00000004,
+/* This is all error bits, zero's are reserved fields. */
+	SATA_DWC_SERROR_ERR_BITS =	0x0FFF0F03
+};
+
+#define SATA_DWC_SCR0_SPD_GET(v)	(((v) >> 4) & 0x0000000F)
+#define SATA_DWC_DMACR_TX_CLEAR(v)	(((v) & ~SATA_DWC_DMACR_TXCHEN) |\
+						 SATA_DWC_DMACR_TMOD_TXCHEN)
+#define SATA_DWC_DMACR_RX_CLEAR(v)	(((v) & ~SATA_DWC_DMACR_RXCHEN) |\
+						 SATA_DWC_DMACR_TMOD_TXCHEN)
+#define SATA_DWC_DBTSR_MWR(size)	(((size)/4) & SATA_DWC_TXFIFO_DEPTH)
+#define SATA_DWC_DBTSR_MRD(size)	((((size)/4) & SATA_DWC_RXFIFO_DEPTH)\
+						 << 16)
+struct sata_dwc_device {
+	struct device		*dev;		/* generic device struct */
+	struct ata_probe_ent	*pe;		/* ptr to probe-ent */
+	struct ata_host		*host;
+	u8			*reg_base;
+	struct sata_dwc_regs	*sata_dwc_regs;	/* DW Synopsys SATA specific */
+	int			irq_dma;
+};
+
+#define SATA_DWC_QCMD_MAX	32
+
+struct sata_dwc_device_port {
+	struct sata_dwc_device	*hsdev;
+	int			cmd_issued[SATA_DWC_QCMD_MAX];
+	struct lli		*llit[SATA_DWC_QCMD_MAX];  /* DMA LLI table */
+	dma_addr_t		llit_dma[SATA_DWC_QCMD_MAX];
+	u32			dma_chan[SATA_DWC_QCMD_MAX];
+	int			dma_pending[SATA_DWC_QCMD_MAX];
+};
+
+/*
+ * Commonly used DWC SATA driver Macros
+ */
+#define HSDEV_FROM_HOST(host)  ((struct sata_dwc_device *)\
+					(host)->private_data)
+#define HSDEV_FROM_AP(ap)  ((struct sata_dwc_device *)\
+					(ap)->host->private_data)
+#define HSDEVP_FROM_AP(ap)   ((struct sata_dwc_device_port *)\
+					(ap)->private_data)
+#define HSDEV_FROM_QC(qc)	((struct sata_dwc_device *)\
+					(qc)->ap->host->private_data)
+#define HSDEV_FROM_HSDEVP(p)	((struct sata_dwc_device *)\
+						(hsdevp)->hsdev)
+
+enum {
+	SATA_DWC_CMD_ISSUED_NOT		= 0,
+	SATA_DWC_CMD_ISSUED_PEND	= 1,
+	SATA_DWC_CMD_ISSUED_EXEC	= 2,
+	SATA_DWC_CMD_ISSUED_NODATA	= 3,
+
+	SATA_DWC_DMA_PENDING_NONE	= 0,
+	SATA_DWC_DMA_PENDING_TX		= 1,
+	SATA_DWC_DMA_PENDING_RX		= 2,
+};
+
+struct sata_dwc_host_priv {
+	void	__iomem	 *scr_addr_sstatus;
+	u32	sata_dwc_sactive_issued ;
+	u32	sata_dwc_sactive_queued ;
+	u32	dma_interrupt_count;
+	struct	ahb_dma_regs	*sata_dma_regs;
+	struct	device	*dwc_dev;
+};
+struct sata_dwc_host_priv host_pvt;
+/*
+ * Prototypes
+ */
+static void sata_dwc_bmdma_start_by_tag(struct ata_queued_cmd *qc, u8 tag);
+static int sata_dwc_qc_complete(struct ata_port *ap, struct ata_queued_cmd *qc,
+				u32 check_status);
+static void sata_dwc_dma_xfer_complete(struct ata_port *ap, u32 check_status);
+static void sata_dwc_port_stop(struct ata_port *ap);
+static void sata_dwc_clear_dmacr(struct sata_dwc_device_port *hsdevp, u8 tag);
+static int dma_dwc_init(struct sata_dwc_device *hsdev, int irq);
+static void dma_dwc_exit(struct sata_dwc_device *hsdev);
+static int dma_dwc_xfer_setup(struct scatterlist *sg, int num_elems,
+			      struct lli *lli, dma_addr_t dma_lli,
+			      void __iomem *addr, int dir);
+static void dma_dwc_xfer_start(int dma_ch);
+
+static void sata_dwc_tf_dump(struct ata_taskfile *tf)
+{
+	dev_vdbg(host_pvt.dwc_dev, "taskfile cmd: 0x%02x protocol: %s flags:"
+		"0x%lx device: %x\n", tf->command, ata_get_cmd_descript\
+		(tf->protocol), tf->flags, tf->device);
+	dev_vdbg(host_pvt.dwc_dev, "feature: 0x%02x nsect: 0x%x lbal: 0x%x "
+		"lbam: 0x%x lbah: 0x%x\n", tf->feature, tf->nsect, tf->lbal,
+		 tf->lbam, tf->lbah);
+	dev_vdbg(host_pvt.dwc_dev, "hob_feature: 0x%02x hob_nsect: 0x%x "
+		"hob_lbal: 0x%x hob_lbam: 0x%x hob_lbah: 0x%x\n",
+		tf->hob_feature, tf->hob_nsect, tf->hob_lbal, tf->hob_lbam,
+		tf->hob_lbah);
+}
+
+/*
+ * Function: get_burst_length_encode
+ * arguments: datalength: length in bytes of data
+ * returns value to be programmed in register corrresponding to data length
+ * This value is effectively the log(base 2) of the length
+ */
+static  int get_burst_length_encode(int datalength)
+{
+	int items = datalength >> 2;	/* div by 4 to get lword count */
+
+	if (items >= 64)
+		return 5;
+
+	if (items >= 32)
+		return 4;
+
+	if (items >= 16)
+		return 3;
+
+	if (items >= 8)
+		return 2;
+
+	if (items >= 4)
+		return 1;
+
+	return 0;
+}
+
+static  void clear_chan_interrupts(int c)
+{
+	out_le32(&(host_pvt.sata_dma_regs->interrupt_clear.tfr.low),
+		 DMA_CHANNEL(c));
+	out_le32(&(host_pvt.sata_dma_regs->interrupt_clear.block.low),
+		 DMA_CHANNEL(c));
+	out_le32(&(host_pvt.sata_dma_regs->interrupt_clear.srctran.low),
+		 DMA_CHANNEL(c));
+	out_le32(&(host_pvt.sata_dma_regs->interrupt_clear.dsttran.low),
+		 DMA_CHANNEL(c));
+	out_le32(&(host_pvt.sata_dma_regs->interrupt_clear.error.low),
+		 DMA_CHANNEL(c));
+}
+
+/*
+ * Function: dma_request_channel
+ * arguments: None
+ * returns channel number if available else -1
+ * This function assigns the next available DMA channel from the list to the
+ * requester
+ */
+static int dma_request_channel(void)
+{
+	int i;
+
+	for (i = 0; i < DMA_NUM_CHANS; i++) {
+		if (!(in_le32(&(host_pvt.sata_dma_regs->dma_chan_en.low)) &\
+			DMA_CHANNEL(i)))
+			return i;
+	}
+	dev_err(host_pvt.dwc_dev, "%s NO channel chan_en: 0x%08x\n", __func__,
+		in_le32(&(host_pvt.sata_dma_regs->dma_chan_en.low)));
+	return -1;
+}
+
+/*
+ * Function: dma_dwc_interrupt
+ * arguments: irq, dev_id, pt_regs
+ * returns channel number if available else -1
+ * Interrupt Handler for DW AHB SATA DMA
+ */
+static irqreturn_t dma_dwc_interrupt(int irq, void *hsdev_instance)
+{
+	int chan;
+	u32 tfr_reg, err_reg;
+	unsigned long flags;
+	struct sata_dwc_device *hsdev =
+		(struct sata_dwc_device *)hsdev_instance;
+	struct ata_host *host = (struct ata_host *)hsdev->host;
+	struct ata_port *ap;
+	struct sata_dwc_device_port *hsdevp;
+	u8 tag = 0;
+	unsigned int port = 0;
+
+	spin_lock_irqsave(&host->lock, flags);
+	ap = host->ports[port];
+	hsdevp = HSDEVP_FROM_AP(ap);
+	tag = ap->link.active_tag;
+
+	tfr_reg = in_le32(&(host_pvt.sata_dma_regs->interrupt_status.tfr\
+			.low));
+	err_reg = in_le32(&(host_pvt.sata_dma_regs->interrupt_status.error\
+			.low));
+
+	dev_dbg(ap->dev, "eot=0x%08x err=0x%08x pending=%d active port=%d\n",
+		tfr_reg, err_reg, hsdevp->dma_pending[tag], port);
+
+	for (chan = 0; chan < DMA_NUM_CHANS; chan++) {
+		/* Check for end-of-transfer interrupt. */
+		if (tfr_reg & DMA_CHANNEL(chan)) {
+			/*
+			 * Each DMA command produces 2 interrupts.  Only
+			 * complete the command after both interrupts have been
+			 * seen. (See sata_dwc_isr())
+			 */
+			host_pvt.dma_interrupt_count++;
+			sata_dwc_clear_dmacr(hsdevp, tag);
+
+			if (hsdevp->dma_pending[tag] ==
+			    SATA_DWC_DMA_PENDING_NONE) {
+				dev_err(ap->dev, "DMA not pending eot=0x%08x "
+					"err=0x%08x tag=0x%02x pending=%d\n",
+					tfr_reg, err_reg, tag,
+					hsdevp->dma_pending[tag]);
+			}
+
+			if ((host_pvt.dma_interrupt_count % 2) == 0)
+				sata_dwc_dma_xfer_complete(ap, 1);
+
+			/* Clear the interrupt */
+			out_le32(&(host_pvt.sata_dma_regs->interrupt_clear\
+				.tfr.low),
+				 DMA_CHANNEL(chan));
+		}
+
+		/* Check for error interrupt. */
+		if (err_reg & DMA_CHANNEL(chan)) {
+			/* TODO Need error handler ! */
+			dev_err(ap->dev, "error interrupt err_reg=0x%08x\n",
+				err_reg);
+
+			/* Clear the interrupt. */
+			out_le32(&(host_pvt.sata_dma_regs->interrupt_clear\
+				.error.low),
+				 DMA_CHANNEL(chan));
+		}
+	}
+	spin_unlock_irqrestore(&host->lock, flags);
+	return IRQ_HANDLED;
+}
+
+/*
+ * Function: dma_request_interrupts
+ * arguments: hsdev
+ * returns status
+ * This function registers ISR for a particular DMA channel interrupt
+ */
+static int dma_request_interrupts(struct sata_dwc_device *hsdev, int irq)
+{
+	int retval = 0;
+	int chan;
+
+	for (chan = 0; chan < DMA_NUM_CHANS; chan++) {
+		/* Unmask error interrupt */
+		out_le32(&(host_pvt.sata_dma_regs)->interrupt_mask.error.low,
+			 DMA_ENABLE_CHAN(chan));
+
+		/* Unmask end-of-transfer interrupt */
+		out_le32(&(host_pvt.sata_dma_regs)->interrupt_mask.tfr.low,
+			 DMA_ENABLE_CHAN(chan));
+	}
+
+	retval = request_irq(irq, dma_dwc_interrupt, 0, "SATA DMA", hsdev);
+	if (retval) {
+		dev_err(host_pvt.dwc_dev, "%s: could not get IRQ %d\n",
+		__func__, irq);
+		return -ENODEV;
+	}
+
+	/* Mark this interrupt as requested */
+	hsdev->irq_dma = irq;
+	return 0;
+}
+
+/*
+ * Function: map_sg_to_lli
+ * The Synopsis driver has a comment proposing that better performance
+ * is possible by only enabling interrupts on the last item in the linked list.
+ * However, it seems that could be a problem if an error happened on one of the
+ * first items.  The transfer would halt, but no error interrupt would occur.
+ * Currently this function sets interrupts enabled for each linked list item:
+ * DMA_CTL_INT_EN.
+ */
+static int map_sg_to_lli(struct scatterlist *sg, int num_elems,
+			struct lli *lli, dma_addr_t dma_lli,
+			void __iomem *dmadr_addr, int dir)
+{
+	int i, idx = 0;
+	int fis_len = 0;
+	dma_addr_t next_llp;
+	int bl;
+
+	dev_dbg(host_pvt.dwc_dev, "%s: sg=%p nelem=%d lli=%p dma_lli=0x%08x"
+		" dmadr=0x%08x\n", __func__, sg, num_elems, lli, (u32)dma_lli,
+		(u32)dmadr_addr);
+
+	bl = get_burst_length_encode(AHB_DMA_BRST_DFLT);
+
+	for (i = 0; i < num_elems; i++, sg++) {
+		u32 addr, offset;
+		u32 sg_len, len;
+
+		addr = (u32) sg_dma_address(sg);
+		sg_len = sg_dma_len(sg);
+
+		dev_dbg(host_pvt.dwc_dev, "%s: elem=%d sg_addr=0x%x sg_len"
+			"=%d\n", __func__, i, addr, sg_len);
+
+		while (sg_len) {
+			if (idx >= SATA_DWC_DMAC_LLI_NUM) {
+				/* The LLI table is not large enough. */
+				dev_err(host_pvt.dwc_dev, "LLI table overrun "
+				"(idx=%d)\n", idx);
+				break;
+			}
+			len = (sg_len > SATA_DWC_DMAC_CTRL_TSIZE_MAX) ?
+				SATA_DWC_DMAC_CTRL_TSIZE_MAX : sg_len;
+
+			offset = addr & 0xffff;
+			if ((offset + sg_len) > 0x10000)
+				len = 0x10000 - offset;
+
+			/*
+			 * Make sure a LLI block is not created that will span
+			 * 8K max FIS boundary.  If the block spans such a FIS
+			 * boundary, there is a chance that a DMA burst will
+			 * cross that boundary -- this results in an error in
+			 * the host controller.
+			 */
+			if (fis_len + len > 8192) {
+				dev_dbg(host_pvt.dwc_dev, "SPLITTING: fis_len="
+					"%d(0x%x) len=%d(0x%x)\n", fis_len,
+					 fis_len, len, len);
+				len = 8192 - fis_len;
+				fis_len = 0;
+			} else {
+				fis_len += len;
+			}
+			if (fis_len == 8192)
+				fis_len = 0;
+
+			/*
+			 * Set DMA addresses and lower half of control register
+			 * based on direction.
+			 */
+			if (dir == DMA_FROM_DEVICE) {
+				lli[idx].dar = cpu_to_le32(addr);
+				lli[idx].sar = cpu_to_le32((u32)dmadr_addr);
+
+				lli[idx].ctl.low = cpu_to_le32(
+					DMA_CTL_TTFC(DMA_CTL_TTFC_P2M_DMAC) |
+					DMA_CTL_SMS(0) |
+					DMA_CTL_DMS(1) |
+					DMA_CTL_SRC_MSIZE(bl) |
+					DMA_CTL_DST_MSIZE(bl) |
+					DMA_CTL_SINC_NOCHANGE |
+					DMA_CTL_SRC_TRWID(2) |
+					DMA_CTL_DST_TRWID(2) |
+					DMA_CTL_INT_EN |
+					DMA_CTL_LLP_SRCEN |
+					DMA_CTL_LLP_DSTEN);
+			} else {	/* DMA_TO_DEVICE */
+				lli[idx].sar = cpu_to_le32(addr);
+				lli[idx].dar = cpu_to_le32((u32)dmadr_addr);
+
+				lli[idx].ctl.low = cpu_to_le32(
+					DMA_CTL_TTFC(DMA_CTL_TTFC_M2P_PER) |
+					DMA_CTL_SMS(1) |
+					DMA_CTL_DMS(0) |
+					DMA_CTL_SRC_MSIZE(bl) |
+					DMA_CTL_DST_MSIZE(bl) |
+					DMA_CTL_DINC_NOCHANGE |
+					DMA_CTL_SRC_TRWID(2) |
+					DMA_CTL_DST_TRWID(2) |
+					DMA_CTL_INT_EN |
+					DMA_CTL_LLP_SRCEN |
+					DMA_CTL_LLP_DSTEN);
+			}
+
+			dev_dbg(host_pvt.dwc_dev, "%s setting ctl.high len: "
+				"0x%08x val: 0x%08x\n", __func__,
+				len, DMA_CTL_BLK_TS(len / 4));
+
+			/* Program the LLI CTL high register */
+			lli[idx].ctl.high = cpu_to_le32(DMA_CTL_BLK_TS\
+						(len / 4));
+
+			/* Program the next pointer.  The next pointer must be
+			 * the physical address, not the virtual address.
+			 */
+			next_llp = (dma_lli + ((idx + 1) * sizeof(struct \
+							lli)));
+
+			/* The last 2 bits encode the list master select. */
+			next_llp = DMA_LLP_LMS(next_llp, DMA_LLP_AHBMASTER2);
+
+			lli[idx].llp = cpu_to_le32(next_llp);
+			idx++;
+			sg_len -= len;
+			addr += len;
+		}
+	}
+
+	/*
+	 * The last next ptr has to be zero and the last control low register
+	 * has to have LLP_SRC_EN and LLP_DST_EN (linked list pointer source
+	 * and destination enable) set back to 0 (disabled.) This is what tells
+	 * the core that this is the last item in the linked list.
+	 */
+	if (idx) {
+		lli[idx-1].llp = 0x00000000;
+		lli[idx-1].ctl.low &= DMA_CTL_LLP_DISABLE_LE32;
+
+		/* Flush cache to memory */
+		dma_cache_sync(NULL, lli, (sizeof(struct lli) * idx),
+			       DMA_BIDIRECTIONAL);
+	}
+
+	return idx;
+}
+
+/*
+ * Function: dma_dwc_xfer_start
+ * arguments: Channel number
+ * Return : None
+ * Enables the DMA channel
+ */
+static void dma_dwc_xfer_start(int dma_ch)
+{
+	/* Enable the DMA channel */
+	out_le32(&(host_pvt.sata_dma_regs->dma_chan_en.low),
+		 in_le32(&(host_pvt.sata_dma_regs->dma_chan_en.low)) |
+		 DMA_ENABLE_CHAN(dma_ch));
+}
+
+static int dma_dwc_xfer_setup(struct scatterlist *sg, int num_elems,
+			      struct lli *lli, dma_addr_t dma_lli,
+			      void __iomem *addr, int dir)
+{
+	int dma_ch;
+	int num_lli;
+	/* Acquire DMA channel */
+	dma_ch = dma_request_channel();
+	if (dma_ch == -1) {
+		dev_err(host_pvt.dwc_dev, "%s: dma channel unavailable\n",
+			 __func__);
+		return -EAGAIN;
+	}
+
+	/* Convert SG list to linked list of items (LLIs) for AHB DMA */
+	num_lli = map_sg_to_lli(sg, num_elems, lli, dma_lli, addr, dir);
+
+	dev_dbg(host_pvt.dwc_dev, "%s sg: 0x%p, count: %d lli: %p dma_lli:"
+		" 0x%0xlx addr: %p lli count: %d\n", __func__, sg, num_elems,
+		 lli, (u32)dma_lli, addr, num_lli);
+
+	clear_chan_interrupts(dma_ch);
+
+	/* Program the CFG register. */
+	out_le32(&(host_pvt.sata_dma_regs->chan_regs[dma_ch].cfg.high),
+		 DMA_CFG_PROTCTL | DMA_CFG_FCMOD_REQ);
+	out_le32(&(host_pvt.sata_dma_regs->chan_regs[dma_ch].cfg.low), 0);
+
+	/* Program the address of the linked list */
+	out_le32(&(host_pvt.sata_dma_regs->chan_regs[dma_ch].llp.low),
+		 DMA_LLP_LMS(dma_lli, DMA_LLP_AHBMASTER2));
+
+	/* Program the CTL register with src enable / dst enable */
+	out_le32(&(host_pvt.sata_dma_regs->chan_regs[dma_ch].ctl.low),
+		 DMA_CTL_LLP_SRCEN | DMA_CTL_LLP_DSTEN);
+	return 0;
+}
+
+/*
+ * Function: dma_dwc_exit
+ * arguments: None
+ * returns status
+ * This function exits the SATA DMA driver
+ */
+static void dma_dwc_exit(struct sata_dwc_device *hsdev)
+{
+	dev_dbg(host_pvt.dwc_dev, "%s:\n", __func__);
+	if (host_pvt.sata_dma_regs)
+		iounmap(host_pvt.sata_dma_regs);
+
+	if (hsdev->irq_dma)
+		free_irq(hsdev->irq_dma, hsdev);
+}
+
+/*
+ * Function: dma_dwc_init
+ * arguments: hsdev
+ * returns status
+ * This function initializes the SATA DMA driver
+ */
+static int dma_dwc_init(struct sata_dwc_device *hsdev, int irq)
+{
+	int err;
+
+	err = dma_request_interrupts(hsdev, irq);
+	if (err) {
+		dev_err(host_pvt.dwc_dev, "%s: dma_request_interrupts returns"
+			" %d\n", __func__, err);
+		goto error_out;
+	}
+
+	/* Enabe DMA */
+	out_le32(&(host_pvt.sata_dma_regs->dma_cfg.low), DMA_EN);
+
+	dev_notice(host_pvt.dwc_dev, "DMA initialized\n");
+	dev_dbg(host_pvt.dwc_dev, "SATA DMA registers=0x%p\n", host_pvt.\
+		sata_dma_regs);
+
+	return 0;
+
+error_out:
+	dma_dwc_exit(hsdev);
+
+	return err;
+}
+
+static int sata_dwc_scr_read(struct ata_link *link, unsigned int scr, u32 *val)
+{
+	if (scr > SCR_NOTIFICATION) {
+		dev_err(link->ap->dev, "%s: Incorrect SCR offset 0x%02x\n",
+			__func__, scr);
+		return -EINVAL;
+	}
+
+	*val = in_le32((void *)link->ap->ioaddr.scr_addr + (scr * 4));
+	dev_dbg(link->ap->dev, "%s: id=%d reg=%d val=val=0x%08x\n",
+		__func__, link->ap->print_id, scr, *val);
+
+	return 0;
+}
+
+static int sata_dwc_scr_write(struct ata_link *link, unsigned int scr, u32 val)
+{
+	dev_dbg(link->ap->dev, "%s: id=%d reg=%d val=val=0x%08x\n",
+		__func__, link->ap->print_id, scr, val);
+	if (scr > SCR_NOTIFICATION) {
+		dev_err(link->ap->dev, "%s: Incorrect SCR offset 0x%02x\n",
+			 __func__, scr);
+		return -EINVAL;
+	}
+	out_le32((void *)link->ap->ioaddr.scr_addr + (scr * 4), val);
+
+	return 0;
+}
+
+static u32 core_scr_read(unsigned int scr)
+{
+	return in_le32((void __iomem *)(host_pvt.scr_addr_sstatus) +\
+			(scr * 4));
+}
+
+static void core_scr_write(unsigned int scr, u32 val)
+{
+	out_le32((void __iomem *)(host_pvt.scr_addr_sstatus) + (scr * 4),
+		val);
+}
+
+static void clear_serror(void)
+{
+	u32 val;
+	val = core_scr_read(SCR_ERROR);
+	core_scr_write(SCR_ERROR, val);
+
+}
+
+static void clear_interrupt_bit(struct sata_dwc_device *hsdev, u32 bit)
+{
+	out_le32(&hsdev->sata_dwc_regs->intpr,
+		 in_le32(&hsdev->sata_dwc_regs->intpr));
+}
+
+static u32 qcmd_tag_to_mask(u8 tag)
+{
+	return 0x00000001 << (tag & 0x1f);
+}
+
+/* See ahci.c */
+static void sata_dwc_error_intr(struct ata_port *ap,
+				struct sata_dwc_device *hsdev, uint intpr)
+{
+	struct sata_dwc_device_port *hsdevp = HSDEVP_FROM_AP(ap);
+	struct ata_eh_info *ehi = &ap->link.eh_info;
+	unsigned int err_mask = 0, action = 0;
+	struct ata_queued_cmd *qc;
+	u32 serror;
+	u8 status, tag;
+	u32 err_reg;
+
+	ata_ehi_clear_desc(ehi);
+
+	serror = core_scr_read(SCR_ERROR);
+	status = ap->ops->sff_check_status(ap);
+
+	err_reg = in_le32(&(host_pvt.sata_dma_regs->interrupt_status.error.\
+			low));
+	tag = ap->link.active_tag;
+
+	dev_err(ap->dev, "%s SCR_ERROR=0x%08x intpr=0x%08x status=0x%08x "
+		"dma_intp=%d pending=%d issued=%d dma_err_status=0x%08x\n",
+		__func__, serror, intpr, status, host_pvt.dma_interrupt_count,
+		hsdevp->dma_pending[tag], hsdevp->cmd_issued[tag], err_reg);
+
+	/* Clear error register and interrupt bit */
+	clear_serror();
+	clear_interrupt_bit(hsdev, SATA_DWC_INTPR_ERR);
+
+	/* This is the only error happening now.  TODO check for exact error */
+
+	err_mask |= AC_ERR_HOST_BUS;
+	action |= ATA_EH_RESET;
+
+	/* Pass this on to EH */
+	ehi->serror |= serror;
+	ehi->action |= action;
+
+	qc = ata_qc_from_tag(ap, tag);
+	if (qc)
+		qc->err_mask |= err_mask;
+	else
+		ehi->err_mask |= err_mask;
+
+	ata_port_abort(ap);
+}
+
+/*
+ * Function : sata_dwc_isr
+ * arguments : irq, void *dev_instance, struct pt_regs *regs
+ * Return value : irqreturn_t - status of IRQ
+ * This Interrupt handler called via port ops registered function.
+ * .irq_handler = sata_dwc_isr
+ */
+static irqreturn_t sata_dwc_isr(int irq, void *dev_instance)
+{
+	struct ata_host *host = (struct ata_host *)dev_instance;
+	struct sata_dwc_device *hsdev = HSDEV_FROM_HOST(host);
+	struct ata_port *ap;
+	struct ata_queued_cmd *qc;
+	unsigned long flags;
+	u8 status, tag;
+	int handled, num_processed, port = 0;
+	uint intpr, sactive, sactive2, tag_mask;
+	struct sata_dwc_device_port *hsdevp;
+	host_pvt.sata_dwc_sactive_issued = 0;
+
+	spin_lock_irqsave(&host->lock, flags);
+
+	/* Read the interrupt register */
+	intpr = in_le32(&hsdev->sata_dwc_regs->intpr);
+
+	ap = host->ports[port];
+	hsdevp = HSDEVP_FROM_AP(ap);
+
+	dev_dbg(ap->dev, "%s intpr=0x%08x active_tag=%d\n", __func__, intpr,
+		ap->link.active_tag);
+
+	/* Check for error interrupt */
+	if (intpr & SATA_DWC_INTPR_ERR) {
+		sata_dwc_error_intr(ap, hsdev, intpr);
+		handled = 1;
+		goto DONE;
+	}
+
+	/* Check for DMA SETUP FIS (FP DMA) interrupt */
+	if (intpr & SATA_DWC_INTPR_NEWFP) {
+		clear_interrupt_bit(hsdev, SATA_DWC_INTPR_NEWFP);
+
+		tag = (u8)(in_le32(&hsdev->sata_dwc_regs->fptagr));
+		dev_dbg(ap->dev, "%s: NEWFP tag=%d\n", __func__, tag);
+		if (hsdevp->cmd_issued[tag] != SATA_DWC_CMD_ISSUED_PEND)
+			dev_warn(ap->dev, "CMD tag=%d not pending?\n", tag);
+
+		host_pvt.sata_dwc_sactive_issued |= qcmd_tag_to_mask(tag);
+
+		qc = ata_qc_from_tag(ap, tag);
+		/*
+		 * Start FP DMA for NCQ command.  At this point the tag is the
+		 * active tag.  It is the tag that matches the command about to
+		 * be completed.
+		 */
+		qc->ap->link.active_tag = tag;
+		sata_dwc_bmdma_start_by_tag(qc, tag);
+
+		handled = 1;
+		goto DONE;
+	}
+	sactive = core_scr_read(SCR_ACTIVE);
+	tag_mask = (host_pvt.sata_dwc_sactive_issued | sactive) ^ sactive;
+
+	/* If no sactive issued and tag_mask is zero then this is not NCQ */
+	if (host_pvt.sata_dwc_sactive_issued == 0 && tag_mask == 0) {
+		if (ap->link.active_tag == ATA_TAG_POISON)
+			tag = 0;
+		else
+			tag = ap->link.active_tag;
+		qc = ata_qc_from_tag(ap, tag);
+
+		/* DEV interrupt w/ no active qc? */
+		if (unlikely(!qc || (qc->tf.flags & ATA_TFLAG_POLLING))) {
+			dev_err(ap->dev, "%s interrupt with no active qc "
+				"qc=%p\n", __func__, qc);
+			ap->ops->sff_check_status(ap);
+			handled = 1;
+			goto DONE;
+		}
+		status = ap->ops->sff_check_status(ap);
+
+		qc->ap->link.active_tag = tag;
+		hsdevp->cmd_issued[tag] = SATA_DWC_CMD_ISSUED_NOT;
+
+		if (status & ATA_ERR) {
+			dev_dbg(ap->dev, "interrupt ATA_ERR (0x%x)\n", status);
+			sata_dwc_qc_complete(ap, qc, 1);
+			handled = 1;
+			goto DONE;
+		}
+
+		dev_dbg(ap->dev, "%s non-NCQ cmd interrupt, protocol: %s\n",
+			__func__, ata_get_cmd_descript(qc->tf.protocol));
+DRVSTILLBUSY:
+		if (ata_is_dma(qc->tf.protocol)) {
+			/*
+			 * Each DMA transaction produces 2 interrupts. The DMAC
+			 * transfer complete interrupt and the SATA controller
+			 * operation done interrupt. The command should be
+			 * completed only after both interrupts are seen.
+			 */
+			host_pvt.dma_interrupt_count++;
+			if (hsdevp->dma_pending[tag] == \
+					SATA_DWC_DMA_PENDING_NONE) {
+				dev_err(ap->dev, "%s: DMA not pending "
+					"intpr=0x%08x status=0x%08x pending"
+					"=%d\n", __func__, intpr, status,
+					hsdevp->dma_pending[tag]);
+			}
+
+			if ((host_pvt.dma_interrupt_count % 2) == 0)
+				sata_dwc_dma_xfer_complete(ap, 1);
+		} else if (ata_is_pio(qc->tf.protocol)) {
+			ata_sff_hsm_move(ap, qc, status, 0);
+			handled = 1;
+			goto DONE;
+		} else {
+			if (unlikely(sata_dwc_qc_complete(ap, qc, 1)))
+				goto DRVSTILLBUSY;
+		}
+
+		handled = 1;
+		goto DONE;
+	}
+
+	/*
+	 * This is a NCQ command. At this point we need to figure out for which
+	 * tags we have gotten a completion interrupt.  One interrupt may serve
+	 * as completion for more than one operation when commands are queued
+	 * (NCQ).  We need to process each completed command.
+	 */
+
+	 /* process completed commands */
+	sactive = core_scr_read(SCR_ACTIVE);
+	tag_mask = (host_pvt.sata_dwc_sactive_issued | sactive) ^ sactive;
+
+	if (sactive != 0 || (host_pvt.sata_dwc_sactive_issued) > 1 || \
+							tag_mask > 1) {
+		dev_dbg(ap->dev, "%s NCQ:sactive=0x%08x  sactive_issued=0x%08x"
+			"tag_mask=0x%08x\n", __func__, sactive,
+			host_pvt.sata_dwc_sactive_issued, tag_mask);
+	}
+
+	if ((tag_mask | (host_pvt.sata_dwc_sactive_issued)) != \
+					(host_pvt.sata_dwc_sactive_issued)) {
+		dev_warn(ap->dev, "Bad tag mask?  sactive=0x%08x "
+			 "(host_pvt.sata_dwc_sactive_issued)=0x%08x  tag_mask"
+			 "=0x%08x\n", sactive, host_pvt.sata_dwc_sactive_issued,
+			  tag_mask);
+	}
+
+	/* read just to clear ... not bad if currently still busy */
+	status = ap->ops->sff_check_status(ap);
+	dev_dbg(ap->dev, "%s ATA status register=0x%x\n", __func__, status);
+
+	tag = 0;
+	num_processed = 0;
+	while (tag_mask) {
+		num_processed++;
+		while (!(tag_mask & 0x00000001)) {
+			tag++;
+			tag_mask <<= 1;
+		}
+
+		tag_mask &= (~0x00000001);
+		qc = ata_qc_from_tag(ap, tag);
+
+		/* To be picked up by completion functions */
+		qc->ap->link.active_tag = tag;
+		hsdevp->cmd_issued[tag] = SATA_DWC_CMD_ISSUED_NOT;
+
+		/* Let libata/scsi layers handle error */
+		if (status & ATA_ERR) {
+			dev_dbg(ap->dev, "%s ATA_ERR (0x%x)\n", __func__,
+				status);
+			sata_dwc_qc_complete(ap, qc, 1);
+			handled = 1;
+			goto DONE;
+		}
+
+		/* Process completed command */
+		dev_dbg(ap->dev, "%s NCQ command, protocol: %s\n", __func__,
+			ata_get_cmd_descript(qc->tf.protocol));
+		if (ata_is_dma(qc->tf.protocol)) {
+			host_pvt.dma_interrupt_count++;
+			if (hsdevp->dma_pending[tag] == \
+					SATA_DWC_DMA_PENDING_NONE)
+				dev_warn(ap->dev, "%s: DMA not pending?\n",
+					__func__);
+			if ((host_pvt.dma_interrupt_count % 2) == 0)
+				sata_dwc_dma_xfer_complete(ap, 1);
+		} else {
+			if (unlikely(sata_dwc_qc_complete(ap, qc, 1)))
+				goto STILLBUSY;
+		}
+		continue;
+
+STILLBUSY:
+		ap->stats.idle_irq++;
+		dev_warn(ap->dev, "STILL BUSY IRQ ata%d: irq trap\n",
+			ap->print_id);
+	} /* while tag_mask */
+
+	/*
+	 * Check to see if any commands completed while we were processing our
+	 * initial set of completed commands (read status clears interrupts,
+	 * so we might miss a completed command interrupt if one came in while
+	 * we were processing --we read status as part of processing a completed
+	 * command).
+	 */
+	sactive2 = core_scr_read(SCR_ACTIVE);
+	if (sactive2 != sactive) {
+		dev_dbg(ap->dev, "More completed - sactive=0x%x sactive2"
+			"=0x%x\n", sactive, sactive2);
+	}
+	handled = 1;
+
+DONE:
+	spin_unlock_irqrestore(&host->lock, flags);
+	return IRQ_RETVAL(handled);
+}
+
+static void sata_dwc_clear_dmacr(struct sata_dwc_device_port *hsdevp, u8 tag)
+{
+	struct sata_dwc_device *hsdev = HSDEV_FROM_HSDEVP(hsdevp);
+
+	if (hsdevp->dma_pending[tag] == SATA_DWC_DMA_PENDING_RX) {
+		out_le32(&(hsdev->sata_dwc_regs->dmacr),
+			 SATA_DWC_DMACR_RX_CLEAR(
+				 in_le32(&(hsdev->sata_dwc_regs->dmacr))));
+	} else if (hsdevp->dma_pending[tag] == SATA_DWC_DMA_PENDING_TX) {
+		out_le32(&(hsdev->sata_dwc_regs->dmacr),
+			 SATA_DWC_DMACR_TX_CLEAR(
+				 in_le32(&(hsdev->sata_dwc_regs->dmacr))));
+	} else {
+		/*
+		 * This should not happen, it indicates the driver is out of
+		 * sync.  If it does happen, clear dmacr anyway.
+		 */
+		dev_err(host_pvt.dwc_dev, "%s DMA protocol RX and"
+			"TX DMA not pending tag=0x%02x pending=%d"
+			" dmacr: 0x%08x\n", __func__, tag,
+			hsdevp->dma_pending[tag],
+			in_le32(&(hsdev->sata_dwc_regs->dmacr)));
+		out_le32(&(hsdev->sata_dwc_regs->dmacr),
+			SATA_DWC_DMACR_TXRXCH_CLEAR);
+	}
+}
+
+static void sata_dwc_dma_xfer_complete(struct ata_port *ap, u32 check_status)
+{
+	struct ata_queued_cmd *qc;
+	struct sata_dwc_device_port *hsdevp = HSDEVP_FROM_AP(ap);
+	struct sata_dwc_device *hsdev = HSDEV_FROM_AP(ap);
+	u8 tag = 0;
+
+	tag = ap->link.active_tag;
+	qc = ata_qc_from_tag(ap, tag);
+	if (!qc) {
+		dev_err(ap->dev, "failed to get qc");
+		return;
+	}
+
+#ifdef DEBUG_NCQ
+	if (tag > 0) {
+		dev_info(ap->dev, "%s tag=%u cmd=0x%02x dma dir=%s proto=%s "
+			 "dmacr=0x%08x\n", __func__, qc->tag, qc->tf.command,
+			 ata_get_cmd_descript(qc->dma_dir),
+			 ata_get_cmd_descript(qc->tf.protocol),
+			 in_le32(&(hsdev->sata_dwc_regs->dmacr)));
+	}
+#endif
+
+	if (ata_is_dma(qc->tf.protocol)) {
+		if (hsdevp->dma_pending[tag] == SATA_DWC_DMA_PENDING_NONE) {
+			dev_err(ap->dev, "%s DMA protocol RX and TX DMA not "
+				"pending dmacr: 0x%08x\n", __func__,
+				in_le32(&(hsdev->sata_dwc_regs->dmacr)));
+		}
+
+		hsdevp->dma_pending[tag] = SATA_DWC_DMA_PENDING_NONE;
+		sata_dwc_qc_complete(ap, qc, check_status);
+		ap->link.active_tag = ATA_TAG_POISON;
+	} else {
+		sata_dwc_qc_complete(ap, qc, check_status);
+	}
+}
+
+static int sata_dwc_qc_complete(struct ata_port *ap, struct ata_queued_cmd *qc,
+				u32 check_status)
+{
+	u8 status = 0;
+	u32 mask = 0x0;
+	u8 tag = qc->tag;
+	struct sata_dwc_device_port *hsdevp = HSDEVP_FROM_AP(ap);
+	host_pvt.sata_dwc_sactive_queued = 0;
+	dev_dbg(ap->dev, "%s checkstatus? %x\n", __func__, check_status);
+
+	if (hsdevp->dma_pending[tag] == SATA_DWC_DMA_PENDING_TX)
+		dev_err(ap->dev, "TX DMA PENDING\n");
+	else if (hsdevp->dma_pending[tag] == SATA_DWC_DMA_PENDING_RX)
+		dev_err(ap->dev, "RX DMA PENDING\n");
+	dev_dbg(ap->dev, "QC complete cmd=0x%02x status=0x%02x ata%u:"
+		" protocol=%d\n", qc->tf.command, status, ap->print_id,
+		 qc->tf.protocol);
+
+	/* clear active bit */
+	mask = (~(qcmd_tag_to_mask(tag)));
+	host_pvt.sata_dwc_sactive_queued = (host_pvt.sata_dwc_sactive_queued) \
+						& mask;
+	host_pvt.sata_dwc_sactive_issued = (host_pvt.sata_dwc_sactive_issued) \
+						& mask;
+	ata_qc_complete(qc);
+	return 0;
+}
+
+static void sata_dwc_enable_interrupts(struct sata_dwc_device *hsdev)
+{
+	/* Enable selective interrupts by setting the interrupt maskregister*/
+	out_le32(&hsdev->sata_dwc_regs->intmr,
+		 SATA_DWC_INTMR_ERRM |
+		 SATA_DWC_INTMR_NEWFPM |
+		 SATA_DWC_INTMR_PMABRTM |
+		 SATA_DWC_INTMR_DMATM);
+	/*
+	 * Unmask the error bits that should trigger an error interrupt by
+	 * setting the error mask register.
+	 */
+	out_le32(&hsdev->sata_dwc_regs->errmr, SATA_DWC_SERROR_ERR_BITS);
+
+	dev_dbg(host_pvt.dwc_dev, "%s: INTMR = 0x%08x, ERRMR = 0x%08x\n",
+		 __func__, in_le32(&hsdev->sata_dwc_regs->intmr),
+		in_le32(&hsdev->sata_dwc_regs->errmr));
+}
+
+static void sata_dwc_setup_port(struct ata_ioports *port, unsigned long base)
+{
+	port->cmd_addr = (void *)base + 0x00;
+	port->data_addr = (void *)base + 0x00;
+
+	port->error_addr = (void *)base + 0x04;
+	port->feature_addr = (void *)base + 0x04;
+
+	port->nsect_addr = (void *)base + 0x08;
+
+	port->lbal_addr = (void *)base + 0x0c;
+	port->lbam_addr = (void *)base + 0x10;
+	port->lbah_addr = (void *)base + 0x14;
+
+	port->device_addr = (void *)base + 0x18;
+	port->command_addr = (void *)base + 0x1c;
+	port->status_addr = (void *)base + 0x1c;
+
+	port->altstatus_addr = (void *)base + 0x20;
+	port->ctl_addr = (void *)base + 0x20;
+}
+
+/*
+ * Function : sata_dwc_port_start
+ * arguments : struct ata_ioports *port
+ * Return value : returns 0 if success, error code otherwise
+ * This function allocates the scatter gather LLI table for AHB DMA
+ */
+static int sata_dwc_port_start(struct ata_port *ap)
+{
+	int err = 0;
+	struct sata_dwc_device *hsdev;
+	struct sata_dwc_device_port *hsdevp = NULL;
+	struct device *pdev;
+	int i;
+
+	hsdev = HSDEV_FROM_AP(ap);
+
+	dev_dbg(ap->dev, "%s: port_no=%d\n", __func__, ap->port_no);
+
+	hsdev->host = ap->host;
+	pdev = ap->host->dev;
+	if (!pdev) {
+		dev_err(ap->dev, "%s: no ap->host->dev\n", __func__);
+		err = -ENODEV;
+		goto CLEANUP;
+	}
+
+	/* Allocate Port Struct */
+	hsdevp = kzalloc(sizeof(*hsdevp), GFP_KERNEL);
+	if (!hsdevp) {
+		dev_err(ap->dev, "%s: kmalloc failed for hsdevp\n", __func__);
+		err = -ENOMEM;
+		goto CLEANUP;
+	}
+	hsdevp->hsdev = hsdev;
+
+	for (i = 0; i < SATA_DWC_QCMD_MAX; i++)
+		hsdevp->cmd_issued[i] = SATA_DWC_CMD_ISSUED_NOT;
+
+	ap->bmdma_prd = 0;	/* set these so libata doesn't use them */
+	ap->bmdma_prd_dma = 0;
+
+	/*
+	 * DMA - Assign scatter gather LLI table. We can't use the libata
+	 * version since it's PRD is IDE PCI specific.
+	 */
+	for (i = 0; i < SATA_DWC_QCMD_MAX; i++) {
+		hsdevp->llit[i] = dma_alloc_coherent(pdev,
+						     SATA_DWC_DMAC_LLI_TBL_SZ,
+						     &(hsdevp->llit_dma[i]),
+						     GFP_ATOMIC);
+		if (!hsdevp->llit[i]) {
+			dev_err(ap->dev, "%s: dma_alloc_coherent failed\n",
+				 __func__);
+			err = -ENOMEM;
+			goto CLEANUP;
+		}
+	}
+
+	if (ap->port_no == 0)  {
+		dev_dbg(ap->dev, "%s: clearing TXCHEN, RXCHEN in DMAC\n",
+			__func__);
+		out_le32(&hsdev->sata_dwc_regs->dmacr,
+			 SATA_DWC_DMACR_TXRXCH_CLEAR);
+
+		dev_dbg(ap->dev, "%s: setting burst size in DBTSR\n",
+			 __func__);
+		out_le32(&hsdev->sata_dwc_regs->dbtsr,
+			 (SATA_DWC_DBTSR_MWR(AHB_DMA_BRST_DFLT) |
+			  SATA_DWC_DBTSR_MRD(AHB_DMA_BRST_DFLT)));
+	}
+
+	/* Clear any error bits before libata starts issuing commands */
+	clear_serror();
+	ap->private_data = hsdevp;
+
+CLEANUP:
+	if (err) {
+		sata_dwc_port_stop(ap);
+		dev_dbg(ap->dev, "%s: fail\n", __func__);
+	} else {
+		dev_dbg(ap->dev, "%s: done\n", __func__);
+	}
+
+	return err;
+}
+
+static void sata_dwc_port_stop(struct ata_port *ap)
+{
+	int i;
+	struct sata_dwc_device *hsdev = HSDEV_FROM_AP(ap);
+	struct sata_dwc_device_port *hsdevp = HSDEVP_FROM_AP(ap);
+
+	dev_dbg(ap->dev, "%s: ap->id = %d\n", __func__, ap->print_id);
+
+	if (hsdevp && hsdev) {
+		/* deallocate LLI table */
+		for (i = 0; i < SATA_DWC_QCMD_MAX; i++) {
+			dma_free_coherent(ap->host->dev,
+					  SATA_DWC_DMAC_LLI_TBL_SZ,
+					 hsdevp->llit[i], hsdevp->llit_dma[i]);
+		}
+
+		kfree(hsdevp);
+	}
+	ap->private_data = NULL;
+}
+
+/*
+ * Function : sata_dwc_exec_command_by_tag
+ * arguments : ata_port *ap, ata_taskfile *tf, u8 tag, u32 cmd_issued
+ * Return value : None
+ * This function keeps track of individual command tag ids and calls
+ * ata_exec_command in libata
+ */
+static void sata_dwc_exec_command_by_tag(struct ata_port *ap,
+					 struct ata_taskfile *tf,
+					 u8 tag, u32 cmd_issued)
+{
+	unsigned long flags;
+	struct sata_dwc_device_port *hsdevp = HSDEVP_FROM_AP(ap);
+
+	dev_dbg(ap->dev, "%s cmd(0x%02x): %s tag=%d\n", __func__, tf->command,
+		ata_get_cmd_descript(tf), tag);
+
+	spin_lock_irqsave(&ap->host->lock, flags);
+	hsdevp->cmd_issued[tag] = cmd_issued;
+	spin_unlock_irqrestore(&ap->host->lock, flags);
+	/*
+	 * Clear SError before executing a new command.
+	 * sata_dwc_scr_write and read can not be used here. Clearing the PM
+	 * managed SError register for the disk needs to be done before the
+	 * task file is loaded.
+	 */
+	clear_serror();
+	ata_sff_exec_command(ap, tf);
+}
+
+static void sata_dwc_bmdma_setup_by_tag(struct ata_queued_cmd *qc, u8 tag)
+{
+	sata_dwc_exec_command_by_tag(qc->ap, &qc->tf, tag,
+				     SATA_DWC_CMD_ISSUED_PEND);
+}
+
+static void sata_dwc_bmdma_setup(struct ata_queued_cmd *qc)
+{
+	u8 tag = qc->tag;
+
+	if (ata_is_ncq(qc->tf.protocol)) {
+		dev_dbg(qc->ap->dev, "%s: ap->link.sactive=0x%08x tag=%d\n",
+			__func__, qc->ap->link.sactive, tag);
+	} else {
+		tag = 0;
+	}
+	sata_dwc_bmdma_setup_by_tag(qc, tag);
+}
+
+static void sata_dwc_bmdma_start_by_tag(struct ata_queued_cmd *qc, u8 tag)
+{
+	int start_dma;
+	u32 reg, dma_chan;
+	struct sata_dwc_device *hsdev = HSDEV_FROM_QC(qc);
+	struct ata_port *ap = qc->ap;
+	struct sata_dwc_device_port *hsdevp = HSDEVP_FROM_AP(ap);
+	int dir = qc->dma_dir;
+	dma_chan = hsdevp->dma_chan[tag];
+
+	if (hsdevp->cmd_issued[tag] != SATA_DWC_CMD_ISSUED_NOT) {
+		start_dma = 1;
+		if (dir == DMA_TO_DEVICE)
+			hsdevp->dma_pending[tag] = SATA_DWC_DMA_PENDING_TX;
+		else
+			hsdevp->dma_pending[tag] = SATA_DWC_DMA_PENDING_RX;
+	} else {
+		dev_err(ap->dev, "%s: Command not pending cmd_issued=%d "
+			"(tag=%d) DMA NOT started\n", __func__,
+			hsdevp->cmd_issued[tag], tag);
+		start_dma = 0;
+	}
+
+	dev_dbg(ap->dev, "%s qc=%p tag: %x cmd: 0x%02x dma_dir: %s "
+		"start_dma? %x\n", __func__, qc, tag, qc->tf.command,
+		ata_get_cmd_descript(qc->dma_dir), start_dma);
+	sata_dwc_tf_dump(&(qc->tf));
+
+	if (start_dma) {
+		reg = core_scr_read(SCR_ERROR);
+		if (reg & SATA_DWC_SERROR_ERR_BITS) {
+			dev_err(ap->dev, "%s: ****** SError=0x%08x ******\n",
+				__func__, reg);
+		}
+
+		if (dir == DMA_TO_DEVICE)
+			out_le32(&hsdev->sata_dwc_regs->dmacr,
+				SATA_DWC_DMACR_TXCHEN);
+		else
+			out_le32(&hsdev->sata_dwc_regs->dmacr,
+				SATA_DWC_DMACR_RXCHEN);
+
+		/* Enable AHB DMA transfer on the specified channel */
+		dma_dwc_xfer_start(dma_chan);
+	}
+}
+
+static void sata_dwc_bmdma_start(struct ata_queued_cmd *qc)
+{
+	u8 tag = qc->tag;
+
+	if (ata_is_ncq(qc->tf.protocol)) {
+		dev_dbg(qc->ap->dev, "%s: ap->link.sactive=0x%08x tag=%d\n",
+			__func__, qc->ap->link.sactive, tag);
+	} else {
+		tag = 0;
+	}
+	dev_dbg(qc->ap->dev, "%s\n", __func__);
+	sata_dwc_bmdma_start_by_tag(qc, tag);
+}
+
+/*
+ * Function : sata_dwc_qc_prep_by_tag
+ * arguments : ata_queued_cmd *qc, u8 tag
+ * Return value : None
+ * qc_prep for a particular queued command based on tag
+ */
+static void sata_dwc_qc_prep_by_tag(struct ata_queued_cmd *qc, u8 tag)
+{
+	struct scatterlist *sg = qc->sg;
+	struct ata_port *ap = qc->ap;
+	u32 dma_chan;
+	struct sata_dwc_device *hsdev = HSDEV_FROM_AP(ap);
+	struct sata_dwc_device_port *hsdevp = HSDEVP_FROM_AP(ap);
+	int err;
+
+	dev_dbg(ap->dev, "%s: port=%d dma dir=%s n_elem=%d\n",
+		__func__, ap->port_no, ata_get_cmd_descript(qc->dma_dir),
+		 qc->n_elem);
+
+	dma_chan = dma_dwc_xfer_setup(sg, qc->n_elem, hsdevp->llit[tag],
+				      hsdevp->llit_dma[tag],
+				      (void *__iomem)(&hsdev->sata_dwc_regs->\
+				      dmadr), qc->dma_dir);
+	if (dma_chan < 0) {
+		dev_err(ap->dev, "%s: dma_dwc_xfer_setup returns err %d\n",
+			__func__, err);
+		return;
+	}
+	hsdevp->dma_chan[tag] = dma_chan;
+}
+
+static unsigned int sata_dwc_qc_issue(struct ata_queued_cmd *qc)
+{
+	u32 sactive;
+	u8 tag = qc->tag;
+	struct ata_port *ap = qc->ap;
+
+#ifdef DEBUG_NCQ
+	if (qc->tag > 0 || ap->link.sactive > 1)
+		dev_info(ap->dev, "%s ap id=%d cmd(0x%02x)=%s qc tag=%d "
+			 "prot=%s ap active_tag=0x%08x ap sactive=0x%08x\n",
+			 __func__, ap->print_id, qc->tf.command,
+			 ata_get_cmd_descript(&qc->tf),
+			 qc->tag, ata_get_cmd_descript(qc->tf.protocol),
+			 ap->link.active_tag, ap->link.sactive);
+#endif
+
+	if (!ata_is_ncq(qc->tf.protocol))
+		tag = 0;
+	sata_dwc_qc_prep_by_tag(qc, tag);
+
+	if (ata_is_ncq(qc->tf.protocol)) {
+		sactive = core_scr_read(SCR_ACTIVE);
+		sactive |= (0x00000001 << tag);
+		core_scr_write(SCR_ACTIVE, sactive);
+
+		dev_dbg(qc->ap->dev, "%s: tag=%d ap->link.sactive = 0x%08x "
+			"sactive=0x%08x\n", __func__, tag, qc->ap->link.sactive,
+			sactive);
+
+		ap->ops->sff_tf_load(ap, &qc->tf);
+		sata_dwc_exec_command_by_tag(ap, &qc->tf, qc->tag,
+					     SATA_DWC_CMD_ISSUED_PEND);
+	} else {
+		ata_sff_qc_issue(qc);
+	}
+	return 0;
+}
+
+/*
+ * Function : sata_dwc_qc_prep
+ * arguments : ata_queued_cmd *qc
+ * Return value : None
+ * qc_prep for a particular queued command
+ */
+
+static void sata_dwc_qc_prep(struct ata_queued_cmd *qc)
+{
+	if ((qc->dma_dir == DMA_NONE) || (qc->tf.protocol == ATA_PROT_PIO))
+		return;
+
+#ifdef DEBUG_NCQ
+	if (qc->tag > 0)
+		dev_info(qc->ap->dev, "%s: qc->tag=%d ap->active_tag=0x%08x\n",
+			 __func__, tag, qc->ap->link.active_tag);
+
+	return ;
+#endif
+}
+
+static void sata_dwc_error_handler(struct ata_port *ap)
+{
+	ap->link.flags |= ATA_LFLAG_NO_HRST;
+	ata_sff_error_handler(ap);
+}
+
+/*
+ * scsi mid-layer and libata interface structures
+ */
+static struct scsi_host_template sata_dwc_sht = {
+	ATA_NCQ_SHT(DRV_NAME),
+	/*
+	 * test-only: Currently this driver doesn't handle NCQ
+	 * correctly. We enable NCQ but set the queue depth to a
+	 * max of 1. This will get fixed in in a future release.
+	 */
+	.sg_tablesize		= LIBATA_MAX_PRD,
+	.can_queue		= ATA_DEF_QUEUE,	/* ATA_MAX_QUEUE */
+	.dma_boundary		= ATA_DMA_BOUNDARY,
+};
+
+static struct ata_port_operations sata_dwc_ops = {
+	.inherits		= &ata_sff_port_ops,
+
+	.error_handler		= sata_dwc_error_handler,
+
+	.qc_prep		= sata_dwc_qc_prep,
+	.qc_issue		= sata_dwc_qc_issue,
+
+	.scr_read		= sata_dwc_scr_read,
+	.scr_write		= sata_dwc_scr_write,
+
+	.port_start		= sata_dwc_port_start,
+	.port_stop		= sata_dwc_port_stop,
+
+	.bmdma_setup		= sata_dwc_bmdma_setup,
+	.bmdma_start		= sata_dwc_bmdma_start,
+};
+
+static const struct ata_port_info sata_dwc_port_info[] = {
+	{
+		.flags		= ATA_FLAG_SATA | ATA_FLAG_NO_LEGACY |
+				  ATA_FLAG_MMIO | ATA_FLAG_NCQ,
+		.pio_mask	= 0x1f,	/* pio 0-4 */
+		.udma_mask	= ATA_UDMA6,
+		.port_ops	= &sata_dwc_ops,
+	},
+};
+
+static int sata_dwc_probe(struct of_device *ofdev,
+			const struct of_device_id *match)
+{
+	struct sata_dwc_device *hsdev;
+	u32 idr, versionr;
+	char *ver = (char *)&versionr;
+	u8 *base = NULL;
+	int err = 0;
+	int irq, rc;
+	struct ata_host *host;
+	struct ata_port_info pi = sata_dwc_port_info[0];
+	const struct ata_port_info *ppi[] = { &pi, NULL };
+
+	/* Allocate DWC SATA device */
+	hsdev = kmalloc(sizeof(*hsdev), GFP_KERNEL);
+	if (hsdev == NULL) {
+		dev_err(&ofdev->dev, "kmalloc failed for hsdev\n");
+		err = -ENOMEM;
+		goto error_out;
+	}
+	memset(hsdev, 0, sizeof(*hsdev));
+
+	/* Ioremap SATA registers */
+	base = of_iomap(ofdev->dev.of_node, 0);
+	if (!base) {
+		dev_err(&ofdev->dev, "ioremap failed for SATA register"
+			" address\n");
+		err = -ENODEV;
+		goto error_out;
+	}
+	hsdev->reg_base = base;
+	dev_dbg(&ofdev->dev, "ioremap done for SATA register address\n");
+
+	/* Synopsys DWC SATA specific Registers */
+	hsdev->sata_dwc_regs = (void *__iomem)(base + SATA_DWC_REG_OFFSET);
+
+	/* Allocate and fill host */
+	host = ata_host_alloc_pinfo(&ofdev->dev, ppi, SATA_DWC_MAX_PORTS);
+	if (!host) {
+		dev_err(&ofdev->dev, "ata_host_alloc_pinfo failed\n");
+		err = -ENOMEM;
+		goto error_out;
+	}
+
+	host->private_data = hsdev;
+
+	/* Setup port */
+	host->ports[0]->ioaddr.cmd_addr = base;
+	host->ports[0]->ioaddr.scr_addr = base + SATA_DWC_SCR_OFFSET;
+	host_pvt.scr_addr_sstatus = base + SATA_DWC_SCR_OFFSET;
+	sata_dwc_setup_port(&host->ports[0]->ioaddr, (unsigned long)base);
+
+	/* Read the ID and Version Registers */
+	idr = in_le32(&hsdev->sata_dwc_regs->idr);
+	versionr = in_le32(&hsdev->sata_dwc_regs->versionr);
+	dev_notice(&ofdev->dev, "id %d, controller version %c.%c%c\n",
+		   idr, ver[0], ver[1], ver[2]);
+
+	/* Get SATA DMA interrupt number */
+	irq = irq_of_parse_and_map(ofdev->dev.of_node, 1);
+	if (irq == NO_IRQ) {
+		dev_err(&ofdev->dev, "no SATA DMA irq\n");
+		err = -ENODEV;
+		goto error_out;
+	}
+
+	/* Get physical SATA DMA register base address */
+	host_pvt.sata_dma_regs = of_iomap(ofdev->dev.of_node, 1);
+	if (!(host_pvt.sata_dma_regs)) {
+		dev_err(&ofdev->dev, "ioremap failed for AHBDMA register"
+			" address\n");
+		err = -ENODEV;
+		goto error_out;
+	}
+
+	/* Save dev for later use in dev_xxx() routines */
+	host_pvt.dwc_dev = &ofdev->dev;
+
+	/* Initialize AHB DMAC */
+	dma_dwc_init(hsdev, irq);
+
+	/* Enable SATA Interrupts */
+	sata_dwc_enable_interrupts(hsdev);
+
+	/* Get SATA interrupt number */
+	irq = irq_of_parse_and_map(ofdev->dev.of_node, 0);
+	if (irq == NO_IRQ) {
+		dev_err(&ofdev->dev, "no SATA DMA irq\n");
+		err = -ENODEV;
+		goto error_out;
+	}
+
+	/*
+	 * Now, register with libATA core, this will also initiate the
+	 * device discovery process, invoking our port_start() handler &
+	 * error_handler() to execute a dummy Softreset EH session
+	 */
+	rc = ata_host_activate(host, irq, sata_dwc_isr, 0, &sata_dwc_sht);
+
+	if (rc != 0)
+		dev_err(&ofdev->dev, "failed to activate host");
+
+	dev_set_drvdata(&ofdev->dev, host);
+	return 0;
+
+error_out:
+	/* Free SATA DMA resources */
+	dma_dwc_exit(hsdev);
+
+	if (base)
+		iounmap(base);
+	return err;
+}
+
+static int sata_dwc_remove(struct of_device *ofdev)
+{
+	struct device *dev = &ofdev->dev;
+	struct ata_host *host = dev_get_drvdata(dev);
+	struct sata_dwc_device *hsdev = host->private_data;
+
+	ata_host_detach(host);
+	dev_set_drvdata(dev, NULL);
+
+	/* Free SATA DMA resources */
+	dma_dwc_exit(hsdev);
+
+	iounmap(hsdev->reg_base);
+	kfree(hsdev);
+	kfree(host);
+	dev_dbg(&ofdev->dev, "done\n");
+	return 0;
+}
+
+static const struct of_device_id sata_dwc_match[] = {
+	{ .compatible = "amcc,sata-460ex", },
+	{}
+};
+MODULE_DEVICE_TABLE(of, sata_dwc_match);
+
+static struct of_platform_driver sata_dwc_driver = {
+	.driver = {
+		.name = DRV_NAME,
+		.owner = THIS_MODULE,
+		.of_match_table = sata_dwc_match,
+	},
+	.probe = sata_dwc_probe,
+	.remove = sata_dwc_remove,
+};
+
+static int __init sata_dwc_init(void)
+{
+	return	of_register_platform_driver(&sata_dwc_driver);
+}
+
+static void __exit sata_dwc_exit(void)
+{
+	of_unregister_platform_driver(&sata_dwc_driver);
+}
+
+module_init(sata_dwc_init);
+module_exit(sata_dwc_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Mark Miesfeld <mmiesfeld@amcc.com>");
+MODULE_DESCRIPTION("DesignWare Cores SATA controller low lever driver");
+MODULE_VERSION(DRV_VERSION);
-- 
1.5.6.3

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox