* [PATCH] arm64: add architecture specified current_pt_regs @ 2016-02-18 11:48 ` Zhi-zhou Zhang 0 siblings, 0 replies; 10+ messages in thread From: Zhi-zhou Zhang @ 2016-02-18 11:48 UTC (permalink / raw) To: linux-arm-kernel From: zhizhou <zhizhou.zh@gmail.com> This patch is based on the implementation of arm. The generic current_pt_regs is implemented with current->stack. It need to access memory that would be too expensive. Signed-off-by: zhizhou <zhizhou.zh@gmail.com> --- arch/arm64/include/asm/ptrace.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h index e9e5467..1865d54 100644 --- a/arch/arm64/include/asm/ptrace.h +++ b/arch/arm64/include/asm/ptrace.h @@ -185,5 +185,9 @@ static inline int valid_user_regs(struct user_pt_regs *regs) extern unsigned long profile_pc(struct pt_regs *regs); +#define current_pt_regs(void) ({ (struct pt_regs *) \ + ((current_stack_pointer | (THREAD_SIZE - 1)) - 0xf) - 1; \ +}) + #endif /* __ASSEMBLY__ */ #endif -- 2.5.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH] arm64: add architecture specified current_pt_regs @ 2016-02-18 11:48 ` Zhi-zhou Zhang 0 siblings, 0 replies; 10+ messages in thread From: Zhi-zhou Zhang @ 2016-02-18 11:48 UTC (permalink / raw) To: robin.murphy, catalin.marinas, will.deacon Cc: linux-arm-kernel, linux-kernel, zhizhou From: zhizhou <zhizhou.zh@gmail.com> This patch is based on the implementation of arm. The generic current_pt_regs is implemented with current->stack. It need to access memory that would be too expensive. Signed-off-by: zhizhou <zhizhou.zh@gmail.com> --- arch/arm64/include/asm/ptrace.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h index e9e5467..1865d54 100644 --- a/arch/arm64/include/asm/ptrace.h +++ b/arch/arm64/include/asm/ptrace.h @@ -185,5 +185,9 @@ static inline int valid_user_regs(struct user_pt_regs *regs) extern unsigned long profile_pc(struct pt_regs *regs); +#define current_pt_regs(void) ({ (struct pt_regs *) \ + ((current_stack_pointer | (THREAD_SIZE - 1)) - 0xf) - 1; \ +}) + #endif /* __ASSEMBLY__ */ #endif -- 2.5.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH] arm64: add architecture specified current_pt_regs 2016-02-18 11:48 ` Zhi-zhou Zhang @ 2016-02-18 11:58 ` Catalin Marinas -1 siblings, 0 replies; 10+ messages in thread From: Catalin Marinas @ 2016-02-18 11:58 UTC (permalink / raw) To: linux-arm-kernel On Thu, Feb 18, 2016 at 07:48:35PM +0800, Zhi-zhou Zhang wrote: > From: zhizhou <zhizhou.zh@gmail.com> > > This patch is based on the implementation of arm. The generic > current_pt_regs is implemented with current->stack. It need to access > memory that would be too expensive. Do you have any performance numbers? > diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h > index e9e5467..1865d54 100644 > --- a/arch/arm64/include/asm/ptrace.h > +++ b/arch/arm64/include/asm/ptrace.h > @@ -185,5 +185,9 @@ static inline int valid_user_regs(struct user_pt_regs *regs) > > extern unsigned long profile_pc(struct pt_regs *regs); > > +#define current_pt_regs(void) ({ (struct pt_regs *) \ > + ((current_stack_pointer | (THREAD_SIZE - 1)) - 0xf) - 1; \ > +}) I don't think this works well with the separate IRQ stack that we merged in 4.5-rc1. current_thread_info() explicitly uses "sp_el0" while current_stack_pointer is just "sp" (though I don't think we ever use current_pt_regs in interrupt context). -- Catalin ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] arm64: add architecture specified current_pt_regs @ 2016-02-18 11:58 ` Catalin Marinas 0 siblings, 0 replies; 10+ messages in thread From: Catalin Marinas @ 2016-02-18 11:58 UTC (permalink / raw) To: Zhi-zhou Zhang; +Cc: robin.murphy, will.deacon, linux-kernel, linux-arm-kernel On Thu, Feb 18, 2016 at 07:48:35PM +0800, Zhi-zhou Zhang wrote: > From: zhizhou <zhizhou.zh@gmail.com> > > This patch is based on the implementation of arm. The generic > current_pt_regs is implemented with current->stack. It need to access > memory that would be too expensive. Do you have any performance numbers? > diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h > index e9e5467..1865d54 100644 > --- a/arch/arm64/include/asm/ptrace.h > +++ b/arch/arm64/include/asm/ptrace.h > @@ -185,5 +185,9 @@ static inline int valid_user_regs(struct user_pt_regs *regs) > > extern unsigned long profile_pc(struct pt_regs *regs); > > +#define current_pt_regs(void) ({ (struct pt_regs *) \ > + ((current_stack_pointer | (THREAD_SIZE - 1)) - 0xf) - 1; \ > +}) I don't think this works well with the separate IRQ stack that we merged in 4.5-rc1. current_thread_info() explicitly uses "sp_el0" while current_stack_pointer is just "sp" (though I don't think we ever use current_pt_regs in interrupt context). -- Catalin ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] arm64: add architecture specified current_pt_regs 2016-02-18 11:58 ` Catalin Marinas @ 2016-02-19 2:30 ` Zhi-zhou -1 siblings, 0 replies; 10+ messages in thread From: Zhi-zhou @ 2016-02-19 2:30 UTC (permalink / raw) To: linux-arm-kernel On Thu, Feb 18, 2016 at 7:58 PM, Catalin Marinas <catalin.marinas@arm.com> wrote: > > On Thu, Feb 18, 2016 at 07:48:35PM +0800, Zhi-zhou Zhang wrote: > > From: zhizhou <zhizhou.zh@gmail.com> > > > > This patch is based on the implementation of arm. The generic > > current_pt_regs is implemented with current->stack. It need to access > > memory that would be too expensive. > > Do you have any performance numbers? I'm using QEMU, so no. Actually this macro isn't heavily used. I just think using the generic implementation is not very nice. It get task_struct from sp_el0, then get stack(which is equal to sp_el0) from task_struct. There are two unnecessary memory accesses. > > > > diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h > > index e9e5467..1865d54 100644 > > --- a/arch/arm64/include/asm/ptrace.h > > +++ b/arch/arm64/include/asm/ptrace.h > > @@ -185,5 +185,9 @@ static inline int valid_user_regs(struct user_pt_regs *regs) > > > > extern unsigned long profile_pc(struct pt_regs *regs); > > > > +#define current_pt_regs(void) ({ (struct pt_regs *) \ > > + ((current_stack_pointer | (THREAD_SIZE - 1)) - 0xf) - 1; \ > > +}) > > I don't think this works well with the separate IRQ stack that we merged > in 4.5-rc1. current_thread_info() explicitly uses "sp_el0" while > current_stack_pointer is just "sp" (though I don't think we ever use > current_pt_regs in interrupt context). > Thank you, that's a problem. So how about write it like this? diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h index e9e5467..b68e01c 100644 --- a/arch/arm64/include/asm/ptrace.h +++ b/arch/arm64/include/asm/ptrace.h @@ -185,5 +185,10 @@ static inline int valid_user_regs(struct user_pt_regs *regs) extern unsigned long profile_pc(struct pt_regs *regs); +#define current_pt_regs(void) ({ (struct pt_regs *) \ + (((unsigned long)current_thread_info() | \ + (THREAD_SIZE - 1)) - 0xf) - 1; \ +}) + #endif /* __ASSEMBLY__ */ #endif > -- > Catalin -- Regards, Zhi-zhou ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] arm64: add architecture specified current_pt_regs @ 2016-02-19 2:30 ` Zhi-zhou 0 siblings, 0 replies; 10+ messages in thread From: Zhi-zhou @ 2016-02-19 2:30 UTC (permalink / raw) To: Catalin Marinas; +Cc: robin.murphy, will.deacon, linux-kernel, linux-arm-kernel On Thu, Feb 18, 2016 at 7:58 PM, Catalin Marinas <catalin.marinas@arm.com> wrote: > > On Thu, Feb 18, 2016 at 07:48:35PM +0800, Zhi-zhou Zhang wrote: > > From: zhizhou <zhizhou.zh@gmail.com> > > > > This patch is based on the implementation of arm. The generic > > current_pt_regs is implemented with current->stack. It need to access > > memory that would be too expensive. > > Do you have any performance numbers? I'm using QEMU, so no. Actually this macro isn't heavily used. I just think using the generic implementation is not very nice. It get task_struct from sp_el0, then get stack(which is equal to sp_el0) from task_struct. There are two unnecessary memory accesses. > > > > diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h > > index e9e5467..1865d54 100644 > > --- a/arch/arm64/include/asm/ptrace.h > > +++ b/arch/arm64/include/asm/ptrace.h > > @@ -185,5 +185,9 @@ static inline int valid_user_regs(struct user_pt_regs *regs) > > > > extern unsigned long profile_pc(struct pt_regs *regs); > > > > +#define current_pt_regs(void) ({ (struct pt_regs *) \ > > + ((current_stack_pointer | (THREAD_SIZE - 1)) - 0xf) - 1; \ > > +}) > > I don't think this works well with the separate IRQ stack that we merged > in 4.5-rc1. current_thread_info() explicitly uses "sp_el0" while > current_stack_pointer is just "sp" (though I don't think we ever use > current_pt_regs in interrupt context). > Thank you, that's a problem. So how about write it like this? diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h index e9e5467..b68e01c 100644 --- a/arch/arm64/include/asm/ptrace.h +++ b/arch/arm64/include/asm/ptrace.h @@ -185,5 +185,10 @@ static inline int valid_user_regs(struct user_pt_regs *regs) extern unsigned long profile_pc(struct pt_regs *regs); +#define current_pt_regs(void) ({ (struct pt_regs *) \ + (((unsigned long)current_thread_info() | \ + (THREAD_SIZE - 1)) - 0xf) - 1; \ +}) + #endif /* __ASSEMBLY__ */ #endif > -- > Catalin -- Regards, Zhi-zhou ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH] arm64: add architecture specified current_pt_regs 2016-02-19 2:30 ` Zhi-zhou @ 2016-02-19 10:32 ` Will Deacon -1 siblings, 0 replies; 10+ messages in thread From: Will Deacon @ 2016-02-19 10:32 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 19, 2016 at 10:30:09AM +0800, Zhi-zhou wrote: > On Thu, Feb 18, 2016 at 7:58 PM, Catalin Marinas > <catalin.marinas@arm.com> wrote: > > > > On Thu, Feb 18, 2016 at 07:48:35PM +0800, Zhi-zhou Zhang wrote: > > > From: zhizhou <zhizhou.zh@gmail.com> > > > > > > This patch is based on the implementation of arm. The generic > > > current_pt_regs is implemented with current->stack. It need to access > > > memory that would be too expensive. > > > > Do you have any performance numbers? > > I'm using QEMU, so no. Actually this macro isn't heavily used. I just > think using the generic > implementation is not very nice. It get task_struct from sp_el0, then > get stack(which is > equal to sp_el0) from task_struct. There are two unnecessary memory accesses. I'd much rather use the generic implementation unless there's a compelling reason not to. "I think it's not very nice" doesn't really cut it for me! Will ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] arm64: add architecture specified current_pt_regs @ 2016-02-19 10:32 ` Will Deacon 0 siblings, 0 replies; 10+ messages in thread From: Will Deacon @ 2016-02-19 10:32 UTC (permalink / raw) To: Zhi-zhou; +Cc: Catalin Marinas, robin.murphy, linux-kernel, linux-arm-kernel On Fri, Feb 19, 2016 at 10:30:09AM +0800, Zhi-zhou wrote: > On Thu, Feb 18, 2016 at 7:58 PM, Catalin Marinas > <catalin.marinas@arm.com> wrote: > > > > On Thu, Feb 18, 2016 at 07:48:35PM +0800, Zhi-zhou Zhang wrote: > > > From: zhizhou <zhizhou.zh@gmail.com> > > > > > > This patch is based on the implementation of arm. The generic > > > current_pt_regs is implemented with current->stack. It need to access > > > memory that would be too expensive. > > > > Do you have any performance numbers? > > I'm using QEMU, so no. Actually this macro isn't heavily used. I just > think using the generic > implementation is not very nice. It get task_struct from sp_el0, then > get stack(which is > equal to sp_el0) from task_struct. There are two unnecessary memory accesses. I'd much rather use the generic implementation unless there's a compelling reason not to. "I think it's not very nice" doesn't really cut it for me! Will ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] arm64: add architecture specified current_pt_regs 2016-02-19 10:32 ` Will Deacon @ 2016-02-19 12:01 ` Zhizhou Zhang -1 siblings, 0 replies; 10+ messages in thread From: Zhizhou Zhang @ 2016-02-19 12:01 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 19, 2016 at 6:32 PM, Will Deacon <will.deacon@arm.com> wrote: > On Fri, Feb 19, 2016 at 10:30:09AM +0800, Zhi-zhou wrote: >> On Thu, Feb 18, 2016 at 7:58 PM, Catalin Marinas >> <catalin.marinas@arm.com> wrote: >> > >> > On Thu, Feb 18, 2016 at 07:48:35PM +0800, Zhi-zhou Zhang wrote: >> > > From: zhizhou <zhizhou.zh@gmail.com> >> > > >> > > This patch is based on the implementation of arm. The generic >> > > current_pt_regs is implemented with current->stack. It need to access >> > > memory that would be too expensive. >> > >> > Do you have any performance numbers? >> >> I'm using QEMU, so no. Actually this macro isn't heavily used. I just >> think using the generic >> implementation is not very nice. It get task_struct from sp_el0, then >> get stack(which is >> equal to sp_el0) from task_struct. There are two unnecessary memory accesses. > > I'd much rather use the generic implementation unless there's a compelling > reason not to. "I think it's not very nice" doesn't really cut it for me! Refer to memory twice may incur cache eviction. I'm really a newbie to kernel. Anyway, I have a look at kernel and I'm very curious what's the 'compelling reason' of 16 architectures implement their own current_pt_regs? > > Will -- Regards, Zhizhou ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] arm64: add architecture specified current_pt_regs @ 2016-02-19 12:01 ` Zhizhou Zhang 0 siblings, 0 replies; 10+ messages in thread From: Zhizhou Zhang @ 2016-02-19 12:01 UTC (permalink / raw) To: Will Deacon; +Cc: Catalin Marinas, robin.murphy, linux-kernel, linux-arm-kernel On Fri, Feb 19, 2016 at 6:32 PM, Will Deacon <will.deacon@arm.com> wrote: > On Fri, Feb 19, 2016 at 10:30:09AM +0800, Zhi-zhou wrote: >> On Thu, Feb 18, 2016 at 7:58 PM, Catalin Marinas >> <catalin.marinas@arm.com> wrote: >> > >> > On Thu, Feb 18, 2016 at 07:48:35PM +0800, Zhi-zhou Zhang wrote: >> > > From: zhizhou <zhizhou.zh@gmail.com> >> > > >> > > This patch is based on the implementation of arm. The generic >> > > current_pt_regs is implemented with current->stack. It need to access >> > > memory that would be too expensive. >> > >> > Do you have any performance numbers? >> >> I'm using QEMU, so no. Actually this macro isn't heavily used. I just >> think using the generic >> implementation is not very nice. It get task_struct from sp_el0, then >> get stack(which is >> equal to sp_el0) from task_struct. There are two unnecessary memory accesses. > > I'd much rather use the generic implementation unless there's a compelling > reason not to. "I think it's not very nice" doesn't really cut it for me! Refer to memory twice may incur cache eviction. I'm really a newbie to kernel. Anyway, I have a look at kernel and I'm very curious what's the 'compelling reason' of 16 architectures implement their own current_pt_regs? > > Will -- Regards, Zhizhou ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-02-19 12:01 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-02-18 11:48 [PATCH] arm64: add architecture specified current_pt_regs Zhi-zhou Zhang 2016-02-18 11:48 ` Zhi-zhou Zhang 2016-02-18 11:58 ` Catalin Marinas 2016-02-18 11:58 ` Catalin Marinas 2016-02-19 2:30 ` Zhi-zhou 2016-02-19 2:30 ` Zhi-zhou 2016-02-19 10:32 ` Will Deacon 2016-02-19 10:32 ` Will Deacon 2016-02-19 12:01 ` Zhizhou Zhang 2016-02-19 12:01 ` Zhizhou Zhang
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.