public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* [Linux-ia64] 010626 kernel, copy_from_user() broken?
@ 2001-07-12 11:16 Richard Hirst
  2001-07-19 19:25 ` David Mosberger
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Richard Hirst @ 2001-07-12 11:16 UTC (permalink / raw)
  To: linux-ia64

Hi,
  Summary:  I had to change PIPE_DEPTH in arch/ia64/lib/copy_user.S
from 21 to 4 to make copy_from_user() work with non-aligned user
addresses on my B3 cpu.  PIPE_DEPTH was 4 in the 010530 kernel.

Long description of how I got to this point:

  I'm working on the debian installer for ia64.  Most of the functionality,
including /bin/mount, is provided by busybox.  The first thing rcS does
on booting the installer is "mount proc /proc -t proc".  That worked fine
with the 010530 kernel, but fails with 010626.

Basically the kernel seems to have trouble reading the filesystem type
parameter, thinks it is an empty string, and ends up calling modprobe
for module name "".

Busybox mount uses getopt() to process its arguments, and when it finds
-t it simply sents char *filesystemtype = optarg.  That means the type
argument to mount() is different from device and dir, because it is
referencing env space.  If I make busybox malloc space for the type
string and copy optarg there, it works fine.

I got busybox mount to print the first three args to mount() before and
after the call:

0x60000000000150e0=/proc
0x60000000000160f0=/proc
0x80000fffffffbf84=proc

same before and after, as expected.  I got the kernel sys_mount to print
the address in user space of the type arg, which matched the users idea
of the value.  sys_mount() calls copy_mount_options (type, &type_page),
but if I then printk("'%s'\n", (char *)type_page), it yields ''.

I added printk's to copy_mount_options(), and saw it calls

copy_from_user(0xe00000003dc98000,0x80000fffffffbf84, 0x4000), which
claims to copy 0x7c bytes (i.e. to end of page).

I then did "export a=b" and tried the mount again; this time the type
param was at user address 0x80000fffffffbf80, and the mount worked.

Checking kernel changes, I found this.  Reverting the change made it work
again:

 
diff -urN linux-2.4.5/arch/ia64/lib/copy_user.S linux-2.4.5-lia/arch/ia64/lib/copy_user.S
--- linux-2.4.5/arch/ia64/lib/copy_user.S	Sun Apr 29 15:49:26 2001
+++ linux-2.4.5-lia/arch/ia64/lib/copy_user.S	Tue Jun 26 22:31:21 2001
@@ -35,7 +35,7 @@
 // Tuneable parameters
 //
 #define COPY_BREAK	16	// we do byte copy below (must be >\x16)
-#define PIPE_DEPTH	4	// pipe depth
+#define PIPE_DEPTH	21	// pipe depth
 
 #define EPI		p[PIPE_DEPTH-1] // PASTE(p,16+PIPE_DEPTH-1)
 


Richard



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Linux-ia64] 010626 kernel, copy_from_user() broken?
  2001-07-12 11:16 [Linux-ia64] 010626 kernel, copy_from_user() broken? Richard Hirst
@ 2001-07-19 19:25 ` David Mosberger
  2001-07-19 22:14 ` Nakajima, Jun
  2001-07-20 17:33 ` Richard Hirst
  2 siblings, 0 replies; 4+ messages in thread
From: David Mosberger @ 2001-07-19 19:25 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Thu, 12 Jul 2001 12:16:58 +0100, Richard Hirst <rhirst@linuxcare.com> said:

  Rich> Hi, Summary: I had to change PIPE_DEPTH in
  Rich> arch/ia64/lib/copy_user.S from 21 to 4 to make
  Rich> copy_from_user() work with non-aligned user addresses on my B3
  Rich> cpu.  PIPE_DEPTH was 4 in the 010530 kernel.

Thanks for reporting this (and for tracking it down!).  It turns out
that the author of the copy_user() recovery code either knowingly or
accidentally assumed that PIPE_DEPTH=4, which is of course not good
as it defeats the whole purpose of making the sw-pipelined loop
tunable.  The patch below should fix this.  Can you try it instead and
let me know how it goes?

Thanks,

	--david

--- lia64/arch/ia64/lib/copy_user.S	Tue Jun 26 22:31:21 2001
+++ lia64-kdb/arch/ia64/lib/copy_user.S	Thu Jul 19 12:21:26 2001
@@ -37,7 +37,7 @@
 #define COPY_BREAK	16	// we do byte copy below (must be >\x16)
 #define PIPE_DEPTH	21	// pipe depth
 
-#define EPI		p[PIPE_DEPTH-1] // PASTE(p,16+PIPE_DEPTH-1)
+#define EPI		p[PIPE_DEPTH-1]
 
 //
 // arguments
@@ -148,8 +148,8 @@
 	//
 
 	//
-	// Optimization. If dst1 is 8-byte aligned (not rarely), we don't need
-	// to copy the head to dst1, to start 8-byte copy software pipleline.
+	// Optimization. If dst1 is 8-byte aligned (quite common), we don't need
+	// to copy the head to dst1, to start 8-byte copy software pipeline.
 	// We know src1 is not 8-byte aligned in this case.
 	//
 	cmp.eq p14,p15=r0,dst2
@@ -233,15 +233,23 @@
 #define SWITCH(pred, shift)	cmp.eq pred,p0=shift,rshift
 #define CASE(pred, shift)	\
 	(pred)	br.cond.spnt.few copy_user_bit##shift
-#define BODY(rshift)							\
-copy_user_bit##rshift:							\
-1:									\
-	EX(failure_out,(EPI) st8 [dst1]=tmp,8);				\
-(EPI_1) shrp tmp=val1[PIPE_DEPTH-3],val1[PIPE_DEPTH-2],rshift;		\
-	EX(failure_in2,(p16) ld8 val1[0]=[src1],8);			\
-	br.ctop.dptk.few 1b;						\
-	;;								\
-	br.cond.spnt.few .diff_align_do_tail
+#define BODY(rshift)						\
+copy_user_bit##rshift:						\
+1:								\
+	EX(failure_out,(EPI) st8 [dst1]=tmp,8);			\
+(EPI_1) shrp tmp=val1[PIPE_DEPTH-3],val1[PIPE_DEPTH-2],rshift;	\
+	EX(3f,(p16) ld8 val1[0]=[src1],8);			\
+	br.ctop.dptk.few 1b;					\
+	;;							\
+	br.cond.sptk.few .diff_align_do_tail;			\
+2:								\
+(EPI)	st8 [dst1]=tmp,8;					\
+(EPI_1)	shrp tmp=val1[PIPE_DEPTH-3],val1[PIPE_DEPTH-2],rshift;	\
+3:								\
+(p16)	mov val1[0]=r0;						\
+	br.ctop.dptk.few 2b;					\
+	;;							\
+	br.cond.sptk.few failure_in2
 
 	//
 	// Since the instruction 'shrp' requires a fixed 128-bit value
@@ -581,13 +589,7 @@
 	br.ret.dptk.few rp
 
 failure_in2:
-	sub ret0=endsrc,src1	// number of bytes to zero, i.e. not copied
-	;;
-3:
-(p16)	mov val1[0]=r0
-(EPI)	st8 [dst1]=val1[PIPE_DEPTH-1],8
-	br.ctop.dptk.few 3b
-	;;
+	sub ret0=endsrc,src1
 	cmp.ne p6,p0=dst1,enddst	// Do we need to finish the tail ?
 	sub len=enddst,dst1,1		// precompute len
 (p6)	br.cond.dptk.few failure_in1bis


^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [Linux-ia64] 010626 kernel, copy_from_user() broken?
  2001-07-12 11:16 [Linux-ia64] 010626 kernel, copy_from_user() broken? Richard Hirst
  2001-07-19 19:25 ` David Mosberger
@ 2001-07-19 22:14 ` Nakajima, Jun
  2001-07-20 17:33 ` Richard Hirst
  2 siblings, 0 replies; 4+ messages in thread
From: Nakajima, Jun @ 2001-07-19 22:14 UTC (permalink / raw)
  To: linux-ia64

I think using '21' (latency of the L3 cache on Itanium) would make more
sense than '4' (something in between the one of L1 and L2) for copy_user,
because we might want to optimize the worst cases. If the data are available
in L1 or L2, the size to copy should be small and it's done quickly anyway.

See "Itanium(tm) Processor Microarchitecture Reference"
(http://developer.intel.com/design/ia-64/downloads/245474.htm) for such
details.

For some reason I know the author accidentally assumed that PIPE_DEPTH=4
when he saw #define ...

Jun

-----Original Message-----
From: David Mosberger [mailto:davidm@hpl.hp.com]
Sent: Thursday, July 19, 2001 12:26 PM
To: Richard Hirst
Cc: linux-ia64@linuxia64.org
Subject: Re: [Linux-ia64] 010626 kernel, copy_from_user() broken?


>>>>> On Thu, 12 Jul 2001 12:16:58 +0100, Richard Hirst
<rhirst@linuxcare.com> said:

  Rich> Hi, Summary: I had to change PIPE_DEPTH in
  Rich> arch/ia64/lib/copy_user.S from 21 to 4 to make
  Rich> copy_from_user() work with non-aligned user addresses on my B3
  Rich> cpu.  PIPE_DEPTH was 4 in the 010530 kernel.

Thanks for reporting this (and for tracking it down!).  It turns out
that the author of the copy_user() recovery code either knowingly or
accidentally assumed that PIPE_DEPTH=4, which is of course not good
as it defeats the whole purpose of making the sw-pipelined loop
tunable.  The patch below should fix this.  Can you try it instead and
let me know how it goes?

Thanks,

	--david

--- lia64/arch/ia64/lib/copy_user.S	Tue Jun 26 22:31:21 2001
+++ lia64-kdb/arch/ia64/lib/copy_user.S	Thu Jul 19 12:21:26 2001
@@ -37,7 +37,7 @@
 #define COPY_BREAK	16	// we do byte copy below (must be >\x16)
 #define PIPE_DEPTH	21	// pipe depth
 
-#define EPI		p[PIPE_DEPTH-1] // PASTE(p,16+PIPE_DEPTH-1)
+#define EPI		p[PIPE_DEPTH-1]
 
 //
 // arguments
@@ -148,8 +148,8 @@
 	//
 
 	//
-	// Optimization. If dst1 is 8-byte aligned (not rarely), we don't
need
-	// to copy the head to dst1, to start 8-byte copy software
pipleline.
+	// Optimization. If dst1 is 8-byte aligned (quite common), we don't
need
+	// to copy the head to dst1, to start 8-byte copy software pipeline.
 	// We know src1 is not 8-byte aligned in this case.
 	//
 	cmp.eq p14,p15=r0,dst2
@@ -233,15 +233,23 @@
 #define SWITCH(pred, shift)	cmp.eq pred,p0=shift,rshift
 #define CASE(pred, shift)	\
 	(pred)	br.cond.spnt.few copy_user_bit##shift
-#define BODY(rshift)							\
-copy_user_bit##rshift:							\
-1:									\
-	EX(failure_out,(EPI) st8 [dst1]=tmp,8);				\
-(EPI_1) shrp tmp=val1[PIPE_DEPTH-3],val1[PIPE_DEPTH-2],rshift;		\
-	EX(failure_in2,(p16) ld8 val1[0]=[src1],8);			\
-	br.ctop.dptk.few 1b;						\
-	;;								\
-	br.cond.spnt.few .diff_align_do_tail
+#define BODY(rshift)						\
+copy_user_bit##rshift:						\
+1:								\
+	EX(failure_out,(EPI) st8 [dst1]=tmp,8);			\
+(EPI_1) shrp tmp=val1[PIPE_DEPTH-3],val1[PIPE_DEPTH-2],rshift;	\
+	EX(3f,(p16) ld8 val1[0]=[src1],8);			\
+	br.ctop.dptk.few 1b;					\
+	;;							\
+	br.cond.sptk.few .diff_align_do_tail;			\
+2:								\
+(EPI)	st8 [dst1]=tmp,8;					\
+(EPI_1)	shrp tmp=val1[PIPE_DEPTH-3],val1[PIPE_DEPTH-2],rshift;	\
+3:								\
+(p16)	mov val1[0]=r0;						\
+	br.ctop.dptk.few 2b;					\
+	;;							\
+	br.cond.sptk.few failure_in2
 
 	//
 	// Since the instruction 'shrp' requires a fixed 128-bit value
@@ -581,13 +589,7 @@
 	br.ret.dptk.few rp
 
 failure_in2:
-	sub ret0=endsrc,src1	// number of bytes to zero, i.e. not copied
-	;;
-3:
-(p16)	mov val1[0]=r0
-(EPI)	st8 [dst1]=val1[PIPE_DEPTH-1],8
-	br.ctop.dptk.few 3b
-	;;
+	sub ret0=endsrc,src1
 	cmp.ne p6,p0=dst1,enddst	// Do we need to finish the tail ?
 	sub len=enddst,dst1,1		// precompute len
 (p6)	br.cond.dptk.few failure_in1bis

_______________________________________________
Linux-IA64 mailing list
Linux-IA64@linuxia64.org
http://lists.linuxia64.org/lists/listinfo/linux-ia64



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Linux-ia64] 010626 kernel, copy_from_user() broken?
  2001-07-12 11:16 [Linux-ia64] 010626 kernel, copy_from_user() broken? Richard Hirst
  2001-07-19 19:25 ` David Mosberger
  2001-07-19 22:14 ` Nakajima, Jun
@ 2001-07-20 17:33 ` Richard Hirst
  2 siblings, 0 replies; 4+ messages in thread
From: Richard Hirst @ 2001-07-20 17:33 UTC (permalink / raw)
  To: linux-ia64

On Thu, Jul 19, 2001 at 12:25:35PM -0700, David Mosberger wrote:
> >>>>> On Thu, 12 Jul 2001 12:16:58 +0100, Richard Hirst <rhirst@linuxcare.com> said:
> 
>   Rich> Hi, Summary: I had to change PIPE_DEPTH in
>   Rich> arch/ia64/lib/copy_user.S from 21 to 4 to make
>   Rich> copy_from_user() work with non-aligned user addresses on my B3
>   Rich> cpu.  PIPE_DEPTH was 4 in the 010530 kernel.
> 
> Thanks for reporting this (and for tracking it down!).  It turns out
> that the author of the copy_user() recovery code either knowingly or
> accidentally assumed that PIPE_DEPTH=4, which is of course not good
> as it defeats the whole purpose of making the sw-pipelined loop
> tunable.  The patch below should fix this.  Can you try it instead and
> let me know how it goes?

Works fine, thanks.

Richard



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2001-07-20 17:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-07-12 11:16 [Linux-ia64] 010626 kernel, copy_from_user() broken? Richard Hirst
2001-07-19 19:25 ` David Mosberger
2001-07-19 22:14 ` Nakajima, Jun
2001-07-20 17:33 ` Richard Hirst

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox