All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [Bugme-new] [Bug 12995] New: NFS mount from avr32 platform crashes on 2.6.29
       [not found] ` <bug-12995-10286-V0hAGp6uBxO456/isadD/XN4h3HLQggn@public.gmane.org/>
@ 2009-04-01 19:41   ` Andrew Morton
  2009-04-01 19:50     ` Chuck Lever
  2009-04-02  7:48     ` Brian Murphy
  0 siblings, 2 replies; 7+ messages in thread
From: Andrew Morton @ 2009-04-01 19:41 UTC (permalink / raw)
  To: brm-EoLPgbz1DBzQT0dZR+AlfA
  Cc: bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r, Chuck Lever,
	Olaf Kirch, linux-nfs


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Wed, 1 Apr 2009 12:31:51 GMT
bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=12995
> 
>            Summary: NFS mount from avr32 platform crashes on 2.6.29
>            Product: File System
>            Version: 2.5
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: NFS
>         AssignedTo: trond.myklebust@fys.uio.no
>         ReportedBy: brm-EoLPgbz1DBzQT0dZR+AlfA@public.gmane.org
>         Regression: No
> 
> 
> The problem turns out to be in the recent(ish) changes in lockd.
> 
> Specifically this struct definition
> 
> struct nsm_handle {
>         struct list_head        sm_link;
>         atomic_t                sm_count;
>         char                    *sm_mon_name;
>         char                    *sm_name;
>         struct sockaddr_storage sm_addr;
>         size_t                  sm_addrlen;
>         unsigned int            sm_monitored : 1,
>                                 sm_sticky : 1;  /* don't unmonitor */
>         struct nsm_private      sm_priv;
>         char                    sm_addrbuf[NSM_ADDRBUF];
> };
> 
> results in my avr32 compiler (and my ia64 compiler) in aligning the sm_priv
> structure (which is a char array) on an odd boundary. The subsequent use
> of this field typecast to a u64 (nsm_init_private) as part of an nfs mount
> causes a crash with unaligned access.
> 
> The compiler only allocates *one byte* to the two bit bitfield.
> Moving the bitfield to the end of the structure fixes the problem in my case.
> It seems to me that one should be very careful with typecasting this sm_priv
> data to anything with larger alignment but especially to a 64 bit type since
> (at least on a 64 bit system) this may demand 64 bit alignment.
> 
> In any case it looks like (with newer gcc at least) that bitfields are
> extremely dangerous.
> 
> Perhaps the solution is to malloc the nsm_private data and sm_priv is then a
> pointer to this data. This would guarantee the correct alignment.
> 

nsm_private is:

struct nsm_private {
	unsigned char		data[SM_PRIV_SIZE];
};

so the compiler is permitted to byte-align this.

I assume that some code somewhere is accessing this with a
larger-than-one-byte typecast?

(Your bug report isn't complete - if it had included the trace then I'd
know where the crash is occurring!)

If so then something like this:

--- a/include/linux/lockd/xdr.h~a
+++ a/include/linux/lockd/xdr.h
@@ -9,6 +9,7 @@
 #ifndef LOCKD_XDR_H
 #define LOCKD_XDR_H
 
+#include <linux/compiler.h>
 #include <linux/fs.h>
 #include <linux/nfs.h>
 #include <linux/sunrpc/xdr.h>
@@ -16,9 +17,10 @@
 #define SM_MAXSTRLEN		1024
 #define SM_PRIV_SIZE		16
 
+/* suitable comment about the __aligned goes here */
 struct nsm_private {
 	unsigned char		data[SM_PRIV_SIZE];
-};
+} __aligned(sizeof(unsigned long));
 
 struct svc_rqst;
 
would be an appropriate fix, as it actually expresses what's going on.

If you agree then please cook up and test a patch?

Please send the patch via email - we very much try to avoid merging
patches out of bugzilla.

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bugme-new] [Bug 12995] New: NFS mount from avr32 platform crashes on 2.6.29
  2009-04-01 19:41   ` [Bugme-new] [Bug 12995] New: NFS mount from avr32 platform crashes on 2.6.29 Andrew Morton
@ 2009-04-01 19:50     ` Chuck Lever
  2009-04-01 20:07       ` Andrew Morton
  2009-04-02  7:48     ` Brian Murphy
  1 sibling, 1 reply; 7+ messages in thread
From: Chuck Lever @ 2009-04-01 19:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: brm-EoLPgbz1DBzQT0dZR+AlfA,
	bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r, Olaf Kirch,
	linux-nfs

On Apr 1, 2009, at 3:41 PM, Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not  
> via the
> bugzilla web interface).
>
> On Wed, 1 Apr 2009 12:31:51 GMT
> bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org wrote:
>
>> http://bugzilla.kernel.org/show_bug.cgi?id=12995
>>
>>           Summary: NFS mount from avr32 platform crashes on 2.6.29
>>           Product: File System
>>           Version: 2.5
>>          Platform: All
>>        OS/Version: Linux
>>              Tree: Mainline
>>            Status: NEW
>>          Severity: normal
>>          Priority: P1
>>         Component: NFS
>>        AssignedTo: trond.myklebust@fys.uio.no
>>        ReportedBy: brm-EoLPgbz1DBzQT0dZR+AlfA@public.gmane.org
>>        Regression: No
>>
>>
>> The problem turns out to be in the recent(ish) changes in lockd.
>>
>> Specifically this struct definition
>>
>> struct nsm_handle {
>>        struct list_head        sm_link;
>>        atomic_t                sm_count;
>>        char                    *sm_mon_name;
>>        char                    *sm_name;
>>        struct sockaddr_storage sm_addr;
>>        size_t                  sm_addrlen;
>>        unsigned int            sm_monitored : 1,
>>                                sm_sticky : 1;  /* don't unmonitor */
>>        struct nsm_private      sm_priv;
>>        char                    sm_addrbuf[NSM_ADDRBUF];
>> };
>>
>> results in my avr32 compiler (and my ia64 compiler) in aligning the  
>> sm_priv
>> structure (which is a char array) on an odd boundary. The  
>> subsequent use
>> of this field typecast to a u64 (nsm_init_private) as part of an  
>> nfs mount
>> causes a crash with unaligned access.
>>
>> The compiler only allocates *one byte* to the two bit bitfield.
>> Moving the bitfield to the end of the structure fixes the problem  
>> in my case.
>> It seems to me that one should be very careful with typecasting  
>> this sm_priv
>> data to anything with larger alignment but especially to a 64 bit  
>> type since
>> (at least on a 64 bit system) this may demand 64 bit alignment.
>>
>> In any case it looks like (with newer gcc at least) that bitfields  
>> are
>> extremely dangerous.
>>
>> Perhaps the solution is to malloc the nsm_private data and sm_priv  
>> is then a
>> pointer to this data. This would guarantee the correct alignment.
>>
>
> nsm_private is:
>
> struct nsm_private {
> 	unsigned char		data[SM_PRIV_SIZE];
> };
>
> so the compiler is permitted to byte-align this.
>
> I assume that some code somewhere is accessing this with a
> larger-than-one-byte typecast?

nsm_init_private().

A patch just went upstream for 2.6.30 that addresses that, but did not  
mention that it fixed an oops.

> (Your bug report isn't complete - if it had included the trace then  
> I'd
> know where the crash is occurring!)
>
> If so then something like this:
>
> --- a/include/linux/lockd/xdr.h~a
> +++ a/include/linux/lockd/xdr.h
> @@ -9,6 +9,7 @@
> #ifndef LOCKD_XDR_H
> #define LOCKD_XDR_H
>
> +#include <linux/compiler.h>
> #include <linux/fs.h>
> #include <linux/nfs.h>
> #include <linux/sunrpc/xdr.h>
> @@ -16,9 +17,10 @@
> #define SM_MAXSTRLEN		1024
> #define SM_PRIV_SIZE		16
>
> +/* suitable comment about the __aligned goes here */
> struct nsm_private {
> 	unsigned char		data[SM_PRIV_SIZE];
> -};
> +} __aligned(sizeof(unsigned long));

>  struct svc_rqst;
>
> would be an appropriate fix, as it actually expresses what's going on.
>
> If you agree then please cook up and test a patch?
>
> Please send the patch via email - we very much try to avoid merging
> patches out of bugzilla.
>
> Thanks.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bugme-new] [Bug 12995] New: NFS mount from avr32 platform crashes on 2.6.29
  2009-04-01 19:50     ` Chuck Lever
@ 2009-04-01 20:07       ` Andrew Morton
  2009-04-01 20:34         ` Chuck Lever
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2009-04-01 20:07 UTC (permalink / raw)
  To: Chuck Lever
  Cc: brm-EoLPgbz1DBzQT0dZR+AlfA,
	bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r, okir, linux-nfs,
	stable

(cc stable)

On Wed, 1 Apr 2009 15:50:51 -0400
Chuck Lever <chuck.lever@oracle.com> wrote:

> >>
> >> Perhaps the solution is to malloc the nsm_private data and sm_priv  
> >> is then a
> >> pointer to this data. This would guarantee the correct alignment.
> >>
> >
> > nsm_private is:
> >
> > struct nsm_private {
> > 	unsigned char		data[SM_PRIV_SIZE];
> > };
> >
> > so the compiler is permitted to byte-align this.
> >
> > I assume that some code somewhere is accessing this with a
> > larger-than-one-byte typecast?
> 
> nsm_init_private().
> 
> A patch just went upstream for 2.6.30 that addresses that

commit ad5b365c1266b0c9e8e254a3c1cc4ef66bf33cba
Author:     Mans Rullgard <mans-2StjZFpD7GcAvxtiuMwx3w@public.gmane.org>
AuthorDate: Sat Mar 28 19:55:20 2009 +0000
Commit:     Trond Myklebust <Trond.Myklebust@netapp.com>
CommitDate: Wed Apr 1 13:24:14 2009 -0400

    NSM: Fix unaligned accesses in nsm_init_private()
    
    This fixes unaligned accesses in nsm_init_private() when
    creating nlm_reboot keys.
    
    Signed-off-by: Mans Rullgard <mans-2StjZFpD7GcAvxtiuMwx3w@public.gmane.org>
    Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

diff --git a/fs/lockd/mon.c b/fs/lockd/mon.c
index 5e2c4d5..6d5d4a4 100644
--- a/fs/lockd/mon.c
+++ b/fs/lockd/mon.c
@@ -16,6 +16,8 @@
 #include <linux/sunrpc/svc.h>
 #include <linux/lockd/lockd.h>
 
+#include <asm/unaligned.h>
+
 #define NLMDBG_FACILITY		NLMDBG_MONITOR
 #define NSM_PROGRAM		100024
 #define NSM_VERSION		1
@@ -274,10 +276,12 @@ static void nsm_init_private(struct nsm_handle *nsm)
 {
 	u64 *p = (u64 *)&nsm->sm_priv.data;
 	struct timespec ts;
+	s64 ns;
 
 	ktime_get_ts(&ts);
-	*p++ = timespec_to_ns(&ts);
-	*p = (unsigned long)nsm;
+	ns = timespec_to_ns(&ts);
+	put_unaligned(ns, p);
+	put_unaligned((unsigned long)nsm, p + 1);
 }
 
 static struct nsm_handle *nsm_create_handle(const struct sockaddr *sap,

Is this the best way of fixing it?  We'll crash again if some other
code starts to access .data with non-byte-sized accesses.  What makes
this especially risky is that the code will pass testing on x86.

__aligned would be one way.  But it would be far far cleaner to just do


--- a/include/linux/lockd/xdr.h~a
+++ a/include/linux/lockd/xdr.h
@@ -17,7 +17,7 @@
 #define SM_PRIV_SIZE		16
 
 struct nsm_private {
-	unsigned char		data[SM_PRIV_SIZE];
+	u64 data[SM_PRIV_SIZE/sizeof(u64)];
 };
 
 struct svc_rqst;

all the typecasting and the put_unaligned()s just go away then.

> but did not mention that it fixed an oops.

That was a fairly significant failing.  Guys, please please do remember
that we're also maintaining the stable kernels.  When merging a patch
please always think whether it needs to be backported.


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Bugme-new] [Bug 12995] New: NFS mount from avr32 platform crashes on 2.6.29
  2009-04-01 20:07       ` Andrew Morton
@ 2009-04-01 20:34         ` Chuck Lever
  2009-04-01 20:46           ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Chuck Lever @ 2009-04-01 20:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: brm-EoLPgbz1DBzQT0dZR+AlfA,
	bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r, okir, linux-nfs,
	stable

On Apr 1, 2009, at 4:07 PM, Andrew Morton wrote:
> (cc stable)
>
> On Wed, 1 Apr 2009 15:50:51 -0400
> Chuck Lever <chuck.lever@oracle.com> wrote:
>
>>>>
>>>> Perhaps the solution is to malloc the nsm_private data and sm_priv
>>>> is then a
>>>> pointer to this data. This would guarantee the correct alignment.
>>>>
>>>
>>> nsm_private is:
>>>
>>> struct nsm_private {
>>> 	unsigned char		data[SM_PRIV_SIZE];
>>> };
>>>
>>> so the compiler is permitted to byte-align this.
>>>
>>> I assume that some code somewhere is accessing this with a
>>> larger-than-one-byte typecast?
>>
>> nsm_init_private().
>>
>> A patch just went upstream for 2.6.30 that addresses that
>
> commit ad5b365c1266b0c9e8e254a3c1cc4ef66bf33cba
> Author:     Mans Rullgard <mans-2StjZFpD7GcAvxtiuMwx3w@public.gmane.org>
> AuthorDate: Sat Mar 28 19:55:20 2009 +0000
> Commit:     Trond Myklebust <Trond.Myklebust@netapp.com>
> CommitDate: Wed Apr 1 13:24:14 2009 -0400
>
>    NSM: Fix unaligned accesses in nsm_init_private()
>
>    This fixes unaligned accesses in nsm_init_private() when
>    creating nlm_reboot keys.
>
>    Signed-off-by: Mans Rullgard <mans-2StjZFpD7GcAvxtiuMwx3w@public.gmane.org>
>    Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
>    Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
>
> diff --git a/fs/lockd/mon.c b/fs/lockd/mon.c
> index 5e2c4d5..6d5d4a4 100644
> --- a/fs/lockd/mon.c
> +++ b/fs/lockd/mon.c
> @@ -16,6 +16,8 @@
> #include <linux/sunrpc/svc.h>
> #include <linux/lockd/lockd.h>
>
> +#include <asm/unaligned.h>
> +
> #define NLMDBG_FACILITY		NLMDBG_MONITOR
> #define NSM_PROGRAM		100024
> #define NSM_VERSION		1
> @@ -274,10 +276,12 @@ static void nsm_init_private(struct nsm_handle  
> *nsm)
> {
> 	u64 *p = (u64 *)&nsm->sm_priv.data;
> 	struct timespec ts;
> +	s64 ns;
>
> 	ktime_get_ts(&ts);
> -	*p++ = timespec_to_ns(&ts);
> -	*p = (unsigned long)nsm;
> +	ns = timespec_to_ns(&ts);
> +	put_unaligned(ns, p);
> +	put_unaligned((unsigned long)nsm, p + 1);
> }
>
> static struct nsm_handle *nsm_create_handle(const struct sockaddr  
> *sap,
>
> Is this the best way of fixing it?  We'll crash again if some other
> code starts to access .data with non-byte-sized accesses.  What makes
> this especially risky is that the code will pass testing on x86.
>
> __aligned would be one way.  But it would be far far cleaner to just  
> do
>
>
> --- a/include/linux/lockd/xdr.h~a
> +++ a/include/linux/lockd/xdr.h
> @@ -17,7 +17,7 @@
> #define SM_PRIV_SIZE		16
>
> struct nsm_private {
> -	unsigned char		data[SM_PRIV_SIZE];
> +	u64 data[SM_PRIV_SIZE/sizeof(u64)];
> };
>
> struct svc_rqst;
>
> all the typecasting and the put_unaligned()s just go away then.

True enough.  It should be noted, however, that opaque data types are  
defined as character arrays in most RPC implementations, and by  
automated code generators like rpcgen.  The size of an RPC opaque type  
is usually expressed by a number of octets.

In general it may be best to keep that tradition (due to a large body  
of legacy code) and always use __aligned, simply because that clearly  
documents the problem that is being fixed.

>> but did not mention that it fixed an oops.
>
> That was a fairly significant failing.  Guys, please please do  
> remember
> that we're also maintaining the stable kernels.  When merging a patch
> please always think whether it needs to be backported.


--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bugme-new] [Bug 12995] New: NFS mount from avr32 platform crashes on 2.6.29
  2009-04-01 20:34         ` Chuck Lever
@ 2009-04-01 20:46           ` Andrew Morton
  0 siblings, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2009-04-01 20:46 UTC (permalink / raw)
  To: Chuck Lever
  Cc: brm-EoLPgbz1DBzQT0dZR+AlfA,
	bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r, okir, linux-nfs,
	stable

On Wed, 1 Apr 2009 16:34:53 -0400
Chuck Lever <chuck.lever@oracle.com> wrote:

> On Apr 1, 2009, at 4:07 PM, Andrew Morton wrote:
> > (cc stable)
> >
> > On Wed, 1 Apr 2009 15:50:51 -0400
> > Chuck Lever <chuck.lever@oracle.com> wrote:
> >
> >>>>
> >>>> Perhaps the solution is to malloc the nsm_private data and sm_priv
> >>>> is then a
> >>>> pointer to this data. This would guarantee the correct alignment.
> >>>>
> >>>
> >>> nsm_private is:
> >>>
> >>> struct nsm_private {
> >>> 	unsigned char		data[SM_PRIV_SIZE];
> >>> };
> >>>
> >>> so the compiler is permitted to byte-align this.
> >>>
> >>> I assume that some code somewhere is accessing this with a
> >>> larger-than-one-byte typecast?
> >>
> >> nsm_init_private().
> >>
> >> A patch just went upstream for 2.6.30 that addresses that
> >
> > commit ad5b365c1266b0c9e8e254a3c1cc4ef66bf33cba
> > Author:     Mans Rullgard <mans-2StjZFpD7GcAvxtiuMwx3w@public.gmane.org>
> > AuthorDate: Sat Mar 28 19:55:20 2009 +0000
> > Commit:     Trond Myklebust <Trond.Myklebust@netapp.com>
> > CommitDate: Wed Apr 1 13:24:14 2009 -0400
> >
> >    NSM: Fix unaligned accesses in nsm_init_private()
> >
> >    This fixes unaligned accesses in nsm_init_private() when
> >    creating nlm_reboot keys.
> >
> >    Signed-off-by: Mans Rullgard <mans-2StjZFpD7GcAvxtiuMwx3w@public.gmane.org>
> >    Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
> >    Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> >
> > diff --git a/fs/lockd/mon.c b/fs/lockd/mon.c
> > index 5e2c4d5..6d5d4a4 100644
> > --- a/fs/lockd/mon.c
> > +++ b/fs/lockd/mon.c
> > @@ -16,6 +16,8 @@
> > #include <linux/sunrpc/svc.h>
> > #include <linux/lockd/lockd.h>
> >
> > +#include <asm/unaligned.h>
> > +
> > #define NLMDBG_FACILITY		NLMDBG_MONITOR
> > #define NSM_PROGRAM		100024
> > #define NSM_VERSION		1
> > @@ -274,10 +276,12 @@ static void nsm_init_private(struct nsm_handle  
> > *nsm)
> > {
> > 	u64 *p = (u64 *)&nsm->sm_priv.data;
> > 	struct timespec ts;
> > +	s64 ns;
> >
> > 	ktime_get_ts(&ts);
> > -	*p++ = timespec_to_ns(&ts);
> > -	*p = (unsigned long)nsm;
> > +	ns = timespec_to_ns(&ts);
> > +	put_unaligned(ns, p);
> > +	put_unaligned((unsigned long)nsm, p + 1);
> > }
> >
> > static struct nsm_handle *nsm_create_handle(const struct sockaddr  
> > *sap,
> >
> > Is this the best way of fixing it?  We'll crash again if some other
> > code starts to access .data with non-byte-sized accesses.  What makes
> > this especially risky is that the code will pass testing on x86.
> >
> > __aligned would be one way.  But it would be far far cleaner to just  
> > do
> >
> >
> > --- a/include/linux/lockd/xdr.h~a
> > +++ a/include/linux/lockd/xdr.h
> > @@ -17,7 +17,7 @@
> > #define SM_PRIV_SIZE		16
> >
> > struct nsm_private {
> > -	unsigned char		data[SM_PRIV_SIZE];
> > +	u64 data[SM_PRIV_SIZE/sizeof(u64)];
> > };
> >
> > struct svc_rqst;
> >
> > all the typecasting and the put_unaligned()s just go away then.
> 
> True enough.  It should be noted, however, that opaque data types are  
> defined as character arrays in most RPC implementations, and by  
> automated code generators like rpcgen.  The size of an RPC opaque type  
> is usually expressed by a number of octets.
> 
> In general it may be best to keep that tradition (due to a large body  
> of legacy code) and always use __aligned, simply because that clearly  
> documents the problem that is being fixed.
> 

struct nsm_private {
	union {
		unsigned char data[SM_PRIV_SIZE];
		u64 aligned_data[SM_PRIV_SIZE/sizeof(u64)];
	};
};

:)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bugme-new] [Bug 12995] New: NFS mount from avr32 platform crashes on 2.6.29
  2009-04-01 19:41   ` [Bugme-new] [Bug 12995] New: NFS mount from avr32 platform crashes on 2.6.29 Andrew Morton
  2009-04-01 19:50     ` Chuck Lever
@ 2009-04-02  7:48     ` Brian Murphy
       [not found]       ` <49D46DBD.508-EoLPgbz1DBzQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 7+ messages in thread
From: Brian Murphy @ 2009-04-02  7:48 UTC (permalink / raw)
  To: Andrew Morton, bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r,
	Chuck Lever, Olaf Kirch, linux-nfs

Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Wed, 1 Apr 2009 12:31:51 GMT
> bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org wrote:
>
>   
>> http://bugzilla.kernel.org/show_bug.cgi?id=12995
>>
>>            Summary: NFS mount from avr32 platform crashes on 2.6.29
>>            Product: File System
>>            Version: 2.5
>>           Platform: All
>>         OS/Version: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: normal
>>           Priority: P1
>>          Component: NFS
>>         AssignedTo: trond.myklebust@fys.uio.no
>>         ReportedBy: brm-EoLPgbz1DBzQT0dZR+AlfA@public.gmane.org
>>         Regression: No
>>
>>
>> The problem turns out to be in the recent(ish) changes in lockd.
>>
>> Specifically this struct definition
>>
>> struct nsm_handle {
>>         struct list_head        sm_link;
>>         atomic_t                sm_count;
>>         char                    *sm_mon_name;
>>         char                    *sm_name;
>>         struct sockaddr_storage sm_addr;
>>         size_t                  sm_addrlen;
>>         unsigned int            sm_monitored : 1,
>>                                 sm_sticky : 1;  /* don't unmonitor */
>>         struct nsm_private      sm_priv;
>>         char                    sm_addrbuf[NSM_ADDRBUF];
>> };
>>
>> results in my avr32 compiler (and my ia64 compiler) in aligning the sm_priv
>> structure (which is a char array) on an odd boundary. The subsequent use
>> of this field typecast to a u64 (nsm_init_private) as part of an nfs mount
>> causes a crash with unaligned access.
>>
>> The compiler only allocates *one byte* to the two bit bitfield.
>> Moving the bitfield to the end of the structure fixes the problem in my case.
>> It seems to me that one should be very careful with typecasting this sm_priv
>> data to anything with larger alignment but especially to a 64 bit type since
>> (at least on a 64 bit system) this may demand 64 bit alignment.
>>
>> In any case it looks like (with newer gcc at least) that bitfields are
>> extremely dangerous.
>>
>> Perhaps the solution is to malloc the nsm_private data and sm_priv is then a
>> pointer to this data. This would guarantee the correct alignment.
>>
>>     
>
> nsm_private is:
>
> struct nsm_private {
> 	unsigned char		data[SM_PRIV_SIZE];
> };
>
> so the compiler is permitted to byte-align this.
>
> I assume that some code somewhere is accessing this with a
> larger-than-one-byte typecast?
>
> (Your bug report isn't complete - if it had included the trace then I'd
> know where the crash is occurring!)
>
>   
I wrote where the crash was occuring.
> If so then something like this:
>
> --- a/include/linux/lockd/xdr.h~a
> +++ a/include/linux/lockd/xdr.h
> @@ -9,6 +9,7 @@
>  #ifndef LOCKD_XDR_H
>  #define LOCKD_XDR_H
>  
> +#include <linux/compiler.h>
>  #include <linux/fs.h>
>  #include <linux/nfs.h>
>  #include <linux/sunrpc/xdr.h>
> @@ -16,9 +17,10 @@
>  #define SM_MAXSTRLEN		1024
>  #define SM_PRIV_SIZE		16
>  
> +/* suitable comment about the __aligned goes here */
>  struct nsm_private {
>  	unsigned char		data[SM_PRIV_SIZE];
> -};
> +} __aligned(sizeof(unsigned long));
>  
>  struct svc_rqst;
>  
> would be an appropriate fix, as it actually expresses what's going on.
>
> If you agree then please cook up and test a patch?
>
> Please send the patch via email - we very much try to avoid merging
> patches out of bugzilla.
>
> Thanks.
>   

It seems to me that there is something wrong here. Shouldn't the 
"private data"
either be declared as a struct (and no longer private) or be a void 
pointer, allocated
by the user of the private data (with malloc).
Otherwise there will always be problems when the size of the structure 
placed in
the private area grows in addition to the alignment issues.

/Brian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bugme-new] [Bug 12995] New: NFS mount from avr32 platform crashes on 2.6.29
       [not found]       ` <49D46DBD.508-EoLPgbz1DBzQT0dZR+AlfA@public.gmane.org>
@ 2009-04-02 17:04         ` Chuck Lever
  0 siblings, 0 replies; 7+ messages in thread
From: Chuck Lever @ 2009-04-02 17:04 UTC (permalink / raw)
  To: Brian Murphy
  Cc: Andrew Morton, bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r,
	Olaf Kirch, linux-nfs

On Apr 2, 2009, at 3:48 AM, Brian Murphy wrote:
> Andrew Morton wrote:
>> (switched to email.  Please respond via emailed reply-to-all, not  
>> via the
>> bugzilla web interface).
>>
>> On Wed, 1 Apr 2009 12:31:51 GMT
>> bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org wrote:
>>
>>
>>> http://bugzilla.kernel.org/show_bug.cgi?id=12995
>>>
>>>           Summary: NFS mount from avr32 platform crashes on 2.6.29
>>>           Product: File System
>>>           Version: 2.5
>>>          Platform: All
>>>        OS/Version: Linux
>>>              Tree: Mainline
>>>            Status: NEW
>>>          Severity: normal
>>>          Priority: P1
>>>         Component: NFS
>>>        AssignedTo: trond.myklebust@fys.uio.no
>>>        ReportedBy: brm-EoLPgbz1DBzQT0dZR+AlfA@public.gmane.org
>>>        Regression: No
>>>
>>>
>>> The problem turns out to be in the recent(ish) changes in lockd.
>>>
>>> Specifically this struct definition
>>>
>>> struct nsm_handle {
>>>        struct list_head        sm_link;
>>>        atomic_t                sm_count;
>>>        char                    *sm_mon_name;
>>>        char                    *sm_name;
>>>        struct sockaddr_storage sm_addr;
>>>        size_t                  sm_addrlen;
>>>        unsigned int            sm_monitored : 1,
>>>                                sm_sticky : 1;  /* don't unmonitor */
>>>        struct nsm_private      sm_priv;
>>>        char                    sm_addrbuf[NSM_ADDRBUF];
>>> };
>>>
>>> results in my avr32 compiler (and my ia64 compiler) in aligning  
>>> the sm_priv
>>> structure (which is a char array) on an odd boundary. The  
>>> subsequent use
>>> of this field typecast to a u64 (nsm_init_private) as part of an  
>>> nfs mount
>>> causes a crash with unaligned access.
>>>
>>> The compiler only allocates *one byte* to the two bit bitfield.
>>> Moving the bitfield to the end of the structure fixes the problem  
>>> in my case.
>>> It seems to me that one should be very careful with typecasting  
>>> this sm_priv
>>> data to anything with larger alignment but especially to a 64 bit  
>>> type since
>>> (at least on a 64 bit system) this may demand 64 bit alignment.
>>>
>>> In any case it looks like (with newer gcc at least) that bitfields  
>>> are
>>> extremely dangerous.
>>>
>>> Perhaps the solution is to malloc the nsm_private data and sm_priv  
>>> is then a
>>> pointer to this data. This would guarantee the correct alignment.
>>>
>>>
>>
>> nsm_private is:
>>
>> struct nsm_private {
>> 	unsigned char		data[SM_PRIV_SIZE];
>> };
>>
>> so the compiler is permitted to byte-align this.
>>
>> I assume that some code somewhere is accessing this with a
>> larger-than-one-byte typecast?
>>
>> (Your bug report isn't complete - if it had included the trace then  
>> I'd
>> know where the crash is occurring!)
>>
>>
> I wrote where the crash was occuring.
>> If so then something like this:
>>
>> --- a/include/linux/lockd/xdr.h~a
>> +++ a/include/linux/lockd/xdr.h
>> @@ -9,6 +9,7 @@
>> #ifndef LOCKD_XDR_H
>> #define LOCKD_XDR_H
>> +#include <linux/compiler.h>
>> #include <linux/fs.h>
>> #include <linux/nfs.h>
>> #include <linux/sunrpc/xdr.h>
>> @@ -16,9 +17,10 @@
>> #define SM_MAXSTRLEN		1024
>> #define SM_PRIV_SIZE		16
>> +/* suitable comment about the __aligned goes here */
>> struct nsm_private {
>> 	unsigned char		data[SM_PRIV_SIZE];
>> -};
>> +} __aligned(sizeof(unsigned long));
>>  struct svc_rqst;
>> would be an appropriate fix, as it actually expresses what's going  
>> on.
>>
>> If you agree then please cook up and test a patch?
>>
>> Please send the patch via email - we very much try to avoid merging
>> patches out of bugzilla.
>>
>> Thanks.
>>
>
> It seems to me that there is something wrong here. Shouldn't the  
> "private data"
> either be declared as a struct (and no longer private) or be a void  
> pointer, allocated
> by the user of the private data (with malloc).
> Otherwise there will always be problems when the size of the  
> structure placed in
> the private area grows in addition to the alignment issues.

nsm_private is a struct which allows an in-core representation of an  
opaque object passed via a standard network protocol called NSM.

The contents of the struct are determined by the kernel, so they are  
not really private in this context.  The receiving end (rpc.statd, in  
user space) does not interpret the contents, it merely stores them and  
returns them when requested.  The use of the term "private" here  
matches the name of the argument to the SM_MON RPC procedure.  It is  
not the same as the term "private" when referring to, say, a file  
system-specific data structure chained off of an inode.

We can be fairly certain the size of the object will not change.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-04-02 17:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <bug-12995-10286@http.bugzilla.kernel.org/>
     [not found] ` <bug-12995-10286-V0hAGp6uBxO456/isadD/XN4h3HLQggn@public.gmane.org/>
2009-04-01 19:41   ` [Bugme-new] [Bug 12995] New: NFS mount from avr32 platform crashes on 2.6.29 Andrew Morton
2009-04-01 19:50     ` Chuck Lever
2009-04-01 20:07       ` Andrew Morton
2009-04-01 20:34         ` Chuck Lever
2009-04-01 20:46           ` Andrew Morton
2009-04-02  7:48     ` Brian Murphy
     [not found]       ` <49D46DBD.508-EoLPgbz1DBzQT0dZR+AlfA@public.gmane.org>
2009-04-02 17:04         ` Chuck Lever

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.