linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 2.6.32-131.6.1.el6.x86_64] idmapd.c: deactivate the ASCII characters check
@ 2012-02-23  9:16 Gregor Gruener
  2012-02-23 13:02 ` Jim Rees
  2012-02-29 20:48 ` Steve Dickson
  0 siblings, 2 replies; 9+ messages in thread
From: Gregor Gruener @ 2012-02-23  9:16 UTC (permalink / raw)
  To: linux-nfs

Customers are using Unicode characters umlauts (ö,ä,ü) in group names and this
creates problems with NFS ID Mapping. Groups with umlauts will be redirected to
the group NFS "nobody".
This patch deactivate the ASCII characters check. It's maybe more like a temporary fix and I think it would be nicer to
adjust the check to support Unicode characters instead to deactivate the check.

========================================================================================================

diff -Nurp nfs-utils-1.2.2/utils/idmapd/idmapd.c nfs-utils-1.2.2-fag/utils/idmapd/idmapd.c
--- nfs-utils-1.2.2/utils/idmapd/idmapd.c	2010-02-18 13:35:00.000000000 +0100
+++ nfs-utils-1.2.2-fag/utils/idmapd/idmapd.c	2011-10-05 09:09:05.815103100 +0200
@@ -850,9 +850,6 @@ validateascii(char *string, u_int32_t le
  	for (i = 0; i<  len; i++) {
  		if (string[i] == '\0')
  			break;
-
-		if (string[i]&  0x80)
-			return (-1);
  	}

  	if ((i>= len) || string[i] != '\0')



=========================================================================================================




Developer's Certificate of Origin 1.1

         By making a contribution to this project, I certify that:

         (a) The contribution was created in whole or in part by me and I
             have the right to submit it under the open source license
             indicated in the file; or

         (b) The contribution is based upon previous work that, to the best
             of my knowledge, is covered under an appropriate open source
             license and I have the right under that license to submit that
             work with modifications, whether created in whole or in part
             by me, under the same open source license (unless I am
             permitted to submit under a different license), as indicated
             in the file; or

         (c) The contribution was provided directly to me by some other
             person who certified (a), (b) or (c) and I have not modified
             it.

         (d) I understand and agree that this project and the contribution
             are public and that a record of the contribution (including all
             personal information I submit with it, including my sign-off) is
             maintained indefinitely and may be redistributed consistent with
             this project or the open source license(s) involved.

Signed-off-by: Gregor Gruener<ggruner@redhat.com>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2.6.32-131.6.1.el6.x86_64] idmapd.c: deactivate the ASCII characters check
  2012-02-23  9:16 [PATCH 2.6.32-131.6.1.el6.x86_64] idmapd.c: deactivate the ASCII characters check Gregor Gruener
@ 2012-02-23 13:02 ` Jim Rees
  2012-02-28 19:55   ` J. Bruce Fields
  2012-02-29 20:48 ` Steve Dickson
  1 sibling, 1 reply; 9+ messages in thread
From: Jim Rees @ 2012-02-23 13:02 UTC (permalink / raw)
  To: Gregor Gruener; +Cc: linux-nfs

Gregor Gruener wrote:

  Customers are using Unicode characters umlauts (ö,ä,ü) in group names and this
  creates problems with NFS ID Mapping. Groups with umlauts will be redirected to
  the group NFS "nobody".
  This patch deactivate the ASCII characters check. It's maybe more like a temporary fix and I think it would be nicer to
  adjust the check to support Unicode characters instead to deactivate the
  check.

Maybe the name of the routine should also be changed to something other than
validateascii().  But I'm curious why that check was put in there.

I found this in rfc5661 section 22.1, which seems to be a bug in the spec:

   1.  A US-ASCII string name that is the actual name of the attribute.
       This name must be unique.  This string name can be 1 to 128 UTF-8
       characters long.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2.6.32-131.6.1.el6.x86_64] idmapd.c: deactivate the ASCII characters check
  2012-02-23 13:02 ` Jim Rees
@ 2012-02-28 19:55   ` J. Bruce Fields
  0 siblings, 0 replies; 9+ messages in thread
From: J. Bruce Fields @ 2012-02-28 19:55 UTC (permalink / raw)
  To: Jim Rees; +Cc: Gregor Gruener, linux-nfs

On Thu, Feb 23, 2012 at 08:02:45AM -0500, Jim Rees wrote:
> Gregor Gruener wrote:
> 
>   Customers are using Unicode characters umlauts (ö,ä,ü) in group names and this
>   creates problems with NFS ID Mapping. Groups with umlauts will be redirected to
>   the group NFS "nobody".
>   This patch deactivate the ASCII characters check. It's maybe more like a temporary fix and I think it would be nicer to
>   adjust the check to support Unicode characters instead to deactivate the
>   check.
> 
> Maybe the name of the routine should also be changed to something other than
> validateascii().  But I'm curious why that check was put in there.

Sounds totally wrong.

By the nfs rfc's of course, if we're going to check anything we should
be checking for utf8.

Though even then, it's hard to see how failing and mapping to nobody
really helps anyone here once they already have a non-utf8 name.

Best might be to allow a non-utf8 mapping and print a one-time warning.
And work on fixing any account-creation tools to enforce utf8.

> 
> I found this in rfc5661 section 22.1, which seems to be a bug in the spec:
> 
>    1.  A US-ASCII string name that is the actual name of the attribute.
>        This name must be unique.  This string name can be 1 to 128 UTF-8
>        characters long.

Oops.

--b.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2.6.32-131.6.1.el6.x86_64] idmapd.c: deactivate the ASCII characters check
  2012-02-23  9:16 [PATCH 2.6.32-131.6.1.el6.x86_64] idmapd.c: deactivate the ASCII characters check Gregor Gruener
  2012-02-23 13:02 ` Jim Rees
@ 2012-02-29 20:48 ` Steve Dickson
  2012-02-29 21:38   ` Jim Rees
                     ` (2 more replies)
  1 sibling, 3 replies; 9+ messages in thread
From: Steve Dickson @ 2012-02-29 20:48 UTC (permalink / raw)
  To: Gregor Gruener; +Cc: linux-nfs



On 02/23/2012 04:16 AM, Gregor Gruener wrote:
> Customers are using Unicode characters umlauts (ö,ä,ü) in group names and this
> creates problems with NFS ID Mapping. Groups with umlauts will be redirected to
> the group NFS "nobody".
> This patch deactivate the ASCII characters check. It's maybe more like a temporary fix and I think it would be nicer to
> adjust the check to support Unicode characters instead to deactivate the check.
> 
> ========================================================================================================
> 
> diff -Nurp nfs-utils-1.2.2/utils/idmapd/idmapd.c nfs-utils-1.2.2-fag/utils/idmapd/idmapd.c
> --- nfs-utils-1.2.2/utils/idmapd/idmapd.c    2010-02-18 13:35:00.000000000 +0100
> +++ nfs-utils-1.2.2-fag/utils/idmapd/idmapd.c    2011-10-05 09:09:05.815103100 +0200
> @@ -850,9 +850,6 @@ validateascii(char *string, u_int32_t le
>      for (i = 0; i<  len; i++) {
>          if (string[i] == '\0')
>              break;
> -
> -        if (string[i]&  0x80)
> -            return (-1);
>      }
> 
>      if ((i>= len) || string[i] != '\0')
> 
> 
> 
> =========================================================================================================
> 
> 
> 
> 
> Developer's Certificate of Origin 1.1
> 
>         By making a contribution to this project, I certify that:
> 
>         (a) The contribution was created in whole or in part by me and I
>             have the right to submit it under the open source license
>             indicated in the file; or
> 
>         (b) The contribution is based upon previous work that, to the best
>             of my knowledge, is covered under an appropriate open source
>             license and I have the right under that license to submit that
>             work with modifications, whether created in whole or in part
>             by me, under the same open source license (unless I am
>             permitted to submit under a different license), as indicated
>             in the file; or
> 
>         (c) The contribution was provided directly to me by some other
>             person who certified (a), (b) or (c) and I have not modified
>             it.
> 
>         (d) I understand and agree that this project and the contribution
>             are public and that a record of the contribution (including all
>             personal information I submit with it, including my sign-off) is
>             maintained indefinitely and may be redistributed consistent with
>             this project or the open source license(s) involved.
> 
> Signed-off-by: Gregor Gruener<ggruner@redhat.com>
Committed... 

steved.

> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2.6.32-131.6.1.el6.x86_64] idmapd.c: deactivate the ASCII characters check
  2012-02-29 20:48 ` Steve Dickson
@ 2012-02-29 21:38   ` Jim Rees
  2012-03-01 15:22     ` Steve Dickson
  2012-05-17 17:40   ` Sorin Faibish
  2012-05-17 17:43   ` pNFS block performance evaluation (sorry for the wrong subject title) Sorin Faibish
  2 siblings, 1 reply; 9+ messages in thread
From: Jim Rees @ 2012-02-29 21:38 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Gregor Gruener, linux-nfs

Steve Dickson wrote:

  On 02/23/2012 04:16 AM, Gregor Gruener wrote:
  > Customers are using Unicode characters umlauts (ö,ä,ü) in group names and this
  > creates problems with NFS ID Mapping. Groups with umlauts will be redirected to
  > the group NFS "nobody".
  > This patch deactivate the ASCII characters check. It's maybe more like a temporary fix and I think it would be nicer to
  > adjust the check to support Unicode characters instead to deactivate the check.
  > Signed-off-by: Gregor Gruener<ggruner@redhat.com>
  ...
  Committed... 

I still think the name should be changed.  The only thing validateascii() is
doing now is verifying that the string is null terminated.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2.6.32-131.6.1.el6.x86_64] idmapd.c: deactivate the ASCII characters check
  2012-02-29 21:38   ` Jim Rees
@ 2012-03-01 15:22     ` Steve Dickson
  0 siblings, 0 replies; 9+ messages in thread
From: Steve Dickson @ 2012-03-01 15:22 UTC (permalink / raw)
  To: Jim Rees; +Cc: Gregor Gruener, linux-nfs

Hey,

On 02/29/2012 04:38 PM, Jim Rees wrote:
> Steve Dickson wrote:
> 
>   On 02/23/2012 04:16 AM, Gregor Gruener wrote:
>   > Customers are using Unicode characters umlauts (ö,ä,ü) in group names and this
>   > creates problems with NFS ID Mapping. Groups with umlauts will be redirected to
>   > the group NFS "nobody".
>   > This patch deactivate the ASCII characters check. It's maybe more like a temporary fix and I think it would be nicer to
>   > adjust the check to support Unicode characters instead to deactivate the check.
>   > Signed-off-by: Gregor Gruener<ggruner@redhat.com>
>   ...
>   Committed... 
> 
> I still think the name should be changed.  The only thing validateascii() is
> doing now is verifying that the string is null terminated.
Sorry Jim...  I kinda overlooked that part of the discussion... 
I guess validateascii() is not a published interface so we
could change it... any suggestions? (Patches always welcome! ;-) )

steved. 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2.6.32-131.6.1.el6.x86_64] idmapd.c: deactivate the ASCII characters check
  2012-02-29 20:48 ` Steve Dickson
  2012-02-29 21:38   ` Jim Rees
@ 2012-05-17 17:40   ` Sorin Faibish
  2012-05-17 17:43   ` pNFS block performance evaluation (sorry for the wrong subject title) Sorin Faibish
  2 siblings, 0 replies; 9+ messages in thread
From: Sorin Faibish @ 2012-05-17 17:40 UTC (permalink / raw)
  To: linux-nfs@vger.kernel.org

Team,

These are some preliminary results with pNFS block. They look very  
promising both performance and scalability. They uncovered some  
instability but mainly due to HW issues; no SW issues. These results are  
our prove that the latest pNFS block code is stable and performing  
reasonably well.

Single client pNFS performance was 86MB/sec write performance for  
reference. Thank you very much for your patience and support



We’ve finished the testing and analysis of pNFS vs. MPFS.  It took much  
longer than expected due to several stability issues uncovered on both the  
client and the server.  Some of the issues occurred during read testing,  
which had never been a focus of previous testing efforts.  Anyway, please  
find the results below.  To help with the below interpretation, please  
reference the following key:

SW8xN-MPFS              Sequential Write 8 clients by N threads using MPFS  
(special build using latest kernel on Fedora 15)
SR8xN-MPFS              Sequential Read 8 clients by N threads using MPFS  
(special build on Fedora 15)
SW8xN-pNFS              Sequential Write 8 clients by N threads using pNFS  
(on Fedora 15)
SR8xN-pNFS              Sequential Read 8 clients by N threads using pNFS  
(on Fedora 15)
SW8xN-pNFS2             Sequential Write 8 clients by N threads using pNFS  
(on Fedora 15)
SR8xN-pNFS2             Sequential Read 8 clients by N threads using pNFS  
(on Fedora 15)
SR8xN-pNFS3             Sequential Read 8 clients by N threads using pNFS  
(on Fedora 15)
SW8xN-MPFS2             Sequential Write 8 clients by N threads using MPFS
SR8xN-MPFS2             Sequential Read 8 clients by N threads using MPFS

I think/hope the rest is self-explanatory based on labels, etc.  To  
summarize:


	pNFS2	pNFS4vs.MPFS
SW8x1	653121.85	-8%
SW8x2	835101.05	-10%
SW8x3	846188.63	-8%
SW8x4	852617.66	-6%
SR8x1	674129.88	43%
SR8x2	634800.33	34%
SR8x3	629843.20	28%
SR8x4	577197.70	19%

Let me know what you think or if you have any questions.  I’m moving on to  
other projects that I’ve committed to and am currently behind on.

-- 
--
Best Regards

Sorin Faibish
Corporate Distinguished Engineer
Fast Data Group - Office of CTO
                           EMC2
where information lives

Phone: 508-249-5745
Mobile: 617-510-0422
email: sfaibish@emc.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* pNFS block performance evaluation (sorry for the wrong subject title)
  2012-02-29 20:48 ` Steve Dickson
  2012-02-29 21:38   ` Jim Rees
  2012-05-17 17:40   ` Sorin Faibish
@ 2012-05-17 17:43   ` Sorin Faibish
  2012-05-18  9:37     ` tao.peng
  2 siblings, 1 reply; 9+ messages in thread
From: Sorin Faibish @ 2012-05-17 17:43 UTC (permalink / raw)
  To: linux-nfs@vger.kernel.org

Team,

These are some preliminary results with pNFS block. They look very
promising both performance and scalability. They uncovered some
instability but mainly due to HW issues; no SW issues. These results are
our prove that the latest pNFS block code is stable and performing
reasonably well.

Single client pNFS performance was 86MB/sec write performance for
reference. Thank you very much for your patience and support



We’ve finished the testing and analysis of pNFS vs. MPFS.  It took much
longer than expected due to several stability issues uncovered on both the
client and the server.  Some of the issues occurred during read testing,
which had never been a focus of previous testing efforts.  Anyway, please
find the results below.  To help with the below interpretation, please
reference the following key:

SW8xN-MPFS              Sequential Write 8 clients by N threads using MPFS
(special build using latest kernel on Fedora 15)
SR8xN-MPFS              Sequential Read 8 clients by N threads using MPFS
(special build on Fedora 15)
SW8xN-pNFS              Sequential Write 8 clients by N threads using pNFS
(on Fedora 15)
SR8xN-pNFS              Sequential Read 8 clients by N threads using pNFS
(on Fedora 15)
SW8xN-pNFS2             Sequential Write 8 clients by N threads using pNFS
(on Fedora 15)
SR8xN-pNFS2             Sequential Read 8 clients by N threads using pNFS
(on Fedora 15)
SR8xN-pNFS3             Sequential Read 8 clients by N threads using pNFS
(on Fedora 15)
SW8xN-MPFS2             Sequential Write 8 clients by N threads using MPFS
SR8xN-MPFS2             Sequential Read 8 clients by N threads using MPFS

I think/hope the rest is self-explanatory based on labels, etc.  To
summarize:


	pNFS2	pNFS4vs.MPFS
SW8x1	653121.85	-8%
SW8x2	835101.05	-10%
SW8x3	846188.63	-8%
SW8x4	852617.66	-6%
SR8x1	674129.88	43%
SR8x2	634800.33	34%
SR8x3	629843.20	28%
SR8x4	577197.70	19%

Let me know what you think or if you have any questions.  I’m moving on to
other projects that I’ve committed to and am currently behind on.

-- 
--
Best Regards

Sorin Faibish
Corporate Distinguished Engineer
Fast Data Group - Office of CTO
                             EMC2
where information lives

Phone: 508-249-5745
Mobile: 617-510-0422
email: sfaibish@emc.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: pNFS block performance evaluation (sorry for the wrong subject title)
  2012-05-17 17:43   ` pNFS block performance evaluation (sorry for the wrong subject title) Sorin Faibish
@ 2012-05-18  9:37     ` tao.peng
  0 siblings, 0 replies; 9+ messages in thread
From: tao.peng @ 2012-05-18  9:37 UTC (permalink / raw)
  To: faibish_sorin, linux-nfs

[-- Attachment #1: Type: text/plain, Size: 3312 bytes --]

> -----Original Message-----
> From: linux-nfs-owner@vger.kernel.org [mailto:linux-nfs-owner@vger.kernel.org] On Behalf Of Sorin
> Faibish
> Sent: Friday, May 18, 2012 1:43 AM
> To: linux-nfs@vger.kernel.org
> Subject: pNFS block performance evaluation (sorry for the wrong subject title)
> 
> Team,
> 
> These are some preliminary results with pNFS block. They look very
> promising both performance and scalability. They uncovered some
> instability but mainly due to HW issues; no SW issues. These results are
> our prove that the latest pNFS block code is stable and performing
> reasonably well.
> 
To be specific, the performance test was run with some additional patches that make client send real IO size in layoutget requests. I attached the patches in the email for your reference. These patches were sent out before and got some objections. I will rework them to address comments and send again later.

Thanks,
Tao
> Single client pNFS performance was 86MB/sec write performance for
> reference. Thank you very much for your patience and support
> 
> 
> 
> We’ve finished the testing and analysis of pNFS vs. MPFS.  It took much
> longer than expected due to several stability issues uncovered on both the
> client and the server.  Some of the issues occurred during read testing,
> which had never been a focus of previous testing efforts.  Anyway, please
> find the results below.  To help with the below interpretation, please
> reference the following key:
> 
> SW8xN-MPFS              Sequential Write 8 clients by N threads using MPFS
> (special build using latest kernel on Fedora 15)
> SR8xN-MPFS              Sequential Read 8 clients by N threads using MPFS
> (special build on Fedora 15)
> SW8xN-pNFS              Sequential Write 8 clients by N threads using pNFS
> (on Fedora 15)
> SR8xN-pNFS              Sequential Read 8 clients by N threads using pNFS
> (on Fedora 15)
> SW8xN-pNFS2             Sequential Write 8 clients by N threads using pNFS
> (on Fedora 15)
> SR8xN-pNFS2             Sequential Read 8 clients by N threads using pNFS
> (on Fedora 15)
> SR8xN-pNFS3             Sequential Read 8 clients by N threads using pNFS
> (on Fedora 15)
> SW8xN-MPFS2             Sequential Write 8 clients by N threads using MPFS
> SR8xN-MPFS2             Sequential Read 8 clients by N threads using MPFS
> 
> I think/hope the rest is self-explanatory based on labels, etc.  To
> summarize:
> 
> 
> 	pNFS2	pNFS4vs.MPFS
> SW8x1	653121.85	-8%
> SW8x2	835101.05	-10%
> SW8x3	846188.63	-8%
> SW8x4	852617.66	-6%
> SR8x1	674129.88	43%
> SR8x2	634800.33	34%
> SR8x3	629843.20	28%
> SR8x4	577197.70	19%
> 
> Let me know what you think or if you have any questions.  I’m moving on to
> other projects that I’ve committed to and am currently behind on.
> 
> --
> --
> Best Regards
> 
> Sorin Faibish
> Corporate Distinguished Engineer
> Fast Data Group - Office of CTO
>                              EMC2
> where information lives
> 
> Phone: 508-249-5745
> Mobile: 617-510-0422
> email: sfaibish@emc.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: 0001-nfsv41-export-pnfs_find_alloc_layout.patch --]
[-- Type: application/octet-stream, Size: 1490 bytes --]

From 55a7c6a10189d04645a84e9b2eb3d9b5d2f7e44d Mon Sep 17 00:00:00 2001
From: Peng Tao <bergwolf@gmail.com>
Date: Sat, 19 Nov 2011 04:23:26 -0500
Subject: [PATCH 1/4] nfsv41: export pnfs_find_alloc_layout

So that layout driver can access layout header when there is none.

Signed-off-by: Peng Tao <peng_tao@emc.com>
---
 fs/nfs/pnfs.c |    3 ++-
 fs/nfs/pnfs.h |    4 ++++
 2 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index baf7353..3be29c7 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -848,7 +848,7 @@ alloc_init_layout_hdr(struct inode *ino,
 	return lo;
 }
 
-static struct pnfs_layout_hdr *
+struct pnfs_layout_hdr *
 pnfs_find_alloc_layout(struct inode *ino,
 		       struct nfs_open_context *ctx,
 		       gfp_t gfp_flags)
@@ -875,6 +875,7 @@ pnfs_find_alloc_layout(struct inode *ino,
 		pnfs_free_layout_hdr(new);
 	return nfsi->layout;
 }
+EXPORT_SYMBOL_GPL(pnfs_find_alloc_layout);
 
 /*
  * iomode matching rules:
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 1509530..9614ac9 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -209,6 +209,10 @@ struct pnfs_layout_segment *pnfs_update_layout(struct inode *ino,
 					       u64 count,
 					       enum pnfs_iomode iomode,
 					       gfp_t gfp_flags);
+struct pnfs_layout_hdr *
+pnfs_find_alloc_layout(struct inode *ino,
+		       struct nfs_open_context *ctx,
+		       gfp_t gfp_flags);
 
 void nfs4_deviceid_mark_client_invalid(struct nfs_client *clp);
 
-- 
1.6.6


[-- Attachment #3: 0002-nfsv41-add-and-export-pnfs_find_get_layout_locked.patch --]
[-- Type: application/octet-stream, Size: 1817 bytes --]

From 904e0e42d12b86f8b03791a325b90c1a415b9dcb Mon Sep 17 00:00:00 2001
From: Peng Tao <bergwolf@gmail.com>
Date: Sat, 19 Nov 2011 04:27:00 -0500
Subject: [PATCH 2/4] nfsv41: add and export pnfs_find_get_layout_locked

It tries to find the lseg from local cache but not retrive layout from server.

Signed-off-by: Peng Tao <peng_tao@emc.com>
---
 fs/nfs/pnfs.c |   25 +++++++++++++++++++++++++
 fs/nfs/pnfs.h |    5 +++++
 2 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 3be29c7..734e670 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -933,6 +933,31 @@ pnfs_find_lseg(struct pnfs_layout_hdr *lo,
 }
 
 /*
+ * Find and reference lseg with ino->i_lock held.
+ */
+struct pnfs_layout_segment *
+pnfs_find_get_layout_locked(struct inode *ino,
+			loff_t pos,
+			u64 count,
+			enum pnfs_iomode iomode)
+{
+	struct pnfs_layout_segment *lseg = NULL;
+	struct pnfs_layout_range range = {
+		.iomode = iomode,
+		.offset = pos,
+		.length = count,
+	};
+
+	if (NFS_I(ino)->layout == NULL)
+		goto out;
+
+	lseg = pnfs_find_lseg(NFS_I(ino)->layout, &range);
+out:
+	return lseg;
+}
+EXPORT_SYMBOL_GPL(pnfs_find_get_layout_locked);
+
+/*
  * Layout segment is retreived from the server if not cached.
  * The appropriate layout segment is referenced and returned to the caller.
  */
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 9614ac9..0c55fc1 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -213,6 +213,11 @@ struct pnfs_layout_hdr *
 pnfs_find_alloc_layout(struct inode *ino,
 		       struct nfs_open_context *ctx,
 		       gfp_t gfp_flags);
+struct pnfs_layout_segment *
+pnfs_find_get_layout_locked(struct inode *ino,
+			loff_t pos,
+			u64 count,
+			enum pnfs_iomode iomode);
 
 void nfs4_deviceid_mark_client_invalid(struct nfs_client *clp);
 
-- 
1.6.6


[-- Attachment #4: 0003-nfsv41-get-lseg-before-issue-LD-IO-if-pgio-doesn-t-c.patch --]
[-- Type: application/octet-stream, Size: 2962 bytes --]

From d3d1626dd2d362aee6c971591c37a49a5c5e1031 Mon Sep 17 00:00:00 2001
From: Peng Tao <bergwolf@gmail.com>
Date: Sat, 19 Nov 2011 04:33:39 -0500
Subject: [PATCH 3/4] nfsv41: get lseg before issue LD IO if pgio doesn't carry one

This gives LD option not to ask for layout in pg_init.

Signed-off-by: Peng Tao <peng_tao@emc.com>
---
 fs/nfs/pnfs.c |   46 ++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 46 insertions(+), 0 deletions(-)

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 734e670..c8dc0b1 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1254,6 +1254,7 @@ pnfs_do_multiple_writes(struct nfs_pageio_descriptor *desc, struct list_head *he
 	struct nfs_write_data *data;
 	const struct rpc_call_ops *call_ops = desc->pg_rpc_callops;
 	struct pnfs_layout_segment *lseg = desc->pg_lseg;
+	const bool has_lseg = !!lseg;
 
 	desc->pg_lseg = NULL;
 	while (!list_empty(head)) {
@@ -1262,7 +1263,29 @@ pnfs_do_multiple_writes(struct nfs_pageio_descriptor *desc, struct list_head *he
 		data = list_entry(head->next, struct nfs_write_data, list);
 		list_del_init(&data->list);
 
+		if (!has_lseg) {
+			struct nfs_page *req = nfs_list_entry(data->pages.next);
+			__u64 length = data->npages << PAGE_CACHE_SHIFT;
+
+			lseg = pnfs_update_layout(desc->pg_inode,
+						  req->wb_context,
+						  req_offset(req),
+						  length,
+						  IOMODE_RW,
+						  GFP_NOFS);
+			if (!lseg || length > (lseg->pls_range.length)) {
+				put_lseg(lseg);
+				lseg = NULL;
+				pnfs_write_through_mds(desc,data);
+				continue;
+			}
+		}
+
 		trypnfs = pnfs_try_to_write_data(data, call_ops, lseg, how);
+		if (!has_lseg) {
+			put_lseg(lseg);
+			lseg = NULL;
+		}
 		if (trypnfs == PNFS_NOT_ATTEMPTED)
 			pnfs_write_through_mds(desc, data);
 	}
@@ -1350,6 +1373,7 @@ pnfs_do_multiple_reads(struct nfs_pageio_descriptor *desc, struct list_head *hea
 	struct nfs_read_data *data;
 	const struct rpc_call_ops *call_ops = desc->pg_rpc_callops;
 	struct pnfs_layout_segment *lseg = desc->pg_lseg;
+	const bool has_lseg = !!lseg;
 
 	desc->pg_lseg = NULL;
 	while (!list_empty(head)) {
@@ -1358,7 +1382,29 @@ pnfs_do_multiple_reads(struct nfs_pageio_descriptor *desc, struct list_head *hea
 		data = list_entry(head->next, struct nfs_read_data, list);
 		list_del_init(&data->list);
 
+		if (!has_lseg) {
+			struct nfs_page *req = nfs_list_entry(data->pages.next);
+			__u64 length = data->npages << PAGE_CACHE_SHIFT;
+
+			lseg = pnfs_update_layout(desc->pg_inode,
+						  req->wb_context,
+						  req_offset(req),
+						  length,
+						  IOMODE_READ,
+						  GFP_KERNEL);
+			if (!lseg || length > lseg->pls_range.length) {
+				put_lseg(lseg);
+				lseg = NULL;
+				pnfs_read_through_mds(desc, data);
+				continue;
+			}
+		}
+
 		trypnfs = pnfs_try_to_read_data(data, call_ops, lseg);
+		if (!has_lseg) {
+			put_lseg(lseg);
+			lseg = NULL;
+		}
 		if (trypnfs == PNFS_NOT_ATTEMPTED)
 			pnfs_read_through_mds(desc, data);
 	}
-- 
1.6.6


[-- Attachment #5: 0004-pnfsblock-do-not-ask-for-layout-in-pg_init.patch --]
[-- Type: application/octet-stream, Size: 3116 bytes --]

From 9813911e4763a36d3340a0ec9bdc118dc8d253b0 Mon Sep 17 00:00:00 2001
From: Peng Tao <bergwolf@gmail.com>
Date: Sat, 19 Nov 2011 04:46:49 -0500
Subject: [PATCH 4/4] pnfsblock: do not ask for layout in pg_init

Asking for layout in pg_init will always make client ask for only 4KB
layout in every layoutget. This way, client drops the IO size information
that is meaningful for MDS in handing out layout.

In stead, if layout is not find in cache, do not send layoutget
at once. Wait until before issuing IO in pnfs_do_multiple_reads/writes
because that is where we know the real size of current IO. By telling the
real IO size to MDS, MDS will have a better chance to give proper layout.

Signed-off-by: Peng Tao <peng_tao@emc.com>
---
 fs/nfs/blocklayout/blocklayout.c |   54 ++++++++++++++++++++++++++++++++++++-
 1 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c
index 48cfac3..fd585fe 100644
--- a/fs/nfs/blocklayout/blocklayout.c
+++ b/fs/nfs/blocklayout/blocklayout.c
@@ -39,6 +39,7 @@
 #include <linux/prefetch.h>
 
 #include "blocklayout.h"
+#include "../internal.h"
 
 #define NFSDBG_FACILITY	NFSDBG_PNFS_LD
 
@@ -990,14 +991,63 @@ bl_clear_layoutdriver(struct nfs_server *server)
 	return 0;
 }
 
+/* While RFC doesn't limit maximum size of layout, we better limit it ourself. */
+#define PNFSBLK_MAXRSIZE (0x1<<22)
+#define PNFSBLK_MAXWSIZE (0x1<<21)
+static void
+bl_pg_init_read(struct nfs_pageio_descriptor *pgio, struct nfs_page *req)
+{
+	struct inode *ino = pgio->pg_inode;
+	struct pnfs_layout_hdr *lo;
+
+	BUG_ON(pgio->pg_lseg != NULL);
+	spin_lock(&ino->i_lock);
+	lo = pnfs_find_alloc_layout(ino, req->wb_context, GFP_KERNEL);
+	if (!lo || test_bit(lo_fail_bit(IOMODE_READ), &lo->plh_flags)) {
+		spin_unlock(&ino->i_lock);
+		nfs_pageio_reset_read_mds(pgio);
+		return;
+	}
+
+	pgio->pg_bsize = PNFSBLK_MAXRSIZE;
+	pgio->pg_lseg = pnfs_find_get_layout_locked(ino,
+						req_offset(req),
+						req->wb_bytes,
+						IOMODE_READ);
+	spin_unlock(&ino->i_lock);
+}
+
+static void
+bl_pg_init_write(struct nfs_pageio_descriptor *pgio, struct nfs_page *req)
+{
+	struct inode *ino = pgio->pg_inode;
+	struct pnfs_layout_hdr *lo;
+
+	BUG_ON(pgio->pg_lseg != NULL);
+	spin_lock(&ino->i_lock);
+	lo = pnfs_find_alloc_layout(ino, req->wb_context, GFP_NOFS);
+	if (!lo || test_bit(lo_fail_bit(IOMODE_RW), &lo->plh_flags)) {
+		spin_unlock(&ino->i_lock);
+		nfs_pageio_reset_write_mds(pgio);
+		return;
+	}
+
+	pgio->pg_bsize = PNFSBLK_MAXWSIZE;
+	pgio->pg_lseg = pnfs_find_get_layout_locked(ino,
+						req_offset(req),
+						req->wb_bytes,
+						IOMODE_RW);
+	spin_unlock(&ino->i_lock);
+}
+
 static const struct nfs_pageio_ops bl_pg_read_ops = {
-	.pg_init = pnfs_generic_pg_init_read,
+	.pg_init = bl_pg_init_read,
 	.pg_test = pnfs_generic_pg_test,
 	.pg_doio = pnfs_generic_pg_readpages,
 };
 
 static const struct nfs_pageio_ops bl_pg_write_ops = {
-	.pg_init = pnfs_generic_pg_init_write,
+	.pg_init = bl_pg_init_write,
 	.pg_test = pnfs_generic_pg_test,
 	.pg_doio = pnfs_generic_pg_writepages,
 };
-- 
1.6.6


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-05-18  9:37 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-23  9:16 [PATCH 2.6.32-131.6.1.el6.x86_64] idmapd.c: deactivate the ASCII characters check Gregor Gruener
2012-02-23 13:02 ` Jim Rees
2012-02-28 19:55   ` J. Bruce Fields
2012-02-29 20:48 ` Steve Dickson
2012-02-29 21:38   ` Jim Rees
2012-03-01 15:22     ` Steve Dickson
2012-05-17 17:40   ` Sorin Faibish
2012-05-17 17:43   ` pNFS block performance evaluation (sorry for the wrong subject title) Sorin Faibish
2012-05-18  9:37     ` tao.peng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).