All of lore.kernel.org
 help / color / mirror / Atom feed
* wsize & PAGE_SIZE issues on IA64 clients.
@ 2003-03-27 21:24 Mark Price
  2003-03-27 23:57 ` Trond Myklebust
  2003-03-31 19:09 ` Greg Lindahl
  0 siblings, 2 replies; 9+ messages in thread
From: Mark Price @ 2003-03-27 21:24 UTC (permalink / raw)
  To: nfs


Hi Folks,

I've been chasing a very poor performance problem from a SuSE Sles-8 
(2.4.19) system on an IA64 box to an AIX NFS server. Writes to the same 
AIX server on an IA32 box also running Sles-8 performed as expected. 

These are v3 mounts with a 4K rsize/wsize. Eg.

mount -t nfs -o nfsvers=3,udp,rsize=4096,wsize=4096,hard rs75:/linux_test 
/linux_test

I tracked the problem down to the default PAGE_SIZE on ia64, which is 16K, 
versus the wsize which was 4K. 

In nfs_updatepage() if the wsize is smaller than the page size the write 
is performed synchronously. It appears though that the block size is then 
reduced further, in this case to 512 bytes. From what I could work out, 
it was cp_new_stat() in linux/fs/stat.c that determined the new preferred 
block size from the remote filesystem. 

On ia32 both the page size and the wsize were 4K, and no problem was seen. 

Can someone give me rough explanation of why that logic is used? ie. Why 
the page is written synchronously if its smaller than the page size? and 
Why even when its written synchronously the wsize wasn't used, but the 
remote filesystems block size was used?

The fix/workaround was to either increase wsize to 16K, or reduce 
PAGE_SIZE to 4K, obviously increasing wsize to 16K makes more sense.

However this leads to a problem between Linux clients and servers where 
the maximum supported block size on the server is 8K and the page size on 
the IA64 client is 16K.

Is increasing the maximum blocksize for the server as simple as changing 
NFSSVC_MAXBLKSIZE (linux/include/linux/nfsd/const.h) to 16K or 32K ? Or is 
more porting work required?

Cheers, Mark. 

-- 
Mark Price
IBM Linux Change Team
(503)-578-7524






-------------------------------------------------------
This SF.net email is sponsored by:
The Definitive IT and Networking Event. Be There!
NetWorld+Interop Las Vegas 2003 -- Register today!
http://ads.sourceforge.net/cgi-bin/redirect.pl?keyn0001en
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread
* RE: wsize & PAGE_SIZE issues on IA64 clients.
@ 2003-03-28  0:01 Lever, Charles
  0 siblings, 0 replies; 9+ messages in thread
From: Lever, Charles @ 2003-03-28  0:01 UTC (permalink / raw)
  To: Mark Price; +Cc: nfs

hi mark-

> In nfs_updatepage() if the wsize is smaller than the page=20
> size the write=20
> is performed synchronously. It appears though that the block=20
> size is then=20
> reduced further, in this case to 512 bytes. From what I could=20
> work out,=20
> it was cp_new_stat() in linux/fs/stat.c that determined the=20
> new preferred=20
> block size from the remote filesystem.=20
>=20
> On ia32 both the page size and the wsize were 4K, and no=20
> problem was seen.=20
>=20
> Can someone give me rough explanation of why that logic is=20
> used? ie. Why=20
> the page is written synchronously if its smaller than the=20
> page size? and=20
> Why even when its written synchronously the wsize wasn't=20
> used, but the=20
> remote filesystems block size was used?

i can answer the first question.

the problem is that databases need synchronous writes to
always go to disk from the lowest to highest byte; otherwise,
a system restart during a write could result in a torn
database page that can not be detected by some databases.

the Linux NFS client is handed write requests a page at a
time by the VFS layer.  if the NFS client queued pages for
sync writes the way it does for async writes, there is a window
where asynchronous events on the client (like an interrupt,
or the VM decides it needs to reclaim memory, or the fs
syncer runs) can push out incomplete writes to the server.
there is a good chance that even during a small test dd run
with wsize=3D32k, some 32k writes will be broken into smaller
writes on the network.

so the low-risk solution is to make the client always
do sync writes in page-sized pieces; that way, the byte
order at the server is always guaranteed.

i've never seen a problem where the writes are further
reduced in size; i imagine this is an application issue,
not an NFS client issue.

> The fix/workaround was to either increase wsize to 16K, or reduce=20
> PAGE_SIZE to 4K, obviously increasing wsize to 16K makes more sense.

correct.

> However this leads to a problem between Linux clients and=20
> servers where=20
> the maximum supported block size on the server is 8K and the=20
> page size on=20
> the IA64 client is 16K.

correct again.

> Is increasing the maximum blocksize for the server as simple=20
> as changing=20
> NFSSVC_MAXBLKSIZE (linux/include/linux/nfsd/const.h) to 16K=20
> or 32K ? Or is=20
> more porting work required?

neil can answer this with authority, but my impression is
there is more to it than simply bumping NFSSVC_MAXBLKSIZE.


-------------------------------------------------------
This SF.net email is sponsored by:
The Definitive IT and Networking Event. Be There!
NetWorld+Interop Las Vegas 2003 -- Register today!
http://ads.sourceforge.net/cgi-bin/redirect.pl?keyn0001en
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread
* RE: wsize & PAGE_SIZE issues on IA64 clients.
@ 2003-03-31 23:25 Lever, Charles
  2003-03-31 23:46 ` Mark Price
  0 siblings, 1 reply; 9+ messages in thread
From: Lever, Charles @ 2003-03-31 23:25 UTC (permalink / raw)
  To: Greg Lindahl; +Cc: nfs

> Is the "wsize < PAGE_SIZE" problem worth a printf in mount or=20
> mountd or
> the kernel?

perhaps an entry in the NFS FAQ might be more timely, considering
how long it would take this kind of change to make it into the
common commercial distributions.


-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb: 
Dedicated Hosting for just $79/mo with 500 GB of bandwidth! 
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2003-04-01  6:25 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-03-27 21:24 wsize & PAGE_SIZE issues on IA64 clients Mark Price
2003-03-27 23:57 ` Trond Myklebust
2003-03-31 19:09 ` Greg Lindahl
  -- strict thread matches above, loose matches on Subject: below --
2003-03-28  0:01 Lever, Charles
2003-03-31 23:25 Lever, Charles
2003-03-31 23:46 ` Mark Price
2003-04-01  5:48   ` Trond Myklebust
2003-04-01  6:11     ` Mark Price
2003-04-01  6:25       ` Trond Myklebust

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.