From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wu Fengguang Subject: Re: [RFC] nfs: use 2*rsize readahead size Date: Thu, 25 Feb 2010 20:37:55 +0800 Message-ID: <20100225123755.GB9077@localhost> References: <20100224024100.GA17048@localhost> <20100224032934.GF16175@discord.disaster> <20100224041822.GB27459@localhost> <20100224052215.GH16175@discord.disaster> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: Dave Chinner , Trond Myklebust , "linux-nfs@vger.kernel.org" , "linux-fsdevel@vger.kernel.org" , Linux Memory Management List , LKML To: Akshat Aranya Return-path: Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-Id: linux-fsdevel.vger.kernel.org On Wed, Feb 24, 2010 at 07:18:26PM +0800, Akshat Aranya wrote: > On Wed, Feb 24, 2010 at 12:22 AM, Dave Chinner wr= ote: >=20 > > > >> It sounds silly to have > >> > >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 client_readahead_size > server_readahead= _size > > > > I don't think it is =C2=A0- the client readahead has to take into acc= ount > > the network latency as well as the server latency. e.g. a network > > with a high bandwidth but high latency is going to need much more > > client side readahead than a high bandwidth, low latency network to > > get the same throughput. Hence it is not uncommon to see larger > > readahead windows on network clients than for local disk access. > > > > Also, the NFS server may not even be able to detect sequential IO > > patterns because of the combined access patterns from the clients, > > and so the only effective readahead might be what the clients > > issue.... > > >=20 > In my experiments, I have observed that the server-side readahead > shuts off rather quickly even with a single client because the client > readahead causes multiple pending read RPCs on the server which are > then serviced in random order and the pattern observed by the > underlying file system is non-sequential. In our file system, we had > to override what the VFS thought was a random workload and continue to > do readahead anyway. What's the server side kernel version, plus client/server side readahead size? I'd expect the context readahead to handle it well. With the patchset in , you can actually see the readahead details: # echo 1 > /debug/tracing/events/readahead/enable # cp test-file /dev/null # cat /debug/tracing/trace # trimmed output readahead-initial(dev=3D0:15, ino=3D100177, req=3D0+2, ra=3D0+4-2= , async=3D0) =3D 4 readahead-subsequent(dev=3D0:15, ino=3D100177, req=3D2+2, ra=3D4+= 8-8, async=3D1) =3D 8 readahead-subsequent(dev=3D0:15, ino=3D100177, req=3D4+2, ra=3D12= +16-16, async=3D1) =3D 16 readahead-subsequent(dev=3D0:15, ino=3D100177, req=3D12+2, ra=3D2= 8+32-32, async=3D1) =3D 32 readahead-subsequent(dev=3D0:15, ino=3D100177, req=3D28+2, ra=3D6= 0+60-60, async=3D1) =3D 24 readahead-subsequent(dev=3D0:15, ino=3D100177, req=3D60+2, ra=3D1= 20+60-60, async=3D1) =3D 0 And I've actually verified the NFS case with the help of such traces long ago. When client_readahead_size <=3D server_readahead_size, the readahead requests may look a bit random at first, and then will quickly turn into a perfect series of sequential context readaheads. Thanks, Fengguang -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org