From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mathieu Avila Date: Fri, 2 Mar 2007 17:54:08 +0100 Subject: [Cluster-devel] What's the issue with read-ahead ? Message-ID: <20070302175408.19329b7f@mathieu.toulouse> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hello, I'm having troubles with the read-ahead feature in GFS, when doing long sequential reads. I understand that there are 3 different code paths when doing a read: - If it is accessed in Direct I/O then i get into: do_read_direct (ops_file.c :: 406) otherwise it will go in "do_read_buf", and therefore : - if the inode is stuffed or its data is journaled, i get into : do_read_readi(file, buf, size, offset); (ops_file.c :: 376) - otherwise i get into the common kernel functions: generic_file_read(file, buf, size, offset); (ops_file.c :: 378) - In the 2nd case, the tunable parameter "max_readahead" is used to get blocks ahead in gfs_start_ra. - In the 3rd case, everything is handled by the kernel code, and the read-ahead value is the one associated to the "struct file*" : file->f_ra.ra_pages . This value is set by default by the kernel, by using the read-ahead value that comes from the block device. The block device here is the diapered ones (diaper.c). The block device structure is initialized to zero, so that i get a 0 read-ahead device. I've changed this by initializing the read ahead of the "struct file*" when entering the read_gfs function and it worked successfully. I was able to get a factor 2 improvement in reading unstuffed unjournaled files, in buffered mode. But i'm not sure that this is correct according to the lock states and data coherency among nodes. So, did the diaper read-ahead was volontarily set to zero to avoid those kind of coherency problems ? If not, can we safely set it to the value of the underlying block device, so that the kernel will do the same than what is performed in do_read_readi/gfs_start_ra ? Thanks in advance, -- Mathieu