All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wendy Cheng <wcheng@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] Problem in ops_address.c :: gfs_writepage ?
Date: Mon, 19 Feb 2007 17:00:10 -0500	[thread overview]
Message-ID: <45DA1DEA.3080302@redhat.com> (raw)
In-Reply-To: <20070219160050.14eb6c10@mathieu.toulouse>

Mathieu Avila wrote:
> Hello all,
>
> I need advice about a bug in GFS that may also affect other filesystems
> (like ext3).
>
> The problem:
> It is possible that the function "ops_address.c :: gfs_writepage" does
> not write the page it's asked to, because the transaction lock is
> taken. This is valid, and in such case, it should return an error code,
> so that the kernel knows it was not possible to write the page. But
> this function does not return an error code; instead, it returns 0.
> I've looked at ext3, it does the same. This is valid and there's no
> corruption, as the page is "redirtied" so that it will be flushed later.
> Returning an error code is not solution, because it's possible that no
> page is flushable, and also 'sync' misinterprets the error code as an
> I/O error. There may be other implications, too.
>
> The problem comes when there is quite a stress on the filesystem.
> I've made a test program that opens 1 file, writes 1Go
> (at least more than the system's total memory), then opens a 2nd file,
> and writes as much data as it can.
> When the number of dirty pages go beyond /proc/sys/vm/dirty_ratio,
> some pages must be flushed synchronously, so that the writer is blocked
> in writing, and the system does not starve of free clean pages to use.
>
> But precisely, in that situation, there are multiple times when
> gfs_writepage cannot perform its duty, because of the transaction lock.
>   
Yes, we did have this problem in the past with direct IO and SYNC flag.
> [snip]
> we've experienced it using the test program "bonnie++" whose purpose is
> to test a FS performance. Bonnie++ makes multiple files of 1GB when it
> is asked to run long multi-Go writes. There is no problem with 5 GB (5
> files of 1 GB) but many machines in the cluster are OOM killed with 10GB
> bonnies....
>   
I would like to know more about your experiments. So these bonnie++(s) 
are run on each cluster node with independent file sets ?

> Setting more aggressive parameters for dirty_ratio and pdflush is not
> a complete solution (altough the problems happens much later or not at
> all), and kills performance.
>
> Proposed solution:
>
> Keep a counter of pages in gfs_inode whose value represents those not
> written in gfs_writepage, and at the end of do_do_write_buf, call
> "balance_dirty_pages_ratelimited(file->f_mapping);" as many times. The
> counter is possibly shared by multiple processes, but we are assured
> that there is no transaction at that moment so pages can be flushed, if
> "balance_dirty_pages_ratelimited" determines that it must reclaim dirty
> pages. Otherwise performance is not affected.
>   
In general, this approach looks ok if we do have this flushing problem. 
However, GFS flush code has been embedded in glock code so I would think 
it would be better to do this within glock code. My cluster nodes 
happened to be out at this moment. Will look into this when the cluster 
is re-assembled.

-- Wendy



  reply	other threads:[~2007-02-19 22:00 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-19 15:00 [Cluster-devel] Problem in ops_address.c :: gfs_writepage ? Mathieu Avila
2007-02-19 22:00 ` Wendy Cheng [this message]
2007-02-20 10:59   ` Mathieu Avila
2007-02-21  5:04     ` Wendy Cheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45DA1DEA.3080302@redhat.com \
    --to=wcheng@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.