All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@osdl.org>
To: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [EXPERIMENT,RFC] FAT: Add "flush" option for hotplug devices
Date: Thu, 20 Oct 2005 13:50:14 -0700	[thread overview]
Message-ID: <20051020135014.2289fa01.akpm@osdl.org> (raw)
In-Reply-To: <871x2gf8f5.fsf@devron.myhome.or.jp>

OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> wrote:
>
> Hi,
> 
> This adds new "flush" option on experiment for hotplug devices.
> 
> Current implementation of "flush" option does,
> 
> 	- synchronizing data pages at ->release() (last close(2))
> 	- if user's work seems to be done (fs is not active), all
> 	  metadata syncs by pdflush()

Seems like a sensible thing to do.

> This option would provide kind of sane progress, and dirty buffers is
> flushed more frequently (if fs is not active).

Your implementation doesn't really do this.  bdi_write_congested() only
returns true if the device is super-busy.  To determine whether it's "not
active" we'd need to peek at the queue's disk_stat accounting, or at the
queue's outstanding read/write requests.  We covered this a couple of weeks
ago in the context of Con's swap prefetch work.

>  This option doesn't
> provide any robustness (robustness is provided by other options), but
> probably the option is proper for hotplug devices.

Well...  It does a full fsync_super() - that's pretty robust.

> +EXPORT_SYMBOL(filemap_write_and_wait);

_GPL please.

> +EXPORT_SYMBOL(fsync_super);

Ditto

> +/*
> + * Copyright (C) 2005, OGAWA Hirofumi
> + * Released under GPL v2.
> + */
> +
> +#include <linux/fs.h>
> +#include <linux/blkdev.h>
> +#include <linux/writeback.h>
> +#include <linux/msdos_fs.h>
> +
> +int fat_sync_fdata(struct inode *inode, struct file *filp)
> +{
> +	int err = 0;
> +
> +	if (filp->f_mode & FMODE_WRITE) {
> +#if 1
> +		current->flags |= PF_SYNCWRITE;
> +		err = filemap_write_and_wait(inode->i_mapping);
> +		current->flags &= ~PF_SYNCWRITE;
> +#else
> +		down(&inode->i_sem);
> +#if 1
> +		err = generic_osync_inode(inode, inode->i_mapping, OSYNC_DATA);
> +#else
> +		err = filp->f_op->fsync(filp, filp->f_dentry, 1);
> +#endif
> +		up(&inode->i_sem);
> +#endif
> +	}
> +	return err;
> +}

Can't we just split up do_fsync() a bit and use that?

> +static void fat_pdflush_handler(unsigned long arg)
> +{
> +	struct super_block *sb = (struct super_block *)arg;
> +	fsync_super(sb);
> +}

It would be nice if /proc/sys/vm/dirty_writeback_centisecs was a per-fs
thing.   That's non-trivial.

> +static void fat_flush_timer(unsigned long data)
> +{
> +	struct super_block *sb = (struct super_block *)data;
> +	struct msdos_sb_info *sbi = MSDOS_SB(sb);
> +	struct backing_dev_info *bdi = blk_get_backing_dev_info(sb->s_bdev);
> +	unsigned long last_flush_jiff;
> +
> +	if (bdi_write_congested(bdi)) {

As indicated above, this won't be very effective.

> +		mod_timer(&sbi->flush_timer, jiffies + (HZ / 10));
> +		return;
> +	}
> +
> +	last_flush_jiff = sbi->last_flush_jiff;
> +
> +	if (!time_after_eq(jiffies, last_flush_jiff + (HZ / 2))) {
> +		mod_timer(&sbi->flush_timer, last_flush_jiff + (HZ / 2));
> +		return;
> +	}

What's the above doing?

> +	if (pdflush_operation(fat_pdflush_handler, (unsigned long)sb) < 0)
> +		mod_timer(&sbi->flush_timer, jiffies + HZ);
> +}
> +
> +void __fat_mark_flush(struct super_block *sb)
> +{
> +	struct msdos_sb_info *sbi = MSDOS_SB(sb);
> +
> +	sbi->last_flush_jiff = jiffies;
> +	/*
> +	 * make sure by smb_wmb() that dirty buffers before here is
> +	 * processed at the timer routine.
> +	 */
> +	smp_wmb();
> +
> +	if (!timer_pending(&sbi->flush_timer))
> +		mod_timer(&sbi->flush_timer, jiffies + HZ);
> +}
> +EXPORT_SYMBOL(__fat_mark_flush);

_GPL?

> +void fat_flush_stop(struct super_block *sb)
> +{
> +	del_timer_sync(&MSDOS_SB(sb)->flush_timer);
> +}

whoops, the pdflush_operation could still be in progress.

To avoid umount races I think the pdflush callback is going to need to take
sb_lock, increment s_count, take ->s_umount, test ->s_root.  Like, for
example, __sync_inodes.

> +void fat_flush_init(struct super_block *sb)
> +{
> +	struct msdos_sb_info *sbi = MSDOS_SB(sb);
> +	init_timer(&sbi->flush_timer);
> +	sbi->flush_timer.data = (unsigned long)sb;
> +	sbi->flush_timer.function = fat_flush_timer;

-mm has setup_timer()

> +
> +static inline void fat_mark_flush(struct super_block *sb)
> +{
> +	if (MSDOS_SB(sb)->options.flush)
> +		__fat_mark_flush(sb);
> +}

It'd be nice to make this a more generic thing, so other filesystems can
use it without copying lots of code.

> +		case Opt_flush:

MS_FLUSH?   I added MS_DIRSYNC a few years ago - it wasn't too complex.



  reply	other threads:[~2005-10-20 20:50 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-10-20 16:41 [EXPERIMENT,RFC] FAT: Add "flush" option for hotplug devices OGAWA Hirofumi
2005-10-20 20:50 ` Andrew Morton [this message]
2005-10-20 21:08   ` running linux TCP/IP stack on a real slow machine Kallol Biswas
2005-10-22  7:45   ` [EXPERIMENT,RFC] FAT: Add "flush" option for hotplug devices OGAWA Hirofumi
2005-10-22 10:15 ` Christoph Hellwig
2005-10-22 10:45   ` OGAWA Hirofumi
2005-10-22 10:56     ` Christoph Hellwig
2005-10-22 11:24       ` OGAWA Hirofumi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20051020135014.2289fa01.akpm@osdl.org \
    --to=akpm@osdl.org \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.