linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Moyer <jmoyer@redhat.com>
To: Kent Overstreet <koverstreet@google.com>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	tytso@google.com, tj@kernel.org,
	Dave Kleikamp <dave.kleikamp@oracle.com>,
	Zach Brown <zab@zabbo.net>,
	Dmitry Monakhov <dmonakhov@openvz.org>,
	"Maxim V. Patlasov" <mpatlasov@parallels.com>,
	michael.mesnier@intel.com, jeffrey.d.skirvin@intel.com
Subject: Re: [RFC, PATCH] Extensible AIO interface
Date: Tue, 02 Oct 2012 13:41:17 -0400	[thread overview]
Message-ID: <x49zk44ojpe.fsf@segfault.boston.devel.redhat.com> (raw)
In-Reply-To: <20121001222341.GF26488@google.com> (Kent Overstreet's message of "Mon, 1 Oct 2012 15:23:41 -0700")

Kent Overstreet <koverstreet@google.com> writes:

> So, I and other people keep running into things where we really need to
> add an interface to pass some auxiliary... stuff along with a pread() or
> pwrite().
>
> A few examples:
>
> * IO scheduler hints. Some userspace program wants to, per IO, specify
> either priorities or a cgroup - by specifying a cgroup you can have a
> fileserver in userspace that makes use of cfq's per cgroup bandwidth
> quotas.

You can do this today by splitting I/O between processes and placing
those processes in different cgroups.  For io priority, there is
ioprio_set, which incurs an extra system call, but can be used.  Not
elegant, but possible.

> * Cache hints. For bcache and other things, userspace may want to specify
> "this data should be cached", "this data should bypass the cache", etc.

Please explain how you will differentiate this from posix_fadvise.

> * Passing checksums out to userspace. We've got bio integrity, which is
> a (somewhat) generic interface for passing data checksums between the
> filesystem and the hardware. There are various circumstances under which
> you may want to pass these checksums out to userspace, and if so we
> ought to have a generic way of doing it.

Yes, that needs a new interface.

> Hence, AIO attributes.

*No.*  Start with the non-AIO case first.

> * FUTURE STUFF:
>
> Return values:
>
> Some attributes are probably going to want to return something to
> userspace.
>
> If nothing else, we want this so that userspace can tell if anything
> handled the attributes it specified - as dynamic as the io stack can be,
> with something extensible like this there really isn't any generic way
> of knowing ahead of time if something is going to interpret any
> attribute - we want to return at least an error code.

Seems odd to me.  Why not expose supported attributes via some other
call?  fcntl?

> One could imagine sticking the return in the attribute itself, but I
> don't want to do this. For some things (checksums), the attribute will
> contain a pointer to a buffer - that's fine. But I don't want the
> attributes themselves to be writeable.

One could imagine that attributes don't return anything, because, well,
they're properties of something else, and properties don't return
anything.

Cheers,
Jeff

  parent reply	other threads:[~2012-10-02 17:41 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-01 22:23 [RFC, PATCH] Extensible AIO interface Kent Overstreet
2012-10-01 23:12 ` Zach Brown
2012-10-01 23:22   ` Kent Overstreet
2012-10-01 23:44     ` Zach Brown
2012-10-02  0:22       ` Kent Overstreet
2012-10-02 17:43         ` Zach Brown
2012-10-02 21:41           ` Kent Overstreet
2012-10-03  1:41             ` Tejun Heo
2012-10-03  3:00               ` Kent Overstreet
2012-10-03 21:58                 ` Tejun Heo
2012-10-04 19:50                   ` Kent Overstreet
2012-10-02  0:47       ` Kent Overstreet
2012-10-02 22:34     ` Martin K. Petersen
2012-10-02 17:41 ` Jeff Moyer [this message]
2012-10-03  0:20   ` Kent Overstreet
2012-10-03  1:28     ` Dave Chinner
2012-10-03  2:41       ` Kent Overstreet
2012-10-04  1:04         ` Dave Chinner
2012-10-03 19:15     ` Jeff Moyer
2012-10-04 19:37       ` Kent Overstreet
2012-10-02 19:34 ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=x49zk44ojpe.fsf@segfault.boston.devel.redhat.com \
    --to=jmoyer@redhat.com \
    --cc=dave.kleikamp@oracle.com \
    --cc=dmonakhov@openvz.org \
    --cc=jeffrey.d.skirvin@intel.com \
    --cc=koverstreet@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michael.mesnier@intel.com \
    --cc=mpatlasov@parallels.com \
    --cc=tj@kernel.org \
    --cc=tytso@google.com \
    --cc=zab@zabbo.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).