From: Dan Williams <dan.j.williams@intel.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Jens Axboe <axboe@fb.com>,
"Jason B. Akers" <jason.b.akers@intel.com>,
IDE/ATA development list <linux-ide@vger.kernel.org>,
"Karkra, Kapil" <kapil.karkra@intel.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 0/5] Enable use of Solid State Hybrid Drives
Date: Wed, 12 Nov 2014 08:47:50 -0800 [thread overview]
Message-ID: <CAPcyv4i3y=hNrhXtpECQPEfbK5gzUb++BxapF9FubQyJ01faqA@mail.gmail.com> (raw)
In-Reply-To: <20141110042244.GN23575@dastard>
On Sun, Nov 9, 2014 at 8:22 PM, Dave Chinner <david@fromorbit.com> wrote:
> [Been distrcted with other issues, so just getting back to this.]
>
> On Thu, Oct 30, 2014 at 10:07:47AM -0700, Dan Williams wrote:
>> On Thu, Oct 30, 2014 at 12:21 AM, Dave Chinner <david@fromorbit.com> wrote:
>> > On Wed, Oct 29, 2014 at 03:24:11PM -0700, Dan Williams wrote:
>> >> On Wed, Oct 29, 2014 at 3:09 PM, Dave Chinner <david@fromorbit.com> wrote:
>> >> > On Wed, Oct 29, 2014 at 03:10:51PM -0600, Jens Axboe wrote:
>> >> >> As for the fs accessing this, the io nice fields are readily exposed
>> >> >> through the ->bi_rw setting. So while the above example uses ionice to
>> >> >> set a task io priority (that a bio will then inherit), nothing prevents
>> >> >> you from passing it in directly from the kernel.
>> >> >
>> >> > Right, but now the filesystem needs to provide that on a per-inode
>> >> > basis, not from the task structure as the task that is submitting
>> >> > the bio is not necesarily the task doing the read/write syscall.
>> >> >
>> >> > e.g. the write case above doesn't actually inherit the task priority
>> >> > at the bio level at all because the IO is being dispatched by a
>> >> > background flusher thread, not the ioniced task calling write(2).
>> >>
>> >> When the ioniced task calling write(2) inserts the page into the page
>> >> cache then the current priority is recorded in the struct page. The
>> >
>> > It does? Can you point me to where the page cache code does this,
>> > because I've clearly missed something important go by in the past
>> > few months...
>>
>> Sorry, should have been more clear that this patch set added that
>> capability in patch-4. The idea is to claim some unused extended page
>> flags to stash priority bits. Yes, the PageSetAdvice() helper needs
>> to be fixed up to do the flags update atomically, and yes this
>> precludes hinting on 32-bit platforms. I also think that
>> bio_add_page() is the better place to read the per-page priority into
>> the bio. We felt ok deferring these items until after the initial
>> RFC.
>
> I think that using page flags for this is a 'orrible idea. Yeah,
> it's a neat hack that you can use for proff of concept
> demonstrations, but my biggest concern is that it isn't a scalable
> channel for carrying IO priority information through the page cache.
> e.g. it can't carry existing ionice priority scheduling information,
> it can't carry blkcg IO control information, etc.
>
> So, really, I think that this buffered write IO priority issue is
> bigger than this patch series, and we need to solve it properly
> rather than hack ugly special cases into core infrastructure
> that are an evolutionary dead-end....
>
>> >> > IOWs, to make effective use of this the task will need different
>> >> > cache hints for each different type of data needs to do IO on, and
>> >> > so overloading IO priorities just seems the wrong direction to be
>> >> > starting from.
>> >>
>> >> There's also the fadvise() enabling that could be bolted on top of
>> >> this capability. But, before that step, is a thread-id per-caching
>> >> context too much to ask?
>> >
>> > If we do it that way, we are stuck with it forever. So let's get our
>> > ducks in line first before pulling the trigger...
>>
>> Are you objecting to ionice as the interface or per-pid based hinting
>> in general?
>
> Neither. It's the implementation I don't like.
>
Fair enough. The page flags hack was indeed a hack to get an RFC out
the door instead of implementing a proper look-aside data structure
for remembering page cache io-priority. We'll iterate from here...
next prev parent reply other threads:[~2014-11-12 16:47 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-29 18:23 [RFC PATCH 0/5] Enable use of Solid State Hybrid Drives Jason B. Akers
2014-10-29 18:23 ` [RFC PATCH 1/5] block, ioprio: include caching advice via ionice Jason B. Akers
2014-10-29 19:02 ` Jeff Moyer
2014-10-29 21:07 ` Dan Williams
2014-10-29 18:23 ` [RFC PATCH 2/5] block: ioprio hint to low-level device drivers Jason B. Akers
2014-10-29 18:23 ` [RFC PATCH 3/5] block: untangle ioprio from BLK_CGROUP and BLK_DEV_THROTTLING Jason B. Akers
2014-10-29 18:24 ` [RFC PATCH 4/5] block, mm: Added the necessary plumbing to take ioprio hints down to block layer Jason B. Akers
2014-10-29 18:24 ` [RFC PATCH 5/5] libata: Enabling Solid State Hybrid Drives (SSHDs) based on SATA 3.2 standard Jason B. Akers
2014-10-29 20:14 ` [RFC PATCH 0/5] Enable use of Solid State Hybrid Drives Dave Chinner
2014-10-29 21:10 ` Jens Axboe
2014-10-29 22:09 ` Dave Chinner
2014-10-29 22:24 ` Dan Williams
2014-10-30 7:21 ` Dave Chinner
2014-10-30 14:15 ` Jens Axboe
2014-10-30 17:07 ` Dan Williams
2014-11-10 4:22 ` Dave Chinner
2014-11-12 16:47 ` Dan Williams [this message]
2014-10-29 22:49 ` Jens Axboe
2014-10-29 21:11 ` Dan Williams
2014-12-03 15:25 ` Pavel Machek
2014-10-30 2:05 ` Martin K. Petersen
2014-10-30 2:35 ` Jens Axboe
2014-10-30 3:28 ` Martin K. Petersen
2014-10-30 4:19 ` Dan Williams
2014-10-30 14:17 ` Jens Axboe
2014-10-30 14:53 ` Jens Axboe
2014-10-30 16:27 ` Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAPcyv4i3y=hNrhXtpECQPEfbK5gzUb++BxapF9FubQyJ01faqA@mail.gmail.com' \
--to=dan.j.williams@intel.com \
--cc=axboe@fb.com \
--cc=david@fromorbit.com \
--cc=jason.b.akers@intel.com \
--cc=kapil.karkra@intel.com \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).