From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:52519)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <kwolf@redhat.com>) id 1biJD3-0005v5-0i
	for qemu-devel@nongnu.org; Fri, 09 Sep 2016 06:39:02 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <kwolf@redhat.com>) id 1biJCy-0007e2-NJ
	for qemu-devel@nongnu.org; Fri, 09 Sep 2016 06:38:59 -0400
Received: from mx1.redhat.com ([209.132.183.28]:41368)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <kwolf@redhat.com>) id 1biJCy-0007di-FN
	for qemu-devel@nongnu.org; Fri, 09 Sep 2016 06:38:56 -0400
Date: Fri, 9 Sep 2016 12:38:51 +0200
From: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20160909103851.GA6682@noname.redhat.com>
References: <1471935390-126476-1-git-send-email-ashish.mittal@veritas.com>
	<20160830173507.GA1501@localhost.localdomain>
	<CAAo6VWOnrcVwR0XEsZOnmjrr2wup3+kLh4UQUuxXq4zZzrVZBw@mail.gmail.com>
	<20160908140049.GD1311@localhost.localdomain>
	<20160908142021.GE4206@noname.redhat.com>
	<CAAo6VWNwYdCJ_TkM5a8Spnjv_CA2pr0rsxwdp7MxiTfdAyBdAg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAAo6VWNwYdCJ_TkM5a8Spnjv_CA2pr0rsxwdp7MxiTfdAyBdAg@mail.gmail.com>
Subject: Re: [Qemu-devel] vxhs caching behaviour (was: [PATCH v4 RFC]
 block/vxhs: Initial commit to add) Veritas HyperScale VxHS block device
 support
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: ashish mittal <ashmit602@gmail.com>
Cc: Jeff Cody <jcody@redhat.com>, qemu-devel@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>, Markus Armbruster <armbru@redhat.com>, "Daniel P. Berrange" <berrange@redhat.com>, Ashish Mittal <ashish.mittal@veritas.com>, Stefan Hajnoczi <stefanha@gmail.com>, Ketan.Nilangekar@veritas.com, Abhijit.Dey@veritas.com

Am 08.09.2016 um 22:46 hat ashish mittal geschrieben:
> Hi Kevin,
> 
> By design, our writeback cache is on non-volatile SSD device. We do
> async writes to this cache and also maintain a persistent index map of
> the data written. This gives us the capability to recover write-back
> cache if needed.

So your server application uses something like O_SYNC to write data to
the cache SSD, in order to avoid that data is sitting only in the kernel
page cache or in a volatile write cache of the SSD?

Kevin

> On Thu, Sep 8, 2016 at 7:20 AM, Kevin Wolf <kwolf@redhat.com> wrote:
> > Am 08.09.2016 um 16:00 hat Jeff Cody geschrieben:
> >> > >> +/*
> >> > >> + * This is called by QEMU when a flush gets triggered from within
> >> > >> + * a guest at the block layer, either for IDE or SCSI disks.
> >> > >> + */
> >> > >> +int vxhs_co_flush(BlockDriverState *bs)
> >> > >> +{
> >> > >> +    BDRVVXHSState *s = bs->opaque;
> >> > >> +    uint64_t size = 0;
> >> > >> +    int ret = 0;
> >> > >> +
> >> > >> +    ret = qemu_iio_ioctl(s->qnio_ctx,
> >> > >> +            s->vdisk_hostinfo[s->vdisk_cur_host_idx].vdisk_rfd,
> >> > >> +            VDISK_AIO_FLUSH, &size, NULL, IIO_FLAG_SYNC);
> >> > >> +
> >> > >> +    if (ret < 0) {
> >> > >> +        /*
> >> > >> +         * Currently not handling the flush ioctl
> >> > >> +         * failure because of network connection
> >> > >> +         * disconnect. Since all the writes are
> >> > >> +         * commited into persistent storage hence
> >> > >> +         * this flush call is noop and we can safely
> >> > >> +         * return success status to the caller.
> >> > >
> >> > > I'm not sure I understand here.  Are you saying the qemu_iio_ioctl() call
> >> > > above is a noop?
> >> > >
> >> >
> >> > Yes, qemu_iio_ioctl(VDISK_AIO_FLUSH) is only a place-holder at present
> >> > in case we later want to add some functionality to it. I have now
> >> > added a comment to this affect to avoid any confusion.
> >> >
> >>
> >> The problem is you don't know which version of the qnio library any given
> >> QEMU binary will be using, since it is a shared library.  Future versions
> >> may implement the flush ioctl as expressed above, in which case we may hide
> >> a valid error.
> >>
> >> Am I correct in assuming that this call suppresses errors because an error
> >> is returned for an unknown ioctl operation of VDISK_AIO_FLUSH?  If so, and
> >> you want a placeholder here for flushing, you should go all the way and stub
> >> out the underlying ioctl call to return success.  Then QEMU can at least
> >> rely on the error return from the flush operation.
> >
> > So what's the story behind the missing flush command?
> >
> > Does the server always use something like O_SYNC, i.e. all potential
> > write caches in the stack operate in a writethrough mode? So each write
> > request is only completed successfully if it is ensured that the data is
> > safe on disk rather than in a volatile writeback cache?
> >
> > As soon as any writeback cache can be involved (e.g. the kernel page
> > cache or a volatile disk cache) and there is no flush command (a real
> > one, not just stubbed), the driver is not operating correctly and
> > therefore not ready for inclusion.
> >
> > So Ashish, can you tell us something about caching behaviour across the
> > storage stack when vxhs is involved?
> >
> > Kevin