From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:33661)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <bharata@linux.vnet.ibm.com>) id 1T1DVW-0002rN-BY
	for qemu-devel@nongnu.org; Tue, 14 Aug 2012 05:33:52 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <bharata@linux.vnet.ibm.com>) id 1T1DVU-0003HC-Tz
	for qemu-devel@nongnu.org; Tue, 14 Aug 2012 05:33:50 -0400
Received: from e28smtp07.in.ibm.com ([122.248.162.7]:47200)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <bharata@linux.vnet.ibm.com>) id 1T1DVU-0003Gd-9h
	for qemu-devel@nongnu.org; Tue, 14 Aug 2012 05:33:48 -0400
Received: from /spool/local
	by e28smtp07.in.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use
	Only! Violators will be prosecuted
	for <qemu-devel@nongnu.org> from <bharata@linux.vnet.ibm.com>;
	Tue, 14 Aug 2012 15:03:44 +0530
Received: from d28av02.in.ibm.com (d28av02.in.ibm.com [9.184.220.64])
	by d28relay01.in.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id
	q7E9X2DQ19595312
	for <qemu-devel@nongnu.org>; Tue, 14 Aug 2012 15:03:05 +0530
Received: from d28av02.in.ibm.com (loopback [127.0.0.1])
	by d28av02.in.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id
	q7E9X16S022415
	for <qemu-devel@nongnu.org>; Tue, 14 Aug 2012 19:33:02 +1000
Date: Tue, 14 Aug 2012 15:04:31 +0530
From: Bharata B Rao <bharata@linux.vnet.ibm.com>
Message-ID: <20120814093430.GE24944@in.ibm.com>
References: <20120809130010.GA7960@in.ibm.com>
	<20120809130216.GC7960@in.ibm.com> <5028F815.40309@redhat.com>
	<20120814043801.GB24944@in.ibm.com> <502A0C66.3060107@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <502A0C66.3060107@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v6 2/2] block: Support GlusterFS as a QEMU
 block backend
Reply-To: bharata@linux.vnet.ibm.com
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>, Anand Avati <aavati@redhat.com>, Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>, Vijay Bellur <vbellur@redhat.com>, Amar Tumballi <amarts@redhat.com>, qemu-devel@nongnu.org, Blue Swirl <blauwirbel@gmail.com>, Paolo Bonzini <pbonzini@redhat.com>

On Tue, Aug 14, 2012 at 10:29:26AM +0200, Kevin Wolf wrote:
> > 
> > Yes, and that will result in port=0, which is default. So this is to
> > cater for cases like gluster://[1:2:3:4:5]:/volname/image
> 
> So you consider this a valid URL? I would have expected it to invalid.
> But let me see, there must be some official definition of an URL...
> 
> Alright, so RFC 2234 says that having no digits after the colon is
> valid. It also says that you shouldn't generate such URLs. And it
> doesn't say what it means when it's there... Common interpretation seems
> to be that it's treated as if it wasn't specified, i.e. the default port
> for the schema is used.
> 
> So if 0 is the default port for glusterfs, your code looks okay. But it
> doesn't seem to be a very useful default port number.

I know, but gluster prefers to be called with port=0 which will be interpreted
as "default" by it.

While we are at this, let me bring out another issue. Gluster supports 3
transport types:

- socket in which case the server will be hostname, ipv4 or ipv4 address.
- rdma in which case server will be interpreted similar to socket.
- unix in which case server will be a path to unix domain socket and this
  will look like any other filesystem path. (Eg. /tmp/glusterd.socket)

I don't think we can fit 'unix' within the standard URI scheme (RFC 3986)
easily, but I am planning to specify the 'unix' transport as below:

gluster://[/path/to/unix/domain/socket]/volname/image?transport=unix

i,e., I am asking the user to put the unix domain socket path within
square brackets when transport type is unix.

Do you think this is fine ?

> >> Is 'socket' really the only valid transport and will it stay like this
> >> without changes to qemu?
> > 
> > There are others like 'unix' and 'rdma'. I will fix this error message to
> > reflect that.
> > 
> > However QEMU needn't change for such transport types because I am not
> > interpreting the transport type in QEMU but instead passing it on directly
> > to GlusterFS.
> 
> Maybe then just specify "[?transport=...]" instead of giving a specific
> option value?

Sure, but as I noted above, let me finalize on how to specify 'unix'
transport type.

> >>> +static int qemu_gluster_send_pipe(BDRVGlusterState *s, GlusterAIOCB *acb)
> >>> +{
> >>> +    int ret = 0;
> >>> +    while (1) {
> >>> +        fd_set wfd;
> >>> +        int fd = s->fds[GLUSTER_FD_WRITE];
> >>> +
> >>> +        ret = write(fd, (void *)&acb, sizeof(acb));
> >>> +        if (ret >= 0) {
> >>> +            break;
> >>> +        }
> >>> +        if (errno == EINTR) {
> >>> +            continue;
> >>> +        }
> >>> +        if (errno != EAGAIN) {
> >>> +            break;
> >>> +        }
> >>> +
> >>> +        FD_ZERO(&wfd);
> >>> +        FD_SET(fd, &wfd);
> >>> +        do {
> >>> +            ret = select(fd + 1, NULL, &wfd, NULL, NULL);
> >>> +        } while (ret < 0 && errno == EINTR);
> >>
> >> What's the idea behind this? While we're hanging in this loop noone will
> >> read anything from the pipe, so it's unlikely that it magically becomes
> >> ready.
> > 
> > I write to the pipe and wait for the reader to read it. The reader
> > (qemu_gluster_aio_event_reader) is already waiting on the other end of the
> > pipe.
> 
> qemu_gluster_aio_even_reader() isn't called while we're looping here. It
> will only be called from the main loop, after this function has returned.

May be I am not understanding you correctly here. Let me be a bit verbose.

This routine is called by aio callback routine that we registered to be
called by gluster after aio completion. Hence this routine will be called
in the context of a separate (gluster) thread. This routine will write to
the pipe and wait until the data is read from the read-end of it.

As soon as the data is available to be read from this pipe, I think the
routine registered for reading (qemu_gluster_aio_event_reader) would be
called which will further handle the AIO completion. As per my understanding,
the original co-routine thread that initiated the aio read or write request
would be blocked on qemu_aio_wait(). That thread would wake up and run
qemu_gluster_aio_event_reader().

So I am not clear why qemu_gluster_aio_even_reader() won't be called until
this routine (qemu_gluster_send_pipe) returns.

Regards,
Bharata.