From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:42075)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jcody@redhat.com>) id 1c7lt2-0000VW-RJ
	for qemu-devel@nongnu.org; Fri, 18 Nov 2016 11:19:41 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jcody@redhat.com>) id 1c7lsy-0003mG-Rl
	for qemu-devel@nongnu.org; Fri, 18 Nov 2016 11:19:36 -0500
Received: from mx1.redhat.com ([209.132.183.28]:38910)
	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
	(Exim 4.71) (envelope-from <jcody@redhat.com>) id 1c7lsy-0003lp-HD
	for qemu-devel@nongnu.org; Fri, 18 Nov 2016 11:19:32 -0500
Date: Fri, 18 Nov 2016 11:19:28 -0500
From: Jeff Cody <jcody@redhat.com>
Message-ID: <20161118161928.GB18745@localhost.localdomain>
References: <1475035789-685-1-git-send-email-ashish.mittal@veritas.com>
	<20160928214510.GA2837@stefanha-x1.localdomain>
	<CAAo6VWPbVYia_JsbtUhqwAufr1_1xOWsFmSxP1NfRijAOv54bg@mail.gmail.com>
	<CAJSP0QW-44yjuvXAc3dpUKaHb8ULSaid6DLhQH23Y3AVpvuNSQ@mail.gmail.com>
	<20161118072621.GA2607@localhost.localdomain>
	<D8AD54FE-DE7F-4809-BAC7-270CA6F56795@veritas.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <D8AD54FE-DE7F-4809-BAC7-270CA6F56795@veritas.com>
Subject: Re: [Qemu-devel] [PATCH v7 RFC] block/vxhs: Initial commit to add
 Veritas HyperScale VxHS block device support
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Ketan Nilangekar <Ketan.Nilangekar@veritas.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>, ashish mittal <ashmit602@gmail.com>, qemu-devel <qemu-devel@nongnu.org>, Paolo Bonzini <pbonzini@redhat.com>, Kevin Wolf <kwolf@redhat.com>, Markus Armbruster <armbru@redhat.com>, "Daniel P. Berrange" <berrange@redhat.com>, Fam Zheng <famz@redhat.com>, Ashish Mittal <Ashish.Mittal@veritas.com>, Abhijit Dey <Abhijit.Dey@veritas.com>, Buddhi Madhav <Buddhi.Madhav@veritas.com>, "Venkatesha M.G." <Venkatesha.Mg@veritas.com>, Abhishek Kane <Abhishek.Kane@veritas.com>

On Fri, Nov 18, 2016 at 10:34:54AM +0000, Ketan Nilangekar wrote:
> 
> 
> 
> 
> 
> On 11/18/16, 12:56 PM, "Jeff Cody" <jcody@redhat.com> wrote:
> 
> >On Wed, Nov 16, 2016 at 08:12:41AM +0000, Stefan Hajnoczi wrote:
> >> On Tue, Nov 15, 2016 at 10:38 PM, ashish mittal <ashmit602@gmail.com> wrote:
> >> > On Wed, Sep 28, 2016 at 2:45 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> >> >> On Tue, Sep 27, 2016 at 09:09:49PM -0700, Ashish Mittal wrote:
> >> >> 5.
> >> >> I don't see any endianness handling or portable alignment of struct
> >> >> fields in the network protocol code.  Binary network protocols need to
> >> >> take care of these issue for portability.  This means libqnio compiled
> >> >> for different architectures will not work.  Do you plan to support any
> >> >> other architectures besides x86?
> >> >>
> >> >
> >> > No, we support only x86 and do not plan to support any other arch.
> >> > Please let me know if this necessitates any changes to the configure
> >> > script.
> >> 
> >> I think no change to ./configure is necessary.  The library will only
> >> ship on x86 so other platforms will never attempt to compile the code.
> >> 
> >> >> 6.
> >> >> The networking code doesn't look robust: kvset uses assert() on input
> >> >> from the network so the other side of the connection could cause SIGABRT
> >> >> (coredump), the client uses the msg pointer as the cookie for the
> >> >> response packet so the server can easily crash the client by sending a
> >> >> bogus cookie value, etc.  Even on the client side these things are
> >> >> troublesome but on a server they are guaranteed security issues.  I
> >> >> didn't look into it deeply.  Please audit the code.
> >> >>
> >> >
> >> > By design, our solution on OpenStack platform uses a closed set of
> >> > nodes communicating on dedicated networks. VxHS servers on all the
> >> > nodes are on a dedicated network. Clients (qemu) connects to these
> >> > only after reading the server IP from the XML (read by libvirt). The
> >> > XML cannot be modified without proper access. Therefore, IMO this
> >> > problem would be  relevant only if someone were to use qnio as a
> >> > generic mode of communication/data transfer, but for our use-case, we
> >> > will not run into this problem. Is this explanation acceptable?
> >> 
> >> No.  The trust model is that the guest is untrusted and in the worst
> >> case may gain code execution in QEMU due to security bugs.
> >> 
> >> You are assuming block/vxhs.c and libqnio are trusted but that
> >> assumption violates the trust model.
> >> 
> >> In other words:
> >> 1. Guest exploits a security hole inside QEMU and gains code execution
> >> on the host.
> >> 2. Guest uses VxHS client file descriptor on host to send a malicious
> >> packet to VxHS server.
> >> 3. VxHS server is compromised by guest.
> >> 4. Compromised VxHS server sends malicious packets to all other
> >> connected clients.
> >> 5. All clients have been compromised.
> >> 
> >> This means both the VxHS client and server must be robust.  They have
> >> to validate inputs to avoid buffer overflows, assertion failures,
> >> infinite loops, etc.
> >> 
> >> Stefan
> >
> >
> >The libqnio code is important with respect to the VxHS driver.  It is a bit
> >different than other existing external protocol drivers, in that the current
> >user and developer base is small, and the code itself is pretty new.  So I
> >think for the VxHS driver here upstream, we really do need to get some of
> >the libqnio issues squared away.  I don't know if we've ever explicitly
> >address the extent to which libqnio issues affect the driver
> >merging, so I figure it is probably worth discussing here.
> >
> >To try and consolidate libqnio discussion, here is what I think I've read /
> >seen from others as the major issues that should be addressed in libqnio:
> >
> >* Code auditing, static analysis, and general code cleanup.  Things like
> >  memory leaks shouldn't be happening, and some prior libqnio compiler
> >  warnings imply that there is more code analysis that should be done with
> >  libqnio.
> >
> >  (With regards to memory leaks:  Valgrind may be useful to track these down:
> >
> >    # valgrind  ./qemu-io -c 'write -pP 0xae 66000 128k' \
> >            vxhs://localhost/test.raw
> >
> >    ==30369== LEAK SUMMARY:
> >    ==30369==    definitely lost: 4,168 bytes in 2 blocks
> >    ==30369==    indirectly lost: 1,207,720 bytes in 58,085 blocks) 
> 
> We have done and are doing exhaustive memory leak tests using valgrind.
> Memory leaks within qnio have been addressed to some extent. We will post
> detailed valgrind results to this thread.
>

That is good to hear.  I ran the above on the latest HEAD from the qnio
github repo, so I look forward to checking out the latest code once it is
available.

> >
> >* Potential security issues such as buffer overruns, input validation, etc., 
> >  need to be audited.
> 
> We have known a few such issues from previous comments and have addressed
> some of those. If there are any important outstanding ones, please let us
> know and we will fix them on priority.
>

One concern is that the issues noted are not from an exhaustive review on
Stefan's part, AFAIK.  When Stefan called for auditing the code, that is
really a call to look for other potential security flaws as well, using the
perspective he outlined on the trust model.


> >
> >* Async operations need to be truly asynchronous, without blocking calls.
> 
> There is only one blocking call in libqnio reconnect which we has been
> pointed out. We will fix this soon.
>

Great, thanks!


> >
> >* Daniel pointed out that there is no authentication method for taking to a
> >  remote server.  This seems a bit scary.  Maybe all that is needed here is
> >  some clarification of the security scheme for authentication?  My
> >  impression from above is that you are relying on the networks being
> >  private to provide some sort of implicit authentication, though, and this
> >  seems fragile (and doesn't protect against a compromised guest or other
> >  process on the server, for one).
> 
> Our auth scheme is based on network isolation at L2/L3 level. If there is
> a simplified authentication mechanism which we can implement without
> imposing significant penalties on IO performance, please let us know and
> we will implement that if feasible.
> 
> >
> >(if I've missed anything, please add it here!)
> >
> >-Jeff