From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:48660) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qvy49-000084-Cb for qemu-devel@nongnu.org; Tue, 23 Aug 2011 16:59:22 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qvy47-0002tS-VH for qemu-devel@nongnu.org; Tue, 23 Aug 2011 16:59:21 -0400 Received: from mx1.redhat.com ([209.132.183.28]:62866) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qvy47-0002tM-OA for qemu-devel@nongnu.org; Tue, 23 Aug 2011 16:59:19 -0400 Message-ID: <4E5414A4.2040603@redhat.com> Date: Tue, 23 Aug 2011 16:59:16 -0400 From: Don Dutile MIME-Version: 1.0 References: <4E53E328.90601@siemens.com> <20110823181751.GB6326@redhat.com> <1314123707.2859.87.camel@bling.home> <20110823182632.GA6489@redhat.com> <1314126740.2859.108.camel@bling.home> <20110823193008.GC6489@redhat.com> In-Reply-To: <20110823193008.GC6489@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] pci: Error on PCI capability collisions List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: Jan Kiszka , Alex Williamson , qemu-devel On 08/23/2011 03:30 PM, Michael S. Tsirkin wrote: > On Tue, Aug 23, 2011 at 01:12:19PM -0600, Alex Williamson wrote: >> On Tue, 2011-08-23 at 21:26 +0300, Michael S. Tsirkin wrote: >>> On Tue, Aug 23, 2011 at 12:21:47PM -0600, Alex Williamson wrote: >>>> On Tue, 2011-08-23 at 21:17 +0300, Michael S. Tsirkin wrote: >>>>> On Tue, Aug 23, 2011 at 07:28:08PM +0200, Jan Kiszka wrote: >>>>>> From: Alex Williamson >>>>>> >>>>>> Nothing good can happen when we overlap capabilities >>>>>> >>>>>> [ Jan: rebased over qemu, minor formatting ] >>>>>> >>>>>> Signed-off-by: Jan Kiszka >>>>> >>>>> I'll stick an assert there instead. Normal devices >>>>> don't generate overlapping caps unless there's a bug, >>>>> and device assignment should do it's own checks. >>>>> >>>>> I really have a mind to rip out the used array too. >>>> >>>> So you'd rather kill qemu rather than have a reasonable error return >>>> path... great :( >>>> >>>> Alex >>> >>> Well that will make it possible to make pci_add_capability return void, >>> less work for callers :) Dev assignment is really the only place where >>> capability offsets need to be verified. >> >> A few issues with that... Since when is error handling so difficult that >> we need to pretend that nothing ever fails just to make it easy for the >> caller? > > It isn't but no need to introduce error codes just for fun. > >> Why is device assignment such a special case? > > Assigned devices are under the guest control so should be assumed > untrusted, and we must verify anything we get from them. > > For example, I think it's generally a mistake to read a device > register and use that as an array index, we must check it's in range > first. It's best to do these range checks in the dev assignment code > so that it's easy to verify that all values are used safely. > So we want to pollute the dev assignment code with knowledge of this array for bounds checking, which you're threatening to remove? The patch is simple, the return error checking is simple, and when we write error free code, we can remove all error checking. I found the current array & it's error checking fairly handy when the array was overflowed and it resulted in oddly succeeding/failing sequences doing device assignment (yes, due to bad hardware -- shocking! ;-) ). The error checking quickly pointed out the problem, and made it easy to debug. I would expect code generators would appreciate keeping the array & it's related checking, like overlap & bounds checking, a welcomed addition. Adding such features in each potentially error-ing caller doesn't reduce the code size, (it'll have to be replicated in several areas), and the return check is simple & common (and already exists), so removing it will be more work then augmenting the existing framework. additionally, that's assuming the coder creates the correct check, in different variants/locations. ACK to Jan's patch. >> It's actually >> rather ironic that we're trying to add error checking to catch bugs that >> real hardware is exposing, but assuming that emulated drivers always get >> it right. How will a return void help the emulated driver that has a >> coding error? > > Drivers use fixed offsets so they will always fail or always work. > If we return an error they might seem to work but behave incrrectly > without the right capability. >