From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59247) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YIcMQ-0001mU-MK for qemu-devel@nongnu.org; Tue, 03 Feb 2015 07:13:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YIcMN-0003oT-FS for qemu-devel@nongnu.org; Tue, 03 Feb 2015 07:13:42 -0500 Received: from mx-v6.kamp.de ([2a02:248:0:51::16]:37173 helo=mx01.kamp.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YIcMN-0003nX-5y for qemu-devel@nongnu.org; Tue, 03 Feb 2015 07:13:39 -0500 Message-ID: <54D0BB6F.8060407@kamp.de> Date: Tue, 03 Feb 2015 13:13:35 +0100 From: Peter Lieven MIME-Version: 1.0 References: <1422907754-21543-1-git-send-email-den@openvz.org> <1422907754-21543-2-git-send-email-den@openvz.org> <54CFE0BA.4040902@kamp.de> <54CFE22E.5000309@openvz.org> <54D07952.1080500@parallels.com> <54D0B16A.7060600@kamp.de> <20150203113718.GF4488@noname.redhat.com> <54D0B550.8060101@kamp.de> <54D0B64C.30609@parallels.com> In-Reply-To: <54D0B64C.30609@parallels.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 1/2] glusterfs: fix max_discard List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Denis V. Lunev" , Kevin Wolf Cc: Paolo Bonzini , qemu-devel@nongnu.org Am 03.02.2015 um 12:51 schrieb Denis V. Lunev: > On 03/02/15 14:47, Peter Lieven wrote: >> Am 03.02.2015 um 12:37 schrieb Kevin Wolf: >>> Am 03.02.2015 um 12:30 hat Peter Lieven geschrieben: >>>> Am 03.02.2015 um 08:31 schrieb Denis V. Lunev: >>>>> On 02/02/15 23:46, Denis V. Lunev wrote: >>>>>> On 02/02/15 23:40, Peter Lieven wrote: >>>>>>> Am 02.02.2015 um 21:09 schrieb Denis V. Lunev: >>>>>>>> qemu_gluster_co_discard calculates size to discard as follows >>>>>>>> size_t size = nb_sectors * BDRV_SECTOR_SIZE; >>>>>>>> ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb); >>>>>>>> >>>>>>>> glfs_discard_async is declared as follows: >>>>>>>> int glfs_discard_async (glfs_fd_t *fd, off_t length, size_t lent, >>>>>>>> glfs_io_cbk fn, void *data) __THROW >>>>>>>> This is problematic on i686 as sizeof(size_t) == 4. >>>>>>>> >>>>>>>> Set bl_max_discard to SIZE_MAX >> BDRV_SECTOR_BITS to avoid overflow >>>>>>>> on i386. >>>>>>>> >>>>>>>> Signed-off-by: Denis V. Lunev >>>>>>>> CC: Kevin Wolf >>>>>>>> CC: Peter Lieven >>>>>>>> --- >>>>>>>> block/gluster.c | 9 +++++++++ >>>>>>>> 1 file changed, 9 insertions(+) >>>>>>>> >>>>>>>> diff --git a/block/gluster.c b/block/gluster.c >>>>>>>> index 1eb3a8c..8a8c153 100644 >>>>>>>> --- a/block/gluster.c >>>>>>>> +++ b/block/gluster.c >>>>>>>> @@ -622,6 +622,11 @@ out: >>>>>>>> return ret; >>>>>>>> } >>>>>>>> +static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp) >>>>>>>> +{ >>>>>>>> + bs->bl.max_discard = MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX); >>>>>>>> +} >>>>>>>> + >>>>>>> Looking at the gluster code bl.max_transfer_length should have the same limit, but thats a different patch. >>>>>> ha, the same applies to nbd code too. >>>>>> >>>>>> I'll do this stuff tomorrow and also I think that some >>>>>> audit in other drivers could reveal something interesting. >>>>>> >>>>>> Den >>>>> ok. The situation is well rotten here on i686. >>>>> >>>>> The problem comes from the fact that QEMUIOVector >>>>> and iovec uses size_t as length. All API calls use >>>>> this abstraction. Thus all conversion operations >>>>> from nr_sectors to size could bang at any moment. >>>>> >>>>> Putting dirty hands here is problematic from my point >>>>> of view. Should we really care about this? 32bit >>>>> applications are becoming old good history of IT... >>>> The host has to be 32bit to be in trouble. And at least if we have KVM the host >>>> has to support long mode. >>>> >>>> I have on my todo to add generic code for honouring bl.max_transfer_length >>>> in block.c. We could change default maximum from INT_MAX to SIZE_MAX >> BDRV_SECTOR_BITS >>>> for bl.max_transfer_length. >>> So the conclusion is that we'll apply this series as it is and you'll >>> take care of the rest later? >> Yes, and actually we need a macro like >> >> #define BDRV_MAX_REQUEST_SECTORS MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX) >> >> as limit for everything. Because bdrv_check_byte_request already has a size_t argument. >> So we could already create an overflow in bdrv_check_request when we convert >> nb_sectors to size_t. >> >> I will create a patch to catch at least this overflow shortly. >> >> Peter >> > I like this macro :) I see I looked at an old checkout in bdrv_check_request there is already such a check meanwhile. static int bdrv_check_request(BlockDriverState *bs, int64_t sector_num, int nb_sectors) { if (nb_sectors < 0 || nb_sectors > INT_MAX / BDRV_SECTOR_SIZE) { return -EIO; } return bdrv_check_byte_request(bs, sector_num * BDRV_SECTOR_SIZE, nb_sectors * BDRV_SECTOR_SIZE); } > > I vote to move MIN(SIZE_MAX >> BDRV_SECTOR_BITS, INT_MAX) into generic code > on discard/write_zero paths immediately and drop this exact patch. > > Patch 2 of this set would be better to have additional > +bs->bl.max_transfer_length = UINT32_MAX >> BDRV_SECTOR_BITS; > > I'll wait Peter's patch and respin on top of it to avoid unnecessary commits. > > Den