From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7690BECAAA1 for ; Fri, 28 Oct 2022 13:50:20 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ooPk7-0004mB-GM; Fri, 28 Oct 2022 09:49:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ooPjp-0004dd-8x for qemu-devel@nongnu.org; Fri, 28 Oct 2022 09:49:42 -0400 Received: from mga14.intel.com ([192.55.52.115]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ooPji-0000tZ-Ao for qemu-devel@nongnu.org; Fri, 28 Oct 2022 09:49:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666964966; x=1698500966; h=message-id:date:subject:to:cc:references:from: in-reply-to:content-transfer-encoding:mime-version; bh=WQuEQYrdGATjY1ti77ZY+EFWj4X+W8zq+v4+JqkSiok=; b=i6fCI57PrWP3MMH8nWTjhHLqtvSthUax/cUzpIhYqqL13D927WD9Q3ue 0pyD8nTmA74R8BbgUu1abrhaCgpuB+KkQZNFmVcW8zs6zs0gmsl0H9RYZ JehIsJFgSAnlCZ5MMrOpkSxjyOx8q03ITaZXaQjxlRFwwkPyx3sftwJ0A bQL6nUgRX8AiJnyjPAS2U1xMLvv4wzJv3O7yQofyQDP3NlcL6QoKHHvS4 64rKEB+24YBT5FA8accKK+ZO+AeAQu7jzn90wj4fF4ah1HO4jqjHFunBJ y7TgPPx4ieV0o74VidsqaRRHQumRBVPRBMDVWhZscStu+3VYLgUb9c21H A==; X-IronPort-AV: E=McAfee;i="6500,9779,10513"; a="308481173" X-IronPort-AV: E=Sophos;i="5.95,221,1661842800"; d="scan'208";a="308481173" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Oct 2022 06:49:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10513"; a="627526622" X-IronPort-AV: E=Sophos;i="5.95,221,1661842800"; d="scan'208";a="627526622" Received: from orsmsx603.amr.corp.intel.com ([10.22.229.16]) by orsmga007.jf.intel.com with ESMTP; 28 Oct 2022 06:49:23 -0700 Received: from orsmsx610.amr.corp.intel.com (10.22.229.23) by ORSMSX603.amr.corp.intel.com (10.22.229.16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Fri, 28 Oct 2022 06:49:23 -0700 Received: from orsmsx610.amr.corp.intel.com (10.22.229.23) by ORSMSX610.amr.corp.intel.com (10.22.229.23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Fri, 28 Oct 2022 06:49:22 -0700 Received: from ORSEDG601.ED.cps.intel.com (10.7.248.6) by orsmsx610.amr.corp.intel.com (10.22.229.23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31 via Frontend Transport; Fri, 28 Oct 2022 06:49:22 -0700 Received: from NAM04-MW2-obe.outbound.protection.outlook.com (104.47.73.171) by edgegateway.intel.com (134.134.137.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2375.31; Fri, 28 Oct 2022 06:49:22 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=iT65ADLOtz63KyA0OF+2eXXgx8D0xroIMt1vObTuz7eWqnKZndsqloAgAVZKJ7+es+Fudh4kMAa9j4Y0vydAsvz3PmGnj0tRHahDsrBFyyxfoSF4glxwXId/qjqDhbHCu5F77FVlGlD+4WHFRovs////yRQ45lmDUC9I0j3LBbd6bwSlvUU2bceWywV/bODLVagOEvoZSklNApdczh83fatQXgpSWyZ2+GNWLo1Gyvw7Co6rH6X5zns23G0heHbpjNVJUoP+vnSZkPWetUQ/OInUQyaOBigsTSpw3/1jjJyS4VmAfQZPKHjySpnL8Nmv5UPtOcFaOR4AP8ABtm0Ymw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=qqzlwt156ZaQVbMQuNw247KhCSbLYpvmpmqnUCqGhX8=; b=IYFjVJJrZkzwIcJ2TEtcZf0C2x/0NPi3GKBbFXEt6Gc4OEPnEwwO0Yn7FFWFeQsXD/9MhfZzx1THnAZbXi5xXZoaCaC37vvFRJUnBTAGvFX4qd8LWvCLCmhIZKFA573ujtWdXpiy+LrUR2Cqg9DhLgydbPm6N6fBBnLEPz/rMX9TKhVaK86pYkGDz2geJPDGwCRNxaytq1Hccmusp19gjeFjYaOk+F/J8ZXqzZrlRgu903bUsI7ZOtiYUFddsrUUT/TYn1o1Y8mF039mNtkBCPmKINwAnMkfNOU9hpJrsI/PAfUDUUBXl4CA/IG3l2PzBnZ8cNxtiHfHWy2ykQ+c8g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; Received: from DS0PR11MB7529.namprd11.prod.outlook.com (2603:10b6:8:141::20) by CY8PR11MB7291.namprd11.prod.outlook.com (2603:10b6:930:9b::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5746.23; Fri, 28 Oct 2022 13:49:21 +0000 Received: from DS0PR11MB7529.namprd11.prod.outlook.com ([fe80::37e3:9536:43ed:2ecf]) by DS0PR11MB7529.namprd11.prod.outlook.com ([fe80::37e3:9536:43ed:2ecf%4]) with mapi id 15.20.5746.028; Fri, 28 Oct 2022 13:49:20 +0000 Message-ID: Date: Fri, 28 Oct 2022 21:49:36 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0 Thunderbird/91.11.0 Subject: Re: [PATCH V5 4/4] intel-iommu: PASID support Content-Language: en-US To: Jason Wang , , CC: , , , References: <20221028061436.30093-1-jasowang@redhat.com> <20221028061436.30093-5-jasowang@redhat.com> From: Yi Liu In-Reply-To: <20221028061436.30093-5-jasowang@redhat.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-ClientProxiedBy: SI2PR04CA0005.apcprd04.prod.outlook.com (2603:1096:4:197::16) To DS0PR11MB7529.namprd11.prod.outlook.com (2603:10b6:8:141::20) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR11MB7529:EE_|CY8PR11MB7291:EE_ X-MS-Office365-Filtering-Correlation-Id: 3628fbdd-7d84-48e5-eb7e-08dab8eb36e2 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ymSxXikjU5UeOQ2JdyRUzMmkacxywmPSxyhxkRm7GRDvHirWKXKaQa1B/kh7iVk+z3IxOK27TjrCD5iOFAD9i2+ic6GKfNfuVoJRC1sUNuGChHkd0RzePu8NXb28jl4B5C7FK/wJLLzMs6wh77VR8B8lS2kmQ//K8n+dkWA12R3LHOVsF6z4LxZaQPIFZRFg6DGpSabUoZ1rgge1uegdMcFM7bXMEO3VhKK8DezwW2w/kOFd84WIoXDOFkgk5hkJ+pmcXOoE+7aUATSJs0CQWK0reEoeoIIZBbFXuZwCd12lbQubBbgg1HcaTzRc9pA3j9+wQk+jKP7WNKfDOxR4Qmj/4tsdfk94x3pn6qoBw5QiKdGg7hgyP/0Vt/zQBzzdpaSP7ywZ4Whg+LLCeDU5EswR9Xn5Ny6H/6NgCn9GXxjD+qtRPwT83Zcc4/QRokIM7Ekq7zQUc/W3JHW1G76ijLHhf8UBNSV5MNoL4APvgYo+6PP+CMJ+gpgsZxYQC+jeHR6HBgVRAufTz1K2cJPQLf107L3+pJqSxBxSUHZM170S5wuu9qBUyrxiTIIwZsayOipZJXAtf3HjXhS45ZBOzpxEjE9qmulKqj0Ovi+7tS3NXbsuaHKpoDjlcCQqj/+0cSxhW9/NiN0YEcLVPRHRuVfxWnQYs4GmWTEkCj+tM/DFCpDvdv1uvynRikgTMObOqwb7vGwAI1Q2zC/1pEb8gZxebtUoLrXh2I4SorUvS9U1tkltmU1vEUY7Zutvbp02kOyvlvZJfcabXT8QK1IVRf/uUyDSrWnI5f5ld2yaxVo= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DS0PR11MB7529.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(366004)(346002)(396003)(136003)(39860400002)(376002)(451199015)(6512007)(31686004)(66946007)(36756003)(31696002)(2616005)(83380400001)(2906002)(82960400001)(38100700002)(6666004)(4326008)(186003)(66556008)(66476007)(30864003)(6486002)(26005)(53546011)(86362001)(316002)(6506007)(478600001)(8936002)(5660300002)(8676002)(41300700001)(45980500001)(43740500002)(579004); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?NjhsVzZxQS9rNXhIakpWNWhsc1NEelJBZE9aM0NXbjVFN2RwVTEzSTI4QmV3?= =?utf-8?B?cGtqVi9iL1Y3aFVOZUkxQnh4Vzh5NUxoTzJVblRDNllrQzZGVVBFRnFHZzIz?= =?utf-8?B?YlVuNTBtWWhCZWFUTUpIYkI1U1NRNzIwZzNlUW9NbkJLWkNHUVhBZkN3UVpX?= =?utf-8?B?b3VOR3lXeWRuL1d5MWM3Q2xsWmFiMGRXSUlHM1JlUnhTMWR0NWVjUVRHZGxC?= =?utf-8?B?VnVlZDJZenlVN21SZlZISUZCb1ZSaXkyUGZadS9TTmQ1YXZNbzBmckp5bUxY?= =?utf-8?B?dUVIWjJqTXhtSlRLeGsrMFJ0Zjg3Q2ZFSStGM3ViRlhpSk8rWW5vZFQ5UThV?= =?utf-8?B?Y0dZRXFtcWxaNFFGRHRsWDQxbzhqNVhYUlZNREUvTWtBRkQ4NGpuRmtzaDU3?= =?utf-8?B?RWcrSkMvcmFidFg2eVh1OTFBQ1p1ZzI1ajkrcXU2aUlkWE5YWFd2QlhoNWhO?= =?utf-8?B?bHhSa1pRUy9lTUIrT013Ti9rSGVWVmw2MTI2MGt3bGJoUWJaSUUyWHJQc3Nj?= =?utf-8?B?TzFyMVZNK1hITVNBb0kycEtheXFzemRBSEhoaTZLbG9CU3RqbXdPOVQ5ZlEr?= =?utf-8?B?NHo1UHpscDRtb0ovSmM2eFhMdS94bU9lSGxkVkNsVDVUSUN4YU95OFNVUnZ2?= =?utf-8?B?TDVsTmFPWDZXWFZvNWFLWElRTlFqeGtIYlJUdFVHMVpZTk5BcUx6dkZldDlB?= =?utf-8?B?L0ZRMkp5M3JnWmJvS1pyU0hsZ0FzdWhKYStidEYwc2wwL0JVcUVlRG1zVXpS?= =?utf-8?B?bDdIZGthemp2UklhSGdkRldlMEZQeDZKa1RQbDJTeXZYNmlETjlpOWRVYUJz?= =?utf-8?B?aHFHUDgrRTdsMERqcFFQZVRhaklwKy8xcmxid2lmRThXVUtBT1hiME16YlNa?= =?utf-8?B?MGR1QlpFbkxYYWoxS0drTEt0emh4Q2xQZUQya3NiR3hPZWIyWnpYWmJGMjE3?= =?utf-8?B?MFljZXNmVXlBaEFjWXh3S0dvd3VTeS9lNE8vcGJtMVRuK2VvK3VYY2wzRXVQ?= =?utf-8?B?TGtaQTJvVFNPMG1lSWV1RS9lSGdxUmpwUFBJOEE4MEFncnlBcHlsTFErZFQw?= =?utf-8?B?cjBDS2FUQVJPWlpyT3U3Y3lHbm1zL3RJMjYwUlNyRzQwTUR1dVJxZXBMT2hR?= =?utf-8?B?TlE3bVduNFJxVW0zNjJWMVNadUhLWHpaZzZzTTJGcHN6QlNoVDFJWk5zRGli?= =?utf-8?B?N2c3d2JPdHR0bmFhNFdScDdMbldkMTJuYks1YkNGQ09ZOU9HZVRGdEZVblpK?= =?utf-8?B?bEdVL2VMclJaaGYxUzJxSG1vaVNKU3JjYzBNQ2l6T1ZCZ3I3bnNMK1IyTVpx?= =?utf-8?B?UlJyMlI2ZzV3a2lIeDVWSHlySzlsSHEwSGcwM2JZVXIwa0c4LzEvalc2VGZQ?= =?utf-8?B?U1VESUFGVC9qSUczRkFpZDdqM1pCeFN4bERHNWVEQnZvR3NHUSszcTF2cGsy?= =?utf-8?B?czVmSlFrZjIxaitZNGt2aDYrZFQ5OG8vZS9Ea0oxYW9VNXBlUDI5VmJFWDFu?= =?utf-8?B?TVhjK044TXJyOUZDZ29wVWNncDA3eVd3b2JyK1Nza2pFQXdMb0xGUWhueE9O?= =?utf-8?B?cFZzVDAzSk03T3AvSGVUTGpGTTl3eGVFT3UwSGJFZE1lYVRCVkJQYUR0VU9M?= =?utf-8?B?V1lPSmNoK014ZkJ0UzhEdFJWNEg1SkFPVG9lOWdCZWpJWjJZZUpUd1NVSnJ2?= =?utf-8?B?SzR5TGtZdU40R2tsSEtGQ1lyTkZBcVcwbkdnTVNPSzhCVzN2bUlGbk5WQXJ1?= =?utf-8?B?WGRpWnVsVW0vQTFsTHZQTE93WDRmbDBvT2VHUExrRUN1NU1sWHE0SU5mOGda?= =?utf-8?B?SFc5dWd6cGlHV1NMUkJvTk92N2FpV3pDK3lzOVltUlJielBaWFdQeXdMWW51?= =?utf-8?B?NWNBQ3VZNnQ1djRqcCtZV0NONEtuL1U1aDhpZXR2WnU5dkZUZlQxb2p4MjVT?= =?utf-8?B?Zmx4R0M0djZ3Q1pmWklhSFhnR25na2t3KzhQcDNMaUZHSVZxSE1sQkIxWnNL?= =?utf-8?B?V2xBRURPZEVjNFFkUnp0N3JPcmdOYldpN1RLbEpWK2htUW5WSlI2QTFPcC9y?= =?utf-8?B?NEpFc3hjaU5TMmt2UmtrUTY4LzAwWnpIckpFVzZ4aStYSXI3ZjRmT1NFRlRm?= =?utf-8?Q?P0VxZmFpbU2l03oeKqdP1n7nB?= X-MS-Exchange-CrossTenant-Network-Message-Id: 3628fbdd-7d84-48e5-eb7e-08dab8eb36e2 X-MS-Exchange-CrossTenant-AuthSource: DS0PR11MB7529.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Oct 2022 13:49:20.7449 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: jEo+ygCAlf0q3AbUdgdGd1GHfshMIeH561lA9mCWXSrc75LJtysgqFBpwbt1BZPssZm+N7gELzJI1D5L7c2grw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY8PR11MB7291 X-OriginatorOrg: intel.com Received-SPF: pass client-ip=192.55.52.115; envelope-from=yi.l.liu@intel.com; helo=mga14.intel.com X-Spam_score_int: -75 X-Spam_score: -7.6 X-Spam_bar: ------- X-Spam_report: (-7.6 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.516, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Qemu-devel" Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org On 2022/10/28 14:14, Jason Wang wrote: > This patch introduce ECAP_PASID via "x-pasid-mode". Based on the > existing support for scalable mode, we need to implement the following > missing parts: > > 1) tag VTDAddressSpace with PASID and support IOMMU/DMA translation > with PASID > 2) tag IOTLB with PASID > 3) PASID cache and its flush > 4) PASID based IOTLB invalidation > > For simplicity PASID cache is not implemented so we can simply > implement the PASID cache flush as a no and leave it to be implemented > in the future. For PASID based IOTLB invalidation, since we haven't > had L1 stage support, the PASID based IOTLB invalidation is not > implemented yet. For PASID based device IOTLB invalidation, it > requires the support for vhost so we forbid enabling device IOTLB when > PASID is enabled now. Those work could be done in the future. > > Note that though PASID based IOMMU translation is ready but no device > can issue PASID DMA right now. In this case, PCI_NO_PASID is used as > PASID to identify the address without PASID. vtd_find_add_as() has > been extended to provision address space with PASID which could be > utilized by the future extension of PCI core to allow device model to > use PASID based DMA translation. a quick comment. For PCI_NO_PASID. perhaps you can use PASID#0. As below code, if RPS bit is 0, then PASID#0 is used as rid2pasid, and rid2pasid is supposed to be used for translating requests without PASID. So I guess you can use pasid#0 for requests without PASID if RPS bit 0. This is high chance, currently no need to report RPS bit as 1. static int vtd_dev_get_rid2pasid(IntelIOMMUState *s, uint8_t bus_num, uint8_t devfn, uint32_t *rid_pasid) { VTDContextEntry ce; int ret; /* * Currently, ECAP.RPS bit is likely to be reported as "Clear". * And per VT-d 3.1 spec, it will use PASID #0 as RID2PASID when * RPS bit is reported as "Clear". */ if (likely(!(s->ecap & VTD_ECAP_RPS))) { *rid_pasid = 0; return 0; } /* * In future, to improve performance, could try to fetch context * entry from cache firstly. */ ret = vtd_dev_to_context_entry(s, bus_num, devfn, &ce); if (!ret) { *rid_pasid = VTD_CE_GET_RID2PASID(&ce); } return ret; } > > This feature would be useful for: > > 1) prototyping PASID support for devices like virtio > 2) future vPASID work > 3) future PRS and vSVA work > > Reviewed-by: Peter Xu > Signed-off-by: Jason Wang > --- > Changes since V3: > - rearrange the member for vtd_iotlb_key structure > - reorder the pasid parameter ahead of addr for vtd_lookup_iotlb() > - allow access size from 1 to 8 for vtd_mem_ir_fault_ops > Changes since V2: > - forbid device-iotlb with PASID > - report PASID based qualified fault > - log PASID during errors > --- > hw/i386/intel_iommu.c | 416 +++++++++++++++++++++++++-------- > hw/i386/intel_iommu_internal.h | 16 +- > hw/i386/trace-events | 2 + > include/hw/i386/intel_iommu.h | 7 +- > include/hw/pci/pci_bus.h | 2 + > 5 files changed, 339 insertions(+), 104 deletions(-) > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > index 9029ee98f4..7ca077b824 100644 > --- a/hw/i386/intel_iommu.c > +++ b/hw/i386/intel_iommu.c > @@ -58,6 +58,14 @@ > struct vtd_as_key { > PCIBus *bus; > uint8_t devfn; > + uint32_t pasid; > +}; > + > +struct vtd_iotlb_key { > + uint64_t gfn; > + uint32_t pasid; > + uint32_t level; > + uint16_t sid; > }; > > static void vtd_address_space_refresh_all(IntelIOMMUState *s); > @@ -199,14 +207,24 @@ static inline gboolean vtd_as_has_map_notifier(VTDAddressSpace *as) > } > > /* GHashTable functions */ > -static gboolean vtd_uint64_equal(gconstpointer v1, gconstpointer v2) > +static gboolean vtd_iotlb_equal(gconstpointer v1, gconstpointer v2) > { > - return *((const uint64_t *)v1) == *((const uint64_t *)v2); > + const struct vtd_iotlb_key *key1 = v1; > + const struct vtd_iotlb_key *key2 = v2; > + > + return key1->sid == key2->sid && > + key1->pasid == key2->pasid && > + key1->level == key2->level && > + key1->gfn == key2->gfn; > } > > -static guint vtd_uint64_hash(gconstpointer v) > +static guint vtd_iotlb_hash(gconstpointer v) > { > - return (guint)*(const uint64_t *)v; > + const struct vtd_iotlb_key *key = v; > + > + return key->gfn | ((key->sid) << VTD_IOTLB_SID_SHIFT) | > + (key->level) << VTD_IOTLB_LVL_SHIFT | > + (key->pasid) << VTD_IOTLB_PASID_SHIFT; > } > > static gboolean vtd_as_equal(gconstpointer v1, gconstpointer v2) > @@ -214,7 +232,8 @@ static gboolean vtd_as_equal(gconstpointer v1, gconstpointer v2) > const struct vtd_as_key *key1 = v1; > const struct vtd_as_key *key2 = v2; > > - return (key1->bus == key2->bus) && (key1->devfn == key2->devfn); > + return (key1->bus == key2->bus) && (key1->devfn == key2->devfn) && > + (key1->pasid == key2->pasid); > } > > /* > @@ -302,13 +321,6 @@ static void vtd_reset_caches(IntelIOMMUState *s) > vtd_iommu_unlock(s); > } > > -static uint64_t vtd_get_iotlb_key(uint64_t gfn, uint16_t source_id, > - uint32_t level) > -{ > - return gfn | ((uint64_t)(source_id) << VTD_IOTLB_SID_SHIFT) | > - ((uint64_t)(level) << VTD_IOTLB_LVL_SHIFT); > -} > - > static uint64_t vtd_get_iotlb_gfn(hwaddr addr, uint32_t level) > { > return (addr & vtd_slpt_level_page_mask(level)) >> VTD_PAGE_SHIFT_4K; > @@ -316,15 +328,17 @@ static uint64_t vtd_get_iotlb_gfn(hwaddr addr, uint32_t level) > > /* Must be called with IOMMU lock held */ > static VTDIOTLBEntry *vtd_lookup_iotlb(IntelIOMMUState *s, uint16_t source_id, > - hwaddr addr) > + uint32_t pasid, hwaddr addr) > { > + struct vtd_iotlb_key key; > VTDIOTLBEntry *entry; > - uint64_t key; > int level; > > for (level = VTD_SL_PT_LEVEL; level < VTD_SL_PML4_LEVEL; level++) { > - key = vtd_get_iotlb_key(vtd_get_iotlb_gfn(addr, level), > - source_id, level); > + key.gfn = vtd_get_iotlb_gfn(addr, level); > + key.level = level; > + key.sid = source_id; > + key.pasid = pasid; > entry = g_hash_table_lookup(s->iotlb, &key); > if (entry) { > goto out; > @@ -338,10 +352,11 @@ out: > /* Must be with IOMMU lock held */ > static void vtd_update_iotlb(IntelIOMMUState *s, uint16_t source_id, > uint16_t domain_id, hwaddr addr, uint64_t slpte, > - uint8_t access_flags, uint32_t level) > + uint8_t access_flags, uint32_t level, > + uint32_t pasid) > { > VTDIOTLBEntry *entry = g_malloc(sizeof(*entry)); > - uint64_t *key = g_malloc(sizeof(*key)); > + struct vtd_iotlb_key *key = g_malloc(sizeof(*key)); > uint64_t gfn = vtd_get_iotlb_gfn(addr, level); > > trace_vtd_iotlb_page_update(source_id, addr, slpte, domain_id); > @@ -355,7 +370,13 @@ static void vtd_update_iotlb(IntelIOMMUState *s, uint16_t source_id, > entry->slpte = slpte; > entry->access_flags = access_flags; > entry->mask = vtd_slpt_level_page_mask(level); > - *key = vtd_get_iotlb_key(gfn, source_id, level); > + entry->pasid = pasid; > + > + key->gfn = gfn; > + key->sid = source_id; > + key->level = level; > + key->pasid = pasid; > + > g_hash_table_replace(s->iotlb, key, entry); > } > > @@ -448,7 +469,8 @@ static void vtd_set_frcd_and_update_ppf(IntelIOMMUState *s, uint16_t index) > /* Must not update F field now, should be done later */ > static void vtd_record_frcd(IntelIOMMUState *s, uint16_t index, > uint16_t source_id, hwaddr addr, > - VTDFaultReason fault, bool is_write) > + VTDFaultReason fault, bool is_write, > + bool is_pasid, uint32_t pasid) > { > uint64_t hi = 0, lo; > hwaddr frcd_reg_addr = DMAR_FRCD_REG_OFFSET + (((uint64_t)index) << 4); > @@ -456,7 +478,8 @@ static void vtd_record_frcd(IntelIOMMUState *s, uint16_t index, > assert(index < DMAR_FRCD_REG_NR); > > lo = VTD_FRCD_FI(addr); > - hi = VTD_FRCD_SID(source_id) | VTD_FRCD_FR(fault); > + hi = VTD_FRCD_SID(source_id) | VTD_FRCD_FR(fault) | > + VTD_FRCD_PV(pasid) | VTD_FRCD_PP(is_pasid); > if (!is_write) { > hi |= VTD_FRCD_T; > } > @@ -487,7 +510,8 @@ static bool vtd_try_collapse_fault(IntelIOMMUState *s, uint16_t source_id) > /* Log and report an DMAR (address translation) fault to software */ > static void vtd_report_dmar_fault(IntelIOMMUState *s, uint16_t source_id, > hwaddr addr, VTDFaultReason fault, > - bool is_write) > + bool is_write, bool is_pasid, > + uint32_t pasid) > { > uint32_t fsts_reg = vtd_get_long_raw(s, DMAR_FSTS_REG); > > @@ -514,7 +538,8 @@ static void vtd_report_dmar_fault(IntelIOMMUState *s, uint16_t source_id, > return; > } > > - vtd_record_frcd(s, s->next_frcd_reg, source_id, addr, fault, is_write); > + vtd_record_frcd(s, s->next_frcd_reg, source_id, addr, fault, > + is_write, is_pasid, pasid); > > if (fsts_reg & VTD_FSTS_PPF) { > error_report_once("There are pending faults already, " > @@ -819,13 +844,15 @@ static int vtd_get_pe_from_pasid_table(IntelIOMMUState *s, > > static int vtd_ce_get_rid2pasid_entry(IntelIOMMUState *s, > VTDContextEntry *ce, > - VTDPASIDEntry *pe) > + VTDPASIDEntry *pe, > + uint32_t pasid) > { > - uint32_t pasid; > dma_addr_t pasid_dir_base; > int ret = 0; > > - pasid = VTD_CE_GET_RID2PASID(ce); > + if (pasid == PCI_NO_PASID) { > + pasid = VTD_CE_GET_RID2PASID(ce); > + } > pasid_dir_base = VTD_CE_GET_PASID_DIR_TABLE(ce); > ret = vtd_get_pe_from_pasid_table(s, pasid_dir_base, pasid, pe); > > @@ -834,15 +861,17 @@ static int vtd_ce_get_rid2pasid_entry(IntelIOMMUState *s, > > static int vtd_ce_get_pasid_fpd(IntelIOMMUState *s, > VTDContextEntry *ce, > - bool *pe_fpd_set) > + bool *pe_fpd_set, > + uint32_t pasid) > { > int ret; > - uint32_t pasid; > dma_addr_t pasid_dir_base; > VTDPASIDDirEntry pdire; > VTDPASIDEntry pe; > > - pasid = VTD_CE_GET_RID2PASID(ce); > + if (pasid == PCI_NO_PASID) { > + pasid = VTD_CE_GET_RID2PASID(ce); > + } > pasid_dir_base = VTD_CE_GET_PASID_DIR_TABLE(ce); > > /* > @@ -888,12 +917,13 @@ static inline uint32_t vtd_ce_get_level(VTDContextEntry *ce) > } > > static uint32_t vtd_get_iova_level(IntelIOMMUState *s, > - VTDContextEntry *ce) > + VTDContextEntry *ce, > + uint32_t pasid) > { > VTDPASIDEntry pe; > > if (s->root_scalable) { > - vtd_ce_get_rid2pasid_entry(s, ce, &pe); > + vtd_ce_get_rid2pasid_entry(s, ce, &pe, pasid); > return VTD_PE_GET_LEVEL(&pe); > } > > @@ -906,12 +936,13 @@ static inline uint32_t vtd_ce_get_agaw(VTDContextEntry *ce) > } > > static uint32_t vtd_get_iova_agaw(IntelIOMMUState *s, > - VTDContextEntry *ce) > + VTDContextEntry *ce, > + uint32_t pasid) > { > VTDPASIDEntry pe; > > if (s->root_scalable) { > - vtd_ce_get_rid2pasid_entry(s, ce, &pe); > + vtd_ce_get_rid2pasid_entry(s, ce, &pe, pasid); > return 30 + ((pe.val[0] >> 2) & VTD_SM_PASID_ENTRY_AW) * 9; > } > > @@ -953,31 +984,33 @@ static inline bool vtd_ce_type_check(X86IOMMUState *x86_iommu, > } > > static inline uint64_t vtd_iova_limit(IntelIOMMUState *s, > - VTDContextEntry *ce, uint8_t aw) > + VTDContextEntry *ce, uint8_t aw, > + uint32_t pasid) > { > - uint32_t ce_agaw = vtd_get_iova_agaw(s, ce); > + uint32_t ce_agaw = vtd_get_iova_agaw(s, ce, pasid); > return 1ULL << MIN(ce_agaw, aw); > } > > /* Return true if IOVA passes range check, otherwise false. */ > static inline bool vtd_iova_range_check(IntelIOMMUState *s, > uint64_t iova, VTDContextEntry *ce, > - uint8_t aw) > + uint8_t aw, uint32_t pasid) > { > /* > * Check if @iova is above 2^X-1, where X is the minimum of MGAW > * in CAP_REG and AW in context-entry. > */ > - return !(iova & ~(vtd_iova_limit(s, ce, aw) - 1)); > + return !(iova & ~(vtd_iova_limit(s, ce, aw, pasid) - 1)); > } > > static dma_addr_t vtd_get_iova_pgtbl_base(IntelIOMMUState *s, > - VTDContextEntry *ce) > + VTDContextEntry *ce, > + uint32_t pasid) > { > VTDPASIDEntry pe; > > if (s->root_scalable) { > - vtd_ce_get_rid2pasid_entry(s, ce, &pe); > + vtd_ce_get_rid2pasid_entry(s, ce, &pe, pasid); > return pe.val[0] & VTD_SM_PASID_ENTRY_SLPTPTR; > } > > @@ -1011,18 +1044,19 @@ static bool vtd_slpte_nonzero_rsvd(uint64_t slpte, uint32_t level) > static int vtd_iova_to_slpte(IntelIOMMUState *s, VTDContextEntry *ce, > uint64_t iova, bool is_write, > uint64_t *slptep, uint32_t *slpte_level, > - bool *reads, bool *writes, uint8_t aw_bits) > + bool *reads, bool *writes, uint8_t aw_bits, > + uint32_t pasid) > { > - dma_addr_t addr = vtd_get_iova_pgtbl_base(s, ce); > - uint32_t level = vtd_get_iova_level(s, ce); > + dma_addr_t addr = vtd_get_iova_pgtbl_base(s, ce, pasid); > + uint32_t level = vtd_get_iova_level(s, ce, pasid); > uint32_t offset; > uint64_t slpte; > uint64_t access_right_check; > uint64_t xlat, size; > > - if (!vtd_iova_range_check(s, iova, ce, aw_bits)) { > - error_report_once("%s: detected IOVA overflow (iova=0x%" PRIx64 ")", > - __func__, iova); > + if (!vtd_iova_range_check(s, iova, ce, aw_bits, pasid)) { > + error_report_once("%s: detected IOVA overflow (iova=0x%" PRIx64 "," > + "pasid=0x%" PRIx32 ")", __func__, iova, pasid); > return -VTD_FR_ADDR_BEYOND_MGAW; > } > > @@ -1035,8 +1069,9 @@ static int vtd_iova_to_slpte(IntelIOMMUState *s, VTDContextEntry *ce, > > if (slpte == (uint64_t)-1) { > error_report_once("%s: detected read error on DMAR slpte " > - "(iova=0x%" PRIx64 ")", __func__, iova); > - if (level == vtd_get_iova_level(s, ce)) { > + "(iova=0x%" PRIx64 ", pasid=0x%" PRIx32 ")", > + __func__, iova, pasid); > + if (level == vtd_get_iova_level(s, ce, pasid)) { > /* Invalid programming of context-entry */ > return -VTD_FR_CONTEXT_ENTRY_INV; > } else { > @@ -1048,15 +1083,16 @@ static int vtd_iova_to_slpte(IntelIOMMUState *s, VTDContextEntry *ce, > if (!(slpte & access_right_check)) { > error_report_once("%s: detected slpte permission error " > "(iova=0x%" PRIx64 ", level=0x%" PRIx32 ", " > - "slpte=0x%" PRIx64 ", write=%d)", __func__, > - iova, level, slpte, is_write); > + "slpte=0x%" PRIx64 ", write=%d, pasid=0x%" > + PRIx32 ")", __func__, iova, level, > + slpte, is_write, pasid); > return is_write ? -VTD_FR_WRITE : -VTD_FR_READ; > } > if (vtd_slpte_nonzero_rsvd(slpte, level)) { > error_report_once("%s: detected splte reserve non-zero " > "iova=0x%" PRIx64 ", level=0x%" PRIx32 > - "slpte=0x%" PRIx64 ")", __func__, iova, > - level, slpte); > + "slpte=0x%" PRIx64 ", pasid=0x%" PRIX32 ")", > + __func__, iova, level, slpte, pasid); > return -VTD_FR_PAGING_ENTRY_RSVD; > } > > @@ -1084,9 +1120,10 @@ static int vtd_iova_to_slpte(IntelIOMMUState *s, VTDContextEntry *ce, > error_report_once("%s: xlat address is in interrupt range " > "(iova=0x%" PRIx64 ", level=0x%" PRIx32 ", " > "slpte=0x%" PRIx64 ", write=%d, " > - "xlat=0x%" PRIx64 ", size=0x%" PRIx64 ")", > + "xlat=0x%" PRIx64 ", size=0x%" PRIx64 ", " > + "pasid=0x%" PRIx32 ")", > __func__, iova, level, slpte, is_write, > - xlat, size); > + xlat, size, pasid); > return s->scalable_mode ? -VTD_FR_SM_INTERRUPT_ADDR : > -VTD_FR_INTERRUPT_ADDR; > } > @@ -1300,18 +1337,19 @@ next: > */ > static int vtd_page_walk(IntelIOMMUState *s, VTDContextEntry *ce, > uint64_t start, uint64_t end, > - vtd_page_walk_info *info) > + vtd_page_walk_info *info, > + uint32_t pasid) > { > - dma_addr_t addr = vtd_get_iova_pgtbl_base(s, ce); > - uint32_t level = vtd_get_iova_level(s, ce); > + dma_addr_t addr = vtd_get_iova_pgtbl_base(s, ce, pasid); > + uint32_t level = vtd_get_iova_level(s, ce, pasid); > > - if (!vtd_iova_range_check(s, start, ce, info->aw)) { > + if (!vtd_iova_range_check(s, start, ce, info->aw, pasid)) { > return -VTD_FR_ADDR_BEYOND_MGAW; > } > > - if (!vtd_iova_range_check(s, end, ce, info->aw)) { > + if (!vtd_iova_range_check(s, end, ce, info->aw, pasid)) { > /* Fix end so that it reaches the maximum */ > - end = vtd_iova_limit(s, ce, info->aw); > + end = vtd_iova_limit(s, ce, info->aw, pasid); > } > > return vtd_page_walk_level(addr, start, end, level, true, true, info); > @@ -1379,7 +1417,7 @@ static int vtd_ce_rid2pasid_check(IntelIOMMUState *s, > * has valid rid2pasid setting, which includes valid > * rid2pasid field and corresponding pasid entry setting > */ > - return vtd_ce_get_rid2pasid_entry(s, ce, &pe); > + return vtd_ce_get_rid2pasid_entry(s, ce, &pe, PCI_NO_PASID); > } > > /* Map a device to its corresponding domain (context-entry) */ > @@ -1462,12 +1500,13 @@ static int vtd_sync_shadow_page_hook(IOMMUTLBEvent *event, > } > > static uint16_t vtd_get_domain_id(IntelIOMMUState *s, > - VTDContextEntry *ce) > + VTDContextEntry *ce, > + uint32_t pasid) > { > VTDPASIDEntry pe; > > if (s->root_scalable) { > - vtd_ce_get_rid2pasid_entry(s, ce, &pe); > + vtd_ce_get_rid2pasid_entry(s, ce, &pe, pasid); > return VTD_SM_PASID_ENTRY_DID(pe.val[1]); > } > > @@ -1485,10 +1524,10 @@ static int vtd_sync_shadow_page_table_range(VTDAddressSpace *vtd_as, > .notify_unmap = true, > .aw = s->aw_bits, > .as = vtd_as, > - .domain_id = vtd_get_domain_id(s, ce), > + .domain_id = vtd_get_domain_id(s, ce, vtd_as->pasid), > }; > > - return vtd_page_walk(s, ce, addr, addr + size, &info); > + return vtd_page_walk(s, ce, addr, addr + size, &info, vtd_as->pasid); > } > > static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as) > @@ -1532,13 +1571,14 @@ static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as) > * 1st-level translation or 2nd-level translation, it depends > * on PGTT setting. > */ > -static bool vtd_dev_pt_enabled(IntelIOMMUState *s, VTDContextEntry *ce) > +static bool vtd_dev_pt_enabled(IntelIOMMUState *s, VTDContextEntry *ce, > + uint32_t pasid) > { > VTDPASIDEntry pe; > int ret; > > if (s->root_scalable) { > - ret = vtd_ce_get_rid2pasid_entry(s, ce, &pe); > + ret = vtd_ce_get_rid2pasid_entry(s, ce, &pe, pasid); > if (ret) { > /* > * This error is guest triggerable. We should assumt PT > @@ -1572,19 +1612,20 @@ static bool vtd_as_pt_enabled(VTDAddressSpace *as) > return false; > } > > - return vtd_dev_pt_enabled(s, &ce); > + return vtd_dev_pt_enabled(s, &ce, as->pasid); > } > > /* Return whether the device is using IOMMU translation. */ > static bool vtd_switch_address_space(VTDAddressSpace *as) > { > - bool use_iommu; > + bool use_iommu, pt; > /* Whether we need to take the BQL on our own */ > bool take_bql = !qemu_mutex_iothread_locked(); > > assert(as); > > use_iommu = as->iommu_state->dmar_enabled && !vtd_as_pt_enabled(as); > + pt = as->iommu_state->dmar_enabled && vtd_as_pt_enabled(as); > > trace_vtd_switch_address_space(pci_bus_num(as->bus), > VTD_PCI_SLOT(as->devfn), > @@ -1604,11 +1645,53 @@ static bool vtd_switch_address_space(VTDAddressSpace *as) > if (use_iommu) { > memory_region_set_enabled(&as->nodmar, false); > memory_region_set_enabled(MEMORY_REGION(&as->iommu), true); > + /* > + * vt-d spec v3.4 3.14: > + * > + * """ > + * Requests-with-PASID with input address in range 0xFEEx_xxxx > + * are translated normally like any other request-with-PASID > + * through DMA-remapping hardware. > + * """ > + * > + * Need to disable ir for as with PASID. > + */ > + if (as->pasid != PCI_NO_PASID) { > + memory_region_set_enabled(&as->iommu_ir, false); > + } else { > + memory_region_set_enabled(&as->iommu_ir, true); > + } > } else { > memory_region_set_enabled(MEMORY_REGION(&as->iommu), false); > memory_region_set_enabled(&as->nodmar, true); > } > > + /* > + * vtd-spec v3.4 3.14: > + * > + * """ > + * Requests-with-PASID with input address in range 0xFEEx_xxxx are > + * translated normally like any other request-with-PASID through > + * DMA-remapping hardware. However, if such a request is processed > + * using pass-through translation, it will be blocked as described > + * in the paragraph below. > + * > + * Software must not program paging-structure entries to remap any > + * address to the interrupt address range. Untranslated requests > + * and translation requests that result in an address in the > + * interrupt range will be blocked with condition code LGN.4 or > + * SGN.8. > + * """ > + * > + * We enable per as memory region (iommu_ir_fault) for catching > + * the tranlsation for interrupt range through PASID + PT. > + */ > + if (pt && as->pasid != PCI_NO_PASID) { > + memory_region_set_enabled(&as->iommu_ir_fault, true); > + } else { > + memory_region_set_enabled(&as->iommu_ir_fault, false); > + } > + > if (take_bql) { > qemu_mutex_unlock_iothread(); > } > @@ -1709,12 +1792,15 @@ static void vtd_report_fault(IntelIOMMUState *s, > int err, bool is_fpd_set, > uint16_t source_id, > hwaddr addr, > - bool is_write) > + bool is_write, > + bool is_pasid, > + uint32_t pasid) > { > if (is_fpd_set && vtd_is_qualified_fault(err)) { > trace_vtd_fault_disabled(); > } else { > - vtd_report_dmar_fault(s, source_id, addr, err, is_write); > + vtd_report_dmar_fault(s, source_id, addr, err, is_write, > + is_pasid, pasid); > } > } > > @@ -1739,13 +1825,14 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus, > uint8_t bus_num = pci_bus_num(bus); > VTDContextCacheEntry *cc_entry; > uint64_t slpte, page_mask; > - uint32_t level; > + uint32_t level, pasid = vtd_as->pasid; > uint16_t source_id = PCI_BUILD_BDF(bus_num, devfn); > int ret_fr; > bool is_fpd_set = false; > bool reads = true; > bool writes = true; > uint8_t access_flags; > + bool rid2pasid = (pasid == PCI_NO_PASID) && s->root_scalable; > VTDIOTLBEntry *iotlb_entry; > > /* > @@ -1758,15 +1845,17 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus, > > cc_entry = &vtd_as->context_cache_entry; > > - /* Try to fetch slpte form IOTLB */ > - iotlb_entry = vtd_lookup_iotlb(s, source_id, addr); > - if (iotlb_entry) { > - trace_vtd_iotlb_page_hit(source_id, addr, iotlb_entry->slpte, > - iotlb_entry->domain_id); > - slpte = iotlb_entry->slpte; > - access_flags = iotlb_entry->access_flags; > - page_mask = iotlb_entry->mask; > - goto out; > + /* Try to fetch slpte form IOTLB, we don't need RID2PASID logic */ > + if (!rid2pasid) { > + iotlb_entry = vtd_lookup_iotlb(s, source_id, pasid, addr); > + if (iotlb_entry) { > + trace_vtd_iotlb_page_hit(source_id, addr, iotlb_entry->slpte, > + iotlb_entry->domain_id); > + slpte = iotlb_entry->slpte; > + access_flags = iotlb_entry->access_flags; > + page_mask = iotlb_entry->mask; > + goto out; > + } > } > > /* Try to fetch context-entry from cache first */ > @@ -1777,10 +1866,11 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus, > ce = cc_entry->context_entry; > is_fpd_set = ce.lo & VTD_CONTEXT_ENTRY_FPD; > if (!is_fpd_set && s->root_scalable) { > - ret_fr = vtd_ce_get_pasid_fpd(s, &ce, &is_fpd_set); > + ret_fr = vtd_ce_get_pasid_fpd(s, &ce, &is_fpd_set, pasid); > if (ret_fr) { > vtd_report_fault(s, -ret_fr, is_fpd_set, > - source_id, addr, is_write); > + source_id, addr, is_write, > + false, 0); > goto error; > } > } > @@ -1788,11 +1878,12 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus, > ret_fr = vtd_dev_to_context_entry(s, bus_num, devfn, &ce); > is_fpd_set = ce.lo & VTD_CONTEXT_ENTRY_FPD; > if (!ret_fr && !is_fpd_set && s->root_scalable) { > - ret_fr = vtd_ce_get_pasid_fpd(s, &ce, &is_fpd_set); > + ret_fr = vtd_ce_get_pasid_fpd(s, &ce, &is_fpd_set, pasid); > } > if (ret_fr) { > vtd_report_fault(s, -ret_fr, is_fpd_set, > - source_id, addr, is_write); > + source_id, addr, is_write, > + false, 0); > goto error; > } > /* Update context-cache */ > @@ -1803,11 +1894,15 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus, > cc_entry->context_cache_gen = s->context_cache_gen; > } > > + if (rid2pasid) { > + pasid = VTD_CE_GET_RID2PASID(&ce); > + } > + > /* > * We don't need to translate for pass-through context entries. > * Also, let's ignore IOTLB caching as well for PT devices. > */ > - if (vtd_dev_pt_enabled(s, &ce)) { > + if (vtd_dev_pt_enabled(s, &ce, pasid)) { > entry->iova = addr & VTD_PAGE_MASK_4K; > entry->translated_addr = entry->iova; > entry->addr_mask = ~VTD_PAGE_MASK_4K; > @@ -1828,18 +1923,31 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus, > return true; > } > > + /* Try to fetch slpte form IOTLB for RID2PASID slow path */ > + if (rid2pasid) { > + iotlb_entry = vtd_lookup_iotlb(s, source_id, pasid, addr); > + if (iotlb_entry) { > + trace_vtd_iotlb_page_hit(source_id, addr, iotlb_entry->slpte, > + iotlb_entry->domain_id); > + slpte = iotlb_entry->slpte; > + access_flags = iotlb_entry->access_flags; > + page_mask = iotlb_entry->mask; > + goto out; > + } > + } > + > ret_fr = vtd_iova_to_slpte(s, &ce, addr, is_write, &slpte, &level, > - &reads, &writes, s->aw_bits); > + &reads, &writes, s->aw_bits, pasid); > if (ret_fr) { > vtd_report_fault(s, -ret_fr, is_fpd_set, source_id, > - addr, is_write); > + addr, is_write, pasid != PCI_NO_PASID, pasid); > goto error; > } > > page_mask = vtd_slpt_level_page_mask(level); > access_flags = IOMMU_ACCESS_FLAG(reads, writes); > - vtd_update_iotlb(s, source_id, vtd_get_domain_id(s, &ce), addr, slpte, > - access_flags, level); > + vtd_update_iotlb(s, source_id, vtd_get_domain_id(s, &ce, pasid), > + addr, slpte, access_flags, level, pasid); > out: > vtd_iommu_unlock(s); > entry->iova = addr & page_mask; > @@ -2031,7 +2139,7 @@ static void vtd_iotlb_domain_invalidate(IntelIOMMUState *s, uint16_t domain_id) > QLIST_FOREACH(vtd_as, &s->vtd_as_with_notifiers, next) { > if (!vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus), > vtd_as->devfn, &ce) && > - domain_id == vtd_get_domain_id(s, &ce)) { > + domain_id == vtd_get_domain_id(s, &ce, vtd_as->pasid)) { > vtd_sync_shadow_page_table(vtd_as); > } > } > @@ -2039,7 +2147,7 @@ static void vtd_iotlb_domain_invalidate(IntelIOMMUState *s, uint16_t domain_id) > > static void vtd_iotlb_page_invalidate_notify(IntelIOMMUState *s, > uint16_t domain_id, hwaddr addr, > - uint8_t am) > + uint8_t am, uint32_t pasid) > { > VTDAddressSpace *vtd_as; > VTDContextEntry ce; > @@ -2047,9 +2155,11 @@ static void vtd_iotlb_page_invalidate_notify(IntelIOMMUState *s, > hwaddr size = (1 << am) * VTD_PAGE_SIZE; > > QLIST_FOREACH(vtd_as, &(s->vtd_as_with_notifiers), next) { > + if (pasid != PCI_NO_PASID && pasid != vtd_as->pasid) > + continue; > ret = vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus), > vtd_as->devfn, &ce); > - if (!ret && domain_id == vtd_get_domain_id(s, &ce)) { > + if (!ret && domain_id == vtd_get_domain_id(s, &ce, vtd_as->pasid)) { > if (vtd_as_has_map_notifier(vtd_as)) { > /* > * As long as we have MAP notifications registered in > @@ -2093,7 +2203,7 @@ static void vtd_iotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id, > vtd_iommu_lock(s); > g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_page, &info); > vtd_iommu_unlock(s); > - vtd_iotlb_page_invalidate_notify(s, domain_id, addr, am); > + vtd_iotlb_page_invalidate_notify(s, domain_id, addr, am, PCI_NO_PASID); > } > > /* Flush IOTLB > @@ -3162,6 +3272,7 @@ static Property vtd_properties[] = { > DEFINE_PROP_BOOL("caching-mode", IntelIOMMUState, caching_mode, FALSE), > DEFINE_PROP_BOOL("x-scalable-mode", IntelIOMMUState, scalable_mode, FALSE), > DEFINE_PROP_BOOL("snoop-control", IntelIOMMUState, snoop_control, false), > + DEFINE_PROP_BOOL("x-pasid-mode", IntelIOMMUState, pasid, false), > DEFINE_PROP_BOOL("dma-drain", IntelIOMMUState, dma_drain, true), > DEFINE_PROP_BOOL("dma-translation", IntelIOMMUState, dma_translation, true), > DEFINE_PROP_END_OF_LIST(), > @@ -3436,7 +3547,64 @@ static const MemoryRegionOps vtd_mem_ir_ops = { > }, > }; > > -VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn) > +static void vtd_report_ir_illegal_access(VTDAddressSpace *vtd_as, > + hwaddr addr, bool is_write) > +{ > + IntelIOMMUState *s = vtd_as->iommu_state; > + uint8_t bus_n = pci_bus_num(vtd_as->bus); > + uint16_t sid = PCI_BUILD_BDF(bus_n, vtd_as->devfn); > + bool is_fpd_set = false; > + VTDContextEntry ce; > + > + assert(vtd_as->pasid != PCI_NO_PASID); > + > + /* Try out best to fetch FPD, we can't do anything more */ > + if (vtd_dev_to_context_entry(s, bus_n, vtd_as->devfn, &ce) == 0) { > + is_fpd_set = ce.lo & VTD_CONTEXT_ENTRY_FPD; > + if (!is_fpd_set && s->root_scalable) { > + vtd_ce_get_pasid_fpd(s, &ce, &is_fpd_set, vtd_as->pasid); > + } > + } > + > + vtd_report_fault(s, VTD_FR_SM_INTERRUPT_ADDR, > + is_fpd_set, sid, addr, is_write, > + true, vtd_as->pasid); > +} > + > +static MemTxResult vtd_mem_ir_fault_read(void *opaque, hwaddr addr, > + uint64_t *data, unsigned size, > + MemTxAttrs attrs) > +{ > + vtd_report_ir_illegal_access(opaque, addr, false); > + > + return MEMTX_ERROR; > +} > + > +static MemTxResult vtd_mem_ir_fault_write(void *opaque, hwaddr addr, > + uint64_t value, unsigned size, > + MemTxAttrs attrs) > +{ > + vtd_report_ir_illegal_access(opaque, addr, true); > + > + return MEMTX_ERROR; > +} > + > +static const MemoryRegionOps vtd_mem_ir_fault_ops = { > + .read_with_attrs = vtd_mem_ir_fault_read, > + .write_with_attrs = vtd_mem_ir_fault_write, > + .endianness = DEVICE_LITTLE_ENDIAN, > + .impl = { > + .min_access_size = 1, > + .max_access_size = 8, > + }, > + .valid = { > + .min_access_size = 1, > + .max_access_size = 8, > + }, > +}; > + > +VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, > + int devfn, unsigned int pasid) > { > /* > * We can't simply use sid here since the bus number might not be > @@ -3445,6 +3613,7 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn) > struct vtd_as_key key = { > .bus = bus, > .devfn = devfn, > + .pasid = pasid, > }; > VTDAddressSpace *vtd_dev_as; > char name[128]; > @@ -3455,13 +3624,21 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn) > > new_key->bus = bus; > new_key->devfn = devfn; > + new_key->pasid = pasid; > + > + if (pasid == PCI_NO_PASID) { > + snprintf(name, sizeof(name), "vtd-%02x.%x", PCI_SLOT(devfn), > + PCI_FUNC(devfn)); > + } else { > + snprintf(name, sizeof(name), "vtd-%02x.%x-pasid-%x", PCI_SLOT(devfn), > + PCI_FUNC(devfn), pasid); > + } > > - snprintf(name, sizeof(name), "vtd-%02x.%x", PCI_SLOT(devfn), > - PCI_FUNC(devfn)); > vtd_dev_as = g_new0(VTDAddressSpace, 1); > > vtd_dev_as->bus = bus; > vtd_dev_as->devfn = (uint8_t)devfn; > + vtd_dev_as->pasid = pasid; > vtd_dev_as->iommu_state = s; > vtd_dev_as->context_cache_entry.context_cache_gen = 0; > vtd_dev_as->iova_tree = iova_tree_new(); > @@ -3502,6 +3679,24 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn) > VTD_INTERRUPT_ADDR_FIRST, > &vtd_dev_as->iommu_ir, 1); > > + /* > + * This region is used for catching fault to access interrupt > + * range via passthrough + PASID. See also > + * vtd_switch_address_space(). We can't use alias since we > + * need to know the sid which is valid for MSI who uses > + * bus_master_as (see msi_send_message()). > + */ > + memory_region_init_io(&vtd_dev_as->iommu_ir_fault, OBJECT(s), > + &vtd_mem_ir_fault_ops, vtd_dev_as, "vtd-no-ir", > + VTD_INTERRUPT_ADDR_SIZE); > + /* > + * Hook to root since when PT is enabled vtd_dev_as->iommu > + * will be disabled. > + */ > + memory_region_add_subregion_overlap(MEMORY_REGION(&vtd_dev_as->root), > + VTD_INTERRUPT_ADDR_FIRST, > + &vtd_dev_as->iommu_ir_fault, 2); > + > /* > * Hook both the containers under the root container, we > * switch between DMAR & noDMAR by enable/disable > @@ -3622,7 +3817,7 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) > "legacy mode", > bus_n, PCI_SLOT(vtd_as->devfn), > PCI_FUNC(vtd_as->devfn), > - vtd_get_domain_id(s, &ce), > + vtd_get_domain_id(s, &ce, vtd_as->pasid), > ce.hi, ce.lo); > if (vtd_as_has_map_notifier(vtd_as)) { > /* This is required only for MAP typed notifiers */ > @@ -3632,10 +3827,10 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) > .notify_unmap = false, > .aw = s->aw_bits, > .as = vtd_as, > - .domain_id = vtd_get_domain_id(s, &ce), > + .domain_id = vtd_get_domain_id(s, &ce, vtd_as->pasid), > }; > > - vtd_page_walk(s, &ce, 0, ~0ULL, &info); > + vtd_page_walk(s, &ce, 0, ~0ULL, &info, vtd_as->pasid); > } > } else { > trace_vtd_replay_ce_invalid(bus_n, PCI_SLOT(vtd_as->devfn), > @@ -3735,6 +3930,10 @@ static void vtd_init(IntelIOMMUState *s) > s->ecap |= VTD_ECAP_SC; > } > > + if (s->pasid) { > + s->ecap |= VTD_ECAP_PASID; > + } > + > vtd_reset_caches(s); > > /* Define registers with default values and bit semantics */ > @@ -3808,7 +4007,7 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn) > > assert(0 <= devfn && devfn < PCI_DEVFN_MAX); > > - vtd_as = vtd_find_add_as(s, bus, devfn); > + vtd_as = vtd_find_add_as(s, bus, devfn, PCI_NO_PASID); > return &vtd_as->as; > } > > @@ -3851,6 +4050,11 @@ static bool vtd_decide_config(IntelIOMMUState *s, Error **errp) > return false; > } > > + if (s->pasid && !s->scalable_mode) { > + error_setg(errp, "Need to set scalable mode for PASID"); > + return false; > + } > + > return true; > } > > @@ -3887,6 +4091,16 @@ static void vtd_realize(DeviceState *dev, Error **errp) > X86MachineState *x86ms = X86_MACHINE(ms); > PCIBus *bus = pcms->bus; > IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev); > + X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s); > + > + if (s->pasid && x86_iommu->dt_supported) { > + /* PASID-based-Device-TLB Invalidate Descriptor is not > + * implemented and it requires support from vhost layer which > + * needs to be implemented in the future. > + */ > + error_setg(errp, "PASID based device IOTLB is not supported"); > + return; > + } > > if (!vtd_decide_config(s, errp)) { > return; > @@ -3913,7 +4127,7 @@ static void vtd_realize(DeviceState *dev, Error **errp) > > sysbus_init_mmio(SYS_BUS_DEVICE(s), &s->csrmem); > /* No corresponding destroy */ > - s->iotlb = g_hash_table_new_full(vtd_uint64_hash, vtd_uint64_equal, > + s->iotlb = g_hash_table_new_full(vtd_iotlb_hash, vtd_iotlb_equal, > g_free, g_free); > s->vtd_address_spaces = g_hash_table_new_full(vtd_as_hash, vtd_as_equal, > g_free, g_free); > diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h > index 930ce61feb..f090e61e11 100644 > --- a/hw/i386/intel_iommu_internal.h > +++ b/hw/i386/intel_iommu_internal.h > @@ -114,8 +114,9 @@ > VTD_INTERRUPT_ADDR_FIRST + 1) > > /* The shift of source_id in the key of IOTLB hash table */ > -#define VTD_IOTLB_SID_SHIFT 36 > -#define VTD_IOTLB_LVL_SHIFT 52 > +#define VTD_IOTLB_SID_SHIFT 20 > +#define VTD_IOTLB_LVL_SHIFT 28 > +#define VTD_IOTLB_PASID_SHIFT 30 > #define VTD_IOTLB_MAX_SIZE 1024 /* Max size of the hash table */ > > /* IOTLB_REG */ > @@ -191,6 +192,7 @@ > #define VTD_ECAP_SC (1ULL << 7) > #define VTD_ECAP_MHMV (15ULL << 20) > #define VTD_ECAP_SRS (1ULL << 31) > +#define VTD_ECAP_PASID (1ULL << 40) > #define VTD_ECAP_SMTS (1ULL << 43) > #define VTD_ECAP_SLTS (1ULL << 46) > > @@ -211,6 +213,8 @@ > #define VTD_CAP_DRAIN_READ (1ULL << 55) > #define VTD_CAP_DRAIN (VTD_CAP_DRAIN_READ | VTD_CAP_DRAIN_WRITE) > #define VTD_CAP_CM (1ULL << 7) > +#define VTD_PASID_ID_SHIFT 20 > +#define VTD_PASID_ID_MASK ((1ULL << VTD_PASID_ID_SHIFT) - 1) > > /* Supported Adjusted Guest Address Widths */ > #define VTD_CAP_SAGAW_SHIFT 8 > @@ -262,6 +266,8 @@ > #define VTD_FRCD_SID(val) ((val) & VTD_FRCD_SID_MASK) > /* For the low 64-bit of 128-bit */ > #define VTD_FRCD_FI(val) ((val) & ~0xfffULL) > +#define VTD_FRCD_PV(val) (((val) & 0xffffULL) << 40) > +#define VTD_FRCD_PP(val) (((val) & 0x1) << 31) > > /* DMA Remapping Fault Conditions */ > typedef enum VTDFaultReason { > @@ -379,6 +385,11 @@ typedef union VTDInvDesc VTDInvDesc; > #define VTD_INV_DESC_IOTLB_AM(val) ((val) & 0x3fULL) > #define VTD_INV_DESC_IOTLB_RSVD_LO 0xffffffff0000ff00ULL > #define VTD_INV_DESC_IOTLB_RSVD_HI 0xf80ULL > +#define VTD_INV_DESC_IOTLB_PASID_PASID (2ULL << 4) > +#define VTD_INV_DESC_IOTLB_PASID_PAGE (3ULL << 4) > +#define VTD_INV_DESC_IOTLB_PASID(val) (((val) >> 32) & VTD_PASID_ID_MASK) > +#define VTD_INV_DESC_IOTLB_PASID_RSVD_LO 0xfff00000000001c0ULL > +#define VTD_INV_DESC_IOTLB_PASID_RSVD_HI 0xf80ULL > > /* Mask for Device IOTLB Invalidate Descriptor */ > #define VTD_INV_DESC_DEVICE_IOTLB_ADDR(val) ((val) & 0xfffffffffffff000ULL) > @@ -413,6 +424,7 @@ typedef union VTDInvDesc VTDInvDesc; > /* Information about page-selective IOTLB invalidate */ > struct VTDIOTLBPageInvInfo { > uint16_t domain_id; > + uint32_t pasid; > uint64_t addr; > uint8_t mask; > }; > diff --git a/hw/i386/trace-events b/hw/i386/trace-events > index e49814dd64..04fd71bfc4 100644 > --- a/hw/i386/trace-events > +++ b/hw/i386/trace-events > @@ -12,6 +12,8 @@ vtd_inv_desc_cc_devices(uint16_t sid, uint16_t fmask) "context invalidate device > vtd_inv_desc_iotlb_global(void) "iotlb invalidate global" > vtd_inv_desc_iotlb_domain(uint16_t domain) "iotlb invalidate whole domain 0x%"PRIx16 > vtd_inv_desc_iotlb_pages(uint16_t domain, uint64_t addr, uint8_t mask) "iotlb invalidate domain 0x%"PRIx16" addr 0x%"PRIx64" mask 0x%"PRIx8 > +vtd_inv_desc_iotlb_pasid_pages(uint16_t domain, uint64_t addr, uint8_t mask, uint32_t pasid) "iotlb invalidate domain 0x%"PRIx16" addr 0x%"PRIx64" mask 0x%"PRIx8" pasid 0x%"PRIx32 > +vtd_inv_desc_iotlb_pasid(uint16_t domain, uint32_t pasid) "iotlb invalidate domain 0x%"PRIx16" pasid 0x%"PRIx32 > vtd_inv_desc_wait_sw(uint64_t addr, uint32_t data) "wait invalidate status write addr 0x%"PRIx64" data 0x%"PRIx32 > vtd_inv_desc_wait_irq(const char *msg) "%s" > vtd_inv_desc_wait_write_fail(uint64_t hi, uint64_t lo) "write fail for wait desc hi 0x%"PRIx64" lo 0x%"PRIx64 > diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h > index e49fff2a6c..46d973e629 100644 > --- a/include/hw/i386/intel_iommu.h > +++ b/include/hw/i386/intel_iommu.h > @@ -97,11 +97,13 @@ struct VTDPASIDEntry { > struct VTDAddressSpace { > PCIBus *bus; > uint8_t devfn; > + uint32_t pasid; > AddressSpace as; > IOMMUMemoryRegion iommu; > MemoryRegion root; /* The root container of the device */ > MemoryRegion nodmar; /* The alias of shared nodmar MR */ > MemoryRegion iommu_ir; /* Interrupt region: 0xfeeXXXXX */ > + MemoryRegion iommu_ir_fault; /* Interrupt region for catching fault */ > IntelIOMMUState *iommu_state; > VTDContextCacheEntry context_cache_entry; > QLIST_ENTRY(VTDAddressSpace) next; > @@ -113,6 +115,7 @@ struct VTDAddressSpace { > struct VTDIOTLBEntry { > uint64_t gfn; > uint16_t domain_id; > + uint32_t pasid; > uint64_t slpte; > uint64_t mask; > uint8_t access_flags; > @@ -261,6 +264,7 @@ struct IntelIOMMUState { > uint8_t aw_bits; /* Host/IOVA address width (in bits) */ > bool dma_drain; /* Whether DMA r/w draining enabled */ > bool dma_translation; /* Whether DMA translation supported */ > + bool pasid; /* Whether to support PASID */ > > /* > * Protects IOMMU states in general. Currently it protects the > @@ -272,6 +276,7 @@ struct IntelIOMMUState { > /* Find the VTD Address space associated with the given bus pointer, > * create a new one if none exists > */ > -VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn); > +VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, > + int devfn, unsigned int pasid); > > #endif > diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h > index eb94e7e85c..5653175957 100644 > --- a/include/hw/pci/pci_bus.h > +++ b/include/hw/pci/pci_bus.h > @@ -28,6 +28,8 @@ enum PCIBusFlags { > PCI_BUS_CXL = 0x0004, > }; > > +#define PCI_NO_PASID UINT32_MAX > + > struct PCIBus { > BusState qbus; > enum PCIBusFlags flags; -- Regards, Yi Liu