From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============4870875021044948093==" MIME-Version: 1.0 From: Walker, Benjamin Subject: Re: [SPDK] Set not only O_DIRECT but also O_DSYNC to BDEV_AIO Date: Tue, 14 Nov 2017 18:01:08 +0000 Message-ID: <1510682467.2168.26.camel@intel.com> In-Reply-To: KAWPR01MB1282CEE4464EC5353E0CD61AA2280@KAWPR01MB1282.jpnprd01.prod.outlook.com List-ID: To: spdk@lists.01.org --===============4870875021044948093== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable On Tue, 2017-11-14 at 07:41 +0000, =E6=9D=BE=E6=9C=AC=E5=91=A8=E5=B9=B3 / M= ATSUMOTO=EF=BC=8CSHUUHEI wrote: > Hi, > = > Current BDEV AIO uses O_DIRECT to avoid IO cache effects but does not use > O_DSYNC. > O_DSYNC assures that IO is written to persistent storage. > = > About the difference of IO command sequence, > for SCSI disk O_DSYNC issues extra IO command and it may affect IO perfor= mance > but > for NVMe-SSD O_DSYNC issues no extra IO command. > Hence I estimate this is the reason of indifference of performance. > = O_DIRECT avoids using operating system caches in system memory. O_DSYNC avo= ids using volatile caches in the SSD itself by setting the FUA bit on I/O (Force Unit Access) and additionally by issuing a SCSI synchronize cache command a= fter each I/O on devices that report having a volatile write cache. This is why = you see an extra command for SCSI devices - they are reporting that they suppor= t a volatile write cache. That's very common for a SAS/SATA HDD. The SPDK API provides the user with a mechanism to query whether a block de= vice has a volatile write cache (spdk_bdev_has_write_cache) and an API to instru= ct the device to make data in its volatile caches persistent (spdk_bdev_flush)= . The existence of volatile write caches and the semantics around flushing are we= ll established in traditional block stacks and provide significant performance benefits to some types of devices (particularly lower end consumer grade devices), so we've chosen to provide those same semantics in SPDK. Altering= the spdk bdev aio module to always specify O_DSYNC will greatly reduce the performance on these types of devices, so I think choosing just O_DIRECT an= d not O_DSYNC is the correct choice for the flag. This provides the user the traditional semantics of flushing block devices that they expect. Note that the Intel P3700 does not have a volatile write cache, so sending = flush requests does nothing (it has a write cache, it just isn't volatile). The r= eason Jim was asking about preconditioning in another branch of this thread is be= cause I don't think either of us expect to see any performance difference on the = Intel P3700 when the O_DSYNC flag is added, regardless of workload. If there is a difference even after preconditioning, then it certainly warrants investiga= tion. Thanks Shuhei! Ben --===============4870875021044948093== Content-Type: application/x-pkcs7-signature MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="smime.p7s" MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIKdTCCBOsw ggPToAMCAQICEFLpAsoR6ESdlGU4L6MaMLswDQYJKoZIhvcNAQEFBQAwbzELMAkGA1UEBhMCU0Ux FDASBgNVBAoTC0FkZFRydXN0IEFCMSYwJAYDVQQLEx1BZGRUcnVzdCBFeHRlcm5hbCBUVFAgTmV0 d29yazEiMCAGA1UEAxMZQWRkVHJ1c3QgRXh0ZXJuYWwgQ0EgUm9vdDAeFw0xMzAzMTkwMDAwMDBa Fw0yMDA1MzAxMDQ4MzhaMHkxCzAJBgNVBAYTAlVTMQswCQYDVQQIEwJDQTEUMBIGA1UEBxMLU2Fu dGEgQ2xhcmExGjAYBgNVBAoTEUludGVsIENvcnBvcmF0aW9uMSswKQYDVQQDEyJJbnRlbCBFeHRl cm5hbCBCYXNpYyBJc3N1aW5nIENBIDRBMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA 4LDMgJ3YSVX6A9sE+jjH3b+F3Xa86z3LLKu/6WvjIdvUbxnoz2qnvl9UKQI3sE1zURQxrfgvtP0b Pgt1uDwAfLc6H5eqnyi+7FrPsTGCR4gwDmq1WkTQgNDNXUgb71e9/6sfq+WfCDpi8ScaglyLCRp7 ph/V60cbitBvnZFelKCDBh332S6KG3bAdnNGB/vk86bwDlY6omDs6/RsfNwzQVwo/M3oPrux6y6z yIoRulfkVENbM0/9RrzQOlyK4W5Vk4EEsfW2jlCV4W83QKqRccAKIUxw2q/HoHVPbbETrrLmE6RR Z/+eWlkGWl+mtx42HOgOmX0BRdTRo9vH7yeBowIDAQABo4IBdzCCAXMwHwYDVR0jBBgwFoAUrb2Y ejS0Jvf6xCZU7wO94CTLVBowHQYDVR0OBBYEFB5pKrTcKP5HGE4hCz+8rBEv8Jj1MA4GA1UdDwEB /wQEAwIBhjASBgNVHRMBAf8ECDAGAQH/AgEAMDYGA1UdJQQvMC0GCCsGAQUFBwMEBgorBgEEAYI3 CgMEBgorBgEEAYI3CgMMBgkrBgEEAYI3FQUwFwYDVR0gBBAwDjAMBgoqhkiG+E0BBQFpMEkGA1Ud HwRCMEAwPqA8oDqGOGh0dHA6Ly9jcmwudHJ1c3QtcHJvdmlkZXIuY29tL0FkZFRydXN0RXh0ZXJu YWxDQVJvb3QuY3JsMDoGCCsGAQUFBwEBBC4wLDAqBggrBgEFBQcwAYYeaHR0cDovL29jc3AudHJ1 c3QtcHJvdmlkZXIuY29tMDUGA1UdHgQuMCygKjALgQlpbnRlbC5jb20wG6AZBgorBgEEAYI3FAID oAsMCWludGVsLmNvbTANBgkqhkiG9w0BAQUFAAOCAQEAKcLNo/2So1Jnoi8G7W5Q6FSPq1fmyKW3 sSDf1amvyHkjEgd25n7MKRHGEmRxxoziPKpcmbfXYU+J0g560nCo5gPF78Wd7ZmzcmCcm1UFFfIx fw6QA19bRpTC8bMMaSSEl8y39Pgwa+HENmoPZsM63DdZ6ziDnPqcSbcfYs8qd/m5d22rpXq5IGVU tX6LX7R/hSSw/3sfATnBLgiJtilVyY7OGGmYKCAS2I04itvSS1WtecXTt9OZDyNbl7LtObBrgMLh ZkpJW+pOR9f3h5VG2S5uKkA7Th9NC9EoScdwQCAIw+UWKbSQ0Isj2UFL7fHKvmqWKVTL98sRzvI3 seNC4DCCBYIwggRqoAMCAQICEzMAAIu5Kz5Fe8d0qN0AAAAAi7kwDQYJKoZIhvcNAQEFBQAweTEL MAkGA1UEBhMCVVMxCzAJBgNVBAgTAkNBMRQwEgYDVQQHEwtTYW50YSBDbGFyYTEaMBgGA1UEChMR SW50ZWwgQ29ycG9yYXRpb24xKzApBgNVBAMTIkludGVsIEV4dGVybmFsIEJhc2ljIElzc3Vpbmcg Q0EgNEEwHhcNMTcwMTA5MjEyMzU4WhcNMTgwMTA0MjEyMzU4WjBFMRkwFwYDVQQDExBXYWxrZXIs IEJlbmphbWluMSgwJgYJKoZIhvcNAQkBFhliZW5qYW1pbi53YWxrZXJAaW50ZWwuY29tMIIBIjAN BgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAxFugJYk4Vd/Yvdmr8BdnGDdCkN1bc1KNCAQBhzC/ BWXw5nxpXWMYFBkTxahM78PtuwdtPDFqoHsMNEaX0miWeYjB6zKbKl7y0LEsSxlu9wjllEdWTYOP 9/m3UC0oITDn7L01adbsD5Sin6W1FMmjcBVrD51oy2orpwfvan3TNVRRQxt8dQz38hivXnona5tt toi+V8ved7o251HApvEwW7QtDfdML+RmBKBSf0MzGjZHPzoBfRrsBUZ0yRHJxlkYNeY99EAUUHwT npsySQSf0cxLmvA6/a4qPOUSitHit+cJQ58/EOt6PLrPGAbdu5sz9O+Iv+FUJakwUtg0sAY4RQID AQABo4ICNTCCAjEwHQYDVR0OBBYEFAU2hsr+3sx/M5e5WafmYD18VvX1MB8GA1UdIwQYMBaAFB5p KrTcKP5HGE4hCz+8rBEv8Jj1MGUGA1UdHwReMFwwWqBYoFaGVGh0dHA6Ly93d3cuaW50ZWwuY29t L3JlcG9zaXRvcnkvQ1JML0ludGVsJTIwRXh0ZXJuYWwlMjBCYXNpYyUyMElzc3VpbmclMjBDQSUy MDRBLmNybDCBnwYIKwYBBQUHAQEEgZIwgY8waQYIKwYBBQUHMAKGXWh0dHA6Ly93d3cuaW50ZWwu Y29tL3JlcG9zaXRvcnkvY2VydGlmaWNhdGVzL0ludGVsJTIwRXh0ZXJuYWwlMjBCYXNpYyUyMElz c3VpbmclMjBDQSUyMDRBLmNydDAiBggrBgEFBQcwAYYWaHR0cDovL29jc3AuaW50ZWwuY29tLzAL BgNVHQ8EBAMCB4AwPAYJKwYBBAGCNxUHBC8wLQYlKwYBBAGCNxUIhsOMdYSZ5VGD/YEohY6fU4KR wAlngd69OZXwQwIBZAIBCTAfBgNVHSUEGDAWBggrBgEFBQcDBAYKKwYBBAGCNwoDDDApBgkrBgEE AYI3FQoEHDAaMAoGCCsGAQUFBwMEMAwGCisGAQQBgjcKAwwwTwYDVR0RBEgwRqApBgorBgEEAYI3 FAIDoBsMGWJlbmphbWluLndhbGtlckBpbnRlbC5jb22BGWJlbmphbWluLndhbGtlckBpbnRlbC5j b20wDQYJKoZIhvcNAQEFBQADggEBAMQUzXgrfwDLl92M7wNqp24Xe1poeurJ8YVAy5a2UukwC/uX uXE8Duoz2jMJL90QETn17H7EQQu1J7kc059H6GyDU42MkzPA3mqZQimrTgOaalPXxWXoVl/UUoLB PJZXGF3Ef1p8b1UVdSnZZ8wTD/QTUw7UhgljKZ1td/raLV1h96x6lKCVkZ0UKU8be5M3FHQ/GZJ9 CgUjvN0m2mYOUHDkNzsUTJb4bsV7vZDa3zixm4Gxu2F/uq328AEJ6JJmXA+jjFOzQ0FI8sa7XOSR 1UPvZSrwyA00M/zFZaDTln+sFPFNseYYGYFU7P711D8Wj1Hv1V/C2G4rSRBJG5f1WF8xggIXMIIC EwIBATCBkDB5MQswCQYDVQQGEwJVUzELMAkGA1UECBMCQ0ExFDASBgNVBAcTC1NhbnRhIENsYXJh MRowGAYDVQQKExFJbnRlbCBDb3Jwb3JhdGlvbjErMCkGA1UEAxMiSW50ZWwgRXh0ZXJuYWwgQmFz aWMgSXNzdWluZyBDQSA0QQITMwAAi7krPkV7x3So3QAAAACLuTAJBgUrDgMCGgUAoF0wGAYJKoZI hvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG9w0BCQUxDxcNMTcxMTE0MTgwMTA3WjAjBgkqhkiG 9w0BCQQxFgQUWSQMUW4x/jKBXTEPrJkbaBmI4wUwDQYJKoZIhvcNAQEBBQAEggEAYm9E8TNxPNn7 ULpqwGKoAXi75sSmKSmkvYqmixUsoMHxuDTTU8PxcHv9povPA8DhMHqnJJk2O9EqE/kJXX7oSvOd FKi85D4HGhiyxAJP/GSAv/CslOOIn82fEU0Lguw9wejUk9RwbFeQfEHy/Z3Iuug4N02sRNEJt0hv Gv7acLLNlzgx31bV8l1jSdCdIt5+OA0VoVrS87DDcNjJ58KNUtdOMEySdcESqnsXAhPf6P29fKG2 N0FGmYEYmv5BQ4FYn4i0G89jyfwzaQDFQ5PDlC3Ndy7NShGRz+aBg++3h5oLa83jxvsdWBgrMqed cxEO5+ywngcKC5xNojpCplEoywAAAAAAAA== --===============4870875021044948093==--