From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6FFA8C433DF for ; Tue, 26 May 2020 16:20:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 495BD2073B for ; Tue, 26 May 2020 16:20:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="goDmTWmf" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727110AbgEZQUQ (ORCPT ); Tue, 26 May 2020 12:20:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35642 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726282AbgEZQUQ (ORCPT ); Tue, 26 May 2020 12:20:16 -0400 Received: from mail-vs1-xe36.google.com (mail-vs1-xe36.google.com [IPv6:2607:f8b0:4864:20::e36]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 603B6C03E96D for ; Tue, 26 May 2020 09:20:16 -0700 (PDT) Received: by mail-vs1-xe36.google.com with SMTP id e7so12144414vsm.6 for ; Tue, 26 May 2020 09:20:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=e1lFxIAJ9dpaucvsxWY6az6Xo7utzX7wkq2rEO3BC1Y=; b=goDmTWmfSUcurp0dgIS0zO6w//UllpIncUVrLajyRx7m1Ye6kvllVJXNjph7nNQx9C mWRxpo4fRan7W+Y12VHWbYRBQFrCJ7q34UDLgwhjMbxS2wCHruXFJ2AjCen2PZldsLXU dYpqRqcb/gIcABsL4FVPoCrMhmvf2EYF/UCv6Kpf7vRpyrZGs4LV3Qkxnz/0K2RSHR0R kU936UnqQ/xk6niArDCuSu2mXD3jO/ilxW7ivh9HN5hOl8JuQ8gMI1UUv6dfFbFPe27q 1RKJjQdKn0UAqzJa/GtWC4fKjvSrBnd5PlJTyv5d00fbOQYjnNmlZnLrPXSa51AOCOVt hP+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=e1lFxIAJ9dpaucvsxWY6az6Xo7utzX7wkq2rEO3BC1Y=; b=DUN9EaLDKBPmBFC+6sbtg6wM321IBCya2RrssJ47cG7MAJ0dBisyxYLnL4H+xASPCJ MhNyzcApGc4x0fCG0l5iV5H67S4KrAfAe6HnK0kXAfB5NZFwXi69axd6hvxuklJ5ZdTj Q7d97JkI9VCe+qI5jr1Z7/hw5sjg6HK5l8uvmHBU4h8T7CRLKwD3fzpKI+cg59RsZl/A JhmWaZ3R8fZPBsQn1AHbq3VXKj22gIrtK1q8btp5BX8oxlnr/UN1IL6u8dxqDevmujJz Xa8bBdTWUpHGftwD0qQJfUWlSGrCXNjv3kenn0Xe+PeN9t5JlK1JW/o1jPSJkNe9zYkb PvEQ== X-Gm-Message-State: AOAM530jNpuXKka/6t8nzuwDCYtQVKyuduRduZX0FBJV1S6FlW9AX2KD UpZNCPqaMAbo0LsZ5SGDXoBr2PtV X-Google-Smtp-Source: ABdhPJxVbjLI1xbehMbsY/N/slH7uHIGHc6YcymwZG/7jpRYUbo7/lmL01ViN0Vou/xNrxsPhnLfKg== X-Received: by 2002:a67:f557:: with SMTP id z23mr1637751vsn.32.1590510015536; Tue, 26 May 2020 09:20:15 -0700 (PDT) Received: from google.com (239.145.196.35.bc.googleusercontent.com. [35.196.145.239]) by smtp.gmail.com with ESMTPSA id n17sm25736vkn.29.2020.05.26.09.20.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 May 2020 09:20:14 -0700 (PDT) Date: Tue, 26 May 2020 16:20:13 +0000 From: Dennis Zhou To: xiaohui li Cc: dennisszhou@gmail.com, linux-block@vger.kernel.org Subject: Re: question about the usage of avg_lat in io latency controller Message-ID: <20200526162013.GA37786@google.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Hello, On Tue, May 26, 2020 at 07:27:34PM +0800, xiaohui li wrote: > hi Dennis Zhou and other who are familiar with io latency controller : > > i have a question about the io latency controller : > how to configure the min latency value of a blk cgroup ? > using the avg_lat value may be not right. > > from the Documentation/admin-guide/cgroup-v2.rst, i know we can do > configuring work in this way: > ------------------- > Use the avg_lat value as a basis for > your real setting, setting at 10-15% higher than the value in io.stat > ------------------- > > but when i have found the avg_lat value is the total sum of running > average of io latency in the past time, and it can't reflect the > average time cost of per single IO request. > but whether one thread can be throttled depends on the compare result > of stat.mean and iolat->min_lat_nsec. > and stat.mean can reflect the average time cost of per single IO request. > > so from above analysis, if do the configuring min io latency value > work of a blk group, use the avg_lat may not be appropriate, because > it is the total sum of running average. > why not make use of the stat.mean to do the configuring min io latency > value work ? > The logic isn't perfect. The idea is that you only care about this metric when the device under some constant load. For spinning disks, it's calculated by contributing the average latency of an interval cur_win_nsec's long to an exponential moving average. See iolat_update_total_lat_avg(). > one experiment on my device: > cat io.stat > 8:0 rbytes=586723328 wbytes=99248033792 rios=143243 wios=331782 > dbytes=0 dios=0 use_delay=12 delay_nsec=0 depth=1 avg_lat=11526 > win=800 > > so the avg_lat value 11526(ns) is so big, it can't be the average time > cost of per single IO request on our device. > so it can not be used for do the configuring min io latency value work. > Yeah so I can't remember if this change preceeded the other change, but timing is now calculated based on issue time which comes the time marked before it enters the block layer. This means it includes any delays that have been incurred. But from the output, you can see that you're at depth=1 and use_delay=12 meaning it is in play iirc. To get the value you're looking for, we'd have to plumb through rq->io_start_time_ns. There is an inherent mismatch here as blk-iolatency works at the bio layer and not the request layer. However, blk-iolatency is meant to protect a single cgroup from the system. In this case, it seems like given your workload, workload + the workload on the system is saturating the disk. Meaningfully, someone else is throttling you as their iolatency is being violated. I'd consider switching to iocost for a proportional io controller. > Maybe I'm not familiar with the io latency controller code. > If any mistake exist in my above analysis, welcome to add your suggestions. Thanks, Dennis