From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2866C47082 for ; Mon, 31 May 2021 08:17:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 564CD61059 for ; Mon, 31 May 2021 08:17:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 564CD61059 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CCFEC6B0036; Mon, 31 May 2021 04:17:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C80486B006E; Mon, 31 May 2021 04:17:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF9CA8D0001; Mon, 31 May 2021 04:17:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0238.hostedemail.com [216.40.44.238]) by kanga.kvack.org (Postfix) with ESMTP id 7BACB6B0036 for ; Mon, 31 May 2021 04:17:03 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 18FFB8E78 for ; Mon, 31 May 2021 08:17:03 +0000 (UTC) X-FDA: 78200820726.05.CC59C40 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf26.hostedemail.com (Postfix) with ESMTP id EEA144202A19 for ; Mon, 31 May 2021 08:16:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1622449022; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/b/Emsa0FSUIKJI9wt8rcErU7Yt1S2VQUQdM8XMLLr8=; b=J8qV2zU00dHkkSR9JFWqqJ00DvyqDB6/VU4qMwtL0xFPOvc/jRm/1aQMxQSaHgY3TTvbed wcwlwgAVrkC6AGKWGpJhp/lIsnAgtKicP2bDXnUTu9vvt9fWr22f8k37zYtn5AdP96X3TM Y8yGnbKToib8KTgrjKzMZtcr/dkwnLk= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-298-TS1w1b-fPmq74CKxRaDWIg-1; Mon, 31 May 2021 04:17:00 -0400 X-MC-Unique: TS1w1b-fPmq74CKxRaDWIg-1 Received: by mail-wr1-f71.google.com with SMTP id t5-20020adfb7c50000b029010dd0bb24cfso3682252wre.2 for ; Mon, 31 May 2021 01:17:00 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:references:from:organization:subject :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=/b/Emsa0FSUIKJI9wt8rcErU7Yt1S2VQUQdM8XMLLr8=; b=nzKrZv0iWoIc6dXdPQ2wBQPEBolpPSBs0idCjqO0MPPKdvGhww/+FjvuCKdmvf0iCV mBYQBHW0/FIvuwqj0maReeS+v8/j7/sS+KAIi//lNQpDviw61pZ/AuZ24+Ow9UWMwML8 RwCOOdu8v4O30pkFipz6fOAD+d1RFbusJfX7JyXFk+G1IduK4jns1+jZ3APizkPhgQyP /AP4QVg0TzCRM0lPim2zJOQ5Ad2mwz3ckSCTQHZBbwzwHm2cD7cgoEv1InUNpLIJN4C4 pSMKUKGf0OIQVg7Gum3gXe3fPPze+8337zEqAWfB6/fKyAK2F+Hw6gKOOa1NPjkTqAfC dOfQ== X-Gm-Message-State: AOAM532iNiJ6EmbcvD/1W9J6+gu2eIAq48+S+abIBcn4h2JSul+EksWK iXTrB/kB3UonickbHcgBdnEfITm057gsgDDns8cToHdTeL/Nh4Lsn3zjXAsQHcNBZnBVpSvYYNK J0KnInVm8/zTlmOE1VbmdDAPqs/W9f1J8K6/FXXlqjnDCO3AwcMLnhY84daQ= X-Received: by 2002:a5d:44cb:: with SMTP id z11mr21500265wrr.159.1622449019243; Mon, 31 May 2021 01:16:59 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz7pluG996V9OgV48SQDr0K4XGMV3gSLo1HKUqb8uwGV2U4T467X/HHsy1/cUVIv8bfWoGVFQ== X-Received: by 2002:a5d:44cb:: with SMTP id z11mr21500232wrr.159.1622449018922; Mon, 31 May 2021 01:16:58 -0700 (PDT) Received: from [192.168.3.132] (p5b0c6a6f.dip0.t-ipconnect.de. [91.12.106.111]) by smtp.gmail.com with ESMTPSA id j10sm16077664wrt.32.2021.05.31.01.16.58 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 31 May 2021 01:16:58 -0700 (PDT) To: yong w , minchan@kernel.org, ngupta@vflare.org, senozhatsky@chromium.org, axboe@kernel.dk, akpm@linux-foundation.org, songmuchun@bytedance.com, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-mm@kvack.org References: From: David Hildenbrand Organization: Red Hat Subject: Re: [RFC PATCH V1] zram:calculate available memory when zram is used Message-ID: <13c28e69-cfd9-bcf6-ab77-445c6fa4cc6e@redhat.com> Date: Mon, 31 May 2021 10:16:57 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=J8qV2zU0; spf=none (imf26.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: EEA144202A19 X-Stat-Signature: dcyrg64sae7aywr8uioam9cad33xup8r X-HE-Tag: 1622449016-741348 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 30.05.21 19:18, yong w wrote: > When zram is used, available+Swap free memory is obviously bigger than > I actually can use, because zram can compress memory by compression > algorithm and zram compressed data will occupy memory too. >=20 > So, I count the compression rate of zram in the kernel. The available > memory =C2=A0is calculated as follows: > available + swapfree - swapfree * compress ratio > MemAvailable in /proc/meminfo returns available + zram will save space >=20 This will mean that we can easily have MemAvailable > MemTotal, right?=20 I'm not sure if there are some user space scripts that will be a little=20 disrupted by that. Like calculating "MemUnavailable =3D MemTotal -=20 MemAvailable". MemAvailable: "An estimate of how much memory is available for starting=20 new applications, without swapping" Although zram isn't "traditional swapping", there is still a performance=20 impact when having to go to zram because it adds an indirection and=20 requires (de)compression. Similar to having very fast swap space (like=20 PMEM). Let's not call something "memory" that doesn't have the same=20 semantics as real memory as in "MemTotal". This doesn't feel right. > Signed-off-by: wangyong > > --- > =C2=A0drivers/block/zram/zcomp.h =C2=A0 =C2=A0| =C2=A01 + > =C2=A0drivers/block/zram/zram_drv.c | =C2=A04 ++ > =C2=A0drivers/block/zram/zram_drv.h | =C2=A01 + > =C2=A0fs/proc/meminfo.c =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | =C2= =A02 +- > =C2=A0include/linux/swap.h =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| 10 ++++= + > =C2=A0mm/swapfile.c =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 | 95=20 > +++++++++++++++++++++++++++++++++++++++++++ > =C2=A0mm/vmscan.c =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 | =C2=A01 + > =C2=A07 files changed, 113 insertions(+), 1 deletion(-) >=20 > diff --git a/drivers/block/zram/zcomp.h b/drivers/block/zram/zcomp.h > index 40f6420..deb2dbf 100644 > --- a/drivers/block/zram/zcomp.h > +++ b/drivers/block/zram/zcomp.h > @@ -40,4 +40,5 @@ int zcomp_decompress(struct zcomp_strm *zstrm, > =C2=A0 const void *src, unsigned int src_len, void *dst); >=20 > =C2=A0bool zcomp_set_max_streams(struct zcomp *comp, int num_strm); > +int get_zram_major(void); > =C2=A0#endif /* _ZCOMP_H_ */ > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_dr= v.c > index cf8deec..1c6cbd4 100644 > --- a/drivers/block/zram/zram_drv.c > +++ b/drivers/block/zram/zram_drv.c > @@ -59,6 +59,10 @@ static void zram_free_page(struct zram *zram, size_t= =20 > index); > =C2=A0static int zram_bvec_read(struct zram *zram, struct bio_vec *bve= c, > =C2=A0 u32 index, int offset, struct bio *bio); >=20 > +int get_zram_major(void) > +{ > + return zram_major; > +} >=20 > =C2=A0static int zram_slot_trylock(struct zram *zram, u32 index) > =C2=A0{ > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_dr= v.h > index 6e73dc3..5d8701a 100644 > --- a/drivers/block/zram/zram_drv.h > +++ b/drivers/block/zram/zram_drv.h > @@ -88,6 +88,7 @@ struct zram_stats { > =C2=A0 atomic64_t bd_reads; /* no. of reads from backing device */ > =C2=A0 atomic64_t bd_writes; /* no. of writes from backing device */ > =C2=A0#endif > + atomic_t min_compr_ratio; > =C2=A0}; >=20 > =C2=A0struct zram { > diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c > index 6fa761c..f7bf350 100644 > --- a/fs/proc/meminfo.c > +++ b/fs/proc/meminfo.c > @@ -57,7 +57,7 @@ static int meminfo_proc_show(struct seq_file *m, void= *v) >=20 > =C2=A0 show_val_kb(m, "MemTotal: =C2=A0 =C2=A0 =C2=A0 ", i.totalram); > =C2=A0 show_val_kb(m, "MemFree: =C2=A0 =C2=A0 =C2=A0 =C2=A0", i.freera= m); > - show_val_kb(m, "MemAvailable: =C2=A0 ", available); > + show_val_kb(m, "MemAvailable: =C2=A0 ", available + count_avail_swaps= ()); > =C2=A0 show_val_kb(m, "Buffers: =C2=A0 =C2=A0 =C2=A0 =C2=A0", i.buffer= ram); > =C2=A0 show_val_kb(m, "Cached: =C2=A0 =C2=A0 =C2=A0 =C2=A0 ", cached); > =C2=A0 show_val_kb(m, "SwapCached: =C2=A0 =C2=A0 ", total_swapcache_pa= ges()); > diff --git a/include/linux/swap.h b/include/linux/swap.h > index 032485e..3225a2f 100644 > --- a/include/linux/swap.h > +++ b/include/linux/swap.h > @@ -514,6 +514,8 @@ extern int init_swap_address_space(unsigned int=20 > type, unsigned long nr_pages); > =C2=A0extern void exit_swap_address_space(unsigned int type); > =C2=A0extern struct swap_info_struct *get_swap_device(swp_entry_t entr= y); > =C2=A0sector_t swap_page_sector(struct page *page); > +extern void update_zram_zstats(void); > +extern u64 count_avail_swaps(void); >=20 > =C2=A0static inline void put_swap_device(struct swap_info_struct *si) > =C2=A0{ > @@ -684,6 +686,14 @@ static inline swp_entry_t get_swap_page(struct pag= e=20 > *page) > =C2=A0 return entry; > =C2=A0} >=20 > +void update_zram_zstats(void) > +{ > +} > + > +u64 count_avail_swaps(void) > +{ > +} > + > =C2=A0#endif /* CONFIG_SWAP */ >=20 > =C2=A0#ifdef CONFIG_THP_SWAP > diff --git a/mm/swapfile.c b/mm/swapfile.c > index cbb4c07..93a9dcb 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -44,6 +44,7 @@ > =C2=A0#include > =C2=A0#include > =C2=A0#include > +#include "../drivers/block/zram/zram_drv.h" >=20 > =C2=A0static bool swap_count_continued(struct swap_info_struct *, pgof= f_t, > =C2=A0 unsigned char); > @@ -3408,6 +3409,100 @@ SYSCALL_DEFINE2(swapon, const char __user *,=20 > specialfile, int, swap_flags) > =C2=A0 return error; > =C2=A0} >=20 > +u64 count_avail_swap(struct swap_info_struct *si) > +{ > + u64 result; > + struct zram *z; > + unsigned int free; > + unsigned int ratio; > + > + result =3D 0; > + if (!si) > + return 0; > + > + //zram calculate available mem > + if (si->flags & SWP_USED && si->swap_map) { > + if (si->bdev->bd_disk->major =3D=3D get_zram_major()) { > + z =3D (struct zram *)si->bdev->bd_disk->private_data; > + down_read(&z->init_lock); > + ratio =3D atomic_read(&z->stats.min_compr_ratio); > + free =3D (si->pages << (PAGE_SHIFT - 10)) > + - (si->inuse_pages << (PAGE_SHIFT - 10)); > + if (!ratio) > + result +=3D free / 2; > + else > + result =3D free * (100 - 10000 / ratio) / 100; > + up_read(&z->init_lock); > + } > + } else > + result +=3D (si->pages << (PAGE_SHIFT - 10)) > + - (si->inuse_pages << (PAGE_SHIFT - 10)); > + > + return result; > +} > + > +u64 count_avail_swaps(void) > +{ > + int type; > + u64 result; > + struct swap_info_struct *si; > + > + result =3D 0; > + spin_lock(&swap_lock); > + for (type =3D 0; type < nr_swapfiles; type++) { > + si =3D swap_info[type]; > + spin_lock(&si->lock); > + result +=3D count_avail_swap(si); > + spin_unlock(&si->lock); > + } > + spin_unlock(&swap_lock); > + > + return result; > +} > + > +void update_zram_zstat(struct swap_info_struct *si) > +{ > + struct zram *z; > + struct zram_stats *stat; > + int ratio; > + u64 orig_size, compr_data_size; > + > + if (!si) > + return; > + > + //update zram min compress ratio > + if (si->flags & SWP_USED && si->swap_map) { > + if (si->bdev->bd_disk->major =3D=3D get_zram_major()) { > + z =3D (struct zram *)si->bdev->bd_disk->private_data; > + down_read(&z->init_lock); > + stat =3D &z->stats; > + ratio =3D atomic_read(&stat->min_compr_ratio); > + orig_size =3D atomic64_read(&stat->pages_stored) << PAGE_SHIFT; > + compr_data_size =3D atomic64_read(&stat->compr_data_size); > + if (compr_data_size && (!ratio > + =C2=A0 =C2=A0 || ((orig_size * 100) / compr_data_size < ratio))) > + atomic_set(&stat->min_compr_ratio, > + =C2=A0 =C2=A0(orig_size * 100) / compr_data_size); > + up_read(&z->init_lock); > + } > + } > +} > + > +void update_zram_zstats(void) > +{ > + int type; > + struct swap_info_struct *si; > + > + spin_lock(&swap_lock); > + for (type =3D 0; type < nr_swapfiles; type++) { > + si =3D swap_info[type]; > + spin_lock(&si->lock); > + update_zram_zstat(si); > + spin_unlock(&si->lock); > + } > + spin_unlock(&swap_lock); > +} > + > =C2=A0void si_swapinfo(struct sysinfo *val) > =C2=A0{ > =C2=A0 unsigned int type; > diff --git a/mm/vmscan.c b/mm/vmscan.c > index eb31452..ffaf59b 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -4159,6 +4159,7 @@ static int kswapd(void *p) > =C2=A0 alloc_order); > =C2=A0 reclaim_order =3D balance_pgdat(pgdat, alloc_order, > =C2=A0 highest_zoneidx); > + update_zram_zstats(); > =C2=A0 if (reclaim_order < alloc_order) > =C2=A0 goto kswapd_try_sleep; > =C2=A0 } > --=20 > 2.7.4 --=20 Thanks, David / dhildenb