From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk0-f198.google.com (mail-qk0-f198.google.com [209.85.220.198]) by kanga.kvack.org (Postfix) with ESMTP id 746126B0573 for ; Tue, 1 Aug 2017 15:33:45 -0400 (EDT) Received: by mail-qk0-f198.google.com with SMTP id q66so12204759qki.1 for ; Tue, 01 Aug 2017 12:33:45 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id r64si28489426qkr.100.2017.08.01.12.33.44 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 01 Aug 2017 12:33:44 -0700 (PDT) Date: Tue, 1 Aug 2017 21:33:41 +0200 From: Andrea Arcangeli Subject: Re: [PATCH v2 4/4] mm: fix KSM data corruption Message-ID: <20170801193341.GA24406@redhat.com> References: <1501566977-20293-1-git-send-email-minchan@kernel.org> <1501566977-20293-5-git-send-email-minchan@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1501566977-20293-5-git-send-email-minchan@kernel.org> Sender: owner-linux-mm@kvack.org List-ID: To: Minchan Kim Cc: Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-team , Nadav Amit , Mel Gorman , Hugh Dickins Hello, On Tue, Aug 01, 2017 at 02:56:17PM +0900, Minchan Kim wrote: > CPU0 CPU1 CPU2 CPU3 > ---- ---- ---- ---- > Write the same > value on page > > [cache PTE as > dirty in TLB] > > MADV_FREE > pte_mkclean() > > 4 > clear_refs > pte_wrprotect() > > write_protect_page() > [ success, no flush ] > > pages_indentical() > [ ok ] > > Write to page > different value > > [Ok, using stale > PTE] > > replace_page() > > Later, CPU1, CPU2 and CPU3 would flush the TLB, but that is too late. CPU0 > already wrote on the page, but KSM ignored this write, and it got lost. > " > > In above scenario, MADV_FREE is fixed by changing TLB batching API > including [set|clear]_tlb_flush_pending. Remained thing is soft-dirty part. > > This patch changes soft-dirty uses TLB batching API instead of flush_tlb_mm > and KSM checks pending TLB flush by using mm_tlb_flush_pending so that > it will flush TLB to avoid data lost if there are other parallel threads > pending TLB flush. > > [1] http://lkml.kernel.org/r/BD3A0EBE-ECF4-41D4-87FA-C755EA9AB6BD@gmail.com > > Note: > I failed to reproduce this problem through Nadav's test program which > need to tune timing in my system speed so didn't confirm it work. > Nadav, Could you test this patch on your test machine? Reviewed-by: Andrea Arcangeli -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752396AbdHATdp (ORCPT ); Tue, 1 Aug 2017 15:33:45 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42718 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752356AbdHATdo (ORCPT ); Tue, 1 Aug 2017 15:33:44 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 81F67377A4F Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=aarcange@redhat.com Date: Tue, 1 Aug 2017 21:33:41 +0200 From: Andrea Arcangeli To: Minchan Kim Cc: Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-team , Nadav Amit , Mel Gorman , Hugh Dickins Subject: Re: [PATCH v2 4/4] mm: fix KSM data corruption Message-ID: <20170801193341.GA24406@redhat.com> References: <1501566977-20293-1-git-send-email-minchan@kernel.org> <1501566977-20293-5-git-send-email-minchan@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1501566977-20293-5-git-send-email-minchan@kernel.org> User-Agent: Mutt/1.8.3 (2017-05-23) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Tue, 01 Aug 2017 19:33:43 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On Tue, Aug 01, 2017 at 02:56:17PM +0900, Minchan Kim wrote: > CPU0 CPU1 CPU2 CPU3 > ---- ---- ---- ---- > Write the same > value on page > > [cache PTE as > dirty in TLB] > > MADV_FREE > pte_mkclean() > > 4 > clear_refs > pte_wrprotect() > > write_protect_page() > [ success, no flush ] > > pages_indentical() > [ ok ] > > Write to page > different value > > [Ok, using stale > PTE] > > replace_page() > > Later, CPU1, CPU2 and CPU3 would flush the TLB, but that is too late. CPU0 > already wrote on the page, but KSM ignored this write, and it got lost. > " > > In above scenario, MADV_FREE is fixed by changing TLB batching API > including [set|clear]_tlb_flush_pending. Remained thing is soft-dirty part. > > This patch changes soft-dirty uses TLB batching API instead of flush_tlb_mm > and KSM checks pending TLB flush by using mm_tlb_flush_pending so that > it will flush TLB to avoid data lost if there are other parallel threads > pending TLB flush. > > [1] http://lkml.kernel.org/r/BD3A0EBE-ECF4-41D4-87FA-C755EA9AB6BD@gmail.com > > Note: > I failed to reproduce this problem through Nadav's test program which > need to tune timing in my system speed so didn't confirm it work. > Nadav, Could you test this patch on your test machine? Reviewed-by: Andrea Arcangeli