From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.6 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7B28C43381 for ; Fri, 22 Feb 2019 16:39:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id AFBBC2070B for ; Fri, 22 Feb 2019 16:39:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="XxbvBeLi" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726886AbfBVQjf (ORCPT ); Fri, 22 Feb 2019 11:39:35 -0500 Received: from mail-yw1-f65.google.com ([209.85.161.65]:37421 "EHLO mail-yw1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726647AbfBVQjf (ORCPT ); Fri, 22 Feb 2019 11:39:35 -0500 Received: by mail-yw1-f65.google.com with SMTP id k14so1057450ywe.4 for ; Fri, 22 Feb 2019 08:39:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=7SYr9TEMynibpGx35iihbTuZUrwfHvVwC1d3yZea5Mg=; b=XxbvBeLi3yNTPl6EFniuMbG5PB9M2dCD4Rpwn57HS632EnUc+e4IPQfEywYfXPzDyP RCe47ia1d0fU+QbdffQ4XnvaMLAyf8X1ZUOZ3WpQdw7mztIZLCF4gVfZRQIYeGcJvkgz jRaU5x1+4x4i/2XpZrEB7k4wnvHigGunyRkTtJAVMMPbhDL8/Ns/I6+yL/AeFzAs/U+K e0iljOC1oBKPfTsZnSD64cwPbzfndzgYwisxhA5hAND7mTT7e4e+V9Yy+NMbn3fWgOiv jCvXos9d9Nv/ackDWW8Kw2e6X82+aZWjLOWpWDajX5QkEhpsQmAroFAA3CcuVJomTK0y yq+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7SYr9TEMynibpGx35iihbTuZUrwfHvVwC1d3yZea5Mg=; b=CDq/7KzE92TWU+JqnBt441YeKF6EoEmv1+6QbdRXwLAMX9fb5w7aEjGNwe80tYeh+X 3D/YDghLAyJizH5OggDcJ1w+GTkB+NvZB/LXZYKMaVuiXBZ7WxBxl1+67x3/cfZZX0w+ uF1mD89b5PdtV5RcFcvKABPfmgiHaxfR3m+NuWkfxgbdpB2rel813mDiEo1mxaIlN7hP lx7soOU8c/IolRIjq2YLvs48lhjFItILaaAdjj/qy4xYAiGoKbyk5gIz/B/SMAA9oxpr 2JEjryNlKR3FTMEC5lsm9o+Cl1KEprIF9Y2OFXv4jtCDbTJxGfJf9T4tsPn3nDIRPrhS dmSg== X-Gm-Message-State: AHQUAuYBvRrexwzabQNVbMRC9NKUawq0uadt7OGCkX6Tg+er3QaRFglI 73EQ2nKRjQA7SA9d/42vsqEek71iQsLh+3nSelP8Ug== X-Google-Smtp-Source: AHgI3IbNqSa1xwg2hTi+HkQ7lE9cJfB9jdzbr1WYCw7zIh7piWGU08CkuPjVHYBDWs7cF0YJlcIozn+poInza7MDG2M= X-Received: by 2002:a81:6246:: with SMTP id w67mr4093142ywb.60.1550853573515; Fri, 22 Feb 2019 08:39:33 -0800 (PST) MIME-Version: 1.0 References: <80f35733-e3cf-c7da-1822-87054903dc67@virtuozzo.com> In-Reply-To: <80f35733-e3cf-c7da-1822-87054903dc67@virtuozzo.com> From: Eric Dumazet Date: Fri, 22 Feb 2019 08:39:22 -0800 Message-ID: Subject: Re: [PATCH] tcp: detect use sendpage for slab-based objects To: Vasily Averin Cc: netdev Content-Type: text/plain; charset="UTF-8" Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Fri, Feb 22, 2019 at 6:02 AM Vasily Averin wrote: > > On 2/21/19 7:00 PM, Eric Dumazet wrote: > > On Thu, Feb 21, 2019 at 7:30 AM Vasily Averin wrote: > >> index 2079145a3b7c..cf9572f4fc0f 100644 > >> --- a/net/ipv4/tcp.c > >> +++ b/net/ipv4/tcp.c > >> @@ -996,6 +996,7 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset, > >> goto wait_for_memory; > >> > >> if (can_coalesce) { > >> + WARN_ON_ONCE(PageSlab(page)); > > > > Please use VM_WARN_ON_ONCE() to make this a nop for CONFIG_VM_DEBUG=n > > > > Also the whole tcp_sendpage() should be protected, not only the coalescing part. > > > > (The get_page() done few lines later should not be attempted either) > > > >> skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); > >> } else { > >> get_page(page); > >> -- > > > > It seems the bug has nothing to do with TCP, and belongs to the caller. > > > > Otherwise you need to add the check to all existing .sendpage() / > > .sendpage_locked() handler out there. > > Eric, could you please elaborate once again why tcp_sendpage() should not handle slab objects? Simply because SLAB has its own way to manage objects from a page, and does not care about the underlying page having its refcount elevated. ptr = kmalloc(xx) ... < here you can attempt cheating and add one to the underlying page> kfree(ptr); // SLAB does not care of page count, it will effectively put ptr in the free list. ptr2 = kmalloc(xx); // ptr2 can be the same than ptr (object was kfreed() earlier) This means that some other stuff will happily reuse the piece of memory that you wanted to use for zero-copy. This is a serious bug IMO, since this would allow for data corruption. > > There is known restriction: sendpage should not be called for pages with counter=0, > because internal put_page() releases the page. All sendpage callers I know have such check. > > However why they should add one check for PageSlab? > > Let me explain the problem once again: > I do not see any bugs neither in tcp nor in any sendpage callers, > there is false alert on receiving side that crashes correctly worked host. This is not a false alert, but a very fundamental issue. We can not mix kmalloc() and page fragments, this simply is buggy. > > There is network block device with XFS, > XFS submit IO request with slab objects, > block device driver checks that page count is positive and decides to use sendpage. > sendpage calls tcp_sendpage() that can merge 2 neighbour slab objects into one tcp fragment. > > If data is transferred outside -- nothing bad happen, network device successfully send data outside. > However if data is received locally tcp_recvmsg detects strange vector with "merged" slab objects. > It is not real problem, data can be accessed correctly, however this check calls BUG_ON and crashes the host. > > By this way recently added hardening check forces all .sendpage callers modify code that worked correctly for ages. > > It looks abnormal to me, but I do not understand how to fix this problem correctly. > > I do not like an idea to keep current state -- it can trigger crash of correctly worked hosts in some rare corner cases. > I do not like an idea to fix all callers -- why they need modify correctly worked code to protect from false positive? > I do not like an idea to modify tcp -- to block merge of fragments with slab objects like I proposed earlier. > We can trigger warning in tcp code -- to inform .sendpage callers that they are under fire, > however I agree with yours "bug has nothing to do with TCP" and do not understand why we need to modify tcp_sendpage(). > > May be it's better to replace BUG_ON to WARN_ON in hardening check? > Could you probably advise some other solution? > > Thank you, > Vasily Averin