From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jesper Dangaard Brouer Subject: Re: [PATCH v3 net-next 08/14] mlx4: use order-0 pages for RX Date: Tue, 14 Feb 2017 20:38:22 +0100 Message-ID: <20170214203822.72d41268@redhat.com> References: <20170214131206.44b644f6@redhat.com> <20170214.120426.2032015522492111544.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: ttoukan.linux@gmail.com, edumazet@google.com, alexander.duyck@gmail.com, netdev@vger.kernel.org, tariqt@mellanox.com, kafai@fb.com, saeedm@mellanox.com, willemb@google.com, bblanco@plumgrid.com, ast@kernel.org, eric.dumazet@gmail.com, linux-mm@kvack.org, brouer@redhat.com To: David Miller Return-path: In-Reply-To: <20170214.120426.2032015522492111544.davem@davemloft.net> Sender: owner-linux-mm@kvack.org List-Id: netdev.vger.kernel.org On Tue, 14 Feb 2017 12:04:26 -0500 (EST) David Miller wrote: > From: Tariq Toukan > Date: Tue, 14 Feb 2017 16:56:49 +0200 > > > Internally, I already implemented "dynamic page-cache" and > > "page-reuse" mechanisms in the driver, and together they totally > > bridge the performance gap. It sounds like you basically implemented a page_pool scheme... > I worry about a dynamically growing page cache inside of drivers > because it is invisible to the rest of the kernel. Exactly, that is why I wanted a separate standardized thing, I call the page_pool, which is part of the MM-tree and interacts with the page allocator. E.g. it must implement/support a way the page allocator can reclaim pages from it (admit I didn't implement this in RFC patches). > It responds only to local needs. Generally true, but a side effect of recycling these pages, result in less fragmentation of the page allocator/buddy system. > The price of the real page allocator comes partly because it can > respond to global needs. > > If a driver consumes some unreasonable percentage of system memory, it > is keeping that memory from being used from other parts of the system > even if it would be better for networking to be slightly slower with > less cache because that other thing that needs memory is more > important. (That is why I want to have OOM protection at device level, with the recycle feedback from page pool we have this knowledge, and further I wanted to allow blocking a specific RX queue[1]) [1] https://prototype-kernel.readthedocs.io/en/latest/vm/page_pool/design/memory_model_nic.html#userspace-delivery-and-oom > I think this is one of the primary reasons that the MM guys severely > chastise us when we build special purpose local caches into networking > facilities. > > And the more I think about it the more I think they are right. +1 > One path I see around all of this is full integration. Meaning that > we can free pages into the page allocator which are still DMA mapped. > And future allocations from that device are prioritized to take still > DMA mapped objects. I like this idea. Are you saying that this should be done per DMA engine or per device? If this is per device, it is almost the page_pool idea. > Yes, we still need to make the page allocator faster, but this kind of > work helps everyone not just 100GB ethernet NICs. True. And Mel already have some generic improvements to the page allocator queued for the next merge. And I have the responsibility to get the bulking API into shape. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org