public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Török Edwin" <edwintorok@gmail.com>
To: Nick Piggin <npiggin@suse.de>
Cc: Mike Waychison <mikew@google.com>, Ying Han <yinghan@google.com>,
	Ingo Molnar <mingo@elte.hu>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	Rohit Seth <rohitseth@google.com>,
	Hugh Dickins <hugh@veritas.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [RFC v1][PATCH]page_fault retry with NOPAGE_RETRY
Date: Mon, 01 Dec 2008 13:37:28 +0200	[thread overview]
Message-ID: <4933CC78.2030707@gmail.com> (raw)
In-Reply-To: <20081201111301.GB13903@wotan.suse.de>

On 2008-12-01 13:13, Nick Piggin wrote:
> BTW. I think your source code (I see you updated it since last posting)
> should be very easy to give good hints to the kernel about the IO. I
> will try a few simple tricks and we can see if they help. (this pattern
> of touching memory corresponds well to how your app works?)

It corresponds well to the latencies involved, but only part of the
behaviour:

- in some cases mmap is used to sequentially read a file (PROT_READ,
MAP_PRIVATE), and does operations like
memchr, memcpy on it, my testcase models this
- in some cases it is used to mmap archives, and containers, that have
the index at the end (like zip), so it jumps back and forth between the
end of the file, and the offset indicated there (using pread here may be
better, but using mmap simplified the code a lot)
- there are multiple threads, each processing a different file, the only
data shared between threads is the signature database, so once a thread
started working on a file,
no other thread touches it
- the goal is to process as many files as possible, which works on some
files very well (PE files mostly), but not on others (where I can't load
all cores to 400%)

In either case it pagefaults a lot, and calls mmap() often, which is
what my testcase attempted to model.

You can completely disable mmap usage in clamav, but last I tried that
slowed things down (it falls back to using fread, and reading the entire
file in memory in case of zip).
Perhaps I should try turning off mmap for just portions.

If you find something that improves my testcase, I can try on the real
application and let you know if it improved or not (and perhaps create a
new testcase).

If you want, you can test on the original application (its open source
after all!) too.
I found that scanning my local copy of my Gmail inbox is  a good
testcase. I can walk you through how to configure/setup clamav to test.

Best regards,
--Edwin

  reply	other threads:[~2008-12-01 11:37 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-22  6:47 [RFC v1][PATCH]page_fault retry with NOPAGE_RETRY Ying Han
2008-11-22  7:15 ` Andrew Morton
2008-11-23  9:18 ` Ingo Molnar
2008-11-23 18:24   ` Andrew Morton
2008-11-25 18:42   ` Ying Han
2008-11-26 12:32     ` Nick Piggin
2008-11-26 19:57       ` Mike Waychison
2008-11-27  8:55         ` Nick Piggin
2008-11-27  9:28           ` Mike Waychison
2008-11-27 10:00             ` Peter Zijlstra
2008-11-27 10:14               ` Nick Piggin
2008-11-27 19:22                 ` Mike Waychison
2008-11-28  9:41                   ` Nick Piggin
2008-11-28 22:46                     ` Mike Waychison
2008-11-27 11:08               ` KOSAKI Motohiro
2008-11-27 19:10               ` Mike Waychison
2008-11-27 11:39             ` Török Edwin
2008-11-27 12:03               ` Nick Piggin
2008-11-27 12:21                 ` Török Edwin
2008-11-27 12:32                   ` Peter Zijlstra
2008-11-27 12:39                   ` Nick Piggin
2008-11-27 12:52                     ` Török Edwin
2008-11-27 13:05                       ` Nick Piggin
2008-11-27 13:10                         ` Török Edwin
2008-11-27 13:12                           ` Nick Piggin
2008-11-27 13:23                             ` Török Edwin
2008-11-28 12:10                               ` Nick Piggin
2008-11-30 19:38                                 ` Török Edwin
2008-12-01  8:52                                   ` Nick Piggin
2008-12-01 11:13                                   ` Nick Piggin
2008-12-01 11:37                                     ` Török Edwin [this message]
2008-12-04 22:27                       ` Ying Han
2008-12-05  6:50                         ` Török Edwin
2008-11-27 13:08             ` Nick Piggin
2008-11-27 19:03               ` Mike Waychison
2008-11-28  9:37                 ` Nick Piggin
2008-11-28 23:02                   ` Mike Waychison
2008-11-30 19:54                     ` Török Edwin
2008-12-01  4:50                       ` Mike Waychison
2008-12-01  8:58                       ` Nick Piggin
2008-12-01 11:45                     ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4933CC78.2030707@gmail.com \
    --to=edwintorok@gmail.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=hpa@zytor.com \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mikew@google.com \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=rientjes@google.com \
    --cc=rohitseth@google.com \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox