linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, Takashi Iwai <tiwai@suse.de>,
	Hyeoncheol Lee <cheol.lee@lge.com>,
	yjay.kim@lge.com, Sangseok Lee <sangseok.lee@lge.com>,
	Hugh Dickins <hughd@google.com>,
	linux-mm@kvack.org, "Darrick J . Wong" <darrick.wong@oracle.com>,
	stable@vger.kernel.org,
	Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Subject: Re: [PATCH v3 1/3] mm: support anonymous stable page
Date: Mon, 28 Nov 2016 09:41:52 +0900	[thread overview]
Message-ID: <20161128004152.GA30427@bbox> (raw)
In-Reply-To: <20161127131910.GB4919@tigerII>

Hi Sergey,

I'm going on a long vacation so forgive if I respond slowly. :)

On Sun, Nov 27, 2016 at 10:19:10PM +0900, Sergey Senozhatsky wrote:
> Hi,
> 
> On (11/25/16 17:35), Minchan Kim wrote:
> [..]
> > Unfortunately, zram has used per-cpu stream feature from v4.7.
> > It aims for increasing cache hit ratio of scratch buffer for
> > compressing. Downside of that approach is that zram should ask
> > memory space for compressed page in per-cpu context which requires
> > stricted gfp flag which could be failed. If so, it retries to
> > allocate memory space out of per-cpu context so it could get memory
> > this time and compress the data again, copies it to the memory space.
> > 
> > In this scenario, zram assumes the data should never be changed
> > but it is not true unless stable page supports. So, If the data is
> > changed under us, zram can make buffer overrun because second
> > compression size could be bigger than one we got in previous trial
> > and blindly, copy bigger size object to smaller buffer which is
> > buffer overrun. The overrun breaks zsmalloc free object chaining
> > so system goes crash like above.
> 
> very interesting find! didn't see this coming.
> 
> > Unfortunately, reuse_swap_page should be atomic so that we cannot wait on
> > writeback in there so the approach in this patch is simply return false if
> > we found it needs stable page.  Although it increases memory footprint
> > temporarily, it happens rarely and it should be reclaimed easily althoug
> > it happened.  Also, It would be better than waiting of IO completion,
> > which is critial path for application latency.
> 
> wondering - how many pages can it hold? we are in low memory, that's why we
> failed to zsmalloc in fast path, so how likely this to worsen memory pressure?

Actually, I don't have real number to say but a thing I can say surely is
it's really hard to meet in normal stress test I have done until now.
That's why it takes a long time to find(i.e., I could encounter the bug
once two days). But once I understood the problem, I can reproduce the
problem in 15 minutes.

About memory pressure, my testing was already severe memory pressure(i.e.,
many memory failure and frequent OOM kill) so it doesn't make any
meaningful difference before and after.

> just asking. in async zram the window between zram_rw_page() and actual
> write of a page even bigger, isn't it?

Yes. That's why I found the problem with that feature enabled. Lucky. ;)

> 
> we *probably* and *may be* can try handle it in zram:
> 
> -- store the previous clen before re-compression
> -- check if new clen > saved_clen and if it is - we can't use previously
>    allocate handle and need to allocate a new one again. if it's less or
>    equal than the saved one - store the object (wasting some space,
>    yes. but we are in low mem).

It was my first attempt but changed mind.
It can save against crash but broken data could go to the disk
(i.e., zram). If someone want to read block directly(e.g.,
open /dev/zram0; read /dev/zram or DIO), it cannot read the data
forever until someone writes some stable data into that sectors.
Instead, he will see many decompression failure message.
It's weired.

I believe stable page problem should be solved by generic layer,
not driver itself.

> 
> -- we, may be, also can try harder in zsmalloc. once we detected that
>    zsmllaoc has failed, then we can declare it as an emergency and
>    store objects of size X in higher classes (assuming that there is a
>    bigger size class available with allocated and unused object).

It cannot solve the problem I mentioned above, either and I don't want
to make zram complicated to solve that problem. :(



> 
> 	-ss

  reply	other threads:[~2016-11-28  0:43 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-25  8:35 [PATCH v3 0/3] Fix zsmalloc crash problem Minchan Kim
2016-11-25  8:35 ` [PATCH v3 1/3] mm: support anonymous stable page Minchan Kim
2016-11-27 13:19   ` Sergey Senozhatsky
2016-11-28  0:41     ` Minchan Kim [this message]
2016-11-28  5:38       ` Sergey Senozhatsky
2016-11-25  8:35 ` [PATCH v3 2/3] zram: revalidate disk under init_lock Minchan Kim
2016-11-26  6:38   ` Sergey Senozhatsky
2016-11-25  8:35 ` [PATCH v3 3/3] zram: support BDI_CAP_STABLE_WRITES Minchan Kim
2016-11-26  6:37   ` Sergey Senozhatsky
2016-11-26 14:41     ` Minchan Kim
2016-11-27 13:01       ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161128004152.GA30427@bbox \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cheol.lee@lge.com \
    --cc=darrick.wong@oracle.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=sangseok.lee@lge.com \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=tiwai@suse.de \
    --cc=yjay.kim@lge.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).