linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hugh.dickins@tiscali.co.uk>
To: Nitin Gupta <ngupta@vflare.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Karel Zak <kzak@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] swap: Fix swap size in case of block devices
Date: Tue, 1 Sep 2009 10:23:39 +0100 (BST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0909011011140.12934@sister.anvils> (raw)
In-Reply-To: <d760cf2d0909010011g75a918c0hedd4b2571afc054c@mail.gmail.com>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3791 bytes --]

On Tue, 1 Sep 2009, Nitin Gupta wrote:
> On Tue, Sep 1, 2009 at 12:56 AM, Hugh Dickins<hugh.dickins@tiscali.co.uk> wrote:
> > On Mon, 31 Aug 2009, Nitin Gupta wrote:
> >> For block devices, setup_swap_extents() leaves p->pages untouched.
> >> For regular files, it sets p->pages
> >>       == total usable swap pages (including header page) - 1;
> >
> > I think you're overlooking the "page < sis->max" condition
> > in setup_swap_extents()'s loop.  So at the end of the loop,
> > if no pages were lost to fragmentation, we have
> >
> >                sis->max = page_no;             /* no change */
> >                sis->pages = page_no - 1;       /* no change */
> >
> 
> Oh, I missed this loop condition. The variable naming is so bad, I
> find it very hard to follow this part of code.
> 
> Still, if there is even a single page in swap file that is not usable
> (i.e. non-contiguous on disk) -- which is what usually happens for swap
> files of any practical size -- setup_swap_extents() gives correct value
> in sis->pages == total usable pages (including header) - 1;
> 
> However, if all the file pages are usable, it gives off-by-one error, as
> you noted.

Right, I see your point now: when the regular file is fragmented thus,
setup_swap_extents() would allow it to use the final page of the file,
which would otherwise be (erroneously) disallowed.

But I would reword your "what usually happens" to "what happens in
the general case": perhaps I'm wrong, but I think that usually these
days people are creating swap files on filesystems with 4kB block
size, where there's no issue of intra-page fragmentation lowering
that page count (but there may still be inter-page fragmentation
to make swapping to the file less efficient than to a partition).

> 
> > Yes, I'd dislike that discrepancy between regular files and block
> > devices, if I could see it.  Though I'd probably still be cautious
> > about the disk partitions.
> 
> > dd if=/dev/zero of=/swap bs=200k        # says 204800 bytes (205kB)
> > mkswap /swap                            # says size = 196 KiB
> > swapon /swap                            # dmesg says Adding 192k swap
> 
> > which is what I've come to expect from the off-by-one,
> > even on regular files.
> 
> In general, its not correct to compare size repored by mkswap and
> swapon like this. The size reported by mkswap includes pages which
> are not contiguous on disk. While, kernel considers only
> PAGE_SIZE-length, PAGE_SIZE-aligned contiguous run of blocks. So, size
> reported by mkswap and swapon can vary wildly. For e.g.:
> 
> (on mtdram with ext2 fs)
> dd if=/dev/zero of=swap.dd bs=1M count=10
> mkswap swap.dd # says size = 10236 KiB
> swapon swap.dd # says Adding 10112k swap

If the filesystem has block size 1kB or 2kB, yes.

> 
> ====
> 
> So, to summarize:
> 
> 1. mkswap always behaves correctly: It sets number of pages in swap file
> minus one as 'last_page' in swap header (since this is a 0-based index).
> This same value (total pages - 1) is printed out as size since it knows
> that first page is swap header.
> 
> 2. swapon() for block devices: off-by-one error causing last swap page
> to remain unused.
> 
> 3. swapon() for regular files:
>   3.1 off-by-one error if every swap page in this file is usable i.e.
>       every PAGE_SIZE-length, PAGE_SIZE-aligned chunk is contiguous on
>       disk.
>   3.2 correct size value if there is at least one swap page which is
>       unusable -- which is expected from swap file of any practical
>       size.
> 
> 
> I will go through swap code again to find other possible off-by-one
> errors. The revised patch will fix these inconsistencies.

Thanks.

Hugh

      reply	other threads:[~2009-09-01  9:24 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-30 16:19 [PATCH] swap: Fix swap size in case of block devices Nitin Gupta
2009-08-31 11:27 ` Hugh Dickins
2009-08-31 17:21   ` Nitin Gupta
2009-08-31 19:26     ` Hugh Dickins
2009-09-01  7:11       ` Nitin Gupta
2009-09-01  9:23         ` Hugh Dickins [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0909011011140.12934@sister.anvils \
    --to=hugh.dickins@tiscali.co.uk \
    --cc=akpm@linux-foundation.org \
    --cc=kzak@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ngupta@vflare.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).