From: Hugh Dickins <hugh.dickins@tiscali.co.uk>
To: Nitin Gupta <ngupta@vflare.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Rik van Riel <riel@redhat.com>, Karel Zak <kzak@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] swap: Fix swap size in case of block devices
Date: Tue, 1 Sep 2009 10:23:39 +0100 (BST) [thread overview]
Message-ID: <Pine.LNX.4.64.0909011011140.12934@sister.anvils> (raw)
In-Reply-To: <d760cf2d0909010011g75a918c0hedd4b2571afc054c@mail.gmail.com>
[-- Attachment #1: Type: TEXT/PLAIN, Size: 3791 bytes --]
On Tue, 1 Sep 2009, Nitin Gupta wrote:
> On Tue, Sep 1, 2009 at 12:56 AM, Hugh Dickins<hugh.dickins@tiscali.co.uk> wrote:
> > On Mon, 31 Aug 2009, Nitin Gupta wrote:
> >> For block devices, setup_swap_extents() leaves p->pages untouched.
> >> For regular files, it sets p->pages
> >> == total usable swap pages (including header page) - 1;
> >
> > I think you're overlooking the "page < sis->max" condition
> > in setup_swap_extents()'s loop. So at the end of the loop,
> > if no pages were lost to fragmentation, we have
> >
> > sis->max = page_no; /* no change */
> > sis->pages = page_no - 1; /* no change */
> >
>
> Oh, I missed this loop condition. The variable naming is so bad, I
> find it very hard to follow this part of code.
>
> Still, if there is even a single page in swap file that is not usable
> (i.e. non-contiguous on disk) -- which is what usually happens for swap
> files of any practical size -- setup_swap_extents() gives correct value
> in sis->pages == total usable pages (including header) - 1;
>
> However, if all the file pages are usable, it gives off-by-one error, as
> you noted.
Right, I see your point now: when the regular file is fragmented thus,
setup_swap_extents() would allow it to use the final page of the file,
which would otherwise be (erroneously) disallowed.
But I would reword your "what usually happens" to "what happens in
the general case": perhaps I'm wrong, but I think that usually these
days people are creating swap files on filesystems with 4kB block
size, where there's no issue of intra-page fragmentation lowering
that page count (but there may still be inter-page fragmentation
to make swapping to the file less efficient than to a partition).
>
> > Yes, I'd dislike that discrepancy between regular files and block
> > devices, if I could see it. Though I'd probably still be cautious
> > about the disk partitions.
>
> > dd if=/dev/zero of=/swap bs=200k # says 204800 bytes (205kB)
> > mkswap /swap # says size = 196 KiB
> > swapon /swap # dmesg says Adding 192k swap
>
> > which is what I've come to expect from the off-by-one,
> > even on regular files.
>
> In general, its not correct to compare size repored by mkswap and
> swapon like this. The size reported by mkswap includes pages which
> are not contiguous on disk. While, kernel considers only
> PAGE_SIZE-length, PAGE_SIZE-aligned contiguous run of blocks. So, size
> reported by mkswap and swapon can vary wildly. For e.g.:
>
> (on mtdram with ext2 fs)
> dd if=/dev/zero of=swap.dd bs=1M count=10
> mkswap swap.dd # says size = 10236 KiB
> swapon swap.dd # says Adding 10112k swap
If the filesystem has block size 1kB or 2kB, yes.
>
> ====
>
> So, to summarize:
>
> 1. mkswap always behaves correctly: It sets number of pages in swap file
> minus one as 'last_page' in swap header (since this is a 0-based index).
> This same value (total pages - 1) is printed out as size since it knows
> that first page is swap header.
>
> 2. swapon() for block devices: off-by-one error causing last swap page
> to remain unused.
>
> 3. swapon() for regular files:
> 3.1 off-by-one error if every swap page in this file is usable i.e.
> every PAGE_SIZE-length, PAGE_SIZE-aligned chunk is contiguous on
> disk.
> 3.2 correct size value if there is at least one swap page which is
> unusable -- which is expected from swap file of any practical
> size.
>
>
> I will go through swap code again to find other possible off-by-one
> errors. The revised patch will fix these inconsistencies.
Thanks.
Hugh
prev parent reply other threads:[~2009-09-01 9:24 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-30 16:19 [PATCH] swap: Fix swap size in case of block devices Nitin Gupta
2009-08-31 11:27 ` Hugh Dickins
2009-08-31 17:21 ` Nitin Gupta
2009-08-31 19:26 ` Hugh Dickins
2009-09-01 7:11 ` Nitin Gupta
2009-09-01 9:23 ` Hugh Dickins [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0909011011140.12934@sister.anvils \
--to=hugh.dickins@tiscali.co.uk \
--cc=akpm@linux-foundation.org \
--cc=kzak@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ngupta@vflare.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).