linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andi Kleen <andi@firstfloor.org>
To: Hugh Dickins <hughd@google.com>
Cc: angelo <angelo70@gmail.com>, Al Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dave Chinner <david@fromorbit.com>,
	Christoph Hellwig <hch@infradead.org>,
	Andi Kleen <andi@firstfloor.org>,
	Jeff Layton <jlayton@poochiereds.net>,
	Eryu Guan <eguan@redhat.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH] mm: fix cpu hangs on truncating last page of a 16t sparse file
Date: Mon, 28 Sep 2015 19:03:32 +0200	[thread overview]
Message-ID: <20150928170332.GA12732@two.firstfloor.org> (raw)
In-Reply-To: <alpine.LSU.2.11.1509270953460.1024@eggly.anvils>

> I can't tell you why MAX_LFS_FILESIZE was defined to exclude half
> of the available range.  I've always assumed that it's because there
> were known or feared areas of the code, which manipulate between
> bytes and pages, and might hit sign extension issues - though
> I cannot identify those places myself.

The limit was intentional to handle old user space. I don't think
it has anything to do with the kernel.

off_t is sometimes used signed, mainly with lseek SEEK_CUR/END when you
want to seek backwards. It would be quite odd to sometimes
have off_t be signed (SEEK_CUR/END) and sometimes be unsigned
(when using SEEK_SET).  So it made some sense to set the limit
to the signed max value.

Here's the original "Large file standard" that describes
the issues in more details:

http://www.unix.org/version2/whatsnew/lfs20mar.html

This document explicitly requests signed off_t:

>>>


Mixed sizes of off_t
    During a period of transition from existing systems to systems able to support an arbitrarily large file size, most systems will need to support binaries with two or more sizes of the off_t data type (and related data types). This mixed off_t environment may occur on a system with an ABI that supports different sizes of off_t. It may occur on a system which has both a 64-bit and a 32-bit ABI. Finally, it may occur when using a distributed system where clients and servers have differing sizes of off_t. In effect, the period of transition will not end until we need 128-bit file sizes, requiring yet another transition! The proposed changes may also be used as a model for the 64 to 128-bit file size transition. 
Offset maximum
    Most, but unfortunately not all, of the numeric values in the SUS are protected by opaque type definitions. In theory this allows programs to use these types rather than the underlying C language data types to avoid issues like overflow. However, most existing code maps these opaque data types like off_t to long integers that can overflow for the values needed to represent the offsets possible in large files.

    To protect existing binaries from arbitrarily large files, a new value (offset maximum) will be part of the open file description. An offset maximum is the largest offset that can be used as a file offset. Operations attempting to go beyond the offset maximum will return an error. The offset maximum is normally established as the size of the off_t "extended signed integral type" used by the program creating the file description.

    The open() function and other interfaces establish the offset maximum for a file description, returning an error if the file size is larger than the offset maximum at the time of the call. Returning errors when the 
<<<

-Andi

  parent reply	other threads:[~2015-09-28 17:03 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <560723F8.3010909@gmail.com>
2015-09-27  1:36 ` [PATCH] mm: fix cpu hangs on truncating last page of a 16t sparse file Hugh Dickins
2015-09-27  2:14   ` angelo
2015-09-27  2:21   ` angelo
2015-09-27 17:59     ` Hugh Dickins
2015-09-27 23:26       ` Dave Chinner
2015-09-27 23:56         ` Jeff Layton
2015-09-28  1:06           ` Dave Chinner
2015-09-28 17:03       ` Andi Kleen [this message]
2015-09-28 18:08         ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150928170332.GA12732@two.firstfloor.org \
    --to=andi@firstfloor.org \
    --cc=akpm@linux-foundation.org \
    --cc=angelo70@gmail.com \
    --cc=david@fromorbit.com \
    --cc=eguan@redhat.com \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=jlayton@poochiereds.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).