From: Greg KH <gregkh@suse.de>
To: linux-kernel@vger.kernel.org, stable@kernel.org
Cc: stable-review@kernel.org, torvalds@linux-foundation.org,
akpm@linux-foundation.org, alan@lxorguk.ukuu.org.uk,
Eric Sandeen <sandeen@redhat.com>, "Theodore Tso" <tytso@mit.edu>,
Greg Kroah-Hartman <gregkh@suse.de>
Subject: [71/90] ext4: make trim/discard optional (and off by default)
Date: Thu, 10 Dec 2009 20:25:49 -0800 [thread overview]
Message-ID: <20091211042812.322370572@linux.site> (raw)
In-Reply-To: <20091211043502.GA17916@kroah.com>
[-- Attachment #1: 0071-ext4-make-trim-discard-optional-and-off-by-default.patch --]
[-- Type: text/plain, Size: 4375 bytes --]
2.6.31-stable review patch. If anyone has any objections, please let us know.
------------------
(cherry picked from commit 5328e635315734d42080de9a5a1ee87bf4cae0a4)
It is anticipated that when sb_issue_discard starts doing
real work on trim-capable devices, we may see issues. Make
this mount-time optional, and default it to off until we know
that things are working out OK.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
Documentation/filesystems/ext4.txt | 6 ++++++
fs/ext4/ext4.h | 1 +
fs/ext4/mballoc.c | 21 +++++++++++++--------
fs/ext4/super.c | 14 +++++++++++++-
4 files changed, 33 insertions(+), 9 deletions(-)
--- a/Documentation/filesystems/ext4.txt
+++ b/Documentation/filesystems/ext4.txt
@@ -338,6 +338,12 @@ noauto_da_alloc replacing existing file
system crashes before the delayed allocation
blocks are forced to disk.
+discard Controls whether ext4 should issue discard/TRIM
+nodiscard(*) commands to the underlying block device when
+ blocks are freed. This is useful for SSD devices
+ and sparse/thinly-provisioned LUNs, but it is off
+ by default until sufficient testing has been done.
+
Data Mode
=========
There are 3 different data modes:
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -747,6 +747,7 @@ struct ext4_inode_info {
#define EXT4_MOUNT_DELALLOC 0x8000000 /* Delalloc support */
#define EXT4_MOUNT_DATA_ERR_ABORT 0x10000000 /* Abort on file data write */
#define EXT4_MOUNT_BLOCK_VALIDITY 0x20000000 /* Block validity checking */
+#define EXT4_MOUNT_DISCARD 0x40000000 /* Issue DISCARD requests */
#define clear_opt(o, opt) o &= ~EXT4_MOUNT_##opt
#define set_opt(o, opt) o |= EXT4_MOUNT_##opt
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2810,7 +2810,6 @@ static void release_blocks_on_commit(jou
struct ext4_group_info *db;
int err, count = 0, count2 = 0;
struct ext4_free_data *entry;
- ext4_fsblk_t discard_block;
struct list_head *l, *ltmp;
list_for_each_safe(l, ltmp, &txn->t_private_list) {
@@ -2840,13 +2839,19 @@ static void release_blocks_on_commit(jou
page_cache_release(e4b.bd_bitmap_page);
}
ext4_unlock_group(sb, entry->group);
- discard_block = (ext4_fsblk_t) entry->group * EXT4_BLOCKS_PER_GROUP(sb)
- + entry->start_blk
- + le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block);
- trace_ext4_discard_blocks(sb, (unsigned long long)discard_block,
- entry->count);
- sb_issue_discard(sb, discard_block, entry->count);
-
+ if (test_opt(sb, DISCARD)) {
+ ext4_fsblk_t discard_block;
+ struct ext4_super_block *es = EXT4_SB(sb)->s_es;
+
+ discard_block = (ext4_fsblk_t)entry->group *
+ EXT4_BLOCKS_PER_GROUP(sb)
+ + entry->start_blk
+ + le32_to_cpu(es->s_first_data_block);
+ trace_ext4_discard_blocks(sb,
+ (unsigned long long)discard_block,
+ entry->count);
+ sb_issue_discard(sb, discard_block, entry->count);
+ }
kmem_cache_free(ext4_free_ext_cachep, entry);
ext4_mb_release_desc(&e4b);
}
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -906,6 +906,9 @@ static int ext4_show_options(struct seq_
if (test_opt(sb, NO_AUTO_DA_ALLOC))
seq_puts(seq, ",noauto_da_alloc");
+ if (test_opt(sb, DISCARD))
+ seq_puts(seq, ",discard");
+
ext4_show_quota_options(seq, sb);
return 0;
@@ -1086,7 +1089,8 @@ enum {
Opt_usrquota, Opt_grpquota, Opt_i_version,
Opt_stripe, Opt_delalloc, Opt_nodelalloc,
Opt_block_validity, Opt_noblock_validity,
- Opt_inode_readahead_blks, Opt_journal_ioprio
+ Opt_inode_readahead_blks, Opt_journal_ioprio,
+ Opt_discard, Opt_nodiscard,
};
static const match_table_t tokens = {
@@ -1152,6 +1156,8 @@ static const match_table_t tokens = {
{Opt_auto_da_alloc, "auto_da_alloc=%u"},
{Opt_auto_da_alloc, "auto_da_alloc"},
{Opt_noauto_da_alloc, "noauto_da_alloc"},
+ {Opt_discard, "discard"},
+ {Opt_nodiscard, "nodiscard"},
{Opt_err, NULL},
};
@@ -1580,6 +1586,12 @@ set_qf_format:
else
set_opt(sbi->s_mount_opt,NO_AUTO_DA_ALLOC);
break;
+ case Opt_discard:
+ set_opt(sbi->s_mount_opt, DISCARD);
+ break;
+ case Opt_nodiscard:
+ clear_opt(sbi->s_mount_opt, DISCARD);
+ break;
default:
ext4_msg(sb, KERN_ERR,
"Unrecognized mount option \"%s\" "
next prev parent reply other threads:[~2009-12-11 4:45 UTC|newest]
Thread overview: 91+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20091211042438.970725457@linux.site>
2009-12-11 4:35 ` [00/90] 2.6.31.8-stable review Greg KH
2009-12-11 4:24 ` [01/90] ext4: Fix memory leak fix when mounting an ext4 filesystem Greg KH
2009-12-11 4:24 ` [02/90] ext4: Avoid null pointer dereference when decoding EROFS w/o a journal Greg KH
2009-12-11 4:24 ` [03/90] jbd2: Fail to load a journal if it is too short Greg KH
2009-12-11 4:24 ` [04/90] jbd2: round commit timer up to avoid uncommitted transaction Greg KH
2009-12-11 4:24 ` [05/90] ext4: fix journal ref count in move_extent_par_page Greg KH
2009-12-11 4:24 ` [06/90] ext4: Fix bugs in mballocs stream allocation mode Greg KH
2009-12-11 4:24 ` [07/90] ext4: Avoid group preallocation for closed files Greg KH
2009-12-11 4:24 ` [08/90] jbd2: Annotate transaction start also for jbd2_journal_restart() Greg KH
2009-12-11 4:24 ` [09/90] ext4: Fix possible deadlock between ext4_truncate() and ext4_get_blocks() Greg KH
2009-12-11 4:24 ` [10/90] ext4: reject too-large filesystems on 32-bit kernels Greg KH
2009-12-11 4:24 ` [11/90] ext4: Add feature set check helper for mount & remount paths Greg KH
2009-12-11 4:24 ` [12/90] ext4: Add missing unlock_new_inode() call in extent migration code Greg KH
2009-12-11 4:24 ` [13/90] ext4: Allow rename to create more than EXT4_LINK_MAX subdirectories Greg KH
2009-12-11 4:24 ` [14/90] ext4: Limit number of links that can be created by ext4_link() Greg KH
2009-12-11 4:24 ` [15/90] ext4: Restore wbc->range_start in ext4_da_writepages() Greg KH
2009-12-11 4:24 ` [16/90] ext4: fix cache flush in ext4_sync_file Greg KH
2009-12-11 4:24 ` [17/90] ext4: Fix wrong comparisons in mext_check_arguments() Greg KH
2009-12-11 4:24 ` [18/90] ext4: Remove unneeded BUG_ON() in ext4_move_extents() Greg KH
2009-12-11 4:24 ` [19/90] ext4: Return exchanged blocks count to user space in failure Greg KH
2009-12-11 4:24 ` [20/90] ext4: Take page lock before looking at attached buffer_heads flags Greg KH
2009-12-11 4:24 ` [21/90] ext4: print more sysadmin-friendly message in check_block_validity() Greg KH
2009-12-11 4:25 ` [22/90] ext4: Use bforget() in no journal mode for ext4_journal_{forget,revoke}() Greg KH
2009-12-11 4:25 ` [23/90] ext4: Assure that metadata blocks are written during fsync in no journal mode Greg KH
2009-12-11 4:25 ` [24/90] ext4: Make non-journal fsync work properly Greg KH
2009-12-11 4:25 ` [25/90] ext4: move ext4_mb_init_group() function earlier in the mballoc.c Greg KH
2009-12-11 4:25 ` [26/90] ext4: check for need init flag in ext4_mb_load_buddy Greg KH
2009-12-11 4:25 ` [27/90] ext4: Dont update superblock write time when filesystem is read-only Greg KH
2009-12-11 4:25 ` [28/90] ext4: Always set dx_nodes fake_dirent explicitly Greg KH
2009-12-11 4:25 ` [29/90] ext4: Fix initalization of s_flex_groups Greg KH
2009-12-11 4:25 ` [30/90] ext4: Fix include/trace/events/ext4.h to work with Systemtap Greg KH
2009-12-11 4:25 ` [31/90] ext4: Fix small typo for move_extent_per_page() Greg KH
2009-12-11 4:25 ` [32/90] ext4: Replace get_ext_path macro with an inline funciton Greg KH
2009-12-11 4:25 ` [33/90] ext4: Replace BUG_ON() with ext4_error() in move_extents.c Greg KH
2009-12-11 4:25 ` [34/90] ext4: Add null extent check to ext_get_path Greg KH
2009-12-11 4:25 ` [35/90] ext4: Fix different block exchange issue in EXT4_IOC_MOVE_EXT Greg KH
2009-12-11 4:25 ` [36/90] ext4: limit block allocations for indirect-block files to < 2^32 Greg KH
2009-12-11 4:25 ` [37/90] ext4: store EXT4_EXT_MIGRATE in i_state instead of i_flags Greg KH
2009-12-11 4:25 ` [38/90] ext4: Fix the alloc on close after a truncate hueristic Greg KH
2009-12-11 4:25 ` [39/90] ext4: Fix hueristic which avoids group preallocation for closed files Greg KH
2009-12-11 4:25 ` [40/90] ext4: Adjust ext4_da_writepages() to write out larger contiguous chunks Greg KH
2009-12-11 4:25 ` [41/90] ext4: release reserved quota when block reservation for delalloc retry Greg KH
2009-12-11 4:25 ` [42/90] ext4: Split uninitialized extents for direct I/O Greg KH
2009-12-11 4:25 ` [43/90] ext4: Use end_io callback to avoid direct I/O fallback to buffered I/O Greg KH
2009-12-11 4:25 ` [44/90] ext4: async direct IO for holes and fallocate support Greg KH
2009-12-11 4:25 ` [45/90] ext4: EXT4_IOC_MOVE_EXT: Check for different original and donor inodes first Greg KH
2009-12-11 4:25 ` [46/90] ext4: Avoid updating the inode table bh twice in no journal mode Greg KH
2009-12-11 4:25 ` [47/90] ext4: Make sure ext4_dirty_inode() updates the inode " Greg KH
2009-12-11 4:25 ` [48/90] ext4: Handle nested ext4_journal_start/stop calls without a journal Greg KH
2009-12-11 4:25 ` [49/90] ext4: Fix time encoding with extra epoch bits Greg KH
2009-12-11 4:25 ` [50/90] ext4: fix a BUG_ON crash by checking that page has buffers attached to it Greg KH
2009-12-11 4:25 ` [51/90] ext4: retry failed direct IO allocations Greg KH
2009-12-11 4:25 ` [52/90] ext4: discard preallocation when restarting a transaction during truncate Greg KH
2009-12-11 4:25 ` [53/90] ext4: fix ext4_ext_direct_IO()s return value after converting uninit extents Greg KH
2009-12-11 4:25 ` [54/90] ext4: skip conversion of uninit extents after direct IO if there isnt any Greg KH
2009-12-11 4:25 ` [55/90] ext4: code clean up for dio fallocate handling Greg KH
2009-12-11 4:25 ` [56/90] ext4: Fix return value of ext4_split_unwritten_extents() to fix direct I/O Greg KH
2009-12-11 4:25 ` [57/90] ext4: fix potential buffer head leak when add_dirent_to_buf() returns ENOSPC Greg KH
2009-12-11 4:25 ` [58/90] ext4: avoid divide by zero when trying to mount a corrupted file system Greg KH
2009-12-11 4:25 ` [59/90] ext4: fix the returned block count if EXT4_IOC_MOVE_EXT fails Greg KH
2009-12-11 4:25 ` [60/90] ext4: fix lock order problem in ext4_move_extents() Greg KH
2009-12-11 4:25 ` [61/90] ext4: fix possible recursive locking warning in EXT4_IOC_MOVE_EXT Greg KH
2009-12-11 4:25 ` [62/90] ext4: plug a buffer_head leak in an error path of ext4_iget() Greg KH
2009-12-11 4:25 ` [63/90] ext4: make sure directory and symlink blocks are revoked Greg KH
2009-12-11 4:25 ` [64/90] ext4: fix i_flags access in ext4_da_writepages_trans_blocks() Greg KH
2009-12-11 4:25 ` [65/90] ext4: journal all modifications in ext4_xattr_set_handle Greg KH
2009-12-11 4:25 ` [66/90] ext4: dont update the superblock in ext4_statfs() Greg KH
2009-12-11 4:25 ` [67/90] ext4: fix uninit block bitmap initialization when s_meta_first_bg is non-zero Greg KH
2009-12-11 4:25 ` [68/90] ext4: fix block validity checks so they work correctly with meta_bg Greg KH
2009-12-11 4:25 ` [69/90] ext4: avoid issuing unnecessary barriers Greg KH
2009-12-11 4:25 ` [70/90] ext4: fix error handling in ext4_ind_get_blocks() Greg KH
2009-12-11 4:25 ` Greg KH [this message]
2009-12-11 4:25 ` [72/90] ext4: make "norecovery" an alias for "noload" Greg KH
2009-12-11 4:25 ` [73/90] ext4: Fix double-free of blocks with EXT4_IOC_MOVE_EXT Greg KH
2009-12-11 4:25 ` [74/90] ext4: initialize moved_len before calling ext4_move_extents() Greg KH
2009-12-11 4:25 ` [75/90] ext4: move_extent_per_page() cleanup Greg KH
2009-12-11 4:25 ` [76/90] jbd2: Add ENOMEM checking in and for jbd2_journal_write_metadata_buffer() Greg KH
2009-12-11 4:25 ` [77/90] ext4: Return the PTR_ERR of the correct pointer in setup_new_group_blocks() Greg KH
2009-12-11 4:25 ` [78/90] ext4: Avoid data / filesystem corruption when write fails to copy data Greg KH
2009-12-11 4:25 ` [79/90] ext4: wait for log to commit when umounting Greg KH
2009-12-11 4:25 ` [80/90] ext4: remove blocks from inode prealloc list on failure Greg KH
2009-12-11 4:25 ` [81/90] ext4: ext4_get_reserved_space() must return bytes instead of blocks Greg KH
2009-12-11 4:26 ` [82/90] ext4: quota macros cleanup Greg KH
2009-12-11 4:26 ` [83/90] ext4: fix incorrect block reservation on quota transfer Greg KH
2009-12-11 4:26 ` [84/90] ext4: Wait for proper transaction commit on fsync Greg KH
2009-12-11 4:26 ` [85/90] ext4: Fix insufficient checks in EXT4_IOC_MOVE_EXT Greg KH
2009-12-11 4:26 ` [86/90] SCSI: megaraid_sas: fix 64 bit sense pointer truncation Greg KH
2009-12-11 4:26 ` [87/90] SCSI: osd_protocol.h: Add missing #include Greg KH
2009-12-11 4:26 ` [88/90] SCSI: scsi_lib_dma: fix bug with dma maps on nested scsi objects Greg KH
2009-12-11 4:26 ` [89/90] signal: Fix alternate signal stack check Greg KH
2009-12-11 4:26 ` [90/90] ext4: Fix potential fiemap deadlock (mmap_sem vs. i_data_sem) Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091211042812.322370572@linux.site \
--to=gregkh@suse.de \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=sandeen@redhat.com \
--cc=stable-review@kernel.org \
--cc=stable@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox