* ATA support for 4k sector size
@ 2009-02-25 22:24 Matthew Wilcox
2009-02-25 22:24 ` [PATCH 1/2] ata: Define new commands from ATA8 Matthew Wilcox
` (3 more replies)
0 siblings, 4 replies; 38+ messages in thread
From: Matthew Wilcox @ 2009-02-25 22:24 UTC (permalink / raw)
To: linux-ide, linux-kernel
The two patches following this add support for drives which have sector
sizes other than 512 bytes. I haven't been able to test this as I don't
have the hardware.
Individual host drivers will have to be updated to support sizes other
than 512 bytes.
Support for logical sector sizes that differ from physical sector sizes
depends on the READ CAPACITY 16 patch I posted in December that isn't in
scsi-misc yet.
The approach I've taken to generating the tables of which commands need
a 512-byte transfer size and which use the drive's sector size command is
'innovative'. Review is encouraged ;-)
^ permalink raw reply [flat|nested] 38+ messages in thread* [PATCH 1/2] ata: Define new commands from ATA8 2009-02-25 22:24 ATA support for 4k sector size Matthew Wilcox @ 2009-02-25 22:24 ` Matthew Wilcox 2009-02-25 22:24 ` [PATCH 2/2] ata: Add support for Long Logical Sectors and Long Physical Sectors Matthew Wilcox ` (2 subsequent siblings) 3 siblings, 0 replies; 38+ messages in thread From: Matthew Wilcox @ 2009-02-25 22:24 UTC (permalink / raw) To: linux-ide, linux-kernel; +Cc: Matthew Wilcox, Matthew Wilcox Later patches require knowledge of some commands which aren't currently defined. While I'm at it, add all the commands that are in ATA8 and reorder the existing commands to be in the same order as ATA8. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> --- include/linux/ata.h | 103 +++++++++++++++++++++++++++++++++----------------- 1 files changed, 68 insertions(+), 35 deletions(-) diff --git a/include/linux/ata.h b/include/linux/ata.h index 08a86d5..8d09d81 100644 --- a/include/linux/ata.h +++ b/include/linux/ata.h @@ -194,65 +194,98 @@ enum { ATA_REG_IRQ = ATA_REG_NSECT, /* ATA device commands */ - ATA_CMD_DEV_RESET = 0x08, /* ATAPI device reset */ + ATA_CMD_CFA_ERASE_SECTORS = 0xC0, + ATA_CMD_CFA_REQUEST_EXT_ERROR = 0x03, + ATA_CMD_CFA_TRANSLATE_SECTOR = 0x87, + ATA_CMD_CFA_WRITE_MULTI_WITHOUT_ERASE = 0xCD, + ATA_CMD_CFA_WRITE_SECTORS_WITHOUT_ERASE = 0x38, + ATA_CMD_CHK_MEDIA_CARD_TYPE = 0xD1, ATA_CMD_CHK_POWER = 0xE5, /* check power mode */ - ATA_CMD_STANDBY = 0xE2, /* place in standby power mode */ - ATA_CMD_IDLE = 0xE3, /* place in idle power mode */ + ATA_CMD_CONFIG_STREAM = 0x51, + ATA_CMD_CONF_OVERLAY = 0xB1, + ATA_CMD_DEV_RESET = 0x08, /* ATAPI device reset */ + ATA_CMD_DLOAD_MCODE = 0x92, ATA_CMD_EDD = 0x90, /* execute device diagnostic */ ATA_CMD_FLUSH = 0xE7, ATA_CMD_FLUSH_EXT = 0xEA, ATA_CMD_ID_ATA = 0xEC, ATA_CMD_ID_ATAPI = 0xA1, - ATA_CMD_READ = 0xC8, - ATA_CMD_READ_EXT = 0x25, - ATA_CMD_WRITE = 0xCA, - ATA_CMD_WRITE_EXT = 0x35, - ATA_CMD_WRITE_FUA_EXT = 0x3D, + ATA_CMD_IDLE = 0xE3, /* place in idle power mode */ + ATA_CMD_IDLEIMMEDIATE = 0xE1, + ATA_CMD_MEDIA_LOCK = 0xDE, + ATA_CMD_MEDIA_UNLOCK = 0xDF, + ATA_CMD_NVCACHE = 0xB6, + ATA_CMD_NOP = 0x00, + ATA_CMD_PACKET = 0xA0, + ATA_CMD_PMP_READ = 0xE4, /* aka READ BUFFER */ + ATA_CMD_READ = 0xC8, /* aka READ DMA */ + ATA_CMD_READ_EXT = 0x25, /* aka READ DMA EXT */ + ATA_CMD_READ_QUEUED = 0xC7, + ATA_CMD_READ_QUEUED_EXT = 0x26, ATA_CMD_FPDMA_READ = 0x60, - ATA_CMD_FPDMA_WRITE = 0x61, - ATA_CMD_PIO_READ = 0x20, - ATA_CMD_PIO_READ_EXT = 0x24, - ATA_CMD_PIO_WRITE = 0x30, - ATA_CMD_PIO_WRITE_EXT = 0x34, + ATA_CMD_READ_LOG_EXT = 0x2F, + ATA_CMD_READ_LOG_DMA_EXT= 0x47, ATA_CMD_READ_MULTI = 0xC4, ATA_CMD_READ_MULTI_EXT = 0x29, - ATA_CMD_WRITE_MULTI = 0xC5, - ATA_CMD_WRITE_MULTI_EXT = 0x39, - ATA_CMD_WRITE_MULTI_FUA_EXT = 0xCE, - ATA_CMD_SET_FEATURES = 0xEF, - ATA_CMD_SET_MULTI = 0xC6, - ATA_CMD_PACKET = 0xA0, - ATA_CMD_VERIFY = 0x40, - ATA_CMD_VERIFY_EXT = 0x42, - ATA_CMD_STANDBYNOW1 = 0xE0, - ATA_CMD_IDLEIMMEDIATE = 0xE1, - ATA_CMD_SLEEP = 0xE6, - ATA_CMD_INIT_DEV_PARAMS = 0x91, ATA_CMD_READ_NATIVE_MAX = 0xF8, ATA_CMD_READ_NATIVE_MAX_EXT = 0x27, + ATA_CMD_PIO_READ = 0x20, /* aka READ SECTOR(S) */ + ATA_CMD_PIO_READ_EXT = 0x24, /* aka READ SECTOR(S) EXT */ + ATA_CMD_READ_STREAM_DMA_EXT = 0x2A, + ATA_CMD_READ_STREAM_EXT = 0x2B, + ATA_CMD_VERIFY = 0x40, /* aka READ VERIFY SECTOR(S) */ + ATA_CMD_VERIFY_EXT = 0x42, /* aka READ VERIFY SECTOR(S) EXT */ + ATA_CMD_SEC_DISABLE_PASSWORD = 0xF6, + ATA_CMD_SEC_ERASE_PREPARE = 0xF3, + ATA_CMD_SEC_ERASE_UNIT = 0xF4, + ATA_CMD_SEC_FREEZE_LOCK = 0xF5, + ATA_CMD_SEC_SET_PASSWORD = 0xF1, + ATA_CMD_SEC_UNLOCK = 0xF2, + ATA_CMD_SERVICE = 0xA2, + ATA_CMD_SET_FEATURES = 0xEF, ATA_CMD_SET_MAX = 0xF9, ATA_CMD_SET_MAX_EXT = 0x37, - ATA_CMD_READ_LOG_EXT = 0x2f, - ATA_CMD_PMP_READ = 0xE4, - ATA_CMD_PMP_WRITE = 0xE8, - ATA_CMD_CONF_OVERLAY = 0xB1, - ATA_CMD_SEC_FREEZE_LOCK = 0xF5, + ATA_CMD_SET_MULTI = 0xC6, + ATA_CMD_SLEEP = 0xE6, ATA_CMD_SMART = 0xB0, - ATA_CMD_MEDIA_LOCK = 0xDE, - ATA_CMD_MEDIA_UNLOCK = 0xDF, - /* marked obsolete in the ATA/ATAPI-7 spec */ - ATA_CMD_RESTORE = 0x10, + ATA_CMD_STANDBY = 0xE2, /* place in standby power mode */ + ATA_CMD_STANDBYNOW1 = 0xE0, + ATA_CMD_TRUSTED_NON_DATA = 0x5B, + ATA_CMD_TRUSTED_RECEIVE = 0x5C, + ATA_CMD_TRUSTED_RECEIVE_DMA = 0x5D, + ATA_CMD_TRUSTED_SEND = 0x5E, + ATA_CMD_TRUSTED_SEND_DMA = 0x5F, + ATA_CMD_PMP_WRITE = 0xE8, /* aka WRITE BUFFER */ + ATA_CMD_WRITE = 0xCA, /* aka WRITE DMA */ + ATA_CMD_WRITE_EXT = 0x35, /* aka WRITE DMA EXT */ + ATA_CMD_WRITE_FUA_EXT = 0x3D, /* aka WRITE DMA FUA EXT */ + ATA_CMD_WRITE_DMA_QUEUED = 0xCC, + ATA_CMD_WRITE_DMA_QUEUED_EXT = 0x36, + ATA_CMD_WRITE_DMA_QUEUED_FUA_EXT = 0x3E, + ATA_CMD_FPDMA_WRITE = 0x61, /* aka WRITE FPDMA QUEUED */ + ATA_CMD_WRITE_LOG_EXT = 0x3F, + ATA_CMD_WRITE_LOG_DMA_EXT = 0x57, + ATA_CMD_WRITE_MULTI = 0xC5, + ATA_CMD_WRITE_MULTI_EXT = 0x39, + ATA_CMD_WRITE_MULTI_FUA_EXT = 0xCE, + ATA_CMD_PIO_WRITE = 0x30, /* aka WRITE SECTOR(S) */ + ATA_CMD_PIO_WRITE_EXT = 0x34, /* aka WRITE SECTOR(S) EXT */ + ATA_CMD_WRITE_STREAM_DMA_EXT = 0x3A, + ATA_CMD_WRITE_STREAM_EXT = 0x3B, + ATA_CMD_WRITE_UNCORRECTABLE_EXT = 0x45, /* EXABYTE specific */ ATA_EXABYTE_ENABLE_NEST = 0xF0, /* READ_LOG_EXT pages */ ATA_LOG_SATA_NCQ = 0x10, - /* READ/WRITE LONG (obsolete) */ + /* Obsolete */ ATA_CMD_READ_LONG = 0x22, ATA_CMD_READ_LONG_ONCE = 0x23, ATA_CMD_WRITE_LONG = 0x32, ATA_CMD_WRITE_LONG_ONCE = 0x33, + ATA_CMD_INIT_DEV_PARAMS = 0x91, + ATA_CMD_RESTORE = 0x10, /* SETFEATURES stuff */ SETFEATURES_XFER = 0x03, -- 1.5.6.5 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 2/2] ata: Add support for Long Logical Sectors and Long Physical Sectors 2009-02-25 22:24 ATA support for 4k sector size Matthew Wilcox 2009-02-25 22:24 ` [PATCH 1/2] ata: Define new commands from ATA8 Matthew Wilcox @ 2009-02-25 22:24 ` Matthew Wilcox 2009-02-25 22:53 ` ATA support for 4k sector size H. Peter Anvin 2009-02-26 18:22 ` hdparm-9.12 released Mark Lord 3 siblings, 0 replies; 38+ messages in thread From: Matthew Wilcox @ 2009-02-25 22:24 UTC (permalink / raw) To: linux-ide, linux-kernel; +Cc: Matthew Wilcox From: Matthew Wilcox <willy@linux.intel.com> ATA 8 permits devices that have sector sizes larger than 512 bytes. We support this by recording the sector size in the ata_device and use it instead of the ATA_SECT_SIZE when the data transfer is a multiple of sectors. Drivers must indicate their support for sector sizes other than 512 by implementing the 'sector_size_supported' port operation. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> --- drivers/ata/Makefile | 11 ++++ drivers/ata/ata-commands.c | 132 ++++++++++++++++++++++++++++++++++++++++++++ drivers/ata/bitops.c | 77 +++++++++++++++++++++++++ drivers/ata/bitops.h | 12 ++++ drivers/ata/libata-core.c | 67 ++++++++++++++++++++++ drivers/ata/libata-scsi.c | 52 +++++++++++++----- include/linux/libata.h | 3 + 7 files changed, 340 insertions(+), 14 deletions(-) create mode 100644 drivers/ata/ata-commands.c create mode 100644 drivers/ata/bitops.c create mode 100644 drivers/ata/bitops.h diff --git a/drivers/ata/Makefile b/drivers/ata/Makefile index 7f1ecf9..a67d5dc 100644 --- a/drivers/ata/Makefile +++ b/drivers/ata/Makefile @@ -84,3 +84,14 @@ libata-objs := libata-core.o libata-scsi.o libata-eh.o libata-$(CONFIG_ATA_SFF) += libata-sff.o libata-$(CONFIG_SATA_PMP) += libata-pmp.o libata-$(CONFIG_ATA_ACPI) += libata-acpi.o + +hostprogs-y := ata-commands +ata-commands-objs := ata-commands.o bitops.o +HOSTCFLAGS_ata-commands.o := -Iinclude +clean-files := all-commands sector-commands +quiet_cmd_ata-cmds = ATA-CMD + cmd_ata-cmds = $(obj)/ata-commands || (rm -f all-commands sector-commands) + +$(obj)/all-commands: $(obj)/ata-commands + $(call if_changed,ata-cmds) +$(obj)/libata-core.o: $(obj)/all-commands diff --git a/drivers/ata/ata-commands.c b/drivers/ata/ata-commands.c new file mode 100644 index 0000000..afb40c1 --- /dev/null +++ b/drivers/ata/ata-commands.c @@ -0,0 +1,132 @@ +/* + * ata-commands.c + * + * Copyright (c) 2008 Intel Corporation + * Author: Matthew Wilcox <willy@linux.intel.com> + * + * This file is part of the Linux kernel, and is made available under + * the terms of the GNU General Public License, version 2 + */ + +#include <stdio.h> + +#include "bitops.h" + +/* We have to define some types to include ata.h */ + +typedef unsigned char u8; +typedef unsigned short u16; +typedef unsigned int u32; +typedef unsigned long long u64; +typedef int bool; +#define false 0 +#define true 1 +#define _LINUX_SWAB_H +#include "linux/ata.h" + +void set_sector_bits(void) +{ + set_bit(ATA_CMD_CFA_TRANSLATE_SECTOR); + set_bit(ATA_CMD_CFA_WRITE_MULTI_WITHOUT_ERASE); + set_bit(ATA_CMD_CFA_WRITE_SECTORS_WITHOUT_ERASE); + set_bit(ATA_CMD_READ); + set_bit(ATA_CMD_READ_EXT); + set_bit(ATA_CMD_READ_QUEUED); + set_bit(ATA_CMD_READ_QUEUED_EXT); + set_bit(ATA_CMD_FPDMA_READ); + set_bit(ATA_CMD_READ_MULTI); + set_bit(ATA_CMD_READ_MULTI_EXT); + set_bit(ATA_CMD_PIO_READ); + set_bit(ATA_CMD_PIO_READ_EXT); + set_bit(ATA_CMD_READ_STREAM_DMA_EXT); + set_bit(ATA_CMD_READ_STREAM_EXT); + set_bit(ATA_CMD_VERIFY); + set_bit(ATA_CMD_VERIFY_EXT); + set_bit(ATA_CMD_WRITE); + set_bit(ATA_CMD_WRITE_EXT); + set_bit(ATA_CMD_WRITE_FUA_EXT); + set_bit(ATA_CMD_WRITE_DMA_QUEUED); + set_bit(ATA_CMD_WRITE_DMA_QUEUED_EXT); + set_bit(ATA_CMD_WRITE_DMA_QUEUED_FUA_EXT); + set_bit(ATA_CMD_FPDMA_WRITE); + set_bit(ATA_CMD_WRITE_MULTI); + set_bit(ATA_CMD_WRITE_MULTI_EXT); + set_bit(ATA_CMD_WRITE_MULTI_FUA_EXT); + set_bit(ATA_CMD_PIO_WRITE); + set_bit(ATA_CMD_PIO_WRITE_EXT); + set_bit(ATA_CMD_WRITE_STREAM_DMA_EXT); + set_bit(ATA_CMD_WRITE_STREAM_EXT); +} + +void set_512_bits(void) +{ + set_bit(ATA_CMD_CFA_ERASE_SECTORS); + set_bit(ATA_CMD_CFA_REQUEST_EXT_ERROR); + set_bit(ATA_CMD_CHK_MEDIA_CARD_TYPE); + set_bit(ATA_CMD_CHK_POWER); + set_bit(ATA_CMD_CONFIG_STREAM); + set_bit(ATA_CMD_CONF_OVERLAY); + set_bit(ATA_CMD_DEV_RESET); + set_bit(ATA_CMD_DLOAD_MCODE); + set_bit(ATA_CMD_EDD); + set_bit(ATA_CMD_FLUSH); + set_bit(ATA_CMD_FLUSH_EXT); + set_bit(ATA_CMD_ID_ATA); + set_bit(ATA_CMD_ID_ATAPI); + set_bit(ATA_CMD_IDLE); + set_bit(ATA_CMD_IDLEIMMEDIATE); + set_bit(ATA_CMD_MEDIA_LOCK); + set_bit(ATA_CMD_MEDIA_UNLOCK); + set_bit(ATA_CMD_NVCACHE); + set_bit(ATA_CMD_NOP); + set_bit(ATA_CMD_PACKET); + set_bit(ATA_CMD_PMP_READ); + set_bit(ATA_CMD_READ_LOG_EXT); + set_bit(ATA_CMD_READ_LOG_DMA_EXT); + set_bit(ATA_CMD_READ_NATIVE_MAX); + set_bit(ATA_CMD_READ_NATIVE_MAX_EXT); + set_bit(ATA_CMD_RESTORE); + set_bit(ATA_CMD_SEC_DISABLE_PASSWORD); + set_bit(ATA_CMD_SEC_ERASE_PREPARE); + set_bit(ATA_CMD_SEC_ERASE_UNIT); + set_bit(ATA_CMD_SEC_FREEZE_LOCK); + set_bit(ATA_CMD_SEC_SET_PASSWORD); + set_bit(ATA_CMD_SEC_UNLOCK); + set_bit(ATA_CMD_SERVICE); + set_bit(ATA_CMD_SET_FEATURES); + set_bit(ATA_CMD_SET_MAX); + set_bit(ATA_CMD_SET_MAX_EXT); + set_bit(ATA_CMD_SET_MULTI); + set_bit(ATA_CMD_SLEEP); + set_bit(ATA_CMD_SMART); + set_bit(ATA_CMD_STANDBY); + set_bit(ATA_CMD_STANDBYNOW1); + set_bit(ATA_CMD_TRUSTED_NON_DATA); + set_bit(ATA_CMD_TRUSTED_RECEIVE); + set_bit(ATA_CMD_TRUSTED_RECEIVE_DMA); + set_bit(ATA_CMD_TRUSTED_SEND); + set_bit(ATA_CMD_TRUSTED_SEND_DMA); + set_bit(ATA_CMD_PMP_WRITE); + set_bit(ATA_CMD_WRITE_LOG_EXT); + set_bit(ATA_CMD_WRITE_LOG_DMA_EXT); + set_bit(ATA_CMD_WRITE_UNCORRECTABLE_EXT); + set_bit(ATA_CMD_INIT_DEV_PARAMS); + set_bit(ATA_EXABYTE_ENABLE_NEST); +} + +int main(int argc, char **argv) +{ + FILE *out; + + set_sector_bits(); + out = fopen("drivers/ata/sector-commands", "w"); + output_array(out); + fclose(out); + + set_512_bits(); + out = fopen("drivers/ata/all-commands", "w"); + output_array(out); + fclose(out); + + return 0; +} diff --git a/drivers/ata/bitops.c b/drivers/ata/bitops.c new file mode 100644 index 0000000..f6a6f06 --- /dev/null +++ b/drivers/ata/bitops.c @@ -0,0 +1,77 @@ +/* + * bitops.c + * + * Copyright 2008 Intel Corporation + * Author: Matthew Wilcox <willy@linux.intel.com> + * + * This file is part of the Linux kernel, and is made available under + * the terms of the GNU General Public License, version 2 + */ + +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +#include "bitops.h" + +static unsigned int *array; +static unsigned maxbit, allocbit; + +static void expand_array(unsigned bit) +{ + unsigned size; + void *tmp; + + if (bit > maxbit) + maxbit = bit; + if (bit < allocbit) + return; + + size = ((bit + 63) / 64) * 8; + tmp = realloc(array, size); + if (!tmp) { + fprintf(stderr, "Memory allocation failure\n"); + exit(1); + } + array = tmp; + memset(array + allocbit / 8, 0, size - allocbit / 8); + allocbit = size * 8; +} + +void set_bit(unsigned bit) +{ + expand_array(bit); + array[bit / 32] |= 1 << (bit % 32); +} + +static const char *sep(unsigned i, unsigned w) +{ + if (i >= w - 1) + return "\n"; + if (i % 6 == 5) + return "\n\t"; + return " "; +} + +void output_array(FILE *file) +{ + unsigned i, words = (maxbit + 32) / 32; + + fprintf(file, "{\n#ifdef CONFIG_64BIT\n\t"); + for (i = 0; i < words; i += 2) { + fprintf(file, "0x%08x%08xULL,%s", array[i + 1], array[i], + sep(i + 1, words)); + } + fprintf(file, "#else\n\t"); + for (i = 0; i < words; i++) { + fprintf(file, "0x%08x,%s", array[i], sep(i, words)); + } + fprintf(file, "#endif\n};\n"); +} + +void free_array(void) +{ + free(array); + array = NULL; + maxbit = allocbit = 0; +} diff --git a/drivers/ata/bitops.h b/drivers/ata/bitops.h new file mode 100644 index 0000000..3da560c --- /dev/null +++ b/drivers/ata/bitops.h @@ -0,0 +1,12 @@ +/* + * A library for creating statically initialised bit arrays + */ + +/* Call this to set a bit. */ +extern void set_bit(unsigned bit); + +/* Call this to output the array. */ +extern void output_array(FILE *file); + +/* Remove all bits from the array. */ +extern void free_array(void); diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 9fbf059..61378d8 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -600,6 +600,32 @@ void ata_tf_from_fis(const u8 *fis, struct ata_taskfile *tf) tf->hob_nsect = fis[13]; } +/** + * ata_sect_size - Returns the sector size to use for a command + * @command: The ATA command byte + * @dev_sect_size: The size of the drive's sectors + * + * Some commands are specified to transfer (a multiple of) 512 bytes of data + * while others transfer a multiple of the number of bytes in a sector. This + * function knows which commands transfer how much data. + */ +unsigned ata_sect_size(u8 command, unsigned dev_sect_size) +{ + static const unsigned long sector_commands[] = +#include "sector-commands" + static unsigned long known_commands[] = +#include "all-commands" + + if (test_bit(command, sector_commands)) + return dev_sect_size; + if (!test_bit(command, known_commands)) { + printk(KERN_ERR "Unknown ata cmd %d, assuming " + "512 byte sector size\n", command); + set_bit(command, known_commands); + } + return 512; +} + static const u8 ata_rw_cmds[] = { /* pio multi */ ATA_CMD_READ_MULTI, @@ -1333,6 +1359,25 @@ static u64 ata_id_n_sectors(const u16 *id) } } +/* + * ATA supports sector sizes up to 2^33 - 1. The reported sector size may + * not be a power of two. The extra bytes are used for user-visible data + * integrity calculations. Note this is not the same as the ECC which is + * accessed through the SCT Command Transport or READ / WRITE LONG. + */ +static u64 ata_id_sect_size(const u16 *id) +{ + u16 word_106 = id[106]; + u64 sz; + + if ((word_106 & 0xc000) != 0x4000) + return ATA_SECT_SIZE; + if (!(word_106 & (1 << 12))) + return ATA_SECT_SIZE; + sz = (id[117] | ((u64)id[118] << 16)) * 2; + return sz; +} + u64 ata_tf_to_lba48(const struct ata_taskfile *tf) { u64 sectors = 0; @@ -2301,6 +2346,20 @@ static void ata_dev_config_ncq(struct ata_device *dev, snprintf(desc, desc_sz, "NCQ (depth %d/%d)", hdepth, ddepth); } +static int ata_check_sect_size(struct ata_device *dev) +{ + /* Every host can handle 512 byte sectors */ + if (dev->sect_size == 512) + return 0; + /* Linux doesn't handle sectors larger than 4GB. This may be + * a problem around 2050 or so. Deal with it then. */ + if (dev->sect_size > 0xffffffffULL) + return -EINVAL; + if (!dev->link->ap->ops->sector_size_supported) + return -EINVAL; + return dev->link->ap->ops->sector_size_supported(dev) ? 0 : -EINVAL; +} + /** * ata_dev_configure - Configure the specified ATA/ATAPI device * @dev: Target device to configure @@ -2384,6 +2443,7 @@ int ata_dev_configure(struct ata_device *dev) dev->max_sectors = 0; dev->cdb_len = 0; dev->n_sectors = 0; + dev->sect_size = ATA_SECT_SIZE; dev->cylinders = 0; dev->heads = 0; dev->sectors = 0; @@ -2423,6 +2483,13 @@ int ata_dev_configure(struct ata_device *dev) } dev->n_sectors = ata_id_n_sectors(id); + dev->sect_size = ata_id_sect_size(id); + rc = ata_check_sect_size(dev); + if (rc) { + ata_dev_printk(dev, KERN_ERR, "sector size %lld not " + "supported.\n", dev->sect_size); + goto err_out_nosup; + } if (dev->id[59] & 0x100) dev->multi_count = dev->id[59] & 0xff; diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index b9747fa..9e89601 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -50,7 +50,6 @@ #include "libata.h" -#define SECTOR_SIZE 512 #define ATA_SCSI_RBUF_SIZE 4096 static DEFINE_SPINLOCK(ata_scsi_rbuf_lock); @@ -455,7 +454,7 @@ static int ata_get_identity(struct ata_port *ap, struct scsi_device *sdev, /** * ata_cmd_ioctl - Handler for HDIO_DRIVE_CMD ioctl - * @scsidev: Device to which we are issuing command + * @sdev: Device to which we are issuing command * @arg: User provided data for issuing command * * LOCKING: @@ -464,7 +463,7 @@ static int ata_get_identity(struct ata_port *ap, struct scsi_device *sdev, * RETURNS: * Zero on success, negative errno on error. */ -int ata_cmd_ioctl(struct scsi_device *scsidev, void __user *arg) +int ata_cmd_ioctl(struct scsi_device *sdev, void __user *arg) { int rc = 0; u8 scsi_cmd[MAX_COMMAND_SIZE]; @@ -486,7 +485,8 @@ int ata_cmd_ioctl(struct scsi_device *scsidev, void __user *arg) memset(scsi_cmd, 0, sizeof(scsi_cmd)); if (args[3]) { - argsize = SECTOR_SIZE * args[3]; + unsigned sect_size = ata_sect_size(args[0], sdev->sector_size); + argsize = sect_size * args[3]; argbuf = kmalloc(argsize, GFP_KERNEL); if (argbuf == NULL) { rc = -ENOMEM; @@ -518,7 +518,7 @@ int ata_cmd_ioctl(struct scsi_device *scsidev, void __user *arg) /* Good values for timeout and retries? Values below from scsi_ioctl_send_command() for default case... */ - cmd_result = scsi_execute(scsidev, scsi_cmd, data_dir, argbuf, argsize, + cmd_result = scsi_execute(sdev, scsi_cmd, data_dir, argbuf, argsize, sensebuf, (10*HZ), 5, 0, NULL); if (driver_byte(cmd_result) == DRIVER_SENSE) {/* sense data available */ @@ -1630,6 +1630,7 @@ nothing_to_do: static unsigned int ata_scsi_rw_xlat(struct ata_queued_cmd *qc) { struct scsi_cmnd *scmd = qc->scsicmd; + struct ata_device *dev = qc->dev; const u8 *cdb = scmd->cmnd; unsigned int tf_flags = 0; u64 block; @@ -1686,9 +1687,9 @@ static unsigned int ata_scsi_rw_xlat(struct ata_queued_cmd *qc) goto nothing_to_do; qc->flags |= ATA_QCFLAG_IO; - qc->nbytes = n_block * ATA_SECT_SIZE; + qc->nbytes = n_block * dev->sect_size; - rc = ata_build_rw_tf(&qc->tf, qc->dev, block, n_block, tf_flags, + rc = ata_build_rw_tf(&qc->tf, dev, block, n_block, tf_flags, qc->tag); if (likely(rc == 0)) return 0; @@ -2354,10 +2355,25 @@ saving_not_supp: */ static unsigned int ata_scsiop_read_cap(struct ata_scsi_args *args, u8 *rbuf) { - u64 last_lba = args->dev->n_sectors - 1; /* LBA of the last block */ + struct ata_device *dev = args->dev; + u64 last_lba = dev->n_sectors - 1; /* LBA of the last block */ + u32 sector_size; + u8 log_per_phys = 1; + u16 first_sector_offset = 0; + u16 word_106 = dev->id[106]; VPRINTK("ENTER\n"); + if ((word_106 & 0xc000) == 0x4000) { + /* Number and offset of logical sectors per physical sector */ + if (word_106 & (1 << 13)) + log_per_phys = word_106 & 0xf; + if ((dev->id[209] & 0xc000) == 0x4000) + first_sector_offset = dev->id[209] & 0x3fff; + } + + sector_size = dev->sect_size; + if (args->cmd->cmnd[0] == READ_CAPACITY) { if (last_lba >= 0xffffffffULL) last_lba = 0xffffffff; @@ -2368,9 +2384,10 @@ static unsigned int ata_scsiop_read_cap(struct ata_scsi_args *args, u8 *rbuf) rbuf[2] = last_lba >> (8 * 1); rbuf[3] = last_lba; - /* sector size */ - rbuf[6] = ATA_SECT_SIZE >> 8; - rbuf[7] = ATA_SECT_SIZE & 0xff; + rbuf[4] = sector_size >> (8 * 3); + rbuf[5] = sector_size >> (8 * 2); + rbuf[6] = sector_size >> (8 * 1); + rbuf[7] = sector_size; } else { /* sector count, 64-bit */ rbuf[0] = last_lba >> (8 * 7); @@ -2383,8 +2400,15 @@ static unsigned int ata_scsiop_read_cap(struct ata_scsi_args *args, u8 *rbuf) rbuf[7] = last_lba; /* sector size */ - rbuf[10] = ATA_SECT_SIZE >> 8; - rbuf[11] = ATA_SECT_SIZE & 0xff; + rbuf[8] = sector_size >> (8 * 3); + rbuf[9] = sector_size >> (8 * 2); + rbuf[10] = sector_size >> (8 * 1); + rbuf[11] = sector_size; + + rbuf[12] = 0; + rbuf[13] = log_per_phys; + rbuf[14] = first_sector_offset >> 8; + rbuf[15] = first_sector_offset; } return 0; @@ -2858,7 +2882,7 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc) } /* READ/WRITE LONG use a non-standard sect_size */ - qc->sect_size = ATA_SECT_SIZE; + qc->sect_size = ata_sect_size(tf->command, dev->sect_size); switch (tf->command) { case ATA_CMD_READ_LONG: case ATA_CMD_READ_LONG_ONCE: diff --git a/include/linux/libata.h b/include/linux/libata.h index 5d87bc0..e0d17ec 100644 --- a/include/linux/libata.h +++ b/include/linux/libata.h @@ -583,6 +583,7 @@ struct ata_device { #endif /* n_sector is CLEAR_BEGIN, read comment above CLEAR_BEGIN */ u64 n_sectors; /* size of device, if ATA */ + u64 sect_size; /* Logical, not physical */ unsigned int class; /* ATA_DEV_xxx */ unsigned long unpark_deadline; @@ -781,6 +782,7 @@ struct ata_port_operations { unsigned int (*read_id)(struct ata_device *dev, struct ata_taskfile *tf, u16 *id); void (*dev_config)(struct ata_device *dev); + bool (*sector_size_supported)(struct ata_device *dev); void (*freeze)(struct ata_port *ap); void (*thaw)(struct ata_port *ap); @@ -988,6 +990,7 @@ extern unsigned int ata_do_dev_read_id(struct ata_device *dev, struct ata_taskfile *tf, u16 *id); extern void ata_qc_complete(struct ata_queued_cmd *qc); extern int ata_qc_complete_multiple(struct ata_port *ap, u32 qc_active); +extern unsigned ata_sect_size(u8 command, unsigned dev_sect_size); extern void ata_scsi_simulate(struct ata_device *dev, struct scsi_cmnd *cmd, void (*done)(struct scsi_cmnd *)); extern int ata_std_bios_param(struct scsi_device *sdev, -- 1.5.6.5 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-25 22:24 ATA support for 4k sector size Matthew Wilcox 2009-02-25 22:24 ` [PATCH 1/2] ata: Define new commands from ATA8 Matthew Wilcox 2009-02-25 22:24 ` [PATCH 2/2] ata: Add support for Long Logical Sectors and Long Physical Sectors Matthew Wilcox @ 2009-02-25 22:53 ` H. Peter Anvin 2009-02-25 23:27 ` Martin K. Petersen 2009-02-26 18:22 ` hdparm-9.12 released Mark Lord 3 siblings, 1 reply; 38+ messages in thread From: H. Peter Anvin @ 2009-02-25 22:53 UTC (permalink / raw) To: Matthew Wilcox; +Cc: linux-ide, linux-kernel Matthew Wilcox wrote: > The two patches following this add support for drives which have sector > sizes other than 512 bytes. I haven't been able to test this as I don't > have the hardware. > > Individual host drivers will have to be updated to support sizes other > than 512 bytes. > > Support for logical sector sizes that differ from physical sector sizes > depends on the READ CAPACITY 16 patch I posted in December that isn't in > scsi-misc yet. > > The approach I've taken to generating the tables of which commands need > a 512-byte transfer size and which use the drive's sector size command is > 'innovative'. Review is encouraged ;-) What sector size do we report to user space for this? I'm asking because logical sector size is visible in most partition formats. -hpa ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-25 22:53 ` ATA support for 4k sector size H. Peter Anvin @ 2009-02-25 23:27 ` Martin K. Petersen 2009-02-25 23:33 ` H. Peter Anvin ` (3 more replies) 0 siblings, 4 replies; 38+ messages in thread From: Martin K. Petersen @ 2009-02-25 23:27 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Matthew Wilcox, linux-ide, linux-kernel, sandeen >>>>> "hpa" == H Peter Anvin <hpa@zytor.com> writes: >> The two patches following this add support for drives which have >> sector sizes other than 512 bytes. I haven't been able to test this >> as I don't have the hardware. hpa> What sector size do we report to user space for this? I'm asking hpa> because logical sector size is visible in most partition formats. There are several flavors of drives we have to deal with: 512-byte logical / 512-byte hardware (current) 512-byte logical / 4096-byte hardware (ATA, doing read-modify-write) 4096-byte logical / 4096-byte hardware (SCSI initially, ATA later) Because of 63-sector legacy problems a bunch of ATA vendors will initially ship 512/4096 drives that are not naturally aligned. I.e. logical sector 63 will be aligned on a 4KB hardware sector boundary to overcome the misaligned default partitioning. I have been working on some alignment patches the last week. They hook into the stuff Matthew has been doing in libata and I'll post them shortly. For each block device you'll get a hardware sector size exposed as well as whether the device (partition) is naturally aligned or not. This works for both ATA and SCSI devices. I'll defer to people like yourself for how this needs to work wrt. boot loaders and creating partition tables. I'm CC:ing Eric Sandeen because he's also looking at this... -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-25 23:27 ` Martin K. Petersen @ 2009-02-25 23:33 ` H. Peter Anvin 2009-02-25 23:51 ` Martin K. Petersen 2009-02-26 12:43 ` Karel Zak 2009-02-25 23:42 ` H. Peter Anvin ` (2 subsequent siblings) 3 siblings, 2 replies; 38+ messages in thread From: H. Peter Anvin @ 2009-02-25 23:33 UTC (permalink / raw) To: Martin K. Petersen; +Cc: Matthew Wilcox, linux-ide, linux-kernel, sandeen Martin K. Petersen wrote: > > I'll defer to people like yourself for how this needs to work wrt. boot > loaders and creating partition tables. I'm CC:ing Eric Sandeen because > he's also looking at this... > I wish it was left to people like myself. Realistically when it comes to disks with 4096-byte logical sectors it going to matter how the firmware chooses to expose it. Most likely, the universe will explode at this time, since very few bootloaders can deal with a sector size other than 512 bytes, and virtually every partition table format contains a sector size dependency, which also means you'll break any mechanical imaging solution. It's going to hurt :( -hpa ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-25 23:33 ` H. Peter Anvin @ 2009-02-25 23:51 ` Martin K. Petersen 2009-02-26 12:43 ` Karel Zak 1 sibling, 0 replies; 38+ messages in thread From: Martin K. Petersen @ 2009-02-25 23:51 UTC (permalink / raw) To: H. Peter Anvin Cc: Martin K. Petersen, Matthew Wilcox, linux-ide, linux-kernel, sandeen >>>>> "hpa" == H Peter Anvin <hpa@zytor.com> writes: hpa> I wish it was left to people like myself. Realistically when it hpa> comes to disks with 4096-byte logical sectors it going to matter hpa> how the firmware chooses to expose it. Well. In the short term ATA is going to emulate 512 (at a penalty for misaligned I/O). SCSI is switching to 4KB wholesale. I have a 4KB/4KB drive here that I have had fun booting from the last few days. hpa> Most likely, the universe will explode at this time, since very few hpa> bootloaders can deal with a sector size other than 512 bytes, and hpa> virtually every partition table format contains a sector size hpa> dependency, which also means you'll break any mechanical imaging hpa> solution. hpa> It's going to hurt :( Yep, it sucks :| -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-25 23:33 ` H. Peter Anvin 2009-02-25 23:51 ` Martin K. Petersen @ 2009-02-26 12:43 ` Karel Zak 2009-02-26 15:17 ` H. Peter Anvin 1 sibling, 1 reply; 38+ messages in thread From: Karel Zak @ 2009-02-26 12:43 UTC (permalink / raw) To: H. Peter Anvin Cc: Martin K. Petersen, Matthew Wilcox, linux-ide, linux-kernel, sandeen On Wed, Feb 25, 2009 at 03:33:46PM -0800, H. Peter Anvin wrote: > virtually every partition table format contains a sector size > dependency Is it true that "every partition table format"? I see in fs/partitions/ that only ibm.c and msdos.c care about a sector size. It seems that the others formats are based on 512 sectors only. Karel -- Karel Zak <kzak@redhat.com> ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-26 12:43 ` Karel Zak @ 2009-02-26 15:17 ` H. Peter Anvin 0 siblings, 0 replies; 38+ messages in thread From: H. Peter Anvin @ 2009-02-26 15:17 UTC (permalink / raw) To: Karel Zak Cc: Martin K. Petersen, Matthew Wilcox, linux-ide, linux-kernel, sandeen Karel Zak wrote: > On Wed, Feb 25, 2009 at 03:33:46PM -0800, H. Peter Anvin wrote: >> virtually every partition table format contains a sector size >> dependency > > Is it true that "every partition table format"? I see in > fs/partitions/ that only ibm.c and msdos.c care about a sector size. > It seems that the others formats are based on 512 sectors only. > And do they do so correctly? Looking at the UEFI spec for one, the GPT partition table format definitely has a logical sector size dependency. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-25 23:27 ` Martin K. Petersen 2009-02-25 23:33 ` H. Peter Anvin @ 2009-02-25 23:42 ` H. Peter Anvin 2009-02-25 23:55 ` Martin K. Petersen 2009-02-25 23:49 ` david 2009-02-26 2:50 ` Theodore Tso 3 siblings, 1 reply; 38+ messages in thread From: H. Peter Anvin @ 2009-02-25 23:42 UTC (permalink / raw) To: Martin K. Petersen; +Cc: Matthew Wilcox, linux-ide, linux-kernel, sandeen Martin K. Petersen wrote: > > Because of 63-sector legacy problems a bunch of ATA vendors will > initially ship 512/4096 drives that are not naturally aligned. > I.e. logical sector 63 will be aligned on a 4KB hardware sector > boundary to overcome the misaligned default partitioning. > I was under the impression Vista didn't do this? Is there any way to force these drives into a sane mode (at the expense of a total data loss)? -hpa ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-25 23:42 ` H. Peter Anvin @ 2009-02-25 23:55 ` Martin K. Petersen 2009-02-25 23:57 ` H. Peter Anvin 0 siblings, 1 reply; 38+ messages in thread From: Martin K. Petersen @ 2009-02-25 23:55 UTC (permalink / raw) To: H. Peter Anvin Cc: Martin K. Petersen, Matthew Wilcox, linux-ide, linux-kernel, sandeen >>>>> "hpa" == H Peter Anvin <hpa@zytor.com> writes: >> Because of 63-sector legacy problems a bunch of ATA vendors will >> initially ship 512/4096 drives that are not naturally aligned. >> I.e. logical sector 63 will be aligned on a 4KB hardware sector >> boundary to overcome the misaligned default partitioning. >> hpa> I was under the impression Vista didn't do this? Is there any way hpa> to force these drives into a sane mode (at the expense of a total hpa> data loss)? Modern Windows aligns the first partition on a 1 MB boundary. As far as disks go, initially the plan was to have "legacy" branded drives with 63-sector alignment. But I think that has been abandoned in favor of instant one-time formatting. I.e. you can pick your poison *once* and that formatting will be done in constant time. Any subsequent changes to blocking and alignment will require a real low-level format. That's all fine and dandy if you go down and buy a drive a Fry's and you're the first to use it. But we don't have that luxury on systems that come preinstalled with Windows. In that case we have to deal with whatever the OEM decided during manufacturing. IOW, we have to deal with all the possible configurations. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-25 23:55 ` Martin K. Petersen @ 2009-02-25 23:57 ` H. Peter Anvin 2009-02-26 0:07 ` Martin K. Petersen 0 siblings, 1 reply; 38+ messages in thread From: H. Peter Anvin @ 2009-02-25 23:57 UTC (permalink / raw) To: Martin K. Petersen; +Cc: Matthew Wilcox, linux-ide, linux-kernel, sandeen Martin K. Petersen wrote: > > Modern Windows aligns the first partition on a 1 MB boundary. > > As far as disks go, initially the plan was to have "legacy" branded > drives with 63-sector alignment. But I think that has been abandoned in > favor of instant one-time formatting. I.e. you can pick your poison > *once* and that formatting will be done in constant time. Any > subsequent changes to blocking and alignment will require a real > low-level format. > Why one-time? -hpa ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-25 23:57 ` H. Peter Anvin @ 2009-02-26 0:07 ` Martin K. Petersen 2009-02-26 0:10 ` H. Peter Anvin 0 siblings, 1 reply; 38+ messages in thread From: Martin K. Petersen @ 2009-02-26 0:07 UTC (permalink / raw) To: H. Peter Anvin Cc: Martin K. Petersen, Matthew Wilcox, linux-ide, linux-kernel, sandeen >>>>> "hpa" == H Peter Anvin <hpa@zytor.com> writes: hpa> Martin K. Petersen wrote: >> >> Modern Windows aligns the first partition on a 1 MB boundary. >> >> As far as disks go, initially the plan was to have "legacy" branded >> drives with 63-sector alignment. But I think that has been abandoned >> in favor of instant one-time formatting. I.e. you can pick your >> poison *once* and that formatting will be done in constant time. Any >> subsequent changes to blocking and alignment will require a real >> low-level format. >> hpa> Why one-time? It's a compromise to avoid hours of low-level formatting before you can use a drive. New drives will come from the factory formatted in a special way that can be switched instantaneously. But once you start writing you're stuck with it. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-26 0:07 ` Martin K. Petersen @ 2009-02-26 0:10 ` H. Peter Anvin 2009-02-26 0:17 ` Martin K. Petersen 0 siblings, 1 reply; 38+ messages in thread From: H. Peter Anvin @ 2009-02-26 0:10 UTC (permalink / raw) To: Martin K. Petersen; +Cc: Matthew Wilcox, linux-ide, linux-kernel, sandeen Martin K. Petersen wrote: > > hpa> Why one-time? > > It's a compromise to avoid hours of low-level formatting before you can > use a drive. New drives will come from the factory formatted in a > special way that can be switched instantaneously. But once you start > writing you're stuck with it. > That's ridiculously stupid. All you need is an offset parameter, which can be flipped at will. Now, flipping it will obviously scramble all data, but there is absolutely no reason why it shouldn't be possible to instantaneously flip it back and forth. -hpa ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-26 0:10 ` H. Peter Anvin @ 2009-02-26 0:17 ` Martin K. Petersen 0 siblings, 0 replies; 38+ messages in thread From: Martin K. Petersen @ 2009-02-26 0:17 UTC (permalink / raw) To: H. Peter Anvin Cc: Martin K. Petersen, Matthew Wilcox, linux-ide, linux-kernel, sandeen >>>>> "hpa" == H Peter Anvin <hpa@zytor.com> writes: hpa> That's ridiculously stupid. All you need is an offset parameter, hpa> which can be flipped at will. Now, flipping it will obviously hpa> scramble all data, but there is absolutely no reason why it hpa> shouldn't be possible to instantaneously flip it back and forth. I'm just quoting what I was told a few months ago. I'm not participating in IDEMA and for all I know that approach could have changed again. I'll ask one of my contacts to find out what the current status is... -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-25 23:27 ` Martin K. Petersen 2009-02-25 23:33 ` H. Peter Anvin 2009-02-25 23:42 ` H. Peter Anvin @ 2009-02-25 23:49 ` david 2009-02-26 0:04 ` Martin K. Petersen 2009-02-26 2:50 ` Theodore Tso 3 siblings, 1 reply; 38+ messages in thread From: david @ 2009-02-25 23:49 UTC (permalink / raw) To: Martin K. Petersen Cc: H. Peter Anvin, Matthew Wilcox, linux-ide, linux-kernel, sandeen On Wed, 25 Feb 2009, Martin K. Petersen wrote: >>>>>> "hpa" == H Peter Anvin <hpa@zytor.com> writes: > >>> The two patches following this add support for drives which have >>> sector sizes other than 512 bytes. I haven't been able to test this >>> as I don't have the hardware. > > hpa> What sector size do we report to user space for this? I'm asking > hpa> because logical sector size is visible in most partition formats. > > There are several flavors of drives we have to deal with: > > 512-byte logical / 512-byte hardware (current) > 512-byte logical / 4096-byte hardware (ATA, doing read-modify-write) > 4096-byte logical / 4096-byte hardware (SCSI initially, ATA later) add to this good support for SSDs ?? logical / 128K hardware or similar. David Lang > Because of 63-sector legacy problems a bunch of ATA vendors will > initially ship 512/4096 drives that are not naturally aligned. > I.e. logical sector 63 will be aligned on a 4KB hardware sector > boundary to overcome the misaligned default partitioning. > > I have been working on some alignment patches the last week. They hook > into the stuff Matthew has been doing in libata and I'll post them > shortly. > > For each block device you'll get a hardware sector size exposed as well > as whether the device (partition) is naturally aligned or not. This > works for both ATA and SCSI devices. > > I'll defer to people like yourself for how this needs to work wrt. boot > loaders and creating partition tables. I'm CC:ing Eric Sandeen because > he's also looking at this... > > ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-25 23:49 ` david @ 2009-02-26 0:04 ` Martin K. Petersen 2009-02-26 0:13 ` david 0 siblings, 1 reply; 38+ messages in thread From: Martin K. Petersen @ 2009-02-26 0:04 UTC (permalink / raw) To: david Cc: Martin K. Petersen, H. Peter Anvin, Matthew Wilcox, linux-ide, linux-kernel, sandeen >>>>> "david" == david <david@lang.hm> writes: >> 512-byte logical / 512-byte hardware (current) 512-byte logical / >> 4096-byte hardware (ATA, doing read-modify-write) 4096-byte logical / >> 4096-byte hardware (SCSI initially, ATA later) david> add to this good support for SSDs david> ?? logical / 128K hardware david> or similar. Yep. And that goes for RAID arrays too. For SCSI there some knobs we can query to get this information and my alignment changes are using those (and they are in turn what Willy's stuff hooks into). I've been lobbying the SSD vendors whose architecture is prone to misalignment problems to propose a similar set of knobs for ATA. But so far it's just been a lot of talking. My topology changes are a bit abstract in the sense that they expose: - smallest I/O you can submit without incurring a penalty (hw sector, raid chunk size) - optimal I/O size for the device in question - biggest I/O you can submit without incurring a penalty - alignment We can use these parameters to lay out partitions and filesystems optimally. Just like we currently do with XFS but implemented in a more generic way that all filesystems can use. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-26 0:04 ` Martin K. Petersen @ 2009-02-26 0:13 ` david 2009-02-26 0:20 ` Martin K. Petersen 0 siblings, 1 reply; 38+ messages in thread From: david @ 2009-02-26 0:13 UTC (permalink / raw) To: Martin K. Petersen Cc: H. Peter Anvin, Matthew Wilcox, linux-ide, linux-kernel, sandeen On Wed, 25 Feb 2009, Martin K. Petersen wrote: >>>>>> "david" == david <david@lang.hm> writes: > >>> 512-byte logical / 512-byte hardware (current) 512-byte logical / >>> 4096-byte hardware (ATA, doing read-modify-write) 4096-byte logical / >>> 4096-byte hardware (SCSI initially, ATA later) > > david> add to this good support for SSDs > > david> ?? logical / 128K hardware > > david> or similar. > > Yep. And that goes for RAID arrays too. > > For SCSI there some knobs we can query to get this information and my > alignment changes are using those (and they are in turn what Willy's > stuff hooks into). > > I've been lobbying the SSD vendors whose architecture is prone to > misalignment problems to propose a similar set of knobs for ATA. But so > far it's just been a lot of talking. even if we can't get them to give us any info from the drive directly, we still want to allow the sysadmin to configure the use of the systems when they can find the info in other ways. > My topology changes are a bit abstract in the sense that they expose: > > - smallest I/O you can submit without incurring a penalty (hw sector, > raid chunk size) > > - optimal I/O size for the device in question > > - biggest I/O you can submit without incurring a penalty > > - alignment > > We can use these parameters to lay out partitions and filesystems > optimally. Just like we currently do with XFS but implemented in a more > generic way that all filesystems can use. if you have the smallest and largest I/O you can submit without a penalty and the alignment, isn't the optimal I/O size everything between these two? (or at least everything between these two may be close enough that defining an 'optimal' size may not be worthwhile) David Lang ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-26 0:13 ` david @ 2009-02-26 0:20 ` Martin K. Petersen 0 siblings, 0 replies; 38+ messages in thread From: Martin K. Petersen @ 2009-02-26 0:20 UTC (permalink / raw) To: david Cc: Martin K. Petersen, H. Peter Anvin, Matthew Wilcox, linux-ide, linux-kernel, sandeen >>>>> "david" == david <david@lang.hm> writes: david> even if we can't get them to give us any info from the drive david> directly, we still want to allow the sysadmin to configure the david> use of the systems when they can find the info in other ways. You can specify RAID parameters on the mkfs.xfs command line today. mke[234]fs have similar knobs. david> if you have the smallest and largest I/O you can submit without a david> penalty and the alignment, isn't the optimal I/O size everything david> between these two? (or at least everything between these two may david> be close enough that defining an 'optimal' size may not be david> worthwhile) These parameters come straight out of SCSI. I described them wrong. maximum is the biggest I/O the device can handle, full stop. Optimal is the preferred I/O size for the array. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-25 23:27 ` Martin K. Petersen ` (2 preceding siblings ...) 2009-02-25 23:49 ` david @ 2009-02-26 2:50 ` Theodore Tso 2009-02-26 3:05 ` Martin K. Petersen 2009-02-26 3:07 ` Matthew Wilcox 3 siblings, 2 replies; 38+ messages in thread From: Theodore Tso @ 2009-02-26 2:50 UTC (permalink / raw) To: Martin K. Petersen Cc: H. Peter Anvin, Matthew Wilcox, linux-ide, linux-kernel, sandeen On Wed, Feb 25, 2009 at 06:27:18PM -0500, Martin K. Petersen wrote: > > Because of 63-sector legacy problems a bunch of ATA vendors will > initially ship 512/4096 drives that are not naturally aligned. > I.e. logical sector 63 will be aligned on a 4KB hardware sector > boundary to overcome the misaligned default partitioning. > Are we *sure* that this is what they plan to be doing? Is there a way we can query the hardware to find out for sure what drives are doing what? I'll note that Vista starts all new partitions at the 1MB boundary, so its filesystems will be naturally aligned. As I mentioned in a recent blog entry: http://thunk.org/tytso/blog/2009/02/20/aligning-filesystems-to-an-ssds-erase-block-size/ ... this is one place Vista is ahead of Linux. So while Microsoft's market share has slipped, 85% of all new x86 machines still have some variant of Redmond-spawn installed on them, and with the advent of Windows 7 coming soon, and Microsoft making it harder and harder for vendors to ship machines upgraded to Windows XP, it seems... surprising... that new disks meant for Windows Vista or Windows 7 systems would be misaligned to start at logical sector 63. - Ted ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-26 2:50 ` Theodore Tso @ 2009-02-26 3:05 ` Martin K. Petersen 2009-02-26 3:07 ` Matthew Wilcox 1 sibling, 0 replies; 38+ messages in thread From: Martin K. Petersen @ 2009-02-26 3:05 UTC (permalink / raw) To: Theodore Tso Cc: Martin K. Petersen, H. Peter Anvin, Matthew Wilcox, linux-ide, linux-kernel, sandeen >>>>> "Ted" == Theodore Tso <tytso@mit.edu> writes: >> Because of 63-sector legacy problems a bunch of ATA vendors will >> initially ship 512/4096 drives that are not naturally aligned. >> I.e. logical sector 63 will be aligned on a 4KB hardware sector >> boundary to overcome the misaligned default partitioning. >> Ted> Are we *sure* that this is what they plan to be doing? I have asked my contacts to verify. That was the plan as of last October. Ted> Is there a way we can query the hardware to find out for sure what Ted> drives are doing what? Yep. And with my patches the appropriate alignment is exposed in sysfs for each block device. Hardware workarounds for DOS partition table brain damage have existed for a long time. On a lot of RAID arrays you have to pick a LUN personality. And by choosing DOS/Windows/Linux you often align sector 63 to the internal RAID chunk size. I'm working with several RAID vendors to make sure they set the right alignment offset when they do that. That isn't currently the case. The alignment knobs only recently made their appearance in the spec to accommodate the 4KB sector transition. Ted> I'll note that Vista starts all new partitions at the 1MB boundary, Ted> so its filesystems will be naturally aligned. Maybe. See above. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-26 2:50 ` Theodore Tso 2009-02-26 3:05 ` Martin K. Petersen @ 2009-02-26 3:07 ` Matthew Wilcox 2009-02-26 3:23 ` Martin K. Petersen 2009-02-26 5:16 ` Martin K. Petersen 1 sibling, 2 replies; 38+ messages in thread From: Matthew Wilcox @ 2009-02-26 3:07 UTC (permalink / raw) To: Theodore Tso, Martin K. Petersen, H. Peter Anvin, linux-ide, linux-kernel, sandeen On Wed, Feb 25, 2009 at 09:50:43PM -0500, Theodore Tso wrote: > On Wed, Feb 25, 2009 at 06:27:18PM -0500, Martin K. Petersen wrote: > > Because of 63-sector legacy problems a bunch of ATA vendors will > > initially ship 512/4096 drives that are not naturally aligned. > > I.e. logical sector 63 will be aligned on a 4KB hardware sector > > boundary to overcome the misaligned default partitioning. > > Are we *sure* that this is what they plan to be doing? Is there a way > we can query the hardware to find out for sure what drives are doing > what? The drive I have that's pretending to be a 512/4k drive reports this: $ sudo sg_readcap -l /dev/sdc Read Capacity results: Protection: prot_en=0, p_type=0 Last logical block address=625142447 (0x2542eaaf), Number of logical blocks=625142448 Logical block length=512 bytes Logical blocks per physical block=3 (log base 2) [actual=8] Lowest aligned logical block address=0 Hence: Device size: 320072933376 bytes, 305245.3 MiB, 320.07 GB This disagrees with Martin's assertion. -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-26 3:07 ` Matthew Wilcox @ 2009-02-26 3:23 ` Martin K. Petersen 2009-12-11 7:05 ` James Andrewartha 2009-02-26 5:16 ` Martin K. Petersen 1 sibling, 1 reply; 38+ messages in thread From: Martin K. Petersen @ 2009-02-26 3:23 UTC (permalink / raw) To: Matthew Wilcox Cc: Theodore Tso, Martin K. Petersen, H. Peter Anvin, linux-ide, linux-kernel, sandeen >>>>> "Matthew" == Matthew Wilcox <matthew@wil.cx> writes: Matthew> Lowest aligned logical block address=0 Matthew> This disagrees with Martin's assertion. The original roadmap was to transition to 4KB sectors in 2006, coinciding with the Vista release. Given how long this has taken (we're now talking ~2011 for GA) it may very well be that the alignment knobs will be unused because everybody will be using Vista or 7 by then. That doesn't change the RAID array alignment problem, however. And we need to prepare our partitioning tools to align correctly regardless. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-26 3:23 ` Martin K. Petersen @ 2009-12-11 7:05 ` James Andrewartha 2009-12-11 7:26 ` H. Peter Anvin 2009-12-11 7:32 ` Martin K. Petersen 0 siblings, 2 replies; 38+ messages in thread From: James Andrewartha @ 2009-12-11 7:05 UTC (permalink / raw) To: Martin K. Petersen Cc: Matthew Wilcox, Theodore Tso, H. Peter Anvin, linux-ide, linux-kernel, sandeen Martin K. Petersen wrote: >>>>>> "Matthew" == Matthew Wilcox <matthew@wil.cx> writes: > > Matthew> Lowest aligned logical block address=0 > > Matthew> This disagrees with Martin's assertion. > > The original roadmap was to transition to 4KB sectors in 2006, > coinciding with the Vista release. > > Given how long this has taken (we're now talking ~2011 for GA) it may > very well be that the alignment knobs will be unused because everybody > will be using Vista or 7 by then. The plan does seem to have changed - WD's shipping 4k sector drives that are sane and require a drive jumper or tool for WinXP and old cloning utilities. http://techreport.com/discussions.x/18115 http://wdc.com/en/products/advancedformat/ http://www.wdc.com/wdproducts/library/whitepapers/en/2579-771430-A00.pdf -- James Andrewartha ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-12-11 7:05 ` James Andrewartha @ 2009-12-11 7:26 ` H. Peter Anvin 2009-12-11 7:32 ` Martin K. Petersen 1 sibling, 0 replies; 38+ messages in thread From: H. Peter Anvin @ 2009-12-11 7:26 UTC (permalink / raw) To: James Andrewartha Cc: Martin K. Petersen, Matthew Wilcox, Theodore Tso, linux-ide, linux-kernel, sandeen On 12/10/2009 11:05 PM, James Andrewartha wrote: > Martin K. Petersen wrote: >>>>>>> "Matthew" == Matthew Wilcox <matthew@wil.cx> writes: >> >> Matthew> Lowest aligned logical block address=0 >> >> Matthew> This disagrees with Martin's assertion. >> >> The original roadmap was to transition to 4KB sectors in 2006, >> coinciding with the Vista release. >> >> Given how long this has taken (we're now talking ~2011 for GA) it may >> very well be that the alignment knobs will be unused because everybody >> will be using Vista or 7 by then. > > The plan does seem to have changed - WD's shipping 4k sector drives that > are sane and require a drive jumper or tool for WinXP and old cloning > utilities. > > http://techreport.com/discussions.x/18115 > http://wdc.com/en/products/advancedformat/ > http://www.wdc.com/wdproducts/library/whitepapers/en/2579-771430-A00.pdf > Even more important, the alignment can be changed as opposed to being fixed for the manufacture of the disk. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-12-11 7:05 ` James Andrewartha 2009-12-11 7:26 ` H. Peter Anvin @ 2009-12-11 7:32 ` Martin K. Petersen 1 sibling, 0 replies; 38+ messages in thread From: Martin K. Petersen @ 2009-12-11 7:32 UTC (permalink / raw) To: James Andrewartha Cc: Martin K. Petersen, Matthew Wilcox, Theodore Tso, H. Peter Anvin, linux-ide, linux-kernel, sandeen >>>>> "James" == James Andrewartha <jamesa@daa.com.au> writes: >> Given how long this has taken (we're now talking ~2011 for GA) it may >> very well be that the alignment knobs will be unused because >> everybody will be using Vista or 7 by then. James> The plan does seem to have changed - WD's shipping 4k sector James> drives that are sane and require a drive jumper or tool for WinXP James> and old cloning utilities. The disk vendors have agreed to transition no later than 2011. There are definitely drives coming out with 4KB physical blocks before then. There has been a lot of discussion of the merits of shipping 1-aligned drives by default given that Vista and Windows 7 handle alignment correctly. I don't think there has been any firm decisions in IDEMA wrt. 0 vs. 1-aligned. It may be up to each vendor's discretion, target market segment, etc. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-26 3:07 ` Matthew Wilcox 2009-02-26 3:23 ` Martin K. Petersen @ 2009-02-26 5:16 ` Martin K. Petersen 2009-02-26 12:36 ` Matthew Wilcox 2009-02-26 15:32 ` H. Peter Anvin 1 sibling, 2 replies; 38+ messages in thread From: Martin K. Petersen @ 2009-02-26 5:16 UTC (permalink / raw) To: Matthew Wilcox Cc: Theodore Tso, Martin K. Petersen, H. Peter Anvin, linux-ide, linux-kernel, sandeen >>>>> "Matthew" == Matthew Wilcox <matthew@wil.cx> writes: Matthew> Lowest aligned logical block address=0 Matthew> This disagrees with Martin's assertion. Quick answer from one of my contacts. Desktop drives will indeed ship with an alignment of 1(*). The alignment is hardwired at time of manufacture and can't be changed. (*) I had to go back and reread the ATA spec to grok this. READ CAPACITY(16) indicates the lowest naturally aligned LBA. With LBA 63 offset in play that would be LBA 7. ATA, on the other hand, indicates how much LBA 0 is offset from the beginning of the first physical sector. If LBA 63 is naturally aligned that means that LBA 0 is offset 512 bytes (physical sector 0 starts at LBA -1 if you will). Hence IDENTIFY DEVICE word 209 will contain 0x4001. So you need to tweak your RC16 response a bit... -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-26 5:16 ` Martin K. Petersen @ 2009-02-26 12:36 ` Matthew Wilcox 2009-02-26 15:32 ` H. Peter Anvin 1 sibling, 0 replies; 38+ messages in thread From: Matthew Wilcox @ 2009-02-26 12:36 UTC (permalink / raw) To: Martin K. Petersen Cc: Theodore Tso, H. Peter Anvin, linux-ide, linux-kernel, sandeen On Thu, Feb 26, 2009 at 12:16:00AM -0500, Martin K. Petersen wrote: > >>>>> "Matthew" == Matthew Wilcox <matthew@wil.cx> writes: > > Matthew> Lowest aligned logical block address=0 > > Matthew> This disagrees with Martin's assertion. > > Quick answer from one of my contacts. Desktop drives will indeed ship > with an alignment of 1(*). The alignment is hardwired at time of > manufacture and can't be changed. Hm. I'll have to poke my contacts about this drive they've given me then. I just checked (with hdparm --Istdout) and word 209 is 0x4000 with this drive. > (*) I had to go back and reread the ATA spec to grok this. READ > CAPACITY(16) indicates the lowest naturally aligned LBA. With LBA 63 > offset in play that would be LBA 7. > > ATA, on the other hand, indicates how much LBA 0 is offset from the > beginning of the first physical sector. If LBA 63 is naturally aligned > that means that LBA 0 is offset 512 bytes (physical sector 0 starts at > LBA -1 if you will). Hence IDENTIFY DEVICE word 209 will contain 0x4001. > > So you need to tweak your RC16 response a bit... You're right. I think I want something like: ((1 << log_per_phys) - first_sector_offset) % (1 << log_per_phys); If you have 8 logical sectors per physical, and ATA reports 0x4001, SCSI wants to hear 7. The only corner case is ATA reporting 0x4000 and SCSI wanting to hear 0, not 8, hence the % (1 << log_per_phys). -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-26 5:16 ` Martin K. Petersen 2009-02-26 12:36 ` Matthew Wilcox @ 2009-02-26 15:32 ` H. Peter Anvin 2009-02-26 20:35 ` Martin K. Petersen 1 sibling, 1 reply; 38+ messages in thread From: H. Peter Anvin @ 2009-02-26 15:32 UTC (permalink / raw) To: Martin K. Petersen Cc: Matthew Wilcox, Theodore Tso, linux-ide, linux-kernel, sandeen Martin K. Petersen wrote: > > Quick answer from one of my contacts. Desktop drives will indeed ship > with an alignment of 1(*). The alignment is hardwired at time of > manufacture and can't be changed. > Oh God. This is a disaster. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-26 15:32 ` H. Peter Anvin @ 2009-02-26 20:35 ` Martin K. Petersen 2009-02-26 21:02 ` H. Peter Anvin 0 siblings, 1 reply; 38+ messages in thread From: Martin K. Petersen @ 2009-02-26 20:35 UTC (permalink / raw) To: H. Peter Anvin Cc: Martin K. Petersen, Matthew Wilcox, Theodore Tso, linux-ide, linux-kernel, sandeen >>>>> "hpa" == H Peter Anvin <hpa@zytor.com> writes: >> Quick answer from one of my contacts. Desktop drives will indeed >> ship with an alignment of 1(*). The alignment is hardwired at time >> of manufacture and can't be changed. >> hpa> Oh God. hpa> This is a disaster. Rationale being that modern Microsoft operating systems know how to interpret the alignment bits. Legacy XP will work without changes thanks to the shifted alignment. And Vista+ will do the right thing to align partition 1 to what the drive reports. Also note that Windows only aligns the first partition. That's something we need to be aware of when setting up dual boot systems. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-26 20:35 ` Martin K. Petersen @ 2009-02-26 21:02 ` H. Peter Anvin 2009-03-16 14:51 ` Greg Freemyer 0 siblings, 1 reply; 38+ messages in thread From: H. Peter Anvin @ 2009-02-26 21:02 UTC (permalink / raw) To: Martin K. Petersen Cc: Matthew Wilcox, Theodore Tso, linux-ide, linux-kernel, sandeen Martin K. Petersen wrote: >>>>>> "hpa" == H Peter Anvin <hpa@zytor.com> writes: > >>> Quick answer from one of my contacts. Desktop drives will indeed >>> ship with an alignment of 1(*). The alignment is hardwired at time >>> of manufacture and can't be changed. >>> > > hpa> Oh God. > > hpa> This is a disaster. > > Rationale being that modern Microsoft operating systems know how to > interpret the alignment bits. Legacy XP will work without changes > thanks to the shifted alignment. And Vista+ will do the right thing to > align partition 1 to what the drive reports. > > Also note that Windows only aligns the first partition. That's > something we need to be aware of when setting up dual boot systems. > Yeah, but all of this completely breaks the disk image abstraction, which is a very powerful paradigm. -hpa ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-02-26 21:02 ` H. Peter Anvin @ 2009-03-16 14:51 ` Greg Freemyer 2009-03-16 16:27 ` H. Peter Anvin 2009-03-18 14:33 ` James Bottomley 0 siblings, 2 replies; 38+ messages in thread From: Greg Freemyer @ 2009-03-16 14:51 UTC (permalink / raw) To: H. Peter Anvin Cc: Martin K. Petersen, Matthew Wilcox, Theodore Tso, linux-ide, linux-kernel, sandeen On Thu, Feb 26, 2009 at 5:02 PM, H. Peter Anvin <hpa@zytor.com> wrote: > Martin K. Petersen wrote: >>>>>>> >>>>>>> "hpa" == H Peter Anvin <hpa@zytor.com> writes: >> >>>> Quick answer from one of my contacts. Desktop drives will indeed >>>> ship with an alignment of 1(*). The alignment is hardwired at time >>>> of manufacture and can't be changed. >>>> >> >> hpa> Oh God. >> >> hpa> This is a disaster. >> >> Rationale being that modern Microsoft operating systems know how to >> interpret the alignment bits. Legacy XP will work without changes >> thanks to the shifted alignment. And Vista+ will do the right thing to >> align partition 1 to what the drive reports. >> >> Also note that Windows only aligns the first partition. That's >> something we need to be aware of when setting up dual boot systems. >> > > Yeah, but all of this completely breaks the disk image abstraction, which is > a very powerful paradigm. > > -hpa > If the reported geometry of these drives was changed to have sectors / track be a multiple of 8, wouldn't that fix most of the issues. ie. If the drive were to report 56 sectors per track, then a traditional partitioning tool would start the first partition as sector 56 and a Vista like partitioning tool would place the first partition at sector 2048. Both would have the same 4K sector alignment. If my logic is sound, anyway to get this recommendation upstream to hardware manufacturers. It seems like an almost trivial change for them. FYI: It sounds to me like partitioning tools should totally drop efforts to align with cylinders, instead they should start asking what the unit of atomic read/writes is at the physical layer and if any offsets are needed to align the partition with the atomic write areas. That would fit better for both SSD technology and for this 4K sectors issue than trying to continue to support cylinders at all. Thanks Greg -- Greg Freemyer Head of EDD Tape Extraction and Processing team Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer First 99 Days Litigation White Paper - http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-03-16 14:51 ` Greg Freemyer @ 2009-03-16 16:27 ` H. Peter Anvin 2009-03-16 17:37 ` Greg Freemyer 2009-03-18 14:33 ` James Bottomley 1 sibling, 1 reply; 38+ messages in thread From: H. Peter Anvin @ 2009-03-16 16:27 UTC (permalink / raw) To: Greg Freemyer Cc: Martin K. Petersen, Matthew Wilcox, Theodore Tso, linux-ide, linux-kernel, sandeen Greg Freemyer wrote: > If the reported geometry of these drives was changed to have sectors / > track be a multiple of 8, wouldn't that fix most of the issues. > > ie. If the drive were to report 56 sectors per track, then a > traditional partitioning tool would start the first partition as > sector 56 and a Vista like partitioning tool would place the first > partition at sector 2048. Both would have the same 4K sector > alignment. > > If my logic is sound, anyway to get this recommendation upstream to > hardware manufacturers. It seems like an almost trivial change for > them. > > FYI: It sounds to me like partitioning tools should totally drop > efforts to align with cylinders, instead they should start asking what > the unit of atomic read/writes is at the physical layer and if any > offsets are needed to align the partition with the atomic write areas. > > That would fit better for both SSD technology and for this 4K sectors > issue than trying to continue to support cylinders at all. As long as BIOSes played along with it (which some of them may not do -- remember the geometry that matters is the one reported by the BIOS) However, it definitely would be a major step in the right direction, as it would let *most* systems Do The Right Thing instead of weirdly misaligning the partitions and trying to cope with that. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-03-16 16:27 ` H. Peter Anvin @ 2009-03-16 17:37 ` Greg Freemyer 2009-03-16 18:11 ` H. Peter Anvin 0 siblings, 1 reply; 38+ messages in thread From: Greg Freemyer @ 2009-03-16 17:37 UTC (permalink / raw) To: H. Peter Anvin Cc: Martin K. Petersen, Matthew Wilcox, Theodore Tso, linux-ide, linux-kernel, sandeen On Mon, Mar 16, 2009 at 12:27 PM, H. Peter Anvin <hpa@zytor.com> wrote: > Greg Freemyer wrote: >> If the reported geometry of these drives was changed to have sectors / >> track be a multiple of 8, wouldn't that fix most of the issues. >> >> ie. If the drive were to report 56 sectors per track, then a >> traditional partitioning tool would start the first partition as >> sector 56 and a Vista like partitioning tool would place the first >> partition at sector 2048. Both would have the same 4K sector >> alignment. >> >> If my logic is sound, anyway to get this recommendation upstream to >> hardware manufacturers. It seems like an almost trivial change for >> them. >> >> FYI: It sounds to me like partitioning tools should totally drop >> efforts to align with cylinders, instead they should start asking what >> the unit of atomic read/writes is at the physical layer and if any >> offsets are needed to align the partition with the atomic write areas. >> >> That would fit better for both SSD technology and for this 4K sectors >> issue than trying to continue to support cylinders at all. > > As long as BIOSes played along with it (which some of them may not do -- > remember the geometry that matters is the one reported by the BIOS) > However, it definitely would be a major step in the right direction, as > it would let *most* systems Do The Right Thing instead of weirdly > misaligning the partitions and trying to cope with that. > > -hpa I'm not intimate with the details, but I would hope most boot loaders by now use LBA values to find the boot code, not CHS. If so the issue becomes the partitioning tools (fdisk etc.) putting the partitions at the right place. Can't those tools bypass the bios somehow and ask the drive itself what it's geometry is? From what I understand Vista has already made the jump and is now ignoring CHS and instead just putting the first partition at 1 MiB into the drive. (sector 2048 with 512 byte sectors.) Sounds like fdisk and friends should be updated to do the same. A bigger issue in my mind is lots of clones, images, etc. are probably LBA based today and simply start the first partition at sector 63. Thus the hardware vendors will need to have drives that perform well with partitions that start at sector 63. The existing scheme described may be as good as it gets for that need. Also, how are SDD manufacturers handling this. Their erase blocks don't align with partitions that start at sector 63 either I assume? Greg -- Greg Freemyer Head of EDD Tape Extraction and Processing team Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer First 99 Days Litigation White Paper - http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-03-16 17:37 ` Greg Freemyer @ 2009-03-16 18:11 ` H. Peter Anvin 2009-03-22 1:20 ` Bill Davidsen 0 siblings, 1 reply; 38+ messages in thread From: H. Peter Anvin @ 2009-03-16 18:11 UTC (permalink / raw) To: Greg Freemyer Cc: Martin K. Petersen, Matthew Wilcox, Theodore Tso, linux-ide, linux-kernel, sandeen Greg Freemyer wrote: > > I'm not intimate with the details, but I would hope most boot loaders > by now use LBA values to find the boot code, not CHS. > Yes, for "true" hard drives this is pretty much universal these days (except for MS-DOS and its ilk.) For USB and so on some BIOSes are still stuck in old times. Sigh. > If so the issue becomes the partitioning tools (fdisk etc.) putting > the partitions at the right place. Can't those tools bypass the bios > somehow and ask the drive itself what it's geometry is? It can, but it MUST NOT do so. The fields that are to be entered into the partition table are BIOS CHS values, not any other kind of CHS. > From what I understand Vista has already made the jump and is now > ignoring CHS and instead just putting the first partition at 1 MiB > into the drive. (sector 2048 with 512 byte sectors.) > > Sounds like fdisk and friends should be updated to do the same. Yes, that is way overdue. > A bigger issue in my mind is lots of clones, images, etc. are probably > LBA based today and simply start the first partition at sector 63. > Thus the hardware vendors will need to have drives that perform well > with partitions that start at sector 63. The existing scheme > described may be as good as it gets for that need. > > Also, how are SDD manufacturers handling this. Their erase blocks > don't align with partitions that start at sector 63 either I assume? Some SDD manufacturers (I don't know which ones) are actually examining the partition table and doing different things. I know this because they are permanently bricked if one writes an invalid partition table. Not recommended. -hpa ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-03-16 18:11 ` H. Peter Anvin @ 2009-03-22 1:20 ` Bill Davidsen 0 siblings, 0 replies; 38+ messages in thread From: Bill Davidsen @ 2009-03-22 1:20 UTC (permalink / raw) To: linux-kernel; +Cc: linux-ide H. Peter Anvin wrote: > Greg Freemyer wrote: >> >> I'm not intimate with the details, but I would hope most boot loaders >> by now use LBA values to find the boot code, not CHS. >> > > Yes, for "true" hard drives this is pretty much universal these days > (except for MS-DOS and its ilk.) For USB and so on some BIOSes are > still stuck in old times. Sigh. > >> If so the issue becomes the partitioning tools (fdisk etc.) putting >> the partitions at the right place. Can't those tools bypass the bios >> somehow and ask the drive itself what it's geometry is? > > It can, but it MUST NOT do so. The fields that are to be entered into > the partition table are BIOS CHS values, not any other kind of CHS. > I can recall playing with the disk "geometry" in the expert part of fdisk and not getting any particular bad effects, so I'm not sure why you MUST NOT other than the SSD you mentioned. However, the reason it was long ago is that aligning things in some possibly batter way didn't make a measurable difference in performance, so I went back to defaults, so I agree with your advice, it doesn't seem to help. -- Bill Davidsen <davidsen@tmr.com> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: ATA support for 4k sector size 2009-03-16 14:51 ` Greg Freemyer 2009-03-16 16:27 ` H. Peter Anvin @ 2009-03-18 14:33 ` James Bottomley 1 sibling, 0 replies; 38+ messages in thread From: James Bottomley @ 2009-03-18 14:33 UTC (permalink / raw) To: Greg Freemyer Cc: H. Peter Anvin, Martin K. Petersen, Matthew Wilcox, Theodore Tso, linux-ide, linux-kernel, sandeen On Mon, 2009-03-16 at 10:51 -0400, Greg Freemyer wrote: > On Thu, Feb 26, 2009 at 5:02 PM, H. Peter Anvin <hpa@zytor.com> wrote: > > Martin K. Petersen wrote: > >>>>>>> > >>>>>>> "hpa" == H Peter Anvin <hpa@zytor.com> writes: > >> > >>>> Quick answer from one of my contacts. Desktop drives will indeed > >>>> ship with an alignment of 1(*). The alignment is hardwired at time > >>>> of manufacture and can't be changed. > >>>> > >> > >> hpa> Oh God. > >> > >> hpa> This is a disaster. > >> > >> Rationale being that modern Microsoft operating systems know how to > >> interpret the alignment bits. Legacy XP will work without changes > >> thanks to the shifted alignment. And Vista+ will do the right thing to > >> align partition 1 to what the drive reports. > >> > >> Also note that Windows only aligns the first partition. That's > >> something we need to be aware of when setting up dual boot systems. > >> > > > > Yeah, but all of this completely breaks the disk image abstraction, which is > > a very powerful paradigm. > > > > -hpa > > > If the reported geometry of these drives was changed to have sectors / > track be a multiple of 8, wouldn't that fix most of the issues. > > ie. If the drive were to report 56 sectors per track, then a > traditional partitioning tool would start the first partition as > sector 56 and a Vista like partitioning tool would place the first > partition at sector 2048. Both would have the same 4K sector > alignment. > > If my logic is sound, anyway to get this recommendation upstream to > hardware manufacturers. It seems like an almost trivial change for > them. This is Ted Ts'o's proposed fix for the problem as well ... we do the C/H/S translation in scsicam.c and it's then stored in the DOS partition label. The only problems are that changing this on the fly might be problematic for things that believe scsicam_bios_param without first checking the DOS label (because they'll then be mismatched). And also, if the vendors use their power to offset the first 4k block, we'll be mismatched again. Plus, once the values are written in the label we can't update them. > FYI: It sounds to me like partitioning tools should totally drop > efforts to align with cylinders, instead they should start asking what > the unit of atomic read/writes is at the physical layer and if any > offsets are needed to align the partition with the atomic write areas. > > That would fit better for both SSD technology and for this 4K sectors > issue than trying to continue to support cylinders at all. The DOS label is the problematic one ... all the rest have (mostly) more sensible schemes. The problem is that the DOS label is the most prevalent one. James ^ permalink raw reply [flat|nested] 38+ messages in thread
* hdparm-9.12 released 2009-02-25 22:24 ATA support for 4k sector size Matthew Wilcox ` (2 preceding siblings ...) 2009-02-25 22:53 ` ATA support for 4k sector size H. Peter Anvin @ 2009-02-26 18:22 ` Mark Lord 3 siblings, 0 replies; 38+ messages in thread From: Mark Lord @ 2009-02-26 18:22 UTC (permalink / raw) To: linux-ide; +Cc: linux-kernel, Matthew Wilcox hdparm-9.12 is now available for download from sourceforge.net. Changes from 9.11 include: - added logical/physical sector size reporting - updated -I output with SATA-2.6 additions - support APM level retrieval with -B flag - updated -C output to match ATA8 - added "form factor" and "rotation" display to -I, courtesy of Martin K.Petersen. Cheers -- Mark Lord Real-Time Remedies Inc. mlord@pobox.com ^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2009-12-11 7:33 UTC | newest] Thread overview: 38+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-02-25 22:24 ATA support for 4k sector size Matthew Wilcox 2009-02-25 22:24 ` [PATCH 1/2] ata: Define new commands from ATA8 Matthew Wilcox 2009-02-25 22:24 ` [PATCH 2/2] ata: Add support for Long Logical Sectors and Long Physical Sectors Matthew Wilcox 2009-02-25 22:53 ` ATA support for 4k sector size H. Peter Anvin 2009-02-25 23:27 ` Martin K. Petersen 2009-02-25 23:33 ` H. Peter Anvin 2009-02-25 23:51 ` Martin K. Petersen 2009-02-26 12:43 ` Karel Zak 2009-02-26 15:17 ` H. Peter Anvin 2009-02-25 23:42 ` H. Peter Anvin 2009-02-25 23:55 ` Martin K. Petersen 2009-02-25 23:57 ` H. Peter Anvin 2009-02-26 0:07 ` Martin K. Petersen 2009-02-26 0:10 ` H. Peter Anvin 2009-02-26 0:17 ` Martin K. Petersen 2009-02-25 23:49 ` david 2009-02-26 0:04 ` Martin K. Petersen 2009-02-26 0:13 ` david 2009-02-26 0:20 ` Martin K. Petersen 2009-02-26 2:50 ` Theodore Tso 2009-02-26 3:05 ` Martin K. Petersen 2009-02-26 3:07 ` Matthew Wilcox 2009-02-26 3:23 ` Martin K. Petersen 2009-12-11 7:05 ` James Andrewartha 2009-12-11 7:26 ` H. Peter Anvin 2009-12-11 7:32 ` Martin K. Petersen 2009-02-26 5:16 ` Martin K. Petersen 2009-02-26 12:36 ` Matthew Wilcox 2009-02-26 15:32 ` H. Peter Anvin 2009-02-26 20:35 ` Martin K. Petersen 2009-02-26 21:02 ` H. Peter Anvin 2009-03-16 14:51 ` Greg Freemyer 2009-03-16 16:27 ` H. Peter Anvin 2009-03-16 17:37 ` Greg Freemyer 2009-03-16 18:11 ` H. Peter Anvin 2009-03-22 1:20 ` Bill Davidsen 2009-03-18 14:33 ` James Bottomley 2009-02-26 18:22 ` hdparm-9.12 released Mark Lord
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).